26
ORIGINAL PAPER Debris-flow susceptibility analysis using fluvio-morphological parameters and data mining: application to the Central-Eastern Pyrenees G. G. Chevalier V. Medina M. Hu ¨ rlimann A. Bateman Received: 12 January 2012 / Accepted: 8 January 2013 / Published online: 30 January 2013 Ó Springer Science+Business Media Dordrecht 2013 Abstract Based on debris-flow inventories and using a geographical information system, the susceptibility models presented here take into account fluvio-morphologic parameters, gathered for every first-order catchment. Data mining techniques on the morphometric parameters are used, to work out and test three different models. The first model is a logistic regression analysis based on weighting the parameters. The other two are classi- fication trees, which are rather novel susceptibility models. These techniques enable gathering the necessary data to evaluate the performance of the models tested, with and without optimization. The analysis was performed in the Catalan Pyrenees and covered an area of more than 4,000 km 2 . Results related to the training dataset show that the optimized models performance lie within former reported range, in terms of AUC, although closer to the lowest end (near 70 %). When the models are applied to the test set, the quality of most results decreases. However, out of the three different models, logistic regression seems to offer the best prediction, as training and test sets results are very similar, in terms of performance. Trees are better at extracting laws from a training set, but validation through a test set gives results unacceptable for a prediction at regional scale. Although omitting parameters in geology or vegetation, fluvio-morphologic models based on data mining, can be used in the framework of a regional debris-flow susceptibility assessment in areas where only a digital elevation model is available. Keywords Debris flows Susceptibility Morphometry Data mining G. G. Chevalier M. Hu ¨rlimann Department of Geotechnical Engineering and Geosciences, UPC—Barcelona Tech, J. Girona 1-3 (D2), 08034 Barcelona, Spain G. G. Chevalier (&) V. Medina A. Bateman Sediment Transport Research Group (GITS), Department of Hydraulic, Marine and Environmental Engineering, UPC—Barcelona Tech, J. Girona 1-3 (D1), 08034 Barcelona, Spain e-mail: [email protected] 123 Nat Hazards (2013) 67:213–238 DOI 10.1007/s11069-013-0568-3

Debris-flow susceptibility analysis using fluvio-morphological parameters and data mining: application to the Central-Eastern Pyrenees

  • Upload
    lleida

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

ORI GIN AL PA PER

Debris-flow susceptibility analysis usingfluvio-morphological parameters and data mining:application to the Central-Eastern Pyrenees

G. G. Chevalier • V. Medina • M. Hurlimann • A. Bateman

Received: 12 January 2012 / Accepted: 8 January 2013 / Published online: 30 January 2013� Springer Science+Business Media Dordrecht 2013

Abstract Based on debris-flow inventories and using a geographical information system,

the susceptibility models presented here take into account fluvio-morphologic parameters,

gathered for every first-order catchment. Data mining techniques on the morphometric

parameters are used, to work out and test three different models. The first model is a

logistic regression analysis based on weighting the parameters. The other two are classi-

fication trees, which are rather novel susceptibility models. These techniques enable

gathering the necessary data to evaluate the performance of the models tested, with and

without optimization. The analysis was performed in the Catalan Pyrenees and covered an

area of more than 4,000 km2. Results related to the training dataset show that the optimized

models performance lie within former reported range, in terms of AUC, although closer to

the lowest end (near 70 %). When the models are applied to the test set, the quality of most

results decreases. However, out of the three different models, logistic regression seems to

offer the best prediction, as training and test sets results are very similar, in terms of

performance. Trees are better at extracting laws from a training set, but validation through

a test set gives results unacceptable for a prediction at regional scale. Although omitting

parameters in geology or vegetation, fluvio-morphologic models based on data mining, can

be used in the framework of a regional debris-flow susceptibility assessment in areas where

only a digital elevation model is available.

Keywords Debris flows � Susceptibility � Morphometry � Data mining

G. G. Chevalier � M. HurlimannDepartment of Geotechnical Engineering and Geosciences, UPC—Barcelona Tech, J. Girona 1-3 (D2),08034 Barcelona, Spain

G. G. Chevalier (&) � V. Medina � A. BatemanSediment Transport Research Group (GITS), Department of Hydraulic, Marine and EnvironmentalEngineering, UPC—Barcelona Tech, J. Girona 1-3 (D1), 08034 Barcelona, Spaine-mail: [email protected]

123

Nat Hazards (2013) 67:213–238DOI 10.1007/s11069-013-0568-3

1 Introduction

Debris-flow hazard poses a substantial threat in mountainous environments (Hungr et al.

1984; Iverson 1997). Debris-flow occurrence and recurrence leave traces in the landscape.

They are generally found in high elevation and high slope and are likely to follow

established channels (Coussot and Meunier 1996; Hungr et al. 2001). Various classifica-

tions of this devastative erosional phenomenon exist, which have enabled to homogenize

its understanding (and vision) worldwide (e.g., Coussot and Meunier 1996; Jakob 2005a).

Recently, Hungr et al. (2012) proposed an update to what they proposed in Hungr et al.

(2001), which is widely used in the literature and also in this paper.

Hazard is generally assessed in terms of frequency, intensity and location (Jakob and

Hungr 2005; Gentile et al. 2008). Susceptibility, or spatial occurrence, accounts for

location and, therefore, tries to predict future occurrences (Guzzetti et al. 2005).

In many cases, susceptibility is performed based on field inventory mapping or heuristic

classification of the terrain. However, it can benefit from computer modeling (e.g., Jakob

2005b). Analyses of the landscape to report locations of past events are numerous and form

the backbone of debris-flow and landslides susceptibility assessments. Moreover, many

different approaches have been envisaged for landslides (Guzzetti et al. 2006).

Debris-flow susceptibility models imply the use of meaningful parameters gathered

from past events and appropriate to describe the phenomenon (e.g., Iverson 1997) and aim

at predicting the location of future activity. Morphometric indicators have already proved

to greatly contribute to landslides studies, thanks to their easy determination. Models have

been developed following this trend for debris-flow hazard (Coe et al. 2004; Chen and Yu

2011).

Models performance is an issue emerging with the increasing numbers of models found

in the literature. Estimation of models quality, comparisons between models and evaluation

of their performance through different techniques have been the focus of recent studies for

debris-flow and landslide susceptibility (Guzzetti et al. 2006; Carrara et al. 2008; Frattini

et al. 2010). When compared to complex statistical models (Chung and Fabbri 2003;

Remondo et al. 2003), simple, heuristic methodologies and analyses of landslides sus-

ceptibility seem to give a similar level of performance (Guinau et al. 2005), which,

however, does not imply similar spatial distribution observed in the resulting maps

(Sterlacchini et al. 2011).

In this study, data mining techniques were applied. They have the advantage to permit

the treatment of a large quantity of data and classification trees simplify vision given to

results (Wan et al. 2008; Wan and Lei 2009). Trees are not common in literature, as

opposed to matrices (Fawcett 2006; Frattini et al. 2010). Supporting modeling results and

assessing robustness of the models are facilitated by these techniques because success and

prediction rate curves are easily gathered (e.g., Santacana et al. 2003). Care and attention

are necessary when interpreting these results (Blahut et al. 2010).

Catchments are often taken as a fundamental geomorphic unit (Coehlo-Netto et al.

2006). Debris-flow susceptibility assessments are abundantly tackled at catchment’s scale

(Okunishi and Suwa 2001; Bacchini and Zannoni 2003; Melelli and Taramelli 2004;

Catani et al. 2005). Generally, four landslide zoning maps’ scales are considered,

depending on the indicative range of scales and the typical area of zoning (Fell et al. 2008):

detailed, large, medium and small. Catchment’s scale is often referred to as detailed, and

Guzzetti et al. (2006) reports that statistics are best suited to large areas/small-scale

landslides susceptibility’s studies.

214 Nat Hazards (2013) 67:213–238

123

In the Pyrenees, debris flows are scarce but pose a serious threat as past occurrences

have dramatically showed (e.g., Alcoverro et al. 1999). They are mainly triggered during

high-intensity, short-duration rainfall events hit the range (Hurlimann et al. 2003; Portilla

et al. 2010). Pyrenean shallow landslides susceptibility studies already benefitted from past

studies (Baeza and Corominas 2001; Baeza et al. 2010). Most of these studies consider a

regional scale and use a statistical approach. They have guided the work presented herein.

The main goal of this study is to elaborate a debris-flow susceptibility analysis at

regional scale using fluvio-morphological parameters and headwater catchments as study

unit. It is likely to be reproduced in remote areas with little knowledge about the physics of

the erosional processes and with no additional data but a digital elevation model (DEM).

Another goal was to produce results that are easy to understand and, therefore, compre-

hensible by non-experts. In order to achieve this, the applicability of decision trees is

checked.

The current work describes first the study area made of four test areas and the elabo-

ration of the inventory reporting past debris flows. Then a fluvio-morphological analysis is

carried out, where the methodology is detailed. The parameters selected are introduced and

eventually what emerges from crossing geographical information system’s (GIS) infor-

mation and the debris-flow inventory is shown. The following section explains the learning

process considered. Firstly, the paper focuses on sets and methods used before presenting

the models, and secondly, it evaluates the performance and credibility of the models used.

2 Study area and debris-flow inventories

2.1 General settings

The Pyrenees spreads over 430 km onshore following an East–West axis and delimits

the boundary between France and Spain, with the Principality of Andorra lying within it

(Fig. 1). Central-Eastern Pyrenees is the focus of this study, running from Ordovician to

Devonian, including Tardy-Hercynian intrusions. The material making up the Pyrenees

started to uplift some 40 million years ago (Munoz 1992; Teixell 1998; ICC 2003). A

dense and complex fault network characterizes Pyrenean stratigraphy. There is, how-

ever, little tectonic activity and low exhumation rates (Fitzgerald et al. 1999; Lynn

2005).

Pyrenean relief starts at sea level near eastern and western extremities and reaches over

3,400 m above sea level (m asl). Past glaciations have generated U-shaped valleys and

cirques in the landscape, signs of erosion’s power and extent of regional glaciers.

Deglaciation forced the destabilization of steep slopes during the last glacial cycle.

Landslide activity is their current remnant and induces a discontinuous sequence of

deposition, emphasized by colluviums lying over bedrock or tills. Those outcrops are

common in the Axial Pyrenees, which is, together with the pre-Pyrenees, comprehensively

described by the ECORS Pyrenees Team (1988).

Dryness and convective storms characterize summers. During the rest of the year,

humidity culminates during autumn with the highest precipitation periods. These extreme

seasonal variations are the result of a latitudinal situation within a temperate zone

(Cuadrat and Pita 1997; Martin and Olcina 2001). Yearly precipitation, ranging from 850

to 1,200 mm, is influenced by a high relief combined with prevailing winds from the

west.

Nat Hazards (2013) 67:213–238 215

123

2.2 Study area and debris-flow inventory

The study presented below is based on sampling regional catchments and slopes’ con-

tinuum. The study area and its extent are dictated by past studies, aerial views’ interpre-

tations and recent field investigations conducted in the region. Figure 2 shows a flowchart

of the study’s methodology.

Four zones have been identified and make up the study area (Fig. 3). Berga majorly lies

within the pre-Pyrenees. NWCat (North West Catalonia), Andorra and Mollo fall into the

Axial Pyrenees. These zones cover over 4,000 km2 and have been defined to represent

typical environments of the Central-Eastern Pyrenees, where debris flows have been

triggered in the past (e.g., Portilla et al. 2010).

Fig. 1 Pyrenean geological context (After ECORS team 1988). In red are highlighted the study areas

Fig. 2 Methodology’s flowchart

216 Nat Hazards (2013) 67:213–238

123

The term reactive has been extracted from the medical vocabulary and is understood as

showing a response to a stimulus. In this presentation, the response is a debris flow, and the

stimulus is a rainfall event. When clear signs of debris-flow activity were witnessed or

reported, reactivity was assigned to the corresponding catchment in the inventory. Debris

flows can travel great distances, but the study focuses on the source catchments and does

not account for run outs. Moreover, the origin of the debris flow is not a criterion of

distinction: Landslide-triggered or in-channel debris flows are equally retained in the

inventory.

Due to the non-systematic reconnaissance of debris flows in Central-Eastern Pyrenees,

most debris-flow events are not reported. Therefore, an inventory of past debris flows was

needed. The debris flows composing this study and giving rise to the reactive catchments

have been gathered and digitalized from past studies or analyses, aerial pictures surveys

and contemporary interactive surveys. No information on the mean of trigger is sought nor

reported in this study. The inventory spreading over 4 zones (Fig. 3) is explained in

Table 1. As Guzzetti et al. (2006) highlighted, a multi-temporal landslides inventory seems

to produce better results than clustering the landslides events in time intervals and studying

the susceptibility within these intervals. The inventory elaborated for this study follows

these guidelines, and no temporal distinction is considered or shown here.

Debris flows in Berga’s zone were determined using an existing database (Clotet and

Gallart 1984; Baeza 1994) and contemporary images providers. The existing database

shows different erosional processes. A filter was necessary, and only debris flows involving

at least 1,000 m3 of material were kept in this zone after preliminary analysis. Aerial

pictures from 2008, visible in GoogleTMEarth, were used to determine where unreported

debris flows could have occurred. Criteria such as vegetation’s change in the landscape,

landslide scar(s), clear visibility of a torrent/gully/stream where roughness could be

assessed or presence of potential deposition fans were considered. Based on Guthrie and

Evans (2007), the time period considered by these criteria ranges from 50 to 100 years.

Fig. 3 Central-Eastern Pyrenees highlighting the study area. Inserted are a DEM view of Andorra andb DEM view of NWCat, Berga and Mollo. a, b Debris flows (white points) used in this study for each zone.For the DEM, the darker the terrain, the higher the elevation

Nat Hazards (2013) 67:213–238 217

123

In Mollo’s zone, an unusually high-intensity rainfall event caused in 1940 a great flood and

many surficial slope failures, some of them developing into debris flows (Parde 1941). An

analysis of these failures by the interpretation of 1956/57 aerial pictures permitted the deter-

mination of the debris flows related to this event (Portilla 2010). These debris flows are used to

determine the reactive catchments. No more recent information was obtained for this zone.

In NWCat’s zone, GoogleTMEarth proved useful to analyze 2008 aerial pictures. Together

with the criteria enumerated above, it allowed traces of debris-flow activity to be recognized

in the landscape. In addition to this, more sets of aerial pictures available for this zone were

consulted. On the one hand, aerial pictures from flights in 1975/76 and 1982/84 (black and

white) were studied. On the other hand, the Catalan Cartographic Institute (ICC), through the

Internet application OrtoXpres 1.0 (URL: www.ortoxpres.cat/client/icc/), shares online

aerial pictures from 1956/57 for the zone. Both sources of information were used.

Eventually debris flows in Andorra’s zone have been determined by 1) a compilation

of events coming from the interpretation of aerial pictures taken between 2003 and 2008

(GoogleTMEarth) through the criterion aforementioned and 2) the compilation of historic debris

flows that occurred in the zone, through the study of reports encompassing past events. This

zone was used to validate the models proposed when the others were used to mount the models.

Table 2 shows the different zones’ total area, number of catchments, mean area per

catchment, mean elevation per catchment, number of catchments and total number of

debris-flow paths (see Fig. 2 for methodology’s flowchart). Berga contains 45 % of the

total number of catchments but only 26 % of the reactive catchments. Its catchments also

display the lowest mean elevation. Mollo, although being the smallest zone with 11 % of

the total number of catchments, accounts for 42 % of the reactive catchments. The smallest

catchments are found in Mollo. NWCat is the biggest test area with more than 50 % of the

total area, incorporating 44 % of all the catchments. The mean elevation is highest in this

area, and 32 % of the reactive catchments are found there.

3 Fluvio-morphological analysis

3.1 Digital elevation model analysis

Landscape can be recreated through a 5 9 5 m DEM in a GIS (obtained from digitalizing

contours using existing maps). The DEM of the study area was used to discretize the

landscape and extract headwaters catchments.

Table 1 Source of information in the elaboration of the inventory

Existing field dataor inventories

Aerial pictures analyzed

GoogleTMEarth andOrtoXpres 1.02008–2009

Standard aerialpictures 1975/76and 1982/84

Berga Clotet and Gallart (1984), Baeza (1994) X X

Mollo Portilla (2010)

NWCat nd X X

Andorra nd X

See text for references to OrtoXpres 1.0 and existing field data

218 Nat Hazards (2013) 67:213–238

123

Imperfections of the DEM were filled and flow direction and flow accumulation com-

mands followed. Then the next step consisted of editing stream lines and catchments

polygons. Definition of streams needs a minimum drainage area for initiating the stream,

which was set at 1 km2; for this reason, no catchment has a drainage area inferior to this

value. The literature provides numerous studies concerning the initiation of streams and the

minimum contributing area to consider in order to localize the ‘‘best starting point’’ of a

stream (Montgomery and Dietrich 1988, 1989; Tarboton et al. 1991; Tarboton and Ames

2001). The problem is mainly related to the stream density (or stream-area ratio) affecting

the representation of the drainage network. In this work, the minimum contributing area is

arbitrarily fixed, based on the information collected in the database. Similar studies have

been carried out using smaller minimum drainage area for initiating streams but the study

area considered revealed smaller than the one considered herein (Carrara et al. 1991).

Stream lines were then processed as drainage lines. Aggregated upstream catchments were

generated, and eventually, catchments were created. Considering the number of catchments

created, special attention needs to be given throughout the elaboration of the database on

the identity of catchments and streams is necessary.

The analysis of the landscape is generally performed using DEM land surface repre-

sentation. Features like slope, curvature, flow accumulation among many others are

obtained through DEM processing. However, some issues should be taken into account.

On the background of geometrical parameters like slope, flow direction or aspect, there

are several complex concerns. As a simplified explanation of the problem, consider the

slope, in a DEM, every cell has 8 neighbors, which in most of the cases are not sharing a

planar surface. With 3 points can be defined a plan. Consequently, for a single cell, 14

different choices for slope are available. The problem is not trivial, and many results are

possible. Similar issues could be described for the flow direction or flow accumulation

computation. These problems have been widely analyzed (Tarboton 1997; Wilson and

Gallant 2000; Pike 2002).

Concerning the analysis carried out in this work, the most relevant choice was to use the

O’Callaghan and Mark (1984) approach (formally D8). This approach, however, has

several limitations. For instance, it is not able to model flow divergence in ridge areas.

However, it properly captures the basin area, which is the main concern for the target of

Table 2 Area, number of catchments, and mean area and mean elevation per catchment of the three testareas of the training set (normal font), with ‘‘all’’ being the recapitulation of the whole training set (boldfont)

Total

Area

(km2 (%))

Number of

catchments

(# (%))

Mean

area per

catchment

(km2)

Mean elevation

per catchment

(m asl)

Number of

reactive

catchments

(# (%))

Number

of debris

flows (#)

Berga 1,386 (34) 457 (45) 2.27 1,380 20 (26) 36

Mollo 547 (13) 119 (11) 2.21 1,712 33 (42) 139

NWCat 2,223 (53) 446 (44) 2.32 1,942 25 (32) 76

All 4,156 (100) 1,022 (100) 2.29 1,664 78 (100) 251

Andorra 468 113 (–) 2.31 2,208 41 (–) 91

In italics is gathered information relative to the test set, Andorra. Percentages shown are relative to the training set

Nat Hazards (2013) 67:213–238 219

123

this paper. The flow accumulation is performed using the Jenson and Domingue (1988)

algorithm, and the slope follows the Burrough and McDonell (1998) approach.

3.2 Parameters selected

First-order catchments are the unit of this study and support a series of fluvio-morpho-

logical parameters (Table 3) applied to either both of the spatial features: catchments

(polygons) and streams (lines) (reminder: minimum contributing area for catchments and

streams of 1 km2). They were derived from the DEM (Fig. 2). Information was gathered

thanks to topography, slope, stream order and orientation (aspect) raster files through the

use of the zonal statistic tool and zonal geometry tool (both being a command of ‘‘Spatial

Analyst’’).

Area and perimeter have been defined by counting the number of cells, respectively,

within the polygons and making up their boundaries. Maximum, minimum and mean

elevations were directly derived from the DEM topographic information. Catchments mean

slope was calculated averaging the slope value for each cell making up a catchment. The

same logic is applied to mean orientation. Length and width have been determined for each

catchment extrapolating the catchments shape to an ellipsoid, length being used in the

calculation of fluvio-morphological ratios (but not used as parameters).

Streams contained within first-order catchments had their length extracted from initi-

ation to outlet. Apart form the stream length, different values of slope were gathered:

average slope, 200-m slope and outlet slope. The parameter average slope is the value of

the slope over the entire segment of stream within a catchment. The 200-m slope is the

average slope over 200 m starting at the outlet and going upstream. The outlet slope is

similar to 200-m slope, but only gathered over 50 m.

Table 3 List of parameters including abbreviations, units, equations when relevant and what feature(catchment or stream) is considered

Parameter Units Equations Applied to

Area (A) km2 (–) Catchment

Perimeter (P) km (–) Catchment

Maximum elevation (Hmax) m asl (–) Catchment

Minimum elevation (Hmin) m asl (–) Catchment

Mean elevation (Hm) m asl (–) Catchment

Mean slope (SmC) Degrees (–) Catchment

Mean orientation (O) Degrees (N–S) (–) Catchment

Mean slope (SmS) Degrees (–) Stream

200-m slope (S200) Degrees (–) Stream

Exutory slope (Se) Degrees (–) Stream

Length (Ls) m (–) Stream

Melton ratio (MR) (–) dH/HA Catchment

Form factor (FF) (–) A/L2 Catchment

Basin elongation (BE) (–) 2HA/L�Hp Catchment

Lemniscate ratio (LR) (–) L2�p/4A Catchment

dH: Hmax - Hmin; L is the catchment’s length when its shape is extrapolated to an ellipsoid

220 Nat Hazards (2013) 67:213–238

123

Morpho-hydrological ratios have been used since long to characterize catchments

(references in Zavoianu 1978). They are often easy to determine and imply few parameters.

Four of them were considered in this study: Melton ratio or ruggedness number is an index

of average catchments slope (Melton 1965); basin elongation compares the longest

dimension of the basin to the diameter of a circle of the same area as the basin (Schumm

1956); form factor gives information about the shape of a catchment (Horton 1932);

lemniscate ratio is a measure of how closely the catchment’s shape approaches a lem-

niscate (Chorley 1957).

3.3 Statistical results

Reactive catchments have been identified and parameters extracted. Table 4 presents

statistical results for the 14 parameters showing mean, standard deviation, minimum,

maximum and median values. They concern reactive catchments, as well as all other

catchments present in the three zones making up the training set.

The results obtained generally coincide well with published data regarding the area

(Rickenmann 1999; Welsh and Davies 2010), altitude (Blahut et al. 2010), mean slope of

catchments (Rickenmann and Zimmermann 1993; Jakob and Hungr 2005) or Melton ratio

(Portilla et al. 2010, Welsh and Davies 2010).

Figure 4 depicts relationships showing four bi-dimensional combinations. Distinction

between catchments based on fluvio-morphological parameters is not straightforward,

as both classes of catchments are never found gathered or clustered: A given value on the

X-axis (or Y-axis) shows both non-reactive and reactive catchments. But thresholds

emerge, capable of clustering the two classes of catchments. Although a little number of

reactive catchments are not respecting these simple rules of occurrence, 11 km2 in area

(Fig. 4a), 20� in mean slope (Fig. 4a), 6 km in stream length (Fig. 4b), 0.25 in Melton ratio

(Fig. 4b) or 1,500 m asl in maximum elevation (Fig. 4c) could restrain the spatial

occurrence of reactive catchments. However, these thresholds are not all relevant to dis-

criminate among reactive and non-reactive catchments.

Some parameters show extreme values for debris-flow occurrence (area, elevations,

Melton ratio) when other better highlight clusters (basin elongation, slope). For this reason,

data mining and statistical techniques were applied to the training set and are discussed in

the following sections.

4 Learning process

4.1 Generalities

In this section, learning process and knowledge are described and evaluated. The target

variable of the analysis is the reactivity, inducing two classes: reactive and non-reactive.

The scope is to define a model able to predict the probability of debris-flow susceptibility at

catchment scale.

Standard procedures belonging to data mining are used (e.g., Witten et al. 2011). The

structure of the work follows the classical approach: (1) Procedures and algorithms are run

in order to obtain knowledge as a consequence of a learning process, (2) the knowledge is

tuned (or optimized) and (3) the resulting knowledge is validated with the test set.

It should be pointed out that the main class is the reactive class that is why it is defined

as ‘‘positive,’’ and non-reactive class is defined as ‘‘negative.’’ Figure 5 exemplifies the

Nat Hazards (2013) 67:213–238 221

123

Tab

le4

Sta

tist

ical

tab

lesh

ow

ing

mea

n,

stan

dar

dd

evia

tio

n,

min

imu

m,

max

imu

man

dm

edia

nv

alu

eso

fth

e1

4p

aram

eter

sin

ves

tig

ated

.In

no

rmal

fon

tar

eth

e1

,022

catc

hm

ents

mak

ing

up

our

trai

nin

gse

t,an

din

bold

and

ital

icar

eth

e78

reac

tive

catc

hm

ents

AP

Hm

ax

Hm

inS

mO

Ls

Mea

n

B2

.27

2.6

29

.18

9.8

81

,763

2,0

601

,037

1,1

792

4.7

42

7.1

61

79

17

11

.37

1.6

6

M2

.21

3.0

18

.62

9.6

12

,076

2,3

211

,312

1,4

132

4.1

82

5.9

81

71

17

11

.21

1.6

1

NC

2.3

22

.32

9.0

79

.28

2,3

90

2,6

041

,449

1,3

912

7.6

93

2.1

91

72

20

91

.35

1.3

3

Sta

nd

ard

dev

iati

on

B1

.34

1.0

72

.97

2.6

84

46

36

92

88

25

95

.15

4.0

53

92

91

.21

1.2

6

M1

.46

2.0

72

.58

3.3

24

93

30

54

07

29

73

.96

3.6

04

24

91

.05

1.3

6

NC

1.5

51

.19

2.7

12

.04

40

92

31

44

72

39

5.0

93

.79

55

51

1.1

61

.08

Min

imu

m

B1

.00

1.0

15

.08

5.9

68

16

1,3

764

73

80

29

.15

19

.56

74

12

90

.007

0.0

6

M1

.00

1.0

15

.48

5.4

81

,166

1,9

216

16

1,0

421

5.9

82

0.6

98

78

70

.007

0.0

9

NC

1.0

01

.03

5.1

26

.28

1,2

42

2,2

526

67

1,0

151

1.5

42

5.5

16

78

40

.005

0.0

1

Maxi

mum

B1

0.4

44

.72

27

.88

16

.76

2,6

48

2,6

031

,979

1,7

334

1.0

13

7.5

22

77

22

39

.55

4.4

0

M1

0.2

61

0.2

61

9.7

61

9.7

62

,909

2,8

952

,253

2,2

533

4.5

83

4.5

82

70

26

55

.98

5.9

8

NC

13

.05

5.7

22

2.6

01

3.4

83

,027

3,0

252

,426

1,8

544

1.7

73

9.8

72

93

26

57

.21

3.8

2

Med

ian

B1

.85

2.5

58

.60

9.8

01

,707

2,1

559

93

1,1

112

4.5

62

6.8

71

79

16

21

.05

1.4

7

M1

.71

2.1

57

.76

9.3

61

,991

2,2

991

,211

1,3

192

4.3

92

5.5

81

65

16

20

.94

1.4

3

NC

1.8

32

.00

8.5

28

.76

2,4

54

2,5

661

,394

1,4

342

7.8

13

1.6

71

66

22

11

.08

1.1

3

Save

S200

Se

MR

BE

FF

LR

Mea

n

B8

.78

8.7

37

.91

6.8

07

.95

7.4

20

.51

0.5

60

.69

0.7

00

.39

0.4

02

.26

2.1

8

M1

0.4

41

1.1

29

.53

9.8

29

.51

10

.50

0.5

50

.59

0.7

00

.72

0.4

00

.42

2.1

72

.00

222 Nat Hazards (2013) 67:213–238

123

Tab

le4

con

tin

ued

Save

S200

Se

MR

BE

FF

LR

NC

11

.96

13

.90

9.8

79

.69

9.2

69

.02

0.6

70

.85

0.7

10

.70

0.4

10

.40

2.1

22

.18

Sta

nd

ard

dev

iati

on

B5

.58

4.1

16

.05

3.8

86

.17

4.6

40

.20

0.1

50

.12

0.1

20

.13

0.1

40

.87

0.8

0

M5

.10

5.5

55

.50

6.6

45

.59

7.0

60

.18

0.1

90

.11

0.1

00

.13

0.1

10

.74

0.5

9

NC

6.9

87

.62

7.2

67

.35

7.2

05

.88

0.2

10

.22

0.1

10

.12

0.1

20

.13

0.7

50

.78

Min

imu

m

B0

.00

2.4

00

.00

0.0

00

.00

0.0

00

.10

0.2

50

.40

0.4

80

.12

0.1

81

.04

1.1

9

M0

.64

2.8

60

.51

2.3

90

.53

2.5

00

.22

0.3

10

.46

0.5

50

.17

0.2

30

.01

1.2

3

NC

0.0

00

.00

0.0

00

.00

0.0

00

.00

0.1

80

.52

0.4

20

.50

0.1

40

.19

1.0

21

.09

Maxi

mum

B2

9.8

91

8.5

13

3.5

21

5.1

13

6.2

31

8.1

31

.12

0.7

90

.97

0.9

10

.75

0.6

56

.14

4.2

7

M3

0.2

12

6.2

13

1.6

23

1.6

23

3.6

03

3.6

01

.04

0.9

70

.99

0.8

90

.77

0.6

34

.58

3.2

8

NC

32

.04

27

.49

34

.85

64

.85

34

.47

21

.69

1.3

81

.38

0.9

80

.95

0.7

60

.71

5.5

23

.96

Med

ian

B7

.80

9.0

46

.57

6.7

66

.67

7.5

50

.48

0.5

90

.68

0.6

10

.37

0.3

52

.10

2.2

3

M9

.61

9.6

18

.73

9.4

28

.86

10

.05

0.5

30

.52

0.7

10

.74

0.3

90

.43

1.9

71

.81

NC

12

.24

14

.59

9.5

69

.55

8.7

51

0.0

20

.65

0.8

20

.71

0.6

80

.40

0.3

81

.93

2.1

5

BB

erg

a,M

Mo

llo,

NC

NW

Cat

Nat Hazards (2013) 67:213–238 223

123

format of the matrices exposed in the results and uses the terms positive (reactive) and

negatives (non-reactive) in context, as well as true class (input) and predicted class

(output).

4.2 Cost matrix definition

The ratio of reactive/non-reactive catchments is unbalanced, 78 reactive catchments in

front of 944 non-reactive. It results in a learning process indirectly biased toward the most

frequent class (non-reactivity). Data mining procedures provide several tools to reduce this

parasitic effect. The standard is to introduce a cost in the misclassification of certain class,

applying a cost matrix (Witten et al. 2011).

The open question focuses on the different possible costs used in the matrix. What

should be kept in mind is that the safety requirements should increase the cost of a false

negative (FN) (reactive catchment classified as non-reactive). The cost matrix is used

throughout the learning process and induces the sets to be weighted. The target of the

introduction of the cost matrix is to increase the influence of the reactive catchments.

In order to select the cost matrix to use in this work, a comprehensive sensitivity

analysis was carried out using the CART algorithm (see below for explanation of the

algorithm) and two values emerged from it: 12 and 15. Thus, both values were candidates.

Each one was tested and the best fit was analyzed. Naturally, defining the cost suffers the

influence of the selected tool for the computations (CART algorithm). Each algorithm has

its own specifications, which make the determination of the cost using a specific tool a

unique computation, but the values determined through the analysis are supposed to be

suitable for the different data mining tools used in this work.

Fig. 4 Bi-dimensional relationships showing non-reactive catchments (gray points) and reactive catch-ments (black points) a Mean slope as a function of area; b stream length as a function of Melton ratio;c Melton ratio as a function of maximum elevation; d form factor as a function of maximum elevation.Legend is shown at the top

224 Nat Hazards (2013) 67:213–238

123

4.3 Classifiers

Two sets have been considered for the learning process described below: (1) the training

set (or learning set) consists of 1,022 catchments spreading over three zones (Berga,

NWCat and Mollo); (2) the test set includes 113 catchments encountered in Andorra

(Fig. 3). Fourteen parameters have been gathered, and reactivity (presence of past debris

flows) has been assigned to both sets. Three classifiers were selected to process the training

set to learn knowledge from it. First a logistic regression was determined. Then, two

classification trees were considered: C4.5 (J48) and CART.

4.3.1 Logistic regression

Traditionally, the first choice to fit data is a linear regression. It is an excellent and simple

method, although it only considers linearity. If the data exhibit nonlinear relations, the fitted

straight line will not accurately reproduce the data behavior. To address this issue, the logistic

regressions (LR) were preferred (Landwehr et al. 2005; Witten et al. 2011). The theoretical

basis of LR is simple, and the result can be represented by the following function:

f ðzÞ ¼ 1

1þ e�z;

z ¼ w0 þ w1a1 þ � � � þ wkak:

ð1Þ

where ai are the attribute values and wi are the attribute weights. In conclusion, Eq. (1)

could be considered as a membership function.

4.3.2 Classification tree

Two algorithms were used to construct the classification tree: CART and C4.5 (J48), thus

giving two resulting trees (Breiman et al. 1984; Breiman et al. 1995). Gini’s Diversity

index (Gini 1912) was selected as splitting method: It defines the order in which questions

are reported in the trees.

The optimal size of the final tree is an important issue in considering a decision tree

algorithm. Problems like overfitting the training data or poorly generalizing to new samples

are often due to a tree that is too large. On the other side, an excessively small tree may not

Fig. 5 Matrix model as used and later reported in this work (after Fawcett 2006). N the total of non-reactivein the true class; P the total of reactive in the true class

Nat Hazards (2013) 67:213–238 225

123

capture important samples structural information (data’s organization in data types or

groups of data type). However, telling where a classification tree’s algorithm should stop is

a tricky task. It is impossible to tell whether adding an extra node will dramatically

decrease the error or not. ‘‘Horizon effect’’ is the name of this well-known problem.

Growing the tree until each node contains a small number of instances, and pruning it in

order to remove the nodes that do not provide additional information is a common strategy

to overcome the problem. Pruning (1) reduces the complexity of the final classifier, (2)

better predicts the accuracy by the reduction in overfitting (which is a common problem in

data mining and consists of finding patterns in the training set which are not present in the

general set) and (3) eliminates classifier’s sections likely to be based on noisy or erroneous

data (which could be associated to the principle of regionalization).

5 Results

5.1 Logistic regression

The first try was run for the construction algorithm using the training set without any

tuning and without using the cost matrix. The logistic regression obtained was:

z ¼� 2:6455þ 0:00123 Hmax � 0:00085 Hmin þ 0:11379 MR

þ 0:00188 SmC þ 0:00185 Oþ 0:12799 A

� 0:10574 Ls � 0:00003 P� 0:17712 BE

� 0:11440 LRþ 0:06086 Se � 0:07900 S200

þ 0:00340 SmS

ð2Þ

The resulting confusion matrix is shown in Table 5A, and results are extremely poor due to

the cost matrix not being used. This is not a consequence of the lack of optimization. If

optimization algorithms were used, the attempt of applying the algorithm to the training set

results in simple pointless knowledge, that is,:

z ¼ �1:3611þ 0:00066 Hmax: ð3Þ

Considering that in the logistic regression the threshold is supposed to be in f (z) = 0.5

(output’s reactivity is comprised between 0 and 1, it can take any value in this range and by

default the middle of the range stands for the positive/negative threshold), the regression

could be reinterpreted as:

f ðzÞ ¼ 0:5! z ¼ 0:0;

0:0 ¼ �1:3611þ 0:00066 Hmax ! Hmax ¼1:3611

0:00066¼ 2052:72 m:

ð4Þ

‘‘All the catchments having a maximum elevation over 2,052 m asl are reactive’’ is a

conclusion that shows the weakness of this process. The confusion matrix for the training

set is the same as in Table 5A.

When the cost matrix was used with a value of 12, the weighted logistic regression can

be expressed as:

226 Nat Hazards (2013) 67:213–238

123

z ¼ �0:9220þ 0:00137 Hmax � 0:00103 Hmin þ 0:28304 MR

� 0:00297 SmC þ 0:00105 Oþ 0:21793 A

� 0:10705 Ls � 0:00006 P� 0:60802 BE

� 0:18096 LRþ 0:055790 Se � 0:069141 S200

þ 0:00565 SmS þ 0:24360 FF

ð5Þ

Using this regression, the success in reactive catchments classification was clearly

improved at the sacrifice of the non-reactive ones (Table 5B).

Linear regressions have a great handicap: The maximum complexity is one coefficient

per input variable. Therefore, the maximum freedom degree in the tuning process is the

number of input variables plus one. Another important point is that input variables are not

normalized. It means that the coefficient value does not provide information about the

relative importance of each variable. This non-optimized version of the fitting curve is

applied to the test set, and the confusion matrix is presented in Table 5C.

5.1.1 Optimization of the logistic regression

The resulting equation should also be optimized in order to be useful for sets different from

the training one, to reduce the influence of overfitting.

A tenfold cross-validation is applied to the weighted logistic regression obtained in the

previous sections. Cross-validation means to divide randomly the training set in 10 parts of

similar size, run the classifier 10 times with each part and average the error estimates to get

an overall error estimate (Witten et al. 2011). The new optimized regression equation is:

z ¼� 1:84110þ 0:00078 Hmax � 0:00044 Hmin þ 0:71571 MR

� 0:00735 SmC þ 0:00105 Oþ 0:15465 A

� 0:07577 Ls � 0:14204 LRþ 0:04088 Se

� 0:05077 S200 þ 0:00356 SmS

ð6Þ

Table 5 Confusion matrices inrelation to the logistic regression:(A) Confusion matrix obtainedfor the training set; (B) Confu-sion matrix obtained for theweighted training set; (C) Suc-cess in test set using the weightednon-optimized logistic regres-sion; (D) Confusion matrix forthe weighted set using the opti-mized logistic regression;(E) Success in test set using theoptimized logistic regression

Non-reactive Reactive

(A)

Non-reactive 944 0

Reactive 78 0

(B)

Non-reactive 649 295

Reactive 21 57

(C)

Non-reactive 41 31

Reactive 15 26

(D)

Non-reactive 629 315

Reactive 27 51

(E)

Non-reactive 41 31

Reactive 12 29

Nat Hazards (2013) 67:213–238 227

123

The confusion matrix for the training set is presented in Table 5D. The results are slightly

worse than the ones obtained using the non-optimized regression in the training set. This

was the target due to the overfitting problem. Finally, when this optimized regression is

applied to the test set and the obtained confusion matrix (Table 5E) is compared to the

results obtained for the non-optimized version of the regression (Table 5C), results

improve.

5.2 C4.5 (J48)

A preliminary result is shown to highlight the limitation of the set used in the tree’s

construction. If the C4.5 tree is constructed using the raw training set without cost matrix,

the resulting tree includes 7 leaves and 13 nodes (Fig. 6). Table 6A illustrates its confusion

matrix.

From a reactivity class point of view, the results shown in the confusion matrix are poor

(Table 6A). The unbalanced rate of reactive/non-reactive catchments is partial to the non-

reactive. It results in a tree best for non-reactive catchments prediction, which poorly

captures the reactive ones. From a susceptibility point of view, a good capture of reactive

catchments is preferable. The small number of reactive catchments inside the set (when the

cost matrix is not applied) provokes the tree to exhibit only one leaf belonging to the

reactive class.

A remedy is the use of the cost matrix. A cost of 15 for the false negatives (FN) has

been chosen to carry out the analysis. The resulting tree has 48 leaves and 95 nodes, being

more complex in order to capture reactive catchments particularities. The tree is not shown

due to its size. The confusion matrix obtained for this unpruned tree, when the weighted set

is considered, is found in Table 6B. Before the optimization, the results obtained on the

test set are obviously poor (Table 6C). The sensibility regarding the reactive catchments

should be improved and justifies the following optimization.

5.2.1 Optimization of C4.5 (J48)

The tuning algorithm uses a tenfold cross-validation method to optimize the tree. The

resulting pruned and weighted tree has 14 leaves and 27 nodes (Fig. 7). Its confusion

matrix is visible in Table 6D.

Fig. 6 Classification tree J48 constructed with the raw set and with no optimization of the algorithm.Leaves with a non-null probability of reactivity are not reported

228 Nat Hazards (2013) 67:213–238

123

Comparing these results and those obtained using the unpruned tree, it is clear that the

results are worse, as expected after reducing overfitting. Once the tree has been pruned

using the cross-validation it is applied to the test set (Table 7E). The results should be

compared to the ones obtained before optimization (Table 7C). There is a general increase

in accuracy. Success in classifying the reactive catchments has been improved, and the

accuracy for the non-reactive ones has been reduced.

5.3 CART

The CART algorithm provides a weighted unpruned tree. It is constructed applying the cost

matrix using a value of 12. There are 161 nodes and 81 leaves making up this tree, which is

not plotted due to its size.

It is clear that the accuracy in classifying the training set is high (Table 7A), when the

test set gives very poor results (Table 7B). From the susceptibility point of view, the result

is unacceptable, justifying the following optimization.

5.3.1 Optimization of CART

A tenfold cross-validation is applied to the training set. It was then necessary to determine

the most efficient level of pruning for this tree. A comprehensive sensitivity analysis was

carried out, and from this, an optimum range of level comprised between 13 and 16 was

defined. In this analysis, it is fixed to 15. In the following, pruned tree (Fig. 8) served as a

base for a tenfold cross-validation. The corresponding confusion matrix is collected in

Table 7C, when the results concerning the test set are presented in Table 7D. Due to the

high weight assigned to reactive catchments, the resulting classification tree performs

better for reactive catchments and presents low accuracy for non-reactive catchments.

Table 6 Confusion matricesrelated to the C4.5 classifier:(A) Confusion matrix obtainedfor the training raw set; (B) Con-fusion matrix obtained for theunpruned tree done and trainingraw set using weighted database;(C) Success in test set using theunpruned tree; (D) Confusionmatrix for the pruned treeobtained with the weighted set;(E) Success in test set using thepruned tree

Non-reactive Reactive

(A)

Non-reactive 941 3

Reactive 78 0

(B)

Non-reactive 851 93

Reactive 0 78

(C)

Non-reactive 59 13

Reactive 34 7

(D)

Non-reactive 634 310

Reactive 22 56

(E)

Non-reactive 52 20

Reactive 27 14

Nat Hazards (2013) 67:213–238 229

123

6 Evaluation and credibility

6.1 Definition of measuring performances indices

There are several factors that affect the success of the classification process. The main

factors conditioning the learning algorithm performance include the following: (1) Class

distribution (the rate between different classes involved in the classification), (2) mis-

classification cost (cost matrix is user defined, so there is no guarantee on that), (3) size of

training and test sets and (4) selected algorithm.

In order to qualitatively analyze the performance in learning process, several standard

indices are selected. The performance indices are confusion matrix, precision, recall,

F-measure, success rate and weighted success rate. The definition of the different indices

follows (also see Fig. 5):

Fig. 7 Pruned version of the classification tree obtained using the C4.5 algorithm (with weighted database)

Table 7 Confusion matrices inrelation to the CART classifier:(A) Confusion matrix for theunpruned tree obtained using theweighted set; (B) Success in testset using the unpruned tree;(C) Confusion matrix for theweighted set using the prunedtree; (D) Success in test set

Non-reactive Reactive

(A)

Non-reactive 898 46

Reactive 0 78

(B)

Non-reactive 65 7

Reactive 37 4

(C)

Non-reactive 683 261

Reactive 27 51

(D)

Non-reactive 39 33

Reactive 25 16

230 Nat Hazards (2013) 67:213–238

123

Precision ¼ TP

TPþ FPð7Þ

Recall ¼ TP

TPþ FNð8Þ

F-measure ¼ 2 TP

2 TPþ FPþ FNð9Þ

Success rate ¼ TPþ TN

TPþ FPþ TNþ FNð10Þ

Weighted success rate ¼ WTPTPþWTNTN

WTPTPþWFPFPþWTNTNþWFNFNð11Þ

The test set usually supports the calculation of these indices. It generally concludes on the

accuracy of the developed classification tools. The best classifier is thus chosen ‘‘a pos-

teriori’’ of its test set. These different indices are first applied to the reactive class, then to

the non-reactive class (Table 8). The results show that the logistic regression is the best,

although the C4.5 classification tree has better performance for some specific indices

related to the non-reactive class. As a general conclusion, it can be stated that the global

performance does not exceed 70 %.

The performance of the different classifiers is compared below for both training and test

sets, and the best classifier is chosen ‘‘a priori.’’

6.2 Measuring relative performance

The main difference between what has been seen in the previous section and what is

performed in this section is the fact that the comparison is carried out using the training set,

without considering the results in the test set. Analyzing both performances is not

redundant. The test set also suffers overfitting, and the best fit on the test set is not equal to

the global best fit.

Fig. 8 CART classification tree after pruning at level 15

Nat Hazards (2013) 67:213–238 231

123

A classical approach to measure the performance of a model was considered: The

receiver operating characteristics (ROC). Each point on a ROC curve represents a clas-

sifier, obtained using different threshold values for a method (considering that the classifier

used is probabilistic and not deterministic).

Changes in the optimization’s algorithm, sample distributions or cost matrix could be

represented also in the ROC curve, as it is observed in Fig. 9. In the following, ROC curves

are constructed measuring the success in the classification. In Fig. 9, the comparison

results for the three classifiers are presented. The area under curve (AUC) for the logistic

regression is 0.694, for the C4.5 tree is 0.659 and for the CART tree is 0.675. However,

when these results are compared to existing AUC value defined for susceptibility analysis,

the results are low (Carrara et al. 2008; Frattini et al. 2010). For instance, Frattini et al.

(2010) reports AUC ranging from 0.64 to 0.84 for five different models.

Eventually each classifier has been introduced in a ROC graph (Fig. 10). The format of

such graphs allows the results to be shown together for training and testing sets, whether

the sets are weighted or not and whether the algorithms are optimized or not. The more

clustered the results of a same classifier, the better the classifier. Conventionally, a clas-

sifier has a better performance if it lies in the upper left corner of the graph (Fawcett 2006).

The susceptibility’s model based on a logistic regression seems to give better results

than the classification trees. The classification trees, however, are better at extracting

susceptibility laws in a training set but they are greatly affected by a regionalization issue

Table 8 Performance measuringindexes for the (A) reactive classand (B) non-reactive class, in thecase of the test set

Logistic regression C4.5 CART

(A)

Precision 0.483 0.412 0.298

Recall 0.707 0.341 0.359

F-measure 0.574 0.373 0.326

Success rate 0.619 0.584 0.477

Weighted success rate 0.690 0.381 0.383

(B)

Precision 0.774 0.658 0.609

Recall 0.569 0.722 0.542

F-measure 0.656 0.689 0.574

Success rate 0.619 0.584 0.477

Weighted success rate 0.690 0.381 0.383

Fig. 9 ROC curves comparingthe CART tree, the C4.5 (J48)tree and the logistic regression.TPR true positive rate, FPR falsepositive rate

232 Nat Hazards (2013) 67:213–238

123

as the rules created are poorly applied to the test set. For instance, the optimized CART

tree is excellent when the training set is considered, but when the testing set is considered,

the results are largely unacceptable.

6.3 Susceptibility maps

Results are previously reported in terms of performance and other ratios. Susceptibility

assessments are not easily used and difficult to understand for whom the tests are for. On

the contrary, susceptibility maps are a common output for hazard assessments. Figure 11

shows four of these maps, corresponding to two optimized models (LR and CART) in two

zones (NWCat and Andorra). Catchments are represented; in white are predicted non-

reactive catchments, in gray are predicted reactive catchments and outlined in black are

proven reactive catchments. In NWCat, both models predict the proven reactive catch-

ments well, but it is also easily visualized that CART predicts less reactive catchments than

Fig. 10 Recapitulative ROC graph presenting all the classifiers encountered classified in two families(training and testing sets)—raw dB ‘‘non-weighted set,’’ wdB ‘‘weighted set’’ and opt ‘‘optimized’’

Nat Hazards (2013) 67:213–238 233

123

LR. In Andorra, LR also produces more reactive catchments than CART. But it also better

predicts the proven reactive catchments. Figure 11 is a good example of the spatial vari-

ability of predicted patterns within a study area due to statistical techniques (Sterlacchini

et al. 2011).

When multiple susceptibility maps are edited, the evaluation of the spatial agreement

between these maps helps the hazard assessment’s users, in choosing the most suitable map

(in other words the model that has the best prediction). Sterlacchini et al. (2011) estimated

how much 13 predictions differed from one to another, which aimed at finding the best

model. Determining a best model used is a difficult task. In this study, it is affected by the

division of the training set in three zones and the use of a test set, all displaying different

morphological characteristics of reactive catchments due to parameters purposely not

envisaged in this study (like lithology and sediment availability). For this reason, the best

regional model is the one giving the higher level of performance in Andorra (the test set),

although other models may reveal more accurate in a specific zone (Fig. 11).

Fig. 11 Susceptibility maps of one zone of the training set (NWCat) and the test set (Andorra). Catchmentsfilled in gray are reactive catchments resulting from the models. Catchments filled in white are non-reactivecatchments resulting from the models. Catchments with black contour are the reactive catchments present inthe dataset when catchments without contour are the non-reactive catchments in the dataset

234 Nat Hazards (2013) 67:213–238

123

Combinations and superimpositions are also an idea to which the research led, although

they are not tackled in this study. On the one hand, a confidence index can be given to each

model. On the other hand, susceptibility maps can be superimposed. The intersections of

reactive catchments defined by the models could be compared to the catchments that

appeared as reactive with the combination of the algorithms.

Eventually, the proximity in the performance of the two models discussed makes the

two models interchangeable depending on the task and the objectives sought. If one wants

to organize a field campaign in search for debris flows, CART is more appropriate as it

identifies less FP (false positives). Otherwise, logistic regressions better fit hazard studies,

which could benefit from them.

7 Conclusion

Debris flows are a geological hazard that also concerns the Central-Eastern Pyrenees.

Considering different media of debris-flow reconnaissance, 534 debris flows have been

determined and digitalized in a GIS, layered over a 5 9 5 m DEM where drainage network

and catchments had been edited and extracted.

From the statistical results of 14 fluvio-morphological parameters, similarities with past

studies emerged that gave credibility to our inventory, although the unit at which work was

conducted limited the comparisons, especially with local and localized studies often

characterizing the hazard’s path itself instead of the landscape where the phenomenon

takes place.

Based on 78 reactive catchments and 944 non-reactive catchments, the models suffer

from overfitting, encouraged by the unbalanced ratio of the number of reactive over non-

reactive catchments. Introducing a cost matrix was necessary to overcome the problem, as

to the weight of the database. Moreover, it appears that applying the results to the test set

generally gives poor matching, likely to be the reflect of the parameters chosen. Simpli-

fying the algorithms through optimization (pruning the trees) permits to better export the

models to a test set. Among the models tested here, the logistic regression gives better

results than decision trees when a test set is considered. The decision trees are better at

extracting rules from a training set but are hardly applicable to a test set, even after

optimization.

Inherent limitations of our approach include the omission of parameters recognized to

play a relevant role in debris-flow susceptibility assessment. Geology, vegetation and

especially sediment availability are generally closely related to debris-flow spatial

occurrence. It is strongly believed that incorporating such information would benefit the

results obtained from the models.

The validation of the models is a necessary step, which in our case revealed to give poor

results. The pertinence of the test set is an issue that plays a direct role on the validation’s

results. In our case, Andorra, high-mountain environment, has been chosen for validation

of the models, which have been computed based, not only on high-mountain environments,

but also on medium-mountain environments similar to the pre-Pyrenees. The test set

should reflect the same environments as in the training dataset. Regionalization is influ-

encing the validation results. The study of this influence is out of the scope of the article,

but should be considered in future studies.

The inventory is subject to limit the results as (1) not all the inventory is represented in

the study (251 out of 534) and (2) different types of debris flows are mixed. Limiting the

study to one type of debris flows may have improved the results from the models but the

Nat Hazards (2013) 67:213–238 235

123

choice of the study area may have been rethought too. The environment plays a role in the

type of debris flows (here pre-Pyrenees and Axial Pyrenees). Moreover, some debris flows

are related to a single extreme rainfall (extreme in the water’s quantity received by

headwaters catchments) and cannot be representative of a prolonged less extreme event. It

is the role played by the sediment availability in a catchment.

Simple methodologies leading to the gathering of the study unit and the different sets,

reproducibility of the work and straightforward understanding of the results have been

sought throughout this analysis. It best suits places where little information is available, is

addressed to entities dealing with debris-flow hazards with little means or resources, or few

sources of information, and is a first step toward a regional hazard assessment, which

would need further studies to improve the errors estimates of the models. Drawbacks

involved in our study may explain the rather poor success rate obtained, which could be

attenuated by further refining the parameters or the unit of study or the inventory.

Acknowledgments This research was financially supported by the European project IMPRINTS (EC FP7 -contract ENV-2008-1-226555), the Spanish DEBRISCATCH project (contract CGL2008 - 00299/BTE) andthe Spanish project CGL2009-13039 from the Ministerio de Ciencia e Innovacion. The authors would like tothank the Institut Geologic de Catalunya and the Institut Cartografic de Catalunya for the supply of the DEM.The manuscript improved thanks to two anonymous reviewers, which are thanked for their comments.

References

Alcoverro J, Corominas J, Gomez M (1999) The Barranco de Aras flood of 7 August 1996 (Biescas, CentralPyrenees, Spain). Eng Geol 51:237–255

Bacchini M, Zannoni A (2003) Relations between rainfall and triggering of debris-flow: case study ofCancia (Dolomites, Northeastern Italy. Nat Hazard Earth Syst Sci 3:71–79

Baeza C (1994) Evaluacion de las condiciones de rotura y la movilidad de los deslizamientos superficialesmediante el uso de tecnicas de analisis multivariante. PhD thesis, Universitat Politecnica de Catalunya,Barcelona, Spain (in Spanish)

Baeza C, Corominas J (2001) Assessment of shallow landslide susceptibility by means of multivariatestatistical techniques. Earth Surf Process Landf 26:1251–1263

Baeza C, Lantada N, Moya J (2010) Validation and evaluation of two multivariate statistical models forpredictive shallow landslide susceptibility mapping of the Eastern Pyrenees (Spain). Environ Earth Sci61:507–523

Blahut J, van Westen CJ, Sterlacchini S (2010) Analysis of landslide inventories for accurate prediction ofdebris-flow source areas. Geomorphology 119:36–51. doi:10.1016/j.geomorph.2010.02.017

Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Chapman and Hall,London

Breiman L, Friedman J, Olshen R, Stone C, Steinberg D, Colla P (1995) Cart: classification and regressiontrees. Chapman and Hall, London

Burrough PA, McDonell RA (1998) Principles of Geographical Information Systems. Oxford UniversityPress, New York, p 190

Carrara A, Cardinali M, Detti R, Guzzetti F, Pasqui V, Reichenbach P (1991) GIS techniques and statisticalmodels in evaluating landslide hazard. Earth Surf Process Landf 16:427–445

Carrara A, Crosta G, Frattini P (2008) Comparing models of debris-flow susceptibility in the alpine envi-ronment. Geomorphology 94:353–378. doi:10.1016/j.geomorph.2006.10.033

Catani F, Casagli N, Ermini L, Righini G, Menduni G (2005) Landslide hazard and risk mapping atcatchment scale in the Arno River basin. Landslides 2:329–342. doi:10.1007/s10346-005-0021-0

Chen C-Y, Yu FC (2011) Morphometric analysis of debris flows and their source areas using GIS. Geo-morphology 129:387–397. doi:10.1016/j.geomorph.2011.03.002

Chorley RJ (1957) Illustrating the laws of morphometry. Geol Mag 94:140–150Chung CF, Fabbri AG (2003) Validation of spatial prediction models for landslide hazard mapping. Nat

Hazards 30(3):451–472

236 Nat Hazards (2013) 67:213–238

123

Clotet N, Gallart F (1984) Inventari de degradacions de vessants originades pels aiguats de novembre de1982, a les altes conques del Llobregat i Cardener. Servei Geologic de la Generalitat de Catalunya,Barcelona, Spain (in Spanish)

Coe JA, Godt JW, Baum RL, Buckman RC Michael JA (2004) Landslide susceptibility from topography inGuatemala. In: Lacerda WA, Ehrlich M, Fontoura AB, Sayao A (eds), Landslides evaluation andstabilization, Proceedings 1Xth symposium on landslides, Balkema, Leiden, pp 69–78

Coehlo-Netto AL, Avelar AS, Fernandes MC, Lacerda WA (2006) Landslide susceptibility in a moun-tainous geoecosystem, Tijuca Massif, Rio de Janeiro: the role of morphometric subdivision of theterrain. Geomorphology 87:120–131

Coussot P, Meunier M (1996) Recognition, classification and mechanical description of debris flows. EarthSci Rev 40:209–227

Cuadrat JM, Pita MF (1997) Climatologia. Catedra, Madrid, Spain (in Spanish)ECORS Pyrenees Team (1988) The ECORS deep reflection seismic survey across the Pyrenees. Nature

331:508–510Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874Fell R, Corominas J, Bonnard C, Cascini L, Leroi E, Savage WZ, On behalf of the JTC-1 Joint Technical

Committee on Landslides and Engineered Slopes (2008) Guidelines for landslide susceptibility, hazardand risk zoning for land use planning. Eng Geol 102:85–98

Fitzgerald PG, Munoz JA, Coney PJ, Baldwin SL (1999) Asymmetric exhumation across the Pyreneanorogen: implications for the tectonic evolution of a collisional orogen. Earth Planet Sci Lett173(3):157–170

Frattini P, Crosta G, Carrara A (2010) Techniques for evaluating the performance of landslide susceptibilitymodels. Eng Geol 111:62–72

Gentile F, Bisantino T, Trisorio Liuzzi G (2008) Debris flow risk analysis in South Gargano watersheds(Southern Italy). Nat Hazards 44:1–17

Gini C (1912) Variabilita e mutabilita, C. Cuppini, Bologna, 156 p. Reprinted in Memorie di metodologicastatistica (1955), Pizetti E, Salvemini T (eds). Rome (in Italiano)

Guinau M, Pallas R, Vilaplana JM (2005) A feasible methodology for landslide susceptibility assessment indeveloping countries: a case-study of NW Nicaragua after Hurricane Mitch. Eng Geol 80:316–327

Guthrie RH, Evans SG (2007) Work, persistence, and formative events: the geomorphic impact of land-slides. Geomorphology 88:266–275

Guzzetti F, Reichenbach P, Cardinali M, Galli M, Ardizzone F (2005) Probabilistic landslide hazardassessment at the basin scale. Geomorphology 71:272–299

Guzzetti F, Reichenbach P, Ardizzone F, Cardinali M, Galli M (2006) Estimating the quality of landslidesusceptibility models. Geomorphology 81:166–184

Horton RE (1932) Drainage basin characteristics. Trans Am Geophys Union 13:350–361Hungr O, Morgan GC, Kellerhals R (1984) Quantitative analysis of debris torrent hazards for design of

remedial measures. Can Geotech J 21:663–677Hungr O, Evans SG, Bovis MJ, Hutchinson JN (2001) A review of the classification of landslides of the flow

type. Environ Eng Geosci 3:221–238Hungr O, Leroueil S, Picarelli L (2012) Varnes classification of landslide types, an update. In: XI inter-

national symposium on landslides and engineered, Banff, Canada, pp 47–58Hurlimann M, Corominas J, Moya J, Copons R (2003) Debris-flow events in the eastern Pyrenees: pre-

liminary study on initiation and propagation. In: Rickenmann D, Chen CL (eds) Debris-flow hazardmechanics: prediction, mitigation and assessment. Millpress, Rotterdam, pp 115–126

ICC, Institut Cartografic de Catalunya (2003) Mapa geologic de Catalunya 1:250 000. ICC, BarcelonaIverson RH (1997) The physics of debris flows. Rev Geophys 35(3):245–296Jakob M (2005a) A size classification for debris flows. Eng Geol 79:151–161Jakob M (2005b) Debris flow hazard assessment. In: Jakob M, Hungr O (eds) Debris flow hazard assessment

and related phenomena, chapter 17. Springer, HeidelbergJakob M, Hungr O (2005) Debris flow hazard assessment and related phenomena. Springer, HeidelbergJenson SK, Domingue JO (1988) Extracting topographic structure from digital elevation data for geographic

information system analysis. Photogramm Eng Remote Sensing 54(11):1593–1600Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59:161–205Lynn G (2005) Macrogeomorphology and erosional history of the post-orogenic Pyrenean mountain belt.

Ph.D. thesis, The University of Edinburgh, Edinburgh, pp 388Martin VJ, Olcina CJ (2001) Climas y Tiempos de Espana. Alianza (ed), Madrid (in Spanish)Melelli L, Taramelli A (2004) An example of debris-flows hazard modeling using GIS. Nat Hazard Earth

Syst Sci 4:347–358

Nat Hazards (2013) 67:213–238 237

123

Melton MA (1965) The morphologic and paleoclimatic significance of alluvial deposits in southern Arizona.J Geol 73:1–38

Montgomery DR, Dietrich WE (1988) Where do channels begin? Nature 336:232–234Montgomery DR, Dietrich WE (1989) Source areas, drainage density, and channel initiation. Water Resour

Res 25:1907–1918Munoz JA (1992) Evolution of a continental collision belt: ECORS-Pyrenees crustal balanced cross-section.

In: McClay KR (ed) Thrust tectonics. Chapman and Hall, London, pp 235–246O’Callaghan JF, Mark DM (1984) The extraction of drainage networks from digital elevation data. Comp

Vis Graph Image Process 28(3):323–344Okunishi K, Suwa H (2001) Assessment of debris-flow hazards of alluvial fans. Nat Hazard 23:259–269Parde M (1941) La formidable crue d’octobre 1940 dans les Pyrenees-Orientales. Rev Geog Pyre et du Sud-

Ouest 12:237–279 (in French)Pike RJ (2002) A bibliography of terrain modeling (geomorphometry), the quantitative representation of

topography. USGS Open file report 02-465Portilla M (2010) Analisis del campo de densidad de movimientos en masa del area Mollo-Queralbs (in

Spanish). Internal report BarcelonaTech—UPC (in Spanish)Portilla M, Chevalier G, Hurlimann M (2010) Description and analysis of the debris flows occurred during

2008 in the Eastern Pyrenees. Nat Hazards Earth Syst Sci 10:1635–1645. doi:10.5194/nhess-10-1635-2010

Remondo J, Gonzalez A, Diaz de Teran JR, Cendrero A, Fabbri A, Chung C-JF (2003) Validation oflandslide susceptibility maps; examples and applications from a case study in Northern Spain. NatHazards 30:437–449

Rickenmann D (1999) Empirical relationships for debris flows. Nat Hazards 19:47–77Rickenmann D, Zimmermann M (1993) The 1987 debris flows in Switzerland: documentation and analysis.

Geomorphology 8:175–189Santacana N, Baeza C, Corominas J, Paz AD, Marturia J (2003) A GIS-based multivariate statistical analysis

for shallow landslide susceptibility mapping in La Pobla de Lillet Area (Eastern Pyrenees, Spain). NatHazards 30:281–295

Schumm SA (1956) The evolution of drainage systems and slopes in badlands at Perth Amboy, New Jersey.Bull Geol Soc Am 67:597–646

Sterlacchini J, Ballabio C, Blahut J, Masetti M, Sorichetta A (2011) Spatial agreement of predicted patternsin landslide susceptibility maps. Geomorphology 125:51–61

Tarboton DG (1997) A new method for the determination of flow directions and contributing areas in griddigital elevation models. Water Resour Res 33(2):309–319

Tarboton DG, Ames DP (2001) Advances in the mapping of flow networks from digital elevation data.World Water and Environmental Resources Congress. ASCE, Orlando, FL

Tarboton DG, Bras RL, Rodriguez-Iturbe I (1991) On the extraction of channel networks from digitalelevation data. Hydrol Process 5:81–100

Teixell A (1998) Crustal structure and orogenic material budget in the West-Central Pyrenees. Tectonics17:395–406

Wan S, Lei TC (2009) A knowledge-based decision support system to analyze the debris-flow problems atChen-Yu-Lan River, Taiwan. Knowl Based Syst 22:580–588. doi:10.1016/j.knosys.2009.07.008

Wan S, Lei TC, Huang PC, Chou TY (2008) The knowledge rules of debris flow event: a case study forinvestigation Chen Yu Lan River, Taiwan. Eng Geol 98:102–114

Welsh A, Davies TRH (2010) Identification of alluvial fans susceptible to debris-flow hazards. Landslides.doi:10.1007/s10346-010-0238-4

Wilson JP, Gallant JC (2000) Terrain analysis: principles and applications. Wiley, LondonWitten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn.

Morgan Kaufmann, BurlingtonZavoianu I (1978) Morphometry of drainage basins. Elsevier, Amsterdam

238 Nat Hazards (2013) 67:213–238

123