Committees of hydrological models specialized on high and low flows

Committees of hydrological models specialized

on high and low flows

Dimitri Solomatine1,2 (presenting) and Nagendra Kayastha1

Vadim Kuzmin3 1 UNESCO-IHE Institute for Water Education , Delft ,The Netherlands

2 Delft University of Technology, The Netherlands 3 Russian State Hydrometeorological University

Motivation

Theory of modelling: Complex systems (processes) are comprised of

multiple components (simpler sub-processes) Simple models often cannot adequately reflect

complexity Idea:

instead of one, use several specialised models, each representing such a sub-process (hydrometeorological situation)

Optimize the way they are combined

This may allow for better response in changing conditions

Solomatine and Kayastha : Committee models, IAHS, Gothenburg, 20132


Outline

Committee modelling: examples and experiences Building specialized models Fuzzy committee of specialized models Case studies:

Leaf catchment, USA Lissbro, Sweden

Results and conclusions Ideas for future work


4

Combination of models (committees, ensembles, modular models): earlier

work Shamseldin, A. Y., K. M. O'Connor and G. C. Liang (1997). Methods for combining the outputs of different rainfall–runoff models. J. Hydrol. 197(1–4): 203-229.

Xiong, L., Shamseldin, A. Y. and O’Connor, K. M. (2001). A nonlinear combination of the forecasts of rainfall-runoff models by the first-order Takagi-Sugeno fuzzy system, J. Hydrol., 245(1), 196–217.

Abrahart, R. J. and See, L. M. (2002). Multi-model data fusion for river flow forecasting: an evaluation of six alternative methods based on two contrasting catchments, Hydrol. Earth Syst. Sci., 6, 655–670.

Solomatine, D. P. and Siek, M. (2006). Modular learning models in forecasting natural phenomena, Neural Networks, 19, 215–224.

Oudin L., Andréassian V., Mathevet T., Perrin C. & Michel C.,(2006), Dynamic averaging of rainfall-runoff model simulations from complementary model parameterization. Water Resources Research, 42.

G. Corzo and D.P. Solomatine (2007). Baseflow separation techniques for modular ANN modelling in flow forecasting. Hydrol. Sci. J., 52(3), 491-507.

Fenicia, F., Solomatine, D. P., Savenije, H. H. G. and Matgen, P. Soft combination of local models in a multi-objective framework. HESS, 11, 1797-1809, 2007.

Toth E. (2009). Classification of hydro-meteorological conditions and multiple artificial neural networks for streamflow forecasting, HESS, 13, 1555–1566.

Kayastha N., Ye J., Fenicia F., Solomatine D.P. (2013). Fuzzy committees of specialised rainfall-runoff models: further enhancements, HESSD, 10, 675-697, doi:10.5194/hessd-10-675-2013.


Limitations of “single model” approach

Complexity of the hydrological processes The simplicity of the “conceptual” modelling

paradigm often leads to errors in representing all the different complexity of the physical processes in the catchment

A single model often cannot capture the full variability of the system response varying order of magnitude in flow value variance of error dependent on flow value

A single aggregate measure criteria of model performance is traditionally used

Divide and conquer… Small is beautiful…


Steps in building a committee of RR models

Identification of specialized models, e.g.: “soft separation” scheme to identify “low flows” and

“high flows” (Fenicia et al., 2007) baseflow separation (Corzo and Solomatine, 2006) identifying rising and falling limbs (Jain and

Shrinivasulu, 2006) separation by threshold value of flow (Willems, 2009) Transformation of flow (Oudin et al. 2006)

Objective functions (errors) definition Calibration of specialised models (Multi-objective,

Single objective) Models combination (committees, ensembles,

averaging) Check performances


Complex process

Modular models for modelling sub-processes

Splitting into sub-processes: Domain experts (humans) specify such processes Computational intelligence algorithms discover

“hidden” processes based on observed data Combination of both approaches

Sub-process1

Sub-process KSub-process2

D.P. Solomatine (2006). Optimal modularization of learning models in forecasting environmental variables. iEMSs 3rd Biennial Meeting: Summit on Environmental Modelling and Software (A. Voinov, A. Jakeman, A. Rizzoli, eds.)Solomatine & Xue. (2004) M5 model trees compared to neural networks: application to flood forecasting in the upper reach of the Huai River in China. ASCE J. Hydrologic Engineering, 9(6), 491-501. … Papers by Shamseldin et al.,1997; Abrahart & See 2002; Jain et al., 2006; Oudin et al., 2006; etc.


Modular models: Methods of data splitting

Using machine learning methods to group (cluster) data corresponding to different hydrometerological regimes: k-means, Fuzzy c-means Self-organising maps

Applying hydrological knowledge for flow separation: Tracer-based methods Threshold-based flow separation Constant-slope method for baseflow separation Recursive filter for baseflow separation

G. Corzo and D.P. Solomatine. (2007) Baseflow separation techniques for modular artificial neural network modelling in flow forecasting. Hydrological Sciences J., 52(3), 491-507.


Modular models using clustering

Modular Models are built for each cluster of data

0

2

4

6

8

200

400

600

100

200

300

400

500

600

700

Precipitation (mm/hr)Discharge (m³/s)

For

ecas

t D

isch

arge

(m

³/s)

-0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-0.5

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Precipitation (t-1)

Pre

cipi

tatio

n

1

2

K-means cluster (Bagmati training data set)

P (current precipitation)Q (current discharge)

Qt+

1 (fo

reca

st d

isch

arge

)


Optimal model structure using recursive filter for baseflow

separation

300 350 400 450 500 550 6000

200

400

600

800

1000

1200

1400

Time(days)

Dis

char

ge

Baseflow Bagmati (Nepal)

Total flow

Baseflow

300 350 400 450 500 550 6000

200

400

600

800

1000

1200

1400

Time(days)

Dis

char

ge

Baseflow Bagmati (Nepal)

Total flow

Baseflow

Parameter a=0.01 (Recession coefficient) Parameter a= 0.99 (Recession coefficient )

max 1 max

max

1 1

1k k

k

BFI ab a BFI Qb

aBFI

Parameter BFImax=0.5 (Chapman and Maxwell 1996)

Ekhardt 2005

Model optimization by GeneticAlgorithms(GA)

Baseflowfilter

Model 1

Model 2

0fbmax ,BFI a

CalculateError(RMSE)

Measured Flow

tQ1tQ

G. Corzo and D.P. Solomatine (2007). Knowledge-based modularization and global optimization of ANN models in hydrologic forecasting. Neural Networks, 20, 528–536


Performance of the Modular Model using recursive filter vs Single (global) model

-500

0

500

1000

1500

2000

2500

220 222 224 226 228 230 232 234 236 238 240

MMGMTarget

Bagmati catchment

Solomatine and Kayastha : Committee models, IAHS, Gothenburg, 2013 12

Committee models

States- based dynamic averaging i) Soil moisture accounting (Oudin et al., 2006) ii) Other states: quick and slow flows

Inputs-based dynamic averaging

Outputs-based dynamic averaging i) Fuzzy committee (Fenicia et al., 2007, Kayastha

et al., 2013)

ii) Weights based on observed and simulated flows

12

2 2,max ,i ,i

, , ,,max ,max

obs obs obsc i LF i HF i

obs obs

Q Q QQ Q Q

Q Q

(inputs) , ,( )c low LF i hig HF iQ I Q I Q max

,max max;i i

low higP P P

I Ip P

log( ) . (1 ).Q Qc states Cr CrQ Q Q

)/()( ,,, HFLFiHFHFiLFLFic mmQmQmQ

h

hif

ifh

hif

m NLF

,0

,)/()(1

,1

h

hif

ifh

hif

m NHF

,1

,)/()(

,0/1

3(1 )

s

s s

A

B

C

Specialized models

Two models are built – for low and high flows Each objective functions (wRMSE) weights flows

differently


Error on low flows

Error on high flows

iHF

n

iioisHF wQQ

nN ,

1

2,,

1

iLF

n

iioisLF wQQ

nN ,

1

2,,

1

By applying weighting factors to the model residuals

Fenicia, F., Solomatine, D. P., Savenije, H. H. G. and Matgen, P. Soft combination of local models in a multi-objective framework. HESS, 11, 1797-1809, 2007.


Fuzzy committee of specialised models (1)

The membership functions are subject to the accurate optimization of the parameters (γ, δ),

14

Low-flow modelCombiner

(fuzzy committee)

High-flow model

Qc

QLF

QHF

R, E

For this range of flow both models work



Further enhancements (2012-2013): optimization all parameters of - weighted schemes and fuzzy membership functions

Experiments conducted on calibration data, and model verification on test data

Tested optimization algorithms Multi- objective – NSGA II (Deb, 2001) Single objective: Genetic algorithm – GA (Goldberg, 1989),

Adaptive cluster covering – ACCO (Solomatine, 1999)

Tested on three case studies Alzette, Bagmati and Leaf catchment

Kayastha, Ye, Fenicia, Solomatine. (2013) Fuzzy committees of specialised rainfall-runoff models: further enhancements, HESSD, 10, 675-697, doi:10.5194/hessd-10-675-2013



Influence of different weighting schemes used in objective functions for calibration of high and low flow models: Quadratic, N=2 Cubic, N=3

),(,

1

2,,)(

1iHFiLF

n

iioisHFLF wQQ

nRMSE

lifl

lifWblWa NiLF

NiLF ,2/1(*1

,0);)() ,,

hifh

hifWdhWc NiHF

NiHF ,/

,1);)() ,,

;)max,

,

max,

,max,

o

io

o

ioo

Q

Qh

Q

QQle

Cont. Kayastha et al. , (2013)



The shape of membership functions are subjected to the parameters (γ, δ), which switch the flow regimes (between low flow and high flow).

, , ,( ) /( )c i LF LF i HF HF i LF HFQ m Q m Q m m

1,

1 ( ) /( ) ,

0,

MLF

if h

m h if h

if h

1/

0,

( ) /( ) ,

1,

MHF

if h

m h if h

if h

1; )

2 )

M figure a

M figure b

Cont. Kayastha et al., (2013)


Case study : Leaf catchment

Area - 1924 km2 , 10 years of daily data , 6 yrs calibration / 4 yrs verification

One of the identified Pareto-optimal front corresponding to model parameterisations using weighting scheme to separate flow regimes and the objective function value in calibration and verification periods

Calibration Verification


Current study: Lissbro, Sweden (1)

Lissbro catcment , 97.0 km², Sweden, 17 years of daily data

HBV model (13 parameters) was usedOptimization algorithm used for calibration:

Adaptive cluster covering – ACCO (Solomatine, 1995, 1999)

All experiments are conducted on calibration data, and verified on test data Complete period: 01/01/1984 - 31/12/2010

Calibration periods: P1: 01/01/1984 - 31/12/1988P2: 01/01/1989 - 31/12/1993P3: 01/01/1994 - 31/12/1998P4: 01/01/1999 - 31/12/2003P5: 01/01/2006 - 31/12/2010


Current study: Lissbro, Sweden (2)

Statistical properties of dataStatistical properties Period1 Period2 Period3 Period4 Period5 Period5+1

Period from (day/month/year) '01-Jan-1984' '01-Jan-1989' '01-Jan-1994' '01-Jan-1999' '01-Jan-2006' '01-Jan-1984'

Period to (day/month/year) '31-Dec-1988' '31-Dec-1993' '31-Dec-1998' '31-Dec-2003' '31-Dec-2010' '31-Dec-2010'

Number of data 1827 1826 1826 1826 1826 10076

Stremflows

Average (m3/s) 0.98 0.90 1.06 1.29 1.30 1.10

Minimum(m3/s) 0.07 0.01 0.01 0.05 0.04 0.01

Maximum (m3/s) 6.20 6.74 6.35 7.30 7.74 10.50

Standard deviation(m3/s) 1.03 0.94 1.22 1.20 1.20 1.14

Precipitation

Average (m3/s) 2.18 2.04 2.15 2.25 2.34 2.19

Minimum(m3/s) 0.00 0.00 0.00 0.00 0.00 0.00

Maximum (m3/s) 33.30 41.80 37.40 57.30 52.20 67.70


Temperature

Average (m3/s) 5.68 7.21 6.50 7.28 7.05 6.81

Minimum(m3/s) -23.90 -13.20 -17.20 -16.90 -17.30 -23.90

Maximum (m3/s) 21.20 24.60 24.20 23.70 24.70 24.70


Evapotranporation

Average (m3/s) 1.32 1.41 1.38 1.44 1.44 1.40

Minimum(m3/s) 0.00 0.00 0.00 0.00 0.00 0.00

Maximum (m3/s) 4.50 5.00 4.90 4.90 5.00 5.00



Conceptual model: HBV

Conceptual lumped model

3 tanks 13 parameters to

calibrate

LZ

UZ

SM

RF

R

PERC

EA

Q=Q0+Q1Q1

Transformfunction

SP

Q0

SF

CFLUX

IN

SF – Snow

RF – Rain

EA – Evapotranspiration

SP – Snow cover

IN – Infiltration

R – Recharge

SM – Soil moisture

CFLUX – Capillary transport

UZ – Storage in upper reservoir

PERC – Percolation

LZ – Storage in lower reservoir

Qo – Fast runoff component

Q1 – Slow runoff component

Q – Total runoff

LZ

UZ

SM

RFRF

RR

PERCPERC

EAEA

Q=Q0+Q1Q1Q1

Transformfunction

SP

Q0Q0

SFSF

CFLUXCFLUX

ININ

SF – Snow

RF – Rain

EA – Evapotranspiration

SP – Snow cover

IN – Infiltration

R – Recharge

SM – Soil moisture

CFLUX – Capillary transport

UZ – Storage in upper reservoir

PERC – Percolation

LZ – Storage in lower reservoir

Qo – Fast runoff component

Q1 – Slow runoff component

Q – Total runoff

Parameters’ name Descriptions Units Lower UpperFC Maximum soil moisture content mm 100 300LP Limit for potential evapotranspiration - 0.1 1ALFA Response box parameter - 0 2BETA Exponential parameter in soil routine - 1 4K Recession coefficient for upper tank mm/d 0.005 0.5K4 Recession coefficient for lower tank mm/d 0.001 0.3PERC Percolation from upper to lower response box mm/d 0 5CFLUX Maximum value of capillary flow mm/d 0 2MAXBAS Transfer function parameter d 1 6RCF Refreezing coefficient - 1 1.2SCF Snowfall correct factor - 0.5 1.2CFMAX Degree day factor mm/ºC/d 1 4TT Threshold temperatures ºC -2 1


Objective functions used

Nash and Sutcliffe Efficiency (NSE)

Root mean squared error (RMSE)

Log transformed flows (Oudin et al. 2006)

Weighted RMSE on high flows (Fenicia et al. 2007)

Weighted RMSE on low flows (Fenicia et al. 2007)

22

2,max ,2

, ,,max1

1( )

no o i

LF s i o ioi

Q QRMSE Q Q

n Q

2,2

, ,,max1

1( )

no i

HF s i o ioi

QRMSE Q Q

n Q

2, ,

1

1 n

c i o ii

RMSE Q Qn

2, ,

1

2, ,

1

ln( ) ln( )

1

ln( ) ln( )

n

s i o ii

sqrtQ n

o i o ii

Q Q

NSE

Q Q

2, ,

1

2, ,

1

1

n

c i o iin

o i o ii

Q Q

NSE

Q Q


Experiment -1: Level 2: single model, calibration on 5

subsets Single model performance (NSE) over the 5 pre-

defined periods)

23

NSE (NSE is used for optimization)

Period1 Period2 Period3 Period4 Period5 Period5+1

Period1 0.66 0.82 0.78 0.77 0.69

Period2 0.61 0.87 0.75 0.73 0.64

Period3 0.60 0.81 0.78 0.70 0.66

Period4 0.60 0.71 0.70 0.81 0.69

Period5 0.54 0.75 0.69 0.77 0.67 Period5+1 (whole) 0.77

Mean Calibration Verification

NSE 0.76 0.70

Log NSE 0.84 0.67

RMSEHF 0.29 0.28

RMSELF 0.35 0.36

Means of 5 performances


Experiment-1: Comparing committee models to the

single one A) State-based dynamic averaging (Oudin et al., 2006)

B) Output-based dynamic averaging using fuzzy committee (Fenicia et al., 2007 ; Kayastha et al., 2013)

C) Output-based dynamic averaging using observed flows

Period1 Period2 Period3 Period4 Period5 Period5+1Period1 0.56 0.73 0.70 0.72 0.63 Period2 0.49 0.74 0.59 0.64 0.59 Period3 0.54 0.78 0.78 0.76 0.69 Period4 0.51 0.66 0.67 0.76 0.60 Period5 0.53 0.71 0.65 0.75 0.66 Period5+1 0.72



Mean of NSE

Models Calibration Verification

Single model 0.76 0.70

Cmte A 0.70 0.65

Cmte B 0.77 0.70

Cmte C 0.77 0.71


Experiment-2: single model, calibration on 1+2+3

Model performance (NSE) over the 2 pre-defined periods Calibration period – Periods 1 + 2 + 3 Verification period – Periods 4 + 5

25

Models

Period 1+2+3 Period 4+5

Calibration Verification

Single parameterized 0.78 0.66

A 0.75 0.70

B 0.79 0.67

C 0.79 0.72

P1: 01/01/1984 - 31/12/1988P2: 01/01/1989 - 31/12/1993P3: 01/01/1994 - 31/12/1998P4: 01/01/1999 - 31/12/2003P5: 01/01/2006 - 31/12/2010

In verification Committees are better than the single models


Experiment -2: fragments of hydrograph

The fragments of hydrograph (Experiment -2)

26


Model errors for various years: the committee model is more robust

Single model vs Committee model (Fuzzy)

27

lower SD for Committee model

Yearly NSE

Yearly NSE low flow

Yearly Bias (Qsim/Qobs)


Model errors for different periods: the committee model is more robust

Single model vs Committee model (Fuzzy)

28

lower SD in Committee model

NSE


Conclusions (1)

Splitting of data into small subsets does not allow for committee models to become significantly better than a single model

However using larger sets (P1+P2+P3) for calibration, makes committee models more accurate than a single model

ModelsExperiment 1 Experiment 2

P1,P2,P3,P4,P5 separately P1+P2+P3 P4+P5

Calibration Verification Calibration VerificationSingle

parameterized 0.76 0.70 0.78 0.66

A 0.70 0.65 0.75 0.70B 0.77 0.70 0.79 0.67

C 0.77 0.71 0.79 0.72

Mean of NSELowe r than single param. model


Conclusions (2)

Committees of specialized models can be used when an overall model fails to identify

triggers and thresholds in the catchment behavior is seen as a natural way of introducing additional

complexity and hence adaptivity Committee models were initially developed for

improving accuracy of short-term forecasts, and they do it well. However our hypothesis is: inherent capacity of committee models to react to

short-term changes in hydrometeorologic condition may provide the mechanism for capturing the long-term changes as well


Further work

Defining what is a “hydrometeorological regime” at different time scales

To try to deal with non-stationarity by using non-stationary parameters (and machine-learning them)

Developing more adaptive dynamic combination schemes able to deal with the long term changes in regimes

“Philosophical questions” to think about: When several models are combined the notion of

“state” seem to disappear – is it a problem? Some combination schemes are not conservative ,

i.e. may generate or loose water (“violates physics”) - is it bad ?

We calibrate models knowing data is bad (especially for peaks) – is it right ?


Welcome to our COURSES onFlood Risk Management (July, 3 weeks)

New data sources for flood modelling (September, 1 week)

KULTURisk Summer SchoolFlood risk reduction:

perception, communication, governanceDelft (The Netherlands) 9-12 September 2013

Erasmus Mundus Flood Risk Management Masters

2013-2015www.FloodRiskMaster.org


Thank you very much!

Documents

Committees of hydrological models specialized on high and low flows