20
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright

Maier Et Al ANN Review 2010 EMS (1)

  • Upload
    hh

  • View
    8

  • Download
    0

Embed Size (px)

DESCRIPTION

research paper on ANN, representing all the models used in the ANN modelling of the water resources, and highest quality of work is done.More than 100papers work is presented in the paper

Citation preview

  • This article appeared in a journal published by Elsevier. The attachedcopy is furnished to the author for internal non-commercial researchand education use, including for instruction at the authors institution

    and sharing with colleagues.

    Other uses, including reproduction and distribution, or selling orlicensing copies, or posting to personal, institutional or third party

    websites are prohibited.

    In most cases authors are permitted to post their version of thearticle (e.g. in Word or Tex form) to their personal website orinstitutional repository. Authors requiring further informationregarding Elseviers archiving and manuscript policies are

    encouraged to visit:

    http://www.elsevier.com/copyright

  • Author's personal copy

    Review

    Methods used for the development of neural networks for the prediction ofwater resource variables in river systems: Current status and future directions

    Holger R. Maier a,*, Ashu Jain b, Graeme C. Dandy a, K.P. Sudheer c

    a School of Civil, Environmental, and Mining Engineering, The University of Adelaide, Adelaide, SA 5005, AustraliabDepartment of Civil Engineering, Indian Institute of Technology Kanpur, Kanpur 208 016, IndiacDepartment of Civil Engineering, Indian Institute of Technology Madras, Chennai 600 036, India

    a r t i c l e i n f o

    Article history:Received 11 December 2009Received in revised form16 February 2010Accepted 16 February 2010Available online 19 March 2010

    Keywords:Articial neural networksWater resourcesRiver systemsForecastingPredictionModelling processModel developmentReview

    a b s t r a c t

    Over the past 15 years, articial neural networks (ANNs) have been used increasingly for prediction andforecasting in water resources and environmental engineering. However, despite this high level of researchactivity, methods for developing ANN models are not yet well established. In this paper, the steps in thedevelopment of ANN models are outlined and taxonomies of approaches are introduced for each of thesesteps. In order to obtain a snapshot of current practice, ANN development methods are assessed based onthese taxonomies for 210 journal papers that were published from 1999 to 2007 and focus on theprediction of water resource variables in river systems. The results obtained indicate that the vast majorityof studies focus on ow prediction, with very few applications to water quality. Methods used for deter-mining model inputs, appropriate data subsets and the best model structure are generally obtained in anad-hoc fashion and require further attention. Although multilayer perceptrons are still the most popularmodel architecture, other model architectures are also used extensively. In relation to model calibration,gradient based methods are used almost exclusively. In conclusion, despite a signicant amount of researchactivity on the use of ANNs for prediction and forecasting of water resources variables in river systems,little of this is focused on methodological issues. Consequently, there is still a need for the development ofrobust ANN model development approaches.

    2010 Elsevier Ltd. All rights reserved.

    1. Introduction

    Over the last 15 years or so, the use of articial neural networks(ANNs) for the prediction and forecasting of water resource variableshas become a well-established research area. In the early years(1992e1998), ANNs were considered a novel modelling approachand, consequently, research efforts were directed primarily towardsthe application of ANNs to different types of problems and casestudies in order to assess their utility as an alternative modellingapproach. The large amount of research activity in this area led toa number of review papers in 2000 and 2001 (Maier and Dandy,2000; ASCE, 2000a, b; Dawson and Wilby, 2001), which not onlyconrmed the potential of ANNs for the prediction and forecasting ofwater resource variables, but also identied a number of challengesthat needed to be addressed in order to ensure that ANNs becomea mature modelling approach that can sit comfortably alongside

    other approaches in the toolkit of hydrological and water resourcemodelers.

    In their review, Maier and Dandy (2000) suggested that thereneeded to be a shift in the focus of ANN research from the applicationof ANNs to various water resources case studies to addressinga number of methodological issues. Attention to good practice inmodel development is vitally important in all modelling efforts(Jakeman et al., 2006; Robson et al., 2008; Welsh, 2008), but isparticularly important in thedevelopmentofANNmodels, as theyaredeveloped using available data and not based on underlying physicalprocesses explicitly, thereby increasing the chances of developinga model that is not very meaningful. Consequently, the focus of thisreview paper is on the methodologies that are used in the develop-ment of ANN models for the prediction and forecasting of waterquantity (e.g. ow, level) and quality variables in river systems. Thesteps in the ANN model development process are outlined, taxon-omies of the methods that are available at each of these steps arepresented and themethods that have been utilized in 210 papers thathave beenpublished inwell-known international journals from1999to 2007 are analysed in relation to these taxonomies. This providesa snapshot of the methods that are used at the various steps of the

    * Corresponding author. Tel.: 61 8 8303 4139; fax: 61 8 8303 4359.E-mail address: [email protected] (H.R. Maier).

    Contents lists available at ScienceDirect

    Environmental Modelling & Software

    journal homepage: www.elsevier .com/locate/envsoft

    1364-8152/$ e see front matter 2010 Elsevier Ltd. All rights reserved.doi:10.1016/j.envsoft.2010.02.003

    Environmental Modelling & Software 25 (2010) 891e909

  • Author's personal copy

    model development process during this time period. It should benoted that this paper does not evaluate the performance of ANNmodels relative to other water quantity and quality models, nor doesit critically evaluate the latest advancements in ANN modelling.These should be the subject of other review papers. Throughout thispaper, in-depth descriptions of the methodologies are not given, asreaders are expected to be familiar with ANN modelling and thevarious methods employed therein. Information on the basicconcepts of ANNs are given inmany papers and textbooks (e.g. Floodand Kartam, 1994; Hassoun, 1995; Maren et al., 1990; Masters, 1993;Rojas, 1996; Bishop, 2004).

    The remainder of this paper is organized as follows. In Section2, details are given on how the database of papers was assembled,as well as an overview of the research activity in the use of ANNsfor the prediction and forecasting of water quality and quantityvariables in river systems from 1999 to 2007. This period waschosen as it follows on from the time period covered in the reviewby Maier and Dandy (2000) (i.e. 1992e1998). In Section 3, a briefoutline of the steps in the ANN model development process isprovided, followed by the taxonomies of available options at thevarious steps in this process, against which the 210 papers areassessed in terms of the modelling approaches adopted. In Section4, a summary and conclusions of the ndings of this paper areprovided.

    2. Overview

    The articles reviewed in this paper are taken from the followinginternational refereed journals (the numbers in brackets are thejournals' 2008 ISI impact factors): Advances in Water Resources(2.235), Civil Engineering and Environmental Systems (0.425), Envi-ronmental Modelling and Software (2.659), Environmetrics (0.719),Hydrological Processes (2.002), Hydrological Sciences Journal (1.216),Hydrologyand Earth SystemSciences (2.167), International Journal ofWater Resources Development (0.738), Journal of EnvironmentalEngineering (1.085), Journal of Hydroinformatics (0.681), Journal ofHydrologic Engineering (1.007), Journal of Hydrology (2.305), Journalof the American Water Resources Association (1.208), Journal ofWater Resources Planning and Management (1.275), NordicHydrology (1.194), Stochastic Environmental Research and RiskAssessment (0.951), Water Resources Management (1.350), WaterResources Research (2.398), and Water SA (0.721). These journalswere chosen because they are widely recognized international jour-nals in theelds of hydrologyand surfacewater resources. A keywordsearch of the ISI Web of Science was then conducted for these jour-nals for the period 1999e2007 using the search term NeuralNetworks, resulting in 516 articles. This list was renedmanually toexclude papers focusing on rainfall, groundwater, lakes and reser-voirs, parameter estimation, etc., resulting in 210 selected papersfocusing on the prediction and forecasting of water quantity andquality variables in river systems.

    Details of the selected papers, including year of publication,authors, study location and variable predicted are given in Table 1.The distribution of papers by year of publication is given in Fig. 1. Ascan be seen, there has been a strongly increasing trend in thenumber of papers published since 2001,with 46 papers published in2007 alone.

    The number of papers inwhichwater quantity andwater qualityvariables were predicted is given in Fig. 2. As can be seen, waterquantity variables were predicted in more than 90% of the papers,of which ow was by far the most popular (see Table 1). Waterquality variables were predicted in fewer than 10% of the papers.

    The distribution of time steps considered is given in Fig. 3. Ascan be seen, a daily time step was used in 105 of the 210 papersreviewed, followed by hourly (49 papers) and monthly (34 papers)

    Table 1Details of papers reviewed.

    Authors (year) River System(s) Variable(s)

    Savic et al. (1999) Kirkton River, Scotland FlowMaier and Dandy (1999) River Murray, Australia SalinityDanh et al. (1999) Da Nhim & La Nga

    Rivers, VietnamFlow

    See and Openshaw (1999) Ouse River, England LevelSajikumar and

    Thandaveswara (1999)River Lee, UK; &Thuthapuzha River, India

    Flow

    Zealand et al. (1999) Winnipeg River, Canada FlowFrakes and Yu (1999) Susquehanna River, USA Flow & NitrateJain et al. (1999) Indravati River, India FlowCampolo et al. (1999a) Tagliamento River, Italy LevelCampolo et al. (1999b) Arno River, Italy LevelDawson and Wilby (1999) River Mole, England FlowCoulibaly et al. (2000b) Eight River Systems,

    CanadaFlow

    Elshorbagy et al. (2000) Little River & Reed Creek,USA, English River, Canada

    Flow

    Coulibaly et al. (2000a) Chute-du-Diable River,Canada

    Flow

    Gautam et al. (2000) River Tono, Japan FlowZhang and

    Govindaraju (2000)Council Grove, El Doradoand Marion Rivers, USA

    Flow

    Tingsanchali andGautam (2000)

    Pasak River & Nan River,Thailand

    Flow

    Liong et al. (2000) River systems in Bangladesh LevelImrie et al. (2000) Rivers Trent & Dove, UK FlowAnmala et al. (2000) Council Grove, El Dorado

    and Marion Rivers, USAFlow

    Kim and Barros (2001) Williamsburg, Raystown,Loyalsockville and NewportRivers, USA

    Flow

    Khalil et al. (2001) English, Oslinka, Graham,Halfway and NagagamiRivers, Canada

    Flow

    Hu et al. (2001) Hanjiang and JingshRivers, China

    Flow

    Elshorbagy et al. (2001) Little River & ReedCreek, USA

    Flow

    Coulibaly et al. (2001a) Chute-du-DiableRiver, Canada

    Flow

    Chang and Chen (2001) Da-cha River, Taiwan FlowCoulibaly et al. (2001b) Chute-du-Diable

    River, CanadaFlow

    Chang et al. (2001) Da-chia River, Taiwan FlowLischeid (2001) Lehstenbach, Germany SO4Xu and Li (2002) Saikawa River, Japan FlowXiong and O'Connor (2002) 11 Rivers in China,

    Australia, Malaysia, Nepal,Bangaladesh, Vietnam,Ireland, & Thailand

    Flow

    Sivakumar et al. (2002a) Chao Phraya River, Thailand FlowSivakumar et al. (2002b) Coaracy Nunes/Araguari

    River, BrazilFlow

    Shim et al. (2002) Han River Basin, Korea FlowRajurkar et al. (2002) Narmada River, India FlowOchoa-Rivera et al. (2002) Tagus River, Spain FlowHsu et al. (2002) Leaf River, USA FlowElshorbagy et al. (2002a) English River, Canada FlowElshorbagy et al. (2002b) English River, Canada FlowChang et al. (2002a) Da-Chia River, Taiwan FlowChang et al. (2002b) Lanyoung River, Taiwan FlowCannon and Whiteld

    (2002)21 Rivers, Canada Flow

    Cameron et al. (2002) River South Tyne, England Flow /Level

    Dawson et al. (2002) River Yangtze, China FlowBrath et al. (2002) Sieve River, Italy FlowBirikundavyi et al. (2002) Mistassibi River, Canada FlowLiong and Sivapragasam

    (2002)Ganga, Jamuna, Brahmputra,Meghna Rivers, Bangladesh

    Level

    Bowden et al. (2002) River Murray, Australia SalinityZhang and

    Govindaraju (2003)Back Creek andIndianeKentuck Creek, USA

    Flow

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909892

  • Author's personal copy

    Table 1 (continued)

    Authors (year) River System(s) Variable(s)

    Wilby et al. (2003) Test River, England FlowSuen and Eheart (2003) Upper Sangamon River, USA NitrateSudheer et al. (2003) Baitarni River, India FlowSolomatine and

    Dulal (2003)Sieve River, Italy Flow

    Markus et al. (2003) Sangamon River, USA NitrateLischeid and

    Uhlenbrook (2003)Brugga River, Germany Flow & Silica

    Laio et al. (2003) Tanaro River, Italy FlowKim and Valdes (2003) Conchos River, Mexico Palmer Drought

    Severity IndexHuynh Ngoc and Nguyen

    Duc Anh (2003)Upper Red River, Vietnam Level

    Deka and Chandramouli(2003)

    Brahamputra River, India Flow

    Coulibaly (2003) Chute-du-Diable River,Canada

    Flow

    Cigizoglu (2003b) Goksu River, Turkey FlowCigizoglu (2003a) Gksu, Lamas and Ermenek

    Rivers, TurkeyFlow

    Chibanga et al. (2003) Kafue River, Zambia FlowCampolo et al. (2003) River Arno, Italy LevelAnctil et al. (2003) Serein River, France FlowAbebe and Price (2003) Sieve River, Italy FlowTayfur et al. (2003) Laboratory Experiments

    (data from earlier work)Sediment

    Khu and Werner (2003) Bukit Timah River, Singapore FlowGaume and Gosset (2003) Marne River, France FlowJain and Indurthy (2003) Salado Creek, USA FlowPhien and Kha (2003) Red River, Vietnam LevelWenrui et al. (2004) Apalachicola River, USA FlowTomasino et al. (2004) Po River, Italy FlowSudheer and Jain (2004) Narmada River, India FlowShu and Burn (2004) 404 Catchments in UK FlowRiad et al. (2004) Ourika River Morocco FlowRajurkar et al. (2004) Krishna & Narmada-India;

    Bird Creek-USA;Brosna-Ireland; Garrapatas-Columbia; Kizu-Japan;Pampanga-Phillipines

    Flow

    Pan and Wang (2004) Wu Tu River, Taiwan FlowNayak et al. (2004) Baitarani River, India FlowMoradkhani et al. (2004) Salt River, USA FlowLin and Chen (2004) Fei-Tsui River, Taiwan FlowKumar et al. (2004) Hemavathi River, India FlowKisi (2004a) Tongue River, USA SedimentJain et al. (2004) Kentucky River, USA FlowHuang et al. (2004) Apalachicola River, USA FlowCigizoglu (2004) Schuylkill River, USA SedimentChiang et al. (2004) Lan-Yang River, Taiwan FlowChang et al. (2004) Da-Chia River, Taiwan FlowCastellano-Mendez

    et al. (2004)Xallas River, Spain Flow

    Anctil and Lauzon(2004)

    Kavi-Ivory Coast; Leaf &Salt Fork-USA; SanJuan-Canada; Serein &Volpajola, France

    Flow

    Anctil et al. (2004b) Serein River, France FlowAnctil et al. (2004a) Serein River, France and

    Leaf River, USAFlow

    Kisi (2004b) Gksudere River, Turkey FlowSolomatine and

    Xue (2004)Huai River, China Flow

    Agarwal andSingh (2004)

    Narmada River, India Flow

    Jain and Srinivasulu(2004)

    Kentucky River, USA Flow

    Teegavarapu andElshorbagy (2005)

    Little River &Reed Creek, USA

    Flow

    Sivapragasam andMuttil (2005)

    Chehalis River,Morse Creek, &Bear Branch, USA

    Flow

    Sivapragasam andLiong (2005)

    Tryggevaelde River,Denmark

    Flow

    Table 1 (continued)

    Authors (year) River System(s) Variable(s)

    Shrestha et al. (2005) Neckar River,Germany

    Flow

    Pan and Wang (2005) Wu-Tu River, Taiwan FlowNayak et al. (2005) Kolar River, India FlowKumar et al. (2005) Malaprabha River, India FlowKisi (2005) Quebrada Blanca &

    Rio Valenciano Stns., USASediment

    Kingston et al. (2005a) River Murray, Australia SalinityKhalil et al. (2005) Sevier River, USA FlowJeong and Kim (2005) Geum River, Korea FlowHu et al. (2005) Seven Rivers in China FlowGoswami et al. (2005) Brosna River, Ireland FlowCoulibaly and Baldwin

    (2005)Saint-Lawrence River,Canada & Nile River, Egypt

    Flow,Volume

    Coulibaly et al. (2005) Kipawa & MatawinRivers, Canada

    Flow

    Cigizoglu (2005a) Seytan, Hayrabolu,and Ergene Rivers, Turkey

    Flow

    Cigizoglu (2005b) Synthetic data FlowCigizoglu and Kisi (2005) Filyos River, Turkey FlowChen and Ji (2005) Yellow River, China FlowChang et al. (2005) Lan-Yang River, Taiwan FlowChandramouli and

    Deka (2005)Bharadhapuza River, India Flow

    Bowden et al. (2005) River Murray, Australia SalinityAgarwal et al. (2005) Vamsadhara River, India SedimentBruen and Yang (2005) Citywest & Dargle

    Rivers, IrelandFlow

    de Vos and Rientjes(2005)

    Geer River, Belgium Flow

    Kingston et al.(2005b)

    Boggy Creek, Australia Flow

    Hettiarachchiet al. (2005)

    Six rivers, England Flow

    Anctil and Rat (2005) 47 rivers in France & USA FlowChau et al. (2005) Yangtze River, China LevelDeka and Chandramouli

    (2005)Brahmaputra River, India Flow

    Sudheer (2005) Narmada River, India FlowWu et al. (2005) North Buffalo Creek, USA FlowSchumann and

    Lauener (2005)Gornera River, Switzerland Flow

    Giustolisi andLaucelli (2005)

    Luzzi and Liguori Rivers, Italy Flow

    Doan et al. (2005) Wabash & MississippiRivers, USA and Riversin Bangladesh

    Flow, Level

    Wang et al. (2006) Yellow River, China FlowSrivastava et al. (2006) West Branch

    Brandywine Creek, USAFlow

    Sahoo et al. (2006) Manoa and Palolo Streams,USA

    Flow, Turbidity,SpecicConductance,DO, pH, watertemp.

    Sahoo and Ray (2006) Waiakeakua andManoa Streams, USA

    Flow

    Pereira and dosSantos (2006)

    Tamanduatei River, Brazil Flow, Level

    Panagoulia (2006) Acheloos River, Greece FlowNilsson et al. (2006) Bulken and Skarsvatn Rivers,

    NorwayFlow

    Melesse andWang (2006)

    Red River, USA Flow

    Lin et al. (2006) Lancang River, China FlowKhan and Coulibaly

    (2006)Serpent & Chute-du-Diable Rivers, Canada

    Flow

    Keskin et al. (2006) Dim Stream, Turkey FlowKarunasinghe and

    Liong (2006)Mississippi & Wabash Rivers,USA and Synthetic Data

    Flow

    Kang et al. (2006) Youngsan River, Korea Flow

    (continued on next page)

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909 893

  • Author's personal copy

    Table 1 (continued)

    Authors (year) River System(s) Variable(s)

    Cigizoglu and Kisi(2006)

    Schuylkill River, USA Sediment

    Chetan and Sudheer(2006)

    Kolar River, India Flow

    Chen et al. (2006) Choshui River, Taiwan FlowChau (2006) Shing Mun River,

    Hong KongLevel

    Anctil et al. (2006) Loire River, France FlowAlvisi et al. (2006) Reno River, Italy LevelAhmad and

    Simonovic (2006)Red River, Canada Flow

    Antar et al. (2006) River Nile, Ethiopiaand Sudan

    Flow

    Lauzon et al. (2006) Loire River, France FlowChen and Adams (2006a) Bei River, China FlowDawson et al. (2006) 850 catchments, UK FlowJain and Srinivasulu

    (2006)Kentucky River, USA Flow

    Jia and Culver (2006) Buck MountainRun River, USA

    Flow

    Lohani et al. (2006) Narmada River, India FlowChen and Adams

    (2006b)Bei River, China Flow

    Garbrecht (2006) Fort Cobb Watershed,USA

    Flow

    Kim et al. (2006) Geum River, Korea FlowRaghuwanshi

    et al. (2006)Siwane River, India Flow,

    SedimentTayfur and

    Guldal (2006)Catchment inTennessee Basin, USA

    Sediment

    Parasuramanet al. (2006)

    English River, Canada Flow

    Toth and Brath(2007)

    River Sieve andRiver Reno, Italy

    Flow

    Piotrowskiet al. (2007)

    Murray Burn, Scotland Concentrationof tracer

    Chen and Yu(2007a)

    Lan-Yan River, Taiwan Level

    Abrahart andSee (2007)

    Hypothetical e Xianjiangrainfall-runoff modelemulator

    Flow

    Chau (2007) Shing Mun River,Hong Kong

    Level

    Gopakumaret al. (2007)

    Achencoil River, India Flow

    Alp and Cigizoglu(2007)

    Juniata River, USA Sediment

    Kisi and Cigizoglu(2007)

    Filyos and ErgeneRivers, Turkey

    Flow

    Kamp and Savenije(2007)

    Alzette River Basin,Luxembourg

    Runoff,River Discharge,Salinity andSecchi Depth

    Amenu et al. (2007) Upper SangamonRiver Basin, USA

    Flow, Nitrate

    Nor et al. (2007) Sungai Bekokand SungaiKetil Catchments,Malaysia

    Flow

    Parasuraman andElshorbagy (2007)

    Little River andReed Creek, USA

    Flow

    Ochoa-Riveraet al. (2007)

    Jucar River, Spain Flow

    Ahmed andSarma (2007)

    Pagladia River, India Flow

    Nayak et al. (2007) Narmada River, India &Kentucky River, USA

    Flow

    Srivastav et al. (2007) Kolar River, India FlowZou et al. (2007) Synthetic data for the

    Loch Raven Reservoir, USADO, Chl a, totalphosphorusand ammonia

    Table 1 (continued)

    Authors (year) River System(s) Variable(s)

    Shamseldinet al. (2007)

    8 catchments fromNepal (1), China (3),Ireland (1), Vietnam (1),Malaysia (1)and Thailand (1)

    Flow

    Yu and Liong(2007)

    Tryggevaeldecatchment &Mississippi River atVicksburg, USA

    Flow

    de Vos andRientjes (2007)

    Greer River Basin,Belgium

    Flow

    Tayfur et al. (2007) upper Tiber, Italy FlowSivapragasam

    et al. (2007)Periyar River, India Flow

    Pulido-Calvo andPortela (2007)

    Tua and Ca Rivers,Portugal

    Flow

    Pang et al. (2007) 8 Watersheds in China FlowMuluye and

    Coulibaly (2007)Churchill Falls, Canada Flow

    Mas and Ahlfeld(2007)

    Gates Brook, MA USA FaecalColiform

    Lohani et al. (2007) Narmada River, India SedimentKisi (2007) North Platte River, USA FlowIliadis and Maris

    (2007)70 Mountainouswatersheds, Cyprus

    Annualwater supply

    Hu et al. (2007) Darong River, China FlowHan et al. (2007) Bird Creek, Oklahoma,

    USAFlow

    Goswami andO'Connor (2007)

    Bronsa, Ireland;Le Guindy Plouguiel, France

    Flow

    El-Shae et al. (2007) Nile River (inow toAswan High Dam), Egypt

    Flow

    Elgaali and Garcia(2007)

    Arkansas River, USA Water availablefor diversion

    Diamantopoulouet al. (2007)

    Axios & Strymon Rivers,Greece

    6 WQ Paramsfor Axios(nitrates,specicconductivity,DO, Na, Mg, Ca)and 3 forStrymon(nitrates, speccond, DO)

    Corzo andSolomatine(2007)

    Bagmati basin, Nepal;Sieve basin, Italy;Brue basin, UK

    Flow

    Chiang et al.(2007)

    Wu-Tu River, Taiwan Flow

    Chen and Yu(2007b)

    Lang-Yang River,Taiwan

    Level

    Chang et al.(2007c)

    Chen-Eu-Lan River,Taiwan

    Debris Flow

    Chang et al.(2007a)

    Da-Chia River, Taiwan Flow

    Chang et al.(2007b)

    Upstream sections ofboth Da-Chia andKee-Lung river basins,Taiwan

    Flow

    Bae et al. (2007) Songgyang Dam,North Han River,South Korea

    Dam inow

    Aqil et al. (2007a) Cilalawi River,Indonesia

    Flow

    Aqil et al. (2007b) Cilalawi River,Indonesia

    Flow

    Alp and Cigizoglu(2007)

    Juniata River,Pennsylvania, USA

    Sediment

    Abrahart et al.(2007)

    River Ouse, England Flow

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909894

  • Author's personal copy

    time steps. It should be noted that a number of different time stepswere used in some of the papers reviewed.

    3. Methods used for ANN model development

    3.1. Introduction

    Themain steps in the development of ANN predictionmodels, aswell as the way the data ow through, and the outcomes achievedat, different steps, are given in Fig. 4. It should be noted that themodel development steps covered here represent a subset of the 10steps presented by Jakeman et al. (2006), which cover the fullscientic process, including formulating a hypothesis, collectingappropriate observations and data and a review of the hypothesis.

    The rst step in the model development process presented hereis the choice of appropriate model output(s) (i.e. the variable(s) tobe predicted) and a set of potential model input variables from theavailable data. Although ANNs are data-driven models, it is up tothe modeler to choose which input variables should be consideredas part of the model development process. This can be done basedon a priori knowledge and/or the availability of data. The resultingdata set constitutes the Selected Data (Unprocessed) (Fig. 4). Itshould be noted that once themodel outputs have been chosen, the

    number of nodes in the output layer has also been determined(Fig. 4). Next, the unprocessed data, which consist of measuredvalues of the potential model input variables, as well as the modeloutput variable(s), have to be processed (e.g. scaled, lagged) so thatthey are in a suitable form for the subsequent steps of the modeldevelopment process. Once the processed database of potentialmodel inputs and outputs has been assembled (Selected Data(Processed)), the actual model can be developed.

    All ANN prediction models take the following form:

    Y f X;W 3 (1)where Y is the vector of model outputs; X, the vector of modelinputs; W, vector of model parameters (connection weights); f(),functional relationship between model outputs, inputs andparameters; 3, vector of model errors.

    Consequently, in order to develop an ANN model, the vector ofmodel inputs (X), the form of the functional relationship (f()),which is governed by the network architecture (e.g. multi-layerperceptron) and geometry (e.g. the number of hidden layers andnodes, type of transfer function) and the vector ofmodel parameters(W), which includes the connection and bias weights, need to bedened.

    The vector of appropriate model inputs is determined during theInput Selection step (Fig. 4). This can be achieved either by usinga model-free approach, which uses statistical measures of signi-cance, or a model-based approach, as part of which appropriateinputs are selected based on the performance of models withdifferent sets of inputs. In the latter case, steps 5 and 6 in Fig. 4 have tobe repeated for each set of model inputs tried. Once the vector ofmodel inputs has been selected, the number of model inputs, andhence the number of nodes in the input layer of the ANN model, areknown (Fig. 4).

    The resulting Model Development Data are usually dividedinto calibration and validation subsets. The calibration data areused to estimate the unknown model parameters (connectionweights) and the validation data are used to validate the perfor-mance of the calibrated model on an independent data set. If cross-validation is used as the stopping criterion, the calibration data aredivided into training and testing subsets (Fig. 4).

    Next, the functional form of the model, f(), needs to be selected,whichdepends on themodel architecture (e.g.multilayerperceptron,radial basis function), as well as an appropriate number of hiddennodes and how they are arranged (e.g. single layer, two layers). Itshould be noted that while the selection of an appropriate model

    0

    20

    40

    60

    80

    100

    120

    Hourly 6-Hourly Daily Weekly Monthly Annually OtherTime Step

    sr

    ep

    aP

    f

    o

    re

    bm

    uN

    Fig. 3. Number of times various time steps have been used.

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    1999 2000 2001 2002 2003 2004 2005 2006 2007Year Published

    sr

    ep

    aP

    f

    o

    re

    bm

    uN

    Fig. 1. Distribution of papers by year of publication.

    Fig. 2. Number of times water quantity and water quality variables were predicted.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909 895

  • Author's personal copy

    structure is required for most ANN architectures, it is superuous forsome, such as Generalised Regression Neural Networks (GRNN),which have a xed structure.

    While the choice of an appropriate model architecture is a func-tion of modeler preference, the optimal model structure generallyneeds to be determined using an iterative process. This involvesselecting a network with a certain structure (e.g. number of hiddennodes, transfer functions), calibrating (training) the selected ANNmodel, as partofwhich anestimate of thevectorofmodel parameters(W) is obtained, evaluating its performance and then repeating theCalibration and Evaluation steps for different network congurations

    (Fig. 4). Once the network conguration that performs best on thecalibration data is identied, the calibrated model needs to be vali-datedusing an independent data set. AsANNsare prone to overttingthe calibration data, cross-validation is generally used, as part ofwhich the calibration data are divided into training and testingsubsets, which enables the performance of models with differentnetwork congurations to be validated during the model calibrationphase to ensure overtting of the training data has not occurred.

    In the subsequent sections, the input selection, data division,model architecture selection, model structure selection, model cali-bration (training) and model evaluation stages of the ANN model

    Fig. 4. Steps in ANN model development process.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909896

  • Author's personal copy

    development process are considered in more detail. In eachsub-section, the purpose and importance of the particular step in theANN model development process considered are introduced, fol-lowedbya taxonomyof themain options available tomodelers. Next,the options selected in the 210 papers considered are reviewed inlight of these taxonomies, thereby presenting a snapshot of the ANNmodel development approaches used from1999 to 2007. It should benoted that this information is presented in terms of the number oftimes a particularmethod has been used in the papers reviewed. Thisis because some papers used multiple methods and details aboutsome of the methods addressed were not provided in some of thepapers. Consequently, the total number of times a particular methodhas been used in the papers reviewed can bemore or fewer than thenumber of papers reviewed (i.e. 210).

    3.2. Input selection

    3.2.1. IntroductionOne of themost important steps in the ANNmodel development

    process is the determination of an appropriate set of inputs (X).However, this task is generally given little attention in ANNmodelling and most inputs are determined on an ad-hoc basis orusing a priori system knowledge (Maier and Dandy, 2000). This canresult in the inclusion of too few or too many model inputs, both ofwhich are undesirable.

    The consequence of excluding one or more signicant inputs isthat the resulting model is not able to develop the best possibleinputeoutput relationship, given the available data. The omissionof important model inputs is more likely to occur in time seriesapplications, where the potential model inputs consist of not onlydifferent input variables, but also their lagged values (unlessrecurrent neural network architectures are used). This increases thenumber of potential model inputs (as distinct from input variables)considerably, and in many previous studies, lags have been chosenon an ad-hoc basis (Maier and Dandy, 2000), with the associateddanger that important system dynamics have not been captured.

    The inclusion of too many inputs is usually caused by inputredundancy, where some of the selected inputs provide signicantinformation, but are related to each other and therefore provideredundant information. This can cause a number of problems. Firstly,

    redundant inputs increase the likelihoodof overtting (overtraining).This is because a larger number of inputs generally increases networksize, and hence the number of connection weights (i.e. modelparameters) that need to be calibrated. As the number of trainingsamples is generally xed, the addition of redundant model inputsincreases the ratio of the number of connection weights to thenumber of training samples, thus increasing the likelihood of over-tting, while not providing any additional information to the model.Secondly, the inclusion of redundant model inputs introduces addi-tional local minima in the error surface inweight space. For example,if two model inputs (x1 and x2) are highly correlated, and thusessentially represent the same input information, there are likely tobe many combinations of weights that will result in identical modelperformance. If the underlying relationship is y x1, then a uniquerelationship exists if either x1 (y x1) or x2 (y x2) are included asmodel inputs. However, if both inputs are included (yw1x1w2x2),the same model output is obtained for a large number of combina-tions of weights (e.g. w11 & w2 0, w10 & w21, w10.5 &w2 0.5, w10.3 & w2 0.7, etc.). The presence of local minima inweight spacemakes itmore difcult tond an optimal set ofweights,aswell as resulting in inputeoutput relationships that are not unique,making it more difcult to extract physical meaning from calibratedmodels.

    3.2.2. TaxonomyA number of techniques are available for assessing the signi-

    cance of the relationship between potential model inputs andoutput(s), as shown in Fig. 5. The primary distinction is betweenModel Free and Model Based approaches. Model Based approachesrely on the development (structure selection, calibration andevaluation) of a number of ANN models with different inputs todetermine which of the candidate inputs should be included. Theprimary disadvantage of this approach is that it is time consuming,as a number of ANNs have to be developed. In addition, it has thepotential for masking the effect different model inputs have onmodel performance, as the latter is also a function of networkstructure (e.g. number of hidden nodes), which ideally should beoptimized for each input set investigated, and the quality of thecalibration process, which is a function of a number of user-denedparameters (e.g. learning rate and momentum if the back-

    Input Significance

    Model Based

    Model Free

    Ad-Hoc (e.g. available data, domain knowledge)

    Analytical

    Sensitivity Analysis

    Non-linear (e.g. mutual information)

    Linear (e.g. correlation)

    Ad-Hoc (e.g. developing models with different inputs)

    Stepwise (e.g. constructive, pruning)

    Global (e.g. genetic algorithm, particle swarm algorithm)

    Fig. 5. Taxonomy of approaches to determining input signicance.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909 897

  • Author's personal copy

    propagation algorithm is used), which should also be optimized foreach candidate input set. Consequently, it is difcult to isolate theimpact of different model inputs on model performance.

    Options for selecting which input combinations to try as part ofModel Based approaches include an ad-hoc approach, where themodel developer selects which combinations ofmodel inputs shouldbe tried, a stepwise approach, where inputs are systematically added(constructive) or removed (pruning), or a global approach, wherea global optimization algorithm, such as a genetic algorithm, is usedto select the combination of inputs that maximizes modelperformance. Another approach is to develop an ANN model witha relatively large number of inputs and to use sensitivity analysis todetermine which inputs should be excluded.

    In contrast to Model Based approaches, Model Free approaches toinput selectiondonot rely on theperformanceof trainedANNmodelsfor the selection of appropriate inputs. As shown in Fig. 5, model freeapproaches can bedivided into two categories: ad-hoc and analytical.As part of ad-hoc model free approaches, inputs are selected by themodel developer on an arbitrary basis or based on domain knowl-edge, for example. When an analytical model free approach to inputselection is used, a statisticalmeasure of signicance is generally usedto assess the strength of the relationship between potential modelinputs and outputs. The most commonly used measure of statisticaldependence for input selection is correlation, which has the disad-vantage of onlymeasuring lineardependencebetweenvariables. Thisis particularly relevant in the context of developing ANN models, asANNs are generally used in preference to linear modellingapproaches, such as linear regression, in cases where inputeoutputrelationships are suspected to behighly non-linear,which is often thecase inwater resourcesproblems. Consequently, the useofnon-linearstatistical dependencemeasures, suchasmutual information, ismoreappropriate for determining inputs to ANN models.

    In order to overcome the problem of redundant inputs discussedearlier, Input Independence needs to be considered in addition toInput Signicance. As can be seen in Fig. 6, there are two mainapproaches to accounting for input independence, namely dimen-sionality reduction and ltering. The aim of the former is to reducethe dimensionality of the input space by eliminating correlatedcandidate inputs. There are two main approaches to achieving this,including rotation of the input vectors, as is the case in principalcomponent analysis, or clustering of the input space and choosingrepresentative inputs from each cluster for further consideration.Dimensionality reduction generally forms the rst part of a twostep process, the second of which is to select the inputs that havethe most signicant relationship with the model output(s) usingone of the methods in Fig. 5.

    The second approach to account for input independence in theinput selection process is ltering. The most prominent example ofthis is the constructive stepwise model-building process, as part ofwhich the candidate input that has themost signicant relationshipwith themodel output(s) is selected rst, followed by the candidate

    input that has the next biggest additional impact and so on. Classicexamples of this are the partial correlation and partial mutualinformation input variable selection algorithms (see May et al.,2008a, b), which combine a stepwise partial modelling approachthat caters for input independence (redundancy) with an analyticalmeasure of statistical dependence that caters for input signicance.The stepwise, constructivemodel based approach achieves a similaraim, although the criteria used to decide when to stop addingcandidate inputs are less well dened. Other model based inputselection approaches, such as global optimization and stepwisepruning approaches cannot be combined with ltering approachesandhave to rely on dimensionality reduction approaches to cater forinput independence (redundancy).

    3.2.3. ResultsAs illustrated in Fig. 7, a model free input selection approachwas

    used 146 times, compared with 72 occasions on which a modelbased approach was used. Of the model free approaches, ad-hocmethods were most popular with applications in 79 papers, fol-lowed by linear analytical approaches, such as correlation, whichwere used on 60 occasions. A non-linear analytical method wasonly used seven times, which seems inconsistent with the non-linear premise that underpins all ANN models.

    In 37 of the 72 instances where a model based input selectionapproach was used, this was done in an ad-hoc fashion. A stepwisemodel building approach was used 14 times, followed by sensitivityanalysis of trained models (seven times) and use of a global searchprocedure (ve times).

    Input independence was only considered in 18 of the 210 papersreviewed (Fig. 8). Filtering was the most commonly used approach,with 10 applications, followed by clustering and rotation, whichwere applied six and two times, respectively.

    3.2.4. ConclusionThe results of the review reveal that there is a need to paygreater

    attention to the input selection step in the development of ANNmodels. The inputs selected can have a signicant impact on modelperformance, yet adhoc approaches to input selection (eithermodelbased or model free) were used in the majority of papers surveyed.While analytical model free approaches were also popular, almostall of these used a linear method for determining input signicance,which is counter to the premise of using a non-linearmodel, such as

    Fig. 6. Taxonomy of approaches to accounting for input independence.

    0102030405060708090

    niLe

    ra

    ylanA(t ci a

    ,loMd

    erF le

    e)

    oN-n

    niLe

    ra

    nA(tylai ac

    M ,l

    doe

    F l

    ree)

    F ledoM(

    coH-dA

    r)ee

    Gbol

    ( la

    ledoM

    Bades)

    S

    ( esiwpet

    Mdoel

    desaB

    ) )desaB ledoM(

    sisylanA ytivitisneS

    AdoH

    -

    ( cM

    do

    saB le

    de) )desaB

    ledoM( rehtO

    sr

    ep

    aP

    f

    o

    re

    bm

    uN

    Fig. 7. Number of times various methods of determining input signicance have beenused.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909898

  • Author's personal copy

    an ANN, as the actual model. Consequently, there is a need to makegreater use of non-linear analytical approaches to input selection.

    The issue of input independence was ignored in almost all of thepapers reviewed, which can have signicant negative impacts onmodel performance and the ability to extract any meaningfulinformation about underlying physical processes from trained ANNmodels. Consequently, this issue requires increased attention.

    3.3. Data division

    3.3.1. IntroductionAs part of the ANN model development process, the available

    data are generally divided into training, testing and validationsubsets. The training set is used to estimate the unknown connec-tion weights, the testing set is used to decide when to stop trainingin order to avoid overtting and/or which network structure isoptimal, and the validation set is used to assess the generalisationability of the trained model. As ANNs, like all empirical models,perform best when they are not used to extrapolate beyond therange of the training data, all patterns that are contained in theavailable data need to be included in the training set. Similarly,since the test set is used to determine when to stop training and/orwhich network geometry is optimal, it needs to be representative ofthe training set and should therefore also contain all of the patternsthat are present in the available data. If all of the available patternsare used to calibrate the model, then the most challenging evalu-ation of the generalisation ability of the model is if all of thepatterns are also part of the validation set. Consequently, thetraining, testing and validation sets should have the same statisticalproperties in order to develop the best possible model, given theavailable data.

    3.3.2. TaxonomyThe methods for dividing the available data into appropriate

    subsets can be divided into supervised and unsupervised approaches(Fig. 9). Unsupervised approaches do not take the statistical proper-ties of the data subsets into account explicitly and only stratiedunsupervised approaches attempt to ensure that the statisticalproperties of the subsets are similar. For example, a self-organizingmap (SOM) (see Kalteh et al., 2008) can be used to cluster the avail-able data and to allocate data samples from each cluster to thetraining, testing and validation subset, therebyensuring that patternsfrom different regions of the multivariate inputeoutput space are

    represented in each subset. In the random unsupervised approach,the data are divided into their respective subsets on a random basis.In the physics based approach, the data are divided into variousclasses based on knowledge about the underlying physical processesor domain knowledge. In the ad-hoc approach, datamight be dividedsuch that the rst XX observations are allocated to the training set,the next YYobservations are allocated to the testing set and the nalZZ observations are allocated to the validation set. However, this doesnot take any account of the statistical properties of the data subsets,making it difcult to know whether the best possible model, giventhe available data, has been developed or whether model perfor-mance based on the validation set is representative of modelperformance under a range of conditions. For example, the patternsin the validation set might only be representative of average condi-tions, thereby inating model performance on the validation data.Alternatively, the validation data might contain rare events not usedduring training (calibration), thereby diminishing the apparentcapability of the model to capture the relationship contained in theavailable data.

    The explicit goal of supervised data division methods is toensure that the statistical properties of the various subsets aresimilar. This can be achieved by using a trial-and-error approach, aspart of which manual adjustments are made to the composition ofthe various subsets until an arbitrarily satisfactory level of agree-ment between the statistical properties of the various data subsetshas been reached, or by using a formal optimization approach tominimize ameasure of difference between the statistical propertiesof the data subsets.

    3.3.3. ResultsIn the papers reviewed, unsupervised data division methods

    were used 177 times (Fig.10). Among these, ad-hoc data divisionwasthe predominant method used. Only a small number of papers usedthe more sophisticated stratied and physics based approaches. Anequally small number used a random data division approach.

    Supervised data division methods were only used on 24 occa-sions, with an approximately equal split between trial and error andoptimization based approaches for achieving similar statisticalproperties between the various data subsets. It should be noted thatdata division was not discussed in some of the papers reviewed.

    3.3.4. ConclusionEven though the way the data are divided can have a signicant

    impact on model performance, and the validity of the resultspresented, data division was conducted in an ad-hoc fashion onalmost 150 occasions. Consequently, there is a need to payincreased attention to data division in the ANN model develop-ment process.

    Data Division

    Supervised

    Unsupervised

    Ad-Hoc

    Stratified (e.g. self-organising map)

    Random

    Trial and Error

    Physics Based

    Optimisation (e.g. genetic algorithm)

    Fig. 9. Taxonomy of approaches to data division.0

    2

    4

    6

    8

    10

    12

    Rotation (DimensionalityReduction)

    Clustering(Dimensionality

    Reduction)

    Filtering

    sr

    ep

    aP

    f

    o

    re

    bm

    uN

    Fig. 8. Number of times various methods of determining input independence havebeen used.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909 899

  • Author's personal copy

    3.4. Model architecture selection

    3.4.1. IntroductionModel (network) architecture determines the overall structure

    and information ow in ANN models. Consequently, it has a signi-cant impact on the functional form of the relationship betweenmodel inputs and output(s), f().

    3.4.2. TaxonomyTraditionally, ANN architectures have been divided into feed-

    forward and recurrent networks (Fig. 11). In feed-forward networks,the information propagation is only in one direction, i.e. from inputlayer to the output layer. Multilayer Perceptrons (MLPs) are the mostcommon form of feed-forward model architecture. Other feed-forward network architectures in use include Generalised Regres-sionNeuralNetworks (GRNNs), Radial Basis Function (RBF) networks,Neurofuzzy networks and Support Vector Machines (SVMs).

    AnMLP uses three or more layers of articial neurons with linearaggregation functions and linear and/or non-linear activationfunctions. The input layer neurons simply pass on the weightedinputs to the subsequent layer neurons. The possibility of using non-linear activation functions at the hidden and output layers of anMLPprovide the capability of capturing the complexity and non-linearity

    inherent in the systems being modeled. A GRNN is capable ofapproximating any function using input and output data like anMLPbut differs in its structure,which consists of four layers, an input layer,a pattern layer, a summation layer, and an output layer. Unlike MLPs,GRNNs do not rely on iterative procedures for their training and arebasedona standard statistical technique calledkernel regression. RBFnetworks are motivated by locally tuned biological neurons, whoseresponse characteristics are bounded in a small range of the inputstimuli. The structure of RBF networks is similar to that of the feed-forward MLPs and consists of three layers, an input layer, a hiddenlayer, and an output layer. The major difference is that the hiddenlayer neurons are specied by radial basis functions and the outputlayer neurons necessarily use linear activation functions. The trainingof an RBF network is usually a two stage process in which the basisfunctions are established at the hidden layer in the rst stage and theweights connecting hidden layer and output layer neurons aredirectly determined in the second stage. Neurofuzzy networks arebased on an integration of neural networks and fuzzy logic. Thelearning capability of neural networks is exploited to design thecomplex fuzzy system (or generation of IF THEN rules) in a Neuro-fuzzy model. Neurofuzzy models offer the advantages of both eldsand have providedmore accurate results than a simple ANNmodel inmany hydrological applications. Recently, SVMs have attracted theattentionof some researchers. SVMs aremachine learningalgorithmsin which the empirical risk in terms of prediction error and struc-tural risk associated with the model structure are minimizedsimultaneously.

    While feed-forward architectures are the most popular archi-tectures among researchers, recurrent neural networks have alsoreceived some attention. In recurrent networks, information maypropagate not only in the forward direction but also in the back-ward direction through feedback loops. The output layer neuronsmay feed back the output to input and/or hidden layer neurons. Theexistence of a feedback mechanism in recurrent networks makes itsimpler for a neural network to model highly dynamic systemswith time delays.

    Environmental andhydrological systems are extremely complex,non-linear, and dynamic in nature, involving a wide variety ofphysical variables that exhibit signicant spatio-temporal variationand are often inter-related and uncertain in nature, thereby posingmajor challenges to the scientic community involved in modellingsuch systems. A single technique may not be able to capture thecomplex nature of environmental and hydrological systems.Consequently, a number of hybrid modelling approaches have beendeveloped to exploit the advantages of the available modellingparadigms in order to capture the complexities involved in suchsystems (Fig. 11).

    In this paper, hybrid modelling frameworks that include ANNmodels have been divided into the following three classes (Fig. 11):

    (a) Data intensive: a data intensive approach is one that attempts toclassify the data in accordance with different dynamics andseparatemodels are then developed for the separate classes. Thedata classication can be either soft or hard depending on themethods employed. As part of soft classication, unsupervisedlearning methods can be used (e.g. Kohonen's self organizingmap (SOM)) to identify the input output patters belonging toa particular class. Alternatively, hard approaches, such asdomain knowledge about the physical system, can be used fordata classication. Once the data have been classied usingeither soft or hard approaches, each category of data can bemodeled separately using neural networks or process basedapproaches. Such a data intensive approach offers the advantageof being able to model clustered data generated by differentdynamics.

    0

    20

    40

    60

    80

    100

    120

    140

    160

    S rta

    deifit(

    nUsu

    pre

    esiv)d

    aRn

    nU(

    mod

    us

    repvsi e

    )d )desivrepusnU( desaB

    scisyhP

    H-dA

    co

    nU(s

    vrepu

    esi)d

    Tria

    E &

    lrr

    ro

    epuS(r

    esiv)d

    oitasimitpO

    n

    puS(

    evri es

    d)

    sr

    ep

    aP

    f

    o

    re

    bm

    uN

    Fig. 10. Number of times various methods of data division have been used.

    Fig. 11. Taxonomy of model architectures.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909900

  • Author's personal copy

    (b) Model intensive: a model intensive approach is one that employsdifferent models for different sub-components of the overallphysical system and then aggregates various responses calcu-lated from different models. On the other hand, it is possible tomodel the same process using two different types of models andthen combine the outputs from two or more models to obtainthe desired output.

    (c) Technique intensive: a technique intensive approach is one inwhich a neural network is combined with a different technique(e.g. regression, time series, or conceptual) with the objectiveof developing a hybrid modelling framework that is capable ofexploiting the advantages offered by different techniques. Forexample, a neural network / time series hybridmodel offers theadvantage of rst removing the deterministic trends from thedata, enabling any nonlinear relationships that remain to bemodeled using neural networks. Similarly, it is possible tocombine conceptual and/or regression techniques with neuralnetworks to develop hybrid models in order to achieve supe-rior model performance.

    3.4.3. ResultsThe results obtained indicated that there has been a signicant

    amount of activity on the development and evaluation of alternativenetwork architectures in order to improve model performancebetween 1999 and2007.Whilemultilayer perceptrons (MLPs),whichhave been used traditionally in applications in hydrology and waterresources (Maier andDandy, 2000),were still by far themost popularnetwork architecture, MLP performance was compared with that ofalternative feedforward network architectures, recurrent architec-tures and avarietyof hybrid architectures in a largenumberof studies(Fig. 12). The number of studies in which alternative architectureswere applied was reasonably uniform, varying between 5 and 20,compared with 178 instances where MLPs were used.

    3.4.4. ConclusionMuch effort has been directed towards the evaluation of existing

    ANN architectures and the development of and evaluation of newANN architectures. The latter has been primarily in the form ofhybrid ANN architectures that aim to exploit the strengths andeliminate the weaknesses of different modelling approaches.However, given the wide variety of hybrid modelling approachesand range of applications to which they have been applied, it is not

    possible to draw any conclusions as to which model architectureshould be used in a particular circumstance. This should be thefocus of future research efforts.

    3.5. Model structure selection

    3.5.1. IntroductionModel (network) structure, together with model (network)

    architecture, denes the functional form of the relationship betweenmodel inputs and output(s), f(). Determination of an appropriatenetwork structure involves the selection of a suitable number ofhidden nodes, how they are arranged (e.g. number of layers, numberof nodes per layer) and how they process incoming signals (e.g. typeof transfer function, etc.). The optimal network structure generallystrikes a balance between generalisation ability and networkcomplexity (e.g. network size and the number of free parameters). Ifnetwork complexity is too low or an inappropriate functional form isselected, the network might be unable to capture the desiredrelationship.However, if network complexity is toohigh, thenetworkmight have decreased generalisation ability and processing speed,could be more difcult to calibrate and might be less transparent.

    3.5.2. TaxonomyThe taxonomy of methods for determining the optimal ANN

    structure is shown in Fig. 13. The methods can be classied intothree types, global, stepwise, or ad-hoc. In the rst method, thestructure of an ANNmodel in terms of hidden layers and/or hiddenneurons is arrived at using global methods based on competitiveevolution found in nature e.g. genetic algorithm, particle swarmoptimization, simulated annealing, etc. Using this approach, it ispossible to simultaneously optimize network parameters (e.g.-network weights) and structure (e.g. the number of hidden layernodes). If used appropriately, global methods are likely to result inthe best ANN structure and/or parameters; however, they arecomputationally expensive.

    Alternatively, a stepwise trial and error procedure can be used(Fig. 13), in which a basic ANN structure is rst assumed, which ismodied with each trial with the objective of achieving a structurethat is neither too complex nor too simple. The stepwisemethods canfurther be divided into two categories, one based on pruning algo-rithms and the second based on constructive approaches. A pruningalgorithm starts with a sufciently complex ANN structure that isassumed to be capable of capturing the complexities involved in thephysical system being modeled. Then, the connection weights andassociated neurons (based on a rating system of their magnitude) aresuccessively removed, one at a time, until model performance dete-riorates signicantly. On the other hand, in a constructive algorithm,one starts with the simplest ANN structure, which is successivelymade more complex by adding hidden neurons/layers one at a timeand calibrating the resulting model. This process is repeated untilthere is no signicant improvement in model performance. Pruning

    Fig. 13. Taxonomy of methods for optimising model structure.Fig. 12. Number of times various model architectures have been used.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909 901

  • Author's personal copy

    and constructive algorithms can alsobe computationally intensive, asANN models with many different structures generally need to betrained, and manually examined before arriving at the optimalstructure.

    Other approaches to determining an appropriate network struc-ture, such as using a trial-and-error approach to determining theoptimal number of hidden nodes, rather than a strict constructive orpruning approach, or selecting a network structure based on expe-rience and/or intuition, have been classied as ad-hoc.

    3.5.3. ResultsAs can be seen in Fig. 14, an ad-hoc approach to determining the

    structure of ANN models was by far the most popular, with 115applications. Of the structured approaches, constructive stepwiseapproaches were used 52 times, whereas pruning and globalapproaches were only used on a small number of occasions (7 and11, respectively).

    3.5.4. ConclusionDespite the important role network structure plays in deter-

    mining the desired relationship between model inputs and outputs,little effort has been directed into this area of the ANN modeldevelopment process, with most studies adopting an ad-hocapproach to determining an appropriate network structure. Therehas been reasonable adoption of constructive, stepwise modelbuilding approaches, but the use of global optimizationmethods hasreceived little attention. In order to ensure that the best possible ANNmodels are being developed, this step in the model developmentprocess requires further attention.

    3.6. Model calibration

    3.6.1. IntroductionThe aim of model calibration (ANN training) is to nd a set of

    model parameters (e.g. connection weights) that enables a modelwith a given functional form to best represent the desired input/output relationship. If overtting is not considered to be a problemand the training data are representative of the modelling domain,this is achieved when a suitable error measure between actual andpredicted training outputs isminimised. If overtting is a possibility,optimal generalisation ability is achieved when a suitable errormeasure between actual and predicted outputs in the test set isminimised, provided that training and testing data are representa-tive of the modelling domain.

    Determination of the combination of model parameter values(i.e. weights) that minimises the training or testing error is nota simple problem. As each combination of parameter values

    generally results in a different model error, an error surface existsin parameter (i.e. weight) space. This is illustrated for a modelwith a single parameter in Fig. 15, where different values of themodel parameter generally result in different model errors. It canbe seen that the degree of difculty in nding the parametervalue or combination of parameter values that results in thesmallest model error is affected by the ruggedness of the errorsurface. Ruggedness is a measure of the number, spacing andsteepness of the craters and valleys in the error surface. As can beseen in Fig. 15(a), if the error surface is smooth, there are fewerlocal minima, and the global optimum can be found more easily.In contrast, as illustrated in Fig. 15(b), if the error surface is morerugged, it generally has more local minima, and the globaloptimum is more difcult to nd. The degree of ruggedness of anerror surface is usually problem dependent and is affected by thenumber of model parameters, among other things. As the numberof model parameters increases, so does the size of the searchspace and, generally, the number of local optima. In addition,a larger number of parameters makes if more difcult to interpretthe model and increases the risk of allowing spurious modes ofmodel behaviour and tting bad data, such as outliers and otheranomalies. Consequently, it is important to nd the model withthe smallest number of inputs and parameters that is able todescribe the underlying relationship in the data, as discussedpreviously.

    3.6.2. TaxonomyDue to the difculty of the ANN calibration problem outlined

    above, ANN calibration is generally conducted using a suitableoptimization algorithm. The vast majority of these approaches are

    Error

    Parameter 1

    Error

    Parameter 1

    Global optimum

    Local optimum

    Smooth Error Surface Rugged Error Surface

    a b

    Fig. 15. Error surface with different degrees of ruggedness for a model with oneparameter.

    0

    20

    40

    60

    80

    100

    120

    140

    Global Pruning (Stepwise) Constructive(Stepwise)

    Ad-Hoc

    sr

    ep

    aP

    f

    o

    re

    bm

    uN

    Fig. 14. Number of times various model structure determination methods have beenused. Fig. 16. Taxonomy of calibration (training) methods.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909902

  • Author's personal copy

    deterministic, in the sense that they attempt to identify a singleparameter vector that minimises an error measure between pre-dicted model outputs and their corresponding measured values forthe training set. These methods generally belong to either local orglobal optimization approaches (Fig.16). Localmethods usually workon gradient information, and are therefore prone to becomingtrapped in local optima if the error surface is reasonably rugged.However, these methods are generally computationally efcient.Gradient methods can be further sub-divided into rst-ordermethods (e.g. back-propagation) or second-order methods (e.g.Newton's method). Global optimization methods, such as geneticalgorithms, have an increased ability to nd global optima in theerror surface, although this is generally at the expense of computa-tional efciency.

    In order to account for parameter uncertainty during the cali-bration process, stochastic calibration methods can be used. Theseapproaches can be used to obtain distributions of the modelparameters, rather than nding a single parameter vector. This hasthe advantage that prediction limits can be obtained. In order toachieve this, Bayesian methods are commonly used.

    3.6.3. ResultsThe results in Fig. 17 illustrate that deterministic calibration

    methods were used predominantly (193 times), although there were17 studies that embraced Bayesian and other stochastic approaches inorder to account for parameter uncertainty. Of the deterministiccalibration methods, rst-order approaches, such as the back-propagation algorithm, were used most frequently, with 103applications. However, second order methods, such as the LevenbergMarquardt algorithm, were been used extensively, with 64applications. Use of other local and global optimization algorithmswas limited.

    3.6.4. ConclusionIn the majority of studies, rst-order local search procedures,

    such as the backpropagation algorithm, were used, although secondorder methods were also used extensively in order to improve thecomputational efciency of ANN calibration. However, there waslittle work on investigating the potential benets of using globaloptimization techniques in terms of improving the predictive abilityof ANN models, which is an area worthy of further exploration. Inaddition, although some work was done on the incorporation ofparameter uncertainty into ANNmodel calibration, this also presentsan area of future research.

    3.7. Model evaluation

    3.7.1. IntroductionIn order to determine which network structure is optimal, the

    performance of a calibrated model is evaluated against one or morecriteria. This also applies to determining the optimal set of modelinputs, if a model based input selection approach is used. As dis-cussed previously, if overtting is not considered to be a problem,model performance is assessed using the training data, whereas thetest data are used for this purpose if overtting is a concern.

    3.7.2. TaxonomyANNmodel performance is usually assessed using a quantitative

    error metric. A taxonomy of the commonly used metrics is given inFig. 18. Squared errors are based on the squares of the differencesbetween actual and modeled output values. Commonly employedmetrics belonging to this category include the sum of squarederrors (SSE), root mean square error (RMSE) and the Nash Sutcliffeefciency (E). A feature of squared error metrics is that they tend tobe dominated by errors with high magnitudes. Alternatively,absolute errors can be used, which are based on the absolutedifferences between actual and modeled outputs and includemeasures such as the total sum of absolute deviations (TSAD) andthemean sum of absolute deviations (MSAD).While absolute errorsprovide information on the magnitude of the error, they do notprovide information on the performance of the model in terms ofoverall under- or over-prediction. This problem can be overcome byconsidering the total or mean sum of the differences without takingabsolute values, resulting in total bias (TBIAS) and mean bias(MBIAS) statistics. In order to allow the performance of modelswith outputs of different magnitudes to be compared more easily,relative error metrics, such as the average absolute relative error(AARE), the normalized root mean square error (NRMSE) and thenormalized mean bias error (NMBE) can be used. Finally, a measureof the empirical error between actual and modeled outputs can beobtained by using product difference moment error statistics, ofwhich the Pearson correlation coefcient is the most well-known.

    Information criteria, such as the Akaike information criterion(AIC) and Bayesian information criterion (BIC) consider modelcomplexity in addition to model error. Consequently, they have thepotential to result in more parsimonious models.

    In addition to the metrics mentioned above, there are a numberof other statistics than can be used in order to evaluate modelperformance. An example of this are threshold statistics (TS), which

    0

    20

    40

    60

    80

    100

    120

    iseyaB

    a

    )citsahcotS( n htO

    erS( ot

    hcas

    cit )

    iF

    L( redrO

    -tsr

    o

    imreteD

    ,lac

    in)cits

    eSco

    n

    redrO-d

    sinimreteD

    ,lacoL(

    tic)

    coL( rehtO

    a ,leDter

    mini

    its)c citsini

    mreteD ,labolG(

    AG

    )

    Oth

    lG( re

    olab,

    nimreteD

    icits) )citsini

    mreteD( rehtO

    sr

    ep

    aP

    fo

    re

    bm

    uN

    Fig. 17. Number of times various calibration methods have been used. Fig. 18. Taxonomy of performance evaluation.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909 903

  • Author's personal copy

    are capable of providing the distribution of the number of datapoints predicted from an ANN model having various levels ofabsolute relative error (ARE). In addition, the performance of ANNmodels can also be based on the accuracy of predicting particulartime series (e.g. hydrograph) characteristics, such as errors inestimating peak ow, timing of the peak and total volume.

    3.7.3. ResultsThe results obtained indicate that a range of performance criteria

    were used inmost studies (Fig.19). This is considered good practice,as different criteria capture different performance characteristics, asdiscussed previously. While squared error metrics were mostwidely used (170 times), measures based on absolute and relativeerrors, as well as correlation, were also used extensively. As can beseen from Fig.19, other non-standard, problem specic evaluationcriteria were used relatively frequently. However, the use of infor-mation criteria was restricted to a small number of studies.

    3.7.4. ConclusionReview of the 210 papers has indicated that a range of perfor-

    mance criteria was used in most of the studies. This increasescondence in the evaluation of the performance of the modelsdeveloped, as different performance criteria generally emphasizedifferent aspects of predictive performance. However, increased useof information criteria, such as theAIC andBIC, could be benecial inan effort to balance predictive performance with model parsimony.

    4. Summary and conclusions

    Since the period 1992e1998, which is the subject of the reviewpaper by Maier and Dandy (2000), research activity in the eld offorecasting andprediction andwaterquantityandquality variables inrivers using ANNs has increased dramatically. From 1992 to 1998, theaveragenumberof journal paperspublishedwas6.1per year. Thishasincreased to an average of 23.3 papers per year for the period of thisreview paper (1999e2007). This is despite the fact that a restrictedjournal list was considered for this paper and that the review wasrestricted toprediction in rivers,meaning thatpredictionof a numberof water resource variables, such as rainfall, was excluded. Evenwithin the period covered by this paper, there has been a markedincrease in the number of papers published in the later years, with anaverage of 38 papers per year from 2005 to 2007.

    As was the case from 1992 to 1998, the primary application areahas been ow forecasting and prediction. Very few papers havefocused on other water quantity variables and even fewer haveconsidered water quality. If anything, the emphasis on ow model-ling has increased in recent years, rather than diminished. Conse-quently, there is a need to broaden the application area of ANN

    models to focus on other predictive variables, especially those con-cerned with water quality. Given the universal function approxima-tion capability of ANNs, they would seem to be ideally suited tomodelling the complex relationships that are a feature of waterquality processes. However, one factor limiting the application ofANNs in thewater quality modelling arenamight be the lack of goodquality, long term data.

    The adoption of appropriate input determination approacheswasan area identied as decient by Maier and Dandy (2000) and basedon the ndings of this study, not much has changed in the subse-quent 9 years. In the vast majority of studies reviewed in this paper,inputs were determined using an ad-hoc approach, either modelbased or model free. While it is pleasing to see that analytical, modelfree approaches were used 67 times, non-linear approaches wereonly used in 7 of these. Using a linear approach to identify which ofthe potential input variables have a signicant relationship with themodel output is not appropriate for the development of ANNmodels, as ANNmodels are generally used because of their ability torepresent non-linear relationships between input and output data.Consequently, there is a need to adopt non-linear model inputselection approaches (e.g. May et al., 2008a, b).

    Another aspect of input selection that has received even lessattention is the issue of input independence. While models withredundant inputs might performwell from the perspective of beingable to obtain a good match to the calibration data, they increasemodel complexity and parameter uncertainty. As a result, this issueneeds to receive increased attention in order to reduce the uncer-tainty surrounding ANN model outputs and to enable research intoknowledge extraction from ANNs to proceed with increasedcondence.

    Maier and Dandy (2000) concluded that data division was notcarried out adequately in most of the 43 papers reviewed in theirstudy. Unfortunately, the same still applies today. In the 210 papersthat were the subject of this review, attempts to ensure that thestatistics of the various data subsets were similar were only madeon 24 occasions, whereas ad-hoc data division methods were used148 times. This can cast serious doubts on the quality and repeat-ability of the results obtained, as different data splits are likely toresult in different calibrated models and different model perfor-mance on the validation data. Consequently, there is a need toconsider well established data sampling approaches for the divi-sion of the available data into the requisite model development andevaluation subsets (e.g. May et al., 2010).

    In relation tomodel architecture, there was a signicant amountof research activity in the nine years covered by this review. Maierand Dandy (2000) found that feedforward networks were usedalmost exclusively from 1992 to 1998, most of which were MLPs.While MLPs were still found to be the dominant ANN architecturein this paper, they were used as a benchmark against which tocompare alternative architectures in many of the papers reviewed.There was a signicant amount of experimentation with othertypes of feedforward architectures, such as generalised regressionneural networks, radial basis function networks, neurofuzzymodels and support vector machines, recurrent networks and,most importantly, different types of hybrid network architectures.The development of hybrid ANN model architectures is an impor-tant advance, as it emphasizes that ANNs have a role to play notonly as an alternative to traditional modelling approaches, but alsoas a complementary modelling tool that can be used to improve theperformance of existing approaches. The acknowledgment thatANNs should be used in circumstances that exploit their strengths,rather than as a panacea for the shortcomings of more traditionalmodelling approaches, is part of the evolution of ANNs towardsa mature modelling approach that can sit comfortably alongsidemore traditional approaches in the toolkit of hydrological modelers.

    0

    20

    40

    60

    80

    100

    120

    140

    160

    180

    Squared Absolute Relative Correlation AIC BIC Other

    sr

    ep

    aP

    f

    o

    re

    bm

    uN

    Fig. 19. Number of times various performance evaluation criteria have been used.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909904

  • Author's personal copy

    The way in which optimal model structures are obtained is anarea that has received little attention in the papers that form part ofthis review. As was the case in the ndings of Maier and Dandy(2000), optimal network geometries were generally obtained usingad-hoc approaches, primarily using trial and error. While there wassome increase in the use of systematic approaches to determiningthe optimal number of hidden nodes during this review periodcompared with the previous one, the development and applicationof methods for determining the optimal model structure remains anarea of ongoing work.

    There was also little activity in relation to model calibrationduring the time period considered for this paper. As was the casefrom 1992 to 1998, rst order local optimization methods were byfar the most common, although an increasing number of secondorder local methods were used in the papers published between1999 and 2007. However, surprisingly, there was little adoption ofglobal optimization methods, which have been found to outper-form more traditional methods when used in conjunction withother water resource modelling approaches in recent years. It wasgood to see that some effort was devoted towards the developmentof Bayesian and other stochastic approaches to model calibration inorder to enable parameter uncertainty to be taken into account andto enable condence limits on predictions to be obtained, but thereis a need to expand this work into the future.

    In themajority of papers reviewed, different methods were usedto evaluate model performance, which is considered good practice.However, there is scope for improving the way models are evalu-ated by applying the various measures in a consistent and informedmanner (see Dawson et al., 2007). In addition, in order to enablebetter comparison of ANN development methods across studies,the use of open access data sets should be encouraged.

    5. Recommendations for future research

    Based on the review of 210 papers on the prediction and fore-casting of water quantity and quality variables in rivers conducted inthis paper, the following recommendations for futurework aremade:

    1. More work needs to be undertaken on the prediction of waterquality variables (e.g.MayandSivakumar, 2009;Dellana andWest,2009) in order to further test the utility of ANN models asa predictive tool in hydrology and water resources. Even thoughthere are fewer water quality data than rainfall-runoff data, thereare still sufcient water quality data available to develop ANNmodels. The fact that ANNs are data-driven, and are thus able tomake best use of existing data, should give them an advantageover process-driven water quality models, which require data onall variables, which are often more difcult to obtain.

    2. Work should continue on the development and evaluation ofhybrid model architectures that attempt to draw on the strengthsof alternative modelling approaches (e.g. Lin et al., 2008). Giventhe amount of work that has already been done in this area,a review of this emerging eld of research would seem timely.

    3. Greater attention should be paid to the input variable selectionand data division steps of the ANN model development process.Currently adopted ad-hoc methods in both of these areas havethe potential to signicantly degrade model performance andtherefore need to be replaced with state-of-the art approaches.In relation to input variable selection, non-linear approachesthat are able to account for input independence should be used(e.g. May et al., 2008a, 2008b, 2009; Fernando et al., 2009). As faras data division is concerned, appropriate sampling techniquesshould be used to ensure that the data in all subsets are repre-sentative of each other (see May et al., 2009, 2010).

    4. There has been increasing adoption of second order localmethods for model calibration, but the use of global optimiza-tion methods is still limited. Consequently, there is scope forcomparative studies investigating the relative performance ofvarious global and local optimization algorithms in the contextof ANN model calibration (training).

    5. Research into the best way to incorporate uncertainty into ANNmodels should be continued. Current work on the incorporationof parameter uncertainty via Bayesian and other stochasticcalibration methods should be extended to include other typesof uncertainty (e.g. Kingston et al., 2005a).

    6. Appropriate methods for determining the optimal ANN modelstructure remain elusive. Although somework has been done onthis recently (e.g. Kingston et al., 2008), this is an area thatrequires further research. Increased utilization of ANN archi-tectures that have a xed structure, such as generalised regres-sion neural networks, might also be worthy of consideration.

    Acknowledgments

    The authors would like to thank Tim Rowan and GayaniFernando for their assistance in obtaining the papers that formthe basis of this review, Rob May for his initial work on theowchart depicting the steps in the ANN model developmentprocess and the eight anonymous reviewers of this paper,whose thoughtful and insightful comments have improved thequality of this paper.

    References

    Abebe, A.J., Price, R.K., 2003. Managing uncertainty in hydrological models usingcomplementary models. Hydrological Sciences Journal (Journal Des SciencesHydrologiques) 48 (5), 679e692.

    Abrahart, R.J., See, L.M., 2007. Neural network modelling of non-linear hydrologicalrelationships. Hydrology and Earth System Sciences 11 (5), 1563e1579.

    Abrahart, R.J., Heppenstall, A.J., See, L.M., 2007. Timing error correction procedureapplied to neural network rainfall-runoff modelling. Hydrological SciencesJournal (Journal Des Sciences Hydrologiques) 52 (3), 414e431.

    Agarwal, A., Singh, R.D., 2004. Runoff modelling through back propagation articialneural network with variable rainfall-runoff data. Water Resources Manage-ment 18 (3), 285e300.

    Agarwal, A., Singh, R.D., Mishra, S.K., Bhunya, P.K., 2005. ANN-based sediment yieldriver basin models for Vamsadhara (India). Water SA 31 (1), 95e100.

    Ahmad, S., Simonovic, S.P., 2006. An intelligent decision support system formanagement of oods. Water Resources Management 20 (3), 391e410.

    Ahmed, J.A., Sarma, A.K., 2007. Articial neural network model for syntheticstreamow generation. Water Resources Management 21 (6), 1015e1029.

    Alp, M., Cigizoglu, H.K., 2007. Suspended sediment load simulation by two articialneural network methods using hydrometeorological data. EnvironmentalModelling & Software 22 (1), 2e13.

    Alvisi, S., Mascellani, G., Franchini, M., Bardossy, A., 2006. Water level forecastingthrough fuzzy logic and articial neural network approaches. Hydrology andEarth System Sciences 10 (1), 1e17.

    Amenu, G.G., Markus, M., Kumar, P., Demissie, M., 2007. Hydrologic applications ofMRAN algorithm. Journal of Hydrologic Engineering 12 (1), 124e129.

    Anctil, F., Lauzon, N., 2004. Generalisation for neural networks through datasampling and training procedures, with applications to streamow predictions.Hydrology and Earth System Sciences 8 (5), 940e958.

    Anctil, F., Rat, A., 2005. Evaluation of neural network streamow forecasting on 47watersheds. Journal of Hydrologic Engineering 10 (1), 85e88.

    Anctil, F., Perrin, C., Andreassian, V., 2003. ANN output updating of lumpedconceptual rainfall/runoff forecasting models. Journal of the American WaterResources Association 39 (5), 1269e1279.

    Anctil, F., Michel, C., Perrin, C., Andreassian, V., 2004a. A soil moisture index as anauxiliary ANN input for stream ow forecasting. Journal of Hydrology 286(1e4), 155e167.

    Anctil, F., Perrin, C., Andreassian, V., 2004b. Impact of the length of observed recordson the performance of ANN and of conceptual parsimonious rainfall-runoffforecasting models. Environmental Modelling & Software 19 (4), 357e368.

    Anctil, F., Lauzon, N., Andreassian, V., Oudin, L., Perrin, C., 2006. Improvement ofrainfall-runoff forecasts through mean areal rainfall optimization. Journal ofHydrology 328 (3, 4), 717e725.

    Anmala, J., Zhang, B., Govindaraju, R.S., 2000. Comparison of ANNs and empiricalapproaches for predicting watershed runoff. Journal of Water Resources Plan-ning and Management e ASCE 126 (3), 156e166.

    H.R. Maier et al. / Environmental Modelling & Software 25 (2010) 891e909 905

  • Author's personal copy

    Antar, M.A., Elassiouti, I., Allam, M.N., 2006. Rainfall-runoff modelling using arti-cial neural networks technique: a Blue Nile catchment case study. HydrologicalProcesses 20 (5), 1201e1216.

    Aqil, M., Kita, I., Yano, A., Nishiyama, S., 2007a. A comparative study of articialneural networks and neuro-fuzzy in continuous modeling of the daily andhourly behaviour of runoff. Journal of Hydrology 337 (1, 2), 22e34.

    Aqil, M., Kita, I., Yano, A., Nishiyama, S., 2007b. Neural networks for real timecatchment ow modeling and prediction. Water Resources Management 21(10), 1781e1796.

    ASCE Task Committee on Application of Articial Neural Networks in Hydrology2000, 2000a. Articial neural networks in hydrology. I: preliminary concepts.Journal of Hydrologic Engineering, ASCE 5 (2), 115e123.

    The ASCE Task Committee on Applicaton of Articial Neural Networks in Hydrology,2000b. Articial neural networks in hydrology. II: hydrologic applications.Journal of Hydrologic Engineering 5 (2), 124e137.

    Bae, D.H., Jeong, D.M., Kim, G., 2007. Monthly dam inow forecasts using weatherforecasting information and neuro-fuzzy technique. Hydrological SciencesJournal (Journal Des Sciences Hydrologiques) 52 (1), 99e113.

    Birikundavyi, S., Labib, R., Trung, H.T., Rousselle, J., 2002. Performance of neuralnetworks in daily streamow forecasting. Journal of Hydrologic Engineering 7(5), 392e398.

    Bishop, C.M., 2004. Neural Networks for Pattern Recognition. Oxford UniversityPress, Oxford.

    Bowden, G.J., Maier, H.R., Dandy, G.C., 2002. Optimal division of data for neuralnetwork models in water resources applications. Water Resources Research 38(2), 1010.

    Bowden, G.J., Maier, H.R., Dandy, G.C., 2005. Input determination for neural networkmodels in water resources applications. Part 2. Case study: forecasting salinityin a river. Journal of Hydrology 301 (1e4), 93e107.

    Brath, A., Montanari, A., Toth, E., 2002. Neural networks and non-parametricmethods for improving real-time ood forecasting through conceptual hydro-logical models. Hydrology and Earth System Sciences 6 (4), 627e639.

    Bruen, M., Yang, J.Q., 2005. Functional networks in real-time ood forecasting -a novel application. Advances in Water Resources 28 (9), 899e909.

    Cameron, D., Kneale, P., See, L., 2002. An evaluation of a traditional and a neural netmodelling approach to ood forecasting for an upland catchment. HydrologicalProcesses 16 (5), 1033e1046.

    Campolo, M., Andreussi, P., Soldati, A., 1999a. River ood forecasting with a neuralnetwork model. Water Resources Research 35 (4), 1191e1197.

    Campolo, M., Soldati, A., Andreussi, P., 1999b. Forecasting river ow rate during low-pow periods using neural networks. Water Resources Research 35 (11),3547e3552.

    Campolo, M., Soldati, A., Andreussi, P., 2003. Articial neural network approach toood forecasting in the River Arno. Hydrological Sciences Journal (Journal DesSciences Hydrologiques) 48 (3), 381e398.

    Cannon, A.J., Whiteld, P.H., 2002. Downscaling recent streamow conditions inBritish Columbia, Canada using ensemble neural network models. Journal ofHydrology 259 (1e4), 136e151.

    Castellano-Mendez, M., Gonzalez-Manteiga, W., Febrero-Bande, M., Prada-Sanchez, J.M., Lozano-Calderon, R., 2004. Modelling of the monthly and dailybehaviour of the runoff of the Xallas river using Box-Jenkins and neuralnetworks methods. Journal of Hydrology 296 (1e4), 38e58.

    Chandramouli, V., Deka, P., 2005. Neural network based decision support model foroptimal reservoir operation. Water Resources Management 19 (4), 447e464.

    Chang, F.J., Chen, Y.C., 2001. A counterpropagation fuzzy-neural network modelingapproach to real time streamow prediction. Journal of Hydrology 245 (1e4),153e164.

    Chang, F.J., Hu, H.F