24
ORIGINAL ARTICLE Artificial neural networks models for predicting effective drought index: Factoring effects of rainfall variability Muthoni Masinde Received: 9 February 2013 / Accepted: 26 March 2013 # Springer Science+Business Media Dordrecht 2013 Abstract Though most factors that trigger droughts cannot be prevented, accurate, relevant and timely forecasts can be used to mitigate their impacts. Drought forecasts must define the droughts severity, onset, cessation, duration and spatial distribution. Given the high proba- bility of droughts occurrence in Kenya, her heavy reliance on rain-fed agriculture and lack of effective drought mitigation strategies, the country is highly vulnerable to impacts of droughts. Current drought forecasting approaches used in Kenya are not able to provide short and long term forecasts and they fall short of providing the severity of the drought. In this paper, a combination of Artificial Neural Networks and Effective Drought Index is presented as a potential candidate for addressing these drawbacks. This is demonstrated using forecasting models that were built using weather data for thirty years for four weather stations (representing 3 agro-ecological zones) in Kenya. Experiments varying various input/output combinations were carried out and drought forecasting network models were implemented in Matrix Laboratorys (MATLAB) Neural Network Toolbox. The models incorporate forecasted rainfall values in order to mitigate for unexpected extreme climate variations. With accuracies as high as 98 %, the solution is a great enhancement to the solutions currently in use in Kenya. Keywords Drought Forecasts . Artificial Neural Networks(ANNs) . Effective Drought Index(EDI) . Available Water Resource Index(AWRI) . Rainfall Variations . Kenya 1 Introduction Droughts are among the most expensive disasters in the world and their negative impacts span economic, social and environmental aspects of the affected society. It is for this reason Mitig Adapt Strateg Glob Change DOI 10.1007/s11027-013-9464-0 M. Masinde (*) Department of Information Technology, Central University of Technology, Private Bag X20539, Bloemfontein, 9300, South Africa e-mail: [email protected] M. Masinde e-mail: [email protected] URL: http://www.muthonimasinde.net

Artificial Neural Networks Models for Predicting Effective

Embed Size (px)

DESCRIPTION

Redes Neuronales Artificiales para predicciones hidrológicas - hidráulicas efectivas

Citation preview

  • ORIGINAL ARTICLE

    Artificial neural networks models for predicting effectivedrought index: Factoring effects of rainfall variability

    Muthoni Masinde

    Received: 9 February 2013 /Accepted: 26 March 2013# Springer Science+Business Media Dordrecht 2013

    Abstract Though most factors that trigger droughts cannot be prevented, accurate, relevantand timely forecasts can be used to mitigate their impacts. Drought forecasts must define thedroughts severity, onset, cessation, duration and spatial distribution. Given the high proba-bility of droughts occurrence in Kenya, her heavy reliance on rain-fed agriculture and lack ofeffective drought mitigation strategies, the country is highly vulnerable to impacts ofdroughts. Current drought forecasting approaches used in Kenya are not able to provideshort and long term forecasts and they fall short of providing the severity of the drought. Inthis paper, a combination of Artificial Neural Networks and Effective Drought Index ispresented as a potential candidate for addressing these drawbacks. This is demonstratedusing forecasting models that were built using weather data for thirty years for four weatherstations (representing 3 agro-ecological zones) in Kenya. Experiments varying variousinput/output combinations were carried out and drought forecasting network models wereimplemented in Matrix Laboratorys (MATLAB) Neural Network Toolbox. The modelsincorporate forecasted rainfall values in order to mitigate for unexpected extreme climatevariations. With accuracies as high as 98 %, the solution is a great enhancement to thesolutions currently in use in Kenya.

    Keywords Drought Forecasts . Artificial Neural Networks(ANNs) . Effective DroughtIndex(EDI) . AvailableWater Resource Index(AWRI) . Rainfall Variations . Kenya

    1 Introduction

    Droughts are among the most expensive disasters in the world and their negative impactsspan economic, social and environmental aspects of the affected society. It is for this reason

    Mitig Adapt Strateg Glob ChangeDOI 10.1007/s11027-013-9464-0

    M. Masinde (*)Department of Information Technology, Central University of Technology, Private Bag X20539,Bloemfontein, 9300, South Africae-mail: [email protected]

    M. Masindee-mail: [email protected]: http://www.muthonimasinde.net

    http://www.muthonimasinde.net

  • that research on drought has attracted a lot of interests from environmentalists, ecologists,hydrologists, meteorologists, geologists, economists and agricultural scientists (Mishra andDesai 2006). In Byun and Wilhite (1999), literature on droughts is classified under 4categories; (1) causes of droughts (atmospheric science); (2) frequency and severity ofdroughts in a given region; (3) impacts of droughts; and (4) response, mitigation andpreparedness for droughts; this paper focuses on some aspects of the latter. Natural disasterstriggered by critical climate extremes especially droughts continue to affect millions ofpeople in Africa. According to the World Disasters Report of 2011, Africa contributed over50 % of the droughts that occurred in the world between 2001 and 2011 (Armstrong et al.2011).

    Kenya has consistently contributed the highest number of people affected by naturaldisasters in the Continent (Deely et al. 2010). Droughts are heavily felt in Kenya because ofthree factors: (1) regularity droughts are so regular in Kenya; they were for instanceexperienced in 50 % of the years between 1980 and 2008; (2) like most other developingcountries of Africa, Kenyas rain-fed agriculture is the backbone of the economy. Over 80 %of Kenyans rely on the agricultural sector that contributes 24 % of the Gross DomesticProduct (GDP) (Mwagore 2002). The sector is highly sensitive to droughts; and (3) inabilityof Kenyas population to prepare and adapt to droughts. Being a developing country with46 % of the rural population living below the poverty line, the governments priority list isfilled with items such as providing basic education, implementing democratic constitution,peace initiatives, providing basic healthcare, and so on. This leaves no room for developingdrought mitigation strategies. Two examples of recent droughts that affected Kenya include:the 19992002 drought in western and central Kenya that affected 23 million people (WHOCollaborating Centre for Research on the Epidemiology of Disasters (CRED) 2012) and the20052006 in the northern and eastern regions of Kenya that left over 70 % of the livestockdead (CERF 2009).

    Droughts are quantified using indices such as the Effective Drought Index (EDI). EDI is ableto quantify droughts in absolute terms and also provide answers to: (1) the when, (2) the howlong (onset to cessation) as well as (3) the severity of droughts/floods (Byun andWilhite 1999;Mishra and Singn 2010 ). EDI quantifies droughts in terms of droughts classes composed ofpositive and negative real values; for example, 2.50 indicates extreme drought, +3.28 indicatesextreme floods and 0.98 indicates close to normal wetness. EDI further qualifies droughts byproviding Available Water Resource Index (AWRI) that can for example reliably inform afarmer of the amount of the available soil moisture at any given day.

    Though the well-developed drought indices such as EDI perform very well in mappingdroughts in spatial and timescale dimensions, they only detect the events already happening(Masinde and Bagula 2011). In this paper, Artificial Neural Networks (ANNs) play the roleof forecasting/predicting future droughts. The numerical approaches that are used in Kenyaproduce forecasts with low accuracies; 70 % at best (Mutua 2011). With such forecasts, itbecomes difficult to make sound policy decisions to mitigate for droughts.

    The work presented in this paper is part of a larger project whose main objective was todevelop an effective, sustainable and relevant home-grown Drought Early Warning System(DEWS) for the Sub-Saharan Africa by making use of Artificial Intelligence technologies tointegrate: (1) indigenous knowledge on droughts; (2) scientific weather forecasts; (3)Wireless Sensor Networks (WSNs); and (4) mobile phones (Masinde and Bagula 2012).The main component of this DEWS is Drought Monitoring and Prediction. While theevaluation of the strengths of EDI as a tool for monitoring droughts in terms of their severity,onset, cessation and probability of their occurrence has been described by Masinde andBagula (2012), this paper tackles the Drought Prediction sub-component of DEWS. It

    Mitig Adapt Strateg Glob Change

  • describes drought-forecasting component that combines the use of Artificial NeuralNetworks (ANNs) and Effective Drought Index (EDI) to forecast drought for periodsranging from one day to four years with accuracies ranging from 98 % to 75 %. This is aphenomenal improvement on the accuracy (most existing solutions accuracy is about 70 %)of the forecasts.

    The models have been tested using historical weather data from 4 weather stations inKenya. Though the integration of EDI and ANNs has been attempted elsewhere in the worldespecially in South-East Asia, no research known to the author has attempted to model themicro (daily) level of details like it has been done in this paper. In designing the long-termcategory of network models, the correlation among EDI values of similar calendar monthswas utilised. For instance, given EDI values for February 2008, February 2009, February2010 and February 2011, one can forecast EDI value for February 2013. Forecasted rainfallvalues are used as input into the ANNs models to take care of unexpected extreme rainfallvariations patterns.

    2 Drought forecasting in Kenya

    Droughts are the most common disasters in Kenya; they result from variations in weather/climate, but there are occasional floods too. In the period 1999 to 2008, Kenya contributed awhopping 32.85 % of people affected by natural disasters in the Africa (Collins et al. 2009).In August 2011, Kenya was among the countries at the Horn of Africa that experienced adevastating drought that was described as the worst drought in 60 years (http://www.bbc.co.uk/news/world-africa-13944550)(lastly accessed on 10 October 2012). Suchdisasters cannot be avoided but can be managed through effective early warning systems.When they occur, droughts affect more that 25 % of Kenyas population not mentioning theripple effects such as inadequate hydro-electric power supply, increased commodity pricesand loss of jobs just to mention a few.

    Currently, monitoring of climatic/weather variations in Kenya is the mandate of theKenya Meteorological Department (KMD). The Department is handicapped in the sensethat all they have are sparsely distributed weather stations. They run 3 main types of stationsthat are currently managed by the Climatological Section of the Department (http://www.meteo.go.ke/): (1) 700 rainfall stations, (2) 62 temperature stations and (3) 27 synopticstations. The Agrometeorological Section on the other hand manages 13 stations related toagriculture; data is remitted from these stations every 10 days. Apart from the normalmeteorological observations, other observations by the Agrometeorological Section include:soil temperature, sunshine duration, radiation, pan evaporation and potential evapotranspi-ration. All this data is stored in semi-automated formats at the Departments Head Quartersin Nairobi. The data is available to interested stakeholders and on request. The data used inthis research was obtained from this Department. The Meteorological Department uses thedata collected to provide five main types of forecasts: (1) Daily forecast for main cities/towns in Kenya; (2) Four-day forecast; (3) Seven-day forecast; (4) Monthly forecast; and (5)Seasonal forecast. The latter are mainly issued two times a year a few weeks before each ofthe two main rain seasons. The Country has two main rainy seasons: (1) October-November-December (OND); (2) March-April-May (MAM); MAM is the main season. Sometimes therainfall may occur in the period June-July-August (JJA). Among other factors (not part ofthis research work), the amount of rain and moisture index in each of the above season isused to classify Kenya under 14 climate and agro-ecological zones (http://www.meteo.go.ke).

    Mitig Adapt Strateg Glob Change

    http://www.bbc.co.uk/news/world-africa-13944550http://www.bbc.co.uk/news/world-africa-13944550http://www.meteo.go.ke/http://www.meteo.go.ke/http://www.meteo.go.kehttp://www.meteo.go.ke

  • KMD uses statistical and dynamical models to generate seasonal rainfall forecasts foreach region in Kenya; these are then compared against the forecasts from neighbouringcountries, as well as against the forecasts from several global dynamical models, issued byMeteorological agencies in Europe and the United States. These are then compiled to giveprobabilistic forecast of rainfall under three categories: above-, near- or below-normalrainfall. One of the shortcomings of KMDs approach is the fact that their forecasts provideconceptual indications of droughts/floods without giving operational indicators. This makesit difficult for key stakeholders to develop solid strategic plans. Conversely, EffectiveDrought Index (EDI) is able to quantify droughts in absolute terms and also provide answersto: (1) the when, (2) the how long (onset to termination) as well as (3) the severity ofdroughts/floods. Combining EDI with Artificial Neural Networks(ANNs) leads to a moreeffective drought forecasting solution described in this paper.

    3 Drought severity indices and effective drought index

    Drought is an insidious hazard of nature which according to Elsa et al. (2008) qualifies as ahazard because it is a natural accident of unpredictable occurrence but of recognizablerecurrence. As a disaster, drought corresponds to the failure of the precipitation regime,causing the disruption of the water supply to the natural and agricultural ecosystems. Thereis no one universally accepted definition of drought yet. Palmer came to this conclusion asearly as 1965 when he stated Drought means various things to various people depending ontheir specific interest. To the farmer drought means a shortage of moisture in the root zone ofhis crops. To the hydrologist, it suggests below average water levels in the streams, lakes,reservoirs, and the like. To the economist, it means a shortage which affects the establishedeconomy (Palmer 1965). Since then, attempts have been made to define the term drought.There are three main categories of drought: meteorological, hydrological and agricultural(Wilhite and Glantz 1985; Dracup et al. 1980). Meteorological drought is defined as a periodof abnormally dry weather sufficiently prolonged for the lack of water/rainfall to causeserious hydrologic imbalance in the affected area (Huschke 1970). As a matter of fact,meteorological drought is precursor for the other categories of droughts, hence the mostcritical.

    According to Panu and Sharma (2002), the severity of a drought is a function of thedrought duration and probability distribution of the drought variable and its autocorrelationstructure. In meteorological drought, the severity is defined in the form of indices such as thePalmer Drought Severity Index (PDSI) (Palmer 1965). PDSI is the best-known index;historically, it has been the most commonly implemented. Its weaknesses however rangefrom its complexity to poor applicability (underlying computation is based on the climate ofthe Midwestern United States). This has led to various variations of the Index such as: (1) theStandardized Precipitation Index (SPI) that measures the deviation of the actual precipitationfrom the average conditions in a given area (Mckee et al. 1993); (2) Palmer HydrologicalDrought Index (PHDI), which is used for water supply monitoring; (3) self-calibrated PDSI,which provided a solution to the inappropriately higher frequency of extreme drought (Wellset al 2004); (4) Normalized Difference Vegetation Index (NDVI) on the other hand takesadvantage of the reflective and absorptive characteristics of plants in the red and near-infrared portions of the electromagnetic spectrum; and (5) AWRI that expresses the actualamount of available water (Wilhite and Glantz 1985). Like the PDSI, SPI has its own shareof popularity; however its major undoing (among others) is the predication granularity,which happens to be monthly scale. This way, it is not as accurate as required; it does not

    Mitig Adapt Strateg Glob Change

  • reflect daily/weekly patterns. Researchers such as Byun and Wilhite (1999) developed theEDI to address SPI shortcomings.

    EDI is different from the rest of the indices in a number of ways; it was developed toaddress weaknesses identified in the existing (at the time) drought indices. Some desirablefeatures of EDI are: (1) More accurately calculates current level of available water resources;(2) It considers drought continuity, not just for a limited period; it can therefore diagnoseprolonged droughts that continue for several years; (3) It is computed using precipitationalone; and (4) It considers daily water accumulation with a weighting function for timepassage. Calculation is therefore made with consideration of the fact that the quantity ofrainfall that can be used as a water resource drops gradually over time after the rain hasfallen. Effective Precipitations (EPs) are used to compute deficiency or surplus of waterresources for a particular date and place. EP here refers to the summed value of dailyprecipitation with a time-dependant reduction function; it makes use of Eq. 1 below.

    EPi Xi

    n1

    Pnm1Pmn

    1

    where Pm is the precipitation of m days before and the index i represents the duration ofsummation (DS) in days. Here i=365 is used, that is, summation for a year which is the mostdominant precipitation cycle worldwide. The 365 can then be a representative value of thetotal water resources available or stored for a long time.

    For instance, if i=2, then m varies from 1 to 2, EP2 becomes [P1+P1+P2)/2]. Once thedaily EP is computed, a series of indices can be calculated to highlight different character-istics of a stations water resources. These are: Mean Effective Precipitation (MEP),Deviation of EP (DEP) and standardised value of DEP (EDI). EDI produced satisfactoryresults in measuring drought severity and was adopted (and adapted) to analyse 200-yeardrought climatology of Seoul, Korea (Kim et al. 2009) and to analyse droughts in Kenya(Masinde and Bagula 2011)

    4 Artificial neural networks in forecasting droughts

    Artificial Intelligence (AI) refers to the computing paradigm that aims to develop solutionsthat mimic human perception, learning and reasoning to solve problems. Two AI techniquesstand out when it comes to modelling environmental systems such as droughts; IntelligentAgents and Artificial Neural Networks (ANNs). ANNs are composed of simple elements(inspired by biological nervous systems) that mimic the brain neurons; they are designed tooperate in parallel to achieve a goal. The network function is determined by the connectionsbetween elements. This function can be altered by varying the values of the connections(weights) between these elements; this is referred to as training of the network.

    Chang-Shian et al. (2010) described and compared AI techniques performance inmodelling environmental system; they singled out ANNs approach as being excellent insolving data-intensive and multivariable problems with unclear mapping rules. ANNs arehighly interconnected systems that operate by mimicking the operation of the human brain.ANNs perform seven categories of tasks: pattern classification, clustering, function approx-imation, prediction, optimisation, retrieval by content and process control.

    For decades, pattern recognition techniques, regression models and autoregressive mov-ing average (ARMA) and autoregressive integrated moving average (ARIMA) models forstatistical time series have been used for drought forecasting. ANNs provide one of the

    Mitig Adapt Strateg Glob Change

  • nonlinear non-stationary alternative models for drought forecasting as explained in (Mishraand Desai 2006). ANNs have several advantages to this end; they can be used to solveproblems with nonlinear/unknown multivariate and less controlled environments (Bodri andCermak 2001). The greatest strength of ANNs, though, is its ability to weave togethervarious mathematical components capable of tackling very complex physical systems suchas droughts. ANNs are also flexible and less assumption-dependent and this has seen themfind applications in modelling extremely complex hydrological domains such as rainfall-runoff, stream flows, water quality and precipitation estimation (American Society of CivilEngineers (ASCE) Task Committee on Application of Artificial Neural Networks inHydrology 2000).

    In most hydrological applications, multi-layer perceptron (MLP) neural network model isadopted with feed-forward back propagation (BPN) as the training algorithm (Rumelhart etal. 1986; Freitas and Billib 1997; Antoni et al. 2001; Bodri and Cermak 2001; Kung et al.2006, and Chang-Shian et al. 2010). Antoni et al. (2001) used seven climatic variablesobtained from 127 weather stations in Croatia to develop empirical drought models usingBPN. Mishra and Desai (2006) observed that most ANNs implementations adopted themulti-layer feed-forward neural network and were based on BPN.

    BPN has been applied in modelling animal/plant population growth to predicthydrological/meteorological phenomena such as droughts and floods. BPN has three generallayers: input, hidden and output, and has two processing types: learning/training andrecalling. At the input layer(s) data are fed to the system while the processing takes placewithin the hidden layer(s) and finally, results are produced at the output layer(s) (Morid et al.2007). Besides it being difficult to determine the representativeness of learning data, usingBPN in hydraulic modelling is faced with the problem that BPN is a non-linear black boxthat does not consider/explain the underlying physical process of watershed (Chang-Shian etal. 2010). This is the weakness that is addressed by Decision Group BPN (Chang-Shian et al.2010).

    Using a data (precipitation) set for 36 years (1965 to 2001), Mishra and Desai (2006)compared the performance of BPN and ARIMA for different Standard Precipitation Index(SPI) lead-times. Applying both the Recursive Multi-step Neural Network (RMSNN) andthe Direct Multi-step Neural Network (DMSNN) approaches, they found that RMSNNperformed better than both DMSNN and ARIMA for one-month lead-time and thatDMSNN performed better than the other two for a lead-time of four months. Other projectsin which ANNs have been used for predicting droughts/floods include drought prediction innorth-east Brazil(Freitas and Billib 1997) and summer flood forecasting in Moravia (Bodriand Cermak 2001). Shin and Salas (2000) used ANNs to quantify the spatial and temporalpatterns of meteorological droughts for the south-western region of Colorado.

    Though invented in 1943 by Mcculloch and Pitts (1943), it is the coming-on-stage of theBack-Propagation training algorithm for feed-forward ANNs in 1986 that boosted theapplication of ANNs in several real-life application areas such as drought forecasting(Panu and Sharma 2002).

    Maier and Dandy (2000) described five key considerations to successful implementationof ANNs as:

    (a) ANNs performance evaluation criteria: how will the performance of the model bejudged? The relevant criterion for this study was accuracy of predicted results. Thisalso happens to be the most commonly used criterion (Demuth et al. 2009).

    (b) Data set division criteria; deals with how to share the available dataset among thetraining, validation and testing phases of ANNs. Most neural network tools such as

    Mitig Adapt Strateg Glob Change

  • MATLAB Neural Network Toolkit adopt 70:15:15 percentages for training, validationand testing respectively (Demuth et al. 2009). This data set division criterion wasadopted in this paper.

    (c) Data pre-processing with the aim of ensuring some form of standardisation. In thispaper, all rainfall data was first subjected to EDI computation and hence, it was alreadystandardised/homogenisation. Further, the mapminmax function implemented in theMATLAB Neural Network Toolkit was applied. The latter puts all the input values into[1:1] range. This in turn ensured that all variables involved had equal representationduring the training.

    (d) Determining model input, this is purely based on a priori knowledge of causal vari-ables in conjunction with inspections of time series plots of potential inputs andoutputs. In this paper an array of weather parameters such as daily readings oftemperature, rainfall, relative humidity and wind direction were considered. From therainfall values, both daily and monthly EDI and AWRI were computed and extensivelyused as inputs/outputs of the neural networks.

    (e) Selecting a suitable network architecture which is made up of two aspects:

    The type of connection (are there loops?) among the nodes; and how many hidden layersand how many nodes in each of the hidden layers. The number at the input and output layersis generally determined by the problem domain; this is referred to as the geometry of thenetwork.

    5 Materials and methods

    5.1 Case study with experimental design

    Yin (1993) defines case study approach as an empirical enquiry that investigates a contem-porary phenomenon within its real life context, when the boundaries between phenomenonand context are not clearly evident and in which multiple sources of evidence are used. Thecase study design was used to analyse historical daily weather data for over 30 years for fourweather stations in Kenya. First it was used to determine if EDI could be used toqualify/quantify droughts and two, to investigate the suitability of ANNs in forecastdroughts. The design of the case studies followed a three-phase experiment made up ofpilot, exploratory and confirmatory experiments. This led to the variation of Case Study withExperimental Design as shown in the Fig. 1. Each of the case studies started off with a Pilot-Single Case Study of one weather station followed by recursive Exploratory Multiple CaseStudies of two weather stations. In the last phase of the case study design, detailedConfirmatory Single Case Study using data from the fourth weather station was used.

    5.2 Data sample

    Droughts impacts are more pronounced in highly populated areas whose main economicactivity is agricultural production. In Kenya, these areas are found in western, central riftvalley and the highlands and in a narrow strip along the Indian Ocean coast. Four weatherstations were selected to represent four (excluding the rift valley) these regions. Datacomposed of daily readings for temperature (highest and lowest), wind (direction andspeed), relative humidity (at 06Z and at 12Z), atmospheric pressure and rainfall was sourcedfrom KMD. These are: Dagoretti, Embu, Kakamega and Makindu. In the this paper, only

    Mitig Adapt Strateg Glob Change

  • daily precipitation data for years 1979 to 2009 from these stations was used; this translatedinto 436531 records (Table 1).

    5.3 EDI calculation

    Two Phases: (1) Data cleaning; (2) Data analysis; were carried out as per the flow chartshown in Fig. 2. A Java program was written to covert the original format to formatsacceptable by the EDI Fortan program: A Fortan program developed by Byun and Wilhite(1999) and available at (http://atmos.pknu.ac.kr/~intra2/down_src.php (lastly accessed on 10October 2012) was then used to compute EDI values. This computer program uses Eq. 1

    Pilot Single-Case-Study Data for 1 weather Station

    (Embu) Formulate theory

    Eploratory Multiple-Case-Studies Data for 2 weather stations

    (Dagoretti and Makindu) Test theory

    Confirmatory Single-Case-Study Data for 1 weather station

    (Kakamega) Confirm and extend theory

    ANNs Design Experiments

    Fig. 1 The 3-steps of the case study with experimental design

    Table 1 Geo-Data of the weather stations studied

    Name Dagoretti Embu Kakamega Makindu

    WMO number 63741 63720 63687 63766

    ICAO HKNC HKEM HKKG HKMU

    Year opened 1954 1975 1957 1904

    Latitude 01 18S 00 30S 00 17 N 2 17S

    Longitude 36 45E 37 27E 34 47E 37 50E

    Elevation 1,798 m 1,493 m 2,133 m 1,000 m

    Annual average rainfall 1,023,mm 1,250 mm 1,913 mm 593 mm

    Ecological-zone 11 11 13 10

    Geographical region Highland (Nairobi) Central Western Coast

    ICAO (International Civil Aviation Organisation), WMO (World Meteorological Organisation)

    The values of the annual average rainfall in this table were calculated (by the author) from daily rainfall valuesfor each station for years 1979 to 2009 in Kenya.

    Mitig Adapt Strateg Glob Change

    http://atmos.pknu.ac.kr/~intra2/down_src.php

  • (presented in Section 3) to compute daily/monthly EDIs and outputs them into text files. Theraw data that was acquired from the Kenya KMD was first cleaned and formatted to suit thepre-defined format (generating an input file to the EDI program) for each of the 4 weatherstations. For each input file, an output file containing the EDI and AWRI values wasgenerated. These were then exported and stored in a My Structured Query Language(MySQL) database from where various data analysis reports were generated. Some of thesecan be found in (Masinde and Bagula 2011).

    Phase I: Data Cleaning

    Identify and remove data gaps

    Extract rainfall data for 4 stations

    Weather Data in text files; Source: KMD

    Using a Java Program, reformat the data to Day, Month, Year and Station

    Formatted Data

    Phase II: Data

    Analysis

    Computed Monthly & Daily EDI

    EDI

    Program

    EDI Web-Based Decision Support

    System

    Weather

    Database

    (MySQL)

    Excel Charts Generator

    Fig. 2 Flow chart showing datacleaning and analysis phases forthe Kenyas case study

    Mitig Adapt Strateg Glob Change

  • 5.4 Artificial neural networks models development

    5.4.1 Overview

    The ANNs network models were developed using the output from the EDI computationdescribed above. The ANNs described in this work utilised only EDI, AWRI and precipi-tation values for 30 years (1980 to 2009); an attempt to include temperature (and otherparameters) led to large errors because among other reasons, precipitation (used to deriveEDI and AWRI) is in itself a function of these parameters. Further, precipitation is itself afunction of so many other linear and nonlinear functions (Weichert and Burger 1998). Thenetwork models were created by combing various Inputs/Targets; for each station, the dataset contained daily or monthly readings for precipitation, EDI and AWRI. The format of thedaily EDI/AWRI file is as shown in Table 2 while the monthly one had the format:[Month/Year|Precipitation|AWRI|EDI]. To normalize the input values, default data pre-processing function, mapminmax that maps the range of input values to the range [1 1]was used.

    5.4.2 Network architecture

    One hidden layer with 20 neurons was selected. During the network building, someexperiments of varying the number of neurons were carried out but the value 20outperformed them; hence the decision to use the default value 20. For example, duringthe training of the ANNs for predicting EDI value for Dagoretti for 7-day lead-time, thenumber of hidden layers was varied between 10 and 40 at an interval of 5. The Mean SquareError (MSE) and Regression(R) values were used to rank them; it emerged that the networkwith 20 hidden layers had the best performance.

    5.4.3 Data division criteria and network training

    For each data set, the ratio of 70 %:15 %:15 % for training, validation and testingrespectively was used. Splitting of the data during training was done randomly usingdividerand() function. Training was carried out using Levenberg-Marquardtbackpropagation (trainlm) algorithm. This was selected because it is the fastestbackpropagation algorithm in the MATLAN ANNs toolbox that was used in this reasearch(Demuth et al. 2009)

    Table 2 Format of daily EDIoutput Date Total Precipitation AWRI EDI

    1980-01-01 0.0 117.0 0.66

    1981-01-31 0.0 208.1 0.96

    2009-12-31 26.6 146.7 0.20

    Mitig Adapt Strateg Glob Change

  • 5.4.4 Network geometry

    The networks were first designed to predict future EDI and/or AWRI values for a givencombination of EDI (E), precipitation (P) and AWRI (W) for 6 previous days/months. Thenetworks were categorized as follows:

    & 7 Networks with 2(EDI and AWRI) outputs& 7 Networks with EDI as the output& 7 Networks with AWRI as the output

    A simple Java program was then created to convert input files in the format on Table 2 tothe following inputs-targets mapping.

    (i) Homogeneous inputs, homogeneous output example:

    & Network 8: En+1=f(En, En1, En2, En3, En4, En5):-Given EDI values for the last 6 days, forecast the EDI value for the next day.

    (ii) Heterogeneous inputs, homogeneous output example:

    & Network 14: En+1=f((Pn, Pn1, Pn2, Pn3, Pn4, Pn5 ) (En, En1, En2, En3, En4, En5)(Wn, Wn1, Wn2, Wn3, Wn4, En5)):-

    Given Precipitation, EDI and AWRI values for the last 6 days, forecast the EDIvalue for the next day.

    (iii) Homogeneous inputs, heterogeneous output example:

    & Network 2: En+1,Wn+1=f(En, En1, En2, En3, En4, En5):-Given EDI values for the last 6 days, forecast the EDI and AWRI values for the

    next day.

    (iv) Heterogeneous inputs, heterogeneous output example:

    & Network7: En+1,Wn+1=f((En, En1, En2, En3, En4, En5) (Wn, Wn1, Wn2, Wn3,Wn4, Wn5) ((Pn, Pn1, Pn2, Pn3, Pn4, Pn5)):-

    Given EDI, AWRI and Precipitation values for the last 6 days, forecast the EDIand AWRI values the next day.

    5.4.5 ANNs performance analysis

    Both R and MSE values were used to select the networks models with the best performance.Root Mean Square Error (RMSE) and Percentage RMSE were then used to determine theimplications of the errors on the resulting forecasts. For Regression Analysis, the higher thevalue, the higher the rank; it is themeasure of the correlation between the inputs and the outputs.In the case of MSE: the smaller the value the better the network; it refers to the average squareddifference between outputs and targets. However, the MSE value is bound to be directlyproportional to the size of the outputs values. For example, the outputs/targets for Networks15 to 21 are those of AWRIwhosemean for Embuweather station is 194.58while for Networks7 to 14 are for EDI values that ranges between 2.28 to 4.32 (an absolute mean of 0.76521) forthe same station. In order to get the actual implications of the networks, RMSE and thepercentage of the RMSE were computed.

    Mitig Adapt Strateg Glob Change

  • 6 Results and discussion

    6.1 Rainfall and temperature variability in Kenya

    As shown in Figs. 3 and 4, data analysis point towards more frequent and pronouncedrainfall variations as well as a rise in temperatures in Kenya. There is evidence of extremewithin-season variability especially of the rains on-set leading to unseasonal rainfall patterns.Considering that the design and operation of ANNs is heavily dependent on learninghistorical patterns and using these to predict the future, the rainfall variability beingwitnessed in Kenya and elsewhere in the world would lead to larger errors in the predicteddrought values. Forecasted (by KMD) rainfall values were incorporated into the ANNsmodels to address such situations.

    6.2 Pilot phase - Experiments results

    In order to select the best network to use, data for one (Embu) weather station was used;results of the runs are tabulated in Table 3. An ideal Network would be the one that predictsEDI and/or AWRI given precipitation only, however, these (1, 8 and 15) had the worstperformance. Secondly, by selecting a network that uses all the 3 inputs (P, E and W) andgives both EDI and AWRI as outputs is ideal because it ensures that only one networkis needed to solve the current problem. This scenario is represented in Networks 7which had the best performance in Category 1 of the networks. What about creatingtwo different networks that use all the three inputs (P, E and W) to predict both EDIand AWRI respectively? This is achieved in Network 14 and 21. Though Network14had the best performances in Category 2, Network 21 was the third after Network 19and 17 in Category 3.

    The networks with average (Rank 1 to 3) performance and that have Precipitation as aninput were selected for further investigation. These are Networks 7, 11, 14, 20 and 21.Finally, in order to exhaustively investigate the best networks for building bi-network (twonetwork; for EDI and AWRI respectively), the two best performing networks in each ofthese categories were selected.

    Networks 1, 8 and 15 that have precipitation alone as input had the worst performance.An attempt to use either EDI to compute AWRI and vice versa did not yield desirable results.This is an indication that there is no any form of relationship between the two. This madeNetworks 2, 3, 10 and 16 give poor performance. Finally, combining EDI and Precipitationto predict AWRI or AWRI with Precipitation to predict EDI did not work either. This madeNetworks 4, 6, 13, and 18 not perform well.

    6.3 Exploratory phase I - Experiments results

    From the results in Table 3 and the analysis in above, 10 Neural Networks models (5, 7, 9,11, 12, 14, 17, 19, 20 and 21) were identified for further analysis during the ExploratoryExperiments. Using data for Embu, Dagoretti and Makindu weather stations, the 10 NeuralNetworks were trained and their performances analysed. For each weather station, a tablesimilar to Table 4 was generated.

    For Dagoretti, Network 9 had the lowest value (0.01887) of MSE in the category withEDI (Networks 9, 11, 12 and 14) as output. For the R value, both Networks 9 and 11 hadequal values (0.99029). For AWRI category, the one-input Network 17 had the bestperformance. For Embu, in the category of Networks for computing EDI(Networks 9, 11,

    Mitig Adapt Strateg Glob Change

  • Fig. 3 Temperature variability inselected weather stations inKenya

    Mitig Adapt Strateg Glob Change

  • Fig. 4 Rainfall variability inselected weather stations inKenya

    Mitig Adapt Strateg Glob Change

  • 12 and 14), Network 9 had the lowest MSE (0.0179) as well as the strongest R(0.99068).Similarly, in the category of networks that compute AWRI, Network 17 had the lowestMSE(64.9521) and highest R (0.99412). While for Makindu, Network 9 performed betteramong the networks with EDI as output but unlike the case of Embu and Dagoretti, Network21(3 inputs) had the least MSE and Network 19(has EDI as the second input) had the highestvalue of R. Since the improvements (51.4371 to 47.7839 (7 %) for MSE and 0.99307 to0.99361(0.05 %)) were too low considering that extra input(s) values are needed, Network17 was selected to represent this category.

    Networks 5 and 7 have both EDI and AWRI as output and therefore difficult to calculatethe implications of MSE. Besides, these networks had weaker Regression values for all the

    Table 3 Performance of the pilot phase neural networks

    Network Inputs (I) Targets (T) Mean Square Error (MSE) Regression (R)

    P E W E W Tr Vl Ts Rn Tr Vl Ts Rn

    Category 1: Two outputs (EDI and AWRI)

    Network1 Y N N Y Y 2,340.90 2,358.28 2,358.05 7 0.58 0.68 0.62 7

    Network2 N Y N Y Y 1,169.10 1,365.17 1,287.50 4 0.94 0.94 0.95 6

    Network3 N N Y Y Y 41.24 40.19 47.63 5 0.76 0.77 0.75 3

    Network4 Y Y N Y Y 795.01 861.62 885.25 3 0.96 0.96 0.96 5

    Network5 N Y Y Y Y 43.00 32.74 42.15 2 0.97 0.98 0.97 1

    Network6 Y N Y Y Y 39.45 51.13 50.33 6 0.76 0.76 0.74 4

    Network7 Y Y Y Y Y 35.66 59.27 45.75 1 0.97 0.97 0.97 1

    Category 2: Output EDI only

    Network8 Y N N Y N 0.84 0.82 0.85 7 0.36 0.32 0.32 7

    Network9 N Y N Y N 0.02 0.03 0.02 4 0.99 0.99 0.99 1

    Network10 N N Y Y N 0.38 0.35 0.38 6 0.78 0.78 0.78 6

    Network11 Y Y N Y N 0.02 0.02 0.02 2 0.99 1.00 0.99 1

    Network12 N Y Y Y N 0.02 0.02 0.02 3 0.99 0.99 0.99 1

    Network13 Y N Y Y N 0.31 0.31 0.33 5 0.82 0.82 0.82 5

    Network14 Y Y Y Y N 0.02 0.02 0.02 1 0.99 0.99 0.99 1

    Category 3: Output AWRI only

    Network15 Y N N N Y 4,672.09 4,847.33 4,743.63 7 0.50 0.50 0.46 7

    Network16 N Y N N Y 2,447.19 2,566.56 2,669.63 6 0.78 0.77 0.77 6

    Network17 N N Y N Y 87.57 77.81 80.48 2 0.99 0.99 0.99 1

    Network18 Y Y N N Y 1,655.01 1,834.05 1,703.69 5 0.85 0.85 0.85 5

    Network19 N Y Y N Y 84.57 91.60 72.05 1 0.99 0.99 0.99 1

    Network20 Y N Y N Y 77.17 78.17 118.44 4 0.99 1.00 0.99 1

    Network21 Y Y Y N Y 80.67 79.87 97.39 3 0.99 0.99 0.99 1

    P precipitation, E Effective Drought Index, W Available Water Resource Index (AWRI), Tr Training, VlValidation, Ts Testing, Y yes (included in the input/targets), N No (not included in the inputs/targets), Rn Rank(the Rank of the Network by performance (MSE and R), for example, Network7 has the lowest error in theCategory 1 while Network 14 and Network 19 have the lowest error in Category 2 and 3 respectively)

    Though the ranking of the networks was based on the average of the performance (MSE and R) of the threedata sets (Tr, Vl and Ts); the value of V1 is preferred because it is the measure of the networks ability to carryout the future forecasting (classify dataset it has not encountered before).

    Mitig Adapt Strateg Glob Change

  • weather stations and were therefore not considered in building the final prediction models.Further, including extra inputs to calculate either EDI or AWRI values did not havesignificant improvements on the Networks performance, as such; only the two single-inputNetworks (Network 9 for EDI and Network 17 for AWRI) were considered in the nextphases. With MSE values ranging from 3 % to 7 % on EDI, Network 9 generated values thatare within acceptable error-range. For example, an EDI value 4.32 (Extreme Floods) with anerror of 3 % will yield 4.320.13. Similarly, Network 17 has MSE values between 4 % and8 % and an AWRI value of 194.6 would yield 194.6 8.1

    6.4 Exploratory phase II - Forecast models results

    Using the results above, neural networks were created to forecast future values of EDI(Network 9) and AWRI (Network 17) for the short-term, medium-term and long-term time-scales as follows:

    6.4.1 Short-Term forecasts

    (a) D-Days-Lead-Time Forecast

    The lead-time (in days) considered included: 1, 2, 3, 4, 5, 6, 7, 12 and 14. Given a specificday n, to forecast EDI or AWRI value for d-days from n, the following expression was used:

    Forecast[En+d]=f(En, En1, En2, En3, En4, En5) where En is the EDI value for day n;the latter ranges from 1 to 6. That is, in order to forecast future values, 6 past values are inputinto the neural network. In case of AWRI, E is replaced with W. For example to forecast thevalue of EDI 5-days Lead-Time from 21st January (26th January), the following expressionapplies:

    Forecast E215 f E21; E20; E19; E18; E17; E16 For the purpose of training and testing the neural networks, the daily EDI/AWRI values

    for years 1980 to 2009 were used. For each of the 3 weather stations, input files for each ofthe neural networks (representing forecast durations above) were created. In all the networksexcept 12b and 14b, 6 number of values (EDI or AWRI) were included as inputs to the

    Table 4 Performance of the exploratory phases neural networks - Dagoretti

    Network Mean Square Error Regression

    Tr Vl Ts Average Tr Vl Ts Average

    Network5 34.4754 27.7476 38.8559 33.6930 0.98315 0.98450 0.98408 0.98391

    Network7 35.1539 35.1539 33.3021 34.5366 0.97997 0.99816 0.98864 0.98892

    Network9 0.0188 0.0178 0.0200 0.01887 0.99018 0.99068 0.99001 0.99029

    Network11 0.0165 0.0254 0.0149 0.01899 0.99126 0.98751 0.99211 0.99029

    Network12 0.0166 0.0199 0.0221 0.0195 0.99138 0.98955 0.98865 0.98986

    Network14 0.0174 0.0151 0.0240 0.0188 0.99110 0.99159 0.98793 0.99021

    Network17 68.5308 59.6619 72.7853 66.9927 0.99378 0.99442 0.99334 0.99385

    Network19 68.4546 81.1977 53.5809 67.7444 0.99373 0.99271 0.99503 0.99382

    Network20 64.7377 75.1657 78.6352 72.8462 0.99399 0.99311 0.99332 0.99347

    Network21 68.8056 75.3141 60.2966 68.1388 0.99382 0.99237 0.99457 0.99359

    Mitig Adapt Strateg Glob Change

  • neural network models. In an attempt to find out if increasing the number of inputs wouldsignificantly improve the network performance, the number of inputs were increased to 12and 14 in 12b and 14b respectively. The results of this network model are shown in Figs. 5and 6; the results indicate that the performance of the network declines as the number of daysbeing forecast increases; for instance, forecasting 3 days ahead gives a better performancethat forecasting 14 days head.

    Increasing the number of inputs (previous days values) did not have a significantimprovement in network performance. For example, increasing number inputs from 6 to14 resulted in an increase from 13.61 to 13.55 %, 15.21 to 15.29 % and 9.256.52 % forDagoretti, Embu and Makindu respectively.

    (b) D-Days-Lead-Time Forecast with Precipitation

    In the era of climate change, extreme climate variations may trigger precipitation eitherduring periods that it does not normally rain or lead to precipitation amounts that are belowor above the expected normal for the area/region. Conventionally, ANNs-based forecastingsolutions learn from past events/patterns and use this knowledge to forecast future trends.This means that faced with extreme climate variations, ANNs-based drought forecastsolution would result in poor forecast skill. In order to take care of such events, the D-Days-Lead-Time Forecasting neural networks described in (a) above were modified toinclude forecasted precipitation values for the lead-time considered. Example, to forecastthe drought (EDI and AWRI) values for the 5-days-Lead-Time counting from 21st January(26th January), the anticipated daily precipitation (as forecasted by professional weatherforecasting institutions such as the Kenya Meteorological Department) for the 5 days areincluded as inputs to the networks:

    Forecast E215 f E21; E20; E19; E18; E17; E16 ; P22; P23; P24; P25; P26 Where Pi represent the forecasted (approximated) precipitation valuesAs shown in Figs. 7 and 8, with accuracies of 94.13 to 97.03 %, 93.57 to 97.62 % and

    96.25 to 99.09 % for Dagoretti, Embu and Makindu respectively, the networks in the EDIcategory (Network 9) displayed excellent performance. Further, the networks had a

    Fig. 5 MSE graph for D-Days-Lead-Time for selected weather stations in Kenya

    Mitig Adapt Strateg Glob Change

  • correlation coefficient values between 0.98 and 0.99. It must be noted that the impressiveperformance of the D-Day-Lead-Time Forecast were achieved because the actual historicalprecipitation values were used as input. The real networks are expected to use valuesforecasted by meteorological institutions. These values have accuracies as low as 70 %and therefore this will slightly affect the neural networks performance.

    6.4.2 Medium and long term forecasts

    Similar to the daily EDI, monthly EDI/AWRI values are computed using monthly precip-itation totals. This is the data that was used for the following long-term and short-termforecasts.

    Fig. 6 Regression graph for D-Days-Lead-Time for selected weather stations in Kenya

    Fig. 7 MSE graph for D-Days-Lead-Time (with precipitation) for selected weather stations in Kenya

    Mitig Adapt Strateg Glob Change

  • (a) M-Months-Lead-Time Forecast

    Six inputs (previous months) were selected as the standard input in training neuralnetworks to forecast monthly EDI/AWRI values. The lead-time units considered were 2,3,4, 5,6,7 and 12 months.

    Forecast[En+m]=f(En, En1, En2, En3, En4, En5) where En is the EDI value for monthn; the latter ranges from 1 to 6.

    For example to forecast the value of EDI 5-months Lead-Time from January 2012 (that isforecast for July 2012), the following expression applies:

    Forecast EJun2012 f EJan2012; EDec2011; ENov2011; EOct2011; ESep2011; EAug2011

    As shown in Table 5, the performance of the neural networks decreases as the length ofthe forecast duration increases. The error rates for majority of the networks are above 30 %resulting in forecasts that with below 70 % accuracy.

    (a) YYears-Lead-Time Forecasting

    Using the expression below, forecast for Y(1,2,3 and 4) years were developed and testedfor the 3 weather stations.

    Network 9 : EMonthM;Year y1 fEMonthM;Year y5;EMonthM;Year y4;EMonthM;Yeary3;EMonthM;Year y2;EMonthM;Year y1;EMonthM;Year y; PYear y5; PYear y4; PYear y3;PYear y2; PYear y1; PYear y; PYearYy1Network 17 : WMonthM; Year n1 fWMonthM;Year n5;WMonthM; Year n4;WMonthM; Year n3; WMonthM; Year n2;WMonthM; Year n1;WMonthM; Year n; PYear n5;PYear n4; PYear n3; PYear n2; PYear n1; PYear n; PYear n1

    Illustration: to forecast EDI/AWRI values for August 1986, the neural network requires theEDI/AWRI values for August 80, 81, 82, 83, 84 and 85. It also requires the annual precipitationtotals for these years as well as an approximate annual precipitation value for year 1986

    Fig. 8 Regression graph for D-Days-Lead-Time(with precipitation) for selected weather stations in Kenya

    Mitig Adapt Strateg Glob Change

  • 6.5 Confirmatory phase Results of evaluating forecast models

    From the exploratory phase, the following networks were identified for this confirmation: D-Days-Lead-Time Forecasting, D-Days-Lead-Time Forecasting with Precipitation and Y-Years-Lead-Time Forecast. Data set for Kakamega weather station with same dimensions as the datafor Dagoretti, Embu and Makindu was used to evaluate the performance of the ANNs models.For each forecast category, the already trained networks were retrieved and used to evaluate therespective networks input-output combinations created using the Kakamega data set.

    In all the three categories of forecasts, the results showed consistency in performance. Forinstance, the D-Days Lead-Time Forecast had accuracies of 88 % to 94 % for EDI and 88 %to 95 % for AWRI. Just like it was the case in Exploratory Phase, the D-Days Lead-TimeForecast with precipitation had much higher accuracies of up to 98 %. The fact that theANNs models gave these impressive results on data not encountered before helped to re-affirm that the ANNs can indeed be used to enhance the accuracy of drought forecasts.

    7 Evaluation and conclusion

    7.1 Evaluation

    Artificial neural network as implemented in MATLABs Neural Network Toolbox was usedto develop drought prediction models discussed in this paper. The ANNs models developedaccepts as input a combination of EDI/AWRI and precipitation values and generate a set ofEDI/AWRI values representing predicted drought. For example, a 2-Years Lead-Time(2 years in advance) Forecasting model could be used to forecast the 1998 floods a follows

    Forecast EJanuary;1998 fEJanuary;1996;EJanuary;1995; ;EJanuary;1994; ;EJanuary;1993;EJanuary;1992;

    EJanuary;1991;P1991; P1992;P1993; P1994; ; P1995; ;P1996; ; P1997; ; P1998

    Table 5 Network performance for the M-Months-Lead-Time Forecast Network 9

    LT (D) MSE RMSE %RMSE R

    D E M D E M D E M D E M

    2 0.552 0.556 0.514 0.743 0.746 0.717 26.75 29.01 24.37 0.6868 0.6713 0.7432

    3 0.750 0.758 0.625 0.866 0.871 0.791 31.17 33.86 28.59 0.5483 0.4733 0.6317

    4 0.903 0.947 0.845 0.950 0.973 0.919 34.19 37.86 36.74 0.3288 0.3907 0.4862

    5 0.893 0.658 0.715 0.945 0.811 0.846 34.00 31.55 31.97 0.6583 0.5045 0.6686

    6 0.995 0.730 0.801 0.998 0.855 0.895 35.90 33.24 35.15 0.3551 0.4204 0.5255

    7 0.862 0.997 0.868 0.928 0.999 0.932 33.42 38.84 37.60 0.3359 0.1837 0.2477

    12 0.809 1.087 1.009 0.900 1.043 1.004 27.15 34.93 36.27 0.3970 0.2593 0.4995

    LT(D) Lead-Time in Days; D Dagoretti Station; E Embu station;MMakindu station;MSEMean Square Error;RMSE Root Mean Square Error computed by calculating the square root of the MSE; %RMSE PercentageRoot Mean Square Error calculated by computing the percentage of the actual mean values (EDI or AWRI)the RMSE represents

    Values 2, 3. ,12 represent the lead-time of the forecast in months. For example, entry 5 implies that theforecast is for the 5th month given the previous 6 months. For example, given the values for June, July,August, September, October and November 2011, the network forecasts the value for May 2012

    Mitig Adapt Strateg Glob Change

  • To forecast EDI values for January 1998, the neural network requires the EDI values forJanuary 91, 92, 93, 94, 95 and 96. It also requires the annual precipitation totals for theseyears as well as an approximate annual precipitation value for year 1997 and 1998. Valuesfor other calendar months are computed in a similar version. That is, given 6 EDI andprecipitation values for 6 years and predicted precipitation values for the lead period (1997and 1998), the ANNs model for 2-years lead-time outputs the EDI values (Fi) for 1998. Theaccuracy of this value (Fi) was determined by its closeness to the actual drought value (Ai)for this period (1998). Figure 9 is a sample plot of such relationships as generated by theANNs model for 2-years lead-time for the period October 1997 to September 1999 usingdata for Makindu weather station.

    MSE was then used to determine the accuracy of the forecasts. The best performancewas retrieved from the MATLAB Artificial Neural Network Toolkit as illustrated in Fig. 10.In the case of 2-Years lead-time EDI forecast model for Makindu, the best performance(MSE=0.37097) was attained after 16 epochs; epoch here refers to the presentation of theentire training set to the neural network.

    Throughout the experiments, validation error was used o evaluate the performance of ournetwork models. For instance, the data for Makindu for Y-Years Lead Time category (seeTable 6) had MSE values of 0.205, 0.233, 0.267 and 0.270 for 1, 2, 3 and 4 years lead timesrespectively. The AWRI models for Makindu for the same network category gave valuesranging from 3,037 to 4,661. In order to put the MSE values into perspective, further hePercentage RMSE which was easier to understand than the MSE were computerd. Avalue of2.89 % implied that the network had an accuracy of 97.11 %. This analysis was performedfor all models and networks with accuracies of over 70 % were considered acceptable.Networks with correlation values of 0.9 and above were considered acceptable.

    7.2 Conclusion

    Most of the drought indices applied in forecasting droughts in Kenya fall short of providingthe severity of the drought. In addressing this drawback, a combination of ANNs and EDIwas adopted. This was evaluated by developing forecasting models for time-scales rangingfrom one day to four years. The models were then tested using weather data for over 30 yearsfor 4 weather stations in Kenya. By including forecasted precipitation values as inputs to theANNs models, our solution took care of the effects of unprecedented climate variations. Italso exploited the correlation among EDI values of calendar months to forecast annualdroughts. With accuracies raging from 75 % to 98 %, the solution is a great enhancement to

    Fig. 9 Predicted versus actual values sample graph

    Mitig Adapt Strateg Glob Change

  • the solutions currently in use in Kenya. In developing the ANNs models, three phases werefollowed:

    Phase I: - using 30-years (19792009) data for Embu Station, 21 neural networks werecreated using different combinations of Rainfall (Precipitation), EDI and AWRI values.From these, 10 best performing networks were selected for further investigation during thePhase II.

    Phase II: - Data for three weather stations (Dagoretti, Embu and Makindu) for 30 (19792009) years was used to further evaluate the 10 networks selected above. After thoroughanalysis, 2 networks (for forecasting EDI and AWRI respectively) were identified and usedto run the following forecasts: (1) Next-Day-Forecast; (2) D-Days-Lead-Time Forecast; (3)D-Days-Lead-Time With Precipitation Forecast; (4) Next-Month Forecast; (5) M-Months-Lead-Time Forecast (6) Next-Year Forecast; and (7) Y-Years-Lead-Time Forecast. Overall,the D-Days-Lead-Time With Precipitation Forecast (for up to 7 days Lead-Time) hadaccuracies levels of over 94 % while the Y-Years-Lead-Time Forecast (for up to 4 yearsLead-Time) had over 75 % accuracies. The Next-Month Forecast had poorer performance;Dagoretti and Embu had accuracies of between 7687 % and 72 %80 % for EDI and AWRInetwork categories respectively. However, monthly forecasts could still be achieved using

    Fig. 10 ANNs evaluation Best MSE sample graph

    Table 6 Network Performance for the Y-Years-Lead-Time Forecast Network 9

    MSE RMSE %RMSE R

    D E M D E M D E M D E M

    1 0.224 0.218 0.205 0.473 0.467 0.453 17.03 18.17 11.81 0.8521 0.8761 0.8516

    2 0.305 0.305 0.233 0.552 0.552 0.482 19.87 21.47 13.02 0.8389 0.8618 0.9107

    3 0.294 0.294 0.267 0.542 0.542 0.516 19.51 21.10 14.46 0.7953 0.8504 0.8682

    4 0.285 0.315 0.270 0.533 0.561 0.520 19.20 21.83 14.60 0.8487 0.8167 0.8527

    Mitig Adapt Strateg Glob Change

  • the Y-Years-Lead-Time Forecast. ANNs are known to rely on historical trends to forecastfuture values but in the case of droughts, phenomena such as climate change may lead toerroneous forecasts due to unexpected increase/decrease of precipitation. In this paper, theANNs forecasting models have addressed this by incorporating forecasted precipitationvalues from issued at professional meteorological institutions such as the KMD. The factthat the latter forecasts can have accuracies of as low as 70 % has been factored in to thedesign and was found that it does not significantly affect accuracies of the ANNs outputs.

    Phase III : this is the validation of the results achieved above; data set from Kakamegaweather station (not used before in the previous two phases) was utilised. The resultingaccuracies were similar to those achieved in Phase II.

    The work presented in this paper is a part of an on-going research and one of theimmediate tasks is to put the ANNs models described here on to a web-based interface toallow end users (such the meteorologists and staff at meteorological departments) to accessand use the system. This way, it will be possible to use the ANNs models together with theseasonal climate forecasts carried out in Kenya and beyond.

    Acknowledgement Special thanks to the Kenya Meteorological Department for providing the data set usedin this research.

    References

    Antoni O, Josip K, Antun M, Dragan B (2001) Ecol Model 138:255263Armstrong S, Mark C, Randolph K, Dan M (2011) World Disasters Report Focus on Hunger and

    Malnutrition. International Federation of Red Cross and Red Crescent Societies. International Federationof Red Cross and Red Crescent Societies, Geneva

    ASCE Task Committee On Application Of Artificial Neural Networks In Hydrology (2000) Artificial NeuralNetworks in Hydrology: II. Hydrologic Applications. J Hydrol Eng 5:115123

    Bodri L, Cermak V (2001) Prediction of extreme precipitation using a neural network: application to summerflood occurrence in Moravia. Journal of Environmental Informatics 45:155167

    Byun H, Wilhite DA (1999) Objective quantification of drought severity and duration. J Clim 12:27472756CERF (Central Emergency Response Fund) (2009) CERF funds jump start emergency aid operations in

    Kenya. United Nations Economic Commission for Africa. Addis Ababa, EthiopiaChang-Shian C, Boris P, Chen F, Chou N, Chao-Chung Y (2010) Development and application of a decision

    group Back-Propagation Neural Network for flood forecasting. J Hydrol 385:173182Collins A, Nick M, Michele M, Anne M, Maarten VA (2009) World Disasters Report Focus on

    Early Warning, Action, 1st edn. International Federation of Red Cross and Red Crescent Societies,Geneva

    Deely S, David D, Jorgelina H, Cassidy J (2010) World Disasters Report Focus on Urban Risk, 1st edn.International Federation of Red Cross and Red Crescent Societies, Geneva

    Demuth, H., Mark, B. and Martin, H. 2009. MATLAB Neural Network Toolbox Users Guide. 6Dracup JA, KS L, KS L, Paulson EG Jr (1980) On the definition of droughts. Water Resour Res 16:289296Elsa E, Moreira C, Coelho A, Paulo AA (2008) SPI-Based Drought Category Predication Using Loglinear

    Models. J Hydrol 354:116130Freitas, M.A.S. and Billib, M.H.A. 1997. Drought prediction and characteristic analysis in semiarid Cear,

    northeast Brazil. Sustainability of Water Resources under Increasing Uncertainty, Anonymous IAHS, No.240, , 105.

    Huschke, R. E., 1970: Glossary of Meteorology. Amer. Meteor. Soc., 638 pp.Kim D, Hi-Ryong B, Ki-Seon C (2009) Evaluation, modification, and application of the Effective Drought

    Index to 200-Year drought climatology of Seoul, Korea. J Hydrol 378:112Kung H, Hua J, Chen C (2006) Drought forecast model and framework using wireless sensor networks. J Inf

    Sci Eng 22:751769Maier HR, Dandy GC (2000) Neural networks for the prediction and forecasting of water resources variables:

    a review of modelling issues and applications. Environ Model Softw 15(1):101124

    Mitig Adapt Strateg Glob Change

  • Masinde, M. and Bagula, A., 2011. The Role of ICTs in Quantifying the Severity and Duration of ClimaticVariations Kenyas Case, Proceedings of ITU Kaleidoscope 2011: The Fully Networked Human? -Innovations for Future Networks and Services (K-2011), 1214 December 2011, IEEE Xplore, pp. 18

    Masinde M, Bagula A (2012) ITIKI: bridge between African indigenous knowledge and modern science ofdrought prediction. Knowl Manag Dev J 7(03):274290

    Mcculloch WS, Pitts W (1943) A logical calculus of the ideas imminent in nervous activity. Bull MathBiophys 5:115133

    Mckee TB, Doesken NJ, Kleist J (1993) The relationship of drought frequency and duration to time scales. In:8th Conference on Applied Climatology, January. Anonymous American Meteorological Society, Boston,Massachusetts, pp 1723

    Mishra AK, Desai VR (2006) Drought Forecasting Using Fee-Forward Recursive Neural Network. EcolModel 198:128138

    Mishra AK, Singn VP (2010) A Review of Drought Concepts. J Hydrol 391:202216Morid S, Smakhtin V, Bagherzadeh K (2007) Drought forecasting using artificial neural networks and time

    series of drought indices. Int J Climatol 27:21032111Mutua, S., 2011. Strengthening Drought Early Warning At The Community and District Levels: Analysis Of

    Traditional Community Warning Systems In Wajir & TurkanaMwagore D (2002) Land use in Kenya: The case for a national land use policy. Printfast Kenya Ltd., NairobiPalmer, W.C., 1965. Meteorological Drought. Research Paper 45. US Department of Commerce, Weather

    Bureau, Washington, DC, pp. 158.Panu, U.S. and Sharma, T.C. 2002. Challenges in drought research: some perspectives and future directions.

    Hydrological Sciences-Journal .Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation.

    Parallel distributed processing, Vol. 1. MIT Press, Cambridge, pp 318362Shin HS, Salas JD (2000) Regional drought analysis based on neural networks. J Hydrol Eng 5(2):145155Weichert A, Burger G (1998) Linear versus nonlinear techniques in downscaling. Clim Res 10:8393Wells N, Goddard S, Hayes MJ (2004) A self-calibrating palmer drought severity index. J Clim 17:23352351WHO Collaborating Centre For Research On The Epidemiology of Disasters (CRED), 2012-last update,

    Emergency Events Database (EM-DAT) [Homepage of Government of Belgium, for WHO], [Online].Available: http://www.emdat.be/ [May/11, 2012].

    Wilhite DA, Glantz MH (1985) Understanding the drought phenomenon: the role of definitions. WaterInternational 10:111120

    Yin RK (1993) Applications of Case Study Research. Sage, Newbury Park

    Mitig Adapt Strateg Glob Change

    http://www.emdat.be/

    Artificial neural networks models for predicting effective drought index: Factoring effects of rainfall variabilityAbstractIntroductionDrought forecasting in KenyaDrought severity indices and effective drought indexArtificial neural networks in forecasting droughtsMaterials and methodsCase study with experimental designData sampleEDI calculationArtificial neural networks models developmentOverviewNetwork architectureData division criteria and network trainingNetwork geometryANNs performance analysis

    Results and discussionRainfall and temperature variability in KenyaPilot phase - Experiments resultsExploratory phase I - Experiments resultsExploratory phase II - Forecast models resultsShort-Term forecastsMedium and long term forecasts

    Confirmatory phase Results of evaluating forecast models

    Evaluation and conclusionEvaluationConclusion

    References