Hybrid Model for Urban Air Pollution Forecasting: A ...idlcc.fc.ul.pt/pdf/AnaRusso_AirPolutionForecasting_2013.pdf · 2 Hybrid Model for Urban Air Pollution Forecasting: A Stochastic

Math GeosciDOI 10.1007/s11004-013-9483-0

Hybrid Model for Urban Air Pollution Forecasting:A Stochastic Spatio-Temporal Approach

Ana Russo · Amílcar O. Soares

Received: 3 October 2012 / Accepted: 28 July 2013© International Association for Mathematical Geosciences 2013

Abstract Air pollution is usually driven by a complex combination of factors inwhich meteorology, physical obstacles, and interactions between pollutants play sig-nificant roles. Considering the characteristics of urban atmospheric pollution and itsconsequent impacts on human health and quality of life, forecasting models haveemerged as an effective tool to identify and forecast air pollution episodes. The over-all objective of the present work is to produce forecasts of pollutant concentrationswith high spatio-temporal resolution and to quantify the uncertainty in those fore-casts. Therefore, a new approach was developed based on a two-step methodology.Firstly, neural network models were used to generate short-term temporal forecastsbased on air pollution and meteorology data. The accuracy of those forecasts was thenevaluated against an independent set of historical data. Secondly, local conditionaldistributions of the observed values with respect to the predicted values were used toperform spatial stochastic simulations for the entire geographic area of interest. Withthis approach the spatio-temporal dispersion of a pollutant can be predicted, whileaccounting for both the temporal uncertainty in the forecast (reflecting the neural net-works efficiency at each monitoring station) and the spatial uncertainty as revealedby the spatial variograms. Based on an analysis of the results, our proposed methodoffers a highly promising alternative for the characterization of urban air quality.

Keywords Air quality · Neural networks · Stochastic simulation · PM10 ·Uncertainty

A. Russo (B)Instituto Dom Luiz, Faculdade de Ciências da Universidade de Lisboa, Universidade de Lisboa,Campo Grande. Edifício C8, Piso 3, 1749-016 Lisboa, Portugale-mail: [email protected]

A.O. SoaresCERENA, Instituto Superior Técnico, Universidade Técnica de Lisboa, Lisboa, Portugal

mailto:[email protected]

Math Geosci

1 Introduction

Urban air pollution (AP) is a complex mixture of toxic components that may in-duce acute and chronic pathologic responses in vulnerable individuals, especiallychildren and people with cardiac and respiratory insufficiencies (Kolehmainen et al.2001). In 2009, about 20 % of the urban population in the European Union (EU)was exposed to PM10 levels above the limit value (EEA 2011). Confronted withthe direct impact of AP on human health and the environment, European author-ities passed legislation aimed at improving air quality (AQ), by approving severalDirectives and Council Decisions (EEA 2011). These legal instruments set limits ortargets for ambient concentrations, total emissions, and specific sources or sectors(EEA 2011). In the case of non-compliance with Directive 2008/50/EC, AQ man-agement plans must be implemented in the affected areas (EEA 2011). These plansusually incorporate numerical models in order to achieve sustainable AQ manage-ment. Numerical modeling allows harmful situations to be identified or predictedthrough the integration of different AQ-related components (e.g., topography orweather data), yielding a set of tools that can be used to comply with AQ standardsand therefore, ultimately, to preserve human health. As such, numerical modelingcontributes decisively to AQ management. Different types of AQ models have beenapplied in different contexts to characterize and forecast the dispersion of air pol-lutants (e.g. box, Gaussian plume, persistence, regression, deterministic and statisti-cal models). The most straightforward approaches can provide quick solutions, how-ever, they rely on highly simplified assumptions (Luecken et al. 2006). Determinis-tic dispersion models, such as the UAM-Urban Airshed Model (Morris and Meyers1990), the ROM-Regional Oxidant Model (Lamb 1983), CHIMERE (Monteiro et al.2005), and the CMAQ-Community Multiscale Air Quality Model (Sokhi et al. 2006;Luecken et al. 2006) are driven by the objective quantification of chemical reactionsand the physical transport of pollutants. Nevertheless, AP dispersion is usually socomplex that reliability is achieved at the cost of producing the forecasts on coarsegrids, which in most cases results in forecasts that are useless for the management ofcritical local situations. Additionally, deterministic dispersion models require a largeamount of accurate input data and are computationally expensive as well as timeconsuming (Dutot et al. 2007).

Since the pioneer work of Bilonick (1983, 1985) and continuing to the most re-cent hybrid models coupled to determinist models (Russo et al. 2008), geostatisti-cal modeling has evolved tremendously. These models are able to mimic the dy-namic behavior of physical phenomena, and specifically AP behavior. The map-ping of pollutant concentrations has been treated by classical estimators of krig-ing (Atkinson and Lloyd 2001), stochastic simulation (Pereira 1999; Russo et al.2008), Markov random fields (Cressie et al. 1999), and the co-simulation of contam-inants (Franco et al. 2006). In addition, secondary information and time componentshave been introduced into the frameworks of most geostatistical models (Kyriakidisand Journel 1999). These models have been widely used to characterize AQ in ur-ban areas, where pollutant sources are considered diffuse, and in industrial areas,with localized emission sources (Nunes and Soares 2005; Soares and Pereira 2007;Russo et al. 2008). Considering the complexity of dispersion phenomena, non-stationary situations have been focused in several spatial models (Monestiez et al.

Math Geosci

1997) and space–time simulations (Kyriakidis and Journel 1999). Nonetheless, main-stream geostatistical models are basically interpolation exercises (estimation, simu-lation) using a set of variables in a spatial or spatio-temporal domain. The tempo-ral prediction of a pollutant (i.e., extrapolation in the time domain) usually requiresknowledge of both the main trends and the complex patterns of physical dispersionphenomena, that is, the dispersion of the main pollutants, the meteorological con-ditions, and the relevant chemical reactions. However, the types of information andknowledge needed to produce AQ predictions with the required rigor for urban ar-eas are usually not compatible with the stationary assumptions of most geostatisticalmodels. Geostatistical space–time simulation models allow for the characterization ofuncertainty by supplying equiprobable images that reproduce patterns of spatial con-tinuity quantified by the observations available. These space–time models can incor-porate complex combinations of factors—which are usually non-linear—and therebyplay a significant role in AQ forecasting (Kyriakidis and Journel 1999; Pereira 1999;Nunes and Soares 2005; Soares and Pereira 2007).

The overall objective of the work presented here is to produce AP forecasts withhigh spatial and temporal resolution and to quantify the uncertainty in those pre-dictions. To do so, a new approach was developed based on a two-step method-ology which combines neural network (NN) short-term temporal forecast andspatial stochastic simulation. First, an NN model is implemented at each moni-toring station for short-term temporal forecasting of a pollutant’s concentration,taking into account the historical behavior of that pollutant and meteorological his-torical data. NNs are mathematical models that have been widely applied to temporalpredictions of AP (Gardner and Dorling 1999, 2000; Cobourn et al. 2000; Hooy-berghs et al. 2005; Turias et al. 2007; Lal and Tripathy 2012). Indeed, several NNmodels have already been tested, mostly for forecasting hourly AP averages (Perezet al. 2000; Kolehmainen et al. 2001) or daily maxima time series (Perez and Reyes2002), and their potential compared to that of other approaches (Perez et al. 2000;Agirre-Basurko et al. 2006) when applied to different pollutants and prediction timelags has been evaluated (Gardner and Dorling 2000; Hooyberghs et al. 2005). After-wards, the local conditional distributions of the observed temporal values with respectto the value predicted by the NN are used to perform fine-grid spatial stochastic sim-ulations for the entire area of interest and then to evaluate the spatial uncertaintyin the dispersion of the pollutant under study. The proposed methodology couplesthe advantages of the temporal prediction capabilities of NN models with the spatialdispersion and uncertainty assessment of geostatistical models.

Nearly 75 % of European citizens live in urban areas, where AQ is worse comparedto rural areas (EEA 2010). Thus, the ability to forecast urban AQ emerges as a priorityfor guaranteeing the quality of life for inhabitants of urban centers. Here, we presentan application of our model by analyzing PM10 levels in the most densely populatedarea in Portugal (Lisbon) and its surroundings. In the next section (Sect. 2) the pro-posed methodology is described, followed by Sect. 3, which presents the applicationof the proposed method to the space–time characterization of PM10 concentration.Section 3 also includes a brief description of the data and of the case study area, andfinally, the discussion of the results. Section 4 includes some brief conclusions andfinal remarks.

Math Geosci

2 Hybrid Model for Urban Air Pollution Forecasting: A StochasticSpatial-temporal Approach

2.1 Short-Term Forecasts of the Local Conditional Probability Distributions at theMonitoring Stations

Let us consider Z(x, t), the pollutant’s concentration at spatial location x and timeperiod t , and M(x, t) the general notation for the meteorological conditioning data.Denoting the present time as t0, and the spatial locations of the Nm monitoring sta-tions as xα , the objective of the first step was to calculate at any location xα theconditional probability

p(Z(xα, t0)|Z(xα, t0−i ),M(xα, t0−i ), i = 1,NT

). (1)

Equation (1) expresses the probability of forecasting the concentration of a pol-lutant Z(xα, t0) at instant t0, taking into account both its concentration Z(xα, t0−i )

and the meteorological conditions M(xα, t0−i ) of previous time periods t0−i (i =1,NT ). In order to calculate Eq. (1), our approach replaces the conditioning data[Z(xα, t0−i ),M(xα, t0−i )] with a function ϕ that summarizes all the conditioningdata in a “predicted” value at (xα, t0)

Z∗(xα, t0) = ϕ(Z(xα, t0−i ),M(xα, t0−i ), i = 1,NT

). (2)

The function ϕ can be any predictor and its construction is achieved using NN mod-eling (Trigo and Palutikof 1999; Cobourn et al. 2000). An advantage of NN models isthat they can be trained to find the best mathematical relationship between predictorsand targets, without predefined restrictions (Trigo and Palutikof 1999). For this studya multi-layer NN was chosen. The NN model used was trained and implementedbased on the historical time series Z(xα, t0−i ) and on the meteorological conditionsM(xα, t0−i ) (i = 1,NT ) of the local monitoring stations xα (Sect. 3). As condition-ing data were available at the monitoring stations, the respective ϕ could be obtainedat those locations as well

Z∗(xα, t0) = ϕ(Z(xα, t0−i ),M(xα, t0−i ), i = 1,NT

). (3)

Hence, local bivariate distributions of predicted and observed values were estimated

F[Z(xα, t0−i ),Z

∗(xα, t0−i )]. (4)

This is based on independent historical data that were not used in the construction ofthe NN prediction model at each monitoring station. For the present time period t0,z∗(xα, t0) was predicted using the local model of NN at the monitoring stations xα ,after which the local conditional distribution functions were estimated based on thehistorical bivariate distribution obtained from Eq. (4)

F[Z(xα, t0)|Z∗(xα, t0) = z∗]. (5)

Details of the calculation of Eq. (5) are presented in the Appendix. For the sake ofsimplicity, in the following Z(xα, t0) is denoted as Z(x), Z∗(xα, t0) as Z∗(x) and z∗as m. Note that the bi-distributions of Eq. (4) were calculated with an independent setof historical data. These conditional distributions indicate the local accuracy of thetemporal predictions at each monitoring station.

Math Geosci

The problem normally posed by the implementation of a NN model is that it pro-duces a single temporal or spatio-temporal result. Thus, the reliability of the modelcannot be confirmed. The approach here succeeds in predicting the spatio-temporaldistribution of a pollutant’s concentration by also accounting for the uncertainty in thetime prediction (reflecting the efficiency of the NN at each local monitoring station).After the concentration of a pollutant has been forecasted and the conditional prob-ability distributions of the observed values estimated for xα = 1,Nm and t = 1,NT ,the objective of the second step of the proposed methodology is to use the local condi-tional distributions for that period to perform spatial simulations for the entire area ofinterest. We chose to use direct sequential simulations (DSS) with local distributionsas they allow for an evaluation of the spatial uncertainty in a pollutant’s concentration(Sect. 2.2).

2.2 Stochastic Simulation with Local Conditional Distributions

A set of realizations of Z(., .) for instant t0 was obtained based on the local dis-tributions F [Z|Z∗ = m] determined at monitoring stations location xα . Amongthe existing algorithms that are based on the spatial re-sampling of conditionaldistributions, P-field simulation (Srivastava 1992) implies knowledge of the localcumulative distribution functions (cdfs) on the entire grid of nodes to be simu-lated, and not only those at the locations of the experimental samples. One optionis to adapt sequential Gaussian simulation (Gomez-Hernandez and Journel 1993;Goovaerts 1997) by transforming the local distributions; however, that would implythe spatial stationarity of the Gaussian transformation (in practice, the same Gaussiantransform function is applied at any location xα , which is not a valid assumption inthis case). Hence, for our model we adapted the basics of the model of direct co-DSSwith bi-distributions (Horta and Soares 2010) to the direct simulation with local cdfs.

Let us set the first nodes to be visited in the path of the sequential algorithm as pre-cisely the Nm monitoring stations. At each “experimental” node (monitoring stationlocation), a local mean and its variance is calculated at a given time and a simulatedvalue Z(l)(xα, t0) is obtained from the local distributions F [Z|Z∗ = m] followingthe algorithm of Horta and Soares (2010). The remaining N − Nm nodes are then se-quentially simulated following the traditional DSS algorithm (Soares 2001). The DSSalgorithm with local cdfs can be summarized in two parts. Firstly, the spatial locationsof local cdfs (in this case, the locations of the monitoring stations) are visited in a ran-dom path and at each one the simple kriging (sk) mean and its variance is calculatedbased on the previous simulated values. A simulated value Z(l)(xα, t0) is obtainedfrom the local distributions F [Z|Z∗ = m] according to the DSS algorithm with bi-distributions (Horta and Soares 2010). Secondly, the remaining N − Nm nodes arethen visited sequentially and Z(l)(xi, t0), i = 1,N − Nm is obtained from the globalcdf by following the traditional approach (DSS algorithm).

The image representing the previous instant (t0−1) of a pollutant’s concentrationsis used as the local spatial trend. As we have an average image of simulated realiza-

Math Geosci

tions Z(l)(x, t0−1), l = 1,Ns for the period t0−1, calculated with the observations att0−1

O(x, t0−1) = 1

Ns

Ns∑

l=1

Zl(x, t0−1), (6)

then it is possible to use the O(x, t0−1) as the local trend of the sk estimates for thenext period t0

Z(xα, t0)∗ − O(x, t0−1) =

∑

α

λα(x, t0)Z(xα, t0) − O(x, t0−1). (7)

This approach is repeated for all the NT time instants. Thus, this approach allowsforecasting of the space–time distribution of AP, accounting for spatial as well astemporal uncertainties. An application to PM10 concentration forecasting is presentedin the next section.

3 Space–Time Application of PM10 Concentration Forecasts for the GreaterLisbon Area

This illustrative example focuses on the city of Lisbon (Fig. 1), a large coastal cityin central Portugal and specifically on its urban/industrial suburban areas. Lisbon has2.6 million inhabitants and it accounts for approximately 37 % of the Portuguese

Fig. 1 Case study area

Math Geosci

gross domestic product (INE 2012). The area is covered by a conventional AQ moni-toring network comprising urban, traffic, industrial and suburban monitoring stationsthat record the atmospheric concentrations of major pollutants. Over the course of thelast two decades, Lisbon has experienced several pollution episodes during which thelegal limits of PM10, O3, and NO2 were repeatedly exceeded (APA 2008).

In the following, we present a space–time model for the short term prediction ofextreme episodes, based on the methodology described in Sect. 2.

3.1 Data

The datasets consisted of the daily concentrations of several pollutants’ (NO, NO2,CO, PM10) measured at 12 monitoring stations within the greater Lisbon area (Fig. 1)for a period of 5 years (from 1/1/2002 to 31/12/2006). Data from the first 4 years wereused to construct the models whereas data from the last year (2006) were used for in-dependent evaluation. Meteorological hourly observational data were also available.Daily values of boundary layer height (BLH) and SLP (sea-level pressure), obtainedfrom the ECMWF (European Centre for Medium Weather Forecast), were used aswell. Specifically, SLP data series were employed to determine prevailing circulationweather types (CWT) based on the methodology proposed by Trigo and DaCamara(2000). Annual PM10 cycles for all stations were computed. Figure 2(a) shows thedaily and monthly mean surface PM10 values measured at a single monitoring sta-tion. Figure 2(b) summarizes the monthly distributions of PM10 concentrations inbox-whiskers plots for the period between 2002 and 2006 and for all monitoring sta-tions. As seen in Fig. 2(b), the PM10 time series do not show a clear annual cycle.

3.2 Selection of Variables

Based on the available 4 years of common period datasets, a collection of records wasconstructed consisting of the input vector (which included the previous-day meanmeteorological variables), the previous-day mean atmospheric concentrations of NO,NO2, CO, and PM10, and the corresponding target, that is, the present day PM10 con-centrations. Pre-processing consisted of computing the cross-correlation for differenttime lags; additionally, stepwise regression was conducted between the meteorologi-cal and the AQ variables. Variables were selected independently for each monitoringstation through backward stepwise regression (BSR), which allows variables to besequentially removed from a full (all regression terms included) model. With thisapproach the best set of variables for every monitoring station is determined, thusretaining the smallest subset of statistically significant variables to predict PM10 au-tomatically at each one. Originally, the following 15 potential variables were con-sidered: the previous-day mean PM10 concentration, the 0 a.m. PM10 concentration,the previous-day mean values of CO, NO2, and NO concentrations, the previous-daymaximum PM10 concentration, and the previous-day mean values of maximum tem-perature, wind direction and intensity, humidity, radiance, BLH (with three differenttime lags), and CWT. From the lag analysis, it was possible to decide that only thedaily time lags in the meteorological and AQ variables were to be considered. BSRanalysis revealed that, for the majority of the monitoring stations, the most signifi-cant variables in the prediction of PM10 were the previous-day PM10 concentration,

Math Geosci

Fig. 2 (a) Time series of daily(gray dashed line) and monthly(black line) mean PM10 valuesmeasured at Entrecamposstation between the years 2002and 2006; (b) box–whiskersplots of the monthly distributionof PM10 concentrations for theperiod 2002–2006

followed by the 0 a.m. and previous-day maximum PM10 concentrations, and theprevious-day values of NO2 concentration, wind direction and humidity.

3.3 Forecasting Daily Contaminants with Neural Networks

The first step in the proposed approach consists of forecasting the present day (t0)PM10 concentration using NN models, producing for each one of the monitoringstations an individual temporal forecast. In their construction, an AP system is con-sidered as one that receives information from various sets of inputs and responds byproducing a specific output. This model assumes no prior knowledge about the rela-tionship between input and output variables. Although the predominance of certainvariables was evident from the BSR analysis (Sect. 3.2), the NN models developed foreach and every monitoring station did not use a standard common set of variables aspredictors; rather, predictor variables were used in the construction of each NN modelaccording to the BSR selection and they were therefore independent. Next, the NNmodels were trained with data from the first 4 years (01/01/2002 to 31/12/2005). Theweight values that associate inputs and outputs were determined by an optimization

Math Geosci

Table 1 Calibration/validationresults for each monitoringstation

Stations PC Sp RMSE

E 0.83 62.02 13.23

O 0.79 49.36 11.11

AL 0.81 52.96 14.90

L 0.79 60.30 11.02

ESC 0.80 58.82 13.96

R 0.77 46.61 12.11

LAR 0.82 58.19 9.97

LRS 0.79 53.35 10.85

CC 0.70 48.24 11.87

QM 0.77 50.48 9.58

MM 0.76 51.90 8.94

OD 0.83 61.09 12.52

Average 0.79 54.44 11.67

procedure designated by a learning algorithm, specifically, by the least mean squarerule which produces a unique solution corresponding to the absolute minimum valueof the error surface (Trigo and Palutikof 1999).

The possibility that an NN model might overfit or underfit should always be takeninto consideration. It is possible to keep such risks under control by using appropriatevalidation techniques, such as cross-validation. Thus, a cross-validation was applied,dividing the available period into four sets and completing the calibration-validationprocedure four times independently, that is, from the 4 years of data available, datafrom 3 years were used to build the model and data from 1 year used for validation.A cross-validation approach was applied such that the equivalent period of one yearwas taken out, for validation purposes, in all runs. For each validation year, a setof performance measures were computed between the observed real values and theNN forecasts, namely, the Pearson correlation coefficient (PC); the root mean squareerror (RMSE); and the skill against persistence (Sp). The outcomes of each resultingperformance measure were analyzed for each year and then averaged for the completeperiod, resulting on an average value for each monitoring station (Table 1).

Note that the PM10 performance results for the validation period are better thanthose obtained by Demuzere et al. (2009) in their Holland study and similar to thosereported by Hooyberghs et al. (2005) for Belgium. The performances reported bythose authors were r2

(PM10)equal to 42 % and r(PM10) between 65 % and 80 %, re-

spectively. In short, based on the results obtained for the validation periods, it isreasonable to conclude that the NN model generalizes well for independent data, thatis, for data not used in NN construction.

After calibration and validation with data from 2002–2005, the models were usedto produce daily average PM10 forecasts for a period of 1 year. For this purpose anindependent 1-year sample (1/1/2006 to 31/12/2006) was omitted in order to evaluatethe performance of the models for the respective monitoring stations during indi-vidual daily average predictions. The PM10 forecasts were then compared with theactual values of the pollutants at the monitoring stations and with persistence. Due to

Math Geosci

Fig. 3 Time series plots of theNN results versus the actualobserved PM10 values for theAvenida Liberdade (AL)monitoring station

a certain level of memory that characterizes air pollutants, persistence corresponds toa benchmark model considerably more difficult to surpass than climatology or ran-domness (Demuzere et al. 2009).

The calculation of performance measures between the observed and modeled val-ues were then computed for all monitoring stations for the year 2006. Additionally,time series of observed and modeled values were plotted (Fig. 3) and scatter plotted(Fig. 4) for each station, in order to support spatial simulation later on. Scatter plotsand the correlation coefficients for two days, 29/4/2006 and 7/8/2006, determined atthe Av. Liberdade monitoring station, are presented (Fig. 4). These two days were in-tentionally chosen in order to illustrate the potential of the proposed method, as theyrepresent two daily periods characterized by a relative sharp change in AQ relativeto the previous day, hence, with difficult forecasting properties. The scatter plot re-sults for 29/4/2006 show generally lower PM10 values than on the previous day, thatis, 28/4/2006. By contrast, on 7/8/2006, the PM10 values were generally higher thanon 6/8/2006. Consequently, the forecasts for 29/4/2006 were usually underestimatedwhile those for 7/8/2006 were mostly overestimated. These results influenced theconditional spatial simulations. Also, the simulated NN results for 29/4/2006 weregenerally closer to the perfect correlation curve (the red line which represents thatpredictions (y) are equal to observations (x), y = x) whereas those for 7/8/2006were usually farthest away from the y = x curve. Based on each day’s forecast (reddot on each scatter plot of Fig. 4), bivariate distributions were determined based onthe methodology presented in Horta and Soares (2010), which is summarized in theAppendix.

In most cases, forecast models (e.g., plume models, Gaussian models, and statisti-cal temporal models) produce an individual value as the result. Deterministic modelsand common geostatistical models (e.g., kriging) provide a three-dimensional spatialview of an individual realization. These types of models are of limited utility becausethey do not incorporate uncertainty but do incorporate simplifications. Through theuse of NN models, we are able to express the uncertainty in the temporal forecast—astep that is fundamental to the conditional spatial simulation (Sect. 3.4). For more

Math Geosci

Fig. 4 Scatter plots of the NNresults versus the actualobserved PM10 values for theAvenida Liberdade monitoringstation. The filled dot representsthe NN forecast for(a) 29/4/2006 and (b) 7/8/2006

detail on how local bivariate distributions are determined see the Appendix. Al-though the simulated NN results for 29/4/2006 and 7/8/2006 were quite different,the bivariate distributions for the two days were, at least to some extent, still simi-lar.

3.4 Conditional Spatial Simulations

Based on the bivariate distributions between the predicted values resulting from theindependent NN forecasts and the previously determined equivalent real values, con-ditional distributions were estimated for each monitoring station according to themethodology described in Sect. 2.2. Then, and in order to reproduce the PM10

spatial dispersion, a conditional spatial simulation was performed by applying themethodology described in Sect. 2.2. Figure 5 illustrates the second step of the pro-posed approach, showing a trend image for the previous day (t0−1) and the condi-

Math Geosci

Fig. 5 Schematic representation of the second stage of the proposed method, showing a trend image forthe previous day (t0−1) and the conditional distributions of real PM10 at the monitoring station locationsfor the present day (t0)

tional distributions of real PM10 at the monitoring station locations for the presentday (t0).

A set of Nr = 50 equiprobable simulated realizations was generated by the al-gorithm described in Sect. 2.2, using 100-m × 100-m grids for each day, with thetrend being the simulated average image of the previous day. Subsequently, an av-erage image was computed based on the 50 equiprobable simulated images. Fig-ures 6(a) and 6(b) represent the trend used as the base historical images for the nextdays events (29/4/2006 and 7/8/2006). As expected, the 7/8/2006 event presents con-siderably higher PM10 values for the entire case study area than for the 29/4/2006event.

Two situations arise from the occurrence of a relatively sharp change in the con-taminant’s content compared to that of the previous day, namely, the need to forecast(1) a substantial increase or (2) a reduction in the concentrations relative to the ob-servations of the respective previous day (evaluated by the 12-hour trend betweenthe previous day 12UTC and the 0UTC of the day to be forecasted). For illustrationpurposes and to assess the average and extreme behaviors of PM10, three scenariosfor the PM10 spatial dispersion are shown in Figs. 7(a–c) and 8(a–c) through quantileforecasts for the 29/4/2006 and 7/8/2006 events. Figures 7(a) and 8(a) show the 10thpercentile average image for the 29/4/2006 and 7/8/2006 events, Figs. 7(b) and 8(b)the 50th percentile, and Figs. 7(c) and 8(c) the 90th percentile, respectively. This

Math Geosci

Fig. 6 Average image (50 simulations) representing the spatial dispersion of PM10 on (a) 29/4/2006 and(b) 7/8/2006

percentile representation allows for an assessment of the spatial uncertainty of thepredictions. As is evident in a comparison of the forecasted maps with the observedPM10 values for each day (Figs. 6(a) and 6(b)), the observations fall within the rangeof risk between the 10th and 90th percentiles.

Math Geosci

Fig. 7 Average image representing the (a) 10th, (b) 50th, and (c) 90th percentile forecasts for 29/4/2006

Math Geosci

Fig. 8 Average image representing the (a) 10th, (b) 50th, and (c) 90th percentile forecasts for 7/8/2006

Math Geosci

3.5 Discussion

Sharp transitions in pollutant levels from one day to the next are among the mostdifficult situations to forecast by means of machine learning techniques, which learnbased on historical past events and thus require a certain amount of repetitiveness. Asthese situations are uncommon relative to the mean behavior, prediction becomes adifficult task. By contrast, the NN approach performed reasonably well on 29/4/2006and 7/8/2006, that is, both situations were well forecasted within the range of riskdefined by the 10th and 90th percentiles.

Geographic areas repeatedly subjected to daily PM10 values higher than 50 μg/m3

or located close to major roads, intersections, or transport infrastructures are consid-ered pollution hotspots. The health and quality of life of people living and workingclose to those areas may be at risk due to increased levels of AP and noise (EC 2006).Therefore, the correct prediction of high pollutant levels and the identification of pol-lution hotspots are vital in order to equip the responsible entities with the tools toproduce alerts and to facilitate sustainable AQ management. The simulated images(Figs. 7 and 8) show the areas with higher and lower PM10 values; some of those areasare hotspots. Critical risk areas are those with PM10 values higher than the 50 μg/m3

threshold. These were also well forecasted within the risk intervals of the 10th and90th percentiles. The choice among the percentile forecasts might be decided uponbased on an evaluation of the main spatial trend of the previous day (12 hours ahead).Through this hybrid approach, the uncertainty resulting from the prediction errorsand from the local spatial variability of pollutants is quantified by the simulated maps,giving rise to critical risk areas. Thus, the resulting images provide a threshold of riskfor use in risk assessment studies and by decision-makers. It is worth noting that thetime prediction could be performed by using an alternative machine learning methodor by deterministic simulation, as long as the chosen method is able to integrate asmany factors into the prediction as the NN does. This means that the hybrid approachproposed in this study is, above all, a methodological framework, coupling a timepredictor (in this case a NN) with a new geostatistical simulation methodology forthe spatial characterization of a pollutant’s concentration, in which the uncertainty inthe time prediction is accounted for.

4 Conclusions and Final Remarks

Our hybrid model for the prediction of AP in urban areas successfully addresses thetypical issues related to the prediction of complex dynamic phenomena. It does sothrough the use of two main components: (i) NN AQ forecasting at the monitoringstations based on historical AQ and meteorological data; and (ii) stochastic simulationwith bi-distributions in order to determine the spatial distribution of AQ in a fine grid.By integrating information from different scales and types we were able to identifythe main characteristics associated with the occurrence of pollution episodes, as wellas other space–time characteristics (e.g., intra-annual variability and climate variabil-ity, trends, and extremes). The best predictors among the available concentrationsof pollutants and weather variables were selected independently for each monitoring

Math Geosci

station, which allowed optimization of the NN model for each one. Typically, NNmodels produce a unique forecast result that corresponds to a certain observed value.Consequently, any uncertainty in the result is not accounted for and thus propagatesspatially when the result is extended to the spatial domain. One of the advantagesof our approach is the inclusion of the uncertainty that arises from the prediction er-rors and from the spatial local variability of pollutants, which are evaluated through abivariate distribution between predicted and real contaminant values. Therefore, un-certainty is accounted for and quantified by the simulated maps, thereby identifyingcritical risk areas.

The combination of NN models and stochastic simulation allows for the forecast-ing of pollutant concentrations with high spatial-temporal resolution and the identifi-cation of critical risk areas. The latter provide a threshold of risk for risk assessmentstudies in addition to supporting contingency plans and providing information for de-cision makers (assuming that it is possible to obtain data in real time). Moreover, themodeling tool developed in this study is applicable to local as well as regional do-mains and is able to identify patterns and predict future behavior. This hybrid method-ology was successfully applied to the greater Lisbon area. Importantly, however, theflexible methodology allows our approach to be applied to other areas as it dependsonly on data availability. In addition, high peak pollutant values were reproduced inmost cases by each model. The simplicity and cost efficiency of the proposed model,together with its performance capabilities, make it a very promising approach to thecharacterization of urban AQ. For the greater Lisbon area, for example, its applicationwill contribute to efforts aimed at producing an integrated AQ surveillance system.However, certain aspects should be taken into account if improvements and furtherdevelopments of the methodology are to be achieved. Primarily, maps of forecastedpollutant concentrations resulting from deterministic dispersion models can be usedas local trends to condition the stochastic simulations, mostly during periods of abruptchanges in AP and in order to downweight local trends of a previous instant. In addi-tion, the determination of trends and anomalies in the time series of a pollutant andthe non-linear determination of predictors might optimize NN forecast performances.

Acknowledgements The authors acknowledge the Instituto de Meteorologia and Agência Portuguesado Ambiente for the meteorological and environmental data, respectively. The authors also acknowledgethe Fundação para a Ciência e Tecnologia from the Science, Technology and Superior Education Ministry,for supporting this research through grant SFRH/BD/27765/2006.

Appendix Local Conditional Distributions of Observed Values Given thePredicted Value of a Contaminant

Bivariate distributions between predicted and equivalent real values [Z∗(xα, t0−i ),

Z(xα, t0−i )] (xα = 1, Nm, i = 1,NT ) are calculated at each monitoring station us-ing a independent set of historical data not used in the training step of the NN model.These local bi-distributions measure the local accuracy of the temporal predictions ateach monitoring station. Experimental bi-plots of two monitoring stations with dif-ferent prediction accuracies are associated with different uncertainties. For example,a bi-plot with a significant cloud spread of high values has a high level of uncertainty

Math Geosci

regarding the real observed values for a certain predicted high value. A bi-plot with aless uncertain cloud of low and high values has less uncertainty regarding the real ob-served values. Based on the bi-plots, the local conditional distribution is determinedfor each day and for each location.

For a certain predicted value z∗(xα, t0) at instant t0, the local conditional dis-tribution: F [Z(xα, t0)|Z∗(xα, t0) = z∗] can be calculated from the bi-distributions[Z∗(xα, t0−i ),Z

∗(xα, t0−i )]. A practical implementation of the local conditional dis-tributions evaluation is presented in Horta and Soares (2010) and can be summarizedas follows: First, the user must define the minimum number of data Nc of each classof the conditional histogram F [Z(xα, t0−i )|Z∗(xα, t0) = z∗]. By ranking the pairsof values [Z(xi),Z

∗(xi)] by increasing the order of Z∗, then for one certain condi-tioning value m = Z∗(xj ) we have the corresponding Z(xj ), at the j th position ofthe ordered [Z(xi),Z

∗(xi)]. Hence, F [Z(x)|Z∗(x) = m] comprises the closest Nc

values of Z(xj ) in the rank-ordered list.

References

Agirre-Basurko E, Ibarra-Berastegi G, Madariaga I (2006) Regression and multilayer perceptron-basedmodels to forecast hourly O3 and NO2 levels in the Bilbao area. Environ Model Softw 21:430–446

APA—Agência Portuguesa do Ambiente (2008) Evolução da qualidade do ar em Portugal entre 2001 e2005. Report

Atkinson PM, Lloyd CD (2001) Ordinary and indicator kriging of monthly mean nitrogen dioxide concen-trations in the United Kingdom. In: Monestiez P et al (eds) GeoENV VIII-geostatistics for environ-mental applications. Kluwer Academic, Norwell, pp 33–44

Bilonick RA (1983) Risk qualified maps of hydrogen ion concentration for the New York state area for1966–1978. Atmos Environ 17:2513–2524

Bilonick RA (1985) The space–time distribution of sulfate deposition in the northeastern united states.Atmos Environ 19:1829–1845

Cobourn W, Dolcine L, French M, Hubbard M (2000) A comparison of nonlinear regression and neuralnetwork models for ground-level ozone forecasting. J Air Waste Manage Assoc 50:1999–2009

Cressie N, Kaiser MS, Daniels MJ, Aldworth J, Lee J, Lahiri SN, Cox LH (1999) Spatial analysis ofparticulate matter in an urban environment. In: Gomez-Hernandez J, Soares A, Froidevaux R (eds)GeoENV II—geostatistics for environmental applications. Kluwer Academic, Dordrecht, pp 41–52

Demuzere M, Trigo R, Arellano V, van Lipzig N (2009) The impact of weather and atmospheric circulationon O3 and PM10 levels at a rural mid-latitude site. Atmos Chem Phys 9:2695–2714

Dutot AL, Rynkiewicz J, Steiner FE, Rude J (2007) A 24-h forecast of ozone peaks and exceedance levelsusing neural classifiers and weather predictions. Environ Model Softw 22:1261–1269

EC—European Community (2006) Development of a methodology to assess population exposed to highlevels of noise and air pollution close to major transport infrastructure. Final report. Entec UK limited

EEA—European Environment Agency (2011) Air quality in Europe. Technical report No 12/2011EEA—European Environment Agency (2010) The European environment–state and outlook 2010: syn-

thesis. European Environment Agency, CopenhagenFranco C, Soares A, Delgado J (2006) Geostatistical modelling of heavy metal contamination in the top-

soil of Guadiamar river margins (Spain) using a stochastic simulation technique. Geoderma 136(3–4):852–864

Gardner M, Dorling S (2000) Statistical surface ozone models: an improved methodology to account fornon-linear behaviour. Atmos Environ 34:21–34

Gardner M, Dorling S (1999) Neural network modelling and prediction of hourly NOx and NO2 concen-trations in urban air in London. Atmos Environ 33:709–719

Gomez-Hernandez J, Journel AG (1993) Joint sequential simulation of multi-Gaussian fields. In: Soares A(ed) Geostatistics TROIA’92. Kluwer Academic, Dordrecht, pp 85–94

Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford Univ. Press, New York

Math Geosci

Hooyberghs J, Mensink C, Dumont G, Fierens F, Brasseur O (2005) A neural network forecast for dailyaverage PM10 concentrations in Belgium. Atmos Environ 39:3279–3289

Horta A, Soares A (2010) Direct sequential co-simulation with joint probability distributions. Math Geosci42:269–292

INE—Instituto Nacional de Estatística. www.ine.pt. Accessed 17 July 2012Kolehmainen M, Martikainen H, Ruuskanen J (2001) Neural networks and periodic components used in

air quality forecasting. Atmos Environ 35:815–825Kyriakidis P, Journel A (1999) Geostatistical space–time models: a review. Math Geol 31(6):651–685Lal B, Tripathy SS (2012) Prediction of dust concentration in open cast coal mine using artificial neural

network. Atmos Pollut Res 3:211–218Lamb RG (1983) A regional scale (1000 km) model of photochemical air pollution, part I: theoretical

formulation. EPA 600/3-83-035. US Environmental Protection Agency, Research Triangle Park, NCLuecken DJ, Hutzell WT, Gipson GL (2006) Development and analysis of air quality modeling simulations

for hazardous air pollutants. Atmos Environ 40:5087–5096Monestiez P, Meiring W, Sampson PD, Guttorp P (1997) Modelling non-stationary spatial covariance

structure from space–time monitoring data. Ciba Foundation symposium. 01/1997; 210:38-48; dis-cussion 48-51, 68-78.

Monteiro A, Vautard R, Borrego C, Miranda A (2005) Long-term simulations of photo oxidant pollutionover Portugal using the CHIMERE model. Atmos Environ 39(17):3089–3101

Morris RE, Meyers TC (1990) User’s guide for the Urban Airshed Model. In: User’s manual for UAM(CB-IV). EPA-450/4-90-007A, vol. I. US Environmental Protection Agency, Research Triangle Park

Nunes C, Soares A (2005) Geostatistical space–time simulation model. Environmetrics 16:393–404Pereira MJ (1999) Air quality modelling and simulation. PhD Dissertation, Instituto Superior Técnico,

Lisbon, PortugalPerez P, Reyes J (2002) Prediction of maximum of 24-h average of PM10 concentrations 30h in advance

in Santiago. Chile Atmos Environ 36:4555–4561Perez P, Trier A, Reyes J (2000) Prediction of PM2.5 concentrations several hours in advance using neural

networks in Santiago, Chile. Atmos Environ 34:1189–1196Russo A, Trigo R, Soares A (2008) Stochastic modelling applied to air quality space–time characteriza-

tion. In: Soares A, Pereira MJ, Dimitrakopoulos R (eds) GeoENV VI-geostatistics for environmentalapplications. Springer, Berlin, pp 83–93

Soares A (2001) Sequential direct simulation and co-simulation. Math Geol 33(8):911–926Soares A, Pereira MJ (2007) Space–time modelling of air quality for environmental-risk maps: a case

study in South Portugal. Comput Geosci 33(10):1327–1336Sokhi RS, San José R, Kitwiroon N, Fragkou E, Pérez JL, Middleton DR (2006) Prediction of ozone levels

in London using the MM5–CMAQ modelling system. Environ Model Softw 21:566–576Srivastava RM (1992) Reservoir characterization with probability field simulation. SPE Paper 24753Trigo R, DaCamara C (2000) Circulation weather types and their impact on the precipitation regime in

Portugal. Int J Climatol 20:1559–1581Trigo R, Palutikof J (1999) Simulation of daily temperatures for climate change scenarios over Portugal: a

neural network model approach. Clim Res 13:45–59Turias I, González F, Martin M, Galindo P (2007) Prediction models of CO, SPM and SO2 concentra-

tions in the Campo de Gibraltar region, Spain: a multiple comparison strategy. Environ Monit Assess143:131–146

http://www.ine.pt

Documents

Hybrid Model for Urban Air Pollution Forecasting: A ...idlcc.fc.ul.pt/pdf/AnaRusso_AirPolutionForecasting_2013.pdf · 2 Hybrid Model for Urban Air Pollution Forecasting: A Stochastic