13
Atmospheric Environment 40 (2006) 554–566 A hierarchical Bayesian approach to the spatio-temporal modeling of air quality data A. Riccio a, , G. Barone b , E. Chianese b , G. Giunta a a Department of Applied Sciences, University of Naples ‘‘Parthenope’’, Via De Gasperi 5, 80133, Napoli, Italy b Department of Chemistry, University of Naples ‘‘Federico II’’, Via Cintia, 80126, Napoli, Italy Received 15 April 2005; received in revised form 23 September 2005; accepted 28 September 2005 Abstract The statistical evaluation of an air quality model is part of a broader process, generally referred to as ‘model assessment’, including sensitivity analysis and other tools. The evaluation process is usually implemented through the comparison of model predicted data with point-wise observations. However, this analysis is based on several (implicit) assumptions which are difficult, if not impossible, to assess: e.g. unbiased observations, measurements errors small enough in comparison to the typical usage of observed data, observations representative of the true area-averaged values within each computational cell, numerical model errors small enough in comparison to mis/un-represented physics/chemistry, and so on. In this work we address the problem of the comparison between point measured data and cell-averaged model values. We present a Bayesian approach for the space-time interpolation of measured data and the prediction of cell-averaged values. We used cell-averaged observations to validate the results from the CAMx air quality model. We found that a relevant fraction of the model bias can be explained by the subgrid spatial variability. This analysis may be important in all cases in which one is interested in a model and/or process comparison exercise. r 2005 Elsevier Ltd. All rights reserved. Keywords: Bayesian space-time interpolation; Sub-grid variability; Model evaluation; CAMx model 1. Introduction Air quality models are a form of a highly complex scientific hypothesis concerning natural and anthro- pogenic processes. Their success derives from the ability to systematically address several important environmental issues (where measurements play a limited role), such as source apportionment, air quality forecast, evaluation of long-term trends, evaluation of the impact of future pollutant sources, e.g. freeways, incinerators, industrial complexes, and so on (Seinfeld and Pandis, 1998). In the framework delineated by some recent European directives (1996/62/EC and 2002/3/EC), models play a relevant role: they integrate in a clear conceptual framework our understanding of atmo- spheric processes and their interactions, and fill in the gap between emission fluxes and air quality. According to the US EPA, the primary objective of the next generation of air quality models is to improve the environmental community’s ability to ARTICLE IN PRESS www.elsevier.com/locate/atmosenv 1352-2310/$ - see front matter r 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.atmosenv.2005.09.070 Corresponding author. Fax: +39 081 5522293. E-mail address: [email protected] (A. Riccio).

A hierarchical Bayesian approach to the spatio-temporal modeling of air quality data

  • Upload
    eni

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

ARTICLE IN PRESS

1352-2310/$ - se

doi:10.1016/j.at

�CorrespondE-mail addr

Atmospheric Environment 40 (2006) 554–566

www.elsevier.com/locate/atmosenv

A hierarchical Bayesian approach to the spatio-temporalmodeling of air quality data

A. Riccioa,�, G. Baroneb, E. Chianeseb, G. Giuntaa

aDepartment of Applied Sciences, University of Naples ‘‘Parthenope’’, Via De Gasperi 5, 80133, Napoli, ItalybDepartment of Chemistry, University of Naples ‘‘Federico II’’, Via Cintia, 80126, Napoli, Italy

Received 15 April 2005; received in revised form 23 September 2005; accepted 28 September 2005

Abstract

The statistical evaluation of an air quality model is part of a broader process, generally referred to as ‘model assessment’,

including sensitivity analysis and other tools. The evaluation process is usually implemented through the comparison of

model predicted data with point-wise observations. However, this analysis is based on several (implicit) assumptions which

are difficult, if not impossible, to assess: e.g. unbiased observations, measurements errors small enough in comparison to

the typical usage of observed data, observations representative of the true area-averaged values within each computational

cell, numerical model errors small enough in comparison to mis/un-represented physics/chemistry, and so on.

In this work we address the problem of the comparison between point measured data and cell-averaged model values.

We present a Bayesian approach for the space-time interpolation of measured data and the prediction of cell-averaged

values.

We used cell-averaged observations to validate the results from the CAMx air quality model. We found that a relevant

fraction of the model bias can be explained by the subgrid spatial variability. This analysis may be important in all cases in

which one is interested in a model and/or process comparison exercise.

r 2005 Elsevier Ltd. All rights reserved.

Keywords: Bayesian space-time interpolation; Sub-grid variability; Model evaluation; CAMx model

1. Introduction

Air quality models are a form of a highly complexscientific hypothesis concerning natural and anthro-pogenic processes. Their success derives from theability to systematically address several importantenvironmental issues (where measurements play alimited role), such as source apportionment, airquality forecast, evaluation of long-term trends,

e front matter r 2005 Elsevier Ltd. All rights reserved

mosenv.2005.09.070

ing author. Fax: +39 081 5522293.

ess: [email protected] (A. Riccio).

evaluation of the impact of future pollutant sources,e.g. freeways, incinerators, industrial complexes,and so on (Seinfeld and Pandis, 1998).

In the framework delineated by some recentEuropean directives (1996/62/EC and 2002/3/EC),models play a relevant role: they integrate in a clearconceptual framework our understanding of atmo-spheric processes and their interactions, and fill inthe gap between emission fluxes and air quality.According to the US EPA, the primary objective ofthe next generation of air quality models is toimprove the environmental community’s ability to

.

ARTICLE IN PRESSA. Riccio et al. / Atmospheric Environment 40 (2006) 554–566 555

evaluate the impact of air quality managementpractices for multiple pollutants at multiple scales(http://www.epa.gov/asmdnerl/).

The evaluation process of an air quality model isof primary importance, from a scientific, as well asregulatory, point of view. However, several complexstatistical issues accompany any validation proce-dure. Modeled data and measurements are affectedby errors of different nature: missing observations,systematic bias, errors connected to the measurementand acquisition devices. Modeled data are affectedby uncertainties from mis/un-represented physics/chemistry, numerical errors, and so on. Furthermore,it is reasonable to hypothesize that measurements areinfluenced by local processes which are not equallyrepresented by the model (the ‘change of support’problem, as defined in the geostatistical literature,Cressie, 1993), due to the implicit and explicitdiffusion introduced into the model for numerical/stability reasons; for example McNair et al. (1996)found that individual concentration measurementswere only approximately represented by the truearea-averaged concentrations within a computationalgrid cell and that significant spatial variations exist.The problem in comparing point measurements withcell-averaged model values is often ignored in themajority of model evaluation studies. Moreover,measurements are not usually defined at the samespatial locations of the model, so a preliminaryinterpolation step is unavoidable, but the interpola-tion of non-stationary spatio-temporal data, inter-acting on a wide variety of scales, is still an openstatistical issue (Guttorp, 2003).

During the last years, the Bayesian statistical viewhas been acknowledged to be the most naturalapproach for combining various information sourceswhile managing their associated uncertainties in astatistically consistent manner (Berliner, 2003; Riccio,2005). Also, the comparison of point-referencedmeasured data with gridded data from numericalmodels is an area of active research in the statisticalsciences (Fuentes and Raftery, 2005). In the follow-ing, we apply a hierarchical Bayesian approach to thestatistical space-time modeling of ozone measureddata, and use it to validate the results of the CAMx(Comprehensive Air quality Model with eXtensions,see http://www.camx.com) air quality model.

In Section 2 we shortly introduce the CAMx modelset up. In Section 3 a Bayesian model evaluationapproach is described; we specify a simple model formeasurements in terms of an unobserved ground truthand estimate it in a Bayesian framework. We show

how to consistently interpolate the measurementsfrom monitoring sites to locations for which modeleddata are available. Bayesian predictive analysis isemployed to estimate cell-averaged values, condi-tioned on observed data; our approach also takes intoaccount the uncertainties about the different spatialsupport and the lack of stationarity in the data.Finally, in Section 4 we evaluate the capabilities of ourapproach in reproducing the spatio-temporal char-acteristics of observed data; we use the results fromour statistical model to evaluate the CAMx perfor-mance, exploring the differences with a ‘traditional’evaluation approach (i.e. by comparing observed datawith bi-linearly interpolated CAMx data).

2. The CAMx set up

The CAMx model, version 4.02, was used tosimulate the air quality dynamics over the Europeanregion from 1 June to 31 August, 2001. The mostimportant physical representations used in thiswork are shortly summarized in the following.

Meteorological input (wind speed in each hor-izontal direction, temperature, pressure, relativehumidity, water content, turbulent diffusivity) werederived from the MM5 model. The MM5 modelwas forced by the European Center for MediumRange Weather Forecasting (ECMWF) analysis(TOGA archive). The CAMx grid used the sameprojection and horizontal grid structure/resolutionas MM5; a pre-processing program simply wind-owed data from a portion of the MM5 grid andoutput the subset to the CAMx files. Fig. 1 showsthe location of CAMx grid points. Sixteen layerswere used in the vertical direction with a resolutionof about 70meters near the surface. The turbulentcoefficient in the horizontal direction was calculatedusing the McNider and Pielke (1981) local gradientapproach, while the horizontal diffusion coefficientswere determined within CAMx using a deformationapproach based on the method by Smagorinsky(1963). Gas-phase chemistry was based on theSAPRC99 chemical mechanism (Carter, 2000),adapted to include a ‘bulk’ aerosol chemistry.

The 2001 anthropogenic emission data from theEMEP database have been used. They consist ofannual emitted quantities, given for the 11 SNAPactivity sectors and NOx, CO, SOx, NMVOC andNH3 families. Spatial emission distribution from theEMEP grid to the CAMx grid was performed usingan intermediate grid at 3000 resolution. Soil typebeing known on this grid allowed for a better

ARTICLE IN PRESS

1 Jun 2 Jun 3 Jun 4 Jun 5 Jun0.016

0.018

0.02

0.022

0.024

0.026

0.028

µg/m3

20 40 60 80 100 120 140

Fig. 1. (a) Observed mean ozone concentration (upper left panel), interpolated over neighboring CAMx grid cells. The first left singular

vector (upper right panel) of ozone observed data interpolated over the same cells. The first right singular vector (lower panel); only the

data for the first five days are shown. The inner rectangle indicates the extension of the CAMx grid domain; circles the location of EMEP

monitoring stations, while dots the CAMx grid points. The singular vectors are reported in arbitrary units because they are scaled by the

corresponding singular value.

A. Riccio et al. / Atmospheric Environment 40 (2006) 554–566556

emission apportionment. VOC speciation was madeaccording to the AEAT suggested profiles (Passant,2002). Biogenic emissions for isoprene and terpeneswere computed using vegetation and soil inventories(Simpson et al., 1999) of the model domain usingreference temperature and PAR conditions; then,these emissions were tuned according to themeteorological conditions prevailing during thesimulation period.

3. The statistical model

3.1. Notation

Our statistical analysis utilizes two datasets:EMEP measurements,

Oft;xg � fOðt;xÞ : ðt; xÞ 2 Og

and CAMx modeled data,

Mft;yg � fM ðt;yÞ : ðt; yÞ 2Mg.

We can look at ðt;xÞ and ðt; yÞ as the indexes of twolattices, O and M, respectively. f�; �g indicates a setof time and spatial locations. The spatial points oflattice M corresponds to the grid points at whichmodeled data are available (the cell centers of theCAMx grid used to discretize the European region,see Fig. 1). Oft;xg represents ozone observationsavailable from the EMEP database (http://www.emep.int), at spatial locations x and at thesame time levels as those of lattice M.

Due to their inherent uncertainties, it can behelpful to set up a stochastic framework in order toanalyze and compare the properties of the twodatasets. Mft;yg and Oft;xg can be viewed as tworealizations coming out from the discretization of a

ARTICLE IN PRESSA. Riccio et al. / Atmospheric Environment 40 (2006) 554–566 557

continuous stochastic spatio-temporal process. Wedenote by Zft;sg a realization of this process forspatial locations s and times t.

Also, throughout the paper, we use the followingnotation: for a random vector Y, ½Y� represents thejoint probability density function of Y, and ½Y j X�the conditional density of Y, given X; the symbol ‘�’denotes ‘is distributed as’, i.e. Y�Nð0;RÞ meansthat Y is a zero-mean and variance R normallydistributed random vector.

The dependence on time and/or space is droppedwhen clear from the context, i.e. O and Oft;xg bothindicate available observations measured at allavailable times t and locations x. We also use thesubscript ðt; �Þ to indicate data at a specific time t

and all spatial locations, while ð�; sÞ indicatedata available at all times but at a specific spatiallocation s.

3.2. Methodology

The final goal of this work is the development of astatistical evaluation procedure for air qualitymodels. To this aim, we need to estimate theprobability density function of the ‘true’ process,Z½ �, i.e. how it evolves in space and time, and how itrelates to the two datasets, M and O.

We can envision two problems:

1.

observations can be viewed as point data, whilethe data produced by the air quality model can beviewed as block data, i.e. areal averages over themodel grid cells.

2.

Observations are collected at points correspond-ing to lattice O, while inference is needed atpoints corresponding to a different lattice M.

In Section 3.2.1 we propose a Bayesian approachfor the solution of the first problem, and in Section3.2.2 a Bayesian interpolation procedure to solveproblem 2.

3.2.1. The prediction of cell-averaged values

Let us assume that observed data have beeninterpolated from lattice O to lattice M. This meansthat ½h;Zft;ygjO�, the posterior probability distribution

function (PPDF), has been evaluated. h is the arrayof parameters introduced by the interpolationprocedure.

The Bayesian solution to problem 1, i.e. theprediction at cell-averaged locations, can be cast asa problem equivalent to the prediction at a new set

of locations, ft; sg, given the predicted values at sitesft; yg

½Zft;sgjZft;yg;O�

¼

Z½Zft;sgjh;Zft;yg;O�½h;Zft;ygjO�dh. ð1Þ

By drawing Zft;sg from ½Zft;sgjh�;Z�;O�, where

ðh�;Z�Þ�½h;Zft;ygjO�, we obtain a sample from (1)which provides any desired inference about the trueprocess at the selected locations and times.

Assuming a Gaussian distribution for the jointprobability distribution function

½Zft;sg;Zft;ygjh;O�

�Nlft;sgðhÞ

lft;ygðhÞ

" #;

Rft;ssgðhÞ Rft;sygðhÞ

R0ft;sygðhÞ Rft;yygðhÞ

24

35

0@

1A ð2Þ

lft;sgðhÞ and Rft;ssgðhÞ indicate the mean and varianceof Zft;sg at locations ft; sg, respectively; lft;ygðhÞ andRft;yygðhÞ the mean and variance at locations ft; yg,and Rft;sygðhÞ the covariance between Zft;sg and Zft;yg.From standard multivariate statistical analysis(Mardia et al., 1980), it can be shown that

½ZfsgjZfyg; h;O��Nðlfsg þ RfsygR�1fyygðZfyg � lfygÞ,

Rfssg � RfsygR�1fyygR

0fsygÞ, ð3Þ

where the dependence on t and h has been droppedfor clarity. Analogously, the prediction at modelgrid cells fBg � ðB1; . . . ;BkÞ, is

½ZfBgjZfyg; h;O��NðlfBg þ RfBygR�1fyygðZfyg � lfygÞ,

RfBBg � RfBygR�1fyygR

0fBygÞ ð4Þ

lfBg can be interpreted as the expected values of thespatial average of process Z over the cells fBg.Formally, the cell-averaged mean and variances are

lBi¼ jBij

�1

ZBi

Z ds ð5aÞ

RBiBi0¼ jBij

�1jBi0 j�1

ZBi

ZBi0

Sss0 dsds0 ð5bÞ

RBiy ¼ jBij�1

ZBi

Ssy ds ð5cÞ

jBij denoting the area of the ith cell. As outlined inGelfand et al. (2001), we can evaluate (5a–c) viaMonte Carlo integration, i.e. we can select L

uniformly distributed points over cell Bi, fsl ; : l ¼

1; . . . ;Lg and sl 2 Bi, and replace (5a) by

lBi�

1

L

XL

l¼1

Zðt;sl Þ. (6)

ARTICLE IN PRESSA. Riccio et al. / Atmospheric Environment 40 (2006) 554–566558

Cell variances can be similarly estimated

RBiBi0�

1

L2

XL

l¼1

XL

m¼1

Ssl s0m; ð7aÞ

RBiy �1

L

XL

l¼1

Ssl y; ð7bÞ

where sl 2 Bi and s0m 2 Bi0 . Those approximationscan be made arbitrarily accurate, by letting L besufficiently large. In this work we used L ¼ 16 andresults did not vary using a greater value. Eqs. (6)and (7a–b), inserted into Eq. (4), allow us to drawsamples for the cell-averaged process.

3.2.2. The interpolation procedure

The interpolation procedure was implementedwithin a conditional hierarchical Bayesian frame-work. In essence, this hierarchical approach wasbased on the formulation of three basic conditionalmodels (Berliner, 1996):

Stage 1: Data model: ½O j Z; h1�Stage 2: Process model: ½Z j h2�Stage 3: Parameter model: ½h1; h2�

h1 and h2 represent generic arrays of parametersintroduced in the modeling. Bayesian probabilitytheory ensures that inference and prediction in thedistribution of the process and parameters can beobtained from the PPDF:

½Z; h1; h2 j O� / ½O j Z; h1� � ½Z j h2� � ½h1; h2�. (8)

The major advantage of this conditional approach isthat statistical and physical reasoning can be moreeasily introduced at each stage of the hierarchy(Berliner, 2003). Similar interpolation procedureshave already been applied in the environmentalsciences, (Berliner et al., 2003; Wikle and Berliner,2005; Wikle et al., 2001). See also Wikle (2003) foran overview of applications of Bayesian hierarchicalmodels in the environmental sciences.

The data model:

Oðt;�Þ ¼ KZðt;�Þ þ eðt;�Þ. (9)

The data model relates observations to the ‘true’process, Zft;yg. Matrix K considers the nearest 10grid points within a cutoff distance of 150 km andweights those points by using the inverse squaredistance method. We also assumed that ‘measure-ment errors’, �ðt;xÞ, were normally and independentlydistributed in space and time with a constantvariance, i.e. eðt;�Þ�Nð0; Is2� Þ. I is the identity matrix.

Of course, theO values are non-negative, since weare modeling the ozone concentration, but Eq. (9)does not guarantee non-negativity. From a practicalpoint of view, ozone values rarely go below30–40mgm�3, and we did not experience any seriousproblem with the normality assumption. Of course,in more general settings, e.g. those with a pro-nounced deviation from Gaussianity, model (9) isnot adequate. Although plausible, we recognize thatthe model (9) is rather simple, but the investigationof the appropriate model at this stage is a researchproblem in itself, well beyond the scope of thiswork.

The process model: The process model was givenas conditional on two processes, denoted by lft;ygand Xft;yg

Zðt;yÞ ¼ mðt;yÞ þ X ðt;yÞ þ gðt;yÞ (10)

mðt;yÞ models a spatio-temporal trend surface, andX ðt;yÞ the short scale correlations. The gðt;yÞ’s areGaussian random variables which model noise, andrepresent the unexplained variations at this secondlevel of the hierarchy, cðt;�Þ�Nð0; Is

2gÞ.

For the statistical interpolation of spatio-tempor-al data, the simplest approach is to model the large-scale spatial trend as a function of some covariates,e.g. a polynomial function of latitude and long-itudes (Fuentes, 2000; Wikle et al., 1998). However,we found that simple spatial dependence relation-ships were inadequate for this problem, due tothe complex large-scale ozone spatial pattern (seeFig. 1). The mean level increases from the north-western countries to the southeastern Mediterra-nean area, but also depends on the altitude (see thealpine Austrian region, Fig. 1). It is believed that thehigher levels at high elevation sites reflect inputsfrom the free troposphere (Vingarzan, 2004). Super-imposed to the large-scale spatial trend, a seasonalcycle, forced by the variability of solar actinic flux,is present; the amplitude of the seasonal cycledepends on the time of the year and on altitude,as well.

A singular value decomposition (SVD) analysis(Press et al., 1992) was employed to capture thespatio-temporal trend of ozone data. The first leftand right singular vectors of the matrix of observeddata (one column for each time series) werecalculated, and interpolated over neighboring gridpoints using a thin-plate spline. Not surprisingly,the first left singular vector (the strongest spatialpattern) closely resembles the mean ozone concen-tration, while the first right singular vector (the

ARTICLE IN PRESSA. Riccio et al. / Atmospheric Environment 40 (2006) 554–566 559

strongest temporal pattern) captures the seasonalcycle (see Fig. 1). mðt;yÞ was defined as the value fromthe first singular vectors interpolated over theCAMx grid cell centers. Thanks to this approach,we were able to model the complex spatial andseasonal behavior of ozone observed data; more-over, non-stationarity was partially taken intoaccount in the spatio-temporal varying mean.

We found that the correlations of the short-scaleprocess, X ðt;yÞ, decay exponentially, both in time andspace, but with a spatially varying variance. For thisreason, we assumed a non-stationary, separable,form for the covariance function

CovðX ðt;sÞ;X ðt0 ;s0ÞÞ

¼ kðsÞkðs0Þs2xe�ks�s0k=dDxe�jt�t0 j=tDt, ð11Þ

where k � k denotes the great circle distance (theshortest distance between any two points along thesurface of the Earth), and d and t dictate the decayof spatial and temporal correlations, respectively.Dx and Dt are appropriate scale parameters. k wasassumed to be a relatively smooth spatial field, andlet logðkÞ�N 0;Cð Þ, where C is an exponentialcorrelation function with a strong spatial depen-dence (i.e. Cðs; s0Þ ¼ e�ks�s0k=25Dx). s, s0, t and t0 aregeneric spatial and time locations.

The separability between space and time wasessentially due to computational reasons, enablingfeasible computation for the change of supportproblem. For example, if S denotes the number ofspatial locations (� 2652, the number of CAMxgrid cells), T the number of time levels (� 3648, thenumber of hours during the simulation period), andRX the ðSTÞ � ðSTÞ covariance function, then

RX ¼ Rð1Þ Rð2Þ, (12)

where Rð1Þ is a S � S matrix, while Sð2Þ is T � T . ‘’is the Kronecker product. By the properties of theKronecker product, it can be shown thatjRX j ¼ jRð1ÞjTjRð2ÞjS, and R�1X ¼ R�1ð1Þ R�1ð2Þ . In otherwords, even though RX is ðSTÞ � ðSTÞ, we onlyneed the determinant and the inverse of a S � S anda T � T matrix, so that the problem remainscomputationally tractable.

We found that this simple approach was adequateto reproduce the spatio-temporal correlations ofobserved data, as shown in Section 4.

The parameter model: To complete the hierarchywe must specify the distributions for all theparameters from the previous stages, i.e.fs2e ;s

2g ;s

2x; d; t;kg.

In this work, we used conjugate distributions forall parameters, i.e. inverse gamma distributions forthe variances and d and t parameters, and normaldistributions for all other parameters. Due to thischoice, the derivation and implementation of thefull posterior distributions was straightforward. Theprior means and variances for the hyperparameterswere estimated in a preliminary exploratory phase,using observed data from the same time period.

3.3. Numerical exploration

From Eq. (8), it is seen that all information aboutthe vector parameters and process is contained inthe PPDF. Once Z; h j �½ � is formulated, one canestimate Z and h, and corresponding probabilitybounds. A Markov Chain Monte Carlo approachwas exploited to explore the PPDF. The basicprocedure of Monte Carlo simulation is to draw alarge set of samples fhðiÞgLi¼1 from a target distribu-tion (the PPDF in this work). One can then estimateany function f ðhÞ by the sample mean as follows:

Eðf Þ ¼

Z½h; � j� � f ðhÞdh �

1

N

XN

i¼1

f ðhðiÞÞ. (13)

Note also that the N values can be used to obtainthe MAP (maximum a posteriori) estimate

hMAP ¼ argmaxyðiÞ ½h; � j ��. (14)

The fhðiÞgNi¼1 samples were obtained by Gibbssampling (Gilks et al., 1996) for all parameters,with the exception of the d, t and k spatio-temporaldependence parameters, for which a standarddistribution is not available; in this case a Metro-polis random walk algorithm (Metropolis et al.,1953) was used. To facilitate sampling, we trans-formed to y ¼ log d, l ¼ log t and p ¼ log k, andsubsequently used Gaussian proposals on thesetransformed parameters; the random walk varianceswere automatically adjusted during an initial phaseto achieve approximately a 50% acceptance ratio.Iterations were performed until convergence, asmeasured by the Gelman and Rubin test (Gelmanand Rubin, 1992). Posterior means/MAP values forall quantities were estimated after convergence(� 1000 iterations), and errors were computed bybatching, to account for the correlation in theMarkov chain (Roberts, 1996). Three thousanditerations were used to gain confidence in theestimated errors.

ARTICLE IN PRESSA. Riccio et al. / Atmospheric Environment 40 (2006) 554–566560

4. Results

4.1. Inference and model assessment

In this Section we present the results from ourinterpolation procedure and explore the capabilitiesof this approach in reproducing the statisticalproperties of observed data. Table 1 shows theprior mean and standard deviations for all para-meters, and the corresponding posterior estimates.

We let Dx be the CAMx grid resolutionð� 81 kmÞ, and Dt the temporal resolution of theCAMx and EMEP data ð� 1 hourÞ. Due to the largeamount of data, the posterior distribution isdominated by the likelihood. In many cases theposterior means do not significantly differ fromtheir prior counterparts, indicating that theseparameters, for example sg, were correctly esti-mated in the preliminary exploratory phase; in somecases, for example for sx, d and t, the posteriormeans differ from their prior counterparts, indicat-ing ‘Bayesian learning’. As can be noted from theposterior estimates of the d and t parameters, 1.083and 1.812, respectively, the residuals from the X

process are significantly correlated over distancescomparable to the size of the CAMx grid cells, witha significant contribution coming from ozone valuesin the previous hours.

Recently, a workshop on Bayesian hierarchicalmodeling (Cressie, 2000) produced a ‘positionpaper’ on outstanding issues related to this topic.The role of several diagnostic methods for modelassessment was discussed; among others, cross-validation and posterior predictive check wereprimarily suggested for checking model adequacy.

We used cross-validation to test if the statisticalmodel correctly predicts observed data in space and/or time. For example, consider the data representedin Fig. 2. We ran a separate Markov chain, but left

Table 1

Gibbs sampler results

Model

parameter

Prior

mean

Prior std.

dev.

Posterior

mean

(std. error)

se ðmgm�3Þ 2.0 1.0 1.35 (0.12)

sg ðmgm�3Þ 10.0 5.0 11.82 (0.35)

sx ðmgm�3Þ 14.0 7.0 17.22 (0.21)

d 2.0 2.0 1.083 (0.016)

t 1.0 1.0 1.812 (0.021)

‘mean k’ 1.0 1.0 1.09 (0.02)

out the EMEP data at 1500 UTC (the time at whichthe maximum ozone concentration is usuallyattained), corresponding to � 5% of total amountof data, and we then compared the medians fromthe PPDF at the EMEP monitoring sites. Fig. 2bshows a close correspondence with observed data;experiments with a greater amount of removed data(up to 10% of total amount) gave similar results,with nearly linear relationships between observedand posterior medians. Figs. 2c and 2d shows animportant feature of Bayesian analysis; as alreadydiscussed in Section 3.3, the value of any functioncan be estimated by means of the Monte Carloprocedure; in this example the lower and upperquartile are shown, thus summarizing the uncer-tainty in the estimation of the spatial map.

We were also interested in assessing if the modelreproduces the spatio-temporal correlations ofobserved data. This was checked by calculatingthe spatial correlations between time series (drawnfrom the PPDF) at different time lags. Fig. 3 showsthe results of this kind of comparison. The spatialcorrelations for observed data (upper left cornerof Fig. 3) exponentially decay from about 0.7 atdistances comparable to the size of CAMx gridresolution, to about 0.2 at large distances; the samepattern can be discerned for the spatial correlationsof data drawn from the PPDF. Uncertainties inreplicating the spatial correlations are quantifiedby standard deviations whose magnitude is about0.1. The bins at short distances are obtained byaveraging over a smaller number of elements, butthis effect is automatically incorporated in theBayesian estimation procedure and nicely reflectedin greater uncertainties at short distances. The closecorrespondence between the correlations of ob-served and replicated data indicates that the decayof spatio-temporal correlations is correctly repro-duced.

Ultimately, it is instructive to examine thesensitivity of our model to data availability. It isclear from Fig. 1 that the monitoring stations areirregularly distributed. The majority resides in thenorthern part, while only two monitoring stationsare present in the Mediterranean area. Fig. 4 showsthe posterior densities of the area-averaged Z

process at two grid cells: the first is a grid cell incentral Europe with three monitoring stationslocated within it, while the second is located nearthe urban area of the city of Rome (central Italy).As expected, the posterior density for the secondcase is ‘wider’, indicating a greater uncertainty in

ARTICLE IN PRESS

Fig. 2. (a) Observed (EMEP) ozone concentration at monitoring stations on 1500 UTC, 2 June 2001 mgm�3. (b) Corresponding posterior

median, (c) lower quartile, and (d) upper quartile values (EMEP data at that time excluded), obtained by drawing samples from the PPDF.

A. Riccio et al. / Atmospheric Environment 40 (2006) 554–566 561

the posterior assessment of the ‘true’ value. Ofcourse, we do not expect that our interpolationapproach works well in regions where data arescarce or absent; for this reason, the results of theinterpolation procedure are examined only for thosecells containing at least one monitoring station.

4.2. The quantification of spatial inhomogeneity

effects

One objective of this work was to address theissue related to the comparison of site measure-

ments against grid cell estimates. In our analysis,data from 115 monitoring stations were used; for 44stations the nearest neighbor is located within a Dx

ð� 81 kmÞ distance; for 88 stations, i.e. � 75% ofthe total number, within a 2Dx distance.

A first indication of the strength of local effects isgained by comparing measured concentrationslocated close to one another; if two sites are inthe same, or neighbor, grid cell, but report verydifferent pollutant concentrations, this may indicatethat local gradients or inhomogeneities are signifi-cant, and this should be taken into account when

ARTICLE IN PRESS

0 100 200 300 400 500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Inter-site distance (km)

Cor

rela

tion

coef

ficie

nt

∆ t = 0hr

0 100 200 300 400 500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Inter-site distance (km)

Cor

rela

tion

coef

ficie

nt

∆ t = 1hr

0 100 200 300 400 500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Inter-site distance (km)

Cor

rela

tion

coef

ficie

nt

∆ t =2hr

0 100 200 300 400 500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Inter-site distance (km)

Cor

rela

tion

coef

ficie

nt∆ t = 3hr

0 100 200 300 400 500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Inter-site distance (km)

Cor

rela

tion

coef

ficie

nt

∆ t = 4hr

0 100 200 300 400 500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Inter-site distance (km)

Cor

rela

tion

coef

ficie

nt

∆ t = 5hr

Fig. 3. Decay of spatial correlations of ozone time series for observations (‘x’) and data drawn from the PPDF (circles with vertical bars)

at different temporal lags. The extension of vertical bars indicates the standard deviation.

A. Riccio et al. / Atmospheric Environment 40 (2006) 554–566562

ARTICLE IN PRESS

60 70 80 90 100 110 1200

0.05

0.1

0.15

0.2

µg/m3

60 70 80 90 100 110 1200

0.05

0.1

0.15

0.2

µg/m3

Fig. 4. (a) Posterior density of the area-averaged Z process for

two selected grid cells at 1500 UTC, 2 June 2001. Left panel:

results from the cell located at 47 N, 6300 E. This cell includes

three EMEP stations, CH002 (46490N, 6570 E), CH004

(4730N, 6590 E) and FR014 ð47110N; 6300 EÞ. Right panel:

results from the cell located at 42420N, 13110 E. This cell does

not contain any monitoring station, but is located near the EMEP

station IT001 ð4260N; 12380 EÞ, in the central Mediterranean

area.

Table 2

Comparison ðmgm�3Þ of ozone observations from monitoring

stations located within a predefined distance

Statistical

indexndistance

Dx 2Dx 3Dx

Absolute bias

(mean/max)

16.2/30.9 17.3/40.2 19.1/50.4

Root mean square

error (mean/max)

20.9/36.8 22.1/51.2 24.0/56.8

Differences in peak

concentration

(mean/max)

11.7/34.4 12.8/47.6 14.5/47.6

A. Riccio et al. / Atmospheric Environment 40 (2006) 554–566 563

evaluating the model performance. In order toquantify the importance of local effects, observa-tions from monitoring stations located in neighbor-ing CAMx grid cells were compared. Table 2 showssuch a comparison for station pairs located withinselected distances. It should be noted that, even fordistances equal to the CAMx grid resolution,differences are comparable to the magnitude oftypical model errors.

As explained in Section 3.2.1, the cell-averagedobserved data can be obtained from Eq. (4).Intuitively, we expect the ZfBg spatial field to be

smoother than the corresponding point estimate,Zfyg, because Eq. (4) essentially consists in thestatistical implementation of an area-weightedaveraging process: the weights are equal becausethe L points are uniformly distributed over eachcell, but the errors are correlated and the estimationof spatial gradients is implicitly included in theconditional mean. Also, from a physical point ofview, we expect the CAMx spatial fields to besmoother than observations, because of the implicit(and explicit) smoothing associated to the numericaldiffusion of the finite difference schemes or to thephysically-based subgrid scale turbulent mixing.The combined effects of numerical and physicaldiffusion usually result in a smoother modeled fieldat spatial scales near to or lower than the gridresolution. The smoothing effect introduced by thestatistical interpolation approach is clearly repro-duced in Table 3; note the narrower interquartilerange of the cell-averaged CDF; also, the mean andthe median are not significantly affected by theaveraging procedure, but the variance is reduced by� 15%.

4.3. The assessment of the CAMx model

performance

Due to the large number of data (92� 24 hourlyvalues from 115 monitoring stations) we only give asnapshot of the overall air quality model perfor-mance, without discussing any particular episode.

Traditional model evaluation statistics can give afirst insight in the model performance. Severalstatistical performance measures (US EPA, 1991),e.g. the mean normalized bias error (MNBE), themean normalized gross error (MNGE), and theunpaired peak prediction accuracy (UPPA), havebeen recommended for air quality model evaluation

ARTICLE IN PRESS

Table 3

Descriptive parameters ðmgm�3Þ of observed and cell-averaged

CDFs for all locations and time levels

Raw

observations

Cell-averaged

obs. data

Mean 76.5 73.9

Standard deviation 23.2 21.5

2.5th percentile 21.1 22.3

25.0th percentile 58.7 52.0

50.0th percentile 75.9 70.8

75.0th percentile 96.2 93.1

97.5th percentile 143.2 140.7

Table 4

Definition of the US EPA recommended statistical measures

MNBE M ðB;tÞ � Z�ðB;tÞ

Z�ðB;tÞ

MNGE jM ðB;tÞ � Z�ðB;tÞj

Z�ðB;tÞ

UPPA MmaxðB;tÞ � Z�max

ðB;tÞ

Z�maxðB;tÞ

A. Riccio et al. / Atmospheric Environment 40 (2006) 554–566564

procedures. Their definitions are listed in Table 4;Z�ðB;tÞ is a sample from the PPDF of the cell-averaged observed value at cell B and time t, andM ðB;tÞ the corresponding CAMx predicted value.The overline indicates the mean over time and gridcells; for the UPPA measure, ‘max’ indicates themaximum value predicted for each day and theoverline the mean over all cells. We also evaluated anumber of additional statistics to get a better pictureof the overall CAMx performance, including theroot mean square error (RMSE), partitioned into itssystematic (RMSEs) and unsystematic (RMSEu)components, and the error in amplitude of thediurnal variation (DVAE). The RMSEs was calcu-lated by regressing the CAMx predicted values ontothe cell-averaged observations and then evaluatingthe root mean square of residuals between thecell-averaged observations and regressed values;the RMSEu is the complementary part of theRMSE, i.e.

RMSEs ¼ ðr2ðB;tÞÞ

1=2;

RMSEu2 ¼ RMSE2�RMSEs2;

(

where rðB;tÞ ¼ ~MðB;tÞ � Z�ðB;tÞ and~M ðB;tÞ ¼ aþ bZ�ðB;tÞ;

a and b are linear regression coefficients fromstraightforward least squares fit. The error inamplitude of the diurnal variation consists in thedifference between the daily maximum and mini-mum concentrations

DVAE ¼MmaxðB;tÞ � Z�max

ðB;tÞ �MminðB;tÞ þ Z�min

ðB;tÞ .

In order to highlight the effects of this evaluationapproach, the US EPA recommended measures andthe RMSEs/u were evaluated only for those cellsincluding more than one monitoring station, andthe results compared with those obtained from a‘traditional’ approach, i.e. by comparing raw

observations with the CAMx bi-linearly interpo-lated data; moreover, as it is customary in thestatistical evaluation of air quality models, all thesemeasures were evaluated with and without a cutoffthreshold, precisely they were also evaluated forthose cells and times corresponding to observationsexceeding the cutoff level of 120mgm�3 ð� 60 ppbÞ;in this case they are a measure of the model’s abilityto reproduce the highest observed concentrations.Data from monitoring stations with a percentage ofmissing values greater than 25% were discarded.

Although there is no objective criterion set forthfor a satisfactory model performance, US EPArecommends the use of a cutoff threshold and amaximum level of 5–15% for the MNBE, 30–35%for the MNGE and 15–20% for the UPPA. Datareported in Table 5 show that our results (using theCAMx bi-linearly interpolated data and the cutoff)are in line with the US EPA recommendations: theCAMx model generally underestimates the highestobservations, with a MNBE of �14%, a MNGE of17.2%, and an UPPA of �7:3%; also, note that theamplitude of the diurnal variation is overestimatedby the CAMx model by about 6%, and about twothirds of the RMSE can be addressed to thesystematic component, i.e. the CAMx systematicallyunderpredicts concentrations above the cutoffthreshold. These results are sensible, since, recallingour earlier discussion on the effects of numericaland physical diffusion, the CAMx model should notbe able to correctly reproduce the highest concen-trations, and these measures just reflect this inabilitysince they were evaluated using data correspondingto observations above the cutoff threshold. How-ever, note that results are cutoff-dependent, pre-cisely the MNGE is significantly larger if the cutoffis not applied and the sign of the MNBE evenchanges. The change in sign has a clear explanation,too: as Hogrefe et al. (2001) outlined, the pairobservation/predicted value is excluded from the

ARTICLE IN PRESS

Table 5

Results of the statistical evaluation procedure obtained using a

cutoff threshold of 120mgm�3. The numbers in parenthesis

represent the same measures evaluated without cutoff

Bi-linearly

interpolated

values

Cell-averaged

values (MAP

values)

MNBE (%) 16.0 ð�14:0Þ 9.5 ð�10:0ÞMNGE (%) 32.3 (17.2) 24.2 (14.1)

UPPA (%) 3.0 ð�7:3Þ 6.2 ð�4:0Þ

DVAE ðmgm�3Þ 5.8 (5.8) 6.9 (6.9)

RMSEs ðmgm�3Þ 15.6 (27.4) 12.3 (21.8)

RMSEu ðmgm�3Þ 22.5 (13.0) 19.4 (11.5)

A. Riccio et al. / Atmospheric Environment 40 (2006) 554–566 565

computation of the bias, regardless of the predictedconcentration, if the observed concentration isbelow the cutoff threshold, so that some modeloverpredictions are excluded; on the other hand,some model underpredictions (observed concentra-tion above and modeled concentration below thecutoff) are included in the estimate. This creates a‘negative bias of the bias estimate’. Historically, thisresult was deemed desirable from a regulatoryperspective, since it was presumed to lead to a morestringent emission reduction strategy; of course, thishas nothing to do with a scientifically sound modelevaluation approach. The larger MNGE indicatesthat, to a certain extent, the CAMx model wasarranged so as to better reproduce the peak ozoneconcentrations. Also, note that the unsystematiccomponent of the RMSE prevails if no cutoff isused, indicating that the previous systematic under-prediction was a spurious result introduced byhaving considered only the highest concentrations.

In a recent paper, Hogrefe et al. (2001), introducethe concept of ‘inherent’ and ‘reducible’ uncertainty.They defined the inherent uncertainty as the modelinability to capture the observed fluctuations that arecaused by processes acting on scales not resolvableby the model grid cell size. The reducible uncertaintyarises from imperfect scientific understanding,e.g. mis/unrepresented physics/chemistry. Even a‘perfect’ model would not be able to fully reproducethe spatio-temporal variability of observed data(assumed observations were ‘perfect’), due to theinherent uncertainty. In order to point out to whatextent the results about CAMx performance can beaddressed to the inherent uncertainty, the thirdcolumn of Table 5 reports the same measuresevaluated using the MAP values of the cell-averagedobserved data. As expected, the averaging procedure

introduces a greater amount of smoothness, inflatingthe differences in the simulated diurnal variationspredicted by the CAMx model. However, themajority of measures indicate a better CAMx modelperformance; the biases, both relative and absolute,are reduced by about one third, and the systematiccomponent of the root mean square error is reducedby a considerable amount, particularly when thecutoff is applied.

5. Conclusions

In this work, we propose a Bayesian space-timeinterpolation approach and a methodology toaddress the problem of the comparison betweenpoint-wise observations and cell-averaged values.While traditional statistics may be useful in out-lining the general performance of an air qualitymodel, it does not give any insight into the strengthsand shortcomings of the model; in particular it isunable to quantify the effects of unresolved subgridvariability. Most current methods of analysis donot explicitly take into account for the differentsupport, and interpolating the modeled data tomatch the location of observed data, or vice-versa,does not avoid this problem, since spatial scales arenot equally represented.

We showed how to exploit Bayesian predictiveanalysis to estimate cell-averaged values, conditionedon point-referenced observed data. Then, we used thecell-averaged observations to evaluate the perfor-mance of the CAMx model. Our analysis showedthat a relevant fraction of the CAMx model bias maybe explained by the subgrid spatial variability.

The results from this statistical analysis can beused to provide a better insight into the model’sability to reproduce the physics/chemistry of theproblem, since it is able to mitigate the effects of theinherent error component.

Acknowledgements

This work has been supported by the ‘CentroRegionale di Competenza Analisi e Monitoraggiodel Rischio Ambientale’ (AMRA). Campania Re-gion, Italy.

References

Berliner, L.M., 1996. Hierarchical Bayesian time series models.

In: Hanson, K.M., Silver, R.N. (Eds.), Maximum Entropy

ARTICLE IN PRESSA. Riccio et al. / Atmospheric Environment 40 (2006) 554–566566

and Bayesian Methods. Kluwer Academic Publishers,

New York, pp. 15–22.

Berliner, L.M., 2003. Physical-statistical modeling in geo-

physics. Journal of Geophysical Research-Atmospheres

108 (D24).

Berliner, L.M., Milliff, R.F., Wikle, C.K., 2003. Bayesian

hierarchical modeling of air-sea interaction. Journal of

Geophysical Research-Oceans 108 (C4), 1413–1430.

Carter, W.P.L., 2000. Programs and files implementing the SAPRC-

99 mechanism and its associates emissions processing proce-

dures for Models-3 and other regional models. ohttp://

pah.cert.ucr.edu/�carter/SAPRC99.htm4.

Cressie, N.A.C., 1993. Statistics for Spatial Data. Wiley, New

York.

Cressie, N.A.C., 2000. Position Paper from the Workshop on

Hierarchical Modeling in Environmental Statistics. Colum-

bus, Ohio, May 14-16. ohttp://www.stat.ohiostate.edu/

�sses/WS2K/4.

Fuentes, M., 2000. Statistical assessment of geographic areas of

compliance with air quality standards. Technical Report,

Department of Statistics, North Carolina State University.

ohttp://www4.stat.ncsu.edu/�fuentes/4.

Fuentes, M., Raftery, A.E., 2005. Model evaluation and

spatial interpolation by Bayesian combination of observa-

tions with outputs from numerical models. Biometrics 61,

36–45.

Gelfand, A.E., Zhu, L., Carlin, B.P., 2001. On the change of

support problem for spatio-temporal data. Biostatistics 2,

31–45.

Gelman, A., Rubin, D.B., 1992. Inference from iterative

simulation using multiple sequences. Statistical Science 7,

457–472.

Gilks, W.R., Richardson, S., Spiegelhalter, D.J., 1996. Markov

Chain Monte Carlo in Practice. Chapman & Hall, CRC, Boca

Raton, Florida.

Guttorp, P., 2003. Environmental statistics—A personal view.

International Statistical Review 71, 169–179.

Hogrefe, C., Rao, S.T., Kasibhatla, P., Hao, W., Sistla, G.,

Mathur, R., McHenry, J., 2001. Evaluating the performance

of regional-scale photochemical modeling systems: Part

II–ozone predictions. Atmospheric Environment 35,

4175–4188.

Mardia, K.V., Kent, J.T., Bibby, J.M., 1980. Multivariate

Analysis. Academic Press, San Diego, California.

McNair, L.A., Hurley, R.A., Russell, A.G., 1996. Spatial

inhomogeneity in pollutant concentrations, and their implica-

tions for air quality model evaluation. Atmospheric Environ-

ment 30, 4291–4301.

McNider, R.T., Pielke, R.A., 1981. Diurnal boundary layer

development over sloping terrain. Journal of Atmospheric

Science 38, 2198–2212.

Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller,

A.H., Teller, E., 1953. Equation of state calculations by fast

computing machines. Journal of Chemical Physics 21,

1087–1092.

Passant, N.R., 2002. Speciation of UK emissions of NMVOC.

AEAT/ENV/R/0545 Report.

Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.,

1992. Numerical Recipes. Cambridge University Press,

London.

Riccio, A., 2005. A Bayesian approach for the spatiotemporal

interpolation of environmental data. Monthly Weather

Review 2, 430–440.

Roberts, W.R., 1996. Markov chain concepts related to sampling

algorithms. In: Gilks, W.R., Richardson, S., Spiegelhalter,

D.J. (Eds.), Markov Chain Monte Carlo in Practice. Chap-

man & Hall, pp. 45–57.

Seinfeld, J.H., Pandis, S.N., 1998. Atmospheric Chemistry

and Physics. From Air Pollution to Climate Change. Wiley,

New York.

Simpson, D., Winiwarter, W., Borjesson, G., Cinderby, S.,

Ferreiro, A., Guenther, A., Hewitt, C.N., Janson, R., Khalil,

M.A.K., Owen, S., Pierce, T.E., Puxbaum, H., Shearer, M.,

Steinbrecher, S., Svennson, B.H., Tarrason, L., Oquist, M.G.,

1999. Inventorying emissions from nature in Europe. Journal

of Geophysical Research 104, 8113–8152.

Smagorinsky, J., 1963. General circulation experiments with the

primitive equations: I. The Basic Experiment. Monthly

Weather Review 91, 99–164.

US EPA, 1991. Guideline for regulatory application of the Urban

Airshed Model. EPA-450/4-91-013, United States Environ-

mental Protection Agency, Research Triangle Park, NC

27711.

Vingarzan, R., 2004. A review of surface ozone background levels

and trends. Atmospheric Environment 38, 3431–3442.

Wikle, C.K., 2003. Hierarchical models in environmental science.

International Statistical Review 71, 181–199.

Wikle, C.K., Berliner, L.M., 2005. Combining information across

spatial scales. Technometrics 47, 80–91.

Wikle, C.K., Berliner, L.M., Cressie, N., 1998. Hierarchical

Bayesian space-time models. Environmental and Ecological

Statistics 5, 117–154.

Wikle, C.K., Milliff, R.F., Nychka, D., Berliner, L.M., 2001.

Spatio-temporal hierarchical Bayesian modeling: tropical

ocean surface winds. Journal of the American Statistical

Association 96, 382–397.