7
A comparison of CMAQ-based and observation-based statistical models relating ozone to meteorological parameters Jerry Davis * , William Cox, Adam Reff * , Pat Dolwick Ofce of Air Quality Planning and Standards, U. S. Environmental Protection Agency, Research Triangle Park, NC 27711, USA article info Article history: Received 15 September 2010 Received in revised form 22 December 2010 Accepted 31 December 2010 Keywords: Ozone CMAQ Statistical models abstract Statistical relationships between ground-level daily maximum 8-h ozone (O 3 ) concentrations and multiple meteorological parameters were developed for data drawn from ambient measurements and values that were simulated with the U.S. Environmental Protection Agencys (EPA) Community Multiscale Air Quality (CMAQ) model. This study used concurrent and co-located data from both sources during the O 3 season (May 1eSeptember 30) for a four-year period (2002e2005). Regression models were devel- oped for 74 areas across the Eastern U.S. The most important meteorological parameters used in the model were found to be daily maximum temperature and the daily average relative humidity (RH). Average morning and afternoon wind speed as well as factors for the day of the week and years were also included in the statistical models. R 2 values above 60% were obtained for the majority of the locations in the analysis for both the ambient and CMAQ statistical models. Analysis of the covariate-specic effects revealed a tendency for the CMAQ model to underestimate how O 3 increases with temperature. These results suggest that air quality forecasts that incorporate the CMAQ model may be underestimating the climate penalty on future O 3 concentrations from warmer temperatures. Ó 2011 Elsevier Ltd. All rights reserved. 1. Introduction Research clearly indicates that meteorological conditions strongly inuence surface ozone (O 3 ) concentrations. Given a specic emis- sions loading, it is often the meteorological conditions that determine whether a violation of Federal standards occurs and whether sensitive populations experience adverse health impacts. Statistical models that relate meteorological parameters to O 3 concentrations can provide forecasts of next-day O 3 levels, as well as allow for a retro- spective analysis of historical O 3 trends to determine the adequacy of air quality control measures. A comprehensive review of the statistical methods used to relate O 3 levels to meteorological conditions is given in Thompson et al. (2001) and more recent analyses have been completed by Zheng et al. (2007) and Camalier et al. (2007). Summarizing broadly, these statistical models are able to explain a signicant amount of the variability in daily O 3 levels across the U.S. by considering a limited number of meteorological variables. Typi- cally these variables include temperature, some measure of atmo- spheric humidity, wind speed and direction, boundary layer height, and solar radiation or cloud cover. Results from these studies typically explain 50e70% of the variability (R 2 ) of daily O 3 concentrations. The aforementioned analyses are based on statistical models using measured ambient O 3 levels and observed meteorological parameters. In this work, we investigate those same relationships in O 3 and meteorological data generated by atmospheric simulation models. Specically, the meteorological inputs and O 3 outputs for a multi-year application of the Community Multiscale Air Quality (CMAQ) (Byun and Schere, 2006) model are used. The goal is to determine if these two relationships (response variable to cova- riates in the ambient data versus response variable to covariates in the CMAQ data) are comparable, and identify parameters that are potential sources of error. If there is a systematic difference in the response of O 3 to meteorological parameters, then this can serve as a useful diagnostic evaluation to identify elements of the model code or inputs which need further investigation. In particular, given the increasing concern about the possible impacts of climate change on future air quality levels (EPA, 2009b), an assessment of the model relationship between temperature and O 3 is needed (Bloomer et al., 2009). 2. Methods 2.1. Meteorology and O 3 data This analysis used data from 74 cities throughout the eastern U.S. (locations are given in the Supplementary information and can * Corresponding authors. E-mail addresses: [email protected] (J. Davis), [email protected] (A. Reff), [email protected] (P. Dolwick). Contents lists available at ScienceDirect Atmospheric Environment journal homepage: www.elsevier.com/locate/atmosenv 1352-2310/$ e see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.atmosenv.2010.12.060 Atmospheric Environment 45 (2011) 3481e3487

A comparison of CMAQ-based and observation-based statistical models relating ozone to meteorological parameters

Embed Size (px)

Citation preview

Page 1: A comparison of CMAQ-based and observation-based statistical models relating ozone to meteorological parameters

lable at ScienceDirect

Atmospheric Environment 45 (2011) 3481e3487

Contents lists avai

Atmospheric Environment

journal homepage: www.elsevier .com/locate/atmosenv

A comparison of CMAQ-based and observation-based statisticalmodels relating ozone to meteorological parameters

Jerry Davis*, William Cox, Adam Reff*, Pat DolwickOffice of Air Quality Planning and Standards, U. S. Environmental Protection Agency, Research Triangle Park, NC 27711, USA

a r t i c l e i n f o

Article history:Received 15 September 2010Received in revised form22 December 2010Accepted 31 December 2010

Keywords:OzoneCMAQStatistical models

* Corresponding authors.E-mail addresses: [email protected] (J. Davis),

[email protected] (P. Dolwick).

1352-2310/$ e see front matter � 2011 Elsevier Ltd.doi:10.1016/j.atmosenv.2010.12.060

a b s t r a c t

Statistical relationships between ground-level daily maximum 8-h ozone (O3) concentrations andmultiple meteorological parameters were developed for data drawn from ambient measurements andvalues that were simulated with the U.S. Environmental Protection Agency’s (EPA) Community MultiscaleAir Quality (CMAQ) model. This study used concurrent and co-located data from both sources during theO3 season (May 1eSeptember 30) for a four-year period (2002e2005). Regression models were devel-oped for 74 areas across the Eastern U.S. The most important meteorological parameters used in themodel were found to be daily maximum temperature and the daily average relative humidity (RH).Average morning and afternoonwind speed as well as factors for the day of the week and years were alsoincluded in the statistical models. R2 values above 60% were obtained for the majority of the locations inthe analysis for both the ambient and CMAQ statistical models. Analysis of the covariate-specific effectsrevealed a tendency for the CMAQ model to underestimate how O3 increases with temperature. Theseresults suggest that air quality forecasts that incorporate the CMAQ model may be underestimating theclimate penalty on future O3 concentrations from warmer temperatures.

� 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Research clearly indicates that meteorological conditions stronglyinfluence surface ozone (O3) concentrations. Given a specific emis-sions loading, it is often themeteorological conditions that determinewhether aviolationof Federal standards occurs andwhether sensitivepopulations experience adverse health impacts. Statistical modelsthat relate meteorological parameters to O3 concentrations canprovide forecasts of next-day O3 levels, as well as allow for a retro-spective analysis of historical O3 trends to determine the adequacy ofair qualitycontrolmeasures.Acomprehensive reviewof the statisticalmethods used to relate O3 levels tometeorological conditions is givenin Thompson et al. (2001) and more recent analyses have beencompleted by Zheng et al. (2007) and Camalier et al. (2007).Summarizing broadly, these statistical models are able to explaina significant amount of the variability in daily O3 levels across the U.S.by considering a limited number of meteorological variables. Typi-cally these variables include temperature, some measure of atmo-spheric humidity, wind speed and direction, boundary layer height,and solar radiation or cloud cover. Results from these studies typicallyexplain 50e70% of the variability (R2) of daily O3 concentrations.

[email protected] (A. Reff),

All rights reserved.

The aforementioned analyses are based on statistical modelsusing measured ambient O3 levels and observed meteorologicalparameters. In this work, we investigate those same relationshipsin O3 andmeteorological data generated by atmospheric simulationmodels. Specifically, the meteorological inputs and O3 outputs fora multi-year application of the Community Multiscale Air Quality(CMAQ) (Byun and Schere, 2006) model are used. The goal is todetermine if these two relationships (response variable to cova-riates in the ambient data versus response variable to covariates inthe CMAQ data) are comparable, and identify parameters that arepotential sources of error. If there is a systematic difference in theresponse of O3 to meteorological parameters, then this can serve asa useful diagnostic evaluation to identify elements of the modelcode or inputs which need further investigation. In particular, giventhe increasing concern about the possible impacts of climatechange on future air quality levels (EPA, 2009b), an assessment ofthe model relationship between temperature and O3 is needed(Bloomer et al., 2009).

2. Methods

2.1. Meteorology and O3 data

This analysis used data from 74 cities throughout the easternU.S. (locations are given in the Supplementary information and can

Page 2: A comparison of CMAQ-based and observation-based statistical models relating ozone to meteorological parameters

J. Davis et al. / Atmospheric Environment 45 (2011) 3481e34873482

be seen in the results displayed in Fig. 2) collected during the peakO3 season (May through September) from 2002 to 2005. Theobserved meteorological data were based on two separate datasetsfrom the National Climatic Data Center (NCDC): 1) surface data wasdrawn from the Integrated Surface Database (ISD) (NCDC, 2008),which contains surface data for dozens of meteorological parame-ters on an hourly basis at nearly 7000 sites across the U.S.; 2) upperair variables used in this analysis are based on information from theIntegrated Global Radiosonde Archive (IGRA) dataset (Durre et al.,2006). Surface and upper air data are combined by pairing eachsurface site with its nearest upper air neighbor. This pairing is onlydone for surface sites within three degrees of its nearest upper airneighbor. Observed daily maximum 8-h O3 data were extractedfrom the EPA Air Quality System (AQS) database (EPA, 2009a). Foreach of the 74 cities in this analysis, O3 data was pulled from the O3monitoring location associated with the highest O3 concentrationsin that area on that day.

Simulated meteorological and O3 data were extracted from theinput and output files of a series of consecutive (2002e2005)annual CMAQ runs performed by EPA. These simulations wereperformed for the eastern U.S. using a 279 � 240 grid of12 km � 12 km cells, and are fully described elsewhere (EPA,2010a,b,c,d). The modeling runs used a 2002 base emissionsinventory with varying temporal resolution by sector. Area sourceand non-utility point source emissions used annual emissionstotals equally allocated across the year and did not vary withtemperature. Emissions from electric generating units were basedon hourly measurements from continuous emissions monitoring.The on-road and non-road mobile emissions were based onmonthly averages using climatological average temperatures withan average diurnal profile. Biogenic emissions were provided on anhourly basis and are dependent upon the temperature estimatesfrom the meteorological modeling. With the exception of themonthly average temperatures used for the mobile sector, the keyemissions inputs are linked closely with the input temperaturesfrom the meteorological modeling. Daily 8-h maximum O3 valueswere extracted from the CMAQ outputs. In order to account for the

Fig. 1. O3 concentrations predicted by the GLM regression models vs. O3 concentrations extrData shown are pooled from the 74 study sites for which individual regression models we

use of area-wide, highest-site daily O3 values in the ambient data,we analyzed the modeled meteorological data from the cellsmatching the locations of the observed meteorological data, butselected the modeled O3 data from the highest of the surroundingnine grid cells. The meteorological inputs to CMAQ that were usedin this statistical analysis were generated by the fifth generationMesoscale Model (MM5) (Grell et al., 1994). The MM5 physicsconfiguration and model version were held constant across the2002 through 2005 simulations.

Regression models were built separately for the ambient andCMAQ datasets between the daily 8-h max O3 concentration(response variable) and multiple concurrent meteorologicalparameters (explanatory variables or covariates). There were nomissing CMAQ data and less than 1% of the ambient data weremissing. The meteorological variables drawn from both MM5and NCDC datasets are as follows: tmax (maximum daily temper-ature, �C), wsavgam (average morning wind speed, 7e10 AM LST,m s�1), wsavgpm (average afternoon wind speed, 1 PMe4 PM LST,m s�1), uavg (average daily east/west wind component, m s�1), vavg(average daily north/south wind component, m s�1), rhavg (averagedaily RH, %), dt850 (temperature difference between 850 hPa leveland the surface, �C), rain (rain/no rain binary indicator), yrf (yearfactor), and jday (Julian day).

2.2. Statistical analysis

All statistical calculations were done using the R statisticallanguage (R Development Core Team, 2007). To build our statisticalmodels for ozone, we used non-parametric regression models,which provide a distribution-free basis for predicting the responsevariable over the range of the data and are helpful in dealing withthe non-linearities that are frequently present in the relationshipbetween ozone levels and meteorology (Davis et al., 1998; Davisand Speckman, 1999). The framework for our models is based ongeneralized linear model (GLM) theory (McCullagh and Nelder,1989). Previous statistical analyses of ozone have been successfulusing this class of models (Zheng et al., 2007; Camalier et al., 2007).

acted from the datasets of ambient concentrations (left) and CMAQ simulations (right).re developed. The dashed lines are the 1:1 lines.

Page 3: A comparison of CMAQ-based and observation-based statistical models relating ozone to meteorological parameters

Fig. 2. R2 values (%) for the O3 statistical models. The last plot is the difference betweenambient and CMAQ R2 values.

J. Davis et al. / Atmospheric Environment 45 (2011) 3481e3487 3483

Generalized additive models (GAM) (Hastie and Tibshirani, 1990;Wood, 2006) are an extension of the GLM approach and can berunwithin the RGLM function. GAMs allow for covariates like thoseused in simple regression models (xiTb) as well as for functionalforms of the covariates (indicated by f()), which provides greaterflexibility when modeling complex non-linear processes. Below isthe form of the equation used in this work:

gðmiÞ ¼ b0 þ fj�xi;j

�þ.þ fp�xi;p

�þ yrf (1)

The g() in Equation (1) is referred to as the link function(McCullagh and Nelder, 1989; Dobson and Barnett, 2008), whichspecifies the relationship between the linear portion of the modelon the right hand side of the equation and the expected response mi.The natural log-link function has been found to be the mostappropriate link function to stabilize the variance of the O3 data(Davis and Speckman, 1999), and so is used in this work. Theparameter b0 represents the overall mean and fj(xi,j) is the value ofthe smoothing function associated with the ith value of theexplanatory variable j, where j ¼ 1,.p. The term yrf is a factorwhich represents the effect of a given year on O3 levels. A naturalcubic spline (Green and Silverman, 1994; Hastie and Tibshirani,

1990; Wood, 2006) with three degrees of freedom was used asthe smoothing function in our work to allow for a non-linearresponse between each meteorological covariate and O3 concen-tration. Our basic model also contains a covariate for the Julian day,which has been fitted using a natural cubic spline.

When a log-link function is used in a GLM, the effect of eachcovariate is defined as the percent change in the response variablefor a unit change in the given covariate assuming that all othercovariates remain constant. We approximated the linear effect onthe O3 response variable using the difference between the modelpredicted O3 at the 75th and 25th percentiles of the meteorologycovariate divided by the difference between these two percentilesof the covariate. The standard errors of these effects were obtainedby bootstrap resampling of the original data 200 times followed bya complete refitting of the data (Davison and Hinkley, 1997). Theoutput from this procedure is a new population of 200 simulatedeffects from which the standard errors can be calculated.

Effects and standard errors were calculated for each of thecovariates for both the ambient meteorology and CMAQ meteo-rology for all locations. The differences in the effects were alsocalculated. A composite standard error (SEcomp) was calculated as:

SEcomp ¼�ðSEambÞ2þ

�SEcmaq

�2�0:5 (2)

To test the hypothesis that there is no difference betweena particular ambient effect and CMAQ effect, we calculate anappropriate t-statistic using the following equation:

t ¼�Effectamb � Effectcmaq

�.SEcomp (3)

If the absolute value of the t-statistic is larger than two, the nullhypothesis is rejected and we conclude that there is a significantdifference between the ambient and CMAQ effects for that partic-ular parameter. Although the t-statistic is useful for determiningstatistical significance, we must differentiate that from practicalsignificance. For example, when the differences in the effects arelarge but statistically insignificant because the standard errorestimate is large, this suggests that there is toomuch uncertainty inthe effect estimates to draw any conclusions about the differencebetween the observation-based and CMAQ-based models.Furthermore, if a small difference in the effects is found to bestatistically significant, we might still conclude that the differenceis not practically significant. In other words, the difference is smallenough that we do not expect it to impact any conclusions wemakewhen using the CMAQ model for a particular application.

3. Results

3.1. GLM model results

To evaluate the fit of the GLM models, scatterplots and coeffi-cients of determination (R2) values were separately generated forthe ambient and CMAQ results. Fig. 1 shows all actual and fitted O3

values pooled together into a single scatterplot for both the ambientand CMAQ datasets. The left plot shows statistical predictions vs.actual ambient O3 values, and the right plot is the analogous graphfor CMAQ data. The plots reveal similar fits of the GLM equations tothe two datasets, although the CMAQ results tend to group moreclosely together than the ambient. This is due toCMAQ’s tendency tounderestimate the range of the 8-hmaxima (i.e., overestimate lowervalues, underestimate higher values) (Appel et al., 2007; Tong andMauzerall, 2006). Similar visual patterns can be seen in the scat-terplots of the individual study sites, which are shown in theSupplementary information. It is encouraging that the MM5/CMAQ

Page 4: A comparison of CMAQ-based and observation-based statistical models relating ozone to meteorological parameters

J. Davis et al. / Atmospheric Environment 45 (2011) 3481e34873484

statistical model shows generally equivalent skill in predictingCMAQ O3 levels as is seen in the ambient statistical model. Thisfinding is an initial indication that the underlying physics andchemistry of the air chemical-transport model are capturing thebasic meteorological/O3 relationships that have been observed inactual atmospheric data.

Fig. 2(a) and (b) show maps of the R2 values of all the site-specific GLM fits for the ambient and CMAQ data. The ambientmodels exhibit the strongest fit over the northern part of thisdomain, where values range from70 to 80%. Ambient R2 values overthe rest of the domain range from 50 to 70%, with an overall averageof 70%. The higher R2 values in the northern vs. the southernportion of the domain might result from more active weathersystems that occur in the O3 season in the north. In the south, littlefrontal activity is experienced during the O3 season, so conditionsare favorable for O3 formation on the majority of the days in ourdataset. The regressionmodels fit to CMAQ data yielded R2 values inthe same range (50e80%, with an overall average of 66%) but do notshowany consistent geographic differences (see Discussion). Amapof the R2 differences (Fig. 2(c)) shows that the ambient modelsprovided slightly better fits over most of the 74 locations in thisstudy. Exceptions are generally located in the southeastern U.S.

3.2. Covariate effects

The statistical models for both the observed and CMAQ dataexplained a substantial portion of the variance in the O3 data witha small subset of the explanatory variables. Significant covariateswere identified by F-statistics; distributions of these F-values foreach variable across the 74 study sites are shown in Fig. 3. Theseboxplots show that tmax, rhavg, and, to a lesser extent, jday, werethe most significant predictors of O3 for the majority of sites. Thejday variable was kept in the models for its usefulness in improvingthe power of the GLM regression, however further exploration ofthe ambient and model differences was limited to tmax and rhavg.

Term plots provide visual insight into how daily peak 8-h O3concentrations respond to temperature and RH. Fig. 4 shows

dt850 jday rain rhavg tmax

050

100

150

F−statistic

Am

dt850 jday rain rhavg tmax

050

100

150

F−statistic

C

Fig. 3. F-statistics of regression model c

spatially disparate examples of term plots from four sample sitesthroughout the modeling domain. The dashed lines in the plotsshow the 95% confidence intervals and the density of the data isindicated by the rug above the x-axis. Note that ambient temper-ature data density appears to be much less than that of the CMAQ,but this is a residual of the rounded format in the ISD and did nothinder interpretation.

The term plots of temperature indicate some important differ-ences between the ambient data and CMAQ. For example, we seethat the air quality modeling successfully replicates the feature thatO3 concentrations are higher on the warmest days, but the slope ofthis relationship is not as steep in the CMAQ model data as it is inthe observed model. In addition, the CMAQ model appears tounderestimate the increase in O3 that results from warmertemperatures at three of the four sites in Fig. 4 (Akron, Savannah,and Milwaukee). Similarly, the CMAQ model appears to underes-timate the strength of the inverse relationship between O3 anddaily average RH at these sample sites. In Akron, the CMAQmodeling indicates a certain RH threshold of about 75e80% whereO3 levels drop sharply, whereas the ambient data show a consistentnegative trend over the entire distribution of RH values. Term plotsof the remaining 70 cities are provided in the Supplementaryinformation.

To examine the O3 response to the various parameters ona domain-wide scale, all the site-specific effects were mapped andplotted in Fig. 5. The upper maps show the station-specific effects,the bottom maps show the t-statistic (Equation (3)), and the scat-terplots provide direct visual comparison of the ambient and CMAQeffect values.

The effects for tmax are presented in Fig. 5(a). Here, the scat-terplot indicates that the percent change for tmax is almost alwayspositive and is typically greater in magnitude for models built fromambient data. This suggests that CMAQ is not completely capturingthe sensitivity of changes in O3 levels to changes in tmax. The t-statistic map indicates that in the southeastern part of the countrythere is generally no significant difference in the effects for theambient and CMAQ cases, while in portions of the midwest and the

uavg vavg wsavgam wsavgpm yrf

bient

uavg vavg wsavgam wsavgpm yrf

MAQ

ovariates across the 74 study sites.

Page 5: A comparison of CMAQ-based and observation-based statistical models relating ozone to meteorological parameters

Fig. 4. A comparison of the response curves based on the statistical models for the ambient data and the CMAQ data. The dashed lines are for the 95% confidence interval. The smallvertical lines at the bottom of the figure indicates data density.

Page 6: A comparison of CMAQ-based and observation-based statistical models relating ozone to meteorological parameters

Fig. 5. A comparison of the effects values based on the t-statistic obtained from thestatistical models using the ambient data and the CMAQ data.

J. Davis et al. / Atmospheric Environment 45 (2011) 3481e34873486

northeast there are significant differences with the ambient valuesexceeding the CMAQ values.

Effects of RH on O3 are shown in Fig. 5(b) and are generallynegative. The scatterplot shows that ambient effects are almostalways more negative (i.e., greater dropoff in O3 levels with higherhumidities) than the CMAQ effects. The effects map of the ambientmodel shows a distinct northesouth gradient whose location ismisplaced by the CMAQ model. Additionally, CMAQ generallypredicts effects that are weaker than observed in the ambient data,with values approaching zero at the northern sites. The map of t-statistics indicates that these differences are significant at themajority of the 74 sites.

4. Discussion

To our knowledge, this is the first known exploration andevaluation of statistical relationships between O3 and meteoro-logical parameters in the CMAQ air quality model. The fit of theGLM model (Equation (1)) in ambient data was generally capturedby CMAQ and results are comparable to previous models that useonly ambient data (Davis and Speckman, 1999). In addition, theeffect terms indicated the degree to which individual meteorologyparameters are contributing to CMAQ O3 simulations. Ambient

temperature effects were observed to increase more than those ofCMAQ in the northern portion of this modeling domain, which isconsistent with the two distinct zones in the map of the t-statistics.The effects maps of RH similarly showed a southwest-to-northeastincrease, with a distinct phase-shift in the CMAQ map relative tothe ambient map. These ambient-CMAQ differences might bea result of uncertainties in location-specific emissions of majorsource categories (e.g., biogenic emissions, peaking units frompower plants, greater mobile source emissions, etc.) that are inputto CMAQ, although we cannot rule out the possibility of uncer-tainties in the chemistry or meteorological inputs at this pointeither.

These results provide both direction for future development andimprovement of CMAQ, as well as an initial indication of expectedaccuracy of climate forecasts with the current state of the model.One of the most common approaches to estimate the potentialimpacts of climate change on air quality has been to dynamicallydownscale global climate model simulations of future meteorologyvia regional climate models and then input those regional-scalefuture meteorological estimates into an air quality model likeCMAQ (Hogrefe et al., 2004; Nolte et al., 2008). Jacob and Winner(2009) summarized numerous chemical-transport modelingstudies of the impact of climate change on O3 levels and indicatedthat the consensus from these studies was that O3 was likely toincrease in the future by 1e10 ppb, primarily as a result of projectedincreases in temperature. The projected O3 increases resulting fromclimate change scenarios are also affected by numerous factorsbeyond warmer temperatures (e.g., changes in moisture, stormtracks, and biogenic emissions). More analysis is needed to deter-mine if the CMAQ underestimation of the sensitivity of O3 to dailymaximum temperatures seen in this model application has rami-fications for downscaled regional climate and air quality modelingefforts and their predictions of the future climate penalty.

Acknowledgements

We would like to thank Lucille Bender, Mike Dudek and CliffStanley from Computer Sciences Corporation for preparing thedatasets for this work. Additionally, we thank Ellen Baldridge,Prakash Bhave, Brian Eder, Kristen Foley, Phil Lorang, Sharon Phil-lips, Norm Possiel, and Rich Scheffe of EPA for helpful commentsand discussion.

Appendix. Supplementary information

Supplementary information related to this article can be foundonline at doi:10.1016/j.atmosenv.2010.12.060.

References

Appel, K.W., Gilliland, A.B., Sarwar, G., Gilliam, R.C., 2007. Evaluation of theCommunity Multiscale Air Quality (CMAQ) model version 4.5: sensitivitiesimpacting model performance part I e ozone. Atmos. Environ. 41, 9603e9615.

Bloomer, B.J., Stehr, J.W., Piety, C.A., Salawitch, R.J., Dickerson, R.R., 2009. Observedrelationships of ozone air pollution with temperature and emissions. Geophys.Res. Lett. 36, L90803.

Byun, D., Schere, K., 2006. Review of the governing equations, computationalalgorithms, and other components of the Models-3 Community Multiscale AirQuality (CMAQ) modeling system. Appl. Mech. Rev. 59, 51e77.

Camalier, L., Cox, W., Dolwick, P., 2007. The effects of meteorology on ozone inurban areas and their use in assessing ozone trends. Atmos. Environ. 41,7127e7137.

Davis, J.M., Eder, B.K., Nychka, D., Yang, Q., 1998. Modeling the effects of meteo-rology on ozone in Houston using cluster analysis and generalized additivemodels. Atmos. Environ. 32, 2505e2520.

Davis, J.M., Speckman, P., 1999. A model for predicting maximum and 8 h averageozone in Houston. Atmos. Environ. 33, 2487e2500.

Davison, A.C., Hinkley, D.V., 1997. Bootstrap Methods and Their Application. Cam-bridge University Press, New York.

Page 7: A comparison of CMAQ-based and observation-based statistical models relating ozone to meteorological parameters

J. Davis et al. / Atmospheric Environment 45 (2011) 3481e3487 3487

Dobson, A.J., Barnett, A.G., 2008. An Introduction to Generalized Linear Models.Chapman & Hall/CRC, New York.

Durre, I., Vose, R.S., Wuertz, D.B., 2006. Overview of the integrated global radio-sonde archive. J. Clim. 19, 53e68.

EPA, January 2009a. Air Quality System Technology Transfer Network Website.http://www.epa.gov/ttn/airs/airsaqs/index.htm.

EPA, December 2009b. Endangerment and Cause or Contribute Findings forGreenhouse Gases Under Section 202(a) of the Clean Air Act. Tech. Rep. EPA-HQ-OAR-2009-0171. U.S. Environmental Protection Agency, Washington DC.Available at:. http://www.epa.gov/climatechange/endangerment.html.

EPA, February 2010a. Hierarchical Bayesian Model (HBM)-Derived Estimates of AirQuality for 2002, Annual Report. Tech. Rep. EPA-600/R-10/017. U.S. Environ-mental Protection Agency, Research Triangle Park, North Carolina.

EPA, February 2010b. Hierarchical Bayesian Model (HBM)-Derived Estimates of AirQuality for 2003, Annual Report. Tech. Rep. EPA-600/R-10/018. U.S. Environ-mental Protection Agency, Research Triangle Park, North Carolina.

EPA, February 2010c. Hierarchical Bayesian Model (HBM)-Derived Estimates of AirQuality for 2004, Annual Report. Tech. Rep. EPA-600/R-10/019. U.S. Environ-mental Protection Agency, Research Triangle Park, North Carolina.

EPA, February 2010d. Hierarchical Bayesian Model (HBM)-Derived Estimates of AirQuality for 2005, Annual Report. Tech. Rep. EPA-600/R-10/020. U.S. Environ-mental Protection Agency, Research Triangle Park, North Carolina.

Green, P.J., Silverman, B.W., 1994. Nonparametric Regression and Generalized LinearModels: A Roughness Approach. Chapman & Hall, New York.

Grell, G., Dudhia, J., Stauffer, D., 1994. A Description of the Fifth Generation PennState/NCAR Mesoscale Model (MM5) Tech. Rep. TN-398þSTR, NCAR.

Hastie, T.J., Tibshirani, R.J., 1990. Generalized Additive Models. Chapman & Hall,London.

Hogrefe, C., Lynn, B., Civerolo, K., Ku, J.-Y., Rosenthal, J., Rosenzweig, C., Goldberg, R.,Gaffin, S., Knowlton, K., Kinney, P.L., 2004. Simulating changes in regionalair pollution over the eastern United States due to changes in global andregional climate and emissions. J. Geophys. Res. 109, D22301. doi:10.1029/2004JD004690.

Jacob, D.J., Winner, D.A., 2009. Effect of climate change on air quality. Atmos.Environ. 43, 51e63.

McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models. Chapman & Hall,London.

NCDC, 2008. Data Documentation for Data Set 3505: Integrated Surface Data. Tech.Rep. DSI-3505. National Climatic Data Center, Asheville, NC.

Nolte, C.G., Gilliland, A.B., Hogrefe, C., Mickley, L.J., 2008. Linking global to regionalmodels to assess future climate impacts on surface ozone levels in the UnitedStates. J. Geophys. Res. 113, D14307. doi:10.1029/2007JD008497.

R Development Core Team, 2007. R: A Language and Environment for StatisticalComputing. R Foundation for Statistical Computing, Vienna, Austria, ISBN3-900051-07-0. http://www.R-project.org/.

Thompson, M.L., Reynolds, J., Cox, L.H., Guttorp, P., Sampson, P.D., 2001. A review ofstatistical methods for the meteorological adjustment of tropospheric ozone.Atmos. Environ. 35, 617e630.

Tong, D.Q., Mauzerall, D.L., 2006. Spatial variability of summertime troposphericozone over the continental United States: implications of an evaluation of theCMAQ model. Atmos. Environ. 40 (17), 3041e3056.

Wood, S.N., 2006. Generalized Additive Models: An Introductionwith R. Chapman &Hall/CRC, New York.

Zheng, J., Swall, J.L., Cox, W.M., Davis, J.M., 2007. Interannual variation in meteo-rologically adjusted ozone levels in the eastern United States: a comparison oftwo approaches. Atmos. Environ. 41, 705e716.