An objective comparison of CMAQ and REMSAD performances

  • Published on
    02-Sep-2016

  • View
    212

  • Download
    0

Transcript

Atmospheric Environment 40 (200ripeEdith Gegoa,, P. Steven Porterb, Christian Hogrefec, John S. Irwindsimulation/observation pairs from ten geographic regions and 12 seasons (months). Following the application of thefacilitate decisions about the most suitable alter-natives.ARTICLE IN PRESSCorresponding author. Tel.: +1 208 523 5873;Two of the most prominent modeling systems arethe community multiscale air quality (CMAQ)1352-2310/$ - see front matter r 2006 Elsevier Ltd. All rights reserved.doi:10.1016/j.atmosenv.2005.12.045fax: +1208 282 7975.E-mail address: e.gego@onewest.net (E. Gego).Wilcoxon matched-pair signed rank test, we conclude that CMAQ is more skillful than REMSAD for simulation ofaerosol sulfate. Simulations of particulate nitrate concentrations by CMAQ and REMSAD can seldom be differentiated,leading to the conclusion that both models perform equally for this pollutant specie.r 2006 Elsevier Ltd. All rights reserved.Keywords: Aerosol sulfate; Aerosol nitrate; Wilcoxon signed rank test; Evaluation metric; Photochemical model1. IntroductionModels are the principal tools used by govern-mental agencies to develop emission reductionstrategies aimed at achieving safe and thereforeadmissible air quality. Models are indeed the onlytool that allows testing of the impact of differentreduction strategies on air quality and, therefore,bUniversity of Idaho, 1776 Science Center Drive, Idaho Falls, ID 83402, USAcAtmospheric Sciences Research Center, University at Albany, ES 351, State University of New York, Albany,NY 12222, USAd1900 Pony Run Road, Raleigh, NC 27615-7415, USAReceived 8 February 2005; received in revised form 17 October 2005; accepted 22 December 2005AbstractPhotochemical air quality modeling systems are the primary tools used in regulatory applications to assess the impact ofdifferent emission reduction strategies aimed at reducing air pollutant concentrations to levels considered safe for publichealth. Two such modeling systems are the community multiscale air quality (CMAQ) model and the regional modelingsystem for aerosols and deposition (REMSAD). To facilitate their inter-comparison, the United States EnvironmentalProtection Agency performed simulations of air quality over the contiguous United States during year 2001 (horizontalgrid cell size of 36 36 km) with CMAQ and REMSAD driven by identical emission and meteorological elds. Here, wecompare the abilities of CMAQ and REMSAD to reproduce measured aerosol nitrate and sulfate concentrations. Modelestimates are compared to observations reported by the interagency monitoring of protected visual environment(IMPROVE) and the clean air status and trend network (CASTNet). Root mean squared errors are calculated fora308 Evergreen Drive, Idaho Falls, ID 83401, USAAn objective compaREMSAD6) 49204934son of CMAQ andrformanceswww.elsevier.com/locate/atmosenvmodel (Byun and Ching, 1999) and the regionalmodeling system for aerosols and deposition (RE-MSAD) (ICF Consulting, 2002). To promotemodel-to-model comparison of these two modelingsystems, the United States Environmental Protec-tion Agency (US EPA) recently used CMAQ andREMSAD to simulate air quality over the contig-uous US during year 2001, with both modelsresponding to identical inputs (meteorology, emis-sions, etc.). Our objective is to use the results ofthese simulations to compare the ability of RE-MSAD and CMAQ to reproduce measured aerosolnitrate and sulfate concentrations.Our evaluation of the respective strengths andweaknesses of CMAQ and REMSAD relies oncalculation of the root mean squared errors(RMSE) between model estimates and correspond-ing observations. In an effort to unveil the areas andelds. CMAQ and REMSAD are three-dimensionalEulerian air quality modeling systems designedto simulate the chemistry, transport, and depositionof airborne pollutants. The two systems mostlydiffer from each other by their modeling ofchemistry. Details about CMAQ and REMSADcan be found at http://www.epa.gov/asmdnerl/models3/doc/science/ and http://remsad.saintl.com,respectively.The meteorological elds used in CMAQ andREMSAD were produced by MM5, the fthgeneration Penn State University (PSU)/ NationalCenter for Atmospheric Research (NCAR) mesos-cale model (Grell et al., 1994). MM5 (version 5) wasused to reconstruct meteorology over the continen-tal United States from 1 January 2001 to 31December 2001 with a horizontal resolution of36 km. Vertically, the domain comprises 34 layersARTICLE IN PRESSRegE. Gego et al. / Atmospheric Environment 40 (2006) 49204934 4921time periods where the quality of CMAQ andREMSAD estimates signicantly differ from eachother, simulation results were organized into tengeographical areas and monthly periods for calcula-tion of the evaluation metric (RMSE). Objectivity inour assessment is attained by submitting matchingsets of the evaluation metric characterizing CMAQand REMSAD respectively, to a statistical test ofcomparison of means.2. ModelsAir quality estimates were produced by CMAQ(2004 release version) and REMSAD (version 7.6)using nearly identical meteorological and emissionCASTNet observation siteIMPROVE observation siteRegion IIIRegion IVRegion VRegion IRegion IIFig. 1. Regions identied in the contiguous US and locawith the surface layer approximately 50m deep.Topographic information was developed using theNCAR and the United States Geological Survey(USGS) terrain databases. Vegetation type and landuse information was developed using the NCAR/PSU databases provided with MM5. Initial andboundary conditions were extracted from theNCAR ETA reanalysis archives. An analysis-nud-ging technique was used to nudge predictions(winds, temperature and the mixing ratio) towardsurface and aloft observations. Thermodynamicvariables were not nudged within the boundarylayer. The model was run with a 51/2 day windowand a restart at 12:00 GMT (Greenwich mean time)every fth day. Further details about the MM5Region IXRegion VIIIRegion VIion VII Region Xtion of the observation sites included in the study.setting, such as the physical options utilized, areavailable in McNally (2003).The MM5 elds were processed by the meteorol-ogy-chemistry interface preprocessor (MCIP) ver-sion 2.2 to provide linkage to the air quality models.See details about MCIP at http://www.epa.gov/asmdnerl/models3/doc/science/chap12.pdf.Anthropogenic emission elds from xed sourceswere obtained with the sparse matrix operatorkernel emission model (SMOKE) (Carolina Envir-onmental Programs, 2003) processing the US EPANational Emissions Inventory for 2001. Emissionsfrom mobile sources were prepared with theMOBILE 6 module (US EPA, 2003); biogenicemissions were estimated with BEIS3.12 (http://www.epa.gov/asmdnerl/biogen.html) in conjunctionwith the MM5-derived meteorological estimates.Model-ready emission data with a horizontal gridsize of 36 km 36 km were created from theemission elds by the emission-chemistry interfaceprocessor (ECIP).3. ObservationsObservations used to judge model performanceare aerosol sulfate and nitrate concentrationsreported by the interagency monitoring of protectedvisual environment (IMPROVE) network and theclean air status and trend network (CASTNet). TheIMPROVE network was designed to supervise airquality in pristine environments whereas CASTNetsites are located mostly in rural, not necessarilypristine, situations. In the western United States,though, newly added CASTNet sites are oftenARTICLE IN PRESSTable 1Number of IMPROVE and CASTNet sites in each regionRegion TotalI II III IV V VI VII VIII IX XCASTNet 3 7 3 8 1 14 12 6 14 5 73IMPROVE 9 12 12 24 5 4 6 4 5 5 86Table 2Comparison of CMAQ and REMSAD estimates of sulfate concentration to observations at IMPROVE sitesMonth RegionI II III IV V VI VII VIII IX XRMSE (mgm3) characterizing CMAQ estimates, by month and regionJan. 0.71 0.35 0.38 0.61 0.76Feb. 0.38 0.29 0.41 0.34 0.88Mar. 0.30 0.76 0.31 0.49 0.68Apr. 0.47 0.54 0.35 0.44 0.72May 0.40 0.96 0.34 0.43 0.61Jun. 0.51 1.08 0.26 0.56 0.59Jul. 0.87 1.26 0.20 0.61 0.65Aug. 0.72 1.21 0.25 0.55 0.52Sep. 0.47 0.99 0.25 0.45 0.48Oct. 0.27 1.11 0.23 0.39 0.430.540.55d reg0.950.930.730.900.470.490.680.610.470.390.410.59E. Gego et al. / Atmospheric Environment 40 (2006) 492049344922Nov. 0.30 0.52 0.26 0.84Dec. 0.24 0.34 0.24 0.32RMSE (mgm3) characterizing REMSAD estimates, by month anJan. 0.74 0.34 0.38 0.65Feb. 0.36 0.28 0.35 0.38Mar. 0.35 0.75 0.33 0.43Apr. 0.38 0.57 0.41 0.58May 0.41 0.86 0.40 0.44Jun. 0.52 1.00 0.27 0.63Jul. 0.91 1.21 0.23 0.70Aug. 0.79 1.13 0.31 0.62Sep. 0.43 0.94 0.26 0.57Oct. 0.28 0.99 0.22 0.54Nov. 0.32 0.49 0.29 0.96Dec. 0.24 0.35 0.19 0.180.59 1.39 0.83 1.31 1.310.48 3.83 0.79 0.97 1.190.71 0.98 0.62 1.34 1.651.10 1.30 0.72 1.24 1.361.15 1.87 1.47 3.09 2.111.44 2.17 1.51 2.65 2.241.63 2.61 1.57 1.89 2.091.16 2.91 1.78 9.59 2.220.92 2.24 1.71 2.61 1.780.62 2.13 1.32 1.91 1.501.10 1.45 1.25 1.03 1.120.64 0.92 1.10 2.13 1.31ion0.64 1.59 1.07 1.30 1.150.50 3.80 0.65 1.05 0.960.89 0.97 0.78 1.31 1.561.00 2.41 1.49 1.61 1.420.52 2.26 1.87 3.48 2.401.38 2.95 2.56 3.34 1.861.75 4.66 3.55 2.21 2.581.23 3.92 2.25 11.00 2.861.35 2.24 1.23 2.88 1.350.42 0.75 1.55 0.83 0.760.84 0.95 1.09 1.09 1.170.64 0.79 1.04 3.02 1.03collocated with an IMPROVE counterpart. Sam-pling and analysis protocols adopted in CASTNetand IMPROVE are also very different: CASTNetsamples are 7-day integrated averages while IM-PROVE samples are 24-h averages collected everythird day.The air sampler at a CASTNet site is a non-sizeselective three-stage lter pack located 10m aboveground level. Filters are not equipped with a particlesize-limiting device but the ow rate utilized and theheight of the instrument are thought to prohibitentrance of coarse particles (Finkelstein, 2003,personal communication). The nitrate and sulfateions interpreted as particulate species are collectedon the rst of the three consecutive lters, composedof Teon. CASTNet concentrations are standar-dized to a temperature of 25 1C and a pressure of1013mb before being reported.The IMPROVE air sampler consists of fourmodules located 3m above ground level andequipped with a device that stops particles largerthan 2.6 mm. Sulfate concentration is determinedfrom the sulfur found on a teon lter. Nitrate isdetermined from particles caught on a nylon lterthat is preceded by an acidic vapor diffusiondenuder which eliminates nitric acid vapor (non-particulate nitrate). Measured concentrations arereported at ambient temperature and pressureconditions.While aerosol sulfates are collected on the sametype of substrate (Teon lter) at both IMPROVEand CASTNET sites, nitrate interpreted as particu-late material is collected on a Teon lter atCASTNet sites and a nylon lter at IMPROVEsites. These differences in sampling equipmentprobably justify differences between nitrate concen-trations reported by the two networks at almostcollocated sites (Gego et al, 2005). Further detailsabout the sampling protocols utilized by eachnetwork, as well as the data they provide areavailable at http://vista.cira.colostate.edu/IMPROVE/and http://www.epa.gov/CASTNet.4. Methods4.1. Model evaluation metricA variety of evaluation metrics are available toassess a models ability to reproduce past observa-tions (e.g., Hanna, 1994). For the model-to-modelARTICLE IN PRESSo simIII2616.9196.4701:56C01:56Cay.91.5810E. Gego et al. / Atmospheric Environment 40 (2006) 49204934 4923Table 3Differentiation of the performances of CMAQ and REMSAD tT 0probability levels p less than 5% are underlined)Season Network RegionI IIWithin each regionAll year IMPROVE T 0 29 8p (%) 23.49 0:61RCASTNet T 0 32 6p (%) 31.10 0:34CHigh conc. months IMPROVE T 0 10 1p (%) 50.0 3:13RCASTNet T 0 5 0p (%) 15.63 1:56CNetwork MonthJan. Feb. Mar. Apr. MWithin each monthIMPROVE T 0 12 22 21 9 1p (%) 6.54 31.25 27.83 3:22C2CASTNet T 0 15 21 16 4 0p (%) 11.62 27.83 13.77 0:68C0:CC: indicates that CMAQ is signicantly better than REMSAD; R: indulate aerosol sulfate with Wilcoxon Matched Pair test (statisticIV V VI VII VIII IX X14 31 36 28 15 14 317 2:61C28.47 42.50 21.19 3:20C2:6C28.470 6 20 17 6 21 270:02C0.34 7.57 4:6C0:34C8.81 19.020 10 10 1 3 0 71:56C50 50. 3:13C7.81 1:56C4500 5 0 0 0 0 11:56C15.63 1:56C1:56C1:56C1:56C3:13CJun. Jul. Aug. Sep. Oct. Nov. Dec.21 4 6 23 14 18 1727.83 0:69C1:37C34.77 9.67 18.75 16.110 2 0 8 19 7 240:10C0:29C0:10C2:44C21.58 1:86C38.48icates that REMSAD is signicantly better than CMAQ.comparison presented here, we chose to utilize theRMSE. This metric seemed better suited for ourpurpose than the model bias, other very commonlyused evaluation metric, because it is calculated asthe average of positive values only (the squarederrors) and, therefore, will not dissimulate potentialmodel aws that occurs when positive and negativeerrors are added. The RMSE between a set of modelpredictions and the observed values the model issupposed to reproduce is calculated asRMSE 1nXMx; t Ox; t2r.With M(x,t): model prediction at location x andtime t, O(x,t): observation at location x and time t,n: number of model/observation pairs.Because of the sampling protocol differencesmentioned above, RMSEs are calculated separatelyfor the two networks. Model grid cell estimates arecompared to data collected at the observation sitelocated in the corresponding cell. No interpolationwas carried out to account for the change of supportvolume from a model cell average to a pointmeasurement.4.2. Geographic and seasonal subdivisionsA single RMSE value can conceivably character-ize the whole simulation. However, since we wish todetermine when and where the quality of CMAQand REMSAD estimates signicantly differ fromeach other, we avoided this erce averaging andorganized simulation results into ten geographicalareas and 12 seasons (months) prior to RMSEcalculations. Two RMSEs, characterizing each ofthe models, respectively, were calculated for eachcombination of region and month by incorporatingall pairs of relevant observations and model values.The every 3-day sampling schedule at IMPROVEsites led to the following sample sizes for the monthsfrom January through December 2001: 11, 9, 11, 10,10, 10, 10, 11, 10, 10, 10, 9. Similarly, CASTNetmonthly sample sizes (based on weekly samplingdurations) were: 4, 4, 5, 4, 4, 4, 5, 5, 4, 4, 3. Onlysites with fewer than 15% missing values were used.Ten geographic regions (Fig. 1) were identied inthe continental US on the basis of the modes ofvariation observed for aerosol sulfate (Irwin et al.,2004). These regions also reect the broad natural,ARTICLE IN PRESSMAQ Region II Region VIIs at IE. Gego et al. / Atmospheric Environment 40 (2006) 492049344924Time (day)Time (day)Concentration (g/m3 )Concentration (g/m3 )21.510.503/31 6/30 9/3016128403/31 6/31 9/30Observations C(a)(c)Region IXFig. 2. Time series of the mean measured sulfate concentrationestimates in four regionsbiweekly moving averages.107.552.503/31 6/30 9/3064.531.503/31 6/30 9/30estimates REMSAD estimates(b)(d)Region XMPROVE sites and the corresponding CMAQ and REMSADhypothesis is true (often simply referred to as theprobability level (p)) is identied. The null hypoth-esis is rejected (meaning that the qualities of the twomodels estimates are signicantly different) if pexceeds a threshold value, usually set at 5%.When assessing within-month performance con-trasts, we use the means of the ten paired regionalRMSEs for a given month, calculate their differencesand submit these differences to the WMP test. Whenassessing performance contrasts within each region,we compare the 12 paired monthly RMSEs char-acterizing a given region. In addition, we compare thepaired RMSEs encountered during the 6 months withhighest concentrations, hence contrasting models skillat reproducing high concentration seasons.5. Results5.1. Aerosol sulfate5.1.1. Comparison with improve observationsThe RMSE values characterizing CMAQ andREMSAD ability to reproduce sulfate observationsARTICLE IN PRESSCMAQ more accurate than REMSADREMSAD more accurate than C MAQE. Gego et al. / Atmospheric Environment 40 (2006) 49204934 4925topographic and climatic features encountered. Thearea extending from Texas north to the Nebraska-South Dakota border is not represented in thisstudy because it contains only one monitoring site.Table 1 recapitulates the number of observationsites operated by CASTNet and IMPROVE in eachregion.4.3. Model-to-model comparisonTo compare CMAQ and REMSAD performancesand determine whether or not they signicantlydiffer from each other, we calculated the differencesbetween matched pairs of RMSEs, respectively,characterizing CMAQ and REMSAD in eachregion/month, and evaluated the mean of thesedifferences. If CMAQ and REMSAD skills aresimilar, the paired RMSEs should be about equaland the mean of their differences should therefore beclose to 0. If the skill of one model considerablysurpasses the other, the mean of the paireddifferences will be signicantly different from zero.The statistical test chosen to perform thiscomparison is the Wilcoxon matched-pairs signedrank test (WMP), a non-parametric counterpart ofthe matched-pairs Student t-test. The non-para-metric option releases us from concerns about thenormality of the underlying population of RMSEdifferences with only limited loss of power. As in apaired Students t test, the null hypothesis of theWMP test states that the mean of the pairedRMSEs differences is zero; i.e., the mean of RMSEscharacterizing CMAQ equals that of RMSEscharacterizing REMSAD. Unlike the Student t-test,WMP test is not performed on the RMSE valuesbut on their ranks.Practically speaking, applying the WMP test rstrequires calculations of the differences betweenmatching RMSE values. The absolute values ofthese differences are then ranked from least togreatest, and the ranks are assigned the sign of thecorresponding RMSE difference. Finally, the sumsof the positive (T+) and the negative ranks (T) arecalculated. If the performances of CMAQ andREMSAD are similar, the positive and negativerank-sums should be approximately equal(T+T). A large gap between the positive andnegative rank-sums indicates that one model per-forms consistently better than the other. The smallerof the two T values is selected as the test statistic(T 0) and the probability of observing a T 0 valueequal or smaller than that observed if the nullApr il MayJune JulyAugust SeptemberFig. 3. Dentication of the most accurate model for sulfatesimulation during the 6 months of high sulfate concentrationsIMPROVE network.for each month and region are shown in Table 2.The calculated T 0 values in correspondence toinformation provided in Table 2 and the respectiveprobability levels are summarized in Table 3. Tofacilitate inspection of the results, tests that conrmmodel performances to be signicantly different(po5%) are underlined and the model whosequality was found better (CMAQ or REMSAD) isidentied. Note that the probability levels shown inTable 3 are associated with each individual test;caution is advised if considering several teststogether.Considering the 12 monthly RMSEs within eachregion, model performance could be differentiatedin regions II, IV, VIII, and IX but not elsewhere. Ofthese regions, CMAQ had a lower RMSE in all butregion II. When focusing only on the six months ofhigh concentrations (from AprilSeptember), sig-nicant differences between the two models arefound in regions II, III, IV, VII, and IX. Onceagain, CMAQ had a lower RMSE in all theseregions but region II.Fig. 2 displays the time series of averagemeasurements and corresponding model estimatesin four of the ten regions identied in the contiguousUS: region II (panel a); region VII (panel b), regionIX (panel c) and region X (panel d). Region II waschosen as representative of the western US, whilethe three eastern regions have the highest annualmean sulfate concentrations. IMPROVE data (24-hconcentrations) and corresponding model estimatesshow high short-term variability, a condition thatprevents their clear display. We therefore chose todisplay the temporal average of 5 successivesampling days, i.e. a quasi bi-weekly signal sincesampling events are separated by 3 days in theIMPROVE protocol, rather than individual obser-vations. Temporal averaging attenuates the ex-tremes, therefore, improving visualization. Eachline in Fig. 2 corresponds to the bi-weekly signalof the mean observations and model estimates at allsites in the region displayed.Although the largest in the western US, sulfateconcentrations in region II (panel a) are modest incomparison to those observed in the eastern USInterestingly, CMAQ and REMSAD signals arevery close but neither resembles the patternobserved there. Yet, REMSAD daily estimatesARTICLE IN PRESSMAQRegion II Region VIs atE. Gego et al. / Atmospheric Environment 40 (2006) 492049344926Time (day)Time (day)Observations CConcentration (g/m3 )Concentration (g/m3 )2.52.01.51.00.503/31 6/30 9/303/31 6/30 9/30129630(a)(c)Region VIFig. 4. Time series of the mean measured sulfate concentrationestimates in four regionsbiweekly moving averages. estimates REMSAD estimates3/31 6/30 9/301296303/31 6/30 9/30129630(b)(d)Region IXCASTNet sites and the corresponding CMAQ and REMSAD(not displayed but substituted for by the bi-weeklysignals) were often slightly closer to reality thanCMAQs, justifying why the WMP test showedREMSAD to be superior. In region X, both modelsfollow the same pattern, but the amplitudes ofCMAQ uctuations more closely resemble those ofthe observations. Results are similar in region VII(panel b), where CMAQ predictions seem moreaccurate than REMSADs during the high concen-trations season but less accurate during the lowconcentration seasons. That explains why the WMPtest pertaining to the entire year did not showCMAQ to be better while the test performed on thehigh concentration months only (6 monthly RMSEvalues) did.In inspecting model performance by month, itappears that April, July and August yielded con-trasted differences, each time in favor of CMAQ.Complementing this comparison, Fig. 3 indicatesthe model (CMAQ or REMSAD) that led to themost accurate results (smallest RMSE value) foreach region duting the 6 months of the highconcentrations season.5.1.2. Comparison with CASTNet observationsFig. 4 displays the temporal evolution of sulfatein regions II, VI, VII and IX; the three latteridentied as the most polluted areas in theCASTNet network. Because the weekly averageconcentrations measured by CASTNet do not showthe short-term variability present in IMPROVEdata, each graph displays raw information and not amoving average of several sampling events. Asobserved with IMPROVE data, both models con-stantly underestimate sulfate concentrations inregion II. Elsewhere, CMAQ seems better thanREMSAD at simulating high concentrations peri-ods.Table 4 summarizes the RMSEs calculated foreach month and region while the T 0 values andcorresponding probability levels are indicated inTable 3. With CASTNet data being the basis forARTICLE IN PRESSTable 4Comparison of CMAQ and REMSAD estimates of sulfate concentrations to observations at CASTNet sitesMonth RegionI II III IV V VI VII VIII IX XRMSE (mgm3) characterizing CMAQ estimates, by month and region0.461.540.820.740.600.530.300.760.450.500.330.26d reg0.691.550.880.790.600.560.260.810.580.540.490.33E. Gego et al. / Atmospheric Environment 40 (2006) 49204934 4927Jan. 0.37 0.17 0.24 0.76Feb. 0.35 0.40 0.32 1.00Mar. 0.37 0.59 0.41 0.70Apr. 0.34 0.59 0.31 1.36May 0.39 0.95 0.43 0.76Jun. 0.56 0.94 0.41 0.99Jul. 1.21 1.26 0.39 0.89Aug. 0.79 1.12 0.52 0.89Sep. 0.53 0.90 0.37 0.65Oct. 0.26 0.83 0.25 0.53Nov. 0.29 0.25 0.21 0.56Dec. 0.23 0.13 0.25 0.43RMSE (mgm3) characterizing REMSAD estimates, by month anJan. 0.36 0.17 0.23 0.76Feb. 0.21 0.28 0.22 0.99Mar. 0.36 0.61 0.37 0.75Apr. 0.23 0.63 0.44 1.45May 0.49 1.02 0.53 0.83Jun. 0.57 1.01 0.44 1.08Jul. 1.33 1.30 0.52 1.09Aug. 0.90 1.17 0.61 1.00Sep. 0.60 0.97 0.53 0.93Oct. 0.24 0.85 0.36 0.73Nov. 0.26 0.33 0.33 0.95Dec. 0.24 0.12 0.08 0.461.41 1.21 1.04 1.55 1.511.68 1.36 0.98 1.71 1.441.23 1.10 0.40 0.86 1.281.05 1.60 0.38 0.85 1.061.60 2.05 1.36 1.55 1.331.29 0.91 0.85 1.37 1.191.88 2.01 0.77 1.72 2.141.41 2.49 1.22 2.40 1.960.74 1.16 0.84 1.61 1.330.68 1.20 0.66 1.35 1.110.58 0.83 0.32 0.56 0.820.81 0.73 0.51 0.77 0.61ion1.49 1.25 0.70 1.60 1.441.15 0.94 0.33 0.74 0.990.95 0.78 0.42 0.63 1.211.30 2.10 0.64 1.10 1.431.66 2.88 1.42 1.82 1.811.55 2.51 0.99 2.54 1.872.58 4.55 1.25 3.33 2.702.52 3.96 1.62 4.96 2.781.20 2.24 1.11 1.81 0.990.37 0.45 0.75 0.62 0.660.69 1.03 0.47 0.47 0.760.71 0.57 0.63 0.86 0.57comparison rather that IMPROVEs, CMAQ esti-mates proved better in regions II, IV, VII and VIII.Of these four regions, only region VII experienceshigh sulfate concentrations. In examining themonths from AprilSeptember, CMAQ estimatesseem more realistic than REMSADs everywhere butin regions I and V where the quality of the twomodels could not be differentiated. Note thatCASTNet maintains only three and one site,respectively, in regions V and I.The WMP test applied to sulfate estimates sortedby month showed that CMAQ was a better modelduring the 6 months of high concentrations, i.e.,from April to September, corroborating the regio-nal contrasts during this period, and again duringNovember. CMAQ estimates were more accuratethan REMSADs in all ten regions for the months ofApril, May and August, as proven by the null valuesof T 0.5.1.3. Differences between the evaluation resultsobtained with IMPROVE and CASTNet.As stated in Section 2, the frequency and durationof a sampling event are quite different in theIMPROVE and CASTNet protocols. While IM-PROVE data describe 24-h average concentrationsmeasured every 3 days, CASTNet data are 7-dayintegrated samples. Because of these differences,IMPROVE and CASTNet data allow assessment ofdifferent model skills: the ability of a model toreproduce day-to-day variations or longer-term(weekly) variations. In the case of sulfate, CMAQskills show better than REMSADs when assessed bycomparison with CASTNet data but not so notablywith IMRPOVE data. This nding tends to showthat CMAQ edge over REMSAD resides in itsability to better reproduce changes in weeklyaverage but not the day-to-day uctuations. Thepreceding comments need to be considered cau-tiously, since differences in the location and thenumber of sites per region for each network mayalso explain these results.5.2. Aerosol nitrate5.2.1. Comparison with improve observationsFig. 5 presents the temporal evolution of aerosolnitrate in regions II, VI, VII and IX. Generallyspeaking, air quality in the western US is better thanin the East. However, in terms of nitrate pollution,ARTICLE IN PRESSCMAQs at IE. Gego et al. / Atmospheric Environment 40 (2006) 492049344928ObservationsConcentration (g/m3 )Concentration (g/m3 )21.510.503/31 6/30 9/30Time (day)3/31 6/30 9/3043210Time (day)(a)(c)Region IIRegion VIIFig. 5. Time series of the mean measured nitrate concentrationestimates in four regions. estimates REMSAD estimates3/31 6/30 9/305.03.752.51.2503/31 6/30 9/305.03.752.51.250(b)(d)Region VIRegion IXMPROVE sites and the corresponding CMAQ and REMSADregion II ranks as the third worst region, just afterregions VII and IX. As seen on panel a, measuredconcentrations in region II do not diminish assharply during late spring and summer as in theother three regions. CMAQ and REMSAD over-predict this concentration decrease and thereforeunderestimate the extent of nitrate pollution inregion II during summertime. Both models simulateregion VI (panel b) reasonably well, althoughREMSAD largely exaggerates late fall concentra-tions. In region VII (panel c), both models over-predict high concentrations but are faithful toobservations during the low concentration season.Finally, CMAQ and REMSAD overestimate con-centrations in region IX during most of the year,especially during the high concentrations period(cold season).Table 5 summarizes the RMSEs that characterizethe goodness of CMAQ and REMSAD nitrateestimates for each month and region. Table 6 showsthe corresponding T 0 and probability levels. Inanalyzing their year-long performance, it appearsthat CMAQ and REMSAD did not simulateregions IV, VII nor X equally well. More speci-cally, REMSAD was signicantly better at simulat-ing region IV, an area with extremely lowconcentrations, while CMAQ failed at reproducingthe general pattern observed. On the other hand,CMAQ simulated regions VII and X more faith-fully. Fig. 6 and panel c of Fig. 5 details observedand simulated temporal evolution of nitrate in theareas with contrasted model performances. Carry-ing out the WMP test on results describing themonths of high concentration only (from Novemberto April) led to similar results, although the super-iority of CMAQ in region VII could no longer beproven.As illustrated in Fig. 7, which indicates the mostaccurate model for each region during the 6 monthsof high-nitrate concentrations, the performances ofREMSAD and CMAQ are close. The WMP testapplied to results sorted by month (Table 6)showed that only April and November led tosignicantly different model performance, withARTICLE IN PRESSTable 5Comparison of CMAQ and REMSAD estimates of nitrate concentrations to observations at IMPROVE sitesMonth RegionI II III IV V VI VII VIII IX Xegion1.050.970.670.980.220.290.200.150.240.410.751.49d reg0.910.780.620.860.270.230.180.150.280.330.981.27E. Gego et al. / Atmospheric Environment 40 (2006) 49204934 4929RMSE (mgm3) characterizing CMAQ estimates, by month and rJan. 1.70 2.40 1.16 0.70Feb. 0.99 1.92 0.86 0.34Mar. 0.31 1.50 0.41 0.38Apr. 0.70 1.81 0.39 0.41May 0.41 1.10 0.19 0.20Jun. 0.31 0.76 0.11 0.24Jul. 0.22 1.01 0.08 0.18Aug. 0.47 0.67 0.11 0.16Sep. 0.34 0.94 0.12 0.16Oct. 0.22 1.70 0.39 0.33Nov. 0.92 3.05 0.30 0.46Dec. 0.79 0.63 0.51 0.59RMSE (mgm3) characterizing REMSAD estimates, by month anJan. 1.72 2.32 1.03 0.65Feb. 1.02 2.05 0.73 0.33Mar. 0.32 1.49 0.44 0.40Apr. 0.61 1.13 0.27 0.31May 0.29 1.03 0.18 0.20Jun. 0.13 0.69 0.11 0.23Jul 0.23 0.87 0.09 0.18Aug. 0.21 0.69 0.12 0.17Sep. 0.35 0.80 0.12 0.14Oct. 0.24 1.73 0.16 0.14Nov. 1.10 3.08 0.36 0.43Dec. 0.68 0.65 0.38 0.492.55 1.30 0.81 1.55 0.710.47 1.37 0.94 2.02 0.850.91 1.75 0.85 2.28 1.130.96 1.86 1.40 2.68 0.730.22 0.57 0.34 1.01 0.400.10 0.41 0.12 0.51 0.390.22 0.31 0.13 0.67 0.320.11 0.26 0.17 0.39 0.350.13 0.78 0.15 0.84 0.230.73 1.33 0.85 1.67 0.371.73 2.51 1.07 2.70 0.601.87 1.28 0.30 1.10 0.46ion1.77 1.58 0.88 1.58 1.620.45 1.70 0.43 1.74 1.111.18 1.90 0.94 2.23 2.030.40 1.55 1.04 1.90 0.680.26 0.47 0.25 1.18 0.430.40 0.68 0.26 1.85 0.450.16 0.34 0.13 1.41 0.460.10 0.67 0.19 1.82 0.400.14 0.98 0.28 1.55 0.440.96 1.43 0.95 1.44 0.432.75 2.60 1.57 2.54 1.021.13 2.39 1.47 2.21 0.97ARTICLE IN PRESSimulIIIE. Gego et al. / Atmospheric Environment 40 (2006) 492049344930Table 6Differentiation of the performances of CMAQ and REMSAD to sprobability levels p less than 5% are underlined)Season Network RegionI IIREMSAD estimates more accurate in April andCMAQs more appropriate in November.5.2.2. Comparison with CASTNet observationsTable 7 shows the RMSE characterizing nitrateestimates by region and month, CASTNet observa-tions being used as the basis for comparison.Within each regionAll year IMPROVE T 0 31 23 24p (%) 28.47 11.67 13.3CASTNet T 0 15 25 13p (%) 3:20R15.06 2:12CHigh conc. months IMPROVE T 0 9 10 3p (%) 50.0 450. 7.81CASTNet T 0 6 10 5p (%) 21.88 50.00 15.6Network MonthJan. Feb. Mar. Apr. May.Within each monthMPROVE T 0 25 24 13 0 23p (%) 42.29 38.48 8.01 0:10R34.77CASTNet T 0 27 7 16 11 19p (%) 50.00 1:86R13.77 5.27 21.58C: indicates that CMAQ is signicantly better than REMSAD; R: indConcentration (g/m3 )0.80.60.40.203/31 6/30 9/30Time (day)Observations CMA(a)Region IVFig. 6. Time series of the mean measured nitrate concentrations at Iestimates in the Southwest and the South-Atlantic regionsbi-weeklyate aerosol nitrate with Wilcoxon matched pair test (statistic T 0IV V VI VII VIII IX XAnalyzing all monthly results with the WMP test(Table 6) showed that models performance wasdistinguishable only in regions I and III, regionswith relatively low nitrate levels, with REMSADmore accurate in region I and CMAQ moreaccurate in region III. Inspection of results pertain-ing to the 6 months of high concentrations indicated11 21 38 13 23 25 41 1:34R8.81 48.49 2:12C11.67 15.06 0:17C32 20 27 30 18 23 3431.10 7.57 19.02 25.93 5.49 11.67 36.672 6 8 4 7 7 14:69R21.88 34.38 10.94 28.13 28.13 3:13C7 2 6 6 7 9 63 28.13 4:69C21.88 21.88 28.13 42.19 21.88Jun. Jul. Aug. Sep. Oct. Nov. Dec.17 22 12 12 26 7 2116.11 31.25 6.54 6.54 46.09 1:86C27.8320 18 9 3 18 27 1124.61 18.75 3:22C0:49C18.75 50.00 5.27icates that REMSAD is signicantly better than CMAQ.Q estimates REMSAD estimates32.251.50.7503/31 6/30 9/30(b)Region XMPROVE sites and the corresponding CMAQ and REMSADmoving averages.ARTICLE IN PRESSE. Gego et al. / Atmospheric Environment 40 (2006) 49204934 4931January FebruaryMarch AprilNovember DecemberCMAQ more accurate than REMSADREMSAD more accurate than C MAQ that region V, sampled by a single CASTNetmonitor, is the only region where model perfor-mance was different, with REMSAD better thanCMAQ. Also illustrating this absence of contrast,Fig. 8 presents the evolution of nitrate concentra-tions in the four regions previously depicted: regionsII, VI, VII and IX. As already observed withIMPROVE data, CMAQ and REMSAD predic-tions are too low in region II during the summer-time (panel a). Both models reproduce regions VIand VII quite well with fairly the same degree ofinaccuracy, justifying why WMP test did notdistinguish their respective skills. Finally, predic-tions in region IX (panel d) are overestimated byboth models, especially during the months of highconcentrations.Examination of model performance for eachmonth with the WMP test (Table 6) shows thatREMSAD was better than CMAQ for simulation ofFebruary (month with high concentrations), whileCMAQ was more accurate for August and Septem-ber, both months with low concentrations. ModelFig. 7. Identication of the most accurate model for nitratesimulation during the 6 months of high observed nitrateconcentrationsIMPROVE network.performances could not be differentiated for the 9other months.Generally speaking, whereas CMAQ was oftenshown superior for simulation of aerosol sulfate, itsskill at reproducing nitrate seems to be comparableto that of REMSAD.6. SummaryIronically, while the modeling community devotesa great amount of attention and energy to rigorousquantication of the relevant chemical and physicalprocesses, the results of a model evaluation areoften communicated in qualitative terms. State-ments such as the model is doing fairly well or themodel has been greatly improved are often madewithout quantitative supporting evidence. Subjec-tivity can even be more treacherous when judgingthe relative performance of two models. When canone conclude that a model is better than another?Does a simple visual examination of model outputssufce? Must one model prove superior for all timesand points in the model domain?The US EPA has endeavored to facilitate themodel-to-model comparison of CMAQ and RE-MSAD models by performing an annual simulationof air quality over the contiguous US using bothmodels driven by identical inputs (meteorology,emissions, etc.). Here we attempted to make bothqualitative and quantitative assessments of therespective skills of CMAQ and REMSAD tosimulate aerosol nitrate and sulfate. Graphs ofobserved and modeled time series obtained withCMAQ and REMSAD were compared. In rareoccasions, a visual examination of these graphs wassufcient to decide upon the best model, such aswhen one of them provided estimates systematicallycloser to observations than the other one. In mostcases, the prevalence of one model over the otherappeared weak, transient and/or local. In theseinstances, to remove any subjectivity from ourinterpretation, we calculated a standard evaluationmetric (the RMSE) to characterize the goodness ofeach model and submitted matched pairs of thisevaluation metric to a statistical test of comparisonof means (Wilcoxon Matched-Pairs signed ranktest). In an effort to unveil the areas and timeperiods where the quality of CMAQ and REMSADestimates signicantly differ from each other,simulation results were organized into ten geogra-phical areas and monthly periods for calculation ofthe RMSEs. The WMP test was used to determineARTICLE IN PRESSTable7ComparisonofCMAQandREMSADestimatesofnitrateconcentrationstoobservationsatCASTNetsitesMonthRegionIIIIIIIVVVIVIIVIIIIXXRMSE(mgm3)characterizingCMAQestimates,bymonthandregionJan.0.361.150.220.360.682.861.080.701.411.43Feb.0.370.530.420.351.281.460.811.311.481.18Mar.0.411.400.120.380.701.691.420.911.931.42Apr.0.800.730.330.450.511.200.831.232.571.45May0.590.840.170.410.330.870.450.401.141.03Jun.0.270.900.190.430.290.570.260.120.510.78Jul.0.471.050.170.440.200.490.140.160.480.64Aug.0.500.950.220.370.400.380.200.190.310.76Sep.0.461.100.140.330.360.540.360.200.540.68Oct.0.291.510.190.360.360.671.201.231.451.24Nov.0.271.010.160.450.672.061.691.312.271.94Dec.0.210.310.190.260.842.210.920.461.051.01Jan.0.530.940.230.270.412.441.331.191.371.66RMSE(mgm3)characterizingREMSADestimates,bymonthandregionFeb.0.300.540.390.341.141.540.630.991.241.06Mar.0.311.440.180.340.591.711.691.111.931.61Apr.0.640.580.370.630.630.910.760.931.871.32May0.520.920.290.470.391.040.370.361.041.11Jun.0.310.920.210.450.261.160.420.391.250.74Jul.0.321.050.170.440.120.750.250.360.890.62Aug.0.220.970.230.370.400.690.210.621.100.78Sep.0.411.100.190.350.430.900.440.481.230.79Oct.0.271.550.200.390.451.310.931.391.541.12Nov.0.321.040.210.460.322.081.571.372.281.84Dec.0.150.330.170.220.551.301.431.462.051.37E. Gego et al. / Atmospheric Environment 40 (2006) 492049344932ARTICLE IN PRESSn4321086420E. Gego et al. / Atmospheric Environment 40 (2006) 49204934 4933Concentratio3/31 6/30 9/30210theansimfolFigestConcentration (g/m3 ) (g/m3 )21.510.503/31 6/30 9/30Time (day)543(a)(c)Region IIRegion VIIsignicance of the differences between CMAQd REMSAD performances during each monthulated and within each region.The results of this analysis can be summarized aslows:In the case of sulfate, signicant differences in thequality of CMAQ and REMSAD estimates werefound for about half the tests performed (for 37months out of the 12 months of simulation,depending on the observation network consid-ered, and in three to seven of the ten regionsindividualized, depending on the observationnetwork and the length of the simulated period(all year vs. 6 months of high concentrations).When differences between CMAQ and RE-MSAD proved signicant, it was almost exclu-sively in favor of CMAQ. The exception to thatstatement is region II (California) where RE-MSAD estimates match IMPROVE data moreclosely. CMAQ was shown signicantly better inthree to four regions, whether assessed withIMPROVE or CASTNet data and better atTime (day)Observations CMAQ est. 8. Time series of the mean measured nitrate concentrations at CAimates in four regions.3/31 6/30 9/3053/31 6/30 9/30(b)(d)Region VIRegion IXreproducing all six months of high concentra-tions when compared with CASTNet data.CMAQ superiority was not as prevalent,although existent, if assessed with IMPROVEobservations, leading us to speculate that thestrength of the CMAQ model does not reside inits ability to simulate day-to-day variations butthe longer-term (weekly) uctuations.In the case of nitrate, signicant differences in thequality of CMAQ and REMSAD estimates werefound in less than 20% of the tests performedwith no model performing consistently better.For instance, if using the IMPROVE data as thebasis for comparison, REMSAD was found abetter model for simulating region IV butCMAQ simulation of region X was morefaithful. Similarly, while February was bettersimulated by REMSAD, CMAQ proved betterfor reproducing the CASTNet data of Augustand September. As a result, the only statementwe consider fair concerning simulation of nitrateis that both models seem to perform reasonablyand equally well.imates REMSAD estimatesSTNet sites and the corresponding CMAQ and REMSADAcknowledgementsThis research was partially funded by the USDepartment of Commerce through contracts withDr. E. Gego (EA133R-03-SE-0710), with the Uni-versity of Idaho to Dr. P. S. Porter (EA133R-03-SE-0372), and with the State University of New York toDr. C. Hogrefe (EA133R-03-SE-0650). The viewspresented are those of the authors and do not reectthe views or policies of the US Department ofCommerce.ReferencesByun, D.W., Ching, J.K.S. (Eds.), 1999. Science algorithms of theEPA Models-3 Community Multiscale Air Quality Model(CMAQ) modeling system. EPA/600/R-99/030, US Environ-mental Protection Agency, Ofce of Research and Develop-ment, Washington, DC, 20460.Carolina Environmental Programs, 2003. Sparse Matrix Opera-tor Kernel Emission (SMOKE) Modeling System, Universityof Carolina, Carolina Environmental Programs, ResearchTriangle Park, NC.Hanna, S.R., 1994. Mesoscale meteorological model evaluationtechniques with emphasis on needs of air quality models. In:Pielke, R.A., Pearce, R.P. (Eds.), Mesoscale Modeling oftheAtmosphere. Meteorological Monographs, vol. 25. Amer-ican Meteorological Society, Boston, MA, pp. 4758.ICF Consulting, 2002. Users Guide to the Regional ModelingSystem for Aerosols and Deposition (REMSAD) Version 7.Systems Applications International/ICF Consulting, SanRafael, CA 94903, 153pp.Irwin J.S., Gego E., Hogrefe C., Jones J. M., Rao S.T., 2004.Comparison of sulfate concentrations simulated by tworegional-scale models with measurements from the IM-PROVE network, Ninth Conference on Harmonizationwithin Atmospheric Dispersion Modeling for RegulatoryPurposes, Garmish-Partenkirchen, Germany.McNally, D., 2003. Annual application of MM5 for calendaryear 2001, prepared for US EPA by D. McNally, AlpineGeophysics, Arvada, CO, 2003, 179pp.US Environmental Protection Agency, 2003. Users guide toMOBILE6.1 and MOBILE6.2; EPA Ofce of Air andRadiation, EPA420-R-03-010, Assessment and StandardsDivision, Ofce of Transportation and Air Quality, USEnvironmental Protection Agency, 262pp.ARTICLE IN PRESSE. Gego et al. / Atmospheric Environment 40 (2006) 492049344934Assessing the comparability of ammonium, nitrate and sulfateconcentrations measured by three air quality monitoringnetworks. Pure and Applied Geophysics 162, 19191939.Grell, G.A., Dudhia, J., Stauffer, D., 1994. A description of thefth-generation Penn State/NCAR Mesoscale Model (MM5).NCAR Technical Note, NCAR/TN-398+STR.Further readingEder, B., Yu, S., 2004. A performance evaluation of the 2004release of MODELS-3 CMAQ. Preprints of the 27th NATO/CCMS International Technical Meeting on Air PollutionModeling and Its Applications, Banff, Canada, pp. 166173.Gego, E., Porter, P.S., Irwin, J.S., Hogrefe, C., Rao, S.T., 2005.An objective comparison of CMAQ and REMSAD performancesIntroductionModelsObservationsMethodsModel evaluation metricGeographic and seasonal subdivisionsModel-to-model comparisonResultsAerosol sulfateComparison with improve observationsComparison with CASTNet observationsDifferences between the evaluation results obtained with IMPROVE and CASTNet.Aerosol nitrateComparison with improve observationsComparison with CASTNet observationsSummaryAcknowledgementsReferencesbm_fur

Recommended

View more >