22
Assessment of the High-Resolution Rapid Refresh Model’s Ability to Predict Mesoscale Convective Systems Using Object-Based Evaluation JAMES O. PINTO,JOSEPH A. GRIM, AND MATTHIAS STEINER Research Applications Laboratory, National Center for Atmospheric Research,* Colorado (Manuscript received 24 September 2014, in final form 23 April 2015) ABSTRACT An object-based verification technique that keys off the radar-retrieved vertically integrated liquid (VIL) is used to evaluate how well the High-Resolution Rapid Refresh (HRRR) predicted mesoscale convective systems (MCSs) in 2012 and 2013. It is found that the modeled radar VIL values are roughly 50% lower than observed. This mean bias is accounted for by reducing the radar VIL threshold used to identify MCSs in the HRRR. This allows for a more fair evaluation of the model’s skill at predicting MCSs. Using an optimized VIL threshold for each summer, it is found that the HRRR reproduces the first (i.e., counts) and second moments (i.e., size distribution) of the observed MCS size distribution averaged over the eastern United States, as well as their aspect ratio, orientation, and diurnal variations. Despite threshold optimization, the HRRR tended to predict too many (few) MCSs at lead times less (greater) than 4 h because of lead time– dependent biases in the modeled radar VIL. The HRRR predicted too many MCSs over the Great Plains and too few MCSs over the southeastern United States during the day. These biases are related to the model’s tendency to initiate too many MCSs over the Great Plains and too few MCSs over the southeastern United States. Additional low biases found over the Mississippi River valley region at night revealed a tendency for the HRRR to dissipate MCSs too quickly. The skill of the HRRR at predicting specific MCS events increased between 2012 and 2013, coinciding with changes in both the model physics and in the methods used to as- similate the three-dimensional radar reflectivity. 1. Introduction The improved prediction of mesoscale convective systems (MCSs) can enhance public safety and improve the decision-making of a number of industries (such as aviation, construction, utility and road maintenance, and outdoor recreation) that require timely and accu- rate forecasts of high-impact weather. Many of these industries could use more specific forecast information regarding the potential properties of convective storms. For example, aviation planners need information re- lated to the size, orientation, and organization of con- vective areas to more efficiently route air traffic across the United States (e.g., Krozel et al. 2007; Steiner et al. 2010). In addition, understanding the performance characteristics of high-resolution ensemble forecasts of convection and their intrinsic properties is an important component of the National Weather Service’s (NWS) warn-on-forecast vision (Stensrud et al. 2009, 2013). High-resolution models continue to evolve as computational resources continue to increase, higher- resolution observations become available, and the representation of physical processes and data assimila- tion techniques continue to improve. As model resolu- tion continues to increase, more detail with regard to the forecasted storm properties becomes available. Un- derstanding how well storm features, such as their size and organization, are reproduced is key for continued model development efforts and for the interpretation of model forecasts. During the late 1980s, feature identification tech- niques were developed and applied to observational datasets. Augustine and Howard (1988) developed a method for identifying MCSs in satellite data and others have done the same for radar (e.g., Dixon and Wiener 1993; Jirak et al. 2003). More recently, feature-based * The National Center for Atmospheric Research is sponsored by the National Science Foundation. Corresponding author address: James O. Pinto, Research Ap- plications Laboratory, National Center for Atmospheric Research, 3450 Mitchell Ln., Boulder, CO 80301. E-mail: [email protected] 892 WEATHER AND FORECASTING VOLUME 30 DOI: 10.1175/WAF-D-14-00118.1 Ó 2015 American Meteorological Society

Assessment of the High-Resolution Rapid Refresh Model’s

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Assessment of the High-Resolution Rapid Refresh Model’s

Assessment of the High-Resolution Rapid Refresh Model’s Ability to PredictMesoscale Convective Systems Using Object-Based Evaluation

JAMES O. PINTO, JOSEPH A. GRIM, AND MATTHIAS STEINER

Research Applications Laboratory, National Center for Atmospheric Research,* Colorado

(Manuscript received 24 September 2014, in final form 23 April 2015)

ABSTRACT

An object-based verification technique that keys off the radar-retrieved vertically integrated liquid (VIL) is

used to evaluate how well the High-Resolution Rapid Refresh (HRRR) predicted mesoscale convective

systems (MCSs) in 2012 and 2013. It is found that the modeled radar VIL values are roughly 50% lower than

observed. This mean bias is accounted for by reducing the radar VIL threshold used to identify MCSs in the

HRRR. This allows for a more fair evaluation of the model’s skill at predicting MCSs. Using an optimized

VIL threshold for each summer, it is found that the HRRR reproduces the first (i.e., counts) and second

moments (i.e., size distribution) of the observed MCS size distribution averaged over the eastern United

States, as well as their aspect ratio, orientation, and diurnal variations. Despite threshold optimization, the

HRRR tended to predict too many (few) MCSs at lead times less (greater) than 4 h because of lead time–

dependent biases in the modeled radar VIL. TheHRRR predicted too manyMCSs over the Great Plains and

too few MCSs over the southeastern United States during the day. These biases are related to the model’s

tendency to initiate too many MCSs over the Great Plains and too few MCSs over the southeastern United

States. Additional low biases found over the Mississippi River valley region at night revealed a tendency for

the HRRR to dissipateMCSs too quickly. The skill of the HRRR at predicting specificMCS events increased

between 2012 and 2013, coinciding with changes in both the model physics and in the methods used to as-

similate the three-dimensional radar reflectivity.

1. Introduction

The improved prediction of mesoscale convective

systems (MCSs) can enhance public safety and improve

the decision-making of a number of industries (such as

aviation, construction, utility and road maintenance,

and outdoor recreation) that require timely and accu-

rate forecasts of high-impact weather. Many of these

industries could use more specific forecast information

regarding the potential properties of convective storms.

For example, aviation planners need information re-

lated to the size, orientation, and organization of con-

vective areas to more efficiently route air traffic across

the United States (e.g., Krozel et al. 2007; Steiner et al.

2010). In addition, understanding the performance

characteristics of high-resolution ensemble forecasts of

convection and their intrinsic properties is an important

component of the National Weather Service’s (NWS)

warn-on-forecast vision (Stensrud et al. 2009, 2013).

High-resolution models continue to evolve as

computational resources continue to increase, higher-

resolution observations become available, and the

representation of physical processes and data assimila-

tion techniques continue to improve. As model resolu-

tion continues to increase, more detail with regard to the

forecasted storm properties becomes available. Un-

derstanding how well storm features, such as their size

and organization, are reproduced is key for continued

model development efforts and for the interpretation of

model forecasts.

During the late 1980s, feature identification tech-

niques were developed and applied to observational

datasets. Augustine and Howard (1988) developed a

method for identifying MCSs in satellite data and others

have done the same for radar (e.g., Dixon and Wiener

1993; Jirak et al. 2003). More recently, feature-based

* The National Center for Atmospheric Research is sponsored

by the National Science Foundation.

Corresponding author address: James O. Pinto, Research Ap-

plications Laboratory, National Center for Atmospheric Research,

3450 Mitchell Ln., Boulder, CO 80301.

E-mail: [email protected]

892 WEATHER AND FORECAST ING VOLUME 30

DOI: 10.1175/WAF-D-14-00118.1

� 2015 American Meteorological Society

Page 2: Assessment of the High-Resolution Rapid Refresh Model’s

techniques have been used in the evaluation of numer-

ical models. Some of the first automated, feature-based,

model evaluation studies were performed by Ebert and

McBride (2000), with a proliferation of various im-

plementations since, including the evaluation of high-

resolution deterministic forecasts (e.g., Baldwin and

Lakshmivarahan 2003; Davis et al. 2006; Lack et al.

2010; Ebert and Gallus 2009; Caine et al. 2013) and,

more recently, high-resolutionmodel ensembles (Gallus

2010; Johnson and Wang 2013; Johnson et al. 2013).

The time dimension is less frequently included in

feature-based forecast verification methods. Davis et al.

(2006) used feature-based verification software called

the Method for Object-Based Diagnostic Evaluation

(MODE) to assess how well the Weather Research

and Forecasting (WRF) Model, version 1.3 (WRF1.3;

Skamarock and Klemp 2008), predicts the spatiotem-

poral evolution of rain systems exceeding 400km2 in

area. They found that, generally, WRF1.3 produced too

many large storms (maximum dimension L . 120 km)

and that the predicted storms lasted longer than ob-

served. They also found evidence that storm timing

(given by the difference in modeled and observed

storm centroid times) was delayed in themodel, possibly

because of their slow formation. Using an object-

based technique, Burghardt et al. (2014) found that

a subkilometer-resolution implementation of WRF, ver-

sion 3.3 (WRF3.3), tended to initiate too many storms

over the high plains in an area of complex terrain. Clark

et al. (2014) evaluated the ability of four members of the

Center for Analysis and Prediction of Storms (CAPS)

high-resolution ensemble with varying treatments of

microphysics to predict the timing and evolution of rain

systems using a modified version of MODE that in-

cluded the time dimension. Their results demonstrated

that predictions of the onset and dissipation of rain ob-

jects (including both mesoscale convective systems and

smaller, less-organized areas of convection) were highly

sensitive to the treatment of cloud and precipitation

physics.

In the present study, storm-tracking software de-

veloped for radar analysis and forecasting has been

modified to evaluate model skill at predicting the oc-

currence, characteristics, and timing (especially the on-

set) of MCSs. A key aspect of this work, which is distinct

from most other object-based model evaluations, is that

we adjust the threshold used to identify storm objects in

the modeled vertically integrated liquid (VIL) field to

remove mean bias prior to performing the model eval-

uation. We also extend recent object-based verification

work by assessing regional variations in model perfor-

mance. The datasets used in this study are described in

section 2. The object-based verification technique is

described in section 3, with results of its application on

radar data given in section 4. In section 5, we describe a

method in which the threshold used to identify MCSs in

the model is optimized in order to more fairly evaluate

the skill of a high-resolution numerical weather pre-

diction model whose performance is charted over two

summers. Finally, the results are summarized and put

into context in section 6.

2. Datasets

a. Radar dataset description

Vertically integrated liquid derived from radar data is

often used by NWS meteorologists to detect areas of

intense, possibly hail-containing convection across the

country (e.g., Kitzmiller et al. 1995; Billet et al. 1997).

Vertically integrated liquid is also used by aviation

meteorologists and aviation planners to identify areas of

convection that are hazardous for flight. In this study, we

use VIL data from the Corridor Integrated Weather

System (CIWS; Crowe andMiller 1999; Evans andDucot

2006), which was developed for tactical decision-making

by the aviation industry. The VIL field developed for

CIWS is generated following the method described by

Smalley and Bennett (2002), in which the reflectivity

from each sweep of each available radar is first quality

controlled and then converted to VIL, using the em-

pirical relationship developed by Greene and Clark

(1972):

VIL5 3:443 1026 �N

i50

0:5(Zei 1Zei11)4/7 dhi , (1)

where Ze is the equivalent reflectivity factor, dh is the

depth of each radar volume, and N is the number of

radar volumes. Unlike other VIL products [e.g., the

multiradar/multisensor (MRMS) method uses a maxi-

mum reflectivity of 56 dBZ in the VIL calculation], the

layer mean reflectivity is not capped at any value. The

two-dimensional VIL field obtained from each radar is

then combined by advecting VIL to a common time and

then taking the maximum plausible VIL value in areas

where the radar swaths overlap. The VIL map includes

radar data collected from each of the WSR-88D net-

work radars (134 NEXRADs) operated by the NWS, 11

Terminal Doppler Weather Radars (TDWRs) operated

by the Federal AviationAdministration (FAA), and five

Canadian C-band (5 cm) radars north of the Canadian–

U.S. border that are operated by Environment Canada.

The resulting VIL dataset, which spans the entire

United States and the southern Canadian provinces, and

extends roughly 100km offshore, has a horizontal

AUGUST 2015 P I NTO ET AL . 893

Page 3: Assessment of the High-Resolution Rapid Refresh Model’s

resolution of 1 km, and is updated every 5min. For this

study, only hourly data are used. In addition, only data

collected over the United States are used in our analy-

ses, because the Canadian radars tend to attenuate

rapidly in the heavy precipitation events that are of in-

terest in this study. Gaps in radar coverage over the

western United States limit the utility of the observa-

tional dataset in this region; thus, much of the analysis

herein focuses on the eastern two-thirds of the country.

b. Model dataset description

The High-Resolution Rapid Refresh (HRRR) model

has been under development at the National Oceanic

and Atmospheric Administration/Earth System Re-

search Laboratory/Global Systems Division (NOAA/

ESRL/GSD) since 2008 (Benjamin et al. 2009, 2011).

Model outputs used in this study are from the summer

months [June, July, andAugust (JJA)] of 2012 and 2013.

The HRRR was run as part of the FAA’s summer

evaluation (Alexander et al. 2011), serving as the model

forecast input field for Consolidated Storm Prediction

for Aviation (CoSPA; Pinto et al. 2010) forecasts. The

HRRRwas run once per hour at a convection-permitting

grid spacing of 3km. While this grid spacing is too coarse

to resolve all scales of motion that are important in the

development, organization, and evolution of convective

storms (Bryan et al. 2003), it is believed that the larger-

scale convective features of MCSs are well resolved

when using this grid spacing (e.g., Weisman et al. 1997;

Schwartz et al. 2009).

The data assimilation and physics of the HRRR and

its parent model were modified between 2012 and 2013.

The model configurations utilized during these two

years are listed in Table 1. The 2013 configuration of the

HRRR became operational at NCEP in September

2014. During both years, the HRRR was initialized and

driven at the lateral boundaries using version 2of the 13-km

Rapid Refresh (RAP; Weygandt et al. 2011) model run

in experimental mode at NOAA/ESRL. Both the RAP

model and the HRRR use Gridpoint Statistical Inter-

polation analysis system (GSI) based three-dimensional

variational data assimilation (3DVAR) to assimilate a

wealth of observational data (Benjamin et al. 2013).

However, two changes expected to have a positive im-

pact on the simulation of convection were made be-

tween 2012 and 2013. First, the PBL scheme was changed

fromMellor–Yamada–Janji�c (MYJ) in 2012 to Mellor–

Yamada–Nakanishi–Niino (MYNN) in 2013. This

change was expected to improve the simulation of the

boundary layer, as found by Coniglio et al. (2013) for

simulations with convection-permitting resolution.

Second, three-dimensional radar reflectivity data,1 which

were only assimilated into RAP in 2012, were assimilated

into both the RAP and the 3-km HRRR in 2013. This

changewas expected to improve the initialization of storm

information on the 3-km grid, thus providing a more ac-

curate depiction of ongoing convection at the earlier

forecast lead times. We present evidence in section 5 that

this is indeed the case.

Because VIL is used to identify MCSs in the model,

the treatment of microphysics and the subsequent cal-

culation of radar reflectivity are critical. The cloud mi-

crophysics scheme of Thompson et al. (2008) is used to

simulate cloud and precipitation processes. Profiles of

snow, rain, and graupel produced by the Thompson et al.

microphysics scheme are converted to a reflectivity

factor using the radar equation and assumptions per-

taining to the shape of the size distribution of each

precipitation type [see Koch et al. (2005) for details].

Profiles of reflectivity are then used to estimate the

‘‘radar retrieved’’ VIL using the empirical relationship

developed for WSR-88D NEXRAD data following (1).

While the equation used to calculate VIL from radar

reflectivity data in the model is the same as that used on

observed reflectivity, no attempt has been made to ac-

count for the impact of radar coverage gaps in the cal-

culation of the simulated reflectivity field obtained from

TABLE 1. HRRR configurations used in 2012 and 2013. Note that RAP, the parentmodel of theHRRR,was also updated. Details of these

updates may be found in Benjamin et al. (2013).

2012 configuration 2013 configuration

WRF version 3.3.1 3.4.1

Radar data assimilation RAP only RAP and HRRR

Radiative transfer RRTM longwave RRTM longwave

Dudhia shortwave Goddard shortwave

PBL MYJ MYNN

Land surface model RUC, version 3.3.1-Smirnova RUC, version 3.3.1-Smirnova

Six levels Nine levels

1 The three-dimensional radar reflectivity mosaic was produced

as part of the MRMS data suite produced by the National Severe

Storms Laboratory in Norman, Oklahoma.

894 WEATHER AND FORECAST ING VOLUME 30

Page 4: Assessment of the High-Resolution Rapid Refresh Model’s

the model. Despite this simplifying assumption (which

would lead to overestimates of radar-retrievedVIL), the

model actually underforecasts VIL. In fact, the fre-

quency of VIL values exceeding 3.5 kgm22 is under-

forecasted by as much as a factor of 5, especially at the

longer lead times (Fig. 1). Thus, the expected over-

prediction of VIL that would result from neglecting the

impact of radar coverage gaps in the model-derived VIL

is masked by biases associated with assumptions in the

microphysics parameterization and subsequent compu-

tations of radar reflectivity that contribute to the un-

derprediction of VIL.

3. Storm identification

The object-based technique used in this study was

adapted from the Thunderstorm Identification Tracking

Analysis and Nowcasting (TITAN) software, developed

by Dixon and Wiener (1993), which was originally de-

signed to identify and track convective storms in the

reflectivity field of an individual radar. The software

provides several characteristics of each storm it iden-

tifies including location, orientation, size, aspect ratio,

storm trends, and motion. The TITAN software has

since been adapted to work on radar mosaic data and,

more recently, model output (Pinto et al. 2007; Caine

et al. 2013). In this study, the adapted TITAN software is

used to assess the timing and location of MCSs in both

radar mosaic data and HRRR forecasts. Convective

objects are identified in the observed VIL field using a

threshold of 3.5 kgm22. This VIL threshold, which is

typically used to identify convection, corresponds to

column maximum reflectivity values ranging from 30 to

40dBZ, following Hallowell et al. (1999). To be con-

sidered an MCS, the convective object (which may in-

clude gaps between convective cells of up to 20km)must

have a maximum dimension (measured along the ob-

ject’s major axis) exceeding 100 km for at least two

consecutive hours. This MCS definition is similar to that

used in previous studies to investigate the occurrence of

MCSs (e.g., Houze 1993; Geerts 1998; Parker and

Johnson 2000; Coniglio et al. 2010), except that our

definition includes less-organized convective areas or

clusters of convective cells that are typical of the

southeastern United States.

In this study, the initiation of an MCS is defined as

occurring if no preexisting MCSs are present within the

last 2 h and within a search radius of 125 km that ex-

tended from all edges of the MCS in question. The

search radius is based on the median speed of MCSs

reported in the literature (e.g., Carbone et al. 2002) and

comparable to the separation distance of 144 km used by

Davis et al. (2006) to distinguish separate convective

storm objects. The use of look-back times of 1 and 2h to

identify MCS initiation (MCS-I) is similar to the 1–2-h

MCS ‘‘genesis’’ stage described by Coniglio et al. (2010)

during which time storm cells may first merge to form a

near-continuous line of convection exceeding 100 km in

length. Figure 2 illustrates the procedure used to identify

MCS-I events. In this example, five MCS objects were

identified at the analysis time. Objects 4 and 5 had

clearly existed previously, as indicated by the blue and

magenta objects from 1 and 2h in the past, respectively.

Farther west, objects 1 and 2 had no preexisting MCSs

within 125 km in the past 2 h and are classified as MCS-I

events. The convective storms evident to the east and

southeast of objects 4 and 5 in the VIL field were too

small and too far apart to be classified as MCSs. Finally,

object 3, which should have been classified as an MCS-I

event, was not identified as such because it formed

within 125 km of object 4 that was present 2 h earlier.

This missed MCS-I event represents a flaw in the algo-

rithm; however, this situation was found to occur less

than 10% of the time in a comparison with manually

classified MCS-Is. Furthermore, it is expected that since

the same technique is used on both observations and

model data, the overall impact on the comparisons will

be small.

4. Observed MCS properties

The algorithm described above was used to detect

MCSs and MCS-I events in the observed VIL mosaic

obtained from CIWS for the summers of 2012 and 2013.

Table 2 lists the mean macroscale properties of MCSs

observed during these two summers. There was a 15%

increase in the number of MCS events observed be-

tween 2012 and 2013, but their median size, shape, and

orientation were basically unchanged (,4% change).

FIG. 1. Observed and modeled frequencies of VIL exceeding

either 1.5 or 3.5 kgm22 as a function of lead time for the eastern

United States (i.e., east of 1058W) for JJA 2012.

AUGUST 2015 P I NTO ET AL . 895

Page 5: Assessment of the High-Resolution Rapid Refresh Model’s

The median orientation, given as the angle at which the

major axis of the storm is oriented clockwise from due

north (values 08–1808), ranged between 758 and 808 (in-dicating that the MCSs tended to extend from the west-

southwest to east-northeast). The aspect ratio, given as

the ratio of the MCS major radius to its minor radius,

indicates that the length of MCSs was 3 times the width.

MCS-I events were recorded a factor of 10 times less

often than MCS occurrences in both years. This large

difference betweenMCS andMCS-I counts is due to the

fact that a single MCS can be counted multiple times

(each hour of its existence) over the course of its life

cycle whereas each MCS-I is treated as a singular event.

Sensitivity studies, in which each algorithmic parameter

was separately varied, revealed that MCS and MCS-I

counts were most sensitive to the VIL threshold and

permissible gap size, with the MCS counts being more

sensitive than the MCS-I counts to changes in the MCS-

defining characteristics. A 1kgm22 increase in VIL

threshold resulted in a 40% (30%) decrease in the

number of MCSs (MCS-Is) while a 10-km increase in

permissible gap size resulted in a 100% (50%) increase

in the number of MCSs (MCS-Is). The sensitivity of the

observed MCS counts to VIL threshold reveals the im-

portance of accounting for mean bias offsets between

the modeled and observed VIL fields prior to evaluating

model skill.

Maps of the observed MCS frequency of occurrence

were generated using 18 grid boxes (Fig. 3). For each

MCS detected, the grid boxes containing any part of the

MCS were determined and the MCS count in each of

these grid boxes was increased by one. The frequency is

then determined by dividing the MCS count by the

number of observation times in the period of interest.

The frequency is then scaled to give the number of ob-

served MCSs occurring within each grid box per week

for a given time period (daytime is defined as 1100–2000

LST, and nighttime during the remaining hours). The

largest seasonal (JJA) differences in MCS frequency

between 2012 and 2013 are found over the central por-

tion of both the Great Plains (GP) andMississippi River

valley (MRV) longitudinal bands at night (Fig. 3). The

TABLE 2. MCS characteristics observed for the summers of 2012 and 2013.

Year No. of MCSs No. of MCS-Is Median MCS (MCS-I) size (km) Median aspect ratio Mode orientation

2012 8270 873 143 (123) 2.8 77

2013 9529 929 145 (123) 2.9 78

FIG. 2. Schematic representation of the MCS-I identification method. Red, blue, and ma-

genta indicate the current, 1-h-previous, and 2-h-previous locations of the MCS, respectively.

Colors from more recent times overlap those from previous times. Filled contour underlay

gives theVIL obtained 1 h before present. The ellipses indicate the approximate search area for

each MCS identified at current time with the orange ellipses indicating the MCS-I events and

the cyan ellipses indicating which MCSs are not considered MCS-Is. Each current MCS is also

numbered to aid discussion in the text.

896 WEATHER AND FORECAST ING VOLUME 30

Page 6: Assessment of the High-Resolution Rapid Refresh Model’s

limited MCS activity observed over the central United

States in 2012 corresponded with one of the driest

summers on record in this region (Hoerling et al. 2013;

Basara et al. 2013). In 2013, MCS activity over the

central United States increased dramatically over 2012

and was much closer to normal.

Regional variations in the observed diurnal cycle of

MCSs are depicted in Fig. 4. Results from the three

longitudinal bands and two additional evaluation boxes

shown in Fig. 3 are given. Results from the Texas (TX)

box represent a subset of the MCS activity occurring

within the GP band, while the Southeast (SE) box

overlaps the southern portions of both the MRV and

Appalachians (APP) longitudinal bands. Both the am-

plitude (i.e., percentage of all MCSs occurring in a re-

gion) and phase (i.e., timing) of the diurnal cycle vary by

region. Consistent with previous climatological studies

(e.g., Carbone and Tuttle 2008), the center of the time

(LST2) of peak MCS activity (defined as the midpoint

between clear positively and negatively sloping portions

of each curve and given in local solar time) occurs about

4 h later in the GP band (1900 LST) than that found for

the APP band (1500 LST). There is also evidence to

support the existence of an earlier ramp-up in MCS

activity in the two southern regions (i.e., SE and TX)

compared to their corresponding longitudinal band(s).

This north–south variation in the timing of the ramp-up

FIG. 3. Observed seasonal mean MCS occurrence rates (No. per week) for (a) all times, (b) daytime (1100–2000

LST), and (c) nighttime (2100–1100 LST) for JJA (left) 2012 and (right) 2013. Labeled bands and boxes indicate

regions used in additional analyses described in the text.

2 LST is used instead of UTC for many analyses, since the con-

tiguous United States spans;588 of longitude and, thus, local solartime varies by ;3.9 h across the country.

AUGUST 2015 P I NTO ET AL . 897

Page 7: Assessment of the High-Resolution Rapid Refresh Model’s

is related to the later initiation of MCSs to the north, as

seen in Fig. 5b. The diurnal cycles with the largest am-

plitudes occur in the SE and APP regions, while the

smallest-amplitude diurnal cycle is found for the MRV

band. Very similar diurnal cycles were observed for each

region in 2013 (not shown).

While the observed ratio of MCS to MCS-I events is

consistent between the two summers (Table 2), differ-

ences are evident in the observed spatial patterns of

MCS-I frequency (Fig. 5a). These maps were created by

counting all MCS-I events occurring within overlapping

58 3 58 boxes spaced 18 apart. This larger box size

(compared to that used to build the MCS frequency

maps) was necessary because of the relatively infrequent

nature of MCS-I events. To remove the effects of off-

shore convection and the Canadian radar attenuation

issue, the MCS-I frequencies are limited to only those

occurring within nonoceanic grid boxes and within the

continental United States.MCS-I frequencymaxima are

observed over the GP and APP bands peaking over the

Florida peninsula. In both years, the MCS-I frequency

maximum in the APP band extends from the Gulf Coast

northward along the spine of the Appalachians into

western New York. This peak frequency of MCS-I over

the Appalachians is consistent with that reported by

Parker andAhijevych (2007). PeakMCS-I activity in the

GP band varies notably between 2012 and 2013, shifting

from the Dakotas in 2012 to Nebraska and western

Kansas in 2013 with the area and increasing in magni-

tude in 2013.

The median time of MCS-I events shown in Fig. 5b is

found following a two-step method. First, a distribution

MCS-I counts versus LST is generated and smoothed

using a cyclic 5-h boxcar average. Then, the time cor-

responding with the minimum MCS-I count in the

smoothed distribution is found and used as the starting

point to compute the median LST value using a circular

time of day coordinate system. The median time of

MCS-I events increases as one moves north and west

across the United States (Fig. 5b) varying from around

1200 LST in the southeastern United States to 1800–

2200 LST over much of the central and northern por-

tions of the GP band. Localized areas of even later

MCS-I activity are evident in the GP and MRV bands

especially in 2013. Summer-to-summer variations are

also in northern Texas where MCS-I occurred notably

later in 2013 and in the northern third of theMRV band,

which experienced much earlier MCS-I in 2013. The

observed MCS-I frequency maxima found over the GP

region in Fig. 5 are generally located upstream of the

MCS frequency maxima shown in Fig. 3. The relation-

ship between MCS-I location and subsequent down-

stream MCS frequency maxima is consistent with that

previously discussed by Carbone and Tuttle (2008).

In the SE region, MCSs tend to form through the

clustering and upscale growth of convective storms (e.g.,

Geerts 1998). The MCS-I frequency maxima in the SE

region are coincident in time and space (Figs. 5a,b) with

the peaks in the MCS frequency map shown in Fig. 3. In

the region denoted by the SE box in Fig. 3, MCS-I

generally occurs between 1100 and 1400 LST and peak

MCS activity occurs between 1300 and 1700 LST

(Fig. 4). Similar to that described by Geerts (1998), the

timing of peak MCS-I and MCS frequencies in the SE

FIG. 4. Normalized diurnal cycle of MCSs by region (GP, MRV, and APP bands, as well as

two smaller boxes for SE and TX, as shown in Fig. 2) for JJA 2012. Total numbers of MCS

events for each region are listed in the legend.

898 WEATHER AND FORECAST ING VOLUME 30

Page 8: Assessment of the High-Resolution Rapid Refresh Model’s

region indicates that MCSs tend to form and dissipate

over a short time period with limited propagation.

5. Evaluation of HRRR

The same storm identification algorithm used to detect

MCSs and MCS-Is in the observations (see section 4) is

used to evaluate the HRRR. Evaluations are performed

both in a statistical sense (i.e., in which the statistical

properties of observed andmodeled storms are compared

without being matched in time and location) and using

standard pairwise metrics (in which modeled and ob-

served storms are matched based on proximity to one

another, as described in more detail below). Since MCS

identification can be highly sensitive to the VIL threshold

used to identify convection and because the modeled VIL

field can be significantly biased depending on the choice of

microphysical parameterization (e.g., Clark et al. 2014)

and the assumptions used to compute the reflectivity, an

attempt is made to remove the mean bias from the

synthetic-radar-based VIL field (as shown in Fig. 1)

prior to applying the MCS identification software.

a. The VIL threshold optimization

To account for model bias caused by assumptions in

the microphysics and computation of radar-derived VIL

and also potential biases in the observations, the

threshold used to identify MCSs is chosen using an op-

timization procedure. This is done by comparing mod-

eled MCS size distributions obtained using a range of

VIL thresholds to that obtained from observations using

the standard VIL threshold of 3.5 kgm22. This pro-

cedure is illustrated in Fig. 6. It is seen that using the

standard threshold of 3.5 kgm22 on the model VIL re-

sults in large underforecasts of the number of MCSs in

each size bin. In the example shown in Fig. 6, a VIL

threshold of 1.5 kgm22 produces the best match be-

tween the 6-h forecast of MCSs and those that were

observed. This finding is consistent with that shown in

Fig. 1, where the frequency of 6-h modeled VIL ex-

ceeding 1.5 kgm22 corresponds with the frequency of

observed VIL exceeding 3.5 kgm22. However, because

bias in the modeled VIL field varies with forecast

lead time (Fig. 1), using a single threshold to detect

MCSs in the model forecasts will translate into lead

FIG. 5. Maps of (a) observed MCS-I frequency (No. per week per unit area for the actual land area within a 500 km2 search box) and

(b) median time of MCS-I event during (left) 2012 and (right) 2013. The frequency is computed by summing over all times of day. The

median time of MCS-I is only computed at points with at least one MCS-I per week per unit land area.

AUGUST 2015 P I NTO ET AL . 899

Page 9: Assessment of the High-Resolution Rapid Refresh Model’s

time–dependent biases in the modeled MCS frequen-

cies. These lead time–dependent biases are an intrinsic

property of the model and are not related to assump-

tions used in the computation of radar reflectivity.

Sensitivity studies revealed that this optimized

threshold can be obtained using a period as short as a

week depending on the amount of convective activity. It

is important to note that only the VIL threshold is varied

during the optimization procedure. All other criteria

used to identifyMCSs in themodel are identical to those

used to identify MCSs in the observations. Using this

optimization procedure, it was found that VIL thresh-

olds of 1.5 and 1.75 kgm22 worked best for the summers

of 2012 and 2013, respectively. These thresholds are

roughly a factor of 2 less than those used to detect MCSs

in the observations. This may seem concerning since one

might think these storms will appear to be too weak to

be considered MCSs, but one must remember that we

chose these thresholds to account for model biases that

are predominantly in the model data postprocessing

used to generate the radar-derived VIL field. When one

compares the retrieved VIL field with the composite

reflectivity field, it becomes clear that the threshold

optimization method does a nice job of detecting con-

vective storms that would otherwise have been missed

when using the retrieved VIL field alone (Fig. 7).

Application of these optimized thresholds resulted in

median storm sizes, aspect ratios, and modal orienta-

tions that are nearly identical to the observed values

reported in Table 2. The 2013 data provide a represen-

tative comparison of observed MCS counts with those

obtained from the model using a threshold of

1.75 kgm22 (Table 3). Biases in the counts of MCSs and

MCS-Is obtained from all available 6-h forecasts are

small (,10%) when evaluated over the United States

east of 1058W (eUS), but can reach nearly a factor of 2

when evaluated regionally. Results given in Table 3 re-

veal that there were 45% toomanyMCSs predicted over

the GP region and 15% too few predicted over the SE

region in 2013 (where the MCS-I count was also dra-

matically underforecasted by 39%). Temporal and re-

gional variations inmodel performance will be discussed

in more detail below, along with how the model per-

formance changed between 2012 and 2013.

FIG. 6. Sensitivity of the MCS size distribution to choice of VIL

threshold using 6-h forecasts and observations collected during 6–

12 Jul 2012. The black line depicts the observed MCS size distri-

bution obtained using a threshold of 3.5 kgm22.

FIG. 7. HRRR 6-h forecast valid at 0500 UTC 9 Jul 2013 of (a) column max reflectivity (dBZ) and (b) VIL (kgm22) derived from

modeled radar reflectivity. The white contours indicate storms identified as MCSs using a VIL threshold of 1.75 kgm22. Only the large

area of convection over southwestern NorthDakota would have been classified as anMCS using the standardVIL threshold of 3.5 kgm22

despite the fact that the modeled composite reflectivity within each MCS contour exceeds 35 dBZ.

900 WEATHER AND FORECAST ING VOLUME 30

Page 10: Assessment of the High-Resolution Rapid Refresh Model’s

b. Space–time variations in model performance

Spatiotemporal variations in the model’s skill at pre-

dictingMCSs were evaluated in two ways. First, regional

variations in the seasonal mean statistical properties of

modeled and observed MCSs and MCS-Is are com-

pared. Then, a simple technique that matches modeled

and observed objects while allowing for small position

errors is used to assess the model’s skill at predicting

specific MCS events.

1) STATISTICAL ASSESSMENT

While threshold optimization removes the impact of

mean bias in the VIL field, biases in the prediction of

MCSs that are a function of lead time, valid time period

(i.e., day vs night), and region remain (Fig. 8). The

model generally predicts too many MCSs during the

earlier lead times and too few MCSs at the longer lead

times. While this dependence on lead time is evident in

both years, it was somewhat reduced in 2013. In both

years, the model tended to overpredict the frequency of

MCSs during both daytime and nighttime periods in the

GP band while underpredicting the daytime MCS fre-

quency in the SE andAPP regions.While themodel bias

in some regions was consistent from one year to the next

(e.g., nighttime high bias in the GP region), other re-

gions experienced notable changes in performance. For

example, the frequency of daytime MCSs in the APP

and MRV regions was better captured in 2013.

The diurnal cycle of MCS frequency of occurrence

obtained from model forecasts of varying lead times is

compared to the observedMCS frequency in Fig. 9. The

MCS frequency for a given longitudinal band is obtained

by dividing the number of MCSs obtained for a given

band, lead time, and valid time by the corresponding

total number of modeled MCSs obtained for a given

band and lead time summed over all valid times. This

normalization effectively removes the model biases

shown in Fig. 8 so that any deviations between the

forecasted and observed diurnal cycles may be in-

terpreted as amplitude and/or timing errors for a given

forecast lead time. All lead times of the model generally

capture the observed regional variations in the phase

TABLE 3. Modeled and observed MCS and MCS-I counts for the

summer of 2013.

Observed

6-h model forecast,

T 5 1.75 kgm22

Region MCS MCS-I MCS MCS-I

eUS 9529 929 10 240 850

GP 2497 288 3622 322

MRV 3483 288 3013 196

APP 3164 313 3198 296

SE 2990 264 2550 160

FIG. 8. Modeled minus observed MCS frequency vs lead time obtained using optimized thresholds for

(a) daytime (1100–2000 LST) and (b) nighttime (2100–1100 LST) for three longitudinal bands (GP, MRV, and

APP), and two regions (TX and SE) during (left) 2012 and (right) 2013. The actual observedMCS frequencies (No.

per week) for each region and time period are given in the legend.

AUGUST 2015 P I NTO ET AL . 901

Page 11: Assessment of the High-Resolution Rapid Refresh Model’s

and amplitude of the MCS diurnal cycle in both years,

with the 2013 forecasts deviating from the observations

less than the 2012 forecasts. Comparing results for the

three longitudinal bands, the largest deviations are evi-

dent for the GP band in 2012 for lead times of 10 h or

greater. The GP diurnal cycle obtained at these longer

lead times appears to be several hours out of phase

compared to the observed diurnal cycle, with both the

ramp-up ofMCS activity and the center time of theMCS

frequency maximum occurring several hours later than

observed (Fig. 9a). The opposite effect is found for the

GP band in 2013, when the modeled ramp-up, peak, and

decay phase of the diurnal cycle occur about 1–2h ahead

of that observed for all forecast lead times. Themodeled

timing and amplitude of the MCS diurnal cycle in the

MRV band is well captured in both years, with the 2013

forecasts showing slightly less deviations from the ob-

served diurnal cycle than those found in 2012. Finally,

the amplitude of MCS activity in the APP band was

closer to that observed in 2013, but the timing (ramp-up

and decay of MCS frequency) remains delayed by

about 2 h.

A more detailed inspection of how well the model

captures temporal and regional variations in MCS fre-

quency of occurrence is shown in Fig. 10. The frequency

differences shown in Fig. 10 are for 6-h forecasts valid

FIG. 9. Frequency of modeled and observed MCSs for the (a) GP, (b) MRV, and (c) APP longitudinal belts for JJA (left) 2012 and

(right) 2013. Frequencies are found by dividing the MCS count found at a given valid time by the total number of MCSs detected in the

observations or the model forecast (for a given lead time) during the 92-day period.

902 WEATHER AND FORECAST ING VOLUME 30

Page 12: Assessment of the High-Resolution Rapid Refresh Model’s

during three time periods: daytime, nighttime, and all

times. In 2012, the largest positive biases are found over

Louisiana and Florida for forecasts valid during day-

time hours, where the model predicts up to five oc-

currences per week more than observed (i.e., an

overprediction by more than a factor of 2). The model

also overpredicts the frequency of MCSs over vast

areas of the GP band (particularly the northern half)

and northeastern United States during the day and at

night. In contrast, the model tended to underforecast

the frequency of daytime MCSs in the southeastern

United States especially near the coast of the Carolinas

and northern Florida in both years. Other regions

characterized by underforecasting biases include the

western slope of the southern Appalachians and east-

ern Texas during the day and parts of the MRV band at

night. The daytime high biases in the northeastern

United States, Louisiana, and Florida were less pro-

nounced in 2013. In contrast, the magnitude of biases in

the GP region (positive bias) and central MRV region

(negative bias) increased in 2013.

The MCS frequency biases shown in Fig. 10 can be

attributed, in part, to biases in the initiation of theMCSs

in the model. Comparing Figs. 5 and 11, it is seen that

while the model was able to roughly approximate the

location of the observedMCS-I hot spots (i.e., especially

the high plains portion of the GP band and Florida), it

overforecasted the MCS-I frequency maximum over

portions of the GP band (in terms of the extent of the

local maximum) and grossly underforecasted theMCS-I

frequency in the SE region and APP band (in terms of

both extent and magnitude) except over parts of Florida

where the model overforecasts the frequency of MCS-I

(Fig. 11a). These MCS-I biases are generally consistent

FIG. 10. Model 6-h forecasted MCS occurrence rate minus the observed occurrence rate for (a) all times,

(b) daytime (1100–2000 LST), and (c) nighttime (2000–1100 LST) for JJA (left) 2012 and (right) 2013. Black box

denotes central MRV region used in Fig. 18.

AUGUST 2015 P I NTO ET AL . 903

Page 13: Assessment of the High-Resolution Rapid Refresh Model’s

with MCS frequency biases shown in Fig. 10. Lesser

MCS-I underforecasting biases are also evident

throughout theMRV band, which does not seem to fully

explain the nighttime MCS forecasting bias that is evi-

dent in 2013 (Fig. 10).

The model seems to capture the larger-scale varia-

tions in the timing of MCS-I fairly well in both years. As

seen in Fig. 4, MCS-I generally occurs later in the day as

one moves north and west with latest MCS-Is in the

central and northern portions of the MRV and GP

bands. South-to-north variations in themedian timing of

MCS-Is are generally captured by the model fairly well

in 2012; however, in 2013 the modeled MCS-I is much

too early in the central portions of the GP and much too

late over southern Texas (Fig. 11). In the APP band,

there is a tendency for the modeledMCS-I to occur later

than observed, with this late timing bias being larger in

2013. Finally, the model shows an area of later-forming

MCSs similar in size to that observed; however, the

modeled area of late-forming MCSs is displaced several

hundred kilometers to the southeast of the observed

area of late-forming MCS.

2) MATCHED MODELED AND OBSERVED MCSANALYSES

Thus far, we have only compared the statistical

properties of modeled and observed MCSs. In this sec-

tion, we discuss the model’s skill at predicting specific

MCS events using a simple object-matching technique.

In this object-matching technique, modeled MCS ob-

jects that are within 100 km of a forecasted MCS are

given credit for a hit, as shown in Fig. 12 and Table 4.

The position and area covered by each MCS are defined

using a centroid location and a 72-vertex polygon (one

vertex every 58). Each modeled and observed MCS ob-

ject is then expanded by extending each vertex of the

polygon by 50km. Extending both modeled and ob-

served MCS objects by 50km allows for forecast errors

of up to 100 km to be considered a hit. Hits, misses, false

alarms, and correct nulls are determined following the

rules given in Table 4. It is important to note that the

standard skill scores computed using this technique are

impacted by the size of the modeled and observed

storms and howmuch they overlap. This upscaling of the

FIG. 11. Maps of (a) seasonal mean MCS-I occurrence rate obtained from 6-h forecasts and (b) median time of MCS-I in the 6-h

forecasts for JJA (left) 2012 and (right) 2013. The median time of MCS-I is only computed at points with at least oneMCS-I per week per

500 km2. The VIL threshold for identifying MCSs is shown at the top right.

904 WEATHER AND FORECAST ING VOLUME 30

Page 14: Assessment of the High-Resolution Rapid Refresh Model’s

modeled and observed MCSs is designed to give credit

for near misses but not penalize for additional false

alarms caused by the MCS expansions.

The hit, miss, false alarm, and positive null counts are

summed over the period from 1 June to 31 August for a

given forecast lead time using the 3 3 3 contingency

table provided in Table 4, from which the following

standard skill scores are computed (Wilks 2006):

frequency bias (FBIAS)5 nfcst/nobs , (2)

false alarm ratio (FAR)5 false alarm/(hit1false alarm),

and (3)

probability of detection (POD)5 hit/(hit1miss) . (4)

Using Table 4, the frequency bias is obtained by

the number of forecasts greater than or equal to two

nfcsts and the number of observations greater than or

equal to two nobs.

In addition, because we are trying to compare the skill

for two seasons with varying event frequencies, we use

the symmetric extreme dependence score, which is eq-

uitable and not impacted by the frequency of occurrence

of the event being verified (Hogan et al. 2009):

symmetric extreme dependency score (SEDS)

5 ln[FBIAS(FRQ)2]/ln[POD(FRQ)]2 1 and (5)

frequency (FRQ)

5 (hit1miss)/(hit1miss1 null1 false alarm). (6)

The seasonal mean skill scores are computed for 2012

and 2013 for three regions: the eastern United States

(east of 1058W), and the SE and GP regions shown in

Fig. 3. As with the statistical comparisons discussed in

the previous section, the skill scores are a strong func-

tion of valid time, forecast lead time, and region

(Figs. 13–16). The dashed line shown in these figures can

be used to determine how the skill score of the 0000 UTC

forecast varies with lead time. The mean skill scores for

any other forecast issuance time may be found by following

a line parallel to the dashed line that originates on the

issuance time of interest.

The diurnal variation in frequency bias is shown for

each region in Fig. 13. This figure providesmore detailed

information on the diurnal variation of bias than that

given in Fig. 8. Figure 13 also includes curves that in-

dicate the diurnal variation in the observed MCS fre-

quency for each region. Comparing the curves in each

panel reveals that the observed diurnal cycle of MCS

frequency for the GP region is almost completely out of

phase with that found for the SE region. Diurnal varia-

tions in the skill scores may be described in terms of the

observed phase of MCS activity (i.e., dissipation vs ini-

tiation and growth). For example, in the SE region, the

frequency bias peaks (more so in 2013) in forecasts is-

sued between 0000 and 0800 UTC, which are valid

during the dissipation phase of MCSs in this region, in-

dicating the model’s tendency to dissipate storms too

slowly in this region. At the same time, the frequency

bias in the SE region was reduced during the storm

FIG. 12. Schematic demonstrating how contingency tables are

created. The red-filled contours depict the truth field (given a value

of 2) and the yellow-filled contours represent a 50-km expansion of

the truth field (given a value of 1). The dark blue contours indicate

themodel-predictedMCS locations (52) and the light blue contour

indicates the model-predicted MCS locations extended by 50 km

(51). The grid lines represent a 18 grid, with each grid square being

classified as either a hit (indicated by an ‘‘H’’), miss (indicated by an

‘‘M’’), false alarm (indicated by an ‘‘F’’), or correct null (indicated by

an ‘‘N’’) based on the classification method shown in Table 4.

TABLE 4. Description of 3 3 3 contingency table used in the matched comparison of modeled and observed MCSs. For the ‘‘Modeled

MCS location value’’ and ‘‘Observed MCS location value,’’ a value of 2 is used to indicate the exact location of anMCS while a value of 1

indicates a 50-km buffer around the MCS. A value of zero indicates that no MCSs are present within 50 km.

Observed MCS location value

2 1 0 Totals

2 Hit Hit False alarm nfcstModeled MCS location value 1 Hit Correct null Correct null Not used

0 Miss Correct null Correct null nfcst,nullTotals nobs Not used nobs,null

AUGUST 2015 P I NTO ET AL . 905

Page 15: Assessment of the High-Resolution Rapid Refresh Model’s

initiation and growth phase (1600–2200 UTC) in 2013.

In the GP region, forecasts issued after 1200 and before

0000 UTC had large frequency biases. These high bia-

ses are found primarily during the initiation phase of

MCSs in this region, which is consistent with the

overprediction of MCS-Is and MCSs in this region

(Figs. 10 and 11). These high biases expanded in 2013

despite a 40% increase in observed MCS activity

over 2012.

Like the frequency bias, each of the skill scores (i.e.,

SEDS, POD, and FAR) varies with lead time, valid

time, region, and year (Figs. 14–16). For the eastern

U.S. domain, the best performing forecasts were issued

between 2000 and 0400 UTC (note the skill pickup

along either side of the dashed line that identifies all of

the valid times for the 0000 UTC forecast in Fig. 14).

This valid time period corresponds with the observed

period of enhanced MCS activity in the GP region

(Fig. 13c). This period of peak skill is characterized by

higher POD and lower FAR values than at other times

of day. In contrast, model forecasts issued between

0600 and 1600 UTC (i.e., period corresponding with

reduced observed MCS activity compared to other

times of day) perform worst (Fig. 14). These relation-

ships are consistent with the conjecture that the as-

similation of radar reflectivity improves model skill

when appreciable convective activity is present at

forecast initialization time.

Overall, the skill (given by SEDS) of the MCS

forecasts is greater in 2013, particularly for longer-

lead forecasts verifying between 0600 and 1200 UTC

(i.e., overnight period). These increases in SEDS

are coincident with simultaneous increases in POD

and decreases in the FAR in the 2013 forecasts

(Figs. 14b,c). This skill pickup was not simply due to the

increased frequency of the observed MCS events (seen

in Fig. 13c), as discussed by Jolliffe (2008), since SEDS is

not sensitive to these fluctuations (Hogan et al. 2009).

FIG. 13. Model bias as a function of lead time and valid time of day in the (a) eastern U.S. and the (b) SE and

(c) GP domains during (left) 2012 and (right) 2013. Solid curves indicate relative variations in observed MCS

frequency with time of day for a given region (the max observed frequency is given for each region). Dashed

diagonal lines indicate the lead time/valid time coordinates of the 0000 UTC forecasts.

906 WEATHER AND FORECAST ING VOLUME 30

Page 16: Assessment of the High-Resolution Rapid Refresh Model’s

However, the year-to-year change in observed MCS

frequency indicates that there were likely appreciable

differences in the mean environmental conditions of this

region between 2012 and 2013, which may have contrib-

uted to changes in model performance.

Regional variations in model performance are ex-

plored by studying two regions (SE and GP) with vastly

different storm formation mechanisms and statistical

performance characteristics [see section 5b(1)]. Values

of SEDS vary more strongly with lead time in the SE

region and more strongly with valid time in the GP re-

gion (Figs. 15 and 16, respectively). In the SE region,

SEDS values are consistently highest for the shortest

lead times, while in the GP region, SEDS peaks during

the evening or overnight period (between 0000 and

1200UTC). Variations in SEDS are related to variations

in POD in the SE region, and both POD and the FAR in

the GP region. While the POD values are comparable

between the two regions, the FAR values are much

greater in the GP region. The cause of the high FAR

values in the GP region after ;1200 UTC, which is

consistent with the overprediction of MCS frequencies

in this region discussed in section 5b(1), is further ex-

plored in section 5c.

Values of SEDS in 2013 tend to be higher than those

found for the 2012 forecasts in both regions. In the

SE region, this increase in skill is limited to forecast

lead times of 3–8h that are valid between 1400 and

1700 UTC. In the GP region, SEDS increases in 2013 for

forecasts issued between 0000 and 0600 UTC and for

lead times of up to 10 h, while it decreases slightly at lead

times longer than 8h for forecasts issued between 1600

and 0000 UTC (i.e., issue times for which convective

activity is limited, as seen in Fig. 13c). In both regions,

the POD of the first few lead hours of forecasts issued in

2013 increases dramatically over that found in 2012.

This increase in POD at the short lead times is consis-

tent with what one might expect through improved

FIG. 14. Model performance as a function of lead time and valid time of day as indicated by (a) SEDS, (b) POD,

and (c) FAR for the eastern U.S. domain during (left) 2012 and (right) 2013. Dashed diagonal lines indicate the

location of lead times associated with the 0000 UTC forecasts.

AUGUST 2015 P I NTO ET AL . 907

Page 17: Assessment of the High-Resolution Rapid Refresh Model’s

assimilation of radar reflectivity. The increase in SEDS

extends to longer lead times in theGP regionwhere larger,

longer-lived storms are more common than in the SE re-

gion; thus, their assimilation into the model can have a

longer impact on themodel’s performance. These findings

are also consistent with Stratman et al. (2013), who dis-

cussed improvements associated with the assimilation of

radar reflectivity into the CAPS 4-km model ensemble.

c. Exploring the cause for model biases

The evaluation presented above indicates that there

are three consistent biases in the model’s prediction of

MCSs. The three main biases are 1) underprediction of

daytime MCSs in the SE, 2) overprediction of MCSs in

the high plains and northernGP throughout the day, and

3) underprediction of nighttime MCSs in the central

MRV. In this section, each of these biases is explored in

more detail.

1) DAYTIME MCS UNDERFORECASTING BIAS IN

THE SOUTHEASTERN UNITED STATES

The importance of evaluating the model at finer scales

is evident when comparing longitudinal band averages

of MCS frequencies (Fig. 8) with those obtained using a

100-km grid (Fig. 10). It is seen that while model-

forecasted MCS frequencies had limited bias in the

APP longitudinal band (Fig. 8), the southern and eastern

areas within the APP band had large negative biases,

especially during the daytime (Fig. 10). As mentioned

earlier, a large fraction of the observed MCS-I events

are missed in the southeastern third of the APP band in

both years. Matched pair analyses revealed that only

18% of the observed MCS-I events in this region were

simulated to within 62 h in 2013. Of the missed MCS-I

events, 25% of the time the model failed to initiate

convection at any scale while the rest of the time con-

vection was predicted, but storms failed to grow large

FIG. 15. As in Fig. 14, but for the SE domain.

908 WEATHER AND FORECAST ING VOLUME 30

Page 18: Assessment of the High-Resolution Rapid Refresh Model’s

enough to attain MCS status (Fig. 17). Here, it is seen

that, for each observed MCS-I event, the size of the

modeled convective storms tended to remain below the

MCS criterion (i.e., 100km). Quantification of the biases

in the modeled environmental conditions associated

with these missed MCS-I events is needed to guide fu-

ture model improvement efforts.

2) MCS OVERFORECASTING BIAS OVER THE HIGH

PLAINS AND NORTHERN PLAINS

Model forecasts of MCSs exhibited a high bias in the

GP longitudinal band throughout the day at all but the

longest forecast lead times (Fig. 8). The high bias was

most pronounced over the high plains (i.e., western half

of the GP band) and northern Great Plains (Fig. 10),

with areas experiencing the highest biases correspond-

ing directly with areas where MCS-Is were also over-

predicted. The overprediction of MCS-Is over the high

plains is consistent with that reported by Burghardt et al.

(2014), who found similar biases (the modeled MCS-Is

being too early and occurring too often) in a set of high-

resolution simulations of terrain-aided convective initi-

ation over the high plains.

In terms of the matched skill scores, the model was

able to detect more than 80% of the MCS events oc-

curring during the period on enhanced MCS activity at

night, but at the expense of FAR values of over 50%

(Fig. 16). These FAR values are much larger than the

30% values found during the period of peak activity in

the SE region (Fig. 15). Further inspection of the false

alarm cases in the GP region revealed that they were

primarily due to the model having too many MCS-I

events, with many of these events producing an MCS

lasting several hours. About 10% of the time, areas of

primarily nonconvective precipitation were incorrectly

identified asMCSs using the optimized threshold values.

FIG. 16. As in Fig. 14, but for the GP domain.

AUGUST 2015 P I NTO ET AL . 909

Page 19: Assessment of the High-Resolution Rapid Refresh Model’s

3) NIGHTTIME MCS UNDERFORECASTING BIASES

OVER THE CENTRAL MISSISSIPPI RIVER

VALLEY

In contrast to the GP region, the model tended to

underforecast the occurrence of MCSs in the MRV re-

gion during overnight hours (Fig. 8), especially in 2013.

The largest underforecasts in this longitudinal band

(with a deficit of nearly 40 per week in 2013) occurred

over the northern and central parts of the MRV band

(Fig. 10), where the model was unable to reproduce the

local maximum in MCS activity observed to the south

and west of Lake Michigan in 2013 (Fig. 3). Similar to

the SE region, this underforecasting bias can be linked,

in part, to an underprediction of MCS-I events. This is

evident when comparing Figs. 5 and 11, which indicate

that the model MCS-I frequency in 2013 is about 50% of

the observed frequency over much of the central MRV

band centered near Missouri. In addition to this MCS-I

underprediction, the MCS underforecasting bias can

also be attributed to the model dissipating nighttime

MCSs too quickly, as demonstrated in Fig. 18. In this

figure the number of observed and forecasted (given as a

function of lead time for every other issue time)MCSs is

summed over the entire summer. The forecasted MCSs

are for all forecast lead times and issued every even hour

of the day; thus, each curve overlaps the next by 12h.

The number of MCSs predicted between 2000 and 1200

LST starts near the observed number of MCSs, but then

decreases with forecast lead time. In other words, over

the course of each model run valid during this time pe-

riod, the model tends to dissipate MCSs too quickly.

Thus, the low bias in the MRV band at night is caused

by a combination of underforecasted MCS-Is and a

tendency to dissipate storms too quickly.

6. Summary and context of results

An object-based verification technique has been used

to evaluate HRRR’s skill at predicting the occurrence,

properties, and initiation of MCSs. Software, which was

originally developed to identify and track convective

storms in radar data, called TITAN (Dixon and Wiener

1993), has been adapted to assess the performance of

high-resolution model forecasts of convection (e.g.,

Pinto et al. 2007; Caine et al. 2013). While a number of

studies have used object-based techniques to evaluate

convective precipitation forecasts in the past, few have

attempted to quantify regional variations in perfor-

mance, and fewer still have quantified the model’s

ability to predict the initiation of larger storm systems

over a region spanning many different climate zones. In

addition, unlike most studies, this work employs a

method to remove the mean bias in the modeled VIL

field prior to identification of storm objects and the

computation of performance statistics. This is done by

adjusting theVIL threshold used to identifyMCSs in the

model forecasts until the modeledMCS size distribution

closely matches that obtained from radar observations

using a VIL threshold of 3.5 kgm22. This procedure ef-

fectively removes the impact of mean biases in both the

model and the observational dataset.

It was found that using an optimized VIL threshold

was critical in evaluating the HRRR’s skill at predicting

MCSs.Optimized thresholds of 1.5 and 1.75 kgm22 were

used to detect MCS HRRR runs produced in 2012 and

FIG. 17. Distributions of modeled (red) and observed (black)

storm sizes occurring within regions bound by 29.58N, 86.58W and

33.58N, 80.58W (a subregion within the SE region) that were ob-

tained for a subset of MCS-I events observed in 2013 in which max

dimension of the modeled convective storms failed to reach

100 km. Note that themodeled storms larger than 100 km that were

present in the subregion during an observed MCS-I event did not

initiate within 62 h of the observed event.

FIG. 18. Total number of modeled MCSs (3–14-h lead times of

forecasts issued every even hour are shown, with the line for the

0000 LST run being blue and more reddish and brown colors being

forecasts issued later in the day) and observed (connected dots)

MCSs obtained during JJA 2013 for the central MRV region de-

noted in Fig. 10.

910 WEATHER AND FORECAST ING VOLUME 30

Page 20: Assessment of the High-Resolution Rapid Refresh Model’s

2013, respectively. Both of these thresholds are at least

50% less than the 3.5 kgm22 value used to identify areas

of convection in the observed VIL field. While these

thresholds are relatively low, and not what one would

typically relate to convection, it was shown that these

thresholds effectively capture areas of convection that

might be seen using other fields such as composite

maximum reflectivity. As such, a key aspect of this study

is to make users of this data aware of potential biases in

the modeled VIL field and to allow for a fair assessment

of the model skill when using this field to identify MCS

rather than penalize the model for issues predominantly

related to postprocessing.

Once the mean bias is removed using a single VIL

threshold, other biases related to model physics and

predictability issues can be explored. It is found that the

remaining biases are a function of lead time, region, and

time of day. Generally, the frequency of occurrence of

predicted MCSs decreases with forecast lead time.

When using a single optimized threshold, the HRRR

predicts too many (as many as a factor for 2 more than

observed in the GP region during the day) MCSs at the

shorter lead times and too few (nearly 50% less than

observed in the SE region at night) at the longer lead

times (see Fig. 8 for details). This dependence on lead

time was less pronounced in 2013, which is indicative of

an overall improvement in the HRRR’s skill at pre-

dicting MCSs. Regional variations in bias of the fore-

casted MCS frequencies obtained with the HRRR

using a single optimized threshold varied between233%

in the SE during the day to 175% in the GP region at

night in both years (Fig. 7). It should be noted that if one

had used 3.5kgm22 to detect MCSs, the range of varia-

tions in the frequency bias reported above would be

similar but values would all be negative.

More detailed analyses revealed that the HRRR

consistently overestimated the frequency of occurrence

ofMCSs over theGreat Plains during both day and night

(Fig. 10). This consistent overforecasting bias was re-

lated to the simulation of too manyMCS-I events on the

high plains in the late afternoon. Recent studies have

shown that this bias may be insensitive to model reso-

lution, as Burghardt et al. (2014) found similar biases in

the initiation of convection over the high plains at a

model grid spacing of 400m. In contrast to the high bias

found over the Great Plains, the model tended to un-

derestimate MCS frequency in the SE during the day

and in the MRV band at night. The low bias in the SE

region was related to the model underforecasting the

number of MCS-I events while the low bias in the MRV

region was related to the model’s tendency to dissipate

storms too quickly. While the model did a better job

simulating MCS-I frequency over the Appalachians in

2013, the timing in this region was worse than in 2012

with MCS-I occurring 2–3 h later than observed. None-

theless, broadly speaking, the model did a nice job

capturing the regional variations in the timing ofMCS-Is

with good representation of the later initiation of MCSs

as one moves north and west across the United States.

Smaller-scale interannual variability appears to have

been captured as well. Future changes in the HRRR’s

skill at predicting MCSs and MCS-I events can be

tracked at the NCAR/Research Applications Labora-

tory MCS verification website, which provides weekly

and monthly performance graphics (http://rap.ucar.edu/

projects/mcsprediction/statistics/).

Using optimized thresholds also allowed for a more fair

evaluation of model skill. It was found that both SEDS and

POD increased notably between 2012 and 2013while FAR

was reduced. This increase in POD was observed

throughout the day for lead times of 3–5h, with notable

increases extending to longer lead times during the storm

growth period in the SE region and at night in the GP re-

gion. Both periods in which the increase in SEDS extended

to longer lead times coincidedwith forecast issue times that

weremore likely to have existing storms. The large increase

in POD found at short lead times and the increase in SEDS

of longer-lead forecasts that likely had ongoing convective

activity at forecast issue time is consistent with what one

might expect in response to improved assimilation of radar

reflectivity. A more detailed investigation of the relation-

ship between radar reflectivity assimilation and model

performance is beyond the scope of this paper.

The object-based technique used in this study could

easily be extended to evaluate model performance as a

function of storm properties. For example, one could

explore the relationship between skill and storm size or

skill and storm organization (e.g., linear vs circular

MCSs, broken squall lines, etc.) via computing skill

scores as a function of permissible gap size or aspect

ratio. The technique could also be used to assess the

performance of high-resolution model ensemble fore-

casts following the work of Gallus (2010). Taking this a

step further, one could apply the object-based technique

on model ensemble forecasts to generate probabilistic

forecasts of storm properties. Pinto et al. (2013) dem-

onstrated this type of forecast system in which they

generated forecasts of the likelihood of convective

storms exceeding a given size threshold to aid aviation

planners in optimally selecting air traffic flow structures

in the presence of large convective storms.

Acknowledgments. The authors greatly appreciated

access to CIWS data provided by Dr. Haig Iskenderian

of the MIT Lincoln Laboratory. We would also like to

thank Drs. Stan Benjamin, Steve Weygandt, and Curtis

AUGUST 2015 P I NTO ET AL . 911

Page 21: Assessment of the High-Resolution Rapid Refresh Model’s

Alexander of NOAA/ESRL/GSD for providing the

HRRR data as well as several stimulating discussions

on HRRR performance over the years. The thoughtful

comments by Drs. TammyWeckwerth, Russ Schumacher,

Matthew Bunkers, and another reviewer (anonymous)

were greatly appreciated and helped sharpen the dis-

cussion of the material.

This research has been conducted in response to re-

quirements of the Federal Aviation Administration

(FAA), supplemented by funding provided by the Na-

tional Science Foundation (NSF) to NCAR. The views

expressed are those of the authors and do not necessarily

represent the official policy or position of neither the

FAA nor NSF.

REFERENCES

Alexander, C., S. Weygandt, S. G. Benjamin, T. G. Smirnova, J. M.

Brown, P. Hofmann, and E. James, 2011: TheHighResolution

Rapid Refresh (HRRR): Recent and future enhancements,

time-lagged ensembling, and 2010 forecast evaluation activi-

ties. Proc. 24th Conf. on Weather and Forecasting/20th Conf.

on Numerical Weather Prediction, Seattle, WA, Amer. Me-

teor. Soc., 12B.2. [Available online at https://ams.confex.com/

ams/91Annual/webprogram/Paper183065.html.]

Augustine, J. A., and K. W. Howard, 1988: Mesoscale convective

complexes over the United States during 1985. Mon. Wea.

Rev., 116, 685–701, doi:10.1175/1520-0493(1988)116,0685:

MCCOTU.2.0.CO;2.

Baldwin, M. E., and S. Lakshmivarahan, 2003: Development of an

events oriented verification system using data mining and

image processing algorithms. Preprints, Third Conf. on Arti-

ficial Intelligence, Long Beach, CA, Amer. Meteor. Soc., 4.6.

[Available online at http://ams.confex.com/ams/pdfpapers/

57821.pdf.]

Basara, J. B., J. N. Maybourn, C. M. Peirano, J. E. Tate, P. J.

Brown, J. D. Hoey, and B. R. Smith, 2013: Drought and as-

sociated impacts in the Great Plains of the United States—A

review. Int. J. Geosci., 4, 72–81, doi:10.4236/ijg.2013.46A2009.

Benjamin, S. G., S. Weygandt, T. G. Smirnova, M. Hu, S. E.

Peckham, J. M. Brown, K. Brundage, andG. S. Manikin, 2009:

Assimilation of radar reflectivity data using a diabatic digital

filter: Applications to the Rapid Update Cycle and Rapid

Refresh and initialization of High Resolution Rapid Refresh

forecasts with RUC/RR grids. Preprints, 13th Conf. on In-

tegrated Observing and Assimilation Systems for Atmosphere,

Oceans, and Land Surface (IOAS-AOLS), Phoenix, AZ,

Amer. Meteor. Soc., 7B.3. [Available online at https://ams.

confex.com/ams/pdfpapers/150469.pdf.]

——, ——, C. Alexander, J. M. Brown, T. G. Smirnova,

P.Hofmann, E. James, andG.Dimego, 2011: NOAA’s hourly-

updated 3km HRRR and RUC/Rapid Refresh—Recent

(2010) and upcoming changes toward improving weather

guidance for air-traffic management. Proc. Second Aviation,

Range, and Aerospace Meteorology Special Symp. on

Weather–Air Traffic Management Integration, Seattle, WA,

Amer. Meteor. Soc., 3.2. [Available online at https://ams.

confex.com/ams/91Annual/webprogram/Paper185659.html.]

——, and Coauthors, 2013: Data assimilation andmodel updates in

the 2013 Rapid Refresh (RAP) and High-Resolution Rapid

Refresh (HRRR) analysis and forecast systems. NCEP/EMC

Meeting, Washington, DC, NCEP/EMC/Model Evaluation

Group. [Available online at http://ruc.noaa.gov/pdf/NCEP_

HRRR_RAPv2_6jun2013-Benj-noglob.pdf.]

Billet, J., M. DeLisi, B. G. Smith, and C. Gates, 1997: Use of re-

gression techniques to predict hail size and the probability of

large hail. Wea. Forecasting, 12, 154–164, doi:10.1175/

1520-0434(1997)012,0154:UORTTP.2.0.CO;2.

Bryan, G. H., J. C. Wyngaard, and J. M. Fritsch, 2003: Resolution

requirements for the simulation of deep moist convection. Mon.

Wea. Rev., 131, 2394–2416, doi:10.1175/1520-0493(2003)131,2394:

RRFTSO.2.0.CO;2.

Burghardt, B. J., C. Evans, and P. J. Roebber, 2014: Assessing the

predictability of convection initiation in the high plains using

an object-based approach. Wea. Forecasting, 29, 403–418,

doi:10.1175/WAF-D-13-00089.1.

Caine, S., T. P. Lane, P. May, C. Jakob, S. T. Siems, M. J. Manton,

and J. O. Pinto, 2013: Statistical assessment of tropical

convection-permitting model simulations using a cell-tracking

algorithm. Mon. Wea. Rev., 141, 557–581, doi:10.1175/

MWR-D-11-00274.1.

Carbone, R. E., and J. D. Tuttle, 2008: Rainfall occurrence in theU.S.

warm season: The diurnal cycle. J. Climate, 21, 4132–4146,

doi:10.1175/2008JCLI2275.1.

——, ——, D. Ahijevych, and S. B. Trier, 2002: Inferences

of predictability associated with warm season precipita-

tion episodes. J. Atmos. Sci., 59, 2033–2056, doi:10.1175/

1520-0469(2002)059,2033:IOPAWW.2.0.CO;2.

Clark, A. J., R. G. Bullock, T. L. Jensen, M. Xue, and F. Kong,

2014: Application of object-based time-domain diagnostics

for tracking precipitation systems in convection allowing

models. Wea. Forecasting, 29, 517–542, doi:10.1175/

WAF-D-13-00098.1.

Coniglio, M. C., J. Y. Hwang, and D. J. Stensrud, 2010: Environ-

mental factors in the upscale growths and longevity of MCSs

derived from Rapid Update Cycle analyses. Mon. Wea. Rev.,

138, 3514–3539, doi:10.1175/2010MWR3233.1.

——, J. Correia, P. T. Marsh, and F. Kong, 2013: Verification of

convection-allowing WRF Model forecasts of the planetary

boundary layer using sounding observations. Wea. Fore-

casting, 28, 842–862, doi:10.1175/WAF-D-12-00103.1.

Crowe, B. A., and D. W. Miller, 1999: The benefits of using

NEXRAD vertically integrated liquid water as an aviation

weather product. Preprints, Eighth Conf. on Aviation, Range,

and Aerospace Meteorology,Dallas, TX, Amer. Meteor. Soc.,

168–171.

Davis, C. A., B. G. Brown, and R. G. Bullock, 2006: Object-based

verification of precipitation forecasts. Part II: Application to

convective rain systems. Mon. Wea. Rev., 134, 1785–1795,

doi:10.1175/MWR3146.1.

Dixon, M., and G. Wiener, 1993: TITAN: Thunderstorm identifi-

cation, tracking, analysis and nowcasting—A radar-based

methodology. J. Atmos. Oceanic Technol., 10, 785–797,

doi:10.1175/1520-0426(1993)010,0785:TTITAA.2.0.CO;2.

Ebert, E. E., and J. L. McBride, 2000: Verification of precipitation

in weather systems: Determination of systematic errors.

J. Hydrol., 239, 179–202, doi:10.1016/S0022-1694(00)00343-7.

——, and W. A. Gallus Jr., 2009: Toward better understanding of

the contiguous rain area (CRA) method for spatial forecast

verification. Wea. Forecasting, 24, 1401–1415, doi:10.1175/

2009WAF2222252.1.

Evans, J. E., and E. R. Ducot, 2006: Corridor Integrated Weather

System. MIT Lincoln Lab. J., 16, 59–80.

912 WEATHER AND FORECAST ING VOLUME 30

Page 22: Assessment of the High-Resolution Rapid Refresh Model’s

Gallus, W. A., Jr., 2010: Application of object-based verification

techniques to ensemble precipitation forecasts. Wea. Fore-

casting, 25, 144–158, doi:10.1175/2009WAF2222274.1.

Geerts, B., 1998: Mesoscale convective systems in the southeast

United States during 1994–95: A survey. Wea. Forecasting,

13, 860–869, doi:10.1175/1520-0434(1998)013,0860:

MCSITS.2.0.CO;2.

Greene, D. R., and R. A. Clark, 1972: Vertically integrated liquid

water—A new analysis tool. Mon. Wea. Rev., 100, 548–552,

doi:10.1175/1520-0493(1972)100,0548:VILWNA.2.3.CO;2.

Hallowell, R. G., and Coauthors, 1999: The Terminal Convective

Weather Forecast Demonstration. Preprints, Eighth Conf. on

Aviation, Range, and Aerospace Meteorology, Dallas, TX,

Amer. Meteor. Soc., 200–204.

Hoerling,M., and Coauthors, 2013: An interpretation of the origins

of the 2012 Central Great Plains drought. NOAA Drought

Task Force Assessment Rep., 50 pp. [Available online at

ftp://ftp.oar.noaa.gov/CPO/pdf/mapp/reports/2012-Drought-

Interpretation-final.web-041113.pdf.]

Hogan, R. J., E. J. O’Connor, and A. J. Illingworth, 2009: Verifi-

cation of cloud fraction forecasts.Quart. J. Roy. Meteor. Soc.,

135, 1494–1511, doi:10.1002/qj.481.

Houze, R. A., Jr., 1993: Cloud Dynamics.Academic Press, 573 pp.

Jirak, I. L., W. R. Cotton, and R. L. McAnelly, 2003: Satellite and

radar survey of mesoscale convective system development. Mon.

Wea. Rev., 131, 2428–2449, doi:10.1175/1520-0493(2003)131,2428:

SARSOM.2.0.CO;2.

Johnson, A., and X. Wang, 2013: Object-based evaluation of a

storm-scale ensemble during the 2009 NOAA Hazardous

Weather Testbed Spring Experiment. Mon. Wea. Rev., 141,1079–1098, doi:10.1175/MWR-D-12-00140.1.

——,——, F. Kong, andM. Xue, 2013: Object-based evaluation of

the impact of horizontal grid spacing on convection-allowing

forecasts. Mon. Wea. Rev., 141, 3413–3425, doi:10.1175/

MWR-D-13-00027.1.

Jolliffe, I. T., 2008: The impenetrable hedge: A note on propriety,

equitability and consistency. Meteor. Appl., 15, 25–29,

doi:10.1002/met.60.

Kitzmiller, D. H., W. E. McGovern, and R. F. Saffle, 1995: The

WSR-88D severe weather potential algorithm. Wea. Fore-

casting, 10, 141–159, doi:10.1175/1520-0434(1995)010,0141:

TWSWPA.2.0.CO;2.

Koch, S. E., B. S. Ferrier, J. S. Kain, M. T. Stoelinga, E. J. Szoke,

and S. J. Weiss, 2005: The use of simulated radar reflectivity

fields in the diagnosis of mesoscale phenomena from high-

resolution WRF Model forecasts. Preprints, 11th Conf. on

Mesoscale Processes/32nd Conf. on Radar Meteorology, Al-

buquerque, NM, Amer. Meteor. Soc., J4J.7. [Available online

at https://ams.confex.com/ams/pdfpapers/97032.pdf.]

Krozel, J., J. S. B. Mitchell, V. Polishchuk, and J. Prete, 2007:

Maximum flow rates for capacity estimation in level flight with

convective weather constraints.Air Traffic Control Quart., 15,209–238.

Lack, S., G. J. Limpert, and N. I. Fox, 2010: An object-oriented

multiscale verification scheme. Wea. Forecasting, 25, 79–92,

doi:10.1175/2009WAF2222245.1.

Parker, M. D., and R. H. Johnson, 2000: Organizational modes of

midlatitude mesoscale convective systems. Mon. Wea. Rev.,

128, 3413–3436, doi:10.1175/1520-0493(2001)129,3413:

OMOMMC.2.0.CO;2.

——, and D. Ahijevych, 2007: Convective episodes in the east-

central United States. Mon. Wea. Rev., 135, 3707–3727,

doi:10.1175/2007MWR2098.1.

Pinto, J. O., C. Phillips, M. Steiner, R. Rasmussen, N. Oien,

M. Dixon, W. Wang, and M. Weisman, 2007: Assessment of

the statistical characteristics of thunderstorms simulated with

the WRF Model using convection permitting resolution. 33rd

Int. Conf. on Radar Meteorology, Cairns, QLD, Australia, 5.5.

[Available online at https://ams.confex.com/ams/pdfpapers/

123712.pdf.]

——, W. Dupree, S. Weygandt, M. Wolfson, S. Benjamin, and

M. Steiner, 2010: Advances in the Consolidated Storm Pre-

diction for Aviation (CoSPA). Preprints, 14th Conf. on Avi-

ation, Range, and Aerospace Meteorology, Atlanta, GA,

Amer. Meteor. Soc., J11.2. [Available online at https://ams.

confex.com/ams/pdfpapers/163811.pdf.]

——, J. A. Grim, D. Ahijevych, and M. Steiner, 2013: An auto-

mated system for detecting large-scale convective storms:

Application to model evaluation. Proc. 16th Conf. on Avia-

tion, Range, and Aerospace Meteorology, Austin, TX, Amer.

Meteor. Soc., 9.4A. [Available online at https://ams.confex.

com/ams/93Annual/webprogram/Paper222079.html].

Schwartz, C. S., and Coauthors, 2009: Next-day convection-

allowing WRF Model guidance: A second look at 2-km ver-

sus 4-km grid spacing. Mon. Wea. Rev., 137, 3351–3372,

doi:10.1175/2009MWR2924.1.

Skamarock, W. C., and J. B. Klemp, 2008: A time-split non-

hydrostatic atmosphere model for weather research and

forecasting applications. J. Comput. Phys., 227, 3465–3485,doi:10.1016/j.jcp.2007.01.037.

Smalley, D. J., and B. J. Bennett, 2002: Using ORPG to enhance

NEXRAD products to support FAA critical systems. Pre-

prints, 10th Conf. on Aviation, Range, and Aerospace Meteo-

rology, Portland, OR, Amer. Meteor. Soc., 3.6. [Available

online at https://ams.confex.com/ams/pdfpapers/38861.pdf.]

Steiner, M., R. Bateman, D. Megenhardt, Y. Liu, M. Xu,

M. Pocernich, and J. Krozel, 2010: Translation of ensemble

weather forecasts into probabilistic air traffic capacity impact.

Air Traffic Control Quart., 18, 229–254.

Stensrud, D. J., and Coauthors, 2009: Convective-scale warn-on-

forecast: A vision for 2020.Bull. Amer. Meteor. Soc., 90, 1487–

1499, doi:10.1175/2009BAMS2795.1.

——, and Coauthors, 2013: Progress and challenges with warn-

on-forecast. Atmos. Environ., 123, 2–16, doi:10.1016/

j.atmosres.2012.04.004.

Stratman,D.R.,M.C.Coniglio, S. E.Koch, andM.Xue, 2013:Use of

multiple verification methods to evaluation forecasts of con-

vection from hot- and cold-start convection allowing models.

Wea. Forecasting, 28, 119–138, doi:10.1175/WAF-D-12-00022.1.

Thompson,G., P. R. Field, R.M. Rasmussen, andW.D.Hall, 2008:

Explicit forecasts of winter precipitation using an improved

bulk microphysics scheme: Part II: Implementation of a new

snow parameterization. Mon. Wea. Rev., 136, 5095–5115,

doi:10.1175/2008MWR2387.1.

Weisman, M. L., W. C. Skamarock, and J. B. Klemp, 1997:

The resolution dependence of explicitly modeled convective

systems. Mon. Wea. Rev., 125, 527–548, doi:10.1175/

1520-0493(1997)125,0527:TRDOEM.2.0.CO;2.

Weygandt, S., and Coauthors, 2011: The Rapid Refresh—

Replacement for the RUC, pre-implementation development

and evaluation. Proc. 24th Conf. on Weather and Forecasting/

20th Conf. on Numerical Weather Prediction, Seattle, WA,

Amer. Meteor. Soc., 12B.1. [Available online at https://ams.

confex.com/ams/91Annual/webprogram/Paper183027.html.]

Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences.

2nd ed. Academic Press, 648 pp.

AUGUST 2015 P I NTO ET AL . 913