THE MODEL EVALUATION TOOLS (MET) SHOW AND TELL

THE MODEL EVALUATION TOOLS (MET) SHOW AND TELL

Tressa L. Fowler John Halley Gotway Randy Bullock

August 2009

MET is a set of tools for evaluating model forecasts.

Overview

  http://www.dtcenter.org/met/users/index.php

On the MET website

MET is…

Data Preprocessing

tools:

Place data in the format(s)

expected by the statistics tools

MET is…

Individual Forecast verification tools

  Traditional methods   Gridded obs   Point obs   Confidence intervals

  Spatial methods  Object-based  Neighborhood  Wavelet

MET is…

(Cumulative) Analysis tools

  Summarize statistics across cases

  Stratify according to various criteria (e.g., lead time)

Gen-Poly-Mask Tool

Gridded NetCDF

Point Stat

Gridded GRIB Input:

Observation Analyses

Model Forecasts

MODE

PrepBufr Point Obs PB2NC

PCP Combine

STAT

Grid Stat

NetCDF Point Obs

PS ASCII

NetCDF

STAT ASCII

NetCDF

= optional

Stat Analysis

MODE Analysis

ASCII

ASCII ASCII Point Obs ASCII2NC

Wavelet Stat

PS STAT ASCII

NetCDF

Gen Poly Mask

NetCDF Mask

Gen-Poly-Mask Tool

CONUS

31.1931 -120.4211

31.2291 -120.4976

31.2650 -120.5741

31.3009 -120.6123

31.3369 -120.6506

31.3728 -120.6888

31.4087 -120.6888

31.4447 -120.7270

…

Define once, apply many times

Accumulated Precip Accumulated Precip

Masking Options

  Masking for Grid-Stat, Point-Stat, and MODE: 1. Ouput of Gen-Poly-Mask: 2. Gridded data field and threshold:

4. Pre-defined grid: 3. Lat/Lon Polyline file:

Accumulated Precip Accumulated Precip

CONUS

31.1931 -120.4211

31.2291 -120.4976

31.2650 -120.5741

31.3009 -120.6123

31.3369 -120.6506

31.3728 -120.6888

31.4087 -120.6888

31.4447 -120.7270

…

Accumulated Precip

Surface Pressure <= 900mb

Accumulated Precip

Grid = “DTC166”

NCEP Grids: •  83 of them

•  Named “GNNN”

•  New custom grids require code change

PCP-Combine Tool

Gridded NetCDF

Point Stat

Gridded GRIB Input:


Model Forecasts

MODE


PCP Combine

STAT

Grid Stat

NetCDF Point Obs

PS ASCII

NetCDF

STAT ASCII

NetCDF

= optional

Stat Analysis

MODE Analysis

ASCII


Wavelet Stat

PS STAT ASCII

NetCDF

Gen Poly Mask

NetCDF Mask

PCP-Combine Tool

  Stands for ”Precip-Combine”   Functionality:

 Mathematically combines data across multiple files.   Add precipitation over 2 files with or without the same

accumulation interval.   Sum precipitation over 2 or more files with the same accumulation

interval.   Subtract precipitation in 2 files.

  Example to follow:   Verify WRF-NMM and WRF-ARW precipitation using StageII

analysis.   Verify a 6-hour accumulation initialized at 20060501 00Z and

valid at 20060502 00Z.

PCP-Combine: Add

  Add together 2 WRF-NMM 3-hourly accumulations:   3-hour + 3-hour = 6-hour

  pcp_combine -add \

wrfpcp_ruc13_21.tm00 3 wrfpcp_ruc13_24.tm00 3 \

NMM_06A_2006050200V.nc

PCP-Combine: Subtract

  Subtract 2 WRF-ARW accumulations:   24-hour – 18-hour = 6-hour

  pcp_combine –subtract \ wrftwo_ruc13_24.tm00 24 wrftwo_ruc13_18.tm00 18 \

ARW_06A_2006050200V.nc

PCP-Combine: Sum

  Sum up 6 StageII 1-hourly accumulations:

  6 * 1-hour = 6-hour

  pcp_combine -sum \

00000000_000000 1 20060502_000000 6 \

ST2_06A_2006050200V.nc -pcpdir ST2

MODE Tool: Method for Object-Based Diagnostic Evaluation

Gridded NetCDF

Point Stat

Gridded GRIB Input:


Model Forecasts

MODE


PCP Combine

STAT

Grid Stat

NetCDF Point Obs

PS ASCII

NetCDF

STAT ASCII

NetCDF

= optional

Stat Analysis

MODE Analysis

ASCII


Wavelet Stat

PS STAT ASCII

NetCDF

Gen Poly Mask

NetCDF Mask

object pictures

model info name, times

object-attribute weights

object-pairs total interest

parameter summary threshold smoothing radius match/merge stats MMI

copyright 2009, UCAR, all rights reserved

Forecast Objects and Observation Objects Overlaid

copyright 2009, UCAR, all rights reserved

Merging Objects with Double Threshold Method

Attributes for Pairs Of Cluster Objects

MODE Tool - Output

  PostScript   object pictures   parameter summary   total interest for each object pair

  ASCII   object sizes, shapes, positions   stats for simple, paired objects and clusters   standard contingency table stats on smoothed and thresholded fields

(objects)

  netCDF   gridded object fields   view with ncview

MODE-Analysis Tool

  Aggregate MODE output across multiple cases.

  Functionality:  Use the Summary option to summarize column(s) of data:

 Use the ByCase option to see matched/unmatched counts/areas for each case:

mode_analysis -summary -lookin mode_output/wrf4ncep/40km/ge03 -fcst -cluster -area_min 100 \ -column area -column axis_ang -column length

Total mode lines read = 393 Total mode lines kept = 17 Field N Min Max Mean StdDev P10 P25 P50 P75 P90 Sum area 17 180.00 8393.00 2955.06 2246.49 624.80 1206.00 2662.00 3958.00 5732.20 50236.00 axis_ang 17 -88.63 85.66 12.62 64.35 -70.77 -63.86 35.04 74.37 79.24 214.60 length 17 25.25 234.76 124.41 60.99 48.85 65.37 116.67 169.37 204.57 2114.90

mode_analysis -bycase -lookin mode_output/wrf4ncep/40km/ge03 -single -cluster -area_min 100

Total mode lines read = 393 Total mode lines kept = 17 Fcst Valid Time Matched Unmatched # Fcst Matched # Fcst Unmatched # Obs Matched # Obs Unmatched Apr 26, 2005 00:00:00 2685 0 1 0 1 0 May 13, 2005 00:00:00 3958 0 1 0 1 0 May 14, 2005 00:00:00 11695 0 2 0 2 0 May 18, 2005 00:00:00 4295 0 1 0 1 0 May 19, 2005 00:00:00 1206 0 1 0 1 0 May 25, 2005 00:00:00 4457 0 2 0 2 0

Wavelet-Stat Tool

Gridded NetCDF

Point Stat

Gridded GRIB Input:


Model Forecasts

MODE


PCP Combine

STAT

Grid Stat

NetCDF Point Obs

PS ASCII

NetCDF

STAT ASCII

NetCDF

= optional

Stat Analysis

MODE Analysis

ASCII


Wavelet Stat

PS STAT ASCII

NetCDF

Gen Poly Mask

NetCDF Mask

Wavelet-Stat Tool: Overview

  Implements Intensity-Scale verification technique, Casati et al. (2004)   Evaluate skill as a function of intensity and spatial scale of the error.   Method:

  Threshold raw forecast and observation to create binary images.   Decompose binary thresholded fields using wavelets (Haar as default).   For each scale, compute the Mean Squared Error (MSE) and Intensity Skill Score (ISS).   At what spatial scale is this forecast skillful?

Difference (F-O) for precip > 0 mm Wavelet decomposition difference

  Handling missing data:   Set to zero for precipitation.   Set to mean of field for continuous

variables.

Wavelet-Stat Tool: Configure

  2n x 2n Grid   Tiling options:

  Automatic tile selection   User-defined tile(s)   Pad to nearest 2n x 2n

Wavelet-Stat Tool: Wavelets

  Haar, centered   Used in Casati et al. (2004)   Default configuration   Discontinuous data   1 member

  Daubechies, centered   9 members

  B-spline, centered   11 members

Haar Wavelet

Daubechies (10) decomposition

Wavelet-Stat Tool: Output

1.  ASCII STAT file •  ISC (Intensity Skill-Score) line for each tile/threshold/scale

•  Header columns •  Mean-Squared Error (MSE) and Intensity Skill Score (ISC) •  Fcst&Obs Energy Squared (FENERGY2, OENERGY2) •  Base Rate (BASER) and Frequency Bias (FBIAS)

2.  NetCDF file •  For each tile/threshold/scale

•  Forecast, Observation, and Difference fields 3.  PostScript summary plot

•  Difference field image for each tile/threshold/scale

PB2NC Tool

Gridded NetCDF

Point Stat

Gridded GRIB Input:


Model Forecasts

MODE


PCP Combine

STAT

Grid Stat

NetCDF Point Obs

PS ASCII

NetCDF

STAT ASCII

NetCDF

= optional

Stat Analysis

MODE Analysis

ASCII


Wavelet Stat

PS STAT ASCII

NetCDF

Gen Poly Mask

NetCDF Mask

PB2NC Tool

  Stands for “PREPBUFR to NetCDF”

  Functionality:   Filters and reformats PREPBUFR point observations into

intermediate NetCDF format.   Configuration file specifies:

  Observation types, locations, elevations, quality marks, times, and variables to retain or derive for use in Point-Stat.

  Data formats:   Reads PREPBUFR using NCEP’s BUFRLIB.   Writes point NetCDF as input to Point-Stat. Note: v2.0 no longer requires CWORDSH to pre-process PREPBUFR files.

PREPBUFR

  BUFR is the World Meteorological Organization (WMO) standard binary code for the representation and exchange of observational data.

  http://www.nco.ncep.noaa.gov/sib/decoders/BUFRLIB/   http://www.ecmwf.int/products/data/software/

  The PREPBUFR format is produced by NCEP for analyses and data assimilation. The system that produces this format:   Assembles observations dumped from a number of sources   Encodes

  information about the observational error for each data type   background (first guess) interpolated to each data location

  Performs both rudimentary multi-platform quality control and more complex platform-specific quality control.

PB2NC Tool: Run

  pb2nc ndas.t00z.prepbufr.tm12.20070401.nr \ out/sample_pb.nc PB2NCConfig_tutorial -v 2

Reading Config File: PB2NCConfig_default Creating NetCDF File: out/sample_pb.nc Processing PrepBufr File: ndas.t00z.prepbufr.tm09.20070401.nr Blocking PrepBufr file to: /tmp/pb2nc_1705_0_blk.pb PrepBufr Time Center: 20070331_150000 Searching Time Window: 20070331_133000 to 20070331_163000 Processing 70884 PrepBufr messages... 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100% Total PrepBufr Messages processed = 70884 Rejected based on message type = 0 Rejected based on station id = 0 Rejected based on valid time = 0 Rejected based on masking grid = 0 Rejected based on masking polygon = 0 Rejected based on elevation = 0 Rejected based on pb report type = 0 Rejected based on input report type = 0 Rejected based on instrument type = 0 Rejected based on zero observations = 24153 Total PrepBufr Messages retained = 46731 Total observations retained or derived = 142709

PREPBUFR NDAS files are for the North-American Data Assimilation System.

PREPBUFR GDAS files are the Global Data Assimilation System.

Output NetCDF file is named out/sample_pb.nc

ASCII2NC Tool

Gridded NetCDF

Point Stat

Gridded GRIB Input:


Model Forecasts

MODE


PCP Combine

STAT

Grid Stat

NetCDF Point Obs

PS ASCII

NetCDF

STAT ASCII

NetCDF

= optional

Stat Analysis

MODE Analysis

ASCII


Wavelet Stat

PS STAT ASCII

NetCDF

Gen Poly Mask

NetCDF Mask

ASCII2NC Tool

  Stands for “ASCII to NetCDF”

  Functionality:   Reformat ASCII point observations into intermediate NetCDF

format.   One input ASCII format supported (10 columns).   No configuration file.

  Data formats:   Reads MET specific ASCII format with point obs.   Writes point NetCDF as input to Point-Stat.   Future: support additional standard ASCII formats.

ADPUPA 72365 20070331_120000 35.03 -106.62 1618.0 11 837.0 1618 273.05 *TMP ADPUPA 72365 20070331_120000 35.03 -106.62 1618.0 17 837.0 1618 271.85 *DPT ADPUPA 72365 20070331_120000 35.03 -106.62 1618.0 52 837.0 1618 92 *RH ADPUPA 72365 20070331_120000 35.03 -106.62 1618.0 11 826.0 1724 274.55 ….. *TMP

Msg STID ValidTime Lat Lon Elev GC Lvl Hgt Ob Ob assigns value to variable

Msg Message type STID WMO Station ID ValidTime Valid time for observation Lat Latitude [North] Lon Longitude [East] Elev Elevation [m] (Note: currently not used by MET code so can be filled with -9999.)

GC GRIB code for variable (i.e. AccPrecip = 61; MSLP = 2; Temp = 11, etc…) http://www.cpc.ncep.noaa.gov/products/wesley/opn_gribtable.html

Lvl Pressure [mb] or Accumulation Interval [hr] Hgt Height above Mean Sea Level [m – MSL] (Note: currently not used by MET code so

can be filled with -9999.) Ob Observed value

Use "-9999" for missing data

ASCII2NC Tool: Format

ASCII2NC Tool: Run

  ascii2nc sample_obs.txt \ sample_ascii.nc -v 2

netcdf sample_ascii { dimensions:

mxstr = 15 ; hdr_arr_len = 3 ; obs_arr_len = 5 ; nhdr = 5 ; nobs = UNLIMITED ; // (2140 currently)

variables: char hdr_typ(nhdr, mxstr) ; hdr_typ:long_name = "message type" ; char hdr_sid(nhdr, mxstr) ; hdr_sid:long_name = "station identification" ; char hdr_vld(nhdr, mxstr) ; hdr_vld:long_name = "valid time" ; hdr_vld:units = "YYYYMMDD_HHMMSS UTC" ; float hdr_arr(nhdr, hdr_arr_len) ; hdr_arr:long_name = "array of observation station header values" ; hdr_arr:_fill_value = -9999.f ; hdr_arr:columns = "lat lon elv" ; … ; float obs_arr(nobs, obs_arr_len) ; obs_arr:long_name = "array of observation values" ; obs_arr:_fill_value = -9999.f ; obs_arr:columns = "hdr_id gc lvl hgt ob" ; obs_arr:hdr_id_long_name = "index of matching header data" ; … ;

obs_arr = 0, 7, 837, 1618, 1618, 1, 11, 837, 1618, 273.05, 2, 17, 837, 1618, 271.85, 3, 52, 837, 1618, 92, 4, 53, 837, 1618, 0.00417, 5, 7, 826, 1724, 1724, 6, 11, 826, 1724, 274.55, 7, 17, 826, 1724, 272.15, 8, 52, 826, 1724, 84, 9, 53, 826, 1724, 0.00432, 10, 7, 815.3, 1829, 1829, 11, 11, 815.3, 1829, 276.45, 12, 17, 815.3, 1829, 265.75, 13, 52, 815.3, 1829, 45, 14, 53, 815.3, 1829, 0.0027, 15, 7, 815, 1832, 1832, 16, 11, 815, 1832, 276.55, 17, 17, 815, 1832, 265.55, 18, 52, 815, 1832, 44, 19, 53, 815, 1832, 0.00266, 20, 7, 784.7, 2134, 2134, 21, 11, 784.7, 2134, 274.05, 22, 17, 784.7, 2134, 264.15, …

Result of ncdump –h

Result of ncdump –v obs_arr

Output NetCDF file

MET Statistics modules (Point and Grid Stat): Traditional verification measures

  Gridded and point verification  Multiple interpolation and matching

options

  Statistics

 Continuous - RMSE, BCRMSE, Bias, Correlation, etc.

 Categorical - POD, FAR, CSI, GSS, Odds Ratio, etc.

 Probabilistic - Brier Score, Reliability, ROC, etc. in v2.0

Continuous Example

Observed Event

Observed Non-event

Forecast Event

Count = 532 (Hits)

Count = 219 (False Alarms)

Forecast Non-event

Count = 393 (Misses)

Count =1,627 (Correct No’s)

Verifying Probabilities

  Probabilistic verification methods added for:  Grid-Stat, Point-Stat, and Stat-Analysis

  Define Nx2 contingency table using:  Multiple forecast probability thresholds  One observation threshold

Example:   FCST: Probability of precip [0.00, 0.25, 0.50, 0.75, 1.00]   OBS: Accumulated precip > 0.00

Verifying Probabilities: Example

  Verify probability of precip with total precip:

fcst_thresh[] = [“ge0.00 ge0.25 ge0.50 ge0.75 ge1.00”];

Verifying Probabilities: Output   Statistical Output :

  Nx2 Table Counts

  Joint/Conditional factorization table with calibration, refinement, likelihood, and base rate by threshold

  Receiver Operating Characteristic (ROC) plot points by threshold

  Reliability, resolution, uncertainty, area under ROC Curve, and Brier Score

Verifying Probabilities: Output

Joint Distribution Calibration Refinement Likelihood Base Rate

Probability category

p(yi, o1) p(yi, o2) p(yi) p(o1|yi) p(yi|o1) p(yi | o2)

0

0.25

0.5

0.75

1

Verifying Probabilities: Output

Accounting for Uncertainty

  Observational   Model

 Model parameters   Physics  Verification scores

  Sampling  Verification statistic is a realization of a random process.  What if the experiment were re-run under identical

conditions?

MET Statistics modules (Point and Grid Stat): Confidence Intervals (CIs)

  MET provides two CI approaches  Normal  Bootstrap

  CIs are critical for appropriate and meaningful interpretation of verification results   Ex: Regional comparisons

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1-PODN

PODY

Region

CONUS

LMV

Confidence Intervals (CI’s)

  Parametric  Assume the observed sample is a realization from a known

population distribution with possibly unknown parameters (e.g., normal).

 Normal approximation CI’s are most common.  Quick and easy.

Confidence Intervals (CI’s)

  Nonparametric  Assume the distribution of the

observed sample is representative of the population distribution.

 Bootstrap CI’s are most common.

 Can be computationally intensive, but easy enough.

Bootstrap Confidence Intervals (CI’s)

  Resample from data with replacement.

  Calculate statistic θ   Repeat to get empirical

distribution (histogram) of θ.   Count in on both ends to get CI

(percentile method)   Do BCa to adjust for bias and

skewness in resampling.

Neighborhood Methods

  In MET, Neighborhood methods are in grid_stat tool.   Smoothing filters in MET:

  Minimum   Maximum   Median   Mean

  Fractional coverage   Fractions Brier Score   Fractions Skill Score

  See Ebert (2008) for a good summary and comparison of these techniques (and references).

Neighborhood verification methods

From Mittermaier 2008

observed forecast

Ebert (2008; Met Applications) describes the neighborhood methods in MET

Example: Fractional skill score (Roberts and Lean, MWR, 2008)

Neighborhood Methods Smoothing

  Minimum, Maximum, Median, Mean

original min max median mean

Stat Analysis Tools

Used to :   Filter   Summarize   Aggregate results over many times, leads, thresholds, domains,

etc.

Stat Analysis Tool

V2.0 WRF … ADPUPA G212 … TMP P850-750 … >278.00 CTC

401 192 11 24 174 UW_MEAN 1

V2.0 WRF … ADPSFC G212 … TMP P850-750 … >278.00 CTC

167 25 23 0 119 UW_MEAN 1

OBS

FCST

Y N

Y 192 11 203

N 24 174 198

216 185 401

OBS FCST

Y N Y 25 23 48

N 0 119 119

25 142 167

ASCII text output

Stat Analysis Tool

OBS

FCST

Y N

Y 217 34 251

N 24 293 317

241 327 568

JOB_LIST: -job aggregate -vx_mask G212 -line_type CTC

-fcst_thresh >278.000 -var TMP -level P850-750 -dump_row out/aggr_ctc_job.stat

COL_NAME: TOTAL FY_OY FY_ON FN_OY FN_ON INTERP_MTHD INTERP_PNTS

CTC: 568 217 34 24 293

Stat Analysis Output

Imminent MET Tools

  MODE time domain   Ensemble forecast verification   Satellite data ingest   Cloud verification

Further Details

  For more detail on METv2.0:  MET User’s Guide

 www.dtcenter.org/met/users/docs/overview.php  README within the MET release

Thank You For more information:

http://www.dtcenter.org/met/users/

Documents

THE MODEL EVALUATION TOOLS (MET) SHOW AND TELL