34
National Weather Service National Weather Service River Forecast River Forecast Verification Verification Peter Gabrielsen Peter Gabrielsen July 2006 Hic Meeting July 2006 Hic Meeting July 10, 2006 July 10, 2006

National Weather Service River Forecast Verification

  • Upload
    tobit

  • View
    53

  • Download
    0

Embed Size (px)

DESCRIPTION

National Weather Service River Forecast Verification. Peter Gabrielsen July 2006 Hic Meeting July 10, 2006. Background. - PowerPoint PPT Presentation

Citation preview

Page 1: National Weather Service  River Forecast Verification

National Weather Service National Weather Service River Forecast Verification River Forecast Verification

Peter GabrielsenPeter GabrielsenJuly 2006 Hic MeetingJuly 2006 Hic Meeting

July 10, 2006July 10, 2006

Page 2: National Weather Service  River Forecast Verification

BackgroundBackgroundAs a result of the 2005 NOAA Audit Plan – As a result of the 2005 NOAA Audit Plan – “The Assistant Administrator for Weather “The Assistant Administrator for Weather Services should develop, document and Services should develop, document and implement a timeline and action plan for implement a timeline and action plan for completing a comprehensive river forecast completing a comprehensive river forecast verification program as soon as practicable”verification program as soon as practicable”

In 1996 the NRC stated the verification of In 1996 the NRC stated the verification of hydrologic forecasts are inadequatehydrologic forecasts are inadequate

Page 3: National Weather Service  River Forecast Verification

Background (cont.)Background (cont.)Research has shown:Research has shown: Little is known about the skill of hydrologic forecastsLittle is known about the skill of hydrologic forecasts

Forecasts depend upon imperfect, mathematical Forecasts depend upon imperfect, mathematical descriptions governing runoff and routingdescriptions governing runoff and routing

Hydrologic forecasts depend on meteorological Hydrologic forecasts depend on meteorological forecasts, therefore, they include the uncertainty of forecasts, therefore, they include the uncertainty of meteorological forecastsmeteorological forecasts

Verification leads to improved forecast skillVerification leads to improved forecast skill

Page 4: National Weather Service  River Forecast Verification

Background (cont.) Background (cont.) Team was chartered November 2005Team was chartered November 2005Representatives from five NWS regions Representatives from five NWS regions and OHDand OHDExpert input from trusted scientists are Expert input from trusted scientists are being usedbeing used OHDOHD OCWWSOCWWS UniversitiesUniversities RFCs RFCs

Page 5: National Weather Service  River Forecast Verification

Team CharterTeam CharterVision:Vision: Provide easy access to enhanced river forecast Provide easy access to enhanced river forecast verification data which will be used to improve our scientific and verification data which will be used to improve our scientific and operational techniquesoperational techniques and services. and services.

Mission:Mission: Assess forecaster, program management and Assess forecaster, program management and user needs for verification data. Inventory current national and user needs for verification data. Inventory current national and regional verification practices and identify unmet needs. regional verification practices and identify unmet needs. Establish requirements for a comprehensive national system to Establish requirements for a comprehensive national system to verify hydrologic forecasts and guidance products which satisfy verify hydrologic forecasts and guidance products which satisfy these needs. This system should identify sources of error and these needs. This system should identify sources of error and skill in the forecasts across the entire forecast process. skill in the forecasts across the entire forecast process.

Page 6: National Weather Service  River Forecast Verification

Charter (cont.)Charter (cont.)

Success Criteria/Deliverables:Success Criteria/Deliverables: Deliver a Deliver a NWS river forecast verification plan which NWS river forecast verification plan which measures skill and error in the forecast measures skill and error in the forecast process. The plan includes process. The plan includes conceptualized solution and a definition of conceptualized solution and a definition of operational requirementsoperational requirements

Page 7: National Weather Service  River Forecast Verification

Charter (cont.)Charter (cont.)Team Membership:Team Membership:Julie Demargne (OHD) Julie Demargne (OHD) Peter Gabrielsen (ER)Peter Gabrielsen (ER)Bill Lawrence (SR)Bill Lawrence (SR)Scott Lindsey (AR)Scott Lindsey (AR)Mary Mullusky (OCWWS) Mary Mullusky (OCWWS) Noreen Schwein (CR)Noreen Schwein (CR)Scott Staggs (WR) Scott Staggs (WR) Kevin Werner (WR)Kevin Werner (WR)Tom Adams (ER) Tom Adams (ER) William Marosi (NWSEO) William Marosi (NWSEO)

Page 8: National Weather Service  River Forecast Verification

Verification SystemVerification System

Prior to proposing verification standards – Prior to proposing verification standards – the hydrologic forecast process must be the hydrologic forecast process must be describeddescribed

Page 9: National Weather Service  River Forecast Verification

Data Processing and Quality Control

Observed ForecastPrecipitation PrecipitationTemperature TemperatureStage StageFlow FlowSnow depth Freezing levelDewpointWind speedSky CoverFreezing LevelSnow water equivalentPotential Evaporation

Short-term Deterministic Forecast

Short-term Probabilistic Forecast (Ensemble)

Long-term Statistical Forecast (Water Supply)

Long-term Probabilistic Forecast (Ensemble)

Postprocessor

Model States

Hydrologic and Hydraulic Models

Rainfall/Runoff Snow accumulation …and ablation Unit Graph Consumptive Use Routing Dynamic Routing Rating Curves Reservoir Statistical Water …Supply (SWS)

Final Product Issuance comparison to action-required stage; appropriate action pursued

Model Calibration

Historical data analysisParameter EstimationParameter CalibrationOperational Implementation

Data Assimilation

Forecaster Analysis Review

Quality ControlRun-time modsReality Check

RFC Hydrologic Forecasting Process

Model Parameters

Page 10: National Weather Service  River Forecast Verification

Raw Model Hydrologic Forecast

Contribution of RFC staff correcting bad data

Contribution of hydrologic forecaster through runtime-mods

Contribution of HAS function

Contribution of forecast processing enhancements

Input errors and model errors (parameters, model states, model structure)

Data QC to correct input errors

Perfect Hydrologic Forecast

Runtime-mods to correct input and model errors (parameters, model states)

Adjustments of observed and forecast data (QPF/MPE, MAT, etc.)

Enhanced calibration to correct model parameter errors

Enhanced data assimilation process to correct initial model states errors

Enhanced/new input data to correct input and model errors

Experimental / Operational Hydrologic Forecast

Corrections of all input, model, and forecaster analysis errors

Enhanced/new hydrologic/hydraulic model to correct model deficiencies

Enhanced post-processor to correct output forecast errors

Operational Hydrologic Forecast

Page 11: National Weather Service  River Forecast Verification

Role and Setup of the Verification SystemRole and Setup of the Verification System

PurposePurpose Monitor forecast quality over timeMonitor forecast quality over time Monitor quality at various steps in the forecast Monitor quality at various steps in the forecast

processprocess Improves forecast qualityImproves forecast quality Assist prioritization of forecast system Assist prioritization of forecast system

enhancementsenhancements

Page 12: National Weather Service  River Forecast Verification

Uses of Verification ResultsUses of Verification Results

Verification SystemVerification System Describe forecast performanceDescribe forecast performance

Past and recent Past and recent

Operational Operational

Control (or baseline) Control (or baseline)

ExperimentalExperimental

Specific time periodsSpecific time periods

Page 13: National Weather Service  River Forecast Verification

Verification

Model Setup:Calibration

Operation Installation

State Updating:Data Quality ControlRuntime Simulations

Data Assimilation

Forecast Computation:Hydrologic and

Hydraulic ModelsPostprocessor

Product Review and Issuance:

Forecaster Analysisand Quality Control

An effective verification process must quantify the characteristics of the An effective verification process must quantify the characteristics of the forecast system and offer a means to analyze why forecasts behave the forecast system and offer a means to analyze why forecasts behave the way they do at various steps in the forecast processway they do at various steps in the forecast process

Page 14: National Weather Service  River Forecast Verification

Uses of Verification ResultsUses of Verification ResultsCustomersCustomers Hydrologic program managersHydrologic program managers Emergency managersEmergency managers Scientists/ResearchersScientists/Researchers Hydrologic forecastersHydrologic forecasters Everyday customersEveryday customers

Use ModesUse Modes OperationalOperational Experimental/ResearchExperimental/Research

Page 15: National Weather Service  River Forecast Verification

Verification System ComponentsVerification System Components

Administrative – Administrative – describe the efficiencydescribe the efficiency Logistical aspects Logistical aspects – type, quantity, duration and – type, quantity, duration and

frequencyfrequency Forecast skill Forecast skill

Scientific – Scientific – describe the reliabilitydescribe the reliability Forecast skillForecast skill Forecast system error analysisForecast system error analysis

Page 16: National Weather Service  River Forecast Verification

National Baseline Verification SystemNational Baseline Verification System

LogisticalLogisticalcharacterizing point forecasts by service type, frequency and characterizing point forecasts by service type, frequency and location;location;

characterizing areal forecasts by service type, frequency and characterizing areal forecasts by service type, frequency and location;location;

identifying daily the number of issued forecasts by type and location;identifying daily the number of issued forecasts by type and location;

quantifying the person effort required to set up a basin for quantifying the person effort required to set up a basin for forecasting, including data gathering, calibration, model setup and forecasting, including data gathering, calibration, model setup and implementation efforts;implementation efforts;

quantifying the person effort required to issue each type of forecast, quantifying the person effort required to issue each type of forecast, including manual quality control of input data, forecaster run-time including manual quality control of input data, forecaster run-time modifications and forecaster review and analysis;modifications and forecaster review and analysis;

quantifying the timeliness of issued forecastsquantifying the timeliness of issued forecasts

Page 17: National Weather Service  River Forecast Verification

Categories of Verification MetricsCategories of Verification Metrics

CategoricalCategorical: : statistics related to predefined threshold statistics related to predefined threshold or range of values (e.g., above flood stage, minor).or range of values (e.g., above flood stage, minor).

ErrorError: : statistics that measure various differences statistics that measure various differences between forecast and observed values (including timing between forecast and observed values (including timing errors).errors).

CorrelationCorrelation: : statistics that measure the statistics that measure the correspondence between ordered pairs (e.g., crest correspondence between ordered pairs (e.g., crest forecasts vs. QPF, forecast and observed stages).forecasts vs. QPF, forecast and observed stages).

Distribution PropertiesDistribution Properties:: statistics that summarize statistics that summarize the characteristics of a set of values.the characteristics of a set of values.

Page 18: National Weather Service  River Forecast Verification

Categories of Verification MetricsCategories of Verification Metrics

Skill Scores:Skill Scores: statistics that measure the relative statistics that measure the relative accuracy with respect to some set of standard reference accuracy with respect to some set of standard reference or control set of forecasts.or control set of forecasts.

Conditional Statistics:Conditional Statistics: metrics computed based on metrics computed based on the occurrence of a particular event or events such as a the occurrence of a particular event or events such as a specific range of observations or forecasts.specific range of observations or forecasts.

Statistical Significance:Statistical Significance: mmeasures the uncertainty easures the uncertainty of the computed values of verification metrics.of the computed values of verification metrics.

Page 19: National Weather Service  River Forecast Verification

Verification SystemsVerification Systems

National Baseline Verification SystemNational Baseline Verification System Administrative in natureAdministrative in nature Logistical measuresLogistical measures Skill measuresSkill measures

Comprehensive Verification SystemComprehensive Verification System AdministrativeAdministrative ScientificScientific

Page 20: National Weather Service  River Forecast Verification

Verification System RequirementsVerification System Requirements

Selection of forecasts to be verifiedSelection of forecasts to be verified time attributes (days, months, seasons, years, as well as lead time attributes (days, months, seasons, years, as well as lead

time)time) service attributes (national, regional, RFCs, groups, locations)service attributes (national, regional, RFCs, groups, locations) individual forecaster within guidelines agreed to by the NWS individual forecaster within guidelines agreed to by the NWS

and the NWSEOand the NWSEO basin attributes (response time, size, slope, aspect, elevation, basin attributes (response time, size, slope, aspect, elevation,

snow, non-snow)snow, non-snow) forecast or observed events (crest timing, rising and falling forecast or observed events (crest timing, rising and falling

hydrographs)hydrographs)

Page 21: National Weather Service  River Forecast Verification

Verification System RequirementsVerification System RequirementsArchivingArchiving

Time attributes (days, months, years, seasons)Time attributes (days, months, years, seasons) Service attributes (national, regional, RFCs, forecaster, Service attributes (national, regional, RFCs, forecaster,

groups, locations)groups, locations) Basin attributes (response time, size, slope, aspect, elevationBasin attributes (response time, size, slope, aspect, elevation

HindcastingHindcasting Different QPFs (e.g., Perfect QPF, zero, actual, persistence) Different QPFs (e.g., Perfect QPF, zero, actual, persistence) Different FMATs (e.g., Perfect FMAT, actual, persistence)Different FMATs (e.g., Perfect FMAT, actual, persistence) Different freezing levelsDifferent freezing levels Different MAPEsDifferent MAPEs Different reservoirs forecasts Different reservoirs forecasts Different QPEs (e.g., point based MAP, MAPX, Q2)Different QPEs (e.g., point based MAP, MAPX, Q2) Different sets of model parameters Different sets of model parameters Different models, including the post-processing and state Different models, including the post-processing and state

updating models updating models

Page 22: National Weather Service  River Forecast Verification

Additional recommendationsAdditional recommendationsOHD should assign a program manager for verification.OHD should assign a program manager for verification.

Establish formal verification focal points at each RFC.Establish formal verification focal points at each RFC.

Create national river forecast performance goals. This should be Create national river forecast performance goals. This should be accomplished once the software has been fielded and some experience accomplished once the software has been fielded and some experience gained with the metrics. gained with the metrics.

Ensure adequate hydrologic verification training, and use of the system, is Ensure adequate hydrologic verification training, and use of the system, is captured in OSIP documentation.captured in OSIP documentation.

Publish findings in peer reviewed journals (e.g., BAMS, EOS) to inform the Publish findings in peer reviewed journals (e.g., BAMS, EOS) to inform the research community of our plans.research community of our plans.

Ensure an end-to-end assessment and verification of the elements in the Ensure an end-to-end assessment and verification of the elements in the hydrologic forecasting process that are outside of the control of the RFC hydrologic forecasting process that are outside of the control of the RFC forecaster or produced by other agencies forecaster or produced by other agencies

Page 23: National Weather Service  River Forecast Verification

Additional recommendationsAdditional recommendationsOHD needs to establish a team to define the raw model to enable OHD needs to establish a team to define the raw model to enable the users to assess the impact of various steps (e.g., calibration, the users to assess the impact of various steps (e.g., calibration, quality control, run-time modifications) on the forecast performance.quality control, run-time modifications) on the forecast performance.

Archive of necessary data to support verification software should Archive of necessary data to support verification software should begin within 30 days of the data being defined. begin within 30 days of the data being defined.

Ensure continuity with other activities that support this verification Ensure continuity with other activities that support this verification plan.plan.

Brief the National Performance Management Committee (NPMC) Brief the National Performance Management Committee (NPMC) and ensure incorporation of the RFC hydrologic verification and ensure incorporation of the RFC hydrologic verification requirements requirements

Page 24: National Weather Service  River Forecast Verification

Background InformationBackground Information

Page 25: National Weather Service  River Forecast Verification

National Baseline Verification System National Baseline Verification System MetricsMetrics

CategoricalCategorical Deterministic: POD, FAR, LTDDeterministic: POD, FAR, LTD Probabilistic: Brier Score, Ranked Probability ScoreProbabilistic: Brier Score, Ranked Probability Score

Error (Accuracy) Error (Accuracy) Deterministic: RMSE, MAE, ME, BiasDeterministic: RMSE, MAE, ME, Bias

Correlation:Correlation: Deterministic: Pearson Correlation CoefficientDeterministic: Pearson Correlation Coefficient

Page 26: National Weather Service  River Forecast Verification

National Baseline Verification System MetricsNational Baseline Verification System Metrics

Skill ScoreSkill Score Deterministic: RMSE Skill ScoreDeterministic: RMSE Skill Score Probabilistic: Rank Probability Skill Score, Brier Skill ScoreProbabilistic: Rank Probability Skill Score, Brier Skill Score

ConfidenceConfidence Deterministic: Sample sizeDeterministic: Sample size Probabilistic: Sample sizeProbabilistic: Sample size

Probabilistic forecasts Probabilistic forecasts should also be verified as should also be verified as deterministic forecast using mean or some predetermined deterministic forecast using mean or some predetermined

exceedence levelexceedence level

Page 27: National Weather Service  River Forecast Verification

Verification System RequirementsVerification System Requirements

Analysis of skill and error sourcesAnalysis of skill and error sources Impact of input data errorsImpact of input data errors Impact of model errorsImpact of model errors Impact of forecast analysisImpact of forecast analysis

Computation of verification metrics and results Computation of verification metrics and results presentationpresentation

Dissemination and trainingDissemination and training

Page 28: National Weather Service  River Forecast Verification

CATEGORIESCATEGORIES DETERMINISTIC FORECAST DETERMINISTIC FORECAST VERIFICATION METRICSVERIFICATION METRICS

PROBABILISTIC PROBABILISTIC FORECAST VERIFICATION FORECAST VERIFICATION METRICSMETRICS

11. . CategoricalCategorical Probability Of Detection (POD), Probability Of Detection (POD), False Alarm Rate (FAR)False Alarm Rate (FAR),, Critical Critical Success Index (CSI), Success Index (CSI), Lead Time of Lead Time of Detection (LTD)Detection (LTD),, Pierce Skill Score Pierce Skill Score (PSS), Gerrity Score (GS) (PSS), Gerrity Score (GS)

Brier Score (BS), Rank Brier Score (BS), Rank

Probability Score (RPS)Probability Score (RPS)

2. Error2. Error Root Mean Square Error (RMSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Mean Absolute Error (MAE), Mean Error (ME), Bias (%)Error (ME), Bias (%),, Linear Error Linear Error in Probability Space (LEPS)in Probability Space (LEPS)

Continuous RPS Continuous RPS

3. Correlation3. Correlation Pearson Correlation CoefficientPearson Correlation Coefficient, , Ranked correlation coefficient, Ranked correlation coefficient, scatter plots scatter plots

4. Distribution 4. Distribution PropertiesProperties Mean, variance, higher momentsMean, variance, higher moments Wilcoxon rank sum test, variance Wilcoxon rank sum test, variance

of forecasts, variance of of forecasts, variance of observations, ensemble spread, observations, ensemble spread, Talagrand Diagram (or Rank Talagrand Diagram (or Rank Histogram) Histogram)

Verification metric categories and metrics for deterministic and probabilistic forecasts

Page 29: National Weather Service  River Forecast Verification

CATEGORIESCATEGORIES DETERMINISTIC FORECAST DETERMINISTIC FORECAST VERIFICATION METRICSVERIFICATION METRICS

PROBABILISTIC PROBABILISTIC FORECAST VERIFICATION FORECAST VERIFICATION METRICSMETRICS

5. Skill Score5. Skill Score Root Mean Squared Error Skill Root Mean Squared Error Skill Score (SS-RMSE) (with reference Score (SS-RMSE) (with reference to persistence, climatology,to persistence, climatology, lagged persistence), Wilson Score lagged persistence), Wilson Score (WS), Linear Error in Probability (WS), Linear Error in Probability Space Skill Score (SS-LEPS) Space Skill Score (SS-LEPS)

Rank Probability Skill ScoreRank Probability Skill Score, , Brier Skill Score (with Brier Skill Score (with reference to persistence, reference to persistence, climatology,climatology, lagged persistence lagged persistence

6. Conditional 6. Conditional StatisticsStatistics

Relative Operating Characteristic Relative Operating Characteristic (ROC) and ROC Area, reliability (ROC) and ROC Area, reliability measures, discrimination diagram, measures, discrimination diagram, other discrimination measures other discrimination measures

ROC and ROC Area, other ROC and ROC Area, other resolution measures, Reliability resolution measures, Reliability diagram, discrimination diagram, diagram, discrimination diagram, other discrimination measures other discrimination measures

7. Confidence7. Confidence Sample sizeSample size,, Confidence Interval Confidence Interval (CI) (CI)

Ensemble size, Ensemble size, sample size,sample size, Confidence Interval (CI) Confidence Interval (CI)

Verification metric categories and metrics for deterministic and probabilistic forecasts

Page 30: National Weather Service  River Forecast Verification

Definition of Metrics for the National Definition of Metrics for the National

Baseline Verification SystemBaseline Verification System

Probability of detection (POD)Probability of detection (POD) – Percentage of – Percentage of (categorical) events forecast correctly. (categorical) events forecast correctly.

False Alarm Ration (FAR)False Alarm Ration (FAR) – Percentage of – Percentage of (categorical) forecast events that did not verify. (categorical) forecast events that did not verify.

Lead Time of Detection (LTDLead Time of Detection (LTD) ) – The average lead – The average lead time of all forecasts that fall into the correct observed time of all forecasts that fall into the correct observed category. category.

Page 31: National Weather Service  River Forecast Verification

Definition of Metrics for the National Definition of Metrics for the National

Baseline Verification SystemBaseline Verification System

Root Mean Square Error (RMSE)Root Mean Square Error (RMSE) – The square – The square root of the average of the squared differences between root of the average of the squared differences between forecasts and observations.forecasts and observations.

Mean Absolute Error (MAE)Mean Absolute Error (MAE) – The average of the – The average of the absolute value of the differences between forecasts and absolute value of the differences between forecasts and observations.observations.

Mean Error (ME) Mean Error (ME) – The average difference between – The average difference between forecasts and observations.forecasts and observations.

Bias (%) Bias (%) – The ME expressed as a percentage of the – The ME expressed as a percentage of the mean observation.mean observation.

Page 32: National Weather Service  River Forecast Verification

Definition of Metrics for the National Definition of Metrics for the National Baseline Verification SystemBaseline Verification System

Brier Skill Score (BSS) Brier Skill Score (BSS) – A skill score based on BS – A skill score based on BS values. The recommended reference forecasts are values. The recommended reference forecasts are persistence and climatology.persistence and climatology.

Ranked Probability Skill Score (RPSS) Ranked Probability Skill Score (RPSS) – A skill – A skill score based on RPS values. The recommended score based on RPS values. The recommended reference forecasts are persistence and climatology.reference forecasts are persistence and climatology.

Sample Size Sample Size – A numeration of the number of – A numeration of the number of forecasts involved in the calculation of a metric forecasts involved in the calculation of a metric appropriate to the type of forecast (e.g., categorical appropriate to the type of forecast (e.g., categorical forecasts should numerate forecasts and observations forecasts should numerate forecasts and observations by categories, etc.)by categories, etc.)

Page 33: National Weather Service  River Forecast Verification

Definition of Metrics for the National Definition of Metrics for the National

Baseline Verification SystemBaseline Verification System

Brier Score (BS) Brier Score (BS) - The mean squared error of - The mean squared error of probabilistic two-category forecasts where the probabilistic two-category forecasts where the observations are either 0 (no occurrence) or 1 observations are either 0 (no occurrence) or 1 (occurrence) and forecast probability may be arbitrarily (occurrence) and forecast probability may be arbitrarily distributed between occurrence and non-occurrence.distributed between occurrence and non-occurrence.

Ranked Probability Score (RPS) Ranked Probability Score (RPS) – The mean – The mean squared error of probabilistic multi-category forecasts squared error of probabilistic multi-category forecasts where observations are 1 (occurrence) for the observed where observations are 1 (occurrence) for the observed category and 0 for all other categories and forecast category and 0 for all other categories and forecast probability may be arbitrarily distributed between all probability may be arbitrarily distributed between all categories.categories.

Page 34: National Weather Service  River Forecast Verification

Definition of Metrics for the National Definition of Metrics for the National Baseline Verification SystemBaseline Verification System

Correlation CoefficientCorrelation Coefficient – A measure of the linear – A measure of the linear association between forecasts and observations.association between forecasts and observations.

Skill Score –Skill Score – In general, skill scores are the In general, skill scores are the percentage difference between verification scores for percentage difference between verification scores for two sets of forecasts (e.g., operational forecasts and two sets of forecasts (e.g., operational forecasts and climatology).climatology).

Root Mean Squared Error Skill Score (SS-Root Mean Squared Error Skill Score (SS-RMSE) –RMSE) – A skill score based on RMSE values. The A skill score based on RMSE values. The recommended reference forecasts are persistence and recommended reference forecasts are persistence and climatology.climatology.