30
Threshold Selection for Precipitation Extremes Uli Schneider * and Philippe Naveau ** EGU, April 27, 2004 * Geophysical Statistics Project, NCAR ** Dept. of Applied Mathematics, University of Colorado and Lab. des Sciences du Climat et de l’Environnement, CNRS

Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Threshold Selection for PrecipitationExtremes

Uli Schneider∗ and Philippe Naveau∗∗

EGU, April 27, 2004

∗Geophysical Statistics Project, NCAR∗∗Dept. of Applied Mathematics, University of Colorado and

Lab. des Sciences du Climat et de l’Environnement, CNRS

Page 2: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Outline

Extreme value theory (threshold models, advantagesand limitations)

A new approach – folding

Conclusions

Threshold Selection for Precipitation Extremes

Page 3: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Extreme value theory

In classical statistics: model the AVERAGE behaviorof a process.

In extreme value theory: model the EXTREMEbehavior (the tail of a distribution).

Usually deal with very small data sets!

Threshold Selection for Precipitation Extremes

Page 4: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Extreme value theory

In extreme value theory: model the EXTREMEbehavior (the tail of a distribution).

Usually deal with very small data sets!

Threshold Selection for Precipitation Extremes

Page 5: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Extreme value theory

In extreme value theory: model the EXTREMEbehavior (the tail of a distribution).

Usually deal with very small data sets!

Threshold Selection for Precipitation Extremes

Page 6: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Threshold models

Model exceedances over a high threshold u –X − u|X > u.

1950 1960 1970 1980 1990 2000

010

020

030

040

0

years

daily

pre

cipi

tatio

n

Daily precipitation for Boulder, Colorado [1/100 in]

u

Threshold Selection for Precipitation Extremes

Page 7: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Threshold models

Model exceedances over a high threshold u –X − u|X > u.

1950 1960 1970 1980 1990 2000

010

020

030

040

0

years

daily

pre

cipi

tatio

n

Daily precipitation for Boulder, Colorado [1/100 in]

u

Threshold Selection for Precipitation Extremes

Page 8: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Threshold models

Model exceedances over a high threshold u –X − u|X > u.

u

Threshold Selection for Precipitation Extremes

Page 9: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Threshold models

Model exceedances over a high threshold u –X − u|X > u.

u

u

Threshold Selection for Precipitation Extremes

Page 10: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Threshold models

Model exceedances over a high threshold u –X − u|X > u.

u

Threshold Selection for Precipitation Extremes

Page 11: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

The gen. Pareto distribution (GPD)

The distribution of Y := X − u|X > u converges to(as u → ∞)

H(y) = 1 − (1 + ξy

σ)−

1

ξ .

H(y) is called the “Generalized Pareto” distribution(GPD) with 2 parameters.

shape parameter ξ

scale parameter σ

u

Threshold Selection for Precipitation Extremes

Page 12: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Extrapolation beyond the data

Return levels (quantiles) outside the data range are oftenthe quantity of interest:

Given m, what is the return level z such that there isa 1/m% probability to exceed z?

P (X > z) =1

m

E.g. for precipitation: the “infamous” 100-year flood

Easy to compute once the parameters of the modelare estimated.

Threshold Selection for Precipitation Extremes

Page 13: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Advantages and limitations

From a theoretical viewpoint

(+) “universal” approach(-) asymptotic result : convergence in u andsample size might be very slow.

From a statistical viewpoint

Choosing the threshold: trade-off – A highthreshold yields a better GPD approximation,whereas a low threshold leaves more data points.Goodness of fit – Is it reasonable to removeobservations in order to fit a pre-fixed model?

From a scientific viewpoint: Threshold determinesthe question: What is an extreme value?

Threshold Selection for Precipitation Extremes

Page 14: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

move above u

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Page 15: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

keep it here

u

move above u

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Page 16: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

move above u

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Page 17: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

move above u

[0,F(u))

F

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Page 18: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

move above u

[0,F(u))

F

[F(u),1)unif.

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Page 19: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

move above u

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Page 20: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – formula

[0,F(u))

F

[F(u),1)

F −1

unif.

X Y(u)

Y (u) :=

{

F−1(

F (u)F (u)F (X) + F (u)

)

if X ≤ u

X if X > uwhere F = 1−F

If X ∼ F (x), then Y (u) has the same distribution asX|X > u.

[0,F(u))

F

[F(u),1)

F −1

unif.

X Y(u)

Idea

Estimate F (in the “middle” of the distribution) withthe empirical cdf Fn.

Estimate F−1 with a “preliminary” GPD.

Threshold Selection for Precipitation Extremes

Page 21: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – formula

[0,F(u))

F

[F(u),1)

F −1

unif.

X Y(u)

Y (u) :=

{

F−1(

F (u)F (u)F (X) + F (u)

)

if X ≤ u

X if X > uwhere F = 1−F

If X ∼ F (x), then Y (u) has the same distribution asX|X > u.

Problem: F is unknown.

[0,F(u))

F

[F(u),1)

F −1

unif.

X Y(u)

Idea

Estimate F (in the “middle” of the distribution) withthe empirical cdf Fn.

Estimate F−1 with a “preliminary” GPD.

Threshold Selection for Precipitation Extremes

Page 22: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – formula

[0,F(u))

F

[F(u),1)

F −1

unif.

X Y(u)

Idea

Estimate F (in the “middle” of the distribution) withthe empirical cdf Fn.

Estimate F−1 with a “preliminary” GPD.

Threshold Selection for Precipitation Extremes

Page 23: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – simulation results

Normal dist. with 100 data points (“F unknown”)

20 40 60 80 100

0.5

1.0

1.5

2.0

2.5

3.0

m (years)

retu

rn le

vels

RETURN LEVELS

red = "true", green = with foldling, white = conventional

Threshold Selection for Precipitation Extremes

Page 24: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – simulation results

Normal dist. with 100 data points (“F known”)

20 40 60 80 100

−20

24

6

m (years)

retu

rn le

vels

RETURN LEVELS

red = "true", green = with foldling, white = conventional

Threshold Selection for Precipitation Extremes

Page 25: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – simulation results

Cauchy dist. with 100 data points (“F unknown”)

20 40 60 80 100

−50

050

100

150

m (years)

retu

rn le

vels

RETURN LEVELS

red = "true", green = with foldling, white = conventional

Threshold Selection for Precipitation Extremes

Page 26: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – simulation results

Cauchy dist. with 100 data points (“F known”)

20 40 60 80 100

050

100

m (years)

retu

rn le

vels

RETURN LEVELS

red = "true", green = with foldling, white = conventional

Threshold Selection for Precipitation Extremes

Page 27: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – An analytical result

Assuming that ξ = 0, it can be shown that thevariance of the estimator for σ using the foldingprocedure, Var(σ̂), for a GPD(ξ = 0, σ) can bereduced compared to the conventional estimator.

V ar(σ̂) ≤σ

n

The reduction in variance is a function of thethreshold u and the “quality” of the approximation forF.

Simulation results suggest that the folding worksbetter for heavy-tailed (Cauchy) distributions.

Threshold Selection for Precipitation Extremes

Page 28: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Folding – An analytical result

Assuming that ξ = 0, it can be shown that thevariance of the estimator for σ using the foldingprocedure, Var(σ̂), for a GPD(ξ = 0, σ) can bereduced compared to the conventional estimator.

V ar(σ̂) ≤σ

n

The reduction in variance is a function of thethreshold u and the “quality” of the approximation forF.

Simulation results suggest that the folding worksbetter for heavy-tailed (Cauchy) distributions.

Threshold Selection for Precipitation Extremes

Page 29: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

Conclusions

Increasing the threshold according to model fitdiagnostics may be misleading in assessing thequality of the fit.

Using more information from below the thresholdseems to yield more robust estimates.

Using the folding procedure may lead to morefreedom to “define” extreme values in applications.

Threshold Selection for Precipitation Extremes

Page 30: Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high threshold u Œ X ujX > u. 1950 1960 1970 1980 1990 2000 0 100 200 300 400 years daily

APPENDIX – GPD convergence

200 400 600 800 1000

1.0

1.5

2.0

2.5

3.0

3.5

4.0

−0.2

0−0

.15

−0.1

0−0

.05

shape parameter for normal distribution (simulated)

Threshold Selection for Precipitation Extremes