Peaks over threshold method (POT)web.cs.elte.hu/~zempleni/aring14_5_en.pdf · for modelling snow depth in Switzerland Model-fitting The estimation of the 1 dimensional margins Estimation

Lecture 5

March 10, 2014

The method of the peaks over threshold,

point processes

Peaks over threshold method(POT)

� Those events are extreme, which exceed a given, high threshold

� Advantages:� One may use more data� The estimators are not influenced by the small block-

maximuma

� Disadvantages:� It depends on the threshold choice� The declustering, needed in case of time series

(decision, which maxima belong to one event) is not always unambiguous

Theoretical foundations

Let X1, X2,…,Xn be independent, identically distributed random variables. If the normalizedmaximum of this sequence converges to an extreme-value distribution (with parameters µ,σ,ξ), then

if y>0 and where

(Generalized Pareto distribution, GPD.)

The asymptotics is valid if both n and u tend to infinity.

ξ

σξ /1)~1(1)|( −+−≈><−y

uXyuXP

0~/1 >+ σξy )(~ µξσσ −+= u

Return levels

The p-quantile of the generalized Pareto distribution:

, if ξ ≠ 0;

, if ξ = 0, where

If we observe annually on average nu maxima over the threshold ⇒ the 1/T*nu quantile is returning once in everyT years on average.

If , the estimated upper endpoint of the distribution is

−

+= 1

1ξ

ζξσ

upp

ux

+= up

pux ζσ

1log

).( uXPu >=ζ

n

nuu =ζ̂

0ˆ <ξ.ˆ/ˆ

1̂ ξσ−= ux

Threshold selection

The mean excess plot:We plot the average of X-u for different thresholds u

(for those observations, which fulfill X>u) as a function of u.

If the Pareto model is true, this curve is near to linear.

The explanation of its properties may be difficult due to its wild fluctuations near the maximum of the observations.

An alternative: let us consider the estimated values of the parameters for different thresholds.

S&P 500, daily losses

� Data for the period 1960-1987

� daily losses in %� Threshold: 1.5%

� MLE: γ=0,177σ=0,415

� 40 years return level:6.57, konf. int: (4.9;10)

� The largest observed loss 13.10.2008: 9.05%

5 10

0.0

0.2

0.4

0.6

0.8

1.0

x (on log scale)

Fu(x-u)

Fitted excess distribution

Time

%Fall

01

23

45

67

06.01.60 06.01.65 06.01.70 06.01.75 06.01.80 06.01.85

S&P500's daily %falls with treshold u = 1.5%

Estimation methods, based on quantiles

� Hill estimator

� Pickands estimator

� Method of moments

� ML estimator for the exponential regression

Hill estimator

For a heavy tailed distribution:

the distribution of log(X):

Quantile function:

The ordered sample of X1, X2,…,Xn:

The elements of the ordered sample are

consistent estimators for the respective

quantiles

Plot:

)()( xxxF lα−=

)()(log uu eeuXP lα−=>

)/( log)log()(log

)/( )(

pppQ

pppQ

11

11

∗

∗−

+−=−⇒

=−

l

l

γ

γ

∗∗∗ ≤≤≤ nXXX K21

γ :steepness

linear asyimpt.

1log vs.)log( 1 ⇒

+

−∗+−

n

jX jn

Hill estimator/2( )

∑

∑

=

=

∗+−

∗+−

+

−

+

−

−

=k

j

k

j

knjn

n

k

n

j

k

XXk

1

1

11

1log

1log

1

loglog1

γ̂Estimation of the steepnes:

The denominator is near to 1, if k is large

Thus the Hill estimator:

Properties:depends on kk small: large variancek large: substantial bias

A compromise must be found. Not too robust!

∑=

∗+−

∗+− −=

k

j

knjnnk XXk

H

1

11, loglog1

Properties of the Hill estimator

� It is weakly consistent under weak dependence and k→∞, k/n→0

� Strongly consistent in the iid case, if k/loglogn→∞

holds

� Under further conditions

(asymptotic normality)

� But in practice it often fails as the conditions are hard to check and it is not robust

( ) ),0(ˆ 2ααα Nk →−

S&P 500 daily losses in %

log(e

mpirica

lquan

tile)

Quantile estimation

-log(j/(n+1))

log(Empirical quantiles)

0 2 4 6 8 10 12

-6-4

-20

2

Pareto quantile plot of S&P500's %falls

-log(p)

log(Q(1-p))

++

−1

1

n

klog

log(Xn-k*)

Point processes

� A useful tool, it is easy to generalise the models with their help

� The sample is a sequence of random points, its number is n

� The extremes are interesting for us: we deal with points falling into (t,∞)

Poisson point processes

� Generalisation of the one dimensional Poisson process

� N(B): the number of observations falling into the Borel set B.

� Properties:� N(A) and N(B) are independent for any disjoint set A and B

� λ≥0 is a given measure over the Borel sets (intensity), N(B) has Poisson distribution with parameter λ(B)

� The process is called homogenous if λ is the Lebesguemeasure (taken over the bounded Borel sets B)

Two dimensional extremes

� Now let the marginal distributions be unit Pareto: (P(X>z)=1/z, z>1)

� Xn/n is also a point process, which is convergent over the region

[0,∞)x[0,∞)\{0,0}

� Properites of the intensity measure: λ(B/m)=mλ(B) (homogenous of order -1)

� Transformation to polar coordinates :

)(),(2

ωωλ dHr

drddr =

Max-stable processes

� Let T be an index set (in case of the basic definition these were precipitation measuring stations), then

is a max-stable process if and only if its coordinates are max-stable.

� Example: let (ri,si) be a Poisson point process over the set (0,∞)xS, with intensity measure

where S is an arbitrary measurable set, and H a measure over S.

}:{ TtYt ∈

)(2

ωdHr

dr

The construction of Smith (1990)

� Let f be such that

for all t and

� ri is the strength of the ith storm and siis its location

� This implies:� the margin of Y is unit Frechet

� Y is max-stable

∫ =S

sdHtsf 1)(),(

TttsfrY iii

t ∈= )},,({max

=∈∀< ∫ )(),(

max-expT)t (

St

dsHy

tsfyYP

ttt

Examples

� |T|=1: one dimensional max-stable distribution

� T={1,2}, S=[0,1], H: Lebesgue measure,

is just the 2 dimensional logistic model

� Gaussian extreme-value process: f(s,t) gives the density function of the normal distribution with mean s and covariance-matrix Σ (as a function of t)

( )( )( )

=−−

=−= −

−

2 if ,11

1 if ,1),(

ts

tstsf α

α

αα

Simulated Gauss process

The process here was used for modelling snow depth in Switzerland

Model-fitting

� The estimation of the 1 dimensional margins

� Estimation of the 2 dimensional dependence (extremal index)

� For parametric –e.g. Gauss - models approximate (pairwise) maximum likelihood estimation can be calculated – we come back to this question later

Multivariate POT models

� It would be good to use more data and real observations

� Classic model: the observations should exceed the threshold in all coordinates� Margins: GPD

� Certain parametric models may be transferred

� R package EVD can be used

� But: the used number of observations is low

Alternative definition (BGPD II)

� It considers every observation that exceeds the threshold in at least one coordinate

� The margins will not be GPDs

� There are just a few models which can be identified

� Estimation: with the R package mgpd, but it is not easy in more than two dimensions

Definition

� G is a d-dimensional GPD, if

where F is a MGEV, 0<F(0)<1

Especially

G(x)=0, if x<0 and

G(x)=1-logF(x)/logF(0), if x>0.

)0(

)(log

)0(log

1)(

∧−

=xF

xF

FxG

Comparison

The estimations based on bootstrap samples show that those estimators, which are based on more data are indeed more reliable

References� Embrechts, P., Klüppelberg, C. and Mikosch, T. (1997)

Modelling extremal events for insurance and finance. Springer

� J.Beirlant, G. Mathys (2000) Quantile estimation for heavy-tailed data

� R.L. Smith (1990) Max-Stable Processes and Spatial Extremes. www.stat.unc.edu/postscript/rs/spatex.pdf

� Coles, S. and Tawn, J. (1991) Modelling extreme multivariate events. Journal of the Royal Statistical Society, Series B, 53, p.377-392.

� Rootzén, H. and Tajvidi, N. (2006) The multivariate generalized Pareto distribution. Bernoulli 12, p.917-930.

� Rakonczai, P.: Multivariate Threshold Models with Applications to Wind Speed Data (Ph.D. thesis, 2012)

Documents

Peaks over threshold method (POT)web.cs.elte.hu/~zempleni/aring14_5_en.pdf · for modelling snow depth in Switzerland Model-fitting The estimation of the 1 dimensional margins Estimation