343
Modeling The Variance of a Time Series Peter Bloomfield Introduction Time Series Models First Wave Second Wave Stochastic Volatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Modeling The Variance of a Time Series Peter Bloomfield Department of Statistics North Carolina State University July 31, 2009 / Ben Kedem Symposium

Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Modeling The Variance of a Time Series

Peter Bloomfield

Department of StatisticsNorth Carolina State University

July 31, 2009 / Ben Kedem Symposium

Page 2: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Outline

1 Introduction

2 Time Series ModelsFirst WaveSecond Wave

3 Stochastic Volatility

4 Stochastic Volatility and GARCHA Simple Tractable ModelAn Application

5 Summary

Page 3: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Ben Kedem

Ben has made many contributions to time seriesmethodology.A common theme is that some unobserved (latent)series controls either:

the values of the observed data, orthe distribution the observed data.

In a stochastic volatility model, a latent series controlsspecifically the variance of the observed data.We relate stochastic volatility models to other timeseries models.

Page 4: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Outline

1 Introduction

2 Time Series ModelsFirst WaveSecond Wave

3 Stochastic Volatility

4 Stochastic Volatility and GARCHA Simple Tractable ModelAn Application

5 Summary

Page 5: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Correlation

Time series modeling is not just about correlation...

http://xkcd.com/552/

Page 6: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Time Domain Approach

The time domain approach to modeling a time seriesYt focuses on the conditional distribution ofYt |Yt−1,Yt−2, . . . .One reason for this focus is that the joint distribution ofY1,Y2, . . . ,Yn can be factorized as

f1:n(y1, y2, . . . , yn)

= f1(y1) f2|1(y2 |y1 ) . . . fn|n−1:1(yn |yn−1, yn−2, . . . , y1 ) .

So the likelihood function is determined by theseconditional distributions.

Page 7: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

The conditional distribution may be defined by:the conditional mean,

µt = E(Yt |Yt−1 = yt−1,Yt−2 = yt−2, . . . ) ;

the conditional variance,

ht = Var(Yt |Yt−1 = yt−1,Yt−2 = yt−2, . . . ) ;

the shape of the conditional distribution.

Page 8: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Forecasting

The conditional distribution of Yt |Yt−1,Yt−2, . . . alsogives the most complete solution to the forecastingproblem:

We observe Yt−1,Yt−2, . . . ;what statements can we make about Yt?

The conditional mean is our best forecast, and theconditional standard deviation measures how far webelieve the actual value might differ from the forecast.The conditional shape, usually a fixed distribution suchas the normal, allows us to make probability statementsabout the actual value.

Page 9: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Outline

1 Introduction

2 Time Series ModelsFirst WaveSecond Wave

3 Stochastic Volatility

4 Stochastic Volatility and GARCHA Simple Tractable ModelAn Application

5 Summary

Page 10: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

First Wave

The first wave of time series methods focused on theconditional mean, µt .

The conditional variance was assumed to be constant.The conditional shape was either normal or unspecified.

Need only to specify the form of

µt = µt (yt−1, yt−2, . . . ) .

Time-homogeneous:

µt = µ(yt−1, yt−2, . . . ) ,

depends on t only through yt−1, yt−2, . . . .

Page 11: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Autoregression

Simplest form:µt = a linear function of a small number of values:

µt = φ1yt−1 + · · ·+ φpyt−p.

Equivalently, and more familiarly,

Yt = φ1Yt−1 + · · ·+ φpYt−p + εt ,

where εt = Yt − µt satisfies

E(εt |Yt−1,Yt−2, . . . ) = 0,Var(εt |Yt−1,Yt−2, . . . ) = h.

Page 12: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Recursion

Problem: some time series need large p.Solution: recursion; include also some past values ofµt :

µt = φ1yt−1 + · · ·+ φpyt−p + ψ1µt−1 + · · ·+ ψqµt−q.

Equivalently, and more familiarly,

Yt = φ1Yt−1 + · · ·+ φpYt−p + εt + θ1εt−1 + · · ·+ θqεt−q.

This is the ARMA (AutoRegressive Moving Average)model of order (p,q).

Page 13: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Outline

1 Introduction

2 Time Series ModelsFirst WaveSecond Wave

3 Stochastic Volatility

4 Stochastic Volatility and GARCHA Simple Tractable ModelAn Application

5 Summary

Page 14: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Two Years of S&P 500: Changing Variance

−10

−5

05

10

2008 2009

Page 15: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Second Wave

The second wave of time series methods added afocus on the conditional variance, ht .Now need to specify the form of

ht = ht (yt−1, yt−2, . . . ) .

Time-homogeneous:

ht = h(yt−1, yt−2, . . . ) ,

depends on t only through yt−1, yt−2, . . . .

Page 16: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

ARCH

Simplest form: ht a linear function of a small number ofsquared εs:

ht = ω + α1ε2t−1 + · · ·+ αqε

2t−q.

Engle, ARCH (AutoRegressive ConditionalHeteroscedasticity):

proposed in 1982;Nobel Prize in Economics, 2003 (shared with the lateSir Clive Granger).

Page 17: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Recursion

Problem: some time series need large q.Solution: recursion; include also some past values ofht :

ht = ω + α1ε2t−1 + · · ·+ αqε

2t−q.+ β1ht−1 + · · ·+ βpht−p.

Bollerslev, 1987; GARCH (Generalized ARCH; noNobel yet, nor yet a Knighthood).Warning! note the reversal of the roles of p and q fromthe convention of ARMA(p,q).

Page 18: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

GARCH(1,1)

The simplest GARCH model has p = 1,q = 1:

ht = ω + αε2t−1 + βht−1

If α + β < 1, there exists a stationary process with thisstructure.If α + β = 1, the model is called integrated:IGARCH(1,1).

Page 19: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Outline

1 Introduction

2 Time Series ModelsFirst WaveSecond Wave

3 Stochastic Volatility

4 Stochastic Volatility and GARCHA Simple Tractable ModelAn Application

5 Summary

Page 20: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Stochastic Volatility

In a stochastic volatility model, an unobserved (latent)process Xt affects the distribution of the observedprocess Yt, specifically the variance of Yt .Introducing a “second source of variability” is appealingfrom a modeling perspective.

Page 21: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Simple Example

For instance:Xt satisfies

Xt − µ = φ (Xt−1 − µ) + ξt ,

where ξt are i.i.d. N(

0, σ2ξ

).

If |φ| < 1, this is a (stationary) autoregression, but ifφ = 1 it is a (non-stationary) random walk.Yt = σtηt , where σ2

t = σ2(Xt ) is a non-negative functionsuch as

σ2(Xt ) = exp(Xt )

and ηt are i.i.d.(0,1)–typically Gaussian, but also t .

Page 22: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Conditional Distributions

So the conditional distribution of Yt given Yt−1,Yt−2, . . .and Xt ,Xt−1, . . . is simple:

Yt |Yt−1,Yt−2, . . . ,Xt ,Xt−1, · · · ∼ N(

0, σ2(Xt )).

But the conditional distribution of Yt given onlyYt−1,Yt−2, . . . is not analytically tractable.In particular,

ht (yt−1, yt−2, . . . ) = Var(Yt |Yt−1 = yt−1,Yt−2 = yt−2, . . . )

is not a simple function.

Page 23: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Difficulties

Analytic difficulties cause problems in:estimation;forecasting.

Computationally intensive methods, e.g.:Particle filtering;Numerical quadrature.

Page 24: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Outline

1 Introduction

2 Time Series ModelsFirst WaveSecond Wave

3 Stochastic Volatility

4 Stochastic Volatility and GARCHA Simple Tractable ModelAn Application

5 Summary

Page 25: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Stochastic Volatility and GARCH

Stochastic volatility models have the attraction of anexplicit model for the volatility, or variance.Is analytic difficulty the unavoidable cost of thatadvantage?

Page 26: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Outline

1 Introduction

2 Time Series ModelsFirst WaveSecond Wave

3 Stochastic Volatility

4 Stochastic Volatility and GARCHA Simple Tractable ModelAn Application

5 Summary

Page 27: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

The Latent Process

We construct a latent process by:

X0 ∼ Γ

2,τ2

2

),

and for t > 0

Xt = BtXt−1,

where

θBt ∼ β(ν

2,12

)and Bt are i.i.d. and independent of X0.

Page 28: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

The Observed Process

The observed process is defined for t ≥ 0 by

Yt = σtηt

where

σt =1√Xt,

and ηt are i.i.d. N(0,1) and independent of Xt.Equivalently: given Xu = xu,0 ≤ u, andYu = yu,0 ≤ u < t ,

Yt ∼ N(0, σ2t )

with the same definition of σt .

Page 29: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Constraints

SinceVar(Y0) = E

(X−1

0

),

we constrain ν > 2 to ensure that

E(

X−10

)<∞.

RequiringE(

X−1t

)= E

(X−1

0

)for all t > 0 is also convenient, and is met if

θ =ν − 2ν − 1

.

Page 30: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Comparison with Earlier Example

This is quite similar to the earlier example, with φ = 1:Write X ∗t = − log (Xt ).Then

X ∗t = X ∗t−1 + ξ∗t ,

whereξ∗t = − log (Bt ) .

In terms of X ∗t ,σ2

t = exp(X ∗t ) .

Page 31: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Differences

A key constraint is that now φ = 1, so X ∗t is a(non-stationary) random walk, instead of a (stationary)auto-regression.Also X ∗t is non-Gaussian, where in the earlierexample, the latent process was Gaussian.Also X ∗t has a drift, because

E(ξ∗t ) 6= 0.

Of course, we could include a drift in the earlierexample.

Page 32: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Matched Simulated Random Walks

0 20 40 60 80 100

−2.

0−

1.0

0.0

nu = 10Gaussian

Page 33: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Matched Simulated Random Walks

0 20 40 60 80 100

−8

−6

−4

−2

0

nu = 5Gaussian

Page 34: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

So What?

So our model is not very different from (a carefullychosen instance of) the earlier example.So does it have any advantage?Note: the inverse Gamma distribution is the conjugateprior for the variance of the Gaussian distribution.

Page 35: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Marginal distribution of Y0

Marginal distribution of Y0:

Y0 ∼√

h0 t∗(ν)

where

h0 =τ2

ν − 2and t∗(ν) is the standardized t-distribution (i.e., scaledto have variance 1).

Page 36: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Conditional distributions of X0 and X1|Y0

Conjugate prior/posterior property: conditionally onY0 = y0,

X0 ∼ Γ

(ν + 1

2,τ2 + y2

02

).

Beta multiplier property: conditionally on Y0 = y0,

X1 = B1X0 ∼ Γ

2, θ

(τ2 + y2

02

)].

Page 37: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Conditional distribution of Y1|Y0

The conditional distribution of X1|Y0 differs from thedistribution of X0 only in scale, so conditionally onY0 = y0,

Y1 ∼√

h1 t∗(ν),

where

h1 =θ

ν − 2

(τ2 + y2

0

)= θh0 + (1− θ)y2

0 .

Hmm...so the distribution of Y1|Y0 differs from thedistribution of Y0 only in scale...I smell a recursion!

Page 38: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

The Recursion

Write Yt−1 = (Yt−1,Yt−2, . . . ,Y0).For t > 0, conditionally on Yt−1 = yt−1,

Yt ∼√

ht t∗(ν),

where

ht = θht−1 + (1− θ)y2t−1.

Page 39: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

The Structure

That is, Yt is IGARCH(1,1) with t(ν)-distributedinnovations.Constraints:

ω = 0;β = 1− α = ν−2

ν−1 .

So we can have a stochastic volatility structure, and stillhave (I)GARCH structure for the observed processYt.

Details and some multivariate generalizations sketchedout at http://www4.stat.ncsu.edu/~bloomfld/talks/sv.pdf

Page 40: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

The Structure

That is, Yt is IGARCH(1,1) with t(ν)-distributedinnovations.Constraints:

ω = 0;β = 1− α = ν−2

ν−1 .

So we can have a stochastic volatility structure, and stillhave (I)GARCH structure for the observed processYt.Details and some multivariate generalizations sketchedout at http://www4.stat.ncsu.edu/~bloomfld/talks/sv.pdf

Page 41: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Outline

1 Introduction

2 Time Series ModelsFirst WaveSecond Wave

3 Stochastic Volatility

4 Stochastic Volatility and GARCHA Simple Tractable ModelAn Application

5 Summary

Page 42: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Same Two Years of S&P 500

−10

−5

05

10

2008 2009

Page 43: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Two Years of S&P 500

Data: 500 log-returns for the S&P 500 index, from05/24/2007 to 05/19/2009.Maximum likelihood estimates:

τ2 = 4.37

θ = 0.914⇒ ν = 12.6.

With ν unconstrained:

τ2 = 3.37

θ = 0.918ν = 9.93.

Page 44: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Comparison

Constrained result has less heavy tails and lessmemory than unconstrained result.Likelihood ratio test:

−2 log(likelihood ratio) = 0.412

assuming ∼ χ2(1),P = 0.521,

so differences are not significant.With more data, difference becomes significant.

Page 45: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Outline

1 Introduction

2 Time Series ModelsFirst WaveSecond Wave

3 Stochastic Volatility

4 Stochastic Volatility and GARCHA Simple Tractable ModelAn Application

5 Summary

Page 46: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Summary

Latent processes are useful in time series modeling.GARCH and Stochastic Volatility are both valuabletools for modeling time series with changing variance.GARCH fits naturally into the time domain approach.Stochastic Volatility is appealing but typicallyintractable.Exploiting conjugate distributions may bridge the gap.

Thank you!

Page 47: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Modeling TheVariance of aTime Series

PeterBloomfield

Introduction

Time SeriesModelsFirst Wave

Second Wave

StochasticVolatility

StochasticVolatility andGARCHA Simple TractableModel

An Application

Summary

Summary

Latent processes are useful in time series modeling.GARCH and Stochastic Volatility are both valuabletools for modeling time series with changing variance.GARCH fits naturally into the time domain approach.Stochastic Volatility is appealing but typicallyintractable.Exploiting conjugate distributions may bridge the gap.

Thank you!

Page 48: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

ON OPTIMAL PREDICTIVE INFERENCE

IN LOG-GAUSSIAN RANDOM FIELDS

Victor De Oliveira

Department of Management Sciences

and Statistics

University of Texas at San Antonio

[email protected]

http://faculty.business.utsa.edu/vdeolive

1

Page 49: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

LOG-GAUSSIAN RANDOM FIELDS

The random field Z(s), s ∈ D, D ⊂ Rd and

d ≥ 1, is log-Gaussian if Y (s), s ∈ D, with

Y (s) = log(Z(s)), is Gaussian.

Here we assume that the mean and covariance

functions of Y (·) are given by

EY (s) = µY

covY (s), Y (u) = CY (s, u)

where µY ∈ R unknown and CY (s, u) a known

covariance function in Rd, satisfying that for

all s ∈ D, CY (s, s) = σ2Y .

It follows the mean and covariance functions

of Z(·) are given by

EZ(s) = expµY +σ2

Y

2 =: µZ

covZ(s), Z(u) = µ2Z( expCY (s, u) − 1).

2

Page 50: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

PREDICTIVE INFERENCE

DATA: Z = (Z(s1), . . . , Z(sn)) measured at

sampling locations s1, . . . , sn ∈ D. Let

s0 ∈ D an unmeasured location in D

B ⊂ D a subregion of interest

GOALS:

• Obtain predictor of Z(s0)

(point prediction).

• Obtain prediction interval for Z(s0)

(interval prediction)

• Obtain preditor of Z(B) = 1|B|

B Z(s)ds

(block prediction).

3

Page 51: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

POINT PREDICTION

Let Z0 be a predictor for Z0 (within a family)

and L(Z0, Z0) a loss function.

The optimal predictor of Z0 is the predictor

that minimizes the risk function

r(Z0) = EL(Z0, Z0).

For the squared error loss the risk function

becomes the mean squared prediction error

MSPE(Z0) = E(Z0 − Z0)2.

Notation: All quantities that depend on the

prediction location s0 would be written with

the subscript ‘0’.

5

Page 52: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

If µY were known, the optimal predictor and

its MSPE are given by

Z∗0 = EZ0 | Z

= expY ∗0 +

σ∗0Y

2

MSPE(Z∗0) = var(Z0) − var(Z∗

0)

= µ2Z( expσ2

Y − expc′0Y Σ−1Y c0Y ),

where

Y ∗0 = EY0 | Y

= µY + c′0Y Σ−1

Y (Y − µY 1)

σ∗0Y = var(Y0 | Y)

= σ2Y − c

′0Y Σ−1

Y c0Y ;

ΣY,ij = CY (si, sj)

c0Y,i = CY (s0, si).

These are known in the geostatistical

literature as the simple kriging predictors and

simple kriging variances of Z0 and Y0.

6

Page 53: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Unbiased Prediction:

In practice the most used predictor is the

lognormal kriging predictor

ZLK0 = expY OK

0 +1

2(σ2

Y − λ′0Y ΣY λ0Y ),

MSPE(ZLK0 ) = µ2

Z( expσ2Y +expλ′

0Y ΣY λ0Y −2expλ′0Y ΣY λ0Y −m0Y );

where

Y OK0 = λ′

0Y Y (the BLUP of Y0)

λ′0Y = (c0Y +

1 − 1′Σ−1Y c0Y

1′Σ−1Y 1

1)′Σ−1Y .

By construction ZLK0 satisfies the following

optimality property:

7

Page 54: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Proposition 1. The predictor ZLK0 minimizes

E(log(Z0) − log(Z0))2 over the class of pre-

dictors of the form Z0 = expλ′0 log(Z) + k0,

where λ0 ∈ Rn and k0 ∈ R are constrained such

that EZ0 = EZ0 for every µY ∈ R.

Recently, Cox (2004) noted a stronger

optimality property:

Proposition 2. The predictor ZLK0 minimizes

E

(Z0−Z0)2

expc′0Y Σ−1Y log(Z)

over the class of all

unbiased predictors of Z0.

These optimality properties are somewhat

unsatisfactory.

The former holds in the transformed log-scale

rather than in the original scale of measure-

ment.

The latter, although holds in the original scale,

is with respect to a weighted squared error loss

function with little intuitive appeal.

8

Page 55: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Optimal Point Prediction:

Consider the family of predictors

P0 = Z0 = expa′0Y + k0 : k0 ∈ R, a0 ∈ R

n, a′01 = 1

which includes many special cases:

ZLK0 , ZN

0 = expY OK0 and

ZML0 = expY OK

0 +1

2(σ2

Y − c′0Y Σ−1

Y c0Y ),

ZB0 = expY OK

0 +1

2(σ2

Y +λ′0Y ΣY λ0Y −2λ′

0Y c0Y ),

Theorem 1. The predictor in the family P0

that minimizes E(Z0 − Z0)2 is given by

ZME0 = expY OK

0 +1

2(σ2

Y −λ′0Y ΣY λ0Y −2m0Y ),

and its MSPE is given by

MSPE(ZME0 ) = µ2

Z( expσ2Y − expλ′

0Y ΣY λ0Y − 2m0Y ).

where m0Y =1−1′Σ−1

Y c0Y

1′Σ−1Y 1

.

9

Page 56: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

BLOCK PREDICTION

A related problem is the prediction of:

Z(B) =1

|B|

BZ(s)ds, B ⊂ D,

based on (point) data Z.

Examples where inference about this arises:

• Environmental assessment

• Precision farming

Two predictors have been proposed in the

geostatistical literature:

The lognormal kriging block predictor

Z(B)LK =1

|B|

BZLK(s)ds.

A block predictor motivated by the assumptionof “preservation of lognormality”

Z(B)PL = exp

Y (B)OK +1

2(σ2

Y − λ′Y (B)ΣY λY (B))

,

where Y (B)OK = λ′Y (B)Y is the BLUP of

Y (B) =∫

B Y (s)ds/|B| based on Y.

10

Page 57: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Optimal Block Prediction:

Consider the family of block predictors

PB =

Z(B) =1

|B|

B

expY OK(s) + k(s)ds : k(s) ∈ C(B)

,

where C(B) is the space of bounded and Lebesgue

measurable functions on B.

Theorem 2. The predictor in the family of

predictors PB that minimizes E(Z(B)−Z(B))2

is given by

Z(B)ME =1

|B|

BZME(s)ds,

where ZME(s) is the optimal point predictor

given before, and MSPE(Z(B)ME) is given by

µ2Z

|B|2

B

B

(

expCY (s,u)−expλ′Y (s)ΣY λY (u)−mY (s)−mY (u)

)

dsdu.

11

Page 58: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Considering now the family of block predictors

PB = Z(B) = expY (B)OK + kB : kB ∈ R.

Theorem 3. The predictor in the family of

predictors PB that minimizes E(Z(B)−Z(B))2

is given by

Z(B)MP = exp Y (B)OK +1

2(σ2

Y − 3λ′Y (B)ΣY λY (B))

+ log

(

1

|B|

B

eλ′

Y (B)cY (s)ds

)

and MSPE(Z(B)MP ) is given by

µ2Z

|B|2

(

B

B

expCY (s,u)dsdu

− |B|2 exp

2 log( 1

|B|

B

eλ′

Y (B)cY (s)ds

)

− λ′Y (B)ΣY λY (B)

)

.

12

Page 59: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Remarks.

From the above results follow that:

• ZLK0 is inadmissible, in the sense that

MSPE(ZME0 ) ≤ MSPE(ZLK

0 ) for all µY ∈ R.

• Z(B)LK and Z(B)PL are both inadmissible.

• Z(B)ME and Z(B)MP , and their MSPEs

can not be compared analytically;

they are compared numerically.

13

Page 60: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

COMPARISON OF PREDICTORS

Let the region D = [0,1] × [0,1] and randomfield Z(s) = expY (s), where Y (s), s ∈ D isGaussian with

EY (s) = µY , CY (s, u) = σ2Y exp−

l

θY;

l = ||s − u|| is Euclidean distance, µY ∈ R andσ2

Y , θY > 0.Data on Z(·) is observed at n = 50 samplinglocations chosen at random.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

x−coordinate

y−co

ordi

nate

14

Page 61: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many
Page 62: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Point Predictors:

We compare the values of ZME0 and ZLK

0

by predicting Z(s0) for locations:

s0 = (0.5,0.5), (0.3,0.8) and (0.9,0.9).

For that note

ZME0

ZLK0

= exp−m0Y =EZME

0

EZ0,

which does not depend on the observed data.

To compare the predictors in terms of their

MSPE’s we use the predictive efficiency of

ZME0 relative to ZLK

0

RMSPE(ZME0 , ZLK

0 ) =MSPE(ZME

0 )

MSPE(ZLK0 )

.

15

Page 63: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

0.0 0.5 1.0 1.5 2.0

−0.06

−0.04

−0.02

0.00

sigma

Z(s_0

)^ME/Z

(s_0)^

LK −

1

s_0(0.5,0.5)(0.3,0.8)(0.9,0.9)

0.0 0.2 0.4 0.6 0.8 1.0

−0.06

−0.04

−0.02

0.00

theta

Z(s_0

)^ME/Z

(s_0)^

LK −

1

s_0(0.5,0.5)(0.3,0.8)(0.9,0.9)

16

Page 64: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

0.0 0.5 1.0 1.5 2.0

0.994

0.996

0.998

1.000

sigma

RMSP

E(Z(s

_0)^M

E,Z(s_

0)^LK

)

s_0(0.5,0.5)(0.3,0.8)(0.9,0.9)

0.0 0.2 0.4 0.6 0.8 1.0

0.994

0.996

0.998

1.000

theta

RMSP

E(Z(s

_0)^M

E,Z(s_

0)^LK

)

s_0(0.5,0.5)(0.3,0.8)(0.9,0.9)

17

Page 65: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Block Predictors:

We compare the block predictors of Z(B) for

the sub-regions B shown below:

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

x−coordinate

y−coord

inate

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

x−coordinate

y−coord

inate

The predictors are approximated by noting that

Z(B)LK = ESZLK(S) , Z(B)ME = ESZ

ME(S),

where ZLK(·) and ZME(·) are point predictors

and expectation is taken with respect to

S ∼ unif(B).

18

Page 66: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

0.00.5

1.01.5

2.0

−0.010 −0.006 −0.002 0.002

sigma

Z(B)^ME/Z(B)^LK − 1

0.00.5

1.01.5

2.0

−0.010 −0.006 −0.002 0.002

sigma

Z(B)^ME/Z(B)^LK − 1

0.00.2

0.40.6

0.81.0

−0.010 −0.006 −0.002 0.002

theta

Z(B)^ME/Z(B)^LK − 1

0.00.2

0.40.6

0.81.0

−0.010 −0.006 −0.002 0.002

theta

Z(B)^ME/Z(B)^LK − 1

19

Page 67: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

0.00.5

1.01.5

2.0

0.990 0.994 0.998

sigma

RMSPE(Z(B)^ME,Z(B)^LK)

0.00.5

1.01.5

2.0

0.990 0.994 0.998

sigma

RMSPE(Z(B)^ME,Z(B)^LK)

0.00.2

0.40.6

0.81.0

0.990 0.994 0.998

theta

RMSPE(Z(B)^ME,Z(B)^LK)

0.00.2

0.40.6

0.81.0

0.990 0.994 0.998

theta

RMSPE(Z(B)^ME,Z(B)^LK)

20

Page 68: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

0.00.5

1.01.5

2.0

−0.2 0.0 0.2 0.4 0.6

sigma

Z(B)^ME/Z(B)^MP − 1

0.00.5

1.01.5

2.0

−0.2 0.0 0.2 0.4 0.6

sigma

Z(B)^ME/Z(B)^MP − 1

0.00.2

0.40.6

0.81.0

−0.2 0.0 0.2 0.4 0.6

theta

Z(B)^ME/Z(B)^MP − 1

0.00.2

0.40.6

0.81.0

−0.2 0.0 0.2 0.4 0.6

theta

Z(B)^ME/Z(B)^MP − 1

21

Page 69: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

0.00.5

1.01.5

2.0

0.2 0.6 1.0 1.4

sigma

RMSPE(Z(B)^ME,Z(B)^MP)

0.00.5

1.01.5

2.0

0.2 0.6 1.0 1.4

sigma

RMSPE(Z(B)^ME,Z(B)^MP)

0.00.2

0.40.6

0.81.0

0.2 0.4 0.6 0.8 1.0 1.2 1.4

theta

RMSPE(Z(B)^ME,Z(B)^MP)

0.00.2

0.40.6

0.81.0

0.2 0.6 1.0 1.4

theta

RMSPE(Z(B)^ME,Z(B)^MP)

22

Page 70: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

The Non-constant Mean Case:

Suppose now that

µY (s) =p

j=1

βjfj(s),

where β = (β1, . . . , βp)′ ∈ Rp are unknown re-

gression parameters, (f1(s), . . . , fp(s))′ are known

location-dependent covariates.

In this case we have:

• The result on optimal point prediction

(Theorem 1) can be easily extended.

• The results on optimal block prediction

(Theorems 2 and 3) cannot be extended.

23

Page 71: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

FIRST CONCLUSIONS

• New point and block predictors for log-Gaussian

processes have been proposed that improve

upon existing ones.

• The lognormal kriging point and block

predictors have (near) optimality properties in

the original scale.

• The lognormal kriging block predictor is

substantially better than the block predictor

motivated by “permanence of lognormality”.

Also, the best predictor in PB is substantially

better than the the best predictor in PB.

• For random fields with non-constant mean

the optimal results also hold for point predic-

tion, but not for block prediction.

24

Page 72: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

INTERVAL PREDICTION

Here we assume that the mean and covariance

functions of Y (·) are given by

EY (s) =p

j=1

βjfj(s)

covY (s), Y (u) = C(s,u)

f1(s), . . . , fp(s) known covariates

β = (β1, . . . , βp)′ ∈ Rp unknown parameters

C(s, u) parametric covariance function in Rd

satisfying C(s, s) = σ2 > 0

25

Page 73: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

OBSERVED DATA:

Noisy measurements of the random field Z(·)

at known sampling locations s1, . . . , sn ∈ D:

Zi,obs = Z(si)εi, i = 1, . . . , n,

log(εi)iid∼ N(0, σ2

ε ) are measurement errors

distributed independently of Z(·), and σ2ε ≥ 0.

GOAL:

obtain prediction interval for Z0 = Z(s0), the

unobserved value of the process at s0 ∈ D,

based on Z = Zi,obsni=1.

Model parameters are β ∈ Rp and ϑ ∈ Θ ⊂ Rq

include σ2ε , σ2 and other parameters in C(s, u).

26

Page 74: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Common approach to construct prediction

intervals for Z(·) is to transform prediction

intervals for Y (·). Let

Y = log(Z) and Y0 = log(Z0)

The BLUP of Y0 based on Y and its mean

squared prediction error are

Y0(ϑ) = λ′0(ϑ)Y , σ2

0(ϑ) = σ2−2λ′0(ϑ)c0(ϑ)+λ′

0(ϑ)Σϑλ0(ϑ)

with

λ′0(ϑ) = (c0(ϑ) + X(X ′Σ−1

ϑ X)−1(x0 − X ′Σ−1ϑ c0(ϑ)))′Σ−1

ϑ

X = (fj(si))n×p, x0 = (f1(s0), . . . , fp(s0))′

Σϑ, c0(ϑ) are the n × n and n × 1 matrices:

Σϑ,ij = C(si, sj)+σ2ε 1i = j, c0(ϑ)i = C(s0, si).

Σϑ is positive definite for any ϑ ∈ Θ.

We start by assuming that ϑ is known.

27

Page 75: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

It follows

(

Y0

Y0(ϑ)

)

∼ N2

((

x′0β

x′0β

)

,

(

σ2 λ′0(ϑ)c0(ϑ)

λ′0(ϑ)c0(ϑ) λ′

0(ϑ)Σϑλ0(ϑ)

))

so T = Y0 − Y0(ϑ) ∼ N(0, σ20(ϑ)) is a pivot for

the prediction of Y0.

Then a 1 − α prediction interval for Y0 is

Y0(ϑ) ± Φ−1(1 − α/2)σ0(ϑ)

and a 1 − α prediction interval for Z0 is

expY0(ϑ) ± Φ−1(1 − α/2)σ0(ϑ)

We denote this PI as IN0 (α, ϑ) and call it the

standard 1 − α prediction interval for Z0.

28

Page 76: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

SHORTEST PREDICTION INTERVALS:

KNOWN COVARIANCE CASE

Consider the family of 1−α prediction intervalsfor Z0

F0 = ( expY0(ϑ) − Φ−1(1 − γ)σ0(ϑ),

expY0(ϑ) − Φ−1(1 − α + γ)σ0(ϑ)) : γ ∈ [0, α)

which includes the standard prediction interval

(obtained for γ = α/2).

THEOREM. Let α ∈ (0,1), ϑ ∈ Θ and s0 ∈ D.

Then the shortest prediction interval in F0 is

the one corresponding to the value

γ = γopt0 = γopt

0 (α, ϑ) ∈ (0, α/2)

which is the (unique) solution to the equation

Φ−1(1 − γ) − Φ−1(1 − α + γ) = 2σ0(ϑ)

Hence the shortest 1 − α PI for Z0 in F0 is

IS0 (α, ϑ) = ( expY0(ϑ) − Φ−1(1 − γopt

0 )σ0(ϑ),

expY0(ϑ) − Φ−1(1 − α + γopt0 )σ0(ϑ))

29

Page 77: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

COMPARISON

Let RL(IN0 (α, ϑ), IS

0 (α, ϑ)) be defined as

len(IS0 (α, ϑ))

len(IN0 (α, ϑ))

=expΦ−1(1 − α + γopt

0 )σ0(ϑ) − exp−Φ−1(1 − γopt0 )σ0(ϑ)

expΦ−1(1 − α/2)σ0(ϑ) − exp−Φ−1(1 − α/2)σ0(ϑ)

Consider D = [0,1] × [0,1] and random field

Z(s) = expY (s), where Y (s), s ∈ D is Gaus-

sian with constant mean and Matern covari-

ance function

C(s, u) =σ2

2θ2−1Γ(θ2)(

l

θ1)θ2Kθ2(

l

θ1),

l = ||s − u|| is Euclidean distance

ϑ = (σ2, θ1, θ2) are covariance parameters.

We consider the cases θ2 = 0.5 and θ2 = 1.5

Also assume σ2ε = 0 (no measurement error).

30

Page 78: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

x−coordinate

y−co

ordi

nate

Fifty sampling locations () chosen at random

within the region D and prediction locations

s0 = (0.5,0.5), (0.3,0.8) and (0.9,0.9) (×).

31

Page 79: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

32

Page 80: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

0.51.0

1.52.0

0.2 0.4 0.6 0.8 1.0

σ2

θ

34

Page 81: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

0.5 1.0 1.5 2.0

0.2

0.4

0.6

0.8

1.0

σ2

θ

θ2 = 0.5

35

Page 82: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Findings:

• Length reductions in the range 1–35%

• Length reductions decrease when confidence

level increases

• Length reductions decrease when

smoothness of the process increases

• The largest reductions are obtained in mod-

els with highly asymmetric marginals (σ2 large)

and moderate to weak dependence (θ small)

36

Page 83: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

SHORTEST PREDICTION INTERVALS:

UNKNOWN COVARIANCE CASE

The previous prediction intervals depend on ϑ.

The most immediate fix to this problem is to

use IN0 (α, ϑ) and IS

0 (α, ϑ), where ϑ = ϑ(z) is

an estimate of ϑ.

These are called plug-in prediction intervals.

The drawback is that plug-in PIs have coverage

properties that differ from the nominal cover-

age properties, usually having smaller coverage

than the desired coverage since these PIs in-

tervals do not take into account the sampling

variability of the parameter estimates.

A solution is to calibrate these plug-in PIs:

Cox (1975) and Beran (1990).

37

Page 84: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Calibrated Prediction Intervals

The coverage probability function of IS0 (α, ϑ)

is defined as

π0(α, ϑ) = PϑZ0 ∈ IS0 (α, ϑ(Z))

We start by estimating π0(·, ϑ) with π0(·, ϑ).

The basic idea of calibrating plug-in prediction

intervals is to find αc ∈ (0,1) for which it holds,

exactly or approximately, that

π0(αc, ϑ) = 1 − α,

The calibrated prediction interval is IS0 (αc, ϑ),

which by construction has coverage probability

close to 1 − α.

π0(·, ϑ) is usually not available in closed form

so it needs to be approximated.

38

Page 85: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Bootstrap Calibration

Let π∗0(·, ϑ) be a Monte Carlo estimate of π0(·, ϑ).

One way is simulate (Z∗j , Z

∗0j), say B times,

from the lognormal model with parameters β =β(z) and ϑ = ϑ(z), and estimate π0(x, ϑ) with

1

B

B∑

j=1

1Y0(ϑ∗j)−Φ−1(1−γopt∗

0j )σ0(ϑ∗j) < Y ∗

0j < Y0(ϑ∗j)+Φ−1(1−x+γopt∗

0j )σ0(ϑ∗j)

Y ∗0j = log(Z∗

0j), ϑ∗j = ϑ(Z∗

j) and γopt∗0j = γopt

0 (x, ϑ∗j).

A better way is to use ‘Rao-Backwellization’

based on the identity

π0(α, ϑ) = Eϑ

Φ(U0(α, ϑ,Y) − η0(0, ϑ,Y)

τ0(ϑ)

)

− Φ(L0(α, ϑ,Y) − η0(0, ϑ,Y)

τ0(ϑ)

)

L0(x, ϑ,Y) = Y0(ϑ) − Φ−1(1 − γopt0 )σ0(ϑ)

U0(x, ϑ,Y) = Y0(ϑ) + Φ−1(1 − x + γopt0 )σ0(ϑ)

Expectation is wrt Y when β = 0.

39

Page 86: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Algorithm:

Step 1. Compute the ML (or REML) estimate ϑ = ϑ(z)from the observed data z.

Step 2. Simulate B independent and identically distributedbootstrap samples Y∗

j : 1 ≤ j ≤ B from the Gaus-

sian random field Y (s), s ∈ D with β = 0 and

ϑ = ϑ.

Step 3. For each j = 1, . . . , B, compute the estimateϑ∗

j = ϑ(exp(Y∗j)) based on the bootstrap sample

Y∗j .

Step 4. For each s0 ∈ D where a PI is sought, computeL∗

0j = L0(x, ϑ∗j ,Y

∗j), U∗

0j = U0(x, ϑ∗j ,Y

∗j) and for

x ∈ (0,1) estimate π0(x, ϑ) by

π∗0(x, ϑ) =

1

B

B∑

j=1

[Φ(U∗

0j − η∗0j

τ0) − Φ(

L∗0j − η∗

0j

τ0)]

where η∗0j = η0(0, ϑ,Y∗

j) and τ0 = τ0(ϑ).

Finally, αc is found as the solution (in x) of

π∗0(x, ϑ) = (1 − α)

40

Page 87: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Illustration:

0.90 0.92 0.94 0.96 0.98 1.00

0.80

0.85

0.90

0.95

1.00

1 − x

π 0* (x, ϑ

)

s0

(0.5,0.5)(0.3,0.8)(0.9,0.9)

0.90 0.92 0.94 0.96 0.98 1.00

0.80

0.85

0.90

0.95

1.00

1 − xπ 0* (x

, ϑ)

s0

(0.5,0.5)(0.3,0.8)(0.9,0.9)

0.90 0.92 0.94 0.96 0.98 1.00

0.80

0.85

0.90

0.95

1.00

1 − x

π 0* (x, ϑ

)

s0(0.5,0.5)(0.3,0.8)(0.9,0.9)

0.90 0.92 0.94 0.96 0.98 1.00

0.80

0.85

0.90

0.95

1.00

1 − x

π 0* (x, ϑ

)

s0(0.5,0.5)(0.3,0.8)(0.9,0.9)

When the data have no nugget the effect of calibration

is often minor.

But when the data contain measurement error the effect

of calibration tends to be substantial.

41

Page 88: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Comparison of plug-in and calibrated PIs

σ2 0.1 1.0θ 0.1 0.5 0.1 0.5

σ2ε = 0

plug-in 1.068 0.551 5.620 3.004standard [.939] [.951] [.940] [.948]plug-in 1.033 0.545 4.450 2.779shortest [.939] [.950] [.942] [.948]

calibrated 1.124 0.557 6.111 3.050standard [.949] [.953] [.951] [.951]calibrated 1.068 0.548 4.694 2.795shortest [.946] [.951] [.949] [.949]

σ2ε = σ2

4plug-in 1.011 0.522 6.131 2.897

standard [.899] [.895] [.899] [.896]plug-in 0.981 0.518 4.908 2.686shortest [.900] [.895] [.909] [.902]

calibrated 1.164 0.544 8.597 3.041standard [.934] [.908] [.936] [.910]calibrated 1.098 0.531 6.750 2.775shortest [.928] [.904] [.937] [.911]

σ2ε = 0

plug-in 1.225 0.727 6.507 4.025standard [.933] [.936] [.934] [.933]plug-in 1.174 0.716 4.945 3.537shortest [.934] [.937] [.937] [.936]

calibrated 1.313 0.779 7.367 4.422standard [.948] [.952] [.949] [.949]calibrated 1.232 0.750 5.400 3.776shortest [.944] [.948] [.947] [.947]

σ2ε = σ2

4plug-in 1.175 0.697 6.960 3.907

standard [.891] [.896] [.890] [.896]plug-in 1.129 0.686 5.345 3.445shortest [.894] [.897] [.903] [.901]

calibrated 1.377 0.761 10.168 4.390standard [.932] [.923] [.933] [.923]calibrated 1.285 0.731 7.636 3.755shortest [.927] [.917] [.935] [.921]

44

Page 89: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

EXAMPLE:

Data on cadmium (Cd) concentrations (in ppm)measured at 259 locations in a region of about15 km2 in the Switzerland, collected in 1992.

Measurements at 100 locations are used forvalidation.

Exploratory analysis suggests the “best” modelis the log-Gaussian random field associated withconstant mean, nugget and exponential covari-ance function:

β1 = 0.084, σ2 = 0.4, θ1 = 0.177 and σ2ε = 0.073.

log(Cd ) concentration (log(ppm))

rela

tive

frequ

ency

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

0.5

0.0 0.5 1.0 1.5 2.0

0.0

0.1

0.2

0.3

0.4

0.5

0.6

distance (km)

sem

ivario

gram

of l

og(C

d)

45

Page 90: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

plug-in calibratedstandard shortest RL standard shortest RL

1 1.169 1.053 0.900 1.176 1.053 0.8902 4.763 4.201 0.882 4.853 4.230 0.8713 4.527 3.908 0.863 4.663 3.999 0.8584 3.550 3.094 0.871 3.636 3.130 0.8615 3.337 2.881 0.863 3.433 2.940 0.8566 2.999 2.615 0.872 3.100 2.680 0.8647 3.984 3.494 0.877 4.075 3.568 0.8758 2.738 2.363 0.863 2.832 2.401 0.8479 2.813 2.479 0.881 2.873 2.502 0.87110 3.817 3.326 0.871 3.911 3.393 0.867

The relative lengths of the shortest prediction intervals

with respect to the standard prediction intervals vary

from 0.84 to 0.89, with an average of 0.86, so on av-

erage the shortest prediction intervals are about 14%

shorter than the standard prediction intervals.

For each of the 100 validation locations computed each

type of 95% prediction interval for the Cd true value,

and determined whether of not each Cd observed value

falls into the corresponding prediction interval.

The proportion of 95% plug-in standard, plug-in short-

est, calibrated standard and calibrated shortest predic-

tion intervals covering the corresponding Cd observed

values were, respectively, 0.93,0.93,0.93 and 0.95. The

calibrated shortest prediction intervals seem to have cov-

erage close to nominal.

46

Page 91: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

A Tale about Hockey Sticks!

[email protected]!

Page 92: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Before the Tale, A few Chance Encounters!

Gerald R. North!Atmos. Sci.!Texas A&M!

Page 93: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Sampling Errors were a concern for the proposed Tropical Rainfall Measuring Mission.!

•  1985!

TRMM!

Page 94: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

I was trying to learn statistics.!I read engineering books, economics books, Bulmer, etc.!Then . . .!

Page 95: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

One of the key papers demonstrating the plausibility of TRMM!

Page 96: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

TRMM Orbits!

Page 97: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Launched in 1997, TRMM is still flying!

Page 98: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Not all of my encounters were uncorrelated!

Page 99: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many
Page 100: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Just a sample:!

Page 101: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Fast Forward to the Present Century!

Page 102: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Global Warming Goes to Washington (2006)!

Page 103: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Mann, Bradley, Hughes, 1999: Geophys. Res. Lett., p759.!

Did Earth cool gradually, then heat up fast?!

Page 104: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

It’s the Notorious Hockey Stick!!

1000AD! 2000AD!http://en.wikipedia.org/wiki/Hockey_stick_controversy!

Page 105: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Past Climates can be Estimated from Proxy Data.!

Page 106: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Instrument records go back over a century.!

Page 107: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many
Page 108: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Trend in deg C/decade since 1950!

NRC Report 2006 !

Page 109: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Hockey Stick Time Line!

•  Mann, Hughes, Bradley (1998, 1999)!•  Intergovernmental Panel on Climate Change

(IPCC 2001)!•  Enter the Amateurs (M&M, 2005)!•  Enter Congressman Barton, then Boehlert (06)!•  Enter the National Academy of Sciences!•  Battling Banjos on the Hill!

Page 110: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

2001 IPCC features the Hockey Stick: it becomes an Icon !

Page 111: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Enter the Amateurs (M&M, 2005)!

Steve!McIntyre!

Page 112: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

A very crude estimate of global temps was featured in the 2000 IPCC Report. !

Page 113: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Enter Congressman Joe Barton!

•  Wegman Report!

Chairman, Energy & Commerce Committee

(2006)

Wegman,

Scott, Said

Page 114: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

NRC Report!

North, Wallace, Christy, Cuffey, Turekian, Druffle, Otto-Bliesner, Nychka, Bloomfield, Biondi, Roberts, Dickinson

12 Anonymous Referees, 2 Monitors

SURFACE TEMPERATURE RECONSTRUCTIONS FOR THE LAST 2,000 YEARS Committee on Surface Temperature Reconstructions for the Last 2,000 Years

Board on Atmospheric Sciences and ClimateDivision on Earth and Life Studies NATIONAL RESEARCH COUNCIL OF THE NATIONAL ACADEMIESTHE NATIONAL ACADEMIES PRESS Washington, D.C.

Page 115: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Jerry & Ed with 1990 graphic.!

From Geotimes, Sept 2006!

Climatologist Gerald North (foreground) and statistician Edward Wegman testified in front of the House Energy and Commerce Subcommittee on Oversight and Investigations in July about the famed hockey stick climate analysis. Photograph is by Christine McCarty, House Committee on Energy and Commerce.

Page 116: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Kinds of Information Available!

•  Tree Rings!•  Ice Cores and Isotopes!•  Bore Holes (ice and ground)!•  Glacial Lengths!•  Historical Records!•  Sediments (lake and oceanic)!•  Corals, Pollen, Caves, Entomology!

Thermometer Records for the Last 150 years

Proxies for Extrapolation Back in Time:!

Page 117: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

How to Reconstruct the Field (Tree Ring Example)!

•  Ring Widths are Correlated to Environmental Conditions!

•  Use the Instrumental Period to Set the Temp Scale on the Ring Widths!

•  Statistical Issues: !– Stationarity, Confounding Variables!– Range Match!– Verification Period!

Page 118: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Making a Record from many Trees

Page 119: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many
Page 120: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Reconstructing a Temperature Field using proxies, e.g., tree rings!

Page 121: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Tree Ring Sites!

Page 122: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Ice Cores!

•  Chronology (counting back, other)!•  Gas Bubbles & Volcanic Ash!•  Temps from 18O Isotopes !•  Temperatures Down the Bore Hole!

Page 123: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Lonnie Thompson’s group collects ice cores.

Page 124: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

NRC 2006 Graphic of STR:

Page 125: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many
Page 126: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

NRC Committee Concludes!•  Last 100 yrs: Temps up 0.6 deg C!

– (highly likely ~95% Confident)!•  30 yr averages warmest in 400 yrs!

– (likely~2 to 1 odds) # #!•  30 yr averages warmest 1000 yrs!

– (Plausible ~reasonable, not possible to bring a convincing argument against - no numbers) !

http://epw.senate.gov/pressitem.cfm?id=257697&party=rep!

“Today’s NAS report reaffirms what I have been saying all along, that Mann's ‘hockey stick’ is broken,” Senator Inhofe said. “Today’s report refutes Mann's prior assertions that there was no Medieval Warm Period or Little Ice Age.”!

Page 127: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many
Page 128: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

My chance brushes with Ben, Ta-Hsin, Peter, and Manny, (Ed W., Too) have enriched my life and I think we’ve done some good. !

At least we had fun tryin’.!

Page 129: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

1

Introducing a Fractional INAR(1) Model for Time Series of Counts

Harry Pavlopouloswww.stat-athens.aueb.gr/~hgp/

Department of StatisticsAthens University of Economics and Business

76 Patission

Str., GR-10434 Athens, Greece

Title Page

Page 130: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

2Outline

1.

Motivation by an interest on series of pixel counts and relevant links to the “Threshold Method”

for prediction of SARR.

2.

Introduce Randomized Binomial Thinning and a class of Fractional INAR(1) models.

3.

Calculation of Moments up to 2nd

order, under the assumption of

stationarity.

4.

Simulation and Inference.

5.

Application on daily series of global rain-rate fields.

OUTLINE

Presenter
Presentation Notes
outline
Page 131: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

3

( ) T1tj,it )A(n,,2,1j,i;AR

== L be a time series of

instantaneous realizations (measurements) of random“marks” over the n2(A) pixels Ai,j

of a 2D Marked Regular Lattice (MRL) configuration of a random field probed over a fixed

domain A.

MOTIVATION

WLOG:

assume pixels of square shape (with fixed side length),

and non-negative marks(e.g. environmental / geophysical MRL).

Ai,j

Motivation

Let

Page 132: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

4

0

50100

150200

West -> to -> East Distance (Km)

0

50

100

150

200North -> to -> South Distance (Km)

010

2030

4050

6070

Rai

n R

ate

(mm

/hr)

240x240 Km^2 TOGA-COARE Region, MIT Radar

Cruise-1, Date 921110, Time 23:21 UTCSARR lattice

Page 133: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

5West -> to -> East Distance (Km)

Sou

th ->

to ->

Nor

th D

ista

nce

(Km

)

0 50 100 150 200 250

050

100

150

200

250

MIT Radar, Cruise-1, Date 921110, Time 23:21 UTCClipping Threshold = 0 mm/hr

Clip 0 mm/hr

Page 134: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

6West -> to -> East Distance (Km)

Sou

th ->

to ->

Nor

th D

ista

nce

(Km

)

0 50 100 150 200 250

050

100

150

200

250

MIT Radar, Cruise-1, Date 921110, Time 23:21 UTCClipping Threshold = 1 mm/hr

Clip 1 mm/hr

Page 135: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

7West -> to -> East Distance (Km)

Sou

th ->

to ->

Nor

th D

ista

nce

(Km

)

0 50 100 150 200 250

050

100

150

200

250

MIT Radar, Cruise-1, Date 921110, Time 23:21 UTCClipping Threshold = 2 mm/hr

Clip 2 mm/hr

Page 136: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

8

West -> to -> East Distance (Km)

Sou

th ->

to ->

Nor

th D

ista

nce

(Km

)

0 50 100 150 200 250

050

100

150

200

250

MIT Radar, Cruise-1, Date 921110, Time 23:21 UTCClipping Threshold = 5 mm/hr

Clip 5 mm/hr

Page 137: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

9

( ) 0ut,nt,n A,uF1)A,u(F≥

−=

is a functional of

random tail-probabilitiescorresponding to the instantaneous

empirical spatial cumulative

distribution functional

(IESCDF):

( ) 0ut,n A,uF≥

u-COVER :( ) ( )[ ]

0u

)A(n

1j,ij,it

2t,n uARI)A(nA,uF

≥=

⎭⎬⎫

⎩⎨⎧

>⋅= ∑

Time series of pixel counts where a MRL exceeds a fixed u-threshold

( )[ ]Tt

An

jijitt uARIAuX

,...,2,1

)(

1,,:),(

== ⎭⎬⎫

⎩⎨⎧

>= ∑

IECDF

is a random functional, parameterized by the u-threshold, providing an important statistical summary of the MRL over A

at each instant t.

motivate an interest

to model

(over-dispersed and persistent) time series of counts.

Page 138: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

10

of the probed underlying spatio-temporal random fieldwhen n(A)

is large enough and/or |Ai,j

|/|A| is small enough (Increasing Domain / Infill Asymptotics).

Lahiri

S.N., Kaiser M.S., Cressie

N., Hsu N.J.(JASA 1999, 94: 86-97)

• IESCDF is a natural predictor of instantaneous spatial-cdf

(ISCDF)

( ) [ ]0uA

t1

t,t, dau)a(RIAA,uF1)A,u(F≥

−∞∞

⎭⎬⎫

⎩⎨⎧

≤⋅=−= ∫

Analogously, u-COVER

is a natural predictor of ISCDF-tails.

Under appropriate assumptions of spatial-temporal homogeneity, rendering stationary ISCDF, modelling

the numerator of

u-COVER

by

a stationary model suitable for time series of counts, could (in

principle) facilitate temporal prediction of spatial tail-probabilities.

ISCDF

Aa);a(R t ∈

Page 139: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

11

Prediction of spatial sample moments (i.e. moments of IESCDF) may also be facilitated by implementation of the “Threshold Method”,

provided that the underlying random field has

sufficient variability in its intermittency between zero and positive marks.

∫∑

∫∑

∫∫

∞+

∞−=

∞+

∞−∞

=

+∞

∞−∞

=⋅=

=⋅=

=⋅=

);()()(

1:)(

);()()(

1:)(

);()(1:)(

,

)(

1,,2

][

,

)(

1,,

)(2

)(

,,,

,)(

,

AudFuARAn

AR

AudFuARAn

AR

AudFudaaRA

AR

tnk

An

jiji

kt

kt

tk

An

jiji

kt

kt

jitk

A

kt

jiji

kt

ji

Spatial Moments

Prediction

Page 140: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

12

THRESHOLD METHOD

Kedem

& Pavlopoulos

[1991, JASA]Short, Shimizu, Kedem

[1993, JAMet.]

Shimizu, Short, Kedem

[1993, J.Met.Soc.Japan]Shimizu & Kayano

[1994, J.Met.Soc.Japan]

Mase

[1996, Ann.Inst.Statistical.Math.]Sakaguchi

& Mase

[2003, J.Japan Statist.Soc.]

( ) ttnkk

t AuFuAR εβ +⋅= ,)()( ,][

0)(||)(

0)(||)()(

,)(

,)(

,)(

,)(

>>

>=

jik

tjik

t

jik

tjik

tk ARuARP

ARAREuβ

within an “optimal”

range of u-thresholds, facilitating/justifying prediction of spatial moments by linear regression on ECDF-tails.

Threshold Method

Page 141: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

13INAR(1)

INAR(1) Model: McKenzie

(1985) ; Al-Osh & Alzaid

(1987)

ttt XpX ε+= −1o tε I.I.D.

innovations ; tsst X <⊥ε

∑−

=− =

1

1,1 :

tX

iitt ZXp o

( ) ),(~ 111 pXBXXp ttt −−−o

Binomial p-Thinning:

]1,0[)0(1)1( ,, ∈==−== pZPZP itit

IID array, independently of i,tZ tX

( )( ) ( ) XuiXpuiXp eppEeEu ⋅⋅⋅ ⋅+−== 1)( ooϕ

Page 142: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

14

Stationary Distribution

Stationary Solution

( ) ( )∏∞

=

⋅+−==0j

jjXX z1GzE)z(G t ααε

( ) 1z,zE)z(G t ≤= εε

( )∞<

≥∑∞

= 1j

t

jjP ε

(strictly) stationary

∑∞

=−=

0jjt

jD

tX εα o

( ) ∞<<⇔ tEp ε&1

Discrete Self-Decomposable Law

tX

Page 143: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

15INAR(1) Moments

2nd

Order Moments

( ) &1 p

XE tX −== εμμ ( ) 2

22

1 ppXVar tX −

+⋅== εε σμσ

( ) 2,)( Xk

kttX pXXCovk σγ ⋅== −

( ) 0,)( ≥== −k

kttX pXXCorrkρ

1)(111)(1)(2

≥⇔≥⎟⎟⎠

⎞⎜⎜⎝

⎛+⎟⎟

⎞⎜⎜⎝

⎛+== εε

μσ ID

ppIDXID

X

X

( )∑∞

−∞=

+⋅−⋅+⋅

=⋅⋅=k

ikXX pp

pekS 2

2

cos21)(1)(

ωπσμγ

πω εεω

Page 144: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

16Model X or ε

)z1(G)z(G)z(G XX ⋅+−= ααε

Model Marginal or Innovation Law ?

)(Geom)1,1(B~)(Geom~X θαεθ ⋅−⇒

Nontrivial~),(NB~X ελβ ⇒

( )λαελ ⋅−⇒ )1(P~)(P~X

( )∞<

≥∑∞

= 1j jjP ε

Model ε-law sο

that

&1)(ID ≥ε

subject to feasibility of inference and simulation.

Page 145: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

17m-Poisson

Innovations

Mixed Poisson

InnovationsPavlopoulos

& Karlis

(Environmetrics

2008; 19: 369-393)

,1,,0 1 ≤≤ mππ L

( ) L,2,1,0,1,!1

=≥⋅⋅== ∑=

− xmx

exPm

i

xi

itiλπε λ

( ) ∏ ∑∞

= =

⎟⎠

⎞⎜⎝

⎛−⋅⋅⋅=

0 1)1(exp

j

m

i

jiiX zzG αλπ

,0,, m1 >λλ L 11

=∑=

m

iiπ

( ) ∑=

−⋅⋅=m

iii zzG

1)1(exp λπε

( ) ( )∑∑ ∑∑=

= =

=

∞<−⋅=⎟⎟⎠

⎞⎜⎜⎝

⎛⋅≤

≥ m

ii

j

m

i

ji

ij

t iejj

jP11 11

1!

λπλπε

( ) εεεεε μμλπμσλπμ >−+== ∑∑==

m

iii

m

iii

1

22

1;

Page 146: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

18

∑=

⋅=m

1i

)i(t

)i(tt Q Λε

ZZ ∈∈ ⊥ tttt ΛQ

( ) ( )m1)m(

t)1(

tt p,,p,1M~Q,,Q LL=Q

( ) ( ))(P,),(P~,, m1)m(

t)1(

tt θθΛΛ LL ⊥=Λindependent sequences of I.I.D. random vectors

A Convenient Representation

( )∑=

−− +⋅=+=m

1i

)i(t1t

)i(tt1tt XQXX Λαεα oo

Convenient Representation

⇒=∑=

1Qm

1i

)i(t

Page 147: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

19PB mixtures

Poisson-Binomial Convolutions( ) ( ) ( )i1t

)i(t1t1t1t

)i(t1t ;,XPB~XXXX λαΛαΛα −−−−− +=+ oo

( ) ( )i1t

m

1i

)i(t1tt ;,XPBQ~XX λα−

=− ⋅∑

( ) ∑−

=

+−−−−− −⋅⋅⎟⎟

⎞⎜⎜⎝

⎛−

⋅⋅=1t

1ti

xz

0k

kzxkz1tki

1ti )1(kz

x!k

exz ααλπ λ

( ) ( )1tti

m

1ii1t1ttt xxpxXxXP −

=−− ⋅=== ∑ πθ

( X t

| X t-1

) ~ mixture of m

Poisson-Binomial

r. v.

More generally, the k-step predictive/conditional law

of ( X t

| X t -

k

) is a mixture of m k

Poisson-Binomial

r. v. with parameters determined

by α, λ1 , …, λm , p1 , …, pm .

Page 148: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

20TOGA-COARE Data

TOGA-COARE DataCruise-1: 10 November -

9 December, 1992)

MIT-Radar (Doppler Precipitation)Location: (2oS,156oE) China Sea, SW Pacific OceanTemporal resolution (sampling frequency) 20 min.Regular section of 184 scans: 22/11(02:01) –

24/11(15:21)

Total duration of section: 61 hr

; 20 min

Pixel SARR Marks: R=(Z/230)0.8

Spatial Scales of probed subregions

(centrally nested):240, 120, 60, 32, 16, 8, 4, 2 Km

Clipping threshold levels of pixel marks:0, 1, 2, 5, 10, 20 mm/hr

Total of 8x6 = 48 time series

Page 149: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

21

Table 1 (sample stats)

Page 150: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

22Figure 1 (Tails: MOM vs

CML)

count of pixels

tail

prob

abilit

y

0 50 100 150 200 250 300

0.0

0.4

0.8

scale = 120 km, threshold=5mm/hr

count of pixels

tail

prob

abilit

y

0 50 100 150 200 250 300

0.0

0.4

0.8

scale = 120 km, threshold=5mm/hr

count of pixels

tail

prob

abilit

y

0 50 100 150

0.0

0.4

0.8

scale = 60 km, threshold=1mm/hr

count of pixels

tail

prob

abilit

y

0 50 100 150

0.0

0.2

0.4

0.6

scale = 60 km, threshold=1mm/hr

count of pixels

tail

prob

abilit

y

0 10 20 30 40

0.0

0.4

scale = 60 km, threshold=20mm/hr

count of pixels

tail

prob

abilit

y0 10 20 30 40

0.0

0.2

0.4

scale = 60 km, threshold=20mm/hr

count of pixels

tail

prob

abilit

y

0 5 10 15 20

0.0

0.2

0.4

scale = 32 km, threshold=10mm/hr

count of pixels

tail

prob

abilit

y

0 5 10 15 20

0.0

0.10

0.25

scale = 32 km, threshold=10mm/hr

Page 151: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

23Figure 3

(Case: 120 Km @ 5 mm/hr)

lag ( x 20 minutes)

acf v

alue

0 5 10 15 20 25 30

-0.2

0.2

0.6

1.0

(a) ACF

frequency

pow

er d

ensi

ty

0.0 0.5 1.0 1.5 2.0 2.5 3.0

050

0015

000

(b) Spectrum

time (x 20 minutes)

coun

t of p

ixel

s

0 50 100 150

020

040

0

(c) 1-step predictions from one replication

time (x 20 minutes)

coun

t of p

ixel

s

0 50 100 150

010

020

030

0

(d) 1-step percentile predictions from 1000 reps

time (x 20 minutes)

coun

t of p

ixel

s

0 50 100 150

020

040

0

(e) 2-step predictions from one replication

time (x 20 minutes)

coun

t of p

ixel

s

0 50 100 150

010

020

030

0(f) 2-step percentile predictions from 1000 reps

Page 152: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

24Figure 4

(Case: 60 Km @ 1 mm/hr)

lag ( x 20 minutes)

acf v

alue

0 5 10 15 20 25 30

-0.2

0.2

0.6

1.0

(a) ACF

frequency

pow

er d

ensi

ty

0.0 0.5 1.0 1.5 2.0 2.5 3.0

010

0030

00

(b) Spectrum

time (x 20 minutes)

coun

t of p

ixel

s

0 50 100 150

050

100

200

(c) 1-step predictions from one replication

time (x 20 minutes)

coun

t of p

ixel

s

0 50 100 150

050

100

(d) 1-step percentile predictions from 1000 reps

time (x 20 minutes)

coun

t of p

ixel

s

0 50 100 150

050

100

200

(e) 2-step predictions from one replication

time (x 20 minutes)

coun

t of p

ixel

s

0 50 100 150

050

100

(f) 2-step percentile predictions from 1000 reps

Page 153: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

25

Randomized Binomial Thinning

∑=

=X

iP iZXP

1)(:o

1)( ≥iP iZ

X is a non-negative integer-valued random variable, P is a [0,1]-valued random variable. (X , P) may be jointly distributed or stochastically independent.

array of binary r.v.’s, which conditionally on P = p are I.I.D., following Bin(1,p), independently of X.

ppPiZPpPiZP PP ===−=== ||0)(1||1)(

Page 154: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

26RBT vs

BT & properties

( )( ) ),(;

..;

pxBxXpPXP

BTRBTeiXppPXPD

D

===

⊃==

o

oo

(always)

( )( ) ( ) ( ) Xui

XP

XPuiXPui

ePPEu

XPeEEeE⋅

⋅⋅⋅⋅

⋅+−=

=

1)(

,||

o

oo

ϕ

( ) ( ) XPPXPPD

ooo 2121 =

( ) ( ) ( )YPXPYXPD

ooo +=+if X,Y,P are mutually independent OR if X,Y are independent conditionally on P.

Page 155: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

27Fractional INAR(1)

Fractional INAR Model: tttt XPX ε+= −1o tε I.I.D.

innovations: non-negative integer-valued noise

( ) ( ) ( )tsXPtand stt <⊥∀ ;, σε

∑−

=− =

1

11 )(:

tX

ittt iZXP o

Auto-correlated thinning process:

[0,1]-valued tP

ttP ε⊥

so that, conditionally on Pt , the array Z t (i) ; i≥1

is comprised of I.I.D.

Bin(1,Pt

) random variables, independently of σ(Xs

; s < t).

( )]1,0[.;

1;;;)(

,,~;)(

2

2)0(||

2

2

∈+

=

=

∨−−− etca

aeeeaf

HFGNaafP

aaa

ttt σμ

Page 156: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

28Iterations -

Stationarity

k-step Iteration (future to present):

t

k

iitikt

k

i

i

jjktkt

D

kt XPPX oo ⎟⎟⎠

⎞⎜⎜⎝

⎛+⎟⎟

⎞⎜⎜⎝

⎛+= ∏∑ ∏

=+−+

=

=−+++

1

1

1

1

0

εε

Representation of Stationary Solution! Necessary Conditions !!!Does a Stationary Solution Exist? || || ||Under What Conditions?

\/ \/ \/

∞-step Iteration (present to remote past):

iti

i

jjtt

D

t PX −

=

=−∑ ∏ ⎟⎟⎠

⎞⎜⎜⎝

⎛+= εε o

1

1

0

( )( )( )

( )⎪⎩

⎪⎨

∞→→<=∞<=

⇒−

=⎟⎠

⎞⎜⎝

⎛+= ∑

= tasPPEPE

EPPE

t

tP

t

PiiX

,0...1

1...1

11

1 μεμ

μμμμ

εε

ε

Page 157: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

29Mean & Variance

1-step Iteration (future to present):

( ) ( ) ( )( )

PX

XPttttttX

tttt

XPEXPXPEEXPEEXE

μμμ

μμμμμμε

ε

εεε

−=

+=+=+=+=

+++

+++

1

,|| 111

111

o

o

Stationary Mean

( ) ( ) ( ) ( ) ( )( )

( ) ( )

( )( )2222

22

22

21

211

22221

21

2222

112

12

12

1

111

1

12

,||2

2

PPP

PP

PP

PX

tttttXPXX

ttttXPXX

ttttttt

PXPPXE

XPXPEE

XPEEXPEEXE

μμσμμμσ

μσμμσσ

μμμμσμσ

μμμμσμσ

εε

εεεε

εεε

εεε

−−−−+

+−−

+=

+−+++=+

+++=+

++=

+++

++

+++++

o

oo

Stationary Variance

Page 158: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

30Overdispersion

Index of Dispersion under Stationarity

( ) ( )( )( )

( )

11

111

111

11

2

2

22

2

22

2

>⇒≥∴

−−>⇔>

−−−−+

++−−

−==

X

P

PX

PPP

PPP

PP

P

X

XX

δδ

μσμδδ

μμσμμσδδ

μσμ

μσδ

ε

εε

εε

Page 159: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

31ACV(k)

ACVF(1)

( ) ( ) ( )( )( )

( )2

2221

112

111

)1(1

,||)1(

XPXP

X

XXPXttX

tttttXXX

ttttttt

XPE

XPXXPEE

XXPEXEXXE

σμγμ

μμ

μσμμμμμ

μμμγ

ε

ε

εε

ε

=⇒−

=

++=+=

+=+

+=

+

++

+++

o

o

( ) ( ) ( )( ) ( )( )( )( )( )

( ) )1()2(

)1()2(2222

2222211222

PXXXPX

XXPPXPXXX

ttttttttttt XXPPEXPEXEXXE

γμσσμγ

μσμγμμμμμμγ

εε

εε

++=⇒

++++=+⇒

++= ++++++ oo

ACVF(2)

( ) ⎟⎟⎠

⎞⎜⎜⎝

⎛++⎟⎟

⎞⎜⎜⎝

⎛+−= ∏∑ ∏

=

=

=−

k

iiXX

k

i

i

jjkXXPX PEPEk

1

221

1

1

0

22)( μσμμμμγ ε

ACVF( k ≥

3)

Page 160: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

32Calculating E(P1…Pk)

( ) ( )( ) ( )

( ) ( ) ( ) ( )( )

( ) ( )

ttt

t

tt

ttt

tjti

HHHat

tii

tii

t

iiii

t

iit

ttt

LLVthatsuchmatrixtriangularlowergularnonL

VofrseigenvectoofmatrixorthogonalQLQ

jijijijiV

ofseigenvalueNIIDW

WEaEPPEthen

HRFGNaaPIf

'sin:

:',...,'',...,

1||21||2

||

0&)1,0(

expexp...

,12/1,0,~&exp

11

,...,1,...1

2222

11

1

2

1

21

22

=−−

=

⎟⎟⎠

⎞⎜⎜⎝

⎛−−+−−+−=−=

⎭⎬⎫

⎩⎨⎧

⎟⎠

⎞⎜⎝

⎛−−=

⎭⎬⎫

⎩⎨⎧

⎟⎠

⎞⎜⎝

⎛−=

≤≤>∈−=

==

==

==∑∑

μμωω

σγ

λ

ωλ

σμ

Page 161: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

33Case μ=0

& t=1

The Case of FGN with μ=0

( )

( ) ( )∏∏

∑∑

=

=

==

+=+=

⎭⎬⎫

⎩⎨⎧

⎟⎠

⎞⎜⎝

⎛−=

⎭⎬⎫

⎩⎨⎧

⎟⎠

⎞⎜⎝

⎛−=

t

ii

t

ii

t

iii

t

iit WEaEPPE

1

2/1*2

1

2/1

1

2

1

21

121

expexp...

λσλ

λ

tttii VVofseigenvalue 2

*1

* 20σ

λ =≥ =

( ) ( ) ( )( ) ( )( ) ( )( ) ( )

( ) ( ) 122/12

122222

2/1221

*1

2141

212exp

21exp

2

−−

+−+=

+−−=−=

+=−===

=

σσ

σμσ

σμ

λ

tPtP

ttP

aEPE

aEPEPE

For t=1:

Page 162: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

34Case μ=0

& t=2,3

The Case of FGN with μ=0

HH 44;4 *2

*1 −== λλFor t=2:

For t=3:

( ) ( )2

116429854295429

94212

*3,2

*1

−−⋅+⋅−+⋅−±+⋅−=

−⋅+=

HHHHHHH

HH

λ

λ

Page 163: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

350-Inflated Poisson

0-inflated Posson

innovations

( )

22 )1(

!1)1(

!

)1,0(11;0;1,0

θθσ

θμ

θθε

θε

ε

ε

θ

θθ

θθ

⋅−⋅+⋅=

⋅=

⋅−

⋅−=⋅⋅==

∈−−

=>∈==

−−

−−

lll

l

keepl

kekP

eplepP

kk

t

t

Overdispersion

Page 164: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

36MOM inference

Method of Moments Inference

( )

( ) ( )

( ) ( )

( ) ( )XXXXT

XXXXT

XXXXT

XXT

S

XT

X

t

T

ttXX

t

T

ttXX

t

T

ttXX

T

ttX

T

ttX

−⋅−==

−⋅−==

−⋅−==

−−

==

==

+

=

+

=

+

=

=

=

3

3

1

2

2

1

1

1

1

1

222

1

1)3()3(

1)2()2(

1)1()1(

11

1

γγ

γγ

γγ

σ

μ

)

)

)

Page 165: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

37Conclusion

Concluding Remarks

• SUFFICIENT & NECESSARY CONDITIONS FOR STATIONARITY

• REMEDY INSTABILITIES OF FITTED MODEL BY INTRODUCING μ ≠ 0

• SOLVE NUMERICALLY A HIGHLY NON-LINEAR 5X5 SYSTEM OF MOM EQUATIONS

• COME UP WITH ALTERNATIVE INFERENCE PROCEDURES (other than Method of Moments)

FAREWELL DEAR

BENJAMIN

Page 166: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Distributions of statistics of hidden state sequences through

the sum-product algorithm

Donald E.K. Martin, NC State Univ. John A.D. Aston, Warwick Univ.

Page 167: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Outline

• Introduction• Background Information • Computation of Distributions in Hidden

State Sequences• Conclusion

Page 168: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Introduction

• Let be an observed sequence

• is a corresponding hidden state sequence

• We study inference for statistics of conditional on

1 , , no o≡o …

1 , , ns s≡s …

so

Page 169: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• States can serve as labels for -- Krogh (1997), Durbin et al. (1988) DNA-- Hamilton (1989) business cycles

(may be interested in “runs”)

• States can be the true values of a noisy observed sequence

-- McEliece (1998) transmitted codewords-- Baxter (1982) noisy image pixels

o

Page 170: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• Past approaches to such problems:

-- determine Viterbi (most likely) sequence and obtain the statistic from that sequence-- sample from

-- Aston and Martin (2007) introduced a Markov chain based method to compute exact distributions for statistics of HMM states

( )p s o

Page 171: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Background Information

• The model• Factor graphs• Sum-product algorithm

Page 172: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• We use the discriminative model, a conditional random field

1( ) ( , )( ) c c

c

pZ ∈

= Ψ∏C

s o s oo

( ) ( , )c ccZ

∈= Ψ∑∏s C

o s o

( , ) exp ( , )c c l l clfλ⎡ ⎤Ψ = ⎣ ⎦∑s o s o

The model

Page 173: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• If each takes on values then takes on , and thus, in general, computation of is intractable for large

• We consider models with dependence structure that allows exact inference to be performed

js s

( ) ( , )c ccZ

∈= Ψ∑∏s C

o s on

KnK

Page 174: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Factor graphs

• Corresponding to the model is a factor graph , a bipartite graph with two types of nodes: -- variable nodes-- factor nodes (with edges connecting the functions to their arguments)

• Graph describes how a global function factors into a product of local functions

( , )F V E=

Page 175: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

1 5( , , )p s s =o… [ ] 14 4 5 3 3 4 2 1 2 4 1 1( ) ( , , ) ( , , ) ( , , , ) ( , )Z s s s s s s s s− Ψ Ψ Ψ Ψo o o o o

Factor graph corresponding to the joint distribution

Page 176: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• Note that “generative” models such as Bayesian networks and HMMs can be represented by our model

e.g. for HMMs1( , , )

1 211( , ) ( ) ( )

t t t ts s o

nt t ttt

p s s o sξ ξ−Ψ

−==∏s o

( ) ( , )Z p= ∑ so s o

( ) ( , ) / ( )p p Z=s o s o o

Page 177: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

The sum-product algorithm (SPA), Kschischang et al. (2001)

• Operates in a cycle-free factor graph to compute marginal functions by exploiting the way a global function factors

• Nodes are treated as “processors” and “messages” are sent between them

• Essentially equivalent to the “generalized distributive law” (Aji and McEliece, 2000) over junction trees

Page 178: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• A variety of algorithms in artificial intelligence, signal processing and digital communications can be derived as instances of the SPA-- forward/backward algorithm for HMMs-- Viterbi algorithm (max-product semiring)-- interative “turbo” decoding -- Pearl’s belief propagation for Bayesian networks -- Kalman filter-- certain FFT algorithms

Page 179: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

[ ]4 3 1 2

11 1

43 442 4

54 5

14 4 5 3 3 4 2 1 2 4 1 1

, ( )( )

( )

( )

( ) ( , , ) ( , , ) ( , , , ) ( , )s

ss

s

s s s s ss

s

s

Z s s s s s s s sμ

μμ

μ

Ψ →

Ψ →Ψ →

Ψ →

−= Ψ Ψ Ψ Ψ∑ ∑ ∑o o o o o5

5( ) ( )s

p s p=∑o s o∼

4 5

5

5( ) ( )ss

Z sμΨ →=∑o

3Ψ2Ψ

2s3s

1s 4s

5s

1 1 1( )s sμΨ →

2 4 4( )s sμΨ →3 4 4( )s sμΨ →

4 5 5( )s sμΨ →

Page 180: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• A perfect elimination sequence of the SPA is one where marginalization can be carried out without enlarging local domains

• More than one perfect elimination sequence can exist

Page 181: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• To obtain :

• is treated as the root of the graph • Initial messages are sent from leaves• Message sent from node to its parent after it

has received messages from its children

js

( )jp s o

( )\

( ) ( )t c d t

d t c

s t s tne s

s sμ μ→Ψ Ψ →Ψ ∈ Ψ

= ∏

\ \

( ) ( , ) ( )c v u c

c u c

s v c c s us s s

s sν ν

μ μΨ → →Ψ∈

⎛ ⎞= Ψ⎜ ⎟⎜ ⎟

⎝ ⎠∑ ∏s s

s o

;

Page 182: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

For a factor graph with cycles:

• Obtain a spanning tree

• A variable is “stretched” by including it in variable nodes along the unique path of from the variable to nodes of that can be reached from it on a path of length two

• Delete redundant edges

F

T

FT

Page 183: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

[ ] 15 4 5 4 3 5 3 2 4 2 1 3 1 1 2( ) ( ) ( , , ) ( , , ) ( , , ) ( , , ) ( , , )p Z s s s s s s s s s s−= Ψ Ψ Ψ Ψ Ψs o o o o o o o

Page 184: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Computation of Distributions of Statistics of a Hidden State

Sequence

• We compute distributions of a count statisticassociated with a pattern

• is the order of evaluating

( )( )(1)1( , , ) , ,n n nhθ σ σ θ θ Θ= ∈Θ=… …

1, , nσ σ… nθ

Page 185: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• A sequential ordering is most convenient when the pattern has words of length greater than one

• For that case we introduce to keep track of pattern progress

• If the pattern consists of singletons, we can evaluate the statistic in any order (e.g. a multinomial distribution)

1 1, , , ,n ns sσ σ =… …

1( , , )t tq g s s= …

Page 186: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• Let

• We first obtain an vector that has entries

• Then we obtain a vector holding probabilities for through

(1) ( )( , ) , ,( , )Q q q ωθ θ×Θ= …

1ω× nγ( ) ( )P r , , j

n nq qθ θ⎡ ⎤=⎣ ⎦

nγ1Θ ×

n nω

γ γΘ×

= Λ

Page 187: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• For each component of ,

• The computation is similar to computing (we also must keep track of the value of the statistic)

( )(( , ) )jn qγ θ nγ

( ) ( ) ( )(( , ) ) Pr( ,( , ) ( , ) )j j jn n nq q qγ θ θ θ= =∑

ss o

( ) ( )( )( ) ( , ) ( ), ( )jp I q g hθ⎡ ⎤= =⎣ ⎦∑s

s o s s

( )Z o

Page 188: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• Vectors serve as indicators of the value of the statistic

where are matrices with a one in the position if and only if with the occurrence of .

( )0 1 0 0 Tϕ =1 1 0n nn σ σ σϕ ϕ−

= Δ Δ Δ

jσΔ ω ω×( ),i j ( ) ( )( , ) ( , )j iq qθ θ→

Page 189: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• Using our representation of we obtain: ( )p s o

( )( )( )1

1

01

01

( , )( ) ( , )

( , )

n

n

k

c ccn

n c cc kc cc

Zσ σ

σ σσ

ϕγ ϕ−

=

Δ Δ Ψ= Ψ Δ =

Ψ

∑ ∑ ∏∑∏ ∏ ∑∏s

s

s oo s o

s o

Page 190: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• When is a perfect elimination sequence of the SPA, factors can be efficiently distributed throughout sums

( )\

( , , ) ( , , )t c d t

d t c

t t t t t tne

q qσ σσ

μ σ θ μ σ θ→Ψ Ψ →Ψ ∈ Ψ

= ∏

\ \

( , , ) ( , ) ( , , )c u r r c

c u r c u

u u u c c s s r r rs

q s qσσ σ

μ σ θ μ θΨ → →Ψ∈

⎛ ⎞= Ψ Δ⎜ ⎟⎜ ⎟

⎝ ⎠∑ ∏s s

s o

( )( ) ( )

1( )

( ) ( , , )u d ud un n

u c u c

n s s u n nne ss s

Z s qγ μ θ−Ψ →Ψ ∈

∈ ∈

= Δ∑ ∏ ∏s s

o

n nγ γ=Λ

1 , , nσ σ…

Page 191: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Examples

• Let

• No pattern progress is needed (patterns are the symbols )

( )1 , ,t t Ktθ η η= …

[ ] 11 1 2 3 4 2 2 3 4 5 3 4 5 6 9( ) ( ) ( , , , , ) ( , , , , ) ( , , , , )p Z s s s s s s s s s s s s−= Ψ Ψ Ψs o o o o o

4 2 3 4 7 5 3 7 8 10 ( , , , , ) ( , , , , )s s s s s s s s× Ψ Ψo o

( 1 ) ( ), , Ks s…1

1n K

+ −⎛ ⎞= Θ ≡ ⎜ ⎟−⎝ ⎠

Page 192: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• The matrices for the transitions based on have a one in location if column corresponds to and row to

tσΔ

( )lt sσ =

( )K1,1 1, 1,, , , ,t t l tη η η− − −… …( ),i j

j

i ( )1,1 1, 1,, , 1, ,t t l t Kη η η− − −+… …

Page 193: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many
Page 194: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• Passed messages:1( , ) 1

j ds jsμ θ→Ψ = ( ) ( ) ( ) ( ) ( ) ( ), 1,1 , 6,3 , 9,3 , 8,5 , 10,5j d =

5 **** 10 8

8 10

**** 3 5 3 7 8 10 0,

( , ) ( , , , , )s s ss s

s s s s sμ θ ϕΨ → = Δ Δ Ψ∑ o

4 **** **** 1 4 2 3 4 7( , ) ( , , , , )s s s s s sμ θΨ → =Ψ o

**** 1 4 **** 5 ******** 3 **** 1 **** 3( , ) ( , ) ( , )s s ss s sμ θ μ θ μ θ→Ψ Ψ → Ψ →=

1 1 7 1 1 **** 1 2*** ***1 7

*** 5 1 1 2 3 4 1 1 **** 3 *** 5,

( , ) ( , , , , ) ( , ) ( , ) ( , )s s s s s ss s

s s s s s s s sμ θ μ θ μ θ μ θΨ → →Ψ →Ψ →Ψ= Δ Δ Ψ =∑ o

2 2 3 2** ***2 3

** 7 2 2 3 4 5 *** 5,

( , ) ( , , , , ) ( , )s s s ss s

s s s s s sμ θ μ θΨ → →Ψ= Δ Δ Ψ∑ o

3 6 9**6 9

* * 9 3 4 5 6 9,

( , ) ( , , , , )s s ss s

s s s s sμ θΨ → = Δ Δ Ψ∑ o

4 5 2 ** 3 **

4 5

110 ** 7 ** 9( ) ( , ) ( , )s s s s

s sZ s sγ μ θ μ θ−

Ψ → Ψ →= Δ Δ∑ ∑o

Page 195: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• Consider the computation of the distribution of the number of overlapping occurrences of in .

Let

1 111L = ( )1 5, ,s s=s …

[ ] 14 1 5 3 1 3 2 1 2 4 1 4( ) ( ) ( , , ) ( , , ) ( , , , ) ( , )p Z s s s s s s s s−= Ψ Ψ Ψ Ψs o o o o o o

Page 196: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

1 1′′Ψ = Ψ

2 1 2 3 4 5 4 1 5 3 1 3 2 1 2 4( , , , , , ) ( , , ) ( , , ) ( , , , )s s s s s s s s s s s s′′Ψ =Ψ Ψ Ψo o o o

Page 197: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• Transitions of values (for matrices)

after one after zero_______________________________

( ),q θ

( ), 0ε ( )1, 0 ( ), 0ε

( ),q θ

( )1, 0 ( )11,0 ( ), 0ε

( )11,0 ( )111,1 ( ), 0ε

( ),1ε ( )1,1 ( ),1ε

( )111,1 ( )111,2 ( ),1ε

( )111,2 ( )111,3 ( ), 2ε

Δ

Page 198: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• If states are listed in the order

( ,0)ε (1,0) (11,0) ( ,1)ε (1,1) (111,1) ( ,2)ε (111,2) (111,3)

1 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 00 0 0 0 0 0 0 0 1

⎛ ⎞⎜ ⎟⎜ ⎟Λ =⎜ ⎟⎜ ⎟⎝ ⎠

Page 199: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• To reduce the number of values, we introduce an Aho-Corasick automaton

set of prefixes of patternsuch that is

the longest suffix of that is in terminal states are words of the pattern

• The automaton is then minimized using the Hopcroft (1971) algorithm

q

( , , , , )QD Q Tχ δ ε=

Q:Q Q Qδ χ× → ( , )Q q aδ

qa QT

Page 200: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

• Supposed that we are interested in the number of overlapping occurrences of the Chi motif ,

• The are 30 pattern prefixes, but only 10 states in the minimal automaton

L GNTGGTGG= , , ,N A C G Tχ∈ =

0 1

3

6 7 8 9

2

4 5

A,C,T

G

G

A,C,T

T

G

T

A,C

G

G G T G G

A,C

G

A,C,TA,C,T

A,C

G

A,C,T

A,C,T

A,C

T

Page 201: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Conclusion

• We have given a method to compute exact distributions of statistics of hidden state sequences

• The method can be applied to a variety of models, both discriminative and generative

• For certain elimination orders, computation is as efficient as computing marginals

Page 202: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Future work

• Analyze data sets• Applications, e.g. in information/coding

theory: can we facilitate decoding based on new decoding criteria?

• Approximate inference (bounds on deviation from true distributions?)

Page 203: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

References

• A. M. Aji and R. J. McEliece, IEEE Transactions on Information Theory, 46, 325-343 (2000).

• J. A. D. Aston and D. E. K. Martin, Annals of Applied Statistics, 1(2), 585-611 (2007).

• F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, IEEE Transactions on Information Theory, 47(2), 498-519, (2001).

Page 204: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Thank you!!

Page 205: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Classification of Eye Movement Data

Ritaja Sur and Benjamin Kedem

Department of Mathematics,University of Maryland, College Park

Advances in Statistics and Applied Probability: Unified Approaches

July 31, 2009

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 206: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 207: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 208: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Background

Land and Mcleod (2000) - For a successful hit cricketersfixate on future points of contact of the ball and the ground.

Reina and Scwartz (2003) studied monkey’s fixation points ina repeated drawing task.

Mataric and Pomplun (1998) studied movement imitation andshowed that people tend to fixate on end effectors.

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 209: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Questions

What visual inputs are captured when we intend to imitate amovement?

What are the computations performed by the brain inmovement imitation?

Whether there is any difference when a person intends toimitate?

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 210: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Data Description

The eye movement data was obtained from the work of LiorNoy.

Eye gaze as a response to the motion of the hand wasrecorded using a eye tracker.

Coordinates of the wrist, elbow and shoulder were recorded.

Number of human subjects = 7.

Number of distinct hand movements = 10.

Number of conditions (Watch and Imitate) = 2.

Data recorded in x- and y- coordinates.

Length of time series ranges from 801 (6.68s) to 1609(13.41s) data points.

r-coordinate rt = (x2t + y2

t )1/2 considered for analysis.

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 211: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Time

x− co

ordina

te

0 200 400 600 800 1000 1200

−10

−50

Figure: x- coordinate for Subject 5 Movement 2 (Watch).

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 212: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Time

y− co

ordina

te

0 200 400 600 800 1000 1200

−20

−15

−10

−50

5

Figure: y- coordinate for Subject 5 Movement 2 (Watch).

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 213: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

−10 −5 0

−20

−15

−10

−50

5

x−coordinate

y−co

ordina

te

Figure: x- coordinate vs y- coordinate for Subject 5 Movement 2(Watch).

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 214: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Time

r− co

ordina

te

0 400 800 1200

510

1520

0 20 40 60 80

0.00.2

0.40.6

0.81.0

Lag

ACF

Figure: Time series plot and Autocorrelation plot of the r- coordinate forSubject 5 Movement 2 (Watch).

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 215: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Time

r− co

ordina

te

0 400 800 1200

−20

24

6

0 20 40 60 80

0.00.2

0.40.6

0.81.0

Lag

ACF

Figure: Time series plot and Autocorrelation plot of the differenced r-coordinate for Subject 5 Movement 2 (Watch).

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 216: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

0 5 10 15 20 25 30

−0.2

0.00.2

0.40.6

0.81.0

Lag

Partia

l ACF

Figure: Partial Autocorrelation plot of the differenced x- coordinate forSubject 5 Movement 2 (Watch).

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 217: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Non-parametric approach using HOC

Let ∇ be the difference operator defined as.

∇Zt ≡ Zt − Zt−1

In general, for k = 0, 1, 2,..., ∇kZt is given by

∇kZt =k∑

j=0

(k

j

)(−1)jZt−j

where ∇0Zt ≡ Zt . For each k = 1, 2, ...., we further obtain the binaryclipped process Xt(k) and the HOC counts Dk .

Xt(k) =

1, if ∇k−1Zt ≥ 0

0, if ∇k−1Zt < 0

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 218: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

HOC continued

Dk =N∑

t=1

[Xt(k)− Xt−1(k)]2

∆k ≡

D1 if k = 1

Dk − Dk−1 if k = 2, 3, ....,K − 1

(N − 1)− DK−1 if k = K

Distance measure from white Gaussian noise:

ψ2 =K∑

k=1

(∆k −mk)2

mk

where mk= E (∆k) can be obtained using

E [Dk ] = (N − 1)[1

2+

1

πsin−1(

k − 1

k)]

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 219: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Subject 1, Movement 9

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 220: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Application to eye data

For each eye signal, the calculations of the Dk ’s are based ona 751-point segments, corresponding to t = 50, 51, ...., 800.

The respective ψ2 distances from white noise were computedusing Dk ’s for k = 1, 2, ..., 8.

For each subject, the average ψ2 values were computed fromthe 10 movements.

To test the hypothesis that ψ2 values differ significantly underthe watch and imitate conditions for each subject, wilcoxonrank-sum test was performed.

To get further insight from the 70 cases for each of the twoconditions, each of the 70 time series was divided into twoequal parts. Essentially, this implies as if the subjects wereviewing 20 movements instead of 10.

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 221: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Results using ψ2Statistic

Table: r- Average ψ2 distance from white noise.

Subject Watch1 Imitate1 H0 : W 1 = I1 Watch2 Imitate2 H0 : W 2 = I21 53.317 85.527 0.0116 33.237 50.629 0.00372 37.064 48.627 0.1088 27.884 32.373 0.19173 92.091 66.617 0.0526 52.357 44.101 0.06374 75.271 79.421 0.2894 40.421 46.265 0.19175 51.149 50.656 0.4559 32.863 33.162 0.34906 105.241 53.506 0.0014 60.175 34.186 0.00267 43.894 122.453 1.08E-05 26.643 74.597 < 0.005

Average 65.43 72.40 39.08 45.04

Median 53.2 66.62 33.24 44.10

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 222: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Subject 6, Subject 7

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 223: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Distance measures - 1

Let xt and yt be two zero mean stationary time series of length n.The various distance measures considered are:

Euclidean Distance:

dEUCL(x , y) =

√√√√ n∑t=1

(xt − yt)2

Distance based on estimated autoregressive weights:

dAR(x , y) =

√√√√ ∞∑j=1

(πj ,x − πj ,y )2

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 224: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Distance measures - 2

Distance based on estimated autocorrelations:

dACFU(x , y) =√

(ρx − ρy )′(ρx − ρy )

dACFG (x , y) =√

(ρx − ρy )′Ω(ρx − ρy )

Distance based on periodogram coordinates:

dNP(x , y) =

√√√√[n/2]∑j=1

[NPx(wj)− NPy (wj)]2

dLNP(x , y) =

√√√√[n/2]∑j=1

[logNPx(wj)− logNPy (wj)]2

dKLFD(x , y) =

[n/2]∑j=1

[NPx(wj)

NPy (wj)− log

NPx(wj)

NPy (wj)− 1]

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 225: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Distance measures - 3

Distance based on HOC:

dDelta(x , y) =

√√√√ K∑k=1

(∆k,x −∆k,y )2

dD(x , y) =

√√√√ K∑k=1

(Dk,x − Dk,y )2

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 226: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Simulation study

Simulated 1000 replications of AR(0.3) and AR(0.4)processes.

Length of each time series = n.

ACFU, ACFG calculated for lag L.

Ω is a diagonal matrix in ACFG with geometrically decayingweights with p = 0.05.

Clustering algorithm - Complete linkage (hierarchical) andK-means (non-hierarchical).

Calculate missclassification percentages.

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 227: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Table: Misclassification Percentages in Hierarchical clustering forclassification of AR(0.3) and AR(0.4) processes.

n EUCL AR L ACFU ACFG FREQ NP LNP KL Delta D

100 50 50 5 45 47 LOW 49 50 50 42 448 46 47 HIGH 50 50 45

10 48 47 ALL 50 50 4625 45 49

200 49 46 5 46 47 LOW 49 49 50 42 478 45 50 HIGH 50 50 50

10 47 49 ALL 50 50 4450 46 47

100 50 44500 45 50 5 34 36 LOW 50 49 43 35 34

8 46 43 HIGH 50 50 5010 47 40 ALL 50 49 4350 37 39

100 50 45250 50 38

1000 47 37 5 45 38 LOW 49 50 42 18 348 44 39 HIGH 50 50 49

50 38 42 ALL 50 50 49100 39 39250 49 41

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 228: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Non-Hierarchical Clustering - K-Means

Table: Misclassification Percentages in Non-Hierarchical clustering(K-Means) for classification of AR(0.3) and AR(0.4) processes.

n EUCL AR L ACFU FREQ NP LNP Delta D100 50 33 5 39 LOW 48 49 40 44

8 41 HIGH 48 3910 41 ALL 48 4925 42

200 50 24 5 34 LOW 48 50 36 428 37 HIGH 49 35

10 38 ALL 48 5050 41

100 43500 50 14 5 21 LOW 47 47 26 34

8 23 HIGH 46 2010 25 ALL 47 4650 29

100 31250 31

1000 50 5 5 9 LOW 48 49 16 168 9 HIGH 24 10

50 9 ALL 20 10100 9250 1

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 229: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Clustering

Consider a particular movement.

Corresponding to each movement, there are eye gaze timeseries of seven subjects under Watch condition and sevenunder imitate condition.

Use Delta metric for classification.

Use Complete linkage algorithm.

1, 2, 3, ..., 7 correspond to Watch condition.

8, 9, 10, ..., 14 correspond to Imitate condition.

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 230: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

14

12 8 10

5 13 2

7 9

11

3 6 1 4

010

020

030

040

0

Cluster Dendrogram

hclust (*, "complete")d

Heigh

t

Figure: Classification of eye data for movement 1 using the Delta metric.

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 231: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

14

11 3 1 6 13

7 4 8 2 9 12 5 10

010

020

030

040

0

Cluster Dendrogram

hclust (*, "complete")d

Heigh

t

Figure: Classification of eye data for movement 4 using the Delta metric.

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 232: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Summary

Based on the ψ2 distances, we conclude that in some casesthe individuals (Subjects 1, 6 and 7) viewed the movementsdifferently under the two conditions.

Using the HOC measure in clustering, it can be said thatSubject 7 viewed the movements differently under the twoconditions.

The results obtained from clustering do not in general give aclear picture.

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 233: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

References

Kedem, B.(1994).Time Series Analysis by Higher OrderCrossings. IEEE Press,Piscataway, N.J.

Noy, L.(2009).Studying Eye Movements in MovementImitation. PhD Thesis, Weizmann Institute of Science,Rehovot, Israel.

Caiado, J., and Crato, N., and Pena, D.(2006).Aperiodogram-based metric for time seriesclassification.Computational Statistics & DataAnalysis,50,2668-2684.

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 234: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

Acknowledgements

We are grateful to Lior Noy (Harvard Medical School) for hisadvice and for providing the datasets.

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 235: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

Study of eye movementsDiscrimination using HOC

Classification using HOC

THANK YOU

Ritaja Sur and Benjamin Kedem Classification of Eye Movement Data

Page 236: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for Multiple-SampleSemiparametric Density Ratio Models

Guanhua Lu

Dr. Ben Kedem Symposium

July 31, 2009

Guanhua Lu Semiparametric Density Ratio Model

Page 237: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Outline

1 IntroductionA Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

2 Asymptotic Theory for θ and GAsymptotic Theory for θAsymptotic Theory for G(t)

3 Data AnalysisA Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Guanhua Lu Semiparametric Density Ratio Model

Page 238: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Length-biased Sampling

Sampling Scheme: Lengths of items are distributed according tothe cdf G (to be estimated). The probability of selecting anyparticular item is proportional to its length, then the lengths ofsampled items are distributed according to the length-biased cdf

F(y) =1µ

∫ y

0xdG(x), y ≥ 0, (1)

where µ =∫∞

0 xdG(x) <∞.The Nonparametric maximum likelihood estimator (NPMLE) forG was obtained through empirical likelihood.

Guanhua Lu Semiparametric Density Ratio Model

Page 239: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Biased Sampling/Selection Bias:

Vardi[1985] and Gill, Vardi and Wellner[1988] considered thefollowing biased sampling model by assuming general weightfunctions

Fj(y) = Wj(G)−1∫ y

−∞wj(x)dG(x), j = 1, . . . ,m, (2)

where the normalization constant Wj(G) =∫∞−∞ wj(x)dG(x).

NPMLE for G was obtained by combining information fromseveral independent samples, and was shown asymptoticallyefficient.

Guanhua Lu Semiparametric Density Ratio Model

Page 240: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Case-control Study: (Prentice and Pyke[1979])Suppose that m disease groups are defined. Let D = j denote thedevelopment of the jth disease, and let D = 0 indicate thedisease-free group. The probability that an individual withcharacteristics x develops disease D = j can be specified in termsof a logistic regression model as

P(D = j|x) = exp(αj + β′j x)/m∑

i=0

exp(αi + β′i x). (3)

Formula (3) leads to the density ratio model

P(x | D = j)/P(x | D = 0) = exp(α∗j + β′j x).

Let gj(x) and g(x) be the densities of the jth group thedisease-free group respectively,

gj(x) = exp(α∗j + β′j x)g(x), j = 1, . . . ,m. (4)

Guanhua Lu Semiparametric Density Ratio Model

Page 241: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Case-control Study: (Prentice and Pyke[1979])Suppose that m disease groups are defined. Let D = j denote thedevelopment of the jth disease, and let D = 0 indicate thedisease-free group. The probability that an individual withcharacteristics x develops disease D = j can be specified in termsof a logistic regression model as

P(D = j|x) = exp(αj + β′j x)/m∑

i=0

exp(αi + β′i x). (3)

Formula (3) leads to the density ratio model

P(x | D = j)/P(x | D = 0) = exp(α∗j + β′j x).

Let gj(x) and g(x) be the densities of the jth group thedisease-free group respectively,

gj(x) = exp(α∗j + β′j x)g(x), j = 1, . . . ,m. (4)

Guanhua Lu Semiparametric Density Ratio Model

Page 242: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Case-control Study: (Prentice and Pyke[1979])Suppose that m disease groups are defined. Let D = j denote thedevelopment of the jth disease, and let D = 0 indicate thedisease-free group. The probability that an individual withcharacteristics x develops disease D = j can be specified in termsof a logistic regression model as

P(D = j|x) = exp(αj + β′j x)/m∑

i=0

exp(αi + β′i x). (3)

Formula (3) leads to the density ratio model

P(x | D = j)/P(x | D = 0) = exp(α∗j + β′j x).

Let gj(x) and g(x) be the densities of the jth group thedisease-free group respectively,

gj(x) = exp(α∗j + β′j x)g(x), j = 1, . . . ,m. (4)

Guanhua Lu Semiparametric Density Ratio Model

Page 243: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Recent ExtensionsGilbert, Lele & Vardi[1999] applied the selection biased modelfor assessing from vaccine trial data how efficacy of an HIVvaccine varies with characteristics (genotype and phenotype) ofexposing virus.

Qin & Zhang[1997] and Zhang[2000] considered the two-samplecase, and studied the asymptotic theory. A Kolmogorov-Smirnovtype statistic was constructed for testing the goodness of fit ofmodel (4).Fokianos et al.[2001] studied model (4) based on multiplesamples for one-way layout with the distortion function xreplaced by a more general form h(x) and designed test forhomogeneity among different samples.

Guanhua Lu Semiparametric Density Ratio Model

Page 244: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Recent ExtensionsGilbert, Lele & Vardi[1999] applied the selection biased modelfor assessing from vaccine trial data how efficacy of an HIVvaccine varies with characteristics (genotype and phenotype) ofexposing virus.Qin & Zhang[1997] and Zhang[2000] considered the two-samplecase, and studied the asymptotic theory. A Kolmogorov-Smirnovtype statistic was constructed for testing the goodness of fit ofmodel (4).

Fokianos et al.[2001] studied model (4) based on multiplesamples for one-way layout with the distortion function xreplaced by a more general form h(x) and designed test forhomogeneity among different samples.

Guanhua Lu Semiparametric Density Ratio Model

Page 245: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Recent ExtensionsGilbert, Lele & Vardi[1999] applied the selection biased modelfor assessing from vaccine trial data how efficacy of an HIVvaccine varies with characteristics (genotype and phenotype) ofexposing virus.Qin & Zhang[1997] and Zhang[2000] considered the two-samplecase, and studied the asymptotic theory. A Kolmogorov-Smirnovtype statistic was constructed for testing the goodness of fit ofmodel (4).Fokianos et al.[2001] studied model (4) based on multiplesamples for one-way layout with the distortion function xreplaced by a more general form h(x) and designed test forhomogeneity among different samples.

Guanhua Lu Semiparametric Density Ratio Model

Page 246: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Data Structure: Suppose we have m + 1 independent samples,

X0 = (x01, . . . , x0n0)′ ∼ g(x)

X1 = (x11, . . . , x1n1)′ ∼ g1(x)

...

Xm = (xm1, . . . , xmnm)′ ∼ gm(x). (5)

X0 is referred as the reference sample with unknown distribution.

Density Ratio Model:

gj(x) = exp(αj + β′jh(x))g(x), j = 1, . . . ,m, (6)

where αj is a scalar, βj is a p× 1 vector, h(x) is a p× 1predetermined distortion function.

Guanhua Lu Semiparametric Density Ratio Model

Page 247: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Data Structure: Suppose we have m + 1 independent samples,

X0 = (x01, . . . , x0n0)′ ∼ g(x)

X1 = (x11, . . . , x1n1)′ ∼ g1(x)

...

Xm = (xm1, . . . , xmnm)′ ∼ gm(x). (5)

X0 is referred as the reference sample with unknown distribution.

Density Ratio Model:

gj(x) = exp(αj + β′jh(x))g(x), j = 1, . . . ,m, (6)

where αj is a scalar, βj is a p× 1 vector, h(x) is a p× 1predetermined distortion function.

Guanhua Lu Semiparametric Density Ratio Model

Page 248: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Example 1. (Normal Distribution)

Xj ∼ gj(x) = N(µj, σ2j ), j = 0, 1, . . . ,m. The weight function

wj(x|αj, βj) = gj(x)/g0(x) = exp(αj + βj(x, x2)′),αj = log(σ0/σj) + µ2

j /(2σ2j )− µ2

0/(2σ20),

βj = (µ0/σ20 − µj/σ

2j , 1/(2σ2

0)− 1/(2σ2j ))′.

The distortion function

h(x) = (x, x2)′.

h(x) degenerates to x2 if µj = 0, and weight functions reduce to

wj(x) = exp(αj + βjx2).

Guanhua Lu Semiparametric Density Ratio Model

Page 249: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Example 2. (Gamma Distribution)

Xj ∼ gj(x) = Gamma(αγj, βγ), j = 0, 1, . . . ,m with a commonscale parameter βγ .The weight function

wj(x|αj, βj) = gj(x)/g0(x) = exp(αj + βj log(x)),

αj = logΓ(αγ0)Γ(αγj)

+ (Γ(αγj)− Γ(αγ0)) logβγ ,

βj = αγj − αγ0.

The distortion function is

h(x) = log(x).

Guanhua Lu Semiparametric Density Ratio Model

Page 250: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Example 3. (Log Normal Distribution)

Xj ∼ gj(x) = LN(αj, σ2), j = 0, 1, . . . ,m with a common σ2

parameter.The weight function

wj(x|αj, βj) = gj(x)/g0(x) = exp(αj + βj log(x)),

(αj, βj) = (µ2

0 − µ2j

2σ2 ,µ0 − µj

σ2 ).

The distortion function is

h(x) = log(x).

Guanhua Lu Semiparametric Density Ratio Model

Page 251: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Preparation for Estimation

Parameters: α = (α1, . . . , αm)′, β = (β′1, . . . , β′m)′,

θ = (α′,β′)′.

Pooled Sample: t = (t1, . . . , tn)′ = (X′0,X′1, . . . ,X

′m)′, where

n = n0 + n1 + · · ·+ nm, the total sample size.

Empirical Likelihood:

L(θ,G) =n∏

i=1

pi

m∏j=1

nj∏i=1

exp(αj + β′jh(xji)), (7)

where pi = dG(ti) is the mass at ti.

Guanhua Lu Semiparametric Density Ratio Model

Page 252: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Preparation for Estimation

Parameters: α = (α1, . . . , αm)′, β = (β′1, . . . , β′m)′,

θ = (α′,β′)′.

Pooled Sample: t = (t1, . . . , tn)′ = (X′0,X′1, . . . ,X

′m)′, where

n = n0 + n1 + · · ·+ nm, the total sample size.

Empirical Likelihood:

L(θ,G) =n∏

i=1

pi

m∏j=1

nj∏i=1

exp(αj + β′jh(xji)), (7)

where pi = dG(ti) is the mass at ti.

Guanhua Lu Semiparametric Density Ratio Model

Page 253: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

Preparation for Estimation

Parameters: α = (α1, . . . , αm)′, β = (β′1, . . . , β′m)′,

θ = (α′,β′)′.

Pooled Sample: t = (t1, . . . , tn)′ = (X′0,X′1, . . . ,X

′m)′, where

n = n0 + n1 + · · ·+ nm, the total sample size.

Empirical Likelihood:

L(θ,G) =n∏

i=1

pi

m∏j=1

nj∏i=1

exp(αj + β′jh(xji)), (7)

where pi = dG(ti) is the mass at ti.

Guanhua Lu Semiparametric Density Ratio Model

Page 254: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

A Profiling Procedure for Estimation

1. For fixed θ, maximize only the product term∏n

i=1 pi from theempirical likelihood subject to the m + 1 constraints

n∑i=1

pi = 1,n∑

i=1

pi[wj(ti)− 1] = 0, j = 1, . . . ,m, (8)

where the weights wj(t) = exp(αj + β′jh(t)), j = 1, . . . ,m.

2. Solve pi by applying the method of Lagrange multipliers,

pi = (n +m∑

j=1

λj[wj(ti)− 1])−1, i = 1, . . . , n. (9)

Guanhua Lu Semiparametric Density Ratio Model

Page 255: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

A Profiling Procedure for Estimation

1. For fixed θ, maximize only the product term∏n

i=1 pi from theempirical likelihood subject to the m + 1 constraints

n∑i=1

pi = 1,n∑

i=1

pi[wj(ti)− 1] = 0, j = 1, . . . ,m, (8)

where the weights wj(t) = exp(αj + β′jh(t)), j = 1, . . . ,m.

2. Solve pi by applying the method of Lagrange multipliers,

pi = (n +m∑

j=1

λj[wj(ti)− 1])−1, i = 1, . . . , n. (9)

Guanhua Lu Semiparametric Density Ratio Model

Page 256: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

3. Substituting pi’s back into (7), the profile log likelihood as afunction of θ only,

`(θ) = −n log n0 −n∑

i=1

log[1 + ρ1w1(ti) + · · ·+ ρmwm(ti)]

+m∑

j=1

nj∑i=1

(αj + β′jh(xji)). (10)

4. θ obtained from the score equations,

∂`

∂αj= −

n∑i=1

ρjwj(ti)∑mk=0 ρkwk(ti)

+ nj = 0

∂`

∂βj= −

n∑i=1

ρjwj(ti)h(ti)∑mk=0 ρkwk(ti)

+nj∑

i=1

h(xji) = 0. (11)

Guanhua Lu Semiparametric Density Ratio Model

Page 257: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

3. Substituting pi’s back into (7), the profile log likelihood as afunction of θ only,

`(θ) = −n log n0 −n∑

i=1

log[1 + ρ1w1(ti) + · · ·+ ρmwm(ti)]

+m∑

j=1

nj∑i=1

(αj + β′jh(xji)). (10)

4. θ obtained from the score equations,

∂`

∂αj= −

n∑i=1

ρjwj(ti)∑mk=0 ρkwk(ti)

+ nj = 0

∂`

∂βj= −

n∑i=1

ρjwj(ti)h(ti)∑mk=0 ρkwk(ti)

+nj∑

i=1

h(xji) = 0. (11)

Guanhua Lu Semiparametric Density Ratio Model

Page 258: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

5. The solution of the score equations gives the MLE θ = (α′, β′)′,

and consequently by substitution also

pi =1n0· 1∑m

j=0 ρj exp(αj + βjh(ti))(12)

6. The estimator for G

G(t) =n∑

i=1

piI(ti ≤ t)

=1n0·

n∑i=1

I(ti ≤ t)∑mj=0 ρj exp(αj + βjh(ti))

. (13)

Guanhua Lu Semiparametric Density Ratio Model

Page 259: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

5. The solution of the score equations gives the MLE θ = (α′, β′)′,

and consequently by substitution also

pi =1n0· 1∑m

j=0 ρj exp(αj + βjh(ti))(12)

6. The estimator for G

G(t) =n∑

i=1

piI(ti ≤ t)

=1n0·

n∑i=1

I(ti ≤ t)∑mj=0 ρj exp(αj + βjh(ti))

. (13)

Guanhua Lu Semiparametric Density Ratio Model

Page 260: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Outline

1 IntroductionA Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

2 Asymptotic Theory for θ and GAsymptotic Theory for θAsymptotic Theory for G(t)

3 Data AnalysisA Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Guanhua Lu Semiparametric Density Ratio Model

Page 261: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Assumptions

The first and second moments of h(t) with respect to eachsample are finite,∫

h(t)wj(t)dG(t) <∞,∫

h(t)h′(t)wj(t)dG(t) <∞, (14)

j = 0, 1, . . . ,m.

Sample fractions ρj = nj/n0, j = 0, 1, . . . ,m are finite andremain fixed as the total sample size

n =m∑

j=0

nj →∞.

Guanhua Lu Semiparametric Density Ratio Model

Page 262: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Assumptions

The first and second moments of h(t) with respect to eachsample are finite,∫

h(t)wj(t)dG(t) <∞,∫

h(t)h′(t)wj(t)dG(t) <∞, (14)

j = 0, 1, . . . ,m.

Sample fractions ρj = nj/n0, j = 0, 1, . . . ,m are finite andremain fixed as the total sample size

n =m∑

j=0

nj →∞.

Guanhua Lu Semiparametric Density Ratio Model

Page 263: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Taylor expansion of ∂`(θ)/∂θ at "true" θ0,

0 =∂`(θ)∂θ

=∂`(θ0)∂θ

+∂2`(θ∗)∂θ2 (θ − θ0), (15)

where θ∗ is between θ and θ0.

Provided that Sn(θ) = ∂2`(θ)/∂θ2 is positive-definite,

√n(θ − θ0) = −

[1n

Sn(θ∗)]−1 1√

n∂`(θ0)∂θ

.

Under the density ratio model (6)

Eθ0(∂`(θ)/∂θ)2 6= −Eθ0

Sn(θ)

since contributions to the score statistic ∂`(θ0)/∂θ fromindividual samples do not in general have mean zero.

Guanhua Lu Semiparametric Density Ratio Model

Page 264: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Taylor expansion of ∂`(θ)/∂θ at "true" θ0,

0 =∂`(θ)∂θ

=∂`(θ0)∂θ

+∂2`(θ∗)∂θ2 (θ − θ0), (15)

where θ∗ is between θ and θ0.

Provided that Sn(θ) = ∂2`(θ)/∂θ2 is positive-definite,

√n(θ − θ0) = −

[1n

Sn(θ∗)]−1 1√

n∂`(θ0)∂θ

.

Under the density ratio model (6)

Eθ0(∂`(θ)/∂θ)2 6= −Eθ0

Sn(θ)

since contributions to the score statistic ∂`(θ0)/∂θ fromindividual samples do not in general have mean zero.

Guanhua Lu Semiparametric Density Ratio Model

Page 265: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Taylor expansion of ∂`(θ)/∂θ at "true" θ0,

0 =∂`(θ)∂θ

=∂`(θ0)∂θ

+∂2`(θ∗)∂θ2 (θ − θ0), (15)

where θ∗ is between θ and θ0.

Provided that Sn(θ) = ∂2`(θ)/∂θ2 is positive-definite,

√n(θ − θ0) = −

[1n

Sn(θ∗)]−1 1√

n∂`(θ0)∂θ

.

Under the density ratio model (6)

Eθ0(∂`(θ)/∂θ)2 6= −Eθ0

Sn(θ)

since contributions to the score statistic ∂`(θ0)/∂θ fromindividual samples do not in general have mean zero.

Guanhua Lu Semiparametric Density Ratio Model

Page 266: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Procedure to Develop the Asymptotic Theory for θ

1. Derive the structure of limit matrix − 1n Sn

a.s.−→ S

2. Derive the covariance matrix of the score statistic ∂`(θ0)

∂θ.

3. Prove the strong consistency of θ as an estimator of θ.

4. Formulate the asymptotic normality of√

n(θ − θ0).

Guanhua Lu Semiparametric Density Ratio Model

Page 267: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Procedure to Develop the Asymptotic Theory for θ

1. Derive the structure of limit matrix − 1n Sn

a.s.−→ S

2. Derive the covariance matrix of the score statistic ∂`(θ0)

∂θ.

3. Prove the strong consistency of θ as an estimator of θ.

4. Formulate the asymptotic normality of√

n(θ − θ0).

Guanhua Lu Semiparametric Density Ratio Model

Page 268: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Procedure to Develop the Asymptotic Theory for θ

1. Derive the structure of limit matrix − 1n Sn

a.s.−→ S

2. Derive the covariance matrix of the score statistic ∂`(θ0)

∂θ.

3. Prove the strong consistency of θ as an estimator of θ.

4. Formulate the asymptotic normality of√

n(θ − θ0).

Guanhua Lu Semiparametric Density Ratio Model

Page 269: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Procedure to Develop the Asymptotic Theory for θ

1. Derive the structure of limit matrix − 1n Sn

a.s.−→ S

2. Derive the covariance matrix of the score statistic ∂`(θ0)

∂θ.

3. Prove the strong consistency of θ as an estimator of θ.

4. Formulate the asymptotic normality of√

n(θ − θ0).

Guanhua Lu Semiparametric Density Ratio Model

Page 270: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Notation

Ajj′ =∫

wj(t)wj′(t)∑mk=0 ρkwk(t)

dG(t), Bjj′ =∫

wj(t)wj′(t)h(t)∑mk=0 ρkwk(t)

dG(t)

Cjj′ =∫

wj(t)wj′(t)h(t)h′(t)∑mk=0 ρkwk(t)

dG(t)

Ej = E(h(xji)) =∫

wj(t)h(t)dG(t) Ej =∫

wj(t)h(t)h′(t)dG(t)

Vj = Var(h(xji))

=∫

wjh(t)h′(t)dG(t)−∫

h(t)wjdG(t)∫

h′(t)wjdG(t)

= Ej − EjE′j,

where Bjj′ and Ej are p× 1 vectors, and Cjj′ , Ej and Vj are all p× pmatrices.

Guanhua Lu Semiparametric Density Ratio Model

Page 271: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Notation Cont’d

A = (Aij)m×m, B = (Bij)mp×m, C = (Cij)mp×mp

ρ = diag(ρ1, . . . , ρm)m×m

E =

E1 · · · 0...

. . ....

0 · · · Em

mp×m

E =

E1 · · · 0...

. . ....

0 · · · Em

mp×mp

1m =

1 · · · 1...

. . ....

1 · · · 1

m×m

V =

V1 · · · 0...

. . ....

0 · · · Vm

mp×mp

where 0 is a p× 1 vector of 0’s, and 0 is a p× p matrix of 0’s.

Guanhua Lu Semiparametric Density Ratio Model

Page 272: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Structure of the Limit Matrix

S =1∑m

k=0 ρk

(S11 S12S21 S22

), (16)

where

S11 = ρ− ρAρ

S12 = ρE′ − ρB′(ρ⊗ Ip)S21 = S′12 = Eρ− (ρ⊗ Ip)Bρ

S22 = (ρ⊗ Ip)E − (ρ⊗ Ip)C(ρ⊗ Ip). (17)

Ip is the p× p identity matrix, ⊗ denotes the kronecker product.

Guanhua Lu Semiparametric Density Ratio Model

Page 273: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Structure of the Covariance Matrix

Λ = Var(

1√n∂`

∂θ

)=

1∑mk=0 ρk

(Λ11 Λ12Λ21 Λ22

),

Λ11 = ρAρ− ρAρAρ− ρ1mρ + ρAρ1mρ + ρ1mρAρ

−ρAρ1mρAρ

Λ12 = ρAE′(ρ⊗ Ip)− ρAρB′(ρ⊗ Ip)− ρ1mE′(ρ⊗ Ip)+ρAρ1mE′(ρ⊗ Ip) + ρ1mρB′(ρ⊗ Ip)− ρAρ1mB′(ρ⊗ Ip)

Λ21 = Λ′12 = (ρ⊗ Ip)EAρ− (ρ⊗ Ip)BρAρ− (ρ⊗ Ip)E1mρ

+(ρ⊗ Ip)E1mρAρ + (ρ⊗ Ip)Bρ1mρ− (ρ⊗ Ip)B1mρAρ

Λ22 = −(ρ⊗ Ip)C(ρ⊗ Ip)− (ρ⊗ Ip)BρB′(ρ⊗ Ip)+(ρ⊗ Ip)BE′(ρ⊗ Ip) + (ρ⊗ Ip)EB′(ρ⊗ Ip) + (ρ⊗ Ip)V

−(ρ⊗ Ip)E1mE′(ρ⊗ Ip) + (ρ⊗ Ip)Bρ1mE′(ρ⊗ Ip)+(ρ⊗ Ip)E1mρB′(ρ⊗ Ip)− (ρ⊗ Ip)Bρ1mρB′(ρ⊗ Ip).

Guanhua Lu Semiparametric Density Ratio Model

Page 274: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Lemma 2.1 (Connection between S and Λ)The limit matrix S and the covariance matrix Λ is connected by

Λ11 = S11 − S11(1m + ρ−1)S11

Λ12 = S12 − S11(1m + ρ−1)S12

Λ21 = S21 − S21(1m + ρ−1)S11

Λ22 = S22 − S21(1m + ρ−1)S12.

Therefore, we have

Σdef= S−1ΛS−1 = S−1 −

m∑k=0

ρk

(1m + ρ−1 0

0 0

). (18)

Guanhua Lu Semiparametric Density Ratio Model

Page 275: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Theorem 2.1 (Asymptotic Theory for θ)Suppose that the density ratio model (6) and the Assumption (14)hold, and S is positive-definite, then

(a) the solution θ to the score equation system (11) is a stronglyconsistent estimator for θ0.

(b) as n→∞,

√n(

α−α0

β − β0

)d→ N(p+1)m(0,Σ), (19)

where Σ = S−1ΛS−1.

Guanhua Lu Semiparametric Density Ratio Model

Page 276: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

A Brief Review of Empirical Process

Let T1, . . . ,Tn be a real-valued random sample from a distributionfunction F.

Empirical measure

Pn =1n

n∑i=1

δTi .

Empirical distribution

Fn(t) = PnI[x<t] =1n

n∑i=1

I[Ti<t].

Guanhua Lu Semiparametric Density Ratio Model

Page 277: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Consider Fn(t) as a random function Fn(t, ω).

Glivenko-Cantelli Theorem (SLLN)

|Fn − F|∞a.s.−→ 0

under the supremum norm |Fn − F|∞ = supt|Fn(t)− F(t)|.

Donsker Theorem (CLT)√

n(Fn − F) d−→ GF

in the Skorohod space D[−∞,∞]. The Brownian bridge GF hasmean 0 and covariance

EGF(t)GF(s) = F(t ∧ s)− F(t)F(s).

Guanhua Lu Semiparametric Density Ratio Model

Page 278: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Consider Fn(t) as a random function Fn(t, ω).

Glivenko-Cantelli Theorem (SLLN)

|Fn − F|∞a.s.−→ 0

under the supremum norm |Fn − F|∞ = supt|Fn(t)− F(t)|.

Donsker Theorem (CLT)√

n(Fn − F) d−→ GF

in the Skorohod space D[−∞,∞]. The Brownian bridge GF hasmean 0 and covariance

EGF(t)GF(s) = F(t ∧ s)− F(t)F(s).

Guanhua Lu Semiparametric Density Ratio Model

Page 279: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Define the empirical process of the reference sample:

G(t) = 1n0

∑n0i=1 I[x0i<t].

Procedure to Prove Weak Convergence:

Approximation: G(t) ≈ H1(t)− H2(t) uniformly in t.

Decomposition:√

n(G(t)− G(t)) =√

n(G(t)− G(t)) +√

n(G(t)− G(t))=√

n(H1(t)− G(t)− H2(t)) +√

n(G(t)− G(t)).

Covariance Structure and Finite-dimensional Convergence.

Tightness.

Guanhua Lu Semiparametric Density Ratio Model

Page 280: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Define the empirical process of the reference sample:

G(t) = 1n0

∑n0i=1 I[x0i<t].

Procedure to Prove Weak Convergence:

Approximation: G(t) ≈ H1(t)− H2(t) uniformly in t.

Decomposition:√

n(G(t)− G(t)) =√

n(G(t)− G(t)) +√

n(G(t)− G(t))=√

n(H1(t)− G(t)− H2(t)) +√

n(G(t)− G(t)).

Covariance Structure and Finite-dimensional Convergence.

Tightness.

Guanhua Lu Semiparametric Density Ratio Model

Page 281: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Define the empirical process of the reference sample:

G(t) = 1n0

∑n0i=1 I[x0i<t].

Procedure to Prove Weak Convergence:

Approximation: G(t) ≈ H1(t)− H2(t) uniformly in t.

Decomposition:√

n(G(t)− G(t)) =√

n(G(t)− G(t)) +√

n(G(t)− G(t))=√

n(H1(t)− G(t)− H2(t)) +√

n(G(t)− G(t)).

Covariance Structure and Finite-dimensional Convergence.

Tightness.

Guanhua Lu Semiparametric Density Ratio Model

Page 282: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Define the empirical process of the reference sample:

G(t) = 1n0

∑n0i=1 I[x0i<t].

Procedure to Prove Weak Convergence:

Approximation: G(t) ≈ H1(t)− H2(t) uniformly in t.

Decomposition:√

n(G(t)− G(t)) =√

n(G(t)− G(t)) +√

n(G(t)− G(t))=√

n(H1(t)− G(t)− H2(t)) +√

n(G(t)− G(t)).

Covariance Structure and Finite-dimensional Convergence.

Tightness.

Guanhua Lu Semiparametric Density Ratio Model

Page 283: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Define the empirical process of the reference sample:

G(t) = 1n0

∑n0i=1 I[x0i<t].

Procedure to Prove Weak Convergence:

Approximation: G(t) ≈ H1(t)− H2(t) uniformly in t.

Decomposition:√

n(G(t)− G(t)) =√

n(G(t)− G(t)) +√

n(G(t)− G(t))=√

n(H1(t)− G(t)− H2(t)) +√

n(G(t)− G(t)).

Covariance Structure and Finite-dimensional Convergence.

Tightness.

Guanhua Lu Semiparametric Density Ratio Model

Page 284: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

An Approximation of G(t)

Define

H1(t; θ) =1n0·

n∑i=1

I(ti ≤ t)∑mk=0 ρkwk(ti;αk, βk)

.

Assume H1(t) = H1(t; θ0). G(t) is a realization of H1(t; θ) at θ.

As n→∞, we have, uniformly in t

∂H1(t; θ0)∂αj

a.s.−→ −ρjAj(t) = ρj

∫wj(y)I(y ≤ t)∑m

k=0 ρkwk(y)dG(y)

∂H1(t; θ0)∂βj

a.s.−→ −ρjBj(t) = ρj

∫wj(y)h(y)I(y ≤ t)∑m

k=0 ρkwk(y)dG(y).

Guanhua Lu Semiparametric Density Ratio Model

Page 285: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

An Approximation of G(t)

Define

H1(t; θ) =1n0·

n∑i=1

I(ti ≤ t)∑mk=0 ρkwk(ti;αk, βk)

.

Assume H1(t) = H1(t; θ0). G(t) is a realization of H1(t; θ) at θ.

As n→∞, we have, uniformly in t

∂H1(t; θ0)∂αj

a.s.−→ −ρjAj(t) = ρj

∫wj(y)I(y ≤ t)∑m

k=0 ρkwk(y)dG(y)

∂H1(t; θ0)∂βj

a.s.−→ −ρjBj(t) = ρj

∫wj(y)h(y)I(y ≤ t)∑m

k=0 ρkwk(y)dG(y).

Guanhua Lu Semiparametric Density Ratio Model

Page 286: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Denote A(t) = (A1(t), . . . ,Am(t))′, B(t) = (B′1(t), . . . ,B′m(t))′.

Lemma 3.1 (Approximation of G(t))

G(t) has an approximation uniformly in t,

G(t) = H1(t)− H2(t) + Rn(t),

where H1(t) is defined as before, and

H2(t) =1n

(A′(t)ρ, B′(t)(ρ⊗ Ip)

)S−1

∂`(θ0)∂α

∂`(θ0)

∂β

, (20)

and the remainder term Rn(t) satisfiessup−∞<t<∞ |Rn(t)| = op(n−1/2).

Guanhua Lu Semiparametric Density Ratio Model

Page 287: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Covariance Structure

Notice that Eθ0[H1(t)− G] = 0, Eθ0

[H2(t)] = 0,

Cov[√

n(H1(t)− G(t)− H2(t)),√

n(H1(s)− G(s)− H2(s))]

= n[

E(

(H1(t)− G(t))(H1(s)− G(s)))

−E(

(H1(t)− G(t))H2(s))

−E(

H2(t)(H1(s)− G(s)))

+ E(

H2(t)H′2(s))]

.

Guanhua Lu Semiparametric Density Ratio Model

Page 288: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Covariance Structure

Cov[√

n(H1(t)− G(t)− H2(t)),√

n(H1(s)− G(s)− H2(s))]

=

(m∑

k=0

ρk

)m∑

j=1

ρjAj(t ∧ s)

−[A′(t)ρ, B′(t)(ρ⊗ Ip)

]S−1

[ρA(s)

(ρ⊗ Ip)B(s)

]. (21)

Guanhua Lu Semiparametric Density Ratio Model

Page 289: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Lemma 3.2 (Finite-dimensional Convergence)

For any finite set (t1, . . . , tk) of points on the real line, let Gn denote√n(H1(t)− G− H2(t)), then we have

(Gn(t1), . . . ,Gn(tk))d→ Nk(0,∆),

where Nk is a mean-zero k-dimensional multivariate normaldistribution with covariance matrix ∆, of which the (i, j)th element isdetermined by (21).

Guanhua Lu Semiparametric Density Ratio Model

Page 290: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Techniques for Proving Tightness

Donsker ClassAssume supf∈F ‖f (x)− Pf‖ <∞, for every x. Under thiscondition, the empirical process Pnf : f ∈ F can be viewed as amap into `∞(F), where `∞(F) is a set of uniformly bounded realfunctions in F .F is called P-Donsker class if the empirical processes based on P andindexed by F satisfy

Gn =√

n(Pn − P) d→ G, in `∞(F),

where G is a tight Borel measurable element in `∞(F).

Guanhua Lu Semiparametric Density Ratio Model

Page 291: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Examples of Donsker ClassesThe collection of all indicator functions of the form I(−∞,t] is aDonsker class.

If F is a Donsker class with ‖P‖F <∞ and g is a uniformlybounded, measurable function, then F · g is a Donsker class.

Refer to Example 2.10.10 of Van der Vaart and Wellner (1996), p.192

Any finite dimensional vector space F of measurable functionsis a Donsker class.

Refer to Lemma 2.6.15, Van der Vaart and Wellner (1996) Page 146

Guanhua Lu Semiparametric Density Ratio Model

Page 292: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Tightness of√

n(H1(t)− G(t))

H1(t)− G(t) =1n0

m∑j=1

nj∑i=1

I(xji ≤ t)∑mk=0 ρkwk(xji)

− 1n0

n0∑i=1

∑mk=1 ρkwk(x0i)I(x0i ≤ t)∑m

k=0 ρkwk(x0i)

H1j(t) =1n0

nj∑i=1

I(xji ≤ t)∑mk=0 ρkwk(xji)

,

H10(t) =1n0

n0∑i=1

∑mk=1 ρkwk(x0i)I(x0i ≤ t)∑m

k=0 ρkwk(x0i)

Guanhua Lu Semiparametric Density Ratio Model

Page 293: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

E [H1j(t)] = ρjAj(t), E [H10(t)] =m∑

j=1

ρjAj(t)

H1(t)− G(t) =m∑

j=1

H1j(t)− H10(t)

=m∑

j=1

[H1j(t)− ρjAj(t)]−

H10(t)−m∑

j=1

ρjAj(t)

.

Guanhua Lu Semiparametric Density Ratio Model

Page 294: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Let F be the collection of all indicator functions of the formI(−∞,t].

PXj is the law of Xj, the jth sample, j = 0, 1, . . . ,m.

Define uniformly bounded functions

f0(y) =∑m

k=1 ρkwk(y)∑mk=0 ρkwk(y)

, fj(y) =ρj∑m

k=0 ρkwk(y),

j=1,. . . ,m.

From the previous example, F · fi, i = 0, 1, . . . ,m, arePXj-Donsker classes, j = 0, 1, . . . ,m

Guanhua Lu Semiparametric Density Ratio Model

Page 295: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Let F be the collection of all indicator functions of the formI(−∞,t].

PXj is the law of Xj, the jth sample, j = 0, 1, . . . ,m.

Define uniformly bounded functions

f0(y) =∑m

k=1 ρkwk(y)∑mk=0 ρkwk(y)

, fj(y) =ρj∑m

k=0 ρkwk(y),

j=1,. . . ,m.

From the previous example, F · fi, i = 0, 1, . . . ,m, arePXj-Donsker classes, j = 0, 1, . . . ,m

Guanhua Lu Semiparametric Density Ratio Model

Page 296: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Let F be the collection of all indicator functions of the formI(−∞,t].

PXj is the law of Xj, the jth sample, j = 0, 1, . . . ,m.

Define uniformly bounded functions

f0(y) =∑m

k=1 ρkwk(y)∑mk=0 ρkwk(y)

, fj(y) =ρj∑m

k=0 ρkwk(y),

j=1,. . . ,m.

From the previous example, F · fi, i = 0, 1, . . . ,m, arePXj-Donsker classes, j = 0, 1, . . . ,m

Guanhua Lu Semiparametric Density Ratio Model

Page 297: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Let F be the collection of all indicator functions of the formI(−∞,t].

PXj is the law of Xj, the jth sample, j = 0, 1, . . . ,m.

Define uniformly bounded functions

f0(y) =∑m

k=1 ρkwk(y)∑mk=0 ρkwk(y)

, fj(y) =ρj∑m

k=0 ρkwk(y),

j=1,. . . ,m.

From the previous example, F · fi, i = 0, 1, . . . ,m, arePXj-Donsker classes, j = 0, 1, . . . ,m

Guanhua Lu Semiparametric Density Ratio Model

Page 298: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Let Pnj = 1nj

∑nji=1 δxji be the empirical measure of the jth sample.

Then we have

√nj(Pnj − PXj)(I(−∞,t]fj)

=√

nj

[1nj

nj∑i=1

ρjI(xji ≤ t)∑mk=0 ρkwk(xji)

− ρj

∫wj(y)I(y ≤ t)∑m

k=0 ρkwk(y)dG(y)

]

= (ρj/

m∑k=0

ρk)1/2 √n(H1j − ρjAj(t)), j = 1, . . . ,m,

and, similarly,

√n0(Pn0 − PX0)(I(−∞,t]f0) = (1/

m∑k=0

ρk)1/2 √n

H10(t)−m∑

j=1

ρjAj(t)

Guanhua Lu Semiparametric Density Ratio Model

Page 299: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

By Donsker’s Theorem,

√nj(Pnj − PXj)(I(−∞,t]f0) d→ Wj in D[−∞,∞],

where j= 0, 1,. . . , m, and Wj’s are mean-zero Gaussian processes.Therefore,

√n(H1j − ρjAj(t)), j = 0, 1, . . . ,m, are tight in D[−∞,∞].

From the decomposition of√

n(H1(t)− G(t)),

Lemma 3.3 Tightness of√

n(H1(t)− G(t))√

n(H1(t)− G(t)) is tight in D[−∞,∞].

Guanhua Lu Semiparametric Density Ratio Model

Page 300: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Tightness of√

nH2(t)

Define functions

Ul(y) =ρlwl(y)∑m

k=0 ρkwk(y), Vl(y) =

ρlwl(y)h(y)∑mk=0 ρkwk(y)

,

where l = 0, 1, . . . ,m.

Define spaces

U = SpanUk : k = 0, 1, . . . ,mV = SpanVk : k = 0, 1, . . . ,m

U and V are both Donsker classes

Guanhua Lu Semiparametric Density Ratio Model

Page 301: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Tightness of√

nH2(t)

Define functions

Ul(y) =ρlwl(y)∑m

k=0 ρkwk(y), Vl(y) =

ρlwl(y)h(y)∑mk=0 ρkwk(y)

,

where l = 0, 1, . . . ,m.

Define spaces

U = SpanUk : k = 0, 1, . . . ,mV = SpanVk : k = 0, 1, . . . ,m

U and V are both Donsker classes

Guanhua Lu Semiparametric Density Ratio Model

Page 302: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Tightness of√

nH2(t)

Define functions

Ul(y) =ρlwl(y)∑m

k=0 ρkwk(y), Vl(y) =

ρlwl(y)h(y)∑mk=0 ρkwk(y)

,

where l = 0, 1, . . . ,m.

Define spaces

U = SpanUk : k = 0, 1, . . . ,mV = SpanVk : k = 0, 1, . . . ,m

U and V are both Donsker classes

Guanhua Lu Semiparametric Density Ratio Model

Page 303: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Let (a1(t), . . . , am(t), b′1(t), . . . , b′m(t)) = (A′(t)ρ, B′(t)(ρ⊗ Ip))S−1,Then we have a decomposition for

√nH2(t),

√nH2(t) =

√n

n

(a1(t), . . . , am(t), b′1(t), . . . , b′m(t)

)∂`

∂θ

=n0

n

[∑mj=1

∑mk=0 ρkρj

√nj(Pnj − PXj)(∑m

l=0l 6=jρjaj(t)Ul

)−∑m

j=1∑m

l=0l 6=j

∑mk=0 ρkρl

√nl(Pnl − PXl) (ρlaj(t)Uj)

+∑m

j=1

∑mk=0 ρkρj

√nj(Pnj − PXj)(∑m

l=0l 6=jρjb′j(t)Vl

)−∑m

j=1∑m

l=0l 6=j

∑mk=0 ρkρl

√nl(Pnl − PXl)

(ρlb′j(t)Vj

)]. (22)

Guanhua Lu Semiparametric Density Ratio Model

Page 304: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Lemma 3.4 Tightness of√

nH2(t)√

nH2(t) is tight in D[−∞,∞].

Theorem 3.1 Weak Convergence of√

n(G− G)

The process√

n(G− G) converges weakly to a zero-mean Gaussianprocess W with continuous sample paths in D[−∞,∞], and thecovariance matrix is determined by

EW(t)W(s) =

(m∑

k=0

ρk

)m∑

j=1

ρjAj(t ∧ s)

−(

A′(t)ρ, B′(t)(ρ⊗ Ip))

S−1(

ρA(s)(ρ⊗ Ip)B(s)

).

Guanhua Lu Semiparametric Density Ratio Model

Page 305: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Lemma 3.4 Tightness of√

nH2(t)√

nH2(t) is tight in D[−∞,∞].

Theorem 3.1 Weak Convergence of√

n(G− G)

The process√

n(G− G) converges weakly to a zero-mean Gaussianprocess W with continuous sample paths in D[−∞,∞], and thecovariance matrix is determined by

EW(t)W(s) =

(m∑

k=0

ρk

)m∑

j=1

ρjAj(t ∧ s)

−(

A′(t)ρ, B′(t)(ρ⊗ Ip))

S−1(

ρA(s)(ρ⊗ Ip)B(s)

).

Guanhua Lu Semiparametric Density Ratio Model

Page 306: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Weak Convergence of√

n(G(t)− G(t)):

Decomposition:√

n(G(t)− G(t)) =√

n(G(t)− G(t)) +√

n(G(t)− G(t))≈√

n(H1(t)− G(t)− H2(t)) +√

n(G(t)− G(t)).

Variance-covariance structure and finite-dimensionalconvergence can be obtained similarly as for

√n(G(t)− G(t)).

Tightness is followed from the fact that both√

n(G(t)− G(t))and√

n(G(t)− G(t)) are tight.

Guanhua Lu Semiparametric Density Ratio Model

Page 307: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Weak Convergence of√

n(G(t)− G(t)):

Decomposition:√

n(G(t)− G(t)) =√

n(G(t)− G(t)) +√

n(G(t)− G(t))≈√

n(H1(t)− G(t)− H2(t)) +√

n(G(t)− G(t)).

Variance-covariance structure and finite-dimensionalconvergence can be obtained similarly as for

√n(G(t)− G(t)).

Tightness is followed from the fact that both√

n(G(t)− G(t))and√

n(G(t)− G(t)) are tight.

Guanhua Lu Semiparametric Density Ratio Model

Page 308: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Weak Convergence of√

n(G(t)− G(t)):

Decomposition:√

n(G(t)− G(t)) =√

n(G(t)− G(t)) +√

n(G(t)− G(t))≈√

n(H1(t)− G(t)− H2(t)) +√

n(G(t)− G(t)).

Variance-covariance structure and finite-dimensionalconvergence can be obtained similarly as for

√n(G(t)− G(t)).

Tightness is followed from the fact that both√

n(G(t)− G(t))and√

n(G(t)− G(t)) are tight.

Guanhua Lu Semiparametric Density Ratio Model

Page 309: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

Asymptotic Theory for θAsymptotic Theory for G(t)

Theorem 3.2 Weak Convergence of√

n(G− G)

The process√

n(G(t)− G(t)) converges weakly to a zero-meanGaussian process in D[−∞,∞], with covariance matrix given by

Cov√

n(G(t)− G(t)),√

n(G(s)− G(s)) =(m∑

k=0

ρk

)(G(t ∧ s)− G(t)G(s)−

m∑j=1

ρjAj(t ∧ s))

+(

A′(s)ρ, B′(s)(ρ⊗ Ip))

S−1(

ρA(t)(ρ⊗ Ip)B(t)

). (23)

Guanhua Lu Semiparametric Density Ratio Model

Page 310: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Outline

1 IntroductionA Historic Review of Biased Sampling ModelsSemiparametric Density Ratio ModelEstimation

2 Asymptotic Theory for θ and GAsymptotic Theory for θAsymptotic Theory for G(t)

3 Data AnalysisA Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Guanhua Lu Semiparametric Density Ratio Model

Page 311: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Simulation Study for Parameters

Generate random samples X0 ∼ N(0, 1), X1 ∼ N(0, 2) andX2 ∼ N(0, 4) with density functions g(x), g1(x) and g2(x)respectively.

Fit the following density ratio model:

g1(x) = g(x) exp(α1 + β1x2),g2(x) = g(x) exp(α2 + β2x2). (24)

True parameters(α1, α2, β1, β2) = (−0.34657,−0.69315, 0.25000, 0.37500).

Calculate average bias and sample variance based on 1000combined random samples.

Guanhua Lu Semiparametric Density Ratio Model

Page 312: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Simulation Study for Parameters

Generate random samples X0 ∼ N(0, 1), X1 ∼ N(0, 2) andX2 ∼ N(0, 4) with density functions g(x), g1(x) and g2(x)respectively.

Fit the following density ratio model:

g1(x) = g(x) exp(α1 + β1x2),g2(x) = g(x) exp(α2 + β2x2). (24)

True parameters(α1, α2, β1, β2) = (−0.34657,−0.69315, 0.25000, 0.37500).

Calculate average bias and sample variance based on 1000combined random samples.

Guanhua Lu Semiparametric Density Ratio Model

Page 313: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Simulation Study for Parameters

Generate random samples X0 ∼ N(0, 1), X1 ∼ N(0, 2) andX2 ∼ N(0, 4) with density functions g(x), g1(x) and g2(x)respectively.

Fit the following density ratio model:

g1(x) = g(x) exp(α1 + β1x2),g2(x) = g(x) exp(α2 + β2x2). (24)

True parameters(α1, α2, β1, β2) = (−0.34657,−0.69315, 0.25000, 0.37500).

Calculate average bias and sample variance based on 1000combined random samples.

Guanhua Lu Semiparametric Density Ratio Model

Page 314: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Simulation Study for Parameters

Generate random samples X0 ∼ N(0, 1), X1 ∼ N(0, 2) andX2 ∼ N(0, 4) with density functions g(x), g1(x) and g2(x)respectively.

Fit the following density ratio model:

g1(x) = g(x) exp(α1 + β1x2),g2(x) = g(x) exp(α2 + β2x2). (24)

True parameters(α1, α2, β1, β2) = (−0.34657,−0.69315, 0.25000, 0.37500).

Calculate average bias and sample variance based on 1000combined random samples.

Guanhua Lu Semiparametric Density Ratio Model

Page 315: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Table: Bias and variance of parameter estimates.

Sample Size Bias(α1) Bias(β1) Bias(α2) Bias(β2)(50, 50, 50) -0.01663 0.02337 -0.03752 0.03508

(50, 50, 100) -0.00022 0.00856 -0.02041 0.02142(50, 100, 50) -0.01865 0.02550 -0.03797 0.03338(100, 50, 50) -0.00326 0.00511 -0.02925 0.01811

(200, 200, 200) -0.00017 0.00217 -0.00303 0.00439Sample Size Var(α1) Var(β1) Var(α2) Var(β2)(50, 50, 50) 0.02425 0.01731 0.03362 0.01672

(50, 50, 100) 0.02085 0.01497 0.02276 0.01421(50, 100, 50) 0.01961 0.01623 0.03168 0.01658(100, 50, 50) 0.01773 0.00929 0.02674 0.00828

(200, 200, 200) 0.00611 0.00391 0.00837 0.00374

Guanhua Lu Semiparametric Density Ratio Model

Page 316: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Table: 95% Confidence intervals for parameters.

Sample Size α1 α2

(50, 50, 50) (-0.64010, -0.05305) (-1.04119, -0.34511)(50, 50, 100) (-0.63505, -0.05810) (-0.98391, -0.40238)(50, 100, 50) (-0.60031, -0.09284) (-1.02833, -0.35796)(100, 50, 50) (-0.60319, -0.08996) (-1.00900, -0.37730)

(200, 200, 200) (-0.49333, -0.19981) (-0.86717, -0.51913)Sample Size β1 β2

(50, 50, 50) (0.02029, 0.47971) (0.14742, 0.60258)(50, 50, 100) (0.02349, 0.47651) (0.15889, 0.59111)(50, 100, 50) (0.03402, 0.46598) (0.15327, 0.59673)(100, 50, 50) (0.06846, 0.43154) (0.19693, 0.55307)

(200, 200, 200) (0.13515, 0.36485) (0.26121, 0.48879)

Guanhua Lu Semiparametric Density Ratio Model

Page 317: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Statistic for Goodness-of-fit Test

Define test statistic

∆n(t) =√

n |G− G|, ∆n = sup−∞≤t≤∞

∆n(t)

Let wα be the α-quantile of the distribution ofsup−∞≤t≤∞ |W(t)|. By weak convergence of

√n(G− G),

limn→∞

P(∆n ≥ w1−α) = limn→∞

P( sup−∞≤t≤∞

√n |G− G| ≥ w1−α)

= P( sup−∞≤t≤∞

|W(t)| ≥ w1−α) = α.

We reject the density ratio model (6) at level α if

∆n > w1−α.

Guanhua Lu Semiparametric Density Ratio Model

Page 318: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Statistic for Goodness-of-fit Test

Define test statistic

∆n(t) =√

n |G− G|, ∆n = sup−∞≤t≤∞

∆n(t)

Let wα be the α-quantile of the distribution ofsup−∞≤t≤∞ |W(t)|. By weak convergence of

√n(G− G),

limn→∞

P(∆n ≥ w1−α) = limn→∞

P( sup−∞≤t≤∞

√n |G− G| ≥ w1−α)

= P( sup−∞≤t≤∞

|W(t)| ≥ w1−α) = α.

We reject the density ratio model (6) at level α if

∆n > w1−α.

Guanhua Lu Semiparametric Density Ratio Model

Page 319: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Statistic for Goodness-of-fit Test

Define test statistic

∆n(t) =√

n |G− G|, ∆n = sup−∞≤t≤∞

∆n(t)

Let wα be the α-quantile of the distribution ofsup−∞≤t≤∞ |W(t)|. By weak convergence of

√n(G− G),

limn→∞

P(∆n ≥ w1−α) = limn→∞

P( sup−∞≤t≤∞

√n |G− G| ≥ w1−α)

= P( sup−∞≤t≤∞

|W(t)| ≥ w1−α) = α.

We reject the density ratio model (6) at level α if

∆n > w1−α.

Guanhua Lu Semiparametric Density Ratio Model

Page 320: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Test Procedure Based on Bootstrap

1. Estimate G, G1, . . . , Gm from (X0,X1, . . . ,Xm) and calculate ∆n.

2. Generate X∗0 ,X∗1 , . . . ,X

∗m from G, G1, . . . , Gm respectively.

3. Obtain the estimate G∗ for G based on (X∗0 ,X∗1 , . . . ,X

∗m) and

empirical cdf G∗ for X∗0 , then calculate

∆∗n = sup−∞≤t≤∞

√n |G∗ − G∗|.

!!√

n(G∗ − G∗) d→ W, W is also the limit process of√

n(G− G).

4. Repeat step 3 by bootstrapping from (X∗0 ,X∗1 , . . . ,X

∗m).

5. Thus we can approximate the quantiles of ∆n by those of ∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 321: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Test Procedure Based on Bootstrap

1. Estimate G, G1, . . . , Gm from (X0,X1, . . . ,Xm) and calculate ∆n.

2. Generate X∗0 ,X∗1 , . . . ,X

∗m from G, G1, . . . , Gm respectively.

3. Obtain the estimate G∗ for G based on (X∗0 ,X∗1 , . . . ,X

∗m) and

empirical cdf G∗ for X∗0 , then calculate

∆∗n = sup−∞≤t≤∞

√n |G∗ − G∗|.

!!√

n(G∗ − G∗) d→ W, W is also the limit process of√

n(G− G).

4. Repeat step 3 by bootstrapping from (X∗0 ,X∗1 , . . . ,X

∗m).

5. Thus we can approximate the quantiles of ∆n by those of ∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 322: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Test Procedure Based on Bootstrap

1. Estimate G, G1, . . . , Gm from (X0,X1, . . . ,Xm) and calculate ∆n.

2. Generate X∗0 ,X∗1 , . . . ,X

∗m from G, G1, . . . , Gm respectively.

3. Obtain the estimate G∗ for G based on (X∗0 ,X∗1 , . . . ,X

∗m) and

empirical cdf G∗ for X∗0 , then calculate

∆∗n = sup−∞≤t≤∞

√n |G∗ − G∗|.

!!√

n(G∗ − G∗) d→ W, W is also the limit process of√

n(G− G).

4. Repeat step 3 by bootstrapping from (X∗0 ,X∗1 , . . . ,X

∗m).

5. Thus we can approximate the quantiles of ∆n by those of ∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 323: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Test Procedure Based on Bootstrap

1. Estimate G, G1, . . . , Gm from (X0,X1, . . . ,Xm) and calculate ∆n.

2. Generate X∗0 ,X∗1 , . . . ,X

∗m from G, G1, . . . , Gm respectively.

3. Obtain the estimate G∗ for G based on (X∗0 ,X∗1 , . . . ,X

∗m) and

empirical cdf G∗ for X∗0 , then calculate

∆∗n = sup−∞≤t≤∞

√n |G∗ − G∗|.

!!√

n(G∗ − G∗) d→ W, W is also the limit process of√

n(G− G).

4. Repeat step 3 by bootstrapping from (X∗0 ,X∗1 , . . . ,X

∗m).

5. Thus we can approximate the quantiles of ∆n by those of ∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 324: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Test Procedure Based on Bootstrap

1. Estimate G, G1, . . . , Gm from (X0,X1, . . . ,Xm) and calculate ∆n.

2. Generate X∗0 ,X∗1 , . . . ,X

∗m from G, G1, . . . , Gm respectively.

3. Obtain the estimate G∗ for G based on (X∗0 ,X∗1 , . . . ,X

∗m) and

empirical cdf G∗ for X∗0 , then calculate

∆∗n = sup−∞≤t≤∞

√n |G∗ − G∗|.

!!√

n(G∗ − G∗) d→ W, W is also the limit process of√

n(G− G).

4. Repeat step 3 by bootstrapping from (X∗0 ,X∗1 , . . . ,X

∗m).

5. Thus we can approximate the quantiles of ∆n by those of ∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 325: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Test Procedure Based on Bootstrap

1. Estimate G, G1, . . . , Gm from (X0,X1, . . . ,Xm) and calculate ∆n.

2. Generate X∗0 ,X∗1 , . . . ,X

∗m from G, G1, . . . , Gm respectively.

3. Obtain the estimate G∗ for G based on (X∗0 ,X∗1 , . . . ,X

∗m) and

empirical cdf G∗ for X∗0 , then calculate

∆∗n = sup−∞≤t≤∞

√n |G∗ − G∗|.

!!√

n(G∗ − G∗) d→ W, W is also the limit process of√

n(G− G).

4. Repeat step 3 by bootstrapping from (X∗0 ,X∗1 , . . . ,X

∗m).

5. Thus we can approximate the quantiles of ∆n by those of ∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 326: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Example 1. (Correctly Specified Model)

Simulated samples X0 ∼ N(0, 1),X1 ∼ N(0, 2) and X2 ∼ N(0, 4)with sample sizes (n0, n1, n2) = (50, 60, 80).

Fit the following model with distortion function h(x) = x2

g1(x) = g(x) exp(α1 + β1x2),g2(x) = g(x) exp(α2 + β2x2).

(α1, α2, β1, β2) = (−0.576,−0.84, 0.436, 0.535). The value ofthe proposed test statistic ∆n = 1.05, and the observed p-value isP(∆∗n > 1.05) = 0.904 based on 1000 bootstrap replications of∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 327: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Figure: G (solid curve), G1 (blue dotted curve), G2 (red dash-dot curve),empirical cdf G (green dashed curve).

Guanhua Lu Semiparametric Density Ratio Model

Page 328: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Example 2. (Misspecified Model)

Use the same data as in Example 1.

Intentionally fit the following misspecified model with distortionfunction h(x) = x

g1(x) = g(x) exp(α1 + β1x),g2(x) = g(x) exp(α2 + β2x).

(α1, α2, β1, β2) = (−0.00072,−0.03,−0.0015, 0.032). Thevalue of the proposed test statistic ∆n = 2.31, and the observedp-value is P(∆∗n > 2.31) = 0.007 based on 1000 bootstrapreplications of ∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 329: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Figure: G (solid curve), G1 (blue dotted curve), G2 (red dash-dot curve),empirical cdf G (green dashed curve).

Guanhua Lu Semiparametric Density Ratio Model

Page 330: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Example 3. (With Relaxed Weight Functions)

Use the same data as in Example 1.

Fit the following model with distortion function h(x) = (x, x2)′

g1(x) = g(x) exp(α1 + γ1x + β1x2),g2(x) = g(x) exp(α2 + γ2x + β2x2).

(α1, α2, γ1, γ2, β1, β2) =(−0.562,−0.860, 0.023, 0.139, 0.427, 0.539). The value of theproposed test statistic ∆n = 0.92, and the observed p-value isP(∆∗n > 0.92) = 0.89 based on 1000 bootstrap replications of∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 331: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Confidence Bands (with Equal Widths)

The limit of ∆n = sup−∞≤t≤∞√

n |G(t)− G(t)| agrees with thelimit of its bootstrap counterpart∆∗n = sup−∞≤t≤∞

√n |G∗(t)− G(t)| almost surely.

Approximate the quantiles of ∆n by those of the distribution of∆∗n.

For α ∈ (0, 1), let

wn1−α = infy|P∗(∆∗n ≤ y) ≥ 1− α,

then a 1− α level bootstrap confidence band for G is given by(G(·)− wn

1−α/√

n, G(·) + wn1−α/

√n). (25)

Guanhua Lu Semiparametric Density Ratio Model

Page 332: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Pointwise Confidence Intervals

Let V(ti) be the estimated variance for G(ti) at each point ti from thecovariance matrix (23). Then a 1− α level pointwise confidenceinterval for G(ti) is given by(

G(ti)− z1−α/2

√V(ti), G(ti) + z1−α/2

√V(ti)

), (26)

where z1−α/2 satisfies P(Z ≤ z1−α/2) = 1− α/2 with Z ∼ N(0, 1).

Guanhua Lu Semiparametric Density Ratio Model

Page 333: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Bonferroni Simultaneous Confidence Intervals

The 1− α Bonferroni simultaneous confidence intervals given by(G(ti)− tn−1

1−α/2n

√V(ti), G(ti) + tn−1

1−α/2n

√V(ti)

), (27)

where tn−11−α/2n is the (1− α

2n) percent cutoff point of the tn−1distribution with degree of freedom n− 1.

Guanhua Lu Semiparametric Density Ratio Model

Page 334: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Figure: G (black thick curve), 95% CB (blue curve), 95% Bonferronisimultaneous CI (red dotted curve), 95% pointwise CI(green dashed curve).

Guanhua Lu Semiparametric Density Ratio Model

Page 335: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Data: Hosmer & Lemeshow[1989, Ch. 1] used the logisticmodel to analyse the relationship between age and the status ofcoronary heart disease based on 100 subjects participating in astudy. Let x denote age and y = 1 or 0 represent the presence orabsence of coronary heart disease. (xi, yi), i = 1, . . . , 100.

Model:P(x|y = 1)P(x|y = 0)

= exp(α+ βx).

Estimation: (α, β) = (−5.0276, 0.1109). The value of theproposed test statistic ∆n = 0.2199, and the observed p-value isP(∆∗n > 0.2199) = 0.970 based on 1000 bootstrap replicationsof ∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 336: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Figure: Solid and dash (red) curves on the upper left represent estimated cdf andempirical cdf for P(x|y = 0), respectively; solid and dash (green) curves on the lowerright represent estimated cdf and empirical cdf for P(x|y = 1), respectively.

Guanhua Lu Semiparametric Density Ratio Model

Page 337: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Fit a More Complex Model:

P(x|y = 1)P(x|y = 0)

= exp(α+ βx + γx2). (28)

Wald Test: When H0 : γ = 0 is true under model (28),√

nγ → N(0, σ2γ),

where σ2γ is the asymptotic variance of γ. Let σ2

γ be theempirical version of σ2

γ on the basis of G, we can use the statistic

T =√

σ2γ

to test H0 : γ = 0 is true under model (28).Estimation: (α, β, γ) = (−3.9589, 0.0613, 0.0006). The Waldstatistic T = 0.2557, and the observed p-value is 0.798. Thissuggests to accept H0 : γ = 0.

Guanhua Lu Semiparametric Density Ratio Model

Page 338: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Odds-linear Model:

P(y = 1|x)1− P(y = 1|x)

= α0 + β0x. (29)

This model is equivalent to

P(x|y = 1)P(x|y = 0)

= expα+ r(x;β), (30)

where r(x;β) = log(1 + (β0/α0)x). The asymptotic results canbe easily extended to model (30).

Estimation: (α, β) = (−7.62, 47.47). The value of the proposedtest statistic ∆n = 1.55, and the observed p-value isP(∆∗n > 1.55) = 0.005 based on 1000 bootstrap replications of∆∗n.

Guanhua Lu Semiparametric Density Ratio Model

Page 339: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Billingsley, P. (1999). Convergence of Probability Measures,New York: Wiley.

Donsker, M.D. (1952). Justification and extension of Doob’sheuristic approach to the Kolmogorov-Smirnov theorems, Annalsof Mathematical Statistics, Vol. 23, 277-281.

Ethier, S. N. and Kurtz, T. G. (1985). Markov Processes:Characterization and Convergence, New York: Wiley.

Fokianos, K., Kedem, B., Qin, J., and Short, D.A. (2001). Asemiparametric approach to the one-way layout. Technometrics,Vol. 43, 56-65.

Gilbert, P.B. (2000). Large sample theory of maximum likelihoodestimates in semiparametric biased sampling models. Annals ofStatistics, Vol. 28, 151-194.

Guanhua Lu Semiparametric Density Ratio Model

Page 340: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Gilbert, P.B., Lele, S.R. and Vardi, Y. (1999). Maximumlikelihood estimation in semiparametric selection bias modelswith application to AIDS vaccine trials. Biometrika, Vol. 86,27-43.

Gill, R. D., Vardi, Y. and Wellner, J. A. (1988). Large sampletheory of empirical distribution in biased sampling models.Annals of Statistics, Vol. 16, 1069-1112.

Prentice, R. L. and Pyke, R. (1979). Logistic disease incidencemodels and case-control studies. Biometrika, Vol. 66, 403-411.

Qin, J. (1993). Empirical likelihood in biased sample problems.Annals of Statistics, Vol. 21, 1182-1196.

Qin, J. (1998). Inferences for case-control and semiparametrictwo-sample density ratio models. Biometrika, Vol. 85 , 619-630

Guanhua Lu Semiparametric Density Ratio Model

Page 341: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Qin, J. and Lawless, J. F. (1994). Empirical likelihood andgeneral estimating equations. Annals of Statistics, Vol. 22,300-325.

Qin, J. and Zhang, B. (1997). A Goodness of Fit Test for LogisticRegression Models Based on Case-Control Data. Biometrika, Vol84, 609-618.

Van der Vaart, A. W. (1998). Asymptotic Statistics, CambridgeUniversity Press.

Van der Vaart, A. W. and Wellner, J. A. (1996). WeakConvergence and Empirical Processes with Applications toStatistics, New York: Springer.

Vardi, Y. (1982). Nonparametric estimation in the presence oflength bias. Annals of Statistics, Vol. 10, 616-20.

Vardi, Y. (1985). Empirical distribution in selection bias models.Annals of Statistics, Vol. 13, 178-203.

Guanhua Lu Semiparametric Density Ratio Model

Page 342: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Wellner, J. A. (1992). Empirical Processes in Action: A Review.International statistical Review, Vol 60, 247-269.

Zhang, B. (2000). A goodness of fit test formultiplicative-intercept risk models based on case-control data.Statistica Sinica, 10, 839-865.

Guanhua Lu Semiparametric Density Ratio Model

Page 343: Modeling The Variance of a Time Seriesbnk/Ked_Symp_Friday.pdfVolatility Stochastic Volatility and GARCH A Simple Tractable Model An Application Summary Ben Kedem Ben has made many

IntroductionAsymptotic Theory for θ and G

Data Analysis

A Simulation Study for Estimation of ParametersGoodness of Fit and Confidence BandsAn Application to Coronary Heart Disease Data

Thank You!

Happy Brithday to Dr.Kedem!

Guanhua Lu Semiparametric Density Ratio Model