State space models - Rob J. Hyndman · 2014. 6. 2. · Outline 1Simple structural models 2Linear...

Preview:

Citation preview

Rob J Hyndman

State space models

2: Structural models

Outline

1 Simple structural models

2 Linear Gaussian state space models

3 Kalman filter

4 Kalman smoothing

5 Time varying parameter models

State space models 2: Structural models 2

Outline

1 Simple structural models

2 Linear Gaussian state space models

3 Kalman filter

4 Kalman smoothing

5 Time varying parameter models

State space models 2: Structural models 3

State space models

xt−1 yt

xt yt+1

xt+1 yt+2

xt+2 yt+3

xt+3 yt+4

xt+4 yt+5

ETS state vectorxt = (`t,bt, st, st−1, . . . , st−m+1)

State space models 2: Structural models 4

State space models

xt−1 yt

xt yt+1

xt+1 yt+2

xt+2 yt+3

xt+3 yt+4

xt+4 yt+5

ETS state vectorxt = (`t,bt, st, st−1, . . . , st−m+1)

ETS models

å yt depends on xt−1.

å The same errorprocess affectsxt|xt−1 and yt|xt−1.

State space models 2: Structural models 4

State space models

xt yt

xt+1 yt+1

xt+2 yt+2

xt+3 yt+3

xt+4 yt+4

xt+5 yt+5

ETS state vectorxt = (`t,bt, st, st−1, . . . , st−m+1)

Structural models

å yt depends on xt.

å A different errorprocess affectsxt|xt−1 and yt|xt.

State space models 2: Structural models 5

Local level model

Stochastically varying level (random walk)observed with noise

yt = `t + εt

`t = `t−1 + ξt

εt and ξt are independent Gaussian white noiseprocesses.

Compare ETS(A,N,N) where ξt = αεt−1.

Parameters to estimate: σ2ε and σ2

ξ .

If σ2ξ = 0, yt ∼ NID(`0, σ

2ε ).

State space models 2: Structural models 6

Local level model

Stochastically varying level (random walk)observed with noise

yt = `t + εt

`t = `t−1 + ξt

εt and ξt are independent Gaussian white noiseprocesses.

Compare ETS(A,N,N) where ξt = αεt−1.

Parameters to estimate: σ2ε and σ2

ξ .

If σ2ξ = 0, yt ∼ NID(`0, σ

2ε ).

State space models 2: Structural models 6

Local level model

Stochastically varying level (random walk)observed with noise

yt = `t + εt

`t = `t−1 + ξt

εt and ξt are independent Gaussian white noiseprocesses.

Compare ETS(A,N,N) where ξt = αεt−1.

Parameters to estimate: σ2ε and σ2

ξ .

If σ2ξ = 0, yt ∼ NID(`0, σ

2ε ).

State space models 2: Structural models 6

Local level model

Stochastically varying level (random walk)observed with noise

yt = `t + εt

`t = `t−1 + ξt

εt and ξt are independent Gaussian white noiseprocesses.

Compare ETS(A,N,N) where ξt = αεt−1.

Parameters to estimate: σ2ε and σ2

ξ .

If σ2ξ = 0, yt ∼ NID(`0, σ

2ε ).

State space models 2: Structural models 6

Local linear trend modelDynamic trend observed with noise

yt = `t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

εt, ξt and ζt are independent Gaussian whitenoise processes.Compare ETS(A,A,N) where ξt = (α + β)εt−1 andζt = βεt−1

Parameters to estimate: σ2ε , σ

2ξ , and σ2

ζ .If σ2

ζ = σ2ξ = 0, yt = `0 + tb0 + εt.

Model is a time-varying linear regression.State space models 2: Structural models 7

Local linear trend modelDynamic trend observed with noise

yt = `t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

εt, ξt and ζt are independent Gaussian whitenoise processes.Compare ETS(A,A,N) where ξt = (α + β)εt−1 andζt = βεt−1

Parameters to estimate: σ2ε , σ

2ξ , and σ2

ζ .If σ2

ζ = σ2ξ = 0, yt = `0 + tb0 + εt.

Model is a time-varying linear regression.State space models 2: Structural models 7

Local linear trend modelDynamic trend observed with noise

yt = `t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

εt, ξt and ζt are independent Gaussian whitenoise processes.Compare ETS(A,A,N) where ξt = (α + β)εt−1 andζt = βεt−1

Parameters to estimate: σ2ε , σ

2ξ , and σ2

ζ .If σ2

ζ = σ2ξ = 0, yt = `0 + tb0 + εt.

Model is a time-varying linear regression.State space models 2: Structural models 7

Local linear trend modelDynamic trend observed with noise

yt = `t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

εt, ξt and ζt are independent Gaussian whitenoise processes.Compare ETS(A,A,N) where ξt = (α + β)εt−1 andζt = βεt−1

Parameters to estimate: σ2ε , σ

2ξ , and σ2

ζ .If σ2

ζ = σ2ξ = 0, yt = `0 + tb0 + εt.

Model is a time-varying linear regression.State space models 2: Structural models 7

Local linear trend modelDynamic trend observed with noise

yt = `t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

εt, ξt and ζt are independent Gaussian whitenoise processes.Compare ETS(A,A,N) where ξt = (α + β)εt−1 andζt = βεt−1

Parameters to estimate: σ2ε , σ

2ξ , and σ2

ζ .If σ2

ζ = σ2ξ = 0, yt = `0 + tb0 + εt.

Model is a time-varying linear regression.State space models 2: Structural models 7

Basic structural model

yt = `t + s1,t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

s1,t = −m−1∑j=1

sj,t−1 + ηt

sj,t = sj−1,t−1, j = 2, . . . ,m− 1

εt, ξt, ζt and ηt are independent Gaussian white noiseprocesses.Compare ETS(A,A,A).Parameters to estimate: σ2

ε , σ2ξ , σ

2ζ and σ2

η

Deterministic seasonality if σ2η = 0.

State space models 2: Structural models 8

Basic structural model

yt = `t + s1,t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

s1,t = −m−1∑j=1

sj,t−1 + ηt

sj,t = sj−1,t−1, j = 2, . . . ,m− 1

εt, ξt, ζt and ηt are independent Gaussian white noiseprocesses.Compare ETS(A,A,A).Parameters to estimate: σ2

ε , σ2ξ , σ

2ζ and σ2

η

Deterministic seasonality if σ2η = 0.

State space models 2: Structural models 8

Basic structural model

yt = `t + s1,t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

s1,t = −m−1∑j=1

sj,t−1 + ηt

sj,t = sj−1,t−1, j = 2, . . . ,m− 1

εt, ξt, ζt and ηt are independent Gaussian white noiseprocesses.Compare ETS(A,A,A).Parameters to estimate: σ2

ε , σ2ξ , σ

2ζ and σ2

η

Deterministic seasonality if σ2η = 0.

State space models 2: Structural models 8

Basic structural model

yt = `t + s1,t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

s1,t = −m−1∑j=1

sj,t−1 + ηt

sj,t = sj−1,t−1, j = 2, . . . ,m− 1

εt, ξt, ζt and ηt are independent Gaussian white noiseprocesses.Compare ETS(A,A,A).Parameters to estimate: σ2

ε , σ2ξ , σ

2ζ and σ2

η

Deterministic seasonality if σ2η = 0.

State space models 2: Structural models 8

Trigonometric models

yt = `t +

J∑j=1

sj,t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

sj,t = cosλjsj,t−1 + sinλjs∗j,t−1 + ωj,t

s∗j,t = − sinλjsj,t−1 + cosλjs∗j,t−1 + ω∗

j,t

λj = 2πj/mεt, ξt, ζt, ωj,t, ω∗

j,t are independent Gaussian whitenoise processesωj,t and ω∗

j,t have same variance σ2ω,j

Equivalent to BSM when σ2ω,j = σ2

ω and J = m/2Choose J <m/2 for fewer degrees of freedom

State space models 2: Structural models 9

Trigonometric models

yt = `t +

J∑j=1

sj,t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

sj,t = cosλjsj,t−1 + sinλjs∗j,t−1 + ωj,t

s∗j,t = − sinλjsj,t−1 + cosλjs∗j,t−1 + ω∗

j,t

λj = 2πj/mεt, ξt, ζt, ωj,t, ω∗

j,t are independent Gaussian whitenoise processesωj,t and ω∗

j,t have same variance σ2ω,j

Equivalent to BSM when σ2ω,j = σ2

ω and J = m/2Choose J <m/2 for fewer degrees of freedom

State space models 2: Structural models 9

Trigonometric models

yt = `t +

J∑j=1

sj,t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

sj,t = cosλjsj,t−1 + sinλjs∗j,t−1 + ωj,t

s∗j,t = − sinλjsj,t−1 + cosλjs∗j,t−1 + ω∗

j,t

λj = 2πj/mεt, ξt, ζt, ωj,t, ω∗

j,t are independent Gaussian whitenoise processesωj,t and ω∗

j,t have same variance σ2ω,j

Equivalent to BSM when σ2ω,j = σ2

ω and J = m/2Choose J <m/2 for fewer degrees of freedom

State space models 2: Structural models 9

Trigonometric models

yt = `t +

J∑j=1

sj,t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

sj,t = cosλjsj,t−1 + sinλjs∗j,t−1 + ωj,t

s∗j,t = − sinλjsj,t−1 + cosλjs∗j,t−1 + ω∗

j,t

λj = 2πj/mεt, ξt, ζt, ωj,t, ω∗

j,t are independent Gaussian whitenoise processesωj,t and ω∗

j,t have same variance σ2ω,j

Equivalent to BSM when σ2ω,j = σ2

ω and J = m/2Choose J <m/2 for fewer degrees of freedom

State space models 2: Structural models 9

Trigonometric models

yt = `t +

J∑j=1

sj,t + εt

`t = `t−1 + bt−1 + ξt

bt = bt−1 + ζt

sj,t = cosλjsj,t−1 + sinλjs∗j,t−1 + ωj,t

s∗j,t = − sinλjsj,t−1 + cosλjs∗j,t−1 + ω∗

j,t

λj = 2πj/mεt, ξt, ζt, ωj,t, ω∗

j,t are independent Gaussian whitenoise processesωj,t and ω∗

j,t have same variance σ2ω,j

Equivalent to BSM when σ2ω,j = σ2

ω and J = m/2Choose J <m/2 for fewer degrees of freedom

State space models 2: Structural models 9

ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.

State space models 2: Structural models 10

ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.

State space models 2: Structural models 10

ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.

State space models 2: Structural models 10

ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.

State space models 2: Structural models 10

ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.

State space models 2: Structural models 10

ETS vs Structural modelsETS models are much more general as theyallow non-linear (multiplicative components).ETS allows automatic forecasting due to itslarger model space.Additive ETS models are almost equivalent tothe corresponding structural models.ETS models have a larger parameter space.Structural models parameters are alwaysnon-negative (variances).Structural models are much easier togeneralize (e.g., add covariates).It is easier to handle missing values withstructural models.

State space models 2: Structural models 10

Structural models in R

StructTS(oil, type="level")StructTS(ausair, type="trend")StructTS(austourists, type="BSM")

fit <- StructTS(austourists, type = "BSM")decomp <- cbind(austourists, fitted(fit))colnames(decomp) <- c("data","level","slope",

"seasonal")plot(decomp, main="Decomposition of

International visitor nights")

State space models 2: Structural models 11

Structural models in R20

4060

data

2535

45

leve

l

−2.

0−

0.5

slop

e

−10

010

2000 2002 2004 2006 2008 2010

seas

onal

Time

Decomposition of International visitor nights

State space models 2: Structural models 12

ETS decomposition20

4060

obse

rved

2535

45

leve

l

0.50

750.

5090

slop

e

−10

05

2000 2002 2004 2006 2008 2010

seas

on

Time

Decomposition by ETS(A,A,A) method

State space models 2: Structural models 13

Outline

1 Simple structural models

2 Linear Gaussian state space models

3 Kalman filter

4 Kalman smoothing

5 Time varying parameter models

State space models 2: Structural models 14

Linear Gaussian SS models

Observation equation yt = f ′xt + εt

State equation xt = Gxt−1 +wt

State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).

Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],

xt =

[`tbt

]G =

[1 10 1

]W =

[σ2ξ 0

0 σ2ζ

]State space models 2: Structural models 15

Linear Gaussian SS models

Observation equation yt = f ′xt + εt

State equation xt = Gxt−1 +wt

State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).

Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],

xt =

[`tbt

]G =

[1 10 1

]W =

[σ2ξ 0

0 σ2ζ

]State space models 2: Structural models 15

Linear Gaussian SS models

Observation equation yt = f ′xt + εt

State equation xt = Gxt−1 +wt

State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).

Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],

xt =

[`tbt

]G =

[1 10 1

]W =

[σ2ξ 0

0 σ2ζ

]State space models 2: Structural models 15

Linear Gaussian SS models

Observation equation yt = f ′xt + εt

State equation xt = Gxt−1 +wt

State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).

Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],

xt =

[`tbt

]G =

[1 10 1

]W =

[σ2ξ 0

0 σ2ζ

]State space models 2: Structural models 15

Linear Gaussian SS models

Observation equation yt = f ′xt + εt

State equation xt = Gxt−1 +wt

State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).

Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],

xt =

[`tbt

]G =

[1 10 1

]W =

[σ2ξ 0

0 σ2ζ

]State space models 2: Structural models 15

Linear Gaussian SS models

Observation equation yt = f ′xt + εt

State equation xt = Gxt−1 +wt

State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).

Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],

xt =

[`tbt

]G =

[1 10 1

]W =

[σ2ξ 0

0 σ2ζ

]State space models 2: Structural models 15

Linear Gaussian SS models

Observation equation yt = f ′xt + εt

State equation xt = Gxt−1 +wt

State vector xt of length pG a p× p matrix, f a vector of length pεt ∼ NID(0, σ2), wt ∼ NID(0,W).

Local level model:f = G = 1, xt = `t.Local linear trend model:f ′ = [1 0],

xt =

[`tbt

]G =

[1 10 1

]W =

[σ2ξ 0

0 σ2ζ

]State space models 2: Structural models 15

Basic structural modelLinear Gaussian state space model

yt = f ′xt + εt, εt ∼ N(0, σ2)

xt = Gxt−1 +wt wt ∼ N(0,W)

f ′ = [1 0 1 0 · · · 0], W = diagonal(σ2ξ , σ

2ζ , σ

2η ,0, . . . ,0)

xt =

`tbts1,t

s2,t

s3,t...

sm−1,t

G =

1 1 0 0 . . . 0 00 1 0 0 . . . 0 00 0 −1 −1 . . . −1 −10 0 1 0 . . . 0 0

0 0 0 1 . . . ......

......

... . . . . . . 0 00 0 0 . . . 0 1 0

State space models 2: Structural models 16

Basic structural modelLinear Gaussian state space model

yt = f ′xt + εt, εt ∼ N(0, σ2)

xt = Gxt−1 +wt wt ∼ N(0,W)

f ′ = [1 0 1 0 · · · 0], W = diagonal(σ2ξ , σ

2ζ , σ

2η ,0, . . . ,0)

xt =

`tbts1,t

s2,t

s3,t...

sm−1,t

G =

1 1 0 0 . . . 0 00 1 0 0 . . . 0 00 0 −1 −1 . . . −1 −10 0 1 0 . . . 0 0

0 0 0 1 . . . ......

......

... . . . . . . 0 00 0 0 . . . 0 1 0

State space models 2: Structural models 16

Outline

1 Simple structural models

2 Linear Gaussian state space models

3 Kalman filter

4 Kalman smoothing

5 Time varying parameter models

State space models 2: Structural models 17

Kalman filterNotation:

xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]

xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]

yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]

Forecasting:yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v

−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f

′Pt|t−1

State Predictionxt+1|t = Gxt|t

Pt+1|t = GPt|tG′ +W

State space models 2: Structural models 18

Kalman filterNotation:

xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]

xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]

yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]

Forecasting:yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v

−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f

′Pt|t−1

State Predictionxt+1|t = Gxt|t

Pt+1|t = GPt|tG′ +W

State space models 2: Structural models 18

Kalman filterNotation:

xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]

xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]

yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]

Forecasting:yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v

−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f

′Pt|t−1

State Predictionxt+1|t = Gxt|t

Pt+1|t = GPt|tG′ +W

State space models 2: Structural models 18

Kalman filterNotation:

xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]

xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]

yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]

Forecasting:yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v

−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f

′Pt|t−1

State Predictionxt+1|t = Gxt|t

Pt+1|t = GPt|tG′ +W

State space models 2: Structural models 18

Kalman filterNotation:

xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]

xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]

yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]

Forecasting:yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v

−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f

′Pt|t−1

State Predictionxt+1|t = Gxt|t

Pt+1|t = GPt|tG′ +W

Iterate for t = 1, . . . , T

State space models 2: Structural models 18

Kalman filterNotation:

xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]

xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]

yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]

Forecasting:yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v

−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f

′Pt|t−1

State Predictionxt+1|t = Gxt|t

Pt+1|t = GPt|tG′ +W

Iterate for t = 1, . . . , T

Assume we know x1|0 andP1|0.

State space models 2: Structural models 18

Kalman filterNotation:

xt|t = E[xt|y1, . . . , yt] Pt|t = V[xt|y1, . . . , yt]

xt|t−1 = E[xt|y1, . . . , yt−1] Pt|t−1 = V[xt|y1, . . . , yt−1]

yt|t−1 = E[yt|y1, . . . , yt−1] vt|t−1 = V[yt|y1, . . . , yt−1]

Forecasting:yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:xt|t = xt|t−1 + Pt|t−1f v

−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1 − Pt|t−1f v−1t|t−1f

′Pt|t−1

State Predictionxt+1|t = Gxt|t

Pt+1|t = GPt|tG′ +W

Iterate for t = 1, . . . , T

Assume we know x1|0 andP1|0.

Just conditional expectations. So thisgives minimum MSE estimates.

State space models 2: Structural models 18

Kalman recursionsKALMANRECURSIONS

2. Forecasting

1. State Prediction 3. State Filtering

Forecast Observation

observation at time t

Filtered State Predicted State Filtered State

Time t-1 Time t Time t

y

x

State space models 2: Structural models 19

Initializing Kalman filterNeed x1|0 and P1|0 to get started.

Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.

Lots of research papers on optimalinitialization choices for Kalman recursions.

ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.

A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.

This gives more realistic prediction intervals.State space models 2: Structural models 20

Initializing Kalman filterNeed x1|0 and P1|0 to get started.

Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.

Lots of research papers on optimalinitialization choices for Kalman recursions.

ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.

A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.

This gives more realistic prediction intervals.State space models 2: Structural models 20

Initializing Kalman filterNeed x1|0 and P1|0 to get started.

Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.

Lots of research papers on optimalinitialization choices for Kalman recursions.

ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.

A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.

This gives more realistic prediction intervals.State space models 2: Structural models 20

Initializing Kalman filterNeed x1|0 and P1|0 to get started.

Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.

Lots of research papers on optimalinitialization choices for Kalman recursions.

ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.

A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.

This gives more realistic prediction intervals.State space models 2: Structural models 20

Initializing Kalman filterNeed x1|0 and P1|0 to get started.

Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.

Lots of research papers on optimalinitialization choices for Kalman recursions.

ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.

A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.

This gives more realistic prediction intervals.State space models 2: Structural models 20

Initializing Kalman filterNeed x1|0 and P1|0 to get started.

Common approach for structural models:set x1|0 = 0 and P1|0 = kI for a very large k.

Lots of research papers on optimalinitialization choices for Kalman recursions.

ETS approach was to estimate x1|0 and avoidP1|0 by assuming error processes identical.

A random x1|0 could be used with ETS models,and then a form of Kalman filter would berequired for estimation and forecasting.

This gives more realistic prediction intervals.State space models 2: Structural models 20

Local level model

yt = `t + εt εt ∼ NID(0, σ2)

`t = `t−1 + ut ut ∼ NID(0,q2)

Kalman recursions:

yt|t−1 = ˆt−1|t−1

vt|t−1 = pt|t−1 + σ2

ˆt|t = ˆ

t−1|t−1 + pt|t−1v−1t|t−1(yt − yt|t−1)

pt+1|t = pt|t−1(1− v−1t|t−1pt|t−1) + q2

State space models 2: Structural models 21

Local level model

yt = `t + εt εt ∼ NID(0, σ2)

`t = `t−1 + ut ut ∼ NID(0,q2)

Kalman recursions:

yt|t−1 = ˆt−1|t−1

vt|t−1 = pt|t−1 + σ2

ˆt|t = ˆ

t−1|t−1 + pt|t−1v−1t|t−1(yt − yt|t−1)

pt+1|t = pt|t−1(1− v−1t|t−1pt|t−1) + q2

State space models 2: Structural models 21

Handling missing values

Forecasting:

yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:

xt|t = xt|t−1+Pt|t−1f v−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1−Pt|t−1f v−1t|t−1f

′Pt|t−1

State Prediction

xt|t−1 = Gxt−1|t−1

Pt|t−1 = GPt−1|t−1G′ +W

Iterate for t = 1, . . . , Tstarting withx1|0 and P1|0.

State space models 2: Structural models 22

Handling missing values

Forecasting:

yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:

xt|t = xt|t−1+Pt|t−1f v−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1−Pt|t−1f v−1t|t−1f

′Pt|t−1

State Prediction

xt|t−1 = Gxt−1|t−1

Pt|t−1 = GPt−1|t−1G′ +W

Iterate for t = 1, . . . , Tstarting withx1|0 and P1|0.

Ignored greyed outsection if yt missing.

State space models 2: Structural models 22

Handling missing values

Forecasting:

yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:

xt|t = xt|t−1+Pt|t−1f v−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1−Pt|t−1f v−1t|t−1f

′Pt|t−1

State Prediction

xt|t−1 = Gxt−1|t−1

Pt|t−1 = GPt−1|t−1G′ +W

Iterate for t = 1, . . . , Tstarting withx1|0 and P1|0.

Ignored greyed outsection if yt missing.

State space models 2: Structural models 22

Multi-step forecasting

Forecasting:

yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:

xt|t = xt|t−1+Pt|t−1f v−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1−Pt|t−1f v−1t|t−1f

′Pt|t−1

State Prediction

xt|t−1 = Gxt−1|t−1

Pt|t−1 = GPt−1|t−1G′ +W

Iterate fort = T + 1, . . . , T + hstarting withxT|T and PT|T.

State space models 2: Structural models 23

Multi-step forecasting

Forecasting:

yt|t−1 = f ′xt|t−1

vt|t−1 = f ′Pt|t−1f + σ2

Updating or State Filtering:

xt|t = xt|t−1+Pt|t−1f v−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1−Pt|t−1f v−1t|t−1f

′Pt|t−1

State Prediction

xt|t−1 = Gxt−1|t−1

Pt|t−1 = GPt−1|t−1G′ +W

Iterate fort = T + 1, . . . , T + hstarting withxT|T and PT|T.

Treat future values asmissing.

State space models 2: Structural models 23

Kalman filter

What’s so special about the Kalman filter

Very general equations for any model in statespace format.

Any model in state space format can easily begeneralized.

Optimal MSE forecasts

Easy to handle missing values.

Easy to compute likelihood.

State space models 2: Structural models 24

Kalman filter

What’s so special about the Kalman filter

Very general equations for any model in statespace format.

Any model in state space format can easily begeneralized.

Optimal MSE forecasts

Easy to handle missing values.

Easy to compute likelihood.

State space models 2: Structural models 24

Kalman filter

What’s so special about the Kalman filter

Very general equations for any model in statespace format.

Any model in state space format can easily begeneralized.

Optimal MSE forecasts

Easy to handle missing values.

Easy to compute likelihood.

State space models 2: Structural models 24

Kalman filter

What’s so special about the Kalman filter

Very general equations for any model in statespace format.

Any model in state space format can easily begeneralized.

Optimal MSE forecasts

Easy to handle missing values.

Easy to compute likelihood.

State space models 2: Structural models 24

Kalman filter

What’s so special about the Kalman filter

Very general equations for any model in statespace format.

Any model in state space format can easily begeneralized.

Optimal MSE forecasts

Easy to handle missing values.

Easy to compute likelihood.

State space models 2: Structural models 24

Likelihood calculationθ = all unknown parametersfθ(yt|y1, y2, . . . , yt−1) = one-step forecast density.

Likelihood

L(y1, . . . , yT;θ) =T∏

t=1

fθ(yt|y1, . . . , yt−1)

Gaussian log likelihood

log L = −T2

log(2π)− 1

2

T∑t=1

log vt|t−1 −1

2

T∑t=1

e2t /vt|t−1

where et = yt − yt|t−1.All terms obtained from Kalman filter equations.

State space models 2: Structural models 25

Likelihood calculationθ = all unknown parametersfθ(yt|y1, y2, . . . , yt−1) = one-step forecast density.

Likelihood

L(y1, . . . , yT;θ) =T∏

t=1

fθ(yt|y1, . . . , yt−1)

Gaussian log likelihood

log L = −T2

log(2π)− 1

2

T∑t=1

log vt|t−1 −1

2

T∑t=1

e2t /vt|t−1

where et = yt − yt|t−1.All terms obtained from Kalman filter equations.

State space models 2: Structural models 25

Likelihood calculationθ = all unknown parametersfθ(yt|y1, y2, . . . , yt−1) = one-step forecast density.

Likelihood

L(y1, . . . , yT;θ) =T∏

t=1

fθ(yt|y1, . . . , yt−1)

Gaussian log likelihood

log L = −T2

log(2π)− 1

2

T∑t=1

log vt|t−1 −1

2

T∑t=1

e2t /vt|t−1

where et = yt − yt|t−1.All terms obtained from Kalman filter equations.

State space models 2: Structural models 25

Structural models in RForecasts from Basic structural model

2000 2002 2004 2006 2008 2010 2012

2030

4050

6070

fit <- StructTS(austourists, type = "BSM")fc <- forecast(fit)plot(fc)

State space models 2: Structural models 26

Outline

1 Simple structural models

2 Linear Gaussian state space models

3 Kalman filter

4 Kalman smoothing

5 Time varying parameter models

State space models 2: Structural models 27

Kalman smoothing

Want estimate of xt|y1, . . . , yT where t < T. That is,xt|T.

xt|T = xt|t + At

(xt+1|T − xt+1|t

)Pt|T = Pt|t + At

(Pt+1|T − Pt+1|t

)A′t

where At = Pt|tG′ (Pt+1|t

)−1.

Uses all data, not just previous data.Useful for estimating missing values:yt|T = f ′xt|T.Useful for seasonal adjustment when one of thestates is a seasonal component.

State space models 2: Structural models 28

Kalman smoothing

Want estimate of xt|y1, . . . , yT where t < T. That is,xt|T.

xt|T = xt|t + At

(xt+1|T − xt+1|t

)Pt|T = Pt|t + At

(Pt+1|T − Pt+1|t

)A′t

where At = Pt|tG′ (Pt+1|t

)−1.

Uses all data, not just previous data.Useful for estimating missing values:yt|T = f ′xt|T.Useful for seasonal adjustment when one of thestates is a seasonal component.

State space models 2: Structural models 28

Kalman smoothing

Want estimate of xt|y1, . . . , yT where t < T. That is,xt|T.

xt|T = xt|t + At

(xt+1|T − xt+1|t

)Pt|T = Pt|t + At

(Pt+1|T − Pt+1|t

)A′t

where At = Pt|tG′ (Pt+1|t

)−1.

Uses all data, not just previous data.Useful for estimating missing values:yt|T = f ′xt|T.Useful for seasonal adjustment when one of thestates is a seasonal component.

State space models 2: Structural models 28

Kalman smoothing in R

fit <- StructTS(austourists, type = "BSM")sm <- tsSmooth(fit)

plot(austourists)lines(sm[,1],col=’blue’)lines(fitted(fit)[,1],col=’red’)legend("topleft",col=c(’blue’,’red’),lty=1,

legend=c("Filtered level","Smoothed level"))

State space models 2: Structural models 29

Kalman smoothing in R

Time

aust

ouris

ts

2000 2002 2004 2006 2008 2010

2030

4050

60 Filtered levelSmoothed level

State space models 2: Structural models 30

Kalman smoothing in R

fit <- StructTS(austourists, type = "BSM")sm <- tsSmooth(fit)

plot(austourists)

# Seasonally adjusted dataaus.sa <- austourists - sm[,3]lines(aus.sa,col=’blue’)

State space models 2: Structural models 31

Kalman smoothing in R

Time

aust

ouris

ts

2000 2002 2004 2006 2008 2010

2030

4050

60

State space models 2: Structural models 32

Kalman smoothing in R

x <- austouristsmiss <- sample(1:length(x), 5)x[miss] <- NAfit <- StructTS(x, type = "BSM")sm <- tsSmooth(fit)estim <- sm[,1]+sm[,3]

plot(x, ylim=range(austourists))points(time(x)[miss], estim[miss],

col=’red’, pch=1)points(time(x)[miss], austourists[miss],

col=’black’, pch=1)legend("topleft", pch=1, col=c(2,1),

legend=c("Estimate","Actual"))

State space models 2: Structural models 33

Kalman smoothing in R

Time

x

2000 2002 2004 2006 2008 2010

2030

4050

60

●●

EstimateActual

State space models 2: Structural models 34

Outline

1 Simple structural models

2 Linear Gaussian state space models

3 Kalman filter

4 Kalman smoothing

5 Time varying parameter models

State space models 2: Structural models 35

Time varying parameter modelsLinear Gaussian state space model

yt = f ′txt + εt, εt ∼ N(0, σ2t )

xt = Gtxt−1 +wt wt ∼ N(0,Wt)

Kalman recursions:

yt|t−1 = f ′txt|t−1

vt|t−1 = f ′tPt|t−1ft + σ2t

xt|t = xt|t−1 + Pt|t−1ftv−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1 − Pt|t−1ftv−1t|t−1f

′tPt|t−1

xt|t−1 = Gtxt−1|t−1

Pt|t−1 = GtPt−1|t−1G′t +Wt

State space models 2: Structural models 36

Time varying parameter modelsLinear Gaussian state space model

yt = f ′txt + εt, εt ∼ N(0, σ2t )

xt = Gtxt−1 +wt wt ∼ N(0,Wt)

Kalman recursions:

yt|t−1 = f ′txt|t−1

vt|t−1 = f ′tPt|t−1ft + σ2t

xt|t = xt|t−1 + Pt|t−1ftv−1t|t−1(yt − yt|t−1)

Pt|t = Pt|t−1 − Pt|t−1ftv−1t|t−1f

′tPt|t−1

xt|t−1 = Gtxt−1|t−1

Pt|t−1 = GtPt−1|t−1G′t +Wt

State space models 2: Structural models 36

Structural models with covariates

Local level with covariate

yt = `t + βzt + εt

`t = `t−1 + ξt

f ′t = [1 zt] xt =

[`tβ

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 0

]Assumes zt is fixed and known (as inregression)

Estimate of β is given by xT|T.

Equivalent to simple linear regression with timevarying intercept.

Easy to extend to multiple regression withadditional terms.

State space models 2: Structural models 37

Structural models with covariates

Local level with covariate

yt = `t + βzt + εt

`t = `t−1 + ξt

f ′t = [1 zt] xt =

[`tβ

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 0

]Assumes zt is fixed and known (as inregression)

Estimate of β is given by xT|T.

Equivalent to simple linear regression with timevarying intercept.

Easy to extend to multiple regression withadditional terms.

State space models 2: Structural models 37

Structural models with covariates

Local level with covariate

yt = `t + βzt + εt

`t = `t−1 + ξt

f ′t = [1 zt] xt =

[`tβ

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 0

]Assumes zt is fixed and known (as inregression)

Estimate of β is given by xT|T.

Equivalent to simple linear regression with timevarying intercept.

Easy to extend to multiple regression withadditional terms.

State space models 2: Structural models 37

Structural models with covariates

Local level with covariate

yt = `t + βzt + εt

`t = `t−1 + ξt

f ′t = [1 zt] xt =

[`tβ

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 0

]Assumes zt is fixed and known (as inregression)

Estimate of β is given by xT|T.

Equivalent to simple linear regression with timevarying intercept.

Easy to extend to multiple regression withadditional terms.

State space models 2: Structural models 37

Structural models with covariates

Local level with covariate

yt = `t + βzt + εt

`t = `t−1 + ξt

f ′t = [1 zt] xt =

[`tβ

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 0

]Assumes zt is fixed and known (as inregression)

Estimate of β is given by xT|T.

Equivalent to simple linear regression with timevarying intercept.

Easy to extend to multiple regression withadditional terms.

State space models 2: Structural models 37

Structural models with covariates

Local level with covariate

yt = `t + βzt + εt

`t = `t−1 + ξt

f ′t = [1 zt] xt =

[`tβ

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 0

]Assumes zt is fixed and known (as inregression)

Estimate of β is given by xT|T.

Equivalent to simple linear regression with timevarying intercept.

Easy to extend to multiple regression withadditional terms.

State space models 2: Structural models 37

Time varying regression

Simple linear regression with time varyingparameters

yt = `t + βtzt + εt

`t = `t−1 + ξt

βt = βt−1 + ζt

f ′t = [1 zt] xt =

[`tβt

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 σ2ζ

]Allows for a linear regression with parametersthat change slowly over time.

Parameters follow independent random walks.

Estimates of parameters given by xt|t or xt|T.State space models 2: Structural models 38

Time varying regression

Simple linear regression with time varyingparameters

yt = `t + βtzt + εt

`t = `t−1 + ξt

βt = βt−1 + ζt

f ′t = [1 zt] xt =

[`tβt

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 σ2ζ

]Allows for a linear regression with parametersthat change slowly over time.

Parameters follow independent random walks.

Estimates of parameters given by xt|t or xt|T.State space models 2: Structural models 38

Time varying regression

Simple linear regression with time varyingparameters

yt = `t + βtzt + εt

`t = `t−1 + ξt

βt = βt−1 + ζt

f ′t = [1 zt] xt =

[`tβt

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 σ2ζ

]Allows for a linear regression with parametersthat change slowly over time.

Parameters follow independent random walks.

Estimates of parameters given by xt|t or xt|T.State space models 2: Structural models 38

Time varying regression

Simple linear regression with time varyingparameters

yt = `t + βtzt + εt

`t = `t−1 + ξt

βt = βt−1 + ζt

f ′t = [1 zt] xt =

[`tβt

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 σ2ζ

]Allows for a linear regression with parametersthat change slowly over time.

Parameters follow independent random walks.

Estimates of parameters given by xt|t or xt|T.State space models 2: Structural models 38

Time varying regression

Simple linear regression with time varyingparameters

yt = `t + βtzt + εt

`t = `t−1 + ξt

βt = βt−1 + ζt

f ′t = [1 zt] xt =

[`tβt

]G =

[1 00 1

]Wt =

[σ2ξ 0

0 σ2ζ

]Allows for a linear regression with parametersthat change slowly over time.

Parameters follow independent random walks.

Estimates of parameters given by xt|t or xt|T.State space models 2: Structural models 38

Updating (“online”) regression

Same idea can be used to estimate aregression iteratively as new data arrives.

Simple linear regression with updatingparameters

yt = `t + βtzt + εt

`t = `t−1 + ξt

βt = βt−1 + ζt

f ′t = [1 zt] xt =

[`tβt

]G =

[1 00 1

]Wt =

[0 00 0

]Updated parameter estimates given by xt|t.Recursive residuals given by yt − yt|t−1.

State space models 2: Structural models 39

Updating (“online”) regression

Same idea can be used to estimate aregression iteratively as new data arrives.

Simple linear regression with updatingparameters

yt = `t + βtzt + εt

`t = `t−1 + ξt

βt = βt−1 + ζt

f ′t = [1 zt] xt =

[`tβt

]G =

[1 00 1

]Wt =

[0 00 0

]Updated parameter estimates given by xt|t.Recursive residuals given by yt − yt|t−1.

State space models 2: Structural models 39

Updating (“online”) regression

Same idea can be used to estimate aregression iteratively as new data arrives.

Simple linear regression with updatingparameters

yt = `t + βtzt + εt

`t = `t−1 + ξt

βt = βt−1 + ζt

f ′t = [1 zt] xt =

[`tβt

]G =

[1 00 1

]Wt =

[0 00 0

]Updated parameter estimates given by xt|t.Recursive residuals given by yt − yt|t−1.

State space models 2: Structural models 39

Updating (“online”) regression

Same idea can be used to estimate aregression iteratively as new data arrives.

Simple linear regression with updatingparameters

yt = `t + βtzt + εt

`t = `t−1 + ξt

βt = βt−1 + ζt

f ′t = [1 zt] xt =

[`tβt

]G =

[1 00 1

]Wt =

[0 00 0

]Updated parameter estimates given by xt|t.Recursive residuals given by yt − yt|t−1.

State space models 2: Structural models 39

Updating (“online”) regression

Same idea can be used to estimate aregression iteratively as new data arrives.

Simple linear regression with updatingparameters

yt = `t + βtzt + εt

`t = `t−1 + ξt

βt = βt−1 + ζt

f ′t = [1 zt] xt =

[`tβt

]G =

[1 00 1

]Wt =

[0 00 0

]Updated parameter estimates given by xt|t.Recursive residuals given by yt − yt|t−1.

State space models 2: Structural models 39

Recommended