60
Time Series Analysis, Lecture 2, 2019 1 Contents 1 Wolds Decomposition 4 2 VAR 8 3 MLE and Hypothesis Testing for VAR 10 4 Estimating the E/ects of Shocks to the Economy 13 5 Identication Problem 15 6 Variance Decomposition 24 Nan Li, Department of Finance, ACEM, SJTU

Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 1

Contents

1 Wold’s Decomposition 4

2 VAR 8

3 MLE and Hypothesis Testing for VAR 10

4 Estimating the Effects of Shocks to the Economy 13

5 Identification Problem 15

6 Variance Decomposition 24

Nan Li, Department of Finance, ACEM, SJTU

Page 2: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 2

7 Standard Error for Impulse Response Functions 26

7.1 Confidence Intervals and the Bootstrap . . . . . . . . . . . . . . . . . 26

7.2 VAR Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

8 Granger Causality 31

9 Kalman Filter 37

9.1 State-Space Representation . . . . . . . . . . . . . . . . . . . . . . . 38

9.2 Kalman Filter Algorithm: . . . . . . . . . . . . . . . . . . . . . . . . 43

9.3 Innovation Representation . . . . . . . . . . . . . . . . . . . . . . . 47

Nan Li, Department of Finance, ACEM, SJTU

Page 3: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 3

9.4 Convergence Results . . . . . . . . . . . . . . . . . . . . . . . . . . 48

9.5 Serially Correlated Measurement Errors . . . . . . . . . . . . . . . . . 50

9.6 MLE estimation of the parameters . . . . . . . . . . . . . . . . . . . 53

9.7 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

9.8 Statistical Inference with the Kalman Filter . . . . . . . . . . . . . . 57

9.9 Application: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Nan Li, Department of Finance, ACEM, SJTU

Page 4: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 4

1 Wold’s Decomposition

All the stationary ARMA model can be written in the form

xt =∞∑j=1

θjεt−j

where εt is the white noise one would make in forecasting xt as a linear function oflagged xt and where θj’s are square summable and with θ0 = 1.

• Wold’s Decomposition Theorem says this result is in fact fundamental for anycovariance-stationary time series, not just stationary ARMAs!

Nan Li, Department of Finance, ACEM, SJTU

Page 5: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 5

Theorem 1 (Wold’s Decomposition) Any zero-mean covariance-stationary process xtcan be represented in the form

xt =∞∑j=0

θjεt−j + ηt

where

1. θ0 = 1 and∑∞j=0 θ

2j <∞,

2. εt is white noise and εt = xt − E(xt|xt−1, xt−2, ......)

3. All the roots of θ(L) are on or outside the unit circle, i.e. (unless is a unite rootprocess) the MA polynomial is invertible.

3. The value ηt is uncorrelated with εt−j for any j and is linear deterministic, i.e.ηt = E(ηt|xt−1, xt−2, ......)

Nan Li, Department of Finance, ACEM, SJTU

Page 6: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 6

Remark 1∑∞j=0 θjεt−j is called the linearly indeterministic component, if ηt = 0

then the process is called purely linear indeterministic (linearly regular).

Remark 2 . E(εt|xt−1, xt−2, ......) = 0

Remark 3θjand

εjare unique.

Idea of proof: rewrite xt as a sum of its forecast errors.

Remark 4 Extension for nonstationary time series, same as above except for that ηt islinear combination of its own past (not necessarily deterministic)

Remark 5 εt need NOT to be normally distributed and not to i.i.d

Remark 6 E(xt|xt−1, xt−2, ......) 6= E [xt|xt−1, xt−2, ......] (linear vs nonlinear)

Nan Li, Department of Finance, ACEM, SJTU

Page 7: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 7

Remark 7 εt need not to be the true shock

Remark 8 Wold’s decomposition is the unique linear presentation where shocks arelinear forecast error, not true for nonlinear presentation

Nan Li, Department of Finance, ACEM, SJTU

Page 8: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 8

Example 1 Non-invertible shocks

xt = ηt + 2ηt−1 where ηt i.i.d and σ2η = 1

xt stationary but MA polynomial is not invertible, hence can not express ηt asforecast error of xt.

Solution: Any MA(∞) can be expressed as an invertible MA(∞), which is unique,and said to be the fundamental innovations of xt

• Wold MA(∞) representation as fundamental representation: if two time serieshave the same Wold representation, they are same time series up to second mo-ment/linear forecast error.

2 VAR

• Proposed by Chris Sims in 1970s and 1980sNan Li, Department of Finance, ACEM, SJTU

Page 9: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 9

• Major subsequent contributions by others (Bernanke, Blanchard-Watson, Blanchard-Quah)• Useful to organize data— VARs serve as "battleground" between alternative economic theories— VARs can be used to quantitatively construct a particular model

• Question that can (in principle) be addressed by VAR:— How does the economy respond to a particular shock?— Answer can be useful:∗ For discriminating between models∗ For estimating parameters of a given model

• VARs can’t actually address such a question:— Identification problem— Need extra assumption....structural VAR(SVAR)

Nan Li, Department of Finance, ACEM, SJTU

Page 10: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 10

3 MLE and Hypothesis Testing for VAR

1. The conditional likelihood function for a vector auroregression

Yt = c+ Φ1Yt−1 + Φ2Yt−2 + ...+ ΦpYt−p + εt, εt ∼ i.i.d.N(0,Σ)

Given the initial p observations, the conditional likelihood of θ = (c,Φ1,Φ2, ..,Φp,Σ)′

is

f(YT , YT−1,...,Y1|Y0, Y−1, ...Y−p+1; θ)

= ΠTt=1f(Yt|Yt−1, Yt−2,...,Y−p+1; θ)

and

Yt|Yt−1, Yt−2,...,Y−p+1 ∼ N(Π′Xt,Σ)

where

Xt =[

1 Y ′t−1 Y ′t−2 ... Y ′t−p]′(np+1)×1

Π′ =[c Φ1 Φ2 ... Φp

]n×(np+1)

Nan Li, Department of Finance, ACEM, SJTU

Page 11: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 11

Hence

L(θ) = −(Tn

2) log(2π) +

T

2log |Σ−1| − 1

2

T∑t=1

[(Yt − Π′Xt)′Σ−1(Yt − Π′Xt)

]2. Maximum Likelihood Estimate of Π and Σ

Π =

T∑t=1

YtX′t

T∑t=1

XtX′t

−1

which is same as "OLS regression equation-by-equation"

Σ =1

T

T∑t=1

εtε′t

εt = Yt − Π′Xt

Nan Li, Department of Finance, ACEM, SJTU

Page 12: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 12

3. Likelihood Ratio Test

L(Σ, Π) = −(Tn

2) log(2π) +

T

2log |Σ−1|

−1

2

T∑t=1

[(Yt − Π′Xt)′Σ−1(Yt − Π′Xt)

]= −(

Tn

2) log(2π) +

T

2log |Σ−1| − (Tn/2)

2(L1 − L0) = T (log |Σ0| − log |Σ1|) ∼ χ2(l)

where l is the number of restrictions, for example, a test of p1(H1) vs. p0(H0)

lags (p1 > p0) in a n variable VAR, l = n2(p1 − p0).

Modified likelihood test for small sample bias (Sims 1980)

(T − k)(log |Σ0| − log |Σ1|) ∼ χ2(l), k = 1 + np1

less likely to reject the null hypothesis in small sample.4. Asymptotic Distribution of Π and Σ : Π and Σ are consistent and[ √

T (vec(ΠT )− vec(Π))√T (vech(ΣT )− vech(Σ))

]L−→ N

([Σ⊗Q−1 0

0 2D+n (Σ′ ⊗ Σ)(D+

n )′

])Nan Li, Department of Finance, ACEM, SJTU

Page 13: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 13

where Q = E(XtX′t), Dn is the unique matrix st. Dnvech(Σ) = vec(Σ) and

D+n = (D′nDn)−1D′n, st. D

+nDn = I

5. Wald Test of general hypothesis of the form

Rvec(Π) = r

√T (Rvec(ΠT )− r)

p−→ N(0, R(ΣT ⊗ Q−1T ))R′

hence

T (Rvec(ΠT )− r)′[R(ΣT ⊗ Q−1

T ))R′]−1

(Rvec(ΠT )− r) ∼ χ2(m)

4 Estimating the Effects of Shocks to the Economy

• Vector Autoregression for N × 1 vector of observed variables:

Xt = A1Xt−1 + ...+ApXp−1 + εt

Eεtε′t = V

Nan Li, Department of Finance, ACEM, SJTU

Page 14: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 14

• A’s, ε and V can be easily obtained by OLS• Problems: ε is statistical innovations— We want impulse response functions to fundamental economic shocks, ηt

εt = Cηt

Eηtη′t = I

CC′ = V

— Impulse response to ith shock:

Xt − Et−1Xt = Ciηit

EtXt+1 − Et−1Xt = A1Ciηit

Nan Li, Department of Finance, ACEM, SJTU

Page 15: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 15

5 Identification Problem

• We know A’s and V , we need to get C• Identification problem: not enough restrictions to pin down C— N2 unknown elements in C— Only N(N + 1)/2 equations in

CC′ = V

— Need more identifying restrictions!— Ambiguity of impulse response function for VAR (or VMA):

V AR : A(L)Xt = εt A(0) = I E(εtε′t) = Σ

VMA : Xt = B(L)εt B(0) = I E(εtε′t) = Σ

where B(L) = A(L)−1. If Σ is not diagonal, the system is in general unidenti-fied, the shocks and impulse responses are not identified. To show this, for anyfull rank Q such that QQ′ = I, we have

Xt = B(L)εt = B(L)ηt

where B(L) = B(L)Q−1 and ηt = Qεt. Hence B(L)εt and B(L)ηt areobservationally equivalent but with different IRs.

Nan Li, Department of Finance, ACEM, SJTU

Page 16: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 16

• Orthogonalization Assumptions:

1. Sims Orthogonalization: B(0) is lower triangular and E(ηtη′t) = I

(a) First variables is affected by its own shock contemporaneously; the other vari-able absorbs all the contemporaneous correlation between the additional shocksand the first shock.Note that in original system B(0) = I, or A(0) = I restrict each shock toaffect its own contemporaneously, but not B(0) 6= I unless Σ is diagonal.

(b) In term of MA representation,[x1tx2t

]= B(L)ηt =

[B0

11 0

B021 B0

22

] [η1tη2t

]+ B1ηt−1 + ....

(c) In term of AR representation, A(L) = B(L)−1, then B(0) is lower triangularimplies that A(0) be lower triangular or

A011x1t = −A1

11x1t−1 −A112x2t−1 +...+ η1t

A022x2t = −A0

21x1t −A121x1t−1 −A1

22x2t−1 +...+ η2t

that is, estimate the system by OLS with contemporaneous x1t in x2t, but notvice versa.

Nan Li, Department of Finance, ACEM, SJTU

Page 17: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 17

• Homework: Show that OLS residuals η1t and η2t are uncorrelated.(d) How to find A0 or B0 ? Cholesky decomposition will do the job.• Example:

Xt = AXt−1 + εt, E(εtε′t) = Σ

Let ηt = Cεt, then C should satisfy E(ηtη′t) = CΣC′ = I, and C is low

triangular. Cholesky decomposition of Σ will give us C−1.

Note: If Σ is Hermitian (symmetric) and positive definite, then Σ can bedecomposed as Σ = CCτ , where C is a lower triangular matrix with strictlypositive diagonal entries, and Cτ denotes the (conjugate) transpose of C.This is the Cholesky decomposition. (chol in matlab)

(e) Order of variables in VAR matters for interpretation of IRs, ideally determinedby economic theory

2. Example: Recursiveness Assumption:(a) Fed’s Policy rule

Rt = f(Ωt) + eRt

where f is a linear function, Ωt is set of variables that Fed looks at and eRt istime t policy shock

Nan Li, Department of Finance, ACEM, SJTU

Page 18: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 18

(b) What does this rule represent?• Literal interpretation: structural policy rule of central bank• Combination of structural rule and other "stuff" (see Clarida-Gertler)

True Policy Rule: Rt = αE[Xt+1|Ft] + eRt = f(zt) + eRt

where zt is all the time t data that generate information set Ft, in E(·|Ft)(c) What is a monetary policy shock?• Shocks to preference of monetary authority• Strategic consideration can lead to exogenous variation in policy (Self-fulfillingexpectation traps in Albanesi, Chari and Christiano)• Technical factors like measurement error (Bernank and Mihov)

(d) Problem: not enough assumptions to identify eRt• Assume:— policy shocks eRt are orthogonal to Ωt— Ωt contains current prices, wages, aggregate quantities, and lagged stuff• Economic content of this assumption:— Fed see prices and output when it makes its choice of Rt— Prices and output don’t respond at time t to eRt

Nan Li, Department of Finance, ACEM, SJTU

Page 19: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 19

• Response of other variables can be obtained by regression them on currentand lagged eRt• In VAR

A(L)Xt = εt

A(L) = I − A1L− A2L2 − ...− ApLp

εt = Cηt, CC′ = Σ

To think about recursiveness assumption, it is convenient to work with

A0 = C−1, A−10 A−1′

0 = Σ

A(L) = A0A(L)

Recursive assumption is then represented as

Xt =

X1tRtX2t

and A0 =

A11 0 0−→a21 a22 0A31

−→a32 A33

(**)

where Rt interest rate (middle equation is policy rule), X1t is k1 variableswhose current and lagged values do appear in policy rule and X2t is k2

variables whose current values do no appear in the policy rule

Nan Li, Department of Finance, ACEM, SJTU

Page 20: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 20

(e) Zero restrictions on A0 are implied by recursiveness assumption:• — Zeros in middle row: current values of X2t do not appear in policy rule— Zeros in first block of rows ensure that monetary policy shock does notaffect X1t: First block of zeros: prevent direct effect, via Rt; Second blockof zeros: prevent indirect effect via X2t

(f) There are many A0 which satisfy zero restrictions and

A−10 A−1′

0 = Σ (*)

• One normalization: lower triangular A0 with positive diagonal elements• A−1

0 is lower triangular Cholesky decomposition of Σ

(g) Proposition:• All A0 matrices that satisfy (*) and zero restrictions imply same value forcolumn of A−1

0 which corresponds to eRt , so we can work with low triangularCholesky decomposition of Σ without loss of generality• Suppose we change the ordering of the variables in X1t and X2t, but alwayspick lower triangular Cholesky decomposition of Σ, then dynamic response ofimpulse response of variable eRt unaffected.

3. Blanchard-Quah Orthogonalization(Long Run Identification):Nan Li, Department of Finance, ACEM, SJTU

Page 21: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 21

(a) Restricts the long-run response for one variable to the other shock is zero, i.e.B(1) to be low triangular,

Xt = B(L)ηt, E(ηtη′t) = I

∞∑j=0

∂Xt+j

∂ηt= B(1)

(b) Why do we care? For system specified in changes,

∆Xt = B(L)ηt

limj→∞

∂Xt+j

∂ηt=

∞∑j=0

Bj = B(1)

B(1) gives the (limiting) long-run response of level of Xt to η shocks.(c) e.g. in DSGE model, technology shock is the only shock that has long-run

impact on level of labor productivity; in long-run risk model, only "permanentshock" has long-run impact on level of consumption and dividends; "demandshocks" have no long-run effect on GNP• There are two types of technology shocks: neutral and capital embodied

Yt = Z1tF (Kt, Lt)

Kt+1 = (1− δ)Kt + Z2tIt

Nan Li, Department of Finance, ACEM, SJTU

Page 22: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 22

• These are only shocks that can affect the log level of labor productivity• The only shock which also has a long run effect on the relative price of capitalis a capital embodied technology shock (Z2t)

• These identification strategies require that the variables in the VAR be co-variance stationary• Advantage of this approach:— Don’t need to make all the usual assumptions required to construct Solow-residual based measure of technology shocks, such as functional form as-sumption for production function, correction for labor hoarding, capital uti-lization and time-varying markups

• Disadvantage: some models don’t satisfy identification assumption— Endogenous growth models where all shocks affect productivity in the longrun

— Standard models when there are permanent shocks to the tax rate oncapital income

• Reference: Francis, Owyang and Theodorou (2003)(d) Implementation: Suppose you estimate the system from OLS, get A and Σ,

Xt = A1Xt−1 + ...+ApXt−p + εtNan Li, Department of Finance, ACEM, SJTU

Page 23: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 23

Let ηt = C−1εt,such that E(ηtη′t) = I and

Xt = A1Xt−1 + ...+ApXt−p + Cηt

Define B(L) = A(L)−1 = (I −A1L−A2L2 − ...ApLp)−1, then

∞∑j=0

∂Xt+j

∂ηt= B(1)C = A(1)−1C

C should satisfy the following restrictions:• (exclusion restriction) B(1)C is low triangular• CC′ = Σ.

• (sign restriction), the (1,1) element of B(1)C is positiveSolution: Get Cholesky decomposition of B(1)ΣB(1)′ = PP ′, and letC = B(1)−1P.

(e) In particular VAR(1),

Xt = AXt−1 + Cηt∞∑j=0

∂Xt+j

∂ηt= (I −A)−1C

Nan Li, Department of Finance, ACEM, SJTU

Page 24: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 24

6 Variance Decomposition

How much of the k step ahead forecast error variance is due to a specified variable?

Xt = C(L)ηt, E(ηtη′t) = I

vart(Xt+k) = C0C′0 + C1C

′1 + ...+ Ck−1C

′k−1

decompose CjC′j as∑nτ=1CjIτC

′j, then we have

vart(Xt+k) =n∑τ=1

(k−1∑j=0

CjIτC′j) =

n∑τ=1

(νk,τ)

let k −→∞

var(Xt) =n∑τ=1

(ντ) =n∑τ=1

(∞∑j=0

CjIτC′j)

• VAR(1) representation

Nan Li, Department of Finance, ACEM, SJTU

Page 25: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 25

Yt = AYt−1 + Cηt, E(ηtη′t) = I

∂Yt+k∂εt

= AkC

vart(Yt+k) =k−1∑j=0

AjCC′Aj =n∑τ=1

(νk,τ), vk,τ =k−1∑j=0

AjCIτC′Aj

var(Yt) =∞∑j=0

AjCC′Aj =n∑τ=1

(ντ),

vτ =∞∑j=0

AjCIτC′Aj = CIτC

′ +A(∞∑j=1

Aj−1CIτC′Aj−1)A′

= CIτC′ +AvτA

Alternatively we can compute vk,τ recursively,

vk+1,τ = CIτC′ +Aνk,τA

′, for k > 1

ν1,τ = CIτC′

Nan Li, Department of Finance, ACEM, SJTU

Page 26: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 26

7 Standard Error for Impulse Response Functions

- Analytically from the distribution of AR parameters

- By Monte Carlo, for Gaussian residuals

- By bootstrap for sample and non Gaussian residuals.

7.1 Confidence Intervals and the Bootstrap

• Estimation produces:

Xt = A(L)Xt−1 + εt,

εt, t = 1, 2, ..., T

• BootstrapNan Li, Department of Finance, ACEM, SJTU

Page 27: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 27

1. Generate r = 1, ..., R artificial data sets, each of length T.— For rth dataset:

λrt ∈ Uniform[0, 1], t = 1, ..., T

— Convert to integers ∈ 1, 2, ..., T :

λrt = integer(λrt × T ), t = 1, ..., T

— Draw shocks

ελr1, ..., ε

λrT

— Generate artificial data:

Xrt = A(L)Xr

t−1 + ελrt, t = 1, ..., T

2. Suppose statistics of interest is φ (could be vector of impulse response functions,serial correlation coeffi cients, etc.)

φr = f(Xr1, ....X

rT ), r = 1, 2, ..., R

Nan Li, Department of Finance, ACEM, SJTU

Page 28: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 28

— Compute

σφ =

1

R

R∑r=1

(φr − φ)2

1/2

— Report

φ± 2× σφ

7.2 VAR Diagnostics

• Whether or not to take first difference is important, for example: hours and pro-ductivity, consumption and dividends• Choose VAR Lag Length

Nan Li, Department of Finance, ACEM, SJTU

Page 29: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 29

— Construct s(p)

Akaike : s(p) = log(det Σp) + (m+m2p)2

T

Hannan-Quinn : s(p) = log(det Σp) + (m+m2p)2 log(log(T ))

T

Schwarz : s(p) = log(det Σp) + (m+m2p)log(T )

T

where T is sample size, m is number of variables, p is number of lags— choose optimal p

p = arg minps(p)

— With T = 170:

2

T= 0.0118;

2 log(log(T ))

T= 0.0192;

log(T )

T= 0.0302

— Akaike penalizes p the least∗ Hannan-Quinn and Schwarz (or Bayesian information Criterion BIC) are con-sistent∗ In population, Akaike has positive probability of overshooting true p.

Nan Li, Department of Finance, ACEM, SJTU

Page 30: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 30

— These IC can be used to compare estimated models only when the numericalvalues of the dependent variable are identical for all estimates being compared.The models being compared need not be nested, unlike the case when modelsare being compared using an F or likelihood ratio test.

Nan Li, Department of Finance, ACEM, SJTU

Page 31: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 31

8 Granger Causality

1. Basic idea• A forecasting relation• Different from "precede effect"

2. Definitionwt Granger causes yt if wt helps to forecast yt, given past yti.e. for s > 0

MSE[E(yt+s|yt, yt−1, ...)] > MSE[E(yt+s|yt, yt−1, ...wt, wt−1, ...)

]• Autoregressive presentation

yt = a(L)yt−1 + b(L)wt−1 + δt

wt = c(L)yt−1 + d(L)wt−1 + νt

wt does not Granger cause yt iff b(L) = 0, or

A(L)

[ytwt

]=

[δtνt

]

A(L) =

[I − La(L) −Lb(L)Lc(L) I − Ld(L)

]≡[a∗(L) b∗(L)c∗(L) d∗(L)

]Nan Li, Department of Finance, ACEM, SJTU

Page 32: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 32

wt does not Granger cause yt iff b∗(L) = 0

• MA presentation

[ytwt

]= A(L)−1

[δtνt

]

=1

a∗(L)d∗(L)− b∗(L)c∗(L)

[d∗(L) −b∗(L)−c∗(L) a∗(L)

] [δtνt

]

≡[a(L) b(L)

c(L) d(L)

] [δtνt

]wt does not Granger cause yt iff the Wold moving average matrix lag polynomialis lower triangularwt does not Granger cause yt iff y’s bivariate Wold representation is same asits univariate Wold representation.• Univariate presentationConsider the pair of univariate Wold representation

yt = e(L)ξt, ξt = yt − E(yt|yt−1, yt−2, ...)

wt = f(L)µt, µt = wt − E(wt|wt−1, wt−2, ...)Nan Li, Department of Finance, ACEM, SJTU

Page 33: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 33

wt does not Granger cause yt if E(µtξt+j) = 0 for all j > 0, i.e. the univariateinnovations of wt are uncorrelated with the univariate innovation in ytProof:

yt = a(L)δt = e(L)ξtwt = c(L)δt + d(L)vt = f(L)µt

=⇒ µt as a combination of δt and νt is uncorrelated with δt+j, hence uncor-related with ξt+j for all j > 0

Note: E(µtξt+j) = 0 =⇒ past µ do not help to forecast ξ =⇒ µ do not helpto forecast yt =⇒ wt = f(L)µt does not help to forecast ytwt does not Granger cause yt then the response of y to w shocks is zero.• Effect on projections: w does not Granger cause y iff E(wt|all yt) = E(wt|currentand past (no future) yt).Proof:

wt = c(L)a(L)−1yt + d(L)νt

3. Test of Granger CausalityF test of

H0 : b1 = b2 = ...bp = 0Nan Li, Department of Finance, ACEM, SJTU

Page 34: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 34

Run an unconstrained regression, save

RSS1 =T∑t=1

u2t

Run a constrained regression, which is a univariate AR(p) for y and save

RSS0 =T∑t=1

e2t

Let

F =(RSS0 −RSS1)/p

RSS1/(T − 2p− 1)∼ F (p, T − 2p− 1)

asymptotically equivalent to

S2 =T (RSS0 −RSS1)

RSS1∼ χ2(p)

4. Interpreting Granger-causality test• It is not necessarily that one pair of variable must Granger cause the other andvice versa, example: money growth and GNP,- Question: fed rate and stock market?

Nan Li, Department of Finance, ACEM, SJTU

Page 35: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 35

• Warning: Granger causality is not causality!5. Granger causality in multivariate context• Estimation

y1t = c1 +A′1x1t +A′2x2t + ε1t

y2t = c2 +B′1x1t +B′2x2t + ε2t

H0 : A2 = 0

which is equivalent to estimate

y1t = c1 +A′1x1t +A′2x2t + ε1t

y2t = d+D′0y1t +D′1x1t +D′2x2t + ν2t

H0 : A2 = 0

Proof:

f(yt|xt; θ) = f(y1t|xt; θ)f(y2t|y1t, x; θ)

var(y2t|y1t, x) = Σ22 − Σ21Σ−111 Σ12

E(y2t|y1t, xt) = E(y2t|xt) + Σ21Σ−111 [y1t − E(y1t|xt)]

Nan Li, Department of Finance, ACEM, SJTU

Page 36: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 36

• Test: likelihood ratio test.

2L(θ)−L(θ(0)) = T (log |Σ11(0)| − log |Σ11|) ∼ χ2(number of restrictions)

• If A2 = 0 and B1 = 0, does it mean that there is no relation between y1t andy2t at all? Not necessarily.- Contemporaneous linear dependence may present.- Geweke test of linear dependence and decomposition of linear dependence

Nan Li, Department of Finance, ACEM, SJTU

Page 37: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 37

9 Kalman Filter

• An algorithm for sequentially updating a linear projection for the system• Deduce the restrictions that the models of the economy and of data collectionimpose on the "innovation representation" of the dynamics of the variables ofinterests• Using Kalman filter we can— calculate exact finite-sample forecasts— exact likelihood function for Gaussian ARMA process— factorize spectral density*— estimate VAR with coeffi cients that change over time*

Nan Li, Department of Finance, ACEM, SJTU

Page 38: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 38

9.1 State-Space Representation

state equation: ξt+1 = Fξt + Cvt+1

observation equation : yt = A′xt +H ′ξt + wt

ξt : vector of state variables

yt : vector of observed variables

xt : vector of exogenous or predetermined variables,

provide no info about xt+s or vt+svt : vector of white noise or martingale difference sequence of shocks

wt : vector of white noise or martingale difference sequence

of measurement error

E(vtv′τ

)=

I for t = τ0 otherwise

, E(wtw

′τ

)=

R for t = τ0 otherwise

E (vtwτ) = 0 for all t and τ , CC′ = Q

Nan Li, Department of Finance, ACEM, SJTU

Page 39: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 39

We need assumptions about ξ1

E(vtξ′1

)= 0, E

(wtξ′1

)= 0, for t = 1, 2, ...T.

In addition, assume that ξ1 is a random vector with know mean and covariance matrix,

E(ξ1) = ξ1

E[(ξ1 − E(ξ1))(ξ1 − E(ξ1))′

]= Σ1

Example of state-space representation

AR(p)

(yt − µ) = φ1(yt−1 − µ) + φ2(yt−2 − µ) + ...+ φp(yt−p − µ) + εt

ξt =

yt − µyt−1 − µ

...yt−p+1 − µ

, F =

φ1 φ2 ... φp−1 φp1 0 ... 0 0... ... ... ... ...0 0 ... 1 0

, C =

σ0...0

,vt+1 = εt+1/σ

yt = yt, A′ = µ, xt = 1, H ′ =

[1 0 ... 0

], wt = 0, R = 0,

Nan Li, Department of Finance, ACEM, SJTU

Page 40: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 40

MA(1)

yt = µ+ εt + θεt−1

ξt =

[εtεt−1

], F =

[0 01 0

], C =

[σ0

],

vt+1 = εt+1/σ

yt = yt, A′ = µ, xt = 1, H ′ =

[1 θ

], wt = 0, R = 0,

or

ξt =

[εt + θε−1

θεt

], F =

[0 10 0

], C =

[σσθ

],

vt+1 = εt+1/σ

yt = yt, A′ = µ, xt = 1, H ′ =

[1 0

], wt = 0, R = 0,

• Application in finance and macroeconomics— Real interest rate (Fama and Gibbons 1982), business cycle (Stock and Watson1991), market expectation of inflation (Hamilton 1985), capital stock (Li 2005)and etc.

Nan Li, Department of Finance, ACEM, SJTU

Page 41: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 41

— Estimation of a model specified at a finer time interval than pertains to theavailable dataKalman Filter Algorithm

Observed data: ytTt=1 , xtTt=1 , we want to construct linear least square forecasts

of ξt and yt based on the data observed through date t.

ξt+1|t = E(ξt+1|yt)

where yt =(y′t, ....y

′1, x′t, ...x

′1

)′. The MSE of the forecast is

Pt+1|t = E

[(ξt+1 − ξt+1|t

) (ξt+1 − ξt+1|t

)′]

• Idea: We accomplish this by constructing an innovation process yt such that[yt, E(ξ1)

]forms an orthogonal basis for the information set

[yt, E(ξ1)

], and

then recursively calculate the projection ξt+1 on[yt, E(ξ1)

]. The orthogonal basis

for[yt, E(ξ1)

]is constructed using Gram-Schmidt process.

Let y1 be the residual from a regression of y1 on E(ξ1) = ξ1|0

y1 = y1 −A′x1 +H ′ξ1|0Nan Li, Department of Finance, ACEM, SJTU

Page 42: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 42

we can check that E [y1] = 0, and[y1, ξ1|0

]and

[y1, ξ1|0

]span the same linear

spaceNote that here we use the exogeneity of xt

E[ξt|xt, yt−1

]= E

[ξt|yt−1

]Next, form y2 as the residual from a regression of y2 on

[y1, ξ1|0

]y2 = y2 − E(y2|y1, ξ1|0)

then E [y1] = 0, E[y2y′1

]= 0 and E

[y2ξ′1|0]

= 0;[y2, ξ1|0

]and

[y2, ξ1|0

]span

the same linear spaceContinuing in this way, form

yt = yt − E(yt|yt−1, ξ1|0)

yt is the innovation representation of yt , and[yt, E(ξ1)

]forms an orthogonal

basis for the information set[yt, E(ξ1)

]

Nan Li, Department of Finance, ACEM, SJTU

Page 43: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 43

9.2 Kalman Filter Algorithm:

• — Step 0 - Starting point: If E(ξ1) and E[(ξ1 − E(ξ1))(ξ1 − E(ξ1))′

]is known

ξ1|0 = E(ξ1)

P1|0 = Σ1 = E[(ξ1 − E(ξ1))(ξ1 − E(ξ1))′

]Otherwise, if eigenvalues of F are all inside the unit circle, the ξt is weakstationary, hence we can solve for E(ξ1) and Σ1 directly

E(ξt+1

)= FE(ξt) =⇒ E (ξt) = 0

Σ = FΣF ′ +Q =⇒ vec(Σ) = [I − F ⊗ F ]−1 vec(Q)

=⇒

ξ1|0 = 0

vec(P1|0) = [I − F ⊗ F ]−1 vec(Q)

Nan Li, Department of Finance, ACEM, SJTU

Page 44: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 44

— Step 1: construct yt with ξt|t−1 and Pt|t−1

E(yt|xt,ξt) = A′xt +H ′ξtyt|t−1 ≡ E(yt|xt, yt−1) = A′xt +H ′E(ξt|xt, yt−1) = A′xt +H ′ξt|t−1

yt = yt − yt|t−1 = H ′(ξt − ξt|t−1

)+ wt

with MSE

E[yty′t

]= H ′Pt|t−1H +R

— Step 2: Update the inference about ξt

ξt|t = E(ξt|xt, yt, yt−1) = E(ξt|yt)= ξt|t−1 + Γtyt

Γt = E[(ξt − ξt|t−1)y′t]E(yty′t)−1

= Pt|t−1H(H ′Pt|t−1H +R)−1

where

E[(ξt − ξt|t−1)y′t] = E[(ξt − ξt|t−1)(H ′(ξt − ξt|t−1

)+ wt)

′] = Pt|t−1H

Nan Li, Department of Finance, ACEM, SJTU

Page 45: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 45

— Step 3: Forecast ξt+1 based on yt

ξt+1|t = F ξt|t = F ξt|t−1 + F (ξt|t − ξt|t−1)

where

ξt|t − ξt|t−1 = Γtyt = Γt(yt −A′xt −H ′ξt|t−1)

=⇒

ξt+1|t = F ξt|t = F ξt|t−1 +Ktyt

where Kt is "Kalman gain matrix"

Kt = FΓt = FPt|t−1H(H ′Pt|t−1H +R)−1

Note that

ξt+1|t = F tE (ξ1) +t∑

j=1

F j−1Kjyj

— Step 3: update Pt|t−1 = E[(ξt+1 − ξt+1|t)(ξt+1 − ξt+1|t)

′]

ξt+1 − ξt+1|t = (F −KtH ′)(ξt − ξt|t−1) + Cvt+1 −KtwtNan Li, Department of Finance, ACEM, SJTU

Page 46: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 46

note yt = H ′(ξt − ξt|t−1) + wt, Hence

Pt+1|t = (F −KH ′)Pt|t−1(F −KH ′)′ + CC′ +KtRK′t

= FPt|t−1F′ −KtH ′Pt|t−1F

′ +Q

Nan Li, Department of Finance, ACEM, SJTU

Page 47: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 47

9.3 Innovation Representation

ξt+1|t = F ξt|t−1 +Ktyt

yt = Axt +H ′ξt|t−1 + yt

E[yty′t

]= H ′Pt|t−1H +R

is a time varying innovation representation of the original state-space representationstarting from initial condition ξ1|0 and P1|0

Or

yt = yt −Axt −H ′ξt|t−1

ξt+1|t = F ξt|t−1 +Ktyt

recursively filter out a record of innovations ytTt=1 from ξ1|0 and ytTt=1 , this iscalled a "whitening filter", which transform serially correlated process yt to a seriallyuncorrelated (i.e. "white") process ytTt=1 .

ytTt=1 is called a fundamental white noise for the ytTt=1 process.

Nan Li, Department of Finance, ACEM, SJTU

Page 48: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 48

9.4 Convergence Results

• If F have eigenvalues inside unite circle, Q and R are positive semidefinite symmet-ric matrices (at least one if strictly positive definite), then

Pt+1|t

is a monoton-

ically nonincreasing sequence and converges as T −→∞ to a steady state matrixP (which is unique), and

P = F[P − PH(H ′PH +R)−1H ′P

]F ′ +Q

and steady state value for Kalman gain matrix is

K = FPH(H ′PH +R)−1

has the property that the eigenvalues of(F −KH ′

)all lie on or inside unite circle.

• Use Kalman filter to find Wold decomposition.

ξt+1|t = F ξt|t−1 +K(yt −H ′ξt|t−1)

= (F −KH ′)ξt|t−1 +Kyt

Nan Li, Department of Finance, ACEM, SJTU

Page 49: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 49

=⇒

ξt+1|t =[I − (F −KH ′)L

]−1Kyt

yt+1|t = H ′ξt+1|t = H ′[I − (F −KH ′)L

]−1Kyt

yt+1 = yt+1 − yt+1|t = I −H ′[I − (F −KH ′)L

]−1KLyt+1

=⇒

yt+1 = I −H ′[I − (F −KH ′)L

]−1KL−1yt+1

= I +H ′[I − (F −KH ′)L

]−1KLyt+1

Nan Li, Department of Finance, ACEM, SJTU

Page 50: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 50

9.5 Serially Correlated Measurement Errors

If measurement error wt is serially correlated,

state equation: ξt+1 = Fξt + Cvt+1

observation equation : yt = H ′ξt + wt

wt = Dwt−1 + ηt

E(ηtη′t

)= R

E[vtη′s

]= 0 for all t and s

Idea to transform yt to yt such the corresponding measurement error is uncorre-lated. Define

yt = yt+1 −Dyt= H ′ξt+1 + wt+1 −DH ′ξt −Dwt= (H ′F −DH ′)ξt +H ′Cvt+1 + ηt+1

Nan Li, Department of Finance, ACEM, SJTU

Page 51: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 51

thusξt, yt is governed by the state-space system

state equation: ξt+1 = Fξt + Cvt+1

observation equation : yt = H ′ξt + wt

where H ′ = H ′F − DH ′. wt = H ′Cvt+1 + ηt+1 is the new "measurement noise",which is contemporary correlated with vt+1, but not serially correlated.

E(Cvt+1w′t) = CC′H = QH

E(wtw′t) = H ′QH +R

We can do the same thing to find the innovation representation ut of yt ,

ξt+1|t = F ξt|t−1 +Ktut

yt = H ′ξt|t−1 + ut

the only thing we need to change is

E(utu′t

)= H ′Pt|t−1H + (H ′QH +R)

Nan Li, Department of Finance, ACEM, SJTU

Page 52: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 52

Pt+1|t = (F −KH ′)Pt|t−1(F −KH ′)′ +Q+Kt(H′QH +R)K′t −QHK′t

= FPt|t−1F′ −KtH ′Pt|t−1F

′ +Q+KtH′QHK′t −QHK′t

For the original processyt , an alternative state-space representation is given by com-bining the innovation representation of yt and

yt+1 = Dyt + yt

hence

state equation:

[ξt+1|tyt+1

]=

[F 0H D

] [ξt|t−1

yt

]+

[KtI

]ut

observation equation : yt =[

0 I] [ ξt|t−1

yt

]+ [0]ut

Nan Li, Department of Finance, ACEM, SJTU

Page 53: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 53

9.6 MLE estimation of the parameters

• Let θ be the vector of parameter in F,H,A,Q,R, the estimates are obtained bymaximize the loglikelihood of θ given ytTt=1

f(yT , yT−1, ...y1; θ) = fT (yT |yT−1, ...y1)fT−1(yT−1|yT−2, ..., y1)...f1(y1)

y1˜N(H ′ξ1|0, H′P1|0H +R),

yt|yt−1, ...y1˜N(H ′ξt|t−1, H′Pt|t−1H +R)

on the other hand

yt˜N(0, H ′Pt|t−1H +R)

so if ξt is stationary, E(ξ1) = 0, then f1(y1) = g1(y1), and since

yt = H ′ξt|t−1 + yt

we have

ft(yt|yt−1, ...y1) = gt(yt)

Nan Li, Department of Finance, ACEM, SJTU

Page 54: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 54

hence the loglikelihood of yT is

T∑t=1

gt(yt)

= −1

2

T∑t=1

n log(2π) + log |H ′Pt|t−1H +R|+ y′t(H

′Pt|t−1H +R)−1yt

• Initialization— Stationary process: ξ1|0 = E(ξ1), and P1|0 = Σ1

— *Nonstationary process: put diffusion prior on ξ1

• Identification:Without putting restrictions on F,H,A,Q,R, the parameters of the state-spacerepresentation are unidentified. example

ξt+1 =

[ε1,t+1ε2,t+1

], yt = ε1t + ε2t

— Global identification at θ0: for any other θ, there exists yT such that f(yT ; θ) 6=f(yT ; θ0)

— Local identification at θ0: information matrix is nonsingular in the neighborhoodaround θ0

Nan Li, Department of Finance, ACEM, SJTU

Page 55: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 55

9.7 Smoothing

we want to form inference about state variables not only based on historical data butthe entire available data, that is, ξt|T = E(ξt|yT )

Step 1: E(ξt|ξt+1, yt) = ξt|t + Jt(ξt+1 − ξt+1|t

)Jt = E

[(ξt − ξt|t)(ξt+1 − ξt+1|t)

′]× E [(ξt+1 − ξt+1|t)(ξt+1 − ξt+1|t)′]−1

= Pt|tF′P−1t+1|t

Step 2:

E(ξt|ξt+1, yT ) = E(ξt|ξt+1, y

t) = ξt|t + Jt(ξt+1 − ξt+1|t

)

Step 3:

E(ξt|yT ) = ξt|t + Jt(E(ξt+1|yT )− ξt+1|t

)Nan Li, Department of Finance, ACEM, SJTU

Page 56: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 56

Step 4:

Pt|T = Pt|t + Jt(Pt+1|T − Pt+1|t)J′t

Summarize: smoothed sequence is generated by backward recursion: start from ξT |T ,PT |T .

Nan Li, Department of Finance, ACEM, SJTU

Page 57: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 57

9.8 Statistical Inference with the Kalman Filter

We assumed that the true value θ is used to construct ξ and P, but in practice we usethe estimates of θ instead.

E[(ξt − ξt|T (θ))(ξt − ξt|T (θ))′|yT

]= E

[(ξt − ξt|T (θ0))(ξt − ξt|T (θ0))′|yT

]︸ ︷︷ ︸

filter uncertainty

+E[(ξt|T (θ0)− ξt|T (θ))(ξt|T (θ0)− ξt|T (θ))′|yT

]︸ ︷︷ ︸

parameter uncertainty

To measure these two parts of uncertainty we can use Monte Carlo simulation

θ|yT˜N(θ,1

TI−1)

Take M number of draws θ(j) from this distribution, and calculate

1

M

M∑j=1

[(ξt|T (θ(j))− ξt|T (θ))(ξt|T (θ(j))− ξt|T (θ0))′|yT

]Nan Li, Department of Finance, ACEM, SJTU

Page 58: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 58

to estimate "parameter uncertainty", and use

1

M

M∑j=1

Pt|T (θ(j))

to estimate "filter uncertainty", the summation of the two estimate MSE of ξt|T (θ)

around the true value of ξt

Nan Li, Department of Finance, ACEM, SJTU

Page 59: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 59

9.9 Application:

• Aggregation over time— average over time

ξt+1 = Fξt + vt+1, t = 0, 1, 2, ...

yt = H ′ξtexpand the state space by including enough lags, Xt = [ξt ξt−1,...ξt−p]

Xt+1 = FXt + Cwt+1

yt = GXt

— skip sample, the data is sampled every τ > 0 period,

ξt+τ = Fτξt + vτt+τ , t = 0, τ , 2τ , ...

yt = H ′ξtwhere

Fτ = F τ

vτt+τ = F τ−1Cwt+1 + F τ−2Cwt+2 + ...+ Cwt+τ

E[vτt+τv

τ ′t+τ

]= CC′ + FCC′F ′ + ...+ F τ−1CC′(F τ−1)′

Nan Li, Department of Finance, ACEM, SJTU

Page 60: Time Series Analysis, Lecture 2, 2019 1 Contents · 2019. 9. 28. · Time Series Analysis, Lecture 2, 2019 6 Remark 1 P1 j=0 j"t tj is called the linearly indeterministic component,

Time Series Analysis, Lecture 2, 2019 60

represented in state-space system as

ξs+1 = Fτξs + vτs+1, s = 0, 1, 2, ...

ys = H ′ξs

• Estimate dynamics of unobserved variables in the economy:Real interest rate (Fama and Gibbons 1982), business cycle (Stock and Watson1991), market expectation of inflation (Hamilton 1985), intangible capital stock (Li2005) and etc.

Nan Li, Department of Finance, ACEM, SJTU