67

Slides 2014 Panel Data

Embed Size (px)

DESCRIPTION

Panel data class in Gèneva.

Citation preview

Page 1: Slides 2014 Panel Data

An Introduction to Panel Data

Econometrics

Joan Llull

Advanced Time Series and Panel DataBarcelona GSE

Spring Term 2014

joan.llull [at] movebarcelona [dot] eu

Page 2: Slides 2014 Panel Data

Outline of the course

1. Introduction to Panel Data

2. Static Models

2.1 The xed eects model. Within groups estimation.

2.2 The random eects model. Error components.

2.3 Testing for correlated individual eects.

3. Dynamic Models

3.1 Autoregressive models with individual eects.

3.2 Dierenced GMM estimation.

3.3 System GMM estimation.

An Introduction to Panel Data Econometrics. Spring Term 2013 2

Page 3: Slides 2014 Panel Data

Introduction to Panel Data

An Introduction to Panel Data Econometrics. Spring Term 2013 3

Page 4: Slides 2014 Panel Data

Panel data

Panel data consists of repeated observations of a given cross-

section of individuals over time. Examples.

Individuals can be persons, households, rms, countries,...

It is dierent from repeated cross-sections.

Main advantages of panel data:

Unobserved heterogeneity

Dynamic responses and error components

An Introduction to Panel Data Econometrics. Spring Term 2013 4

Page 5: Slides 2014 Panel Data

Micro and macro panel data

Micro panel data usually has large N , small T (e.g. household

surveys).

Macro panel data usually have longer T , but smaller N (e.g.

countries).

In this course, we will review asymptotic results only for the case

of xed T , N→∞ (micro panels).

Hence, the results that we are after are closer to cross-section

approaches than to time series.

An Introduction to Panel Data Econometrics. Spring Term 2013 5

Page 6: Slides 2014 Panel Data

Employment equations for U.K. rms

We will use the same example all over the course.

Consider the following equation for rm i demand of employ-

ment in year t:

nit = β0 + β1wit + β2kit + ηi + vit,

where

nit is log employment

wit is log wage

kit is log capital

ηi + vit is unobserved

Available in Gretl: data from Arellano and Bond (1991).

An Introduction to Panel Data Econometrics. Spring Term 2013 6

Page 7: Slides 2014 Panel Data

General notation

We will consider the following model:

yit = x′itβ + (ηi + vit)

where xit and β are vectors:

xit =

x1it...

xKit

and β =

β1...

βK

yit and xit are observed, and ηi and vit are unobserved

An Introduction to Panel Data Econometrics. Spring Term 2013 7

Page 8: Slides 2014 Panel Data

Matrix notation

We can write the model piling observations for each individual:

yi = Xiβ + (ηi + vi)

where yi, β, and vi are vectors, and Xi is a matrix:

yi =

yi1...yiT

, Xi =

x1i1 . . . xKi1...

. . ....

x1iT . . . xKiT

,β =

β1...

βK

, and vi =

vi1...viT

,

and ηi = ηiιT , where ιT is a size T vector of ones.

An Introduction to Panel Data Econometrics. Spring Term 2013 8

Page 9: Slides 2014 Panel Data

Matrix notation (cont'd)

And, on top we can pile over all observations:

y = Xβ + (η + v)

where yi, β, η, and vi are vectors, and Xi is a matrix:

y =

y11...

y1T...

yN1...

yNT

=

y1...

yN

, X =

X1...

XN

,η =

η1...

ηN

, and v =

v1...

vN

.

An Introduction to Panel Data Econometrics. Spring Term 2013 9

Page 10: Slides 2014 Panel Data

Pooled OLS

Dene: u ≡ η + v ⇔ uit ≡ ηi + vit.

Then, we have y = Xβ + u.

A simple approach would be to do standard OLS:

βOLS = (X ′X)−1X ′y.

We assume that E[xitvit] = 0 (xit is predetermined).

Therefore, the properties of βOLS depend on E[xitηi]

An Introduction to Panel Data Econometrics. Spring Term 2013 10

Page 11: Slides 2014 Panel Data

Pooled OLS in Employment equations

In our previous example, we would redene our regression as:

nj = β0 + β1wj + β2kj + uj ,

where I used subindex j to emphasize that each observation it isconsidered as one independent observation.

In Gretl we would do:

ols n const w k --robust

An Introduction to Panel Data Econometrics. Spring Term 2013 11

Page 12: Slides 2014 Panel Data

Properties of Pooled OLS

The properties of βOLS depend on E[xitηi].

If E[xitηi] = 0 ⇒E[xitvit]=0

E[xjuj ] = 0 (random eects):

βOLS is consistent as N→∞, or T →∞, or both.

it is ecient only if σ2η = 0.

cross-section produce (asymptotically) the same results.

If E[xitηi] 6= 0⇒ E[xjuj ] 6= 0 (xed eects):

βOLS is inconsistent as N→∞, or T →∞, or both.

cross-section results are inconsistent as well (but panel helpsin constructing a consistent estimator).

Panel data is particularly very useful in the second case.

An Introduction to Panel Data Econometrics. Spring Term 2013 12

Page 13: Slides 2014 Panel Data

Employment equations for U.K. rms

In our example of employment in U.K. rms:

Most productive rms may pay higher wages and yet hire

more workers.

Larger plants can use larger amounts of capital and yet hire

more workers.

...

An Introduction to Panel Data Econometrics. Spring Term 2013 13

Page 14: Slides 2014 Panel Data

Static Models

An Introduction to Panel Data Econometrics. Spring Term 2013 14

Page 15: Slides 2014 Panel Data

General assumptions for static models

Our model is given by:

yit = x′itβ + ηi + vit

with

E[xitηi] 6= 0 (xed e.) or E[xitηi] = 0 (random e.).

E[xitvis] = 0 ∀ s, t (strict exogeneity): rules out eects ofpast vis on current xit (and thus lagged dependent variables).

E[ηi] = E[vit] = E[ηivit] = 0 (error components).

E[vitvis] = 0 ∀ s, t (serially uncorrelated shocks).

ηi ∼ iid(0, σ2η), vit ∼ iid(0, σ2

v) (iid+homoskedasticity).

An Introduction to Panel Data Econometrics. Spring Term 2013 15

Page 16: Slides 2014 Panel Data

The xed eects model.

Within groups estimation.

An Introduction to Panel Data Econometrics. Spring Term 2013 16

Page 17: Slides 2014 Panel Data

Within groups transformation

Consider the following transformation of the model:

yit = yit − yi yi =1

T

T∑t=1

yit.

Notice that:

ηi =1

T

T∑t=1

ηi =1

TTηi = ηi.

Therefore:

yit = (xit − xi)′β + (ηi − ηi) + (vit − vi) = x′itβ + vit.

An Introduction to Panel Data Econometrics. Spring Term 2013 17

Page 18: Slides 2014 Panel Data

Within groups estimator

Given the previous assumptions,

E[xitvit] = 0.

Therefore, OLS on the transformed model,

βWG =(X ′X

)−1X ′y,

provide a consistent estimator regardless of whether E[xitηi] 6= 0or E[xitηi] = 0

This consistency is an advantage of βWG compared to cross-

section OLS, pooled OLS or GLS (to be seen below).

An Introduction to Panel Data Econometrics. Spring Term 2013 18

Page 19: Slides 2014 Panel Data

Disadvantages of within groups estimator

This advantage comes at a cost:

Not ecient:

When N→∞ but T is xed, less ecient that e.g. βGLS ifE[xitηi] = 0.

If E[xitηi] 6= 0, it is only ecient when all regressors arecorrelated with ηi.

It does not allow to identify coecients for (even almost)

time-invariant regressors:

yit = x′itβ +w′iγ + ηi + vit

An Introduction to Panel Data Econometrics. Spring Term 2013 19

Page 20: Slides 2014 Panel Data

The role of strict exogeneity

In the case where N→∞ and T is xed, consistency depends on

strict exogeneity.

To see it, recall that

xit = xit −1

T(xi1 + ...+ xiT ) and vit = vit −

1

T(vi1 + ...+ viT ).

Therefore E[xitvit] = 0 requires E[xitvis] = 0 ∀ s, t unless T →∞.

This has motivated the development of dynamic panel data

models, to relax this assumption.

An Introduction to Panel Data Econometrics. Spring Term 2013 20

Page 21: Slides 2014 Panel Data

Within Groups in Employment equations

In our example, Gretl results from the pooled estimation were:

ols n const w k --robust

If we now estimate βWG, we get:

panel n const w k --fixed-effects --robust

An Introduction to Panel Data Econometrics. Spring Term 2013 21

Page 22: Slides 2014 Panel Data

Least Squares Dummy Variables (LSDV)

The Within Groups estimator can also be obtained by including

a set of N individual dummy variables:

yit = x′itβ + η1D1i + ...+ ηNDNi + vit,

where Dhi = 1h = i (e.g. D1i takes the value of 1 for the

observations on individual 1 and 0 for all other observations).

OLS estimation of this model gives numerically equivalent es-

timates to WG (that's why βWG is a.k.a. βLSDV ).

This gives intuition on why WG is not very ecient if there is

only limited time-series variation (degrees of freedom are NT −K −N = N(T − 1)−K)

An Introduction to Panel Data Econometrics. Spring Term 2013 22

Page 23: Slides 2014 Panel Data

LSDV in Employment equations

In Gretl, we can generate individual (rm) dummies and estimate

by OLS, to check that it delivers the same results:

discrete unit

dummify unit

ols n w k Dunit ∗ --robust

An Introduction to Panel Data Econometrics. Spring Term 2013 23

Page 24: Slides 2014 Panel Data

First-dierenced OLS (FDLS)

Another transformation we can consider is rst dierences:

∆yit = ∆x′itβ + ∆vit, for i = 1, ..., N ; t = 2, ..., T

where ∆yit = yit − yit−1.

It is an alternative way to get rid of time-invariant individual

eects (∆ηi = ηi − ηi = 0), so OLS on the dierenced model is

consistent.

An Introduction to Panel Data Econometrics. Spring Term 2013 24

Page 25: Slides 2014 Panel Data

Some properties of FDLS

Consistency requires E[∆xit∆vit] = 0 which is (implied by but)

weaker than strict exogeneity.

WG is more ecient than FDLS under classical assumptions.

FDLS is more ecient if vit is random walk (i.e. ∆vit = εit ∼iid(0, σ2

ε)).

FDLS (and not WG) is consistent if vi1, ..., vit−2, vit+1, ..., viTare correlated with xit, as long as vit and vit−1 are uncorrelated

with xit.

An Introduction to Panel Data Econometrics. Spring Term 2013 25

Page 26: Slides 2014 Panel Data

FDLS in Employment equations

In Gretl, we can generate st dierences and estimate by OLS,

to get FDLS:

diff n w k

ols d n d w d k --robust

An Introduction to Panel Data Econometrics. Spring Term 2013 26

Page 27: Slides 2014 Panel Data

The random eects model.

Error components.

An Introduction to Panel Data Econometrics. Spring Term 2013 27

Page 28: Slides 2014 Panel Data

Uncorrelated eects

Now we go back to the assumption of uncorrelated or random

eects: E[xitηi] = 0.

In this case, pooled (and cross-section) OLS estimates are con-

sistent, but not ecient.

The ineciency is provided by the serial correlation induced

by ηi:

E[uituis] = E[(ηi + vit)(ηi + vis)] = E[η2i ] = σ2

η

The variance of the unobservables (under classical assumptions) is:

E[u2it] = E[η2

i ] + E[v2it] = σ2

η + σ2v

An Introduction to Panel Data Econometrics. Spring Term 2013 28

Page 29: Slides 2014 Panel Data

Error structure

Therefore, the variance-covariance matrix of the unobservables is

E[uiu′i] =

σ2η + σ2

v σ2η . . . σ2

η

σ2η σ2

η + σ2v . . . σ2

η...

.... . .

...

σ2η σ2

η . . . σ2η + σ2

v

= Ωi,

whose dimensions are T × T , and E[uiu′h] = 0 ∀ i 6= h, or

E[uu′] =

Ω1 0 . . . 00 Ω2 . . . 0...

.... . .

...

0 0 . . . ΩN

= Ω,

which is block-diagonal with dimension NT ×NT .

An Introduction to Panel Data Econometrics. Spring Term 2013 29

Page 30: Slides 2014 Panel Data

Generalized Least Squares (GLS)

Under the classical assumptions, GLS (or random eects) estima-

tor is consistent and ecient if E[xitηi] = 0:

βGLS =(X ′Ω−1X

)−1X ′Ω−1y.

If E[xitηi] 6= 0 GLS is inconsistent as N→∞ and T is xed.

This estimator is unfeasible because we do not know σ2η and σ

2v .

An Introduction to Panel Data Econometrics. Spring Term 2013 30

Page 31: Slides 2014 Panel Data

Theta-dierencing

βGLS is equivalent to OLS on the following transformation of

the model:

y∗it = x∗it′β + u∗it,

where

y∗it = yit − (1− θ)yi,

and

θ2 =σ2v

σ2v + Tσ2

η

.

Two special cases:

If σ2η = 0 (i.e. no individual eect), OLS is ecient.

If T →∞, then θ→ 0, and y∗it→ yit = yit − yi: WG is ecient.

An Introduction to Panel Data Econometrics. Spring Term 2013 31

Page 32: Slides 2014 Panel Data

Between Groups (BG)

A consistent but inecient way to estimate β if E[xitηi] = 0 is

to do OLS on the following cross-section:

yi = x′iβ + ηi + vi, i = 1, ..., N

Between groups is not ecient, and, in practice, it is only used

to estimate σ2η when implementing feasible GLS.

An Introduction to Panel Data Econometrics. Spring Term 2013 32

Page 33: Slides 2014 Panel Data

Feasible GLS (FGLS)

βGLS is unfeasible because we do not know σ2η and σ

2v .

A consistent estimator of σ2v is provided by the WG residuals:

ˆvit = yit − x′itβWG

σ2v =

ˆv′ ˆv

N(T − 1)−K

Then, a consistent estimator of σ2η is given by the BG residuals:

ˆui = yi − x′iβBG

σ2u =

(σ2η +

1

Tσ2v) =

ˆu′ ˆu

N −K

σ2η = σ2

u −1

Tσ2v

An Introduction to Panel Data Econometrics. Spring Term 2013 33

Page 34: Slides 2014 Panel Data

Feasible GLS in Employment equations

In our previous example, Gretl results from the WG estimation

were:

panel n const w k --fixed-effects --robust

If we now estimate βFGLS , we get:

panel n const w k --random-effects --robust

An Introduction to Panel Data Econometrics. Spring Term 2013 34

Page 35: Slides 2014 Panel Data

Testing for correlated individual eects.

An Introduction to Panel Data Econometrics. Spring Term 2013 35

Page 36: Slides 2014 Panel Data

Testing for correlated individual eects

With xed T , it is useful to test whether some of the included

explanatory variables are correlated with the unobserved eects.

βWG is consistent regardless of E[xitηi] being equal to zero or

not.

βFGLS is consistent only if E[xitηi] = 0.

⇒ we can test whether both estimates are similar!

An Introduction to Panel Data Econometrics. Spring Term 2013 36

Page 37: Slides 2014 Panel Data

Hausman test

The Hausman test does exactly this comparison:

h = q′[avar(q)]−1qa∼ χ2(K)

under the null hypothesis E[xitηi] = 0, where

q = βWG − βFGLS ,

and

avar(q) = avar(βWG

)− avar

(βFGLS

).

The test require classical assumptions (under which the FGLS

is more ecient that WG).

An Introduction to Panel Data Econometrics. Spring Term 2013 37

Page 38: Slides 2014 Panel Data

Hausman test in Employment equations

Gretl output for random eects includes the Hausman test:

panel n const w k --random-effects --robust

An Introduction to Panel Data Econometrics. Spring Term 2013 38

Page 39: Slides 2014 Panel Data

Dynamic Models

An Introduction to Panel Data Econometrics. Spring Term 2013 39

Page 40: Slides 2014 Panel Data

Autoregressive models with individual

eects.

An Introduction to Panel Data Econometrics. Spring Term 2013 40

Page 41: Slides 2014 Panel Data

Autorregressive panel data model

We will consider the following model:

yit = αyit−1 + ηi + vit |α| < 1.

Other regressors can be included, but main results unaected.

We assume that we observe yi0, and, hence, this model is dened

for i = 1, ..., N and t = 1, ..., T .

Two important properties:

E[yit−1ηi] = σ2η

(1−αt−1

1−α

)> 0 since yit−1 includes ηi.

E[yit−1vit−1] = σ2v > 0 for the same reason.

Hence, yit−1 is correlated with the individual eects, and it

is not strictly exogenous.An Introduction to Panel Data Econometrics. Spring Term 2013 41

Page 42: Slides 2014 Panel Data

Properties of pooled OLS and WG estimators

Even assuming E[yit−1vit] = 0, still pooled OLS delivers

plimN→∞

αOLS > α,

because E[yit−1ηi] = σ2η

(1−αt−1

1−α

)> 0.

Doing the within groups transformation we see that

plimN→∞

αWG < α

because E[yit−1vit] = −Aσ2v < 0.

(A =

(1−α)(1+T

(1−αt−1−αT−1−t

))+αT

(1−αT−1

)T2(1−α)2

)

WG bias vanishes as T →∞ (bias not small even with T = 15).

Supposedly consistent estimators that give α >> αOLS or α << αWG should

be seen with suspicion.

An Introduction to Panel Data Econometrics. Spring Term 2013 42

Page 43: Slides 2014 Panel Data

Instrumental variables

Consider the model in rst dierences:

∆yit = α∆yit−1 + ∆vit.

OLS on the rst dierences is inconsistent:

E[∆yit−1∆vit] = −σ2v < 0.

However, if E[yit−1vit] = 0 (serially uncorrelated shocks), then

yit−2 or ∆yit−2 are valid instruments for ∆yit−1:

Relevance: E[∆yit−2∆yit−1] = E[yit−2∆yit−1] = E[−y2it−2] 6= 0.

Orthogonality: E[∆yit−2∆vit] = E[yit−2∆vit] = 0.

An Introduction to Panel Data Econometrics. Spring Term 2013 43

Page 44: Slides 2014 Panel Data

Anderson and Hsiao

The 2SLS estimators of this type were suggested by Anderson

and Hsiao (1981):

αAH =(

∆y′−1∆y−1

)−1∆y′−1∆y,

where

∆y−1 = Z(Z ′Z

)−1Z ′∆y−1,

where Z can be y−2 or ∆y−2.

A minimum of three periods (T = 2 plus yi0) is needed to

implement it.

This approach is only ecient if T = 2 (otherwise, additional

instruments are available).

An Introduction to Panel Data Econometrics. Spring Term 2013 44

Page 45: Slides 2014 Panel Data

Dierenced GMM estimation.

An Introduction to Panel Data Econometrics. Spring Term 2013 45

Page 46: Slides 2014 Panel Data

GMM in 3 slides (I): the setup

GMM nds parameter estimates that come as close as possible

to satisfy orthogonality conditions (or moment conditions) in

the sample.

E.g., consider K regressors β and L instruments zi:

ui = yi − f(xi,β) zi = g(xi).

The model species L moment conditions: E[ziui] = 0.

Sample analogue:

bN (β) =1

N

N∑i=1

ziui(β)

An Introduction to Panel Data Econometrics. Spring Term 2013 46

Page 47: Slides 2014 Panel Data

GMM in 3 slides (II): the estimation

Two possible cases cases:

L = K (just identied): bN (βGMM ) = 0.

L > K (overidentied): βGMM = arg minβ bN (β)′WNbN (β).

WN is a positive denite weighting matrix.

Optimal GMM (ecient) uses(

1N

∑Ni=1 E[ziuiu

′iz′i])−1

, the in-

verse of the variance-covariance matrix, as weighting matrix.

An Introduction to Panel Data Econometrics. Spring Term 2013 47

Page 48: Slides 2014 Panel Data

GMM in 3 slides (III): asymptotic properties

βGMM is a consistent estimator of β.

It is asymptotically normal, with the following variance:

avar(βGMM ) = (D′WD)−1D′WSWWD(D′WD)−1

where

D = plimN→∞

∂bN (β)

∂β′

W = plimN→∞

WN

SW =1

N

N∑i=1

E[ziuiu′iz′i]

An Introduction to Panel Data Econometrics. Spring Term 2013 48

Page 49: Slides 2014 Panel Data

Back to the AR(1) panel data: assumptions

Our model is given by:

yit = αyit−1 + ηi + vit, i = 1, ..., N ; t = 1, ..., T

with

E[ηi] = E[vit] = E[ηivit] = 0 (error components).

E[vitvis] = 0 ∀ s, t (serially uncorrelated shocks).

E[yi0vit] = 0 ∀ t = 1, ..., T (predeterm. initial cond.).

An Introduction to Panel Data Econometrics. Spring Term 2013 49

Page 50: Slides 2014 Panel Data

Moment conditionsGiven previous assumptions, several moment conditions:

Equation Instruments Orthogonality cond.

∆yi2 = α∆yi1 + ∆vi2 yi0 E[∆vi2yi0] = 0

∆yi3 = α∆yi2 + ∆vi3 yi0, yi1 E[∆vi3

(yi0yi1

)]= 0

∆yi4 = α∆yi3 + ∆vi4 yi0, yi1, yi2 E

∆vi4

yi0yi1yi2

= 0

......

...

∆yiT = α∆yiT−1 + ∆viT yi0, yi1, yi2, ..., yiT−2 E

∆viT

yi0yi1yi2...

yiT−2

= 0

We end up with (T − 1)T/2 moment conditions.An Introduction to Panel Data Econometrics. Spring Term 2013 50

Page 51: Slides 2014 Panel Data

Moment conditions in matrix form

We can write these moment conditions as E[Z′i∆vi] = 0, where

Zi =

yi0 0 0 0 . . . 0 0 . . . 00 yi0 yi1 0 . . . 0 0 . . . 0...

......

.... . .

...... . . .

...0 0 0 0 . . . yi0 yi1 . . . yiT−2

and ∆vi =

∆vi2∆vi3...

∆viT

,

and the sample analogue is

bN (α) =1

N

N∑i=1

Z′i∆vi(α)

Hence, the GMM estimator (proposed by Arellano and Bond, 1991) is

αGMM = arg minα

(1

N

N∑i=1

∆v′i(α)Zi

)WN

(1

N

N∑i=1

Z′i∆vi(α)

)=

= (∆y′−1ZWNZ′∆y−1)−1∆y′−1ZWNZ

′∆y

An Introduction to Panel Data Econometrics. Spring Term 2013 51

Page 52: Slides 2014 Panel Data

Optimal weighting matrix

The optimal weighting matrix (ecient GMM) is:

WN =

(1

N

N∑i=1

E[Z′i∆viv′iZi]

)−1

The sample analogue is obtained in a two-step procedure:

WN =

(1

N

N∑i=1

[Z′i∆vi(α)∆v′i(α)Zi]

)−1

Windeijer (2005) proposes a nite sample correction of the variance thataccounts for α being estimated.

The most common one-step (and rst-step) matrix uses the structure ofE[∆vi∆v

′i]:

E[∆vi∆v′i] = σ2

v

2 −1 0 0 . . . 0−1 2 −1 0 . . . 0...

......

. . ....

0 0 0 0 . . . 2

An Introduction to Panel Data Econometrics. Spring Term 2013 52

Page 53: Slides 2014 Panel Data

GMM on Employment equationsBy default, Gretl estimates a one-step GMM:

dpanel 1; n

However, we can also do the two-step GMM:

dpanel 1; n --two-step --asymptotic

When doing two-step, we want to do small sample correction:

dpanel 1; n --two-step

An Introduction to Panel Data Econometrics. Spring Term 2013 53

Page 54: Slides 2014 Panel Data

GMM on Employment equations (cont'd)The command is exible enough so that we can estimate Anderson-Hsiao:

dpanel 1; n; GMM(n,2,2)

Let us also compute OLS:

lags 1; n

ols n n 1 --robust

and Within Groups:

ols n n 1 Dunit ∗ --robust

An Introduction to Panel Data Econometrics. Spring Term 2013 54

Page 55: Slides 2014 Panel Data

Potential limitations of Arellano-Bond

Weak instruments:

When α→ 1, relevance of the instrument decreases.

Instruments are still valid, but have poor small sample properties.

Monte Carlo evidence shows that with α > 0.8, estimator behavespoorly unless huge samples available.

There are alternatives in the literature.

Overtting:

Too many instruments if T relative to N is relatively large.

We might want to restrict the number of instruments used.

It is always good practice to check robustness to dierent combina-tions of instruments.

An Introduction to Panel Data Econometrics. Spring Term 2013 55

Page 56: Slides 2014 Panel Data

Dynamic linear model

Once we include regressors, the model is

yit = αyit−1 + x′itβ + ηi + vit |α| < 1.

We maintain the previous assumptions: error components, se-

rially uncorrelated shocks, and predetermined initial conditions.

Therefore, moment conditions of the kind

E[yit−s∆vit] = 0 t = 2, ..., T, s ≥ 2

are still valid.

An Introduction to Panel Data Econometrics. Spring Term 2013 56

Page 57: Slides 2014 Panel Data

Assumptions on regressors

Dierent assumptions regarding xit will enable dierent addi-

tional orthogonality conditions:

xit can be correlated or uncorrelated with ηi.

xit can endogenous, predetermined, or strictly exogenous

with respect to vit.

For instance, if assumptions are analogous to those for yit−1, we

may use xit−1 (and longer lags) as instruments:

Zi =

yi0 x′i0 x′i1 . . . 0 . . . 0 0′ . . . 0′

......

.... . .

... . . ....

......

...0 0′ 0′ . . . yi0 . . . yiT−2 x′i0 . . . x′iT−1

.

An Introduction to Panel Data Econometrics. Spring Term 2013 57

Page 58: Slides 2014 Panel Data

Employment equations with regressorsGretl command is easily extended to include regressors:

dpanel 1; n w k

And we can do OLS:

ols n n 1 w k --robust

and Within Groups in order to compare:

ols n n 1 w k Dunit ∗ --robust

An Introduction to Panel Data Econometrics. Spring Term 2013 58

Page 59: Slides 2014 Panel Data

Specication tests

There are several relevant aspects for the validity of the estima-

tion that can be tested formally.

Orthogonality conditions: are they small enough to not

reject that they are zero (overidentifying restrictions)

Validity of a subset of more restrictive assumptions (dier-

ence Sargan test, Hausman test)

Serial correlation in the data: vital for the validity of

the instruments (Arellano-Bond test).

An Introduction to Panel Data Econometrics. Spring Term 2013 59

Page 60: Slides 2014 Panel Data

Sargan/Hansen overidentifying restrictions test

The null hypothesis is whether the orthogonality conditions are satised(i.e. moments are equal to zero).

The test can only be implemented if the problem is overidentied (other-wise the sample moments are exactly zero by construction).

The test is:

S = NJN (β) = N

1

N

N∑i=1

ˆu′iZi

(1

N

N∑i=1

Z′iuiu′iZi

)−1

1

N

N∑i=1

Z′i ˆui

,

where u are predicted residuals from the rst step and ˆu are those predictedfrom the second stage,and

Sa∼ χ2(L−K).

An Introduction to Panel Data Econometrics. Spring Term 2013 60

Page 61: Slides 2014 Panel Data

Testing stronger assumptions

Some of the assumptions that we make are stronger than others.

If the problem is overidentied, we can test whether results change

if we include or exclude the orthogonality conditions generated

by them.

If they are true, increase eciency, but if not, inconsistent!

Two ways of testing it:

Overidentifying restrictions (dierence in Sargan should

be close to zero: ∆S = S − SA a∼ χ2(L− LA))

Dierences in coecients (Hausman test: q = δAGMM −

δGMM )

An Introduction to Panel Data Econometrics. Spring Term 2013 61

Page 62: Slides 2014 Panel Data

Direct test for serial correlation

The test was proposed by Arellano-Bond (1991).

Tests for the presence of second order autocorrelation in the

rst-dierenced residuals.

If dierences in residuals are second-order correlated, some in-

struments would not be valid!

The test is:

m2 =∆v−2

′∆v∗

se

a∼ N (0, 1),

where ∆v−2 is the second lagged residual in dierences, and ∆v∗is the part of the vector of contemporaneous rst dierences for

the periods that overlap with the second lagged vector.

Values close to zero do not reject the hypothesis of no serial

correlation.

An Introduction to Panel Data Econometrics. Spring Term 2013 62

Page 63: Slides 2014 Panel Data

System GMM estimation.

An Introduction to Panel Data Econometrics. Spring Term 2013 63

Page 64: Slides 2014 Panel Data

Additional orthogonality conditions

Without loss of generality, we will go back to the autoregressive

model:

yit = αyit−1 + ηi + vit |α| < 1

Recall our (T − 1)T/2 moment conditions:

E[yit−s∆vit] = 0 t = 2, ..., T ; s ≥ 2

System GMM (Arellano and Bover, 1995) uses additional con-

ditions:

E[∆yisηi] = 0,

or, alternatively,

E[∆yisuiT ] = 0, uiT = ηi + uiT

An Introduction to Panel Data Econometrics. Spring Term 2013 64

Page 65: Slides 2014 Panel Data

The System GMM estimator

Analogously to the rst-dierenced GMM, the estimator will be

given by E[(Z∗)′u∗i ] = 0:

αSys−GMM =((X∗)′Z∗WN (Z∗)′X∗

)−1X∗Z∗WN (Z∗)′y∗,

where,

Z∗i =

(Zi 0 . . . 00 ∆yi1 . . . ∆yiT−1

),u∗i =

(ui

ηi + viT

), X∗i =

(∆y−1i

yiT−1

)and y∗i =

(∆yiyiT

),

This estimator is more ecient, as it uses additional moment conditions.

It reduces small sample bias, especially when α→ 1

An Introduction to Panel Data Econometrics. Spring Term 2013 65

Page 66: Slides 2014 Panel Data

System-GMM on Employment equationsRemember our Gretl for the two-step rst-dierenced GMM with smallsample correction:

dpanel 1; n --two-step

Two-step system-GMM with small sample correction delivers:

dpanel 1; n --system --two-step

And the one-step system-GMM delivers:

dpanel 1; n --system

An Introduction to Panel Data Econometrics. Spring Term 2013 66

Page 67: Slides 2014 Panel Data

System GMM in employment equations (cont'd)And with regressors, two-step System-GMM:

dpanel 1; n w k --two-step --system

Versus OLS:

ols n n 1 w k --robust

and Within Groups estimates:

ols n n 1 w k Dunit ∗ --robust

An Introduction to Panel Data Econometrics. Spring Term 2013 67