Transcript
Page 1: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 1/75]

Econometric Analysis of Panel Data

William Greene

Department of Economics

Stern School of Business

Page 2: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 2/75]

Endogeneity y = X+ε, Definition: E[ε|x]≠0 Why not?

Omitted variables Unobserved heterogeneity (equivalent to omitted

variables) Measurement error on the RHS (equivalent to omitted

variables) Structural aspects of the model Endogenous sampling and attrition Simultaneity (?)

Page 3: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 3/75]

Instrumental Variable Estimation One “problem” variable – the “last” one yit = 1x1it + 2x2it + … + KxKit + εit E[εit|xKit] ≠ 0. (0 for all others) There exists a variable zit such that

E[xKit| x1it, x2it,…, xK-1,it,zit] = g(x1it, x2it,…, xK-1,it,zit)In the presence of the other variables, zit “explains” xit

E[εit| x1it, x2it,…, xK-1,it,zit] = 0 In the presence of the other variables, zit and εit are

uncorrelated. A projection interpretation: In the projection XKt =θ1x1it,+ θ2x2it + … + θk-1xK-1,it + θK zit, θK ≠ 0.

Page 4: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 4/75]

The First IV Study: Natural Experiment(Snow, J., On the Mode of Communication of Cholera, 1855)

http://www.ph.ucla.edu/epi/snow/snowbook3.html

London Cholera epidemic, ca 1853-4 Cholera = f(Water Purity,u)+ε.

‘Causal’ effect of water purity on cholera? Purity=f(cholera prone environment (poor,

garbage in streets, rodents, etc.). Regression does not work.

Two London water companies

Lambeth Southwark

Main sewage discharge

Paul Grootendorst: A Review of Instrumental Variables Estimation of Treatment Effects…http://individual.utoronto.ca/grootendorst/pdf/IV_Paper_Sept6_2007.pdf

River Thames

Page 5: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 5/75]

IV Estimation

Cholera=f(Purity,u)+ε Z = water company Cov(Cholera,Z)=δCov(Purity,Z) Z is randomly mixed in the population

(two full sets of pipes) and uncorrelated with behavioral unobservables, u)

Cholera=α+δPurity+u+ε Purity = Mean+random variation+λu Cov(Cholera,Z)= δCov(Purity,Z)

Page 6: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 6/75]

Cornwell and Rupert DataCornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 YearsVariables in the file are

EXP = work experienceWKS = weeks workedOCC = occupation, 1 if blue collar, IND = 1 if manufacturing industrySOUTH = 1 if resides in southSMSA = 1 if resides in a city (SMSA)MS = 1 if marriedFEM = 1 if femaleUNION = 1 if wage set by union contractED = years of educationLWAGE = log of wage = dependent variable in regressions

These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied Econometrics, 3, 1988, pp. 149-155.  See Baltagi, page 122 for further analysis.  The data were downloaded from the website for Baltagi's text.

Page 7: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 7/75]

Page 8: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 8/75]

Specification: Quadratic Effect of Experience

Page 9: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 9/75]

The Effect of Education on LWAGE

1 2 3 4 ... ε

What is ε? ,... + everything elAbil seity, Motivation

Ability, Motivation = f( , , , ,...)

LWAGE EDUC EXP

EDUC GENDER SMSA SOUTH

2EXP

Page 10: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 10/75]

What Influences LWAGE?

1 2

3 4

Ability, Motivation

Ability, Motivat

( , ,...)

...

ε( )

Increased is associated with increases in

ion

Ability

Ability, Motivati( , ,on

LWAGE EDUC X

EXP

EDUC X

2EXP

2

...) and ε( )

What looks like an effect due to increase in may

be an increase in . The estimate of picks up

the effect of and the hidden effect of .

Ability, Motivation

Ability

Ability

EDUC

EDUC

Page 11: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 11/75]

An Exogenous Influence

1 2

3 4

( , , ,...)

...

ε( )

Increased is asso

Abili

ciate

ty, Motivation

Ability, Motivation

Ability, Motiva

d with increases in

( , , ,ti .n .o

LWAGE EDU Z

Z

Z

C X

EXP

EDUC X

2EXP

2

.) and not ε( )

An effect due to the effect of an increase on will

only be an increase in . The estimate of picks up

the effect of only.

Ability, Motiv

ation

EDUC

EDUC

ED

Z

Z

UC

is an Instrumental Variable

Page 12: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 12/75]

Instrumental Variables

Structure LWAGE (ED,EXP,EXPSQ,WKS,OCC,

SOUTH,SMSA,UNION) ED (MS, FEM)

Reduced Form: LWAGE[ ED (MS, FEM), EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION ]

Page 13: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 13/75]

Two Stage Least Squares Strategy

Reduced Form: LWAGE[ ED (MS, FEM,X), EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION ]

Strategy (1) Purge ED of the influence of everything but MS, FEM

(and the other variables). Predict ED using all exogenous information in the sample (X and Z).

(2) Regress LWAGE on this prediction of ED and everything else.

Standard errors must be adjusted for the predicted ED

Page 14: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 14/75]

OLS

Page 15: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 15/75]

The weird results for the coefficient on ED happened because the instruments, MS and FEM are dummy variables. There is not enough variation in these variables.

Page 16: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 16/75]

Source of Endogeneity

LWAGE = f(ED, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) +

ED = f(MS,FEM, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u

Page 17: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 17/75]

Remove the Endogeneity LWAGE = f(ED,

EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u +

LWAGE = f(ED, EXP,EXPSQ,WKS,OCC, SOUTH,SMSA,UNION) + u +

Strategy Estimate u Add u to the equation. ED is uncorrelated with

when u is in the equation.

Page 18: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 18/75]

Auxiliary Regression for ED to Obtain Residuals

Page 19: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 19/75]

OLS with Residual (Control Function) Added

2SLS

Page 20: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 20/75]

A Warning About Control Functions

Sum of squares is not computed correctly because U is in the regression.A general result. Control function estimators usually require a fix to the estimated covariance matrix for the estimator.

Page 21: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 21/75]

On Sat, May 3, 2014 at 4:48 PM, … wrote:

Dear Professor Greene,

I am giving an Econometrics course in Brazil and we are using your great textbook. I got a question which I think only you can help me. In our last class, I did a formal proof that var(beta_hat_OLS) is lower or equal than var(beta_hat_2SLS), under homoscedasticity. We know this assertive is also valid under heteroscedasticity, but a graduate student asked me the proof (which is my problem). Do you know where can I find it?

Page 22: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 22/75]

Page 23: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 23/75]

Page 24: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 24/75]

Page 25: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 25/75]

The General Problem1 2

1 1

2 2

2

1 2

2

Cov( , ) , K variables

Cov( , ) , K variables

is

OLS regression of y on ( , ) cannot estimate ( , )

consistently. Some other estimator is needed.

Additional structure:

endogenous

y X X

X 0

X 0

X

X X

X

1 2

= + where Cov( , ) but Cov( , )= .

An estimator based on ( , , ) may

be able to estimate ( , ) consistently.

instrumental variable

Z V V 0 Z 0

(IV X) X Z

Page 26: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 26/75]

Instrumental Variables Framework: y = X + , K variables in X. There exists a set of K variables, Z such that

plim(Z’X/n) 0 but plim(Z’/n) = 0

The variables in Z are called instrumental variables.

An alternative (to least squares) estimator of is

bIV = (Z’X)-1Z’y

We consider the following: Why use this estimator? What are its properties compared to least

squares? We will also examine an important application

Page 27: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 27/75]

IV Estimators

* Consistent

bIV = (Z’X)-1Z’y

= (Z’X/n)-1 (Z’X/n)β+ (Z’X/n)-1Z’ε/n = β+ (Z’X/n)-1Z’ε/n β* Asymptotically normal (same approach to proof as for

OLS)* Inefficient – to be shown.* By construction, the IV estimator is consistent. We have

an estimator that is consistent when least squares is not.

Page 28: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 28/75]

IV Estimation

Why use an IV estimator? Suppose that X and are not uncorrelated. Then least squares is neither unbiased nor consistent.

Recall the proof of consistency of least squares:

b = + (X’X/n)-1(X’/n).

Plim b = requires plim(X’/n) = 0. If this does not hold, the estimator is inconsistent.

Page 29: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 29/75]

A Popular MisconceptionIf only one variable in X is correlated with , the other

coefficients are consistently estimated. False.

The problem is “smeared” over the other coefficients.

1

111

21-1

1

1

1

Suppose only the first variable is correlated with

0Under the assumptions, plim( /n) = . Then

...

.

q0 q

plim = plim( /n)... .... q

times

K

ε

X'ε

b- β X'X

-1 the first column of Q

Page 30: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 30/75]

Consistency and Asymptotic Normality of the IV Estimator

-1

-1

-1

ˆ ( )

( )

ˆplim plim( N) N

ˆAsy.Var[ Asy.Var[ N]

( N) [plim N] (1/

-1ZX

-1 -1ZX XZ

-1 -1ZX XZ

β= Z'X Z'y

=β + Z'X Z'ε

(β - β )= Z'X/ (Z'ε/ )

= Q 0

β - β]= Q Z'ε/ Q

= 1/ Q Z'ΩZ/ Q =

i,t it it-1 -1

i

N)

(Assuming "well behaved" data)

ˆN ( N) N = ( N) N N

Invoke Slutsky and Lindberg-Feller for N to assert asymptotic normality.

ˆEstimate Asy.Var[ ] with

-1 -1ZX ZZ XZQ Ω Q

z(β - β )= Z'X/ Z'X/ w

w

β2

,t it it -1 -1ˆ(y )

[ ] [ ][ ]N or (N-K)

x βZ'X Z'Z XZ

Page 31: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 31/75]

Asymptotic Covariance Matrix of bIV

1IV

1 -1IV IV

2 1 -1IV IV

( ) '

( )( ) ' ( ) ' ' ( )

E[( )( ) '| ] ( ) ' ( )

b Z'X Z

b b Z'X Z Z X'Z

b b X,Z Z'X Z Z X'Z

Page 32: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 32/75]

Asymptotic EfficiencyAsymptotic efficiency of the IV estimator. The

variance is larger than that of LS. (A large sample type of Gauss-Markov result is at work.)

(1) It’s a moot point. LS is inconsistent.(2) Mean squared error is uncertain:

MSE[estimator|β]=Variance + square of bias.

IV may be better or worse. Depends on the data

Page 33: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 33/75]

Two Stage Least Squares

How to use an “excess” of instrumental variables(1) X is K variables. Some (at least one) of the K variables in X are correlated with ε.(2) Z is M > K variables. Some of the variables in Z are also in X, some are not. None of the variables in Z are correlated with ε.(3) Which K variables to use to compute Z’X and

Z’y?

Page 34: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 34/75]

Choosing the Instruments Choose K randomly? Choose the included Xs and the remainder

randomly? Use all of them? How? A theorem: (Brundy and Jorgenson, ca. 1972)

There is a most efficient way to construct the IV estimator from this subset: (1) For each column (variable) in X, compute the

predictions of that variable using all the columns of Z. (2) Linearly regress y on these K predictions.

This is two stage least squares

Page 35: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 35/75]

Algebraic Equivalence

Two stage least squares is equivalent to (1) each variable in X that is also in Z is

replaced by itself. (2) Variables in X that are not in Z are replaced

by predictions of that X using All other variables in X that are not correlated with ε All the variables in Z that are not in X.

Page 36: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 36/75]

The weird results for the coefficient on ED happened because the instruments, MS and FEM are dummy variables. There is not enough variation in these variables.

Page 37: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 37/75]

2SLS Algebra

1

1

ˆ

ˆ ˆ ˆ( )

But, = ( ) and ( ) is idempotent.

ˆ ˆ ( )( ) ( ) so

ˆ ˆ( ) = a real IV estimator by the definition.

ˆNote, plim( /n) =

-1

2SLS

-1Z Z

Z Z Z

2SLS

X Z(Z'Z) Z'X

b X'X X'y

Z(Z'Z) Z'X I -M X I -M

X'X= X' I -M I -M X= X' I -M X

b X'X X'y

X' 0

-1

ˆ since columns of are linear combinations

of the columns of , all of which are uncorrelated with

( ) ] ( )

2SLS Z Z

X

Z

b X' I -M X X' I -M y

Page 38: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 38/75]

Asymptotic Covariance Matrix for 2SLS

2 1 -1IV IV

2 1 -12SLS 2SLS

General Result for Instrumental Variable Estimation

E[( )( ) '| ] ( ) ' ( )

ˆSpecialize for 2SLS, using = ( )

ˆ ˆ ˆ ˆE[( )( ) '| ] ( ) ' ( )

Z

b b X,Z Z'X Z Z X'Z

Z X= I -M X

b b X,Z X'X X X X'X

2 1 -1

2 1

ˆ ˆ ˆ ˆ ˆ ˆ ( ) ' ( )

ˆ ˆ ( )

X'X X X X'X

X'X

Page 39: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 39/75]

2SLS Has Larger Variance than LS

2 -1

2 -1

A comparison to OLS

ˆ ˆAsy.Var[2SLS]= ( ' )

Neglecting the inconsistency,

Asy.Var[LS] = ( ' )

(This is the variance of LS around its mean, not )

Asy.Var[2SLS] Asy.Var[LS] in the matrix sense.

Com

X X

X X

β

-1 -1 2

2 2Z Z

pare inverses:

ˆ ˆ{Asy.Var[LS]} - {Asy.Var[2SLS]} (1/ )[ ' ' ]

(1/ )[ ' '( ) ]=(1/ )[ ' ]

This matrix is nonnegative definite. (Not positive definite

as it might have some rows and columns

X X- X X

X X- X I M X X M X

which are zero.)

Implication for "precision" of 2SLS.

The problem of "Weak Instruments"

Page 40: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 40/75]

Estimating σ2

2

2 n1i 1 in

Estimating the asymptotic covariance matrix -

a caution about estimating .

ˆSince the regression is computed by regressing y on ,

one might use

ˆ (y )ˆ

This is i

2sls

x

x'b

2 n1i 1 in

nconsistent. Use

(y )ˆ

(Degrees of freedom correction is optional. Conventional,

but not necessary.)

2slsx'b

Page 41: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 41/75]

Robust estimation of VC

-1 2 -1i,t it it

Counterpart to the White estimator allows heteroscedasticity

ˆˆ ˆ ˆ ˆˆ ˆEst.Asy.Var[ ]=( ) (y ) ( ) it itX'X x β x x X'X

“Actual” X

“Predicted” X

Page 42: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 42/75]

2SLS vs. Robust Standard Errors+--------------------------------------------------+| Robust Standard Errors |+---------+--------------+----------------+--------+|Variable | Coefficient | Standard Error |b/St.Er.|+---------+--------------+----------------+--------+ B_1 45.4842872 4.02597121 11.298 B_2 .05354484 .01264923 4.233 B_3 -.00169664 .00029006 -5.849 B_4 .01294854 .05757179 .225 B_5 .38537223 .07065602 5.454 B_6 .36777247 .06472185 5.682 B_7 .95530115 .08681261 11.000 +--------------------------------------------------+| 2SLS Standard Errors |+---------+--------------+----------------+--------+|Variable | Coefficient | Standard Error |b/St.Er.|+---------+--------------+----------------+--------+ B_1 45.4842872 .36908158 123.236 B_2 .05354484 .03139904 1.705 B_3 -.00169664 .00069138 -2.454 B_4 .01294854 .16266435 .080 B_5 .38537223 .17645815 2.184 B_6 .36777247 .17284574 2.128 B_7 .95530115 .20846241 4.583

Page 43: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 43/75]

Endogeneity Test? (Hausman)

Exogenous Endogenous

OLS Consistent, Efficient Inconsistent 2SLS Consistent, Inefficient Consistent

Base a test on d = b2SLS - bOLS

Use a Wald statistic, d’[Var(d)]-1d

What to use for the variance matrix? Hausman: V2SLS - VOLS

Page 44: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 44/75]

Hausman Test

Page 45: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 45/75]

Hausman Test: One at a Time?

Page 46: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 46/75]

Endogeneity Test: Wu

Considerable complication in Hausman test (text, pp. 234-237)

Simplification: Wu test. Regress y on X and estimated for the

endogenous part of X. Then use an ordinary Wald test.

Page 47: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 47/75]

Wu Test

Page 48: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 48/75]

Regression Based Endogeneity Test

it it it

An easy t test. (Wooldridge 2010, p. 127)

y q

= a set of M instruments.

Write = +

Can be estimated by ordinary least squares.

Endogeneity concerns correlation between v and .

ˆAdd v

itx δ

Z

q Zπ v

it it it it

= q - to the equation and use OLSˆ

ˆy q v + { error}

Simple t test on whether equals 0.

ˆEven easier, algebraically identical, (Wu, 1973), add

to the equation and do the same tes

it

z

x δ

q

t.

Page 49: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 49/75]

Testing Endogeneity of WKS(1) Regress WKS on 1,EXP,EXPSQ,OCC,SOUTH,SMSA,MS. U=residual, WKSHAT=prediction(2) Regress LWAGE on 1,EXP,EXPSQ,OCC,SOUTH,SMSA,WKS, U or WKSHAT+---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Constant -9.97734299 .75652186 -13.188 .0000 EXP .01833440 .00259373 7.069 .0000 19.8537815 EXPSQ -.799491D-04 .603484D-04 -1.325 .1852 514.405042 OCC -.28885529 .01222533 -23.628 .0000 .51116447 SOUTH -.26279891 .01439561 -18.255 .0000 .29027611 SMSA .03616514 .01369743 2.640 .0083 .65378151 WKS .35314170 .01638709 21.550 .0000 46.8115246 U -.34960141 .01642842 -21.280 .0000 -.341879D-14+---------+--------------+----------------+--------+---------+----------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|+---------+--------------+----------------+--------+---------+----------+ Constant -9.97734299 .75652186 -13.188 .0000 EXP .01833440 .00259373 7.069 .0000 19.8537815 EXPSQ -.799491D-04 .603484D-04 -1.325 .1852 514.405042 OCC -.28885529 .01222533 -23.628 .0000 .51116447 SOUTH -.26279891 .01439561 -18.255 .0000 .29027611 SMSA .03616514 .01369743 2.640 .0083 .65378151 WKS .00354028 .00116459 3.040 .0024 46.8115246 WKSHAT .34960141 .01642842 21.280 .0000 46.8115246

Page 50: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 50/75]

General Test for Endogeneity

2

Extending the Wu test

There are M K instruments .

ˆTo carry out the test, compute column

by column with regressions on .

Compute by OLS the extended model

ˆ

Use

1 1 2 2

2

1 1 2 2 2

y= X β +X β +ε

Z

X

Z

y= X β +X β + X + error

0 an F test to test H : 0

Page 51: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 51/75]

Alternative to Hausman’s Formula?

H test requires the difference between an efficient and an inefficient estimator.

Any way to compare any two competing estimators even if neither is efficient?

Bootstrap? (Maybe)

Page 52: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 52/75]

Page 53: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 53/75]

Weak Instruments Symptom: The relevance condition, plim Z’X/n not zero,

is close to being violated. Detection:

Standard F test in the regression of xk on Z. F < 10 suggests a problem.

F statistic based on 2SLS – see text p. 351. Remedy:

Not much – most of the discussion is about the condition, not what to do about it.

Use LIML? Requires a normality assumption. Probably not too restrictive.

Page 54: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 54/75]

Weak Instruments (cont.)

-1ˆplim = + [Cov( )] Cov( )

If Cov( ) is "small" but nonzero, small

Cov( ) may hugely magnify the effect.

IV is not only inefficient, may be very badly

biased by "weak" instruments.

Solutions

β β Z,X Z,ε

Z,ε

Z,X

? Can one "test" for weak instruments?

Page 55: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 55/75]

Weak Instruments

-1 -1ZX XZ

-1XZ

Which is better?

LS is inconsistent, but probably has smaller variance

LS may be more precise

IV is consistent, but probably has larger variance

ˆAsy.Var[ ] =

may be lZZβ Q Ω Q

Q

ZX

arge. (Compared to what?)

Strange results with "small"

IV estimator tends to resemble OLS (bias) (not a

function of sample size).

Contradictory result. Suppose z is perfectly correlated

Q

with x. IV MUST be the same as OLS.

Page 56: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 56/75]

A study of moral hazardRiphahn, Wambach, Million: “Incentive Effects in the Demand for Healthcare”Journal of Applied Econometrics, 2003

Did the presence of the ADDON insurance influence the demand for health care – doctor visits and hospital visits?

For a simple example, we examine the PUBLIC insurance (89%) instead of ADDON insurance (2%).

Page 57: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 57/75]

Application: Health Care Panel Data

German Health Care Usage Data, 7,293 Individuals, Varying Numbers of PeriodsVariables in the file areData downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293 individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary choice.  This is a large data set.  There are altogether 27,326 observations.  The number of observations ranges from 1 to 7.  (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987).  Note, the variable NUMOBS below tells how many observations there are for each person.  This variable is repeated in each row of the data for the person.  (Downloaded from the JAE Archive) DOCTOR = 1(Number of doctor visits > 0) HOSPITAL = 1(Number of hospital visits > 0) HSAT =  health satisfaction, coded 0 (low) - 10 (high)   DOCVIS =  number of doctor visits in last three months HOSPVIS =  number of hospital visits in last calendar year PUBLIC =  insured in public health insurance = 1; otherwise = 0 ADDON =  insured by add-on insurance = 1; otherswise = 0 HHNINC =  household nominal monthly net income in German marks / 10000. (4 observations with income=0 were dropped) HHKIDS = children under age 16 in the household = 1; otherwise = 0 EDUC =  years of schooling AGE = age in years MARRIED = marital status EDUC = years of education

Page 58: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 58/75]

Evidence of Moral Hazard?

Page 59: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 59/75]

Regression Study

Page 60: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 60/75]

Endogenous Dummy Variable

Doctor Visits = f(Age, Educ, Health, Presence of Insurance, Other unobservables)

Insurance = f(Expected Doctor Visits, Other unobservables)

Page 61: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 61/75]

Approaches (Parametric) Control Function: Build a

structural model for the two variables (Heckman)

(Semiparametric) Instrumental Variable: Create an instrumental variable for the dummy variable (Barnow/Cain/ Goldberger, Angrist, Current generation of researchers)

(?) Propensity Score Matching (Heckman et al., Becker/Ichino, Many recent researchers)

Page 62: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 62/75]

Heckman’s Control Function Approach Y = xβ + δT + E[ε|T] + {ε - E[ε|T]} λ = E[ε|T] , computed from a model for whether T = 0 or 1

Magnitude = 11.1200 is nonsensical in this context.

Page 63: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 63/75]

Instrumental Variable ApproachConstruct a prediction for T using only the exogenous informationUse 2SLS using this instrumental variable.

Magnitude = 23.9012 is also nonsensical in this context.

Page 64: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 64/75]

Propensity Score Matching Create a model for T that produces probabilities for T=1: “Propensity Scores” Find people with the same propensity score – some with T=1, some with T=0 Compare number of doctor visits of those with T=1 to those with T=0.

Page 65: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 65/75]

Treatment Effect

Earnings and Education: Effect of an additional year of schooling

Estimating Average and Local Average Treatment Effects of Education when Compulsory Schooling Laws Really Matter Philip Oreopoulos AER, 96,1, 2006, 152-175

Page 66: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 66/75]

Treatment Effects and Natural Experiments

Page 67: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 67/75]

How do panel data fit into this? We can use the usual models.

We can use far more elaborate models We can study effects through time

Observations are surely correlated. The same individual is observed more than once Unobserved heterogeneity that appears in the

disturbance in a cross section remains persistent across observations (on the same ‘unit’).

Procedures must be adjusted. Dynamic effects are likely to be present.

Page 68: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 68/75]

Appendix: Structure and

Regression

Page 69: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 69/75]

Least Squares Revisited

-1 -1

-1

-1 -1

( ) ( )

plim plim( N) plim( N)

plim ( ) ( )= ( N) ( N)

[ N]

-1XX

-1 -1XX XX

-1 -1 -1XX XX XX

b= X'X X'y=β + X'X X'ε

b=β + X'X/ X'ε/ = β +Q γ

b- b = β + X'X X'ε - β +Q γ X'X/ X'ε/ - Q γ

Asy.Var[b - Q γ]= Q Asy.Var X'ε/ Q

i,t it it-1

-1

(1 N) [plim N] (1 N)

(Assuming "well behaved" data)

plim ) ( N) NN

= ( N) N ( plim )

Invoke Slutsky and Lind

-1 -1 -1 -1XX XX XX XX XX

-1XX

= / Q X'ΩX/ Q = / Q Ω Q

xN(b - b = X'X/ Q γ

X'X/ w - w

berg-Feller for ( plim ) to assert asymptotic

normality. is also asymptotically normally distributed, but inconsistent.

N w - w

b

Page 70: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 70/75]

Inference with IV Estimators

(1) Wald Statistics:

ˆ ˆ ˆ( ) ( )

(E.g., the usual 't-statistics')

(2) A type of F statistic:

ˆ ˆCompute SSUA=( )'( ) without restrictions

ˆ ˆˆ ˆCompute SSR=( )'( ) wit

-1

u u

R R

Rβ- q ' {Est.Asy.Var[β]} Rβ- q

y Xβ y Xβ

y Xβ y Xβ

h restrictions

ˆ ˆˆ ˆCompute SSU=( )'( ) without restrictions

(SSR SSU) / JF = ~F[J ,N K]

SSUA/(N-K)

U Uy Xβ y Xβ

Page 71: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 71/75]

Comparing OLS and IV

-1XX

-1 -1XX XX

XX

plim = +

Asy.Var[ ] =

Precision = Mean squared error[b| ]

= Asy.Var[b] + [plim( - )][plim( - )]

=

XX

Least squares b

b β Q

b Q Ω Q

β

b β b β

Q -1 -1 -1 -1XX XX XX

-1 -1ZX XZ

-1 -1ZX XZ

ˆ

ˆ plim =

ˆ Asy.Var[ ] =

ˆ Precision = Mean squared error[ | ]

=

XX

ZZ

ZZ

Ω Q Q Q

Instrumental variables β

β β

β Q Ω Q

β β

Q Ω Q

Page 72: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 72/75]

Testing for Endogeneity(?)

it it it

-1

A test for endogeneity? Consider one variable:

y q ==> = +

q may be endogenous. = a set of M instruments.

(1) OLS: = ( ) Inconsistent if Cov[q, ] 0.

itx δ y Xβ ε

Z

b X'X X'y

-1

2 -1 2,2SLS ,OLS

Consistent and efficient if Cov[q, ] = 0.

ˆ ˆ ˆ ˆ(2) 2SLS: ( ) Always consistent.

Inefficient if Cov[q, ] = 0.

ˆ ˆ ˆHausman test? [ ]'{ ( ) (ˆ ˆ

β = X'X X'y

β - b X'X

-1 1

2

-1 -1 12,2SLS

ˆ) } [ ]'

(a) Need to use the same estimator of .

1 ˆ ˆˆ ˆ[ ]' {( ) ( ) } [ ]'ˆ

(b) Even with this fix, the resulting matrix is singular.

X'X β - b

β - b X'X X'X β - b

Page 73: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 73/75]

Structure vs. Regression

Reduced Form vs. Stuctural Model Simultaneous equations origin

Q(d) = a0 + a1P + a2I + e(d) (demand)Q(s) = b0 + b1P + b2R + e(s) (supply)Q(.) = Q(d) = Q(s)What is the effect of a change in I on Q(.)?(Not a regression)

Reduced form: Q = c0 + c1I + c2R + v.(Regression)

Modern concepts of structure vs. regression: The search for causal effects.

Page 74: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 74/75]

Implications The structure is the theory The regression is the conditional mean There is always a conditional mean

It may not equal the structure It may be linear in the same variables What is the implication for least squares

estimation? LS estimates regressions LS does not necessarily estimate structures Structures may not be estimable – they may not be

identified.

Page 75: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 75/75]

Structure and Regression

Simultaneity? What if E[ε|x]≠0 y=x+ε, x=δy+u. Cov[x, ε]≠0

x is not the regression? What is the regression?

Reduced form: Assume ε and u are uncorrelated. y = [/(1- δ)]u + [1/(1- δ)]ε x= [1/(1- δ)]u + [δ /(1- δ)]ε Cov[x,y]/Var[x] =

The regression is y = x + v, where E[v|x]=0

2 2 2 2 2

2 2 2 2

[ ] /[ ]

(1 )(1/ ) where w= /[ ]u u

u uw w

Page 76: Part 2A: Basic Econometrics [ 1/75] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Part 2A: Basic Econometrics [ 76/75]

Structure vs. RegressionSupply = a + b*Price + c*Capacity

Demand = A + B*Price + C*Income


Recommended