James J. Heckman University of Chicago Econ 312 This draft ...jenni.uchicago.edu/econ312/Slides/topic7/panel... · 26-05-2006 · Panel Data Analysis Part II — Feasible Estimators

Panel Data AnalysisPart II — Feasible Estimators

James J. HeckmanUniversity of Chicago

Econ 312This draft, May 26, 2006

1 Feasible GLS estimation

How to do feasible GLS?

To do feasible GLS estimation we follow the following two-stepprocess:

(1) Perform OLS regression, fit ˆ and form residuals ˆ(2) Estimate 2 and 2 using the estimated residuals.

Step 2 of the above procedure is done as follows:

Take ˆ · =X=1

ˆ = ˆ +X=1

˜

1

Then we have (using the assumptions that [ 0 0 ] = 0 and[ ] = 0)

X=1

( ˆ)2

1= 2 + 2 ( )

ˆ is unbiased for but not consistent - why? For fixed, wecannot generate consistent estimates.

2

Impose restrictionX=1

ˆ = 0.

"X=1

X=1

(ˆ ˆ ·)2#

= ( 1 ( 1)] 2 ( )

= [ ( 1) ( 1)] 2

where ˆ · =1X

=1

3

we use residuals from the regression to estimate 2

ˆ2 =

IX=1

X=1

(ˆ ˆ ·)2

I( 1) ( 1)

4

Then from equations ( ) and ( ) we have:

ˆ2 =

X=1

( ˆ)2

12

,

(but this may go negative: if so, set ˆ2 at zero). Then, simplyuse X

=1

( ˆ)2

I 1

to estimate ˆ2 which is always positive, consistent.

5

• Assuming normality for the error term, we get that theFGLS estimator is unbiased, i.e.: (Feasible GLS) =(ˆGLS)

• Further, by a theorem of Taylor - only 17% reduction ine ciency by estimating

Prather than knowing it.

6

Mundlak Problem: Example of a use of panel data to ferretout and identify a cross sectional relationship. Mundlak posedthe problem that is correlated with (e.g.. is managerialability; is inputs)

( ( )) 6= 0, and we have a specification error bias:hˆi= +

£( 0 ) 1 0 ¤ 6= 0 since

( 0 ) 6= 0.

7

OLS is inconsistent and biased. One way to eliminate theproblem: Use the within estimator:

I -0¸

= I -0¸

+ I -0¸

· = [ ·]0 + ·

On the transformed data, we get an estimator that is unbiasedand consistent.

Estimator of fixed e ect not consistent (we acquire an inciden-tal parameters problem but we can eliminate it asfixed, we get a consistent estimator.)

8

Notice, however, if there exists a variable that stays constantover the spell for all persons, we cannot estimate the associated.

ˆ = + ·

where and are variables and associated coe cients thatstay fixed over spells (we can regress estimated fixed e ects onthe provided that they stay constant over the spell are notcorrected with the ).

9

• In a cross section context, we have that without someother information, the model is not identified unless wecan invoke IV estimation.

• F.E. estimator is a conditional version of R.E. estima-tor // R.E. estimator: + both random values, wecondition on values of .

10

How To Test For the Presence of Bias?

0 : No Bias in the OLS estimator:OLS and between estimator are biased, Within estimator

is unbiased.

ˆ vs. ˆ

ˆ = + ( ) 1

IX=1

0[I -1 0]

ˆB = + (B ) 1

"X=1

0 1 0#.

11

Under 0

COV³ˆ ˆ

´= 0

Independently distributed under a normality assumption.we can test (just pool the standard errors).

12

Strict Exogeneity Test

Basic idea

( | ) 6= 0. where ( 1 )

Failure of this is failure of strict exogeneity in the time seriesliterature.

Regression Function (Scalar Case)

( | ) = 0 + 1 1 + 2 2 + 3 3 +

where denotes linear projection. Then, in Mundlak’s prob-lem, we get that = 1

= 0 + 1 + [ 0 + 1 1 + 2 2 + ] +

13

Then, we can test to see whether or not future and past valuesof enter the equation [if so, we get a violation of strictexogeneity in this set up].

Notice we can estimate 2 (from first equation), 1 (from sec-ond equation) and so forth

can estimate 1 [but, we cannot separate out the interceptsin this equation. Nor can we identify variables that don’t varyover time.] This is just a control function in the sense of Heck-man and Robb (1985, 1986).

14

Chamberlain’s Strict Exogeneity Test= + , = 1 I, = 1

is strictly exogenous if ( | ) = 0

model can be fitted by OLS.We can test, in time series

= + +an extraneous variable

+

= 1 = 1

We have strict exogeneity in the process if = 0 (assumption:is correlated over time: + ) a future value of a variable

is in the equation (that doesn’t belong) we can do an exacttest.

15

Consider special error structure: (one factor setup)

= + i.i.d.

( | 1 2 3 )=X=1

Then if we relax the strict exogeneity assumption, we have that

( | [ 1 ]) = +X=1

( | ) =

16

Array the into a supervector

= DIAG{ }+

in all regressions, we have that stays fixed we cantest this assumption.

When applying this test in particular economic situations, wemust interpret the results with caution. For e.g., in the ap-plication of this test to the situation in the permanent incomehypothesis, the significance of the coe cients of future valuescan not be ruled out under the model.

17

Example: Chamberlain test with T = 3 periods

Simple regression setting with = + i.i.d.,we have:

= 1 1 + 2 2 + 3 3 +

Then

1 = 1 1 + 1 1 + 2 2 + 3 3 + + 1

2 = 2 2 + 1 1 + 2 2 + 3 3 + + 2

3 = 3 3 + 1 1 + 2 2 + 3 3 + + 3

18

For a factor structure,= + i.i.d. .

Then:

1 = 1 1 + 1( 1 1 + 2 2 + 3 3) + 1 + 1

2 = 2 2 + 2( 1 1 + 2 2 + 3 3) + 2 + 2

3 = 3 3 + 3( 1 1 + 2 2 + 3 3) + 3 + 3

19

Can Identify

1 2 1 3 ( 1 + 1 1)

2 1 2 3 2 + 2 2

( 3 + 3 3) 3 1 3 2 · · · · · · · · ·

Normalize: set 1 1 then we can identify, 2, 3, 1 2 and3.

20

2 Maximum likelihood panel data es-timators

Consider these models from a more general viewpoint, we canform di erent maximum likelihood estimators of the parame-ters of interest.

Assume = + Write

= ( 1 1 ) = 1 I

21

is an i.i.d. random vector with distribution depending on

˜= ( 1 I) = ( )

(treat as a parameter)

£ =Y=1

(˜

¯̄̄˜)

( | 1 ).

Max £ w.r.t. = ˆ .

22

³ˆ´

9 as

fixed.

In general, ˆ 9 as I because of this. Not like in lin-ear models (in general, roots of these equations interconnectedand we have problems). A joint system of equations

$= 0

$= 0 = 1

23

This set of likelihood equations can be solved using three dis-tinct concepts:

1. Marginal Likelihood;

2. Conditional Likelihood; and

3. Integrated Likelihood.

24

2.1 Marginal Likelihood (or Ancillary Like-lihood):

Find (if possible) ( ) independent of thei.e. find some statistic = ( ) such that ( | )

£Marginal =Y=1

( | )

$ ˆ

Then we can form the ML estimators for (the parameters ofinterest) without worrying about the

25

We say that is ancillary for given with respect to originalmodel. (This is really b-ancillarity). An example of this is thewithin estimator.

= + +

i.i.d. N (0 2 )

26

= 2= 2 1

is called an ancillary statistic distribution is independent is:

| ˜N ( ( 2 1)¯̄2 2)

Thus an example of the Marginal likelihood estimator is thefirst di erence estimator, which is almost identical to the “within”estimator. Here, the within estimator would also be a Marginallikelihood estimator.

27

Because0[

0]

0= 0

We can always break up the distribution of into two pieces

= I0¸

+0

( | ) = ( | )| {z }This portion ind ofMarginal Likelihood

( · | )| {z }This is a su cientstatistic for

28

2.2 Maximum Likelihood Second Principle

Find , a su cient statistic for such that

( | su cient statistic for ) is ind. of

Find

= ( ) so that ( | ) = ( | )

Can throw away , e.g., = 1 + 2

1 + 2˜ ( + ( 1 + 2) 22 + 4 2)

29

Transform observation

μ1

2

¶ μ2 1

2 + 1

¶( 2 1 2 + 1 | ) = 0

( 1 2) = ( 2 1 1 + 2)

= ( 2 1 | ) ( 1 + 2 | )

but

( 2 1 2 + 1 | )

= ( 2 1 | )

conditional likelihood function is the same as in previouscase.

30

2.3 Integrated L.F. or Random E ects Esti-mator

Pick a density for (other methods do not require this)

pdf of ( | ).

For each person

( | ) =

Z( | ) ( | )

$ =Y=1

( | )

Suppose it is normal, N ¡0 2¢.

31

When we integrate out in the above using normality, we get

N ( 2 + 2 )11

· · · · · · 1

Problem becomes one of estimating

( 2 and distribution function of ).

32

Two possible methods:

(a) Assume ( | ) is a known finite parameter distribution(function of ) and estimate ( 2 ) (maybe too).

(b) Nonparametric estimation (e.g., Heckman-Singer). Thenestimate 2 ( ).

33

Mundlak Point:The within estimator is the GLS estimator in all cases if

= ¯ +

The more general point is that if we permit fixed e ects tobe functions of exogenous variables, the between and withinestimators will in general di er. Lee (as cited in Judge, et al.)shows how special the Mundlak point is.

34

Suppose = +

N (0 2)

If = ·

Marginal = Con = Int. = Within = ˆMLE

in a regression setting. Mundlak’s point is this: Suppose that

= · + (then is ind of )

= + · + +

35

Now what is the random e ect estimator?

= ( ·) + ·( + ) + +

intuitively: you get info only on from within. Apply GLS

=0¸= ˜

where = 1

r1

1 +

¸(refer to Section 3.2 of Part I).

36

Thus, we get the GLS transformation as:

˜ = ˜( ·) + ˜ ·( + ) + ˜( + )

In general,0¸

· = ·(1 )

37

Documents

James J. Heckman University of Chicago Econ 312 This draft ...jenni.uchicago.edu/econ312/Slides/topic7/panel... · 26-05-2006 · Panel Data Analysis Part II — Feasible Estimators