67
1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA 2.3 Panel Data Models for Binary Choice

1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

Embed Size (px)

Citation preview

Page 1: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

1/62: Topic 2.3 – Panel Data Binary Choice Models

Microeconometric Modeling

William GreeneStern School of BusinessNew York UniversityNew York NY USA

2.3 Panel Data Models for Binary Choice

Page 2: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

2/62: Topic 2.3 – Panel Data Binary Choice Models

Concepts

• Unbalanced Panel• Attrition Bias• Inverse Probability Weight• Heterogeneity• Population Averaged Model• Clustering• Pooled Model• Quadrature• Maximum Simulated Likelihood• Conditional Estimator• Incidental Parameters Problem• Partial Effects• Bias Correction• Mundlak Specification• Variable Addition Test

Models

• Random Effects Progit• Fixed Effects Probit• Fixed Effects Logit• Dynamic Probit• Mundlak Formulation• Correlated Random Effects Model

Page 3: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

3/62: Topic 2.3 – Panel Data Binary Choice Models

Page 4: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

4/62: Topic 2.3 – Panel Data Binary Choice Models

Application: Health Care Panel Data

German Health Care Usage Data

Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293 individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary choice.  There are altogether 27,326 observations.  The number of observations ranges from 1 to 7.  (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987). 

Variables in the file are

DOCTOR = 1(Number of doctor visits > 0) HOSPITAL = 1(Number of hospital visits > 0) HSAT =  health satisfaction, coded 0 (low) - 10 (high)   DOCVIS =  number of doctor visits in last three months HOSPVIS =  number of hospital visits in last calendar year PUBLIC =  insured in public health insurance = 1; otherwise = 0 ADDON =  insured by add-on insurance = 1; otherswise = 0 HHNINC =  household nominal monthly net income in German marks / 10000. (4 observations with income=0 were dropped) HHKIDS = children under age 16 in the household = 1; otherwise = 0 EDUC =  years of schooling AGE = age in years MARRIED = marital status

Page 5: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

5/62: Topic 2.3 – Panel Data Binary Choice Models

Unbalanced Panels

Group Sizes

Most theoretical results are for balanced panels.

Most real world panels are unbalanced.

Often the gaps are caused by attrition.

The major question is whether the gaps are ‘missing completely at random.’ If not, the observation mechanism is endogenous, and at least some methods will produce questionable results.

Researchers rarely have any reason to treat the data as nonrandomly sampled. (This is good news.)

Page 6: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

6/62: Topic 2.3 – Panel Data Binary Choice Models

Unbalanced Panels and Attrition ‘Bias’

Test for ‘attrition bias.’ (Verbeek and Nijman, Testing for Selectivity Bias in Panel Data Models, International Economic Review, 1992, 33, 681-703. Variable addition test using covariates of presence in the panel Nonconstructive – what to do next?

Do something about attrition bias. (Wooldridge, Inverse Probability Weighted M-Estimators for Sample Stratification and Attrition, Portuguese Economic Journal, 2002, 1: 117-139) Stringent assumptions about the process Model based on probability of being present in each wave of the panel

Page 7: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

7/62: Topic 2.3 – Panel Data Binary Choice Models

Page 8: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

8/62: Topic 2.3 – Panel Data Binary Choice Models

Inverse Probability Weighting1

1 2 T

Panel is based on those present at WAVE 1, N individuals

Attrition is an absorbing state. No reentry, so N N ... N .

Sample is restricted at each wave to individuals who were present at

the pre

1 , 1

1

vious wave.

d = 1[Individual is present at wave t].

d = 1 i, d 0 d 0.

covariates observed for all i at entry that relate to likelihood of

being present at subseque

it

i it i t

ix

nt waves.

(health problems, disability, psychological well being, self employment,

unemployment, maternity leave, student, caring for family member, ...)

Probit model for d 1[ it x 1

1

ˆ], t = 2,...,T. fitted probability.

ˆ ˆAssuming attrition decisions are independent, P

dˆInverse probability weight WP̂

Weighted log likelihood logL log (No commo

i it it

t

it iss

itit

it

W it

w

L1 1

n effects.) N T

i t

Page 9: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

9/62: Topic 2.3 – Panel Data Binary Choice Models

Page 10: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

10/62: Topic 2.3 – Panel Data Binary Choice Models

Panel Data Binary Choice Models

Random Utility Model for Binary Choice

Uit = + ’xit + it + Person i specific effect

Fixed effects using “dummy” variables

Uit = i + ’xit + it

Random effects using omitted heterogeneity

Uit = + ’xit + it + ui

Same outcome mechanism: Yit = 1[Uit > 0]

Page 11: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

11/62: Topic 2.3 – Panel Data Binary Choice Models

Ignoring Unobserved Heterogeneity

it

it

it

it it

x

x β

x β

x β x δ

i it

it i it

it it i it

2it it u

Assuming strict exogeneity; Cov( ,u ) 0

y *= u

Prob[y 1| x ] Prob[u - ]

Using the same model format:

Prob[y 1| x ] F / 1+ F( )

This is the 'population averaged model.'

Page 12: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

12/62: Topic 2.3 – Panel Data Binary Choice Models

Page 13: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

13/62: Topic 2.3 – Panel Data Binary Choice ModelsIgnoring Heterogeneity in the RE Model

Ignoring heterogeneity, we estimate not .

Partial effects are f( ) not f( )

is underestimated, but f( ) is overestimated.

Which way does it go? Maybe ignoring u is ok?

Not if we want

it it

it

δ β

δ x δ β x β

β x β

to compute probabilities or do

statistical inference about Estimated standard

errors will be too small.

β.

is too small

Scale factor is too large

Page 14: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

14/62: Topic 2.3 – Panel Data Binary Choice Models

Age 43.527

P( .37176 .01625Age).01625 .006128

Age

REP1 x 1

Age

= 1 .45298 ( .53689 .02338Age) 1 .45298 .02338

= .00648

Population average coefficient is off by 43% but the partial effect is off by 6%

Page 15: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

15/62: Topic 2.3 – Panel Data Binary Choice Models

Ignoring Heterogeneity (Broadly)

Presence will generally make parameter estimates look smaller than they would otherwise.

Ignoring heterogeneity will definitely distort standard errors.

Partial effects based on the parametric model may not be affected very much.

Is the pooled estimator ‘robust?’ Less so than in the linear model case.

Page 16: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

16/62: Topic 2.3 – Panel Data Binary Choice Models

Pooled vs. RE Panel Estimator----------------------------------------------------------------------Binomial Probit ModelDependent variable DOCTOR --------+-------------------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X--------+-------------------------------------------------------------Constant| .02159 .05307 .407 .6842 AGE| .01532*** .00071 21.695 .0000 43.5257 EDUC| -.02793*** .00348 -8.023 .0000 11.3206 HHNINC| -.10204** .04544 -2.246 .0247 .35208--------+-------------------------------------------------------------Unbalanced panel has 7293 individuals--------+-------------------------------------------------------------Constant| -.11819 .09280 -1.273 .2028 AGE| .02232*** .00123 18.145 .0000 43.5257 EDUC| -.03307*** .00627 -5.276 .0000 11.3206 HHNINC| .00660 .06587 .100 .9202 .35208 Rho| .44990*** .01020 44.101 .0000--------+-------------------------------------------------------------

Page 17: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

17/62: Topic 2.3 – Panel Data Binary Choice Models

Partial Effects----------------------------------------------------------------------Partial derivatives of E[y] = F[*] withrespect to the vector of characteristicsThey are computed at the means of the XsObservations used for means are All Obs.--------+-------------------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Elasticity--------+------------------------------------------------------------- |Pooled AGE| .00578*** .00027 21.720 .0000 .39801 EDUC| -.01053*** .00131 -8.024 .0000 -.18870 HHNINC| -.03847** .01713 -2.246 .0247 -.02144--------+------------------------------------------------------------- |Based on the panel data estimator AGE| .00620*** .00034 18.375 .0000 .42181 EDUC| -.00918*** .00174 -5.282 .0000 -.16256 HHNINC| .00183 .01829 .100 .9202 .00101--------+-------------------------------------------------------------

Page 18: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

18/62: Topic 2.3 – Panel Data Binary Choice Models

The Effect of Clustering

Yit must be correlated with Yis across periods Pooled estimator ignores correlation Broadly, yit = E[yit|xit] + wit,

E[yit|xit] = Prob(yit = 1|xit) wit is correlated across periods

Assuming the marginal probability is the same, the pooled estimator is consistent. (We just saw that it might not be.)

Ignoring the correlation across periods generally leads to underestimating standard errors.

Page 19: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

19/62: Topic 2.3 – Panel Data Binary Choice Models

‘Cluster’ Corrected Covariance Matrix

1

the number if clusters

number of observations in cluster c

= negative inverse of second derivatives matrix

= derivative of log density for observation

c

ic

C

n

H

g

Page 20: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

20/62: Topic 2.3 – Panel Data Binary Choice Models

Cluster Correction: Doctor----------------------------------------------------------------------Binomial Probit ModelDependent variable DOCTORLog likelihood function -17457.21899--------+-------------------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X--------+------------------------------------------------------------- | Conventional Standard ErrorsConstant| -.25597*** .05481 -4.670 .0000 AGE| .01469*** .00071 20.686 .0000 43.5257 EDUC| -.01523*** .00355 -4.289 .0000 11.3206 HHNINC| -.10914** .04569 -2.389 .0169 .35208 FEMALE| .35209*** .01598 22.027 .0000 .47877--------+------------------------------------------------------------- | Corrected Standard ErrorsConstant| -.25597*** .07744 -3.305 .0009 AGE| .01469*** .00098 15.065 .0000 43.5257 EDUC| -.01523*** .00504 -3.023 .0025 11.3206 HHNINC| -.10914* .05645 -1.933 .0532 .35208 FEMALE| .35209*** .02290 15.372 .0000 .47877--------+-------------------------------------------------------------

Page 21: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

21/62: Topic 2.3 – Panel Data Binary Choice Models

Random Effects

i it

it i it

it it i it

2it it u 0

it

Assuming strict exogeneity; Cov( ,u ) 0

y *= + u

Prob[y 1| ] Prob[u - - ]

Prob[y 1| ] F + / 1+ F( )

This is the 'population averaged model.'

Prob[y 1|

it

it

it

it it

x

x β

x x β

x x β x δ

x

it i it u i

it it i u i

,u] Prob[ - - - v ]

Prob[y 1| ,u] F v

This is the structural model

it

it

x β

x x β

Page 22: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

22/62: Topic 2.3 – Panel Data Binary Choice Models

Quadrature – Butler and Moffitt (1982)

xiN

ii 1

N

ii

T

it it

1

it 1 i

2

u

Th

F(y , v )

g(

v

1 -vexp

22

logL l

is method is used in most commerical software since

og dv

= log dv

(make a change of variable to w = v/ 2

v

198

)

2

=

N

ii 1

N H

i 1 h

2

h h

hh 1

1 l g( 2w)

g( 2z )

The values

og dw

The integral can be

of w (weights) and z

exp -w

(node

computed using

s) are found i

Hermite quadr

n published

tab

ature.

1

les such as A

log w

bramovitz and Stegun (or on the web). H is by

choice. Higher H produces greater accuracy (but takes longer).

2i u

u i

i

u ~ N[0, ]

= v

where v ~ N[0,1]

Page 23: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

23/62: Topic 2.3 – Panel Data Binary Choice Models

Quadrature Log Likelihood

i

i

N H

i 1 h

h

T

it it u ht 1

T

it it ht

h 1

N H

i 1 1 1h

After all the substitutions, the function to be maximized:

Not simple, but

F(y , 2 z )

F(y ,

w1

logL log

1 lo

feasib

)

l

wg z

e.

x

x

Page 24: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

24/62: Topic 2.3 – Panel Data Binary Choice Models

Simulation Based Estimator

xiN

ii 1 i

2i

T

it it u it 1

i

N

ii 1

N

i

i

i1

logL log dv

= log dv

T ]

The expected value of the functio

his equals log E[

n of v can be a

F(y ,

ppro

v

-v1

xima

ex

te

v )

g(v )

d

by dr

g(

aw

p22

v )

ing

xiTN R

S it it u ir

i

i 1

r

ir

r 1 t 1

R random draws v from the population N[0,1] and

averaging the R functions of v . We maxi

1logL log F(y , v

i

)R

mze

Page 25: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

25/62: Topic 2.3 – Panel Data Binary Choice Models

Random Effects Model: Quadrature

----------------------------------------------------------------------Random Effects Binary Probit ModelDependent variable DOCTORLog likelihood function -16290.72192 Random EffectsRestricted log likelihood -17701.08500 PooledChi squared [ 1 d.f.] 2820.72616Estimation based on N = 27326, K = 5Unbalanced panel has 7293 individuals--------+-------------------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X--------+-------------------------------------------------------------Constant| -.11819 .09280 -1.273 .2028 AGE| .02232*** .00123 18.145 .0000 43.5257 EDUC| -.03307*** .00627 -5.276 .0000 11.3206 HHNINC| .00660 .06587 .100 .9202 .35208 Rho| .44990*** .01020 44.101 .0000--------+------------------------------------------------------------- |Pooled EstimatesConstant| .02159 .05307 .407 .6842 AGE| .01532*** .00071 21.695 .0000 43.5257 EDUC| -.02793*** .00348 -8.023 .0000 11.3206 HHNINC| -.10204** .04544 -2.246 .0247 .35208--------+-------------------------------------------------------------

Page 26: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

26/62: Topic 2.3 – Panel Data Binary Choice Models

Random Parameter Model

----------------------------------------------------------------------Random Coefficients Probit ModelDependent variable DOCTOR (Quadrature Based)

Log likelihood function -16296.68110 (-16290.72192) Restricted log likelihood -17701.08500Chi squared [ 1 d.f.] 2808.80780Simulation based on 50 Halton draws--------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z]--------+------------------------------------------------- |Nonrandom parameters AGE| .02226*** .00081 27.365 .0000 ( .02232) EDUC| -.03285*** .00391 -8.407 .0000 (-.03307) HHNINC| .00673 .05105 .132 .8952 ( .00660) |Means for random parametersConstant| -.11873** .05950 -1.995 .0460 (-.11819) |Scale parameters for dists. of random parametersConstant| .90453*** .01128 80.180 .0000--------+-------------------------------------------------------------

Using quadrature, a = -.11819. Implied from these estimates is.904542/(1+.904532) = .449998 compared to .44990 using quadrature.

Page 27: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

27/62: Topic 2.3 – Panel Data Binary Choice Models

A Dynamic Model

x

x

it it i,t 1 it i

it i,t 1 i0 it it i,t 1 i

y 1[ y u > 0]

Two similar 'effects'

Unobserved heterogeneity

State dependence = state 'persistence'

Pr(y 1| y ,...,y ,x ,u] F[ y u]

How to estimate , , marginal effects, F(.), etc?

(1) Deal with the latent common effect

(2) Handle the lagged effects:

This encounters the initial conditions problem.

Page 28: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

28/62: Topic 2.3 – Panel Data Binary Choice Models

Dynamic Probit Model: A Standard Approach

T

i1 i2 iT i0 i i,t 1 i itt 1

i1 i2 iT i0

(1) Conditioned on all effects, joint probability

P(y ,y ,...,y | y , ,u ) F( y u ,y )

(2) Unconditional density; integrate out the common effect

P(y ,y ,...,y | y , )

i it

i

x x β

x

i1 i2 iT i0 i i i0 i

2i i0 i0 u i i1 i2 iT

i i0 u i

P(y ,y ,...,y | y , ,u )h(u | y , )du

(3) Density for heterogeneity

h(u | y , ) N[ y , ], = [ , ,..., ], so

u = y + w (contains every

i i

i i

i

x x

x x δ x x x x

x δ

it

i1 i2 iT i0

T

i,t 1 i0 u i it i it 1

period of )

(4) Reduced form

P(y ,y ,...,y | y , )

F( y y w ,y )h(w )dw

This is a random effects model

i

it i

x

x

x β x δ

Page 29: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

29/62: Topic 2.3 – Panel Data Binary Choice Models

Simplified Dynamic Model

i

2i i0 i0 u

i i0 i

Projecting u on all observations expands the model enormously.

(3) Projection of heterogeneity only on group means

h(u | y , ) N[ y , ] so

u = y + w

(4) Re

i i

i

x x δ

x δ

i1 i2 iT i0

T

i,t 1 i0 u i it i it 1

duced form

P(y ,y ,...,y | y , )

F( y y w ,y )h(w )dw

Mundlak style correction with the initial value in the equation.

This is (again) a random effects mo

i

it i

x

x β x δ

del

Page 30: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

30/62: Topic 2.3 – Panel Data Binary Choice Models

A Dynamic Model for Public Insurance

AgeHousehold IncomeKids in the householdHealth Status

Add initial value, lagged value, group means

Page 31: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

31/62: Topic 2.3 – Panel Data Binary Choice Models

Dynamic Common Effects Model

Page 32: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

32/62: Topic 2.3 – Panel Data Binary Choice Models

ApplicationStewart, JAE, 2007

British Household Panel Survey (1991-1996) 3060 households retained (balanced) out of 4739 total. Unemployment indicator (0.1) Data features

Panel data – unobservable heterogeneity State persistence: “Someone unemployed at t-1 is

more than 20 times as likely to be unemployed at t as someone employed at t-1.”

Page 33: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

33/62: Topic 2.3 – Panel Data Binary Choice Models

Application: Direct Approach

Page 34: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

34/62: Topic 2.3 – Panel Data Binary Choice Models

GHK Simulation/Estimation

The presence of the autocorrelation and state dependence in the model invalidate the simple maximum likelihood procedures we examined earlier. The appropriate likelihood function is constructed by formulating the probabilities as

Prob( yi,0, yi,1, . . .) = Prob(yi,0) × Prob(yi,1 | yi,0) × ・ ・ ・ ×Prob(yi,T | yi,T-1) .

This still involves a T = 7 order normal integration, which is approximated in the study using a simulator similar to the GHK simulator.

Page 35: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

35/62: Topic 2.3 – Panel Data Binary Choice Models

Problems with Dynamic RE Probit

Assumes yi,0 and the effects are uncorrelated Assumes the initial conditions are exogenous – OK if the process

and the observation begin at the same time, not if different. Doesn’t allow time invariant variables in the model. The normality assumption in the projection.

Page 36: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

36/62: Topic 2.3 – Panel Data Binary Choice Models

Distributional Problem Normal distributions assumed throughout

Normal distribution for the unique component, εi,t

Normal distribution assumed for the heterogeneity, ui

Sensitive to the distribution?

Alternative: Discrete distribution for ui. Heckman and Singer style, latent class model. Conventional estimation methods.

Why is the model not sensitive to normality for εi,t but it is sensitive to normality for ui?

Page 37: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

37/62: Topic 2.3 – Panel Data Binary Choice Models

Fixed Effects Modeling

Advantages: Allows correlation of covariates and heterogeneity

Disadvantages: Complications of computing all those dummy variable coefficients – solved

problem (Greene, 2004) No time invariant variables – not solvable in the FE context Incidental Parameters problem – persistent small T bias, does not go away

Strategies Unconditional estimation Conditional estimation – Rasch/Chamberlain Hybrid conditional/unconditional estimation Bias corrrections Mundlak estimator

Page 38: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

38/62: Topic 2.3 – Panel Data Binary Choice Models

Fixed Effects Models Estimate with dummy variable coefficients

Uit = i + ’xit + it

Can be done by “brute force” for 10,000s of individuals

F(.) = appropriate probability for the observed outcome Compute and i for i=1,…,N (may be large) See FixedEffects.pdf in course materials.

1 1log log ( , )iN T

it i iti tL F y

x

Page 39: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

39/62: Topic 2.3 – Panel Data Binary Choice Models

Unconditional Estimation Maximize the whole log likelihood

Difficult! Many (thousands) of parameters.

Feasible – NLOGIT (2004) (“Brute force”)

Page 40: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

40/62: Topic 2.3 – Panel Data Binary Choice Models

Fixed Effects Health ModelGroups in which yit is always = 0 or always = 1. Cannot compute αi.

Page 41: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

41/62: Topic 2.3 – Panel Data Binary Choice Models

Conditional Estimation Principle: f(yi1,yi2,… | some statistic) is free of

the fixed effects for some models. Maximize the conditional log likelihood, given

the statistic. Can estimate β without having to estimate αi. Only feasible for the logit model. (Poisson

and a few other continuous variable models. No other discrete choice models.)

Page 42: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

42/62: Topic 2.3 – Panel Data Binary Choice Models

Binary Logit Conditional Probabilities

i

i

1 1 2 2

1 1

1

T

S1 1

All

Prob( 1| ) .1

Prob , , ,

exp exp

exp exp

i it

i it

i i

i i

i i

t i

i

t i

it it

i i i i iT iT

T T

it it it itt t

T T

it it it i

T

itt

td St t

ey

e

Y y Y y Y y

y

d d

y

y

x

xx

x x

x x

β

β

i

t i

different

iT

it t=1 i

ways that

can equal S

.

Denominator is summed over all the different combinations of T valuesof y that sum to the same sum as the observed . If S is this sum,

there are

it

it

d

y

i

terms. May be a huge number. An algorithm by KrailoS

and Pike makes it simple.

T

Page 43: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

43/62: Topic 2.3 – Panel Data Binary Choice Models

Example: Two Period Binary Logit

i it

i it

i

i

i i i

t it i

it it

T

it itTt 1

i1 i1 i2 i2 iT iT it Tt 1

it itd St 1

2

i1 i2 itt 1

i1 i

eProb(y 1| ) .

1 e

exp y

Prob Y y , Y y , , Y y y ,data .

exp d

Prob Y 0, Y 0 y 0,data 1.

Prob Y 1, Y

x β

x β

x

x

x

2

2 itt 12

i1 i2 itt 12

i1 i2 itt 1

exp( )0 y 1,data

exp( ) exp( )exp( )

Prob Y 0, Y 1 y 1,data exp( ) exp( )

Prob Y 1, Y 1 y 2,data 1.

i1

i1 i2

i2

i1 i2

x βx β x β

x βx β x β

Page 44: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

44/62: Topic 2.3 – Panel Data Binary Choice Models

1 7

1 2 7

*| 1

Prob[ = (1,0,0,0,1,1,1)| ]=

exp( ) exp( )1...

1 exp( ) 1 exp( ) 1 exp( )

There are 35 different sequences of y (permutations) that sum to 4.

For example, y might b

i

i i

i i i

it

it p

y X

x x

x x x

717

1 35 7 *1 |1

e (1,1,1,1,0,0,0). Etc.

exp yProb[y=(1,0,0,0,1,1,1)| , y =7] =

exp y

t it it

i t it

t it p itp

xX

x

Example: Seven Period Binary Logit

Page 45: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

45/62: Topic 2.3 – Panel Data Binary Choice Models

Page 46: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

46/62: Topic 2.3 – Panel Data Binary Choice Models

12 14

With T = 50, the number of permutations of sequences of

y ranging from sum = 0 to sum = 50 ranges from 1 for 0 and 50,

to 2.3 x 10 for 15 or 35 up to a maximum of 1.3 x 10 for sum =25.

These are the numbers of terms that must be summed for a model

with T = 50. In the application below, the sum ranges from 15 to 35.

Page 47: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

47/62: Topic 2.3 – Panel Data Binary Choice Models

The sample is 200 individuals each observed 50 times.

Page 48: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

48/62: Topic 2.3 – Panel Data Binary Choice Models

The data are generated from a probit process with b1 = b2 = .5. But, it is fit as a logit model. The coefficients obey the familiar relationship, 1.6*probit.

Page 49: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

49/62: Topic 2.3 – Panel Data Binary Choice Models

Estimating Partial Effects

“The fixed effects logit estimator of immediately gives us the effect of each element of xi on the log-odds ratio… Unfortunately, we cannot estimate the partial effects… unless we plug in a value for αi. Because the distribution of αi is unrestricted – in particular, E[αi] is not necessarily zero – it is hard to know what to plug in for αi. In addition, we cannot estimate average partial effects, as doing so would require finding E[Λ(xit + αi)], a task that apparently requires specifying a distribution for αi.”

(Wooldridge, 2010)

Page 50: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

50/62: Topic 2.3 – Panel Data Binary Choice Models

MLE for Logit Constant Terms

ii

i

i

ˆT i it

i ˆt 1i i i i

Step 1. Estimate with Chamberlain's conditional estimator

Step 2. Treating as if it were known, estimate from the

first order condition

c1 e e 1y

T T 1 c1 e e

it

it

x β

x β

β

β

i iT T itt 1 t 1

t i i it

i i i i

it

i

i

c1T c

Estimate 1 / exp( ) log

ˆc exp is treated as known data.

Solve one equation in one unknown for each .

Note there is no solution if y = 0 or 1.

Iterating back

itx β

and forth does not maximize logL.

Page 51: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

51/62: Topic 2.3 – Panel Data Binary Choice Models

An Approximate Solution for E[i]

it i it

T T

i it i i itt 1 t 1

i i i

0i iti,t

P ( )

has been consistently estimated using

1 1Let y = y estimate P , and

T TTo an error O(1/T), y estimates ( ).

1Overall, approximate plim ( ) (

nT

x

b

x x

x

x

n 0i iti 1 i,t

plim )

1 1 yuse plim y = plim y . Solve for log

n nT 1 y

b x)

b x

Page 52: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

52/62: Topic 2.3 – Panel Data Binary Choice Models

An Approximate Solution for i

it i it

T

i i i itt 1

i

T1i t 1 i itT

P ( )

1The FOC for is y ( )

T(Note, no solution if y = 0 or 1.)

has been consistently estimated using

Equate y to ( ). Use a first order Taylor series arou

x

x

b

b x

0

T 010 i t 1 itT

i T 0 01t 1 it itT

i

nd

y ( )ˆ

( )[1 ( )]

(Will tend to give large but finite values when y = 0 or 1.)

b x

b x b x

Page 53: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

53/62: Topic 2.3 – Panel Data Binary Choice Models

Page 54: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

54/62: Topic 2.3 – Panel Data Binary Choice Models

Page 55: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

55/62: Topic 2.3 – Panel Data Binary Choice Models

Page 56: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

56/62: Topic 2.3 – Panel Data Binary Choice Models

Systematic overestimation by alphai, or random noise? A second run of the experiment:

Page 57: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

57/62: Topic 2.3 – Panel Data Binary Choice Models

Page 58: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

58/62: Topic 2.3 – Panel Data Binary Choice Models

Page 59: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

59/62: Topic 2.3 – Panel Data Binary Choice Models

Fixed Effects Logit Health Model: Conditional vs. Unconditional

Page 60: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

60/62: Topic 2.3 – Panel Data Binary Choice Models

Incidental Parameters Problems: Conventional Wisdom

General: The unconditional MLE is biased in samples with fixed T except in special cases such as linear or Poisson regression (even when the FEM is the right model). The conditional estimator (that bypasses

estimation of αi) is consistent.

Specific: Upward bias (experience with probit and logit) in estimators of

Page 61: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

61/62: Topic 2.3 – Panel Data Binary Choice Models

Page 62: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

62/62: Topic 2.3 – Panel Data Binary Choice Models

A Monte Carlo Study of the FE Estimator: Probit vs. Logit

Estimates of Coefficients and Marginal Effects at the Implied Data Means

Results are scaled so the desired quantity being estimated (, , marginal effects) all equal 1.0 in the population.

Page 63: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

63/62: Topic 2.3 – Panel Data Binary Choice Models

Bias Correction Estimators Motivation: Undo the incidental parameters bias in the

fixed effects probit model: (1) Maximize a penalized log likelihood function, or (2) Directly correct the estimator of β

Advantages For (1) estimates αi so enables partial effects Estimator is consistent under some circumstances (Possibly) corrects in dynamic models

Disadvantage No time invariant variables in the model Practical implementation Extension to other models? (Ordered probit model (maybe) –

see JBES 2009)

Page 64: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

64/62: Topic 2.3 – Panel Data Binary Choice Models

A Mundlak Correction for the FE Model

*it i

it it

i

y ,i = 1,...,N; t = 1,...,T

y 1 if y > 0, 0 otherwise.

(Projection, not necessarily conditional mean)

w

i it it

i iu

Fixed Effects Model :

x

Mundlak (Wooldridge, Heckman, Chamberlain), ...

x

u 1 2

*it

here u is normally distributed with mean zero and standard

deviation and is uncorrelated with or ( , ,..., )

y ,i = 1,...,N; t = 1,..

i i i iT

i it it iu

x x x x

Reduced form random effects model

x x i

it it

.,T

y 1 if y > 0, 0 otherwise.

Page 65: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

65/62: Topic 2.3 – Panel Data Binary Choice Models

Mundlak Correction

Page 66: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

66/62: Topic 2.3 – Panel Data Binary Choice Models

A Variable Addition Test for FE vs. RE

The Wald statistic of 45.27922 and the likelihood ratio statistic of 40.280 are both far larger than the critical chi squared with 5 degrees of freedom, 11.07. This suggests that for these data, the fixed effects model is the preferred framework.

Page 67: 1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA

67/62: Topic 2.3 – Panel Data Binary Choice Models

Fixed Effects Models Summary Incidental parameters problem if T < 10 (roughly) Inconvenience of computation Appealing specification Alternative semiparametric estimators?

Theory not well developed for T > 2 Not informative for anything but slopes (e.g.,

predictions and marginal effects) Ignoring the heterogeneity definitely produces an

inconsistent estimator (even with cluster correction!) A Hobson’s choice Mundlak correction is a useful common approach.