27
Non-linear regression models a) The probit and tobit models as examples b) Interpretation of the models c) Relevant estimation methods (ML) d) Final considerations about regressions Literature Wooldridge (2002,2010): 15.1-15.4, 15.6, 17.1-17.4 Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I Lecture 4

  • Upload
    lamdan

  • View
    228

  • Download
    4

Embed Size (px)

Citation preview

Page 1: Econometric Methods and Applications I Lecture 4

Non-linear regression models a) The probit and tobit models as examples b) Interpretation of the models c) Relevant estimation methods (ML) d) Final considerations about regressions

Literature Wooldridge (2002,2010): 15.1-15.4, 15.6, 17.1-17.4

Econometric Methods and Applications I Lecture 4

Page 2: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 2

Introduction (1)

> If the CIA is valid, then the causal parameter in the most

simple binary D framework is obtained by estimating

and aggregating over the appropriate distribution of X

> Now, we consider the case when a linear approximation of

these conditional expectations is inadequate

( | , 1) ( | , 0)E Y X x D E Y X x D= = − = =

Page 3: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 3

Introduction (2)

> Example

• True conditional expectation of Y:

• Function used in estimation by linear regression:

• Implied error term by function used in estimation:

> Implied error term is correlated with variables used in estimation OLS/FGLS is inconsistent

> Non-linear models may lead to better approximation of and may avoid inconsistent estimation

( , )g x θ

( | ) ( | ) ( , ) 0U Y X

E Y X X x E Y X x x g x xβ

β β θ β= − ⇒

− = = = − = − ≠

( | ) ( , )E Y X x g x θ= =

Page 4: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 4

Probit model (1-1)

> Example 1: With a binary outcome variable, linear regression is

generally not attractive, because

• A probability is bounded between zero and one

• The probability (usually) corresponds to a cumulative distribution

function (details later), which is generally not linear in its

arguments (exception: uniform distribution)

( | ) 1 ( 1| ) 0 ( 0 | )( 1| )

E Y X x P Y X x P Y X xP Y X x

= = × = = + × = == = =

Page 5: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 5

Probit model (1-2)

Page 6: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 6

Probit model (2)

> Model based on a linear index model and a

non-linear link function

> F(.) denotes the cdf of U

( 1| )P Y X x= =

1( 0)Y X Uβ= + > ⇒

( | ) ( 1| )( 0 | )( | )

1 ( | )1 ( )

E Y X x P Y X xP X U X xP U X X x

P U X X xF x

ββ

ββ

= = = == + > == > − == − ≤ − == − −

Page 7: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 7

Probit model (3)

> By making different distributional assumptions concerning F,

we obtain different models (logit, probit, etc.)

> Only in one special case (U is distributed uniformly in a fixed

interval) is F(.) a linear function (linear probability model)

> Generally, this expression simplifies for symmetric

distributions:

( | ) 1 ( ) ( )u symmetric

E Y X x F x F xβ β= = − − =

Page 8: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 8

Probit model (4)

> By assuming that U is normally distributed with mean zero

and variance , we obtain the probit model:

( | ) ,

: cdf of standard normal distribution

xE Y X x

a

βσ

σ

= = Φ Φ

2Remember: (0, ) ( ) aA N P A aσσ ⇒ ≤ = Φ

Page 9: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 9

Probit model (5)

> General identification problem of binary choice models

• The following two models lead to the same dependent variable:

• therefore, it is impossible to distinguish models empirically

• Some (convenient) normalisation is needed usually

1( 0) and 1( 0); 0.Y X U Y X Uβ βσ σ σ= + > = + > >

1σ =

Page 10: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 10

Tobit model (1-1)

> Second example for a non-linear model: Tobit

• Motivation: Some dependent variables cannot fall below (rise

above) some threshold

− e.g. earnings cannot be negative

> Again, modelling is based on a latent linear index:

*

1( 0)( )Y

Y X U X Uβ β= + > +

Page 11: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 11

Tobit model (1-2)

Page 12: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 12

Tobit model (2)

> Assume U is normally distributed with mean 0 and variance

> Derivation of E(Y| X) is somewhat complex (Wooldridge, 2002,

Ch. 16.2)

> Consider only the subpopulation having positive values of y:

( | ) , : pdf of stand. normal distr.x x aE Y X x xβ ββ σφ φσ σ σ

= = Φ +

( | , 0)

xxE Y X x Y x x

x

βφβσβ σ β σλ

β σσ

= > = + = + Φ

Page 13: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 13

Tobit model (3)

> Estimation in the complete sample

> OLS is inconsistent because of the neglected nonlinearity

and the omitted variable

xβσ

Φ

xβφσ

( | ) x xE Y X x xβ ββ σφσ σ

= = Φ +

Page 14: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 14

Tobit model (4)

> Estimation in the subsample with Y > 0

> OLS in the population with positive Y is inconsistent because

of the omitted variable

xx

x

βφβ σλ

βσσ

= Φ

( | , 0)

xxE Y X x Y x x

x

βφβσβ σ β σλ

β σσ

= > = + = + Φ

Page 15: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 15

Effects of interest (1)

> In case of a binary treatment D, we want to compute

> Assume the most simple model without D x X interactions

[ ]( )

( ) | ( | 1, ) ( | 0, ) |x

E x D d E E Y D X x E Y D X x D dθ

θ = = = = − = = =

* 2

Probit

obit

; (0, )

P robit: ( )

Tobit: ( ) ( ) ( )T

Y X D U U N

x xx

x x x xx x x

β γ σ

β γ βθσ σβ γ β γ β βθ β γ σφ β σφσ σ σ σ

= + +

+ = Φ −Φ

+ + = Φ + + −Φ −

Page 16: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 16

Effects of interest (2)

> If D is continuous we may be more interested in how the

conditional expectation changes for very small changes in D

( | , )P robit:

( | , )Tobit:

E Y D d X x x dd

E Y D d X x x dd

β γ γφσ σ

β γ γσ

∂ = = + = ∂

∂ = = + = Φ ∂

Page 17: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 17

Effects of interest (3)

> The coefficient is informative about the sign of the effect,

but not of its magnitude which depends also on the other

coefficients and control variables

(this is different in the linear regression model)

γ

Page 18: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 18

Estimation (1)

> Minimizing the squared deviation between actual and

predicted individual outcomes (least squares principle)

• is not efficient (due to the implied heteroscedasticity)

> Probit and Tobit models are usually estimated by maximum

likelihood or generalized methods of moments

• both these estimation methods will be discussed in more detail in

Econometric Methods and Applications II

Page 19: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 19

Estimation (2)

> Basic idea of Maximum Likelihood (ML)

• Choose the unknown coefficients in such a way that the observed

sample is most likely to come from an underlying population

described by the chosen values of the coefficients

> Properties of ML

• When the model is correctly specified and some further regularity

conditions are met, ML is consistent, asymptotically efficient,

and asymptotically normally distributed

Page 20: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 20

Estimation (3) > Basic idea of the Generalized Method of Moments (GMM)

• The model implies that the residual V (not U) has conditional expectation 0

• Thus, it is uncorrelated with all functions of X. This defines a set of moment conditions (equalities) that hold in the population for the true parameters

• Choose the parameters such that the sample analogues of those moments (very often mean functions) come as close as possible to fulfil the same conditions in the sample

• Under correct specification, GMM is (usually) consistent and asymptotically normally distributed

( | ) | ( | ) 0; Probit :V

XE Y E Y X x X x E V X x V Y βσ

− = = = = = = −Φ

N −

Page 21: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 21

Estimation (4)

> Probit and Tobit models are usually estimated by ML

> Tobit model: There is a particularly simple 2-step GMM

estimator when considering observations with positive y:

( | , 0)

xxE Y X x Y x x

x

βφβσβ σ β σλ

β σσ

= > = + = + Φ

Page 22: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 22

Estimation (5)

> 1st step:

• Estimate a probit model to obtain consistent estimates of

• Use them to compute a consistent estimate of

for every observation:

> 2nd step

• Use as additional regressor in regression of Y on X (Heckit)

( | , 0) xE Y X x Y x ββ σλσ

= > = +

βσxβλ

σ

ˆi

i i

i

xx

x

βφσβλ λ

σ βσ

= = Φ

Page 23: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 23

Computing the effects of interest (1)

( )

Probitˆi i ix x xβ γ βθ

σ σ σ

= Φ + −Φ

( )11

1 ˆ N

i ii

ATET d xN

θ=

= ∑

( )Tobitˆ ˆ ˆ ˆˆ ˆˆ ˆ ˆˆ ˆ ˆ( )ˆ ˆ ˆ ˆ

i i i ii i

x x x xx x xβ γ β γ β βθ β γ σφ β σφσ σ σ σ

+ += Φ + + −Φ −

( )11

1 ˆ (1 )N

i ii

ATENT d xN N

θ=

= −− ∑

( )1

1 ˆ N

ii

ATE xN

θ=

= ∑

Page 24: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 24

Computing the effects of interest (3)

> The same logic applies to the continuous outcomes

> Averaging is over the various populations as before

> Most statistical software packages also provide the values of

these derivatives or discrete changes of D (and other X) for a

particular value of D and X, usually the sample mean

( | , )P robit: E Y D d X x x dd

β γ γφσ σ σ

∂ = = = + ∂ ˆ ˆ( | , ) ˆTobit:

ˆE Y D d X x x d

dβ γ γσ

∂ = = += Φ ∂

Page 25: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 25

Final considerations about regressions (1)

> Linear and non-linear regressions are tools to remove differences

in the outcome variables due to observable variables

> Whether this is enough to uncover causal effects depends on the

(non-)existence other (non-observables) differences also related

to selection (i.e. other factors influencing D and Y)

> Regressions uncover causal effects if

• conditional expectations are of linear or non-linear known form

• CIA holds

[ ] [ ]( | ) | ( , ) | ( | ) 0E Y E Y X x X x E Y g X X x E U X xθ− = = = − = = = =

Page 26: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 26

Final considerations about regressions (2)

> Regressions for causal inference

• If effect heterogeneity is expected include (enough) interaction

terms D x X

• Always check whether coefficient has an interpretation as effect

or if more complex calculations are required, in particular in

models with D x X interactions and non-linear models

• If unsure about non-linearities or if substantial effect

heterogeneity of unknown form is expected

use more flexible semi- or non-parametric methods (as

discussed in the course “Flexible estimation in practice”)!

Page 27: Econometric Methods and Applications I Lecture 4

Econometric Methods and Applications I, Lecture 4, Slide 27

Final considerations about regressions (3)

> Regressions for causal inference: Fatal mistakes

• Bad controls

− Conditioning on variables influenced by D (simultaneous equation bias)

− Controls measured with error related to D (measurement error bias)

• Missing variables

− Variables related to Y and D are not in the data (omitted variable bias)

• Specification error of the conditional expectation functions acts like a

missing variable (or a measurement error)

− Misspecified bit of true regression becomes part of error term and may

violate E(U|X)=0 condition