Lecture 7slides

8/17/2019 Lecture 7slides

1/27

HE3021, NTU Lecture 7 FENG Qu

LECTURE 7: BINARY CHOICE MODELS

1 Modeling Choice Decision

2 Binary Choice Models: LPM, Probit, Logit

3 Estimation and Testing


2/27

2

1. Modeling Choice Decision

Example of (binary) choice decision: having another child

Other examples:

•

female labor force participation: work or be a housewife

• marriage decision: married or single

•

applying for mortgage: accepted or denied

• admitted or not by NTU

• go to graduate school or work: yes or no

•

vote or not vote

•

buy or not buy

(Q: Common feature? )

Binary choice: y = 1 or 0

(decision variable) (yes) (No)


3/27

3

How to model people’s choice decision?

Example: having another child

Economics: benefit- (opportunity) cost analysis (Gary Becker, 1981)

= 1 for having another child; = 0 for not having.

1 denotes the benefit from having another child, e.g., packages above, tax rebate,

baby bonus, bigger HDB flatter, subsidized childcare, and happiness, etc.;

0 denotes the (opportunity) cost, e.g., economic costs (of pregnancy, birth,

growing, education, healthcare), leisure, pay increase otherwise,

Then, the decision can be modeled by:

= 1 if 1 0 > 0;

0 otherwise


4/27

4

Another example: female labor force participation

Economics: the outcome of a market process

• Demanders: offer a wage based on labor’s expected marginal product

• Suppliers: whether or not to accept the offer depending on it exceeded their

own reservation wages

From women’s side, being in labor force or not is a trade-off between wage and

alternative: taking care of your kids, housekeeping, leisure,….

= 1 for working; = 0 for not being in labor force. If 1 denotes the utility

from working and 0

denotes the utility from staying at home, then the decision

can be modeled by:

= 1 if 1 0 > 0;

0 otherwise


5/27

5

Like in linear regression models, suppose that the difference 1 0 can be

interpreted by observable characteristics and unobserved error term,

1 0 = 0 + 11 +⋯+ + .

Thus,

( = 1|) = (1 0 > 0) = (0 + 11 +⋯+ + > 0)

Suppose the random variable follows a distribution with CDF (∙). Thus

( = 1|) = ( > (0 + 11 +⋯+ ))

= 1 ((0 + 11 +⋯+ ))

If the distribution is symmetric, then

( = 1|) = (0 + 11 +⋯+ ) and

( = 0|) = 1 ( = 1) = 1 (0 + 11 +⋯+ ).


6/27


7/27

7

LPM: OLS with a binary dependent variable ( = 1 or 0)

= 0 + 11 +⋯+ + , = 1,… , .

Example 1: birth intention in Singapore (data set: babybonus.dta)

dep. var. (y): ( yb) 1 for intention to have another child; 0 otherwise

indep. var.(x): (a scale measure of) policy package (cpl), current numberof children (num), husband monthly income (hmi), wife’s education (we),

wife’s age (wa)

_cons 1.521159 .2663598 5.71 0.000 .9935997 2.048718

wa -.0175984 .0085417 -2.06 0.042 -.0345164 -.0006805

we -.0551932 .0474809 -1.16 0.247 -.1492351 .0388486

hmi -.0005362 .0362719 -0.01 0.988 -.0723773 .0713049

num -.2759397 .0454467 -6.07 0.000 -.3659526 -.1859267

cpl .1865787 .076402 2.44 0.016 .0352548 .3379026

yb Coef. Std. Err. t P>|t| [95 Conf. Interval]

Total 30.4672131 121 .25179515 Root MSE = .41049

Adj R-squared = 0.3308

Residual 19.5458145 116 .168498401 R-squared = 0.3585

Model 10.9213986 5 2.18427972 Prob > F = 0.0000

F( 5, 116) = 12.96

Source SS df MS Number of obs = 122

. reg yb cpl num hmi we wa


8/27

8

Result:

Policy package (cpl) has big positive effect on people’s birth intention.

(Q: how to interpret the coefficient .187?)

Prediction:

Stata command: predict yhat, xb

(Q: What does the predicted value � = 0.499 mean? 1.121,-.016)


9/27

9

First, for LPM, [|] = ( = 1|) = 0 + 11 +⋯+ , so

� = 0 + 11 +⋯+ = . ( = 1|)

predicted value �: the predicted probability of “success” (having another child)

0.499 is couple 1’s predicted probability of having another child, given other factors.

Second, is the partial effect of on the probability of “success” (( =

1|)),

=[|]

=

( = 1|)

, =1 ,… ,

: the estimated partial effect (the ceteris paribus interpretation)

E.g., 1 =.187 can be interpreted that additional unit of policy package increases

the probability of having another child by18.7%, holding other factors fixed.


10/27

10

Example 2: Women’s Labor Force Participation (data set: MROZ.dta)

= 0 + 1 ∙ + 2 ∙ + 3 ∙ 6 + , = 1,… ,

o

: 1 for being in labor force and 0 for being a housewife;

o

nwifeinc: husband’s earning

o

educ: wife’s education

o

kidslt6 : number of children less than 6 years old

Estimation: run the multiple regression:

_cons .0737593 .0931678 0.79 0.429 -.1091417 .2566604 kidslt6 -.2227047 .0325987 -6.83 0.000 -.2867004 -.158709 educ .0572465 .0077912 7.35 0.000 .0419513 .0725418 nwifeinc -.0077404 .001519 -5.10 0.000 -.0107224 -.0047583

inlf Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 184.727756 752 .245648611 Root MSE = .4656 Adj R-squared = 0.1175 Residual 162.3691 749 .216781175 R-squared = 0.1210 Model 22.3586557 3 7.45288523 Prob > F = 0.0000 F( 3, 749) = 34.38 Source SS df MS Number of obs = 753

. reg inlf nwifeinc educ kidslt6


11/27

11

Advantages of LPM:

•

easy to implement: OLS

•

simple to interpret results

•

straightforward to test hypothesis

• OLS estimator is consistent

(Q:?)

Disadvantages of LPM:

1. heteroskedasticity:

o

heteroskedasticity-robust inference

(exercise)

2. the predicted probability � could be < 0 or > 1!


12/27

12

Graphic Interpretation of LPM

Example: explain Mortgage application by debt payments to income (P/I) ratio

LPM:


13/27

13

Probit and Logit Models

LPM model:

[|] = ( = 1|) = (0 + 11 +⋯+ ) = 0 + 11 +⋯+

Probit model: standard normal CDF Φ(∙)

(0 + 11 +⋯+ ) = Φ(0 + 11 +⋯+ )

Logit model: logistic CDF

(0 + 11 +⋯+ ) =exp(0 + 11 +⋯+ )

1 + exp(0 + 11 +⋯+ )

For CDFs, 0 ≤ Φ(∙) ≤ 1 and 0 ≤ exp (∙)

1+exp (∙) ≤ 1, so the estimated probability

� = Φ(̂0 + ̂11 +⋯+ ̂) for probit model

and � =exp (++⋯+)

1+exp (++⋯+) for logit model


14/27


15/27

15

probit regression of women’s labor force participation

Stata command: probit inlf nwifeinc educ kidslt6

.1667 is the coefficient of educ from probit regression. (.057 in LPM)

(Q: why so different?)

Compare the estimation results with those of LMP

_cons1.245253 .2714193 4.59 0.000 1.777225 .7132807

kidslt6.6525247 .0996887 6.55 0.000 .847911 .4571383

educ .1666664 .0235149 7.09 0.000 .120578 .2127547 nwifeinc .0231133 .0045451 5.09 0.000 .0320216 .014205

inlf Coef. Std. Err. z P>|z| [95% Conf. Interval]

Log likelihood =465.45302

Pseudo R2 =0.0960

Prob > chi2 = 0.0000 LR chi2(3) = 98.84Probit regression Number of obs = 753

Iteration 3: log likelihood =465.45302


Iteration 1: log likelihood = 466.34923Iteration 0: log likelihood = 514.8732

. probit inlf nwifeinc educ kidslt6


16/27

16

logit regression of women’s labor force participation

Stata command: logit inlf nwifeinc educ kidslt6

.274 is the coefficient of educ from logit regression. (.057 in LPM and .1666 in probit)

(Q: why so different?)

_cons -2.046709 .4552589 -4.50 0.000 -2.939 -1.154418 kidslt6 -1.068074 .167187 -6.39 0.000 -1.395755 -.740394 educ .2741035 .0399976 6.85 0.000 .1957097 .3524973 nwifeinc -.0385731 .0078653 -4.90 0.000 -.0539887 -.0231574


Log likelihood = -465.55373 Pseudo R2 = 0.0958 Prob > chi2 = 0.0000 LR chi2(3) = 98.64Logistic regression Number of obs = 753

Iteration 3: log likelihood = -465.55373Iteration 2: log likelihood = -465.55673Iteration 1: log likelihood = -466.55427Iteration 0: log likelihood = -514.8732

. logit inlf nwifeinc educ kidslt6


17/27

17

Partial (or marginal) effect of ( = 1|) = (0 + 11 +⋯+ )

ceteris paribus effect, the effect of one unit of change in on the

probability of success ( = 1|) = (0 + 11 +⋯+ ), given other

factors fixed.

(i) Continuous :

( = 1|)

= ′

(0 + 11 +⋯+ ) , = 1,… ,

For probit model, () = Φ(), ′() = ()

For logit model, () = ()

1+ () and ′() = () =

()

(1+ ())

(for LPM, ′

() = 1)

(ii) Discrete , e.g. 1 from 1 to 0, the partial effect is defined as

(0 + 1 + 22 +⋯+ ) (0 + 22 +⋯+ )


18/27

18

Remarks:

1. Different from LPM, the partial effects in probit and logit models are notconstant, related with the values of .Slope parameter is NOT the partial

effect of on the probability of “success”, implying that the interpretations of

coefficients in these 3 models are different, not comparable.

2. Since ′ > 0 for probit and logit, the direction of the partial effect of

depends on the sign of .

3. Calculation of marginal effects at the mean values of regressors in probit

and logit regressions :Stata command: mfx


19/27

19

Probit regression:

following probit regression, run Stata command: mfx

(Note: at the mean values of x)

_cons 1.245253 .2714193 4.59 0.000 1.777225 .7132807 kidslt6 .6525247 .0996887 6.55 0.000 .847911 .4571383 educ .1666664 .0235149 7.09 0.000 .120578 .2127547 nwifeinc .0231133 .0045451 5.09 0.000 .0320216 .014205


Log likelihood = 465.45302 Pseudo R2 = 0.0960 Prob > chi2 = 0.0000

LR chi2(3

) =98.84

Probit regression Number of obs =753

Iteration 3: log likelihood = 465.45302Iteration 2: log likelihood = 465.4538Iteration 1: log likelihood = 466.34923Iteration 0: log likelihood = 514.8732


kidslt6 .2560349 .03923 6.53 0.000 .332929 .179141 .237716

educ .0653958 .00921 7.10 0.000 .047335 .083457 12.2869nwifeinc .0090691 .00178 5.08 0.000 .012566 .005572 20.129 variable dy/dx Std. Err. z P>|z| [ 95% C.I. ] X

= .57228348

y = Pr(inlf) (predict)Marginal effects after probit

. mfx


20/27

20

logit regression:

(Q: any interesting finding from these results?)

_cons -2.046709 .4552589 -4.50 0.000 -2.939 -1.154418 kidslt6 -1.068074 .167187 -6.39 0.000 -1.395755 -.740394

educ .2741035 .0399976 6.85 0.000 .1957097 .3524973 nwifeinc -.0385731 .0078653 -4.90 0.000 -.0539887 -.0231574


Log likelihood = -465.55373 Pseudo R2 = 0.0958 Prob > chi2 = 0.0000 LR chi2(3) = 98.64Logistic regression Number of obs = 753

Iteration 3: log likelihood = -465.55373Iteration 2: log likelihood = -465.55673Iteration 1: log likelihood = -466.55427Iteration 0: log likelihood = -514.8732

. logit inlf nwifeinc educ kidslt6

kidslt6 -.2614511 .04111 -6.36 0.000 -.342023 -.18088 .237716

educ .0670971 .00977 6.87 0.000 .047943 .086251 12.2869nwifeinc -.0094422 .00193 -4.90 0.000 -.013222 -.005663 20.129 variable dy/dx Std. Err. z P>|z| [ 95% C.I. ] X

= .57219848

y = Pr(inlf) (predict)Marginal effects after logit

. mfx


21/27

21

the mean values of regressors:

LPM result:

This empirical example tells us that though the estimates of 2, are different in

LPM, probit and logit regressions, their partial effects evaluated at mean values of

regressors are very close. (Q: why does this make sense?)

kidslt6 753 .2377158 .523959 0 3 educ 753 12.28685 2.280246 5 17 nwifeinc 753 20.12896 11.6348 .0290575 96

Variable Obs Mean Std. Dev. Min Max

. sum nwifeinc educ kidslt6

_cons .0737593 .0931678 0.79 0.429 -.1091417 .2566604

kidslt6 -.2227047 .0325987 -6.83 0.000 -.2867004 -.158709 educ .0572465 .0077912 7.35 0.000 .0419513 .0725418 nwifeinc -.0077404 .001519 -5.10 0.000 -.0107224 -.0047583

inlf Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 184.727756 752 .245648611 Root MSE = .4656 Adj R-squared = 0.1175 Residual 162.3691 749 .216781175 R-squared = 0.1210 Model 22.3586557 3 7.45288523 Prob > F = 0.0000 F( 3, 749) = 34.38

Source SS df MS Number of obs = 753

. reg inlf nwifeinc educ kidslt6


22/27

22

Partial (marginal) effects in LPM: ̂

,

Partial effects in probit regression: (∙)̂

Partial effects in logit regression: (∙)

(1+ (∙))̂

A simple rule for comparing coefficients in these 3 models: partial effects are

considered to be approximately equal:

̂

≈ (∙)̂

≈ (∙)

(1+ (∙)) ̂

Since (0) ≈ 0.4 for probit and (0)

(1+ (0))= 0.25 for logit, we obtain:

̂

≈ 0.4 ∙ ̂

≈ 0.25 ∙ ̂

or

̂

≈ 2.5̂

, ̂

≈ 4̂

and ̂

≈ 0.625̂

Example above: ̂

= 0.057, ̂= 0.167, ̂

= 0.274


23/27

23

Calculation of the predicted probability � in probit/logit model:

Stata command: predict ypr, pr (after probit/logit regression)

Check whether � lies in the unit interval and compare the predicted probabilities

in probit and logit models.

Note:

In Stata 11, the calculation of marginal effect has 3 cases: marginal

effect at the mean, marginal effect at a representative value and average

marginal effect. Stata commands are:

margins, dydx(*) atmeanmargins, dydx(*) at(nwifeinc=0 educ=6 kidslt6=1)

margins, dydx(*)


24/27

24

3. Estimation and Testing: Probit and Logit

We need calculate the likelihood function in probit and logit models:

Prob( = 1|) = (0 + 11 +⋯+ ), Prob( = 0|) = 1 (∙)

or equivalently, () = (∙) ∙ [ 1 (∙)]1−

Then likelihood function(0,1, … ,) = ∏ () = ∏ (∙)

[1 (∙)]1−=1=1

or

ln(0,1, … ,) = ∑ { ln(∙) + (1 ) ln1 (∙)]}=1

as a function of the unknown parameters 0,1, … ,.

() = Φ(), for probit model; () = ()1+ ()

, for logit model.

Maximizing ln (0,1, … ,) gives the probit (or logit) estimates of ’s.


25/27

25

For probit and logit models, we can’t solve for the maximum explicitly. We need

use numerical methods (iterations): e.g.,

Properties of MLE:

o consistent

o

asymptotically normal

o

asymptotically efficient

_cons 1.245253 .2714193 4.59 0.000 1.777225 .7132807

kidslt6.6525247 .0996887 6.55 0.000 .847911 .4571383

educ.1666664 .0235149 7.09 0.000 .120578 .2127547

nwifeinc .0231133 .0045451 5.09 0.000 .0320216 .014205


Log likelihood = 465.45302 Pseudo R2 = 0.0960

Prob > chi2 =0.0000

LR chi2(3

) =98.84

Probit regression Number of obs = 75 3


Iteration 2: log likelihood = 465.4538Iteration 1: log likelihood = 466.34923Iteration 0: log likelihood =

514.8732



26/27

26

Hypothesis Testing in probit and logit models: same as in OLS

Example 1: 0:1 = 2 = 3 = 0 in logit model

State commands:

quitely logit inlf nwifeinc educ kidslt6

test nwifeinc educ kidslt6

0 is rejected since the p-value is 0.

Example 2: linear restriction: 0: 22 3 = 0


Prob > chi2 = 0.0000

chi2( 3) = 78.00

( 3) [inlf]kidslt6 = 0

( 2) [inlf]educ = 0

( 1) [inlf]nwifeinc = 0

. test nwifeinc educ kidslt6

Prob > chi2 = 0.0000

chi2( 1) = 65.05

( 1) 2*[inlf]educ - [inlf]kidslt6 = 0

. test 2*educ- kidslt6=0


27/27

27

Example 3: 0: 22 3 = 0 and 1 + 2 = 0

Stata commands:

test (2*educ- kidslt6=0) (educ+ nwifeinc=0)


Prob > chi2 = 0.0000

chi2( 2) = 69.16

( 2) [inlf]nwifeinc + [inlf]educ = 0

( 1) 2*[inlf]educ - [inlf]kidslt6 = 0

. te

st (2*educ- kidslt6=0) (educ+ nwifeinc=0)

Documents

Lecture 7slides