Week 4: Instrumental Variables Estimation and Two Stage

Week 4: Instrumental Variables Estimation and TwoStage Least Square (Wooldridge Chapter 15)

Tsun-Feng Chiang*

*School of Economics, Henan University, Kaifeng, China

March 24, 2014

1 / 35

Instrumental Variables Estimation One Endog. Variable, No Exog. Variable other than One IV

Omitted Variables (A Simple Case)

Suppose there is a two-variable regression model, with an importantexplanatory variable x2

y = β0 + β1x1 + β2x2 + e (eq. 1),

where e is the error term which is uncorrelated with x . However, thevariable of x2 is not observable. So the variable x2 with e is included ina composite error term u,

y = β0 + β1x1 + u (eq. 2),

If x2 is not correlated with x1, then u will be also uncorrelated with x1.The OLS will give an unbiased estimator. If correlated (cov(u, x) 6= 0,or the assumption of strict exogeneity is violated), the OLS estimatorsare biased.

2 / 35


To fix the problem of endogeneity,1 Proxy Variable

Find a variable x∗2 closely related to the unobservable x2 to

replacex2,y = δ0 + β1x1 + δ2x∗

2 + e (eq. 3)

The unbiased OLS estimates β1 is identical in (eq. 1) and (eq. 3).

2 Instrumental Variable (IV)When x and u are correlated, to obtain consistent estimators of β0and β1, first it needs to find a variable, say z, that satisfies twoproperties:

1 z is uncorrelated with u

Cov(z,u) = 02 z is correlated with the observable x (x1 for this simple case)

Cov(z, x) 6= 0

Then z is an instrumental variable for x.

3 / 35


For the first property, it is impossible to examine whether z and u areuncorrelated because u is unobservable. In most cases, studies useeconomic reasonings to explain why the z they pick are uncorrelated tothe omitted variables.Examle (1): mother’ education as an IV for education

log(wage) = β0 + β1education + u

Examle (2): distance as an IV for the lectures missed

score = β0 + β1skipped + u

For the second property, formal tests are possible by running a simpleregression by OLS,

x = π0 + π1z + v

Use t test for the null hypothesis H0 : π1 = 0. If rejected, the z and xare correlated or the second property is satisfied.

4 / 35


IV EstimatorsStart from Cov(z, y), replace y using (eq. 2) and use the fact that β0 isfixed,

Cov(z, y) = β1Cov(z, x1) + Cov(z,u)

By the first property of z, Cov(z,u) = 0, the population β1 is,

β1 = Cov(z, y)/Cov(z, x1)

Therefore, the sample analog for β1 is,

β1 =∑n

i=1(zi−z)(yi−y)∑ni=1(zi−z)(xi1−x1)

It is immediate to obtain after β1 is known,

β0 = y − β1x1

5 / 35


Statistical Inference with the IV EstimatorTwo properties about z ensure the unbiased estimators, for statisticalinference, it needs the homoskedasticity assumption for variance of uconditional on z, not x1.

Var(u) = E(u2|z) = σ2

Combine two properties of z and the homoskedasticity assumption, itcan be shown that the variance of β1 is

σ2

nσ2x1ρ2

x1,z(eq. 4)

where σ2x1

is the population variance of x1, and ρx1,z is the population ofcorrelation between x1 and z. (eq. 4) provides a way to obtain astandard error for β1. (eq. 4) can be estimated by approximations of itsthree components,

6 / 35


Statistical Inference with the IV Estimator (continued)1 σ2

x1is approximated by sample variance of x1, or 1

n SSTx1 .2 ρ2

x1,z is approximated by R2x1,z obtained from regressing of x1 on z.

3 σ2 is approximated by the estimated variance of u,

σ2 = 1n−2

∑ni=1 u2

i

where ui is the IV residuals; β0 and β1 are IV estimates,ui = yi − β0 − β1xi1, i = 1,2, · · · ,n

Therefore, the standard error for β1 is the square root of

σ2

SSTx1 R2x1,z

If the correlation between x1 and z are perfect (R2x1,z = 1), the IV

variance is identical to the OLS variance. Since R2x1,z ≤ 1 , the IV

variance is always not smaller than the OLS variance.

7 / 35


Example 15.1

Figure: Return to Education for Married Women (OLS Estimation)

Figure: Return to Education for Married Women (Correlation between theEndogenous Variable and the IV)

8 / 35


Example 15.1(continued)

Figure: Return to Education for Married Women (IV Estimation)

Import the data> mroz = read.csv(file.choose(), header = TRUE)To run the IV estimation in R, first downlod the package AER> install.packages("AER")> library("AER")Use the command ivreg in AER where you set the model and pick the IV> mroz_iv <- ivreg ( lwage ∼ educ | fatheduc, data = mroz)> summary(mroz_iv)

9 / 35


Example 15.2

Figure: Return to Education for Men (Correlation between the EndogenousVariable and the IV)

Figure: Return to Education for Men (IV Estimation)

10 / 35


Properties of IV with a Poor Instrumental VariableWeak correlation between z and x leads to large standard errors;weak correlation between z and u leads to biased estimators even withthe large sample size. To see the latter, from the formula for β1 the limitof probability of β1 can be shown as,

plimβ1 = β1 + Corr(z,u)Corr(z,x)

σuσx

where σu and σx are the standard deviations of u and x in thepopulation, respectively. If corr(z,u) 6= 0, then β1 is asymptotic bias,and the inconsistency can be very large if corr(z, x) is small.Similarly, the probability limit of OLS estimator can be derived as

plimβ1 = β1 + Corr(x ,u)σuσx

when x is endogenous, but if we choose a z such thatCorr(z,u)/Corr(z, x) < Corr(x ,u), the IV is not preferred to OLS.

11 / 35


Example 15.3 A Example of Poor IV

Figure: Cigarette Price as an IV

Figure: Estimating the Effect of Smoking on Birth Weight

12 / 35

Instrumental Variables Estimation One Endog. Variable, Multiple Exog. Variables and One IV

IV Estimation of the Multiple Regression Model

Suppose there is one endogenous variable other than the dependentvariable y. Now call the endogenous variable y2, and the dependentvariable y1. Then the regression model can be written as,

y1 = β0 + β1y2 + β2z1 + · · ·+ βkzk−1 + u1 (eq. 5)

where z1, z2, · · · , zk−1 are assumed to be exogenous variables, i.e.Cov(zj ,u1) = 0 for all j = 1,2, · · · , k − 1. Another assumption isE(u1) = 0. (eq. 5) where endogenous variables appears on both sidesis called the structural equation.

The IV for y2, called zk , should not be one of zj in (eq. 5), and need tosatisfy two conditions: (i) u1 and zk are uncorrelated; and (ii) y2 and zkare partially correlated. (i) can be justified by economic reasonings; for(ii), run the regression model,

13 / 35


y2 = π0 + π1z1 + π2z2 + · · ·+ πk−1zk−1 + πkzk + v2 (eq. 6)

(eq. 6) is called reduced form equation where the endogenousvariable is written in terms of exogenous variables. (ii) requires πk 6= 0.Under (i), (ii), and the assumptions that E(u1) = 0 and Cov(zj ,u1) = 0,zk is a valid IV for y2.

Method of MomentsTo estimate β, consider the assumptions for (eq. 5):

E(u1) = 0,Cov(zj ,u1) = 0, j = 1,2, · · · , k − 1.

And the first condition for a valid IV,

Cov(zk ,u1) = 0

14 / 35


Method of Moments (Continued)

Given E(u1) = 0, it can be shown that Cov(zj ,u1) = E(zju1) = 0 forj = 1,2, · · · , k − 1 and Cov(zk ,u1) = E(zku1) = 0. Therefore,

E(u1) = 0⇒∑n

i=1(yi1 − β0 − β1yi2 − β2zi1 − · · · − βkzik−1) = 0

E(z1u1) = 0⇒∑n

i=1 zi1(yi1 − β0 − β1yi2 − β2zi1 − · · · − βkzik−1) = 0...

E(zk−1u1) = 0⇒∑n

i=1 zik−1(yi1− β0− β1yi2− β2zi1−· · ·− βkzik−1) = 0

E(zku1) = 0⇒∑n

i=1 zik (yi1 − β0 − β1yi2 − β2zi1 − · · · − βkzik−1) = 0

Solve k + 1 equations to obtain β0, β1, β2, · · ·, βk .

15 / 35


Example 15.4 Using College Proximity as an IV forEducation

Figure: Return to Education for Men (Correlation between the EndogenousVariable and the IV)

16 / 35


Example 15.4 (Continued)

Figure: Return to Education for Men (IV Estimation)

17 / 35

Instrumental Variables Estimation One Endog. Variable, Multiple Exog. Variables and Multiple IV

Two Stage Least Square

There is no restriction for more than one IV for one endogenousvariable. Following (eq. 5), we can find more exogenous variables notin (eq. 5), such as zk , zk+1, zk+2, · · · , zm, to be IV for y2. Like otherexogenous variables z1, z2, · · · , zk−1, these IV should be exogenous tou1, i.e. Cov(zk ,u1) = Cov(zk+1,u1) = · · · = Cov(zm,u1) = 0.

The reduced form equation then should be extended to

y2 = π0 + π1z1 + · · ·+ πk−1zk−1 + πkzk + · · ·+ πmzm + v2 (eq. 7)

The best IV for y2 is the linear combination of the zj , which IV is calledy∗

2 ,

y∗2 = π0 + π1z1 + · · ·+ πk−1zk−1 + πkzk + · · ·+ πmzm (eq. 8)

The requirement for y∗2 to be a valid IV, it must be that

18 / 35


at least one πj 6= 0 for j = k , k + 1, · · · ,m

In practice, we run (eq.7) using OLS and test the null hypothesis:

H0 : πk = πk+1 = · · · = πm = 0

If rejected, then calculate the fitted values,

y2 = π0 + π1z1 + · · ·+ πk−1zk−1 + πkzk + · · ·+ πmzm

where y2 is the best IV which will be applied in estimating β. GivenE(u1) = 0, Cov(zj ,u1) = E(zju1) = 0 for j = 1,2, · · · , k − 1, we have kmoment conditions. But in (eq. 5) there are k + 1 parameters to beestimated! Because y2 is the linear combination of exogenousvariables, itself is exogenous. So the last moment condition is,∑n

i=1 yi2(yi1 − β0 − β1yi2 − β2zi1 − · · · − βkzik−1) = 0

Now it is just enough to estimate k + 1 parameters.

19 / 35


The IV estimator using y2 is also called the two stage least squares(2SLS) estimator. Because in the first stage we run (eq. 7) to obtainfitted values of y∗

2 , y2, which then is used to estimate β in the secondstage (eq. 5).

y2 is correlated with u1, its fitted value y2, however, y2 uncorrelatedwith u1 since the unobservable part of y2 which cause endogeneityhad been discarded.

Example 15.5 (Return to Education for Working Women)Figure: 2SLS: Mother’s Education and Father’s Education

20 / 35


Multicollinearity and 2SLSSee (eq. 5), by OLS, the variance of the estimator of β1 is,

Var(β1) = σ2/SST2(1− R22)

where σ2 = Var(u1); SST2 is the total sample variation in y2; R22 is the

R-squared from regressing y2 on all independent variables.By 2SLS, the variance of the estimator of β1 is,

Var(β1) = σ2/ ˆSST2(1− R22)

where ˆSST2 is the total sample variation in y2; R22 is the R-squared

from regressing y2 on all other exogenous variables in the structuralequation.

21 / 35


Multicollinearity and 2SLS (continued)

By construction, y2 has smaller variation than y2 ⇒ SST2 < ˆSST2.Second, the correlation between y2 and the exogenous variables in(eq. 5) is often much higher than the correlation between y2 and thesevariables⇒ R2

2 < R22 . Multicollinearity makes R2

2 much larger, so doesthe standard error of 2SLS estimator.

Multiple Endogenous Explanatory VariablesFor more than one endogenous variables in the model, it needs atleast as many excluded exogenous variables as there are includedendogenous variables in the structural equation. Need to check theorder condition and rank condition.

Linear Hypothesis Test after 2SLS EstimationR-squared could be negative, so does the F statistic.

22 / 35

Instrumental Variables Estimation IV Solutions to Errors-in-Variables Problems

IV Solutions to Errors-in-Variables Problems

The use of IV can only solve the omitted variable problem, but alsomeasurement error problem. Consider the following model,

y = β0 + β1x∗1 + β2x2 + u

where x∗1 is unobservable. Suppose there is an imperfect measure for

x∗1 , called x1 and x1 = x∗

1 + e1, where e1 is the measurement error.The equation above becomes

y = β0 + β1x1 + β2x2 + (u − β1e1)

The new error term is now correlated with x1 via e1, so it needs to findan IV for x1 such that it is uncorrelated with e1 but correlated with x1.One possibility is to obtain another imperfect measure of x∗

1 , say z1, tobe the IV. Let z1 = x∗

1 + a1. z1 is correlated with x1 because they areboth measures of x∗

1 . For z1 to be a valid IV, a1 and e1 have to beuncorrelated so that z1 is uncorrelated with the error term.

23 / 35


Example 15.6 Using Two Test Scores as Indicators ofAbility

Figure:Using IQ as a Proxy (Example 9.3)

24 / 35


Example 15.6 (Continued)

Figure: Using IQ as an measure of unobservable variable and KWW as an IV

25 / 35

Instrumental Variables Estimation Testing for Endogeneity

Testing for Endogeneity

Since 2SLS estimators are less efficient than OLS estimators when theexplanatory variables are exogenous, it is useful to test if endogeneityexists and if 2SLS is necessary. Suppose the explanatory y2 issuspected to be endogenous in this structural equation,

y1 = β0 + β1y2 + β2z1 + · · ·+ βkzk−1 + u1

As usual, we find exogenous variables not included in the model for IVand run the reduced form equation,

y2 = π0 + π1z1 + π2z2 + · · ·+ πk−1zk−1 + πkzk + · · ·+ πmzm + v2

zj for j = 1,2, · · · ,m is exogenous, so y2 is uncorrelated with u1 if andonly if u1 and v2 are uncorrelated. The correlation between u1 and v2can be written as u1 = δ1v2 + e1. u1 and v2 are uncorrelated whenδ1 = 0. Replace u1 in the structural equation by δ1v2 + e1, and replacev2 by v2 estimated from the reduced from equation,

26 / 35


y1 = β0 + β1y2 + β2z1 + · · ·+ βkzk−1 + δ1v2 + error

by OLS and test H0 : δ1 = 0. If rejected, y2 is endogenous.

Example 15.7 Return to Education for Working WomenFigure: Testing for Endogeneity of Education

27 / 35


My R Code (Example 15.7):

> mroz = read.csv(file.choose(), header = TRUE)

> educ_ols <- lm(educ ∼ exper + I(exper^2) + motheduc + fatheduc,data = mroz)

Extract residuals, and name it resid:> resid <- residuals(educ_ols)

Add resid as a new variable to data mroz> mroz$resid <- resid

> wage_ols <- lm(lwage ∼ educ + exper + I(exper^2) + resid, data =mroz)

> summary(wage_ols)

28 / 35

Instrumental Variables Estimation Testing for Overidentifying Restrictions

Testing for Overidentifying Restrictions

In the case of one endogenous variable y2 and more than one IV, wecan effectively test whether some of them are correlated with thestructural error. Take the simplest case where there are two IV, zk andzk+1, as an example. However, we use only zk as an IV for y2 andobtain IV estimates β. The calculate IV residuals,

u1 = y1 − β0 − β1y2 − β2z1 − · · · − βkzk−1

Because zk+1 is not used, we can test whether zk+1 and u1 arecorrelated. For this test to be useful, we must assume zk and u1 andare uncorrelated. In the same logic, we can use zk+1 as an IV and testthe correlation between zk and u1. One more IV gives one moreoveridentifying restriction. If there are three IV, then the tests should beexercised twice (two overidentifying restrictions), and so on.

29 / 35


For more than two IV, use the following three steps to testoveridentification:

1 Estimate the structural equation using 2SLS and obtain theresiduals.

2 Regress u1 on all exogenous variables. Obtain the R-squared, sy,R2

1 .3 Under the null hypothesis that all IVs are uncorrelated with u1,

nR21 ∼ χ2

q, where n is the sample size, q is the number ofoveridentifying restrictions, or the number of IV from outside themodel minus the total number of endogenous explanatory. Wetest H0: all IVs are exogenous against H1: at least one IV is notexogenous.

Adding IVs could improve the asymptotic efficiency of the 2SLS, but itrequires that any new IVs are exogenous. With the typical sample sizeavailable, adding to many IVs can cause severe biases in 2SLS.

30 / 35


Example 15.8 Return to Education

Figure: One Restriction

nR21 = 428(0.0009) = 0.3852

Figure: Two Restrictions

nR21 = 428(0.0026) = 1.11

31 / 35

Instrumental Variables Estimation Applying 2SLS to Pooled Cross Sections and Panel Data

Applying 2SLS to Pooled Cross Sections and PanelData

As with models of pooled cross sections and panel data estimated byOLS, time dummy variables are often added to allow for aggregatetime effects. These dummy variable are exogenous. Applying 2SLS tothese models causes no new difficulties.

32 / 35


Example 15.9 Effect of Education on Fertility

Figure: Applying 2SLS to Pooled Cross Section (IV: Meduc and Feduc)

33 / 35


Example (Continued)

Figure: Test for Endogeneity

34 / 35

1st Midterm

Date: Tuesday, April 1st, 2014

Time: 9:30 am ∼ 11:30 am

Location: The Conference for Lecture

Coverage: Wooldridge Chapter 13, 14 and 15

Others: Closed Book, Closed Notes, No Calculator Needed.

35 / 35

Documents

Week 4: Instrumental Variables Estimation and Two Stage