Econometrics

Exercises in Advanced Microeconometrics

By Bertel Schjerning, Rasmus Jørgensen and Jakob Johansen

Autumn 2015 — Week 4

Contents

1 Relaxing the Strict Exogeneity Assumption to Sequential Exogeneity in theLinear Unobserved Effects Model 11.1 Difference (FD) and Fixed Effects (FE) estimation . . . . . . . . . . . . . . . . . 21.2 IV and GMM estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 Predetermined Variables as Instruments . . . . . . . . . . . . . . . . . . . 31.2.2 Potential Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.4 Orthogonal deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.5 Test for Overidentifying Restrictions . . . . . . . . . . . . . . . . . . . . . 61.2.6 The Rank Condition and Weak Instruments . . . . . . . . . . . . . . . . . 6

2 The Exercise: The Effect of Fertility on Female Labor Market Participation 62.1 Endogeneity Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 A Potentially Exogenous Instrument . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 The Questions 9

More Topics in Linear Unobserved Effects Models

Until now we have considered linear unobserved effects models with strictly exogenous regressors,i.e. the explanatory variables are assumed to be uncorrelated with past, present and futureidiosyncratic errors. In this problem set, we will relax the strict exogeneity assumption such thatthe idiosyncratic errors may be correlated with future values of the conditioning variables. Wesay that these conditioning variables are predetermined with respect to the time-varying errors.Prominent examples of models with predetermined variables are dynamic models and modelswith feedback effects.

1 Relaxing the Strict Exogeneity Assumption to Sequen-tial Exogeneity in the Linear Unobserved Effects Model

Our model of interest can still be written as

yit = xitβ + ci + uit, t = 1, 2, ..., T.

1

Until now, we have considered regression models, where the errors are assumed to be meanindependent of past, present and future values of xit, i.e.

E (uit|xi1, , ...,xiT , ci) = 0. (1)

We refer to these variables as being strictly exogenous conditional on ci. However, since thisassumption rules out certain kinds of feedback from uit to future values of xit, we now considermodels in which the errors satisfy sequential moment conditions of the form:

E (uit|xit,xit−1, ...,xi1, ci) = 0. (2)

That is, in addition to allow ci and xit to be correlated, we will now allow uit to be correlatedwith future values of xit. When this restriction holds, we say that xit is sequentially exogenousconditional on the unobserved effect. Given our model, this assumption is equivalent to

E (yit|xit,xit−1, ...,xi1, ci) = E (yit|xit, ci) = xitβ + ci.

Intuition: Sequential exogeneity allows a correlation between a shock and a subsequent explana-tory variable but not from an explanatory variable to a subsequent shock. The intuitionis the following: We allow agents to react to a shock (for example by getting a child inresponse to an adverse employment shock) but we do not allow that a decision opens upto a different type of shocks. An easy way of remembering this is that it is fine for someshock to cause changes in an xitk since we usually make no assumptions on these (rather,we condition on them), but if a particular xitk could alter the distribution of uit+1, thenthe shock is no longer (conditionally) ”random”.

1.1 Difference (FD) and Fixed Effects (FE) estimation

To show inconsistency of the estimators, recall that for pooled OLS of some variable, yit, on aset of explanatory variables, xit, we can write the probability limit of β as

plim(β)

= β +

[T−1

T∑t=1

E (x′itxit)

]−1 [T−1

T∑t=1

E (x′ituit)

]

Now, we can see why the FD estimator is inconsistent; The model in first differences can beformulated as

∆yit = ∆xitβ + ∆uit, t = 2, ..., T,

(so here, yit = ∆yit, etc.) The FD estimator can be derived by estimating the above model bypooled OLS. Under (2) we note, however, that

E (∆uit∆xit) = E [(uit − uit−1) (xit − xit−1)] = −E (uit−1xit) 6= 0.

Hence, FD is inconsistent and this inconsistency does not depend on T .

2

The fixed effects estimator is inconsistent too. To see this, note that yit = yit

E [x′ituit] =E[(xit − xi)

′(uit − ui)

]=E [x′ituit]︸︷︷︸

=0

− E [x′itui]− [x′iuit] + E [x′iui]

=E [x′iui]− E [x′itui]− [x′iuit] .

These expressions are generally non-zero — for example, in E[x′iui] we have uit alongside xit+1

and they are not uncorrelated under sequential exogeneity.Under certain conditions (weak dependency), the correlations die out and the FE estimator

can be shown to be inconsistent of order O(T−1), so the inconsistency dies out. See lecture note3.

1.2 IV and GMM estimation

Consider the model in first differences

∆yit = ∆xitβ + ∆uit, t = 2, ..., T

As we have seen, this transformation only gets rid of the endogeneity induced by the unobservedindividual effect. IV or GMM estimation techniques are therefore needed to estimate β con-sistently in models with sequentially exogeneous regressors. Hence, we need instruments zit for∆xit (or xit if the model is estimated in levels).

1.2.1 Predetermined Variables as Instruments

It turns out that from the sequential exogeneity assumption, we are able to obtain sequentialmoment restrictions on the form

E (z′isuit) = 0, s = 1, 2.., t,

where the zit’s are predetermined variables that can be used as potential instruments for ∆xit

— that is, even though (2) rules out putting uit together with xit+h it’s fine to put uit and xit−htogether, h ≥ 1.

For the model in first differences this will imply the following t − 1 moment conditions forthe error term from period t

E (z′iseit) = 0, s = 1, 2.., t− 1, eit ≡ uit − uit−1

so we can potentially use z0it−1 = (zi1, zi2, ..., zit−1) as potential instruments for ∆xit. Note thatthe sequential exogeneity implies that we can use any function of z0it−1 as potential instrumentsfor ∆xit. In particular, we can use lagged differences ∆zit−h where 1 ≤ h ≤ t−1 or lagged levelszit−1, ..., zi1.

1.2.2 Potential Instruments

The choice of instruments depends on the type of endogeneity of the explanatory variables, orthe available external instruments. A couple of examples of valid instruments are given here:

1. If xit is strictly exogenous conditional on ci, then ∆xit is strictly exogenous and can beused as its own instrument. In fact, we can use ∆xis as instrument for ∆xit for any

3

s = 1, . . . , T .

2. If xit is sequentially exogenous conditional on ci, ∆xit is sequentially exogenous, and wecan use x0

it−1 = (xi1,xi2, ...,xit−1) as potential instruments for ∆xit.

3. If xit is endogenous due to correlation with ci and contemporaneous correlation with uit,we need an external instrument zit:

(a) If the external instrument zit is strictly exogenous, we can use z0iT = (zi1, zi2, ..., ziT )as potential instruments for ∆xit

(b) If the external instrument zit is sequentially exogenous, we can use z0iT = (zi1, zi2, ..., zt−1)as potential instruments for ∆xit

1.2.3 Estimation

Now write the model in matrix notation

∆yi = ∆Xiβ + ∆ui (3)

and define the matrix of instruments as

Zi =

Zi2 0 · · · 00 Zi3 0...

. . ....

0 0 ZiT

(T−1)×L

(4)

where Zit are the valid instruments at time t. The instrument matrix Zi has T − 1 rows corre-sponding to the T − 1 time periods in the system (3). Note that each Zit may contain differentinstruments since different instruments are used at each time period. Let L denote the columnsin Zi.

The L instruments give L moment restrictions E [Z′iei (β)] = 0L×1, where ei (β) = ∆yi −∆xiβ(T − 1× 1). The sample analogue is

1

N

N∑i=1

Z′iei (β) = 0L×1

The parameters of interest, β, can be estimated consistently and efficiently by GMM (GeneralizedMethod of Moments). The (possibly nonlinear) GMM criterion function, J : RK → R is givenby

J (β) = e (β)′ZW−1Z ′e (β) ,

where W is a weight matrix given by

W = N−1N∑i=1

Z′iei (β) ei (β)′Zi.

The GMM estimator picks an estimate β that minimizes the GMM criterion. In the linear model,it can be shown that β can be estimated by least squares methods. This is possible because the

4

model of interest is linear in the parameters. The linear GMM estimator is

βGMM =(∆X′ZW−1Z′∆X

)−1∆X′ZW−1Z′∆Y

If the weighting matrix W−1 is known then this estimator solves the minimization problemstated above directly without having to use numerical minimization. This makes computation alot easier and it should be stressed that it is entirely due to the model being linear.

To actually implement the linear GMM estimator we need an estimate of W. However, W isalso a function of β — the very object we set out to estimate in the first place. One way togo about this problem is to assume a particular (yet incorrect) form of W. This will make itpossible to estimate β consistently, albeit inefficiently. Assume, e.g. W = Z′Z, such that

βGMM =(∆X′Z (Z′Z)

−1Z′∆X

)−1∆X′Z (Z′Z)

−1Z′∆Y = β2SLS ,

which we recognize as the two stage least square instrumental variables estimator. One problemwith the IV estimator is that the moment restrictions are weighted equally. Hence, momentrestrictions with weak instruments are given to much weight. In order to obtain a more efficientweighting of the moment restrictions, we may therefore calculate W−1 and estimate β3SLS usingthis weight matrix. Therefore, the parameters of the linear GMM problem are estimated in threesteps:

Procedure: 3-step linear GMM estimation.

1. Estimate β2SLS =(∆X′Z (Z′Z)

−1Z′∆X

)−1∆X′Z (Z′Z)

−1Z′∆Y

2. Compute W = N−1∑N

i=1 Z′ieie′iZi

3. Estimate β by applying β3SLS =(∆X′ZW−1Z′∆X

)−1∆X′ZW−1Z′∆Y

Sometimes, β is estimated using an algorithm, where W and βGMM are repeatedly updateduntil βGMM converges, i.e. steps 2 and 3 in the above procedure are repeated until convergence.This is done in order to improve the finite sample properties of the GMM estimator. Yet anotherprocedure exists, called continuously updated GMM, wherein the dependence of the weightingmatrix on parameters is made explicit, W = W(β), and the steps are made simultaneously.

1.2.4 Orthogonal deviations

As we discussed in problem set 2, the first difference transformation may induce serial corre-lation in the errors. For instance, if the errors in the original model are iid (0, σ2), then thefirst differenced errors will be serially correlated. Arellano and Bover (1995) proposed an alter-native method to first differences which ”can be regarded as doing first differences to eliminatethe [unobserved] effects plus a GLS transformation to remove the serial correlation induced bydifferencing”.1 This transformation method is known as orthogonal deviations. The requiredtransformation is given by the (T − 1)× T matrix

A =(DD′

)−1/2D

1M. Arellano and O. Bover (1995): ”Another Look at the Instrumental Variable Estimation of Error-Components Models”, Journal of Econometrics, 68, 29-51.

5

where D is the (T − 1)× T first difference operator:

D =

−1 1 0 . . . 0 00 −1 1 0 0...

. . ....

0 0 0 . . . −1 1

(T−1)×T(

DD′)−1/2

is the upper triangular Cholesky matrix of(DD′

)−1. In MATLAB, this can be

obtained from the following piece of code:

1 D=[zeros(T-1,1),eye(T-1)]-eye(T-1,T); % First differences2 A=chol((D*D')ˆ-1)*D; % Orhogonal deviations

To estimate the model in orthogonal deviations, simply transform the data using A instead ofD (that would give ordinary first differences)

1.2.5 Test for Overidentifying Restrictions

The IV/GMM approach gives overidentifying restrictions, which can be used to test the validityof the moment restictions. The test is called a Sargan−test or a J−test

J = ∆u′ZW−1

Z′∆u ∼ χ2 (L−K)

where L is the number of moment restrictions and K is the number of parameters to estimate(in this case the dimention of β).

Since the model under sequential exogeneity is identified using a subset of the moments understrict exogeneity, we can directly test the validity of the additional moment restrictions that weget from assuming the stricter assumption (noting, however, that the test only works given thatsequential exogeneity holds). Let J0 and J1 be the Sargan test statistics under the sequentialand strict exogeneity assumptions. Under the null that the additional moments are valid, J0−J1is χ2−distributed with degrees of fredom equal to the additional number of moments, L0 − L1.

1.2.6 The Rank Condition and Weak Instruments

Just recall, that in order to identify the parameters, the rank condition E (∆xiZi) 6= 0 must besatisfied (where Zi are the used instruments). Hence, Zi must be a strong predictor of ∆xi inorder to avoid the weak instrument problem. Otherwise, the model will be poorly identified.

2 The Exercise: The Effect of Fertility on Female LaborMarket Participation

We consider the relationship between female labour force participation and fertility. The datadpdat.txt comes from the Panel Survey of Income Dynamics (PSID) for the years 1986-1989. Thesample consists of 1,442 women aged between 18 and 55 in 1986 who are married or cohabiting.The variables of interest are

6

Variable Contenty0 Participation

x1 Fertility

x2 Children aged 2-6.

x5 Children of the same sex (male or female).

x7 Schooling level 1.



x10 Age

x11 Race

y1 Lagged participation

Year Year of observation

We consider the following linear probability model for Female Labor Market Participation:

yit = γ + ρyit−1+β1fertilityit+β2kids2-6it (5)

+δ1educ1it + δ2educ2it + δ3ageit + δ4raceit + ci + uit

The dependent variable yit is a binary indicator of labor market participation in year t . Fertilityis also a dummy variable which takes the value one if the age of the youngest child at time t+ 1is 1. The equation includes also an indicator of whether the woman has a child aged between 2and 6, and other socio-demographic characteristics.

2.1 Endogeneity Issues

Fertility: The time-varying errors are likely to include idiosyncratic labor market shocks to par-ticipation (e.g., promotion or dismissal). Since these shocks are likely to affect fertility decisions,the fertility indicator should be treated as an endogenous variable in the following sense:

F1: Fertility is correlated with the contemporaneous time-varying errors, i.e. E [fertilityituit] 6=0.

F2: Fertility is correlated with the individual-specific effects , i.e. E [fertilityitci] 6= 0.

Kids 2-6: The presence of a child aged 2-6 is the result of past fertility decisions, and so itshould be treated as

K1: a predetermined variable.

K2: a variable that is correlated with individual effects, ci.

Lagged Participation: If the model contains a lagged dependent variable, then we explic-itly allow for the transmission of current shocks into future values of the explanatory variable.Therefore,

L1: the lagged dependent variable is by construction predetermined and hence violates the strictexogeneity assumption.

L2: For the same reasons, yit−1 will be correlated with the individual effects, ci.

7

To see this consider the example below:

Example 1 (Dynamic Unobserved effects model): Consider an AR(1) model with additionalexplanatory variables

yit = zitγ + ρ1yit−1 + ci + uit, t = 1, 2, ..., T

such that xit = (zit, yit−1) . To see that this model violates the strict exogeneity assumption,observe that

E (uityit) = E (uitzit) γ+ρ1E (uityit−1) + E (uitci) + E(u2it)

If we assume that zi is strictly exogenous and uit is independent of the unobserved effect thisreduces to

E (uityit) = E(u2it)> 0

This implies that E (uitxit+1) 6= 0 (since yit is a part of xit+1) and the dynamic linear unobservedeffects model therefore violates the strict exogeneity assumption.

Under sequential exogeneity, we have

E (yit|xit,xit−1, ...,xi1, ci) = E (yit|zit, yit−1, zit−1, yit−2..., zi1, yi0, ci)

and the sequential exogeneity assumption implies

E (yit|zit, yit−1, zit−1, yit−2..., zi1, yi0, ci) = E (yit|zit, yit−1, ci)= zitγ + ρ1yit−1 + ci

Hence, the model formulation estimates the conditional mean. Or equally true, the regressor yitis sequentially exogenous conditional on the unobserved effect, ci.

2.2 A Potentially Exogenous Instrument

Same sex: An indicator of whether a woman has two children of the same sex (male or female).The motivation for this instrument is that women with two children of the same sex has asignificantly higher probability of having a third child (Couples continue producing childrenuntil they have a full set). However, the instrument is not nessesarily strictly exogenous:

S1: We expect Same sex to be predetermined: It may be affected by past labour shocks throughpast fertility decisions (given that the sample contains data for women with less than twochildren).

S2: We expect Same sex to be correlated with the fixed effect, as it will be a predictor forpreferences for children (given that the sample contains data for women with less than twochildren)

8

3 The Questions

In the following, we shall estimate the model under different assumptions regarding the exo-geneity of the explanatory variables and instruments. Using OLS, 2SLS, FE and different GMMestimators, we will estimate both a static and a dynamic version of (5) That is, for each method,estimate the model in (5) with and without lagged participation, yit−1, in the set of conditioningvariables.

Through the excercise, you will asked to replicate the results of the following table

Female Labour Force ParticipationLinear probability model (N=1442, 1986-1989)

Variable OLS 2SLS FE GMM (Str. Ex) GMM (Seq. Ex)1-step 2-step 1-step 2-step

Fertility -0.1553 -1.0103 -0.0565 -0.0769 -0.0531 -0.1322 -0.1009-8.1667 -2.1465 -3.8048 -2.8288 -2.0653 -2.1525 -1.6997

Kids 2-6 -0.0778 -0.2425 0.0006 -0.0051 -0.0024 -0.0859 -0.0720-5.1935 -2.6304 0.0434 -0.3577 -0.1842 -2.6588 -2.3192

Sargan - - - 49.3029 48.7363 18.2802 18.1236df 0 0 0 22 22 10 10

Models including lagged participationYt−1 0.6284 0.6155 0.0353 0.3561 0.3843 0.2874 0.3190

41.8886 30.1413 1.6833 8.2597 9.9012 6.3337 7.5080Fertility -0.0898 -0.3332 -0.0563 -0.0929 -0.0720 -0.1417 -0.1331

-5.2487 -1.3187 -3.7533 -3.0908 -2.4894 -2.2298 -2.1238Kids 2-6 -0.0195 -0.0673 -0.0000 -0.0161 -0.0184 -0.1000 -0.0849

-2.1269 -1.3463 -0.0010 -1.0740 -1.2780 -3.5441 -3.0783Sargan - - - 52.9032 51.4411 25.3089 24.2667df 0 0 0 27 27 15 15

Note: Heteroschedasticity robust t-ratios are shown below estimates

To make the results comparable, always report White’s heteroscedasticity consistent standarderrors. [Hint: Use the function robust()]

Note: The code at the bottom of the Ex Ante file will produce a table like the one below.However, it assumes that the coefficients for the dynamic models follow the structure that thecoefficient on the lagged explanatory (or change in ditto) is before the other explanatory variables.Otherwise, the estimates will have switched places.

We will discuss how these different estimation strategies are able to take account of differentaspects of the endogeneity problems mentioned above: F1, F2, K1, K2, L1, L2, S1 and S2.

1. Replicate column 1: Estimate the static and dynamic version of (5) using pooled OLS (thisis done for you). Which of the endogeneity problems F1, F2, K1, K2, L1, L2, S1 and S2 areneglected here?

2. Replicate column 2: Estimate the static and a dynamic version of (5) using pooled 2SLSwhere the variable ”Same sex” is used as an instrument for fertility. Which of the endogeneityproblems F1, F2, K1, K2, L1, L2, S1 and S2 are neglected here?

9

Below we will estimate models, where we get rid of the individual specific component. Therefore,exclude the constant, age, race, and education dummies from the regression.3. Replicate column 3: Estimate the static and a dynamic version of (5) using fixed effects.Which of the endogeneity problems F1, F2, K1, K2, L1, L2, S1 and S2 are neglected here?

4. Replicate columns 4-5: Estimate (in orthogonal deviations) the static and a dynamic versionof (5) using GMM with the following instrumental variables: All lags and leads of ”kids 2-6”and ”Same sex”. For the dynamic model, use also lags of participation up to t − 2. Reportparameter estimates along with the Sargan test statistic for overidentifying restrictions. Whichof the endogeneity problems F1, F2, K1, K2, L1, L2, S1 and S2 are neglected here?

Hints for programming:

1. First-differencing: Do orthogonal deviations by transforming the data with A using thefunction permgeneral() e.g. ∆y = permgeneral(A, y, n, T). permgeneral() is a generalizedversion of the perm function we used in problem set 2 and 3, able to take transformationmatrices of, say, size T − 1 × T (note that this means that it ”automatically” removesperiod 1, which we had to do manually earlier on).

2. Creating the instrument matrix.

� The function zstex() creates a matrix of instruments treating the variable as strictlyexogenous. That is, it creates a matrix Z with elements

Zi =

zoiT 0 · · · 00 zoiT 0...

. . ....

0 0 zoiT

(T−1)×L

where zoiT = (zi1, .., ziT ) and L = 12T (T − 1) ·#{columns in z}. Hence, past, present

and future values of zit are stored in the instrument matrix Zi.

� The function zpred() creates a matrix of instruments treating the variable as prede-termined. That is, it creates a matrix Z with elements

Zi =

zoi1 0 · · · 00 zoi2 0...

. . ....

0 0 zoiT−1

(T−1)×L

such that only past realizations of zit are used as instruments at time t, i.e zoit−1 =(zi1, .., zit−1).

� In the static model, all instruments are assumed to be strictly exogenous - so thefunction zstex() is the relevant one!

� In the dynamic model model, you also have a predetermined instrument (lagged par-ticipation), so you will have to use both of the functions zstex() and zpred()

3. Do GMM estimation using the function GMM().

10

5. Replicate columns 6-7: Estimate (in orthogonal deviations) the static and a dynamic versionof (5) with GMM using the following instrumental variables: Lags of ”kids 2-6” and ”Same sex”up to t − 1. For the dynamic model, use also lags of participation up to t − 2. Which of theendogeneity problems F1, F2, K1, K2, L1, L2, S1 and S2 are neglected here?

11

Documents

Econometrics