21
Estimating Marginal Returns to Education Jenna Stearns Department of Economics University of California, Santa Barbara

Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

Estimating Marginal Returns to Education

Jenna Stearns

Department of Economics

University of California, Santa Barbara

Page 2: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

1 Introduction

Estimating returns to education is an essential part of determining optimal schooling

decisions, both on an individual and social level. Because education is costly, individuals

make choices about how much time to spend in school based on the difference between the

marginal return to an additional year of schooling and the cost. From a policy perspective,

average marginal returns to education across the population drive investment decisions in

education as well as laws regarding mandatory schooling. However, the causal effects of

education on earnings are difficult to measure. Although economists have been estimating

returns to education for decades, the variation in empirical results along with the continuous

activity in this area both suggest that there is no clear consensus on how best to measure

marginal returns to education, or even how big they are.

In reality, people are heterogeneous and make decisions based on differences in individ-

ual characteristics that are both observed and unobserved by the economist. Because of this,

people react to policies, standards, and choices in different ways. Economists would like to

know, on average, how the marginal return to schooling changes as the number of years of

education increases, and would also like to be able to evaluate policies that change the prob-

ability of attaining a certain level of schooling. Estimating the marginal returns to education

can help do both of these things.

The purpose of this literature review is not to list the various empirical estimates of

marginal returns to education found in studies using different methods and different data.

Instead, I aim to provide an overview of the influential methods and papers in the field,

and to identify and explain some of the primary challenges in estimating marginal returns

to schooling. These issues likely explain a lot of the variation in reported returns. I also ex-

plore potential avenues for future research that can help better identify marginal returns to

education.

The remainder of this paper is organized as follows. Section 2 describes two distinct

ways of thinking about marginal returns to education. Section 3 discusses the estimation

techniques and problems with a more traditional definition. Section 4 describes the marginal

treatment effect, another way of interpreting marginal returns. The fifth section compares

the estimated parameters from these different methods in two empirical papers that estimate

1

Page 3: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

the marginal returns to college. Finally, section 5 discusses areas for future research.

2 What are Marginal Returns to Education?

There are two related but fundamentally different ways of defining marginal returns to

education, and both are used in the literature to answer different types of questions. This

section identifies and explains these two concepts; subsequent sections relate each back to

some of the influential literature in which they are used.

The first way to think about marginal returns to education is in line with the traditional

idea of marginal benefits: each additional unit of a good provides some additional utility.

People want to consume more of the good until the benefits from the last unit are equal to

the cost of obtaining it. The same concept is true of education. Individuals should choose to

stay in school until the marginal return is no longer greater than the marginal cost.1 Thus, we

are interested in how the marginal return to schooling changes as the amount of schooling

increases. In other words, holding constant the individual, economists are interested in how

the marginal return changes over time.

Of course, the obvious problem with estimating an individual’s marginal return to edu-

cation is that only one outcome per individual is realized. That is, if Tom completes twelve

years of schooling, only his return to twelve years of schooling can be estimated from his ob-

served earnings. Economists do not observe the counterfactual: his earnings had he gotten

a college degree or had he dropped out of school in eleventh grade. To estimate individual

level marginal returns in this context, economists must make the assumption that all individ-

uals have the same returns to education, and attempt to control for the differences driving

schooling decisions. More commonly, the average marginal return is what is actually esti-

mated.

The second way to define a marginal return to education is to hold constant the level of

schooling and look at returns across individuals. To distinguish it from the above definition,

this marginal return will be called the marginal treatment effect (MTE), which is consistent

1Empirically, returns to education are generally estimated using a measure of earnings. However, theoreti-cally, returns to education are often defined more broadly to include components of utility that are more difficultto measure in the data. For example, some people (e.g., Ph.D. students) enjoy learning and derive some intrinsicbenefit from knowledge; for others education is simply a means to an end or a signal in the labor market.

2

Page 4: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

with the definition of the MTE used in the literature.2 The MTE measures how, as the fraction

of the population with a fixed level of schooling increases, the return to education for the

individuals who are indifferent to staying in school or not staying in school changes. In other

words, holding constant the level of schooling, we want to know how returns vary over

individuals. More specifically, we want to know the return of the person on the margin.

These two definitions are equivalent if returns to education are the same for everyone.

However, as discussed in detail below, marginal returns to schooling are not homogeneous.

Therefore, each of these concepts helps answer a specific question. If economists are inter-

ested in how individuals make schooling decisions, the first definition is more useful. If

instead economists want to assess the impact of a policy aimed at changing the number of

people who complete a certain level of education, then the second is preferred.

3 Marginal Return to Education

3.1 Estimation

Any analysis of marginal returns to education starts with the assumption that individuals

decide on their optimal amount of education by comparing the benefits to the costs. Benefits

include improvements in earnings over the course of the lifetime, as well as non-monetary

gains such as access to more desirable jobs, self-worth, and the joy of learning (“psychic earn-

ings”). Costs usually are thought of as a combination of money spent directly on education,

the forgone value of the time spent obtaining education, and disutility of studying.

Becker and Chiswick (1966) play an important role in developing the literature on esti-

mating marginal returns to education. Their canonical model of human capital views edu-

cation as an investment decision, where the costs are compared to the discounted stream of

expected future benefits. Thus, schooling is an endogenous decision. The simple model says

that total earnings over the lifetime are the sum of the returns to investments in education

plus the earnings from any original human capital:

2The concept of the MTE was first introduced by Bjorklund and Moffitt (1987) as a way to estimate marginalreturns, and was more formally defined by Heckman (1997).

3

Page 5: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

Ei = Xi +

m∑j=1

rijCij ,

where Ei is the lifetime earnings of person i, Cij is the amount spent by person i on the jth

unit of investment in education, rij is the marginal rate of return on the jth unit of investment,

and Xi is the return from any original human capital. Becker and Chiswick point out that

each individual invests in education until the marginal rate of return on a dollar of invest-

ment is equal to the marginal “interest” cost of that dollar. If the rate of return is assumed to

be constant across units of schooling, then the marginal return is equal to the average return.

Becker (1967) extends this analysis. In a slightly more detailed model, he lays out how to

solve for a condition for the optimal amount of schooling. Again, this is the point at which

the marginal benefits equal the marginal costs. He also suggests that individual heterogene-

ity in this optimal choice can arise from one of two sources. Individuals can differ in their

marginal returns to schooling, or they can face different marginal costs of schooling. How-

ever, if individuals all face the same costs (what he calls “equality of opportunity”) or have

the same benefits (“equality of ability”) then the marginal return to a given level of schooling

is the same for everyone.

A main limitation of the Becker model is that schooling is the only specified source of

human capital. If schooling and the error are uncorrelated, then an Ordinary Least Squares

(OLS) regression will produce an unbiased estimate of the return to a year of education.

However, if they are correlated, the estimate will be biased. Becker identifies two obvious

reasons why schooling is likely correlated with the error in this model. First, human capi-

tal can also be accumulated through on-the-job training (experience). Years of schooling and

years of experience, conditional on age, are highly correlated. Second, if ability is unobserved

and the return to education varies by ability, then estimates of the marginal return to educa-

tion will also be biased. If more able individuals make greater investments in education, then

the estimated marginal rate of return to schooling is actually the return to schooling plus the

returns to ability and other unobserved forms of human capital.

Mincer (1974) extends Becker’s work to try to address part of this problem. He includes

in his model a measure of on-the-job training and experience. Importantly, he also shows that

percent changes in earnings are strictly proportional to the absolute differences in schooling.

4

Page 6: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

In other words, log earnings are a linear function of years of education:

ln Yi = ln Y0 + rSi + ui,

where Yi is the level of earnings of individual i, and Y0 is the mean level of earnings of

someone with zero years of schooling. The coefficient r is the marginal return to a year of ed-

ucation, which Mincer assumes to be constant in the simplest model, but can be subscripted

by S and allowed to vary.

In this framework of log earnings, Mincer argues, years of work experience should enter

additively and not multiplicatively. Additionally, the experience term is concave, resulting

in the following earnings equation for an individual with t years of experience:

ln Yi = ln Y0 + rSi + β1ti − β2t2i + εi. (1)

The above equation is the basis for many studies estimating the returns to schooling. It

can be extended in several ways. There is no reason to assume that marginal returns to

education are constant; for example adding in a nonlinear schooling term (S2i ) captures how

marginal returns are changing as the level of schooling changes. An interaction term between

schooling and experience may help predict marginal returns as well. Mincer specifies the

following equation to estimate the marginal effects of education on log earnings using a

cross-sectional distribution of annual earnings in 1959 for white men:

ln Yi = α+ r1Si − r2S2i − γti ∗ Si + β1ti − β2t2i + εi.

He estimates

ln Yi = 4.87 + 0.255Si − 0.0029S2i − 0.0043ti ∗ Si + 0.148ti − 0.0018t2i .

Then the marginal returns at different levels of schooling can be approximated. The marginal

return to the Sth year of education is given by:

rS =d(ln Y )

dS= 0.255− 0.0058 ∗ S − 0.0043t.

5

Page 7: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

Clearly, the marginal returns are decreasing in the level of schooling. For someone with eight

years of experience (t = 8), the marginal return to the eighth year of education in this sample

is 17.4 percent. The marginal return is 15.1 percent for the twelfth year of schooling, and is

12.8 percent for the sixteenth year.

Although this specification is a good illustration of a simple way to estimate non-constant

marginal returns empirically, Mincer notes that both the nonlinear schooling term and the in-

teraction term become insignificant when other covariates (namely number of weeks worked

in 1959) are controlled for. Ignoring possible sources of bias, which are discussed in the

following section, Mincer’s simple model in (1) seems to do a good job of estimating the

marginal return to schooling.

In his 2001 survey, Card extends the Mincer model described above. He shows that one

can allow for individual heterogeneity to affect both the intercept of the log earnings equa-

tion as well as the slope, through the coefficient on schooling. Individual intercepts are esti-

mated by including an individual level fixed effect in the regression equation, and the aver-

age marginal return to education is just the expectation of the individual marginal return.

3.2 Problems with OLS Estimation

There is significant cause for concern that OLS estimates of marginal returns to education

are biased. Bias means that the difference between the probability limit of the estimator and

the average marginal return to schooling in the true population is not equal to zero. The

issue of bias is well-acknowledged throughout the literature using OLS to estimate returns

to education, and Card (2001) has a particularly nice discussion about some of the main

problems.

First, suppose that log earnings are linear in years of education, and that there is no indi-

vidual heterogeneity in returns. Even if these assumptions are true, OLS is biased if there is

correlation between an individual’s unobserved ability and the marginal cost of schooling. If

the marginal costs are lower for people who are more able (i.e., those who would earn more

at any level of schooling than those of lower ability would), then OLS is biased upward. This

is known as ability bias.

If we allow for heterogeneity in returns, then there are even more issues. People with

6

Page 8: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

higher returns to education have incentive to acquire more education, all else equal. Even as-

suming there is no ability bias, cross-sectional estimates likely yield upward biased estimates

of the average marginal return to education. This endogeneity, or comparative advantage,

bias arises from the fact that differences in the earnings-education relationship result from

differences in returns as opposed to differences in preferences for education, costs, or ability.

More simply stated, OLS estimates are upward biased if an individual’s return to schooling

is positively correlated with the amount of education chosen. If there is ability bias on top of

this, the upward bias in OLS is even larger.

Education can also be mismeasured or misreported. Empirically, economists only ob-

serve information on rounded years of schooling rather than a continuous measure, and

self-reported schooling information is not always accurate. Measurement error is a form

of attenuation bias in OLS. Because returns to schooling are generally assumed to be posi-

tive, this is a downward bias. Griliches (1977) argues that measurement error is enough of

a problem to at least partially offset any upward bias from the sources mentioned above.

Angrist and Krueger (1999) conclude that the reliability of self-reported schooling is about

85-90 percent, which implies that the resulting measurement error is enough to offset modest

ability bias. However, Card (2001) points out that measurement error in schooling is mean-

regressive. People with the highest levels of education cannot over-report, and those with the

lowest levels cannot under-report. This means that conventional estimates of measurement

error are likely overstated and the true attenuation bias in OLS estimates of marginal returns

to education is smaller than previously thought.

Theoretically, the direction of the overall bias in OLS estimates is ambiguous. However,

the literature concludes that it is most likely positive. Ability and endogeneity biases dom-

inate measurement error. This means that many of the empirical estimates of the marginal

returns to education in the literature are probably overstating the true return.

3.3 Instrumental Variables and Marginal Returns to Education

The instrumental variables (IV) method is often used as an alternative to OLS estimation

of marginal returns to education in order to overcome the ability bias and measurement error

issues. In the absence of heterogeneity in returns, if there exists an observable instrument that

7

Page 9: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

affects education choices but is uncorrelated with ability, then an IV estimator based on this

instrument will yield a consistent estimate of the average marginal return to schooling. When

marginal returns are the same for everyone, any valid instrument will identify the same

parameter. If there is heterogeneity in returns to schooling (endogeneity bias), however, a

stronger independence assumption between the instrument, individual ability, and the error

in the schooling equation is needed to produce a consistent estimate. Examples of common

instruments used to estimate returns to education include distance to college (Card, 1993),

tuition costs (Kane and Rouse, 1995), and minimum mandatory schooling laws (Angrist and

Krueger, 1991; Oreopoulos, 2006).

The stronger assumption is usually violated by these sorts of instruments. Different in-

struments thus measure different effects, depending on which individuals are induced to

change their optimal education choice by a change in the instrument. Imbens and Angrist

(1994) formalize the notion that when there is heterogeneity in returns, IV actually measures

a local average treatment effect (LATE). The LATE parameter is consistently estimated given

the instrument satisfies the standard assumptions, but it consistently estimates the marginal

return to education only for a select subset of the population: those whose schooling decision

is affected by a change in the instrument.

In comparison to OLS, IV estimates are unaffected by classical measurement error. This

is one reason why IV estimates are generally larger than OLS estimates. In the presence of

heterogeneity, when IV estimates the LATE parameter, it could produce a larger parameter

than OLS because of who the instrument affects. For example, if the individuals affected are

more credit constrained but have high returns to schooling, then IV will overestimate the

average marginal return to education of the sample population.

The validity of the IV estimator depends crucially on the assumption that the instrument

is uncorrelated with the error. Small violations of this assumption cause the estimated pa-

rameter to blow up. Carneiro and Heckman (2002) show that several commonly used instru-

ments, including distance to college and tuition, are correlated with ability. If ability is not

controlled for as a covariate, than these instruments are ”bad” and result in upward biased

IV estimates. This is problematic because many commonly used data sources do not include

a measure of individual ability, and thus it cannot be controlled for.

8

Page 10: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

4 Marginal Treatment Effects

4.1 Relationship between Marginal Returns and Marginal Treatment Effects

If the instrument is valid, the LATE parameter is a consistent estimate of the marginal

return to education for a specific subset of the population. However, because of the extreme

sensitivity to the choice of instrument, it is often difficult to establish exactly what the LATE

parameter is measuring. Furthermore, instruments such as the ones mentioned above tend

to affect a particular level of schooling: changing the compulsory schooling age from 16 to

17, for example, does not directly affect the returns of people who would have completed

sixteen years of education anyway.3 Because the LATE estimates the return to education

of those affected by the instrument, it is not measuring the return to an additional year of

education for the average individual. For these reasons, Heckman (1997) argues that LATE

does not necessarily produce a parameter that is economically interesting, nor does it capture

how marginal returns are changing over the level of education. It fits somewhere in between

the two definitions discussed in section 2.

From a policy perspective, the second definition of a marginal return to education is of-

ten more relevant. Economists want to know what the marginal return to schooling is for

the people affected by a policy change, so that they can compare the benefits and costs of the

policy. In a series of papers, Heckman and Vytlacil (Heckman, 1997; Heckman and Vytlacil,

1999, 2001a) attempt to define such a parameter. The marginal treatment effect is the effect

of treatment for individuals indifferent to taking or not taking treatment. The MTE is the

limit form of the LATE. Unlike the LATE, however, the MTE parameter does not depend on

the choice of instrument. Additionally, because the MTE identifies the return to education

at every margin, it can be used to construct any treatment parameter of interest.4 In par-

ticular, Carneiro, Heckman, and Vytlacil (2010) show how the MTE can be used to identify

the marginal return to education of individuals affected by a specific educational policy. The

following section outlines the framework used to identify the MTE.

3General equilibrium effects are ignored here, as in the majority of the literature. However, they are importantto consider. See Heckman, Lochner, and Taber (1998) for a discussion of general equilibrium effects in education.

4In practice, the MTE may not be defined everywhere. This limitation is discussed below.

9

Page 11: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

4.2 A Generalized Roy Model to Estimate the MTE

This section summarizes the generalized Roy Model used in much of the econometric

literature deriving treatment effect parameters (e.g., Imbens and Angrist, 1994; Heckman and

Vytlacil, 2001b; Heckman, Urzua, and Vytlacil, 2006). This framework is directly applicable

to estimating the marginal effects of schooling on earnings. For simplicity, suppose that there

are only two levels of education. Individuals can choose to go to college or not go to college.

Because this decision is likely correlated with unobserved ability, a valid instrument is used

to solve the selection bias issues. This instrument must affect the decision to go or not go to

college, but must not affect earnings in any other way.

For each individual i, assume there are two potential outcomes, (Y0i, Y1i), correspond-

ing to earnings if the individual does not go to college and does go to college, respectively.

Earnings in each state are a function of a vector of observable random variables Xi and an

unobserved random variable Ui:

Y0i = µ0(Xi) + U0i

Y1i = µ1(Xi) + U1i.

There is no reason to impose that µ(Xi) is linearly seperable, but separability between the

observable and unobservable characteristics is assumed.

Further, the outcome of individual i depends upon the college decision, given by:

D?i = µD(Zi)− UDi

where the instrument Zi is a vector of observed random variables and UDi is an unobserved

random variable. The vector of instruments can contain all of the Xis and must include at

least one instrument that is excludable from the outcome equation. The value of D?i is not

observed by the economist, but an indicator of treatment, Di, is. The treatment indicator is

defined as:

Di =

1 if D?i > 0

0 if D?i ≤ 0

10

Page 12: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

AlthoughD?i is not observed, both the realized outcome of individual i and the treatment

indicator are observed:

Yi = DiY1i + (1−Di)Y0i. (2)

This can be written as:

Yi = µ0(Xi) +Di[µ1(Xi)− µ0(Xi) + U1i − U0i] + U0i.

The difference U1i − U0i can be interpreted as the idiosyncratic gain from education, and the

average gain from going to college is µ1(Xi)− µ0(Xi).

In this framework, Heckman and Vytlacil (2001a) define the conditional probability of

going to college as:

P (z) ≡ Pr(Di = 1|Zi = z,Xi = x) = FUDi(µD(z))

where FUDiis the cumulative distribution function (CDF) of the random variable UDi with

realization u. Empirically, the selection probability can be estimated using a probit or a logit

model. Given the properties of a CDF and the assumptions above, it is possible to assume

UDi ∼ Uniform[0, 1] without loss of generality.5 One way to think about this is that the

transformed values of UDi represent the quantiles of the original values. This normalization

allows us to avoid making an assumption about the true distribution of the error. It also then

follows that µD(z) = P (z). Furthermore, because of the uniform distribution, we can define

UDi = FUDi(UDi). Notice that Di = 1 if and only if P (Zi) ≥ UDi.

The marginal treatment effect is the mean return to going to college for people who are

indifferent to going or not going, conditional on both the observed and unobserved charac-

teristics that determine the college decision:

MTE(x, u) = E[Y1i − Y0i|Xi = x, UDi = u].

Because P (z) = u for the person indifferent to selecting or not selecting treatment (D?i =

0), the MTE can be estimated by taking the derivative of the conditional expectation of the

5See Heckman and Vytlacil (2001a) for a proof of this within the latent variable framework.

11

Page 13: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

outcome with respect to the probability of treatment:6

MTE(x, P (z)) =∂E[Yi|Xi = x, P (Zi) = P (z)]

∂P (z).

Clearly, the MTE is only defined over the support of P (Zi), but is otherwise not sensitive to

the instrument choice. This means that in regions of common support, two instruments will

yield identical estimates of the MTE.

4.3 Advantages and Disadvantages of the MTE

The marginal treatment effect has two primary advantages over other ways of estimating

marginal returns. First, it is a natural way to characterize heterogeneity in returns to edu-

cation because it shows how returns vary with Xi and UDi. By estimating the MTE for all

values of UDi, it is possible to identify the returns to college at any relevant margin. In con-

trast, the LATE parameter estimates returns at unidentified intervals. The MTE aids in the

analysis of variation in returns to education across populations, and also allows a more ac-

curate analysis of policy changes, as described below. Second, Heckman and Vytlacil (2001a)

show that all treatment effect parameters7 can be expressed as different weighted averages of

the MTE, where the weights all integrate to one. In this sense, the MTE unifies all of the treat-

ment effect parameters. It can be used to estimate other treatment effects when endogeneity

bias prevents consistent estimation of these parameters directly.

However, there are some important limitations of the MTE. Most importantly, estimation

is empirically challenging. The MTE is identifiable only over the support of P (Zi). If the

support is not the full unit interval, the MTE will not be defined everywhere. If the CDF is

not smooth enough (the conditional probability of going to college is not continuous), the

MTE cannot be estimated either.

Finally, the marginal treatment effect is in itself generally not the parameter of interest.

It identifies the average return at very specific margins. Usually, economists are interested

in the returns over a wider interval. This is not such an issue because it is relatively easy to

6The derivative of a conditional expectation can be estimated using standard nonparametric regression tech-niques (see Heckman and Vytlacil, 2001b).

7E.g., the average treatment effect, effect of treatment on the treated, LATE, and policy relevant treatmenteffects.

12

Page 14: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

construct other treatment parameters once the MTE is defined over the relevant range. The

next section describes how the MTE can be used to construct a parameter that estimates the

return to education of people affected by specific policies.

4.4 Policy Relevant Treatment Effects

The MTE can be used to derive treatment effect parameters that directly answer the pol-

icy questions at hand. The main problem with IV estimation in the presence of heterogeneity

in returns to schooling is that it is not always clear what effect the LATE parameter is identi-

fying. However, using the MTE, economists can derive a treatment parameter that answers

a specific question.

Consider a policy that affects the probability of going to college, but that does not directly

affect earnings or unobservables in the decision process. Let D∗i be the college choice made

under the alternative policy and P ∗(Zi) be the probability that D∗i = 1 conditional on Zi.

Then the earnings outcome under the alternative policy is:

Y ∗i = D∗

i Y1i + (1−D∗i )Y0i

and the outcome under the baseline policy is as in (2). The mean effect of going from the

baseline policy to the alternative policy per person shifted is the policy relevant treatment

effect (PRTE), defined by Heckman and Vytlacil (2001b). It is defined wherever E[Di|Xi =

x] 6= E[D∗i |Xi = x] as:

PRTE =E[Yi|Alternative Policy, Xi = x]− E[Yi|Baseline Policy, Xi = x]

E[Di|Alternative Policy, Xi = x]− E[Di|Baseline Policy, Xi = x],

which can be alternatively expressed as a weighted average of the MTE.8 The PRTE depends

on the policy change only through the distribution of P ∗(Zi). Thus, the CDFs of P ∗(Zi) and

P (Zi), along with the MTE, are sufficient to calculate the average return to education of the

people affected by the policy change. Unless the instrument used corresponds exactly to the

policy change, this parameter is different from the LATE estimate.

The PRTE can only be identified if the support of P ∗(Zi) is contained in the support of

8PRTE =∫ 1

0MTE(x, u)ωPRTE(x, u) du, where ωPRTE =

FP |X (u|x)−FP∗|X (u|x)E[P (Zi)|Xi=x]−E[P∗(Zi)|Xi=x]

.

13

Page 15: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

P (Zi). This often means that the support must be the full unit interval, which again is a

requirement that is empirically challenging to satisfy. The marginal policy relevant treat-

ment effect (MPRTE) corresponds to a marginal change from a baseline policy and does not

run into this problem. It still answers economically interesting questions and, according to

Carneiro, Heckman and Vytlacil (2010), is the appropriate parameter with which to conduct

cost-benefit analysis of policy changes. The MPRTE is expressed as a weighted average of the

MTE as well, and places positive weight on the MTE only for values of u where the density

of P (Zi) is positive (fP (u) 6= 0) so it is identified under only the assumption that P (Zi) is a

continuous random variable.

5 Empirical Estimation of Marginal Returns to Education

In a homogeneous world, the OLS, IV, and MTE methods of estimating returns to ed-

ucation would all produce the same parameter. However, in practice heterogeneity in re-

turns, ability bias, and measurement error complicate the analysis in the ways previously

discussed. How much do these factors affect different estimation methods?

Carneiro, Heckman, and Vytlacil (2011) use data from the National Longitudinal Survey

of Youth (NLSY) of 1979 to show that returns to college vary across individuals. Further-

more, they provide evidence that the people in their sample act on the knowledge about

their idiosyncratic returns to education. By calculating the MTE, they compare marginal pol-

icy relevant treatment effects and the average treatment effect to the OLS and IV estimates of

the return to a year of college. Their findings suggest that both OLS and IV are substantially

upward biased compared to the average return to a year of college in the sample, as esti-

mated by the average treatment effect. Just as importantly, they show that different policies

produce different marginal policy relevant treatment effect parameters. The average return

to a year of college of the people affected by a policy that changes the probability of going to

college by a fixed amount for everyone, for example, is similar in size to the average treat-

ment effect. However, the people affected by a policy that changes the probability of going

to college by a small proportion have an average return only one third the size of the IV esti-

mate. This suggests that the IV/LATE estimates may be wildly off the mark if they are used

to estimate the returns to individuals affected by a certain policy change and the instrument

14

Page 16: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

does not correspond exactly to that policy.

Moffitt (2008) uses a similar approach to estimate the MTE. He uses data on the earnings

of 33 year old men in the United Kingdom in 1993 to estimate the returns to higher education.

Again, education is considered a binary choice. The OLS and IV estimates of the return to col-

lege are very similar in size to those found by Carneiro, Heckman, and Vytlacil (2011) despite

using different data and different instruments.9 He shows further evidence for heterogeneity

in returns to education, and that the marginal returns to college fall as the proportion of the

population with higher education rises. Encouragingly, he finds that the shape of the MTE

over different values of UD (for a given level of the observable characteristics) in his sample

is the same as Carneiro, Heckman, and Vytlacil estimate in their study. While Moffitt does

not calculate MPRTEs, his results imply that both OLS and IV overstate the return to college

in his sample.

6 Directions for Future Research

The existing empirical work on estimating marginal treatment effects (and thus policy

relevant treatment effects) is subject to several limitations. The current empirical literature

limits educational decisions to a binary choice of whether or not to go to college. From

this marginal return to college, economists can back out an approximate marginal return

to a year of college (as in Carneiro, Heckman, and Vytlacil (2011)). However, this analysis

is somewhat misleading. Empirically, we know that returns to schooling are not constant

across years. Specifically, returns are highest in degree years. More work is needed to be able

to apply this methodology to multiple schooling level outcomes. Heckman, Urzua, and Vyt-

lacil (2006) extend their theoretical analysis of treatment effects to more than two outcomes

using an ordered choice model. Ordered choice is theoretically correct for education because

individuals must complete a grade before moving on to the next one. However, no one has

been able to empirically estimate marginal treatment effects with multiple schooling choices.

This is primarily because of data concerns. To estimate the MTE at any point, the conditional

support of the probability to attaining that level of education must be the full unit interval.

9The estimates reported in Carneiro, Heckman, and Vytlacil (2011) are the true parameters divided by four, sothat they can be interpreted as the return to one year of college. Moffitt reports estimates of the return to a collegeeducation, and so are approximately four times as large.

15

Page 17: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

Adding more choices significantly increases the demands on the data.

Nevertheless, the non-linear returns are important to think about, especially when esti-

mating policy relevant treatment effects. A policy that increases the probability of starting

college will have a very different impact on earnings than a policy that increases the prob-

ability of graduating from college. A significant proportion of individuals who start college

never receive a degree. This implies that students are learning about their costs and returns to

education and updating their optimal schooling choices. While actual earnings data is used

to estimate returns to schooling, it is an individual’s perceived return that influences educa-

tional choices. In a world of imperfect information, Jensen (2010) points out that there is no

reason to expect the level of education chosen to be either individually or socially efficient.

Using data on eighth grade boys in the Dominican Republic, he finds that perceived returns

to secondary education are substantially lower than measured returns. Students who are

provided with information about the observed returns to schooling in the area stay in school

significantly longer than students who do not receive this information.

Stinebrickner and Stinebrickner (2012) use data from a college that serves mostly low in-

come students to examine the role of learning in the college dropout decision. In contrast to

Jensen (2010), they find evidence that students overestimate returns to education. As they

learn about their academic ability and psychic costs to education, they update their beliefs

about their individual return to schooling. The authors find that in their sample, dropout

would be reduced by 40 percent if learning about ability did not occur. Because perceived

returns and perceived costs affect education decisions, it is important to understand these

perceptions. This is a very relevant area for future work. If low-income students are less

informed about the true costs and returns to education, then policies that disseminate infor-

mation may be as important as policies that change the costs of schooling. Future research

that explores the relationship between perceived marginal returns and actual marginal re-

turns to education will help guide educational policy.

Additionally, all of the empirical work on marginal treatment effects thus far has focused

on while males. This is because more data is available. In most surveys the sample of mi-

norities is smaller, and the sample of females with wage data is smaller than that of males

because traditionally more women do not enter the labor market. Yet, evidence suggests

16

Page 18: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

that working women have a higher rate of return to college degrees than do working men

(Dwyer, Hodson, and McCloud, 2013). If the observed marginal returns to education differ

for different groups of people (men, women, minorities, etc.), then policies that change the

probability of getting a certain level of education are going to have different effects depend-

ing on which groups of people they primarily affect. In order to be able to do cost benefit

analysis of educational policies that are targeted at particular groups, it is important to be

able to estimate marginal returns for people other than white men. Further work on both

the theoretical and empirical sides of estimating marginal treatment effects will hopefully

result in techniques that can be applied more generally. This is especially relevant in terms

of analyzing the effectiveness of policies that promote education through affirmative action

or that aim to push women into STEM fields. Before economists can draw conclusions about

whether such programs are good or bad, more research needs to be dome in order to learn

how effective they are at improving outcomes.

Finally, marginal returns to education have mostly been analyzed in a partial equilibrium

setting. However, this is a simplistic view of the world. As the supply of workers with

a certain level of education changes, wages will adjust and the returns to education will

change as well. Consider a simple signaling model. If education is only a signal to employers

about ability, as the proportion of individuals who get a college degree rises, for example, the

conditional expectation of a college-educated person’s ability falls and the signal becomes

less valuable in the market. This drives observed returns to education down.

It is important to remember that economists cannot measure true returns to education.

Wages do not capture the intrinsic value that education provides. There are likely network

effects and other non-monetary benefits to education as well. When interpreting either the

marginal return to an extra year of education or the marginal treatment effect, it is important

to think carefully about what the parameter is measuring and what it is not. The enormous

amount of heterogeneity in returns highlights the need for continued discussion about how

to best estimate marginal returns to education.

17

Page 19: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

References

[1] Angrist, Joshua, and Alan Krueger. 1991. “Does Compulsory School Attendance Affect

Schooling and Earnings?” Quarterly Journal of Economics 104(4): 979-1014.

[2] Angrist, Joshua, and Alan Krueger. 1999. “Empirical Strategies in Labor Economics.” In

Handbook of Labor Economics, Volume 3A, edited by Orley Ashenfelter and David Card.

Amsterdam and New York: North Holland.

[3] Becker, Gary. 1967. Human Capital and the Personal Distribution of Income. Anne Arbor

Michigan: University of Michigan Press.

[4] Becker, Gary, and Barry Chiswick. 1966. “Education and the Distribution of Earnings.”

American Economic Review 56(1): 358-369.

[5] Bjorklund, Anders, and Robert Moffitt. 1987. “The Estimate of Wage Gains and Welfare

Gains in Self-Selection Models.” The Review of Economics and Statistics 69(1): 42-49.

[6] Card, David. 1993. “Using Geographic Variation in College Proximity to Estimate the

Return to Schooling.” National Bureau of Economic Research Working Paper 4483.

[7] Card, David. 2001. “Estimating the Return to Schooling: Progress on Some Persistent

Econometric Problems.” Econometrica 69(5): 1127-1160.

[8] Carneiro, Pedro, and James Heckman. 2002. “The Evidence on Credit Constraints in

Post-Secondary Schooling.” Economic Journal 112(482): 705-734.

[9] Carneiro, Pedro, James Heckman, and Edward Vytlacil. 2010. “Evaluating Marginal Pol-

icy Changes and the Average Effect of Treatment for Individuals at the Margin.” Econo-

metrica 78: 377-394.

[10] Carneiro, Pedro, James Heckman, and Edward Vytlacil. 2011. “Estimating Marginal Re-

turns to Education.” American Economic Review 101: 2754-2781.

[11] Dwyer, Rachel, Randy Hodson, and Laura McCloud. 2013. “Gender, Debt, and Drop-

ping Out of College.” Gender and Society 27: 30-55.

18

Page 20: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

[12] Griliches, Zvi. 1977. “Estimating the Returns to Schooling: Some Econometric Prob-

lems.” Econometrica 45: 1-22.

[13] Heckman, James. 1997. “Instrumental Variables: A Study of Implicit Behavioral As-

sumptions Used in Making Program Evaluations.” The Journal of Human Resources 32(3):

441-462.

[14] Heckman, James, Lance Lochner, and Christopher Taber. 1998. “General Equilibrium

Treatment Effects: A Study of Tuition Policy.” American Economic Review 88: 381-386.

[15] Heckman, James, Sergio Urzua, and Edward Vytlacil. 2006. “Understanding Instrumen-

tal Variables in Models with Essential Heterogeneity.” ational Bureau of Economic Re-

search Working Paper 12574.

[16] Heckman, James and Edward Vytlacil. 1999. “Instrumental Variables and Latent Vari-

able Models for Identifying and Bounding Treatment Effects.” Economic Sciences 96:

4730-4734.

[17] Heckman, James and Edward Vytlacil. 2001a. “Local Instrumental Variables.” In Nonlin-

ear Statistical Modeling: Proceedings of the Thirteenth International Symposium in Economic

Theory and Econometrics: Essays in Honor of Takeshi Amemiya, edited by Cheng Hsiao,

Kimio Morimune, and James Powell, 1-46. New York: Cambridge University Press.

[18] Heckman, James and Edward Vytlacil. 2001b. “Policy-Relevant Treatment Effects.”

American Economic Review 91(2): 107-111.

[19] Imbens, Guido and Joshua Angrist. 1994. “Identification and Estimation of Local Aver-

age Treatment Effects.” Econometrica 62(2): 467-475.

[20] Jensen, Robert. 2010. “The (Perceived) Returns to Education and the Demand for School-

ing.” Quarterly Journal of Economics 125(2): 515-548.

[21] Kane, Thomas, and Cecilia Rouse. 1995. “Labor Market Returns to Two- and Four-Year

Colleges.” American Economic Review 85(3): 600-614.

[22] Mincer, Jacob. 1974. Schooling, Experience and Earnings. New York: Columbia University

Press.

19

Page 21: Estimating Marginal Returns to Educationecon.ucsb.edu/~pjkuhn/Ec250A/StudentsPapers/Stearns... · 2013-03-26 · Estimating returns to education is an essential part of determining

[23] Moffitt, Robert. 2008. “Estimating Marginal Treatment Effects in Heterogeneous Popu-

lations.” Annals of Economics and Statistics 91-92: 239-261.

[24] Oreopoulos, Philip. 2006. “Estimating Average and Local Average Treatment Effects of

Education when Compulsory Schooling Laws Really Matter.” American Economic Review

96(1): 152-175.

[25] Stinebrickner, Todd and Ralph Stinebrickner. 2012. “Learning about Academic Ability

and the College Dropout Decision.” Journal of Labor Economics 30(4): 707-748.

20