55
Monotonicity in IV and fuzzy RD designs - A Guide to Practice Mario Fiorini 1 and Katrien Stevens 2 1 Economics Discipline Group, University of Technology Sydney 2 School of Economics, University of Sydney June 19, 2014 Abstract Whenever treatment effects are heterogeneous and there is sorting into treat- ment based on the gain, monotonicity is a condition that both Instrumental Vari- able and fuzzy Regression Discontinuity designs have to satisfy for their estimate to be interpretable as a LATE. However, applied economic work rarely discusses this important assumption. We provide insights in how to think about monotonicity. While there is no test of monotonicity in itself, the joint hypotheses of indepen- dence and monotonicity lead to testable implications. Yet, these implications are necessary but not sufficient for independence and monotonicity to hold. Therefore, we argue that monotonicity can and should be investigated using a mix of economic intuition and data patterns, just like other untestable assumptions in an IV or fuzzy RD design. We provide examples in a variety of settings as a guide to practice. JEL classification : C1, C21, C26, C31, C36. Keywords : essential heterogeneity, monotonicity assumption, LATE, instrumental vari- able, regression discontinuity We thank Jerome Adda, Martin Huber, Markus J¨ antti, Toru Kitagawa, Tobias Klein, Matthew Lindquist, Peter Siminski, Olena Stavrunova and seminar participants at the Swedish Institute for Social Research (SOFI), Australasian Meetings of the Econometrics Society, University of Melbourne, University of Technology Sydney, University of Sydney, University of New South Wales, Monash University and Australasian Labour Econometrics Workshop for helpful comments and suggestion. 1

Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Monotonicity in IV and fuzzy RD designs - A Guideto Practice

Mario Fiorini1 and Katrien Stevens2

1Economics Discipline Group, University of Technology Sydney2School of Economics, University of Sydney

June 19, 2014

Abstract

Whenever treatment effects are heterogeneous and there is sorting into treat-ment based on the gain, monotonicity is a condition that both Instrumental Vari-able and fuzzy Regression Discontinuity designs have to satisfy for their estimate tobe interpretable as a LATE. However, applied economic work rarely discusses thisimportant assumption. We provide insights in how to think about monotonicity.While there is no test of monotonicity in itself, the joint hypotheses of indepen-dence and monotonicity lead to testable implications. Yet, these implications arenecessary but not sufficient for independence and monotonicity to hold. Therefore,we argue that monotonicity can and should be investigated using a mix of economicintuition and data patterns, just like other untestable assumptions in an IV or fuzzyRD design. We provide examples in a variety of settings as a guide to practice.

JEL classification: C1, C21, C26, C31, C36.

Keywords : essential heterogeneity, monotonicity assumption, LATE, instrumental vari-able, regression discontinuity

We thank Jerome Adda, Martin Huber, Markus Jantti, Toru Kitagawa, Tobias Klein, MatthewLindquist, Peter Siminski, Olena Stavrunova and seminar participants at the Swedish Institute for SocialResearch (SOFI), Australasian Meetings of the Econometrics Society, University of Melbourne, Universityof Technology Sydney, University of Sydney, University of New South Wales, Monash University andAustralasian Labour Econometrics Workshop for helpful comments and suggestion.

1

Page 2: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

1 Introduction

In the early 90’s, work by Imbens and Angrist (1994), Angrist and Imbens (1995) and An-grist, Imbens, and Rubin (1996) provided the theoretical foundation for the identificationof the Local Average Treatment Effect (LATE): the treatment effect for those individualswho are affected by the instrument. Heckman, Urzua, and Vytlacil (2006) stress that thisidentification result is crucial when (i) the gain from treatment is heterogeneous acrossthe population and (ii) there is sorting into treatment based on the gain from treatment.Heckman, Urzua, and Vytlacil (2006) define any context where both (i) and (ii) occuras essential heterogeneity. Under essential heterogeneity, the identification of the LATErequires the additional assumption of monotonicity: for a given change in the value of theinstrument, it can not be that some individuals increase treatment intensity while othersdecrease treatment intensity. Hahn, Todd, and Van der Klaauw (2001) show that underessential heterogeneity, the assumption of monotonicity is also needed for the identifica-tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicitydoes not hold, the IV and fuzzy RD estimates are generally uninterpretable. Heckman,Urzua, and Vytlacil (2006) point out that there is an asymmetry in the way empiricalresearchers discuss heterogeneity: outcomes of choices are often allowed to be hetero-geneous, choices themselves are not. But since choices are plausibly also heterogeneousin the sense that individuals might respond differently to changes in some variable ofinterest (such as the instrument or the discontinuity), monotonicity cannot be assumedto hold a priori.

Given the importance of the monotonicity condition in both IV and fuzzy RD de-signs, it is remarkable that this condition is seldom investigated in applied studies. Thisis in stark contrast to the lengthy discussions dedicated to the IV independence and rankconditions, and to the RD discontinuity (in the probability of treatment) and continuity(in the conditional regression function) conditions. There are a number of possible expla-nations for this missing step. There seems to be a general consensus that monotonicityis fundamentally untestable and there is no clear framework to think about monotonicityin practice.

The aim of this paper is to argue that applied work can and should investigate mono-tonicity using a mix of economic intuition and data patterns, as it is generally done forother assumptions in an IV or fuzzy RD design. In doing so, we link the theoreticalliteratures on Instrument Variables (Imbens and Angrist (1994), Angrist and Imbens(1995), Angrist, Imbens, and Rubin (1996)), Regression Discontinuity (Hahn, Todd, andVan der Klaauw (2001), Lee and Lemieux (2010)) and Essential Heterogeneity (Heck-man and Vytlacil (2005), Heckman, Urzua, and Vytlacil (2006)) with a focus on theapplied-microeconomics profession. More specifically, we make three contributions.

First, we show how informative the IV and fuzzy RD estimates are under variousdegrees of heterogeneity in treatment effects, varying degrees of sorting on gain and thedegree of violation of monotonicity. Using a simple model of selection into treatment,we provide general numerical examples. We find that even a small violation of mono-tonicity yields un-interpretable estimates unless we are in the extreme case of either noheterogeneity in treatment effects or no sorting on gain (i.e. no essential heterogeneity).

Second, we emphasize that the joint hypotheses of independence and monotonicityyield testable implications and thus that monotonicity can be partly tested. Angrist and

2

Page 3: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Imbens (1995) suggest a method for checking monotonicity in the case of multivaluedtreatment: if monotonicity holds, the cumulative distribution functions (CDFs) of theendogenous treatment variable should not cross for alternative values of the instrument.In other words, one CDF should first-order stochastically dominate the other. We stressthat stochastic dominance (SD) is an implication of both the independence and mono-tonicity conditions, and that it can be formally tested using existing SD tests (see forexample Barrett and Donald (2003)). In more recent studies, Huber and Mellace (2014)and Kitagawa (2013) develop tests for binary treatment settings. The former develop atest using inequality moment constraints regarding the mean potential outcomes. Kita-gawa (2013) instead designs a test based on the negativity of potential outcome densities.We highlight that i) the independence and monotonicity conditions are needed only undersorting on gain, a point surprisingly neglected by this literature on instrument validity,ii) while these tests are designed in an IV setting, they can easily be extended to thefuzzy RD design and iii) the testable implications are necessary but not sufficient for in-dependence and monotonicity to hold. Hence these tests are informative but not definite.We therefore argue that a thorough discussion of monotonicity needs to rely on economicintuition and data patterns, in addition to formal testing.

As a third contribution we go through different studies in a variety of settings thatadopt either the IV or fuzzy RD estimator. Our goal here is to provide the appliedresearcher with a guide to practice. Hence, for each study we try to make a case for oragainst monotonicity. We show how the monotonicity assumption has sometimes beenoverlooked and how possible violations could have been detected.

Finally, we also discuss recent theoretical work on monotonicity which attempts torelax the monotonicity condition. Klein (2010), de Chaisemartin (2012) and Huber andMellace (2012) establish alternative conditions under which it is still possible to interpretthe IV and fuzzy RD estimates as a LATE even though monotonicity is violated. Theapplicability of these conditions is highly dependent on the context.

The remaining of the paper is organized as follows. Section 2 discusses the monotonic-ity condition in an IV and RD setting, while section 3 illustrates interpretability of theestimates under a violation of monotonicity. Section 4 looks into testing the monotonic-ity condition. We provide a thorough discussion of the monotonicity condition in severalexisting studies in section 5 and recent attempts at relaxing monotonicity are discussedin section 6. Section 7 concludes.

2 The Monotonicity Assumption

To illustrate the notion of essential heterogeneity and monotonicity in a simple way, letus define a latent utility model where the endogenous variable (D) is binary, and theinstrumental variable (Z) has support in Z. For each individual i:

Yi = β0i + β1iDi (Outcome) (1)

D∗i = α0i + α1iZi (Latent Utility)

with

Di =

1 if D∗i > 0 (Observed Treatment)0 if D∗i ≤ 0

3

Page 4: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

All the coefficients β0, β1, α0, α1 are random variables with non-degenerate distribu-tions. Here β1 is the gain from treatment that is heterogeneous across individuals, andα1 is the heterogeneous response to the instrument. The case of essential heterogeneityarises when E(β1|D = 1) 6= E(β1|D = 0), that is when the treatment is partly drivenby the gain from treatment. Essential heterogeneity is different from sorting on levelsE(β0|D = 1) 6= E(β0|D = 0).1 The model can be generalized with the inclusion of covari-ates (X), in which case essential heterogeneity arises when E(β1|D = 1, X) 6= E(β1|D =0, X). In the rest of the paper we keep the conditioning on X implicit.

2.1 Monotonicity in the IV design

Following the notation in Imbens and Angrist (1994), let Y (0) be the response withouttreatment for a given individual. Y (1) is the response with treatment for the sameindividual. Define for each z ∈ Z, D(z) as the individual’s treatment value when Z = z.Imbens and Angrist (1994) show that, provided a random variable Z satisfying the 3following conditions is available, βIV1 identifies the average treatment effect for thoseindividuals who are affected by the instrument (LATE).

IV1. P (z) = E[D|Z = z] is a non trivial function of z (rank)

IV2. [Y (0), Y (1), D(z)z∈Z ] is jointly independent of Z (independence)

IV3. for any two points of support z, w ∈ Z, Either Di(z) ≥ Di(w) ∀i, Or Di(z) ≤Di(w) ∀i (monotonicity)

Condition IV1 is the rank condition. Condition IV2 requires that the instrument isindependent of both potential outcomes and potential treatment assignment. This isstronger than the standard IV exclusion restriction: in model (1) it not only impliesthat Z is independent of β0 but also that Z is independent of β1 and D(z)z∈Z . Thelatter independence between Z and the counterfactual treatment values under alternativevalues of the instrument D(z)z∈Z is sometimes referred to as “type” independence.2

The strengthening of the exclusion restriction to this independence condition togetherwith IV3 are needed to identify a LATE whenever there is essential heterogeneity.3 Themonotonicity assumption requires that, for every individual i, a change in the value ofthe instrument from w to z must either leave the treatment unchanged or change the

1The econometrician often estimates (1) without allowing for heterogeneity:

Yi = β0 + β1Di +

UYi︷ ︸︸ ︷

(β0i − β0) + (β1i − β1)Di

D∗i = α0 + α1Zi + (α0i − α0) + (α1i − α1)Zi︸ ︷︷ ︸

UDi

Both sorting on gain and on levels are then a source of endogeneity.2Individuals are classified into types depending on the counterfactual treatment values. For instance,

when both the treatment and the instrument are binary there are only four types: compliers, defiers,always-takers and never-takers. We define these types more formally later.

3With sorting on gain Cov(Z, β0) = 0 is not sufficient for identification. See footnote 5 in Heckman,Urzua, and Vytlacil (2006) and the related discussion for a proof.

4

Page 5: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

treatment in the same direction. Monotonicity is violated if, because of the same changein the value of the instrument from w to z, some individuals respond by getting thetreatment (“switching in”) while others stop getting it (“switching out”).4 Monotonicityis a condition on counterfactuals: it refers to an individual’s treatment in 2 alternativestates of the world Z = w and Z = z.

The IV estimator for any two points of support z, w in Z is given by

βIV1 (z, w) ≡ E[Y |Z = z]− E[Y |Z = w]

E[D|Z = z]− E[D|Z = w]

Following Angrist, Imbens, and Rubin (1996), and assuming both IV1 and IV2, wecan interpret the IV estimate of β1 as

βIV1 (z, w) = λ×E[Y (1)− Y (0)|D(z)−D(w) = 1] + (2)

(1− λ)×E[Y (1)− Y (0)|D(z)−D(w) = −1]

where

λ =P [D(z)−D(w) = 1]

P [D(z)−D(w) = 1]− P [D(z)−D(w) = −1]

Equation 2 is very informative. It says that

• βIV1 (z, w) does not “use” individuals who do not respond to changes in the value ofthe instrument: D(z)−D(w) = 0.

• if monotonicity holds, such that for instance Di(z)−Di(w) ≥ 0 ∀i, then λ = 0 andIV estimates the effect for those individuals that are induced to take the treatmentbecause of the instrument (LATE-in): βIV1 (z, w) = E[Y (1)− Y (0)|D(z)−D(w) =1]. Alternatively, if monotonicity holds because D(z)−D(w) ≤ 0 ∀i then λ = −1,and IV estimates the effect for those individuals that stop getting the treatmentbecause of the instrument (LATE-out): βIV1 (z, w) = E[Y (1)−Y (0)|D(z)−D(w) =−1].

• if monotonicity does not hold and individuals respond differently to a given changein the value of Z, such that for some individual D(z)−D(w) = 1 while for othersD(z)−D(w) = −1, then the IV estimate is not a treatment effect for any specificindividual or group of individuals. Importantly, it is not a weighted average of theLATE-in and LATE-out but it takes a more extreme value since either λ > 1 orλ < 0.

• if the return to treatment is heterogeneous but there is no sorting on gain, thenmonotonicity is not required. Without sorting on gain the expected return fromtreatment is the same among those who switch in and out, and equal to the averagetreatment effect (ATE):

βIV1 (z, w) = LATE-in = LATE-out = E[Y (1)− Y (0)]

4A necessary condition for monotonicity to be violated in (1) is α1 positive for some individuals andnegative for some others.

5

Page 6: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

2.2 Monotonicity in the fuzzy RD design

Let the setting be the same as in equation (1), with v ∈ Z. Also let

limv↓v0

P [D = 1|Z = v] 6= limv↑v0

P [D = 1|Z = v]

such that the probability of treatment jumps at the threshold v0, without requiringthe jump to be equal to 1.5 Hahn, Todd, and Van der Klaauw (2001) point out thatthere is a very close analogy between the fuzzy Regression Discontinuity design and theIV estimators since both can be expressed as a Wald estimator:

βRD1 ≡ limv↓v0 E[Y |Z = v]− limv↑v0 E[Y |Z = v]

limv↓v0 E[D|Z = v]− limv↑v0 E[D|Z = v]

which measures a similar LATE to βIV1 (z, w) if z = v0 + ε, w = v0 − ε and ε isarbitrarily small. Hahn, Todd, and Van der Klaauw (2001) show that under essentialheterogeneity, regression discontinuity estimation identifies β1 for those individuals withZ = v0 who are affected by the threshold (LATE at v0) under the following conditions

RD1. limv↓v0 P [D = 1|Z = v] 6= limv↑v0 P [D = 1|Z = v] (RD)

RD2. E[Y (0)|Z = v] is continuous in v at v0 (continuity)

There exists a small number ξ > 0 such that for all 0 < e < ξ, and

RD3. [β1, D(v − e), D(v + e)] is jointly independent of Z near v0 (independence)

RD4. Either Di(v0 + e) ≥ Di(v0 − e) ∀i, Or Di(v0 + e) ≤ Di(v0 − e) ∀i (monotonicity)

Condition RD1 is the RD equivalent of the rank condition in the IV setting. ConditionRD2 implies that in the absence of treatment, individuals close to the threshold v0 aresimilar. These first two conditions must hold whether there is essential heterogeneityor not. Similar to IV estimation, whenever there is essential heterogeneity a strongerindependence condition together with monotonicity are needed to identify a LATE at v0.Note again that monotonicity is a condition on counterfactuals: for every individual i,crossing the threshold must either leave the treatment unchanged or change the treatmentin the same direction. Invoking the reasoning in Angrist, Imbens, and Rubin (1996), wecan express the RD estimate of β1 as6

βRD1 = lime→0

λ×E[Y (1)− Y (0)|D(v0 + e)−D(v0 − e) = 1] + (3)

(1− λ)×E[Y (1)− Y (0)|D(v0 + e)−D(v0 − e) = −1]

where

λ =P [D(v0 + e)−D(v0 − e) = 1]

P [D(v0 + e)−D(v0 − e) = 1]− P [D(v0 + e)−D(v0 − e) = −1]

5When the jump in treatment probability is equal to 1, monotonicity is satisfied by definition. Thiscase is defined in the literature as a sharp regression continuity design.

6Proof in appendix A.

6

Page 7: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Equation (3) is the equivalent of equation (2) in an RD setting. Thus, if monotonicityholds, λ is equal to either 0 or −1, and the RD estimate can be interpreted as a LATEat the threshold. When monotonicity is violated either λ > 1 or λ < 0, and the RDestimate measures neither a LATE for any particular group of individuals nor a weightedaverage of LATEs.

2.3 Monotonicity when the treatment is multi-valued

Angrist and Imbens (1995) discuss the interpretation of the IV estimate when the treat-ment D is a multivalued random variable with support D = 0, 1, ..., K and K > 1.Assume that IV1 (rank) and IV2 (independence) hold. Extending the discussion by An-grist and Imbens (1995), we can express the IV estimate of β1 for any two points ofsupport z, w in Z as follows (proof in appendix C)

βIV1 (z, w) =1

Ω×

K∑k=1

E[Yk − Yk−1|D(z) ≥ k > D(w)]× P [D(z) ≥ k > D(w)] (4)

−E[Yk − Yk−1|D(w) ≥ k > D(z)]× P [D(w) ≥ k > D(z)]

where

Ω =K∑k=1

(P [D(z) ≥ k > D(w)]− P [D(w) ≥ k > D(z)])

From equation (4) we can see that

• βIV1 (z, w) does not rely on individuals who are not affected by the instrument:D(z) = D(w).

• If assumption IV3 (monotonicity) holds, such that P [D(z) ≥ k > D(w)] = 1 ∀k,equation (4) simplifies to

βIV1 (z, w) =K∑k=1

E[Yk − Yk−1|D(z) ≥ k > D(w)]× P [D(z) ≥ k > D(w)]∑Kk=1 P [D(z) ≥ k > D(w)]

(5)

Angrist and Imbens (1995) refer to this parameter as the average causal response(ACR). It is a weighted average of causal responses to a unit change in treatment, forthose whose treatment status is affected by the instrument. Note that individualswhose treatment value changes with more than one unit are counted several times inthe weights and enter multiple conditioning sets in the treatment effects. A similardiscussion applies if instead monotonicity holds such that P [D(w) ≥ k > D(z)] =1 ∀k.

• if monotonicity does not hold, then for a given change in the value of Z someindividuals increase treatment value D(z) ≥ k > D(w) while others decrease it

7

Page 8: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

D(w) ≥ k > D(z) for at least some k. Equation (4) shows that in this case boththe numerator and the denominator include contributions from individuals affectedin either direction. Without monotonicity, the IV estimate is thus not a weightedaverage of treatment effects and cannot be given a useful interpretation.

The close analogy between the fuzzy Regression Discontinuity design and the IVestimators extends to the case of multivalued treatment with binary instrument in astraightforward way. Both can still be expressed as Wald estimators. Lee and Lemieux(2010) show that monotonicity is also required in the fuzzy RD setting with a multi-valued treatment, in which case the interpretation of the RD estimator is still the sameas that of the IV estimator.

2.4 Monotonicity when the instrument is multi-valued

Let Z be a multivalued random variable with support Z = 0, 1, ..., J and with J > 1.Assume that IV1 (rank) and IV2 (independence) hold. Define g(Z) as a scalar functionfrom the support of Z to the real space. Using g(Z) as an instrument, the IV estimatoris now given by7

βIV1 ≡Cov(Y, g(Z))

Cov(D, g(Z))

In order to interpret the IV estimator Imbens and Angrist (1994) and Angrist and Imbens(1995) supplement IV3 (monotonicity) with another condition

IV4. (i) either ∀z, w ∈ Z, E[D|Z = z] ≤ E[D|Z = w] implies g(z) ≤ g(w); or, ∀z, w ∈Z, E[D|Z = z] ≤ E[D|Z = w] implies g(z) ≥ g(w) and (ii) Cov(D, g(Z)) 6= 0

Note that while monotonicity is a condition across individuals, IV4 is a condition on therelation between E[D|Z] and g(Z): IV3 does not imply IV4 and viceversa.8 ConditionIV4 is satisfied by construction when Z is binary, or when Z is a discrete random variablethat enters g(Z) in the form of mutually exclusive dummy variables. Otherwise IV4 isnot guaranteed to hold.

Under IV1-IV4, let the points of support of Z be ordered such that ` < m impliesE[D|Z = `] < E[D|Z = m]. Angrist and Imbens (1995) show that we can write the IVestimate of β1 as a weighted average of Wald estimators:

βIV1 =J∑j=1

µjβIV1 (j, j − 1) (6)

where

βIV1 (j, j − 1) =E[Y |Z = j]− E[Y |Z = j − 1]

E[D|Z = j]− E[D|Z = j − 1]

7The case where g(Z) = Z is the simplest case, but this notation generalizes the estimator to anyfunctional form of g(Z) and to the case where Z is a vector.

8Because IV3 implies that every individual has to respond to the instrument in the same direction,Heckman and Vytlacil (2005) and Heckman, Urzua, and Vytlacil (2006) rename IV3 with the term“uniformity”.

8

Page 9: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

and

µj = (E[D|Z = j]− E[D|Z = j − 1])

∑J`=j π`(E[D|Z = `]− E[D])∑J

`=0 π`E[D|Z = `](E[D|Z = `]− E[D])

with π` = P [Z = `]. Given the ordering in Z we also have that 0 ≤ µj ≤ 1 and∑Jj=1 µj = 1. Equation (6) indicates that, under a binary treatment,

• when monotonicity IV3 and IV4 are satisfied then βIV1 is a weighted average ofLATEs E[Y (1)− Y (0)|D(j)−D(j − 1) = 1]. This is because each Wald estimatorhas a LATE interpretation (given IV3) and the µj weights are non-negative (givenIV4).

• when monotonicity IV3 is satisfied but IV4 is not then βIV1 is not a weighted averageof LATEs. While each Wald estimator has a LATE interpretation, some µj weightscan be negative.

• when monotonicity IV3 is violated but IV4 is satisfied then βIV1 is again not aweighted average of LATEs. Now some Wald estimators no longer has a LATEinterpretation, even though all the µj weights are non-negative.

When the treatment is multi-valued the same conclusions apply with the only differ-ence that each Wald estimator has an ACR interpretation (under monotonicity).

The analogy between the fuzzy Regression Discontinuity design and the IV estimatorsis less direct now. Let Z be the running variable in an RD setting, with v ∈ Z. Considernow the case where there are multiple discontinuities or cutoffs (vf ) for f = 0, . . . , F andF > 0, such that the probability of treatment jumps at each threshold

limv↓vf

P [D = 1|Z = v] 6= limv↑vf

P [D = 1|Z = v] ∀f

One way to approach this problem is to split the sample and run a separate RD re-gression at each cutoff, doing so for both the treatment and the outcome equation. Thisapproach yields a vector of F + 1 treatment effects, each interpretable as a LATE de-pending on whether monotonicity is satisfied. Note that this is not exactly how discretemultivalued instruments are used. If Z has support Z = z0, z1, . . . , zJ then the instru-ments often enter the g(Z) function in the form of mutually exclusive dummy variables(similarly to RD) but the outcome equation is estimated over the whole sample (contraryto separate RD regressions). The result is a weighted average of LATEs as describedearlier. Of course one could take the same approach for the fuzzy RD by pooling all theobservations together while using dummy variables to identify threshold specific effect.

For instance, using a linear spline specification, the RD design can now be describedby:

D =α0 +F∑f=0

α1f × 1[Z > vf ] + δDf × 1[Z > vf ]× (Z − vf )

Y =β0 + β1D +

F∑f=0

δYf × 1[Z > vf ]× (Z − vf )

9

Page 10: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

See for instance Brollo, Nannicini, Perotti, and Tabellini (2013), Clark and Royer(2013) and Dobbie and Skiba (2013) for settings with multiple discontinuities. The caseof a continuous instrument is instead unlikely in an RD setting since that would imply acontinuum of cutoffs.

3 Interpretation of IV and RD estimates: Sensitivity

to Key Assumptions

The discussion in section 2 highlights that in a world with essential heterogeneity, mono-tonicity is an important condition. When monotonicity is violated the IV and RD esti-mates cannot be interpreted. However, the researcher might be interested in knowing towhat degree the presence of essential heterogeneity and a violation of monotonicity area problem. What if the treatment effects are “roughly” homogeneous? What if there isjust a “bit” of sorting on gain? What if only a “few” individuals violate monotonicity?To answer this question we set up a generalized Roy model with an exogenous variableZ affecting the treatment decision. The key characteristics of this model are i) treat-ment effects are heterogeneous, ii) selection into treatment is based on the gain, and iii)the impact of Z is heterogeneous across individuals. Under i) and ii) we know that aninstrumental variable or fuzzy regression discontinuity approach has to satisfy the mono-tonicity condition. The goal of this Roy model is to investigate the extent to which theIV and fuzzy RD estimates can be interpreted as a LATE of interest as i), ii) and iii) arestrengthened or weakened.

For each individual i let Y0i be the outcome if she does not take the treatment, andY1i be the outcome if she does:

Y1i = α + β + U1i

Y0i = α + U0i

Thus the gain from treatment is heterogeneous and given by βi = Y1i − Y0i = β +U1i − U0i.

3.1 Selection

Individuals decide whether or not to take treatment partly on the basis of their gain(sorting on gain) and partly on the basis of an exogenously determined variable Z (in-strument). Dropping the i subscript for notational simplicity, let the treatment D bedetermined as follows:

D =

1 if Y1 − Y0 + γZ > 0⇔ β > −γZ0 if Y1 − Y0 + γZ ≤ 0⇔ β ≤ −γZ

Here γZ could be interpreted as the cost or taste for treatment. To keep things simple,we define both the instrumental variable Z and its coefficient γ as binary: z ∈ 0, 1, andγ = γL or γ = γH . A key point of the model is that we set γL < 0 and γH > 0. Thus,the instrument “pushes” some individuals out of treatment (γ < 0) and other individuals

10

Page 11: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

intro treatment (γ > 0). The proportion of individuals with γL and γH are given by pγLand pγH = 1− pγL .

Now we can define how individuals make their treatment decision based on theirrealization of Z, γ, U1 and U0. Given sorting on gain, there is a cut-off value of β abovewhich individuals take treatment. That cut-off value is affected by the instrument, asindicated in table 1. The treatment decisions under alternative values of Z allow us todistinguish between 4 types: always-takers (AT), never-takers (NT), compliers (CM) anddefiers (DF).9

Table 1: Counterfactual Choices(D(Z = 1), D(Z = 0)

)γ = γL

β ≤ 0 0 < β ≤ −γL β > −γLZ = 0 D = 0 D = 1 D = 1Z = 1 D = 0 D = 0 D = 1type NT DF AT

γ = γHβ ≤ −γH −γH < β ≤ 0 β > 0

Z = 0 D = 0 D = 0 D = 1Z = 1 D = 0 D = 1 D = 1type NT CM AT

We maintain that conditions IV1 (rank) and IV2 (independence) are satisfied. Therank condition requires that P (D = 1|Z = 1) − P (D = 1|Z = 0). Under IV2 (indepen-dence)

• P (D = 1|Z = 1) = pAT + pCM

• P (D = 1|Z = 0) = pAT + pDF

where pAT , pCM and pDF are the proportions of always-takers, compliers and defiersrespectively. Thus the rank condition implies that pCM 6= pDF . Moreover, since inour model types are defined by the pair (β, γ), type independence requires that theseparameters are independent of Z. To simplify our discussion below, we also constrain βand γ to be uncorrelated though our results generalize to the case where they are not.Finally, we assume that the treatment effects are distributed normally: β ∼ N(β, σβ).

9In the literature these types are defined as

• Always-Takers (AT): take treatment irrespective of the instrument value(D(Z = 1) = 1, D(Z =

0) = 1)

• Never-Takers (NT): do not take treatment irrespective of the instrument value(D(Z = 1) =

0, D(Z = 0) = 0)

• Compliers (CM): take treatment when Z = 1 and not otherwise(D(Z = 1) = 1, D(Z = 0) = 0

)• Defiers (DF): do not take treatment when Z = 1 and take treatment otherwise

(D(Z = 1) =

0, D(Z = 0) = 1)

11

Page 12: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

We can then define the probability of observing each type as a function of γL, γH ,pγL and the distribution of the gain from treatment f(β). Figure 1 illustrates where thedifferent types are located along the β distribution.10

Figure 1: Types over the β distribution

• pAT = pγL × P [β > −γL] + pγH × P [β > 0]:Always-takers (ATs) take treatment irrespective of the instrument. Therefore, theymust have a high return to treatment. In particular, ATs with γL must havean especially large β. Since γL < 0, the instrument tends to push them out oftreatment, hence they need a large and positive β to compensate for it and remaintreated.

• pNT = pγL × P [β ≤ 0] + pγH × P [β ≤ −γH ]:Never-takers (NTs) do not take treatment irrespective of the instrument. Therefore,they must have a low return to treatment. In particular, NTs with γH must have anespecially low β. Since γH > 0, the instrument tends to push them into treatment,in which case they need a very negative β to compensate for it and remain untreated.

• pCM = 0 + pγH × P [−γH < β ≤ 0]:Compliers (CMs) are on the margin of taking treatment. They switch into treatmentonly when the instrument changes from 0 to 1. Therefore they must have β ≤ 0and γ = γH such that Z = 1 pushes them into treatment, but they must alsohave β > −γH otherwise the push is not sufficient to compensate for the negativetreatment effect.

• pDF = pγL × P [0 < β ≤ −γL] + 0:Defiers (DFs) are also on the margin of taking treatment. However, they switch outof treatment if the instrument changes from 0 to 1. Therefore they must have β > 0and γ = γL such that Z = 1 pushes them out of treatment, but they must alsohave β < −γL otherwise the push is not sufficient to compensate for the positivetreatment effect.

10Figure 1 is obtained using the parametrization described in table 2a.

12

Page 13: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Given the location of types along the β distribution, one can easily derive the AverageTreatment Effects of the various types. See appendix B for the formal expressions.

3.2 Violating Monotonicity - Baseline

In this section we investigate the effect of violating the monotonicity condition. We doso under our baseline parametrization, while in the next section we alter the baseline.Figure 1 was obtained using the parametrization in table 2a, and letting U1 ∼ N(µ1, σ1),U0 ∼ N(µ0, σ0) and Cov(U1, U0) = σ01. The baseline parametrization is chosen to besimple and at the same time to ensure that β is heterogeneous (σ01 6= 1) and that therank condition is satisfied (pγL 6= 0.5). The resulting proportions of types is describedin table 2b. Most of the population is given by always-takers, followed by never-takers.However, there are also both compliers and defiers. In this Roy model, with a binarytreatment and a binary instrument

βIV = λ× LATECM + (1− λ)× LATEDF (7)

whereλ =

pCMpCM − pDF

Whenever both pCM 6= 0 and pDF 6= 0 monotonicity is violated. Thus, in our example,no economic information can be recovered from βIV . It is neither a LATE nor is it aweighted average of the LATE for compliers and defiers, but it is more extreme sinceλ > 1 (see table 2c).11

Table 2: Baseline

(a) Parametrization

γL γH pγL β µ1 µ0 σ1 σ0 σ10-1 1 0.25 0 0 0 1 1 -1

(b) Types

pAT pNT pCM pDF pAT + pNT + pCM + pDF λ0.452134 0.356403 0.143597 0.0478656 1 1.5

(c) ATE, LATEs and IV estimate

ATE ATEAT ATENT LATECM LATEDF βIV

0 1.71287 -2.04142 -0.489673 0.489673 -0.979345

11In table 2c βIV = 2LATECM (and βIV = −2LATEDF ). This occurs because LATECM =−LATEDF and λ = 1.5. It might be tempting to conclude that if we had a good guess of λ then it wouldbe possible to recover the LATECM and LATEDF from the βIV . However, LATECM = −LATEDFonly because f(β) is centred around 0 and γH = −γL. More generally, equation 7 is an equation in threeunknowns. Therefore, the LATECM and LATEDF cannot be linked and even knowledge of λ would notallow us to recover any LATE.

13

Page 14: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

3.3 Altering the baseline: interpretation of βIV

In this section we alter the parameters in the baseline scenario and show how the decisionsof individuals are affected. The goal is to illustrate how the monotonicity bias in βIV

changes with the degree of heterogeneity in the treatment effect (σ2β), the impact of the

instrument (γL and γH), the heterogeneity in the impact of the instrument (pγL) and thedegree of sorting on gain.

3.3.1 Heterogeneity in the treatment effect: σβ

Since β = β + U1 − U0 then σβ =√σ21 + σ2

0 − 2σ01. In the baseline we set σ0 = σ1 = 1and σ01 = −1 in order to get a positive standard deviation of σβ = 2. However, if U1

and U0 are positively correlated then β has a smaller standard deviation. In figure 2a-cwe illustrate the types over the β distribution as a function of σβ. As σβ decreases thedistribution narrows around the Average Treatment Effect (β = 0). Figure 2d shows theproportion of types: as σβ goes towards zero all observations get concentrated withinthe −γH and −γH interval. Thus, the proportion of always-takers and never-takers falls.Moreover, the proportions of compliers and always-takers converge. The same is truefor the proportions of never-takers and defiers. Because of the symmetry around thethreshold, λ is constant. Figure 2f shows the ATE, the LATEs for compliers and defiers,and βIV . All the LATEs and the βIV tend to the ATE as σβ goes towards zero: in theabsence of treatment effect heterogeneity there is no monotonicity requirement. However,for positive σβ the βIV is always more extreme and far from the LATEs (since λ > 1).

(a) σβ = 0.7 (b) σβ = 1.41 (c) σβ = 2 (baseline)

(d) Proportion of types (e) λ (f) ATE, LATEs and βIV

Figure 2: Sensitivity to σβ

14

Page 15: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

3.3.2 Impact of the instrument: size of γL and γH

We focus here on symmetric changes in the values of γL and γH . Intuitively, as the mag-nitude of γL and γH increases, the instrument has a stronger impact on the treatmentdecisions. Figures 3a-c show what happens if the size of both γ’s increases: the thresh-olds of β inbetween which compliers and defiers exist shift out. Hence the fraction ofcompliers and defiers rises, while the fraction of always-takers and never-takers shrinks.This is presented in figure 3d. As γ grows, the proportion of compliers and always-takersconverges, and the same applies to the fraction of defiers and never-takers. This is be-cause we are symmetrically changing γL and γH around the threshold β=0. This againresults in λ being constant, as can be seen in figure 3e. Figure 3f then shows that themagnitude of the LATEs on compliers and defiers grows with γ. With λ constant thisresults in a βIV that is always more extreme and diverges from these LATEs. A violationof monotonicity is thus a problem for whatever value of γ; the intensity of the problemdoes not seem to vary with the impact of the instrument. As γ goes to zero, the LATEsand βIV tend towards the ATE: this is an uninteresting limit case since it implies theinstrument has no impact at all.

(a) γL, γH = −0.5, 0.5 (b) γL, γH = −1, 1 (baseline) (c) γL, γH = −4, 4

(d) Proportion of types (e) λ (f) ATE, LATEs and βIV

Figure 3: Sensitivity to γ

3.3.3 Heterogeneity in the response to the instrument: pγL

In this section we investigate what happens when we vary the degree of heterogeneity inthe response to the instrument. In our model, this heterogeneity exists since γ can beeither positive (γH) or negative (γL): individuals with positive γ are needed to generate

15

Page 16: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

compliers, while individuals with negative γ are needed to generate defiers.12 As long as0 < pγL < 1 there are both compliers and defiers causing monotonicity to be violated.Figure 4a-c illustrates the types over the β distribution as a function of pγL . As pγL falls,there are fewer individuals negatively affected by the instrument: thus the proportionof defiers decreases while the proportion of compliers increases. In the limit case wherepγL = 0 (no heterogeneity in the response to the instrument) there are no defiers. Theopposite is true if pγL = 1: there are no more compliers. Figure 4f shows that onceagain βIV cannot be interpreted as a LATE of interest unless we are in the limit cases ofpγL = 0 (leading to λ = 1) or pγL = 1 (leading to λ = 0): no heterogeneity in the responseto the instrument. As as soon as we depart from either of the limit cases βIV becomesuninformative rapidly. An especially problematic situation occurs when pγL is close to0.5 in which case pCM ≈ pDF and λ→∞13. Note that the LATECM and the LATEDFare constant over different values of pγL since f(β) and the fraction of individuals in the[−γH ,−γL] interval are also constant.

(a) pγL = 0.1 (b) pγL = 0.25 (baseline) (c) pγL = 0.9

(d) Proportion of types (e) λ (f) ATE, LATEs and βIV

Figure 4: Sensitivity to pγL

3.3.4 Other forms of Heterogeneity: γL

There is another way to alter the degree of heterogeneity in this model aside from shiftingpγL . In our baseline γH = −γL, that is the magnitude of the response to the instrumentis the same between individuals with different γ, albeit in opposite directions. However,

12Strictly speaking, there is heterogeneity in the response to treatment also if γL and γH have thesame sign. In that case, heterogeneity does not create a violation of monotonicity: there are either nocompliers or no defiers.

13The limit case itself (pγL = 0.5) is uninteresting since the instrument does not satisfy the rankcondition (pCM = pDF ).

16

Page 17: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

it could be that γH 6= −γL. Thus, holding γH constant, shifting the value of γL leadsto different proportions of defiers (for a given proportion of compliers). Figure 5a-cillustrates the types over the β distribution as a function of γL: the closer γL is tozero the smaller the proportion of defiers. A very negative γL has the opposite effect.Figure 5d shows the proportion of types as a function of ψL, a parameter re-scaling thebaseline: γL = ψLγ

BL (with γBL = −1). For ψL = 0 then γL = 0: this is a limit case

with heterogeneity in the response to the instrument (pγL = 0.25 in the baseline) but nodefiers since no individual is pushed out of treatment by the instrument.14 If ψL > 0then γL becomes a negative number, λ > 1 and βIV cannot be interpreted as a LATE ofinterest. In our example λ has an horizontal asymptote since, even for very negative γL,the proportion of defiers is always smaller than the proportion of compliers. Note thata shift in the value of γL also has implications for the impact of the instrument. As wediscussed earlier, the rank condition demands that pCM 6= pDF . Figure 5d clearly showsthat the difference in the proportion of compliers and defiers is largest for γL = 0, rightbecause in that case pDF = 0.

(a) γL = −0.5 (b) γL = −1 (baseline) (c) −γL = −2

(d) Proportion of types (e) λ (f) ATE, LATEs and βIV

Figure 5: Sensitivity to γL

3.3.5 Size of the average treatment effect: β

In the baseline setup of our model, the average treatment effect β = 0 and it equals thethreshold value of β = 0 which separates compliers and defiers.15 We chose this valuefor simplicity: it makes the distribution of types symmetric around the threshold (since

14This case is equivalent to a situation where γ is heterogeneous but only takes positive values.15This threshold value is purely defined by the treatment decision: under Z=0, individuals with β

above the threshold take treatment, while those with β below do not take treatment.

17

Page 18: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

γH = −γL) and the LATEs for compliers and defiers are symmetric. Nevertheless, thisthreshold value β = 0 and the ATE are not necessarily related in our model: a changein β results in a shift of the probability density function f(β), but does not affect thethreshold value.

Figure 6a-c illustrates the types over the β distribution as a function of the ATE.As the ATE increases, the proportion of always-takers increases accordingly, with a cor-responding reduction in the proportion of never-takers. The proportion of compliersand defiers also decreases as fewer individuals are located in the [−γH ,−γL] interval.16

Importantly, λ first tends to +∞ and then towards 0. The former occurs because theproportions of compliers and defiers are positive but identical, while the latter resultsfrom the proportions of compliers and defiers tending to zero. Both cases are uninter-esting because the instrument does not have an impact in the first stage. Overall, figure6f shows that βIV is always more extreme than any of the LATEs and deviates stronglyfrom them. Note also the special (coincidental) case where βIV is the same as the ATE:at a β slightly above 5.

(a) β = 0 (baseline) (b) β = 0.5 (c) β = 3

(d) Proportion of types (e) λ (f) ATE, LATEs and βIV

Figure 6: Sensitivity to β

3.3.6 Sorting on gain

In the model above, we assume that individuals sort into treatment based on the gain,and treatment decisions are potentially affected by the instrument Z. Theoreticallywe know that monotonicity is not a problem in the limit case of no sorting on gain:

16In figure 6f, the ATEs for compliers and defiers look constant but in fact they are rising slightly withβ.

18

Page 19: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

βIV = LATECM = LATEDF = ATE. However, is a small degree of sorting on gainsufficient to render βIV uninformative?

To address this, we adjust our model slightly by including a homogenous parameterθ ≥ 0 on the gain from treatment:

D =

1 if θ (Y1 − Y0) + γZ > 0⇔ β > −γ

θZ

0 if θ (Y1 − Y0) + γZ ≤ 0⇔ β ≤ −γθZ

Individuals make a treatment decision based on a signal about β. In economic terms,θ 6= 1 can be interpreted as individuals having imperfect information about the actualreturn to treatment β, or the existence of some constraints that precludes β from beingused in the treatment decision (θ = 0).

From this adjusted treatment equation, it is clear that introducing θ affects treatmentdecisions in a similar way as changing the values of γL and γH . The impact of increasingsorting on gain (θ rising) is similar to the effect of γL and γH falling (see figures 3a-f).Considering the impact on βIV , a violation of monotonicity is a problem with both littleor strong sorting on gain.

In the limit case of extreme sorting on gain (θ = +∞), the instrument does not affectthe treatment decision anymore (equivalent to γ = 0). In the other extreme of no sortingon gain (θ = 0), the monotonicity assumption is not needed; the treatment decision nowsimplifies to

D =

1 if γZ > 00 if γZ ≤ 0

Here, treatment decisions are purely driven by the instrument and there is no morethreshold value of β above or below which individuals change treatment decision. If onlyZ matters, no one takes treatment under Z=0, thus there cannot be always-takers ordefiers. Individuals with a negative impact of the instrument (γ < 0) will never taketreatment, while a positive γ will always push individuals into treatment: compliers. Inthis limit case, monotonicity is not a problem because defiers are nonexistent.

3.4 Regression Discontinuity

The same Roy model can be used to explain the importance of monotonicity in a fuzzyRegression Discontinuity Design. Let the selection equation be

D =

1 if Y1 − Y0 + γ1[Z > v0] > 0⇔ β > −γ1[Z > v0]0 if Y1 − Y0 + γ1[Z > v0] ≤ 0⇔ β ≤ −γ1[Z > v0]

Thus treatment now depends on Z in a discontinuous way: if Z is larger than a cut-offvalue v0 there is an additional effect determined by γ. All other elements of the model stayunchanged, that is γ ∈ γL, γH and the proportion of individuals with the two values ofγ are given by pγL and pγH = 1− pγL . We also maintain that around the threshold valuev0 the assumptions of discontinuity in the probability of treatment (RD1), continuity inthe conditional regression function (RD2) and independence (RD3) are satisfied.

19

Page 20: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

3.4.1 Sharp RD

In a sharp design, treatment is known to depend in a deterministic way on Z. Forinstance, all individuals with Z > v0 take treatment and viceversa. This is an examplewhere all individuals are compliers. The model above would generate a sharp RD onlyif there is no sorting on gain and if γ is homogeneous: β does not enter the selectionequation (this rules out the existence of always-takers and never-takers) and pγL = 0(this rules out the existence of defiers). Otherwise either β or γ introduce randomness inthe treatment decision which results in a fuzzy RD.17

3.4.2 Fuzzy RD

The fuzzy design differs from the sharp design in that the treatment assignment is not adeterministic function of Z but there are additional variables unobserved by the econo-metrician that determine assignment to treatment. In the model these variables are βand γ. Thus, in the presence of sorting on gain or with 0 < pγL < 1 we have a fuzzyRD. Similarly to table 1, individuals can then be classified into types according to theirindividual return to treatment β and their response γ to crossing the threshold v0:

Table 3: Counterfactual Choices - RD

γ = γLβ ≤ 0 0 < β ≤ −γL β > −γL

[Z ≤ v0] D = 0 D = 1 D = 1[Z > v0] D = 0 D = 0 D = 1type NT DF AT

γ = γHβ ≤ −γH −γH < β ≤ 0 β > 0

[Z ≤ v0] D = 0 D = 0 D = 1[Z > v0] D = 0 D = 1 D = 1type NT CM AT

Under independence, the distribution of types around the threshold v0 would be thesame as in figure 1. Thus, the probability of observing each of the four types, the ATEand all the LATEs can be computed as before. Importantly, the RD estimate of β canagain be expressed as

βRD = λ× LATECM + (1− λ)× LATEDFwhere

λ =pCM

pCM − pDFThe only difference is that now we are using observations in a narrow window (or at thelimit) around the threshold value v0.

17Alternatively, provided β does not enter the selection equation, γ could still be heterogeneous aslong as its support is either strictly positive or strictly negative.

20

Page 21: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

3.5 Discussion

How informative is βIV (or βRD) when our economic world does not rule out essentialheterogeneity and heterogeneous responses to the instrument (or forcing variable)? Ageneralized Roy model suggests that the bias is very sensitive to both essential hetero-geneity and violations of the monotonicity condition. There are extremes cases whenmonotonicity is not an issue. These extremes cases are given by either the absence ofheterogeneity in the treatment effect, the absence of sorting of gain, or the absence ofheterogeneity in the response to the instrument. However, once we depart from these ex-tremes cases, in the sense that none of them applies, then βIV (or βRD) quickly becomesuninformative.

4 Can we test for Monotonicity?

Virtually any applied study that takes an Instrumental Variable approach tests the rankcondition (IV1). The same study would also provide a lengthy discussion of the indepen-dence condition (IV2) even though no formal test would be included. The monotonicitycondition (IV3) is often overlooked, let alone tested for. A similar remark can be madeabout studies that take a fuzzy Regression Discontinuity approach. They would firstclearly show and investigate the size of the discontinuity at the threshold (RD1). Theywould then test whether, around the same threshold, the observable covariates are sim-ilar on either side and whether the density of the running variable is continuous. Thesetwo tests are considered the best possible internal validity checks of RD2 and RD3.18

Lee and Lemieux (2010) point out that the availability of such tests is one advantageof the Regression Discontinuity design over the IV approach. No test of monotonicity isproposed in the RD literature.

In this section we review a few tests of instrument validity. All the strategies proposedin the literature are not tests of monotonicity alone, but tests of both independence andmonotonicity. At the same time, the null hypothesis in any of these tests is a necessary butnot sufficient condition for independence and monotonicity to hold. Thus, rejecting thenull hypothesis is a clear indication that at least one of the two identification assumptionsis unwarranted. Failing to reject the null hypothesis is a positive indication but notdefinite evidence. This is an area of active research.

4.1 Binary Treatment

Let both the treatment and instrument be binary with support in 0, 1 such that thesample can be partitioned in four types depending on the treatment under the two alter-native values of the instrument (D(Z = 1), D(Z = 0)): compliers (CM), always-takers(AT), never-takers (NT) and defiers (DF). Under independence (IV2) and monotonic-ity (IV3) we can identify the proportion of each type in a cross-section. For instance,an individual i observed with (Z = 1,D = 0) could either be a never-taker or a defier.However, under monotonicity we can rule out the existence of defiers. Then individual

18To fully test RD2 and RD3 we would also need to test the similarity of the unobserved characteristics,unobserved treatment effects and unobserved counterfactual treatment decisions on either side of thecutoff.

21

Page 22: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

i must be a never-taker. Similarly, if an individual j is observed with (Z = 0,D = 1),individual j must be an always-taker. Table 4 summarizes the types depending on theobserved values of the treatment D and the instrument Z.

Table 4: Types under observed (Z,D)-values

Zobs

0 1Dobs 0 never-taker, complier never-taker, defiera

1 always-taker, defiera always-taker, complier

a By the monotonicity assumption

Let pNT and pAT be the proportion of never-takers and always-takers respectively.Because of the independence condition the instrument must be independent of the types.Thus, P (D = 0|Z = 1) = pNT and P (D = 1|Z = 0) = pAT . By the monotonicityassumption pDF = 0 and thus the proportion of compliers can be derived from the data:pCM = 1 − pNT − pAT . At this point some additional notation is useful. Let fz,d(y)denote the observed distribution of outcomes Y in the sub-sample defined by Z = z andD = d. Also let gtype,d(y) denote the unobserved outcome distribution for a given typeand D = d, noting that for always-takers d = 1 and for never-takers d = 0 only. Forclarity, in the remaining of section 4 we use the bold font to denote statistics that areobservable from the data without the need of maintaining any assumption. Imbens andRubin (1997) show that

f0,1(y) = gAT (y) (8a)

f1,0(y) = gNT (y) (8b)

f1,1(y) =

(pCM

pCM + pAT

)gCM,1(y) +

(pAT

pCM + pAT

)gAT (y) (8c)

f0,0(y) =

(pCM

pCM + pNT

)gCM,0(y) +

(pNT

pCM + pNT

)gNT (y) (8d)

Thus, under independence and monotonicity we can interpret the estimable f1,0(y)and f0,1(y) as the outcome distributions for never-takers and always-takers respectively.Moreover, it is then straightforward to derive the distribution of outcomes for compliersunder each value of the treatment gCM,1(y) and gCM,0(y):

gCM,1(y) =

(pAT + pCM

pCM

)f1,1(y)−

(pATpCM

)f0,1(y) (9a)

gCM,0(y) =

(pNT + pCM

pCM

)f0,0(y)−

(pNTpCM

)f1,0(y) (9b)

4.1.1 Test based on Non-negative Outcome Densities

The two potential outcome densities (9a) and (9b) can be estimated using the data. Sinceprobability densities are non-negative by definition, Imbens and Rubin (1997) point out

22

Page 23: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

that negative density estimates in some region of the outcome support are indicative of aviolation of at least one of the conditions used in deriving these densities: independenceand monotonicity. Kitagawa (2013) uses this idea to construct a formal test to measurethe degree of violation of non-negativity.

Let Y be the support of outcome Y and let d ∈ 0, 1 be the observed treatment.Kitagawa (2013) defines p(y, d) and q(y, d) as the (conditional) probability density func-tions of (y, d) ∈ Y × 1, 0, given Z = 1 and Z = 0 respectively. p(y, d) and q(y, d)are identifiable from the data: p(y, 1) represents the treated outcome density under Z=1(including always-takers and compliers), while q(y, 1) is the treated outcome density un-der Z=0 (including always-takers, and possibly defiers). Similarly, p(y, 0) and q(y, 0)represent untreated outcome densities under Z = 1 and Z = 0 respectively. Linking hisnotation back to (9):

p(y, 1) = f1,1(y) P (D = 1|Z = 1) = f1,1(y) (pAT + pCM)

q(y, 1) = f0,1(y) P (D = 1|Z = 0) = f0,1(y) pAT

p(y, 0) = f1,0(y) P (D = 0|Z = 1) = f1,0(y) pNT

q(y, 0) = f0,0(y) P (D = 0|Z = 0) = f0,0(y) (pNT + pCM)

Kitagawa (2013) shows that if independence and monotonicity hold in the population,then p(y, 1) must nest q(y, 1) and q(y, 0) must nest p(y, 0) (proposition 2.1 in his paper).19

He provides a visual illustration of his proposition, which is presented here in figures 7aand 7b. Figure 7a illustrates nested densities, in which case we cannot reject IV validity.If, by contrast, we observe densities as in figure 7b, one can refute at least one of theinstrumental validity conditions since some of the nonnegativity conditions are violatedon some subsets of the outcome support. These subsets are labeled as V1 and V2 infigure 7b. Kitagawa (2013)’s test procedure uses a Kolmogorov-Smirnov type of teststatistic that quantifies a maximal violation of non-negativity over the outcome support.He then applies a bootstrap-based procedure to construct critical values. Nevertheless,he emphasizes that non-negativity is a necessary but not sufficient condition for thejoint restriction of independence and monotonicity to hold. Hence accepting the nullhypothesis is not definite evidence of instrument validity. He also shows that this testprocedure can be extended to a setting with a multivalued (discrete) instrument.

Kitagawa (2013) illustrates his test by applying it to the labor market data from Card(1995), who evaluates the returns to education based on the 1966 and 1976 waves of theU.S. National Longitudinal Survey of Young Men (NLSYM). Card uses a dummy for

19Kitagawa (2013)’s null hypothesis implies p(y, 1) ≥ q(y, 1) and p(y, 0) ≤ q(y, 0) (∀y ∈ Y) and isequivalent to non-negativity of the treated and untreated outcome densities for compliers:

p(y, 1)− q(y, 1) = f1,1(y) (pAT + pCM )− f0,1(y) pAT

= gCM,1(y) pCM

q(y, 0)− p(y, 0) = f0,0(y) (pNT + pCM )− f1,0(y) pNT

= gCM,0(y) pCM

23

Page 24: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

(a) Nested densities: IV validity not refuted

(b) Non-nested densities: IV validity rejected

Figure 7: Kitagawa (2013), figures 1 and 2, pages 7-8

proximity to a 4-year college in 1966 as an instrument for the potentially endogenous de-cision of going to college. Kitagawa (2013) redefines education level as a binary indicatorfor a four year college degree, which indicates one’s education to be 16 years or moresuch that it roughly corresponds to a four year college degree. He focuses on the (binary)proximity-to-college instrument only. He shows Kernel density estimates for the observedoutcome densities p(y, d) and q(y, d) (for d ∈ 0, 1) and finds that they intersect, both inthe full sample (without controlling for demographic characteristics) and when focusingon a restricted sample of white males in metropolitan areas and not living in southernstates. His bootstrap-based test procedure yields p-values equal to zero, which impliesthat instrument validity is refuted both with and without these covariates. He also applieshis test to the setting of Angrist (1990) where the impact of veteran status on earnings ismeasured, using the Vietnam draft lottery as an instrument for veteran status. Kitagawa(2013) does not reject IV validity in this setting.

4.1.2 Inequality Moment Constraints

Huber and Mellace (2014) derive alternative testable implications by replacing the densi-ties in (8) with mean potential outcomes. Let E(Yz,d|type, Z = z) denotes the unobserv-able mean outcome for a given type, in the sub-sample defined by Z = z and D = d:

24

Page 25: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

E(Y|D = 1,Z = 0) = E(Y0,1|AT,Z = 0) (12a)

E(Y|D = 0,Z = 1) = E(Y1,0|NT,Z = 1) (12b)

E(Y|D = 1,Z = 1) =

(pCM

pCM + pAT

)E(Y1,1|CM,Z = 1)+(

pATpCM + pAT

)E(Y1,1|AT,Z = 1) (12c)

E(Y|D = 0,Z = 0) =

(pCM

pCM + pNT

)E(Y0,0|CM,Z = 0)+(

pNTpCM + pNT

)E(Y0,0|NT,Z = 0) (12d)

Since these are means, rather than densities, their negativity is not informative.From (12a) we can see that the mean potential outcome of the always-takers underD = 1 and Z = 0 is point identified: E(Y1,0|NT,Z = 1) corresponds to the observableE(Y|D = 1,Z = 0). Moreover, by the independence assumption E(Y0,1|AT,Z = 0) =E(Y1,1|AT,Z = 1) such that

E(Y|D = 1,Z = 0) = E(Y0,1|AT,Z = 0) = E(Y1,1|AT,Z = 1)

Now note that, from (12c), we can see that E(Y|D = 1,Z = 1) is a weighted averageof mean potential outcomes for compliers and always-takers. Using (12c), Huber andMellace (2014) derive bounds on the mean potential outcome of always-takers, whichyields testable implications. Let q correspond to the proportion of always-takers in thepopulation observed under (D = 1, Z = 1), pAT

pCM+pAT, and define yq as the qth conditional

quantile of the outcome Y given D = 1 and Z = 1. By the results of Horowitz andManski (1995) E(Y|D = 1,Z = 1,Y ≤ yq) is the sharp lower bound of the mean poten-tial outcome of the always-takers, implying that all the always-takers are concentratedin the lower tail of the distribution. Similarly, E(Y|D = 1,Z = 1,Y ≥ y1−q) is the up-per bound when assuming that any always-taker occupies a higher rank in the outcomedistribution than any complier. Therefore, independence and monotonicity imply that

E(Y|D = 1,Z = 1,Y ≤ yq) ≤ E(Y|D = 1,Z = 0) ≤ E(Y|D = 1,Z = 1,Y ≥ y(1−q))(13)

Equivalent arguments hold for the mixed outcome equation of never-takers and com-pliers, using (12b) and (12d) instead. Let r = pNT

pCM+pNTand define yr as the rth conditional

quantile of the outcome Y , given D = 0 and Z = 0:

E(Y|D = 0,Z = 0,Y ≤ yr) ≤ E(Y|D = 0,Z = 1) ≤ E(Y|D = 0,Z = 0,Y ≥ y(1−r))(14)

Using (13) and (14) Huber and Mellace (2014) construct the following null hypothesis:

H0 :

E(Y|D = 1,Z = 1,Y ≤ yq)− E(Y|D = 1,Z = 0)

E(Y|D = 1,Z = 0)− E(Y|D = 1,Z = 1,Y ≥ y(1−q))E(Y|D = 0,Z = 0,Y ≤ yr)− E(Y|D = 0,Z = 1)

E(Y|D = 0,Z = 1)− E(Y|D = 0,Z = 0,Y ≥ y(1−r))

0000

(15)

25

Page 26: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Under a violation of independence or monotonicity conditions (or both) at least oneand at most two constraints might be binding. This is the case because violations of thefirst and second as well as of the third and fourth constraints are mutually exclusive,respectively. Furthermore, even if no inequality constraint is violated, the independenceor monotonicity conditions may not be satisfied because the test detects violations onlyif they are large enough such that the point identified mean outcomes of the always-takers and/or never-takers lie outside their respective bounds in the mixed populations.Since H0 is defined by inequality restrictions, standard two-sided tests cannot be applied.Huber and Mellace (2014) consider three different methods to test H0: a simple bootstraptest with Bonferroni adjustment (bs), the minimum p-value test of Bennett (2009) (withpartial (B.p) and full (B.f) recentering) and the smoothed indicator-based method ofChen and Szroeter (2014) (CS).

Figure 8: Huber and Mellace (2014), table 5, page 23.

Like in Kitagawa (2013), Huber and Mellace (2014) test the inequality moment con-straints in the setting of Card (1995) where the returns to education are measured usingan indicator for proximity to college as an instrument for years of schooling. They testH0 in the full sample (i.e. unconditionally) and in four different subsamples in order tocontrol for covariates that are potentially correlated with both the instrument and theoutcome. The subsamples differ in terms of father’s education (12 or more years versusless than 12 years) and living in an urban or rural area. Figure 8 shows the p-valuesusing the three different methods to test H0. Unconditionally, all tests reject the null atthe 1% level (solid line rectangle). In contrast, the null is never rejected in any of thesubsamples (dashed line rectangle). Huber and Mellace (2014) take this as evidence thatwhile college proximity is most likely not an unconditionally valid instrument, it mightbe so conditional on covariates.

4.2 Multivalued Treatment

Angrist and Imbens (1995) point out that the monotonicity assumption has a testableimplication whenever the treatment takes more than two values: stochastic dominance(SD). If monotonicity holds such that Di(z) ≥ Di(w) ∀i then the CDF of the observedtreatment D given Z = z will dominate the CDF of observed D given Z = w. Angristand Imbens (1995) do not discuss the importance of the independence condition for thestochastic dominance result, but implicitly assume that it is satisfied. Here we first brieflyformalize and emphasize the result that stochastic dominance is not a test of monotonicity

26

Page 27: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

alone, but rather a tests of both independence and monotonicity. Then we discuss Angristand Imbens (1995) application of the test.

Proposition 1. Let treatment D be multivalued with support D and let Fj be the observedtreatment CDF conditional on the jth value of the instrument. Suppose D(z)z∈Z isjointly independent of Z (independence) and for any two points of support z, w ∈ Z,Di(z) ≥ Di(w) ∀i (monotonicity). Then Fw(d) ≥ Fz(d) ∀d ∈ D.

Proof. By monotonicity P [Di(z) ≥ k] ≥ P [Di(w) ≥ k] for every individual i and forevery value k. Then P [D(z) ≥ k] ≥ P [D(w) ≥ k] ∀k.

By independence P [D(z) ≥ k|Z = z] ≥ P [D(w) ≥ k|Z = w] ∀k. Finally, bydefinition of D(z) as the treatment when Zi = z, we have the SD result for the observedtreatment outcomes:

P [D ≥ k|Z = z] ≥ P [D ≥ k|Z = w] ∀k

or equivalentlyFw(d) ≥ Fz(d) ∀d ∈ D

Note that in Proposition 1 we do not exploit the full independence condition (IV2)but only require the weaker type independence:

IV5. D(z)z∈Z is jointly independent of Z (type independence)

Angrist and Imbens (1995) do not provide a formal test of first-order SD. However,since the concept has been used extensively in many contexts such as decision theoryand decision analysis, tests of SD have been developed independently. For instance,Barrett and Donald (2003) show how to analytically compute the p-value when the twoCDFs are derived from independent random samples. Let Nz and Nw be the numberof observations under two alternative values of the instrument. Now let the null andalternative hypothesis be H0 : Fw(d) ≥ Fz(d) for all d and H1 : Fw(d) < Fz(d) for somed. Thus we are testing the hypothesis that the CDF for the treatment values observedunder Z = z stochastically dominates the CDF for the the treatment values observedunder Z = w.20 The test statistic for first-order stochastic dominance is given by

S =

(Nz ×Nw

Nz +Nw

)1/2

supd

(Fz(d)− Fw(d)

)Barrett and Donald (2003) then show that one can compute a p-value by exp(−2(S)2).

Angrist and Imbens (1995) then investigate monotonicity using Angrist and Krueger(1991) setting where quarter of birth is used as an instrument (Z) for years of schooling(D). They mostly restrict the attention to a binary instrument (first vs fourth quarter

27

Page 28: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

(a) Binary instrument, page 438

(b) Multivalued instrument, page 440

Figure 9: Angrist and Imbens (1995)

of birth). Angrist and Krueger (1991) maintain that because of school entry laws, indi-viduals born in the fourth quarter (Z = 4) would attain more schooling than those bornin the first quarter (Z = 1).21

Figure 9a is extracted from Angrist and Imbens (1995) paper and shows that the

20The alternative hypothesis is simply the converse of the null and implies that there is at least sometreatment value at which Fz is strictly larger than Fw. In other words stochastic dominance fails at somepoint. As formulated, one can in principle distinguish between the case where Fz and Fw coincide andthe case where Fw dominates Fz by reversing the roles they play in the hypotheses and redoing the tests.

21Angrist and Krueger (1991) explain that, for men in the 1980 US Census who were born in the 1930sand 1940s, school districts typically required a student to have turned age six by January 1 of the yearin which he or she entered school. Therefore, students born earlier in the year entered school at an olderage and attained the legal dropout age at an earlier point in their educational careers than students bornlater in the year.

28

Page 29: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

two CDFs do not cross: the schooling CDF for individuals born in the fourth quarterfirst-order stochastically dominates the CDF for individuals born in the first quarter.Angrist and Imbens (1995) take this as important evidence in favour of the monotonic-ity assumption in this example. Moreover, Angrist and Imbens (1995) also extend theirdiscussion to all quarters of birth. Since in their data first-order SD holds for any adja-cent pair of quarters (figure 9b), they conclude that there is evidence that any adjacentpair of quarters can be used to define a binary instrumental variable that satisfies themonotonicity assumption. Angrist and Imbens (1995) conclude that the IV estimatescan be interpreted as the return to schooling for those individuals induced to take moreschooling because they were born in the fourth rather than the first quarter (LATE).

Two recent studies investigate monotonicity according to the Angrist and Imbens(1995) procedure: Barua and Lang (2009) and Aliprantis (2012). Both papers study theeffect of school entry age on educational attainment. To our knowledge this is one of thefirst applied literatures to explicitly discuss and question the monotonicity assumption.Estimating the causal effect of school entry age is problematic because of the endogeneitydue to sorting on levels and sorting on gain. For instance, children who are less intellectu-ally and/or emotionally mature are also more likely to face delayed school entry (sortingon levels). Moreover, parents might make school entry decisions based on the (partial)knowledge of the gain from starting school later. Bedard and Dhuey (2006) suggest solv-ing the endogeneity problem by using the legal entry age (LEA) as an instrument forschool entry age. LEA is the age at which a child could start school given his/her birthdate and given the country/state-specific school entry cutoff date. Bedard and Dhuey(2006) show that actual entry age is strongly correlated with the LEA in their data.

Both Barua and Lang (2009) and Aliprantis (2012) independently argue that mono-tonicity is not likely to hold in this setting. Figure 10a is extracted from Barua and Lang(2009) and shows the CDFs for those individuals born in the first and last quarters of1952 using reported age and grade at the time of the 1960 US Census. The CDFs clearlycross multiple times.22 Barua and Lang (2009) conclude that figure 10a gives clear evi-dence that monotonicity is violated when quarter of birth or legal entry age are used asan instrument for entry age.

Aliprantis (2012) investigates the schooling of six year old children in the Early Child-hood Longitudinal Study, Kindergarten Class of 1998-99 (ECLS-K) data set. Figure 10bis extracted from Aliprantis (2012). It shows the schooling CDFs for children born thequarter before (Z = 1) and the quarter after (Z = 0) the school entry cutoff date: theCDFs of school entry age cross.23 Aliprantis (2012) uses figure 10b to strengthen hisargument that monotonicity fails in this instance as well.

Later in the paper we discuss another study using the same strategy to identify theeffect of starting school at different ages on educational attainment. There we presenta thorough discussion of why monotoniticy is likely to be violated. Nevertheless, weargue that the evidence against SD presented in Barua and Lang (2009) and Aliprantis

22Barua and Lang (2009) do not observe the exact date of birth, but only the quarter. Thus, allindividuals born in the first quarter are categorized as being born 1 March, while individuals born inthe fourth quarter are categorized as being born 1 December. The Age at Entry values in figure 10a arederived accordingly.

23In the figure, schooling attainment at age 6 is measured in quarters and thus takes discrete valuesin steps of 0.25 years.

29

Page 30: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

(a) Barua and Lang (2009), page 7 (b) Aliprantis (2012), page 328

Figure 10: Applications of the monotonicity test

(2012) could also be due to a violation of the independence condition. SD relies onboth assumptions. For instance Buckles and Hungerman (2013) show that season ofbirth is not random but is associated with maternal characteristics: winter births aredisproportionally realized by teenagers and the unmarried. If date of birth is not randomthen neither LEA is, violating the independence condtion.

4.2.1 Failing to reject SD even though independence and monotonicity areviolated

Stochastic dominance is a necessary but not sufficient condition for independence andmonotonicity to hold. Evidence against SD is a clear indication that at least one ofthe two identification assumptions is unwarranted. Failing to reject the SD is a positiveindication but not definite evidence. To illustrate this point we construct a simple exampleusing the Angrist and Krueger (1991) setting, where quarter of birth partly determinesyears of schooling through a combination of school entry cutoff date and legal drop outage. Compulsory school attendance age varies by state but the minimum is generally age16. This context is used by Angrist and Imbens (1995) to apply their monotonicity test.Here we think of counterfactuals, where each of 10 individuals could be born either inthe 1st or 4th quarter. Depending on the quarter of birth, the individuals might make adifferent decision as to when to leave school.

In our example in table 5 most children have less schooling when born in the firstquarter. For instance, Jenny would drop out before completing High School if born inthe first quarter, but completes High School if born in the fourth quarter. Similarly,Jeanie also stays in schooling longer, attaining college education, if born in the fourthquarter. However, this is not true for all individuals: John, James and Mike have lessschooling when born in the fourth quarter. Thus, monotonicity does not hold. Let usconsider independence in this example. There are four different types of individuals, ascan be seen in the last column of table 5. For example, John, Mary and Megan are ofthe same type (tA): they have the same counterfactual schooling outcomes. Suppose weobserve all J-named individuals when born in QoB=1 while we observe all the M-namedindividuals when born in QoB=4 (the observed choices are bolded in the table). When

30

Page 31: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Table 5: Independence and monotonicity are violated

Date of Birth TypeIndividual QoB=1 QoB=4

Choice Schooling Choice SchoolingJenny Dropout 11 High-School 12 tAJohn High-School 12 Dropout 11 tBJames High School 12 Dropout 11 tBJeanie High School 12 College 16 tCJulia College 16 College 16 tDMike High School 12 Dropout 11 tBMary Dropout 11 High School 12 tAMegan Dropout 11 High School 12 tAMel High School 12 College 16 tCMolly High School 12 College 16 tC

Values in bold indicate actual observed students/choices

Figure 11: SD holds while independence and monotonicity are violated (Table 5).

31

Page 32: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

comparing the two groups, only three out of five individuals are of the same type. Hence,the potential treatment outcomes are not independent of the instrument (QoB). Figure11 then plots the CDFs obtained using the observations in bold: the CDFs do not crosseven though both independence and monotonicity are violated.24

4.3 Discussion

The tests discussed in Kitagawa (2013), Huber and Mellace (2014) and Angrist andImbens (1995) are similar in spirit. All of them test implications of independence andmonotonicity: for Kitagawa (2013) it is the non-negativity of the outcome densities, forHuber and Mellace (2014) it is the inequality restrictions on the mean potential outcomes,for Angrist and Imbens (1995) it is stochastic dominance of the treatment CDFs. Allthese implications are necessary but not sufficient for independence and monotonicity tohold. There are some important differences between the tests though. Kitagawa (2013)and Huber and Mellace (2014) tests apply to settings where the treatment is binary whilethe stochastic dominance test only applies to cases where the treatment is multivalued.Moreover, stochastic dominance only exploits type independence (IV5) while Kitagawa(2013) and Huber and Mellace (2014) testable implications also require independencebetween the potential outcomes (Y (1), Y (0)) and the instrument. Thus the Kitagawa(2013) and Huber and Mellace (2014) tests are less informative as tests of monotonicityonly, but represent stronger internal validity checks since they test for more conditions.

Another element that these tests have in common is that they are discussed exclusivelyin an IV setting. However, there is no reason not to apply them to the fuzzy RD settingas well. Instead of comparing densities, moment inequalities or CDFs for two alternativevalues of the instrument, we can do so for observations on either side of a threshold.

5 Discussing Monotonicity in practice

All the tests discussed in section 4 rely on necessary but not sufficient conditions for bothindependence and monotonicity to hold. One can argue that these tests are mainly a checkon internal validity. More specifically, we see these tests as a complement, rather thana substitute, to a thorough discussion of the monotonicity condition. In this section ourgoal is to maintain that monotonicity should also be investigated like any other conditionrelying on a combination of economic intuition and data patterns. We go through threedifferent studies in various settings that adopt either the IV or fuzzy RD estimator. Foreach study, we try to make a case for or against monotonicity.

We do not scrutinize the independence condition since it is extensively discussed inthese papers, and in applied work more generally. Similarly, we do not investigate or testessential heterogeneity. This would lengthen the discussion considerably and it is notour focus. Yet, at least on intuitive grounds we cannot see how essential heterogeneitycould be ruled out a priori in any of these studies. Heckman, Urzua, and Vytlacil (2006)and Heckman, Schmierer, and Urzua (2010) provide a thorough discussion of essentialheterogeneity and describe a way to test it.

24One could also come up with a plausible example where the CDFs do not cross even though mono-tonicity holds but independence is violated, or vice versa.

32

Page 33: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

5.1 Angrist (1990): Vietnam lottery draft

Angrist (1990) uses Vietnam era draft lottery numbers (Z) as an instrument to estimatethe effect of veteran status (D) on subsequent earnings (Y ). These lottery numbers(known as Random Sequence Numbers, RSN), were randomly assigned and influencedthe probability of military service. Individuals with a number below or equal to the drafteligibility cutoff value were draft eligible (Z = 1). Figure 12 is extracted from Angristand Chen (2011): it shows the conditional probability of military service by RSN forwhites and non-whites. The cutoff value is different over the years, with the largest being195 for the cohort born in 1950.

Figure 12: Angrist and Chen (2011) page 102.

This is the simplest setting where both the treatment and the instrument are binaryand can easily be modelled as in equation (1). If we assume that D∗ = α0 +α1Z and thatα1 is homogeneous then monotonicity is clearly satisfied. Thus, with α1 > 0 individualswould always be more likely to serve in Vietnam when drafted (Z=1). In this model, formonotonicity to be violated we need α1 to be heterogeneous. One could imagine that thereis a group of individuals that would like to serve in the army independently of the lotterynumber (type A: always-takers). There was certainly a substantial group of individualsthat would go to Vietnam only if drafted (type B: compliers); for them αB1 > 0. If every

33

Page 34: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

individual belongs to either type A or B, there is heterogeneity in the way individualsrespond to the instrument but monotonicity still holds since Di(Z = 1) ≥ Di(Z = 0) ∀i.By contrast, monotonicity fails if there is also a group of individuals who would go toVietnam only if not drafted (type C: defiers); for them αC1 < 0. While this is possible it isvery unlikely. Thus, in the Angrist (1990) setting one could argue that the monotonicityassumption is not a major concern.25 This is confirmed in Kitagawa (2013) who applieshis test of independence and monotonicity to the Angrist (1990) data and cannot rejectthat the assumptions hold. While Angrist (1990) takes an IV approach, this settings alsolends itself to a fuzzy RD approach given the clear discontinuity in the probability ofveteran status as shown in figure 12.

Keane (2010) provides a hypothetical example, similar to Angrist (1990), where mono-tonicity is more likely to fail. Suppose we use draft eligibility as an IV for completedschooling in an earnings equation. This seems as sensible as using it as an IV for militaryservice, since draft eligibility presumably affects schooling while being uncorrelated withunobserved characteristics. Thus D = 1 if the individual went to college and D = 0otherwise; Y and Z are the same as above. Amongst the draft eligible group, somestay in education longer than they otherwise would have, as a draft avoidance strategy(type B: compliers); αB1 > 0. Others get less schooling than they otherwise would have,either because their school attendance is directly interrupted by the draft, or becausethe threat of school interruption lowers the option value of continuing school (type C:defiers); αC1 < 0. This behaviour is more plausible than in Angrist (1990). It implies thatmonotonicity is violated since the instrument, draft eligibility, lowers schooling for someand raises it for others.26

5.2 Black, Devereux, and Salvanes (2011): school entry ageeffects on test scores and adult outcomes

Black, Devereux, and Salvanes (2011) investigate the effect of school entry age (D = EA)on military IQ test scores and adult outcomes (Y ) in Norway. To deal with endogeneity,they use Legal Entry Age as an instrument (Z = LEA). LEA is the age at which a childcould start school given his/her birth date and given the country or state-specific schoolentry cutoff date. In Norway, school starts towards the end of August and children areexpected to enter school in the calendar year they turn 7 (implying a January 1st cutoffdate). Black, Devereux, and Salvanes (2011) use both an IV and a RD-type approach,with the latter relying only on children born one month either side of the cutoff date.Note that they only observe month of birth, so they cannot narrow the sample anyfurther around the cutoff date, neither include a trend in date of birth. This RD-typeapproach is implemented to account for either potential manipulation of the date of birthby parents and/or the seasonality of births. In this section we focus on the monotonicityassumption in the RD approach by centering our attention around the December-Januarydiscontinuity. Nevertheless, the reasoning equally applies to all months of birth in a moregeneral IV setting. Figure 13b plots the LEA by month of birth: the LEA is fully

25Notice that monotonicity would be satisfied by construction if all drafted individuals were forced togo to Vietnam such that Di(Z = 1) = 1 ∀i.

26See for example Card and Lemieux (2001) and Angrist and Chen (2011) for a discussion of theimpact of the draft lottery on educational decisions.

34

Page 35: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

determined by the date of birth.27 In this context, monotonicity clearly holds in theextreme case where all children start school on-time (EA = LEA): all December-bornchildren would enter school 11 months older had they been born in January, and viceversa.

Estimating the causal effect of school entry age is problematic because of the endo-geneity due to sorting on levels and sorting on gain. For instance, children who are lessintellectually and/or emotionally mature are also more likely to face delayed school entry(a practice also known as “red-shirting”). Moreover, parents might make school entrydecisions based on the (partial) knowledge of the gain from starting school later. For ex-ample, this gain could depend on the intellectual and/or emotional maturity of the child,or on the relative age of her classmates. Thus this is plausibly a context with essentialheterogeneity: (i) the gain from treatment is heterogeneous across the population and(ii) there is some degree of sorting into treatment based on the gain from treatment.

In a next step, we exploit data patterns to scrutinise the monotonicity assumption.Although Black, Devereux, and Salvanes (2011) do not discuss monotonicity, they pro-vide useful information. Table 13a is taken from their paper and shows the proportionof children who enter school on-time, before and after the expected school entry age.Throughout the year, a very large fraction of children start school in the year they turn7 (On Time). However, about 15% of December-borns are red-shirted (Late), while 10%of January-borns start school before the year they turn 7 (Early). Overall, the youngestchildren in an eligible school entry cohort (Oct-Dec borns) are the most likely to be red-shirted, while the oldest ones (Jan-Feb borns) are the most likely to start school early.This entry age behaviour is consistent with parents/educators making school-entry de-cisions based on either a child’s absolute or relative age. Figure 13c replicates figure13b but we now add the observed EA patterns as shown in the table. The size of thecircles mirrors the proportions by month of birth. The largest circles are found alongthe LEA line, reflecting the high on-time entry rates. The smaller circles on the dottedline reflect early school entry (EA < LEA), which occurs mainly among children born inJanuary-February. Instead, the smaller circles on the dashed lines reflect delayed schoolentry (EA > LEA), which is most common among October-December borns.

We now focus on December and January born children and consider counterfactualsto discuss monotonicity. Since all children are assumed to be born on a given day ineither month, it is possible to distinguish 9 types based on actual and counterfactual EAbehaviour (see table 6). The sign in each cell indicates the change in EA if a child is born inJanuary rather than December. For instance, type E represents December-born on-timeschool entrants who would also enter school on-time had they been born in January. Thisimplies an increase in EA for members of type E: EAi(Dec) < EAi(Jan), ∀i with typei =tE. The (+) sign reflects the associated increase in EA. Assuming type independence,observed behaviour of January-born children can function as a counterfactual for actualDecember-borns, and vice versa. Figures 13a and 13c thus suggest that type E is themost prevalent type.

Since late school entry among January-borns does not occur, we choose to ignore typesC, F and I. Similarly, since early school entry among December-borns does not occur,

27In Black, Devereux, and Salvanes (2011) LEA is defined as 7.7 − (month of birth−1)12 . In constructing

the figures we assume children are born on the first day of the month and that school starts September1st.

35

Page 36: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

(a) Black, Devereux, and Salvanes (2011) page 458.

(b) LEA by month of birth

(c) LEA and observed school Entry Age

Figure 13: Monotonicity in Black, Devereux, and Salvanes (2011)

36

Page 37: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Table 6: Monotonicity

January BornDecember born Early On Time LateEarly tA(+) tB(+) tC(+)On Time tD(-) tE(+) tF (+)Late tG(-) tH(-) tI(+)

The (+) term indicates that children enter schoololder when born in January. Viceversa, the (-) termindicates that children enter school younger whenborn in January.

we choose to ignore types A, B and C.28 Finally, we find it unlikely that December-bornlate entrants would instead enter school early had they been born in January, so we alsoignore type G. Any other December-born on-time entrants would enter school early hadthey been born in January (type D). Crossing the cutoff date implies a drop in EA for thistype: EAi(Dec) > EAi(Jan), ∀i with typei = tD. Since 10% of January-born childrenenter school early, existence of this type cannot be ruled out. Similarly, type H representDecember-borns entering school late, but who would enter school on-time if they hadbeen born in January. This again implies a drop in EA: EAi(Dec) > EAi(Jan), ∀i withtypei = tH . Figures 13a and 13c again suggest that the existence of this type cannot beruled out. Crucially, the existence of either of the (−) types D or H alongside the mostnumerous type E (+) creates a violation of monotonicity.29 Thus, economic intuition anddata patterns raise some concern about the validity of the monotonicity assumption.

To complement our discussion we apply the SD test. From table 13a and figure 13c wecan derive the school entry age under each value of the instrument. Thus for children bornDecember 1st EAi ∈ 5.75, 6.75, 7.75 depending on whether they enter early, on-timeor late respectively. Similarly, for children born January 1st EAi ∈ 6.66, 7.67, 8.67.We can then use the proportions in table 13a to draw the CDFs. Note that none ofthe December born children enter school early (EA = 5.75) and none of the Januaryborn children enter late (EA = 8.67). Hence there are only four points of support inconstructing the CDFs.

Figure 14 shows that the CDFs cross. We can test whether the crossing is statisticallysignificant by applying Barrett and Donald (2003) procedure. Let ND and NJ be thenumber of the December and January born children respectively. Similarly let FD(x) andFJ(x) be the CDFs of EA by month of birth (or equivalently, LEA). Finally let the nulland alternative hypothesis be H0 : FD(x) ≥ FJ(x) for all x and H1 : FD(x) < FJ(x) forsome x. Thus we are testing the hypothesis that the CDF for the January born childrenstochastically dominates the CDF for the December born children. The test statistic forfirst-order stochastic dominance is given by

28We choose to ignore these types for simplicity, but including them in the discussion does not changeany of the intuition provided.

29The above reasoning equally applies to children born in January, had they been born in Decemberinstead. For example, type E also includes January-born on-time entrants who would also start schoolon-time had they been born in December. The (+) sign then indicates an increase in EA when movingfrom a counterfactual birth in December to an actual birth in January.

37

Page 38: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Figure 14: Stochastic Dominance under alternative values of the instrument

S =

(ND ×NJ

ND +NJ

)1/2

supx

(FJ(x)− FD(x))

Black, Devereux, and Salvanes (2011) have a sample of N=104,023 children born inDecember and January. Assuming that ND = NJ = N/2 leads to a test statistic of24.189448 compared to a critical value of 1.5174 at a 1% level of significance. The p-value is zero. Thus we reject stochastic dominance.30 We can also invert the null andalternative hypothesis to H0 : FJ(x) ≥ FD(x) for all x and H1 : FJ(x) < FD(x) for somex but this leads to an even larger test statistic of 120.94724. Thus the test indicates thateither independence or monotonicity (or both) are violated. Together with the earlierdiscussion, this evidence raises concerns about the internal validity of the identificationapproach.

In this school entry age setting, it is important to emphasize a difference betweenthe IV and a “genuine” RD approach. A true RD design captures the impact exactlyat the threshold. In the school entry age setting violations of monotonicity become lessand less of an issue as we shrink the RD sample to children born closer to the cutoffdate. For instance, if we had sufficient data we could select children born a day oneither side of the cutoff date only. Then for the (−) types D or H the difference inentry age under the two alternative values of the instrument would be at most 2 days:0 < EA(Dec 31)− EA(Jan 1) ≤ 2 days. This difference tends to zero as we shrink theRD sample to just a few moments before and after the cutoff and we thus capture theeffect right at the discontinuity: the impact of a rise in entry age of 12 months (typeE). In this setting, this is an important advantage of using a genuine RD approach asopposed to an IV (or RD-type) approach. 31

30We do not know exactly how many children were born in December versus January. However, theconclusion is not sensitive to a (reasonable) imbalance in ND and NJ . For example, if three quarters ofchildren were born in December (ND=78,017) and the rest in January (NJ=26,006), the test statistic is

still much larger than the critical value (S=13.6066).31Alternatively, one could include a trend in date of birth which, if correctly specified, would also

allow us to capture the effect right at the discontinuity while also using observations further from thecutoff date. In practice Black, Devereux, and Salvanes (2011) only observe month of birth implying that0 < EA(Dec) − EA(Jan) ≤ 2 months. Hence they can neither select children born very close to thecutoff date neither include a trend.

38

Page 39: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

A variety of studies use the same intuition in an IV or fuzzy RD setting to estimatethe causal effect of school entry age for different countries and/or outcomes: Bedard andDhuey (2006), Datar (2006), Puhani and Weber (2007), McEwan and Shapiro (2008),Elder and Lubotsky (2009), Fredriksson and Ockert (2009), Muhlenweg and Puhani(2010),Muhlenweg, Blomeyer, Stichnoth, and Laucht (2012). None of these studies inves-tigate the monotonicity assumption. The idea of using LEA as an instrument for schoolentry age is also very similar to the Angrist and Krueger (1991) idea of using quarterof birth as an instrument for schooling. In both cases the date of birth provides thevariation in the instrument.

5.3 Clark and Royer (2013): returns to education using an ex-pansion in compulsory schooling

Clark and Royer (2013) investigate the effect of education (D) on health (Y ). To dealwith endogeneity, they use two changes to British compulsory schooling laws (Z) thatgenerated differences in educational attainment across cohorts. The first reform raisedthe minimum school leaving age from 14 to 15: it was implemented on 1 April 1947and therefore it affected individuals born from April 1933 onwards. The second reformraised the minimum school leaving age from 15 to 16: it was implemented on 1 September1972 and it therefore affected individuals born from September 1957 onwards. Figure 15 isextracted from their paper and shows the impact of the schooling laws on different cohortsby quarter of birth. Both reforms had large impacts. The first reform affected about 50%of the population while the second reform affected about 25% of the population.

Figure 15: Clark and Royer (2013) page 2092

To identify the treatment effect, Clark and Royer (2013) use a fuzzy RD approachwhere the discontinuities are given by the 1 April 1933 and 1 September 1957 cutoffs indate of birth. In the estimation a local linear regression is adopted, selecting individualsborn 7 years around the cutoffs and a bandwidth of 46 months. They also include trendsin month of birth (see equation (1) in their paper). Clark and Royer (2013) find no effectof education on health, a result that stands in sharp contrast to previous estimates of the

39

Page 40: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

effects of education on health. They argue that this result is possibly due to their novelfuzzy RD approach.

In this context, monotonicity holds in the case where the schooling reforms inducedall individuals to get more education, or at least not less of it. Clark and Royer (2013)do not discuss monotonicity but again we can use the information provided in the paperto scrutinise the assumption. Table 16a is extracted from their paper and it showsthe estimated effect of the compulsory schooling changes. The first column reports theeffect on the years of schooling: the positive coefficients clearly shows that both reformsincreased the average years of schooling. The following columns report the effect bycumulate years of schooling: these columns show that the 1947 reform increased theproportion of individuals staying in education beyond 9 and 10 years of schooling but thesame reform actually decreased the proportion of individuals staying in education beyondthe 11, 12 and 13 years of schooling. This latter result is mostly true for men, see therectangular selection in the table.

We can use the results in their table to consider counterfactuals. There are severalpossible types of individuals based on actual and counterfactual behaviour which wesummarize in table 7. The sign in each cell indicates the change in years of schoolingif an individual is born before or after the 1947 school reform. The reform certainlyincreased the average years of schooling, so for monotonicity to hold what is required isthat no one belongs to a cell below the main diagonal (=), because that would imply adecrease in schooling due to the reform (-). It is plausible that no individual who would beattaining 9 or fewer years of schooling before the 1947 reform would attain less educationafter a reform that made it illegal (even if the data show that not everyone obeyed thelaw). The results in table 16a support this conclusion. What we cannot exclude though,is that someone who would attain 11 or more years of education before the reform wouldattain fewer years after, as suggested by the significantly positive coefficients in table16a.32 Note that we can use the information in the table to explicitly derive the CDFsfor men born before and after April 1933 (1947 reform). This is illustrated in figure 16b:the CDFs clearly cross.33

One might ask what kind of economic behaviour would explain such a pattern. Heck-man, Lochner, and Taber (1998) consider a setting where changes in tuition fees are usedas an instrument (Z) to measure the impact of having a college degree (D) on earnings(Y ). Their focus is on illustrating general equilibrium effects in skill prices. The idea isthat a quantitatively large tuition-reduction policy financed by a proportional tax canhave two-way effects. On the one hand, it can increase enrolment by directly reducing thecost of attending college. This is a standard partial equilibrium effect. On the other hand,it can decrease enrolment by reducing the wage premium of going to college, because ofthe increased supply of college-educated individuals. This is a general equilibrium (GE)effect. In their setting monotonicity would be violated if for a group of individuals thepartial equilibrium effect dominates and pushes them into taking the treatment, while for

32These coefficients reflect net flows: they could result from a large number of defiers (taking lesseducation after the reform) that are not completely compensated by the flow of compliers (taking moreeducation after the reform).

33Table 16a already provides evidence of statistically significant differences in the proportions obtaining9, 10 ,11, 12 and 13 (or fewer) years of schooling across both reforms. Therefore we do not formally testthe CDF crossing.

40

Page 41: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Figure 16: Monotonicity in Clark and Royer (2013)

(a) Clark and Royer (2013) page 2103

(b) stochastic dominance

41

Page 42: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Table 7: Monotonicity - Years of Schooling

Born AfterBorn before 9 10 11 12 13 14+9 = + + + + +10 - = + + + +11 - - = + + +12 - - - = + +13 - - - - = +14+ - - - - - =

The (+) term indicates that individuals would at-tain more schooling if born after the schooling re-form. Viceversa, the (-) term indicates that indi-viduals would attain less schooling.

another group of individuals the general equilibrium effect outweighs the partial equilib-rium effect and stops them from taking the treatment. When simulating the impact ofa tuition subsidy, Heckman, Lochner, and Taber (1998) find evidence that both groupsare present.34 In the setting of Clark and Royer (2013) we also have a quantitativelylarge policy, a schooling reform that affects 50% of the population. Thus, one possiblereason for the pattern observed by Clark and Royer (2013) is that GE effects are at play,reducing the incentive to attain 11 or more years of schooling for some individuals.

It is also important to discuss the presence of essential heterogeneity in the settingof Clark and Royer (2013). They use this reform to study the impact of education onhealth outcomes such as mortality, self-reported health status, self-reported health be-haviours such as smoking and drinking and clinical health measures collected by a nurse.A differential return to education in terms of health seems plausible, as individuals likelydiffer in the behavioural response to better health information (e.g. through peers orother sources) or better access to health care depending on their income and background.Sorting on gain is perhaps less intuitive in this setting. Would individuals take intoaccount their health return to education (or the return in terms of well-being more gen-erally) when deciding how much education to attain? We might expect that the healthreturn to education is at least a weaker driver of education decisions than the earningsreturn. However, the key issue is whether the health return to education (β) is correlatedwith educational attainment (D). For this to happen it is not necessary that individualstake into account their β when making education decisions. Suppose for instance thatindividuals do not know their health return (β) but they know and consider their earn-ings return (ρ) when making education decisions: Cov(ρ,D) 6= 0. Now suppose that thehealth return and earnings return to education are positively correlated: smarter andmore motivated individuals might get larger returns to education for a variety of differ-

34Let Vj be the discounted present value of earnings from choice j, with j = C (College) or j = H(High-School). In the model, individuals choose both the type of skill and the quantity of each skill j toacquire. The price of skill j thus enters Vj multiplicatively, while tuition enters additively. Individualswith a larger VC will lose more from a decrease in the price of skill C. Those with a high VH will benefitmore from an increase in the price of skill H. A change in tuition, however, affects everyone equally.This is what generates the two-way flow.

42

Page 43: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

ent outcomes such as earnings, health, networks, happiness. In this world Cov(β,D) 6= 0even though individuals do not directly take into account their β when making educationdecisions. In other words, sorting on gain should be interpreted in the broad sense of anon-zero correlation between treatment and returns. In the setting of Clark and Royer(2013) this is not implausible. Thus, we would argue that this study could and shouldhave investigated the monotonicity assumption.35

Finally, it is worth noting that the same British compulsory schooling reforms havebeen extensively used as an instrumental variable in various contexts such as earningsand labor activity (Devereux and Hart (2010), Harmon and Walker (1995), Oreopoulos(2006)); citizenship and political involvement (Milligan, Moretti, and Oreopoulos (2004));health of offspring (Lindeboom, Llena-Nozal, and van der Klaauw (2009)); fertility andteenage childbearing (Silles (2011)). None of these papers investigate monotonicity. Sim-ilar reforms have also been used in other countries to estimate the returns to educationfor a variety of outcomes.

6 Relaxing the Monotonicity Condition

A number of recent papers have tried to establish alternative conditions under whichit is still possible to interpret the IV and fuzzy RD estimates as a LATE even thoughmonotonicity is violated. Since these conditions mostly apply to a setting where theendogenous treatment variable is binary, we discuss them in a context where the Vietnamlottery draft is used as an instrument in measuring the return to college. Thus D = 1if individuals went to college and D = 0 otherwise. Similarly, Z = 1 if individuals weredraft eligible and Z = 0 otherwise. In this case the sample can be partitioned intofour types: compliers, defiers, always-takers and never-takers. In this context, compliersare individuals who go to college only when drafted (D(1) = 1, D(0) = 0), likely as adraft-avoidance strategy. For clarity, we relabel compliers as draft-avoiders (DA) in thissection. Defiers instead are individuals who do not obtain a college degree purely dueto the draft (D(1) = 0, D(0) = 1). We relabel defiers as dropouts (DP). Always-takers(AT) and never-takers (NT) complete the types space and are defined as standard in theliterature. In section 5.1 we discussed how monotonicity is likely to be violated becauseboth draft-avoiders and dropouts might be present. Thus it is informative to knowwhether monotonicity can be relaxed in this context. While we find all of the alternativeconditions discussed below rather restrictive in this setting, it is of course possible thatthey are more plausible in other contexts.

35Given the inclusion of trends on either side of the cutoffs, Clark and Royer (2013) effectively compareindividuals born one month apart. However, even if we could compare individuals born just one dayapart, the monotonicity issue would remain, as opposed to the case of school entry age discussed above.That is because education is a discrete variables, which is measured in years.

43

Page 44: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

6.1 Violations of monotonicity occurring at random

Klein (2010) shows how to recover the LATE for violations of monotonicity occurring atrandom. Let us rewrite the IV estimate in equation (2) as

βIV1 (z, w) = E[Y (1)− Y (0)|D(z)−D(w) = 1] (16)

− (1− λ) (E[Y (1)− Y (0)|D(z)−D(w) = 1]− E[Y (1)− Y (0)|D(z)−D(w) = −1])︸ ︷︷ ︸bias

When P [D(z) − D(w) = −1] 6= 0 such that λ 6= 1, one can see equation (16) as theLocal Average Treatment Effect for those individuals taking the treatment because ofthe instrument, plus a bias term. Klein (2010) provides an approximation of this biaswhen violations of monotonicity occur at random. In the general setting of equation(1), this implies that there is some random variable V causing individuals to respondto the instrument differently, with V being independent of the instrumental variable Zand (Y (0), Y (1), α0). In other words, V is a random variable that causes α1 to have adifferent sign for different individuals, in turn affecting the treatment status of (at least)some individuals.

Consider the context of measuring the return to college using the Vietnam lotterydraft. Whether individuals are draft-avoiders or dropouts could be driven by risk pref-erences, patriotism or pacifism, (bad) luck etc. The bias correction proposed by Klein(2010) requires that elements driving this behaviour (types) are independent of earningsability, the return to college or anything else that affects earnings or the decision to goto college (such as health, parental background).

6.2 Comvivors: More compliers than defiers

Let β1 = Y (1) − Y (0) have support B. Similarly let (Y (0), Y (1)) have support Y .de Chaisemartin (2012) replaces the monotonicity condition with either

IV-A1. P [D(z) > D(w)|β = b] ≥ P [D(z) < D(w)|β = b] for every b ∈ B or

IV-A2. P [D(z) > D(w)|(Y0, Y1) = (y0, y1)] ≥ P [D(z) < D(w)|(Y0, Y1) = (y0, y1)] forevery (y0, y1) ∈ Y

In other words, IV-A1 allows for both individuals switching in and out of treatment, aslong as the proportion of individuals switching in is larger than the proportion switchingout for any given value of the treatment effect. Under IV-A1 de Chaisemartin (2012)shows that we can identify a LATE for that fraction of individuals switching in that isin excess of the proportion switching out. de Chaisemartin (2012) refers to this group as“comvivors” (compliers surviving the defiers). Condition IV-A2 is stronger than IV-A1but it is partly testable using Kitagawa (2013) test while IV-A1 is not testable.

We can consider condition IV-A1 in the Vietnam lottery draft and return to collegesetting. For simplicity, assume that B is a set with two values of support, βL1 < βH1 , whereβ1 is the return to college. Let us also assume that in the absence of the lottery draft, theprobability of attaining a college degree is strongly increasing in β1 (sorting on gain). Wemight then expect never-takers and draft-avoiders to be drawn disproportionally fromthe pool with a low return to college: P (type = DA|β1 = βL1 ) > P (type = DA|β1 = βH1 ).

44

Page 45: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Similarly, we might expect always-takers and dropouts to be drawn disproportionally fromthe pool with a large return to college: P (type = DP |β1 = βH1 ) > P (type = DP |β1 =βL1 ). Condition IV-A1 rules out that P (type = DA|β1 = βH1 ) < P (type = DP |β1 = βH1 )while P (type = DA|β1 = βL1 ) > P (type = DP |β1 = βL1 ). Thus in this simplified setting,condition IV-A1 is only plausible if there is little sorting on gain.36

6.3 Local Monotonicity

Huber and Mellace (2012) replace the monotonicity condition with the following. In thissection, for notational clarity we write P [X|(Y0, Y1) = (y0, y1)] as P [X|(y0, y1)].

IV-A3. P [D(z) ≥ D(w)|(y1, y0)] = 1 or P [D(z) ≤ D(w)|(y1, y0)] = 1, ∀ (y1, y0) ∈ Y

Condition IV-A3 allows again for both individuals switching in and out of treatment.However, for any given realization of the potential outcomes (y1, y0), there are only indi-viduals switching in or only individuals switching out. Hence the name local monotonicity.Under condition IV-A3, Huber and Mellace (2012) show that one can separately identifythe LATE for the individuals switching in and for the individuals switching out.

One way to interpret IV-A3 in the Vietnam lottery draft and college setting is to as-sume that Y is a set with two values of support, (yL0 , yL1 ), (yH0 , y

H1 ), where (y0, y1) are the

potential earnings with and without college, and yLj < yHj for j = 0, 1. Once again assumethat in the absence of the lottery draft, the probability of attaining a college degree isincreasing in β1 (sorting on gain) but also, let β1 be increasing in (y1, y0) (for example, thereturn to college is increasing in ability). Using a similar reasoning as above, due to sort-ing on gain we would expect never-takers and draft-avoiders to be drawn disproportionallyfrom the pool of low potential earners: P (type = DA|(yL0 , yL1 )) > P (type = DA|(yH0 , yH1 )).Likewise, we would expect always-takers and dropouts to be disproportionally drawn fromthe pool with high potential earnings: P (type = DP |(yH0 , yH1 )) > P (type = DP |(yL0 , yL1 )).Condition IV-A3 would then require that P (type = DP |(yL0 , yL1 )) = 0 and P (type =DA|(yH0 , yH1 )) = 0. Thus in this setting IV-A3 is plausible with perfect sorting on gain.

7 Conclusion

Several studies have emphasised that IV and fuzzy RD estimates can be interpreted as alocal average treatment effect only if monotonicity is satisfied. This requirement appliesin a context with essential heterogeneity, a situation with heterogeneous treatment effectsand sorting into treatment based on the gain. Monotonicity is a restriction on the impactof the instrument on the treatment. It implies that a change in the instrument fromvalue w to z affects the treatment of all individuals in the same direction. If monotonic-ity does not hold then the IV and fuzzy RD estimates are generally not interpretable.Surprisingly, very few of the applied studies that rely on IV and fuzzy RD designs dis-cuss monotonicity. This is in stark contrast to the lengthy discussions dedicated to othertestable and untestable conditions like the IV independence and rank conditions, and tothe RD discontinuity (in the probability of treatment) and continuity (in the conditional

36Without sorting on gain there is no need to assume monotonicity altogether as discussed in section2.1.

45

Page 46: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

regression function) conditions. Given the importance of the monotonicity condition,this is an important missing step in evaluating internal validity. There are a number ofpossible explanations for this missing step. First, there seems to be a general consensusthat monotonicity is fundamentally untestable. Second, there is no clear framework tothink about monotonicity in practice.

In this paper we illustrate how informative the IV and fuzzy RD estimates are oncemonotonicity is violated, and do so under various degrees of treatment effect heterogene-ity and sorting on gain. We find that, under essential heterogeneity, interpretability islost even for minor violations of monotonicity. This reinforces the importance of investi-gating the monotonicity assumption. We argue that monotonicity can be partly tested.More precisely, the joint hypothesis of independence and monotonicity leads to testableimplications. Nevertheless these testable implications are necessary but not sufficient forindependence and monotonicity to hold. We then investigate monotonicity in differentapplied studies using economic intuition and data patterns. We claim that monotonic-ity can be reasonably assumed to hold in some cases. In other settings monotonicity isdebatable. We provide a thorough discussion as a guide to practice. Therefore, in ourview the applied literature is not “monotonically” hopeless. IV and fuzzy RD designscan be incredibly powerful in dealing with endogeneity problems. A discussion of themonotonicity assumption is just an extra step that should be included to validate theresults. As the key role of the monotonicity condition gains recognition, recent workis trying to either develop alternative tests or find ways of replacing monotonicity withweaker assumptions. This is a promising area of current and future research.

46

Page 47: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Appendix A Monotonicity in the fuzzy RD design

In this section we want to illustrate what the Wald estimator in (3) measures if themonotonicity assumption RD4 does not hold. Let e > 0 denote an arbitrary smallnumber.

Consider first the numerator. The mean difference in outcomes for individuals aboveand below the discontinuity point is

E[Yi|Zi = v0 + e]− E[Yi|Zi = v0 − e] =

E[Yi(0) + (Yi(1)− Yi(0))Di|Zi = v0 + e]− E[Yi(0) + (Yi(1)− Yi(0))Di|Zi = v0 − e] =

E[Yi(0)|Zi = v0 + e]− E[Yi(0)|Zi = v0 − e]+E[(Yi(1)− Yi(0))Di|Zi = v0 + e]− E[(Yi(1)− Yi(0))Di|Zi = v0 − e]

By assumption RD2, the first two terms cancel out at the limit e→0. By assumptionRD3, the remaining terms can be written as

E[(Yi(1)− Yi(0))Di(v0 + e)]− E[(Yi(1)− Yi(0))Di(v0 − e)] =

E[(Yi(1)− Yi(0))(Di(v0 + e)−Di(v0 − e))] =

1× E[Yi(1)− Yi(0)|Di(v0 + e)−Di(v0 − e) = 1]× P [Di(v0 + e)−Di(v0 − e) = 1]−1× E[Yi(1)− Yi(0)|Di(v0 + e)−Di(v0 − e) = −1]× P [Di(v0 + e)−Di(v0 − e) = −1]

The denominator can be rewritten:

E[Di|Zi = v0 + e]− E[Di|Zi = v0 − e] =

E[Di(v0 + e)]− E[Di(v0 − e)] =

E[Di(v0 + e)−Di(v0 − e)] =

P [Di(v0 + e)−Di(v0 − e) = 1]− P [Di(v0 + e)−Di(v0 − e) = −1]

The Wald estimator in (3) can then be rewritten as follows:

lime↓0E[Yi|v0 + e]− lime↑0E[Yi|v0 − e]lime↓0E[Di|v0 + e]− lime↑0E[Di|v0 − e]

=

lime→0(1 + λ)× E[Yi(1)− Yi(0)|Di(v0 + e)−Di(v0 − e) = 1]−

λ× E[Yi(1)− Yi(0)|Di(v0 + e)−Di(v0 − e) = −1]

where

λ =P [Di(v0 + e)−Di(v0 − e) = −1]

P [Di(v0 + e)−Di(v0 − e) = 1]− P [Di(v0 + e)−Di(v0 − e) = −1]

47

Page 48: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Appendix B Average Treatment Effects in the Roy

model

Using figure 1, we can write down the average treatment effects for the different types.

• Average Treatment Effect (ATE):

E[β] = β =

∫ +∞

−∞β f(β) dβ

• ATE on compliers (LATECM):

E[β|CM ] =1

P [−γH < β ≤ 0]

∫ 0

−γHβ f(β) dβ

The ATE on compliers is the expected β over the interval where compliers arelocated. Each β is weighted by the probability density f(β), adjusted for thefraction of individuals in that interval (P [−γH < β ≤ 0]).37

• ATE on defiers (LATEDF ):

E[β|DF ] =1

P [0 < β ≤ −γL]

∫ −γL0

β f(β) dβ

The ATE on defiers is obtained in a similar way as that for compliers, but over theinterval of β where defiers are located: [0,−γL].

• ATE on always-takers (LATEAT ):

E[β|AT ] = wAT1

P [0 < β ≤ −γL]

∫ −γL0

β f(β) dβ

+ (1− wAT )1

P [β > −γL]

∫ +∞

−γLβ f(β) dβ

The always-takers are spread over two different intervals of β. Over each intervalthere is an expected β. Each of these are then assigned a weight equal to thefraction of always-takers that are located in that interval (wAT and 1 − wAT , seebelow).

• ATE on never-takers (LATENT ):

E[β|NT ] = wNT1

P [−γH < β ≤ 0]

∫ 0

−γHβ f(β) dβ

+ (1− wNT )1

P [β ≤ −γH ]

∫ −γH−∞

β f(β) dβ

37Note that the sum of weights 1P [−γH<β≤0]

∫ 0

−γH f(β) dβ = 1F (0)−F (−γH)

∫ 0

−γH f(β) dβ = 1.

48

Page 49: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Also never-takers are spread over 2 intervals of β. Over each interval there is anexpected β. Each of these are then assigned a weight equal to the fraction of allnever-takers that are located in that interval (respectively wNT and 1 − wNT , seebelow).

The weights wAT and wNT are as follows:

wAT =pγH P [0 < β < −γL]

pAT

1− wAT = 1− pγH P [0 < β < −γL]

pAT=P [β > −γL]

pAT

wNT =pγL P [−γH < β < 0]

pNT

1− wNT = 1− pγL P [−γH < β < 0]

pNT=P [β < −γH ]

pNT

49

Page 50: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Appendix C Monotonicity in an IV design with Mul-

tivalued Treatment

To write the IV estimate of β1 in the case of a multivalued treatment and binary instru-ment, we take elements from the proof in the Appendix of Angrist and Imbens (1995) andgeneralize it by allowing violations of the monotonicity assumption. LetD ∈ 0, 1, . . . , Kbe the treatment with K > 1, and let Z ∈ w, z be the binary instrument. Then let I(A)be an indicator function for the event A. Define the following indicators: ψZk = I(D ≥ k)for Z = w, z and k = 0, 1, . . . , K + 1; χ = I(Z = z) and thus 1 − χ = I(Z = w). Notethat ψZ0 = 1 and ψZK+1 = 0 for all Z. Then the outcome Y can be written as

Y = χ× YDz + (1− χ)× YDw

=

χ×

K∑k=0

Yk(ψzk − ψzk+1)

+

(1− χ)×

K∑k=0

Yk(ψwk − ψwk+1)

where Yk is the outcome for an individual with D = k. Then, under the rank andindependence conditions (IV1 and IV2), the numerator of the Wald estimator can berewritten as

E(Y |Z = z)− E(Y |Z = w)

= E

[K∑k=0

Yk(ψzk − ψzk+1 − ψwk + ψwk+1)

]

= E

[K∑k=1

(Yk − Yk−1)(ψzk − ψwk)

]

=K∑k=1

1× E [Yk − Yk−1|ψzk − ψwk = 1] P [ψzk − ψwk = 1]

+ 0× E [Yk − Yk−1|ψzk − ψwk = 0] P [ψzk − ψwk = 0]

+ (−1)× E [Yk − Yk−1|ψzk − ψwk = −1] P [ψzk − ψwk = −1]

=K∑k=1

E [Yk − Yk−1|D(z) ≥ k > D(w)] P [D(z) ≥ k > D(w)]

−K∑k=1

E [Yk − Yk−1|D(w) ≥ k > D(z)] P [D(w) ≥ k > D(z)]

50

Page 51: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Similarly we can write the denominator of the Wald estimator as

E(D|Z = z)− E(D|Z = w)

= E

[K∑k=0

k × (ψzk − ψzk+1 − ψwk + ψwk+1)

]

= E

[K∑k=1

(ψzk − ψwk)

]

=K∑k=1

1× P [ψzk − ψwk = 1] + 0× P [ψzk − ψwk = 0] + (−1)× P [ψzk − ψwk = −1]

=K∑k=1

P [D(z) ≥ k > D(w)]− P [D(w) ≥ k > D(z)]

Then, the Wald estimator is given by

βIV1 (z, w) =E(Y |Z = z)− E(Y |Z = w)

E(D|Z = z)− E(D|Z = w)

=1

Ω

K∑k=1

E [Yk − Yk−1|D(z) ≥ k > D(w)]P [D(z) ≥ k > D(w)]−

E [Yk − Yk−1|D(w) ≥ k > D(z)]P [D(w) ≥ k > D(z)]

where Ω =∑K

k=1P [D(z) ≥ k > D(w)]− P [D(w) ≥ k > D(z)]

51

Page 52: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

References

Aliprantis, Dionissi. 2012. “Redshirting, Compulsory Schooling Laws, and EducationalAttainment.” Journal of Educational and Behavioral Statistics 37 (2):316–338.

Angrist, Joshua D. 1990. “Lifetime earnings and the Vietnam era draft.” The AmericanEconomic Review 80 (3):313–336.

Angrist, Joshua D. and Stacey H. Chen. 2011. “Schooling and the Vietnam-Era GI Bill:Evidence from the Draft Lottery.” American Economic Journal: Applied Economics3 (2):96–118.

Angrist, Joshua D. and Guido W. Imbens. 1995. “Two-stage least squares estimationof average causal effects in models with variable treatment intensity.” Journal of theAmerican statistical Association 90 (430):431–442.

Angrist, Joshua D., Guido W. Imbens, and Donald B. Rubin. 1996. “Identification ofCausal Effects Using Instrumental Variables.” Journal of the American StatisticalAssociation 91 (434):444–455.

Angrist, Joshua D. and Alan B. Krueger. 1991. “Does Compulsory School AttendanceAffect Schooling and Earnings?” The Quarterly Journal of Economics 106 (4):979–1014.

Barrett, Garry F. and Stephen G. Donald. 2003. “Consistent tests for stochastic domi-nance.” Econometrica 71 (1):71–104.

Barua, Rashmi and Kevin Lang. 2009. “School Entry, Educational Attainment andQuarter of Birth: A Cautionary Tale of LATE.” Working Paper 15236, NationalBureau of Economic Research.

Bedard, Kelly and Elizabeth Dhuey. 2006. “The Persistence Of Early Childhood Ma-turity: International Evidence Of Long-Run Age Effects.” The Quarterly Journal ofEconomics 121 (4):1437–1472.

Bennett, Christopher J. 2009. “Consistent and Asymptotically Unbiased MinP Testsof Multiple Inequality Moment Restrictions.” Vanderbilt University Department ofEconomics Working Papers 0908, Vanderbilt University Department of Economics.

Black, Sandra E., Paul J. Devereux, and Kjell G. Salvanes. 2011. “Too Young to Leavethe Nest: The Effects of School Starting Age.” The Review of Economics and Statistics93 (2):455–467.

Brollo, Fernanda, Tommaso Nannicini, Roberto Perotti, and Guido Tabellini. 2013. “ThePolitical Resource Curse.” The American Economic Review forthcoming.

Buckles, Kasey and Daniel M. Hungerman. 2013. “Season of Birth and Later Outcomes:Old Questions, New Answers.” The Review of Economics and Statistics 95 (5):711–724.

52

Page 53: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Card, David. 1995. “Using Geographic Variation in College Proximity to Estimate theReturn to Schooling.” In Aspects of Labor Market Behaviour: Essays in Honour of JohnVanderkamp, edited by L.N. Christofides, E.K. Grant, and R. Swidinsky. University ofToronto Press, 201–222.

Card, David and Thomas Lemieux. 2001. “Going to College to Avoid the Draft: TheUnintended Legacy of the Vietnam War.” The American Economic Review 91 (2):97–102.

Chen, Le-Yu and Jerzy Szroeter. 2014. “Testing multiple inequality hypotheses: Asmoothed indicator approach.” Journal of Econometrics 178 (P3):678–693.

Clark, Damon and Heather Royer. 2013. “The Effect of Education on Adult Mortalityand Health: Evidence from Britain.” American Economic Review 103 (6):2087–2120.

Datar, Ashlesha. 2006. “Does delaying kindergarten entrance give children a head start?”Economics of Education Review 25 (1):43–62.

de Chaisemartin, Clement. 2012. “All you need is LATE.” CREST and Paris School ofEconomics - unpublished.

Devereux, Paul J. and Robert A. Hart. 2010. “Forced to be Rich? Returns to CompulsorySchooling in Britain.” The Economic Journal 120 (549):1345–1364.

Dobbie, Will and Paige Marta Skiba. 2013. “Information Asymmetries in ConsumerCredit Markets: Evidence from Payday Lending.” American Economic Journal: Ap-plied Economics 5 (4):256–82.

Elder, Todd E. and Darren H. Lubotsky. 2009. “Kindergarten Entrance Age and Chil-dren’s Achievement: Impacts of State Policies, Family Background, and Peers.” TheJournal of Human Resources 44 (3):641–683.

Fredriksson, Peter and Bjorn Ockert. 2009. “The effect of school starting age on schooland labor market performance.” Stockholm University, Department of Economics -unpublished.

Hahn, Jinyong, Petra Todd, and Wilbert Van der Klaauw. 2001. “Identification and Es-timation of Treatment Effects with a Regression-Discontinuity Design.” Econometrica69 (1):201–09.

Harmon, C. and I. Walker. 1995. “Estimates of the Economic Return to Schooling forthe United Kingdom.” The American Economic Review 85 (5):1278–86. Available athttp://ideas.repec.org/a/aea/aecrev/v85y1995i5p1278-86.html.

Heckman, James J., Lance Lochner, and Christopher Taber. 1998. “General-EquilibriumTreatment Effects: A Study of Tuition Policy.” The American Economic Review88 (2):381–386.

Heckman, James J., Daniel Schmierer, and Sergio Urzua. 2010. “Testing the CorrelatedRandom Coefficient Model.” Journal of Econometrics 158 (2):177 – 203. SpecificationAnalysis in Honor of Phoebus J. Dhrymes.

53

Page 54: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Heckman, James J., Sergio Urzua, and Edward Vytlacil. 2006. “Understanding Instru-mental Variables In Models With Essential Heterogeneity.” The Review of Economicsand Statistics 88 (3):389–432.

Heckman, James J. and Edward Vytlacil. 2005. “Structural Equations, Treatment Effects,and Econometric Policy Evaluation.” Econometrica 73 (3):669–738.

Horowitz, Joel L and Charles F Manski. 1995. “Identification and Robustness withContaminated and Corrupted Data.” Econometrica 63 (2):281–302.

Huber, Martin and Giovanni Mellace. 2012. “Relaxing monotonicity in the identificationof local average treatment effects.” Economics Working Paper Series 1212, Universityof St. Gallen, School of Economics and Political Science.

———. 2014. “Testing instrument validity for LATE identification based on inequalitymoment constraints.” The Review of Economics and Statistics Forthcoming (1143).

Imbens, Guido W. and Joshua D. Angrist. 1994. “Identification and Estimation of LocalAverage Treatment Effects.” Econometrica 62 (2):467–75.

Imbens, Guido W. and Donald B. Rubin. 1997. “Estimating outcome distributionsfor compliers in instrumental variables models.” The Review of Economic Studies64 (4):555–574.

Keane, Michael P. 2010. “Structural vs. atheoretic approaches to econometrics.” Journalof Econometrics 156 (1):3 – 20.

Kitagawa, Toru. 2013. “A Bootstrap Test for the Instrument Validity in HeterogeneousTreatment Effect Model.” University College London - unpublished.

Klein, Tobias J. 2010. “Heterogeneous treatment effects: Instrumental variables withoutmonotonicity?” Journal of Econometrics 155 (2):99 – 116.

Lee, David S. and Thomas Lemieux. 2010. “Regression Discontinuity Designs in Eco-nomics.” Journal of Economic Literature 48 (2):281 – 355.

Lindeboom, Maarten, Ana Llena-Nozal, and Bas van der Klaauw. 2009. “Parental educa-tion and child health: Evidence from a schooling reform.” Journal of Health Economics28 (1):109 – 131.

McEwan, Patrick J. and Joseph S. Shapiro. 2008. “The Benefits of Delayed PrimarySchool Enrollment: Discontinuity Estimates Using Exact Birth Dates.” The Journalof Human Resources 43 (1):1–29.

Milligan, Kevin, Enrico Moretti, and Philip Oreopoulos. 2004. “Does education improvecitizenship? Evidence from the United States and the United Kingdom.” Journal ofPublic Economics 88 (910):1667 – 1695.

Muhlenweg, Andrea, Dorothea Blomeyer, Holger Stichnoth, and Manfred Laucht. 2012.“Effects of age at school entry (ASE) on the development of non-cognitive skills: Evi-dence from psychometric data.” Economics of Education Review 31 (3):68 – 76.

54

Page 55: Monotonicity in IV and fuzzy RD designs - A Guide to Practice€¦ · tion of a LATE in a fuzzy regression discontinuity (RD) approach. When monotonicity does not hold, the IV and

Muhlenweg, Andrea M. and Patrick A. Puhani. 2010. “The Evolution of the School-EntryAge Effect in a School Tracking System.” The Journal of Human Resources 45 (2):407– 438.

Oreopoulos, Philip. 2006. “Estimating Average and Local Average Treatment Effectsof Education when Compulsory Schooling Laws Really Matter.” American EconomicReview 96 (1):152–175.

Puhani, Patrick and Andrea Weber. 2007. “Does the early bird catch the worm?” Em-pirical Economics 32 (2-3):359 – 386.

Silles, Mary A. 2011. “The Effect of Schooling on Teenage Childbearing: Evidence UsingChanges in Compulsory Education Laws.” Journal of Population Economics 24 (2):761– 777.

55