35
Impact Evaluation Impact Evaluation Sebastian Galiani Sebastian Galiani November 2006 November 2006 Causal Inference Causal Inference

Impact Evaluation Sebastian Galiani November 2006 Causal Inference

Embed Size (px)

Citation preview

Page 1: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

Impact EvaluationImpact Evaluation

Sebastian GalianiSebastian Galiani

November 2006November 2006

Causal InferenceCausal Inference

Page 2: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

2WBISARHDN WBISARHDN

Motivation The research questions that motivate

most studies in the health sciences are causal in nature. For example:

What is the efficacy of a given drug in Impact: given population? What fraction of deaths from a given disease could have been avoided by a given treatment or policy?

Page 3: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

3WBISARHDN WBISARHDN

Motivation

The most challenging empirical questions in economics also involve causal-effect relationships:

Does school decentralization improve schools quality?

Page 4: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

4WBISARHDN WBISARHDN

Motivation Interest in these questions is motivated

by: Policy concerns

Does privatization of water systems improve child health?

Theoretical considerations

Problems facing individual decision makers

Page 5: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

5WBISARHDN WBISARHDN

Causal Analysis The aim of standard statistical analysis, typified

by likelihood and other estimation techniques, is to infer parameters of a distribution from samples drawn of that distribution.

With the help of such parameters, one can:

1. Infer association among variables,

2. Estimate the likelihood of past and future events,

3. As well as update the likelihood of events in light of new evidence or new measurement.

Page 6: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

6WBISARHDN WBISARHDN

Causal Analysis These tasks are managed well by standard

statistical analysis as long as experimental conditions remain the same.

Causal analysis goes one step further:

Its aim is to infer aspects of the data generation process.

With the help of such aspects, one can deduce not only the likelihood of events under static conditions, but also the dynamics of events under changing conditions.

Page 7: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

7WBISARHDN WBISARHDN

Causal Analysis This capability includes:

1. Predicting the effects of interventions2. Predicting the effects of spontaneous changes3. Identifying causes of reported events

This distinction implies that causal and associational concepts do not mix.

Page 8: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

8WBISARHDN WBISARHDN

Causal AnalysisThe word cause is not in the vocabulary of standard

probability theory.

All Probability theory allows us to say is that two events are mutually correlated, or dependent –meaning that if we find one, we can expect to encounter the other.

Scientists seeking causal explanations for complex phenomena or rationales for policy decisions must therefore supplement the language of probability with a vocabulary for causality.

Page 9: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

9WBISARHDN WBISARHDN

Causal Analysis

Two languages for causality have been proposed:

1. Structural equation modeling (ESM) (Haavelmo 1943).

2. The Neyman-Rubin potential outcome model (RCM) (Neyman, 1923; Rubin, 1974).

Page 10: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

10WBISARHDN WBISARHDN

The Rubin Causal Model

Define the population by U. Each unit in U is denoted by u.

For each u U, there is associated a value Y(u) of the variable of interest Y, which we call: the response variable.

Let A be a second variable defined on U. We call A an attribute of the units in U.

Page 11: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

11WBISARHDN

The key notion is the potential for exposing or not exposing each unit to the action of a cause:

Each unit has to be potentially exposable to any one of the causes.

Thus, Rubin takes the position that causes are only those things that could be treatments in hypothetical experiments.

An attribute cannot be a cause in an experiment, because the notion of potential exposability does not apply to it.

Page 12: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

12WBISARHDN

For simplicity, we assume that there are just two causes or level of treatment.

Let D be a variable that indicates the cause to which each unit in U is exposed:

controltoexposedisuunitif

treatmenttoexposedisuunitif

c

tD

In a controlled study, D is constructed by the experimenter. In an uncontrolled study, it is determined by factors beyond the experimenter’s control.

Page 13: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

13WBISARHDN

The values of Y are potentially affected by the particular cause, t or c, to which the unit is exposed.

Thus, we need two response variables:

Yt(u), Yc(u)

Yt is the value of the response that would be observed if the unit were exposed to t and

Yc is the value that would be observed on the same unit if it were exposed to c.

Page 14: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

14WBISARHDN

Let D also be expressed as a binary variable:

D = 1 if D = t and D = 0 if D = c

Then, the outcome of each individual can be written as:

Y(U) = D Y1 + (1 – D) Y0

Page 15: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

15WBISARHDN

Definition: For every unit u treatment {Du = 1 instead of Du = 0} causes the effect

u = Y1(u) – Y0(u)

This definition of a causal effect assumes that the treatment status of one individual does not affect the potential outcomes of other individuals.

Fundamental Problem of Causal Inference: It is impossible to observe the value of Y1(u) and Y0(u) on the same unit and, therefore, it is impossible to observe the effect of t on u.

Another way to express this problem is to say that we cannot infer the effect of treatment because we do not have the counterfactual evidence i.e. what would have happened in the absence of treatment.

Page 16: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

16WBISARHDN

Given that the causal effect for a single unit u cannot be observed, we aim to identify the average causal effect for the entire population or for sub-populations.

The average treatment effect ATE of t (relative to c) over U (or any sub-population) is given by:

ATE =E [Y1(u) – Y0(u)]

= E [Y1(u)] – E [Y0(u)]

(1)01 YY

Page 17: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

17WBISARHDN

The statistical solution replaces the impossible-to-observe causal effect of t on a specific unit with the possible-to-estimate average causal effect of t over a population of units.

Although E(Y1) and E(Y0) cannot both be calculated, they can be estimated.

Most econometrics methods attempt to construct from observational data consistent estimates of

01 YandY

Page 18: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

18WBISARHDN

Consider the following simple estimator of ATE:

(2)0]D|Y[-1]D|Y[ˆ01

Note that equation (1) is defined for the whole population, whereas equation (2) represents an estimator to be evaluated on a sample drawn from that population

Page 19: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

19WBISARHDN

Let equal the proportion of the population that would be assigned to the treatment group.

Decomposing ATE, we have:

0D|YY)1(1D|YY 0101

0100

11

YY]0D|Y[)1(]1D|Y[

]0D|Y[)1(]1D|Y[

}0{}1{ )1( DD

Page 20: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

20WBISARHDN

If we assume that

]0D|Y[)1(]0D|Y[

]1D|Y[)1(]1D|Y[

00

11

0]D|Y[1]D|Y[and0]D|Y[1]D|Y[ 0011

0]D|Y[-1]D|Y[ˆ01

0]D|Y[-1]D|Y[ 01

Which is consistently estimated by its sample analog estimator:

Page 21: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

21WBISARHDN

Thus, a sufficient condition for the standard estimator to consistently estimate the true ATE is that:

0]D|Y[1]D|Y[and0]D|Y[1]D|Y[ 0011 In this situation, the average outcome under the treatment

and the average outcome under the control do not differ between the treatment and control groups.

In order to satisfy these conditions, it is sufficient that treatment assignment D be uncorrelated with the potential outcome distributions of Y1 and Y2.

The principal way to achieve this uncorrelatedness is through random assignment of treatment.

Page 22: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

22WBISARHDN

In most circumstances, there is simply no information available on how those in the control group would have reacted if they had received the treatment instead.

This is the basis for an important insight into the potential biases of the standard estimator (2).

After a bit of algebra, it can be shown that:

ityHeterogeneTreatment

0}{D1}{D

DifferenceBaseline

00 )1(0]D|Y[1]D|Y[ˆ

Page 23: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

23WBISARHDN

This equation specifies the two sources of biases that need to be eliminated from estimates of causal effects from observational studies.

1. Selection Bias: Baseline difference. 2. Treatment Heterogeneity.

Most of the methods available only deal with selection bias, simply assuming that the treatment effect is constant in the population or by redefining the parameter of interest in the population.

Page 24: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

24WBISARHDN WBISARHDN

Treatment on the Treated ATE is not always the parameter of interest.

In a variety of policy contexts, it is the average treatment effect for the treated that is of substantive interest:

TOT =E [Y1(u) – Y0(u)| D = 1] = E [Y1(u)| D = 1] – E [Y0(u)| D = 1]

Page 25: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

25WBISARHDN WBISARHDN

Treatment on the Treated

The standard estimator (2) consistently estimates TOT if:

0]D|Y[1]D|Y[ 00

Page 26: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

26WBISARHDN WBISARHDN

Structural Equation Modeling

Structural equation modeling was originally developed by geneticists (Wright 1921) and economists (Haavelmo 1943).

Page 27: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

27WBISARHDN WBISARHDN

Structural Equations Definition: An equation

y = β x + ε (8)

is said to be structural if it is to be interpreted as follows:

In an ideal experiment where we control X to x and any other set Z of variables (not containing X or Y) to z, the value y of Y is given by β x + ε, where ε is not a function of the settings x and z.

This definition is in the spirit of Haavelmo (1943), who explicitly interpreted each structural equation as a statement about a hypothetical controlled experiment.

Page 28: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

28WBISARHDN

Thus, to the often asked question, “Under what conditions can we give causal interpretation to structural coefficients?”

Haavelmo would have answered: Always!

According to the founding father of SEM, the conditions that make the equation y = β x + ε structural are precisely those that make the causal connection between X and Y have no other value but β, and ensuring that nothing about the statistical relationship between x and ε can ever change this interpretation of β.

Page 29: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

29WBISARHDN

The average causal effect: The average causal effect on Y of treatment level x is the difference in the conditional expectations:

E(Y|X = x) – E(Y|X = 0)

In the context of dichotomous interventions (x = 1), this causal effect is called the average treatment effect (ATE).

Page 30: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

30WBISARHDN WBISARHDN

Representing Interventions Consider the structural model M:

z = fz(w)

x = fx(z, )

y = fy(x, u)

We represent an intervention in the model through a mathematical operator denoted d0(x).

d0(x) simulates physical interventions by deleting certain functions from the model, replacing them by a constant X = x, while keeping the rest of the model unchanged.

Page 31: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

31WBISARHDN

To emulate an intervention d0(x0) that holds X constant (at X = x0) in model M, replace the equation for x with x = x0, and obtain a new model, Mx0

z = fz(w)

x = x0

y = fy(x, u)

The joint distribution associated with the modified model, denoted P(z, y| d0(x0)) describes the post-intervention (“experimental”) distribution.

From this distribution, one is able to assess treatment efficacy by comparing aspects of this distribution at different levels of x0.

Page 32: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

32WBISARHDN WBISARHDN

Structural Parameters Definition: The interpretation of a structural

equation as a statement about the behavior of Y under a hypothetical intervention yields a simple definition for the structural parameters.

The meaning of β in the equation y = β x + ε is simply

(x)]d|E[Yx o

Page 33: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

33WBISARHDN WBISARHDN

Counterfactual Analysis in Structural Models

Consider again model Mxo. Call the solution of Y the potential response of Y to x0.

We denote it as Yx0(u, , w).

This entity can be given a counterfactual interpretation, for it stands for the way an individual with characteristics (u, , w) would respond, had the treatment been x0, rather than the x = fx(z, ) actually received by the individual.

Page 34: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

34WBISARHDN

In our example,

Yx0(u, , w) = Yx0(u) = y = fy(x0, u)

• This interpretation of counterfactuals, cast as solutions to modified systems of equations, provides the conceptual and formal link between structural equation modeling and the Rubin potential-outcome framework.

• It ensures us that the end results of the two approaches will be the same.

• Thus, the choice of model is strictly a matter of convenience or insight.

Page 35: Impact Evaluation Sebastian Galiani November 2006 Causal Inference

35WBISARHDN WBISARHDN

References Judea Pearl (2000): Causality: Models, Reasoning and

Inference, CUP. Chapters 1, 5 and 7. Trygve Haavelmo (1944): “The probability approach in

econometrics”, Econometrica 12, pp. iii-vi+1-115. Arthur Goldberger (1972): “Structural Equations Methods in

the Social Sciences”, Econometrica 40, pp. 979-1002. Donald B. Rubin (1974): “Estimating causal effects of

treatments in randomized and nonrandomized experiments”, Journal of Educational Psychology 66, pp. 688-701.

Paul W. Holland (1986): “Statistics and Causal Inference”, Journal of the American Statistical Association 81, pp. 945-70, with discussion.