Download ppt - Pitfalls in Analysis of Survey Data Ec798, Lecture 4 Dilip Mookherjee Fall 2010

Pitfalls in Analysis of Survey Data

Ec798, Lecture 4Dilip Mookherjee

Fall 2010

1. IntroductionPurpose is to alert you to the key practical

issues in analysis of survey dataBy `analysis’, I mean making inferences

concerning effectiveness of particular policies or programs, or behavioral patterns

Effectiveness assessment requires comparison of observed outcomes with a counterfactual: what would have happened in the absence of the program

Intro, contd.Assessing counterfactuals requires appropriate

benchmarks of comparison, and/or a theory which predicts how people and institutions would have behaved if the program had not been instituted, and how this would have changed the observed outcomes

Requires considerable creativity, ingenuity in addition to understanding local context and institutions

Intro, contd.Most people are prone to drawing inferences

based on cross-sectional (comparing areas with and without the program) or time-series evidence (comparing outcomes before and after the program), without being careful about assessing counterfactuals

Important in how you react and learn from almost any data pertaining to effects of a given development program --- i.e., how you evaluate work done by others to make claims about effectiveness based on their analysis

Intro, contd.Courses on statistics or econometrics will

emphasize the assumptions needed to make valid inferences from data, how to assess validity of these assumptions, and what to do if there is substantial doubt about their validity

In this session I will try to give you a practitioner’s perspective on this, based on my own experience

Will eschew technicalities, and provide an intuitive common-sense account

Pitfalls and Qualifications to Statistical Inference

I will try to give you a laundry list of the most common pitfalls and qualifications to what can be learned from analysis of statistical data concerning program effectiveness

And the most common techniques available for overcoming these

Even if you are not going to do this kind of analysis, its important for you to understand what others are doing, to review it critically, and raise appropriate questions

Laundry List Pitfalls, concerning bias (of estimates):

Selection Bias and Endogeneity (reverse causality, omitted variables)

Measurement Error Functional Form (non-linearity, censoring, truncations)

Qualifications, mainly concerning precision (calculating standard errors correctly): Heteroscedasticity Clustering Serial Correlation

Selection BiasLets say you compare outcomes of a program

between areas which had and didn’t have it: e.g., decentralization of forest management to local user groups: how does forest degradation vary between areas with and without such decentralization? Or compare health of children in villages that received a sanitation program with villages lacking such a program

Selection Problem, contd.Problem is that areas with more degraded forests

may have been more likely to have forest user groups in the first place; sanitation program likely to have been targeted to villages with dirtier water and greater poverty

If so, your cross-sectional differences will underestimate the true effect of the program

However you cannot be sure of the direction of the bias: maybe the communities more concerned about deforestation or health may have lobbied harder to get these programs

Selection Problem, contd.Maybe you can get around this by looking at

effects of the program before and after its implementation in the areas in which it was implemented

Lets say you have a panel data-set and see an improvement after

But what if the areas which didn’t receive the program also witnessed an improvement? Maybe there was something else that was going on that explains the improvement in both sorts of areas?

Selection Problem, contd.Then maybe you can compare the changes

before and after in the treatment and control areas? (The diff-of-diff estimate)

Can we stop here? Can we trust/test the diff-of-diff estimate? What assumptions are needed? And so on…

Are there contexts where cross-sectional comparisons yield valid (unbiased) estimates? When might they be better than the panel data based diff-of-diff estimate?

Pitfall No. 1: EndogeneitySelection problems part of a wider concern about endogeneity of program placement

One form of endogeneity: reverse causality (is forest degradation affecting creation of forest user groups? Is health driving placement of the sanitation program or the other way around?

The other form is omitted variable bias: maybe some third, unobserved variable such as underlying social capital of the community driving both deforestation and user group formation?

Other Examples of Endogeneity Problems

Suppose you are interested in effectiveness of a price subsidy program for rice on rice consumption: does consumption cause price or the other way around? Does underlying tastes for rice affect both price and consumption?

Are small farms more productive than large farms? Or are more productive farms tend to be smaller (owing to greater subdivision)? Is unobserved soil quality driving both size and productivity?

Endogeneity Examples, contd.

What is the effectiveness of a fertilizer distribution program on farm productivity? Does fertilizer application drive productivity? Or is it the case that more hardworking, motivated farmers tend to respond to the program more actively and apply more fertilizer?

Does under-nutrition cause low productivity/earnings, or the other way around?

Pitfall No. 2: Measurement Error

Is the independent variable measured accurately?

Problems measuring income, consumption based on survey responses (recall, aggregation, purposive..)

May not have data concerning program implementation at a disaggregated enough scale (e.g., interested in village-level effects but only have program intensity at province level)

Measurement Error, contd.`Iron Law of Econometrics’: measurement error

in independent variable (only) causes under-estimate of program effect (attenuation bias)

Intuitively this is because estimate of the effect is based on how independent and dependent variable co-vary, relative to the variation of the independent variable

Example of Attenuation Bias

Suppose you over-estimated placement of an effective fertilizer distribution program: some villages that didn’t get the program are mistakenly believed to have got it

Then you would be assessing program effectiveness by comparing mean farm yields in villages that are thought to have got the program, with those that appear not to have

You would under-estimate the effectiveness as some low-yield villages are mistakenly believed to have got the program

But Note That:Measurement error in dependent variable does not

matter (for bias): if productivity is measured with error this pertains to both kinds of villages equally

Not all kinds of independent variable errors matter (eg when data is at a higher level of aggregation: the measurement error is orthogonal to the measured value of the independent variable so it washes out in the aggregate)

Measurement error cannot reverse sign of the effect, or raise its quantitative magnitude (unlike endogeneity problems)

How Can You Tell How Serious Endogeneity or Measurement Error is?

Have to rely on your understanding of the situation, and your prior expectations based on theory

There is no easy test or measureWhat you can do is to analyze the data

differently so as to correct for the problems, and see how much of a difference this makes

How to Correct for Endogeneity

Approach 1: Control for possible omitted variables: collect data on those and include them in the regression

What about unobserved omitted variables: here panel data can come in very useful: use of fixed effects to control for unobserved heterogeneity

E.g., in the analysis of user groups and deforestation, unobserved `social capital’ which potentially affects both formation of user groups and deforestation is effectively controlled for, by looking at effects of formation of user groups on changes in forest quality

No longer comparing levels across areas, but changes over time --- the diff-of-diff estimate

Other Examples of Diff-of-Diff

Productivity variation by farm size or fertilizer application: control for soil fertility as best as you can, control for farmer ability/motivation with farmer fixed effects, for unobserved plot quality with plot fixed effects

Need data for same farmer over time as he changes scale of cultivation (for farmer fixed effects), for productivity of separate plots (for plot fixed effects) with differential fertilizer application

Assumptions Underlying Diff-of-Diff

Have to still assume that program placement or its timing was exogenous (ie uncorrelated with independent variable)

At the level of changes over time, placement was not purposive (e.g., can you rule out the possibility that the creation of user groups was one of many other changes taking place, one of which was really effective)

Test by looking at pre-program trends, other policies etc.

Other Assumptions underlying D-o-D

Effects of unobserved omitted variables are linear and additive so they can be washed out by looking at changes over time

No significant increase in measurement error when looking at changes over time (if panel responses based on recall, lot of the reported changes may just be the result of recall errors)

If this is the case, the cross-sectional estimate may involve less bias

Often significant cross-country regression results disappear in panel data: don’t know whether to interpret this as evidence of significant OV bias in the cross-section, or significant attenuation bias in the panel

Instrumental VariablesAnother qualification: DoD deals with

unobserved heterogeneity, but not reverse causality (nutrition-earnings example)

IV estimator: the most commonly used method to deal with endogeneity problems and measurement error

Idea is to find an instrument for the independent variable: a source of variation in the independent variable which logically cannot have a direct impact on the dependent variable

Examples of IV/Natural Experiments

UK water quality-mortality study (1853 London cholera epidemic): which of two companies was supplying water to any given street

Cuban boatlift effect on labor supply in MiamiMiddle East events that affect international oil

prices, which shifts the price of rice owing to higher transport costs , but not the consumption demand for rice

Regression discontinuity: class-size effects on learning; minimum wage laws across state borders

IV AssumptionsTwo key assumptions for an instrument to be

valid: It has to predict significant variation in the

independent variable in question (water quality/labor supply/rice price/class-size): first stage F

Exclusion restriction: conditional on the effect on the independent variable (and other controls) there is no direct effect on the dependent variable

No statistical tests for the exclusion restriction; based on theory and institutional knowledge

Pitfall No. 3: Functional Form

Regression estimates of effects of continuous treatment effects based on hypothesis of linear relationship between independent and dependent variable (eg effect of a drug does not depend on dosage)

In many cases, may expect this to be wrong (water on productivity, age on earnings, community heterogeneity on collective action, gender empowerment on ROSCA participation)

In other cases, may not know what pattern to expectAdditional problem: program effect may be

heterogeneous (very serious practical problem)

Testing LinearityInclude higher order terms, take log

transformations, interaction effects etc.Non-parametric analysis

Both have practical problems which can be resolved only with sufficient data

Can do only with respect to one variable at a time

Censoring and Truncation Bias

Particular form of problem with functional form: limited dependent variable Sometimes it is zero or one (eg member of a group

or not, road built or not)Sometimes it is endogenously truncated: you

cannot work negative number of hours, or collect negative quantity of firewood

Ignoring the inherent nonlinearity of the data can give rise to significantly biased estimates

Censoring and Truncation Biases

What can you do?Assume functional form of distribution of errors: e.g.,

probit or logit regressions for 0-1 variables, tobits for truncated variables

Results could be sensitive to what you assume hereSome new methods that don’t depend so much on

error distributions (semi-parametric methods, such as LAD)

Warning: cannot easily extend to panel estimators such as diff-of-diff!

Qualifications Laundry ListProblems emphasized in many econometrics texts

concern correct assessment of precision of estimates (how to calculate standard errors):HeteroscedasticitySerial correlation Clustering (more important, less often discussed in

textbooks)

Ignoring these may cause you to overlook more precise estimates, and more importantly overestimate your precision/statistical significance (thus biasing inferences)

HeteroscedsaticityWhere precision varies with `size’ of the

independent variable, OLS is not the most precise estimator (data needs to be re-weighted), and the standard errors are incorrectly calculated

STATA can make these corrections for you (`robust’ or White-corrected standard errors)

Case for quantile regressions

Serial CorrelationProblem when you have repeated observations

for the same agent or unit over time: if they are not independent, treating them as such means you overestimate the precision of your estimates

Problem with macro time series data, also with panel data

Can test for severity (eg Durbin-Watson stat.) and correct standard error estimates

ClusteringMost serious problem is when the data is

clustered (by village, industry, location etc.) and different observations in each cluster are not independent: again results in overestimate of precision (underestimation of s.e.’s)

STATA cluster command can correct your s.e. estimate (you have to specify the `level’ of clustering)

Can often blow them sky high, whence statistical significance of all your results can disappear

Concluding CommentsMany filters and pinches of salt involved here, but these

are absolutely fundamental to separate garbage from real evidence

Pitfalls (concerning bias) and Qualifications (precision), but both can result in biased inference

Lot of techniques for detecting and correcting problemsCannot rely on `technical’ fixes alone: no substitute for

good and sufficient data, common-sense, intuition, theory and institutional knowledge

Ultimately to be useful and compelling, the analysis must be simple and clear