View
221
Download
4
Category
Preview:
Citation preview
1
Nature gives us correlations…
Evaluation Research (8521)Prof. Jesse Lecy
Lecture 0
2
Policy / ProgramBlack Box
Input Outcome
The Program Evaluation Mindset
(something happens here)
3
Policy / Program
Input Outcome
The Program Evaluation Mindset
𝑂𝑢𝑡𝑐𝑜𝑚𝑒= 𝑓 (𝑝𝑟𝑜𝑔𝑟𝑎𝑚 , 𝑖𝑛𝑝𝑢𝑡 𝑙𝑒𝑣𝑒𝑙 )
𝑂𝑢𝑡𝑐𝑜𝑚𝑒=𝑏0+𝑏1 ∙ 𝐼𝑛𝑝𝑢𝑡+𝜀
(something happens here)
The slopes tells us how much impact we expect a program to have when we spend one additional unit of input.
The outcome is some function of the program and the amount of inputs into the process. It can sometimes be represented by this simple input-output equation.
50 55 60 65 70
50
10
01
50
Dosage and Response
Caffeine (mm)
He
art
Ra
te (
pe
r m
in)
𝐻𝑒𝑎𝑟𝑡𝑟𝑎𝑡𝑒=𝑏0+𝑏1∙𝐶𝑎𝑓𝑓𝑒𝑖𝑛𝑒+𝜀
Effect
5
𝐻𝑒𝑎𝑟𝑡𝑟𝑎𝑡𝑒=𝑏0+𝑏1∙𝐶𝑎𝑓𝑓𝑒𝑖𝑛𝑒+𝜀
Heart Rate
Treatment(Caffeine)
Control(No caffeine)
50 55 60 65 70
50
10
01
50
Dosage and Response
# of Potato Chips
He
art
Ra
te (
pe
r m
in)
7
http://www.radiolab.org/2010/oct/08/its-alive/ 4:15-
50 55 60 65 70
50
10
01
50
Dosage and Response
Caffeine (mm)
He
art
Ra
te (
pe
r m
in)
50 55 60 65 70
50
10
01
50
City Density and Productivity
Walking Speed
Nu
mb
er
of P
ate
nts
Pe
r 1
00
00
Re
sid
en
ts
How do we know when the interpretation is causal?
Effect?Effect ?
9
NATURE GIVES US CORRELATIONS
0 50 100 150 200
010
030
0
x
y
0 50 100 150 200
010
030
0
z
y
0 50 100 150 200
050
100
200
x
z
x y
z
Example #1
x y
z
0 100 200 300 400
010
030
0
x
y
0 50 100 150 200
010
030
0
z
y
0 100 200 300 400
050
100
200
x
z
Example #2
x y
z
0 100 200 300 400
020
060
0
x
y
0 50 100 150 200
020
060
0
z
y
0 100 200 300 400
050
100
200
x
z
Example #3
x y
x y
0 20 40 60 80 100
-50
50
15
02
50
x
y
-50 0 50 100 150 200 250
02
06
01
00
y
x
Examples of Poor Causal Inference
Examples of Poor Causal Inference1. Ice cream consumption causes polio
2. Investments in public buildings creates economic growth
3. Early retirement and health decline
4. Hormone replacement therapy and heart disease:
In a widely-studied example, numerous epidemiological studies showed that women who were taking combined hormone replacement therapy (HRT) also had a lower-than-average incidence of coronary heart disease (CHD), leading doctors to propose that HRT was protective against CHD. But randomized controlled trials showed that HRT caused a small but statistically significant increase in risk of CHD. Re-analysis of the data from the epidemiological studies showed that women undertaking HRT were more likely to be from higher socio-economic groups (ABC1), with better than average diet and exercise regimes. The use of HRT and decreased incidence of coronary heart disease were coincident effects of a common cause (i.e. the benefits associated with a higher socioeconomic status), rather than cause and effect as had been supposed. (Wikipedia)
Examples of Complex Causal Inference
17
MODERN PROGRAM EVAL
18
To Experiment or Not Experiment
http://www.youtube.com/watch?v=exBEFCiWyW0
19
CASE STUDY – EDUCATION REFORM
Classroom Size and Performance
http://www.publicschoolreview.com/articles/19
State Laws Limiting Class Size
Notwithstanding the ongoing debate over the pros and cons of reducing class sizes, a number of states have embraced the policy of class size reduction. States have approached class size reduction in a variety of ways. Some have started with pilot programs rather than state-wide mandates. Some states have specified optimum class sizes while other states have enacted mandatory maximums. Some states have limited class size reduction initiatives to certain grades or certain subjects.Here are three examples of the diversity of state law provisions respecting class size reduction.
California – The state of California became a leader in promoting class size reduction in 1996, when it commenced a large-scale class size reduction program with the goal of reducing class size in all kindergarten through third grade classes from 30 to 20 students or less. The cost of the program was $1 billion annually.
Classroom Size and Performance
http://www.publicschoolreview.com/articles/19
Florida – Florida residents in 2002 voted to amend the Florida Constitution to set the maximum number of students in a classroom. The maximum number varies according to the grade level. For prekindergarten through third grade, fourth grade through eighth grade, and ninth grade through 12th grade, the constitutional maximums are 18, 22, and 25 students, respectively. Schools that are not already in compliance with the maximum levels are required to make progress in reducing class size so that the maximum is not exceeded by 2010. The Florida legislature enacted corresponding legislation, with additional rules and guidelines for schools to achieve the goals by 2010.
Georgia -- Maximum class sizes depend on the grade level and the class subject. For kindergarten, the maximum class size is 18 or, if there is a full-time paraprofessional in the classroom, 20. Funding is available to reduce kindergarten class sizes to 15 students. For grades one through three, the maximum is 21 students; funding is available to reduce the class size to 17 students. For grades four through eight, 28 is the maximum for English, math, science, and social studies. For fine arts and foreign languages in grades K through eight, however, the maximum is 33 students. Maximums of 32 and 35 students are set for grades nine through 12, depending on the subject matter of the course. Local school boards that do not comply with the requirements are subject to lose funding for the entire class or program that is out of compliance.
Class Size Case Study - The Theory:
Class Size Test Scores
Class Size Test Scores
SES
?
Scenario 1
Scenario 2
-400 -300 -200 -100 0
01
00
20
03
00
40
0
Class Size
Test
Score
23
Example: Classroom Size
-400 -300 -200 -100 0
-100
-50
050
Class Size
Tes
t S
core
Res
idua
ls
∆Y
∆X
𝑆𝑙𝑜𝑝𝑒=∆𝑌∆ 𝑋
The regression coefficient represents a slope. In policy we think of the slope as an input-output formula. If I decrease class size (input) standardized tests scores increase (output).
Note changes in slopes and standard errors when you add variables.
The Naïve Model:
-400 -300 -200 -100 0
01
00
20
03
00
40
0
Class Size
Tes
t S
core
eClassSizebbTestScore 10
With Teacher Skill as a Control:
-400 -300 -200 -100 0
-10
0-5
00
50
Class Size
Tes
t S
core
Res
idua
ls
eTeachSkillbClassSizebbTestScore 210
-400 -300 -200 -100 0
-20
0-1
00
01
00
Class Size
Tes
t S
core
Res
idua
ls
Add SES to the Model:
eSESbTeachSkillbClassSizebbTestScore 3210
27
Example: Classroom Size
Why are slopes and standard errors changingwhen we add “control” variables?
-400 -300 -200 -100 0
-200
-100
010
0
Class Size
Tes
t S
core
Res
idua
ls
How do we interpret results causally?
Class Size Test Scores
SES
Teacher Skill
X
29
COURSE OUTLINE
30
The Origins of Modern Program Evaluation
• The “Great Society” introduced unprecedented levels of spending on social services – marks the dawn of the modern welfare state.
• Econometrics also comes of age, creating tools the provide opportunity for rigorous analysis of social programs.
31
32
33
We need effective programs, not expensive programs
34
Modern Program Evaluation
Course Objectives:
1. Understanding why regressions are biased
Seven Deadly Sins of Regression:
1. Multicollinearity2. Omitted variable bias3. Measurement error4. Selection / attrition5. Misspecification6. Population heterogeneity7. Simultaneity
35
Modern Program Evaluation
Course Objectives:
2. Understand tools of program evaluation
– Fixed effect models– Instrumental variables
– Matching– Regression discontinuity– Time series – Survival analysis
36
Modern Program Evaluation
Course Objectives:
3. How to talk to economists (and other bullies)
37
Modern Program Evaluation
Course Objectives:
4. Correctly apply and critique evaluation designs
Experiments• Pretest-posttest control group• Posttest only control group
Quasi-Experiments• Pretest-posttest comparison group• Posttest only comparison group• Interrupted time series with comparison group
Reflexive Design• Pretest-posttest design• Simple time series
38
Course Structure
First half: Understanding bias– No text, course notes online– Weekly homework
Second half: Evaluation design– Text is required– Campbell Scores
policy-research.net/programevaluation
39
Evaluating Internal Validity: The Campbell Scores
A Competing Hypothesis Framework
Omitted Variable Bias• Selection• Nonrandom Attrition
Trends in the Data• Maturation • Secular Trends
Study Calibration• Testing• Regression to the Mean• Seasonality• Study Time-Frame
Contamination Factors• Intervening Events• Measurement Error
Homework Policy
40
• Homework problems each week 1st half of semester– Graded pass/fail– Submit via D2L please– Work in groups is strongly encouraged
• Campbell Scores are due each class for the second half of the semester
• Midterm Exam (30%)– Confidence intervals, Standard error of regression, Bias
• Final Exam (20%)– Covers evaluation design and Internal validity
No late homework accepted! 50% of final grade.
See syllabus for policy on turning in by email.
41
Midterm Spring 2012
Grade
Stu
de
nt
Co
un
t
40 60 80 100 120
01
23
45
1st Qu. 75.5Median 89Mean 83.473rd Qu. 96
Midterm Fall 2011
Grade
Stu
de
nt
Co
un
t
40 60 80 100 120
05
10
15
1st Qu. 82Median 86Mean 87.213rd Qu. 97.5
42
0-2 hours/wk 3-4 hours/wk 4-8 hours/wk 9-14 hours/wk
15+ hours/wk0.00%5.00%
10.00%15.00%20.00%25.00%30.00%35.00%40.00%45.00%50.00%
Average Time Spent on Class by Students: Spring 2012
Hours Per Week
0-2 3-4 4-8 9-14 15+0.00%5.00%
10.00%15.00%20.00%25.00%30.00%35.00%40.00%45.00%50.00%
Average Time Spent on Class by Students: Fall 2011
Hours Per Week
Recommended