Upload
oneida
View
43
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Regression Discontinuity Design Case Study : National Evaluation of Early Reading First Peter Z. Schochet. Decision Information Resources, Inc. Overview. ERF program Evaluation research questions Regression discontinuity design Conclusions. ERF Program. - PowerPoint PPT Presentation
Citation preview
Regression Discontinuity DesignRegression Discontinuity DesignCase StudyCase Study::
National Evaluation of National Evaluation of
Early Reading FirstEarly Reading First
Peter Z. SchochetPeter Z. Schochet
Decision Information Resources, Inc.
2
Overview
ERF programERF program
Evaluation research questionsEvaluation research questions
Regression discontinuity designRegression discontinuity design
Conclusions Conclusions
3
ERF Program
Part of the No Child Left Behind ActPart of the No Child Left Behind Act
3-year ERF grants provided to a 3-year ERF grants provided to a collaboration of preschoolscollaboration of preschools
– Funding focus is on low-income childrenFunding focus is on low-income children
Goal:Goal: Enhance the language, cognitive, Enhance the language, cognitive, and early reading skills of preschool and early reading skills of preschool childrenchildren
4
ERF Funds Are Intended to:
Provide professional development for Provide professional development for teachersteachers
Create high-quality and print-rich Create high-quality and print-rich environments environments
Promote the use of scientifically proven Promote the use of scientifically proven literacy methods and instructional materialsliteracy methods and instructional materials
Identify preschool children at risk for Identify preschool children at risk for reading failurereading failure
5
Study Research Questions
What are the impacts of ERF onWhat are the impacts of ERF on::
– Children’s language and literacy?Children’s language and literacy?
– Quality of language and literacy Quality of language and literacy instruction, practice, and materials?instruction, practice, and materials?
6
KEY DESIGN FEATURESKEY DESIGN FEATURES
7
Study Focus Is on FY 2003 ERF Grant Applicants
700 sites submitted pre-applications700 sites submitted pre-applications
126 invited to submit full applications 126 invited to submit full applications
8
Random Assignment Was Not Possible
ED required that funds be awarded ED required that funds be awarded based on based on rankingsrankings of applicationsof applications
– Applications were “scored”Applications were “scored” 30 sites were funded with scores 30 sites were funded with scores 74 74 96 unfunded sites with scores < 7496 unfunded sites with scores < 74
– Scoring criteria were set Scoring criteria were set a prioria priori
– Favorable conditions for a RD design Favorable conditions for a RD design
9
The Ideal ERF RD Design: Compare “73s” to “75s”: Almost Experimental
60
70
80
90
100
40 50 60 70 80 90
Application Score
Readin
g M
easu
re Cutoff
Mean for 75s
Mean for 73s
10
But There Are Not Enough 73s and 75s: Need to Include Other Sites
Score Number of Sites
42 to 53 22
54 to 63 21
64 to 73 21 Cutoff Value=74
74 to 83 18
84 to 95 12
UNFUNDED
FUNDED
11
Sampling
All All fundedfunded sites agreed to participate sites agreed to participate
Sorted Sorted unfundedunfunded sites by their scores sites by their scores – Sites with largest scores contacted firstSites with largest scores contacted first– 64 sites contacted64 sites contacted– 37 agreed to participate37 agreed to participate
Obtained lists of classrooms in sitesObtained lists of classrooms in sites– Sampled 3 classrooms per site Sampled 3 classrooms per site – Selected 9 four-year olds per classroomSelected 9 four-year olds per classroom
94 percent parental consent rates94 percent parental consent rates
12
The RD Method Visually
Estimated Regression Lines
0
10
20
30
40
50
60
70
40 45 50 55 60 65 70 75 80 85 90 95
Application Score
Sit
e-L
evel
Ou
tco
me
Unfunded
Funded
Impact
Cutoff
13
Key Identifying Assumption
There must be a There must be a continuous relationshipcontinuous relationship between the outcome measure and the between the outcome measure and the application score application score
14
Differences-in-Means Estimates Could Be Biased
0
10
20
30
40
50
60
70
40 45 50 55 60 65 70 75 80 85 90 95
Application Score
Ou
tco
me
Mea
sure
Funded
Spurious Impact
= Simple Means
Unfunded
15
RD Designs Require Larger Samples Than Experimental Designs
Controlling for the application score Controlling for the application score reduces powerreduces power
Design effects are 3 to 4Design effects are 3 to 4
16
The Correct Functional Form Specification Is Crucial for Obtaining Unbiased Estimates
0
10
20
30
40
50
60
70
80
40 45 50 55 60 65 70 75 80 85 90 95
Application Score
Ou
tco
me
Mea
sure
Assume Linearity When the True Relationship is Nonlinear
SpuriousImpact
17
There Also Has to Be a Clear Functional Form Relationship
0
20
40
60
80
100
40 50 60 70 80 90 100
Application Score
% W
ith A
ctiv
itie
s
% of Classrooms in Site That Engage in an Activity
18
Basic Regression (HLM) Model
Y Y = = ββ00 + + ββ11*T *T + + ββ22*f(SCORE-74) + u*f(SCORE-74) + u
– Y = Y = Outcome measure (Outcome measure (child or teacher level)child or teacher level)
– TT = 1 if funded site, 0 if unfunded = 1 if funded site, 0 if unfunded
– f(SCORE)f(SCORE) = Function of application score = Function of application score
– ββ1 1 = Impact estimate= Impact estimate
– uu = Error term accounting for site and = Error term accounting for site and classroom-level classroom-level clusteringclustering
19
Selecting f(SCORE) Graph Y on SCOREGraph Y on SCORE
Add SCOREAdd SCORE22, SCORE, SCORE33, and T*SCORE , and T*SCORE interaction terms and test for significanceinteraction terms and test for significance
Use nonparametric methodsUse nonparametric methods
Specification testsSpecification tests
– Impacts = 0 using Impacts = 0 using baselinebaseline data data
– Impacts = 0 at “artificial” cutoff valuesImpacts = 0 at “artificial” cutoff values
– Adding covariates should not change impacts Adding covariates should not change impacts
20
Interpretation of Impact Estimates
In the “73-75” model, results pertain In the “73-75” model, results pertain onlyonly to to sites around the cutoff valuesites around the cutoff value
Results “generalize” more broadly using the Results “generalize” more broadly using the parametricparametric approach approach
– Does this stretch the results too far? Does this stretch the results too far?
But not using the But not using the nonparametricnonparametric approach approach
21
Conclusions
RD designs can produce rigorous impact RD designs can produce rigorous impact estimates under the right conditions:estimates under the right conditions:
– Need exogenous “scores” Need exogenous “scores”
– Scores and outcomes must have a smooth Scores and outcomes must have a smooth relationship that can be credibly modeledrelationship that can be credibly modeled
But there are limitations to the RD approach:But there are limitations to the RD approach:– Need larger samples than an experimental design Need larger samples than an experimental design
– Generalizability Generalizability
– Nonresponse a problem in unfunded sitesNonresponse a problem in unfunded sites
22
EXTRA SLIDES
23
Data: Fall 2004 and Spring 2005 (Spring Response Rates)
Child assessments (97%)Child assessments (97%)
Teacher behavioral ratings (96%)Teacher behavioral ratings (96%)
Teacher/classroom observations (79%)Teacher/classroom observations (79%)
Parent surveys (69%)Parent surveys (69%)
Teacher surveys (91%)Teacher surveys (91%)
Center director interviews (88%) Center director interviews (88%)
24
Child Assessment Instruments
Instrument Name Primary Dimension(s) Tapped Preschool Comprehensive Test of Phonological and Print Processing (Pre-CTOPP)
Print Awareness Phonological Awareness
Expressive One-Word Picture Vocabulary Test (EOWPVT)
Vocabulary Skills
Preschool Language Scale (PLS-4)
Auditory Comprehension
Social Competence & Behavior Evaluation (30-item) - Teacher Rating
Social Competence Anger-aggression Anxiety-withdrawal
25
Observations and SurveysInstrument Names Primary Dimension(s) Tapped Observations Teacher Behavior Rating Scale Literacy Environment ECERS-R (11 - subscales) Preschool quality with emphasis on use
of language and communication Surveys Teachers & Center Directors Demographics
Background Classroom characteristics Professional development Funding Sources
Parents Demographics
Child preschool experience Literacy resources available Weekly non-school literacy activities
Classroom Attendance Data Days attended Proportion of available days attended