Regression Discontinuity Design Case Study : National Evaluation of Early Reading First

Regression Discontinuity DesignRegression Discontinuity DesignCase StudyCase Study::

National Evaluation of National Evaluation of

Early Reading FirstEarly Reading First

Peter Z. SchochetPeter Z. Schochet

Decision Information Resources, Inc.

2

Overview

ERF programERF program

Evaluation research questionsEvaluation research questions

Regression discontinuity designRegression discontinuity design

Conclusions Conclusions

3

ERF Program

Part of the No Child Left Behind ActPart of the No Child Left Behind Act

3-year ERF grants provided to a 3-year ERF grants provided to a collaboration of preschoolscollaboration of preschools

– Funding focus is on low-income childrenFunding focus is on low-income children

Goal:Goal: Enhance the language, cognitive, Enhance the language, cognitive, and early reading skills of preschool and early reading skills of preschool childrenchildren

4

ERF Funds Are Intended to:

Provide professional development for Provide professional development for teachersteachers

Create high-quality and print-rich Create high-quality and print-rich environments environments

Promote the use of scientifically proven Promote the use of scientifically proven literacy methods and instructional materialsliteracy methods and instructional materials

Identify preschool children at risk for Identify preschool children at risk for reading failurereading failure

5

Study Research Questions

What are the impacts of ERF onWhat are the impacts of ERF on::

– Children’s language and literacy?Children’s language and literacy?

– Quality of language and literacy Quality of language and literacy instruction, practice, and materials?instruction, practice, and materials?

6

KEY DESIGN FEATURESKEY DESIGN FEATURES

7

Study Focus Is on FY 2003 ERF Grant Applicants

700 sites submitted pre-applications700 sites submitted pre-applications

126 invited to submit full applications 126 invited to submit full applications

8

Random Assignment Was Not Possible

ED required that funds be awarded ED required that funds be awarded based on based on rankingsrankings of applicationsof applications

– Applications were “scored”Applications were “scored” 30 sites were funded with scores 30 sites were funded with scores 74 74 96 unfunded sites with scores < 7496 unfunded sites with scores < 74

– Scoring criteria were set Scoring criteria were set a prioria priori

– Favorable conditions for a RD design Favorable conditions for a RD design

9

The Ideal ERF RD Design: Compare “73s” to “75s”: Almost Experimental

60

70

80

90

100

40 50 60 70 80 90

Application Score

Readin

g M

easu

re Cutoff

Mean for 75s

Mean for 73s

10

But There Are Not Enough 73s and 75s: Need to Include Other Sites

Score Number of Sites

42 to 53 22

54 to 63 21

64 to 73 21 Cutoff Value=74

74 to 83 18

84 to 95 12

UNFUNDED

FUNDED

11

Sampling

All All fundedfunded sites agreed to participate sites agreed to participate

Sorted Sorted unfundedunfunded sites by their scores sites by their scores – Sites with largest scores contacted firstSites with largest scores contacted first– 64 sites contacted64 sites contacted– 37 agreed to participate37 agreed to participate

Obtained lists of classrooms in sitesObtained lists of classrooms in sites– Sampled 3 classrooms per site Sampled 3 classrooms per site – Selected 9 four-year olds per classroomSelected 9 four-year olds per classroom

94 percent parental consent rates94 percent parental consent rates

12

The RD Method Visually

Estimated Regression Lines

0

10

20

30

40

50

60

70

40 45 50 55 60 65 70 75 80 85 90 95

Application Score

Sit

e-L

evel

Ou

tco

me

Unfunded

Funded

Impact

Cutoff

13

Key Identifying Assumption

There must be a There must be a continuous relationshipcontinuous relationship between the outcome measure and the between the outcome measure and the application score application score

14

Differences-in-Means Estimates Could Be Biased

0

10

20

30

40

50

60

70

40 45 50 55 60 65 70 75 80 85 90 95

Application Score

Ou

tco

me

Mea

sure

Funded

Spurious Impact

= Simple Means

Unfunded

15

RD Designs Require Larger Samples Than Experimental Designs

Controlling for the application score Controlling for the application score reduces powerreduces power

Design effects are 3 to 4Design effects are 3 to 4

16

The Correct Functional Form Specification Is Crucial for Obtaining Unbiased Estimates

0

10

20

30

40

50

60

70

80

40 45 50 55 60 65 70 75 80 85 90 95

Application Score

Ou

tco

me

Mea

sure

Assume Linearity When the True Relationship is Nonlinear

SpuriousImpact

17

There Also Has to Be a Clear Functional Form Relationship

0

20

40

60

80

100

40 50 60 70 80 90 100

Application Score

% W

ith A

ctiv

itie

s

% of Classrooms in Site That Engage in an Activity

18

Basic Regression (HLM) Model

Y Y = = ββ00 + + ββ11*T *T + + ββ22*f(SCORE-74) + u*f(SCORE-74) + u

– Y = Y = Outcome measure (Outcome measure (child or teacher level)child or teacher level)

– TT = 1 if funded site, 0 if unfunded = 1 if funded site, 0 if unfunded

– f(SCORE)f(SCORE) = Function of application score = Function of application score

– ββ1 1 = Impact estimate= Impact estimate

– uu = Error term accounting for site and = Error term accounting for site and classroom-level classroom-level clusteringclustering

19

Selecting f(SCORE) Graph Y on SCOREGraph Y on SCORE

Add SCOREAdd SCORE22, SCORE, SCORE33, and T*SCORE , and T*SCORE interaction terms and test for significanceinteraction terms and test for significance

Use nonparametric methodsUse nonparametric methods

Specification testsSpecification tests

– Impacts = 0 using Impacts = 0 using baselinebaseline data data

– Impacts = 0 at “artificial” cutoff valuesImpacts = 0 at “artificial” cutoff values

– Adding covariates should not change impacts Adding covariates should not change impacts

20

Interpretation of Impact Estimates

In the “73-75” model, results pertain In the “73-75” model, results pertain onlyonly to to sites around the cutoff valuesites around the cutoff value

Results “generalize” more broadly using the Results “generalize” more broadly using the parametricparametric approach approach

– Does this stretch the results too far? Does this stretch the results too far?

But not using the But not using the nonparametricnonparametric approach approach

21

Conclusions

RD designs can produce rigorous impact RD designs can produce rigorous impact estimates under the right conditions:estimates under the right conditions:

– Need exogenous “scores” Need exogenous “scores”

– Scores and outcomes must have a smooth Scores and outcomes must have a smooth relationship that can be credibly modeledrelationship that can be credibly modeled

But there are limitations to the RD approach:But there are limitations to the RD approach:– Need larger samples than an experimental design Need larger samples than an experimental design

– Generalizability Generalizability

– Nonresponse a problem in unfunded sitesNonresponse a problem in unfunded sites

22

EXTRA SLIDES

23

Data: Fall 2004 and Spring 2005 (Spring Response Rates)

Child assessments (97%)Child assessments (97%)

Teacher behavioral ratings (96%)Teacher behavioral ratings (96%)

Teacher/classroom observations (79%)Teacher/classroom observations (79%)

Parent surveys (69%)Parent surveys (69%)

Teacher surveys (91%)Teacher surveys (91%)

Center director interviews (88%) Center director interviews (88%)

24

Child Assessment Instruments

Instrument Name Primary Dimension(s) Tapped Preschool Comprehensive Test of Phonological and Print Processing (Pre-CTOPP)

Print Awareness Phonological Awareness

Expressive One-Word Picture Vocabulary Test (EOWPVT)

Vocabulary Skills

Preschool Language Scale (PLS-4)

Auditory Comprehension

Social Competence & Behavior Evaluation (30-item) - Teacher Rating

Social Competence Anger-aggression Anxiety-withdrawal

25

Observations and SurveysInstrument Names Primary Dimension(s) Tapped Observations Teacher Behavior Rating Scale Literacy Environment ECERS-R (11 - subscales) Preschool quality with emphasis on use

of language and communication Surveys Teachers & Center Directors Demographics

Background Classroom characteristics Professional development Funding Sources

Parents Demographics

Child preschool experience Literacy resources available Weekly non-school literacy activities

Classroom Attendance Data Days attended Proportion of available days attended

Documents

Regression Discontinuity Design Case Study : National Evaluation of Early Reading First