11
This article was downloaded by: [Osaka University] On: 27 October 2014, At: 21:44 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK School Effectiveness and School Improvement: An International Journal of Research, Policy and Practice Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/nses20 Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia Robert E. Slavin a b a Johns Hopkins University , Baltimore , MD , USA b University of York , York , UK Published online: 29 May 2013. To cite this article: Robert E. Slavin (2013) Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia , School Effectiveness and School Improvement: An International Journal of Research, Policy and Practice, 24:4, 383-391, DOI: 10.1080/09243453.2013.797913 To link to this article: http://dx.doi.org/10.1080/09243453.2013.797913 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &

Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia 1

Embed Size (px)

Citation preview

Page 1: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

This article was downloaded by: [Osaka University]On: 27 October 2014, At: 21:44Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

School Effectiveness and SchoolImprovement: An International Journalof Research, Policy and PracticePublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/nses20

Effective programmes in reading andmathematics: lessons from the BestEvidence EncyclopaediaRobert E. Slavin a ba Johns Hopkins University , Baltimore , MD , USAb University of York , York , UKPublished online: 29 May 2013.

To cite this article: Robert E. Slavin (2013) Effective programmes in reading andmathematics: lessons from the Best Evidence Encyclopaedia , School Effectiveness and SchoolImprovement: An International Journal of Research, Policy and Practice, 24:4, 383-391, DOI:10.1080/09243453.2013.797913

To link to this article: http://dx.doi.org/10.1080/09243453.2013.797913

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &

Page 2: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Osa

ka U

nive

rsity

] at

21:

44 2

7 O

ctob

er 2

014

Page 3: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

Effective programmes in reading and mathematics: lessons from theBest Evidence Encyclopaedia1

Robert E. Slavin*

Johns Hopkins University, Baltimore, MD, USA, and University of York, York, UK

(Received 30 July 2012; final version received 19 March 2013)

This article summarises findings from systematic reviews of research on primary andsecondary mathematics, primary and secondary reading, and programmes for strug-gling readers. All reviews used a common set of procedures, requiring comparisonswith control groups and duration of at least 12 weeks. Across hundreds of qualifyingstudies, a clear pattern emerged. Programmes providing extensive professional devel-opment in well-structured methods such as cooperative learning and teaching ofmetacognitive skills produce much more positive effect sizes than those evaluatingeither curricular reforms or computer-assisted instruction.

Keywords: primary and secondary mathematics programmes; primary and secondaryreading programmes; struggling readers; educational research review

Introduction

Until recent years, educational policy and practice have been minimally affected by thefindings of research. Research has played a role in justifying policies or practices whenthey are advocated for other reasons, and research is often cited in arguments for oragainst various decisions, but it is rare that educators or policy makers truly consult theevidence to find out how to improve outcomes for pupils. Instead, educational practice isshaped by politics, marketing, fads, or other considerations, and as a result, educationalchange follows the pendulum swings characteristic of any field driven by fashion ratherthan evidence. That is, innovations appear with great fanfare, promising transformationalimpacts on outcomes for children. They are widely adopted for a period of time, and thendecline or disappear after they inevitably fail to achieve the miraculous outcomes theypromised. Perhaps more often they disappear just because educators or policy makers gettired of them and gravitate toward the next (untested) innovation. Consider the history ofinnovations of the past such as the open classroom, whole language in reading, new math,and many more. In each case, research on the innovative practice did not begin in earnestuntil the innovation had already been widely adopted. Because research takes time, theoutcomes of innovations often do not appear until the innovation is already on the wanefor reasons unrelated to outcomes. The failure to base educational policies and practiceson evidence explains the glacial pace of change in student outcomes over time. Fieldssuch as medicine, agriculture, and technology make constant progress because they payclose attention to evidence, advancing steadily because proven practices are adopted toreplace less effective methods. Practitioners and policy makers in those fields haveconfidence that reliable evidence supports new products or procedures and are therefore

*Email: [email protected]

School Effectiveness and School Improvement, 2013Vol. 24, No. 4, 383–391, http://dx.doi.org/10.1080/09243453.2013.797913

© 2013 Taylor & Francis

Dow

nloa

ded

by [

Osa

ka U

nive

rsity

] at

21:

44 2

7 O

ctob

er 2

014

Page 4: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

eager to adopt those that work (and ignore those that do not). Innovators in universities,private companies, and government itself, are confident that if they can create and validatenew products and methods, they will have an impact on practice. Yet none of this happensin education, where demonstrated effectiveness is rarely a factor in the success or failureof a given innovation to be influential.

Evidence-based reform in education

Evidence of effectiveness is beginning to make a difference in educational practice andpolicy, although progress remains slow and unsteady. Government support for rigorousevaluations of educational programmes has greatly expanded in the US, the UK, TheNetherlands, and elsewhere, and the findings of these evaluations are beginning to matter.In the US, the Obama Administration, in its Investing in Innovation (i3) competition, hasmade $950 million in grants over 3 years to help programmes that met a high standard ofevidence to disseminate themselves throughout the US and to help newer programmes tobuild up their evidence bases and prepare for dissemination. US education legislation isincreasingly mentioning evidence of effectiveness as a criterion for funding, and theCongress is currently considering regulations that would give additional points in educa-tion grant competitions to programmes that meet the evidence standards used in the i3competition, which include a requirement for at least one randomised or two high-qualitymatched evaluations with positive outcomes. In the UK, the newly created EducationEndowment Fund (EEF) received £125 million to distribute over 15 years to funddevelopment, evaluation, and dissemination of proven programmes. It is already funding55 third-party, mostly randomized evaluations of promising approaches. In TheNetherlands, the Top Institute for Education Research (TIER) is carrying out researchand development on important issues of practice.

An important tool in the movement toward evidence-based reform in education is theappearance of series of systematic reviews of research. These apply consistent standardsof evidence to the research on all programmes in a given area. In the UK, the Evidence forPolicy and Practice Information and Co-ordinating (EPPI) Centre has produced manysuch reviews (www.EPPI.IOE.ac.uk), and in the US, the federal What WorksClearinghouse (www.IES.ED.Gov/NCEE/WWC/) is continuing to review the evidencesupporting programmes in several areas. The Best Evidence Encyclopaedia (BEE),supported by the Johns Hopkins University and the University of York (with US andUK editions) has completed reviews of research on studies of primary and secondaryreading and mathematics. The BEE reviews have all been published in journals (primaryreading: Slavin, Lake, Chambers, Cheung, & Davis, 2009; reading for struggling learners:Slavin, Lake, Davis, & Madden, 2011; secondary reading: Slavin, Cheung, Groff, & Lake,2008; primary mathematics: Slavin & Lake, 2008; secondary mathematics: Slavin, Lake,& Groff, 2009), as well as being made available in various forms at www.bestevidence.org and www.bestevidence.org.uk.

The BEE reviews all report findings for various categories of interventions. As thereviews have appeared, patterns have emerged across subjects and year levels. Thepurpose of the present article is to present the main findings of the BEE reviews ofresearch on reading and mathematics in an attempt to draw broader principles of effectivepractice in these important areas of the curriculum. All of the BEE reviews have used thesame standards and procedures, with minor variations for the different subjects. Theseprocedures are described in the following sections.

384 R.E. Slavin

Dow

nloa

ded

by [

Osa

ka U

nive

rsity

] at

21:

44 2

7 O

ctob

er 2

014

Page 5: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

Review methods

The purpose of each BEE review is to place all types of programmes intended to enhanceachievement on a common scale, to provide educators and policy makers with meaningful,unbiased information that they can use to select programmes most likely to make adifference with their pupils. The reviews emphasise practical programmes that are orcould be used at scale. They therefore emphasise large studies done over significant timeperiods that used standard measures, to maximise the usefulness of the reviews to educators.The reviews also seek to identify common characteristics of programmes likely to make adifference in achievement. These syntheses are intended to include all kinds of approachesto teaching, and group them in categories, always including curricula, instructional technol-ogy, and instructional process programmes, and often adding combinations of these.Curricula primarily encompass core reading and maths textbooks and curricula.Instructional technology refers to programmes that use technology to enhance reading ormathematics achievement. This mostly includes traditional supplementary computer-assisted instruction (CAI) or information and communication technologies (ICT) pro-grammes, in which students are sent to computer labs or computers in the classroom foradditional practice. Instructional process programmes rely primarily on professional devel-opment to give teachers effective strategies for teaching. These include programmes focus-ing on cooperative learning, classroom management, motivation, and metacognitive skills.

The review methods used in all BEE reviews are adaptations of a technique called best-evidence synthesis (Slavin, 1986, 2008). Best-evidence syntheses seek to apply consistent,well-justified standards to identify unbiased, meaningful information from experimentalstudies, discussing each study in some detail, and pooling effect sizes across studies insubstantively justified categories. The method is very similar to meta-analysis (Cooper,1998; Lipsey & Wilson, 2001), adding an emphasis on narrative description of each study’scontribution. It is similar to the methods used by the What Works Clearinghouse, with a fewimportant exceptions noted in the following sections. See Slavin (2008) for an extendeddiscussion and rationale for the procedures used in all of these reviews.

Literature search procedures

A broad literature search was carried out in an attempt to locate every study that couldpossibly meet the inclusion requirements. Electronic searches were made of educationaldatabases (ERIC, PsychINFO, Dissertation Abstracts) using various combinations ofkeywords (e.g., “elementary or primary students”, “reading”, “mathematics”, “achieve-ment”) and the years 1970–2009. Results were then narrowed by subject area (e.g.,“reading/mathematics intervention”, “educational software”, “academic achievement”,“instructional strategies”). In addition to looking for studies by key terms and subjectarea, we conducted searches by programme name. Web-based repositories and educationpublishers’ websites were also examined. We attempted to contact producers and devel-opers of programmes to check whether they knew of studies that we had missed. Citationswere obtained from other reviews of programmes, including the What WorksClearinghouse. We also conducted searches of recent tables of contents of key journals.Studies were sought and accepted from every type of published or unpublished source, butwe systematically scanned the following tables of contents from 2000 to 2009: AmericanEducational Research Journal, Reading Research Quarterly, Journal of Research onMathematics Education, Journal of Educational Research, Journal of EducationalPsychology, Reading and Writing Quarterly, British Educational Research Journal, and

School Effectiveness and School Improvement 385

Dow

nloa

ded

by [

Osa

ka U

nive

rsity

] at

21:

44 2

7 O

ctob

er 2

014

Page 6: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

Learning and Instruction. Citations of studies appearing in the studies found in the firstwave were followed up.

Effect sizes

In general, effect sizes were computed as the difference between experimental and controlindividual pupil posttests after adjustment for pretests and other covariates, divided by theunadjusted posttest control group standard deviation. If the control group SD was notavailable, a pooled SD was used. Procedures described by Lipsey and Wilson (2001) andSedlmeier and Gigerenzer (1989) were used to estimate effect sizes when unadjustedstandard deviations were not available, as when the only standard deviation presented wasalready adjusted for covariates or when only gain score SD’s were available. If pretest andposttest means and SD’s were presented but adjusted means were not, effect sizes forpretests were subtracted from effect sizes for posttests.

Effect sizes were pooled across studies for each programme and for various categoriesof programmes. This pooling used means weighted by the final sample sizes. The reasonfor using weighted means is to maximise the importance of large studies, as the previousreviews and many others have found that small studies tend to overstate effect sizes (seeRothstein, Sutton, & Borenstein, 2005; Slavin & Smith, 2009).

Criteria for inclusion

Criteria for inclusion of students in this review were as follows:

(1) The studies evaluated classroom programmes for reading or maths. Studies ofvariables, such as use of ability grouping, block scheduling, or single-sex class-rooms, were not reviewed.

(2) The studies compared children taught in classes using a given programme to thosein control classes using an alternative programme or standard methods.

(3) Studies could have taken place in any country, but the report had to be available inEnglish.

(4) Random assignment or matching with appropriate adjustments for any pretestdifferences (e.g., analyses of covariance) had to be used. Studies without controlgroups, such as pre-post comparisons and comparisons to “expected” scores, wereexcluded.

(5) Pretest data had to be provided, unless studies used random assignment of at least30 units (individuals, classes, or schools) and there were no indications of initialinequality. Studies with pretest differences of more than 50% of a standarddeviation were excluded because, even with analyses of covariance, large pretestdifferences cannot be adequately controlled for as underlying distributions may befundamentally different (Shadish, Cook, & Campbell, 2002).

(6) The dependent measures included quantitative measures of achievement, such asstandardized reading or mathematics measures. Experimenter-made measureswere accepted if they were comprehensive measures of reading or maths, whichwould be fair to the control groups, but measures of objectives inherent to theexperimental programme (but unlikely to be emphasised in control groups) wereexcluded. Studies using measures inherent to treatments, usually made by theexperimenter or programme developer, have been found to be associated withmuch larger effect sizes than are measures that are independent of treatments

386 R.E. Slavin

Dow

nloa

ded

by [

Osa

ka U

nive

rsity

] at

21:

44 2

7 O

ctob

er 2

014

Page 7: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

(Slavin & Madden, 2011), and for this reason, effect sizes from treatment-inherentmeasures were excluded. Measures individually administered by the children’sown teachers were also excluded, on the basis that such assessments are suscep-tible to bias.

(7) A minimum study duration of 12 weeks was required. This requirement isintended to focus the review on practical programmes intended for use for thewhole year, rather than brief investigations. Study duration is measured from thebeginning of the treatments to posttest, so, for example, an intensive 8-weekintervention in the autumn of Year 1 would be considered a year-long study ifthe posttest were given in May.

(8) Studies had to have at least 15 students and two teachers in each treatment group.

Limitations

It is important to note several limitations of the BEE reviews. First, the reviews focusedon experimental or quasi-experimental studies using quantitative measures of achieve-ment. There is much to be learned from qualitative and correlational research that can adddepth and insight to understanding the effects of programmes, but this research is notreviewed here. Second, the reviews focus on replicable programmes used in realisticschool settings over periods of at least 12 weeks. This emphasis is consistent with thepurpose of providing educators with useful information about the strength of evidencesupporting various practical programmes, but it does not attend to shorter, more theore-tically driven studies that may also provide useful information, especially to researchers.Finally, the reviews focus on traditional measures of performance, primarily standardisedtests. These are useful in assessing the practical outcomes of various programmes and arefair to control as well as experimental teachers, who are equally likely to be trying to helptheir students do well on these assessments. The reviews do not report on experimenter-made measures of content taught in the experimental group but not the control group,even though results on such measures may also be of importance to researchers oreducators (see Slavin & Madden, 2011).

Categories of research design

Four categories of research designs were included in the BEE reviews. Randomisedexperiments (R) were those in which pupils, classes, or schools were randomly assignedto treatments, and data analyses were at the level of random assignment. When schools orclasses were randomly assigned but there were too few schools or classes to justify analysisat the level of random assignment, the study was categorized as a randomized quasi-experiment (RQE) (Slavin, 2008). Matched (M) studies were ones in which experimentaland control groups were matched on key variables at pretest, before posttests were known.Studies using fully randomised designs (R) are preferable to randomised quasi-experiments(RQE), but all randomised experiments are less subject to bias than matched studies.

Results

Table 1 summarises the mean effect sizes from the reviews of research on primary andsecondary reading and mathematics. Means were weighted by sample size, so the largerstudies have more influence than smaller ones. The table entries were derived from a total

School Effectiveness and School Improvement 387

Dow

nloa

ded

by [

Osa

ka U

nive

rsity

] at

21:

44 2

7 O

ctob

er 2

014

Page 8: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

of 346 studies that met the BEE inclusion requirements and evaluated programmes forpupils in general (for research on struggling readers, see Slavin et al., 2011).

What is striking about Table 1 is the consistency of effect sizes across subjects andyear levels. These trends are discussed in the following sections.

Curriculum

A total of 77 studies evaluated innovative textbooks, curriculum series, and otherapproaches whose theory of action focuses on the idea that changing the content ofwhat is taught will improve pupils’ learning. Overall, the average effect size for suchstudies is near zero (ES = +0.06). Somewhat larger effect sizes were seen in beginningreading (ES = +0.13), where curriculum innovations generally involved introducing text-books with more of a phonetic emphasis. In mathematics, most studies involved adoptingtextbooks or curriculum series with more of an emphasis on problem solving and under-standing of mathematical ideas, but some studies evaluated Saxon Math, a more algo-rithmic, back-to-the-basics approach. None of the programmes showed notably positiveimpacts.

Technology

There were 130 qualifying studies of the use of various types of technology in reading andmathematics. Once again, the effect sizes were modest in all categories. Technologyinnovations had their largest effects in primary mathematics (ES = +0.19), but of these,the higher quality randomised studies had much lower effects, averaging +0.10. A reviewof technology applications in reading by Cheung and Slavin (in press) found that effectsizes were somewhat higher for innovative uses of technology, especially whole-classinnovations in which computers played a limited role (such as Read 180 and Writing toRead) and embedded multimedia applications in which brief video segments are woveninto teachers’ lessons. However, the most widely studied and most widely used applica-tions, integrated learning systems and other stand-alone, individualised software in whichpupils work through a set of exercises at their own instructional level, had minimalimpacts in mathematics and in reading.

Instructional process

Instructional process programmes, such as cooperative learning, classroom management,and teaching of metacognitive skills, had the most positive outcomes in every category. Atotal of 100 studies met the inclusion criteria, and an additional 39 studies evaluated

Table 1. Weighted mean effect sizes by progamme category.

Curricula Technology Instructional Process Combined

Mathematics-Primary +0.10 (13) +0.19 (38) +0.33 (36) –Mathematics-Secondary +0.03 (40) +0.08 (40) +0.18 (22) –Reading-Beginning +0.13 (8) +0.11 (10) +0.31 (18) +0.28 (22)Reading-Upper Primary +0.07 (16) +0.06 (34) +0.23 (10) +0.29 (6)Reading-Secondary – +0.10 (8) +0.21 (14) +0.22 (11)Weighted Mean +0.06 (77) +0.11 (130) +0.27 (100) +0.26 (39)

388 R.E. Slavin

Dow

nloa

ded

by [

Osa

ka U

nive

rsity

] at

21:

44 2

7 O

ctob

er 2

014

Page 9: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

combinations of instructional process and curriculum innovations or of instructionalprocess and technology. In all of these approaches, the theory of action focuses onimproving teachers’ effectiveness by training them in specific, well-structured approachesthat engage pupils in helping each other to learn academic content, increasing pupils’motivation, making more effective use of class time, and helping pupils use studystrategies or learning-to-learn skills. Most instructional process programmes teach tea-chers to use multiple strategies, such as co-operative learning and metacognitive strategyinstruction and effective classroom management approaches. The 100 instructional pro-cess studies had an average effect size of +0.27, and the 39 studies of combinedapproaches had a nearly identical mean, ES = +0.26. Outcomes were particularly positivein primary schools, both in mathematics (ES = +0.33) and beginning reading (ES =+0.31). The largest number of studies evaluated various forms of co-operative learning,such as Peer Assisted Learning Strategies (PALS) and Student Teams-AchievementDivisions (STAD), and these also had the most positive outcomes. Among studies ofcombined approaches, co-operative learning also played a prominent role in programmessuch as Success for All in beginning reading and Read 180 in secondary reading.

Methodological findings

The BEE reviews typically compared effect sizes for various studies with methodologicalcharacteristics, and once again, outcomes were strikingly similar across subjects and yearlevels.

Randomised versus matched assignment

In every comparison, there were no important differences in effect sizes between studies inwhich pupils, classes, teachers, or schools were assigned at random and those in whichmatched assignment was used (see Slavin, 2008). Within reviews, there was a pattern oflarger effect sizes for matched studies of technology applications than for randomisedstudies (Cheung & Slavin, 2013, in press), but there were no such trends for other categoriesof programmes. It is important to note that randomised designs should still be preferred tomatched designs to rule out possible effects of selection bias. Also, it seems likely that inareas in which self-selection is likely at the pupil level, such as evaluations of after-school orsummer school programmes that pupils may choose to attend, random assignment isprobably essential, as the willingness to attend is a serious biasing factor that cannot bestatistically controlled for. The same would be true for studies in which there is systematicassignment of pupils, as in evaluations of special education or gifted programmes. However,in the studies of school programmes included in the main BEE reviews, the possibility ofselection bias exists only at the teacher or school level, not the pupil level, and at least aportion of such bias may be controlled out by statistical controls for pretests, which arerequired for matched studies by the BEE inclusion standards.

Small versus large studies

As reported by Slavin and Smith (2009), based on data from the primary and secondarymathematics studies that qualified for the BEE, small studies produce much larger effectsizes than large ones. Mean effect sizes for studies with sample sizes less than 50 averaged+0.44, those with sample sizes of 51–100 averaged +0.29, and those with sample sizes of101–150 averaged +0.22. In contrast, studies with a sample size of more than 2000 had a

School Effectiveness and School Improvement 389

Dow

nloa

ded

by [

Osa

ka U

nive

rsity

] at

21:

44 2

7 O

ctob

er 2

014

Page 10: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

mean effect size of only +0.09. For this reason, effect sizes are weighted by sample size inBEE reviews, so the reported effect sizes are strongly influenced by large studies.

Measures inherent to treatments

In evaluations of educational programmes, researchers often include measures of contentpresented to the experimental group but not the control group, or use outcome measurescreated by the study’s authors that appear to contain content more aligned with theexperimental than the control treatment. For example, a few studies of ICT administeredoutcome measures on the computer, with questions like those the experimental group hadbeen practising. The control pupils, of course, had no experience of either the content orthe computers. An analysis by Slavin and Madden (2011) found that effect sizes for suchmeasures inherent to the experimental treatment were far larger than those for measures ofcontent studied equally by all groups. For this reason, measures inherent to the experi-mental treatment were excluded in all BEE reviews.

Conclusion

The individual BEE reviews contain much information unique to each subject and yearlevel, but looking across all of these, important patterns emerge. On valid measures ofachievement in studies of at least 12 weeks’ duration, the kinds of programmes consistentlyassociated with achievement gains in reading and mathematics are those that provideteachers with extensive professional development on classroom teaching strategies designedto increase pupils’ motivation, make more effective use of time, teach pupils effective studystrategies, and so on. In particular, positive effects were found for many programmes thatincorporated co-operative learning, usually in combination with other strategies.

The implications of these findings are clear. There is no programme that works inevery study or circumstance, but the BEE reviews support the idea that significant gains inlearning are most likely for interventions that change the core teaching practices ofclassroom teachers, using extensive training, coaching, and follow-up to help teachersmake effective and lasting changes in their daily teaching. Technology and curriculuminnovations can support or supplement changes in teaching practices, but they do not haveimportant effects on learning in themselves. Much additional research is needed to expandthe range of effective practices and to better understand how and why and under whatcircumstances various instructional process approaches produce learning gains, butresearch to date supports the idea that it is in these types of innovations where effectiveand replicable approaches are most likely to be found.

Note1. Invited keynote address presented at the second meeting of EARLI SIG 18, Centre for

Educational Effectiveness and Evaluation, Leuven, Belgium, August 25–27, 2010. This manu-script was accepted under the guest editorship of Jan Van Damme.

Notes on contributorRobert Slavin is currently Director of the Center for Research and Reform in Education at JohnsHopkins University, part-time Professor at the Institute for Effective Education at the University ofYork (England), and Chairman of the Success for All Foundation. He received his BA inPsychology from Reed College in 1972, and his PhD in Social Relations in 1975 from Johns

390 R.E. Slavin

Dow

nloa

ded

by [

Osa

ka U

nive

rsity

] at

21:

44 2

7 O

ctob

er 2

014

Page 11: Effective programmes in reading and mathematics: lessons from the Best Evidence Encyclopaedia               1

Hopkins University. Dr. Slavin has authored or co-authored more than 300 articles and bookchapters on such topics as cooperative learning, comprehensive school reform, ability grouping,school and classroom organization, desegregation, mainstreaming, research review, and evidence-based reform. Dr. Slavin is the author or co-author of 24 books, including Educational Psychology:Theory into Practice (Allyn & Bacon, 1986, 1988, 1991, 1994, 1997, 2000, 2003, 2006, 2009),Cooperative Learning: Theory, Research, and Practice (Allyn & Bacon, 1990, 1995), Show Me theEvidence: Proven and Promising Programs for America’s Schools (Corwin, 1998), EffectivePrograms for Latino Students (Erlbaum, 2000), Educational Research in the Age ofAccountability (Allyn & Bacon, 2007), and Two Million Children: Success for All (Corwin,2009). He received the American Educational Research Association’s Raymond B. Cattell EarlyCareer Award for Programmatic Research in 1986, the Palmer O. Johnson award for the best articlein an AERA journal in 1988, the Charles A. Dana award in 1994, the James Bryant Conant Awardfrom the Education Commission of the States in 1998, the Outstanding Leadership in EducationAward from the Horace Mann League in 1999, the Distinguished Services Award from the Councilof Chief State School Officers in 2000, the AERA Review of Research Award in 2009, the PalmerO. Johnson Award for the best article in an AERA journal in 2008, and was appointed as a Memberof the National Academy of Education in 2009 and an AERA Fellow in 2010.

ReferencesCheung, A., & Slavin, R. E. (2013). The effectiveness of educational technology applications for

enhancing mathematics achievement in K-12 classrooms: A meta-analysis. EducationalResearch Review, 9, 88–113.

Cheung, A., & Slavin, R. E. (in press). Effects of educational technology applications on readingoutcomes for struggling readers: A best-evidence synthesis. Reading Research Quarterly.

Cooper, H. (1998). Synthesizing research (3rd ed.). Thousand Oaks, CA: Sage.Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage.Rothstein, H. R., Sutton, A. J., & Borenstein, M. (Eds.). (2005). Publication bias in meta-analysis:

Prevention assessment, and adjustments. Chichester, UK: John Wiley.Sedlmeier, P., & Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power

of studies? Psychological Bulletin, 105, 309–316.Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental

designs for generalized causal inference. Boston, MA: Houghton-Mifflin.Slavin, R. E. (1986). Best-evidence synthesis: An alternative to meta-analytic and traditional

reviews. Educational Researcher, 15(9): 5–11.Slavin, R. E. (2008). Perspectives on evidence-based research in education – What works? Issues in

synthesizing education program evaluations. Educational Researcher, 37, 5–14.Slavin, R. E., Cheung, A., Groff, C., & Lake, C. (2008). Effective reading programs for middle and

high schools: A best evidence synthesis. Reading Research Quarterly, 43, 290–322.Slavin, R. E., & Lake, C. (2008). Effective programs in elementary mathematics: A best-evidence

synthesis. Review of Educational Research, 78, 427–515.Slavin, R. E., Lake, C., Chambers, B., Cheung, A., & Davis, S. (2009). Effective reading programs

for the elementary grades: A best-evidence synthesis. Review of Educational Research, 79,1391–1466.

Slavin, R. E., Lake, C., Davis, S., & Madden, N. (2011). Effective programs for struggling readers:A best-evidence synthesis. Educational Research Review, 6, 1–26. http://dx.doi.org/10.1016/j.edurev.2010.07.002

Slavin, R. E., Lake, C., & Groff, C. (2009). Effective programs in middle and high schoolmathematics: A best-evidence synthesis. Review of Educational Research, 79, 839–911.

Slavin, R. E., & Madden, N. A. (2011). Measures inherent to treatments in program effectivenessreviews. Journal of Research on Educational Effectiveness, 4, 370–380.

Slavin, R. E., & Smith, D. (2009). The relationship between sample sizes and effect sizes insystematic reviews in education. Educational Evaluation and Policy Analysis, 31, 500–506.

School Effectiveness and School Improvement 391

Dow

nloa

ded

by [

Osa

ka U

nive

rsity

] at

21:

44 2

7 O

ctob

er 2

014