Download pdf - Systemwide Asses Sprog

Transcript
  • 8/2/2019 Systemwide Asses Sprog

    1/54

    This publication is the result of research that forms part of a program

    supported by a grant t o the Australian Council for Educational Research by

    State, Territ ory and Commonwealth governments. The suppor t provided by

    these governments is gratefull y acknowledged.

    The views expressed in this publication are those of the author and not

    necessaril y those of the State, Territ ory and Commonwealth governments.

    A POLICYMAKER'SGUIDE TO

    SystemwideAssessmentPrograms

    by

    Margaret Forster

    First pu bli shed 2001 by the Australian Cou ncil for Educational Research Ltd.

    19 Prospect Hi ll Road, Camberwell, M elbou rne, Victori a 3124, Australia.

    Copyright 2001 Australian Council fo r Educational Research

    All ri ghts reserved. Except as provi ded for by Australian copyright law, no p art of

    this book may be reproduced without wr itten permission fro m the publi sher.

    ISBN 0-86431-359-4

    Printed by

  • 8/2/2019 Systemwide Asses Sprog

    2/54

  • 8/2/2019 Systemwide Asses Sprog

    3/54

    Page

    1

    Introduction 2

    What are the purposes of systemwide

    assessment programs? 3

    Why the interest in systemwide

    assessment programs? 4

    What are some examples of systemwide

    assessment programs? 6

    How are data collected? 7

    How are data reported? 8

    How are trends monitored? 20

    How are data used to improve learning? 27

    What concerns have been raised? 37

    A case study 44

    A checklist of considerations 46

    Useful websites 48

    End Notes 49

    Table of Contents Page

  • 8/2/2019 Systemwide Asses Sprog

    4/54

    The assessment programs considered in

    this handbook focus on the collection

    and analysis of systemwide in formation.

    Two kinds of information usually are

    collected:

    data on student achievement in

    particular subject areas at particular

    grade levels; and

    limited background information

    (characteristics of students).

    Achievement data usually are collected

    through standardised tests administered

    to either representative samples or ent ire

    cohor ts of students. Background

    information is collected by means of

    questionnaires completed by students or

    teachers.

    Page

    2

    This guide provides policy makers with research-based information about

    systemwide assessment programs.

    Good decision making at all levels of an education system is facili tated by easily

    accessible, relevant, and reliable information.

    Many ind icators provide useful input to educational decision making; but the

    most important indicators are those which address the central concern of

    education: the promotion of student learning.

    Education systems monitor student learningwith the fundamental intention of

    promoting learningby collecting, analysing and reporting student achievement

    data. Given that state, national and international achievement studies are bothtime consuming and expensive, it seems prudent to reflect on this effort:

    What is the purpose of these programs?

    How are data reported and used?

    How can we ensure that data will provide evidence for informed decision

    making?

    INTRODUCTION

  • 8/2/2019 Systemwide Asses Sprog

    5/54

    Systemwide assessment programs

    provide systematic and regular measures

    of student learning. They are designed

    to investigate and monito r the health of

    an education system and to improve

    student learning by providing

    information to stakeholders at different

    levels of the system. They provide

    policy makers with information tomonitor standards over time, to

    monitor the impact of particular

    programs, and to make decisions about

    resource allocation; and

    schools (principals, councils) and

    teachers with information about whole

    school , class and ind ividual pupi l

    performance that they can use to make

    decisions about resource allocation

    and to suppor t learning in the

    classroom.

    Full-cohort programs provide:

    parents with information about their

    childs progress to assist them to make

    decisions about the best ways to

    suppor t their child; and

    students with information about their

    progress to assist them to take an

    active role in monitor ing their own

    learning.

    Programs that are not fu ll cohort can

    provide this information for a limited

    number of students.

    Page

    3

    The purpose of a systemwide monitoring program

    The purpose of British Columbias Foundation Skills Assessment (FSA) program is

    stated explicitly.1 The program is intended to

    provide information to d istricts about the performance of their students in

    relation to p rovincial expectations and standards in order to assist districts to p lan

    for improvement;

    provide information to the public about the performance of students provincially

    in relation to expectations and trends over time;

    measure the achievement of students in reading comprehension, first-draft

    writing, and selected components of numeracy;

    determine if there are any trends in student performance at the district and

    provincial levels; and

    determine if there are groups of students who under perform with respect to

    provincial standards.

    WHAT ARE THE PURPOSES OF SYSTEMWIDEASSESSMENT PROGRAMS?

  • 8/2/2019 Systemwide Asses Sprog

    6/54

    The management of an education system

    is a complex and expensive operation . If

    decisions are to be informed, then

    dependable in formation on educational

    outputs is required. Systemwide

    programs provide this information for

    system level monitoring and resource

    allocation.

    Of increasing interest, however, is therole large-scale assessment programs can

    play as agents of reform and

    accountabilityto provide bo th

    direction and motivation to schools,

    teachers, parents and students. In the

    State of South Australia, for example,

    assessment is seen as the missing link in

    earlier curriculum planning and

    programming which was not i nformed,

    as a matter of course, by student

    achievement information.2 In Briti sh

    Columbia, State assessments are seen aspart of the ongoing process of

    educational reform as Figure 1 below

    illustrates.3

    [In the United States] the testing

    enterprise in K-12 education has

    mushroomed in the last quarter-century;

    Americans want numbers when they look

    at students, schools, state education

    systems, and how Americas students

    compare to those of other countries.

    Among political leaders, testing is turning

    into a means of reform, rather than just a

    way of finding out whether reforms have

    been effective.4

    In some countr ies, student achievement

    data collected th rough systemwide

    assessment programs are used as a

    measure of schools contribu tions to

    student learning. In some, schools are

    rewarded or punished on the basis of

    students results (see pages 34-35). In

    others, publ ic comparisons of schoo ls

    achievements in the fo rm of league

    tables are made.

    Page

    4

    WHY THE INTEREST IN SYSTEMWIDE ASSESSMENTPROGRAMS?

    Figure 1 Assessment and educational reform a continuous process

    District assessment resul ts

    Assessment fol low throughInterpretation ofdistrict results

    Relevantinformationcollectedby district

    Make recommendationsand set goals

    Developaction plan

    Monitor actionplan progress

    Revisit goals

    Annual provincialassessment

  • 8/2/2019 Systemwide Asses Sprog

    7/54

    United States government publicationsemphasise the need for system d rivenaccountabili ty measures to improvestudent learning:

    No school improvement can succeed

    without real accountabili ty for results.

    Turning around low-performing schools

    requires that state and district leaders

    take active steps to set high expectations

    for schools and students, establ ish themeans to measure performance against

    those expectations, and create policies to

    identi fy and provide assistance to thoseschools and students that fail to meet

    high standards for performance.5

    Commentary on this reform andaccountabili ty agenda focuses on i tsunderstandable pol itical appeal:

    Compared with reforms such as targeting

    instructional time, professional

    development for teachers, and reducing

    class sizes, state assessment programs are

    relatively inexpensive. The assessments

    also can be mandated (unlike changes inclassroom practice), can be rapidlyimplemented, and have a public

    visibility.6

    Significant commentary also focuses onthe unintended negative consequencesof assessment driven reform (see pages38-44).

    The recent role o f achievement data inassessment dri ven reform is illustrated inthe lower feedback loop in Figure 2.Once assessments of student learning(data collection) have been reported andevaluated (figure centre), information isthen d isaggregated to provideevaluations at school level. These

    evaluations are publ icised. Rewards andsanctions are applied to schools toencourage improvements in studentlearning.

    The upper feedback loop in Figure 2illustrates the traditional approach tosystem use of student achievement data.Once assessments of student learning(data collection) have been reported andevaluated, decisions are then made atsystem level about the best ways toimprove student learning th rough the

    allocation of resources to disadvantagedschools, to programs, and to teacherprofessional development.

    Page

    5

    Resourceallocation t oprofessionaldevelopment

    Resourceallocation t o

    programs

    Resourceallocation to

    schools

    Collect data Report dataEvaluatefindings

    Apply rewardsand sanctions

    to schools

    Publiciseschool level

    findings

    Disaggregatefindings at

    school level

    System level improvement (and accountability)

    School level accountability (and improvement)

    Figure 2 Using achievement data for improvement and accountability

  • 8/2/2019 Systemwide Asses Sprog

    8/54

    The table below lists systemwideassessment programs referred to in thishandbook. Education systems collect arange of achievement data at key yearlevels of schooling. Most monitor ingsystems collect data on literacy andnumeracy achievement, some, on a rangeof other learning outcomes.

    Page

    6

    WHAT ARE SOME EXAMPLES OF SYSTEMWIDEASSESSMENT PROGRAMS?

    Some examples of systemwide assessment programsProgram Year level/s Learning outcomes Website information

    assessed assessed

    Australia

    Australian 3, 5, 7, 9 Literacy, numeracy www.decs.act.gov.au/ Capital schools/assessment

    Territory

    New South 3, 5, 7 Literacy, numeracy www.det.nsw.edu.auWales

    Western 3, 5, 7, 10 Literacy, numeracy, www.eddept.wa.edu.Australia science, technology, au/centoff/annrep99/

    the arts, health & AnnualRepor t.pdfphysical education,studies of society

    & environment

    South Australia 3, 5 literacy, numeracy

    Victoria 3, 5, 7 literacy, numeracy, www.bos.vic.edu.au

    science, stud ies ofsociety and environmentin alternate years

    Northern 3, 5, 7 www.ntde.gov.auTerritory

    Queensland 3, 5 literacy, numeracy www.qscc.quld.edu.au

    Tasmania 3, 5 literacy, numeracy www.tased.edu.au

    CanadaBrit. Columbia 4, 7, 10 Reading, numeracy, www.bced.gov.bc.ca/

    first draft writing assessments

    United States

    Colorado 4, 8 Reading, Lang Arts www.ced.state.co.us/

    Kentucky 4, 5, 7, 8 maths www.kde.state.ky.us/

    Nth Carolina 4, 8 www.dpi.state.nc.us

    Texas 4, 8 www.tea.state.tx.us

    Tennessee 4, 8 www.state.tn.us/

    education/

  • 8/2/2019 Systemwide Asses Sprog

    9/54

    Full-cohort testing

    Some education systems collect studentachievement data through full-cohortassessment programs. The Year 3 andYear 5 literacy and numeracy assessmentsin Australian States and Territor ies, inmany states of the United States, and inparts of Canada are examples, although

    in some instances full-cohort means thecohort of government schools only.

    Typically, data in these programs arecollected through the use of machine-scored paper and pen tests, although forsome learning outcomes, studentsextended responses or on-the-spotperformances are assessed by classroomteachers or by centrally trained assessors(for example, writing outcomes, andphysical education outcomes). Data areaggregated to provide summaries of

    group perfo rmances, either at the levelof a system or at the level of a school .

    An advantage of these programs is thatthey can provide reliable information toall parents on individual student progressin a few crucial areas of school learning.

    Sample surveys

    Some systems collect achievement datathrough sample surveys which aredesigned to provide summary

    information at the system level only. TheUS National Assessment of EducationalProgress (NAEP) is an example of a surveyof this kind conducted at a national level.But some systems conduct similarsurveys at a State or district level; forexample, the Western AustralianMoni tor ing Standards in EducationProgram.

    Programs of this kind are based on the

    performances of carefully drawnrepresentative samples of students.Samples may be drawn to ensureadequate representation of part icularcategories of students so that theaverage performances of students inthose categories can be compared andmonitored.

    Although sample surveys cannot provideall parents with information on theprogress of individual students, or localschool communi ties with in formation onschool results, sample surveys have anumber of important advantages overfull-cohor t testing.

    Sample surveys are capable of providingevidence about a rich and varied set oflearning goals. Full -cohor t testingprograms inevitably address only thoseoutcomes that can be assessed for many

    thousands of students at a time. Thisconstraint limits the range of learningoutcomes that the program is able toaddress.

    Because sample surveys usually do notreport on individual students, it i s notnecessary fo r all students to attempt thesame set of assessment tasks. Di fferentstudents can attempt different butoverlapping sets of tasks (known as amultiple-matrix design) to allow systemreporting on a wide range of valuedcurriculum goals.

    Sample surveys are also less expensiveoverall (though more expensive perstudent) than full-cohort testing, andtend to be less intrusive into classroomtime (though they may requi re extensivecommitment from a very small numberof teachers and students).

    Page

    7

    HOW ARE DATA COLLECTED?

  • 8/2/2019 Systemwide Asses Sprog

    10/54

    There are multiple audiences for reportsof student achievement, includingeducation systems, school communit ies,parents, students, and the general publi c.These audiences usually are interested indifferent levels of detail abouteducational achievements.

    Data are reported in summary ordisaggregated form, and against a range ofreference points; for example, curriculum

    standards, proficiency scales, andexpectations. In each instance, findingsare communicated using a variety o f

    techniques including graphs, numericalpresentations (tables), wri ttendescript ions, and rankings (leaguetables).

    The table below summarises some waysin which data are reported.

    The following examples 1-10 illustratereports of systemwide results. Examples11-13 illustrate reports of school results,and example 14 illustrates an individual

    student report.

    Page

    8

    HOW ARE DATA REPORTED?

    SOME WAYS IN WHICH ACHIEVEMENT DATA ARE REPORTED

    Systemwide results

    Averages and distributions

    Against national norms

    Against standards framework (includ ing described pro ficiency scales)

    Against performance expectations (including standards, goals, benchmarks)

    Against international benchmarks

    For subgroups of students (including gender, cultural background, languagebackground)

    Against background variables

    In curriculum areas

    Item-by-item

    School results

    Averages

    For subgroups of students

    For subgroups of schools (including district, geographic region)

    In curriculum areas

    Item-by-item

    Against systemwide cohor t achievement

    Against school level perfo rmance goals (includ ing value-added)

    Student results

    Individual student results

  • 8/2/2019 Systemwide Asses Sprog

    11/54

    Page

    9

    Example 1 Systemwideaverages and distributions

    Most education systems provide summary reports of the mean achievements ofstudents. In this example, the summary achievements of Year 3, Year 7 and Year 10Western Australian students in wri ting are shown graphically and (cover the page)in tabular form.

    The performance scale (0-800), against which achievement is reported, is markedout in levels. A descript ion of achievement at each of these levels is provided tothe lef t of the display.7

    0

    50

    100

    150

    200

    250

    300

    350

    400

    450

    500

    550

    600

    650

    700

    750

    1

    2

    3

    4

    5

    7

    6

    LEVELPERFORMANCE

    SCALE

    ATSI

    NESBGirls

    Mean

    Girls

    YR 3

    ATSI

    NESB

    Boys

    Mean

    Girls

    YR 7

    ATSI

    NESBBoys

    Mean

    Girls

    YR 10

    STUDENT PERFORMANCE IN WRITINGLevel 8

    Students strive to write with assurance, precisionand vitality. They explore complex themes andissues in a variety of styles that compel readersinterest and attention.Level 7Students explore ideas about texts and issues ina precise and organised way. They expressthemselves precisely when writing for complexpurposes and they try to match text type,structure, tone and vocabulary to their subjectand purpose.Level 6Students write in a variety of sustained ways toexplore complex issues and ideas. They selectinformation to influence readers. They make theirmeaning clear for readers by using correctpunctuation, spelling and grammar and bymanipulating words and the structure of the text.Level 5Students use a variety of text types to write atlength and with some sense of complexity. Theywrite sustained, logically organised texts thatsubstantiate or elaborate ideas. They show asense of the requirements of the reader andexperiment with manipulating prose for effect.Level 4Students have a sound basic knowledge of howto use English. They use familiar ideas andinformation in their writing, showing control overthe way some basic text types are written. Theypresent ideas logically with limited elaboration.They try to adjust their writing to meet readersneeds.Level 3Students write longer texts using ideas andinformation about familiar topics for particularpurposes known audiences. They use many of

    the linguistic structures and features of a smallrange of text types and make attempts atspelling new words according to spellingpatterns and conventions.Level 2Students produce brief written texts understoodby others and which include related ideas andinformation about familiar topics. Students havea beginning knowledge of the conventions ofwritten textsLevel 1Students show an emerging awareness of thenature, purposes and conventions of writtenlanguage. They experiment with using writtensymbols for conveying ideas and messages.

  • 8/2/2019 Systemwide Asses Sprog

    12/54

    Page

    10

    Example 1 cont.....

    Number of students Mean Standard deviation

    (in level)

    1992

    Year 3 1426 285 2 96

    Year 7 1497 477 4 127

    Year 10 1143 564 5 142

    1995

    Year 3 1682 262 2 139

    Year 7 1610 475 4 134

    Year 10 1563 551 5 136

    Example 2 Systemwideagainst national norms

    Many State and district education systems report systemwide summaryachievement against national norms. For example, many States of t he US useNational Assessment of Educational Progress (NAEP) data and TIMSS results asreference poin ts for student achievement on State tests. The table below showsresults on the Connecticut Mastery Test 1999 compared with a national sample ofstudents.8

    State level normative information

    Mean national percentile ranking *

    Mathematics Reading Writtencomprehension communication

    Grade 4 68 57 64

    Grade 6 66 58 68

    Grade 8 68 54 66

    * Normative inform ation is provided to indicate how well the average student i n Connecticutperfo rms compared to a Uni ted States national sample. For example, it is estim ated that four th gradestudents who achieved the state average score on the CMT mathematics test wou ld have scoredbetter than 68% of students nationally.

  • 8/2/2019 Systemwide Asses Sprog

    13/54

    Example 3 Systemwideagainst standards frameworks

    Increasingly, systems are reporting summary achievement against levels of astandards framework . This example shows the summary repor ting of Grade 4students reading achievement against National Assessment of EducationalProgress (NAEP) proficiency scales.9

    North Carolina NAEP State Result

    Page

    11

    1994 Reading Grade 4

    Proficient level and above 30%

    Basic level and above 59%

    1996 Math Grade 4

    Proficient level and above 21%

    Basic level and above 64%

  • 8/2/2019 Systemwide Asses Sprog

    14/54

    Example 4 Systemwideagainst standards frameworks

    (proficiency scales)

    As part o f i ts Monitoring Standards in Education sample assessment program,Western Australia assesses and reports the speaking achievements of Year 3, 7 and10 students. This example shows the achievements of Year 3 students in small groupdiscussion against described speaking prof iciency levels.10 The achievements areshown graphically. Four levels of the 8-level profi ciency scale are described below.

    Speaking proficiency levels

    Level 8

    Students use group work to explore complex concepts. They negotiateagreements in groups where there are disagreements or conf licting personalities,managing discussions sensitively and intelligentl y and conclud ing wi th positivesummaries of achievement.

    Level 7

    Students make significant contributions to independent work groups and areaware of and able to modi fy their own behaviour where necessary. They attemptto arri ve at consensus. Students discuss the topic at a sophisticated level.

    Level 6Students explore ideas in d iscussions by comparing their ideas with those of theirpeers and building on o thers ideas to advance the discussion. They generate acomprehensive and detailed response to the topic.

    Level 5

    Students work well in formal groups where they take on roles, responsibi lit ies andtasks. They consider challenging issues, give considered reasons for opin ions andideas, and constructively discuss the presentation of those ideas.Page

    12

    7

    6

    5

    4

    3

    2

    1

    0

    5 10 15 20 25 30 35 40 45

    PERCENTAGE OF STUDENTS

    LEVEL

    7

    6

    5

    4

    3

    2

    1

    0

    5 10 15 20 25 30 35 40 45

    PERCENTAGE OF STUDENTS

    LEVEL

    YEAR 3 - EXPOSITORY SPEAKINGNOMINATED STUDENT IN A SMALL GROUP

    YEAR 3 - NARRATIVE SPEAKINGNOMINATED STUDENT IN A SMALL GROUP

  • 8/2/2019 Systemwide Asses Sprog

    15/54

    Page

    13

    Example 5 Systemwideagainst performance expectations

    Where systems have set proficiency standards (the level at which students areexpected to perform), summary and d isaggregated data often are repor ted asthe percentage of students achieving the proficiency standard.

    This example shows the Grade Reading/Language Arts achievements of Grade 3students in O regon.11 The fi rst row shows the achievement of the Year 3 cohor t(based on eighty-eight per cent participation rate). The second and third rowsshow the achievements of students from low socio-economic backgrounds.Federal government Title 1 funds provide additional educational suppor t forchildren in need. Schools with more than fift y per cent poverty are eligible for aschool wide program. Targeted assistance programs provide funds directl y tothe neediest students. The entries at the bottom of the table show theachievements of students from a non-English speaking background.

    % % %

    Standard Meet Exceed

    not met standard standard

    All students 23.0 42.0 35.0Title 1 school wide 32.0 44.5 23.5Title 1 targeted 50.2 42.4 7.4

    Percent of school in poverty0-34% 17.0 40.0 43.0

    75-100% 46.0 40.0 14.0

    Limited English Proficient students 72.3 25.5 2.2Migrant students 53.8 41.6 4.6

    Example 6 Systemwideagainst international benchmarks

    Although the age/grade groups of students involved in national and internationalassessments are often different from State and district target grades, in somesystems there is enough overlap to be of interest. For example, many States ofthe US use Third International Mathematics and Science Study (TIMSS) results asa reference point fo r student achievement on State tests.

    British Columbia, as part of its State assessment report, details studentachievement on the national School Achievement Indicators Program (SAIP) andin 1998 included 1995 TIMSS results.

  • 8/2/2019 Systemwide Asses Sprog

    16/54

    Page

    14

    Example 7 Systemwidefor subgroups of students

    Most systems report the achievements of subgroups of the population, includingbreakdowns by gender, cultural and language background, geographic region,and type of community.

    The f irst graph shows the achievementsof students from a languagebackground o ther than English(LBOTE), Aboriginal and Torres StraitIslander students (ATSI), boys, girl s and

    all students in Year 5 Mathematics,Western Australia.12

    The second graph shows theachievements of Grade 4 students indifferent regions of Newfoundland andLabrador on the complete battery ofCanadian Basic Skills tests.1

    The thi rd graph shows thereading and viewingachievements of Year 5students in Queensland,

    Australiaproportionallymore girls are in the topscoring groups.14

    5 0

    1 0 0

    1 5 0

    2 0 0

    2 5 0

    3 0 0

    3 5 0

    4 0 0

    4 5 0

    5 0 0

    5 5 0

    6 0 0

    6 5 0

    7 0 0

    7 5 0

    8 0 0

    A

    l

    l

    G

    i

    r

    l

    s

    B

    o

    y

    s

    A

    T

    S

    I

    L

    B

    O

    T

    E

    S u b g r o u p

    P E R F O R M A N C E

    S C A L E

    L E V E L

    2

    3

    4

    5

    6

    60

    50

    40

    30

    20

    10

    0

    PERCENTILERANK

    COMPOSITE SCORES BY REGION

    Avalon

    59

    South

    55

    Central

    47

    West

    47

    Labrador

    47

    boys: N = 23751 girls: N = 22922

    415

    455

    490

    520

    550

    585

    625

    680

    800

    0 2 4 6 8

    Percent

    boys

    girls

    GENDER

    1998 Year 5 - Aspects of Literacy - Reading & Viewing

  • 8/2/2019 Systemwide Asses Sprog

    17/54

    Page

    15

    Example 8 Systemwideagainst background variables

    Some systems collect and report information about students atti tudes andbehaviours.

    Brit ish Columbia, for example, reports data on students reading, writing andmathematics atti tudes and behaviours, including calculator and computer use, atGrades 4, 8 and 10.15 Changes in attitudes and behaviours as students move fromGrade 4 through to Grade 10 are of part icular interest. In Grade 4, 68% of studentsrepor ted they read in their spare time almost every day. The percentage ofstudents reading in their spare time every day dropped to 51% at Grade 7, and35% at Grade 10.

    Example 9 Systemwidein curriculum areas

    Some systems report achievement by curriculum area within test. This exampleshows the achievements of students in South Australia on d ifferent aspects ofnumeracy. Included are the achievements of boys and gir ls.16 The reportedmeans are based on scaled scores (not raw scores) allowing for the comparisonof achievements on sub-sets of test items. For example, in 1997, students

    performed less well in space than in number or measurement .

    Year 5 1997 Numeracy aspects: Number, Space, Measurement

    Number Measurement Space

    All students 58.3 58.5 56.4

    Boys 58.4 59.0 56.4

    Girls 58.3 58.0 56.5

  • 8/2/2019 Systemwide Asses Sprog

    18/54

    Example 10 Systemwideitem-by-item

    It is also possible to repor t the performances of groups of students by item. Forexample, Queensland reports the achievements of boys, girl s, Aboriginal andTorres Strait Islander students, students from language backgrounds other thanEngli sh, and di fferent groups of language speakers on each test item. The tablebelow shows the performance of students from an English speaking background(ESB), students from a non-Engl ish speaking background (NESB) with Engl ish astheir first language (non-ESL), and students from a non-English speakingbackground (NESB) with English as a second language (ESL), on a number of1998 Year 5 numeracy i tems.17 On these items, students from an NESB + ESL

    background performed bet ter than their NESB and ESB peers. Apart from i tem40, all of these items are from the Number strand and demand calculation skil lsof varying levels of d iff iculty.

    Item number and item description % answering correctly

    ESB NESB + NESB +

    non-ESL ESL

    01 Add amounts of money 82.0 80.0 84.0(calculator available)

    07 Calculate using the correct order 50.5 47.2 57.0

    of operations (calculator available)

    08 Calculate using the correct order 40.9 45.5 50.0of operations (calculator available)

    10 Calculate the remainder of a division 38.2 36.4 46.4example (calculator available)

    20 Multiply a 2 digit number by a single 61.5 59.7 69.4digit number involving regrouping

    21 Subtract 3 digit numbers 68.1 64.3 72.0with regrouping

    38 Interpret a table and order lengths 20.7 18.8 26.5

    involving decimals

    40 Interpret visual information and solve 5.8 4.3 6.5problem involving distances arounda 3D shape

    Page

    16

  • 8/2/2019 Systemwide Asses Sprog

    19/54

    Page

    17

    Example 11 School resultsaverages

    In many systems the performances of individual schoo ls are reported publicly.For example, in Pennsylvania the mean performance of schools on thePennsylvania System of Schoo l Assessment (PSSA) sorted by coun ty and d istri ctis available on the Department of Education website. Average scaled scores forschools in Brownsville d istrict are shown below.18

    Average scaled scores for Year 5

    County District School Math Reading

    Fayette Brownsville Cardale El Sch 1170 1190

    Fayette Brownsville Central El Sch 1220 1300

    Fayette Brownsville Colonial El Sch 1220 1260

    Fayette Brownsville Coox-Donahey El Sch 1170 1210

    Fayette Brownsville Hiller El Sch 1200 1250

    Example 12 School resultsfor subgroups of students

    Many systems report information directly to schools. This school report isprovided to schools in the Australian Capital Territory. It shows whole schoo land subgroup achievement in listening at Year 3 level.19

  • 8/2/2019 Systemwide Asses Sprog

    20/54

    Page

    18

    Example 13 School resultsagainst standards frameworks

    The fol lowing repor t is provided to schoo ls in South Australia. The reportshows the schoo ls results against the State proficiency scales (Bands).20

    Basic Ski ll s Testing Program 1995 Year 5 Aspects of li teracySchool: ##No of students: 96

    Percentage of students in skill bands

    Reading Language Literacy

    Band 4 State 45 37 42School 52 36 43

    Band 3 State 22 32 27

    School 23 31 32

    Band 2 State 18 19 19

    School 16 22 16

    Band 1 State 15 11 12

    School 8 9 8

  • 8/2/2019 Systemwide Asses Sprog

    21/54

    Page

    19

    Example 14 Student resultsindividual student achievement

    Some fu ll cohort assessment programs provide individual student repo rts.This Queensland Year 6 report shows an individual student s numeracy resultson Level 2, Level 3 and Level 4 items and against systemwide performances.21

    Aspects of Numeracys h o w s t h i s s t u d e n t ' s r e s u l t .

    Number

    Measurement

    Space

    L o w e s t p o s s i b l e s c o r e 6 0 % o f s t u d e n t s t e s t e d i n t h e s t a t e s c o r e d i n t h i s r a n g e H i g h e s t p o s s i b l e s c o r e

    T h i s s t u d e n t a n s w e r e d t h i s i t e m c o r r e c t l y .

    Level 4

    T h i s s t u d e n t d i d n o t a n s w e r t h i s i t e m c o r r e c t l y .

    Represent common fractionson a number line.

    Use place value to compareand order numbers.

    Convert measurements usingcommon metric prefixes.

    Visualise position and describeit on a map using distance,direction and co-ordinates.Use conventional units of mass.

    Calculate areas of rectangles. Make use of conventionsrelating to co-ordinate pairs.Visualise locations andunderstand directionallanguage when reading maps.

    Level 3

    Completely solve division prob-lems by interpreting remainders.

    Use a calculator to add lengthsexpressed as decimals.

    Visualise and follow pathsusing co-ordinates.

    Compare areas by countingunits.

    Level 2

    Continue whole numberpatterns involving addition.

    Recognise that different unitscan be used to measure thesame length.

    Compare properties of3D shapes.

    Locate a date on a calendar.

    Divide a whole number by a1-digit number.

    Recognise features of a 3Dobject that are shown in a2D diagram.Subtract a 3-digit number from

    another involving regrouping.Use an object as a repeatedunit of measurement. Recognise the same shape

    within arrangements andpatterns.

    Number Measurement Space

    Recognise equivalent fractions.

    Continue number patterns.

    Place whole numbers in order.

    Subtract one 3-digit numberfrom another.

    Partly solve division problemsby interpreting remainders.

    Multiply by a 1-digit number.

    Interpret whole numbers writtenin words and use a calculatorfor adding whole numbers.

    Represent word problems asnumber sentences.

    Use a calculator for subtractingwhole numbers.

    Add 3-digit whole numbers.

    Multiply small whole numbers.

    Subtract small whole numbers.

    Add 2-digit whole numbers.

    Recognise place value inwhole numbers.

    Identify a right angle.

    Calculate time intervals. Recognise 3D shapes from adescription of their surface.

    Choose shapes that can cover aregion with no gaps or overlaps.

    Select a flat shape that willfold to make a prism.

    Compare and measure lengthto the nearest graduation.

    Read time on a clock.

    Read a thermometer scale to thenearest marked graduation.

    Interpret placement ofobjects in drawings.

  • 8/2/2019 Systemwide Asses Sprog

    22/54

    Systems moni tor the health of aneducation system by studying trend data.Some ways in which trends are

    monitored are summarised in the tablebelow. Examples 1523 provide detailedillustrations.

    Page

    20

    SOME WAYS IN WHICH TRENDS ARE MONITORED

    Statewide trends

    Changes in averages and distributions Changes in percentage of students above or below national norms

    Changes in percentage of students at levels of a standards framework

    Changes in percentage of students above or below performance expectations

    Changes in the achievements of subgroups of students (including relativegrowth of subgroups)

    School trends

    Changes in averages and distributions

    Changes in the achievements of subgroups of students

    Changes in school rankings (league tables)

    Success in meeting performance goals (includ ing value added)

    HOW ARE TRENDS MONITORED?

  • 8/2/2019 Systemwide Asses Sprog

    23/54

    Page

    21

    Example 15 Systemwide trendsaverages and distributions

    The Australian Capital Territory monito rs student achievement by comparing themedian achievements and distributions over time and between year levels.22

  • 8/2/2019 Systemwide Asses Sprog

    24/54

    Page

    22

    Example 16 Systemwide trendspercentages in levels

    One of the ways in which SouthAustralia monitors studentachievement over time is bytracking the percentage ofstudents working with in a skillband on a described proficiencyscale. The graph ill ustrates thepercentage of Year 3 studentsachieving skill bands 1, 2, 3, 4 and

    5 in numeracy in 1995, 1996, and1997.23 In 1997, the drop in themean in Year 3 was accompaniedby an increase of 2% of studentsin skill bands 1 and 2, and a 5%decrease in ski ll bands 4 and 5.There was a 4% increase ofstudents in skill band 3.

    Example 17 Systemwide trendsreaching performance expectations

    Connecticut monitors student achievement by comparing over time thepercentages of students at o r above State goals set for each grade.24 Goals areset fo r each grade on each of three tests (mathematics, reading comprehension,and writ ten communication). For example, the State goal for reading at eachgrade level is set at 8 on a scale o f 212. The table shows the percentage ofstudents who performed at or above the State goal on all threetests.

    Connecticut Mastery Test 19931999 results

    Year Grade 4 Grade 6 Grade 8

    1993 19.3 23.1 21.5

    1994 23.3 23.7 25.41995 27.1 24.3 28.4

    1996 30.1 30.0 36.5

    1997 32.8 30.2 36.4

    1998 34.9 33.8 40.4

    1999 34.5 38.1 41.5

    Change +15.2 +15.0 +20.0

    5

    0

    1015

    20

    25

    30

    35

    40

    5 4 3 2 1

    Year 3 Skill Bands Numeracy

    95

    96

    97

  • 8/2/2019 Systemwide Asses Sprog

    25/54

    Page

    23

    Example 18 Systemwide trendsreaching performance expectations

    Western Australia tracks thepercentage of students at orabove the fitness standards, fordifferent ages, established in1994 by the Australian Councilfor Health, Physical Educationand Recreation (ACHPER). In1998, the state repor ted adecline with age in the

    percentage of students whoachieved the appropr iateACHPER minimum standard ofcardiorespiratory endurance.This decline was greatest forgir ls, with 60% of Year 10 girlsnot achieving the min imumstandards for their age.25

    10

    0

    20

    3040

    50

    60

    70

    80

    90

    100

    Cardiorespiratory Endurance

    Girls Boys

    70.8 62.3

    Year 7

    Girls Boys

    38.6 52.3

    Year 10

    (%) of studentsachieving at orabove theACHPERbenchmark

    Example 19 Systemwide trendsreaching performance expectations

    Connecticut also tracks the percentage of students, over the three testadministrations, scoring above the goals set for each grade. The table shows thereading achievements of three different cohort groups from 4th to 6th to 8thgrade.26

    Grade 4 Grade 6 Grade 8

    year % year % year %

    Cohort 1 1993 44.6 1994 59.4 1995 64.2

    Cohort 2 1994 45.0 1995 60.0 1996 66.4

    Cohort 3 1995 47.7 1996 60.3 1997 67.6

  • 8/2/2019 Systemwide Asses Sprog

    26/54

    Page

    24

    Example 20 Systemwide trendssubgroups of students

    Western Australia tracks the mean achievements (on the Moni tor ing Standardsin Education performance scale) of students in Years 3, 7 and 10 in writing. Thetable shows mean writing scores for subgroups of the population.27

    Summary of subgroup performances in writing 1995

    Year 3 Year 7 Year 10

    All 262 475 551

    Girls 285 502 583

    Boys 238 450 517

    ATSI* 140 372 448

    NESB** 229 428 521

    *Aboriginal and Torres Strait Islander students

    ** Students from a non-English speaking background

    Example 21 Systemwide trendssubgroups of students

    Queensland t racks the mean achievements of students in Years 3, 5 and 6 to

    investigate the gender gap. The table below shows that in l iteracy (reading andviewing, spelling, and writing) girls consistently outperform boys, with the gapbeing w idest at Year 5 and reduced at Year 6.28 Boys appear to catch up mostnot iceably during Year 6 in reading and viewing.

    Mean scores literacy

    Reading Cohort Boys Girls Gender gap inand viewing favour of girls

    1998 Year 3 490.7 482.6 499.1 +16.5

    1998 Year 5 592.6 582.6 602.8 +20.2

    1997 Year 6 649.3 643.0 656.1 +13.1

    1996 Year 6 629.5 621.5 638.5 +17.0

  • 8/2/2019 Systemwide Asses Sprog

    27/54

    School trends

    Some education systems disaggregateState achievement data to evaluateachievement at the school level with theaim of moni toring and comparing thecontribu tion individual schools make to

    pupils progress.

    At the crudest level, school averages arecompared (and sometimes reported inthe form of league tables). However,schools and their context differ from oneanother and these differencessignif icantl y inf luence progress andachievement. Raw results can bemisleading indicators of the value addedby a school if they are not adjusted forintake dif ferences.30

    Value added approachesIn response to this concern a number ofvalued added approaches have beeninvestigated. Value added is thecalculation of the contr ibution thatschools make to pupils progress.31 Takinga value added approach means findingan accurate way to analyse performancewhich takes account of factors that havebeen found empir ically to be associated

    with performance but over whichschools have little or no contro l. Thesefactors include pr ior attainment , sex,ethnic grouping, date of bir th, level ofspecial education need, and socialdisadvantage. Typically, pupi ls priorattainment plus the overall level of socialdisadvantage in the school (as measuredby free school meals) can account for asmuch as 80% of the apparent d ifferencebetween schools.32

    The term value added also has beenused more broadly (and sometimesconfusingly) to describe a whole rangeof connected but distinct activitiesincluding

    making like with l ikecomparisons ofschools (or departments or classes)performances;

    representing pup ils progressas well astheir achievement;

    identifying which schools/departments/classes are currently performing aboveor below predictions;and

    identifying which individual pupils arelikely to perform above or below

    predictions.33

    Page

    25

    Example 22 Systemwide trendssubgroups of students

    By tracking the same cohort of students,South Australia monito rs the relativegrowth of subgroups of the population.This graph il lustrates numeracy growthfrom Year 3 to Year 5.29 The scale 09represents growth on the calib rated tests.The largest growth occurs for the youngeststudents, but as this is a small group ofstudents the results are relatively

    unreliable. Aboriginal students show themost growth wi th the oldest studentsshowing least growth.

    1

    0

    2

    3

    4

    5

    6

    7

    8

    Numeracy

    AllS

    t

    Boy

    s

    Girls

    Youngest

    Mostcommonag

    e

    Oldest

    ATS

    I

    NESB

    T

    ES

    B

    NESB

    1

  • 8/2/2019 Systemwide Asses Sprog

    28/54

    Page

    26

    Example 23 School trendsvalue added

    In Chicago, the value a school adds to students learning i s calculated on the basisof a set of grade profi les.34 For each school, for each grade, the profile is basedon two pieces of information: input status and learning gain. Input statuscaptures the background knowledge and skills that students bring to their nextgrade of instruction and is based on students test scores on the Iowa Test ofBasic Ski lls (ITBS) from the previous spr ing. Learning gain is the degree to whichthe end-of-year ITBS results have improved over the input status for the samegroup of students.

    The profi le is organised around data from a base year. Students who move intoand ou t of a school du ring the academic year do not count in the productivitypro file for that year. A statistical model is used to smooth trend lines so thatvariabili ty in the data from year to year does not obscure any overall pattern . Themodel also is used to adjust trend estimates for other factors that might bechanging over time besides school effectiveness (for example, the school s ethniccomposition, the percentage of low-income students, and retention rates).

    To evaluate schools contribut ions to student learning, productivity profi les areclassified into one of nine patterns using a dual indicator comparison schemewhich considers both the learning gain trends and the output trends. Forexample, a school whose grade profi le shows an increasing output t rend with aninput trend of the same rate is classified as no change and contributing less to

    students learning than a school with an output t rend which is increasing at afaster rate than the input trend (up ). In computing pro files for each grade,averages are calculated from across adjacent grades, providing a more stableestimate than single grade averages would provide. (Improving productivity inone grade tends to be fo llowed by some declines in the next, and vice versa.)

    Figure 3 shows the readingproductivity trends forFillmore ElementarySchool . The bot tom trendline illustrates input trends;the top illustrates outpu ttrends. The distancebetween the two trendlines illustratesproductivity.

    Figure 3 Reading productivity for Fillmore elementary schoolNote: Percentages associated with each grade productivity profile are the percentage improvement inlearning gains over the base year period (1991).

    30

    40

    50

    60

    70

    80

    90

    100

    Scale

    MixedIncreasing Output Up No Change

    Grade3

    -14%

    5

    12%

    6

    60%

    7

    12%

    1991output

    1996output

    1991input

    1996input

    8

    -1%

    4

    -33%

  • 8/2/2019 Systemwide Asses Sprog

    29/54

    Student achievement data collectedthrough systemwide programs usually areused for two closely related purposes:accountabilit y and improvement.

    For example, as part of their commitment

    to equal opportunity, education systemsmonitor the achievements of studentsfrom di fferent geographic, gender andethnic backgrounds to ensure that allstudents enjoy equal access to education.On the basis of achievement data, theymay allocate additional resources toprograms targeting a particular subgroupof students (system accountability to thepublic to provide resources equitably).

    Systems that have set improvement goalscheck progress towards system targets.On the basis of achievement data, theymay allocate additional resources toprograms targeting low-achieving schools

    (system improvement purpose).In some countries system managers alsoencourage schools to use system levelachievement data for accountabili ty andimprovement. Schools are supported touse data to compare the achievements oftheir students with past performances,and with the performances of students inother schoo ls, and to set schoo l goals forimprovement.

    Page

    27

    Grade productivity profiles from all individual elementary schools in the systemalso are aggregated to show the overall producti vity of Chicago schools. Figure 4below illustrates overall ITBS mathematics producti vity. The output trends are upfor all grades and learning gain t rends show improvements for the middle andupper grades. Grades three and four show li ttl e change in learning gains. Thegrade three data are part icularly interesting. Although the output trend isposit ive, the Learning Grade Index is down by 4%. This indicates that the gains inachievement at the end o fthe grade are largelyattributable toimprovements prior tograde three. If outputtrend only had beenconsidered, then onemight have mistakenlyconcluded that thirdgrades were improvingsystem wide.

    Figure 4 Mathematics productivity profile for Chicago publicschools, 19871996

    Note: LGI = Learning Gain Index, computed for 1992-1996

    HOW ARE DATA USED TO IMPROVE LEARNING?

    30

    35

    40

    45

    50

    55

    60

    65

    70

    75

    Scale

    Grade 4 Input

    LGI for Grade 6

    Grade

    Grade 5Output

    3

    -4%

    4

    2%

    5

    19%

    6

    7%

    7

    27%

    8

    63%

  • 8/2/2019 Systemwide Asses Sprog

    30/54

    In some countr ies, schools are requ iredto respond to centrally collectedachievement data. For example, intwenty-three states in the US, schools areheld directly accountable for educationoutcomes. These States have po licies forintervening and mandating majorchanges in schools judged to be lowperforming on the basis of studentachievement on State mandated tests. Insome cases, States or d istri cts providetechnical assistance and additionalresources to help redesign or restructurechronically low performing schools. Insome jur isdictions, schools have been

    reconstituted. This often involvesreplacing school principals and removingteachers.35

    The table below summarises some waysin which assessment data are used bysystem managers for improvement andaccountability purposes at differentlevels of the education system. Thesestrategies are elaborated (pages 2937)with detailed examples. Example 24illustrates one Canadian provincessystematic approach to using data at all

    levels of the education system toimprove student learning.

    Page

    28

    SOME SYSTEM USES OF ASSESSMENT DATA

    Improvement Accountability

    System Allocating resources Motivating research Informing curriculum and

    performance standards

    School Provid ing pro fessional Applying sanctions anddevelopment offering rewards to schools

    Setting goals on the basis of contributions Allocating resources to student achievement

    (including value added)

    Classroom Providing curriculumfeedback

    Motivating change

    Student Informing learning Allowing or refusing grade Motivating learning promotion

    Parent Communicating progress Informing decision making

    (school selection)

    Pre-service Informing course focusTraining

    Community Informing about standards

    Educational Guiding researchResearch

  • 8/2/2019 Systemwide Asses Sprog

    31/54

    Page

    29

    Example 24 A systematic approach to using achievement data

    The ways in which assessment data are to be used at all levels of the educationsystem are made explicit i n Briti sh Columbia.36 Information is used

    by the Province and Districts to

    report on the results of student learning in selected areas of the curriculum;

    assist in policy, program and curr iculum development;

    facili tate publ ic and pro fessional d iscussions on student learning;

    analyse results of particular populations of students to determine if theyrequire additional support or focused attention ;

    by schools to

    facili tate discussions about student learning;

    assist in the development and review of schoo l growth p lans;

    by students and parents as

    an additional external source of information about a student s performance inrelation to provincial standards.

    Example 25 System leveldirecting resources

    Student achievement data from systemwide achievement studies can be used asa basis for allocating resources.

    For example, in California the Immediate Intervention Underperforming Schools

    Program allocates addi tional resources to schools scoring in the bottom half ofthe Statewide distribution of the Standardized Testing and Reporting (STAR)Program. Schools may volunteer or may be randomly selected for planninggrants to work with an external evaluator and a community team to identifybarriers to school performance and to develop an action plan to improvestudent achievement.37

    In Queensland, State sector schools with students performing in the bottom15% of the cohor t on the li teracy or numeracy Statewide tests are allocatedadditional funds to provide intervention programs designed specifically forthose students.

  • 8/2/2019 Systemwide Asses Sprog

    32/54

    Page

    30

    Example 26 System levelmotivating research

    Student achievement data from systemwide achievement studies can be used tomotivate research. For example, the li teracy achievements of Queenslandstudents relative to the achievement of students in o ther Australian Statesmotivated two reviewsa study of the literacy practices in schools, and a studyof the State testing program.

    Example 27 System levelinforming standards

    Some systems use student achievement data to inform reviews of curr iculumand performance standards. For example, the Victo rian (State) Board of Studiescommissioned a study to compare the English, science and mathematicsexpectations contained in the revised Curr iculum and Standards framework withState and Territory data as well as with international achievement data. Theintention was to confirm the level of expectation with reference to actualstudent performance.38

    Example 28 School levelproviding professional development

    Some systems provide direct assistance to schools to encourage them to pursuedata dri ven improvements. For example, the Maryland State Department ofEducation has a web page to help schools to analyse their State data.

    Achievement data on a variety o f key dimensions are presented in simple graphs

    for each schoo l. The data are disaggregated by subject, gender, race, and grade.Schools can compare results to similar schoo ls in the State. Worksheets also areprovided to guide schoo ls to investigate instructional practices, chart data, andidentify fu rther data they need to collect.39

    Similarly, in the State of Victoria, Australia, schools can use data to compare theirstudents results with State and like school benchmarks to learn about theireffectiveness. They can compare their current performance levels with theirown past performance, and the performance of similar schools, to plan forimproved achievement and to set performance expectations for themselves.40

  • 8/2/2019 Systemwide Asses Sprog

    33/54

    Page

    31

    Example 29 Classroom levelproviding curriculum feedback

    Data from fu ll cohort State and district programs can be used to providefeedback to schools and teachers on student achievement in relation to aspectsof the curr iculum. On the basis of objective information, teachers are then ableto adjust their teaching strategies. For example, the Australian Capital Terr itoryprovides schools and teachers with student achievement data on each testquestion. The report below has been annotated to assist teachers to see thekinds of listening questions the students in t his school found most diffi cult.41

  • 8/2/2019 Systemwide Asses Sprog

    34/54

    Brit ish Columbia provides simi larinformation to assist teachers tointerpret district assessment results.42

    A series of tables indicating thepropor tions of students answering eachtest question correctly are provided.Also provided is the proportion ofstudents who selected particularincorrect alternatives and commentary

    (where possible) on what students aretypically doing when answeringquestions incor rectly.

    The Grade 4 resul ts for patterns andrelations are listed below. The Statenumeracy assessment addresses number,patterns and relations, shape and space,and statistics and probabilit y skills.

    Page

    32

    Item % Description of item Comments on

    correct incorrect responses

    2 53 A word problem involving division M ore than one quarter,and finding a remainder (dividing 28%, subt racted rather thana number of objects into sets of a divided. 13 per cent dividedgiven size and finding how many correctly but found anare left over) incorrect remainder

    21 55 A word problem involving Common errors weremul tip lication, subt raction and incorrect calculations, 19%,division (finding the greatest ignoring part of thenumber of an item which can be information, 10%, and usingpurchased with change from an only part of the information

    earlier purchase) in the calculations, 10%

    25 72 Find the number of missing votesin a class of students usinginformation shown in a tally

  • 8/2/2019 Systemwide Asses Sprog

    35/54

    Page

    33

    Encouraging the use of data at all levels of the systema case study

    British Columbia engaged an interpretation panel of representatives ofeducational and community organisations to review and comment on the resultsof the 1999 State assessments.43 As well as commenting on strengths and areasrequi ring attent ion , the panel made recommendations regarding steps thatcould be taken to improve BC students reading comprehension, writing andnumeracy skills.

    Strategies were suggested at every level of the education system. Some of therecommendations are listed below:

    To teachers, principals, and superintendents

    increase the amount of direct instruction in reading f rom Kindergarten toGrade 12

    emphasize that all teachers should teach reading strategies in all subject fields,not just in English Language Arts

    select reading materials that wi ll engage boys

    encourage students to wri te daily

    increase emphasis on appli cations of mathematics and problem solving

    develop math intervention programs similar to reading intervention p rograms

    To the ministry

    increase access to ESL programs

    provide updated writing reference sets (a classroom assessment resource)with samples of writ ing from p rovincial assessments

    provide additional support for implementation o f mathematics IntegratedResource Packages

    To parents and guardians

    encourage children to read regularly and read to your children regularly

    promote numeracy and problem solving in the home

    emphasize to children the importance of mathematics in our lives

    To teacher education programs

    require that all education students take at least one course on the teaching ofreading

    require that all education students take a course in the teaching of numeracy

    To educational researchers

    increase research on strategies for teaching boys to read

    look into how different types of learners acquire reading, writing andnumeracy skil ls

    research ef fective support strategies for numeracy

    conduct research on the relationship between student achievement andcompu ter use/physical activities

  • 8/2/2019 Systemwide Asses Sprog

    36/54

    Page

    34

    Example 30 School levelaccountability sanctions and rewards

    Using State and national achievement data to ho ld schools accountable fo rimproving student learning is of increasing interest in developed countries.

    Twenty three States in the US have pol icies for intervening and mandating majorchanges in low performing schools, and 17 States grant this authori ty at distri ctlevel. In some cases, this means that States or d istricts provide techn icalassistance and addi tional resources to help redesign or restructure chronicallylow performing schools. In some jur isdictions, schools have been reconstitutedwhich often involves replacing school principals and removing teachers. Forexample, in Kentucky low performing schoo ls are assigned distinguishededucators from other districts to assist in reform effo rts. Schools that continueto drop far behind expectations are assigned state managers who evaluate allschool personnel and make recommendations and changes to improve schoolperformance.44

    Subgroup achievement in Texas

    The US State of Texas disaggregates student achievement data to measure bothschools progress and the progress of students of di fferent racial, ethnic andeconomic backgrounds.45 To make adequate yearly progress, schools mustobtain an acceptable rating from the States accountability systema ratingwhich requires at least for ty per cent of all students and student groups to passthe Texas Assessment of Academic Skills, a dropout rate of no more than six per

    cent, and an attendance rate of at least ninety-four per cent. School districts canbe disenfranchised and p rincipals removed if sustained levels of performanceare poor.

    Achievement targets in Kentucky

    The US State of Kentucky also has established a clear definition of adequateprogress as part of the State accountability system.46 Student performance onassessments is classified: novice, apprentice, prof icient, distinguished. Eachclassif ication i s assigned a score (0, 40, 100, 140). The performance index for theschool is defined as the average of these scores. A target score of 100 is set as agoal to be achieved within 20 years by all schools, and schools are expected tomove a tenth of the way from their baseline performance toward the goal of 100

    each biennium. For example, adequate progress for a school wi th a baselineindex score of 30 would have a goal of 37 after two years (ie 30 plus 10% of 70the d ifference between the baseline of 30 and the long term target of 100).

    Schools that exceed the goals are eligible for financial awards and schools thatfall behind are designated in decline. The lowest performing schools, schoolsin crisis, are those whose perfo rmance declines by more than f ive per cent oftheir baseline for two consecutive assessment cycles.

  • 8/2/2019 Systemwide Asses Sprog

    37/54

    Page

    35

    School accreditation in Colorado

    New Colorado education accreditation indicators include academic achievementindicators. For example, schools are expected to increase the percentage of 4thgrade students scoring at the proficient level or higher by 25% within threeyears. A district at 40% Prof icient or Advanced would need to improve to 50%level with in three years.47

    Awards and recognition in North Carolina

    The Nor th Carolina Accountabilit y Model for schoo ls establishes growth/gainstandards for each elementary, middle, and high school in the State. Schools

    that attain specified levels of growth/gain are eligible for incentive awards orother recognition.48 To be eligible for incentive awards, schools must not haveexcessive exemptions and must test at least 98% of their eligible students in K-8,and at least 95% of students enrol led in specific courses or grades in h ighschool.

    For example, Schoo ls of Excellence make expected growth/gain and have atleast 90% of their students performing at or above grade level. They arerecognised in a Statewide event, receive a dated banner to hang in the school ,and a certif icate. In addi tion, they receive whatever incentive award they earn ashaving made either expected or exemplary gains. Schools making exemplarygrowth receive a certi ficate and financial awards of $1500 per person for certif iedstaff and $500 per person for teacher assistants.

  • 8/2/2019 Systemwide Asses Sprog

    38/54

    Page

    36

    Example 31 School levelaccountability to parents and the public

    Providing parents and the wider publ ic with information about schoolachievement is of increasing interest in some countr ies. The most publ icisedstrategy is to present results in the fo rm of league tables. As well as providinginformation to parents and the pub lic, it is assumed that leagues tables willgenerate competition and stimulate higher educational perfo rmance across thesystem.

    Some States are experiment ing wi th o ther forms of publ ic accountabili ty. Forexample, in 2000, the Australian Capital Territory Department of Education

    reviewed how information about students literacy and numeracy achievementswas presented to parents and what information should be made publiclyavailable. They presented five models for consideration including pub lishinginformation about average school results on Statewide tests, the distributions ofstudents results on Statewide tests for each school, the performance of schoolsover time in relation to a base year, schools results against l iteracy andnumeracy standards, and the progress of groups of students through school.49

    For example, using the last model, information that ind icates the extent o fimprovement in student perfo rmance from Years 3 to 5 and 7 to 9 is pub lished.This enables the comparison o f the rate of improvement across schools as wellas that of ind ividual schoo ls.

    The intention is that th isinformation will give parents,carers and the community anindication of progress overtime, and whethermainstream and interventionschool programs are actuallymaking a difference tostudents learning. Studentmovement between schoolsneeds to be taken intoaccount to provide accurate

    data.

    Year 3

    1997

    School System School System

    Year 5

    1999

    Average

    Comparison of School Performance

  • 8/2/2019 Systemwide Asses Sprog

    39/54

    Over recent decades, a great deal hasbeen learned about the ways in whichlarge-scale assessment programs conveyvalues and impact on practice, and aboutthe unforeseen and unintendedconsequences of part icular approachesto assessment. With the increasingemphasis in some countries on the use

    of student achievement data foraccountabili ty as well as improvementpurposes, a new set of concerns hasarisen.

    There is a general concern about the

    emphasis placed on test scores:

    In mandating tests, policy makers have

    created the illusion that test performance

    is synonymous with the quality of

    education51

    Technocratic models of school reform

    threaten to turn accountabili ty in to a

    narrow, mechanistic discussion based on

    numbers far removed from the gritty

    reali ty of classrooms.52

    And, consequent ly, recommendationshave been made to consider multipleindicators of performance:

    Dont put all of the weight on a single

    test when making important decisions

    about students and schools (ie retention,promotion, probation, rewards). Instead,

    seek mult iple indicators of performance.

    Include performance assessments and

    other indicators of success such as

    attendance, students taking Advanced

    Placement courses, etc.53

    Specific concerns raised throughsystematic research about the repor tingand evaluation of systemwideassessment data are summarised in thefol lowing table and discussed in detailbelow.

    Page

    37

    Example 32 School levelholding students accountable

    Some States of America are using achievement results to hold students moreaccountable. In an effor t to end social promotion , a number o f States requi redistricts and schools to use State standards and assessments to determinewhether students can be promoted. For example, in Chicago, students whoperform below minimum standards at key transition grades must participate in aseven-week summer br idge program and pass a test before moving on to thenext grade. In 1997 about half of the 41 000 students who were requi red toattend the summer program passed the test. They showed an average one-and-a-half-year gain in their reading and mathematics scores.50

    WHAT CONCERNS HAVE BEEN RAISED?

  • 8/2/2019 Systemwide Asses Sprog

    40/54

    Over interpreting improvementin test scores

    The problem:It is possible for test scoresto go up without an increase in studentlearning in the domain the test

    addresses. This can happen if teachersteach to a non-secure test. What cameto be known as the Lake WobegonEffect (below) is an example of inflatedimpressions of student achievement.

    Page

    38

    CONCERNS RAISED ABOUT THE REPORTING AND EVALUATION

    OF SYSTEMWIDE ACHIEVEMENT DATA

    Over interpreting improvement in test scores (this page)

    Over interpreting systemwide trend data (page 39)

    Underestimating the negative impact of programs on teacher behaviour(page 40)

    Underestimating improvement and accountabili ty tensions (page 40)

    Underestimating the negative consequences of accountabil ity measures(page 40)

    Overestimating the strength of league tables (page 42)

    Underestimating the problems of value added measures (page 42)

    Assuming that summative information will in form teaching (page 43)

    Ignor ing the role of teachers in reform (page 44)

    Example 33 The Lake Wobegon Effectinflated impressions of

    student achievement

    The mushrooming of standardised tests started in the US in the 1970s withminimal competency testing. By 1987 John Cannel l, a physician in WestVirginia, noticed that many States and schools were claiming that their studentswere reported as being above average. An investigation revealed that studentsscores almost everywhere were above average. Cannell concluded that

    standardized, nationally normed achievement tests give children, parents,school systems, legislatures and the press inf lated and misleading reports onachievement levels 54

    In his assessment o f Cannel ls concerns, Robert Linn summarised his response:

    There are many reasons for the Lake Wobegon effectamong the many are theuse of old norms, the repeated use of the same test year after year, the exclusionof students from participation in accountability testing programs at a higher ratethan they are excluded from norming studies, and the narrow focusing ofinstruction on the skills and question types used on the test.55

  • 8/2/2019 Systemwide Asses Sprog

    41/54

    A solution:Assessments can leadteaching without a negative impact if theassessments are standards-referencedthat is, if the tests are systematicallyconstructed to address publi cly availablestandards. Teachers then teach to thestandard, no t to specific test items.56

    The problem:Rises (or falls) in testscores, which make po lit ically attractiveheadlines, may be insignif icant due tosampl ing or measurement erro r, orbased on invalid comparisonswhere

    there have been changes in the testingformat, administration (eg testing timeallowed), or exclusion pol icies.

    For example, trend results from Statesand d istricts in the US that include a shif tfrom an old to a new test show thatwhere a new test replaces one that hasbeen in use for several years there is asudden drop in achievement. This dropis followed by a steady improvement(sawtooth effect).57

    A solution:When monitoring trends overtime, it is impor tant to report

    measurement errors and to ensure thatlike comparisons are made. In general,for monitor ing purposes, the groupmean is a more reliable statistic than thepercentage of students achieving aparticular performance standard orworking at a part icular level of astandards f ramework.

    Over interpreting systemwide trend data

    The problem:When systemwide trendsare being monito red over t ime, averagescore increases at the year levels tested

    are sometimes observed. Theseincreases do not necessarily indicate thatschools are performing better overtimefor example, there may be noincrease over t ime in cohort growthbetween the year levels tested.Disaggregating summary data by schoolcan also give a distorted picture o fgrowth. A NAEP study (example 34)illustrates these problems in the contextof national monitoring but the sameissue arises in the context of o ther

    systemwide monitoring programs.

    Page

    39

    Example 34 NAEPover interpreting summary trend data

    A redesign in NAEP in the early 1980s allowed the tracking of a cohort ofstudents, in addi tion to measuring the level of 4th, 8th and 12th grade studentsat a given time. In most cases, the students average NAEP scores were slightlyhigher at each grade level than they were 20 or 25 years ago. However, thecohor t growth between the fourth and the eighth grade was the same as, orlower than, it was during the earliest period fo r which there are NAEP data.What should be concluded? Is the education system performing better or worseover t ime?

    When the achievement of States was compared, there was little d ifference in thecohort growth between the four th and eighth grade. While the State of M ainescored highest in the nation and the State of Arkansas lowest, both States hadthe same cohort growth, 52 points on the NAEP scale (in mathematics) betweenthe fourth and eighth grade. What shou ld be concluded? Are Maine andArkansas at the two ends of the school quality continuum, or are they actuallyequal? 58

  • 8/2/2019 Systemwide Asses Sprog

    42/54

    A solution:Care needs to be taken whendrawing inferences from summarystatistics.

    Underestimating the negative impact of

    programs on teacher behaviour

    As well as providing useful informationfor educational decision making, large-scale assessment programs play animpor tant role in communicating valuesand expectations and have an inf luenceon teacher behaviour. It is dangerous tothink that assessment is a neutral

    measuring instrument which onlyrequires further technical developmentsto make it more effective.59

    The problem:Poorly designedassessment systems may provide li tt lesupport to learning and, at worst, maydistort and undermine curriculumintentions, encourage superficiallearning, and lower students sights onsatisfying minimal requirements.60 Ifsanctions are attached to test results,

    then teachers typically emphasise what isbeing tested, thus narrowing andfragmenting the curr iculum. The USexperience with min imum competencytesting (opposite) provides an exampleof the unintended negativeconsequences of assessment programs.

    A solution:Well-designed assessmentsystems, which do not focus only on theachievement of min imally acceptablestandards, can reinforce curri culumintentions, bringing the intended

    curriculum and the actual curriculuminto closer alignment. They can provide abasis for valuable conversations amongteachers about learning and itsassessment, and between teachers,students and parents about individualscurrent levels of progress, their strengthsand weaknesses, and the kinds oflearning experiences likely to beeffective in supporting further learning.

    Underestimating improvement and

    accountability tensions

    The problem:Developing tests that canbe used to hold schools accountable andalso to improve instruction may result ina confl ict of design and place unduepressure on teachers and schoo ls. Testsused for accountabilit y purposes areusually limited in scope and are thusincapable of providing a comprehensivepicture of student achievement. Also iftest results are to be used to inform

    teaching, then they need to beadministered early in the year; tests foraccountability purposes need to beadministered at the end of t he year.

    If assessment resul ts are used to drawconclusions about the performances ofind ividual teachers or schools, or toallocate resources, then schools mayattempt to manipulate data. Forexample, in Chile, some schools,realising that their rank in the nationalleague tables depended on the reported

    socio-economic groupings of theirstudents, overestimated the extent o fpoverty among their students to helpboost their position.61

    A solution:Clarify the dif ferent purposesof systemwide tests and provide teacherprofessional development to assistteachers to use centrally-collected datato inform teaching. Ensure that theind icators on which decisions are madeare incorruptible. Monito r schools

    responses to the assessment program.Underestimating the negative

    consequences of accountability measures

    The problem:Some evidence suggeststhat the unintended negative effects ofhigh stakes accountability uses oftenoutweigh the intended positive effects.62

    For example, those opposed to the threatof reconstitut ion o f schools in the

    Page

    40

  • 8/2/2019 Systemwide Asses Sprog

    43/54

    United States argue that it is a strategywhich blames teachers for school failu re,demoralising the profession, while doinglittle to solve the underlying problemsthat contr ibute to low performance.(Those in favour of the strategy believethat the threat of reconstitut ion helps tomotivate improvement, particularly inlow level or probationary schools.Improvement in these schools is cited asevidence of the positive effect.) Evidence

    suggests that the impact wil l be positiveor negative depending on thecircumstanceswhich include strongleadership, col lective responsibili ty, aclear break with the past andpro fessional development and capacitybuilding.66

    A solution:Continue to monitor theunintended consequences of h igh stakesaccoun tabil ity assessments.

    Page

    41

    Example 35 US minimum competency testingunintended

    negative consequences

    Minimum competency tests were introduced in the 1970s and 1980s to establishwhether students were achieving the minimum levels of knowledge and skillexpected of students in particular grades (for example, end of high school). Asmany commentators have observed, a common response by American teachersto minimum competency tests was to focus their teaching efforts on thefoundational skill s assessed by these tests and to concentrate their attent ion onstudents who had not yet achieved these ski lls. This was sometimes at the

    expense of extending the knowledge and skills of h igher achieving students.Accord ing to some wr iters, these tests not only constrained classroom teaching,but also had dubious benefits for the students they were designed to serve:

    Minimum competency tests are often used as a policy tool to require that

    students meet some basic level of achievement, usually in reading, writ ing and

    computation, wi th the intention that the use of these tests will lead to the

    elimination of these educational problems [However,] the empirical findings

    show that the negative consequences far outweigh the few positive results For

    example, Griffin and Heidorn (1996) showed that minimum competency tests do

    not help those they are most intended to helpstudents at the lowest end of

    the achievement distribution There have been several studies focusing on the

    effects of minimum competency testing on curriculum, teaching, and learning.

    Most of these studies have been critical of their negative effects on curriculumand instruction.63

    Other writers are less damning, pointing to evidence of a relationship betweenminimum competency testing and improved performances among lower-achieving students.64 Nevertheless, because minimum competency testsgenerally were perceived not to have been effective in raising educationalstandards, by the 1990s there was a trend away from large-scale tests focused onthe achievement of min imally acceptable standards to tests focused on newlyvalued world class standards.65

  • 8/2/2019 Systemwide Asses Sprog

    44/54

    Overestimating the strength of league

    tables

    The problem:The evidence suggests thatthe unintended negative consequencesof league tables outweigh any positivecontribu tion they might make. Whereleague tables do not take students intakecharacteristics into account, schools canbe blamed or praised for achievement onthe basis of factors beyond the inf luenceof the school.

    Using pupil assessment to place schools

    in rank order, as a way of forcing changein the curriculum or in teachingmethods, is recognised as being unfairunless some measure of value-added bythe school is used. Even then, rankingmay alienate those at the bottom and failto motivate those at the top of the order;it does not support the message that allschools can improve. Giving schoolsindividual targets is heralded as a way ofensuring that all school are expected tocontr ibu te to raising standards.

    Also, media enthusiasm fo r the use ofleague tables often conflicts with thepublic s interest in having clearlyunderstandable information. Example 36illustrates the problem.67

    A solution:Discourage the use of leaguetables. If schoo ls are ranked pub licly,make the lim itations of the data clear.

    Underestimating the problems of value

    added measures

    The problem:Value added measures arecomplex and a single index can give adistorted picture of school performance.

    ..value added measurement of

    performance is a very interesting andimportant development in education, but

    it has some drawbacksthere is novalid value added measure which is

    simple to compute and understand, and

    at the same t ime gives true and useful

    insights into school performance. The

    statistical analysis is complex and not

    easily explained to the statistical lay-

    person. Furthermore, there is no magic

    number, and never will be, which can

    summarise all that is good or bad about a

    school, thus doing away with the need

    for careful and conscientious

    professional judgement based on a widerange of d ifferent kinds of evidence.69

    A solution:Use multiple indices ofperformance and continue to be awareof the limitations of value addedinformation.

    Page

    42

    Example 36 Media misuse of league tablesIn response to the int roduction of the Tennessee Value Added AssessmentSystem, Tennessee newspapers printed school league tables even where schoolsmany rankings apart had negligible score dif ferences. The newspaper did notreport the evaluators clear statement that school scores were unstable andcould no t be relied on for clear distinctions in performance.

    In 1996, one newspaper transformed the value added scores into percentilerankings, even though the technical documentation for the scores did notsuppor t the interpretation.68

  • 8/2/2019 Systemwide Asses Sprog

    45/54

    The problem:It takes time to collectvalue added information and by the timeinformation from a particular school hasbeen analysed and provided to theschool, the information refers to theachievements of students who enteredthat schoo l several years previously. Itsusefulness for making judgements aboutschool effectiveness for future studentsmay be dub ious, especially i f there havebeen staff changes.

    A solution:Where information is

    analysed on a yearly basis, makeadjustments for prior contributingfactors that extend over two or moreyears in time. Do not judge schools, orteachers within those schools, by theachievements of a single cohort ofstudents, but on their performance overtime.

    The problem:There is a lack of evidencefor the positive impact of value addedinformation.

    .. the enormous national investment inperformance data has been something o fan act of faith; we need further empirical

    evidence to help answer such questions

    as: How do staff in schools actually make

    use of value added data? Is the effort

    worthwhi le? Do value added measures

    help to improve education in practice?

    Under what condi tions and wi th what

    prerequisites? What kinds of

    professional development and support

    are necessary? Al l of this is crying out to

    be explored empirically by bui lding up asystematic evidence base.70

    A solution:Monitor the impact of valueadded information.

    Assuming that summative information will

    inform teaching

    There is ongoing discussion, particularlyin the United Kingdom, about the needto distinguish between formative andsummative assessment purposesformative assessment being assessmentfor learning (to feed directly into theteaching learning cycle); summativeassessment the assessment of learning(for repor ting purposes). The assumptionthat a single assessment can effect ively

    serve both these purposes iscontentious.71

    The problem:It is assumed thatsummative information wi ll informteaching. However,

    there is general agreement that where

    there are both formative and summative

    purposes [for assessment], there wi ll

    invariably be a drift towards more

    emphasis on the summative functions

    which inform managerial concerns for

    accountabil ity and evaluation. Theformative functions, which informteaching and learning, are likely to be

    diminished.72

    For example, in Chile, almost two thirdsof t eachers reported that they did notuse the special manual that dealt withthe pedagogical implications of thenational test results.73 (Studies of teacheruse of o ther systemwide achievementdata may result in equally soberingfindings.)

    A solution:Provide professionaldevelopment for teachers to assist themto use achievement data to i nformteaching.

    Page

    43

  • 8/2/2019 Systemwide Asses Sprog

    46/54

    Ignoring the role of teachers in reform

    In the United Kingdom, much has beenwritten about the black box of theclassroom.

    In terms of systems engineering, present

    policy seems to treat the classroom as a

    black box. Certain inputs from the

    outside are fed in or make demands

    pupils, teachers, other resources,

    management rules and requirements,parental anxieties, tests with pressures to

    score highly, and so on. Some outputs

    follow, hopeful ly pupi ls who are moreknowledgeable and competent But

    what is happening inside?74

    The problem:The collection of studentachievement data wil l no t by itselfimprove standards. It has been knownfor a long time that the most effectiveway of improving the quality ofeducation for individual pupi ls is forteachers in schools to evaluate what theyare doing and to make the necessarychanges.75

    A solution:Provide professionaldevelopment for teachers to assist themto use achievement data to inform

    teaching.

    Page

    44

    The US National Education Goals Panelrepor ts progress on 33 indicators linkedto eight National Education Goals. In the1997 repo rt two StatesNorth Carolina

    and Texasstood out for realisingpositive gains on the greatest number ofind icators. An analysis of the reforms inboth States was undertaken to identi fythe factors that could and could notaccount for their progress. The findingsof the study are summarised here.76

    Factors commonly associated withstudent achievement which did notexplain the test score gains included realper pupil spending, teacher/pupil ratios,the number o f teachers with advanced

    degrees, and the experience level ofteachers in the system.

    Two plausible explanations for test scoregains were proposed: the way in whichpolicies were developed, implementedand sustained (the policy environment);and the pol icies themselves.

    The policy environment

    Leadership f rom the businesscommunity

    In both States the business communityplayed a criti cal leadership role indeveloping and sustaining reformincluding funding organisations thatbrought together the business,education and policy-makingcommunit ies. Business involvement wasalso characterised by the presence of afew business leaders who becamedeeply involved.

    Political leadership

    Pol it ical leadership was essential at

    criti cal poin ts in the reform process. Thepassage of legislation involved coalit ionsfrom both parties, and the businesscommunity remained a consistentexternal voice.

    Consistency of the reform agenda

    Despite changes in Governors andlegislators the reform agenda has beenmaintained.

    A CASE STUDY

  • 8/2/2019 Systemwide Asses Sprog

    47/54

    The policies