Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
DOCUMENT RESUME
ED 399 524 CS 012 607
AUTHOR Abdullah, Mardziah Hayati, Comp.TITLE Standardized and Alternative Assessment. Hot Topic
Guide 59.INSTITUTION Indiana Univ., Bloomington. School of Education.PUB DATE 96NOTE 207p.PUB TYPE Information Analyses (070) Guides Non-Classroom
Use (055)
EDRS PRICE MF01/PC09 Plus Postage.DESCRIPTORS Annotated Bibliographies; Class Activities;
Elementary Secondary Education; Liberal Arts;Performance Based Assessment; *Standardized Tests;*Student Evaluation; *Testing; Workshops
IDENTIFIERS *Alternative Assessment; Authentic Assessment
ABSTRACTOne of a series of educational packages designed for
implementation either in a workshop atmosphere or through individualstudy, this Hot Topic guide presents a variety of materials designedto assist educators in designing and implementing classroom projectsand activities centering on the topic of standardized and alternativeassessment. The Hot Topic guide contains guidelines for workshop use;an overview of standardized and alternative assessment; and sevenarticles (from scholarly and. professional journals) and ERICdocuments on the topic. A 19-item annotated bibliography of items inthe ERIC database on the topic is attached. (RS)
***********************************************************************
Reproductions supplied by EDRS are the best that can be madefrom the original document.
**********************************AAA"'A******************************
HOT TOPICGUIDE 59
Standardized and Alternative Assessmente so eo ettevo e
tm sp e 40edu tItek Au'
d Wit Asyouo e b g a 04()
14.1
TABLE OF CONTENTS:
HELPFUL GUIDELINES FOR WORKSHOP USESuggestions for using this Hot Topic Guide as a professional development tool.
OVERVIEW/LECTUREStandardized and Alternative Assessmentby Mardziah Hayati Abdullah
ARTICLES AND ERIC DOCUMENTSThe Need for a New Science of AssessmentRaising Standardized Test Scores and the Origins of Test Score PollutionPerformance-Based Assessment and Educational EquityAssessment and the Morality of TestingAssessment Worthy of the Liberal ArtsThe Morality of Test SecurityTesting and Tact
BIBLIOGRAPHY
A collection of selected references and abstracts obtained directly from the ERICdatabase.
Compiler. Mardziah Hayati AbdullahBlooming t on . School ofIndiana University,
Series Editors: Carl Smith, Eleanor Macfarlane, and Christopher Essex
n 'Other Of GutdeA4'ors ou ,ma sand pripOblishi
40vAf-,1ketreproduce -acitransmittealrnon
rmotion4torage and retne a stem
11\Orthwa
Theeei
d an pkckwrskpe, tsstor of t etrtcomg
oft eo ocoproran
ass:\lt
t.ardilit es :ho GI.Cfs
c.ats.suon
rmationfDiNF041-ess
searSmith searzn Center Suite,2805\ 90,1 OffiksStree,ftwortun
aY eons
U.S. DEPARTMENT OF EDUCATIONOffice of Educational Research and Improvement
EDUCATIONAL RESOURCES INFORMATIONCENTER (ERIC)
EKThis document has been reproduced asreceived from the person or organizationoriginating it.
Minor-changes have been made toimprove reproduction quality.
PERMISSION TO REPRODUCE ANDPoints of view or opinions stated in this DISSEMINATE THIS MATERIALdocument do not necessarily representofficial OERI position or policy. HAS BEEN RANTED BY
C.
2REST COPY AMICABLE
Education.
1996
'TO THE EDUCATIONAL RESOURCESINFORMATION CENTER (ERIC)
In-Service Workshops and Seminars:Suggestions for Using this Hot Topic Guide as aProfessional Development Tool
Before the Workshop:Carefully review the materials presented in this Hot Topic Guide. Think about how theseconcepts and projects might be applied to your particular school or district.As particular concepts begin to stand out in your mind as being important, use theBibliography section (found at the end of the packet) to seek out additional resourcesdealing specifically with those concepts.Look over the names of the teachers and researchers who wrote the packet articlesand/or are listed in the Bibliography. Are any of the names familiar to you? Do any ofthem work in your geographical area? Do you have colleagues or acquaintances whoare engaged in similar research and/or teaching? Perhaps you could enlist their help andexpertise as you plan your workshop or seminar.As you begin to plan your activities, develop a mental "movie" of what you'd like to seehappening in the classroom as a result of this in-service workshop or seminar. Keep thisvision in mind as a guide to your planning.
During the Workshop:Provide your participants with a solid grasp of the important concepts that you haveacquired from your reading, but don't load them down with excessive detail, such aslots of hard-to-remember names, dates or statistics. You may wish to use theOverview/Lecture section of this packet as a guide for your introductory remarks aboutthe topic.Try modeling the concepts and teaching strategies related to the topic by "teaching" aminilesson for your group.Remember, if your teachers and colleagues ask you challenging or difficult questionsabout the topic, that they are not trying to discredit you or your ideas. Rather, they aretrying to prepare themselves for situations that might arise as they implement theseideas in their own classrooms.If any of the participants are already using some of these ideas in their own teaching,encourage them to share their experiences.Even though your workshop participants are adults, many of the classroom managementprinciples that you use every day with your students still apply. Workshop participants,admittedly, have a longer attention span and can sit still longer than your second-graders; but not that much longer. Don't have a workshop that is just a "sit down, shutup, and listen" session. Vary the kinds of presentations and activities you provide inyour workshops. For instance, try to include at least one hands-on activity so that theparticipants will begin to get a feel for how they might apply the concepts that you arediscussing in your workshop.Try to include time in the workshop for the participants to work in small groups. Thistime may be a good opportunity for them to formulate plans for how they might use theconcepts just discussed in their own classrooms.Encourage teachers to go "a step further" with what they have learned in the workshop.Provide additional resources for them to continue their research into the topicsdiscussed, such as books, journal articles, Hot Topic Guides, teaching materials, andlocal experts. Alert them to future workshops/conferences on related topics.
11/94
3
After the Workshop:Follow up on the work you have done. Have your workshop attendees fill out an End-of-Session Evaluation (a sample is included on the next page). Emphasize that theirresponses are anonymous. The participants' answers to these questions can be veryhelpful in planning your next workshop. After a reasonable amount of time (say a fewmonths or a semester), contact your workshop attendees and inquire about how theyhave used, or haven't used, the workshop concepts in their teaching. Have anysurprising results come up? Are there any unforeseen problems?When teachers are trying the new techniques, suggest that they invite you to observetheir classes. As you discover success stories among teachers from your workshop,share them with the other attendees, particularly those who seem reluctant to give theideas a try.Find out what other topics your participants would like to see covered in futureworkshops and seminars. There are nearly sixty Hot Topic Guides, and more are alwaysbeing developed. Whatever your focus, there is probably a Hot Topic Guide that canhelp. An order form follows the table of contents in this packet.
Are You Looking for University Course Credit?Indiana University's Distance Education programis offering new one-credit-hour Language Arts Educationminicourses on these topics:
Elementary:Language Learning and DevelopmentVaried Writing StrategiesParents and the Reading ProcessExploring Creative Writing with
Elementary Students
Secondary:Varied Writing StrategiesThematic Units and LiteratureExploring Creative Writing with
Secondary Students
K-12:Reading across the CurriculumWriting across the CurriculumOrganization of the Classroom
Course Requirements:These minicourses are taught bycorrespondence. Minicourse readingmaterials consist of Hot Topic Guides andERIC/EDINFO Press books. You will beasked to write Goal Statements andReaction Papers for each of the assignedreading materials, and a final Synthesispaper.
I really enjoyed working at my own pace....It was wonderful to have everything soorganized...and taken care of in a mannerwhere I really felt like I was a student,however 'distant' I was....'--Distance Education student
Three-Credit-Hour Coursesare also offered (now with optionalvideos!):Advanced Study in the Teaching of:
Reading in the Elementary SchoolLanguage Arts in the Elementary SchoolSecondary School English/Language ArtsReading in the Secondary School
Writing as a Response to ReadingDeveloping Parent Involvement ProgramsCritical Thinking across the CurriculumOrganization and Administration of a
School Reading Program
For More Information:For course outlines and registrationinstructions, please contact:
Distance Education OfficeSmith Research Center, Suite 1502805 East 10th StreetBloomington, IN 47408-26981-800-759-4723 or (812) 855-5847
4
Planning a Workshop PresentationWorksheet
Major concepts you want to stress in this presentation:
1)
2)
3)
Are there additional resources mentioned in the Bibliography that would be worthlocating? Which ones? How could you get them most easily?
Are there resource people available in your area whom you might consult about thistopic and/or invite to participate? Who are they?
What would you like to see happen in participants' classrooms as a result of thisworkshop? Be as specific as possible.
Plans for followup to this workshop: [peer observations, sharing experiences, etc.]
Agenda for WorkshopPlanning Sheet
Introduction/Overview:[What would be the most effective way to present the major conceptsthat you wish to convey ?]
Activities that involve participants and incorporate the main concepts of this workshop:
1)
2)
Applications:Encourage participants to plan a mini-lesson for their educational setting thatdraws on these concepts. [One possibility is to work in small groups, duringthe workshop, to make a plan and then share it with other participants.]
Your plan to make this happen:
Evaluation:[Use the form on the next page, or one you design, to get feedback fromparticipants about your presentation.]
6
END-OF-SESSION EVALWITION
Now that today's meeting is over, we would like to know how you feel and what you think about
the things we did so that we can make them better. Your opinion is important to us. Please
answer all questions honestly. Your answers are confidential.
1. Check ( V ) to show if today's meeting was
Not worthwhile Somewhat worthwhile Very worthwhile
2. Check ( ) to show if today's meeting was
Not interesting Somewhat interesting Very interesting
3. Check ( ) to show if today's leader wasNot very good Just O.K. Very good
4. Check ( V ) to show if the meeting helped you get any useful ideas about how youcan make positive changes in the classroom.
1:1 Very little Some Very much
5. Check ( ) to show if today's meeting was
Too long 'Pao short Just about right
6. Check ( ) whether you would recommend today's meeting to a colleague.Yes No
7. Check ( V ) to show how useful you found each of the things we did or discussed today.
Getting information/new ideas.
Not useful Somewhat useful Very useful
Seeing and hearing demons
Li Not useful
Getting materials to read.Not useful
trations of teaching techniques.
Somewhat useful
Somewhat useful
7
Very useful
Very useful
Listening to other teachers tell about their own experiences.Not useful Somewhat useful Very useful
Working with colleagues in a small group to develop strategies of our own.Not useful Somewhat useful Very useful
Getting support from others in the group.Not useful Somewhat useful Very useful
8. Please write one thing that you thought was best about today:
9. Please write one thing that could have been improved today:
10. What additional information would you have liked?
11. Do you have any questions you would like to ask?
12. What additional comments would you like to make?
Thank you for completing this form.
STANDARDIZED AND ALTERNATIVE ASSESSMENTOverview by
Mardziah Hayati AbdullahM.S. Language Education, Indiana University
We evaluate every day. We make judgments about the most common thingswithout even realizing it when we comment on how a dish tastes, which TV programis more entertaining or how to perform a chore better. Understandably, we wouldexpect evaluation to be an important part of a system in which so much nationalinterest and expenditure is invested: education. Evaluation should inform educatorsabout how well they have done or are doing, and it should indicate how to educatebetter. However, educational assessment has other social and political consequenceswhich significantly impact society and thus provoke debate. This overview introducessome of the major issues surrounding educational assessment and provides a guide tofurther reading on particular aspects.
Trends in Assessment
The practice of evaluating human ability and performance in an organizedmanner has been around for centuries, dating as far back as 2000 B.C. when Chinesecivil examinations were established in a move toward appointing civil servants basedon meritocracy instead of unfair preference. Since then, trends in assessment haveundergone changes in response to changes in the philosophy of education,assumptions about learning, and political ideologies.
In the 400's B.C, the Greek philosopher Socrates used oral conversationalmethods for examining rhetorical abilities in presenting and defending arguments.Much later, in the 1700's, early testing in the United States also involved oralexaminations conducted by faculty who used their expert judgment to determine thequality of their students' performances, much like the oral defenses required ofdoctoral students today. Horace Mann started using written standardized tests inMassachusetts and Connecticut in the 1830's, leading to the development of Mann'sBoston Survey in 1846, the first printed test for large-scale assessment of studentachievement in various disciplines. The tests were discontinued because the resultswere not used.
Standardized examinations were again recommended after 1895 when JosephRice conducted tests in a number of large school systems that yielded largedifferences in math scores among schools. In the 1900's Thorndike, often called thefather of the educational testing movement, persuaded educators to measure humanchange. There was a call for immediate, demonstrable results, and by 1915, testing
9
had become the primary means of evaluating schools. A lack of trust in teachersdeveloped because of the assumption that they were biased in evaluating students,and standardized norm-referenced tests emerged so that individual students' or groupscores could be compared against those of other individuals or groups on abell-shaped normal curve. Dewey's progressive education movement favorededucation based on natural learning and practical problems in the 1930's, and whilethat may have called for more authentic testing, standardized norm-referenced testingwas still employed, leading to a boom in the technical development of tests tocompare schools on a larger scale.
As educators realized that standardized test outcomes did not adequatelyreflect the complexity of learning, opposition against standardized testing grew. The1980's saw a rise in the use of criterion-referenced tests that, in reality, are alsostandardized, but students are assessed against a set of criteria instead of otherstudents. By the 1990's, however, educators were raising and documenting more andmore problems with standardized testing. In answer, educators explored alternativesto standardized testing which fall under the umbrella term alternative assessment.Different forms of alternative assessment are still being developed and triedenthusiastically by some educators, but they, too, face opposition. In Chapter 1 andpart of Chapter 8 in the book Toward a New Science of Educational Testing andAssessment, which have been included in this Hot Topic guide, Berlak discussessome of the epistemological contentions in the history of evaluation.
Testing and assessment
Before engaging in a discussion on standardized testing and alternativeassessment, I think it is important to read Chapter 1 in the reading text for thisguide, Assessing Student Performance: Exploring the Purpose and Limits of Testing,in which Wiggins (1993) makes an enlightening distinction between testing andassessment.
In a nutshell, a test is a "one- shot" procedure which assumes to measure thetest-taker's ability or knowledge. It requires the test-taker to provide uniform,'correct' responses to items formulated and scored by other parties using criteria overwhich the test-taker has no say. Thus, Wiggins claims, to "test" a student is "apractice of determining whether the student has mastered what is orthodox" (p. 10).Tests are often "secure" (the contents kept secret) before they are administered. Therole of human judgment in scoring is deliberately minimized. Standardized items andscoring also necessarily minimizes responsiveness to individual test-takers andcontexts. Assessment, in contrast, is a comprehensive, multi-faceted analysis ofperformance; it must be judgment-based and personal" (p.13). By Wiggins' definition,assessment requires the systematic collection of data about a student's performance
10
using multiple and varied techniques, including assessors' observations. Unlike scoresobtained from one-shot tests, assessment involves the "integration of (diverse)information in a summary judgment" about a student (p.13) and necessitates theassessor's personal and subjective evaluation of a particular individual's performancein a particular context. The student has more say in his own assessment.
If one accepts the distinctions described above, then testing cannot beassessment, and vice-versa. Wiggins' book clarifies aspects of evaluation that furtherdistinguish the two, such as the practice of assessment (Chapter 2), test securityversus tact (Chapter 4) and the nature of feedback (Chapter 6). (Note: Chapter 6 isnot a required reading for this Hot Topic guide but is included in the one onMeasurement Issues).
Standardized testing
A standardized test is typically one in which the same questions, in the sameform, are given to everyone who is being evaluated. The test may be scored by amachine or by different raters following a common rubric. On one hand, standardizedtesting procedures have made evaluation convenient by offering neatly packaged andoften easily obtainable test items. Machine scoring of multiple- choice questions hasalso made scoring easier and free of human error, and even Wiggins acknowledgesthat there are virtues in minimizing bias and error in human judgment. However,more and more educators are presenting strong arguments against standardizedtesting. For instance, bias can also be built into the questions themselves, as testitems inevitably reflect particular perspectives that may not be shared by test-takers.Thus, incorrect responses need not indicate erroneous understanding.
Berlak discusses four assumptions underlying what he considers to be thecurrent psychometric paradigm of standardized testing, along with counter-argumentsthat support the emergence of a contextual or curriculum-sensitive paradigm. Oneassumption is that there can be a single, established consensus about what a testscore means about individuals all over the world, while the argument is that there are"plural and contradictory" perspectives on what it means to be able or competent in acertain area. Secondly, the psychometric paradigm assumes that 'scientific'techniques and instruments are value-neutral and thus should yield objectivemeasurements; this is countered by the assertion that there are different learningsituations and culturally ingrained perspectives, making it morally wrong to measurestudents' performance without considering those factors. Another assumption is thatcognitive learning can be separated from affective learning and measured likewise, butthe contextual paradigm argues that cognitive ability cannot be assessedindependently from other human factors. Finally, the psychometric paradigm sees a
11
need for centralized control, whereas assessment reform calls for decentralizeddecision making.
Standardized testing has also raised important issues of equity,deprofessionalization of teachers and the negative use of test scores for purposes suchas placement, tracking and grade retention. Pages 5-18 of Linda Darling-Hammond'sarticle Symposium: Equity in Educational Assessment (included in this guide)presents these issues convincingly, highlighting the need to consider theconsequential validity of testing. Haladyna, Nolen and Haas, in their 1991 article,Raising Standardized Achievement Scores and the Origins of Test Score Pollution(found in this guide), point out yet another consequence: unethical practices thatschools engage in as a result of society's concern with standardized scores asindicators of performance. The escalating academic demand has even led to readinessscreening and grade retention being imposed on children as young as kindergartners(Shepard & Smith, 1988). With these and other criticisms being levied againststandardized testing, assessment, as Wiggins defines it, is increasingly being sought asan alternative evaluation approach.
Alternative assessment
Alternative assessment has come to be used as an umbrella term that coversauthentic assessment and performance assessment, and incorporates techniques suchas portfolio assessment, exhibitions of mastery, projects, profiles and discourseassessment.
Authentic assessment refers to the assessment of achievement or ability shownthrough performances which are modeled on real-life tasks or practices inbeyond-school settings. It is performance-based as it evaluates performances such asthe ability to integrate, apply or produce knowledge in contexts representative of orsimilar to real-life situations that (ideally) have personal value for the student. Theforms and criteria of assessment are similar to those that might be used in realsituations. Thus, authentic assessment is likely to have a combination of some or allof the following characteristics: ongoing, continuous, personalized, negotiable, with multipleindicators, and possibly conducted by more than one assessor, although this list is notexhaustive. Fred Newmann and Douglas Archbald discuss criteria for authenticassessment in Chapter 4 of the Berlak (et.al.) book (not included in the guide). InChapter 7 (also not included), the same writers describe practices for assessingacademic achievement that meet one or more of those criteria.
Wiggins' book offers one of the best discussions available on what shouldconstitute performance assessment. Gitomer (in press) defines a performance task as "onethat simultaneously requires the use of knowledge, skills and values that are
12
recognized as important in a domain of study and is qualitatively consistent withtasks that members of discipline-based communities might conceivably engage in."The alternative assessment techniques mentioned earlier represent performance tasksthat vary in scope of content, method of initiation and method of presentation. Textsby the following (not required reading but listed under references at the end of thisoverview) provide more detailed descriptions of these techniques: Newmann inBerlak, et.al.(discourse assessment), Mabry (demonstrations of mastery), Stenmark(performance tasks), Archbald and Newmann (portfolios and profiles). (The last textis a required reading for the Hot Topic guide on Measurement Issues.)
How do alternative assessment techniques differ from standardized tests inlanguage education? Discourse assessment, as described by Newmann, is an example:it is the assessment of a student's ability to present coherent, 'whole' (as opposed tofragmented) responses in discursive forms such as narratives or arguments, whichreflect synthesized, personal, critical and contextually appropriate use of content andlanguage. It also helps to understand what discourse assessment is not, namely, thechecking of answers to responses, both of which are presented in a sequence that hasno obvious meaning or purpose beyond that of testing discrete items of knowledge.Discourse assessment is in principle authentic assessment, as it requires a form ofdiscursive performance that we engage in real-life situations, such as when we writeletters, argue to defend our position or choices, negotiate with others or presentinformation. However, authenticity comes only with contextual appropriateness,hence we must consider the nature of the performance task and criteria. For instance,if students in a rural village (who are likely to become the future leaders) were facingthe possibility of their only soccer field being used for the construction of teachers'quarters, an authentic assessment task would be for them to present strong oral orwritten arguments for stopping the housing project. The assessor must also be familiarwith the discursive elements that are culturally appropriate for that situation,otherwise an inauthentic assessment will be rendered. A standardized test necessarilyexcludes such personalized attention.
Indeed, Wiggins and Mabry view alternative assessment within the frameworkof a personalized assessment paradigm. Going beyond aligning curriculum withassessment, which is the crwc of a contextual paradigm, a personalized paradigmincludes student- sensitive content, variable times and settings, greater studentselection and the essential feature of self-assessment by students in addition toevaluation by others. The intent is to find out about a particular student in aparticular setting over a period of time, and giving him the opportunity to show whathe does best in the way he does it best. Undoubtedly, there are numerous difficultiesassociated with such a paradigm: among them are that teachers and students need farmore power than they have at the moment and they need to be very clear about thenature of performance assessment, schools, curricula and schedules have to be
13
restructured to accommodate more flexible assessment practices, and society has tostop evaluating schools in terms of statistical scores. Despite the problems, however,personalized alternative assessment may at the moment be the only way to counterthe absence of student-centeredness in standardized testing.
Rubrics and scoring
Assessment reform involves more than merely changing the nature of the tasks.A lot of thought has to be given to what kind of corresponding change there shouldbe in the way performance tasks are scored. For example, some teachers vehementlyoppose giving grades, so should grades be used at all? If they are not, how does onedocument student achievement? Keeping descriptive records of students' progress andperformance is one option. Portfolios of students' work is another. But if grades haveto be given, what are the standards by which they are determined? There is no easyanswer. For example, although the Vermont portfolio assessment program was anattempt to move away from standardized testing, their writing assessment still usedan assessment rubric that required every student to meet the same specified criteriabroken down analytically into dimensions such as purpose, details, organization andmechanics. Essentially, the students were still being evaluated in a standardizedmanner.
Test security
The perceived need for test security is another major difference betweenstandardized testing and alternative assessment. The assumption that there arecertain correct' answers necessitates keeping the content of tests confidential, whileassessment calls for open negotiation between assessorand student.
One reasonable argument for supporting test security is Wiggins': openknowledge of test content may cause students to focus only on limited domains oflearning. However, open performance-based assessment can be designed to ensure theneed to integrate knowledge from various domains. Test situations that I can think ofwhich may warrant security are ones which need to assess the appropriateness ofspontaneous response, such as spontaneous medical decisions; in such cases, I feel itis justifiable to keep both test content and format unknown to the student beforethe test, but discussion on results must be open.
There are, however, many reasons why tests should not be secure. The firstconcern is with authenticity. Many forms of assessment in the real world beyondeducational settings do not practice the kind of secrecy adhered to in school settings,particularly in standardized testing; thus, going through secure tests is not anauthentic assessment experience. Security generally requires one-shot testing in
14
artificially constructed settings (for example, secure rooms); subsequently, thevalidity of interpretations made of students' achievement in secure tests can bequestioned, since it is not an assessment of authentic use of knowledge.
Secondly, although test security is intended to ensure credibility and fairnessin test administration, the irony is that secrecy actually leads us to challenge thosevery aspects. Secrecy cuts off dialogue between student and assessor and makes itimpossible for students (as well as teachers and parents, in the case of state-widetests) to raise objections or ask for clarification before tests are administered.Inevitably, this leads to doubts about the credibility of the tests, although objectionsare rarely entertained. In some situations, hours- long quarantine may be imposed onstudents before an exam in which there is suspected leakage of test information. Boththe validity and reliability of student performance under such conditions can beseriously called into question; more often than not, however, no one does so,particularly not the powerless students, even though they are the real victims. Thus,the moral dimension of secrecy in testing is also a subtantial cause for concern. Inaddition, to students who are denied knowledge of the standards and criteria bywhich they are judged, tests are both arbitrary and unfair. Secrecy in high-staketesting also leads to high anxiety, which in turn often results in unethical practices,such as selling and buying 'leaked' test questions, and cheating during exams.
Thirdly, there can be very little or no instructional value when results are keptsecret even after the test. Students are also denied consultation with or guidance fromadults during assessment, nor do they have access to resource materials, restrictionsthat do not often exist in the real world. In addition, because the whole testingprocedure is shrouded in secrecy and only selected personnel are allowed access totest planning and design (in standardized tests), test security leads to an over-relianceon and excessive belief in 'expert' assessment authorities and a distrust of teachercompetence. Thus teachers have little control over testing imposed by strangers ontheir students, whom they know best.
Reliability and validity
The distinctions between standardized testing and alternative assessment aretied in to issues of reliability and validity. Both Wiggins' and Berlak's texts for thistopic address these constructs to some extent. Reliability refers to the consistency orreplicability of scores or performances, while validity, as Messick defines it, is thedegree to which empirical evidence and theoretical rationales support the adequacyand appropriateness of inferences and actions based on test scores or other modes ofassessment (Messick, 1989). Both standardized testing and alternative assessmentattempt to achieve a degree of reliability and validity. Standardized tests, however,have a greater concern with reliability and are "intrinsically prone to sacrifice validity"
15
(Wiggins, p.4); they are usually simplified for the sake of precision in scoring, andthus lose the authentic complexity of real-life tasks. In addition, the test securitydemands test-taking under artificial circumstances, as has been previously mentioned,rendering the validity of the inferences one can draw from test scores produced undersuch circumstances questionable. Alternative assessment, on the other hand, byfocusing on more authentic ill-structured tasks, considering individual students to agreater degree, and allowing negotiation between the student and teacher, reduces thedegree of reliability but raises the degree of validity of the inferences that can bemade about the student's performance. The Hot Topic guide on Measurement Issuesdiscusses these measurement constructs in greater depth.
Conclusion
Like most other problems that impact society in a significant way, the issuessurrounding educational testing and assessment are not easily resolved because theyare tied to ideology, politics and social systems. Standardized testing is familiar andcomfortable, while numerous obstacles still have to be overcome in order tosuccessfully reform assessment. Standardized test results are the basis forreproducing a stratified society, but true assessment reform, particularly assessmentwithout traditional grades, may lead to the restructuring of society. The question iswhether society is ready to move into a new paradigm.
References(not listed in this Hot Topic guide)
Books:
Mabry, L. (1992) Performance assessment. In Debra D. Bragg (Ed.). Alternativeapproaches to outcomes assessment (p.109-128). University of California, Berkeley:National Center for Research on Vocational Education.
Messick, S. (1989) Chap. 1 in Robert L. Linn (Ed.) Educational Measurement. NY:American Council on Education, Macmillan.
Stenmark, J.K. (1991). Mathematics assessment: Myths. models. good questions andpractical suggestions. Reston, VA: NCTM.
Articles:
Archbald, D.A. & F. Newmann (1992). Approaches to assessing academicachievement. In Harold Berlak, Fred M. Newmann, Elizabeth Adams, Doug A.Archbald, Tyrrell Burgess, John Rave & Thomas A. Romberg Toward a new science ofeducational testing and assessment. Albany: State University of New York Press.
Newmann, F. and D.A. Archbald (1992). The nature of authentic academicachievement. In Harold Berlak, Fred M. Newmann, Elizabeth Adams, Doug A.Archbald, Tyrrell Burgess, John Rave & Thomas A. Romberg Toward a new science ofeducational testing and assessment. Albany: State University of New York Press.
Shepard, L.A. & M.L. Smith (1989). Escalating academic demand in kindergarten:Counterproductive policies. In The Elementary School Journal Vol. 89. No. 2.p.135-45.
17
Frts
vt,
Jito
-of.
Exa
444.
)G
t. M
. N4v
oyvk
arin
Eli
-Gt-
i'J-e
A41
4.,4
trl^
'S/ r
o-uo
A-
krck
bc.lc
Q
rreu
;Ter
&A
.),
Loy
t.a..5
Rov
i.or)
a-:-
5(
1 ct
ct *
-2-)
-rdv
.) a
.),A
,_A
-K
ie_t
jjcA
itita
a.cU
tc,1
/4.4
1-01
k4rt
e-ir
tA-4
s cA
/A_c
t
So.4
.4.,s
tLx-
yA-4
-
-N(c
sIff
c-at
etax
.un
.t\I
f.k
4 )0
,(t.
p.
18
1
The
Nee
d fo
ra
New
Sci
ence
of A
sses
smen
t
Har
old
Ber
lak
Intr
oduc
tion
The
idea
that
sch
oolin
g fo
r al
l is
esse
ntia
l for
soc
ial
prog
ress
and
eco
-no
mic
gro
wth
gre
w u
p al
ongs
ide
the
deve
lopm
ent o
f in
dust
rial
capi
talis
m d
ur-
ing
the
tail
end
of th
e ni
nete
enth
and
ear
ly d
ecad
es o
fth
e tw
entie
th c
entu
ry.
By
the
1990
s, th
e as
pira
tion
for
univ
ersa
l sch
oolin
g ha
sco
me
a lo
ng w
ay to
-w
ard
real
izat
ion,
thou
gh m
any
Am
eric
an y
outh
stil
l do
not c
ompl
ete
seco
nd-
ary
scho
ol.'
Whi
le u
nive
rsal
pro
visi
on o
f sc
hool
ing
is s
till w
idel
yse
en a
s a
nobl
e, if
unr
ealiz
ed g
oal,
ther
e is
a g
row
ing
cons
ensu
s th
at th
e sy
stem
of
pub-
lic e
duca
tion
that
has
evo
lved
ove
r th
eco
urse
of
this
cen
tury
in th
e U
nite
dSt
ates
is in
ser
ious
trou
ble.
Pub
lic o
ffic
ials
, cor
pora
te le
ader
s,an
d or
dina
ryci
tizen
s ar
e in
crea
sing
ly d
issa
tisfi
ed w
ith th
e qu
ality
of
the
educ
atio
n pr
o-vi
ded
by th
e na
tion'
s sc
hool
s to
the
grea
t maj
ority
of
child
ren.
Whi
le th
e m
ar-
gins
of
the
Am
eric
an p
oliti
cal s
cene
, lef
t and
rig
ht h
ave
long
bee
n cr
itica
lof
sch
ools
(al
beit
with
qui
te d
iffe
rent
idea
s of
the
prob
lem
san
d so
lutio
ns),
with
the
exce
ptio
n of
rac
ial d
eseg
rega
tion,
dis
cuss
ions
of
elem
enta
ry a
nd s
ec-
onda
ry s
choo
ling
polic
y ov
er th
e la
st 2
5ye
ars
wer
e vi
rtua
lly a
bsen
t in
the
natio
nal m
edia
, in
the
plat
form
s of
the
natio
nal p
oliti
cal p
artie
s,or
in c
am-
paig
ns f
or n
atio
nal s
tate
or
even
loca
l pub
lic o
ffic
e. F
or b
rief
inte
rlud
&S,
fol
-lo
win
g th
e la
unch
ing
of S
putn
ik in
the
late
195
0s a
nd in
the
mid
-196
0s d
urin
gL
yndo
n Jo
hnso
n's
"war
on
pove
rty,
" pu
blic
atte
ntio
n fo
cuse
don
sch
ools
, but
this
inte
rest
was
not
sus
tain
ed.
Thi
s ch
ange
d in
198
3 w
ith p
ublic
atio
n of
A N
atio
nat
Ris
k, a
rep
ort o
fth
e N
atio
nal C
omm
issi
on o
n E
xcel
lenc
e in
Edu
catio
n (1
983)
.It
mad
e na
-tio
nal n
ews
with
its
asse
rtio
n th
at A
mer
ican
edu
catio
nw
as th
reat
ened
by
"a r
isin
g tid
e of
med
iocr
ity,"
and
with
its
freq
uent
ly c
ited
lines
: "If
an
2H
arol
d B
erla
k
unfr
iend
ly f
orei
gn p
ower
atte
mpt
edto
impo
se o
n A
mer
ica
the
med
iocr
e ed
-uc
atio
nal p
erfo
rman
ce th
at e
xist
sto
day,
we
mig
ht w
ell h
ave
view
edit
as a
nac
t of
war
. As
it st
ands
, we
have
allo
wed
this
to h
appe
n to
our
selv
es.
..
. We
have
, in
effe
ct b
een
com
mitt
ing
an a
ct o
f un
thin
king
uni
late
ral d
isar
mam
ent."
Why
this
rep
ort r
ecei
ved
so m
uch
atte
ntio
n is
a m
atte
r of
som
e co
njec
-tu
re. V
ery
seri
ous
prob
lem
s, p
artic
ular
ly in
,bu
t not
res
tric
ted
to, i
nner
city
and
poor
rur
al s
choo
ls, h
ad e
xist
ed a
ndbe
en w
idel
y kn
own
for
man
y ye
ars.
In
spite
of
the
repo
rt's
cla
ims
to th
eco
ntra
ry, w
hat h
ad c
hang
edw
ere
not t
hepr
oble
ms2
thou
gh u
ndou
bted
lyth
ey h
ad g
otte
n w
orse
but t
hepu
blic
's a
ndel
ecte
d of
fici
als'
res
pons
e. T
here
ason
for
wid
e no
tice
of A
Nat
ion
at R
isk
had
mor
e to
do
with
the
part
icul
ar h
isto
rica
lm
omen
t it a
ppea
red
than
with
the
orig
inal
ity o
r pr
ofun
dity
of
its a
naly
sis.
In th
e ea
rly
eigh
ties,
the
failu
res
ofth
e U
S ec
onom
y ha
d ju
st b
egun
to p
enet
rate
the
natio
n's
cons
ciou
snes
sdo
min
atin
g th
e ne
ws
wer
e th
e ga
llopi
ng U
S tr
ade
defi
cit;
the
failu
res
of U
Sin
dust
ry; p
lant
clo
sing
s; a
nd d
ram
atic
incr
ease
s in
une
mpl
oym
ent,
part
icu-
larl
y in
the
olde
r in
dust
rial
citi
es. W
hat t
his
repo
rt o
ffer
ed w
asan
exp
lana
tion
for
thes
e ap
pare
ntly
inex
plic
able
even
ts, a
n ex
plan
atio
n w
hich
was
eag
erly
embr
aced
by
the
mai
nstr
eam
pres
s an
d co
rpor
ate
Am
eric
a, a
nd w
idel
yre
-pe
ated
in th
e na
tiona
l med
ia. T
here
port
told
the
Am
eric
an p
ublic
that
a m
ajor
caus
e, if
not
the
maj
or c
ause
, of
Am
eric
a's
fall
from
gra
ce a
s th
e w
orld
'spr
e-m
inen
t eco
nom
ic a
nd in
dust
rial
pow
er w
as th
e fa
ilure
of
the
natio
n's
scho
ols
to e
duca
te a
com
pete
nt, d
edic
ated
wor
kfo
rce.
Thi
s w
as a
pal
atab
le d
iagn
osis
of th
e na
tion'
s ec
onom
ic m
alai
seth
at s
uite
d th
e tim
es. I
t pla
ced
blam
e, n
ot o
nth
e ba
sic
stru
ctur
al p
robl
ems
ofth
e U
S ec
onom
y, n
oron
the
failu
res
of c
or-
pora
te le
ader
s an
d po
litic
ians
to a
ddre
ssth
e ch
angi
ng w
orld
econ
omy,
and
tocl
o,so
met
hing
to r
elie
ve th
e ac
cum
ulat
ing
soci
al p
robl
ems
and
the
gros
s di
s-pa
ritie
s be
twee
n ri
ch a
ndpo
or; b
ut o
n th
e po
litic
ally
impo
tent
: the
natio
n's
elem
enta
ry a
nd s
econ
dary
sch
ool
teac
hers
, nam
eles
s ed
ucat
iona
lbu
reau
crat
s,an
d un
skill
ed a
nd/o
r un
mot
ivat
edw
orke
rs.
A N
atio
n at
Ris
k w
asno
t the
wor
k of
rig
ht-w
ing
ideo
logu
es.
Ter
rell
Bel
l, w
ho in
itiat
ed th
ere
port
, and
who
was
app
oint
ed b
y R
onal
dR
eaga
n as
his
firs
t, se
cret
ary
of e
duca
tion,
was
at t
he ti
me
wid
ely
rega
rded
as a
mid
dle-
of-t
he-r
oad
prof
essi
onal
, and
the
eigh
teen
-mem
ber
Nat
iona
l Com
mis
sion
onE
xcel
lenc
e B
ell a
ppoi
nted
incl
uded
,am
ong
othe
rs, t
he r
etir
ed c
hair
man
of
the
bpar
d of
Bel
l Lab
orat
orie
s,tw
o pr
ofes
sors
fro
m H
arva
rd a
ndU
nive
rsity
of
Cal
ifor
nia
at B
erke
ley
resp
ectiv
ely,
four
uni
vers
ity p
resi
dent
s(i
nclu
ding
Yal
e),-
a fo
rmer
gov
erno
r of
Min
neso
ta,
the
imm
edia
te p
ast-
pres
iden
tof
the
Nat
iona
l Sch
ool B
oard
s A
ssoc
iatio
n,tw
o pr
inci
pals
, tw
o sc
hool
boa
rdm
em-
bers
, the
sup
erin
tend
ent o
fsc
hool
s fr
om A
lbuq
uerq
ue, a
ndth
e 19
81-8
2te
ache
r-of
-the
yea
r, a
hig
h sc
hool
fore
ign
lang
uage
teac
her
from
an a
fflu
ent
subu
rb o
f N
ew Y
ork
City
.
1
2(0
The
Nee
d fo
r a
New
Sci
ence
of
Ass
essm
ent
3
Wha
teve
r its
def
icie
ncie
s, th
e N
atio
n at
Ris
k dr
ew p
ublic
atte
ntio
n to
the
scho
ols,
and
this
, atte
ntio
n co
ntra
ry to
the
expe
ctat
ions
of
man
y, h
as c
on-
tinue
d to
the
pres
ent.
The
rep
ort a
nd th
e w
ide
atte
ntio
n if
rec
eive
d st
imul
ated
resp
onse
s fr
om v
irtu
ally
eve
ry o
rgan
izat
ion
and
grou
p w
ith a
n in
tere
st in
ed-
ucat
iona
l pol
icy.
Sin
ce 1
983
coun
tless
rep
orts
, art
icle
s, a
nd b
ooks
hav
e be
enw
ritte
n or
com
mis
sion
ed b
y ev
ery
maj
or f
ound
atio
n, d
ozen
s of
min
or o
nes,
polic
y th
ink-
tank
s ac
ross
the
polit
ical
spe
ctru
m, a
ssoc
iatio
ns o
f co
rpor
ate
ex-
ecut
ives
and
edu
catio
nal p
rofe
ssio
nals
, tea
cher
s' u
nion
s, c
hild
ren'
s an
d pa
r-en
ts' a
dvoc
acy
grou
ps, f
orm
al a
nd a
d ho
c or
gani
zatio
ns o
f st
ate
and
loca
led
ucat
iona
l off
icia
ls, a
s w
ell a
s by
indi
vidu
al jo
urna
lists
and
sch
olar
s. W
hile
ther
e ar
e m
ajor
dif
fere
nces
in th
e po
licy
reco
mm
enda
tions
, ver
y fe
w r
epor
tsco
ntes
t the
Nat
ion
at R
isk'
s vi
ew o
f th
e ec
onom
y, a
nd n
one
with
dis
sent
ing
view
s ha
ve r
ecei
ved
wid
e pu
blic
not
ice.
3A
ll th
is ta
lk a
bout
edu
catio
n di
d, h
owev
er, g
alva
nize
late
nt p
ublic
dis
-co
nten
t with
the
scho
ols
and
crea
te a
pol
itica
l clim
ate
for
chan
ge. S
ince
198
3vi
rtua
lly e
very
gov
ernm
enta
l age
ncy
and
adm
inis
trat
ive
unit
at th
e st
ate,
coun
ty, a
nd s
choo
l dis
tric
t lev
els
that
hel
d so
me
resp
onsi
bilit
y fo
r el
emen
tary
and
seco
ndar
y sc
hool
s ha
s in
itiat
ed a
nd im
plem
ente
d so
me
refo
rms.
Sta
tele
gisl
atur
es, g
over
nors
, sta
te a
nd lo
cal e
duca
tion
offi
cers
, the
maj
or f
ound
a-tio
ns a
nd th
ink
tank
s, th
e tw
o le
adin
g na
tiona
l tea
cher
s un
ions
, and
eve
n th
e19
88 p
resi
dent
ial c
andi
date
s, B
ush
and
Duk
akis
, fel
t the
nee
d to
res
pond
toth
e cl
amor
for
edu
catio
nal e
xcel
lenc
e.M
any
of th
e re
spon
ses
can
be p
asse
d of
f as
med
ia h
ype
and
polit
ical
rhet
oric
. But
ther
e w
ere
also
man
y co
ncre
te m
easu
res
unde
rtak
en. I
mak
e no
effo
rt h
ere
to r
ecou
nt a
nd a
naly
ze th
ese
effo
rts
in a
ny d
etai
l, a
mon
umen
tal
unde
rtak
ing
far
beyo
nd th
e pu
rvie
w o
f th
is c
hapt
er. H
owev
er, s
ome
effo
rt to
mak
e se
nse
of th
ese
inte
nded
ref
orm
s is
ess
entia
l if
we
are
to u
nder
stan
d th
ecu
rren
t mov
emen
t for
dev
elop
ing
new
for
ms
of e
duca
tiona
l ass
essm
ent a
ndte
stin
g.
An
Ana
lysi
s of
the
Ref
orm
Mov
emen
t: T
he R
ole
of T
estin
g
Tw
o co
mpe
ting
tend
enci
es a
bout
how
pol
itica
l dec
isio
ns s
houl
d be
mad
e an
d w
ho s
houl
d m
ake
them
are
rep
rese
nted
by
rece
nt e
ffor
ts to
ref
orm
the
natio
n's
scho
ols.
One
tend
ency
is to
war
d de
cent
raliz
atio
n of
aut
hori
ty a
ndde
cisi
on-m
akin
g by
thos
e w
ho a
re m
ost i
mm
edia
tely
aff
ecte
d by
thos
e de
ci-
sion
s. T
his
view
is o
ften
cou
pled
with
a d
istr
ust o
f ce
ntra
lized
aut
hori
ty a
nd (
"Ara
a di
sdai
n fo
r ex
pert
s an
d in
telle
ctua
ls. F
rom
this
per
spec
tive,
"bo
ttom
-up"
); i`
c_j_
dich
ange
is v
alor
ized
alo
ng w
ith d
irec
t, gr
assr
oots
or
part
icip
ator
y de
moc
racy
.T
he s
econ
d te
nden
cy in
this
soc
iety
is to
war
d ce
ntra
lizat
ion
of a
utho
rity
and
deci
sion
-mak
ing,
with
res
pons
ibili
ty f
or th
e di
ffic
ult d
ecis
ions
left
to th
e
at
.4H
arol
d B
erla
k
man
or
wom
an a
t the
topt
he C
EO
, the
chi
ef o
f sta
ff. I
n th
e ca
se o
f sc
hool
s,th
e su
peri
nten
dent
or
prin
cipa
l mus
t be
a to
ugh-
min
ded
lead
er, a
ble
to s
hape
up th
e tr
oops
, del
egat
e re
spon
sibi
lity
and
hold
sub
ordi
nate
sac
coun
tabl
e fo
rth
eir
perf
orm
ance
. Eff
icie
ncy
and
imm
edia
te, d
emon
stra
ble
resu
ltsar
e va
lo-
rize
d, a
nd w
hile
dem
ocra
cy is
not
nece
ssar
ily r
ejec
ted,
it is
rep
rese
ntat
ive
de-
moc
racy
and
del
egat
ion
of a
utho
rity
to th
ose
who
kno
wbe
st w
hich
isen
dors
edw
ith li
ttle
tole
ranc
e fo
rpa
rtic
ipat
ory
dem
ocra
cy, w
hich
is s
een
asch
aotic
and
in th
e en
d as
enco
urag
ing
the
low
est c
omm
on d
enom
inat
or in
term
s of
pro
cess
and
pro
duct
.T
he r
elat
ive
stre
ngth
of
thes
etw
o te
nden
cies
and
the
ambi
vale
nce
man
yA
mer
ican
s fe
el a
bout
how
to r
efor
msc
hool
s ar
e ev
iden
t in
the
mul
tiplic
ity o
fpr
opos
als
adva
nced
and
pol
icie
s in
stitu
ted
sinc
e 19
83.
The
lang
uage
that
has
dom
inat
ed th
e di
scou
rse
abou
t sch
ool
refo
rm h
as b
een
that
of
cris
is, o
f di-
sast
er, o
f im
min
ent t
hrea
t to
the
very
sur
viva
l of
the
natio
n. I
hav
e al
read
yqu
oted
A N
atio
n at
Ris
k w
ith it
s m
ilita
rym
etap
hors
. Her
e ar
e th
e w
ords
of
A N
atio
n Pr
epar
ed, t
he s
econ
d-m
ost
infl
uent
ial r
epor
t, pu
blis
hed
by th
e C
ar-
negi
e Fo
rum
on
Edu
catio
n an
d th
eE
cono
my
(198
6), c
reat
ed a
ndsu
ppor
ted
byth
e C
arne
gie
Cor
pora
tion
of N
ewY
ork:
Am
eric
an's
abi
lity
toco
mpe
te in
the
wor
ld m
arke
ts is
ero
ding
. The
pro-
duct
ivity
gro
wth
of
our
com
petit
ors
outd
ista
nces
our
own.
As
jobs
re-
quir
ing
little
ski
lls a
re a
utom
ated
or
go o
ffsh
ore
and
dem
and
incr
ease
sfo
r th
e hi
ghly
ski
lled,
the
pool
of
educ
ated
and
ski
lled
peop
le g
row
ssm
alle
r an
d th
e ba
ckw
ater
of
the
unem
ploy
able
ris
es. L
arge
num
bers
of
Am
eric
an c
hild
ren
are
inlim
boig
nora
nt o
f th
e pa
st a
ndun
prep
ared
for
the
futu
re. M
any
are
drop
ping
out
not j
ust o
ut o
f sc
hool
but
out o
fpr
oduc
tive
soci
ety.
As
in p
ast e
cono
mic
and
soc
ial
cris
es, A
mer
ican
s tu
rn to
edu
catio
n.T
hey
righ
tly d
eman
d an
impr
oved
supp
ly o
f yo
ung
peop
le w
ith th
ekn
owle
dge,
the
spir
it, th
e st
amin
a an
d th
e sk
ills
to m
ake
the
natio
n on
ceag
ain
fully
com
petit
ive.
(p.
2)
In ti
mes
of
natio
nal c
risi
s, it
isno
sur
pris
e th
at th
e st
rong
est
impu
lse
bypo
litic
ians
mos
t dir
ectly
res
pons
ible
for
scho
ols
is to
use
thei
rau
thor
ity b
yem
ploy
ing
the
tool
s th
ey u
nder
stan
d an
d kn
owbe
st. I
n th
e U
nite
d St
ates
, ba-
sic
resp
onsi
bilit
y fo
r sc
hool
sre
side
s w
ith th
e st
ates
.E
ight
yea
rs a
fter
pub
li-ca
tion
of A
Nat
ion
at R
isk
virt
ually
eve
ry s
tate
had
inst
itute
da
com
bina
tion
of to
p-do
wn
mea
sure
s in
tend
edto
rai
se e
duca
tiona
l sta
ndar
ds. T
hese
mea
-su
res
incl
ude
requ
irem
ents
for
aca
dem
icco
urse
s, n
ew o
r st
reng
then
ed c
on-
trol
s ov
er te
xtbo
okad
optio
ns, m
anda
ted
use
of s
tate
cur
ricu
lum
guid
elin
esw
hich
in s
ome
inst
ance
sar
e cl
osel
y al
igne
d to
req
uire
dte
sts,
and
mor
e pr
e-
The
Nee
d fo
r a
New
Sci
ence
of
Ass
essm
ent
5
scri
ptiv
e re
gula
tions
for
cer
tifyi
ng te
ache
rs. B
ut, b
y fa
r th
e m
ost c
omm
onm
easu
re is
sta
tew
ide
test
ing
prog
ram
s th
roug
hout
the
grad
es th
at, i
n ef
fect
,in
crea
sed
the
prop
ortio
n of
edu
catio
n do
llars
spe
nt a
t the
sta
te le
vel,
and
stre
ngth
ened
the
cont
rol o
f th
e st
ate'
s ch
ief
educ
atio
nal o
ffic
er a
nd/o
r st
ate
depa
rtm
ent o
f ed
ucat
ion.
Whi
le it
is d
iffi
cult
to g
ener
aliz
e ab
out s
ever
al th
ousa
nd s
choo
l dis
tric
ts,
man
y, p
artic
ular
ly th
e la
rger
urb
an s
yste
ms,
res
pond
edm
uch
like
stat
e de
-pa
rtm
ents
of
educ
atio
n by
tigh
teni
ng a
nd c
entr
aliz
ing
bure
aucr
atic
con
trol
over
cur
ricu
lum
, ped
agog
y, g
radi
ng, s
tude
nt d
isci
plin
e, a
nd p
erso
nnel
sel
ec-
tion.
In
addi
tion
to th
e ne
wly
dev
ised
or
revi
sed
stat
e "b
asic
ski
lls"
test
s, a
ndth
e st
anda
rdiz
ed a
chie
vem
ent t
ests
whi
ch h
ave
been
use
d fo
r m
any
year
s al
-m
ost u
nive
rsal
ly th
roug
hout
the
grad
es, s
ome
dist
rict
s in
stitu
ted
thei
r ow
n I
dist
rict
-wid
e te
sts,
in s
ome
case
s go
ing
so f
ar a
s to
spe
cify
text
book
s fo
r ea
chgr
ade
leve
l, an
d to
link
man
date
d te
sts
to th
ese
text
s.T
he r
ole
of th
e fe
dera
l gov
ernm
ent u
nder
Rea
gan-
Bus
h is
con
trad
icto
ry.
On
the
one
hand
thei
r ad
min
istr
atio
ns g
reat
ly r
educ
ed o
r el
imin
ated
pro
gram
ssu
ppor
ting
educ
atio
nal r
esea
rch
and
deve
lopm
ent,
curr
icul
um a
nd s
taff
de-
velo
pmen
t, as
wel
l as
prog
ram
s th
at a
ided
par
ticul
arly
nee
dy p
opul
atio
ns, u
s-in
g th
e ju
stif
icat
ion
that
sch
ools
are
pri
mar
ily th
e re
spon
sibi
lity
of lo
cal a
ndst
ate
gove
rnm
ents
. On
the
othe
r ha
nd, t
he D
epar
tmen
t of
Edu
catio
n, w
hose
elev
atio
n to
cab
inet
-lev
el s
tatu
s w
as b
itter
ly o
ppos
ed b
y R
eaga
n an
d ri
ght-
win
g gr
oups
pri
or to
198
0, in
the
ensu
ing
year
s be
cam
e an
incr
easi
ngly
act
ive
inst
rum
ent i
n ef
fort
s of
rig
ht-w
ing
forc
es w
ithin
the
fede
ral g
over
nmen
t to
shap
e lo
cal a
nd s
tate
sch
oolin
g po
licy
thro
ugh,
for
exa
mpl
e, s
elec
tive
en-
forc
emen
t of
and
in s
ome
case
s op
posi
tion
to a
gree
men
ts r
each
ed b
y lo
cal a
ndst
ate
scho
ol o
ffic
ials
and
the
cour
ts o
n ci
vil r
ight
s is
sues
, act
ive
advo
cacy
of
a na
tiona
l cor
e cu
rric
ulum
, nat
iona
l ass
essm
ent,
and
so-c
alle
d "f
reed
om o
fch
oice
" pl
ans
whi
ch w
ould
, in
effe
ct, d
iver
t pub
lic f
unds
to p
riva
te s
choo
ls.
Am
ong
the
mor
e vi
sibl
e ef
fort
s by
the
fede
ral g
over
nmen
t to
shap
e sc
hool
ing
prac
tice
is th
e an
nual
med
ia e
vent
sta
ged
by th
e se
cret
ary
of e
duca
tion
upon
publ
icat
ion
of th
e "w
all c
hart
," w
hich
ran
ks th
e st
ates
' edu
catio
nal p
erfo
r-m
ance
bas
ed o
n st
anda
rdiz
ed te
st s
core
s. I
n so
me
inst
ance
s a
form
of
this
an-
nual
ritu
al is
rep
eate
d by
sta
tes
publ
iciz
ing
rank
ings
of
scho
ol d
istr
icts
, and
by
the
cent
ral a
dmin
istr
atio
ns o
f sc
hool
dis
tric
ts r
elea
sing
to th
e pr
ess
rank
ings
of in
divi
dual
sch
ools
with
in d
istr
icts
.W
hat e
xpla
ins
the
enor
mou
s em
phas
is o
n te
sts?
I h
ave
sugg
este
d th
at a
prim
ary
reas
on f
or th
is e
mph
asis
is th
at te
sts
are
a m
eans
of
mai
ntai
ning
cen
-10_
4.
.
tral
ized
con
trol
, pro
vidi
ng th
ose
high
er u
p in
the
educ
atio
nal b
urea
ucra
cy,o
-ir
f(c
entr
al o
ffic
e ad
min
istr
ator
s, s
choo
l boa
rd m
embe
rs, s
tate
edu
catio
n of
fi-
cial
s, le
gisl
ator
s, e
tc.)
with
rel
ativ
e ra
nkin
gs o
f or
gani
zatio
nal u
nits
(cl
ass-
room
s, s
choo
ls, d
istr
icts
, etc
.) a
nd/o
r st
uden
ts a
nd te
ache
rs. T
his,
how
ever
, is
not a
n ad
equa
te e
xpla
natio
n si
nce
it do
es n
ot a
ccou
nt f
or w
ides
prea
d po
pula
r
Har
old
Ber
lak
supp
ort f
or th
e us
e of
test
s. W
hile
ther
e is
incr
easi
ngly
voc
al c
ritic
ism
of
test
sam
ong
prof
essi
onal
s an
d by
the
natio
nal m
edia
, the
re is
stil
l rem
arka
bly
little
evid
ence
of
wid
espr
ead
disc
onte
nt w
ith c
urre
nt f
orm
s of
test
ing.
Ind
eed,
man
y su
ppor
t inc
reas
ed te
stin
g, in
clud
ing
Afr
ican
-Am
eric
an, a
nd L
atin
o-A
mer
ican
par
ents
who
are
con
vinc
ed th
at th
eir
child
ren,
who
con
sist
ently
scor
e lo
wer
on
stan
dard
ized
and
cri
teri
on-r
efer
ence
d te
sts,
hav
e be
en a
nd c
on-
tinue
to b
e vi
ctim
ized
by
low
exp
ecta
tions
on
the
part
of
teac
hers
and
sch
ool
offi
cial
s. F
or m
any
with
in th
ese
com
mun
ities
, the
onl
y cr
edib
le in
dica
tor
ofim
prov
ed e
duca
tiona
l per
form
ance
is im
prov
ed p
erfo
rman
ce o
n st
anda
rdiz
edte
sts.
The
iron
y in
this
is th
at, w
hile
the
dem
and
for
mor
e pr
ofes
sion
al a
c-co
unta
bilit
y is
cer
tain
ly ju
stif
ied,
any
gai
ns o
n su
ch te
sts
are
ofte
n te
mpo
rary
and
loca
l. T
he te
chno
logy
of
thes
e te
sts
assu
mes
ther
e w
ill b
e w
inne
rs a
ndlo
sers
, and
in o
ur s
ocie
ty th
e w
inne
rs a
re in
vari
ably
the
mor
e af
flue
nt a
nd th
elo
sers
the
poor
and
pow
erle
ss.
Eff
orts
to r
efor
m s
choo
ls f
rom
the
cent
er c
ontin
ue, b
uta
coun
ter
ten-
denc
y to
war
d m
ore
dem
ocra
tic s
choo
l-le
vel c
ontr
ol h
as b
ecom
em
ore
visi
ble
rece
ntly
for
sev
eral
rea
sons
, inc
ludi
ng o
rgan
ized
opp
ositi
on to
cen
tral
ized
cont
rol b
y te
ache
rs u
nion
s, p
aren
t gro
ups,
and
loca
l sch
ool b
oard
s, a
nda
grow
ing
conv
ictio
n th
at m
anda
ting
chan
ges
from
abo
ve h
as n
ot w
orke
d. W
hat
a fe
w y
ears
ago
was
a f
ring
e vi
ew th
at g
enui
ne c
hang
es in
the
end
mus
t occ
urin
indi
vidu
al c
lass
room
s, w
hich
is n
ot p
ossi
ble
with
out a
ctiv
e pa
rtic
ipat
ion
ofte
ache
rs a
nd w
ithou
t a la
rge
mea
sure
of
auto
nom
y w
ithin
eac
h sc
hool
, has
beco
me
incr
easi
ngly
acc
epte
d as
the
com
mon
wis
dom
by
the
publ
ic p
olic
yes
tabl
ishm
ent a
nd th
e m
ains
trea
m p
ress
.4Se
vera
l sta
tes
whi
le ti
ghte
ning
cen
tral
ized
con
trol
, hav
e en
cour
aged
scho
olle
vel d
ecis
ionm
akin
g by
alte
ring
sta
te r
egul
atio
ns to
per
mit
prin
ci-
pals
and
teac
hers
mor
e sa
y ab
out s
choo
l exp
endi
ture
s, c
urri
culu
m a
nd s
taff
-in
g. A
lso
seve
ral d
istr
icts
sca
ttere
d ac
ross
the
coun
tryN
ew Y
ork
City
,6.
ez B
uffa
lo, a
nd D
ade
Cou
nty,
Flo
rida
,ar
e th
e m
ost f
requ
ently
men
tione
d in
the
2-
pres
snot
onl
y to
lera
te b
ut a
ppea
r to
fos
ter
scho
ol-l
evel
dec
isio
n-m
akin
g...e
.-G
t.S(U
nTho
wev
er,
alth
ough
talk
abo
ut, a
nd a
rgum
ents
for
, tea
cher
empo
wer
men
t and
scho
ol-l
evel
gov
erna
nce
are
com
mon
plac
e, it
is th
era
re e
xcep
tion
rath
er th
anth
e ru
le f
or c
entr
al o
ffic
e bu
reau
crac
ies
to y
ield
pow
er.
Thi
s am
biva
lenc
e ov
er w
ho s
houl
d ca
ll th
e sh
ots,
the
auth
oriti
esat
the
cent
er o
r th
e lo
cal s
choo
l com
mun
ity, i
s pr
obab
ly n
owhe
re m
ore
clea
rly
ex-
empl
ifie
d th
an in
the
prev
ious
ly c
ited
Car
negi
e re
port
, A N
atio
n Pr
epar
ed.
On
the
one
hand
, the
rep
ort c
eleb
rate
s th
e ro
le o
f th
e te
ache
r an
dpr
ovid
esw
hat i
t cal
ls "
a sc
enar
io,"
a h
ypot
hetic
al e
xam
ple
ofa
high
sch
ool r
un b
y th
esc
hool
sta
ff in
clo
se c
olla
bora
tion
with
the
loca
l com
mun
ity. O
n th
e ot
her
hand
, how
ever
, the
rep
ort m
akes
no
reco
mm
enda
tions
as to
how
cen
tral
ized
adm
inis
trat
ive
cont
rol b
y sc
hool
dis
tric
ts o
r th
e st
ate
is to
be
relin
quis
hed.
Its
The
Nee
d fo
r a
New
Sci
ence
of
Ass
essm
ent
7
key
and
sole
con
cret
e pr
opos
al is
cre
atin
g a
new
Nat
iona
l Boa
rd f
or P
rofe
s-
sion
al T
each
ing
Stan
dard
s w
hich
wou
ld, i
nef
fect
, cen
tral
ize
the
cert
ific
atio
n
of a
n el
ite c
adre
of
mas
ter
or le
ad te
ache
rsw
hom
they
ass
ume
wou
ld tr
ans-
form
the
scho
ols.
If th
ere
is a
ny c
onse
nsus
aft
er a
lmos
t eig
ht y
ears
of in
tens
ive
publ
ic d
is-
cuss
ion
and
activ
ity, i
t is
that
tink
erin
g w
ithre
gula
tions
and
issu
ing
mor
e ad
-m
inis
trat
ive
man
date
s w
ill n
ot s
uffi
ce, a
nd th
at w
hat
is n
eede
d is
per
estr
oika
, 9
a ba
sic
rest
ruct
urin
g of
the
entir
e sy
stem
. Res
truc
turi
ng is
one
of
thos
e w
ords
,lik
e de
moc
racy
and
acc
ount
abili
ty th
at h
ave
anin
exha
ustib
le n
umbe
r of
pos
-
sibl
e m
eani
ngs,
eac
h af
lam
e w
ith id
eolo
gica
lpa
ssio
n. A
t ver
y le
ast i
t im
plie
s
an u
nfre
ezin
g of
the
cent
ral o
ffic
ebu
reau
crac
y an
d a
shif
t in
auth
ority
and
the
pow
er o
f de
cisi
on-m
akin
gfr
om e
xist
ing
to n
ew f
orm
atio
ns.
In s
pite
of
the
calls
for
per
estr
oika
, dec
entr
aliz
ing
auth
ority
, and
em
-
pow
erin
g te
ache
rs a
nd p
rinc
ipal
s to
inst
itute
cha
nges
from
bel
ow, t
here
has
not b
een
any
wid
e-sc
ale
rest
ruct
urin
gof
the
syst
em. E
xcep
t for
som
e w
ell-
publ
iciz
ed e
xcep
tions
, the
evi
denc
e is
that
, ove
rall,
the
syst
em h
as b
ecom
e
mor
e an
d no
t les
s ce
ntra
lized
ove
rth
e pa
st e
ight
or
so y
ears
. (Sa
raso
n, 1
989)
Whi
le th
ere
are
seve
ral i
nter
conn
ecte
d fa
ctor
s at
wor
k, o
neif
not
thes
in-
gle
mos
t sig
nifi
cant
in h
oldi
ng th
e cu
rren
t sys
tem
in p
lace
, ind
eed
in s
tren
gth-
enin
g th
e cu
rren
t str
uctu
res,
is te
stin
g. N
ot a
ny te
sts,
but t
he p
artic
ular
for
ms
of s
tand
ardi
zed
and
crite
rion
-ref
eren
ced
test
ing
whi
ch h
ave
beco
me
the
mai
nin
stru
men
ts o
f re
form
. Her
e w
e ha
ve th
e m
ajor
para
dox
of th
e re
form
mov
e-
men
t of
the
eigh
ties:
sig
nifi
cant
impr
ovem
ents
in th
e qu
ality
of
scho
olin
g ar
eim
poss
ible
with
out s
truc
tura
l cha
nges
, but
incr
ease
dde
pend
ence
on
mas
s-
adm
inis
tere
d te
sts
at a
ll le
vels
has
had
the
effe
ctof
str
engt
heni
ng e
xist
ing
stru
ctur
es a
nd f
orm
s of
con
trol
. The
culp
rit i
s no
t edu
catio
nal a
sses
smen
t and
test
ing
per
se. R
athe
r, th
e ar
gum
ent I
mak
e he
rean
d in
Cha
pter
8 is
that
the
part
icul
ar f
orm
s of
test
ing
in w
ides
t use
for
incr
easi
ngac
coun
tabi
lity
are
root
ed in
a s
ocia
l sci
ence
par
adig
m w
hich
take
s as
agi
ven
the
nece
ssity
for
,
cent
raliz
ed c
ontr
ol.
Use
of
such
test
s ar
e no
t the
sol
e ca
use
for
the
failu
res
to r
estr
uctu
resc
hool
s. R
e-fo
rmin
g sc
hool
s or
any
soc
ial i
nstit
utio
nis
a c
ompl
ex b
usin
ess.
It
requ
ires
a c
omm
itmen
t by
natio
nal,
stat
e, a
nd lo
cal,
publ
ic o
ffic
ials
, and
pro
-
fess
iona
l edu
cato
rs to
cri
tical
ly e
xam
ine
thei
r ow
n lo
ngst
andi
ng p
ract
ices
and
patte
rns
of o
rgan
izat
iona
l con
trol
. It t
akes
pers
iste
nce
and
inor
dina
te c
oura
ge
by le
ader
s an
d go
vern
ing
bodi
es to
dis
lodg
een
tren
ched
, cen
tral
ized
bur
eau-
crat
ic p
ower
. If
we
know
any
thin
g at
all
abou
t pol
itics
and
hum
an b
ehav
ior,
it
is th
at m
any
endo
rse
the
need
for
cha
nge,
but
few
ris
k ch
alle
ngin
g th
e m
any
vest
ed in
divi
dual
and
inst
itutio
nal i
nter
ests
in m
aint
aini
ngbu
sine
ss-a
s-us
ual.
The
re a
re th
ousa
nds
of o
rgan
izat
iona
l ent
ities
,an
d te
ns o
f th
ousa
nds
of in
di-
vidu
als
with
in n
atio
nal a
nd s
tate
gov
ernm
ents
,co
llege
s an
d un
iver
sitie
s,
:AA
)
8H
arol
d B
erla
k
foun
datio
ns, p
ublis
hing