Click here to load reader
Upload
karl-koenke
View
215
Download
0
Embed Size (px)
Citation preview
ERIC/RCS: Recent Developments in Testing Reading ComprehensionAuthor(s): Karl KoenkeSource: The Reading Teacher, Vol. 33, No. 4 (Jan., 1980), pp. 506-510Published by: Wiley on behalf of the International Reading AssociationStable URL: http://www.jstor.org/stable/20195057 .
Accessed: 28/06/2014 15:15
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp
.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].
.
Wiley and International Reading Association are collaborating with JSTOR to digitize, preserve and extendaccess to The Reading Teacher.
http://www.jstor.org
This content downloaded from 46.243.173.151 on Sat, 28 Jun 2014 15:15:28 PMAll use subject to JSTOR Terms and Conditions
ERIC/RCS j
Recent developments in testing reading comprehension
KARL KOENKE
Though Roger Fair's [ED 033 258] Reading: What Can Be Measured?, published in 1969, remains the profession's most complete statement on the testing of
reading ability, a number of studies available through the ERIC system raise the
possibility of major changes in the testing of reading comprehension during the 1980s and suggest more immediate adjustments in current testing practices.
A chance for change A number of reports from the Center for the Study of Reading, funded by the
National Institute of Education (NIE), point to both a need and a possibility for
change in testing reading comprehension. Durkin's [ED 162 259] study of reading instruction in 39 classrooms in central Illinois suggests that instruction in and
assessment of reading comprehension are often identical. Observations by Durkin
and her assistants reveal that to develop reading comprehension teachers use
workbook pages and ask students questions. There is little if any explanation of
methods to be used in answering questions; the emphasis is on correctness of the
response. That is assessment?testing.
One result of this study could be a redoubling of efforts by writers and
publishers of manuals and materials to provide the teacher with help in meeting Durkin's criterion for comprehension instruction: "Teacher does/says something to help children understand or work out the meaning of more than a single, isolated word" (p. 8).
It is also possible that test makers will be influenced by the finding that teachers ask many questions. Perhaps there will be an even greater emphasis on
listening comprehension and oral response. That would mean more use of check
lists or tape recorders. In addition, some test makers might be influenced by
Durkin's definition of comprehension assessment: "Teacher does /says something in order to learn whether what was read was comprehended. Efforts could take a
variety of forms?for instance, orally-posed questions; written exercises; request
for picture of unpictured character in a story" (p. 11).
Anderson and others [ED 157 036] have described their attempts to develop and pilot a model for building domain-referenced tests of reading comprehension.
506 The Reading Teacher January 1980
This content downloaded from 46.243.173.151 on Sat, 28 Jun 2014 15:15:28 PMAll use subject to JSTOR Terms and Conditions
This is the type that the National Assessment of Educational Progress (NAEP) has been using, but without the development strategies and controls made
apparent in the Anderson and others paper. Domain-referenced assessment,
according to Petrosky [ED 159 599] in a discussion of the third NAEP assessment of reading comprehension, provides the best standardized procedure for finding out what students know and what teachers and schools can do. This assessment
procedure eliminates the ambiguity about the meaning of test scores by sampling representative items from a well-defined set of tasks, by referring to the logical
relationship between a set of items in a test and a well-defined domain
represented by those items, and by estimating the kinds of behavior students are
capable of within a defined domain. Anderson and others piloted their model for building a domain-referenced
reading comprehension test within the domain (or what might be called the skill) of finding the main idea. The variables within the passage that they chose to
manipulate or control are quite interesting: whether the main point is stated, the
frequency with which it is stated, the number of supporting ideas, cues to the main point statement, cues to details, vocabulary, syntax, structure, style, length
of passage, and number of major points in the passage. Test makers in general
have a long way to go to reach the specificity in testing that Anderson and others
require.
Royer and Cunningham [ED 157 040] show that comprehension theory contributes to the measurement of reading comprehension. They espouse neither
the "bottom-up" theory (major concern and control of the written material), nor
the "top-down" theory (major concern and study of the reader). Rather they stress the interaction between the reader and the material. Their analysis of
existing tests led to the conclusion that test passages are likely to draw broadly from the reader's knowledge of the world. Failure to comprehend, therefore, can
be attributed to either a lack of specific knowledge or lack of skill development. Existing tests do not distinguish between these two factors. Royer and
Cunningham conclude that existing reading comprehension tests are acceptable as
predictive instruments but are not useful for diagnosis or assessment of gain. The
researchers caution that some disadvantaged children might not be tested fairly when existing tests are used to show gain in skill development. They suggest
creating tests that match students' prior knowledge with the topical content of the
test passages, and that control for reasoning and inferential ability on the part of
the reader. Though these factors cannot be completely excluded, controls are
necessary to more clearly specify what it is that the tests do test.
Adjusting present practice In addition to the possibility of extensive changes in the longer term, the
current literature on the testing of reading comprehension presents varied
suggestions for adjustments. Jenkins and Pany's [ED 134 938] analysis of five
reading achievement tests and seven basal reading series shows that there is
discernible curriculum bias in the standardized reading tests. Scores on th?
Paragraph Meaning Subtest of the Stanford Achievement Test (SAT), for
example, are related to the vocabulary taught through the second grade in various
basal series. If a child were being taught the words in the Ginn 360 series, a grade
equivalent score of 1.9 is expected on the SAT Paragraph Meaning Subtest; for the Macmillan series, 2.5; for both the SRA series and the Bank Street Readers, 2.9; for the Sullivan series, 3.1; for Economy, 3.2; and for the Houghton Mifflin
series, 3.4. From high to low, the expected scores vary by a year and a half
(Jenkins and Pany, p. 21).
ERIC/RCS 507
This content downloaded from 46.243.173.151 on Sat, 28 Jun 2014 15:15:28 PMAll use subject to JSTOR Terms and Conditions
Teachers must understand the specific relationship between the tests they use
and those used by others. Cumulative records or clear communication between
schools and teachers is critical. For example, a teacher, measuring reading
comprehension of a new class, with the SAT, may test one child who has been
reading in the Ginn 360 series for two years and test another who has been
reading in Houghton Mifflin the same length of time. The teacher might place the first child among those children who have not succeeded as well as expected,
while considering the second child quite talented. However, the real difference
between the two children is the specific words they have learned as a result of
their basal reading series. The need for adjustment seems clear.
Jenkins and Pany also point out that teachers should be aware of the difference
between scores on common achievement tests and those from the following tests
used by reading teachers and others: Peabody Individual Achievement Test
(PIAT), Slosson Oral Reading Test (SORT), and Wide Range Achievement Test
(WRAT). There are differences across tests and across basal series. The second
grader who achieved a 1.9 on the SAT comprehension section after two years in
the Ginn 360 program would be expected to score 2.5 on the WRAT, 2.7 on the
SORT, and 2.8 on the PIAT. The child who achieved 3.4 on the SAT after two
years in the Houghton Mifflin series would be expected to score only 2.9 on the
WRAT, but 3.4 on the SORT, and 3.8 on the PIAT. As a result of another study of the WRAT and the PIAT, Harmer and
Williams [ED 160 982] conclude that the two tests should be used interchangeably only with caution and understanding of the differences. While there is a moderate
to high correlation between the test scores, the two instruments have distinctly
different strengths and weaknesses. The WRAT tends to be more accurate for
younger children (kindergarten and primary grades). Tlr?y concluded that differences in test content, testing procedures, test format, and the method ?f
scoring may account for the differences in grade equivalent scores.
Another analysis of reading comprehension tests has been undertaken by
Carroll [ED 155 616], who focuses on the construct of comprehension which each
of seven reading comprehension tests claimed to measure. The 367 items from
these tests were classified by two judges according to four taxonomic levels:
recognition and recall, inference, evaluation, and appreciation. Inference items
and recall tasks together accounted for 81% to 100% of the items on the various
tests. Evaluation and appreciation items were not only less numerous, they were
also less well developed. (After finding a low rate of interjudge agreement in item
categorization, Carroll also concluded that there is a need to control the
subjectivity in this type of analysis.) A few other studies that might serve as a basis for the adjustment of our
present practices in testing reading comprehension can be briefly mentioned.
Fleisher and others [ED 159 664] found that the ability to read words and
phrases in isolation quickly did not affect reading comprehension scores. Once
again the idea that a child can "call" words quickly and not understand what is
read is exemplified. The use of subtests involving flashed phrases or word lists
must be carefully thought through.
Parrish and others [ED 159 648], discuss the development of culture-fair,
informal reading inventories, using as an example an informal reading inventory
reflecting the experience of a Spanish-speaking child. The idea for this approach came as a result of Parrish's teaching in the South Pacific where there are
obvious differences in culture between the writers of the tests and the children
who actually take them.
In another study, Asher [ED 159 661] found that children, regardless of race,
508 The Reading Teacher January 1980
This content downloaded from 46.243.173.151 on Sat, 28 Jun 2014 15:15:28 PMAll use subject to JSTOR Terms and Conditions
comprehend high-interest material better than low-interest material and that there
is a high degree of overlap in interests across races. This finding and that of
Steffensen and others [ED 159 660], who also have studied the effects of
experience on comprehension, seem to support Parrish's ideas concerning
specially written informal reading inventories.
It seems fitting to conclude with a quote from Roger Farr [Owoc ED 159 628]:
'To be most useful, tests should be dictated by curriculum. Those who administer
tests should be fully familiar with the instructional program that teaches the
behavior the test purports to measure. This means understanding the major
features of the program and their sequence in the program in relation to the goals
of the program. It also means knowing the particular students?their interests,
backgrounds, abilities, and needs?that the program instructs" (p. 6).
Obtaining ERIC Materials The Educational Resources Information Center Clearinghouse on Reading and Communication Skills (ERIC/RCS) is one of sixteen ERIC clearinghouses. It is jointly sponsored by the
ERIC/RCS I national Institute of Education and the National Council of -
Teachers of English. ED (ERIC Document) numbers identify materials announced in monthly issues
of Resources In Education (RIE). Except as otherwise noted, these materials have
been reproduced in microfiche and may be viewed at libraries that contain ERIC
microfiche collections or may be purchased (in microfiche or paper copy) from
the ERIC Document Reproduction Service. (Some are also available from the
publisher mentioned.) For complete ordering information and current prices
(based on number of pages), consult the most recent issue of RIE or write to
ERIC/RCS, 1111 Kenyon Road, Urbana, IL 61801, U.S.A. Prices should be obtained before placing orders.
EJ (ERIC Journal) numbers identify journal articles announced in monthly issues of Current Index to Journals in Education (CUE). These articles may be
obtained from libraries; many are also available as reprints from University Microfilms International. For current information on prices and availability,
consult the most recent issue of CUE.
CS numbers identify documents recently acquired by ERIC/RCS; they will be announced and indexed in forthcoming issues of CUE (journal articles) or RIE
(other documents). Check the cross-reference index in these issues for the
appropriate ED or EJ number.
A,
References
Anderson, Thomas H. and others. Development and Trial of a Model for Developing Domain-Refer
enced Tests of Reading Comprehension. Technical Report No. 86. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.: Center for the Study of Reading, Universityof Illinois, 1978. 69 pp. [ED 157 036]
Asher, Steven R. Influence of Topic Interest on Black Children and White Children's Comprehen sion. Technical Report No. 99. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.: Center for the Study of Reading, University of Illinois, 1978. 35 pp. [ED 159 661]
Carroll, Karen. A Taxonomic Examination of Seven Reading Comprehension Tests for the Inter
mediate Grades. Master's thesis. Rutgers, The State University of New Jersey, New Brunswick,
N.J., 1978. 145 pp. [ED 155 616] Durkin, Dolores. What Classroom Observations Reveal about Reading Comprehension Instruction.
Technical Report No. 106. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.:
Center for the Study of Reading, University of Illinois, 1978. 89 pp. [ED 162 259]
ERIC/RCS 509
This content downloaded from 46.243.173.151 on Sat, 28 Jun 2014 15:15:28 PMAll use subject to JSTOR Terms and Conditions
Farr, Roger. Reading: What Can Be Measured? Newark, Del.: International Reading Association, 1969. 305 pp. [ED 033 258; PC not available from EDRS]
Fleisher, Lisa S. and others. Effects on Poor Readers' Comprehension of Training in Rapid Decoding. Technical Report No. 103. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.:
Center for the Study of Reading, University of Illinois, 1978. 39 pp. [ED 159 664] Harmer, William R. and Fern Williams. The Wide Range Achievement Test and the Peabody Individ
ual Achievement Test: A Comparative Study. Paper presented at the International Reading Association annual convention, Houston, Texas, May 1978. 11 pp. [ED 160 982]
Jenkins, Joseph R. and Darlene Pany. Curriculum Biases in Reading Achievement Tests. Technical
Report No. 16. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.: Centerforthe
Study of Reading, University of Illinois, 1976. 24 pp. [ED 134 938] Owoc, Paul, Ed. Reading and Measurement. St. Ann, Mo.: Central Midwestern Regional Educational
Laboratory, 1978. 9 pp. [ED 159 628] Parrish, Berta E. and others. New Directions in Reading Education. Volume I. Tempe, Ariz.: Arizona
State University, 1978. 89 pp. [ED 159 648]
Petrosky, Anthony R. The 3rd National Assessment of Reading and Literature Versus Norm- and Cri
terion-Referenced Testing. Paper presented at the International Reading Association annual
convention, Houston, Texas, May 1978. 13 pp. [ED 159 599]
Royer, James M. and Donald J. Cunningham. On the Theory and Measurement of Reading Compre hension. Technical Report No. 91. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and
Urbana, III.: Center for the Study of Reading, University of Illinois, 1978. 55 pp. [ED 157 040] Steffenson, Margaret S. and others. A Cross-Cultural Perspective on Reading Comprehension.
Technical Report No. 97. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.:
Center for the Study of Reading, University of Illinois, 1978. 41 pp. [ED 159 560]
Young Children's Use of Language May 16-17, 1980
A conference cosponsored by the Graduate
School of Education of Rutgers University and the International Reading Association
The conference will examine functional, developmental, and ethnographic approaches to the study of children's language in school and other contexts.
Participants will include Joan Tough, University of Leeds, England; Courtney Cazden, Harvard Graduate School of Education; Virginia Shipman and
Rodney Cocking, Educational Testing Service; Katharine Nelson, City
University of New York; Marion Blank, New Jersey College of Medicine and
Dentistry; Donald Graves, University of New Hampshire; Yetta Goodman,
University of Arizona; Janet Emig, Rutgers University; Irene Athey, Rutgers
University; Louise Rosenblatt, author of Literature as Exploration and The
Reader, The Text, The Poem; and Nancy Martin, University of London.
For more information, write: Dr. Robert P. Parker, Associate Professor
Graduate School of Education
Rutgers-The State University New Brunswick, New Jersey 08903
telephone: (201) 932-7614
510 The Reading Teacher January 1980
This content downloaded from 46.243.173.151 on Sat, 28 Jun 2014 15:15:28 PMAll use subject to JSTOR Terms and Conditions