ERIC/RCS: Recent Developments in Testing Reading Comprehension

ERIC/RCS: Recent Developments in Testing Reading ComprehensionAuthor(s): Karl KoenkeSource: The Reading Teacher, Vol. 33, No. 4 (Jan., 1980), pp. 506-510Published by: Wiley on behalf of the International Reading AssociationStable URL: http://www.jstor.org/stable/20195057 .

Accessed: 28/06/2014 15:15

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Wiley and International Reading Association are collaborating with JSTOR to digitize, preserve and extendaccess to The Reading Teacher.

http://www.jstor.org

This content downloaded from 46.243.173.151 on Sat, 28 Jun 2014 15:15:28 PMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=black

http://www.jstor.org/action/showPublisher?publisherCode=ira

http://www.jstor.org/stable/20195057?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp


ERIC/RCS j

Recent developments in testing reading comprehension

KARL KOENKE

Though Roger Fair's [ED 033 258] Reading: What Can Be Measured?, published in 1969, remains the profession's most complete statement on the testing of

reading ability, a number of studies available through the ERIC system raise the

possibility of major changes in the testing of reading comprehension during the 1980s and suggest more immediate adjustments in current testing practices.

A chance for change A number of reports from the Center for the Study of Reading, funded by the

National Institute of Education (NIE), point to both a need and a possibility for

change in testing reading comprehension. Durkin's [ED 162 259] study of reading instruction in 39 classrooms in central Illinois suggests that instruction in and

assessment of reading comprehension are often identical. Observations by Durkin

and her assistants reveal that to develop reading comprehension teachers use

workbook pages and ask students questions. There is little if any explanation of

methods to be used in answering questions; the emphasis is on correctness of the

response. That is assessment?testing.

One result of this study could be a redoubling of efforts by writers and

publishers of manuals and materials to provide the teacher with help in meeting Durkin's criterion for comprehension instruction: "Teacher does/says something to help children understand or work out the meaning of more than a single, isolated word" (p. 8).

It is also possible that test makers will be influenced by the finding that teachers ask many questions. Perhaps there will be an even greater emphasis on

listening comprehension and oral response. That would mean more use of check

lists or tape recorders. In addition, some test makers might be influenced by

Durkin's definition of comprehension assessment: "Teacher does /says something in order to learn whether what was read was comprehended. Efforts could take a

variety of forms?for instance, orally-posed questions; written exercises; request

for picture of unpictured character in a story" (p. 11).

Anderson and others [ED 157 036] have described their attempts to develop and pilot a model for building domain-referenced tests of reading comprehension.

506 The Reading Teacher January 1980



This is the type that the National Assessment of Educational Progress (NAEP) has been using, but without the development strategies and controls made

apparent in the Anderson and others paper. Domain-referenced assessment,

according to Petrosky [ED 159 599] in a discussion of the third NAEP assessment of reading comprehension, provides the best standardized procedure for finding out what students know and what teachers and schools can do. This assessment

procedure eliminates the ambiguity about the meaning of test scores by sampling representative items from a well-defined set of tasks, by referring to the logical

relationship between a set of items in a test and a well-defined domain

represented by those items, and by estimating the kinds of behavior students are

capable of within a defined domain. Anderson and others piloted their model for building a domain-referenced

reading comprehension test within the domain (or what might be called the skill) of finding the main idea. The variables within the passage that they chose to

manipulate or control are quite interesting: whether the main point is stated, the

frequency with which it is stated, the number of supporting ideas, cues to the main point statement, cues to details, vocabulary, syntax, structure, style, length

of passage, and number of major points in the passage. Test makers in general

have a long way to go to reach the specificity in testing that Anderson and others

require.

Royer and Cunningham [ED 157 040] show that comprehension theory contributes to the measurement of reading comprehension. They espouse neither

the "bottom-up" theory (major concern and control of the written material), nor

the "top-down" theory (major concern and study of the reader). Rather they stress the interaction between the reader and the material. Their analysis of

existing tests led to the conclusion that test passages are likely to draw broadly from the reader's knowledge of the world. Failure to comprehend, therefore, can

be attributed to either a lack of specific knowledge or lack of skill development. Existing tests do not distinguish between these two factors. Royer and

Cunningham conclude that existing reading comprehension tests are acceptable as

predictive instruments but are not useful for diagnosis or assessment of gain. The

researchers caution that some disadvantaged children might not be tested fairly when existing tests are used to show gain in skill development. They suggest

creating tests that match students' prior knowledge with the topical content of the

test passages, and that control for reasoning and inferential ability on the part of

the reader. Though these factors cannot be completely excluded, controls are

necessary to more clearly specify what it is that the tests do test.

Adjusting present practice In addition to the possibility of extensive changes in the longer term, the

current literature on the testing of reading comprehension presents varied

suggestions for adjustments. Jenkins and Pany's [ED 134 938] analysis of five

reading achievement tests and seven basal reading series shows that there is

discernible curriculum bias in the standardized reading tests. Scores on th?

Paragraph Meaning Subtest of the Stanford Achievement Test (SAT), for

example, are related to the vocabulary taught through the second grade in various

basal series. If a child were being taught the words in the Ginn 360 series, a grade

equivalent score of 1.9 is expected on the SAT Paragraph Meaning Subtest; for the Macmillan series, 2.5; for both the SRA series and the Bank Street Readers, 2.9; for the Sullivan series, 3.1; for Economy, 3.2; and for the Houghton Mifflin

series, 3.4. From high to low, the expected scores vary by a year and a half

(Jenkins and Pany, p. 21).

ERIC/RCS 507



Teachers must understand the specific relationship between the tests they use

and those used by others. Cumulative records or clear communication between

schools and teachers is critical. For example, a teacher, measuring reading

comprehension of a new class, with the SAT, may test one child who has been

reading in the Ginn 360 series for two years and test another who has been

reading in Houghton Mifflin the same length of time. The teacher might place the first child among those children who have not succeeded as well as expected,

while considering the second child quite talented. However, the real difference

between the two children is the specific words they have learned as a result of

their basal reading series. The need for adjustment seems clear.

Jenkins and Pany also point out that teachers should be aware of the difference

between scores on common achievement tests and those from the following tests

used by reading teachers and others: Peabody Individual Achievement Test

(PIAT), Slosson Oral Reading Test (SORT), and Wide Range Achievement Test

(WRAT). There are differences across tests and across basal series. The second

grader who achieved a 1.9 on the SAT comprehension section after two years in

the Ginn 360 program would be expected to score 2.5 on the WRAT, 2.7 on the

SORT, and 2.8 on the PIAT. The child who achieved 3.4 on the SAT after two

years in the Houghton Mifflin series would be expected to score only 2.9 on the

WRAT, but 3.4 on the SORT, and 3.8 on the PIAT. As a result of another study of the WRAT and the PIAT, Harmer and

Williams [ED 160 982] conclude that the two tests should be used interchangeably only with caution and understanding of the differences. While there is a moderate

to high correlation between the test scores, the two instruments have distinctly

different strengths and weaknesses. The WRAT tends to be more accurate for

younger children (kindergarten and primary grades). Tlr?y concluded that differences in test content, testing procedures, test format, and the method ?f

scoring may account for the differences in grade equivalent scores.

Another analysis of reading comprehension tests has been undertaken by

Carroll [ED 155 616], who focuses on the construct of comprehension which each

of seven reading comprehension tests claimed to measure. The 367 items from

these tests were classified by two judges according to four taxonomic levels:

recognition and recall, inference, evaluation, and appreciation. Inference items

and recall tasks together accounted for 81% to 100% of the items on the various

tests. Evaluation and appreciation items were not only less numerous, they were

also less well developed. (After finding a low rate of interjudge agreement in item

categorization, Carroll also concluded that there is a need to control the

subjectivity in this type of analysis.) A few other studies that might serve as a basis for the adjustment of our

present practices in testing reading comprehension can be briefly mentioned.

Fleisher and others [ED 159 664] found that the ability to read words and

phrases in isolation quickly did not affect reading comprehension scores. Once

again the idea that a child can "call" words quickly and not understand what is

read is exemplified. The use of subtests involving flashed phrases or word lists

must be carefully thought through.

Parrish and others [ED 159 648], discuss the development of culture-fair,

informal reading inventories, using as an example an informal reading inventory

reflecting the experience of a Spanish-speaking child. The idea for this approach came as a result of Parrish's teaching in the South Pacific where there are

obvious differences in culture between the writers of the tests and the children

who actually take them.

In another study, Asher [ED 159 661] found that children, regardless of race,




comprehend high-interest material better than low-interest material and that there

is a high degree of overlap in interests across races. This finding and that of

Steffensen and others [ED 159 660], who also have studied the effects of

experience on comprehension, seem to support Parrish's ideas concerning

specially written informal reading inventories.

It seems fitting to conclude with a quote from Roger Farr [Owoc ED 159 628]:

'To be most useful, tests should be dictated by curriculum. Those who administer

tests should be fully familiar with the instructional program that teaches the

behavior the test purports to measure. This means understanding the major

features of the program and their sequence in the program in relation to the goals

of the program. It also means knowing the particular students?their interests,

backgrounds, abilities, and needs?that the program instructs" (p. 6).

Obtaining ERIC Materials The Educational Resources Information Center Clearinghouse on Reading and Communication Skills (ERIC/RCS) is one of sixteen ERIC clearinghouses. It is jointly sponsored by the

ERIC/RCS I national Institute of Education and the National Council of -

Teachers of English. ED (ERIC Document) numbers identify materials announced in monthly issues

of Resources In Education (RIE). Except as otherwise noted, these materials have

been reproduced in microfiche and may be viewed at libraries that contain ERIC

microfiche collections or may be purchased (in microfiche or paper copy) from

the ERIC Document Reproduction Service. (Some are also available from the

publisher mentioned.) For complete ordering information and current prices

(based on number of pages), consult the most recent issue of RIE or write to

ERIC/RCS, 1111 Kenyon Road, Urbana, IL 61801, U.S.A. Prices should be obtained before placing orders.

EJ (ERIC Journal) numbers identify journal articles announced in monthly issues of Current Index to Journals in Education (CUE). These articles may be

obtained from libraries; many are also available as reprints from University Microfilms International. For current information on prices and availability,

consult the most recent issue of CUE.

CS numbers identify documents recently acquired by ERIC/RCS; they will be announced and indexed in forthcoming issues of CUE (journal articles) or RIE

(other documents). Check the cross-reference index in these issues for the

appropriate ED or EJ number.

A,

References

Anderson, Thomas H. and others. Development and Trial of a Model for Developing Domain-Refer

enced Tests of Reading Comprehension. Technical Report No. 86. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.: Center for the Study of Reading, Universityof Illinois, 1978. 69 pp. [ED 157 036]

Asher, Steven R. Influence of Topic Interest on Black Children and White Children's Comprehen sion. Technical Report No. 99. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.: Center for the Study of Reading, University of Illinois, 1978. 35 pp. [ED 159 661]

Carroll, Karen. A Taxonomic Examination of Seven Reading Comprehension Tests for the Inter

mediate Grades. Master's thesis. Rutgers, The State University of New Jersey, New Brunswick,

N.J., 1978. 145 pp. [ED 155 616] Durkin, Dolores. What Classroom Observations Reveal about Reading Comprehension Instruction.

Technical Report No. 106. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.:

Center for the Study of Reading, University of Illinois, 1978. 89 pp. [ED 162 259]

ERIC/RCS 509



Farr, Roger. Reading: What Can Be Measured? Newark, Del.: International Reading Association, 1969. 305 pp. [ED 033 258; PC not available from EDRS]

Fleisher, Lisa S. and others. Effects on Poor Readers' Comprehension of Training in Rapid Decoding. Technical Report No. 103. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.:

Center for the Study of Reading, University of Illinois, 1978. 39 pp. [ED 159 664] Harmer, William R. and Fern Williams. The Wide Range Achievement Test and the Peabody Individ

ual Achievement Test: A Comparative Study. Paper presented at the International Reading Association annual convention, Houston, Texas, May 1978. 11 pp. [ED 160 982]

Jenkins, Joseph R. and Darlene Pany. Curriculum Biases in Reading Achievement Tests. Technical

Report No. 16. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.: Centerforthe

Study of Reading, University of Illinois, 1976. 24 pp. [ED 134 938] Owoc, Paul, Ed. Reading and Measurement. St. Ann, Mo.: Central Midwestern Regional Educational

Laboratory, 1978. 9 pp. [ED 159 628] Parrish, Berta E. and others. New Directions in Reading Education. Volume I. Tempe, Ariz.: Arizona

State University, 1978. 89 pp. [ED 159 648]

Petrosky, Anthony R. The 3rd National Assessment of Reading and Literature Versus Norm- and Cri

terion-Referenced Testing. Paper presented at the International Reading Association annual

convention, Houston, Texas, May 1978. 13 pp. [ED 159 599]

Royer, James M. and Donald J. Cunningham. On the Theory and Measurement of Reading Compre hension. Technical Report No. 91. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and

Urbana, III.: Center for the Study of Reading, University of Illinois, 1978. 55 pp. [ED 157 040] Steffenson, Margaret S. and others. A Cross-Cultural Perspective on Reading Comprehension.

Technical Report No. 97. Cambridge, Mass.: Bolt, Beranek and Newman, Inc. and Urbana, III.:

Center for the Study of Reading, University of Illinois, 1978. 41 pp. [ED 159 560]

Young Children's Use of Language May 16-17, 1980

A conference cosponsored by the Graduate

School of Education of Rutgers University and the International Reading Association

The conference will examine functional, developmental, and ethnographic approaches to the study of children's language in school and other contexts.

Participants will include Joan Tough, University of Leeds, England; Courtney Cazden, Harvard Graduate School of Education; Virginia Shipman and

Rodney Cocking, Educational Testing Service; Katharine Nelson, City

University of New York; Marion Blank, New Jersey College of Medicine and

Dentistry; Donald Graves, University of New Hampshire; Yetta Goodman,

University of Arizona; Janet Emig, Rutgers University; Irene Athey, Rutgers

University; Louise Rosenblatt, author of Literature as Exploration and The

Reader, The Text, The Poem; and Nancy Martin, University of London.

For more information, write: Dr. Robert P. Parker, Associate Professor

Graduate School of Education

Rutgers-The State University New Brunswick, New Jersey 08903

telephone: (201) 932-7614




Documents

ERIC/RCS: Recent Developments in Testing Reading Comprehension