Upload
bryan-watson
View
217
Download
0
Embed Size (px)
Citation preview
1
Seeking Validity Evidence of Passage-Based Computer Adaptive Reading Test with
Item-Level Selection
Liru Zhang, Delaware DOEShudong Wang, NWEA
Presented at the 2015 NCSA Annual Conference, San Diego, CA
2
At the heart of the Common Core State Standards (CCSS) in English language arts/Literacy (ELA/LIT) is the shift of instruction to center on text. The standards focus on the growing complexity of texts (or passages) and using evidence collected from texts to present analyses and well-defended claims.
To align the CCSS in K-12 assessments, students are expected to read and comprehend grade-appropriate passages across content categories in a variety of topics and respond a range of passage-dependent questions that require inferences based on the passage instead of questions that can be answered from prior knowledge and experiences (Coleman and Pimentel, 2012).
Background (1)
3
With innovative technology, advantages of online testing, and encouragement of educational policy, the computerized-adaptive testing (CAT) has been greatly implemented in K-12 assessment.
In CAT, the adaptive process is typically based on item-level selection. The ultimate goals are to satisfy the test specifications, match provisional student ability, and control item exposure rate.
In reading comprehension, students are expected to read and comprehend grade-appropriate passages across a range of genres with a variety of topics. Some texts, such as in philosophy, literature or scientific research, may be more difficult than the others to comprehend because of knowledge, multiple resources, structure, and/or features.
Background (2)
4
The current study is an investigation whether the item-level selection could achieve the content balance at both the item-level and the passage-level to align the CCSS for the passage-based reading comprehension.
According to the Standards for the Educational and Psychological Testing (1999), the study collected supporting evidence (e.g., in item selection procedure, item exposure rate) and validity evidence in parallel construct across individual tests to assure that content standards are adequately represented.
The CAT reading comprehension is Rasch-based with fix-length of 50 on-grade and off-grade items. Test results are reported on the vertical scale and in four performance levels. Students’ responses are collected from the grades 5 and 9 in the fall and spring operations.
Purposes and Methods
5
Item Pool Review and Evaluation: Two overlapping item pools, the initial pool used in the fall and the enhanced pool used in the spring, per grade. Item pool evaluation focused on the sufficiency to support the constraints in test specifications.
Empirical Analyses focused on the adequacy of validity in content balance and parallel construct across individual tests by student achievement level in, such as, number of passages and associated items per passage, content category, topic, gender, Lexile, and difficulty level of passages, uses of on-/off-grade items/passages, and conditional exposure and overlapping rates.
Expert Review focused on the content balance at the passage and the item-levels based on a randomly selected sample of 100 individual tests per grade per operation by student achievement level.
Process for Analyses
6
GradeItem Content Category Passage Type Item Cognitive Level
Standard Min. Max. % Type Min. Max. % DOK Min. Max. %
5
2 30 40 60-80 Inf. 4 5 50 1 0 20 0-304 10 20 20-40 Lit. 4 5 50 2 0 40 0-60
Total 8 10 3 0 6 0-10On-Gr 40 50 MC 49 50
Off-Gr 0 10 TE 0 1
9
2 30 40 60-80 Inf. 6 7 70 1 0 20 0-304 10 20 20-40 Lit. 2 3 30 2 0 40 0-60
Total 8 10 3 0 6 0-10On-Gr 40 50 MC 49 50 Off-Gr 0 10 TE 0 1
Test Specifications
7
Matching Test Spec. in Grade 5
ConstraintsTest Spec. Grade 5
N % Pool (%)Operation (N)
Mean Min.-Max.
Passage-Level 8-10 7.1-8.9 6-13Informational 4-5 50 52 3.7-5.0 2-5
Literary 4-5 50 48 3.5-3.9 2-8Item-Level 50 50
Std. 2 30-40 60-80 78-80 36.4-37.2 31-40Std. 4 10-20 20-40 20-22 12.8-13.6 10-19On-Gr 40-50 80-100 44-48 43.3-43.8 39-50Off-Gr 0-10 0-20 52-56 6.2-6.7 0-11
MC 49-50 98-100 49.7-49.9 48-50TE 0-1 0-2 0.1-0.3 0-2
8
Matching Test Spec. in Grade 9
ConstraintsTest Spec. Grade 9
N % Pool (%)Operation
Mean Min.-Max.Passage-Level 8-10 7.2-8.0 6-10
Informational 6-7 70 70-76 4.1-4.8 3-7Literary 2-3 30 24-30 3.1-3.2 1-5
Item-Level 50 50Std. 2 30-40 60-80 78 37.3-37.8 31-40Std. 4 10-20 20-40 22 12.2-12.7 10-19On-Gr 40-50 80-100 43-52 44.8 39-50Off-Gr 0-10 0-20 48-57 5.2 0-11
MC 49-50 98-100 49.6-49.9 44-50TE 0-1 0-2 0.1-0.4 0-6
9
Sample Individual Test 1 – Grade 5 (AL 1)
Content Category N. Items Type Gender On-Grade Lexile Length
TopicAnimal 7 I N Y 1020L 815
Career 6 I N Y 1250L 825
Genre
Biography 1 I F Y 880L 632
Biography 6 I N Y 1110L 791
Legend-Folktale 1 L M Y 870L 582
Realistic Fiction 1 L M N 500L 341
Realistic Fiction 5 L M Y 1130L 668
Realistic Fiction 1 L F Y 740L 981
Realistic Fiction 1 L F Y 800L 898
Realistic Fiction 6 L F N 570L 984
StructurePair 8 L M Y 890L 112
Pair 7 I M Y 1290L 681
Total 12 50 5/7 4/5/3 10/2 CCSS: 830L-1010L
10
Sample Individual Test 2 – Grade 5 (AL 4)
Content Topic N. Items Type Gender On-Grade Lexile Length
TopicCareer 6 I N Y 1250L 825
Sports 1 I M Y 910L 717
Format How-to-Do 8 I N Y 970L 679
Genre
Poem 6 L N Y 1100L 100
Realistic Fiction 8 L N N 960L 575
Realistic Fiction 7 L F Y 970L 1011
StructurePair 6 I M Y 1000L 550
Pair 8 L M Y 890L 1012
Total 8 50 4/4 1/3/4 7/1
CCSS: 830L-1010L
11
Sample Individual Test 3 – Grade 9 (AL 1)
Content Topic N. Items Type Gender On-Grade Lexile Length
Topic
Food 8 I N N 1290L 609
History 7 I M Y 970L 1135
Science 7 I N Y 1320L 1171
Format How-to-Do 7 I N Y 1160L 1588
GenreRealistic Fiction 8 L F Y 1060L 821
Realistic Fiction 2 L M Y 1280L 900
StructurePair 2 L N N 1610L 991
Pair 9 L F Y 820L 1031
Total 8 50 4/4 2/2/4 6/2CCSS: 1050L-
1260L
12
Sample Individual Test 4 – Grade 9 (AL 4)
Content Topic N. Items Type Gender On/Off Lexile Length
Topic
Entertainment 5 I N Y 1070L 1188
Entertainment 4 I N Y 1090L 1130
Environment 7 I N Y 1300L 1048
Health 8 I N Y 1290L 761
Genre
Biography 6 I M N 1210L 376
Realistic Fiction 7 L M Y 970L 800
Realistic Fiction 7 L M Y 980L 1136
Realistic Fiction 6 L F Y 930L
Total 8 50 5/3 1/3/4 7/1 CCSS: 1050L-1260L
13
Findings and Implications (1)
Compared with the test specifications, the content balance is satisfied at the item level in Standards 2 & 4 on the average as well as within the min./max. limitations in both grades. The proportion of on-grade and off-grade items is generally met in the fix-length test.
At the passage level, the total number of passages varies greatly from student to student, as shown in the sample test from 8 for an achievement level 4 to 12 for an achievement level 1. The proportion of the two types of passages, informational and literary, are failed to meet the target in operation.
In reading, passage and associated items are related with each other, but each has its unique coding category and evaluation system. To address all constraints and balance them in both levels is much more complicated to accomplish in reality than presumed. Otherwise, the compensation for the constraints at one level could be compromised at another level.
14
Summary and Implications (2)
When students repeatedly received reading passages from certain genres with similar topics or in the same format or structure, it not only limits the breadth of their exposure in reading, but also introduces bias in testing.
According to the content expert review, one pairing per test is desirable. This is because paired passages increase the reading demands with an additional passage. More importantly, the cognitive load is increased as students are asked to make inferences and draw conclusions across passages, not just within each passage.
When students read passage(s) with only 1-2 associated items to satisfy the test specifications in a fixed-length test, the reading demands are unexpectedly swelled, especially for young readers.
15
Findings and Implications (3)
To achieve content balance in passage-based adaptive reading tests, an indispensable condition is that all constraints at the item level and the passage level must be considered simultaneously. .
In CAT, sufficient item pools and well established content constraints in the test specifications are the necessity for ensuring the adequacy in content balance and comparable construct across individual tests.
16
Thank you!