Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
WHAT IS PREDICTIVE ASSESSMENT?
DISCOVERY EDUCATION ASSESSMENT
New Mexico Benchmark Assessments
Reading and Mathematics, Grades 3-10
WHAT IS PREDICTIVE ASSESSMENT?
What is Predictive Assessment?
EXECUTIVE SUMMARY
1. Are Discovery Education Predictive Assessments reliable?
The median Reading reliability for the 0809 NM benchmark assessments grades 3-8 was .78 with
a median sample size of 574. The median Mathematics reliability for the 0809 NM benchmark assessments was .83 with a median sample size of 574.
2. Do Discovery Education Predictive Assessments have content validity?
Discovery Education Assessments ensure content validity by using each state’s curriculum
standards for reading and mathematics. The DEA New Mexico reporting categories for Reading and Mathematics Grades 3-8 are based on the US Reporting Categories The DEA New Mexico
reporting categories for Reading and Mathematics grades 9 & 10 are based on the New Mexico
Short-Cycle Assessments (SCA) requirements.
3. Do Discovery Education Predictive Assessments match state standardized tests?
4. Can Discovery Education Predictive Assessments predict proficiency levels?
Yes, there is a greater than 90% accuracy rate for predicting combined state proficiency percentages. The median state Proficiency Prediction Score for the Reading tests was 94% and
the median state Proficiency Prediction Score for Mathematics tests was 94%.
5. Can the use of Discovery Education Predictive Assessments improve student learning?
6. Can Discovery Education Predictive Assessments be used to measure growth over time?
Yes. These benchmark assessments are scored on a vertical scale using state-of-the-art Rasch
psychometric modeling. Thus, reliable estimates of student growth can be made over time.
7. Are Discovery Education Predictive Assessments based on scientifically-based research
advocated by the U. S. Department of Education?
Two matched control group studies—one in Birmingham, Alabama, and the other in Nashville,
Tennessee—support the claim that Discovery Education Predictive Assessments help schools demonstrate significant improvement in student proficiency.
WHAT IS PREDICTIVE ASSESSMENT?
What is Predictive Assessment?
NEW MEXICO
An Overview of Standards and Scientifically-Based Evidence
Supporting the Discovery Education Assessment Test Series
Since its inception in 2000 by Vanderbilt University, ThinkLink Learning, now a part of
Discovery Education, has focused on the use of formative assessments to improve K-12 student
learning and performance. Bridging the gap between university research and classroom practice, Discovery Education Assessment offers effective and user-friendly assessment products that
provide classroom teachers and students with the feedback needed to strategically adapt their
teaching and learning activities throughout the school year.
Discovery Education Assessment has pioneered a unique approach to formative assessments
using a scientifically research-based continuous improvement model that maps diagnostic
assessments to each state’s high stakes test. Discovery Education Assessment’s Predictive State-Specific Benchmark tests are aligned to the content assessed by each state test allowing teachers
to track student progress toward the standards and objectives used for accountability purposes.
Furthermore, Discovery Education Assessment subscribes to the Standards for Educational and
Psychological Testing articulated by the consortium of the American Educational Research
Association, the American Psychological Association, and the National Council on Measurement
in Education. This document, “What is Predictive Assessment?” outlines how Discovery Education Assessment addresses the following quality testing standards:
1. Are Discovery Education Predictive Assessments reliable?
Test reliability provides evidence that test questions are consistently measuring a given
construct, such as mathematics ability or reading comprehension. Furthermore, high test
reliability indicates that the measurement error for a test is low.
2. Do Discovery Education Predictive Assessments have content validity?
Content validity evidence shows that test content is appropriate for the particular constructs that
are being measured. Content validity is measured by agreement among subject matter experts
about test material and alignment to state standards, by highly reliable training procedures for item writers, by thorough reviews of test material for accuracy and lack of bias, and by
examination of depth of knowledge of test questions.
3. Do Discovery Education Predictive Assessments match state standardized tests?
Criterion validity evidence demonstrates that test scores predict scores on an important criterion
variable, such as a state’s standardized test.
4. Can Discovery Education Predictive Assessments predict proficiency levels?
WHAT IS PREDICTIVE ASSESSMENT?
Proficiency predictive validity evidence supports the claim that a test can predict a state’s
proficiency levels. High accuracy levels show that a high degree of confidence can be placed in
the vendor’s prediction of student proficiency.
5. Can the use of Discovery Education Predictive Assessments improve student learning?
Consequential validity outlines how the use of these predictive assessments facilitates important
consequences, such as the improvement of student learning and student performance on state
standardized tests.
6. Can Discovery Education Predictive Assessments be used to measure growth over time?
Growth models depend on a highly rigorous and valid vertical scale to measure student
performance over time. A vendor’s vertical scales should be constructed using advanced
statistical methodologies such as Rasch measurement models and other state-of-the-art
psychometric techniques.
7. Are Discovery Education Predictive Assessments based on scientifically-based research
advocated by the U. S. Department of Education?
In the No Child Left Behind Act of 2001, the U.S. Department of Education outlined six major
criteria for “scientifically-based research” to be used by consumers of educational measurements
and interventions. Accordingly, a vendor’s test
(i) employs systematic, empirical methods that draw on observation and experiment;
(ii) involves rigorous data analyses that are adequate to test the stated hypotheses and justify
the general conclusions drawn;
(iii) relies on measurements or observational methods that provide reliable and valid data
across evaluators and observers, across multiple measurements and observations, and
across studies by the same or different investigators;
(iv) is evaluated using experimental or quasi-experimental designs in which individuals,
entities, programs or activities are assigned to different conditions and with
appropriate controls to evaluate the effects of the condition of interest, with a
preference for random-assignment experiments, or other designs to the extent that
those designs contain within-condition or across-condition control.
(v) ensures experimental studies are presented in sufficient detail and clarity to allow for
replication or, at a minimum, offer the opportunity to build systematically on their
finding; has been accepted by a peer-reviewed journal or approved by a panel of
independent experts through a comparably rigorous, objective and scientific review;
WHAT IS PREDICTIVE ASSESSMENT?
TEST RELIABILITY
1. Are Discovery Education Predictive Assessments reliable?
Test reliability provides evidence that test questions are consistently measuring a given
construct, such as mathematics ability or reading comprehension. Furthermore, high test reliability indicates that the measurement error for a test is low. Reliabilities are calculated using
Cronbach’s alpha.
Table 1 and Table 2 present test reliabilities and sample sizes for Discovery Education Predictive
Assessments in the subject areas of Reading and Mathematics for New Mexico grades 3-8.
Reliabilities for grades 3-8 are based on Discovery Education Assessments given in the fall,
winter and spring 2008-2009 school year.
The median Reading reliability was .78 with a median sample size of 574. The median
Mathematics reliability was .83 with a median sample size of 574.
Table 1: Test reliabilities for the 0809 DEA Reading Benchmark assessments, grades 3-8.
Reading Reliabilities
Fall 2008 Winter 2009 Spring 2009
N Reliability N Reliability N Reliability
Grade 3 459 0.77 563 0.82 546 0.83
Grade 4 535 0.78 596 0.78 601 0.80
Grade 5 589 0.78 643 0.75 645 0.76
Grade 6 575 0.79 627 0.83 621 0.79
Grade 7 223 0.78 228 0.76 227 0.79
Grade 8 250 0.77 254 0.75 243 0.78
Median 497 0.78 580 0.77 574 0.79
Table 2: Test reliabilities for the 0809 DEA Mathematics Benchmark assessments, grades 3-8.
Mathematics Reliabilities
Fall 2008 Winter 2009 Spring 2009
N Reliability N Reliability N Reliability
Grade 3 460 0.82 564 0.81 549 0.83
Grade 4 532 0.84 597 0.83 598 0.84
Grade 5 585 0.84 648 0.76 646 0.80
Grade 6 572 0.79 615 0.74 634 0.82
Grade 7 215 0.75 237 0.80 227 0.78
Grade 8 253 0.84 245 0.80 237 0.83
Median 496 0.83 581 0.80 574 0.83
Table 3 and Table 4 present test reliabilities and sample sizes for Discovery Education Predictive
Assessments in the subject areas of Reading and Mathematics for New Mexico grades 9 and 10.
Reliabilities for grades 9 and 10 are based on Discovery Education Assessments given in the fall,
winter and spring 2009-2010 school year.
WHAT IS PREDICTIVE ASSESSMENT?
Table 3: Test reliabilities for the 0910 DEA Reading Benchmark assessments, grades 9 & 10.
Reading Reliabilities
Fall 2009 Winter 2010 Spring 2010
N Reliability N Reliability N Reliability
Grade 9 283 0.79 284 0.80 263 0.78
Grade 10 279 0.85 273 0.84 228 0.88
Table 4: Test reliabilities for the 0910 DEA Reading Benchmark assessments, grades 9 & 10.
Mathematics Reliabilities
Fall 2009 Winter 2010 Spring 2010
N Reliability N Reliability N Reliability
Grade 9 290 0.49 285 0.68 252 0.71
Grade 10 300 0.66 284 0.57 231 0.79
WHAT IS PREDICTIVE ASSESSMENT?
CONTENT VALIDITY
2. Do Discovery Education Predictive Assessments have content validity?
Content validity evidence shows that test content is appropriate for the particular constructs that are being measured. Content validity is measured by agreement among subject matter experts
about test material and alignment to state standards, by highly reliable training procedures for
item writers, by thorough reviews of test material for accuracy and lack of bias, and by examination of depth of knowledge of test questions.
To ensure content validity of all tests, Discovery Education Assessment carefully aligns the content of its assessments to a given state’s content standards and the content sampled by the
respective high stakes test. Discovery Education Assessment hereby employs one of the leading
alignment research methodologies, the Webb Alignment Tool (WAT), which has continually
supported the alignment of our tests to state specific content standards both in breadth (i.e., amount of standards and objectives sampled) and depth (i.e., cognitive complexity of standards
and objectives). All Discovery Education Assessment tests are thus state specific and feature
matching reporting categories of a given state’s large-scale assessment used for accountability purposes.
The following reporting categories are used on Discovery Education Predictive Assessments. The DEA New Mexico reporting categories for Reading and Mathematics are based on the
United States Reporting Categories for grades 3-8. The DEA New Mexico 9th and 10
th grade
reporting categories for reading and math are based on the New Mexico Short-Cycle Assessments
We continually update our assessments to reflect the most current version of a state’s standards.
Reading US Reporting Categories: Grades 3-8
Text & Vocabulary Understanding Sentence Construction
Interpretation Composition
Extend Understanding Proofreading/ Editing
Comprehension Strategies
Reading New Mexico Reporting Categories: Grades 9 & 10
Reading and Listening for Comprehension Literature and Media
Writing and Speaking for Expression
Math US Reporting Categories: Grades 3-8
Number and Number Relations Geometry and Spatial Sense
Computation and Numerical Estimation Data Analysis, Statistics and Probability
Operation Concepts Patterns, Functions, Algebra
Measurement Problem Solving and Reasoning
WHAT IS PREDICTIVE ASSESSMENT?
Math New Mexico Reporting Categories: Grades 9 & 10
Data Analysis and Probability Algebra
Geometry
WHAT IS PREDICTIVE ASSESSMENT?
CRITERION VALIDITY
3. Do Discovery Education Predictive Assessments match state standardized tests?
WHAT IS PREDICTIVE ASSESSMENT?
PROFICIENCY PREDICTIVE VALIDITY
4. Can Discovery Education Predictive Assessments predict state proficiency levels?
Proficiency predictive validity supports the claim that a test can predict a state’s proficiency
levels. High accuracy levels show that a high degree of confidence can be placed in our test
predictions of student proficiency. Two measures of predictive validity are calculated. If only summary data for a school or district are available, the Proficiency Prediction Score is tabulated.
When individual student level data is available, then an additional index, the Proficiency Success
Rate, is also calculated. Both measures are explained in the following sections with examples drawn from actual data from the states.
Proficiency Prediction Score
The Proficiency Prediction Score is used to determine the accuracy of predicted proficiency status. Under the NCLB legislation, it is important that states and school districts help students
progress from a “Not Proficient” status to one of “Proficient”. The Proficiency Prediction Score is
based on the percentage of correct proficiency classifications (Not Proficient/Proficient). If a state uses two or more classifications for “Proficient” (such as “Proficient” and “Advanced”), the
percentage of students in these two or more categories would be added together. Also, if a state
uses two or more categories for “Not Proficient” (such as “Below Basic” and “Basic”), the percentage of students in these two or more categories would be added together. To see how to
use this score, let’s assume a school district had the following data based on its annual state test
and a Discovery Education Assessment Spring benchmark assessment. Let’s use data from a
Grade 4 Mathematics Test as an example:
Predicted Percent Proficient or higher = 70%
Actual Percent Proficient or higher on the State Test = 80%
The error rate for these predictions is as follows:
Error Rate = /Actual Percent Proficient - Predicted Percent Proficient/ Error Rate = 80% - 70% = 10%
In this example, Discovery Education Assessment underpredicted the percent of students proficient by 10%. The absolute value (the symbols / / ) of the error rate is used to account for
cases where Discovery Education Assessment overpredicts the percent of students proficient and
the calculation is negative (e.g., Actual - Predicted = 70% - 80% = -10%; absolute value is 10%).
The Proficiency Prediction Score is calculated as follows:
Proficiency Prediction Score = 100% - Error Rate
In this example, the score is as follows:
Proficiency Prediction Score = 100% - 10% = 90%.
A higher Proficiency Prediction Score indicates a larger number or percentage of correct
proficiency predictions. In this example, Discovery Education Assessment had a score of 90%,
WHAT IS PREDICTIVE ASSESSMENT?
which indicates 9 correct classifications for every 1 misclassification. Discovery Education
Assessment uses information from these scores to improve its benchmark assessments every year.
The state of New Mexico uses a four level proficiency scale: Beginning, Near Proficient,
Proficient and Advanced. The table following show the proficiency prediction scores for the
Reading and Mathematics assessments. On average, Discovery Education Assessments had a 94% proficiency prediction score on the reading tests and a 94% proficiency prediction score on
the math tests. The figures on the following pages compare the percentages of the student
proficiency levels predicted by Discovery Education Assessment 0809B test and the 2009 NM State assessments.
Proficiency Prediction Score for New Mexico
Reading Math
Grade 3 94.10 94.30
Grade 4 98.00 98.60
Grade 5 98.50 97.00
Grade 6 94.70 91.30
Grade 7 93.00 95.20
Grade 8 84.40 81.80
Grade 11 94.90 97.20
WHAT IS PREDICTIVE ASSESSMENT?
Grade 3 Reading Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
tsDEA
NM State
Grade 3 Math Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
ts
DEA
NM State
Grade 4 Reading Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
ts
DEA
NM State
WHAT IS PREDICTIVE ASSESSMENT?
Grade 4 Math Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
tsDEA
NM State
Grade 5 Reading Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
ts
DEA
NM State
Grade 5 Math Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
ts
DEA
NM State
WHAT IS PREDICTIVE ASSESSMENT?
Grade 6 Reading Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
tsDEA
NM State
Grade 6 Math Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
ts
DEA
NM State
Grade 7 Reading Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
ts
DEA
NM State
WHAT IS PREDICTIVE ASSESSMENT?
Grade 7 Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
tsDEA
NM State
Grade 8 Proficiency Level Percentage
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
ts
DEA
NM State
Grade 8 Math Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
ts
DEA
NM State
WHAT IS PREDICTIVE ASSESSMENT?
Grade 10/11 Reading Proficiency Level
Percentages
0.00
10.00
20.00
30.00
40.00
50.00
60.00
Beginning Near
Proficient
Proficient Advanced
Perc
en
t o
f S
tud
en
tsDEA Grade
10
NM State
Grade 11
Grade 10/11 Math Proficiency Level Percentages
0.00
10.00
20.00
30.00
40.00
Beginning Near
Proficient
Proficient Advanced
Pe
rce
nt
of
Stu
de
nts
DEA Grade
10
NM State
Grade 11
WHAT IS PREDICTIVE ASSESSMENT?
CONSEQUENTIAL VALIDITY
5. Can the use of Discovery Education Predictive Assessments improve student learning?
Consequential validity outlines how the use of benchmark assessments facilitates important consequences, such as the improvement of student learning and student performance on state
standardized tests.
WHAT IS PREDICTIVE ASSESSMENT?
GROWTH MODELS
6. Can Discovery Education Predictive Assessments be used to measure growth over time?
Growth models depend on a highly rigorous and valid vertical scale to measure student performance over time. Discovery Education Assessment vertical scales are constructed using
Rasch measurement models with state-of-the-art psychometric techniques.
The accurate measurement of student achievement over time is becoming increasingly important
to parents, teachers, and school administrators. Student “growth” within a grade and across
grades has also been sanctioned by the U. S. Department of Education as a reliable way to measure student proficiency in Reading and Mathematics and to satisfy the requirements of
Adequate Yearly Progress (AYP) under the No Child Left Behind Act. Accurate measurement
and recording of individual student achievement can also help with issues of student mobility: as
students move within a district or state, records of individual student achievement can help new schools administer to the needs of this mobile population.
The assessment of student achievement over time is even more important with the use of benchmarks tests. Discovery Education Assessment Benchmark tests provide a snapshot of
student progress toward state standards at up to four points during the school year. These
benchmark tests are scientifically linked, so that the reporting of student proficiency levels is both reliable and valid.
How is the growth score created?
Discovery Education Assessment has added a scientifically based vertical scaled growth score to its family of benchmark tests in 2007-08. These growth scores are based on the Rasch
measurement model, a state-of-the-art psychometric technique for scaling ability (e.g., Wright &
Stone, 1979; Wright & Masters, 1982; Linacre 1999; Smith & Smith, 2004; Wilson, 2005). To accomplish vertical scaling, common items are embedded across assessments to enable the
psychometric linking of tests at different points in time. For example, a Grade 3 mathematics
benchmark test administered mid-year might contain below grade level and above grade level
items. Performance on these off grade level items provides an accurate measurement of how much growth occurs across grades. Furthermore, benchmark tests within a grade are also linked
with common items, once again to assess change at different points in time within a grade.
Discovery Education Assessment is using established psychometric procedures to build calibrated item banks and linked tests (i.e., Ingebo, 1997; Kolen & Brennan, 2004).
Why use such a rigorous vertical scale? Isn’t student growth similar across grades? Don’t students change as much from Grade 3 to Grade
4 as they do from Grade 7 to Grade 8? Previous research on the use of vertical scales has
demonstrated that student growth is not linear; that is, growth in student achievement is
different from grade to grade (see Young 2006). For instance, Figure 9 on the next page shows preliminary Discovery Education Assessment vertically scaled growth results. This graph shows
growth from Grades 3 to 10 in Mathematics as measured by Discovery Education Assessment’s
Spring benchmark tests. Typically, students have larger gains in mathematics achievement in elementary grades with growth somewhat slowing in middle and high school, as published by
other major testing companies.
WHAT IS PREDICTIVE ASSESSMENT?
Figure 11: Vertically Scaled Growth Results for Discovery Education Assessment Mathematics
Tests.
Discovery/ThinkLink Growth Score: Mathematics
1300
1400
1500
1600
1700
Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8 Grade 9 Grade 10
Med
ian
Scale
Sco
re
What is unique about the Discovery Education Assessment vertical growth scores?
Student growth can now be accurately measured at four points in time in each grade level. Discovery Education Assessment benchmark tests are administered up to four times yearly:
Early Fall, Late Fall, Winter, and Spring. For each time period, we report scale scores and
accompanying statistics. Most testing companies only allow the measurement of student growth
at two points in time: Fall and Spring. Discovery Education Assessment benchmark tests provide normative information to assess student growth multiple times each year. Figure 10
illustrates this growth for Grade 4 Mathematics using our benchmark assessments.
Figure 12: Within-Year Growth Results for Discovery Education Assessment Mathematics
Tests.
Discovery/ThinkLink Growth in Grade 4 Mathematics
1380
1400
1420
1440
1460
1480
1500
1520
Early Fall Late Fall Winter Spring
ThinkLink Benchmark Tests
Med
ian
Sc
ale
Sc
ore
Math
WHAT IS PREDICTIVE ASSESSMENT?
New Mexico Growth Scale
The following tables and figures illustrate the Test Difficulty on the Discovery Education
Assessment vertical growth scale for the 0809 Reading and Mathematics tests between three periods, Fall, Winter and Spring.
Reading Average Scale Scores
Fall 2008 Winter 2009 Spring 2009
Grade 3 1395 1397 1430
Grade 4 1412 1429 1472
Grade 5 1485 1504 1502
Grade 6 1532 1529 1578
Grade 7 1557 1562 1601
Grade 8 1572 1585 1605
Grade 9 1604 1636 1626
Grade 10 1672 1657 1581
Math Average Scale Scores
Fall 2008 Winter 2009 Spring 2009
Grade 3 1398 1352 1404
Grade 4 1440 1449 1456
Grade 5 1478 1487 1516
Grade 6 1516 1545 1584
Grade 7 1557 1560 1597
Grade 8 1595 1616 1630
Grade 9 1617 1611 1682
Grade 10 1638 1621 1638
WHAT IS PREDICTIVE ASSESSMENT?
0809 NM Reading Average Student Scale Scores
1300
1400
1500
1600
1700
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Grade
3
Grade
4
Grade
5
Grade
6
Grade
7
Grade
8
Grade
9
Grade
10
Avg
. S
tud
en
t S
cale
Sco
re
Scale
Score
0809 NM Math Average Student Scale Scores
1300
1400
1500
1600
1700
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Fall
Win
ter
Spring
Grade
3
Grade
4
Grade
5
Grade
6
Grade
7
Grade
8
Grade
9
Grade
10
Avg
Stu
den
t S
cale
Sco
re
Scale
Score
WHAT IS PREDICTIVE ASSESSMENT?
NCLB SCIENTIFICALLY-BASED RESEARCH
7. Are Discovery Education Predictive Assessments based on scientifically-based research
advocated by the U. S. Department of Education?
Discovery Education Assessment has also adhered to the criteria for “scientifically-based
research” put forth in the No Child Left Behind Act of 2001. “What is Predictive Assessment?”
has outlined how Discovery Education Predictive Assessments test reliability and validity meets the following criteria for scientifically-based research set forth by NCLB:
(i) employs systematic, empirical methods that draw on observation and
experiment;
(ii) involves rigorous data analyses that are adequate to test the stated hypotheses
and justify the general conclusions drawn;
(iii) relies on measurements or observational methods that provide reliable and valid
data across evaluators and observers, across multiple measurements and
observations, and across studies by the same or different investigators;
Discovery Education Assessment also provides evidence of meeting the following scientifically-based research criterion:
(iv) is evaluated using experimental or quasi-experimental designs in which
individuals, entities, programs or activities are assigned to different conditions
and with appropriate controls to evaluate the effects of the condition of interest,
with a preference for random-assignment experiments, or other designs to the
extent that those designs contain within-condition or across-condition control.
Case Study One: Birmingham, Alabama City Schools
Larger schools and school districts typically do not participate in experimental or quasi-experimental studies due to logistical and ethical concerns. However, a unique situation in
Birmingham, Alabama afforded Discovery Education Assessment with the opportunity to
investigate the efficacy of its benchmark assessments in respect to a quasi-control group. In 2003-2004, approximately one-half of the schools in Birmingham City used Discovery Education
Predictive Assessments whereas the other half did not. At the end of the school year,
achievement results for both groups were compared revealing a significant improvement on the SAT10 for those schools that used the Discovery Education Predictive Assessments as opposed to
those that did not. Discovery Education Assessment subsequently compiled a brief report titled
the “Birmingham Case Study”. Excerpts from the case study are included below:
This study is based on data from elementary and middle schools in the City of Birmingham,
Alabama. In 2002-03, no Birmingham Schools used Discovery Education’s Predictive
Assessment Series. Starting in 2003-04, 20 elementary and 9 middle schools used the Discovery Education Assessment program. All Birmingham schools took the Stanford Achievement Test
Tenth Edition (SAT10) at the end of both school years. The SAT10 is administered yearly as part
of the State of Alabama’s School Accountability Program. The State of Alabama uses
improvement in SAT10 percentiles to gauge school progress and as part of its NCLB reporting.
WHAT IS PREDICTIVE ASSESSMENT?
National percentiles on the SAT10 are reported by subject and grade level. A single national
percentile is reported for all students within a subject and grade level (this analysis is
subsequently referred as ALL STUDENTS). Furthermore, national percentiles are disaggregated by various subgroups within a school. For the comparisons that follow, the national percentiles
for students
classified as utilizing free and reduced lunch (referred to below as POVERTY) were used. All
percentiles have been converted to Normal Curve Equivalents (NCE) to allow for averaging of results.
The Discovery Education Assessment schools comprise the experimental group in this study. The Birmingham schools that did not use Discovery Education Assessment comprise the matched
comparison group. The following charts show SAT10 National Percentile changes for DEA
Schools vs. Non-DEA Schools in two grades levels (Grades 5 and 6) for three subjects (Language, Mathematics, and Reading) for two groups of students (ALL STUDENTS and
POVERTY students). In general, there was a significant decline or no improvement in SAT10
scores from 2002-03 to 2003-04 for most non-DEA schools. This trend however did not happen
in the schools using Discovery Education Assessment: instead, there was a marked improvement with most grades scoring increases in language, math and reading. In grade levels where there
was a decline in Discovery Education Assessment schools, it was a much lower decline in scores
when compared to those schools that did not use Discovery Education Assessment.
As a result of the improvement that many of these schools made in school year 2003-04, the
Birmingham City Schools selected Discovery Education Assessment to be used with all of the
schools in school year 2004-05. The Birmingham City Schools also chose to provide professional development in each school to help all teachers become more familiar with the
concepts of standardized assessment and better utilize data to focus instruction.
SAT 10 Math Comparisons: DEA vs. Non DEA Schools
in Birmingham, AL
38
39
40
41
42
43
44
45
SAT 10
02-03
SAT 10
03-04
SAT 10
02-03
SAT 10
03-04
SAT 10
02-03
SAT 10
03-04
SAT 10
02-03
SAT 10
03-04
DEA Schools Non-DEA
Schools
DEA Schools Non-DEA
Schools
All Schools Poverty Schools
NC
E
WHAT IS PREDICTIVE ASSESSMENT?
SAT 10 Grade 5 Reading Comparison: DEA vs. Non
DEA Schools in Birmingham, AL
40
41
42
43
44
45
46
SAT 10
02-03
SAT 10
03-04
SAT 10
02-03
SAT 10
03-04
SAT 10
02-03
SAT 10
03-04
SAT 10
02-03
SAT 10
03-04
DEA Schools Non-DEA
Schools
DEA Schools Non-DEA
Schools
All Schools Poverty Schools
NC
E
SAT 10 Grade 6 Language Comparison: DEA vs Non
DEA Schools
36
38
40
42
44
46
48
SAT 10
02-03
SAT 10
03-04
SAT 10
02-03
SAT 10
03-04
SAT 10
02-03
SAT 10
03-04
SAT 10
02-03
SAT 10
03-04
DEA Schools Non-DEA Schools DEA Schools Non-DEA Schools
All Schools Poverty Schools
NC
E
The following pie graph shows the Lunch Status percentages provided by Birmingham,
AL school system for grades 5th and 6
th.
WHAT IS PREDICTIVE ASSESSMENT?
Grades 5 & 6 Free Lunch Status Percentages
Free
83%
Not
Free/Reduced
11%
Reduced
6%
Free
Not Free/Reduced
Reduced
Case Study Two: Metro Nashville, Tennessee City Schools
Metro Nashville schools that used Discovery Education Assessment made greater
improvements in AYP than Metro Nashville schools that didn’t use Discovery Education Assessment. During the 2004-2005 school year, sixty-five elementary and middle schools in Metro Nashville, representing over 20,000 students, used Discovery Education assessments.
Fifty-two elementary and middle schools, representing over 10,000 students, did not use
Discovery Education assessments. The improvement in the percent of students at the
Proficient/Advanced level from 2004 to 2005 is presented in the graph below. The results compare DEA schools versus non-DEAschools in Metro Nashville. Discovery Education
Assessment schools showed more improvement in AYP status from 2004 to 2005 when schools
are combined and analyzed separately at the elementary and middle school level.
Improvement in Reading TCAP Scores 2005 over
2004
0
2
4
6
8
10
12
14
Combined
Improvement
(n=20190)
Elementary
Improvement
(n=5217)
Middle School
Improvement
(n=14948)
% Im
pro
vem
en
t
DEA
Schools
Non DEA
Schools
WHAT IS PREDICTIVE ASSESSMENT?
Improvement in Math TCAP Scores 2005 over
2004
0
2
4
6
8
10
12
14
Combined
Improvement
(n=21549)
Elementary
Improvement
(n=5765)
Middle School
Improvement
(n=15759)
% Im
pro
vem
en
t
DEA
Schools
Non DEA
Schools
The following pie charts display the frequency percents of the NCLB data
provided to DEA from the Metro Nashville Public School System for the Elementary
Schools.
Gender Frequency Percentages: Elementary
Male
49%Female
46%
No Gender
Provided
5%
Male
Female
No Gender Provided
Ethnicity Frequency Percentages: Elementary
Black
43.45%
Hispanic
15.99%
American Indian
0.13%
Other
0.08%
White
31.54%
Asian
3.46%
No Ethnicity
Provided
5.34%
Asian
Black
Hispanic
American Indian
Other
White
No Ethnicity Provided
WHAT IS PREDICTIVE ASSESSMENT?
Free Lunch Status Freq Percentages:
Elementary
Not
Free/Reduced
27%
Yes:
Free/Reduced
67%
No Status
Provided
6%
Not Free/Reduced
Yes: Free/Reduced
No Status Provided
English Language Learners Freq Percentages:
Elementary
81.43%
12.96%
5.61%
Non ELL Students
ELL Student
No Status Provided
Special Education Freq Percentages:
Elementary
83%
11%
6%
Not in Special Ed.
Yes: Special Education
No Status Provided
The following pie charts display the frequency percents of the NCLB data
provided to DEA from the Metro Nashville Public School System for the Middle
Schools.
WHAT IS PREDICTIVE ASSESSMENT?
Gender Frequency Percentages: Middle School
Male
48%Female
47%
No Gender
Provided
5%
Male
Female
No Gender Provided
Ethinicity Frequency Percentages: Middle School
Asian
3.33%
Black
46.43%
Hispanic
14.40%
White
30.71%
No Ethnicity
Provided
4.91%
Other
0.05%
American Indian
0.18%
Asian
Black
Hispanic
American Indian
Other
White
No Ethnicity Provided
Free Lunch Status Freq Percentages: Middle
School
Yes:
Free/Reduced
66%
Not
Free/Reduced
29%
No Status
Provided
5%
Not Free/Reduced
Yes: Free/Reduced
No Status Provided
WHAT IS PREDICTIVE ASSESSMENT?
English Language Learners Freq Percentages:
Middle School
86.39%
8.52%5.09%
Non ELL Students
ELL Student
No Status Provided
Special Education Freq Percentages: Middle
School
85%
10%
5%
Not in Special Ed.
Yes: Special Education
No Status Provided
(v) ensures experimental studies are presented in sufficient detail and clarity to
allow for replication or, at a minimum, offer the opportunity to build
systematically on their finding;
Consumers are encouraged to request additional data or further details for the examples listed in
this overview. Discovery Education Assessment also compiles Technical Manuals specific to
each school district and/or state. Accumulated data are of sufficient detail to permit adequate
psychometric analyses, and their results have been consistently replicated across school districts and states. Past documents of interest include among others: “A Multi-State Comparison of
Proficiency Predictions for Fall 2006” and “A Multi-State Look at ‘What is Predictive
Assessment?’.” Furthermore, the “What is Predictive Assessment?” series of documents is
WHAT IS PREDICTIVE ASSESSMENT?
available for multiple states. Please check the Discovery Education website
www.discoveryeducation.com for document updates.
(vi) has been accepted by a peer-reviewed journal or approved by a panel of
independent experts through a comparably rigorous, objective and scientific
review;
Discovery Education Assessment tests and results have been incorporated and analyzed in the
following peer- reviewed manuscripts and publications:
Jacqueline Shrago and Dr. Michael Smith of ThinkLink Learning contributed a chapter on
formative assessment to “Online Assessment and Measurement: Case Studies from Higher
Education, K-12, and Corporate” by Scott Howell and Mary Hicko in 2006.
Dr. Elizabeth Vaughn-Neely, Associate Professor, Dept. of Leadership & Counselor Education,
Ole Miss University ([email protected]) and Dr. Marjorie Reed, Associate Professor, Dept. of
Psychology, Oregon State University presented their peer-reviewed findings based on their joint research and work with schools using Discovery Education Assessment at:
Society for Research & Child Development, Atlanta, GA. April 2005 Kappa Delta Pi Conference, Orlando, FL November 2005
Society on Scientific Study of Reading, July 2006
Two dissertations for Ed.S studies have also been published: Dr. Juanita Johnson, Union University ([email protected])
Dr. Monica Eversole, Richmond KY ([email protected])
Please contact us for other specific information requests. We welcome your interest in the
evidence supporting the efficacy of our Discovery Education Assessment tests.
WHAT IS PREDICTIVE ASSESSMENT?
Procedures for Item and Test Review
Discovery Education Assessment has established policies to review items in each
benchmark assessment for appropriate item statistics and for evidence of bias.
Furthermore, the collection of items that form a specific test is reviewed for alignment to
state or national standards and for appropriate test difficulty. This document outlines in
more detail how these processes are implemented.
Item Statistics
P-values and item discrimination indices are calculated for each item based on the
number of students that completed a benchmark assessment.
P-values represent the percentage of students tested who answered the item correctly.
Based on p-values alone, the following item revision procedures are followed:
Items with p-values of .90 or above are considered too “easy” and are revised or
replaced.
Items with p-values of .20 or below are considered too “hard” and are revised or
replaced.
Item discrimination indices (biserial correlations) are calculated for each item. Items with
low biserial correlations (less than .20) are revised or replaced.
Test Content Validity and Statistics
The blueprints of state tests are monitored for changes in state standards. When state
blueprints have changed, Discovery Education Assessment revises the blueprints for
benchmarks tests used in that state. Items may be revised or replaced where necessary to
match any revised blueprint and to maintain appropriate test content validity.
P, A, B, and C tests within the same year are compared to verify that all tests are of
comparable difficulty. When necessary, items are revised to maintain the necessary level
of difficulty within each test.
The predictability of Discovery Education benchmark assessments is examined closely.
When necessary, items are revised or replaced to improve the predictability of benchmark
assessments. In order to maintain high levels of predictability, Discovery Education
Assessment strives to revise or replace less than 20% of the items in any benchmark
assessment each year.
WHAT IS PREDICTIVE ASSESSMENT?
Differential Item Functioning
Differential Item Functioning (DIF) analyses are performed on items from tests where the
overall sample size is large (n = 1000 or more per test) and the sample size for each
subgroup meets minimum standards (usually 300 or more students per subgroup). DIF
analyses for Gender (males and females) and Ethnicity (Caucasian vs. African-American,
or Caucasian vs. Hispanic) are routinely performed when sample size minimums are met.
DIF analyses are conducted using Rasch modeling through the computer program
WINSTEPS. This program calculates item difficulty parameters for each DIF subgroup
and compares these logit estimates. DIF Size is calculated using industry standard
criteria (see p.1070 “An Adjustment for Sample Size in DIF Analysis”, Rasch
Measurement Transactions, 20:3, Winter 2006).
The following criteria are used to determine DIF Size:
Negligible: 0 logits to .42 logits (absolute value)
Moderate: .43 logits to .63 logits (absolute value)
Large: .64 logits and up (absolute value)
Items with Large DIF Size are marked for the content team to review. Subject matter
content experts analyze an item to determine the cause of Gender or Ethnic DIF. If the
content experts can determine this cause, the item is revised to remove the gender or
ethnicity bias. If the cause cannot be determined, the item is replaced with a different
item that measures the same sub-skill.
WHAT IS PREDICTIVE ASSESSMENT?
TEST AND QUESTION STATISTICS, RELIABILITY, AND PERCENTILES
The following section reports test and question statistics, reliability, and percentiles for the New
Mexico benchmark tests, for grades 3-10, Reading and Mathematics. The 3-8 benchmark tests were administered in New Mexico in spring of 2009. The 9-10
th grade benchmarks were
administered winter of 2010. These benchmark tests are representative samples of over 1000
benchmark tests developed by Discovery Education Assessment. Benchmark tests are revised each year based on test and question statistics, particularly low item discrimination indices and
significant DIF.
The following statistics are reported:
Number of Students: Number of students used for calculation of test statistics.
Number of Items: Number of items in each benchmark test (including common items
used for scaling purposes).
Number of Proficiency
Items:
Number of items used to calculate the proficiency level.
Mean: Test mean in terms of number correct.
Standard Deviation: Test standard deviation.
Reliability: Cronbach’s alpha.
SEM: Standard Error of Measurement (SEM) for the test.
Scale Score:
Discovery Education Assessment Scale Score for each number correct (Scale scores are vertically scaled using Rasch
measurement. Scale scores from grades K-12 range from 1000 to
2000).
Percentiles: Percentage of students below each number correct score.
Stanines: Scale scores that range from 1 to 9.
Question P-values: The proportion correct for each item.
Biserial: Item discrimination using biserial correlation.
Rasch Item Difficulty: Rasch item difficulty parameter calculated using WINSTEPS.
DIF Gender: Rasch item difficulty difference (Male vs. Female).
DIF Ethnicity: Rasch item difficulty difference (White vs. Black).
DIF Size
Negligible: 0 logits to .42 logits (absolute value).
Moderate: .43 logits to .63 logits (absolute value).
Large: .64 logits and up (absolute value).
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 3
0809 Spring New Mexico
Test Statistics
Number of Students 546
Number of Items 28
Mean 17.40
Std. Deviation 5.58
Reliability 0.83
Std. Error of Measurement 2.30
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct Scale Score
No. Correct
Scale Score
1 0.70 0.45 -0.38 0 1010 15 1389
2 0.88 0.37 -1.77 1 1104 16 1402
3 0.58 0.29 0.23 2 1162 17 1415
4 0.79 0.41 -0.99 3 1197 18 1428
5 0.72 0.47 -0.50 4 1224 19 1442
6 0.55 0.46 0.43 5 1246 20 1456
7 0.30 0.31 1.71 6 1265 21 1472
8 0.48 0.34 0.74 7 1282 22 1489
9 0.74 0.50 -0.66 8 1298 23 1508
10 0.61 0.38 0.11 9 1312 24 1530
11 0.73 0.54 -0.55 10 1326 25 1557
12 0.81 0.36 -1.13 11 1339 26 1593
13 0.75 0.51 -0.73 12 1352 27 1650
14 0.61 0.48 0.12 13 1364 28 1744
15 0.61 0.45 0.10 14 1377
16 0.55 0.38 0.42
17 0.58 0.44 0.23
18 0.64 0.46 -0.07
19 0.61 0.43 0.11
20 0.69 0.47 -0.32
21 0.58 0.40 0.24
22 0.34 0.24 1.52
23 0.59 0.40 0.21
24 0.70 0.48 -0.41
25 0.55 0.42 0.39
26 0.64 0.36 -0.04
27 0.40 0.41 1.17
28 0.66 0.41 -0.18
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 3
0809 Spring New Mexico
Test Statistics
Number of Students 549
Number of Items 32
Mean 18.76
Std. Deviation 5.98
Reliability 0.83
Std. Error of Measurement 2.47
Question Statistics Scale Scores
Item No.
P-Value Biserial
Rasch Item
Difficulty No.
Correct Scale Score
No. Correct
Scale Score
41 0.41 0.38 0.92 0 1000 17 1378
42 0.53 0.29 0.29 1 1080 18 1389
43 0.66 0.30 -0.33 2 1138 19 1400
44 0.55 0.36 0.21 3 1173 20 1411
45 0.57 0.46 0.11 4 1200 21 1423
46 0.62 0.50 -0.16 5 1221 22 1435
47 0.71 0.41 -0.64 6 1240 23 1448
48 0.69 0.42 -0.52 7 1256 24 1462
49 0.79 0.42 -1.08 8 1271 25 1476
50 0.62 0.49 -0.13 9 1285 26 1492
51 0.62 0.35 -0.14 10 1298 27 1510
52 0.53 0.45 0.32 11 1310 28 1532
53 0.75 0.35 -0.83 12 1322 29 1557
54 0.50 0.49 0.46 13 1334 30 1592
55 0.52 0.36 0.35 14 1345 31 1648
56 0.65 0.32 -0.28 15 1356 32 1741
57 0.85 0.29 -1.59 16 1367
58 0.50 0.28 0.43
59 0.68 0.38 -0.45
60 0.61 0.41 -0.09
61 0.50 0.29 0.44
62 0.32 0.31 1.39
63 0.66 0.49 -0.32
64 0.43 0.45 0.8
65 0.59 0.34 0.01
66 0.33 0.20 1.32
67 0.88 0.37 -1.87
68 0.52 0.41 0.35
69 0.72 0.50 -0.65
70 0.62 0.50 -0.12
71 0.48 0.37 0.53
72 0.34 0.36 1.26
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 4
0809 Spring New Mexico
Test Statistics
Number of Students 601
Number of Items 28
Mean 16.39
Std. Deviation 5.05
Reliability 0.80
Std. Error of Measurement 2.26
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
1 0.51 0.42 0.41 0 1060 15 1449
2 0.63 0.31 -0.16 1 1154 16 1463
3 0.82 0.45 -1.36 2 1212 17 1476
4 0.25 0.3 1.77 3 1249 18 1490
5 0.66 0.43 -0.33 4 1277 19 1504
6 0.87 0.31 -1.76 5 1300 20 1519
7 0.75 0.41 -0.84 6 1319 21 1535
8 0.50 0.5 0.44 7 1337 22 1553
9 0.75 0.51 -0.85 8 1353 23 1572
10 0.70 0.51 -0.56 9 1368 24 1595
11 0.46 0.31 0.65 10 1383 25 1622
12 0.26 0.19 1.73 11 1397 26 1659
13 0.75 0.54 -0.87 12 1410 27 1716
14 0.66 0.38 -0.32 13 1423 28 1811
15 0.36 0.39 1.13 14 1436
16 0.42 0.34 0.83
17 0.63 0.55 -0.17
18 0.61 0.36 -0.09
19 0.45 0.43 0.69
20 0.61 0.3 -0.11
21 0.67 0.49 -0.38
22 0.64 0.45 -0.24
23 0.59 0.31 0
24 0.19 0.13 2.17
25 0.37 0.17 1.11
26 0.73 0.44 -0.74
27 0.77 0.46 -0.97
28 0.80 0.49 -1.17
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 4
0809 Spring New Mexico
Test Statistics
Number of Students 598
Number of Items 32
Mean 17.57
Std. Deviation 6.33
Reliability 0.84
Std. Error of Measurement 2.53
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
41 0.67 0.39 -0.60 0 1056 17 1445
42 0.87 0.41 -2.00 1 1150 18 1457
43 0.65 0.43 -0.50 2 1207 19 1468
44 0.72 0.30 -0.91 3 1242 20 1479
45 0.57 0.48 -0.09 4 1268 21 1491
46 0.65 0.26 -0.49 5 1289 22 1504
47 0.58 0.31 -0.14 6 1308 23 1517
48 0.39 0.34 0.80 7 1324 24 1531
49 0.74 0.42 -0.98 8 1339 25 1546
50 0.49 0.51 0.29 9 1353 26 1562
51 0.58 0.45 -0.13 10 1366 27 1581
52 0.45 0.27 0.50 11 1378 28 1603
53 0.67 0.47 -0.60 12 1390 29 1630
54 0.23 0.23 1.76 13 1401 30 1665
55 0.49 0.52 0.31 14 1412 31 1722
56 0.52 0.42 0.17 15 1423 32 1817
57 0.69 0.37 -0.71 16 1434
58 0.61 0.49 -0.28
59 0.72 0.49 -0.87
60 0.53 0.42 0.14
61 0.59 0.46 -0.17
62 0.50 0.48 0.28
63 0.34 0.47 1.09
64 0.44 0.44 0.55
65 0.56 0.52 -0.04
66 0.41 0.45 0.71
67 0.48 0.43 0.34
68 0.31 0.10 1.24
69 0.40 0.25 0.74
70 0.74 0.51 -1.03
71 0.59 0.52 -0.16
72 0.4 0.47 0.74
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 5
0809 Spring New Mexico
Test Statistics
Number of Students 645
Number of Items 28
Mean 16.10
Std. Deviation 4.81
Reliability 0.76
Std. Error of Measurement 2.36
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
1 0.94 0.36 -2.67 0 1093 15 1490
2 0.82 0.41 -1.36 1 1188 16 1504
3 0.71 0.41 -0.65 2 1247 17 1517
4 0.41 0.37 0.83 3 1284 18 1531
5 0.38 0.27 0.99 4 1312 19 1546
6 0.55 0.45 0.18 5 1336 20 1561
7 0.55 0.40 0.18 6 1356 21 1577
8 0.65 0.56 -0.31 7 1374 22 1594
9 0.55 0.36 0.17 8 1391 23 1613
10 0.74 0.23 -0.82 9 1407 24 1636
11 0.50 0.40 0.43 10 1422 25 1663
12 0.51 0.39 0.39 11 1436 26 1698
13 0.39 0.27 0.96 12 1450 27 1755
14 0.41 0.24 0.83 13 1464 28 1849
15 0.50 0.35 0.42 14 1477
16 0.67 0.43 -0.40
17 0.30 0.33 1.39
18 0.81 0.48 -1.26
19 0.61 0.30 -0.09
20 0.33 0.26 1.23
21 0.66 0.32 -0.38
22 0.65 0.38 -0.33
23 0.76 0.49 -0.92
24 0.42 0.40 0.79
25 0.73 0.43 -0.76
26 0.60 0.43 -0.07
27 0.63 0.46 -0.20
28 0.29 0.16 1.44
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 5
0809 Spring New Mexico
Test Statistics
Number of Students 646
Number of Items 32
Mean 17.79
Std. Deviation 5.64
Reliability 0.80
Std. Error of Measurement 2.52
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
41 0.76 0.44 -1.10 0 1112 17 1504
42 0.74 0.51 -0.94 1 1205 18 1515
43 0.57 0.38 -0.06 2 1262 19 1526
44 0.68 0.32 -0.61 3 1298 20 1538
45 0.17 0.13 2.12 4 1324 21 1550
46 0.52 0.33 0.19 5 1346 22 1563
47 0.53 0.28 0.14 6 1364 23 1576
48 0.37 0.26 0.93 7 1381 24 1590
49 0.70 0.43 -0.72 8 1396 25 1606
50 0.43 0.31 0.59 9 1409 26 1622
51 0.64 0.50 -0.42 10 1422 27 1641
52 0.57 0.44 -0.06 11 1435 28 1663
53 0.58 0.41 -0.09 12 1447 29 1690
54 0.62 0.44 -0.31 13 1459 30 1725
55 0.68 0.39 -0.62 14 1470 31 1782
56 0.77 0.38 -1.14 15 1481 32 1876
57 0.67 0.37 -0.54 16 1492
58 0.76 0.46 -1.04
59 0.54 0.36 0.09
60 0.46 0.34 0.46
61 0.57 0.49 -0.03
62 0.70 0.31 -0.71
63 0.84 0.27 -1.67
64 0.46 0.40 0.47
65 0.37 0.41 0.91
66 0.17 0.17 2.17
67 0.56 0.44 -0.01
68 0.51 0.33 0.22
69 0.54 0.47 0.08
70 0.31 0.36 1.22
71 0.60 0.40 -0.20
72 0.41 0.26 0.69
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 6
0809 Spring New Mexico
Test Statistics
Number of Students 621
Number of Items 28
Mean 17.33
Std. Deviation 4.94
Reliability 0.79
Std. Error of Measurement 2.26
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
1 0.57 0.29 0.34 0 1155 15 1541
2 0.60 0.32 0.19 1 1250 16 1555
3 0.74 0.38 -0.58 2 1307 17 1568
4 0.62 0.37 0.06 3 1343 18 1582
5 0.41 0.31 1.10 4 1370 19 1596
6 0.78 0.53 -0.85 5 1393 20 1611
7 0.34 0.48 1.46 6 1412 21 1628
8 0.75 0.34 -0.69 7 1430 22 1645
9 0.54 0.30 0.45 8 1446 23 1665
10 0.41 0.44 1.10 9 1461 24 1688
11 0.77 0.43 -0.79 10 1475 25 1716
12 0.80 0.39 -0.97 11 1489 26 1752
13 0.74 0.39 -0.58 12 1502 27 1811
14 0.78 0.54 -0.86 13 1515 28 1906
15 0.64 0.47 -0.04 14 1528
16 0.85 0.45 -1.39
17 0.87 0.44 -1.60
18 0.54 0.42 0.48
19 0.69 0.46 -0.31
20 0.36 0.30 1.35
21 0.63 0.44 0.01
22 0.25 0.26 1.95
23 0.68 0.49 -0.28
24 0.83 0.45 -1.25
25 0.35 0.21 1.42
26 0.82 0.42 -1.12
27 0.39 0.23 1.20
28 0.60 0.36 0.19
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 6
0809 Spring New Mexico
Test Statistics
Number of Students 634
Number of Items 32
Mean 19.12
Std. Deviation 5.78
Reliability 0.82
Std. Error of Measurement 2.45
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
41 0.61 0.46 -0.04 0 1172 17 1555
42 0.69 0.22 -0.44 1 1265 18 1566
43 0.82 0.21 -1.30 2 1321 19 1577
44 0.69 0.43 -0.43 3 1355 20 1588
45 0.84 0.38 -1.46 4 1381 21 1600
46 0.57 0.46 0.18 5 1402 22 1612
47 0.28 0.29 1.65 6 1420 23 1624
48 0.81 0.28 -1.17 7 1436 24 1638
49 0.26 0.25 1.75 8 1450 25 1652
50 0.67 0.39 -0.35 9 1464 26 1668
51 0.56 0.45 0.22 10 1476 27 1686
52 0.44 0.34 0.82 11 1488 28 1707
53 0.66 0.46 -0.28 12 1500 29 1733
54 0.77 0.31 -0.89 13 1511 30 1767
55 0.48 0.48 0.60 14 1522 31 1823
56 0.49 0.35 0.55 15 1533 32 1916
57 0.45 0.40 0.74 16 1544
58 0.61 0.50 -0.02
59 0.76 0.43 -0.84
60 0.69 0.45 -0.44
61 0.65 0.40 -0.22
62 0.67 0.35 -0.31
63 0.76 0.39 -0.83
64 0.50 0.38 0.50
65 0.55 0.40 0.29
66 0.50 0.43 0.50
67 0.35 0.53 1.25
68 0.80 0.46 -1.11
69 0.52 0.29 0.41
70 0.75 0.39 -0.78
71 0.35 0.21 1.28
72 0.58 0.39 0.14
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 7
0809 Spring New Mexico
Test Statistics
Number of Students 227
Number of Items 28
Mean 17.57
Std. Deviation 5.05
Reliability 0.79
Std. Error of Measurement 2.32
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
1 0.67 0.26 -0.17 0 1189 15 1563
2 0.75 0.30 -0.63 1 1283 16 1575
3 0.61 0.39 0.12 2 1340 17 1587
4 0.72 0.37 -0.47 3 1375 18 1600
5 0.83 0.49 -1.22 4 1401 19 1613
6 0.63 0.38 0.01 5 1423 20 1627
7 0.60 0.32 0.19 6 1442 21 1642
8 0.22 0.22 2.17 7 1458 22 1658
9 0.78 0.47 -0.80 8 1474 23 1677
10 0.88 0.47 -1.68 9 1488 24 1698
11 0.65 0.42 -0.06 10 1501 25 1724
12 0.60 0.49 0.19 11 1514 26 1758
13 0.70 0.43 -0.37 12 1526 27 1815
14 0.78 0.41 -0.83 13 1539 28 1908
15 0.53 0.45 0.51 14 1551
16 0.41 0.24 1.11
17 0.54 0.31 0.47
18 0.45 0.29 0.92
19 0.68 0.48 -0.25
20 0.61 0.43 0.12
21 0.72 0.40 -0.44
22 0.61 0.31 0.15
23 0.80 0.40 -0.99
24 0.68 0.39 -0.22
25 0.40 0.40 1.18
26 0.49 0.42 0.70
27 0.63 0.39 0.06
28 0.59 0.46 0.23
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 7
0809 Spring New Mexico
Test Statistics
Number of Students 227
Number of Items 32
Mean 18.44
Std. Deviation 5.04
Reliability 0.78
Std. Error of Measurement 2.36
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
41 0.82 0.30 -1.35 0 1190 17 1578
42 0.92 0.22 -2.31 1 1284 18 1589
43 0.79 0.42 -1.16 2 1341 19 1600
44 0.32 0.40 1.32 3 1376 20 1612
45 0.62 0.48 -0.18 4 1402 21 1623
46 0.36 0.55 1.09 5 1424 22 1635
47 0.66 0.26 -0.36 6 1442 23 1648
48 0.67 0.53 -0.43 7 1458 24 1661
49 0.52 0.47 0.31 8 1473 25 1676
50 0.63 0.40 -0.25 9 1487 26 1692
51 0.18 0.37 2.24 10 1499 27 1710
52 0.92 0.26 -2.31 11 1512 28 1731
53 0.62 0.33 -0.18 12 1523 29 1757
54 0.80 0.36 -1.19 13 1535 30 1792
55 0.86 0.39 -1.72 14 1546 31 1848
56 0.57 0.36 0.05 15 1557 32 1941
57 0.63 0.20 -0.23 16 1567
58 0.30 0.33 1.42
59 0.63 0.49 -0.21
60 0.09 -0.05 3.08
61 0.78 0.29 -1.07
62 0.80 0.35 -1.22
63 0.26 0.33 1.63
64 0.12 0.41 2.72
65 0.55 0.50 0.16
66 0.35 0.16 1.16
67 0.61 0.37 -0.12
68 0.53 0.50 0.27
69 0.69 0.14 -0.55
70 0.77 0.46 -1.02
71 0.37 0.32 1.02
72 0.70 0.34 -0.60
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 8
0809 Spring New Mexico
Test Statistics
Number of Students 243
Number of Items 28
Mean 17.07
Std. Deviation 4.82
Reliability 0.78
Std. Error of Measurement 2.26
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
1 0.86 0.30 -1.52 0 1170 15 1571
2 0.48 0.22 0.72 1 1265 16 1585
3 0.93 0.25 -2.39 2 1324 17 1598
4 0.58 0.50 0.23 3 1361 18 1612
5 0.85 0.38 -1.41 4 1390 19 1627
6 0.74 0.37 -0.58 5 1414 20 1642
7 0.39 0.31 1.17 6 1434 21 1658
8 0.36 0.30 1.34 7 1453 22 1675
9 0.84 0.35 -1.34 8 1470 23 1695
10 0.49 0.26 0.68 9 1486 24 1717
11 0.67 0.48 -0.19 10 1501 25 1744
12 0.50 0.42 0.62 11 1516 26 1780
13 0.49 0.35 0.68 12 1530 27 1837
14 0.74 0.36 -0.58 13 1544 28 1931
15 0.56 0.40 0.35 14 1558
16 0.73 0.53 -0.56
17 0.66 0.35 -0.17
18 0.41 0.40 1.09
19 0.36 0.40 1.34
20 0.58 0.41 0.25
21 0.46 0.39 0.82
22 0.76 0.50 -0.71
23 0.44 0.38 0.92
24 0.69 0.39 -0.33
25 0.36 0.49 1.34
26 0.49 0.37 0.70
27 0.89 0.20 -1.77
28 0.75 0.36 -0.69
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 8
0809 Spring New Mexico
Test Statistics
Number of Students 237
Number of Items 32
Mean 18.14
Std. Deviation 5.90
Reliability 0.83
Std. Error of Measurement 2.43
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
41 0.65 0.38 -0.40 0 1203 17 1612
42 0.85 0.33 -1.65 1 1299 18 1624
43 0.44 0.24 0.67 2 1359 19 1635
44 0.71 0.37 -0.72 3 1397 20 1647
45 0.79 0.41 -1.23 4 1425 21 1660
46 0.55 0.48 0.13 5 1448 22 1673
47 0.32 0.36 1.30 6 1467 23 1686
48 0.77 0.34 -1.06 7 1485 24 1701
49 0.28 0.15 1.50 8 1500 25 1717
50 0.76 0.43 -1.00 9 1515 26 1734
51 0.55 0.44 0.11 10 1529 27 1754
52 0.35 0.34 1.12 11 1542 28 1777
53 0.56 0.32 0.09 12 1554 29 1806
54 0.65 0.31 -0.40 13 1566 30 1844
55 0.58 0.31 -0.04 14 1578 31 1906
56 0.62 0.33 -0.20 15 1589 32 2000
57 0.38 0.47 0.94 16 1601
58 0.51 0.55 0.31
59 0.47 0.37 0.52
60 0.69 0.53 -0.61
61 0.86 0.51 -1.73
62 0.84 0.41 -1.55
63 0.65 0.47 -0.38
64 0.37 0.49 1.01
65 0.37 0.24 1.01
66 0.43 0.52 0.69
67 0.37 0.28 1.03
68 0.64 0.42 -0.33
69 0.66 0.55 -0.45
70 0.66 0.52 -0.42
71 0.52 0.40 0.27
72 0.29 0.36 1.47
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 9
0910 Spring New Mexico
Test Statistics
Number of Students 284
Number of Items 31
Mean 16.18
Std. Deviation 5.61
Reliability .80
Std. Error of Measurement 2.51
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
1 0.49 0.18 0.2 0 1251 16 1643
2 0.61 0.30 -0.42 1 1345 17 1655
3 0.75 0.43 -1.17 2 1402 18 1666
4 0.42 0.38 0.52 3 1438 19 1678
5 0.55 0.46 -0.1 4 1464 20 1690
6 0.73 0.52 -1.02 5 1486 21 1702
7 0.47 0.41 0.28 6 1505 22 1716
8 0.23 0.11 1.54 7 1521 23 1730
9 0.45 0.37 0.37 8 1536 24 1746
10 0.51 0.25 0.07 9 1550 25 1763
11 0.62 0.43 -0.45 10 1563 26 1784
12 0.68 0.43 -0.79 11 1576 27 1810
13 0.44 0.3 0.43 12 1587 28 1844
14 0.88 0.29 -2.08 13 1599 29 1899
15 0.58 0.54 -0.26 14 1610 30 1992
16 0.51 0.44 0.05 15 1621 31 1643
17 0.39 0.23 0.64
18 0.62 0.45 -0.43
19 0.40 0.54 0.59
20 0.61 0.47 -0.42
21 0.61 0.53 -0.43
22 0.45 0.38 0.35
23 0.34 0.03 0.93
24 0.44 0.42 0.4
25 0.37 0.42 0.74
26 0.46 0.31 0.32
27 0.39 0.41 0.64
28 0.33 0.28 0.93
29 0.81 0.36 -1.56
30 0.50 0.44 0.1
31 0.51 0.50 0.07
WHAT IS PREDICTIVE ASSESSMENT?
0910A NM Reading 9th Grade Assessment
NM Item Number
Also found in: Question # Biserial
DIF Gender DIF Eth
1 KY / 9th 10 0.3 0.56 0.27
2 FL / 9th 11 0.38 0.09 0.22
3 - - - - -
4 IL / 9th 16 0.32 0.06 0.04
5 MS / 9th 17 0.33 NA NA
6 - - - - -
7 - - - - -
8 - - - - -
9 FL / 9th 30 0.5 0.09 0.09
10 - - - - -
11 MS / 9th 8 0.18 NA NA
12 FL / 9th 19 0.45 0.08 0.44
13 - - - - -
14 IL / 9th 5 0.16 0.39 0.46
15 IL / 9th 7 0.43 0.08 0.36
16 IL / 9th 18 0.47 0.02 0.03
17 - - - - -
18 - - - - -
19 - - - - -
20 - - - - -
21 IL / 9th 14 0.39 0.02 0.35
22 - - - - -
23 - - - - -
24 - - - - -
25 - - - - -
26 FL / 9th 28 0.39 0.21 0.11
27 - - - - -
28 KY / 9th 23 0.45 0.29 0.04
29 - - - - -
30 - - - - -
31 - - - - -
WHAT IS PREDICTIVE ASSESSMENT?
0910B NM Reading 9\th Grade Assessment
NM Item # Also found
in: Item # Biserial DIF Eth DIF Gen
1 KY / 9th 3 0.31 0.35 0.04
2 WI / 9th 2 0.24 NA NA
3 - - - - -
4 KY / 9th 6 0.45 0.51 0.08
5 - - - - -
6 - - - - -
7 IL / 9th 10 0.43 0.08 0.24
8 - - - - -
9 IL / 9th 11 0.49 0.1 0.17
10 FL / 9th 19 0.4 0.25 0.04
11 - - - - -
12 IL / 9th 22 0.44 0.07 0.08
13 MS / 9 23 0.38 0.01 0.15
14 - - - - -
15 IL / 9th 14 0.52 0.09 0.08
16 FL / 9th 13 0.49 0.16 0.08
17 - - - - -
18 FL / 9th 12 0.48 0.0 0.03
19 US / HS 20 0.49 NA NA
20 FL / 9th 9 0.44 0.49 0.05
21 - - - - -
22 IL / 9th 24 0.38 0.23 0.12
23 - - - - -
24 - - - - -
25 - - - - -
26 IL / 9th 26 0.46 0.02 0.16
27 - - - - -
28 - - - - -
29 - - - - -
30 KY / 9th 32 0.41 0.03 0.04
31 KY / 9th 33 0.52 0.12 0.32
WHAT IS PREDICTIVE ASSESSMENT?
Reading Grade 10
0910 Spring New Mexico
Test Statistics
Number of Students 273
Number of Items 31
Mean 17.81
Std. Deviation 6.14
Reliability .84
Std. Error of Measurement 2.43
Question Statistics Scale Scores
Item No. P-Value Biserial
Rasch Item Difficulty
No. Correct
Scale Score
No. Correct
Scale Score
1 0.56 0.38 0.13 0 1260 16 1633
2 0.65 0.41 -0.34 1 1353 17 1644
3 0.61 0.47 -0.15 2 1409 18 1655
4 0.81 0.50 -1.37 3 1444 19 1667
5 0.63 0.61 -0.24 4 1469 20 1678
6 0.67 0.43 -0.46 5 1490 21 1690
7 0.61 0.52 -0.13 6 1508 22 1703
8 0.91 0.44 -2.3 7 1524 23 1717
9 0.71 0.5 -0.67 8 1539 24 1731
10 0.77 0.42 -1.03 9 1552 25 1747
11 0.54 0.33 0.2 10 1565 26 1765
12 0.47 0.28 0.55 11 1577 27 1786
13 0.40 0.42 0.94 12 1589 28 1812
14 0.75 0.38 -0.94 13 1600 29 1847
15 0.50 0.34 0.42 14 1611 30 1903
16 0.58 0.55 0.04 15 1622 31 1997
17 0.70 0.41 -0.63
18 0.70 0.57 -0.65
19 0.61 0.62 -0.13
20 0.34 0.32 1.25
21 0.40 0.17 0.94
22 0.53 0.52 0.26
23 0.53 0.33 0.26
24 0.46 0.41 0.62
25 0.75 0.53 -0.96
26 0.46 0.38 0.62
27 0.59 0.47 -0.05
28 0.23 0.28 1.9
29 0.50 0.41 0.42
30 0.48 0.40 0.53
31 0.39 0.23 0.97
WHAT IS PREDICTIVE ASSESSMENT?
0910A NM Reading 10th Grade Assessment
NM Item Number
Also Found in: Question # Biserial
DIF Gender DIF Eth
1 IL / 10th 7 0.51 0.2 0.08
2 - - - - -
3 - - - - -
4 IL / 10th 10 0.2 0.27 0.38
5 DC / 10th 2 0.46 0.19 0.26
6 - - - - -
7 KY / 10th 6 0.49 0.35 0.02
8 - - - - -
9 - - - - -
10 - - - - -
11 IL / 10th 2 0.41 0.1 0.0
12 US / HS 22 0.44 0.24 0.16
13 - - - - -
14 TN / HS 13 0.44 0.05 0.05
15 - - - - -
16 DC / 10th 34 0.42 0.03 1.2
17 - - - - -
18 - - - - -
19 - - - - -
20 FL / 10th 17 0.6 0.48 0.38
21 FL / 10th 18 0.54 0.56 0.25
22 - - - - -
23 TN / HS 17 0.41 0.15 0.07
24 FL / 10th 19 0.53 0.29 0.09
25 FL / 10th 20 0.39 0.0 0.03
26 - - - - -
27 IL / 10th 21 0.35 0.47 0.46
28 - - - - -
29 - - - - -
30 - - - - -
31 - - - - -
WHAT IS PREDICTIVE ASSESSMENT?
0910B NM Reading 10th Grade Assessment
NM Item # Also found in: Item # Biserial DIF Eth DIF Gen
1 - - - - -
2 IL / 10th 2 0.45 0.31 0.3
3 IL / 10th 3 0.39 0.08 0.01
4 IL / 10th 4 0.47 0.27 0.29
5 - - - - -
6 TN / HS 13 0.46 0.24 0.12
7 - - - - -
8 KY / 10th 4 0.46 0.08 0.19
9 KY / 10th 6 0.5 0.12 0.02
10 - - - - -
11 KY / 10th 8 0.28 0.27 0.13
12 IL / 10th 10 0.29 0.0 0.05
13 KY / 10th 18 0.45 0.02 0.13
14 - - - - -
15 KY / 10th 20 0.39 0.19 0.26
16 DC / 10th 11 0.52 0.18 0.3
17 KY / 10th 21 0.52 0.08 0.23
18 - - - - -
19 DC / 10th 6 0.46 0.42 0.06
20 DC / 10th 4 0.2 0.81 0.14
21 DC / 10th 5 0.3 0.91 0.21
22 KY / 10th 24 0.48 0.48 0.39
23 - - - - -
24 DC / 10th 25 0.44 0.11 0.23
25 DC / 10th 26 0.43 0.46 0.44
26 TN / HS 18 0.33 0.7 0.45
27 TN / HS 19 0.29 0.4 0.22
28 - - - - -
29 - - - - -
30 IL / 10th 23 0.47 0.04 0.01
31 KY / 10th 29 0.36 0.28 0.46
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 9
0910 Winter New Mexico
Test Statistics
Number of Students 290
Number of Items 32
Mean 10.79
Std. Deviation 3.54
Reliability 0.49
Std. Error of Measurement 2.53
0910A New Mexico Test & Item Stats for 9
th Grade Math
NM Item Number
NM Item Average NM Biserial
Rasch Item Difficulty
Also Found in: # Biserial DIF Gen DIF Eth
41 0.43 0.34 -0.5 IL / 9th 76 0.35 0.22 0.07
42 0.65 0.26 -1.45 IL / 9th 42 0.35 0.21 0.00
43 0.65 0.36 -1.45 AL / HS 4 0.39 0.05 0.39
44 0.35 0.24 -0.11 KY / 10th 56 0.31 0.03 0.03
45 0.14 -0.06 1.16 IL / 10th 51 0.02 0.39 0.86
46 0.23 0.32 0.50 IL / 9th 73 0.23 0.12 0.28
47 0.19 0.05 0.77 US / HS 2 0.1 0.42 0.01
48 0.26 0.12 0.33 IL / HS 28 0.51 0.4 0.22
49 0.46 0.12 -0.60 - - - - -
50 0.45 0.42 -0.57 - - - - -
51 0.24 0.21 0.46 IL / 10th 70 0.29 0.36 0.2
52 0.37 0.24 -0.19 IL / 10th 72 0.29 0.01 0.18
53 0.37 -0.02 -0.22 - - - - -
54 0.23 0.07 0.48 AL / HS 37 0.28 0.31 0.81
55 0.28 0.06 0.22 - - - - -
56 0.37 0.56 -0.2 DC / 10th 5 0.51 0.17 1.57
57 0.18 0.34 0.84 FL / 9th 58 0.42 0.02 0
58 0.22 -0.06 0.59 - - - - -
59 0.14 0.16 1.16 - - - - -
60 0.40 0.24 -0.36 IL / 9th 74 0.32 0.17 0.05
61 0.32 0.37 0.01 - - - - -
62 0.18 0.01 0.84 - - - - -
63 0.43 0.32 -0.48 IL / 9th 63 0.4 0.21 0.04
64 0.34 0.37 -0.07 IL / 9th 56 0.37 0.12 0.04
65 0.51 0.21 -0.84 FL / 9th 46 0.39 0.29 0.06
66 0.35 0.37 -0.11 IL / 9th 65 0.51 0.21 0.2
67 0.29 0.29 0.20 FL / 9th 45 0.4 0.14 0.24
68 0.38 0.21 -0.27 FL / 9th 65 0.26 0.27 0.11
69 0.31 0.36 0.08 - - - - -
70 0.50 0.41 -0.79 FL / 10th 49 0.47 0.02 0.06
71 0.36 0.42 -0.16 KY / 10th 72 0.42 0.05 0.62
72 0.20 0.20 0.72 FL / 10th 67 0.36 0.2 0.11
WHAT IS PREDICTIVE ASSESSMENT?
Math Grade 10
0910 Winter New Mexico
Test Statistics
Number of Students 300
Number of Items 32
Mean 12.01
Std. Deviation 4.16
Reliability 0.66
Std. Error of Measurement 2.43
0910A New Mexico Test & Item Stats for 10th
Grade Math
NM Item Number
NM Item Average NM Biseral
Rasch Item Difficulty
Also Found in: # Biserial DIF Gen DIF Eth
41 0.04 0.09 2.66 - - - - -
42 0.57 0.48 -0.98 FL / 10th 50 0.55 0.29 0.44
43 0.35 0.21 0.02 - - - - -
44 0.4 0.32 -0.25 IL / 10th 55 0.33 0.05 0.35
45 0.09 -0.15 1.84 AL / HS 18 0.18 0.04 0.26
46 0.77 0.45 -1.99 AL / HS 22 0.39 0.12 0.45
47 0.23 0.31 0.67 - - - - -
48 0.57 0.39 -1.00 IL / 10th 54 0.40 0.35 0.15
49 0.54 0.48 -0.85 IL / HS 2 0.37 0.34 0.06
50 0.14 0.2 1.34 US / HS 22 0.23 0.14 0.31
51 0.48 0.11 -0.61 IL / 10th 42 0.19 0.22 0.26
52 0.87 0.2 -2.78 - - - - -
53 0.36 0.38 -0.05 - - - - -
54 0.29 0.27 0.29 US / HS 6 0.26 0.23 0.05
55 0.53 0.37 -0.81 - - - - -
56 0.58 0.43 -1.05 - - - - -
57 0.31 0.29 0.20 - - - - -
58 0.09 -0.01 1.80 - - - - -
59 0.28 0.41 0.36 FL / 10th 61 0.52 0.00 0.24
60 0.28 0.1 0.38 - - - - -
61 0.38 0.38 -0.16 IL / 10th 61 0.33 0.14 0.08
62 0.25 0.16 0.51 KY / 10th 62 0.31 0.2 0.04
63 0.51 0.42 -0.75 AL / HS 15 0.33 0.00 0.1
64 0.14 -0.01 1.34 - - - - -
65 0.26 0.42 0.46 KY / 10th 70 0.47 0.05 0.23
66 0.57 0.32 -0.98 KY / 10th 41 0.37 0.1 0.47
67 0.27 0.38 0.42 KY / 10th 42 0.4 0.09 0.01
68 0.46 0.28 -0.49 IL / 10th 75 0.30 0.12 0.23
69 0.48 0.5 -0.58 DC / 10th 11 0.42 0.15 0.46
70 0.2 0.14 0.87 DC / 10th 19 0.29 0.11 0.23
71 0.15 0.22 1.20 - - - - -
72 0.58 0.36 -1.05 - - - - -