Upload
lisa-white
View
213
Download
0
Embed Size (px)
Citation preview
CCSSO Criteria for High-Quality Assessments
Technical Issues and Practical Application of Assessment
Quality Criteria
Background
• Test Validity• Madaus call for Test Monitoring Board• Unified Concept of Test Validity• AERA/APA/NCME Standards for Educational
and Psychological Testing• Peer-Review• Criteria for High-Quality Assessments
Technical Quality
• Indicate Progress toward College & Career Readiness• Valid for Required and Intended Purposes• Ensuring Reliability• Valid and Consistent Test Score Interpretations within
and across Years• Accessibility to ALL students, including English Learners
and Students with Disabilities• Transparency of Test Design and Expectations• Meeting Requirements for Data Privacy and Ownership
Progress Toward College & Career Readiness
• Description of process for developing performance level descriptors and setting performance standards, including:– Involvement of higher education & career experts– Use of external evidence to inform standards– Evidence that external benchmarks valid for
intended purpose• Description of studies to be conducted to
evaluate validity of standards over time
Valid for Required and Intended Purposes
• Well-articulated validity evaluation that focuses on:– the validity of test score uses– scoring and reporting structures consistent with
structure of state standards– total test and sub-scores are related to external
variables as expected– assessments lead to intended outcomes – content validity of test forms and usefulness of
score reports
Ensuring Reliability
• Reliability of test scores for total population and reported sub-populations
• Precision of scores at cut points and consistency of student classifications
• Generalizability for relevant sources– Variability within and across groups– Variability among schools– Consistency across test forms– Consistency across rater scores
Consistency in Score Interpretations Within and Across Years
• Assessment Forms– Comparability within years– Linking across years– Consistency in meaning across achievement level
• Score Scales– Method used to transform raw to scale score
coherent with test design and intended claims– Scaling procedures support intended interpretations– Evidence supports validity of vertical scales
Accessibility for ALL Students
• Principles of Universal Design– Description of item development process used to reduce
construct irrelevance– Sample items and interfaces that reflect principles of Universal
Design• Appropriate Accommodations
– Accessibility features– Access to translations and definitions– Construct validity of accessibility features
• Evidence that the test items and accessibility features permit ELL and SWD to demonstrate their knowledge, skills, and abilities
Transparency of Test Design
• Test blue prints demonstrate range of standards covered
• Release plan yields representative sample of items on regular basis and across grades
• Sample items with annotations and scoring rubrics available
• Item development specifications available
Data Privacy
• Assurance of student privacy protection that complies with all federal and state requirements
• Assurance of state ownership of all data• State is provided all underlying data in timely
and usable manner to support secondary state analyses
• Description of secure data management procedures
Challenges to Implementation
• Validity and Reliability are not dichotomous and acceptable levels vary based on test use– Guidance on what is acceptable levels for different
uses• Timing of availability of evidence– New Programs may only have descriptions of plans– Existing Programs have student data to perform
analyses• Objectivity, Transparency, and Reliability of Quality
Review
Proposed Phased Approach
• Phase I: Content and Test Design– Focus on alignment with standards, item quality, and
accessibility features– Occurs after initial item development and test form
construction• Phase II: Test Characteristics
– Focus on validity, reliability, and generalizability– Occurs after field testing and/or first operational use
• Phase III: Program Implementation– Focus on Test Administration, Reporting, and Test/Item Pool
Maintenance– Occurs after second or third year of operational administration