CCSSO Criteria for High-Quality Assessments Technical Issues and Practical Application of Assessment Quality Criteria

CCSSO Criteria for High-Quality Assessments

Technical Issues and Practical Application of Assessment

Quality Criteria

Background

• Test Validity• Madaus call for Test Monitoring Board• Unified Concept of Test Validity• AERA/APA/NCME Standards for Educational

and Psychological Testing• Peer-Review• Criteria for High-Quality Assessments

Technical Quality

• Indicate Progress toward College & Career Readiness• Valid for Required and Intended Purposes• Ensuring Reliability• Valid and Consistent Test Score Interpretations within

and across Years• Accessibility to ALL students, including English Learners

and Students with Disabilities• Transparency of Test Design and Expectations• Meeting Requirements for Data Privacy and Ownership

Progress Toward College & Career Readiness

• Description of process for developing performance level descriptors and setting performance standards, including:– Involvement of higher education & career experts– Use of external evidence to inform standards– Evidence that external benchmarks valid for

intended purpose• Description of studies to be conducted to

evaluate validity of standards over time

Valid for Required and Intended Purposes

• Well-articulated validity evaluation that focuses on:– the validity of test score uses– scoring and reporting structures consistent with

structure of state standards– total test and sub-scores are related to external

variables as expected– assessments lead to intended outcomes – content validity of test forms and usefulness of

score reports

Ensuring Reliability

• Reliability of test scores for total population and reported sub-populations

• Precision of scores at cut points and consistency of student classifications

• Generalizability for relevant sources– Variability within and across groups– Variability among schools– Consistency across test forms– Consistency across rater scores

Consistency in Score Interpretations Within and Across Years

• Assessment Forms– Comparability within years– Linking across years– Consistency in meaning across achievement level

• Score Scales– Method used to transform raw to scale score

coherent with test design and intended claims– Scaling procedures support intended interpretations– Evidence supports validity of vertical scales

Accessibility for ALL Students

• Principles of Universal Design– Description of item development process used to reduce

construct irrelevance– Sample items and interfaces that reflect principles of Universal

Design• Appropriate Accommodations

– Accessibility features– Access to translations and definitions– Construct validity of accessibility features

• Evidence that the test items and accessibility features permit ELL and SWD to demonstrate their knowledge, skills, and abilities

Transparency of Test Design

• Test blue prints demonstrate range of standards covered

• Release plan yields representative sample of items on regular basis and across grades

• Sample items with annotations and scoring rubrics available

• Item development specifications available

Data Privacy

• Assurance of student privacy protection that complies with all federal and state requirements

• Assurance of state ownership of all data• State is provided all underlying data in timely

and usable manner to support secondary state analyses

• Description of secure data management procedures

Challenges to Implementation

• Validity and Reliability are not dichotomous and acceptable levels vary based on test use– Guidance on what is acceptable levels for different

uses• Timing of availability of evidence– New Programs may only have descriptions of plans– Existing Programs have student data to perform

analyses• Objectivity, Transparency, and Reliability of Quality

Review

Proposed Phased Approach

• Phase I: Content and Test Design– Focus on alignment with standards, item quality, and

accessibility features– Occurs after initial item development and test form

construction• Phase II: Test Characteristics

– Focus on validity, reliability, and generalizability– Occurs after field testing and/or first operational use

• Phase III: Program Implementation– Focus on Test Administration, Reporting, and Test/Item Pool

Maintenance– Occurs after second or third year of operational administration

Documents

CCSSO Criteria for High-Quality Assessments Technical Issues and Practical Application of Assessment Quality Criteria