7. Testing · 7.9 Reliability 7.10 Test Bias 7.11 Using Tests Appropriately 7.12 Summary. 7.1...

Preview:

Citation preview

7. Testing

Testing: Big Questions •How do teachers

construct tests?•How are teacher-made

tests like/unlike standardized tests?•What information comes

from test results?

7.1 Instructional Objectives

7.2 Teacher-Developed Tests in the Classroom 7.3 Formative Evaluation

7.4 Classroom Grading Approaches

7.5 Criterion- Referenced Testing

7.6 Norm-Referenced Testing

7.7 Interpreting Norm- Referenced Tests Scores

7.8 Validity

7.9 Reliability 7.10 Test Bias 7.11 Using Tests Appropriately 7.12 Summary

7.1 Instructional Objectives

Objectives: Checklist for learning •More specific than goals •What students should know

or be able to do by end of lesson ➔ descriptive verbs! • Taxonomies provide

hierarchies of increasing sophistication•Bloom: Cognitive, affective,

psychomotor

Bloom’s taxonomies •Cognitive most used• 6 levels: remember,

comprehend, apply, analyze, evaluate, create •Objective: “Students will

compare and contrast yurts and tipis, in 3 key features.” •Note task, level (analysis),

criteria ➔ “Mastery learning” system

7.2 Teacher-Developed Tests in the Classroom

Classroom assessmentBackward planning as a “best practice”1. Write objective with

taxonomy-level verb and criteria for mastery

2. Create Assessment/test that fits objective

3. Plan learning activities that support and prepare students for mastery

Classroom tests• Essay: for comprehension,

analysis; needs criteria• Multiple choice, matching

for recognition • T/F, fill blanks for recall• Problem-solving for

application/analysis ➔ Consider pros/cons and kind of students who benefit

Performance-based or authentic assessment 1• Portfolio showing

progress • Exhibition, e.g. posters

• Demonstration, e.g. slide shows, videos

• For individual or group assessment

Authentic assessment 2

Rubric with criteria for scoring (posted for all to see)

10 points 5 pointsSources Over 5 Under 5Facts Over 10 Under 10Format Correct ErrorsGraphics Over 5 Under 5

7.3 Formative Evaluation

Formative assessments 1•Assess/evaluate learning

needs before instruction (aka “pretest”)•Determine previous

knowledge on topic or skill •Determine readiness for

skill or topic

Formative assessments 2•Check understanding,

monitor progress during learning cycle • Spot errors for re-teaching •Give feedback and

suggestions •Check readiness for final

(summative) assessment (aka “posttest”)

7.4 Classroom Grading Approaches

Assigning grades 1When student gets a grade for work, what does he/she think it means? • This is what I am worth • This how I compare with

classmates• This is what teacher thinks

of me • This is how well I learned

Assigning grades• Letter grades: A, B, C, D, F•Absolute: 10 points per letter •Curve (relative): comparative

scaling (force bell curve?)•Descriptive (short or long) • Performance rating (with

rubric/criteria)•Mastery checklist (# of

attempts not important)

7.5 Criterion-Referenced Testing

Criterion referencing• Emphasis on mastery of

specific skills/objectives •Good for topics that can be

broken into small objectives •Good for topics that have

hierarchy of skills (e.g. math) •Must master skill A before

you can understand and master skill B

Criterion referencing• Set-up: objective and

performance criteria to prove mastery for each skill(e.g. 80% correct answers)•No comparisons (and no

time constraints?) ➔ move to next level at own pace

7.6 Norm-Referenced Testing

Norm referencing• “Standardized” •Comparative with other

students •Achievement tests

(what has been learned, e.g. state/graduation test) •Aptitude tests

(predict future success, e.g. IQ, SAT, GRE)

7.7 Interpreting Norm-Referenced Test Scores

Analyzing test results (1)•Raw scores ➔ derived

(comparative) score • “Normed” with large

samples of test-takers•Norming = fitted onto normal

distribution (bell curve) •Bell curve: mean/average

(skewed by extremes), median (middle #), and mode (most frequent) are same

Analyzing test results (2)Statistical descriptors •Areas of distribution marked

by standard deviations = deviations from average• Example: IQ tests 100 = avg.;

34% either side of average• Z-scores: # standard

deviations +/- from average • Stanines: #5 in center; 1-4

below, 6-9 above

Analyzing test results (3)More statistical descriptors• Percentiles = % of students

performing same or below• Example: 80th percentile =

performs better than 80% of others

•Grade-level equivalents = • Example: 3.4 = 3rd grade,

4th month

7.8 Validity

How is a test valid?• Validity: accuracy measure•Content: match what was in

curriculum • Face: appropriate format•Criterion-related: items

match objectives • Predictive: match future

performance •Construct: match other tests

7.9 Reliability

How is a test reliable?•Reliability = consistency • Test-retest

• Alternate/parallel (versions) • Split-half = odds/evens • Kuder-Richardson = 1

test

• Perfect = 1.0, but .80 OK • 0 = no correlation • Negative value = as one

factor goes up, other down

7.10 Test Bias

Can a test be biased?• If content or format favors

one SES, race, culture, gender, or learning style • Shows up in form/content

of test question or answer • Partial solution: test in

students’ native language •Not bias: Males vary more

than females in achievement scores

7.11 Using Tests Appropriately

Testing: Use wisely •Check validity and standard

error of estimate (score +/-)•Check reliability and

standard error of measurement (confidence interval) caused by degree of unreliability •Consider how scores and

results will be used

7.12 Summary

Testing the test •What are you trying to find

out, and at what point in learning cycle? •Does a test report skill

achievement or compare students? •Does a test measure what it

should, consistently and without bias to any learner?

Recommended