Upload
guest84cff7
View
238
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Measurement and Evaluation of Thinking
Skills (HOTS)
ASSOC. PROF DR KAMISAH OSMAN7 APRIL 2008
Measurement and Evaluation
What may happen in measurement and evaluation of science teaching with respect to measure higher order thinking?
“We have never studied the content of these questions”.
“This test is too easy. I can answer them by closing my eyes.”
“I scored 65% on the test, but I still failed the test, which doesn’t make sense to me.”
What is all about?
Student
x
xxx
xxxxx
xxxxxx
Xx
X
x
Measurement Objectives
2.4
1.3
2.4 3.3
2.3 3.1 3.2
2.1 2.2
4.1
1.1
Key to Good Assessment of HOTS
Assessment
(Measurement + Evaluation)
alignment
Curriculum Instruction
Necessary Conditions for Good Assessment Planning Developing Administering Scoring Analyzing Grading
Test Planning: Test Grid
# of Items (points)
Remember Understand Apply
M-C C-R Performance Sub-total Substances 5(5) 2(6) Mixture 8(8) 1(5) Conservation 4(4) 1(3) Physical change
5(5)
Chemical change
4(4)
2
Sub-total
26(26) 4 (14) 2(10) 32(50)
Developing multiple-choice (MC) questions
Guideline 1: The stem of an item should be meaningful by itself and should present a definite problem
A scientist…
a. Consults the writing of Aristotle.
b. Debates with fellow scientists.
c. Makes a careful observation during experiments.
d. Thinks about the probability.
How does a scientist discover new facts?
a. Consulting the writing of Aristotle.
b. Debating with fellow scientists.
c. Making careful observations during experiments.*
d. Thinking about the probability.
Developing MC questions
Guideline 2: Make all choices plausible to uninformed students
What are electrons?
a. Negative particles*
b. Positive particles
c. Neutral particles
d. Mechanical tools
What are electrons?
a. Negative particles*
b. Positive particles
c. Neutral particles
d. Nuclei of atoms
Ways to make choices plausible
1. Using students’ common misconceptions or errors
2. Use the textbook language or other phraseology that has the “appearance of truth”
3. Use distracters that are parallel in form and grammatically consistent with the item’s stem
4. Make the distracters similar to the correct answer in length, vocabulary, sentence structure, and complexity of thought
Developing MC questions
Guideline 3: Arrange the responses in a logical order if it exists
How many bonding electrons does an oxygen atom have?
a. 3
b. 5
c. 6
d. 2*
e. 4
How many bonding electrons does an oxygen atom have?
a. 2*
b. 3
c. 4
d. 5
e. 6
Developing MC questions
Guideline 4: Avoid extraneous clues to the correct and incorrect choices
Increasing the temperature will increase the pressure of a gas in a sealed container because
a. No container expansion
b. Gas particles constantly more
c. Gas particles collide with each other
d. More gas particles collide with each other and with the container walls*
Increasing the temperature will increase the pressure of a gas in a sealed container because
a. Gas particles move more rapidly
b. Gas particles expand bigger
c. Gas particles collide more with each other
d. Gas particles collide more with the container*
Developing MC questions
Guideline 5: Avoid using the “none of the above”, “all of the above” and “I don’t know” alternatives
According to Boyle’s law, which of the following changes will occur to the pressure of a gas at a given temperature when the volume of the gas is increased?
a. increase
b. decrease*
c. no change
d. none of the above
According to Boyle’s law, which of the following changes will occur to the pressure of a gas at a given temperature when the volume of the gas is increased?
a. increase
b. decrease*
c. increase first, then decrease
d. no change
Summary: Checklist for good MC questions The stem presents a clear problem The stem is stated as a question The choices are equally plausible The choices are in alpha-numerical or other logical
order The choices are consistent in length and contain no
extraneous clues The choices contain only one best or correct answer “None of the above” or “all of the above” choices are
avoided
Advantages and Limitations of M-C Questions
Advantages Easy to score Objective to score Large coverage Good at assessing specific
knowledge and understanding or lower order thinking skills (LOWS)
Incorrect answers provide valuable information on students’ learning difficulties
Limitations Limited in assessing
higher order thinking skills (HOTS)
Guessing Reading comprehension Time consuming to write
good M-C questions
How to measure higher order thinking skills (HOTS) First of all: You don’t have to use MC to assess
HOTS; there are many other question formats that assess HOTS better than MC does
Understand the difference among different cognitive levels
Use combinations of question formats
Develop appropriate multiple-choice questions
Lower Order Thinking Skills (LOTS) Remember: recognize (identify) , recall (retrieve)
Understand: interpret (clarify, paraphrase, represent, translate) , exemplify (illustrate, instantiate), classify (categorize, instantiate), summarize (abstract, generalize), infer (conclude, extrapolate, interpolate, predict), compare (contrast, map, match), explain (construct, model)
Apply: execute (carry out), implement (use)
Higher Order Thinking Skills (HOTS)
Analyze: differentiate (discriminate, distinguish, focus, select), organize (find coherence, integrate, outline, parse, structure), attribute (deconstruct)
Evaluate: check (coordinate, detect, monitor, test), critique (judge)
Create: generate (hypothesize), plan (design), Produce (construct)
Original Terms New Terms
Evaluation
Synthesis
Analysis
Application
Comprehension
Knowledge
•Creating
•Evaluating
•Analysing
•Applying
•Understanding
•Remembering
Using combinations of question formats: M-C + M-CAfter a large ice-cube
has melted in a beaker of water, how will the water level change?a. higherb. lowerc. the same*
After a large ice-cube has melted in a beaker of water, how will the water level change?a. higherb. lowerc. the same*
Why do you think so? Choose all that apply. a. The mass of water displaced is equal to the mass of the ice*. b. Ice has more volume than water. c. Water is denser than ice. d. Ice cube decreases the temperature of water. e. Water molecules in water occupy more space than in ice.
Using combinations of question formats: M-C + Constructed ResponseAfter a large ice-cube has
melted in a beaker of water, how will the water level change?a. higherb. lowerc. the same*
After a large ice-cube has melted in a beaker of water, how will the water level change?a. higherb. lowerc. the same*
Why do you think so? Please justify your choice:
Using combinations of question formats: Performance + M-CUsing the materials provided at
your table, create a model of the human heart. You should use the blue and red play-doh to represent de-oxygenated and oxygenated blood. Be sure to create and label the following:
Left Atrium (2 pts.)Right Atrium (2 pts.)Left Ventrical (2 pts.)Right Ventrical (2 pts.)Aorta (2 pts.)Pulmonary Vein (2 pts.)Pulmonary Artery (2 pts.)
Using the materials provided at your table, create a model of the human heart. You should use the blue and red play-doh to represent de-oxygenated and oxygenated blood. Be sure to create and label the following:
Same as the left
In the heart, the mixing of oxygen-rich and oxygen-poor blood is prevented by the
a.mitral valve b.tricuspid valve c.septum* d.pericardium.
Developing appropriate M-C questions for HOTS1. Providing a factual statement, ask students to
analyze.
The Sun is the only body in our solar system that gives off large amounts of light and heat. Why can we see the Moon?
A. It is nearer the earth than the SunB. It is reflecting light from the Sun*C. It is the biggest object in the solar systemD. It is without an atmosphere
Developing appropriate M-C questions for HOTS2. Providing a diagram, ask
students to identify elements:
In the cell on the right, what letter correctly identifies the portion that first receives a signal
a. A*b. Bc. C.d. De. E
Developing appropriate M-C questions for HOTS3. Providing data, ask students to develop a hypothesis:
Amounts of oxygen produced in a pound at different depths are shown below:Location OxygenTop meter 4 g/m3Second meter 3g/m3Third meter 1g/m3Bottom meter 0g/m3
Which statement is a reasonable hypothesis based on the data in the table?A. More oxygen production occurs near the surface because there is more light
there.*B. More oxygen production occurs near the bottom because there are more plants
thereC. The greater the water pressure, the more oxygen production occursD. The rate of oxygen production is not related to depth.
Developing appropriate M-C questions for HOTS4. Providing a statement, ask students to evaluate its
validity:
The crews of two boats at sea can communicate with each other by shouting to each other, so are crews of two close-by spaceships in the space. How valid is this statement?
A. ValidB. Partially valid*C. InvalidD. Not enough information to make a judgment
Developing constructed response questions for assessing HOTS
Short constructed response (SCR) questions require answers ranging from one word to a few sentences.
Extended constructed response (ECR) questions require students to write a few sentences or a short paragraph.
Essay (E) questions require students to write a few paragraphs to a few pages.
General guidelines for writing constructed-response questions
1. Define the task completely and specifically.
Poor State whether you think pesticide should be used in farms.
Better State the environmental effects of pesticide use in farms
Avoid ambiguous words
Possible student interpretations of the word “discuss” Explain in my own words, maybe with an
introduction, something in the middle and a conclusion
Analyze in length Present analogies and comparisons Tell all I know as much as possible Put down facts …
General guidelines for writing constructed-response questions
2. Give explicit directions such as the length, grading guideline, and time to complete.
Poor State whether you think pesticide should be used in farms.
Better State whether you think pesticide should be used in farms. Defend your position as follows:
a. Identify any positive benefits associated with pesticide use. b. Identify any negative effects associated with pesticide use. c. Compare positive benefits against negative effects. d. Suggest if better alternatives than pesticide are available. Your essay should be in no more than 2 double-spaced pages.
Two of the points will be used to evaluate the sentence structure, punctuation, and spelling. (10 points).
General guidelines for writing constructed-response questions3. Do not provide optional
questions for students to choose
Because different questions may measure completely different constructs, which makes comparisons among students difficult
General guidelines for writing constructed-response questions4. Define scoring clearly and appropriately
scoring rubric
Analytic vs Holistic
Trait 1
Trait 2
Trait 3
4
3
2
1
5. -----------------------
4. ------------------------
3. ------------------------
2. ------------------------
1. ------------------------
Holistic Scoring Rubric
Quality Category (Score) Characteristics Distinguished (4) Student can apply all the skills needed to
investigate a self-selected issue or problem
Experienced (3) Student can use appropriate skills needed to investigate a self-selected issue or problem
Average (2) Student can use some investigative skills to investigate a problem identified by another person.
Novice (1) Student has difficulty in demonstrating skills needed to study a provided problem.
Analytic Scoring Rubric Score Attribute
3 2 1
Source Lists more than three sources
Includes title, page, date
Includes author
Lists three sources Includes title, page,
date Includes author
Lists two sources Includes title,
page, and date
Report Title
Relates to main topic.
Capitalizes correctly
Captures a reader’s interest
Relates to main topic
Capitalizes correctly
Relates to main topic
Content Many related topics Informative Well organized
Relates to topic Informative
Relates to topic
Paragraphs Complete sentences Correct spelling Correct punctuation Correct
capitalization Neatly done.
Complete sentences Correct spelling Correct punctuation Correct
capitalization
Complete sentences
Illustrations More than one titled
Labeled correctly Neat appearance
Titled Labeled correctly Neat appearance
Titled Labeled
Activities Relates to issue question
Informative Written conclusion
Relates to issue question
Informative Written conclusion
Relates to issue question
Cover Title Name Date Illustration
Title Name Date
Title Name
Holistic vs Analytic
Holistic Easy to construct Efficient to score Clear implication Vague feedback Less informative for
students to answer the question
Analytic Time consuming to
construct Time consuming to
score Unclear implication Specific feedback Informative for
students to answer the question
Guidelines for scoring essay questions
Essays are scored anonymously Essays are scored question by question
across students Each essay is graded twice independently to
ensure consistency/reliability Appropriate scoring rubrics are developed
and applied consistently
Multiple Faculty, TAs
Common curriculum Common learning opportunities Develop and agree on a common test grid Same scoring rubrics Consider item banking
Developing vs. Adopting
There are many standardized tests or item banks (e.g. http://www.flaguide.org/)
Standardized tests have established validity, reliability, and absence of bias
The key is the match between the test coverage and the curriculum/instruction
Necessary conditions for good assessment Planning Developing Administering Scoring Analyzing Grading
Administering tests
Order questions: easy to difficult; SRC questions first, SCR questions next, and ECR questions at last
Give complete instructions before students begin: test purpose, time allowance, basis for responding, methods of recording, appropriateness of guessing
Use equivalent forms or different item orders and recording sheets to avoid cheating
Ensure adequate physical setting
Avoid unnecessary interaction with students
Start and end the test at the same time
Scoring tests Hand scoring vs. optical scanning
Need to establish inter-rater reliability for constructed response questions
Correct for guessing when appropriate (e.g. speeded)Corrected Score = R – W/(n-1)
Correct for cheating
Harpp-Hogan index (H‑H) = EEIC/D
EEIC is exact errors in commonD is number of different responses
Item and test analysis
Item analysis: item response patterns, item difficulty, item discrimination, etc.
Test analysis: reliability (), criterion related validity, bias, prediction related validity, etc.
Grading
Lake Wobegon Effect:
In 1988 it was reported that 70% of the students, 90% of the 15,000 school districts, and 50 states in US were scoring above the national norms on norm-referenced achievement tests in elementary schools (Cannell, 1988)
Criterion-referenced grading based on standards Commonly used standards: pass/fail,
A/B/C/F, …
The essential part of standard setting is to decide a cut-off score
Deciding the cut-off score: M-C testIf number of test questions are more than 20, and the 0
is within the range of .50 to .80, the approximate X can be calculated as follows:
X= (n-)/ * 0 + (-1)/ * M + .5
M is the mean score on the test, is test reliability, n is total number of questions
0 is a true cut-off score if measurement quality is perfectX is the approximate cut-off score given the measurement error ()
Example (n = 28, 0 = .75, X0=21)
Average Score on the Test (M)
Reliability
()
17 21 25
.6 23.43 20.76 18.10
.8 21.75 20.75 19.25
Norm-referenced grading or curving grading
Z = (X-μ)/σ
Norm-referenced vs. criterion-reference Number of students Characteristics of students Purpose of testing Use of testing results Quality of tests
Other grading issues
1.Components of Grades (achievement, efforts, attitude)
2. Combining Scores for the Final Grade (equate before weight before aggregate)
3. Translating Final Grades to Letter Grades (pre-determined scheme)
4. Reporting Grades (clear definition)
Putting all things together: VRA
validity
reliability
absence of bias
Congratulations! You have survived two hours’ preach, you can preach others now.
Questions?