Upload
tetsuo-kimura
View
475
Download
1
Embed Size (px)
DESCRIPTION
TERA & PROMS 2013, Taiwan
Citation preview
Evaluation of in-house item banks by administering actual CATs
Tetsuo KIMURA (Niigata Seiryo Unirversity)
TERA-PROMS 2013, Kaohsiung, TaiwanAugust 5, 2013
UCAT
Outline•Background & Previous Studies
▫What is CAT?▫UCAT and Moodle UCAT▫Construction of Item Bank
• Current Study▫Sample & Method▫Results▫Conclusion
2
What is CAT?
Paper-Pencil Test
Computerized Test
Computer Adaptive
Test
What is CAT?
Paper-Pencil
Test
Computerized
Test
Computer Adaptive
Test
Interview Test
Self-scoring flexilevel test (Lord, 1971)
Binet’s IQ test (Binet’s & Simon, 1905)
Adaptive Test
Binet’s IQ test (Binet’s & Simon,1905)
The First Adaptive Test
5
Flexilevel Test (Lord,1971)The middle difficulty item, number 11 in difficulty-order
① ② ③ ④
1. A slightly easier item, number 10 in difficulty-order ① ② ③ ④
1. A slightly harder item, number 12 in difficulty-order
① ② ③ ④2. A slightly easier item, number 9 in difficulty-order ① ② ③ ④
2. A slightly harder item, number 13 in difficulty-order ① ② ③ ④
3. 3.
・・・
・・・
10. The easiest item, number 1 in difficulty-order ① ② ③ ④
10. The hardest item, number 21 in
difficulty-order ① ② ③ ④
① ② ③ ④
① ② ③ ④① ② ③ ④
① ② ③ ④
① ② ③ ④ ① ② ③ ④
① ② ③ ④① ② ③ ④
① ② ③ ④ ① ② ③ ④① ② ③ ④
6
Individualizatio
n of test
1. item selection suitable to each test taker2. shortening of test administration time
What is CAT?
3. improvement of measurement accuracy
Efficiency of
measuremtn
Previous Studies
8
•Rash-based CAT program▫Linacre (1987) . UCAT: a BASIC computer-adaptive testing
program. ▫Kimura, Ohnishi & Nagaoka (2012). Moodle UCAT: a computer-
adaptive test module for Moodle based on the Rasch model.⇒ ACP (SG ) & Version2 (JP) cooperative project
•Construction of item banks for CAT▫Kimura (2009). Construction of a Moodle-based placement test
and possibility of a Moodle-based computer adaptive test.▫Kimura & Nagaoka (2010). Towards the construction of item
banks for moodle-based in-house computer adaptive English tests.
Construction of item bank
Pretesting
9
Item analysis & elimination of misfit
More pretests with new items and anchored items
Item bank
Calibrated items
Anchored items
Types of items used in the study
All the items were adopted from the Eiken Test Grade pre 1 to Grade 3, under the permission of the Society for Testing English Proficiency (STEP).
Listening comprehension (Lng)
Reading comprehension (Rdg)
Vocabulary and grammar (Vgm)
Listening comprehension with dialogue (Dlg)
Listening comprehension with monologue (Mlg)
Construction of item bank:Common Person Linking Dlg & Mlg Lng
r = .86
Mlg =Dlg × 1.18 + 0.06
r = 0.89
Dlg =Mlg × 0.85 + 0.05
Current item banks
12
Vgm N AVG SDG1.5 (B2) 73 1.57 0.84 G2 (B1) 69 0.52 0.81 G2.5 (A2) 67 -0.47 0.91 G3 (A1) 49 -1.41 0.80 Total 258 0.19 1.37
Lng N AVG SDG1.5 (B2)
44 1.26 1.42
G2 (B1) 109 0.77 1.11 G2.5 (A2)
75 0.35 1.05
G3 (A1) 80 -0.90 1.33 Total 308 0.30 1.43
CAT Algorithm: Item Selection (logit bias)
13
Moodle UCATLL and UL can be adjusted by adding logit value to the Logit bias box in the CAT setting window
BiasULULBiased
BiasLLLLBiased
_
_
Positve logit value decrease the chance of answer correctNegative logit value increase the chance of answer correct
Current Study: Sample & Method
14
Test takers: About 160 Japanese university freshmen whose majors are nursing and social welfare
Some students were eliminated from the data because they had not completed the CAT properly.
Eiken grade
Item banks
Vgm Lng
Pre 1st
115 113
2nd 105 108Pre 2nd
95 104
3rd 85 91CAT conditions• Initial estimate ability: 0.0 logit (100 unit)• Ending condition: number of item (16 items)
S.E. theoretically reached as low as 0.5 logit (Linacre, 2006)
• Logit bias: 0 (targeting probability of answering correct could be 0.5)
Current Study: results
15
Vgm: More than 90% of 157 test takes ended their CAT with S.E. less than .55 logits.Lng: More than 90% of 130 test takes ended their CAT with S.E. less than .55 logits.
Item exposure rate (frequency per 100 test takers)
Vgm Lng
Current Study: results
Vgm Lng
Item exposure rate (frequency per 100 test takers)
Current item banks
Vgm Lng
Current Study: Conclusions
•More items with lower difficulty should be added to both item banks.
•If the CATs were administered to students in advanced level, more items with higher difficulty need to be added to both item banks.
•If the cutting point of test is set between 0 and 3 logits for Vgm and between -1 and 3 for Lng, the current item banks can serves well for the CAT.
Thank you for listening.
Tetsuo Kimura [email protected]
Files for Moodle UCAT https://github.com/VERSION2-Inc/moodle_ucat
Acknowledgements: A part of the present study was supported by a
Grant-in-Aid for Scientific Research for 2010-2012 (No. 22520590) from the Japan Society for the Promotion of Science.
18