Upload
jeffery-knight
View
214
Download
0
Embed Size (px)
Citation preview
Running a computer based exam:
the good, the bad and the ugly
Mark Rutter
Background
• Behavioural Methodology module• Level 5 module covering the practical issues
associated with designing, conducting and interpreting animal behaviour studies
• Studied by 2nd year ABW and 4th year ABW top-ups (Hons and non-Hons)
• Complements Research Methods (but not a formal co-requisite)
• I’m the sole tutor
Changes for 2010/11
• Following external examiner comments that we over-assess our students…
• … I dropped the assignment and made this 100% exam (with HoD & AVSC approval)
• We also ran the exam as one of the UMF e-assessment pilot modules
• Decided to keep largely the same exam format as previous years, just run it on a computer
• (To allow a comparison of sorts between paper-based exams and e-assessment)
Exam structure
A Multiple choice 20 MCQ. Negative marking 20
B Short answer One word up to a sentence 20
C Data interpretationAnimal tracking
Plot a graph. Interpret itPlot tracking data on map 20
D Expt. design For a given scenario 20
E Essay One essay from two titles 20
• 2.5 hours i.e. ½ hour per section. 26 students.
Getting questions into QMark
• Started with questions in a Word document• Cut-and-paste (via Notepad to strip
formatting) into QMark (with Carl’s help)• Advantages:
– Do not need QMark access to write questions– Easier to moderate questions (internal and ExEx)
• Disadvantage:– Extra work?– Will not work with more advanced questions
RUNNING THEE-EXAM
Two ‘tests’• Ran a ‘mock-mock’ with staff
– Several problems identified and corrected
• Ran a ‘mock’ with students– Could not access exam at first (ID number issue)!– But it was fine once we got started, other than…
• Subsequent student feedback– Carl and I had each exam section in a QMark section,
requiring students to complete the exam in order– Students wanted to do the exam sections in any order,
and wanted to be able to go back to previous sections– So we moved all the questions into a single QMark section
The ‘real’ exam• On the day:
– Again, students could not access the exam at first!– (Different problem – a language setting issue then meant
they were locked out as they only had one ‘attempt’)– Problem sorted in 15 mins, and I gave them 5 min extra– Then everything ran smoothly!– (Had a back-up ‘paper’ paper just in case)
• Gave students a choice of writing the essay (and the longer answers) either in an answer book or by typing it on the computer
• 21/26 (81%) chose typing on the computer
MARKING THE ANSWERS
Auto vs manual marking
A Multiple choice All marked automatically
B Short answer Single words auto-marked
C Data interpretationAnimal tracking
Numeric answers marked automatically
D Expt. design Manually marked
E Essay Manually marked
• 32/45 (71%) of the questions were auto-marked
Manual marking• Initially tried to do this in QMark…• … but not all the answers could be accessed (?)• Julie suggested exporting everything to Excel• Typed essays easier to mark than handwritten essays
• This lets you mark each answer in turn (this is good!)
Student ID B09 Give an example of when one-zero sampling is appropriate (1 mark)
???????? recording rare occurances of behaviour
???????? play behaviour in young animals
???????? wehn observing behavour that frequenstly stops and starts- observing play in young animals
???????? observing a individual behaviour
???????? with an incative animal which does not perform may behaviours at a time
Some extra QA
• Question by question allows for ‘post-hoc’ refinement e.g. decided to accept ‘chewing’ and ‘ruminating’
• All the auto-marked ‘one-word’ answers checked• Corrected a mistake (required correct Capitalisation)• Marks collated in an Excel spread sheet• Six sets of answers selected to be checked• For these, all answers marked and added up manually• Marks agreed with the auto-marked answers (good!)• All this was printed off (A3) for moderation and ExEx
STUDENT PERFORMANCE
Compare with previous years
• Simplest approach is to consider the mean and the spread of marks for the module…
• …but this does not take into account that some cohorts are ‘better’, on average, than others
• So I compared each students module mark with their average mark for the whole 2nd year
• Scatter plot with a fitted line
y = 1.0929x - 0.8009R² = 0.7935
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
Th
is m
od
ule
mea
n =
67.
4
All modules mean = 62.4
2008/09
y = 1.2219x - 11.301R² = 0.8724
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
Th
is m
od
ule
mea
n =
55.
6
All modules mean = 54.7
2009/10
y = 1.3018x - 13.729R² = 0.5147
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
Th
is m
od
ule
mea
n =
62.
8
All modules mean = 58.8
2010/11
ITEM ANALYSIS
Item analysis
• As well as (supposedly?) saving time, computer based exams allow the analysis of the results (called item analysis)
• e.g. for each question:– How difficult was it? (proportion getting it right)– How discriminating was it? (did better students
get it correct, and the worse ones get it wrong?)
• Gives quantitative, objective data about whether your questions are any good
# co
rrec
t
dif
ficu
lty
A4 3 3 3 0 -1 0 -1 -1 3 3 0 -1 3 3 0 3 0 -1 -1 -1 -1 3 -1 0 -1 -1 9 0.35 A4A13 3 0 0 3 3 3 3 0 -1 3 0 3 0 3 0 0 -1 -1 0 3 3 0 0 0 0 -1 10 0.38 A13A15 3 3 3 -1 3 3 3 -1 3 -1 -1 3 0 -1 0 -1 -1 3 3 -1 3 -1 -1 0 -1 -1 11 0.42 A15A9 3 3 3 3 0 -1 0 3 -1 0 3 3 3 -1 0 3 -1 -1 3 -1 -1 -1 3 -1 -1 -1 11 0.42 A9A1 3 3 3 0 -1 0 0 -1 3 3 -1 3 0 3 0 0 -1 3 3 0 3 -1 -1 0 -1 3 11 0.42 A1A5 3 3 3 -1 3 3 3 3 -1 3 -1 -1 3 -1 3 -1 -1 -1 3 3 -1 -1 -1 -1 -1 -1 12 0.46 A5A10 -1 0 3 3 -1 3 -1 -1 3 -1 3 -1 3 -1 3 -1 3 0 3 3 -1 -1 3 3 3 -1 13 0.50 A10A3 3 3 3 3 3 -1 -1 0 3 3 -1 3 3 3 3 3 -1 -1 0 3 -1 0 0 3 3 -1 15 0.58 A3A14 0 0 3 3 3 0 3 0 3 -1 3 3 0 3 0 3 0 3 3 3 3 0 0 3 0 3 15 0.58 A14A20 3 3 3 0 3 3 3 0 0 3 3 0 0 3 3 3 3 3 0 3 0 0 3 3 -1 -1 16 0.62 A20A19 3 3 3 3 3 3 3 3 3 3 0 3 3 3 3 3 0 3 0 3 0 0 0 0 -1 -1 17 0.65 A19A6 3 3 3 3 -1 3 3 3 3 0 3 3 -1 -1 3 -1 -1 3 3 3 3 -1 3 -1 3 -1 17 0.65 A6A8 3 3 3 3 3 3 3 3 3 3 3 3 3 -1 -1 3 3 3 3 3 -1 3 -1 0 -1 -1 18 0.69 A8A17 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 -1 3 -1 3 3 3 3 -1 -1 -1 21 0.81 A17A7 3 3 3 3 -1 3 3 3 3 -1 3 3 3 -1 3 3 3 3 3 3 3 -1 3 3 3 -1 21 0.81 A7A16 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 -1 3 -1 3 3 24 0.92 A16A11 3 3 3 3 3 3 3 3 3 3 3 -1 3 3 3 3 3 3 3 3 3 3 3 -1 3 3 24 0.92 A11A2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 -1 3 3 3 25 0.96 A2A12 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 26 1.00 A12A18 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 26 1.00 A18
Question difficultyproportion of students answering correctly
How discriminating is a question?
• 1.00 means all those in the top group got it right and all those in the bottom got it wrong (this is good!)
• 0.00 means equal numbers were right and wrong• Negative values mean more of the bottom group got
it correct than in the top group (this is bad!)
No. students in the top quartile gettingthe question right
No. students in the bottom quartile gettingthe question right
Discrimination index =
-
No. of students in a quartile
Discrimination index• What level of discrimination is appropriate?• QMark (Pope, 2009) suggests:
• Chiavaroli and Familiari (2011) suggest 0.2• Abdel-Hameed et al. (2005) suggest 0.3• McAlpine (2002) suggests 0.4
Negative Major problem – find out why!0 to 0.15 Low – review to determine why0.16 to 0.29 Moderate outcome discrimination0.30 to 0.49 High0.5 of greater Very high
V.
high
Hig
hM
oder
ate
Low
!
Top 27% of students Bottom 27% of students
Questionnumber
DiscriminationindexAll 26 students, ranked according to overall exam mark
82.7
81.0
80.0
74.3
70.3
69.3
69.0
67.7
67.0
64.7
63.7
60.7
56.7
56.7
55.7
55.7
55.3
54.3
54.3
54.0
52.0
50.3
47.0
46.0
41.0
35.7
top
bott
om
ind
ex
A19 3 3 3 3 3 3 3 3 3 3 0 3 3 3 3 3 0 3 0 3 0 0 0 0 -1 -1 7 1 0.86A8 3 3 3 3 3 3 3 3 3 3 3 3 3 -1 -1 3 3 3 3 3 -1 3 -1 0 -1 -1 7 2 0.71
A15 3 3 3 -1 3 3 3 -1 3 -1 -1 3 0 -1 0 -1 -1 3 3 -1 3 -1 -1 0 -1 -1 6 1 0.71A5 3 3 3 -1 3 3 3 3 -1 3 -1 -1 3 -1 3 -1 -1 -1 3 3 -1 -1 -1 -1 -1 -1 6 1 0.71A9 3 3 3 3 0 -1 0 3 -1 0 3 3 3 -1 0 3 -1 -1 3 -1 -1 -1 3 -1 -1 -1 4 1 0.43
A13 3 0 0 3 3 3 3 0 -1 3 0 3 0 3 0 0 -1 -1 0 3 3 0 0 0 0 -1 5 2 0.43A17 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 -1 3 -1 3 3 3 3 -1 -1 -1 7 4 0.43A20 3 3 3 0 3 3 3 0 0 3 3 0 0 3 3 3 3 3 0 3 0 0 3 3 -1 -1 6 3 0.43A3 3 3 3 3 3 -1 -1 0 3 3 -1 3 3 3 3 3 -1 -1 0 3 -1 0 0 3 3 -1 5 3 0.29A4 3 3 3 0 -1 0 -1 -1 3 3 0 -1 3 3 0 3 0 -1 -1 -1 -1 3 -1 0 -1 -1 3 1 0.29A6 3 3 3 3 -1 3 3 3 3 0 3 3 -1 -1 3 -1 -1 3 3 3 3 -1 3 -1 3 -1 6 4 0.29
A16 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 -1 3 -1 3 3 7 5 0.29A1 3 3 3 0 -1 0 0 -1 3 3 -1 3 0 3 0 0 -1 3 3 0 3 -1 -1 0 -1 3 3 2 0.14A2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 -1 3 3 3 7 6 0.14A7 3 3 3 3 -1 3 3 3 3 -1 3 3 3 -1 3 3 3 3 3 3 3 -1 3 3 3 -1 6 5 0.14
A11 3 3 3 3 3 3 3 3 3 3 3 -1 3 3 3 3 3 3 3 3 3 3 3 -1 3 3 7 6 0.14A12 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 7 7 0.00A14 0 0 3 3 3 0 3 0 3 -1 3 3 0 3 0 3 0 3 3 3 3 0 0 3 0 3 4 4 0.00A18 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 7 7 0.00A10 -1 0 3 3 -1 3 -1 -1 3 -1 3 -1 3 -1 3 -1 3 0 3 3 -1 -1 3 3 3 -1 3 4 -0.14
0.31
2 slides showing exam questionshave been removed
Conclusions• Will I use an e-exam again? Yes!• Student mock exam worked well – keep this• What will I do differently?
– Revise questions based on item analysis– Modify some questions so more are marked
automatically– More testing earlier in the process– Forget the scheduler and have live activation– Mark and check answers in Excel (or Access?)
Acknowledgements
• Thanks to Abigail, Henry, Julie, Roger, Martyn, Sandra E and Sandra G
•
But most of all Carl!