View
211
Download
1
Category
Tags:
Preview:
DESCRIPTION
Slides to accompany presentation at AMEE 2012. The presentation describes our recent research showing that assessors' judgements within medical education are importantly influence by recent experience through a process known as contrast effect.
Citation preview
peter.yeates@manchester.ac.uk..... @brainstormpete
You’re Certainly Relatively Competent
Judgmental relativity in performance assessments: the influence of recent experience on Mini-CEX score choices
Peter Yeates Karen MannPaul O’Neill Kevin Eva
peter.yeates@manchester.ac.uk @brainstormpete
peter.yeates@manchester.ac.uk.... @brainstormpete
Background
Mini-CEX assessments
peter.yeates@manchester.ac.uk.... @brainstormpete
Background
• Scores by assessors highly variable– 40% of observed score variance– Range 1-6 on 9 point scale
• Novel enquiry to understand judgement processes
• ? Assessors comparing trainees rather than criterion- referenced judgements
– Yeates, et al, 2012
Background
Assimilation
Contrast
Comparison could cause two possible effects:
Recent Experience of other trainees
No Influence
peter.yeates@manchester.ac.uk..... @brainstormpete
peter.yeates@manchester.ac.uk.... @brainstormpete
Research questions
1. Does recent observation of either “good” or “poor” performances influence assessors’ Mini-CEX scores?
– Assimilation / Contrast / No influence
2. If so, do other influences mediate this effect?
peter.yeates@manchester.ac.uk.... @brainstormpete
Methods
• Internet-based experimental design• Consultant physicians
– Nationwide recruitment (England & Wales)• Randomised to groups• Blinded to intervention
Group A
Group B
G1 G3G2
P1 P3P2
B1
B1 B2
B2
B3
B3
Intervention Comparison
scor
esc
ore
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
rANOVA
peter.yeates@manchester.ac.uk.... @brainstormpete
Results
• 41 participants completed– 32% female– 11 out of 14 postgrad deaneries– 13 different medical specialities
• Groups comparable at baseline on:– Gender: 35% vs 29% (non-sig)– Duration of consultancy: 13 yrs vs 8 yrs (p=0.03)
1
2
3
4
5
Intervention
Me
an
sc
ore
s
Group mean scores for intervention and comparison phases
Group A
Group B
Group B scored 0.67 higher on 6 point scale than group A (F (1, 39) = 12.0, p = 0.001) Cohen’s d = 0.63 (moderate effect)
1
2
3
4
5
Intervention Comparison
Me
an
sc
ore
s
Group mean scores for intervention and comparison phases
Group A
Group B
peter.yeates@manchester.ac.uk.... @brainstormpete
Methods Follow up study:
Group A
Group B
G1 G2 B1 B2sc
ore
scor
e
scor
e
scor
e
scor
e
scor
e
P1 P2
P1 P2 B1 B2
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
G1 G2
Descending proficiency
Ascending proficiency
peter.yeates@manchester.ac.uk.... @brainstormpete
Methods Follow up study:
Additional measures:
Memory Insight
“What percentage of trainees would do better?”Consider existing memory of trainees
“How confident do you feel about the scores you gave?”High conf, Low manipulation = insight
Results Follow up study:
*
*
Group A: Good to PoorGroup B: Poor to Good
Group A and B mean scores by level of performance
1
2
3
4
5
6
Good Borderline Poor
Level of performance
Mea
n s
core
s
Group A
Group B
Results Follow up study:
*
*
Group A: Good to PoorGroup B: Poor to Good
Group A and B mean scores by level of performance
1
2
3
4
5
6
Good Borderline Poor
Level of performance
Mea
n s
core
s
Group A
Group B
Results Follow up study:
*
*
Group A: Good to PoorGroup B: Poor to Good
F=9.80(1, 47), p=0.003, Cohen’s d=0.52
Group A and B mean scores by level of performance
1
2
3
4
5
6
Good Borderline Poor
Level of performance
Mea
n s
core
s
Group A
Group B
*
*
Results Follow up study:
Group A: Good to PoorGroup B: Poor to Good
F=16.0 (1, 46), p<0.001, Cohen’s d=0.67
Participants' "Percent Better" ratings by Level and Group
0
10
20
30
40
50
60
70
80
90
100
Good Borderline Poor
Per
cen
t B
ette
r ra
tin
gs
Group A
Group B
*
*
~
peter.yeates@manchester.ac.uk.... @brainstormpete
Results Follow up study:
Confidence ratings:– Uniformly high (median = 6 out of 7)– No variation by level– No significant interaction between group effect
and confidence(i.e. high and low confidence assessors just as
susceptible)
peter.yeates@manchester.ac.uk.... @brainstormpete
Discussion
Recent experience caused a Contrast Effect
Theoretical:• Competence based on relative rather than absolute
criteria• Robust despite appeals to long term memory• Assessors lack insight into susceptibility
Practical:• ? Fairness / safety of exams
peter.yeates@manchester.ac.uk..... @brainstormpete
References:Azar, et al, 2007 Journal of Socio-economics 36 : 1-14Gingerich, et al, 2011 Academic Medicine 86: s1-s7Ginsburg, et al,2010 Academic Medicine 85(5):780-786Govaerts , et al, 2011 Advances in Health Sciences Education 16(2): 151-65Kogan, et al, 2011 Medical Education 45 (10) 1048-60Mussweiler , et al, 2003 Psychological review 110 (3): 472-489Wedell, et al, 2005 Basic and Applied social Psychology 27(3) :213-28Yeates, et al, 2012 Advances in Health Sciences Education. On-line ahead of print
Questions ?
QR
peter.yeates@manchester.ac.uk..... @brainstormpete
Group A
Group B
G1 G3G2
P1 P3P2
B1
B1 B2
B2
B3
B3
Intervention Comparison
scor
esc
ore
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
Hawk / Dove Index (HDI):Participant z-scoreHow far from the middle of the group?
Predictors:HDI, Group
Outcome
Recent experience vs. Hawk / Dove differences
peter.yeates@manchester.ac.uk.... @brainstormpete
Hawk / Dove Results
• Overall model explained 50% of observed score variance– r2 = 0.50, p <0.001
• “Hawk / Dove accounted for 18% – Change in r2 = 0.18, p = 0.006
• Group (recent-experience) then accounted for a further 24%– Change in r2 = 0.24, p <0.001
• As a result Group (recent experience) accounted for more score variation than assessors’ fixed Hawk/Dove differences
peter.yeates@manchester.ac.uk.... @brainstormpete
Video654321
Scor
e
6.00
5.00
4.00
3.00
2.00
1.00
41
109
272
242
279
42
271
194
158
274
Recommended