Upload
peteryeates12
View
211
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Slides to accompany presentation at AMEE 2012. The presentation describes our recent research showing that assessors' judgements within medical education are importantly influence by recent experience through a process known as contrast effect.
Citation preview
[email protected]..... @brainstormpete
You’re Certainly Relatively Competent
Judgmental relativity in performance assessments: the influence of recent experience on Mini-CEX score choices
Peter Yeates Karen MannPaul O’Neill Kevin Eva
[email protected] @brainstormpete
[email protected].... @brainstormpete
Background
• Scores by assessors highly variable– 40% of observed score variance– Range 1-6 on 9 point scale
• Novel enquiry to understand judgement processes
• ? Assessors comparing trainees rather than criterion- referenced judgements
– Yeates, et al, 2012
Background
Assimilation
Contrast
Comparison could cause two possible effects:
Recent Experience of other trainees
No Influence
[email protected]..... @brainstormpete
[email protected].... @brainstormpete
Research questions
1. Does recent observation of either “good” or “poor” performances influence assessors’ Mini-CEX scores?
– Assimilation / Contrast / No influence
2. If so, do other influences mediate this effect?
[email protected].... @brainstormpete
Methods
• Internet-based experimental design• Consultant physicians
– Nationwide recruitment (England & Wales)• Randomised to groups• Blinded to intervention
Group A
Group B
G1 G3G2
P1 P3P2
B1
B1 B2
B2
B3
B3
Intervention Comparison
scor
esc
ore
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
rANOVA
[email protected].... @brainstormpete
Results
• 41 participants completed– 32% female– 11 out of 14 postgrad deaneries– 13 different medical specialities
• Groups comparable at baseline on:– Gender: 35% vs 29% (non-sig)– Duration of consultancy: 13 yrs vs 8 yrs (p=0.03)
1
2
3
4
5
Intervention
Me
an
sc
ore
s
Group mean scores for intervention and comparison phases
Group A
Group B
Group B scored 0.67 higher on 6 point scale than group A (F (1, 39) = 12.0, p = 0.001) Cohen’s d = 0.63 (moderate effect)
1
2
3
4
5
Intervention Comparison
Me
an
sc
ore
s
Group mean scores for intervention and comparison phases
Group A
Group B
[email protected].... @brainstormpete
Methods Follow up study:
Group A
Group B
G1 G2 B1 B2sc
ore
scor
e
scor
e
scor
e
scor
e
scor
e
P1 P2
P1 P2 B1 B2
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
G1 G2
Descending proficiency
Ascending proficiency
[email protected].... @brainstormpete
Methods Follow up study:
Additional measures:
Memory Insight
“What percentage of trainees would do better?”Consider existing memory of trainees
“How confident do you feel about the scores you gave?”High conf, Low manipulation = insight
Results Follow up study:
*
*
Group A: Good to PoorGroup B: Poor to Good
Group A and B mean scores by level of performance
1
2
3
4
5
6
Good Borderline Poor
Level of performance
Mea
n s
core
s
Group A
Group B
Results Follow up study:
*
*
Group A: Good to PoorGroup B: Poor to Good
Group A and B mean scores by level of performance
1
2
3
4
5
6
Good Borderline Poor
Level of performance
Mea
n s
core
s
Group A
Group B
Results Follow up study:
*
*
Group A: Good to PoorGroup B: Poor to Good
F=9.80(1, 47), p=0.003, Cohen’s d=0.52
Group A and B mean scores by level of performance
1
2
3
4
5
6
Good Borderline Poor
Level of performance
Mea
n s
core
s
Group A
Group B
*
*
Results Follow up study:
Group A: Good to PoorGroup B: Poor to Good
F=16.0 (1, 46), p<0.001, Cohen’s d=0.67
Participants' "Percent Better" ratings by Level and Group
0
10
20
30
40
50
60
70
80
90
100
Good Borderline Poor
Per
cen
t B
ette
r ra
tin
gs
Group A
Group B
*
*
~
[email protected].... @brainstormpete
Results Follow up study:
Confidence ratings:– Uniformly high (median = 6 out of 7)– No variation by level– No significant interaction between group effect
and confidence(i.e. high and low confidence assessors just as
susceptible)
[email protected].... @brainstormpete
Discussion
Recent experience caused a Contrast Effect
Theoretical:• Competence based on relative rather than absolute
criteria• Robust despite appeals to long term memory• Assessors lack insight into susceptibility
Practical:• ? Fairness / safety of exams
[email protected]..... @brainstormpete
References:Azar, et al, 2007 Journal of Socio-economics 36 : 1-14Gingerich, et al, 2011 Academic Medicine 86: s1-s7Ginsburg, et al,2010 Academic Medicine 85(5):780-786Govaerts , et al, 2011 Advances in Health Sciences Education 16(2): 151-65Kogan, et al, 2011 Medical Education 45 (10) 1048-60Mussweiler , et al, 2003 Psychological review 110 (3): 472-489Wedell, et al, 2005 Basic and Applied social Psychology 27(3) :213-28Yeates, et al, 2012 Advances in Health Sciences Education. On-line ahead of print
Questions ?
QR
[email protected]..... @brainstormpete
Group A
Group B
G1 G3G2
P1 P3P2
B1
B1 B2
B2
B3
B3
Intervention Comparison
scor
esc
ore
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
scor
e
Hawk / Dove Index (HDI):Participant z-scoreHow far from the middle of the group?
Predictors:HDI, Group
Outcome
Recent experience vs. Hawk / Dove differences
[email protected].... @brainstormpete
Hawk / Dove Results
• Overall model explained 50% of observed score variance– r2 = 0.50, p <0.001
• “Hawk / Dove accounted for 18% – Change in r2 = 0.18, p = 0.006
• Group (recent-experience) then accounted for a further 24%– Change in r2 = 0.24, p <0.001
• As a result Group (recent experience) accounted for more score variation than assessors’ fixed Hawk/Dove differences
[email protected].... @brainstormpete
Video654321
Scor
e
6.00
5.00
4.00
3.00
2.00
1.00
41
109
272
242
279
42
271
194
158
274