A Programmatic Approach to Assessmentshapingdentaleducation.org/meetings/london2017/... · Messages...

Preview:

Citation preview

AProgrammaticApproachtoAssessmentADEA/ADEEMeeting:ShapingtheFutureofDentalEducation8-9May,London,UK

CeesvanderVleutenMaastrichtUniversityTheNetherlandswww.maastrichtuniversity.nl/shewww.ceesvandervleuten.comWith gratitude to the supportofLiftUPP!

Overview

• Frompracticetoresearch• Fromresearchtotheory• Fromtheorytopractice• Conclusions

TheToolbox

• MCQ,MEQ,OEQ,SIMP,Write-ins,Key Feature,Progress test,PMP,SCT,Viva,Longcase,Shortcase,OSCE,OSPE,DOCEE,SP-based test,Videoassessment,MSF,Mini-CEX,DOPS,assessmentcenter,self-assessment,peerassessment,incognitoSPs,portfolio………….

Knows

Shows how

Knows how

Does

Knows Fact-oriented assessment:MCQ, write-ins, oral…..

Knows howScenario or case-based assessment:MCQ, write-ins, oral…..

Shows howPerformance assessment in vitro:Assessment centers, OSCE…..

DoesPerformance assessment in vivo:In situ performance assessment, 360◌,۫ Peer assesment

Thewayweclimbed......

Validity

Characteristicsofinstruments

Reliability

Educationalimpact

Acceptability

Cost

Validity

Reliability

Educationalimpact

Validity:whatareweassessing?

• Curriculahavechangedfromaninputorientationtoanoutputorientation

• Wewentfromhaphazardlearningtointegratedlearningobjectives,toendobjectives,andnowto(generic)competencies

• Wewentfromteacherorientedprogramstolearningoriented,self-directedprograms

Competency-frameworks

CanMeds§ Medical expert§ Communicator§ Collaborator§ Manager§ Health advocate§ Scholar§ Professional

ACGMEn Medical knowledgen Patient caren Practice-based learning

& improvementn Interpersonal and

communication skillsn Professionalismn Systems-based practice

GMCn Good clinical caren Relationships with

patients and familiesn Working with

colleaguesn Managing the

workplacen Social responsibility

and accountabilityn Professionalism

Knows

Shows how

Knows how

Does

Knows

Knows how

Shows how

Does

Validity:whatareweassessing?

Standardizedassessment(fairlyestablished)

Unstandardizedassessment(emerging)

Messagesfromvalidityresearch

• Thereisnomagicbullet;weneedamixtureofmethodstocoverthecompetencypyramid

• WeneedBOTHstandardizedandnon-standardizedassessmentmethods

• Forstandardizedassessmentqualitycontrolaroundtestdevelopmentandadministrationisvital

• Forunstandardizedassessmenttheusers(thepeople)arevital.

Methodreliabilityasafunctionoftestingtime

TestingTimeinHours

1

2

4

8

MCQ1

0.62

0.77

0.87

0.93

Case-BasedShortEssay2

0.68

0.81

0.89

0.94

PMP1

0.36

0.53

0.69

0.82

OralExam3

0.50

0.67

0.80

0.89

LongCase4

0.60

0.75

0.86

0.92

OSCE5

0.54

0.70

0.82

0.90

PracticeVideoAssess-ment7

0.62

0.77

0.87

0.93

1Norcinietal.,19852Stalenhoef-Hallingetal.,19903Swanson,1987

4Wassetal.,20015VanderVleuten,19886Norcinietal.,1999

In-cognitoSPs8

0.61

0.76

0.86

0.93

MiniCEX6

0.73

0.84

0.92

0.96

7Ram etal.,19998Gorter,2002

Reliabilityasafunctionofsamplesize(Moonen etal.,2013)

0.65

0.7

0.75

0.8

0.85

0.9

4 5 6 7 8 9 10 11 12G=0.80 KPB

Mini-CEX

0.65

0.7

0.75

0.8

0.85

0.9

4 5 6 7 8 9 10 11 12

G=0.80 KPB OSATSMini-CEX OSATS

Reliabilityasafunctionofsamplesize(Moonen etal.,2013)

0.65

0.7

0.75

0.8

0.85

0.9

4 5 6 7 8 9 10 11 12

Mini-CEX OSATS MSF

Reliabilityasafunctionofsamplesize(Moonen etal.,2013)

Effectofaggregationacrossmethods(Moonen etal.,2013)

Method

Mini-CEXOSATSMSF

Sampleneeded

whenusedasstand-alone

899

Sampleneeded

whenusedasacomposite

562

ResultatenBetrouwbaarheidPerinstrumentAllejaren:

8KPB,9OSATS,9MSFEerstejaar:

6KBP,6OSATS,6MSF

GezamenlijkAllejaren:

7KPB,8OSATS,1MSFof5KPB,6OSATS,2MSF

Eerstejaar:5KBP,6OSATS,1MSF

Messagesfromreliabilityresearch

• Acceptablereliabilityisonlyachievedwithlargesamplesoftestelements(contexts,cases)andassessors

• Nomethodisinherentlybetterthananyother(thatincludesthenewones!)

• ObjectivityisNOTequaltoreliability• Manysubjectivejudgmentsareprettyreproducible/reliable.

Educationalimpact:Howdoesassessmentdrivelearning?

• Relationshipiscomplex(cf.Cilliers,2011,2012)• Butimpactisoftenverynegative

– Poorlearningstyles– Gradeculture(gradehunting,competitiveness)– Gradeinflation(e.g.intheworkplace)

• AlotofREDUCTIONISM!– Littlefeedback(gradeispoorestformoffeedbackonecanget)– Non-alignmentwithcurriculargoals– Non-meaningfulaggregationofassessmentinformation– Fewlongitudinalelements– Tick-boxexercises(OSCEs,logbooks,work-basedassessment).

• Alllearnersconstructknowledgefromaninnerscaffoldingoftheirindividualandsocialexperiences,emotions,will,aptitudes,beliefs,values,self-awareness,purpose,andmore...ifyouarelearning…..,whatyouunderstandisdeterminedbyhowyouunderstandthings,whoyouare,andwhatyoualreadyknow.

PeterSenge,DirectoroftheCenter forOrganizationalLearningatMIT (ascitedinvanRyn etal.,2014)

Messageslearningimpactresearch

• Noassessmentwithout(meaningful)feedback

• Narrativefeedbackhasalotmoreimpactoncomplexskillsthanscores

• Provisionoffeedbackisnotenough(feedbackisadialogue)

• Longitudinalassessmentisneeded.

Overview

• Frompracticetoresearch

• Fromresearchtotheory• Fromtheorytopractice• Conclusions

Limitationsofthesingle-methodapproach

• Nosinglemethodcandoitall• Eachindividualmethodhas(significant)limitations

• Eachsinglemethodisaconsiderablecompromise onreliability,validity,educationalimpact

Implications

• Validity:amultitudeofmethodsneeded• Reliability: alotof(combined)informationisneeded

• Learningimpact: assessmentshouldprovide(longitudinal)meaningfulinformationforlearning

Programmaticassessment

Programmaticassessment

• Acurriculumisagoodmetaphor;inaprogramofassessment:– Elementsareplanned,arranged,coordinated– Issystematicallyevaluatedandreformed

• Buthow?(theliteratureprovidesextremelylittlesupport!)

Programmaticassessment

• Dijkstraetal2012:73genericguidelines

• To be done:– Further validation– Afeasible (self-assessment)instrument

• ASPIREassessmentcriteria

Buildingblocksforprogrammaticassessment1

• Everyassessmentisbutonedatapoint(Δ)• Everydatapointisoptimizedforlearning

– Informationrich(quantitative,qualitative)– Meaningful– Variationinformat

• Summativeversusformativeisreplacedbyacontinuumofstakes(stakes)

• Ndatapointsareproportionallyrelatedtothestakesofthedecisiontobetaken.

Continuumofstakes,numberofdatapointandtheirfunction

Nostake

Very highstake

OneDatapoint:

• Focusedoninformation

• Feedbackoriented

• Notdecisionoriented

Intermediateprogressdecisions:

• Moredatapointsneeded

• Focusondiagnosis,remediation,prediction

Finaldecisionsonpromotionorselection:

• Manydatapointsneeded• Focusedona(non-

surprising)heavydecision

Assessmentinformationaspixels

Classicalapproachtoaggregation

Method1toassessskillA Σ

Method2toassessskillB Σ

Σ

Σ

Method3toassessskillC

Method4toassessskillC

Moremeaningfulaggregation

Method1

Σ

Method2

Σ

Method3

Σ

Method4

Σ

SkillA

SkillBB

SkillC

SkillD

Overview

• Frompracticetoresearch• Fromresearchtotheory

• Fromtheorytopractice• Conclusions

Fromtheorybacktopractice

• Existingbestpractices:– VeterinaryeducationUtrecht– ClevelandLearnerClinic,Cleveland,Ohio

– DutchspecialtytraininginGeneralPractice

– McMasterModularAssessmentPrograminEmergencyMedicine

– GraduateentryprogramMaastricht

Fromtheorybacktopractice

• Existingbestpractices:– VeterinaryeducationUtrecht– ClevelandLearnerClinic,Cleveland,Ohio

– DutchspecialtytraininginGeneralPractice

– McMasterModularAssessmentPrograminEmergencyMedicine

– GraduateentryprogramMaastricht

Fromtheorybacktopractice

• Existingbestpractices:– VeterinaryeducationUtrecht– ClevelandLearnerClinic,Cleveland,Ohio

– DutchspecialtytraininginGeneralPractice

– McMasterModularAssessmentPrograminEmergencyMedicine

– GraduateentryprogramMaastricht

Physician-clinical investigator program

• 4 year graduate entry program• Competency-based (Canmeds) with emphasis on

research• PBL program

– Year 1: classic PBL– Year 2: real patient PBL– Year 3: clerkship rotations– Year 4: participation in research and health care

• High expectations of students: in terms of motivation, promotion of excellence, self-directedness

The assessment program• Assessment in Modules: assignments, presentations, end-examination,

etc.• Longitudinal assessment: assignments, reviews, projects, progress

tests, evaluation of professional behavior, etc.• All assessment is informative and low stake formative• The portfolio is central instrument

Module-overstijgendetoetsingvanprofessioneelgedrag

Module2 Module3 Module4Module1

PT 1 PT2 PT 3 PT 4

Longitudinal Module exceeding assessment of knowledge, skills and professional behavior

portfolioCounselormeeting

Counselormeeting

Counselormeeting

Counselormeeting

Module exceeding assessment of knowledge in Progress Test

Longitudinal total test scores across 12 measurement moments and predicted future performance

Maastricht Electronic portfolio(ePass)

Comparisonbetween the score of the student and the average score of his/her peers.

Every blue dot corresponds to an assessment form included in the portfolio.

Maastricht Electronic portfolio (ePass)

Coaching by counselors• Coaching is essential for successful use of reflective

learning skills • Counselor gives advice/comments (whether asked or not)• He/she counsels if choices have to be made• He/she guards and discusses study progress and

development of competencies

Decision-making by committee• Committee of counselors and externals• Decision is based on portfolio information & counselor

recommendation, competency standards• Deliberation is proportional to clarity of information• Decisions are justified when needed; remediation

recommendation may be provided

Strategytoestablishtrustworthiness Criteria

PotentialAssessmentStrategy(sample)

Credibility Prolongedengagement Trainingofexaminers

Triangulation Tailoredvolumeofexpertjudgmentbasedoncertaintyofinformation

Peerexamination Benchmarkingexaminers

Memberchecking Incorporate learnerview

Structuralcoherence Scrutinyofcommitteeinconsistencies

Transferability Timesampling Judgment basedonbroadsampleofdatapoints

Thickdescription Justifydecisions

Dependability Stepwisereplication Usemultipleassessorswhohavecredibility

Confirmability Audit Givelearnersthepossibilitytoappealtotheassessmentdecision

Progress test embedded in programmatic assesssment – use of information and feedback

to selfdirect learning

Overview

• Frompracticetoresearch• Fromresearchtotheory• Fromtheorytopractice

• Conclusions

Conclusions1:Thewayforward• Wehavetostopthinkingintermsofindividualassessmentmethods

• Asystematicandprogrammaticapproachisneeded,longitudinallyoriented

• Everymethodofassessmentmaybefunctional(oldandnew;standardizedandunstandardized)

• Professionaljudgmentisimperative(similartoclinicalpractice)

• Subjectivityisdealtwiththroughsamplingandproceduralbiasreductionmethods(notwithstandardizationorobjectification).

Conclusions 2:Thewayforward

• Theprogrammaticapproachtoassessmentoptimizes:– Thelearningfunction(throughinformationrichness)

– Thepass/faildecisionfunction(throughthecombinationofrichinformation)

Furtherreading:www.ceesvandervleuten.com

Recommended