Simulation-Based Summative Assessments in Orthopaedic Surgery

8th Annual Meeting of the Consortium of ACS-Accredited Education Institutes

Special Panel: Advances in Summative

Assessment of Surgeons Using Simulation

Simulation-Based

Summative Assessments

in Orthopaedic Surgery

Robert A. Pedowitz, MD, PhD

Professor Emeritus of Orthopaedic Surgery

David Geffen School of Medicine at UCLA

Disclosure: Robert Pedowitz, MD, PhD

Consultant: Virtamed

Chair of the Fundamentals of

Arthroscopic Surgery Training

(FAST) Program • Arthroscopy Association of North America

• American Academy of Orthopaedic Surgeons

• American Board of Orthopaedic Surgery

Summative Assessments: Ortho

ASSET (Arthroscopic Surgical Skill

Evaluation Tool): Diagnostic Knee Scope

Virtual Reality Simulation Training:

Diagnostic Knee Arthroscopy

FAST (Fundamentals of Arthroscopic

Surgery Training Program): Knot Tying

Copernicus Initiative: Proficiency-based

Progression Training (Bankart Procedure)

Assessment Strategies

ASSET: Likert scales

Virtual Reality: VR to OR (Likert scales)

FAST: Biomechanical measures

Copernicus Initiative: Binary scores

ASSET (Arthroscopic Surgical

Skill Evaluation Tool) Am J Sports Med. 2013 June ; 41(6): 1229–1237.

Koehler, Amsdell, Arendt, Bisson, Braman,

Butler, Cosgarea, Harner, Garrett, Olson,

Warme, Nicandri

ASSET

Task deconstruction: Diagnostic knee

arthroscopy

Content validity: Delphi panel, 7 experts

Construct validity: Video recordings; 28

residents & 2 attending surgeons

Strong inter-rater reliability and test-

retest reliability

Max

Score

40

Koehler et al, AJSM 2013

Various Assessment Strategies





Improving Residency Training in

Arthroscopic Knee Surgery with

Use of a Virtual-Reality Simulator A Randomized Blinded Study

J Bone Joint Surg Am. 2014;96:1798-806

Cannon, Garrett, Hunter, Sweeney,

Eckhoff, Nicandri, Hutchinson, Johnson,

Bisson, Bedi, Hill, Koh, Reinig

Cannon et al., 2014

48 PGY3 Ortho residents randomized:

VR training to proficiency (mean 11

hours on the VR trainer)

Standard educational routine

Video recorded diagnostic knee scope

on a live patient

Each recording reviewed by a pair of

blinded raters

Cannon, JBJS 2014

Cannon et al., JBJS 2014

Global Ratings of Arthroscopy Performance

Mean + SD (%)

Measure

Simulator

Trained (N=27)

Non-Simulator

Trained (N=21)

P

Value

Procedural

checklist

63 + 20 52 + 21 0.031

Visualization

scale

61 + 20 58 + 21 0.34

Probing

scale

64 + 24 48 + 24 0.016

Global

ratings

64 + 20 57 + 19 0.061

Cannon et al., JBJS 2014






Enhancement of arthroscopic

knot-tying performance with

the FAST workstation and

knot-tester

Pedowitz, Nicandri, Angelo, Ryu, Gallagher

Arthroscopic Knot Tying

A challenging surgical skill, best

learned with a deliberate sequence

Rope

Suture

Cannulas

Video control

Patient care (fluids) Assessment:

Observation

Arthroscopic knot

performance

(As opposed to appearance)

Good suture

repair constructs

demonstrate

knot security

and

loop security

Knot

Loop

Construct failure defined as

> 3 mm of loop expansion

FAST Knot Tester

Study 1

5 sequential knots with #2 Fiberwire

Knot of choice: All with 3 reversed hitches & posts

Faculty: 20 surgeons at OLC course (19.9 + 8.25

years in practice; 381 + 150 scopes per year).

Study 1


Knot of choice: All with 3 reversed hitches & posts

Faculty: 20 surgeons at OLC course (19.9 + 8.25

years in practice; 381 + 150 scopes per year).

Orthopaedic Residents (All PGY4 / PGY5)

Copernicus Study (Bankart Procedure)

Group A: Standard resident’s course

Group B: Structured curriculum

Group C: Proficiency based progression

Faculty

(n = 20)

Group A

Residents

(n = 14)

Group B

Residents

(n = 14)

Group C

Residents

(n = 16)

Knot Performance

0/5 Failed 12 3 3 11

1/5 Failed 0 3 5 2

2/5 Failed 4 7 2 2

3/5 Failed 2 0 1 1

4/5 Failed 0 1 1 0

5/5 Failed 2 0 2 0

Knots that failed

(> 3mm displacement)

24 / 100 =

24%

21 / 70 =

30%

26 / 70 =

35%

9 / 80 =

11%

Logistic Regression Analysis

Faculty

(n = 20; tied 100 knots)

Group A Residents


Group B Residents


Odds Ratio = 0.53

p = 0.07

Odds Ratio = 0.74

p = 0.38 Knots Passed

Binary

Outcome •Pass (< 3 mm)

•Fail (> 3 mm)

Reference group

30%

Failed

24%

Failed

35%

Failed

Group C Residents

(Proficiency-Based Progression)


Odds Ratio = 2.84

p = 0.01

12%

Failed

Study 2


Arthroscopic knot of choice with 3

reversed half-hitches & posts

Faculty and surgeon-attendees at the

2013 AANA Fall Course (n = 30)

Faculty

(n = 10)

Attendee

(n = 12)

Not Defined

(n = 8)

Total

(n = 30)

Year in Practice:

Mean + SD (range)

20.5 + 7.6

(3 – 30)

14.6 + 12.4

(1 – 40)

19.0 + 9.3

(9 – 32)

17.1 + 19.3

(1 – 40)

Knot Performance

0/5 Failed 5 3 4 12

1/5 Failed 3 3 3 9

2/5 Failed 1 3 0 4

3/5 Failed 1 3 1 5

4/5 Failed 0 0 0 0

5/5 Failed 0 0 0 0

Knots that failed

(> 3 mm displacement)

8 / 50 =

16%

18 / 60 =

30%

6 / 40 =

15%

32 / 150 =

21.3%

Combined Data:

Experienced

Surgeons

Study 1

Faculty at AANA

Copernicus Course

(n = 20)

Study 2

Faculty and Attendees at

AANA Fall Course

(n = 30)

0/5 Failed 12 12

1/5 Failed 0 9

2/5 Failed 4 4

3/5 Failed 2 5

4/5 Failed 0 0

5/5 Failed 2 0

Knots that failed


24 / 100 = 24.0% 32 / 150 = 21.3% Overall failure rate: 56/250 knots = 22.4%

Setting a

performance

benchmark

Knot Performance

Study 1

Faculty at AANA

Copernicus Course

(n = 20)

Study 2


AANA Fall Course

(n = 30)

0/5 Failed 12 12

1/5 Failed 0 9

2/5 Failed 4 4

3/5 Failed 2 5

4/5 Failed 0 0

5/5 Failed 2 0

Knots that failed


24 / 100 = 24.0% 32 / 150 = 21.3% If perfection: 52% would fail the test

Knot Performance

Study 1

Faculty at AANA

Copernicus Course

(n = 20)

Study 2


AANA Fall Course

(n = 30)

0/5 Failed 12 12

1/5 Failed 0 9

2/5 Failed 4 4

3/5 Failed 2 5

4/5 Failed 0 0

5/5 Failed 2 0

Knots that failed


24 / 100 = 24.0% 32 / 150 = 21.3% If threshold < 1: 34% would fail the test

Knot Performance

Study 1

Faculty at AANA

Copernicus Course

(n = 20)

Study 2


AANA Fall Course

(n = 30)

0/5 Failed 12 12

1/5 Failed 0 9

2/5 Failed 4 4

3/5 Failed 2 5

4/5 Failed 0 0

5/5 Failed 2 0

Knots that failed


24 / 100 = 24.0% 32 / 150 = 21.3% If threshold < 2: 18% would fail test

Based upon AANA expert faculty and

experienced arthroscopic surgeons:

“Acceptable” Performance:

< 2/5 failures using FAST protocol






A new educational approach:

From the apprenticeship model to

proficiency-based progression

Development Process

Task deconstruction: Bankart procedure

Step & error definition (face/content validity)

Metrics: Stress testing & metric refinement

Inter-rater reliability

Delphi panel

Construct validity (joint models and cadavers)

Proficiency benchmarks

Randomized controlled trial

Bankart Procedure (45 Surgical Steps) Arthroscopy Portals

Arthroscopic Instability Assessment

Capsulolabral Mobilization / Glenoid Preparation

1st Anchor Preparation / Insertion

1st Suture Delivery / Management

1st Knot Tying

2nd Anchor Preparation / Insertion

2nd Suture Delivery / Management

2nd Knot Tying

3rd Anchor Preparation / Insertion

3rd Suture Delivery / Management

3rd Knot Tying

Arthroscopic Instability Assessment

Metric Development / Validation - Bankart Procedure

Task Deconstruction

Operational Definitions: Steps & potential intra-operative errors

Construct validation of

metrics on cadaver

Stress test & refine steps & intra-operative errors

Delphi Panel: Consensus on steps (45) and intra-op errors (77)

Construct validation of

shoulder model

Didactic Bankart

Orientation Video

Simulation Model

Benchmark

Didactic Proficiency

Benchmark Bankart Outcome

Benchmark

Binary Assessments (Y/N)

Procedure-Specific Shoulder Model

Construct Validity: Shoulder Model

Group A:

Control

Group B:

Simulation

Group C:

PBP

Lectures / Knot

Tying Practice

Random allocation

Lectures / Knot

Tying Practice

Perform Assisted / Unaided Bankart Repair on Fresh Cadaver (video)

Baseline assessment: visio-spatial, perceptual, and psychomotor performance

Prospective, randomized, blinded comparisons for

intra-operative performance of a Bankart procedure

Lectures / Knot Tying: Test to Proficiency

Train on Simulator: Test to Proficiency

Train on Simulator

Practice Arthroscopic Bankart Skills on Cadaver Shoulders

Random allocation

Online Orientation Video & Didactic Material Test to Proficiency

Independent reviews of videos done in pairs blinded to group IRR = 0.93

Multicenter, prospective, randomized, blinded

trial of proficiency-based progression simulation

training for an arthroscopic Bankart procedure:

PGY 4 and PGY 5 residents

Residency training

program

No. of

residents

1. East Amherst 1 2. Loyola University Medical

Center 1

3. New York Medical College 1 4. Northwestern University 3 5. NS-LIJ 3 6. NS-LIJ - Plainview Hosp 4 7. Ohio State University 1 8. Philadelphia Coll of Osteopathic

Medicine 1

9. Rush University Medical Center 2 10. St Louis 1 11. Tufts 4 12. Tulane University 2 13. UCSF 3 14. University of Illinois at Chicago 1 15. University of Louisville 1 16. University of Miami School of

Medicine 1

17. University of Michigan 1 18. University of Mississippi 2 19. University of Tennessee 3 20. University of Utah Medical

Center 6

21. Yale University Medical Center 2 Total 44

Motor Skill Assessments

Baseline psychomotor parameters Knot tying

Shoulder model Cadaver training Full Procedure on a

Cadaver Shoulder:

Summative Assessment

N = 44

A N = 14

B N = 14

C N = 16

Random

assignment

On-line

material

On-line

material

On-line

material to

proficiency

Traditional

weekend

AANA

Resident

Course

Suturing

and knot

tying

course

Suturing and

knot tying

course to

proficiency

Bankart

Shoulder

Model

Bankart

Shoulder

Model to

proficiency

Assisted but Unaided Video-recorded Cadaver Bankart Repair

Independently Assessed by two Experienced Faculty

Failed to

demonstrate

proficiency

n = 0

n = 1

Ʃ = 2 N = 12 N = 14 N = 12 Ʃ = 4

Trial design W

eeke

nd c

ours

e

Did not

complete index

procedure

n = 4*

N =

6

N = 2

34

36

38

40

42

44

A B C

Bankart procedure steps completed

Mean

no. of

ste

ps

com

ple

ted

an

d

95%

Con

fid

ence

In

tervals

Groups

Steps completed p < 0.000

p < 0.000

A = Traditional training

B = New

curriculum but no

PBP requirement

C = New

curriculum with a

PBP requirement

ALL Group C trainees included

0

1

2

3

4

5

6

7

8

9

10

Mean

no. of

err

ors

an

d

95%

Con

fid

ence

In

tervals

A B C

Groups

Total errors

p < 0.01

p = 0.039

~56%

~41%


B = New

curriculum but no

PBP requirement

C = New

curriculum with a

PBP requirement

Objectively assessed intra-operative errors

ALL Group C trainees included

Groups

A B C

0.5

1.0

1.5

2.0

2.5

3.0

0

Mea

n n

o. o

f se

nti

nel

err

ors

an

d

95

% C

on

fid

ence

In

terv

als

Sentinel errors

p < 0.023

p < 0.351

~69%

~44%


B = New

curriculum but no

PBP requirement

C = New

curriculum with a

PBP requirement

(all trainees

included)

ALL trainees included in Group C

Proficiency Benchmark

• Benchmark based on mean

faculty performance with

outliers excluded (> 2 SD)

• Pass score defined as:

• Completion of a 3 anchor repair

• No more than 3 procedural errors

• No more than 1 sentinel error

Outcome: Achieved Benchmark?

No Yes Total Group A 10 4 14

71% 29% 100%

Group B 9 5 14

64% 36% 100%

Group C 5 11 16

31% 69% 100%

Outcome: Achieved Benchmark? No Yes Total Group A 10 4 14

71% 29% 100%

Group B 9 5 14

64% 36% 100%

Group C (Met all 3 9 12

proficiency steps) 25% 75% 100%

Group C- (Failed a 2 2 4

proficiency step) 50% 50% 100%

Group C (Overall) 5 11 16

31% 69% 100%

Group A

Group B

Group C

(All C subjects)

Logistic Regression Analysis:

Odds Ratio = 5.5

p = 0.033

Odds Ratio = 1.4

p = 0.121

Proficiency

Benchmark

Demonstrated

(Y/N)

Reference Group

(Traditional Course)

1.4 times more likely

>5 times

more likely

69%

36%

29%

Group C

(PBP only)

Odds Ratio = 7.5

p = 0.036

>7 times

more likely

75%

Summative Assessments: Challenges

ASSET: Video analysis, experts, Likert

scales (Is tool reliable if scored real time?)

VR Knee Scope: Video, experts, Likerts

(Will it transfer to VR computer scoring?)

FAST Knot Tying: Biomechanical test,

direct assessment (Benchmarks correct?)

Copernicus Study: Binary scores, steps &

errors (Is development process scalable?)

Documents

Simulation-Based Summative Assessments in Orthopaedic Surgery