Dependency Hand Out

7/29/2019 Dependency Hand Out

1/24

Item Dependency in an Objective

Structured Clinical Examination

Cherdsak Iramaneerat

Carol M. Myford

Rachel Yudkowsky

University of Illinois at Chicago


2/24

Objective Structured Clinical Examination

Objective structured clinical examination (OSCE)

An assessment approach used in medical education inwhich the clinical competence of residents is evaluated

using multiple stations of standardized clinical tasks

Standardized patients (SP)

Lay persons trained to portray a scripted patient

presentation in a standardized fashion


3/24

Conditional Item Independence

A basic assumption of the Rasch model After accounting for the latent trait, item responses

on a test are independent.

0)|,( kj xxCov

Item dependence can lead to inaccurate estimationof item parameters, test statistics, and residentcompetency.

Item dependence can lead to overestimation ofreliability and test information.


4/24

Items in OSCE

In each OSCE station, items are linked to the sameclinical task and are rated by the same SP.

A residents level of performance on one item may

be dependent on his/her level of performance onother items in the same OSCE station.


5/24

Purposes

1. To check for the existence of item dependency inan OSCE

2. To outline an alternative approach for analyzingrating data using a MFRM model to ameliorate theproblem of item dependency

3. To compare reliability estimates, parameterestimates, and fit statistics obtained from MFRManalyses when local dependence is present/absent


6/24

Participants

79 residents from one Midwestern medical school

68 internal medicine residents

66% Male

34% Female

11 family medicine residents

45% Male

55% Female


7/24

Tasks (OSCE Stations)

A communication skills assessment

Six OSCE stations of simulated clinical scenarios

1. Patient education2. Informed consent

3. Treatment refusal

4. Elderly abuse

5. Giving bad news

6. Physical examination


8/24

Rating Scale

A modification of the communication skills ratingform of the American Board of Internal Medicine

18 items asking for agreement ratings

Five-point Likert Scale

1 (Strongly disagree) to 5 (Strongly agree)


9/24

Items1. You greeted me warmly...2. You were friendly...3. You treated me like we were on the same level...4. You let me tell my story without interruption...5. You were truthful...6. You never ignored what I had to say...7. You discussed options with me...8. You made sure that I understood the options...9. You allowed me to make my own decision...10. You encouraged me to ask questions...11. You were patient...12. You never avoided my questions...

13. You clearly explained the problem...14. You clearly explained what should be expected...15. You used plain language, not medical jargon...16. You were careful in approaching sensitive issues...17. You displayed a positive attitude...18. I will choose this physician as my personal physician.


10/24

Analyses

ikjin

knij

nijk FCDBPP

)1(

ln

Pnijk Probability of resident nreceiving a rating of kon item iin stationj

Pnij(k-1) Probability of resident nreceiving a rating of k-1 on item iin stationjBn Level of communication competence of resident nDi Difficulty of item iCj Difficulty of OSCE stationjFik Difficulty of receiving a rating of krelative to k-1 for item i


11/24

Local Independence

Yens Q3statistic (Yen, 1984, 1993)

the correlation of the residuals for a pair of items afterpartialling out the latent trait estimate

Fishers Zapproach (Shen, 1996)

A modification of Yens Q3statistic

Adjusting residuals by the accuracy of the resident

communication competence measure Establishing a practical significance level


12/24

Fishers ZIndex

1. Calculate the standardized residuals for eachrating of resident non item i

dni= (observed ratingexpected rating)/SEn

2. Correlate the standardized residuals for all pairs ofitem i, jin each OSCE station

3. Compute Fishers Zstatistic

ij

ij

ijr

rZ

1

1log

2

1


13/24

Alternative Approach

Treating each OSCE station as a scoring unit

Average the ratings from all items in one station andmultiply by 10 to produce a station score.

Integers ranging from 10 (poor performance) to 50(excellent performance)


14/24

Alternative Analysis

jkjnknjnjk FCBPP ]/ln[ )1(

Pnjk Probability of resident nreceiving a station score of kin stationj

Pnj(k-1) Probability of resident nreceiving a station score of k-1 in stationjBn Level of communication competence of resident n

Cj Difficulty of OSCE stationj

Fjk Difficulty of receiving a station score of krelative to k-1 for stationj


15/24

Item Dependency

Scoring units Mean

Fishers Z

SD Percentage ofsignificantFishers Z

Independentitems

-0.03 0.05 3

Items withinstation

0.12 0.10 65

Station scores -0.09 0.08 27


16/24

Resident Separation Reliability

Using items as scoring units

A resident separation reliability = 0.94

Using stations as scoring units

A resident separation reliability = 0.74


17/24

Resident Communication Competence Measures

Scoringunits

Minimum(logits)

Maximum(logits)

Mean

(logits)

SD

(logits)

Items 0.20 2.65 1.39 0.52

Stationscores

-0.37 0.68 0.13 0.23


18/24

Measures based on item scores analysis

3.02.52.01.51.0.50.0

.8

.6

.4

.2

0.0

-.2

-.4

Resident Communication Competence Measures


19/24

Scoring unit

Criteria

Item scores Station scores

Underfitting Overfitting Underfitting Overfitting

Infit MnSq 15 0 27 13

Infit Zstd 8 8 2 4

Outfit MnSq 19 0 25 13

Outfit Zstd 14 10 5 1

Misfitting Residents


20/24

Station Difficulty Measures

Scoringunits

Minimum(logits)

Maximum(logits)

Mean

(logits)

SD

(logits)

Items -1.46 0.66 0 0.86

Stationscores

-0.22 0.14 0 0.14


21/24

Misfitting Stations

Scoring unit

Criteria

Item scores Station scores

Underfitting Overfitting Underfitting Overfitting

Infit MnSq 2 0 0 0

Infit Zstd 3 3 0 0Outfit MnSq 3 0 2 0

Outfit Zstd 3 3 0 0


22/24

Item Dependency in MFRM Analyses

A violation of a basic assumption of the model

Results

Overestimation of separation reliability estimates

Poorer fit of resident communication competencymeasures (according to standardized fit statistics)

Poorer fit of station difficulty measures (according to

both standardized and unstandardized fit statistics)


23/24

Suggestions

When conducting a MFRM analysis of a data set thathas items linked to the same task or raters

Check for the violation of local independence assumption

If item dependency is a problem: combine ratingsfrom multiple items into a station score

Alleviate item dependency problem

Loss of information and decrease in residentseparation reliability


24/24

Questions and Comments

Cherdsak Iramaneerat

Department of Educational Psychology

College of EducationUniversity of Illinois at Chicago

[email protected]

Documents

Dependency Hand Out