Fragility Index

Preview:

Citation preview

Yes, a talkabout astatistic…

14 June

FRAGILITY INDEX(the rise of the P value)

@kathyrowan101

de Winter and Dodou (2015) PeerJ

RCTs – best evidence

• RCTs are fundamentally hypothesis testing instruments (a hypothesis being an educated guess about how things work)

• RCTs are a type of scientific experiment which aims to reduce bias when testing new/existing treatments

RCTs – best evidence

• RCTs are fundamentally hypothesis testing instruments (a hypothesis being an educated guess about how things work)

• RCTs are a type of scientific experiment which aims to reduce bias when testing new/existing treatments

• RCTs permit the use of probability theory to express the likelihood of a difference in outcome between treatments/groups

P value

• P value - probability of obtaining a result equal to or more extreme than observed when the null hypothesis is true

P value

• P value - probability of obtaining a result equal to or more extreme than observed when the null hypothesis is true

• Null hypothesis H0 – default position – assumed to be true until evidence indicates otherwise

P value

• P value - probability of obtaining a result equal to or more extreme than observed when the null hypothesis is true

• Null hypothesis H0 – default position – assumed to be true until evidence indicates otherwise

• If observed data are unlikely,then null hypothesis is rejected

• If observed data are consistent,then null hypothesis is not rejected

An analogy might help…?

• Analogous to a criminal trial• Defendant assumed to be innocent (null)• Until proven guilty (null is rejected)• Beyond reasonable doubt (agreed threshold)

P value

• P value - probability of obtaining a result equal to or more extreme than observed when the null hypothesis is true

• Agreed threshold traditionally set at 5% (P<0.05)

Fisher’s exact test

• Used in the analysis of contingency tables• Used when sample sizes are small

(valid for all sample sizes)• Calculates significance of deviation from the

null hypothesis exactly• It can’t be calculated in your head…!

RCTs – size and events are important

Bad outcome

Good outcome

Treatment A

1 99

Treatment B

9 91

Bad outcome

Good outcome

Treatment A

200 1800

Treatment B

250 1750

P=0.02

P=0.02

Bad outcome

Good outcome

Treatment A

1 2 99 98

Treatment B

9 91

Bad outcome

Good outcome

Treatment A

200 201 1800 799

Treatment B

250 1750

P=0.06

P=0.02

P value

• P value - probability of obtaining a result equal to or more extreme than observed when the null hypothesis is true

• Agreed threshold traditionally set at 5% (P<0.05)• A shift of only a few events can change

interpretation…

Bad outcome

Good outcome

Treatment A

2 98

Treatment B

9 91

Bad outcome

Good outcome

Treatment A

2 98

Treatment B

9 91

P=0.02chi-square

P=0.06Fisher’s exact

P value

• P value - probability of obtaining a result equal to or more extreme than observed when the null hypothesis is true

• Traditionally set at 5% (P<0.05)• A shift of only a few events can change

interpretation…• Simple recalculation with the correct statistical

test can change interpretation…

Primary and secondary hypotheses

P value

• P value - probability of obtaining a result equal to or more extreme than observed when the null hypothesis is true

• Traditionally set at 5% (P<0.05)• A shift of only a few events can change

interpretation…• Simple recalculation with the correct test can

change interpretation…• Significant subgroups(?!) – hypothesis generating

FRAGILITY INDEX

de Winter and Dodou (2015) PeerJ

• Calculates how many events (f) required to change a significant to a non-significant result

• Designed for dichotomous outcomes

Table

Bad outcome

Good outcome

Treatment A

4 96

Treatment B

17 83

Table

Bad outcome

Good outcome

Treatment A

5 95

Treatment B

17 83

FRAGILITY INDEX EQUALS FOUR

Walsh et al. 2014 Evaniew et al. 2015 Ridgeon et al. 2016RCTs in NEJM, Lancet, JAMA, Annals, BMJ(RCT in MeSH)

RCTs on spine surgery

Multicentre (>1) RCTs in critically ill

2004-2010 2009-2014 No date restrictionP<0.05 resultin abstract

P<0.05 resultin abstract

P<0.05 resultfor mortality

RCTs=399 RCTs= 40 RCTs=56Patients=median 682(IQR 15-112604)

Patients=median 132(IQR 79-208)

Patients=median 127(IQR 79-326)

Walsh et al. 2014 Evaniew et al. 2015 Ridgeon et al. 2016RCTs in NEJM, Lancet, JAMA, Annals, BMJ(RCT in MeSH)

RCTs on spine surgery

Multicentre (>1) RCTs in critically ill

2004-2010 2009-2014 No date restrictionP<0.05 resultin abstract

P<0.05 resultin abstract

P<0.05 resultfor mortality

RCTs=399 RCTs= 40 RCTs=56Patients=median 682(IQR 15-112604)

Patients=median 132(IQR 79-208)

Patients=median 127(IQR 79-326)

66% primary result 58% primary result 52% primary result

Walsh et al. 2014 Evaniew et al. 2015 Ridgeon et al. 2016Fragility Index =median 8(IQR 3-18)

Fragility Index =median 2(IQR 1-3)

Fragility Index =median 2(IQR 1-3.5)

Range = 0-808 Range = 0-39 Range = 0-48FI zero = 10% FI zero = 20% FI zero = 20%FI ≤3 = 25% FI ≤3 = 75% FI ≤3 = 75%FI ≤loss tofollow-up = 53%

FI ≤loss tofollow-up = 65%

FI ≤loss tofollow-up = 87.5%

Walsh et al. 2014 Evaniew et al. 2015 Ridgeon et al. 2016Fragility Index =median 8(IQR 3-18)

Fragility Index =median 2(IQR 1-3)

Fragility Index =median 2(IQR 1-3.5)

Range = 0-808 Range = 0-39 Range = 0-48FI zero = 10% FI zero = 20% FI zero = 20%FI ≤3 = 25% FI ≤3 = 75% FI ≤3 = 75%FI ≤loss tofollow-up = 53%

FI ≤loss tofollow-up = 65%

FI ≤loss tofollow-up = 87.5%

Fragility Index

• Significant results of many RCTs hinge on very few events

• Results could potentially be overturned if missing/loss to follow-up data were known

http://fragilityindex.com/

Fragility Index

• Significant results of many RCTs hinge on very few events

• Results could potentially be overturned if missing/loss to follow-up data were known

• Reporting of the Fragility Index may help interpretation/over-interpretation

• You can calculate it yourself…!

http://fragilityindex.com/

Recommended