25
Observational Studies Based on Rosenbaum (2002) David Madigan nbaum, P.R. (2002). Observational Studies (2 nd edition). Spri

Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Embed Size (px)

Citation preview

Page 1: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Observational Studies

Based on Rosenbaum (2002)

David Madigan

Rosenbaum, P.R. (2002). Observational Studies (2nd edition). Springer

Page 2: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Introduction•A empirical study in which:

•Examples:

•smoking and heart disease

•vitamin C and cancer survival

•DES and vaginal cancer

“The objective is to elucidate cause-and-effect relationships in which it is

not feasible to use controlled experimentation”

•aspirin and mortality

•cocaine and birthweight

•diet and mortality

Page 3: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Asthma Study

• Have data on 2,000 kids

• What is the effect of tobacco experimentation on asthma?

Sex Male

Female

  Ethnicity

African American

Asian

Hispanic

Other

White

Smoking at Home Yes

No

Tobacco Experimentation

Yes

No

Asthma Self-Diagnosis

Yes

No

Asthma ISAAC Yes

No

Page 4: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer
Page 5: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Cameron and Pauling Vitamin C

•Gave Vitamin C to 100 terminally ill cancer patients

•For each patient found 10 controls matched for age, gender, cancer site, and tumor type

•Vitamin C patients survived four times longer than controls

•Later randomized study found no effect of vitamin C

•Turns out the control group was formed from patients already dead…

LESSONS: - observational studies are tricky- randomized study is the gold standard

why?

Page 6: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Why does randomization work?

Page 7: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer
Page 8: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

•The two groups are comparable at baseline

•Could do a better job manually matching patients on 18 characteristics listed, but no guarantees for other characteristics

•Randomization did a good job without being told what the 18 characteristics were

•Chance assignment could create some imbalances but the statistical methods account for this properly

Page 9: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

The Hypothesis of No Treatment Effect

• In a randomized experiment, can test this hypothesis essentially without making any assumptions at all

• “no effect” formally means for each patient the outcome would have been the same regardless of treatment assignment

• Test statistic, e.g., proportion (D|TT)-proportion(D|PCI)

TT DTT DPCI

L

PCI

L

TT DPCI

D

TT LPCI

L

TT DPCI

D

PCI

L

TT L

PCI

D

TT DTT LPCI

L

PCI

D

TT DPCI

L

TT L

PCI

D

PCI

D

TT LTT L

P=1/6

observed

Page 10: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Estimates, etc.

• Note: the probability distribution needed for the test is known, not assumed or modeled

• Randomized experiment provides unbiased estimator of the average treatment effect

• Internal versus external validity• Confidence intervals by inverting tests• Partially ordered outcomes, censoring,

multivariate outcomes, etc.

Page 11: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Overt Bias in Observational Studies

“An observational study is biased if treatment and control groups differ prior to treatment in ways that matter for the

outcome under study”

Overt bias: a bias that can be seen in the dataHidden bias: involves factors not in the data

Can adjust for overt bias…

Page 12: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Overt BiasM units, j=1,…,M jx

covariate vector

jZtreatment (assume binary 0 or 1). j =Pr(Zj=1)

M

j

zj

zjMM

jjzZzZ1

11 )1(),,Pr(

unknown

An OS is free of hidden bias if the j’s are known to depend only on the ’s (i.e., )

(so two units with same x have same prob of getting the treatment)

jx

)( jj x

unknown

Page 13: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Stratifying on x• Suppose can group units into

strata with identical x’s. Then:

• Conditional on all ’s are equally likely…just like in a uniform randomized experiment

S

s

mns

ms

ssszZ1

)1()Pr(

i sis zm Z

Page 14: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Stratifying on the Propensity Score

• Obviously exact matching not always possible

• Idea: form strata comprising units with the same ’s ( i.e. could have )

• Problem: don’t know the ’s• Solution: estimate them (logistic

regression, SVM, decision tree, etc.)• Form strata containing units with “similar”

probability of treatment

)()( sjsisjsi xxxx but

Page 15: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Matched Analysis Using a model with 29 covariates to predict VHA use, we were able to obtain an accuracy of 88 percent (receiver-operating-characteristic curve, 0.88) and to match 2265 (91.1 percent) of the VHA patients to Medicare patients. Before matching, 16 of the 29 covariates had a standardized difference larger than 10 percent, whereas after matching, all standardized differences were less than 5 percent

Page 16: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Conclusions VHA patients had more coexisting conditions than

Medicare patients. Nevertheless, we found no significant difference in mortality between VHA and Medicare patients, a result that suggests a similar quality of care for acute myocardial infarction.

Page 17: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer
Page 18: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer
Page 19: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

What about hidden bias?

• Sensitivity analysis!• Consider two units j and k with the same x. hidden bias they may not have the same

• Consider this inequality:

• Sensitivity analysis will consider various ’s

)1(

)1(1

jk

kj

Page 20: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

An equivalent latent variable model

for two units j and k with the same x:

10,)(1

log

jjjj

j uux

)}(exp{)1(

)1(

kjjk

kj uu

between –1 and 1

)exp()1(

)1()exp(

jk

kj

so the model implies the previous inequality with

(implication goes the other way too)

)exp(

Page 21: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Matched Pairs • Strata of size 2, one gets the treatment,

one doesn’t

• If =0, every unit has the same chance of treatment

• Standard test statistic for matched pairs is:

21

)exp()exp(

)exp(

)exp()exp(

)exp()Pr(

21

2

1 21

1

ss z

ss

s

zS

s ss

s

uu

u

uu

uzZ

S

s isisis ZcdrZtT

1

2

1

),(

rank of 21 ss rr otherwise 0 and if 211 1 sss rrc

sum of the ranks for pairs in which treated unit > control unit

Wilcoxonrank sum test

Page 22: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

More on Matched Pairs

• No hidden bias => know the null distribution of T because sth pair contributes ds with prob ½ and 0 with prob ½

• with hidden bias, the sth pair contributes ds with prob:

and zero with prob 1-ps

• so null distribution of T is unknown…

S

s isisis ZcdrZtT

1

2

1

),(

)exp()exp(

)exp()exp(

21

2211

ss

sssss uu

ucucp

Page 23: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Even More on Matched Pairs

• The P-value we are after is • Lower bound on P-value:

where T- is the sum of S quantities, the sth one being ds with prob and 0 otherwise

• Upper bound likewise using • This directly provides bounds on P-

values for fixed

)Pr( obsTT

)Pr( obsTT

• easy to see that:

11

1sp

sp

sp

sp

sp

Page 24: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Smoking & Lung Cancer Example

• Hammond (1964) paired 36,975 heavy smokers to non-smokers. Matched on age, race, plus 16 other factors Minimu

mMaximu

m1 < 0.0001 < 0.0001

2 < 0.0001 < 0.0001

3 < 0.0001 < 0.0001

4 < 0.0001 0.0036

5 < 0.0001 0.03

6 < 0.0001 0.1

Page 25: Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer

Asthma Study

• Need a of three to make the effect of tobacco experimentation on asthma become non-significant