192
Causal Inference Methodology for Comparisons of Hospital Quality of Care by Katherine Daignault A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Biostatistics, Dalla Lana School of Public Health University of Toronto © Copyright 2019 by Katherine Daignault

Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

  • Upload
    others

  • View
    20

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Causal Inference Methodology for Comparisons of Hospital Quality ofCare

by

Katherine Daignault

A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy

Graduate Department of Biostatistics, Dalla Lana School of Public HealthUniversity of Toronto

© Copyright 2019 by Katherine Daignault

Page 2: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Abstract

Causal Inference Methodology for Comparisons of Hospital Quality of Care

Katherine Daignault

Doctor of Philosophy

Graduate Department of Biostatistics, Dalla Lana School of Public Health

University of Toronto

2019

In a national or provincial health care system, where limited financial resources are available to improve

patient care, it is necessary to be able to evaluate the current care practices of hospitals to determine

where resources are best spent. Assessment of hospital care quality is achieved by comparing each hospital’s

performance to some reference level of care, often the average care level in the system, termed standardization.

Standardization allows adjustment for differences in patient characteristics between hospitals which would

unduly penalize hospitals that treat sicker patients. The quality and quantity of information available to

make such adjustments, or lack thereof, can bias estimates of a hospital’s performance, resulting in misleading

assessments of quality. Further, the goal of profiling care is not just to identify areas in which care disparities

exist, but ultimately to intervene on care to improve patient outcomes.

In this thesis, I take advantage of the causal nature of such comparisons (i.e. poor care leads to poor

outcomes) and propose new statistical methods under a causal inference framework. First, I illustrate the

current limitations of a standard hospital comparison analysis using U.S. prostate cancer data. Second, I

develop a doubly robust estimator for the standardized mortality ratio (SMR) when the reference is to the

system average level of care. I show that this estimator will provide unbiased estimates of the SMR as

long as one of the component models is correctly specified. Third, I show that one assumption needed for

the above estimator can be relaxed only for this reference comparison. Fourth, I adapt causal mediation

analysis methods to derive a decomposition of the hospital effect on patient outcomes that may act through

a mediating process, and develop two estimators for this decomposition. This allows quantification of the

effect an intervention to improve care may have on patient outcomes so that hospitals can be prioritized

in terms of those who would benefit most from government resources. Finally, I illustrate the proposed

mediation methods on Ontario kidney cancer data. This thesis provides valuable tools to effectively identify

and target hospitals in which care improvement is most needed.

ii

Page 3: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Acknowledgements

I want to extend a huge thank you to my supervisor, Olli Saarela, for all the time and patience he has

devoted to me throughout this degree. He has been a wonderful mentor and has always left me in awe of

the extent of his expertise. He is a constant source of inspiration for me and I would not be where I am

today without him. I also want to extend my gratitude to my committee members, Wendy Lou and Eleanor

Pullenayegum. This thesis would not have been possible without their continued support and encouragement.

I was also fortunate enough to have had the opportunity to work with a wonderful group of collaborators

in the Urology Department at Princess Margaret Cancer Centre, under the lead of Drs. Antonio Finelli and

Keith Lawson. Their projects allowed me to gain insight into the field of health services research and served

as important motivation of the methods developed for my thesis.

My appreciation goes out to the Biostatistics department for their continued support and advocation for

their students. Their support led to the creation and renovation of the Biostatistics PhD offices, which has

been one of the most important factors in my successful completion of this dissertation. Whether through

academic or emotional support, or just goofing off, the room and the people in it have been an integral part

of this journey. In particular, I want to thank Osvaldo, Kaviul, Thai-Son, Michela, Myrtha, Sudip, Tim,

Jen, and Konstantin for all the laughs and support throughout. You’ve all made my time at Dalla Lana an

unforgettable one.

My experience in this program was not always an easy one, and I feel extremely lucky to have some

of the most wonderful and unwaveringly supportive friends anyone could ask for. To Kuan, thank you for

always being the life of any room you enter, for always making me laugh, for putting all of life’s problems

into perspective for me, and for being just a great friend. To Thea, thank you for always making time for

me, for helping me get accustomed to a new city and life, for our wonderfully unproductive work lunches,

and for being one of my oldest and most cherished friends. To Marie, you have been my rock for so many

years, there are just no words capable of expressing my gratitude. I quite literally could not have done this

without you and your unending emotional support.

It goes without saying that none of this would have been possible without the love and support of my

family and particularly my parents. Thank you for a lifetime of encouragement to pursue my passions and

for continually believing in me, even when I couldn’t believe in myself. Finally, to my love and best friend,

Ryan, you are the best thing that has ever happened to me. You are endlessly supportive, caring, funny,

and intelligent. You make me a better person and I can not imagine having done this without you by my

side.

iii

Page 4: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Contents

1 Introduction 1

1.1 Preliminary Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Authorship Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Background and Literature Review 6

2.1 Quality Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 Brief History of Quality Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.2 Choice of Quality Indicator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.3 Various Uses for Quality Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Standardization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.1 Direct and Indirect Standardization: An Introduction . . . . . . . . . . . . . . . . . . 9

2.2.2 Case-mix Adjustment Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.3 Direct versus Indirect: Which is appropriate? . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.4 Hospital Standardized Mortality Ratio for Mortality . . . . . . . . . . . . . . . . . . . 13

2.3 Causal Models and Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.1 Introduction to Causal Inference and Potential Outcomes . . . . . . . . . . . . . . . . 15

2.3.2 Causal Inference and Quality Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.3 A Comment on Indirect Standardization and the Positivity Assumption . . . . . . . . 18

2.3.4 Doubly Robust Estimation of Causal Effects . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.5 Traditional Mediation Analysis Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.3.6 Counterfactual Approach to Mediation Analysis . . . . . . . . . . . . . . . . . . . . . 21

2.4 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3 Prostate cancer quality of care disparities and their impact on patient mortality 25

3.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3.2 Study cohort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.3 Measurement of quality of care . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.4 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

iv

Page 5: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.7 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.8 Supplemental Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4 Doubly Robust Estimator for Indirectly Standardized Mortality Ratios 67

4.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.3 Proposed Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.3.1 Notation and assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.3.2 Direct versus indirect standardization . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.3.3 Doubly robust estimation in direct standardization . . . . . . . . . . . . . . . . . . . . 71

4.3.4 Causal estimand under indirect standardization . . . . . . . . . . . . . . . . . . . . . . 73

4.3.5 Proposed doubly robust estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.6 Appendix A: Proofs for equations (4.4)-(4.7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.7 Appendix B: Consistency of the Proposed Estimator . . . . . . . . . . . . . . . . . . . . . . . 83

4.7.1 A note on correctly specified models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.7.2 Consistency for correctly specified models . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.7.3 Consistency under misspecified assignment model . . . . . . . . . . . . . . . . . . . . . 87

4.7.4 Consistency under misspecified outcome model . . . . . . . . . . . . . . . . . . . . . . 87

5 Effect of Positivity Violations on Hospital Quality of Care Comparisons 90

5.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.3 Direct Standardization and Positivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.3.1 Notation and Assumptions in Causal Inference . . . . . . . . . . . . . . . . . . . . . . 92

5.3.2 Directly Standardized Hospital Comparisons . . . . . . . . . . . . . . . . . . . . . . . 93

5.4 Positivity Violations on Indirectly Standardized SMRs . . . . . . . . . . . . . . . . . . . . . . 94

5.4.1 Comparison to Another Hospital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.4.2 Comparison to an Average Hospital . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.4.3 Comparison to Average Nationwide Care . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.5 Toy Example of Positivity Violations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

6 Causal Mediation Analysis for Standardized Mortality Ratios 102

6.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.3 Causal Estimand and Total Effect Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.3.2 Causal estimand for SMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

v

Page 6: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

6.3.3 Total effect decomposition of SMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.4 Proposed Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

6.4.1 Proposed model-based estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

6.4.2 Proposed semi-parametric estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.5 Simulation study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

6.6 Application to Ontario Kidney Cancer Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

6.8 Supplementary Digital Content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.8.1 eAppendix 1. Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.8.2 eAppendix 2. Derivation of model-based estimators . . . . . . . . . . . . . . . . . . . 116

6.8.3 eAppendix 3. Derivation of semi-parametric estimators . . . . . . . . . . . . . . . . . 118

6.8.4 eAppendix 4. Additional simulation details and results . . . . . . . . . . . . . . . . . . 121

6.8.5 eAppendix 5. Sample R code for simulations . . . . . . . . . . . . . . . . . . . . . . . 123

7 Using Causal Mediation Analysis to Target Minimally Invasive Surgery Rates to Im-

prove Length of Stay after Surgical Treatment of Kidney Cancer 127

7.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

7.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

7.3 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

7.3.1 Data and Study Cohort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

7.3.2 Causal Mediation Analysis for Hospital Comparisons . . . . . . . . . . . . . . . . . . . 131

7.3.3 Estimation of Effect Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

7.3.4 Standard Errors for the Estimated SMRs . . . . . . . . . . . . . . . . . . . . . . . . . 134

7.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

7.4.1 Description of the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

7.4.2 Mediation Analysis Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

7.4.3 Comparison of Error Estimation Methods . . . . . . . . . . . . . . . . . . . . . . . . . 145

7.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

7.6 Supplemental Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

8 Discussion 163

8.1 Limitations and Future Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

8.1.1 Causal Inference and Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

8.1.2 Variables for Case-mix Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

8.1.3 Sensitivity of Results to Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

8.1.4 Variability of Proposed Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

8.1.5 Profiling using Multiple Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

8.1.6 Quality Improvement over Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

8.2 Impact of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

Bibliography 167

vi

Page 7: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

List of Tables

2.1 Population characteristics needed for standardization for K strata based on patient character-

istics in each index and reference population (e.g. age group or gender). The total observed

events and crude rates within each population involve calculating the total events or crude

rate per patient strata, and summing over all strata. . . . . . . . . . . . . . . . . . . . . . . . 10

S3.1 Quality indicator definitions and inclusion criteria . . . . . . . . . . . . . . . . . . . . . . . . 49

S3.2 Descriptive statistics of patients from each QI cohort in the training set. . . . . . . . . . . . . 50

S3.3 Descriptive statistics for outcome subsets in training and validation set. . . . . . . . . . . . . 58

4.1 Difference between the standardization methods; The asterisk refers to the standard popula-

tion, k indicates the covariate strata, πk is the estimated event rate, and E is the expected

outcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.1 Hypothetical example of comparing rate of hip fracture treatment within 24 hours (Y ) between

three hospitals (Z) while adjusting for age of patient (X). Crude rate is the rate of treatment

in each hospital, unadjusted for X. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.2 Empirical proportions based on hypothetical data (Table 5.1), for use in causal effect es-

timation. Note: the conditional outcome proportion for hospital 1 is given value NI for

non-identifiable. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

7.1 Descriptive statistics for population of n = 4001 Ontario kidney cancer patients undergo-

ing radical nephrectomies across 60 hospitals. Here, DX refers to diagnosis, NX refers to

nephrectomy, and ACG score is the Adjusted Clinical Group score (Starfield et al., 1991). . . 137

vii

Page 8: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

List of Figures

2.1 Basic causal mechanism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 A simple mediation model, with exposure Z, mediator M , confounder X and outcome Y . . 20

3.1 Nationwide hospital-level benchmarking of prostate cancer quality of care. Case-mix ad-

justed performance for individual hospitals (circles, size proportional to hospital volume)

benchmarked for quality according to disease-specific quality indicators. Vertical dashed red

line represents the average nationwide hospital performance. The y axis represents the in-

verse standard error of the case-mix adjusted performance measure, with the dot-dash blue

funnel giving the unadjusted 95% non-rejection region for the null of equivalence between

observed and expected performance and the dashed red funnel giving the non-rejection re-

gion after Bonferroni correction. Between hospital heterogeneity in performance is reported

on each plot in terms of the I2 statistic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.2 Concordance in quality indicators for identifying outlier hospitals. Venn diagrams display

the concordance in classifying outlier hospitals between the individual QIs. . . . . . . . . . 33

3.3 Impact of hospital quality on patient outcomes. Unadjusted and case-mix adjusted associa-

tions between hospital-level quality, measured by the PC-QS, and overall mortality. Values

displayed reflect hazard ratio (HR) when comparing hospitals with a positive vs. negative

PC-QS. CI = confidence interval. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.4 Hospital structure features associated with quality. Associations between hospital quality,

measured by the PC-QS, and hospital volume (left panel), facility type (middle panel), and

geographical location (right panel). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.5 Impact of hospital level quality on race and insurance status associations with patient out-

comes. Associations between race and insurance status with the rate of salvage therapy

(surgery or radiation) [S], ADT initiation [ADT], 30-day mortality [30], 90-day mortality

[90] and overall mortality [M], adjusted for both case-mix as well as hospital PC-QS. . . . . 35

S3.1 A-J: Model estimates (95% CI) of QI case-mix adjustment models . . . . . . . . . . . . . . 36

S3.2 Yearly trend in outlier status for each QI. Red circles are poor performers, blue circles are

superior performers, black line is smoothed average time trend. . . . . . . . . . . . . . . . . 46

S3.3 Associations of QIs with outcomes of interest, adjusted for case-mix. . . . . . . . . . . . . . 47

S3.4 Distribution of the PC-QS in the validation set. . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.1 The postulated causal mechanism (U is a non-confounder latent variable representing the

correlation between potential outcomes for an individual). . . . . . . . . . . . . . . . . . . . 71

viii

Page 9: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

4.2 Sampling distributions of observed-to-expected ratios based on outcome model (4.5) only,

assignment model (4.6) only and doubly robust estimators (4.9) when true SMR = 1.0 for

all hospitals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.3 Sampling distributions of observed-to-expected ratios based on outcome model (4.5) only,

assignment model (4.6) only and doubly robust estimators (4.9) when true level of care varies

across hospitals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.4 Sampling distributions of observed-to-expected ratios based on outcome model (4.5) only,

modified assignment model (4.12) only and doubly robust estimators (4.9) when true level

of care varies across hospitals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.1 The postulated causal mechanism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

6.1 The postulated causal mechanism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.2 Causal relationship for simulated data. U1, U2 are non-confounder latent variables represent-

ing individual-level correlation among the potential binary mediator and potential binary

outcome values respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.3 Total effect decomposition for five providers using (a) model-based and (b) semi-parametric

estimators. Bars are the means of the sampling distribution for each hospital, and error

bars represent 2.5th and 97.5th percentiles of sampling distributions. NDE indicates natural

direct effect; NIE, natural indirect effect; SMR, standardized mortality ratio; TE, total effect.112

6.4 Total, indirect, and direct effect sampling distributions of proposed estimators when an

indirect effect exists, but no direct effect exists. . . . . . . . . . . . . . . . . . . . . . . . . . 112

6.5 Total, natural indirect (mediated through minimally invasive surgery) and natural direct

(not mediated through minimally invasive surgery) hospital effects on length of stay for

the 10 largest Ontario hospitals, with distribution of 500 bootstrap resamples and whiskers

corresponding to 95 percentile intervals. The standardized mortality ratios (SMRs) refer to

the ratio of observed versus expected (under average level of care) length of stay for the

patient case-mix of a given hospital. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

S6.1 Total, indirect and direct effect sampling distributions of proposed estimators when a direct

effect exists, but no indirect effect exists via the provider-mediator pathway. . . . . . . . . . 123

S6.2 Total, indirect and direct effect sampling distributions of proposed estimators when a direct

effect exists, but no indirect effect exists via mediator-outcome pathway. . . . . . . . . . . . 123

7.1 Flow diagram illustrating the database merging and cohort defining steps resulting in the

general analysis dataset from which defined our analysis cohort. . . . . . . . . . . . . . . . . 130

7.2 Causal model representing the effect of hospital on patient length of stay (LOS) that may

be mediated by performance of minimally invasive surgery (MIS). Case-mix factors include

patient level demographic and disease-progression information. . . . . . . . . . . . . . . . . 131

7.3 Number of patients in cohort per hospital. ‘Others’ is a pooled category combining hospitals

who treated fewer than 9 patients. The red line indicates the cut point for pooling hospitals

who treat fewer than 50 patients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

ix

Page 10: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

7.4 Distribution of pairwise standardized mean differences (SMD) between hospitals for each

covariate, as well as the mediator MIS and outcome LOS. . . . . . . . . . . . . . . . . . . . 138

7.5 Funnel plot of case-mix adjusted minimally invasive surgery proportions. Circles represent

hospital standardized mortality ratios, proportional to their volume, plotted against the

inverse of their estimated standard error. Red indicates hospitals classified as poor outliers,

blue for superior outliers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

7.6 Funnel plot of case-mix adjusted length of stay. Circles represent hospital standardized

mortality ratios, proportional to their volume, plotted against the inverse of their estimated

standard error. Red indicates hospitals classified as poor outliers, blue for superior outliers. 140

7.7 Caterpillar plot of the parameter estimates and 95% confidence intervals of the mediator

model used in the model-based and semi-parametric estimators of the total effect decompo-

sition. Here, all hospitals treating fewer than 50 patients are pooled into a single category

(‘Others’). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

7.8 Caterpillar plot of the parameter estimates and 95% confidence intervals of the outcome

model used in the model-based estimators of the total effect decomposition. Here, all hospi-

tals treating fewer than 50 patients are pooled into a single category (‘Others’). . . . . . . . 142

7.9 Boxplots of bootstrap sampling distribution of model-based and semi-parametric estimators

of the total effect decomposition when pooling hospitals who treat fewer than 50 patients

and fitting the multinomial model specified in (7.10). Whiskers of boxplots represent 95%

confidence intervals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

7.10 Boxplots of 95% confidence intervals of model-based and semi-parametric estimators of the

total effect decomposition when pooling hospitals who treat fewer than 9 patients and fitting

a constrained multinomial model specified in (7.11). Variability was estimated via a 50

iteration non-parametric bootstrap and a Normal approximation was used for the confidence

intervals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

7.11 Differences in point estimates between the use of unconstrained and constrained multino-

mial assignment models for both semi-parametric and model-based estimators of the total,

indirect and direct effects for the 29 large hospitals in Ontario. . . . . . . . . . . . . . . . . 145

7.12 Margin of error for the 95% confidence intervals of the model-based estimators for each

of the variance estimation methods: 500 iteration non-parametric bootstrap using the un-

constrained multinomial model, the approximate Bayesian method, the 50 iteration non-

parametric bootstrap with Normal approximation and the 125 iteration non-parametric

bootstrap using constrained multinomial model. . . . . . . . . . . . . . . . . . . . . . . . . 146

7.13 Margin of error for the 95% confidence intervals of the semi-parametric estimators for each

of the variance estimation methods: 500 iteration non-parametric bootstrap using the un-

constrained multinomial model, the 50 iteration non-parametric bootstrap with Normal ap-

proximation and the 125 iteration non-parametric bootstrap using constrained multinomial

model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

S7.1 Heat map of pairwise standardized mean differences (SMD) between hospitals for assessing

imbalance in age group. Red means small imbalances, yellow means larger imbalances.

Legend shows the distribution of pairwise SMDs. . . . . . . . . . . . . . . . . . . . . . . . . 150

x

Page 11: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

S7.2 Heat map of pairwise standardized mean differences (SMD) between hospitals for assessing

imbalance in ACG score. Red means small imbalances, yellow means larger imbalances.

Legend shows the distribution of pairwise SMDs. . . . . . . . . . . . . . . . . . . . . . . . . 151

S7.3 Heat map of pairwise standardized mean differences (SMD) between hospitals for assessing

imbalance in Charlson comorbidity score. Red means small imbalances, yellow means larger

imbalances. Legend shows the distribution of pairwise SMDs. . . . . . . . . . . . . . . . . . 152

S7.4 Heat map of pairwise standardized mean differences (SMD) between hospitals for assessing

imbalance in days from diagnosis to nephrectomy. Red means small imbalances, yellow

means larger imbalances. Legend shows the distribution of pairwise SMDs. . . . . . . . . . . 153

S7.5 Heat map of pairwise standardized mean differences (SMD) between hospitals for assessing

imbalance in income quintile. Red means small imbalances, yellow means larger imbalances.

Legend shows the distribution of pairwise SMDs. . . . . . . . . . . . . . . . . . . . . . . . . 154

S7.6 Heat map of pairwise standardized mean differences (SMD) between hospitals for assessing

imbalance in sex. Red means small imbalances, yellow means larger imbalances. Legend

shows the distribution of pairwise SMDs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

S7.7 Heat map of pairwise standardized mean differences (SMD) between hospitals for assessing

imbalance in tumour size (cm). Red means small imbalances, yellow means larger imbal-

ances. Legend shows the distribution of pairwise SMDs. . . . . . . . . . . . . . . . . . . . . 156

S7.8 Heat map of pairwise standardized mean differences (SMD) between hospitals for assessing

imbalance in tumour stage. Red means small imbalances, yellow means larger imbalances.

Legend shows the distribution of pairwise SMDs. . . . . . . . . . . . . . . . . . . . . . . . . 157

S7.9 Heat map of pairwise standardized mean differences (SMD) between hospitals for assessing

imbalance in year of diagnosis. Red means small imbalances, yellow means larger imbalances.

Legend shows the distribution of pairwise SMDs. . . . . . . . . . . . . . . . . . . . . . . . . 158

S7.10 Caterpillar plot of the parameter estimates and 95% confidence intervals of the mediator

model used in the model-based and semi-parametric estimators of the total effect decompo-

sition. Here, all hospitals treating fewer than 9 patients are pooled into a single category

(‘Others’). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

S7.11 Caterpillar plot of the parameter estimates and 95% confidence intervals of the outcome

model used in the model-based estimators of the total effect decomposition. Here, all hospi-

tals treating fewer than 9 patients are pooled into a single category (‘Others’). . . . . . . . 160

S7.12 Boxplots of 95% confidence intervals of model-based and semi-parametric estimators of the

total effect decomposition when pooling hospitals who treat fewer than 9 patients and fitting

a constrained multinomial model specified in (7.11). Variability was estimated via a 125

iteration non-parametric bootstrap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

S7.13 Boxplots of 95% confidence intervals of model-based estimators of the total effect decomposi-

tion when pooling hospitals who treat fewer than 9 patients and fitting a constrained multi-

nomial model specified in (7.11). Variability was estimated via an approximate Bayesian

method that resamples fitted model parameters. . . . . . . . . . . . . . . . . . . . . . . . . . 162

xi

Page 12: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Chapter 1

Introduction

1.1 Preliminary Background and Motivation

With increased availability of routinely collected administrative and clinical data, institutional quality com-

parisons have become popular in recent years. These data are often used to assess the quality of patient care

being provided at various levels of a health care system in an effort to improve care through policy deci-

sions and optimal resource allocation. Such assessments can be made at any level of the health care setting

of interest including administrative subregions, hospitals or physicians; here the focus shall be on hospital

comparisons. The assessment of quality of care is usually only meaningful when there is known variability

in the care being provided between hospitals. When such variation exists, valid measures of disease-specific

care must be identified for quality assessment purposes. Such measures, termed quality indicators, are

standardized, evidence-based measures of the quality of patient care that may be used in conjunction with

administrative data to measure and track patient outcomes and clinical performance. These then may be

used, either individually or through the development of composite measures of care, to assess hospital per-

formance. Often, indicators are used to benchmark hospitals relative to some reference level of care. This

is done to identify hospitals that exhibit superior or poor performance in an attempt to target hospitals for

quality improvement initiatives. By classifying hospitals as outliers, certain methodological considerations

must be taken regarding the adequacy of the adjustment for patient-level demographic or disease-specific

characteristics that may be associated with the indicator of care, termed the patient case-mix. Such adjust-

ment is necessary as often larger hospitals may be responsible for the treatment of sicker patients and thus

may seem to provide worse care in unadjusted comparisons (Shahian and Normand, 2008). Further, it is

important to consider the causal relationship between the hospital of care, how patients are treated, and

their post-treatment outcomes (Donabedian, 1988). Without the existence of such a causal pathway of care,

it would be difficult to conceive reasonable interventions to improve patient care. This thesis contributes to

the area of hospital quality comparisons by developing statistical methodology that helps identify poor care

providers and quantifies how possible interventions on hospital practices will improve patient outcomes. To

this end, this thesis frames quality comparisons using the causal inference framework and develops statisti-

cal methodology that addresses the issue of inadequate case-mix adjustment. Further, this thesis develops

methodology to help policy makers identify which aspects of care lead to worse patient outcomes (i.e. identify

1

Page 13: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

areas for intervention) through exploitation of the causal pathway of care by adapting methods from causal

mediation analysis.

Indicators of quality can measure any element along the pathway of care, which can be divided into struc-

tural elements (e.g. hospital volume), process elements (e.g. pertaining to what was actually done to the

patient) and outcome elements (e.g. reflecting some aspect of the health status of the patient) (Donabedian,

1988). Despite numerous advantages and disadvantages facing each of these types of indicators (Birkmeyer

et al., 2004), it is generally regarded that process measures are the optimal choice for assessing quality.

Process measures have the distinct advantage of being actionable which is appealing when interventions on

care quality are of interest. For such measures to be useful, Donabedian (1978) notes that it is essential for

there to exist a causal relationship between structural, process and outcome measures. However, there has

been some reluctance to address such quality comparisons in an explicit causal framework (Dowd, 2011).

Further, indicators must only reflect variation in the level of care, so variation between hospitals due to

patient differences must be adjusted for, often through the use of standardization methods.

Indirect standardization is the most commonly used method of adjusting for the effect of patient case-

mix on quality indicators. Standardization works by comparing the observed outcomes of a hospital to the

outcomes that would be expected based on some reference population. Indirectly standardized measures

can be interpreted as measuring the expected outcome if patients within a specific hospital instead received

some reference level of care (Keiding and Clayton, 2014). The choice of reference care level can be that of

another hospital in the system, that of an average hospital in the system, or the average care level within

the system itself. While all reference levels of care have usefulness, comparison to the average care level in

the system is particularly relevant to policy makers who must allocate limited resources across a provincial

or national health care system. Upon selection of a reference level of care, the observed indicator can then

be standardized through the use of a quantity such as the standardized mortality ratio (SMR), an observed

to expected ratio (SMR = O/E). Here, the observed patient outcomes of a particular hospital are scaled to

account for the effect of the patient case-mix of that hospital. Standardization is often achieved through

the fitting of regression models to the observed data, followed by calculation of the expected outcome E

based on the predicted outcomes from these models. This process allows adjustment for case-mix differences

between hospitals. However, misspecification of these models (e.g. not adjusting for an important patient

characteristic or misspecifying the functional forms of the relationship) is a serious concern that can lead to

misleading assessments of quality.

In addition to properly adjusting for case-mix differences between hospitals, quality comparisons should

also make use of the causal pathway between the hospital of treatment and the patient outcomes of interest.

Causal inference allows evidence-based conclusions to be drawn regarding the causal effects of an exposure

on an outcome. To do so, causal models are used to conceptualize the possible causal relationships and

mechanisms at play within the system under study. For quality comparisons, the exposure being considered

is receiving treatment at some hospital and interest lies in specifying the causal effect of this exposure on

patient outcomes, or simply, the care that the patient received. Causal models enable the mathematical

formalization of causal effects that may occur due to potential interventions on the exposure (Petersen and

2

Page 14: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

van der Laan, 2014). Often, potential outcomes notation is employed to specify the causal effect of interest,

and is used to refer to an outcome that may have occurred under some alternative exposure (i.e. through

some intervention) that may be contrary to the exposure received (Rubin, 1974). Despite the natural causal

interpretation of quality comparisons and standardization, addressing such comparisons using an explicit

causal inference framework has not been widely adopted. Causal estimands (i.e. the causal effect to be

estimated) have been developed, in the context of a binary exposure, for the risk difference and ratio among

the exposed (Shinozaki and Matsuyama, 2015). In the context of hospital comparisons, causal estimands

have been developed for the directly standardized risk difference and the SMR when comparing to an average

hospital’s level of care (Varewyck et al., 2014), and for the indirectly standardized excess risk (Varewyck

et al., 2016). However, no causal estimand has been formulated for the SMR when comparing a hospital’s

care to the national/provincial average care level.

By adopting the causal inference framework, it is possible to propose methods for dealing with model

misspecification and for assessing the possible impact that intervening on certain aspects of care may have

on patient outcomes. Doubly robust (DR) estimation (Bang and Robins, 2005) addresses the former by

employing two different models, rather than one, within a single estimator, only one of which need be

correctly specified for unbiased estimation of the causal estimand. By having two opportunities to correctly

adjust for patient differences, the quality indicator may more accurately reflect the care being provided and

may prove more valid in identifying poor care providers. Both Varewyck et al. (2014) and Shinozaki and

Matsuyama (2015) have proposed such estimators for their causal estimands. The latter goal of quantifying

the impact of interventions on care can be addressed through the adoption of causal mediation analysis.

Mediation analysis allows the total effect of a hospital on the outcome to be decomposed into the indirect

(mediated) effect, the effect of the hospital on the outcome that is attributed to a certain process of care, and

the remaining direct (unmediated) hospital effect. By decomposing this causal care pathway, it is possible

to determine the hospitals at which an intervention on a particular process of care will result in the greatest

improvement in patient outcomes. Causal effect decompositions have been proposed for the risk and mean

differences (VanderWeele, 2009), odds ratios (VanderWeele and Vansteelandt, 2010) and risk differences

among the exposed (Vansteelandt and VanderWeele, 2012), yet none consider the SMR when comparing to

a national/provincial average level of care.

1.2 Thesis Outline

In this thesis, I address the issues briefly outlined above by considering the indirectly standardized mortality

ratio under a causal inference framework. I adopt potential outcome notation to define the causal estimand

of interest for comparing a hospital’s care level to that of the national/provincial average level. By doing so,

I develop statistical methodology to address model misspecification in the adjustment for case-mix differ-

ences using doubly robust estimators, and to quantify the benefit to patient outcomes of an intervention to

improve care using causal mediation analysis. This is a manuscript-based thesis with 8 chapters including

an introduction, literature review, five journal articles, and a discussion with possible directions for future

work. Upon submission of this thesis, Chapters 4 has been published in a peer-reviewed journal, Chapter

6 has been accepted for publication, and Chapters 3, 5 and 7 are in preparation for submission. The five

3

Page 15: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

manuscripts that appear as chapters in this thesis are (in order of appearance):

1. Lawson, K.A., Daignault, K., Aboussaly, R., Khanna, A., Goldenberg, M., Hamilton, R.J., Loblaw, A.,

Warde, P., Saarela, O., and Finelli, A. Prostate Cancer Quality of Care Disparities and their Impact

on Patient Mortality; in preparation for submission.

2. Daignault, K. and Saarela, O. (2017). Doubly Robust Estimator for Indirectly Standardized Mortality

Ratios. Epidemiologic Methods 6, 1: 20160016.

3. Daignault, K. and Saarela, O. Effect of Positivity Violations on Hospital Quality of Care Comparisons;

in preparation for submission.

4. Daignault, K., Lawson, K.A., Finelli, A., and Saarela, O. (2019). Causal Mediation Analysis for

Standardized Mortality Ratios. Epidemiology (in press).

5. Daignault, K., Lawson, K.A., Finelli, A., and Saarela, O. Using Causal Mediation Analysis to Target

Minimally Invasive Surgery Rates to Improve Length of Stay after Surgical Treatment of Kidney

Cancer; in preparation for submission.

Details of the contributions of each author to the manuscripts can be found below.

This thesis begins, in Chapter 2, by providing background and a literature review of quality indicators and

comparisons, standardization methods, and causal models and inference with specific focus on doubly robust

estimation and mediation analysis. In the causal inference section of Chapter 2, the notation of potential

outcomes and the assumptions surrounding their use will be introduced, as well as those needed for mediation

analysis. Chapter 3 provides a motivating example of hospital profiling in the case of prostate cancer care in

the United States. There is much evidence that the care provided to prostate cancer patients varies between

hospitals (Ellison et al., 1999; Harlan et al., 2001; Crook et al., 2002; Potosky et al., 2004; Krupski et al.,

2005). Further, a number of studies have attempted to define valid indicators of prostate cancer care for use

in quality assessment studies (Miller and Saigal, 2009; Nag et al., 2018; Ortelli et al., 2018). Therefore, there

is a clear need for benchmarking the quality of prostate cancer care. This chapter profiles hospitals across

the U.S. against the average level of care nationwide using indirect standardization, classifies hospitals as

providing poor or superior care, develops a composite score across multiple indicators of care, and considers

associations of this score to patient outcomes and hospital characteristics. It motivates the need for the

development of novel methods that address inadequate case-mix adjustment and model misspecification, as

well as the desire to illustrate the impact of poor care on patient outcomes, which this thesis goes on to fulfill.

Chapter 4 approaches quality comparisons using the causal inference framework by defining the causal

estimand for the SMR when comparing to the national/provincial average level of care, developing a DR esti-

mator for this estimand, proving that the proposed DR estimator consistently estimates the causal estimand

and illustrating the DR property through a simulation study. One of the main assumptions made in Chapter

4 is positivity, which simply states that there is a chance, however small, that any hospital must be able to

treat any patient with a certain set of characteristics. Violation of positivity means that the causal estimand

of interest will not be identifiable. Chapter 5 provides a comparison of direct and indirect standardization

with respect to this assumption and mathematically shows that indirect standardization in comparison to

4

Page 16: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

a national/provincial average is the only method that is not affected by violations of this critical assumption.

Chapter 6 continues under the causal framework and defines the causal estimand for the SMR in the

mediated case, where the effect of the hospital on the outcome may or may not be mediated through

some process of care, and derives the decomposition of the total hospital effect on the outcome. Then two

sets of estimators, one model-based and one semi-parametric, are proposed for this decomposition. The

performance of these estimators is compared through a simulation study and a brief illustration of the use

of the decomposition is presented, in the context of surgical treatment of kidney cancer. Chapter 7 presents

a more detailed analysis using the proposed methods from Chapter 6, and compares two approaches for

dealing with the presence of low volume hospitals, as well as various methods for obtaining the sampling

distribution of the estimates. A discussion on the limitations and potential future directions of this work

can be found in Chapter 8.

1.3 Authorship Contributions

As the papers that make up this thesis have been written with collaborators and are not all first author

manuscripts, I have outlined the authorship contributions of each paper, in the order that they appear in

the thesis:

1. (Chapter 3): KAL, RA, OS and AF devised the research question. KAL drafted the majority of the

manuscript. KD and OS wrote the statistical methods section of the manuscript, planned and executed

the statistical analysis, and produced all tables and figures for the manuscript. RA ran the analysis

code on location. AK, MG, RJH, AL, PW and AF were involved in the analysis and interpretation of

the results, as well as critical revision of the manuscript.

2. (Chapter 4): OS proposed the research problem. KD and OS jointly developed the proposed methods.

KD drafted the original manuscript and OS helped in the editing and revision process. KD is the

corresponding author.

3. (Chapter 5): KD and OS jointly developed the proposed methods. KD drafted the original manuscript

and OS helped in the editing and revision process.

4. (Chapter 6): OS proposed the research problem. KD and OS jointly developed the proposed methods.

KAL and AF provided access to the data used in the application with valuable expert subject matter

knowledge. KD drafted the original manuscript and OS helped in the editing and revision process.

KD is the corresponding author.

5. (Chapter 7): OS proposed the research problem. KD and OS jointly developed the proposed methods.

KAL and AF provided access to the data with valuable expert subject matter knowledge. KD drafted

the original manuscript and OS helped in the editing and revision process.

5

Page 17: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Chapter 2

Background and Literature Review

2.1 Quality Comparisons

Quality comparisons refer broadly to the method of assessing individual institutional performance by some

metric (i.e. indicator) used to measure quality, and then comparing across multiple institutions to determine

whether performance/quality is adequate. Indicators used to define performance or quality will depend on

the specific field/area being assessed, whether it be health care or education (Goldstein and Spiegelhalter,

1996). Quality comparisons have been in use for many decades, but increased access to routinely collected

administrative data has renewed interest in institutional quality assessment, especially in health care settings.

2.1.1 Brief History of Quality Comparisons

The need to assess some aspect of an institution’s performance relative to other similar institutions is

natural, especially when trying to identify areas for improvement. Historically, one of the most notable

usages of quality comparisons in the area of health care was by Florence Nightingale (Spiegelhalter, 1999),

who developed a “coxcombe” diagram to display how reforms implemented in a Scutari hospital at which she

was superintendent led to reduced mortality compared to London military hospitals in 1857 (Nightingale,

1858). She also went on to advocate for the collection of hospital and surgical statistics which she claimed

would

“enable us to ascertain the relative mortality of different hospitals, as well as of different diseases

and injuries at the same and at different ages, the relative frequency of different diseases and

injuries among the classes which enter hospitals in different countries, and in different districts

of the same country” (Nightingale, 1863).

Nightingale’s way of assessing quality in health care has been deemed an “epidemiological” approach to

quality comparisons (Spiegelhalter, 1999), in that it focuses on populations rather than individual patients.

An alternative view to quality comparisons, deemed a “clinical” approach, was introduced by Ernest

Amory Codman and is based on a case-by-case assessment of patient outcomes (Spiegelhalter, 1999). Cod-

man’s “End Result Idea” was for every hospital to follow each patient after treatment until it has been long

6

Page 18: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

enough to evaluate whether treatment was successful and if not, to determine why, so as to avoid similar

failures in the future (Codman, 1934). While this way of assessing quality might be of interest to physi-

cians or surgeons as a means of improving performance through reflective analysis of outcomes, it was not

widely adopted by hospitals. Both Nightingale and Codman saw the value of assessing the quality of care of

hospitals but also highlighted concerns regarding fair comparisons between hospitals and the complexity of

evaluating care in such a complex, multidisciplinary organization.

In the 1980’s and 90’s, the United Kingdom saw an increase in the use of performance/quality indicators

with a view to holding the public sector, in particular the health care and education sectors, accountable for

their activities (Goldstein and Spiegelhalter, 1996; Freeman, 2002; Draper and Gittoes, 2004). Numerous

concerns were raised regarding the reliability of these indicators in terms of the indicators chosen, the

data quality, the statistical methods employed, and the interpretation and impact of the analysis. Much

progress into the statistical aspects of institutional profiling was made, with emphasis on improving case-

mix adjustment methods (Goldstein and Spiegelhalter, 1996; Christiansen and Morris, 1997; Normand et al.,

1997; Burgess et al., 2000; Howley and Gibberd, 2003; Huang et al., 2005; Gajewski et al., 2008; Jones

and Spiegelhalter, 2011), alternative criteria for benchmarking institutions (Normand et al., 1997; Burgess

et al., 2000; Spiegelhalter et al., 2012), and visualization of profiling results (Marshall and Spiegelhalter,

1998; Spiegelhalter, 2005b; Jones et al., 2008). Research and debate into these concerns is still ongoing. The

remainder of this discussion on quality indicators will focus on the health care context.

2.1.2 Choice of Quality Indicator

Quality indicators (QI) can be broadly classified into three main types (Donabedian, 1988), representing the

general components of care which a patient experiences, namely structural elements, process elements and

outcome elements. This decomposition of the care pathway is motivated by the notion that good structural

elements promote good process which in turn leads to good outcomes, but only if there exists a relationship

between them (Donabedian, 1978). There are advantages and disadvantages to the use of each of these

elements as indicators of care (Donabedian, 1988; Birkmeyer et al., 2004), summarized below.

Structural Indicators

Structural indicators are variables that reflect the physical setting in which care is being provided such as

material and human resources and organizational structure. These indicators are meant to be surrogate

measures of quality and are of most use when there is a demonstrable relationship with patient outcomes.

For example, it has been shown that receiving treatment in a high volume hospital leads to lower post-

operative complications and mortality (Begg et al., 2002; Lawson et al., 2017a). The greatest advantage

to using structural indicators to assess quality is that they can be easily and inexpensively extracted from

administrative data. However, the relationship between structural variables and patient outcomes is sparse,

especially in the case of non-fatal outcomes. Further, structural indicators can only be measured using obser-

vational data so relationships between these indicators and outcomes may be spurious or due to unmeasured

confounding. Most importantly, structural indicators are not easily actionable, i.e. the intervention is vague

or not well-defined (e.g. increase hospital volume, but how to go about achieving this?), and thus they have

limited value for quality improvement initiatives.

7

Page 19: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Process Indicators

Conversely, process indicators reflect practices that are actually being done by the hospital in caring for

patients. Here, there is a much clearer relationship between process measures and patient outcomes. Such

indicators can be considered fairer measures of quality because they reflect care that patients actually receive

rather than the proxies for outcomes represented by structural indicators. The most appealing argument

for the use of process measures is that they can be directly acted upon in an effort to improve patient care

because it is clear what the target of an intervention should be (Lilford et al., 2004). The main disadvantage

to using process indicators is founded in the need for clinical-based data which may be more costly and

difficult to acquire. Such granular disease-specific data is necessary for accurate cohort definitions as each

process indicator is measuring care for a specific procedure/treatment that may not be applicable to the

general patient population (Birkmeyer et al., 2004). Databases now exist, such as the Discharge Abstract

Database in Ontario and similar ones available through the Canadian Institute for Health Information, that

routinely collect data on hospital processes and thus enable the derivation of such indicators.

Outcome Indicators

Outcome indicators broadly reflect the effect of care on the health status of the patient, such as complication

rate, length of stay, mortality, or health-related quality of life. Such outcomes are appealing because they

are often considered the “bottom line” for patients and hospitals alike and thus have face validity. Another

advantage is that the act of measuring outcomes may in turn improve patient outcomes through awareness

(Birkmeyer et al., 2004). However, there are numerous disadvantages to the use of outcomes as indicators

of care. Outcomes often take much longer to measure and require patient follow-up as compared to pro-

cess measures which can often be measured quickly. By extension, some outcomes such as morbidity and

mortality are so infrequent for some diseases that it is not meaningful to assess quality of care on so few

cases. Further, definitions of certain outcomes may vary between hospitals and thus may not be consistently

recorded (Julious et al., 2001). Additionally, when adverse outcomes are observed, it is often not possible

to determine what aspect of the care received is attributed to such outcomes, making it difficult to develop

meaningful quality improvement initiatives (Donabedian, 1988). Finally, it is often difficult to adequately

adjust such outcome indicators for all possible confounders due to the multi-dimensional nature of certain

outcomes (e.g. mortality) and the limited availability of the necessary information to do so.

While all three indicators have distinct advantages, process measures tend to be favoured above all oth-

ers when health care quality improvement is the goal. To this end, QIs are often used to detect hospitals

giving poor care so interventions can be targeted towards those hospitals in need.

2.1.3 Various Uses for Quality Indicators

For care improvement initiatives, interest often lies in identifying hospitals who are providing poor or out-

lying care. Detection of such outlier hospitals involves determining whether the observed event rate of the

indicator deviates substantially from some expected event rate according to an alternative level of care (i.e.

reference or benchmark). This deviation can be measured in terms of a standardized difference or a ratio,

and will be discussed further in the next section. The cutoffs for hospitals being substantially different

8

Page 20: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

from the benchmark (i.e. outliers) are often based on statistical significance, adjusted for multiple hospital

comparisons using either Bonferroni or false discovery rate corrections (Jones et al., 2008), but could be

based on clinical significance as well, or some combination of the two. Generally, overall classification of a

hospital as an outlier in care should be made based on multiple QIs representing different facets of the care

being provided to patients. Each QI can be used individually to classify a hospital as an outlier in a single

aspect of care, followed by some sort of aggregation of these classifications to obtain a measure of overall

performance (Lawson et al., 2017a). Such benchmarking or hospital classification analyses only have merit

if there is meaningful variability in the hospital practices being assessed.

There are a number of ways to visualize the variability of hospital-specific QIs within a system and

identify outlying hospitals. The first graphical method is termed a “caterpillar” plot and simply plots each

standardized QI with a corresponding confidence interval for each hospital (Jones et al., 2008). An outlier

hospital is one whose confidence interval does not contain a chosen benchmark level of performance. A natu-

ral extension is to display the results in a “league table” which is a caterpillar plot that has the standardized

QIs ordered from smallest to largest (Marshall and Spiegelhalter, 1998). One issue with league tables is that

they are often made public which causes attention to be placed simply on the rank order of the hospitals, and

especially the “winners” and “losers”, disregarding the extensive statistical noise surrounding the rankings.

While this does promote accountability of the hospital, it may lead to the possibility that hospitals might

manipulate their data or refuse to cooperate with profiling initiatives to avoid a negative ranking (Goldstein

and Spiegelhalter, 1996; Normand and Shahian, 2007). Further, the rankings of each hospital will change

depending on whether a standardized difference or ratio is used as well as with different choices of indicator.

As an alternative, “funnel” plots can be used to display the results of outlier classification (Spiegelhalter,

2005b). Here, the indicator for each hospital is plotted against a measure of its precision and a funnel is

drawn representing 2 standard deviations from a target level of care. Such plots are often used to detect

publication bias in meta-analysis (Egger et al., 1997). Funnel plots provide no means of ranking the hospitals.

The identification of outlying care practices is an attempt to understand the variability in care observed

between hospitals. Variability in the indicators can generally be attributed to data quality, differences in

care practices, random chance, and differences in patient characteristics. The last must be accounted for

in any quality comparison analysis so that variability in the QI reflects only variability in care practices of

hospitals. This is often achieved through the standardization of QIs.

2.2 Standardization Methods

A valid QI should only reflect variability in the quality of care across hospitals, not variations in care due

to patient characteristics. Adjusting for differences in patient populations, i.e. patient case-mix, allows the

removal of differences in care practices due to, for example, larger hospitals treating sicker patients and

thus having possibly worse outcomes (Neuberger et al., 2010). These adjustments are most commonly made

through one of two standardization methods: direct or indirect.

9

Page 21: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

2.2.1 Direct and Indirect Standardization: An Introduction

The choice between using direct or indirect standardization depends on the particular comparison to be made.

Consider the characteristics of two populations of patients shown in Table 2.1 below. The crude event rate is

calculated separately for each population, and does not account for differences in the patient-level stratum

membership between the index and reference populations. Standardization therefore attempts to apply

certain characteristics of an index or study population to another reference or standard population in order

to remove differences in the patient strata between these populations. The crucial difference between direct

and indirect standardization is which population is the “target” or population of interest (Miettinen, 1972):

direct standardization holds the standard or reference population as the target, while indirect standardization

considers the index population as the target.

Index population Reference populationStratum membership S1, . . . , SK R1, . . . , RKEvent rates per stratum α1, . . . , αK π1, . . . , πKNo. events per stratum S1α1, . . . , SKαK R1π1. . . . , RKπKTotal observed events

∑Skαk

∑Rkπk

Crude rate∑Skαk/

∑Sk

∑Rkπk/

∑Rk

Table 2.1: Population characteristics needed for standardization for K strata based on patient characteristicsin each index and reference population (e.g. age group or gender). The total observed events and crude rateswithin each population involve calculating the total events or crude rate per patient strata, and summingover all strata.

Direct standardization can be seen as attempting to standardize the case-mix of the index population so

that it resembles that of the reference population (Pouw et al., 2013). By applying the event rate of each

strata of the index population (αk) to the membership of each strata in the reference population (Rk) then

averaging over all strata, direct standardization calculates the expected rate if the index population had

the same stratum membership as the reference population (Keiding and Clayton, 2014), or alternatively the

expected rate if the reference population experienced the event rate of the index. In hospital profiling terms,

direct standardization calculates the expected outcome had all patients in a health care system received the

care level of a particular hospital. This directly standardized rate (DSR) can be mathematically written as

DSR =

∑k Rkαk∑k Rk

following the notation in Table 2.1. The DSR can be used to create a standardized difference by subtracting

it from the observed rate in the reference population.

If, instead of a rate, a standardized ratio of the number of events is desired, then a directly standardized

quantity often called the comparative mortality figure (CMF) can be calculated as

CMF =

∑k Rkαk∑k Rkπk

.

The CMF is the ratio of the expected number of events in the reference population, if the event rate were

10

Page 22: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

that of the index population, over the observed number of events in the reference under its own event rate.

In contrast, indirect standardization can be seen as attempting to standardize the event rate in the

index population so that it resembles that of the reference population (Pouw et al., 2013). To calculate

an indirectly standardized quantity, each stratum-specific event rate from the reference population (πk)

is applied to the membership of each strata in the index population (Sk) and then an average over the

strata is taken. This yields the expected event if the index population had the same event rate as the

reference population (Keiding and Clayton, 2014). For hospital comparisons, indirect standardization yields

the expected outcome among the patients of one hospital if they had received the event rate of some reference

hospital. An indirectly standardized quantity often used is the standardized mortality/morbidity ratio (SMR)

which can be calculated as

SMR =

∑k Skαk∑k Skπk

≡ O

E,

a ratio of observed to expected events, which is similar in form to the CMF but uses the stratum membership

of the index instead of the reference population. In the hospital comparison setting, the SMR simply refers to

this ratio of observed to expected events, regardless of whether the quality indicator of interest is mortality,

morbidity or some other process measure of care. If instead an absolute standardized quantity is needed,

the indirectly standardized rate (ISR) may be used. The ISR is calculated as

ISR =

∑k Rkπk∑k Rk

×∑k Skαk∑k Skπk

,

which is just the SMR multiplied by the crude rate from the reference population. This can be subtracted

from the observed event rate in the index to obtain a standardized difference.

2.2.2 Case-mix Adjustment Methods

Adjustment for case-mix in the standardization methods above can be achieved in a number of ways. When

the number of strata characterizing the patients is small, calculating the expected number of events can

easily be done in a model-free manner illustrated in the formulae of the previous section. But as the number

of covariates needed to properly adjust for patient differences increases, model-based techniques such as

regression modelling becomes the preferred method (Spiegelhalter, 2005b).

The simplest regression modelling approach involves regressing the outcome Yi on patient covariates Xi

using either an external standard (i.e. entirely independent reference) or internal standard (i.e. overlapping

with the index) population. When performing direct standardization of the outcome, assuming Yi is Normally

distributed, a linear model would be fit including the covariates Xi as well as a vector of indicator variables

representing fixed hospital effects Zi, as

E[Yi | Xi, Zi, β] = β0 + β′

1xi + β′

2zi.

In this case, the hospital-specific indicator of quality for the jth hospital corresponds to the jth element of

the vector β′

2. The number of parameters to be estimated becomes quite large if there are a large number

of hospitals being compared; further fitting hospital-patient interactions will magnify this issue. Often the

11

Page 23: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

outcome will not be Normally distributed and a generalized linear model may be required. Here the hospital

coefficients would no longer correspond to the indicator of quality. As the hospital coefficients no longer

have this interpretation, indirect standardization allows for case-mix adjustment without the necessity of

including hospital effects and thus avoids estimating an excessive number of parameters. In the case of a

binary outcome, a generalized linear model of the form

pi = P (Yi = 1 | Xi = x, φ) = expit{φ0 + φ′

1xi} = RSi

would be used, with RS denoting the risk score. Fitted values pi for each patient in the index population are

extracted then summed over the patients in each hospital z to obtain hospital-specific standardized QIs. For

the case of a model constructed under an external standard population, a prevalence correction (Wijensinha

et al., 1983) can be applied to the fitted values to ensure that the total number of events is the same in both

the study and reference populations (∑i Yi =

∑i pi). As an alternative correction, one could model the risk

score from the external model as a covariate in a secondary regression on the outcome, such as

logit{p∗i } = φ∗0 + φ′∗1logit(RSi).

A more sophisticated modelling approach involves adding either fixed or random hospital effects (Goldstein

and Spiegelhalter, 1996) to the secondary regression when using an external standard or to the original model

for an internal standard. DeLong et al. (1997) demonstrated that while there was reasonable concordance

between these methods in classifying hospitals as outliers, methods involving random hospital effects gave

more conservative estimates than the others.

A number of statistical issues can arise with these simple regression methods, such as obtaining inaccurate

estimates for small hospitals, clustering of patients within hospitals, and multiple comparisons (Normand

and Shahian, 2007). Hierarchical models can address some of these issues, and a number of methods have

been proposed, such as using posterior tail probabilities of multilevel hierarchical models to profile hospitals

(Normand et al., 1997), using empirical Bayes shrinkage estimators to stabilize the estimates for non-outlying

hospitals (Thomas et al., 1994), or developing approximate Bayesian hospital-level credible intervals for out-

lier classification (Gajewski et al., 2008), to list a few. In a comparison between the simple regression model

approach (developed on an internal standard population) and the hierarchical approach of Normand et al.

(1997), it has been demonstrated (Austin et al., 2001) that there is poor agreement in the classification

of outliers, with the Bayesian method giving conservative estimates. Further comparison between various

Bayesian and frequentist fixed and random effect models shows the Bayesian random effect methods produce

wider credible intervals than the corresponding confidence intervals of the fixed effect methods and thus

correctly classify poor outliers with low frequency (Racz and Sedransk, 2010). While hierarchical models can

help to avoid some statistical issues, if the goal of a hospital profiling analysis is to identify outlier hospitals,

then methods that produce conservative or shrinkage estimates may not be entirely appropriate.

Finally, propensity score methods (Rosenbaum and Rubin, 1983) can correct for covariate imbalance in

hospital comparisons. Shahian and Normand (2008) use the propensity of being treated at each hospital to

assess whether the hospitals can be directly comparable with respect to their patient profiles, and Huang

12

Page 24: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

et al. (2005) through stratification by the propensity of membership in a physician group. Compared to

traditional regression approaches for case-mix adjustment in the presence of multiple hospitals, propensity

score approaches, such as stratification, weighting and adjustment perform similarly, except for propensity

score matching in the case of small sample sizes of hospitals (Brakenhoff et al., 2018). In many observational

data settings, inverse propensity score weighting is often used to estimate causal effects (Funk et al., 2011)

for both direct and indirect standardization comparisons.

2.2.3 Direct versus Indirect: Which is appropriate?

There is some debate regarding which standardization method is the most appropriate for comparing events

from different populations, especially in the area of quality of care comparisons for multiple hospitals. Pouw

et al. (2013) note that direct standardization is the most appropriate method to use for ranking the quality

of care because each hospital’s adjusted indicator is based on the same reference population as those of

other hospitals, allowing comparisons to reflect care disparities on a single universal population. Further,

Pouw et al. (2013) state that indirect standardization may be subject to interactions between hospital and

case-mix variables, meaning that hospitals with the same level of care may appear to be providing different

care due to differences in the stratum membership of hospitals.

Despite these criticisms, the indirectly standardized SMR is more often used in practice over the directly

standardized CMF. This may be driven by a number of practical advantages when faced with large numbers

of hospitals. One of these advantages often emphasized is that it does not require estimation of the stratum-

specific rates in the study population (αi from Table 2.1), which can be heavily influenced by random

variability when stratum membership is small (Schoenbach and Rosamond, 2000; Zaslavsky, 2001; Keiding

and Clayton, 2014). Also, direct standardization is often performed using regression models that include both

patient-level covariates as well as hospital effects. The presence of a large number of hospitals often requires

introduction of random effects to be able to estimate such hospital effects. As noted in Normand et al.

(1997), estimates for small hospitals from such multilevel models may shrink towards the population mean

which makes poor performance much more difficult to detect. The use of random hospital effects is however

questionable as such multilevel models assume that the hospitals are a sample from some super-population

of hospitals. In the context of province-wide or nationwide administrative data, the hospitals would not be

considered a sample and so such multilevel models serve as a computational convenience in dealing with small

hospitals through shrinkage. Further, by modelling these hospital effects in direct standardization, it raises

the question of whether hospital-patient interactions should also be modelled (Varewyck et al., 2016) in order

to fully remove all extraneous patient and hospital variability. However, when making quality comparisons

for all hospitals within a province or even a country, modelling hospital effects translates to estimating a

large number of parameters. Thus modelling hospital-patient interactions further contributes to “the curse

of dimensionality” (Varewyck et al., 2016). Finally, smoothing methods may be required to estimate hospital

effects if some hospitals are small. Indirect standardization does not require modelling hospital effects and

therefore avoids these common issues. Despite numerous criticisms regarding the appropriateness of using

indirect standardization for quality comparisons, practical limitations of direct standardization have left

indirect methods as the preferred strategy for case-mix adjustment in quality comparisons. The controversy

regarding the appropriate method of standardization may also be due in part to a lack of understanding of

13

Page 25: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

what the SMR itself represents as a causal quantity.

2.2.4 Hospital Standardized Mortality Ratio for Mortality

One of the most popular applications of indirect standardization and standardized mortality ratios in par-

ticular is to compare overall mortality between hospitals using what is termed the hospital standardized

mortality ratio (HSMR) as seen in Jarman et al. (1999, 2010) and Wen et al. (2008) to name a few. Of inter-

est to all parties involved in a health care system would be reduction in patient and hospital-wide mortality

so it is natural that mortality would be used as an indicator of care. However, while some outcome measures

(of which mortality is one) can have some usefulness as quality indicators (Donabedian, 1978), mortality,

particularly hospital-wide or overall mortality, has a number of conceptual and practical disadvantages as

an indicator of care.

A few advantages to using hospital-wide mortality as an indicator of care would be 1) when hospitals only

admit a small number of cases of a particular diagnosis and thus combining mortality from many diagnoses

could alleviate some sample size concerns, and 2) when the indicator might reflect more structural aspects

of care that mortality based on a single diagnosis would not make apparent (Shahian et al., 2012). However,

the difficulty in using overall hospital-wide mortality as a quality metric is its nature as an aggregation of

mortality across many diagnoses. Due to this aggregation, the HSMR as a screening tool for quality can be

seen as having low sensitivity, i.e. most problems with quality of care do not cause death, as well as having

low specificity, i.e. most deaths do not reflect poor quality (Scott et al., 2011). In addition, due to the

aggregation of disease-specific mortality, an HSMR that indicates a hospital has higher mortality rates than

expected does not provide any information concerning what aspect of the care is deficient and causing such

inflated rates (Lilford and Provonost, 2010). Thus ranking hospitals based on their overall mortality can be

misleading, especially when consideration is taken of the limitations to case-mix adjustment for hospital-wide

mortality.

Case-mix adjustment is meant to account for differences in the observed hospital-wide mortality due to

variation in the patient case-mix of the hospitals. However, when deaths due to various diseases are com-

bined, case-mix adjustment may not fully capture all case-mix differences. This is either due to recording

inconsistencies or discrepancies in the definitions used to define adjustment variables between hospitals (van

Gestel et al., 2012), or due to omitting important prognostic factors in adjustment, such as how well patients

were previously cared for prior to admission to the current hospital (Lilford et al., 2004). The latter can be

termed a “referral bias” (van Gestel et al., 2012), since the adjustment does not account for patients having

been transferred or referred from other hospitals and so the mortality rates observed may reflect some aspect

of care of the referring hospital. Another issue that arises, termed the “case-mix adjustment fallacy”, where

after adjusting for case-mix it is believed that the indicator now solely reflects variations in care (Lilford

et al., 2004). The variability in the outcome can be broken down into 4 sources of variation, only one of

which is due to the differential case-mix, while one represents the variation in the quality of care. Thus,

concluding that by adjusting for case-mix, the indicator now solely represents variations in care does not

consider variability also caused by definitions/data quality and chance (Lilford et al., 2004).

14

Page 26: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Further, it has been shown in Shahian et al. (2010) that the ranking of hospitals based on their HSMR

can change dramatically depending on how the case-mix adjustment is implemented, indicating sensitivity of

rankings to patient exclusion criteria, variable definitions and coding, as well as methodological implementa-

tions. It further highlights the absence of a concrete association between hospital-wide mortality and quality

(Scott et al., 2011; Shahian et al., 2010). It appears there is consensus that hospital-wide mortality should

not be used to measure quality of care, yet there is ongoing interest in investigating the causal link between

measures of quality and patient outcomes, such as disease specific or short term mortality, to determine the

possible impact of changes to care practices on outcomes. However, this thesis will not consider mortality

as an indicator of the quality of care being provided. Chapter 3 of this thesis provides an illustration of

quality comparisons in the context of prostate cancer care in the United States by employing indirect stan-

dardization methods to identify outlying hospitals in care and investigating associations between poor care

and poor patient outcomes.

2.3 Causal Models and Inference

Donabedian (1978, 1988) detailed the importance of the existence of a causal relationship between structural,

process and outcome indicators of care, as discussed in section 2.1.2. Further, standardization can be seen

to have a natural causal interpretation by considering the expected event that a certain patient population

would have had they received the level of care of some other population that may be contrary to the care

they actually received (Zaslavsky, 2001). Yet, despite such hospital comparisons being fundamentally causal

questions, there has been some reluctance in health services research to address these in an explicit causal

framework (Dowd, 2011), including formulating the objects of inference as causal contrasts, and explicitly

stating the assumptions needed for identifying them based on observed data.

2.3.1 Introduction to Causal Inference and Potential Outcomes

Causal inference focuses on the specification and estimation of the causal effect of some exposure on some

outcome (Holland, 1986). As can be seen in Figure 2.1, the causal pathways between the exposure Z and

the outcome Y can be affected by patient covariates X. The effect of X on the Z and Y is negated under

randomization, however in non-randomized or observational studies, X would need to be accounted for in

the causal pathway.

X

Z Y

Figure 2.1: Basic causal mechanism.

A common causal effect in clinical research, in the simple case of a binary exposure (z ∈ {0, 1}), considers

the change in the outcome for an individual that is a direct result of the exposure, often written as Y1 − Y0

15

Page 27: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

(Rubin, 1974), where Yz represents the outcome that would occur under exposure z. The “Fundamental

Problem of Causal Inference” is that an individual i can only receive one of these two exposures and thus it

is impossible to identify this individual causal effect (Holland, 1986). The unobservable term is denoted the

potential or counterfactual outcome since it reflects the possible outcome value that would have occurred

under an exposure that was contrary to what was given (Hernan, 2004). However, rather than focusing

on the causal effect of an individual, the field of causal inference makes use of the population of units to

compute population average causal effects, such as E[Y1 − Y0], the average treatment effect (ATE), or the

average treatment effect among the treated (ATT), E[Y1 − Y0 | Z = 1]. According to Hernan (2004), the

notion of counterfactuals can be traced as far back as Hume (1748), with the formalization of counterfactuals

for randomized experiments by Neyman in 1923 (Splawa-Neyman et al., 1990), with Rubin (1974) further

extending it to both randomized and non-randomized studies.

For population causal effects like the ATE or ATT, one needs to consider the proportion of individuals

from some population that would have experienced the outcome under exposure z. In direct standardization,

this proportion takes the form P (Yz = 1) ≡ E[Yz], which represents the proportion of individuals who

experienced the event if the entire population had received exposure z, and is used to estimate effects like

the ATE. Alternatively, indirect standardization gives the proportion of subjects in exposure group z who

would have experienced the event if they had been given exposure z∗, P (Yz∗ = 1 | Z = z), which can be

used to estimate the ATT. In order to estimate these causal proportions using observable data, one needs to

make three assumptions: exchangeability, positivity, and consistency.

A2.1 Exchangeability states that the observed exposure does not predict the counterfactual outcome (Green-

land and Robins, 1986; Hernan, 2004), written mathematically as

P (Yz = 1 | Z = 0) = P (Yz = 1 | Z = 1).

Exchangeability can be achieved through randomization, however for observational data, we require a

modified “conditional exchangeability” assumption, Yz ⊥⊥ Z | X, where we condition on confounding or

case-mix factors X that may influence the effect of the exposure on the outcome (Hernan and Robins,

2006).

A2.2 Positivity states that it is possible for any subject characterized by covariates X = x to receive any

exposure, namely 0 < P (Z = z | X) < 1 for all z and X combinations (Westreich and Cole, 2010).

The purpose of the positivity assumption is to ensure that the conditional distribution P (Y = 1 | Z =

z,X = x) used in the estimation process is well-defined (Hernan and Robins, 2006).

A2.3 Consistency (Cole and Frangakis, 2009) simply states that the observed outcome is equivalent to

the potential outcome for the exposure actually given, which can be written mathematically as Y =

(1− Z)Y0 + ZY1 (Hernan, 2004) in the case of a binary exposure. This can be written more generally

in the case of multiple exposures as Y = YZ , where z ∈ {1, . . . , p} and reflects the notion that out of all

the p exposures, one can only observe the potential outcome corresponding to the exposure received.

These assumptions combined allow the causal estimands using potential outcomes notation to be rewritten

16

Page 28: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

using notation for observable quantities. For example, we may write the ATE in terms of observed data as

E[Y1 − Y0] = EX{E[Y1 | X]− E[Y0 | X]}

= EX{E[Y1 | Z = 1, X]− E[Y0 | Z = 0, X]} (A2.1)

= EX{E[Y | Z = 1, X]− E[Y | Z = 0, X]} (A2.3)

=∑x

{E[Y | Z = 1, X = x]− E[Y | Z = 0, X = x]}P (X = x) (A2.2)

where first the exchangeability assumption is used, followed by consistency and finally positivity, and P (X =

x) represents the covariate distribution of the standard population. Positivity ensures that the conditional

distribution P (Y | Z,X) is identifiable. While potential outcomes have been presented here for the case

of a binary exposure, it is straightforward to consider potential outcomes Yz for a multinomial exposure

z ∈ {1, . . . , p}, as would be required for the hospital quality comparison context. This results in many more

potential outcomes to consider and highlights the need to consider causal estimands that are not simply

pairwise comparisons between hospitals, as there would be p(p− 1)/2 such comparisons.

2.3.2 Causal Inference and Quality Indicators

While there is much literature involving standardization methods, as seen in section 2.2.2, very little ad-

dresses quality comparisons in an explicit causal framework, despite its obvious causal nature. However, the

methods in Section 2.2.2 can be used to estimate the expected counterfactual outcome required for estima-

tion of the causal effect of interest (Shahian and Normand, 2008). Depending on whether a risk difference

or ratio is desired to profile hospitals, and the choice of reference or standard population (see section 2.2.1),

the causal effect of interest will take different forms, and thus it is necessary to explicitly define the causal

estimand using the potential outcomes framework. Once the causal estimand for the quality comparison of

interest has been defined, the quality indicator can be obtained by estimating this causal estimand using the

assumptions above.

Rubin et al. (2004) advocated for the use of potential outcomes in profiling educational institutions and

discussed the importance in defining one’s causal quantity of interest explicitly as a means of clarifying the

estimation goals and to understand the limitations of the available data. In the health care setting, Shahian

and Normand (2008) introduce hospital profiling as a causal problem and discuss the notion of counterfactu-

als within this context. Their main focus surrounds evaluating the practicality of making pairwise hospital

comparisons of quality if the distribution of patient covariates is dissimilar. They propose comparing the

distribution of propensity scores between patient populations of the hospitals being profiled, such as the

populations of two separate hospitals, or even one hospital population with the remaining patient popula-

tion of all hospitals.

Varewyck et al. (2014) define a causal estimand for a directly standardized risk difference, which considers

the expected outcome had all individuals in the population been treated at hospital z, namely E[Yz] for each

z ∈ {1, . . . , p}. They further provide a causal estimand for the indirectly standardized mortality ratio, given

17

Page 29: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

by

SMR =E[Yz | Z = z]

1m

∑pz∗=1E[Yz∗ | Z = z]

,

where the denominator represents the expected outcome had the individuals in hospital z been treated at

an average hospital, denoted by the equally weighted average across all hospitals. They also propose and

compare different strategies for estimating their directly standardized causal estimand E[Yz], yet propose

nothing further for their causal estimand of the SMR, with comparisons being made to an average hospital.

A subsequent paper by the same authors (Varewyck et al., 2016) discusses the impact on estimation

bias of omitting interactions between hospital effect and case-mix variables. They consider pairwise com-

parisons between hospitals and formulate causal estimands for both directly standardized risk difference and

the indirectly standardized excess risk. Finally, Shinozaki and Matsuyama (2015) express the standardized

risk difference and ratio among the exposed (indirectly standardized measures) as causal estimands in the

context of a binary exposure as E[Yz | Z = 1]. While their setup is restricted to a binary exposure and is not

framed in the hospital profiling context, these methods can be adapted to the comparison of two hospitals,

where the counterfactual represents the expected outcome if patients of z received the care provided by z∗.

While there has been some work framing quality comparisons as a causal problem, each contribution

considers either different standardized measures or different reference populations or care levels. However,

none focus on developing the causal estimand for the indirectly standardized SMR when the reference

population is all hospitals in a province or country and comparisons are being made to the provincial

or national average care level respectively. This particular comparison is relevant for policy makers who

are interested in determining the optimal areas to allocate funds for health care improvement initiatives

(Varewyck et al., 2014) as it considers how hospitals would care for their own patients if they performed

average care. Chapter 4 and 6 develop explicit causal estimands for the SMR when comparing to the average

national or provincial level of care.

2.3.3 A Comment on Indirect Standardization and the Positivity Assumption

The purpose of the positivity assumption is to ensure that there is no set of covariates X for which it is not

possible to receive an exposure/treatment z. This assumption can be likened to identifiability in that it will

not be possible to estimate a causal quantity of interest if there are subsets in X for which we have no data

for exposure z. This assumption is generally used to ensure that a conditional distribution of the potential

outcome on the exposure and covariates can be expressed, which in turn is necessary for the estimation of

the causal quantities of interest.

For the application of causal inference to quality comparisons, the positivity assumption is essentially

saying that for any combination of hospital and patient covariates, there is a chance, however small, that a

patient with covariates determined by X would receive treatment in hospital z. It is especially important

to remember the distinction here that X is a random variable and we have collected realizations of X in x.

Thus when dealing with positivity, we are not saying that for specific realizations of X = x that we observed,

there is a chance to be treated in z. Instead we are saying that, given a patient with similar characteristics

18

Page 30: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

to those in X, there is a non-zero probability that a hospital z would treat such a patient. While there are

a number of patient-level factors which would immediately violate the positivity assumption if included in

the set of X’s, such as the patient’s postal code, if these factors are not confounders then they need not be

included at all. Specifically, X only needs to be the set of covariates that satisfy both the exchangeability

and positivity assumptions. So long as none of the covariates in X contain information about the specific

exposure received (in this case the specific hospital of treatment) that would prevent any patient from being

treated at that hospital, it would appear that positivity should be satisfied except in the application of

this methodology to instances where the disease is rare or requires specialized facilities for treatment that

potentially not all hospitals would be able to offer.

However, in the case where positivity does not hold, it is of interest to know whether this violation

actually affects the estimation when the causal comparison being made is to the average national care level.

It will be shown in Chapter 5 that even when positivity fails, these hospitals only contribute a zero term to

the expected outcome calculation, and therefore we are still considering an average over all hospitals that

treat similar kind of patients as the index hospital in question, unlike direct standardization which would be

susceptible to positivity violations.

2.3.4 Doubly Robust Estimation of Causal Effects

Model misspecification can be due to a number of reasons, including unmeasured confounders, omission of

observed confounders, and misspecification of the functional form of the model. Doubly robust (DR) methods

are an attempt to overcome or alleviate such issues by incorporating both a propensity score model, P (Z | X),

and an outcome model, E[Y | Z,X], into a single estimator (Bang and Robins, 2005; Funk et al., 2011).

By modelling the effect of patient covariates on the exposure (i.e. propensity score) and the effect of the

exposure and covariates on the outcome (i.e. outcome model), DR methods incorporate information from

the multiple causal pathways in Figure 2.1. In general, a DR estimator for a marginal mean µz under direct

standardization, such as the components of the ATE from section 2.3.1, will combine fitted values from an

outcome model m(X, z, φ) ≡ E[Y | Z,X] and a propensity model e(X, z, γ) ≡ P (Z | X) in the following way

(Funk et al., 2011):

µDRz = n−1n∑i=1

m(xi, z, φ) + n−1n∑i=1

1{Zi=z}

e(xi, z, γ)

[Yi −m(xi, z, φ)

](2.1)

= n−1n∑i=1

1{Zi=z}

e(xi, z, γ)Yi + n−1

n∑i=1

[1−

1{Zi=z}

e(xi, z, γ)

]m(xi, z, φ) (2.2)

where the nuisance parameters (φ, γ) are estimated using standard statistical techniques, such as maxi-

mum likelihood. In equation (2.1), the Yi −m(xi, z, φ) term will converge to zero when the outcome model

is correctly specified, removing the misspecified propensity model from the estimation procedure. Alter-

natively, when the propensity model is correctly specified, equation (2.2) shows that the term containing

1 − 1{Zi=z}/e(xi, z, γ) converges to zero, effectively canceling the contribution of the misspecified outcome

model. The advantage of using DR methods is that they only require one of the included models to be

correctly specified for consistent estimation of the causal effect (Funk et al., 2011).

19

Page 31: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

DR methods were originally developed in the context of missing data and involved the use of a model

for the missingness mechanism and one for the distribution of the complete data (Bang and Robins, 2005).

The extension to the field of causal inference was natural since the counterfactual outcomes are unobserved

for certain exposures. The construction of DR methods for causal inference was originally formulated by

Scharfstein et al. (1999). This was later extended by many (Robins, 2000; Lunceford and Davidian, 2004;

Neugebauer and van der Laan, 2005) with the mathematical theory underlying these methods laid out in

Robins and Rotnitzky (2001) and van der Laan and Robins (2003).

The need and importance of standardizing quality indicators has been emphasized in section 2.2. It has

been demonstrated that inadequate case-mix adjustment or model misspecification can lead to misleading

assessments of quality (Spiegelhalter, 2005a; Neuberger et al., 2010). However, very few DR methods have

been developed in the context of quality comparisons, despite their obvious appeal. Varewyck et al. (2014)

propose a DR estimator for their causal estimand of the directly standardized risk, E[Yz], but they do not

develop an analogous DR estimator for their indirectly standardized causal estimand of the SMR, where

again comparison is being made to the care of an average hospital. Further, Shinozaki and Matsuyama

(2015) also propose a DR estimator for their standardized risk difference and ratio among the exposed,

however their methods are only applicable to the pairwise comparison of hospitals. When the comparison

of interest is to the national or provincial average care level, no DR estimator has been developed in this

context for indirect standardization. Chapter 4 of this thesis frames the causal estimand of the SMR for this

comparison and develops a corresponding DR estimator.

2.3.5 Traditional Mediation Analysis Methods

Donabedian (1988) emphasized the interest in using outcome measures as indicators of care and also empha-

sized that the pathway of care from structure to process to outcome can only be decomposed and used to

assess care if a causal relationship exists between these components (Donabedian, 1978). Further, there is a

clear desire to use outcomes to assess hospital care, as seen in the discussion around the use of mortality as

an indicator in section 2.2.4. If outcomes are of interest but process measures are more intervenable and a

causal relationship has been shown to exist between the two, then it might be of interest to instead consider

whether observed variations in the outcomes between hospitals are caused by variations in the processes.

Causal mediation analysis can be used to answer such a question by taking into account the causal pathway

from structure to outcome through process.

Mediation analysis methods can be broken down into two main areas: traditional approaches and coun-

terfactual approaches. Traditional mediation analysis methods, popularized by Baron and Kenny (1986)

but preceded by a number of others (Hyman, 1955; Alwin and Hauser, 1975; Judd and Kenny, 1981; Sobel,

1982), consider estimating the causal effect of an exposure Z on an outcome Y that may or may not be

acting through a mediator M , as seen in Figure 2.2. Estimation often proceeds in these methods by fitting

three regression models: one for the outcome, conditional on the mediator, exposure and covariates needed

to adjust for confounding (equation (2.3)), one for the outcome that omits the mediator (equation (2.4)),

and one for the mediator, conditional on the exposure and covariates (equation (2.5)). In the case where

20

Page 32: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

X

Z M Y

Figure 2.2: A simple mediation model, with exposure Z, mediator M , confounder X and outcome Y .

both the mediator and outcome are continuous, the models would be specified as

E[Y | Z = z,M = m,X = x] = θ0 + θ1z + θ2m+ θ′

3x (2.3)

E[Y | Z = z,X = x] = η0 + η1z + η′

3x (2.4)

E[M | Z = z,X = x] = β0 + β1z + β′

2x. (2.5)

In the original methods of Baron and Kenny (1986), these models did not include adjustment for covariates.

Here the direct effect (Z → Y ) of exposure Z on the outcome Y , unmediated by M , is simply θ1, which can

be regarded as the effect on Y for a fixed value of m (VanderWeele, 2015). The indirect effect (Z →M → Y )

on outcome Y of changes in Z that operate through M is β1θ2. This result is obtained by taking β1, the

effect on the mediator of a unit change in the exposure (Z →M), and plugging this value into m of equation

(2.3), so that the combined coefficient of β1θ2 measures the effect on the outcome that results from the effect

of Z on M . This is often termed the “product method”.

There are other similar methods for estimating direct and indirect effects. The “difference method”

(Susser, 1973) involves fitting two different outcome models: the first includes the exposure and the media-

tor (as in equation (2.3)) and the second omits the mediator (as in equation (2.4)). Then an indirect effect

can be estimated by taking the difference between the coefficients of the exposure in the two models, θ1−η1.

The “MacArthur approach” (Kraemer et al., 2008) allows for non-linear relations among the variables to

qualify as mediators of the exposure-outcome relationship as long as there exists a relationship between the

exposure and the mediator. They further propose that M can be defined as a mediator if Z precedes M and

M precedes Y in time, Z and M are correlated and either M or Z ×M interaction are significant in the

outcome model.

The previous methods can be generalized into the area of path analysis. Path analysis, introduced by

Wright (1921), has laid the foundation for many other approaches to mediation analysis (Judd and Kenny,

1981; Baron and Kenny, 1986; MacKinnon, 2008). These methods are now more commonly referred to as

structural equation modelling (SEM) approaches (VanderWeele, 2015) and allow estimation of direct and

indirect effects through modelling covariance and correlation matrices. However they are often criticized for

not adequately addressing confounding when inferring a causal relationship. SEM has been adopted into

the counterfactual approach (VanderWeele and Vansteelandt, 2009; Imai et al., 2010a; Pearl, 2011), which

explicitly states confounding assumptions to allow for inference of causal relationships.

21

Page 33: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

2.3.6 Counterfactual Approach to Mediation Analysis

The counterfactual approach to mediation analysis was developed to explicitly state assumptions about con-

founding necessary for a causal interpretation, as well as to address non-linearity and interactions of the

causal relationship (Robins and Greenland, 1992; Pearl, 2001). This approach allows for more generalized

counterfactual- or potential outcomes-based definitions of direct and indirect effects (Robins and Greenland,

1992; Pearl, 2001; VanderWeele, 2009; VanderWeele and Vansteelandt, 2010; Imai et al., 2010a,b), which can

be estimated using regression techniques similar to above, provided that certain no confounding assumptions

hold.

The potential outcomes notation used in mediation analysis (Robins and Greenland, 1992; Pearl, 2001) is

similar to that of section 2.3.1, however these are extended to refer to the joint exposure (Z,M) by creating

a double subscript potential outcome notation, Yzm. This refers to the potential outcome had the subject

received exposure z while having their mediator value fixed at level m. Since the level of the mediator can

change depending on the exposure, it is also necessary to define a potential mediator Mz, which denotes

the value the mediator would naturally take if the subject had received exposure z. Using this potential

outcome and mediator notation, one can define four causal effects that may be of interest when considering,

for example, a risk difference:

CDE = E[Yzm − Yz∗m] (controlled direct effect)

NDE = E[YzMz∗ − Yz∗Mz∗ ] (natural direct effect)

NIE = E[YzMz− YzMz∗ ] (natural indirect effect)

TE = E[YzMz − Yz∗Mz∗ ] (total effect)

The CDE reflects the expected effect on Y of receiving exposure z instead of z∗ while the mediator is held

fixed at level m. The NDE represents the effect of the exposure on the outcome that would remain if we

were to disable the pathway from Z →M (VanderWeele, 2015) by letting the mediator take its natural value

under z∗. The NIE captures the effect of Z on Y that occurs from changing the mediator to its natural

value under the alternative exposure z∗. Finally, the TE captures the combined effect of the exposure on

the outcome that may or may not go through the mediator. One property of the counterfactual approach

is that, when considering a risk difference, the total effect is decomposed into the sum of the natural direct

and indirect effects, TE = NIE + NDE, in the following way:

TE = E[YzMz− Yz∗Mz∗ ]

= E[YzMz − YzMz∗ + YzMz∗ − Yz∗Mz∗ ]

= E[YzMz − YzMz∗ ] + E[YzMz∗ − Yz∗Mz∗ ]

= NIE + NDE

A number of assumptions are needed in order for these causal estimands to be estimated using the

observed data and simple regression models:

A2.4 Consistency of the outcome: similar to assumption A2.3 in the unmediated case; for the subgroup with

22

Page 34: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

observed exposure Z = z and observed mediator M = m, the observed outcome Y is equal to Yzm.

A2.5 Consistency of the mediator: again similar to assumption A2.3; here the observed mediator for subjects

with exposure Z = z will be equal to Mz.

A2.6 No unmeasured confounding of the exposure-outcome relationship: Yzm ⊥⊥ Z | X, which is similar to

the exchangeability assumption in A2.1

A2.7 No unmeasured confounding of the mediator-outcome relationship: Yzm ⊥⊥M | (Z,X)

A2.8 No unmeasured confounding of the exposure-mediator relationship: Mz ⊥⊥ Z | X

A2.9 No unmeasured confounders of the mediator-outcome relationship that are effects of the exposure:

Yzm ⊥⊥Mz∗ | (Z,X)

Assumptions A2.4 and A2.5 (and similarly A2.3) are always made in causal inference, and there is some de-

bate regarding whether these are assumptions or simply definitions (Cole and Frangakis, 2009). To estimate

the CDE, only A2.6 and A2.7 need to be assumed, while the NDE and NIE require the addition of A2.8 and

A2.9 (VanderWeele, 2009). Randomization of the exposure would only address confounding of the Z → Y

and Z →M relationships. In general, even if the exposure is randomized to subjects, the mediator won’t be,

so randomization does not satisfy all no confounding assumptions in mediation analysis (Judd and Kenny,

1981; James and Brett, 1984; MacKinnon, 2008).

A central reason for the development of counterfactual mediation analysis was to incorporate exposure-

mediator interactions into the estimation of the causal effects (Robins and Greenland, 1992; Pearl, 2001).

For a continuous outcome and mediator, one can incorporate this interaction by modifying the model in

equation (2.3) as

E[Y | Z = z,M = m,X = x] = θ0 + θ1z + θ2m+ θ3x+ θ4zm.

In conjunction with the mediator model of equation (2.5), the various effects of interest can be estimated in

a similar manner to the previous section as:

CDE = (θ1 + θ4m)(z − z∗)

NDE = (θ1 + θ4β0 + θ4β1z∗ + θ4β

2x)(z − z∗)

NIE = (θ2β1 + θ4β1z)(z − z∗).

If no exposure-mediator interaction exists, then the interaction term in the above model will have coefficient

θ4 = 0, resulting in the following estimates:

CDE = (θ1 + 0 ∗m)(z − z∗) = θ1(z − z∗)

NDE = (θ1 + 0 ∗ β0 + 0 ∗ β1z∗ + 0 ∗ β′

2x)(z − z∗) = θ1(z − z∗)

NIE = (θ2β1 + 0 ∗ β1z)(z − z∗) = θ2β1(z − z∗).

Here the CDE and NDE are equivalent to θ1(z − z∗) and identical to the direct effect of Baron and Kenny

(1986) when z − z∗ = 1, while the NIE is equal to θ2β1(z − z∗), which will be equal to θ2β1, the indirect

23

Page 35: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

effect of Baron and Kenny (1986), also when z − z∗ = 1. However, in general, the product method and the

difference method do not yield indirect effect estimates that agree with those given by the counterfactual

method (MacKinnon and Dwyer, 1993; VanderWeele and Vansteelandt, 2010). However by using the coun-

terfactual approach to mediation analysis, it is possible to decompose the total effect into its natural direct

and indirect components even in the presence of exposure-mediator interactions or non-linearity, unlike those

of Preacher et al. (2007) and Kraemer et al. (2008). It also coincides with the criteria for mediation outlined

in the MacArthur approach (Kraemer et al., 2008).

Each of the above effects has particular usefulness depending on the context of the analysis. The CDE

may be important for policy purposes in certain situations, as it considers the effect of the exposure if

an intervention were to fix the mediator level at a specific value (Pearl, 2001; Robins, 2003; VanderWeele,

2013). Alternatively, the NDE and NIE may be relevant for evaluating the actions of various mechanisms

and determining the importance of different pathways, as well as for effect decomposition (Robins, 2003;

Joffe et al., 2007). In the context of hospital comparisons, traditional mediation analysis of the effect of

certain structural characteristics of the hospital, such as for-profit status of nursing homes (Flynn et al.,

2010) and academic vs. non-academic hospitals (Rochon et al., 2014), have been considered. However, when

the hospital itself is considered as the exposure, interest would be on the decomposition of the hospital effect

itself on whatever measure is being considered.

Such causal effect decompositions have been formulated for a number of different measures though not

specifically in the context of hospital comparisons. VanderWeele (2009) developed a decomposition for both

the risk and mean differences. Vansteelandt and VanderWeele (2012) derive the total effect decomposition

for the risk difference (RD) among the exposed as

E[YzMz− Yz∗Mz∗ | Z = z] = E[YzMz

− Yz∗Mz| Z = z]− E[Yz∗Mz

− Yz∗Mz∗ | Z = z]

⇒ RDTE = RDNIE + RDNDE.

Finally VanderWeele and Vansteelandt (2010) also provide a decomposition for the odds ratio as

P (YzMz = 1)/1− P (YzMz = 1)

P (Yz∗M∗z = 1)/1− P (Yz∗M∗z = 1)=

P (YzMz = 1)/1− P (YzMz = 1)

P (YzM∗z = 1)/1− P (YzM∗z = 1)×

P (YzM∗z = 1)/1− P (YzM∗z = 1)

P (Yz∗M∗z = 1)/1− P (Yz∗M∗z = 1)

⇒ ORTE = ORNDE ×ORNIE

which differs to that of the risk difference among the exposed by being multiplicative in nature rather than

additive. The methods used to estimate such decompositions can be classified into parametric model-based

estimators (VanderWeele, 2009; Baron and Kenny, 1986) or semi-parametric weighted estimators (Lange

et al., 2012). However, there is no total effect decomposition for the SMR in the hospital profiling context.

Chapter 6 of this thesis thus provides a TE decomposition for the SMR when comparison is made to the

provincial or national average care level and develops model-based and semi-parametric estimators for this

decomposition. Chapter 7 illustrates the use of the proposed mediation methodology on Ontario kidney

cancer data.

24

Page 36: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

2.4 Thesis Contributions

This thesis begins, in Chapter 3, with an illustration of the current method for profiling hospital care using

indirectly standardized mortality ratios. The analysis emphasizes the need for sufficient case-mix adjustment

and proper model specification, in addition to linking assessments of quality of care to patient outcomes.

I then adopt the causal inference framework in Chapter 4 to propose a causal estimand for the indirectly

standardized SMR when the reference level of care is to the provincial average, followed by a novel doubly

robust estimator to address model misspecification. In Chapter 5, I prove that the causal estimand proposed

in Chapter 4 is not susceptible to violations of the positivity assumption in Section 2.3.1. Chapter 6 extends

the causal inference framework by adapting mediation analysis methods for the indirectly standardized

SMR. I define the causal estimand in the mediated case and prove the existence of a meaningful total

effect decomposition. I further propose two novel sets of estimators for this decomposition which can be

used to quantify the impact of improvements to care on patient outcomes. Finally, Chapter 7 illustrates

the use of the proposed methods of Chapter 6 on Ontario kidney cancer data to quantify the effect of an

intervention on minimally invasive surgery on surgical length of stay. I further consider two approaches for

fitting multinomial hospital assignment models in the presence of small hospitals, as well as a number of

approaches for obtaining estimated confidence intervals.

25

Page 37: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Chapter 3

Prostate cancer quality of care

disparities and their impact on

patient mortality

3.1 Abstract

Background: A paucity of real-world data exists highlighting the degree to which prostate cancer quality

of care variations occur at a provider-level, independent of differences in case-mix. Further, it is unknown

whether such variations lead to differences in patient outcomes.

Objectives: To benchmark hospital-level performance across a composite of expert-defined quality indica-

tors (QIs) and subsequently determine associations between hospital-level quality and prostate cancer patient

mortality.

Design, settings, participants: Men diagnosed with localized prostate cancer were identified from the

National Cancer Database between 2004-2014. Two cohorts were evaluated; a training cohort, utilized to

benchmark hospital-level quality across individual QIs; and a validation cohort, utilized to estimate associ-

ations between composite hospital-level quality performance and patient mortality.

Outcome measures and statistical analysis: Hospital-level quality of care was measured across a mul-

tidisciplinary panel of previously reported disease-specific, expert-defined QIs. A composite measure of

prostate cancer quality was derived; prostate cancer quality score (PC-QS) and associations between PC-QS

and patient demographics and outcomes as well as hospital features were assessed.

Results: After adjusting for case-mix, 2-38% of hospitals were identified as performing below the national

average for a given QI. Hospitals with higher quality scores displayed larger referral volumes and were more

commonly academic-affiliated. Prostate cancer patients treated at hospitals with higher quality scores, por-

26

Page 38: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

tended improved overall survival rates (adjusted hazard ratio [confidence interval]: 0.93 [0.87-0.98]). After

adjusting for hospital-level quality, significant racial and insurance status outcome disparities persist.

Conclusions: Data-driven benchmarking of hospital-level quality performance reveals the widespread dis-

parities that exist in prostate cancer care and their negative effects on patient outcomes.

Patient summary: We employed a statistical benchmarking method to reveal widespread variations in

prostate cancer quality of care are due to provider-level performance deficiencies. In turn, our analysis

highlights that provider-level performance variations impact patient outcomes, including overall survival.

3.2 Introduction

Prostate cancer has been a focus for quality indicator (QI) development not only given its prevalence but

also the highly variable patterns of care reported that patients receive (Hoffman et al., 2014; Chamie et al.,

2015). To date, significant effort towards defining optimal QIs for prostate cancer care have been made, with

an initial set of disease-specific indicators developed by the RAND organization in 2000 and subsequently

expanded by the PCPI (Physician Consortium for Performance Improvement), PQRS (Physician Quality

Reporting System) and NQF (National Quality Forum) (Spencer et al., 2003; Herrel et al., 2016).

Despite many QIs being proposed, a paucity of data exists demonstrating their benchmarking validity.

While patient- and provider-level variation for several of these metrics has been reported, these studies have

not been comprehensive and are limited by a combination of inadequate case-mix adjustment, small sam-

ple sizes or the use of Medicare claims data (Spencer et al., 2008; Schroeck et al., 2014a). Moreover, data

demonstrating whether adherence to these metrics is associated with improved patient outcomes has been

either negative or lacking (Schroeck et al., 2014b). Further, a number of these indicators are not readily

captured within comprehensive cancer databases limiting their feasibility. Consequently, widespread adop-

tion of existing prostate cancer QIs into real-world quality benchmarking programs has not been realized,

underscored by recent efforts to define alternative metrics (Nag et al., 2016).

Given the lack of rigorous data validating existing prostate cancer QIs as quality benchmarking tools, the

true extent and clinical impact of variations in prostate cancer care are currently unknown. To better clarify

this, prospectively captured data from the United States National Cancer Database (NCDB) were analyzed

to systematically determine associations between hospital-level performance across readily identified prostate

cancer QIs and patient mortality (Boffa et al., 2017). A composite measure of prostate cancer quality (PC-

QS; prostate cancer quality score) was derived, and associations between PC-QS and patient mortality,

hospital type and geographical location as well as race and insurance status were determined.

27

Page 39: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

3.3 Methods

3.3.1 Data

This study employed the National Cancer Database (NCDB), which prospectively collects data from Com-

mission on Cancer Accredited facilities on a hospital-level across the United States and Puerto Rico. Data

are collected through standardized templates by certified tumour registrars, with over 34 million individual

patient records across 1500 hospitals being accumulated as of 2016 (Boffa et al., 2017). The NCDB covers

approximately 70% of newly diagnosed cancer patients nationwide representing the largest clinical cancer

registry in the world. This study was approved by the research ethics boards at the University Health

Network in Toronto, Ontario (Study Number 16-5978) and the Cleveland Clinic (Study Number 17-1630).

3.3.2 Study cohort

International Classification of Disease for Oncology (3rd Edition) codes were utilized to identify all patients

with a diagnosis of prostate cancer within the NCDB between 2004 and 2014. Notably, data prior to 2004

were excluded due to incomplete comorbidity information whereas data on active surveillance were only

available and incorporated into the analysis from 2010 onwards. Patients with metastatic disease were

excluded from this analysis.

3.3.3 Measurement of quality of care

Hospital-level quality of care was benchmarked according to published prostate cancer specific QIs previously

identified through a modified Delphi process by expert consensus panels (Spencer et al., 2003; Nag et al.,

2016). Only those QIs readily captured through the NCDB were employed in this analysis. A summary of

all 10 QIs utilized including inclusion and exclusion criteria are presented in Table S3.1.

3.3.4 Statistical Analysis

Hospitals in the NCDB were randomly divided into two groups (a training and a validation set) to facilitate

determination of individual QIs for inclusion in the PC-QS, and subsequent independent validation. De-

scriptive statistics on the training cohort are included in Table S3.2-S3.3.

In the training set, case-mix adjusted QIs were derived through indirect standardization, whereby the

standardized ratio of observed to expected performance was calculated for each hospital adjusting for all

clinic-pathological variables within the NCDB (Figure S3.1A-J) as previously reported by our group (Lawson

et al., 2017a). To detect outlier hospitals, the adjusted QIs were compared to the national average using

z-tests, with p-values calculated on logit scale for all indicators except LOS and time to treatment (log

scale) (Spiegelhalter, 2005b). Notably, when assessing time-trends for outlier status, year of surgery and

diagnosis were removed from the case-mix adjustment. Between hospital heterogeneity in the adjusted QIs

was assessed through the meta-analysis I2 statistic and Q-test. For each QI, hospitals were classified as poor

outliers or superior outliers based on performing worse or better than the national average respectively, with

Bonferroni corrected p-value threshold at a 5% (Jones et al., 2008).

28

Page 40: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

The PC-QS composite score was constructed by including all individual QIs demonstrating significant

interhospital variation in the training set, based on the Q test. Further, QIs for which superior hospital-level

performance was associated with inferior patient outcomes were excluded to match our a priori hypothesis

that poor quality is associated with inferior outcomes. Patient outcomes analyzed on a hospital-level included

the need for salvage therapy (surgery or radiation), androgen deprivation therapy (ADT) initiation, 30-day

mortality, 90-day mortality and overall mortality. 30-day mortality, 90-day mortality and overall mortality

are directly reported within the NCDB. Salvage therapy was defined as the receipt of any additional local

therapy, and identified from the NCDB using the Radiation Surgery Sequence variable. Notably, in this

study ADT initiation was used as a surrogate for progression to advanced disease, and was reported as the

time of hormonal therapy administration, excluding those cases wherein ADT was given within 3 months

of primary radiation therapy. In all cases, quality-outcome associations were derived after adjusting for

case-mix variation between hospitals for all possible clinic-pathological variables within the NCDB (Figure

S3.1A-J). Outcome models were fitted using generalized estimating equations to account for within-hospital

correlation. For each hospital, the PC-QS score represents the summative performance across QIs, where

superior outlier status receives one point, poor outlier status deducts one point, and non-outlier status re-

ceives zero points.

To validate the clinical utility of the PC-QS, associations between hospital-level PC-QS and overall patient

mortality were investigated in the validation set of hospitals using analogous regression models as described

above. Similarly, associations between PC-QS and hospital volume, location and type of institution were

also determined. P-values were computed, comparing hospitals with positive versus negative PC-QS scores

(zero scores omitted), using chi-square tests for hospital location and two-sample Wilcoxon tests for hospital

volume and type. We investigated association between race and insurance status with patient outcomes by

fitting regression models as above, while adjusting for both patient case-mix and the PC-QS value for the

hospital in which each patient was treated. The statistical analyses were performed in SAS software version

9.3 (SAS Institute, NC, USA) and R statistical environment version 3.4.1 (R Foundation for Statistical

Computing, Vienna, Austria).

3.4 Results

Variations in prostate cancer care were assessed at a hospital-level across the training set of hospitals (N=600)

according to 10 disease-specific expert defined QIs (Table S3.1). After adjustment for case-mix variation at

each site (all variables included in Figure S3.1A-J), both mixed models and I2 statistics demonstrated signif-

icant interhospital variation across all QIs assessed with p < 0.001 (I2: 88-99.5%) (Figure 3.1). Collectively,

between 2-38% of hospitals performed significantly below the national average (i.e. poor outliers) as defined

by one of the 10 QIs assessed (Figure 3.1). Notably, outlier status was consistent across the study period,

with minimal crossover being observed (Figure S3.2).

Interestingly, minimal concordance for identifying the same outlier hospitals was observed across the in-

dividual QIs (Cronbachs α 0.28), suggesting a composite measure capable of integrating performance across

the QIs was required to benchmark hospital-level quality of care with high content validity (Figure 3.2).

29

Page 41: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Hence, we derived a composite measure of prostate cancer quality, the PC-QS, by integrating the perfor-

mance of each hospital across QIs that demonstrated significant interhospital variation. Importantly, we

excluded two QIs (time to treatment and active treatment proportion), as superior performance on these

metrics was associated with poor patient outcomes (Figure S3.3), highlighting a lack of construct validity

for these individual QIs.

Widespread variation was observed in the PC-QS when applied to a separate validation set of over 600

NCDB hospitals (Figure S3.4). To assess construct validity of this metric and determine whether variations

in quality of care impact patient outcomes, associations between PC-QS and patient mortality were derived.

Overall, prostate cancer patients treated at hospitals with a positive vs. negative PC-QS score displayed

lower rates overall mortality, which persisted after case-mix adjustment (adjusted hazard ratio [confidence

interval]: 0.93 [0.87-0.98]). (Figure 3.3).

After deriving and validating the PC-QS as a composite measure of quality we next sought to evaluate

various hospital structural features as putative drivers of quality variations. For this, hospital-level associa-

tions between the PC-QS and hospital volume, facility type and geographic location were assessed across the

validation set of hospitals. Overall, hospitals with a positive PC-QS (i.e. superior performance) displayed

higher volume and were more likely to be academic affiliated relative to hospitals with a negative PC-QS

(p < 0.01) (Figure 3.4). However, no significant association between PC-QS and geographical location was

observed (Figure 3.4).

While socioeconomic and racial disparities in prostate cancer care have previously been reported, min-

imal evidence exist to validate that these disparities are not a result of poor access to high quality care

(Schmid et al., 2016). Therefore, we next assessed whether associations between race (focused on black

patients) and socioeconomic status with patient outcomes exist, after controlling for hospital level quality

with the PC-QS. Notably, multivariable models adjusted for patient and tumour factors as well as PC-QS

demonstrated significant associations between race (black vs. white) and higher rates of ADT initiation,

30-day, 90-day and overall mortality (Figure 3.5). With respect to socioeconomic status, patients without

insurance demonstrated higher rates of salvage surgery and radiotherapy, ADT initiation and overall mor-

tality, whereas patients with private insurance displayed lower rates of ADT initiation, 30-day, 90-day and

overall mortality (Figure 3.5). Collectively, these results demonstrate that racial and socioeconomic outcome

disparities persist after adjusting for hospital-level quality.

3.5 Discussion

While many QIs have been proposed as putative measures of prostate cancer quality, minimal evidence

exists concerning their benchmarking validity. To bridge this knowledge gap, we systematically determined

the ability of readily measured case-mix adjusted QIs to benchmark prostate cancer provider care across the

prospectively maintained National Cancer Database. This analysis not only prioritizes those QIs that should

be integrated into existing prostate cancer quality improvement programs, but also provides a previously

30

Page 42: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

unappreciated, comprehensive picture of care disparities and their impact on patient outcomes.

Seminal work pioneered by Litwin and others developed a list of putative QIs for localized prostate can-

cer, and the subsequent documentation that widespread variations in quality care exist (Spencer et al., 2003;

Miller et al., 2008). However, these early studies relied on intensive human resource to manually extract data

elements from the NCDB in order to capture many of the QIs, thereby limiting the analysis to a sample size

of just over 5000 patients and a 1-year study period. Moreover, as associations between QI compliance and

patient outcomes were not determined, the construct validity of the metrics remained unknown. Attempts

to derive construct validity for a small number of these QIs were subsequently reported by Schroeck et al.

(2014b, 2015) and Sohn et al. (2016), utilizing data from SEER-medicare and The Comparative Effective-

ness Analysis of Surgery and Radiation study, respectively, yet both these analyses failed to demonstrate

quality-outcome associations. Further, a population-based study utilizing data from Ontario, Canada, re-

vealed construct validity for 2 radiation-specific QIs, however these metrics are not well captured within

comprehensive cancer databases and were derived from data collected between 1990-1999 (Webber et al.,

2013). As such, our analysis represents the first comprehensive and systematic assessment of quality-outcome

associations utilizing granular case-mix adjusted data for prostate cancer.

Derivation of the PC-QS as a composite measure not only provides improved content validity and ease of

reporting, but also facilitated our ability to determine quality-outcome associations by reducing statistical

noise, consistent with prior reports (Lawson et al., 2017a; Dimick et al., 2012). The PC-QS captured signif-

icant hospital-level variations across the NCDB, and demonstrated construct validity with the observation

that prostate cancer patients treated at positive vs. negative PC-QS hospitals had lower rates of overall

mortality. This supports the PC-QS as a foundational prostate-cancer composite measure of quality that

should be adopted and iteratively improved as additional QIs are validated. Critically, as the PC-QS is

readily derived from NCDB, this database serves as a practical vehicle to disseminate the PC-QS for audit

level feedback.

The PC-QS also facilitated the systematic analysis of potential hospital-level drivers of quality variations,

including the finding that hospital quality is associated with higher volumes and academic affiliation. While

this suggests that centralization of care may provide a mechanism to improve quality, this hypothesis must be

confirmed prospectively. Interestingly, the PC-QS additionally allowed us to further clarify mechanisms for

racial and socioeconomic disparities in prostate cancer outcomes by demonstrating their persistence despite

adjusting for hospital level quality (Schmid et al., 2016). This aids in guiding further studies seeking to

better understand this long-reported observation, with the investigation of physician-level access issues and

timely access to care being priorities for future work (Moses et al., 2017; Maurice et al., 2017; Godley, 2003).

This study has important limitations. First, certain clinical variables are not collected within the NCDB

(e.g. BMI), limiting our ability to case-mix adjust for all possible confounders. Second, we restricted quality-

outcome associations to select patient outcomes available within the NCDB. Hence, future quality-outcome

studies for those QIs that did not display construct validity should be performed with more granular patient-

centred outcomes, including functional endpoints (e.g. sexual, urinary, and rectal dysfunction) and cancer-

31

Page 43: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

specific mortality. Data-initiatives such as the American Urological Association (AQUA) quality registry and

the TrueNTH Global Registry being notable examples that should substantially improve assessment of QI

benchmarking validity (Gandaglia et al., 2016; Evans et al., 2017). Third, the outlier classification was based

on statistical rather than clinical significance, and as a consequence, we are not able to adequately benchmark

the performance of small hospitals due to a lack of statistical power with the described approach. Lastly,

as these data are derived from the NCDB, validation of the PC-QS in the context of non-U.S healthcare

systems is warranted to ensure external validity.

3.6 Conclusions

Data-driven benchmarking of hospital-level quality performance reveals the widespread disparities that exist

in prostate cancer care associated with poor patient outcomes. The PC-QS serves as a validated, readily

determined composite metric of prostate cancer hospital-level quality to be used as a benchmarking tool for

audit level feedback and quality improvement.

3.7 Figures

32

Page 44: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Positive margin proportion T2(61 lower outliers with 38419 patients, 493 non−outliers with 87038 patients, 39 upper outliers with 18073 patients)

Case−mix adjusted proportion

1/S

E

0.0 0.2 0.4 0.6

05

1015

2025

30●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

I2 = 93.4% (p−value: <0.001)

Positive margin proportion T3(27 lower outliers with 9886 patients, 488 non−outliers with 27267 patients, 26 upper outliers with 3648 patients)

Case−mix adjusted proportion

1/S

E

0.0 0.2 0.4 0.6 0.8 1.0

05

1015

20

●●

● ●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ● ●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

● ●●

●●

I2 = 88% (p−value: <0.001)

Active surveillance proportion(90 lower outliers with 6561 patients, 429 non−outliers with 10383 patients, 24 upper outliers with 4712 patients)

Case−mix adjusted proportion

1/S

E

0.0 0.2 0.4 0.6 0.8

02

46

810

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

I2 = 89.9% (p−value: <0.001)

Active treatment proportion(18 lower outliers with 5971 patients, 518 non−outliers with 63780 patients, 94 upper outliers with 20020 patients)

Case−mix adjusted proportion

1/S

E

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

12

●●

●●

●●

●●

●●

●●

●● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

I2 = 92.1% (p−value: <0.001)

Time to first treatment(74 lower outliers with 10257 patients, 499 non−outliers with 71149 patients, 57 upper outliers with 18151 patients)

Case−mix adjusted mean (days)

1/S

E

0 10 20 30 40 60 80 110 150 210

010

2030

40

●●

●●

● ●

●●

● ●

●●

●●

● ●

● ●

●●

● ●

●●●

●●

●●

●●

● ● ●

●●

●●

●● ●

●●

● ●

●●

●●

●●

●●

● ●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

I2 = 95.4% (p−value: <0.001)

Length of stay(91 lower outliers with 78474 patients, 272 non−outliers with 47317 patients, 226 upper outliers with 53907 patients)

Case−mix adjusted mean (days)

1/S

E

0 1 2 3 4 5

050

100

150

200

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

I2 = 98.9% (p−value: <0.001)

Readmission proportion(67 lower outliers with 63581 patients, 515 non−outliers with 113697 patients, 13 upper outliers with 6283 patients)

Case−mix adjusted proportion

1/S

E

0.0 0.1 0.2 0.3 0.4 0.5

02

46

810

1214

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●●

●●

●●

●●

I2 = 89.4% (p−value: <0.001)

Lymph node dissection proportion(148 lower outliers with 61321 patients, 268 non−outliers with 41762 patients, 177 upper outliers with 67073 patients)

Case−mix adjusted proportion

1/S

E

0.0 0.2 0.4 0.6 0.8 1.0

010

2030

4050

●●

●●

●●

●●

● ●

●● ●

●●

●● ●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●●

●●

● ●

I2 = 99.5% (p−value: <0.001)

ADT with EBRT proportion(56 lower outliers with 4894 patients, 411 non−outliers with 23500 patients, 39 upper outliers with 4746 patients)

Case−mix adjusted proportion

1/S

E

0.0 0.2 0.4 0.6 0.8 1.0

24

68

1012

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

● ●

●●

I2 = 91.7% (p−value: <0.001)

Appropriate EBRT dose proportion(82 lower outliers with 9819 patients, 311 non−outliers with 28468 patients, 118 upper outliers with 22756 patients)

Case−mix adjusted proportion

1/S

E

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

1214

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

I2 = 98.1% (p−value: <0.001)

Figure 3.1: Nationwide hospital-level benchmarking of prostate cancer quality of care. Case-mixadjusted performance for individual hospitals (circles, size proportional to hospital volume) bench-marked for quality according to disease-specific quality indicators. Vertical dashed red line repre-sents the average nationwide hospital performance. The y axis represents the inverse standard errorof the case-mix adjusted performance measure, with the dot-dash blue funnel giving the unadjusted95% non-rejection region for the null of equivalence between observed and expected performanceand the dashed red funnel giving the non-rejection region after Bonferroni correction. Betweenhospital heterogeneity in performance is reported on each plot in terms of the I2 statistic.

33

Page 45: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Figure 3.2: Concordance in quality indicators for identifying outlier hospitals. Venn diagramsdisplay the concordance in classifying outlier hospitals between the individual QIs.

34

Page 46: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

OR (log scale)

0.4 0.6 0.8 1.0 1.2

Outcome

30 day mortality

30 day mortality

90 day mortality

90 day mortality

Salvage therapy

Salvage therapy

Model

unadjusted

adjusted

unadjusted

adjusted

unadjusted

adjusted

OR

0.73

0.79

0.83

0.91

0.70

0.48

95% CI

(0.54, 0.99)

(0.58, 1.08)

(0.64, 1.08)

(0.70, 1.19)

(0.57, 0.87)

(0.38, 0.61)

|

|

|

|

|

|

|

|

|

|

|

|

HR (log scale)

0.6 0.7 0.8 0.9 1.0

Outcome

ADT initiation

ADT initiation

Overall mortality

Overall mortality

Model

unadjusted

adjusted

unadjusted

adjusted

HR

0.65

0.77

0.73

0.91

95% CI

(0.55, 0.77)

(0.67, 0.89)

(0.64, 0.83)

(0.84, 0.98)

||

||

||

||

Figure 3.3: Impact of hospital quality on patient outcomes. Unadjusted and case-mix adjustedassociations between hospital-level quality, measured by the PC-QS, and overall mortality. Valuesdisplayed reflect hazard ratio (HR) when comparing hospitals with a positive vs. negative PC-QS.CI = confidence interval.

●●

●●

●●

●●

●●

Negative Positive

20

50

100

200

500

1000

2000

5000

10000

Hospital volume (p−value: 0.007)

PC−QS

Hos

pita

l vol

ume

(log−

scal

e)

Negative Positive

Hospital type (p−value: 0.001)

PC−QS

Pro

port

ion

0.0

0.2

0.4

0.6

0.8

1.0

CommunityComprehensive

AcademicIntegrated

Negative Positive

Hospital location (p−value: 0.126)

PC−QS

Pro

port

ion

0.0

0.2

0.4

0.6

0.8

1.0

East North CentralEast South CentralMiddle AtlanticMountainNew England

PacificSouth AtlanticWest North CentralWest South Central

Figure 3.4: Hospital structure features associated with quality. Associations between hospitalquality, measured by the PC-QS, and hospital volume (left panel), facility type (middle panel), andgeographical location (right panel).

35

Page 47: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

OR (log scale)0.5 1.0 1.5 2.0 2.5 3.0 3.5

Patient Characteristic

Not insured vs medicare/other

Private vs medicare/other

Black vs white race

Hispanic vs white race

||

||

||

||

||

||

||

||

||

||

||

||

30

90

S

30

90

S

30

90

S

30

90

S

OR

1.48

1.26

1.02

0.68*

0.69*

0.96

2.19*

1.81*

1.22*

1.15

1.18

1.00

95% CI

(0.74, 2.98)

(0.65, 2.45)

(0.85, 1.24)

(0.50, 0.92)

(0.53, 0.89)

(0.92, 1.01)

(1.56, 3.06)

(1.35, 2.41)

(1.13, 1.31)

(0.56, 2.39)

(0.69, 2.01)

(0.89, 1.12)

HR (log scale)0.8 1.0 1.2 1.4 1.6 1.8

Patient Characteristic

Not insured vs medicare/other

Private vs medicare/other

Black vs white race

Hispanic vs white race

||

||

||

||

||

||

||

||

M

ADT

M

ADT

M

ADT

M

ADT

HR

1.18*

1.40*

0.82*

0.86*

1.23*

1.22*

0.70*

1.00

95% CI

(1.07, 1.30)

(1.23, 1.59)

(0.79, 0.85)

(0.81, 0.90)

(1.17, 1.28)

(1.15, 1.30)

(0.65, 0.75)

(0.88, 1.14)

Figure 3.5: Impact of hospital level quality on race and insurance status associations with patientoutcomes. Associations between race and insurance status with the rate of salvage therapy (surgeryor radiation) [S], ADT initiation [ADT], 30-day mortality [30], 90-day mortality [90] and overallmortality [M], adjusted for both case-mix as well as hospital PC-QS.

36

Page 48: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

3.8 Supplemental Tables and Figures

Positive margin proportion T2

Odds ratio (log scale)

0.01 0.05 0.50 5.00

Variable

Age (10 years)

Year of surgery

Time from diagnosis to surgery (years)

Black vs white race

Native vs white race

Asian vs white race

Other vs white race

Hispanic vs white race

Not insured vs medicare/other

Private vs medicare/other

Medicaid vs medicare/other

29+% vs <14% not completed high school

20−28.9% vs <14% not completed high school

14−19.9% vs <14% not completed high school

<$30,000 vs $46,000+ income

$30,000−$34,999 vs $46,000+ income

$35,000−$45,999 vs $46,000+ income

Urban/rural score (1−9)

Charlson score 1 vs 0

Charlson score 2+ vs 0

Regional positive nodes found vs not

Lymph−vascular invasion present vs not

Nodal disease vs not

Great circle distance (100 miles)

Prostate Specific Antigen (100 ng/ml)

Gleason score 7 vs 6

Gleason score 8 vs 6

Gleason score 9 vs 6

Gleason score 10 vs 6

Path. stage T2a vs unspec. T2

Path. stage T2b vs unspec. T2

Path. stage T2c vs unspec. T2

OR

0.96

0.99

0.68

1.07

1.15

1.11

0.76

1.14

1.00

0.93

1.29

0.98

1.02

1.04

1.01

1.03

1.05

1.02

1.09

1.12

1.43

1.51

1.12

0.94

1.49

1.48

1.49

1.68

1.11

0.42

0.79

1.01

95% CI

(0.94, 0.99)

(0.99, 1.00)

(0.62, 0.76)

(1.02, 1.13)

(0.83, 1.59)

(0.99, 1.25)

(0.59, 0.97)

(1.05, 1.24)

(0.87, 1.14)

(0.90, 0.97)

(1.15, 1.45)

(0.92, 1.04)

(0.97, 1.07)

(1.00, 1.08)

(0.94, 1.08)

(0.98, 1.09)

(1.00, 1.09)

(1.01, 1.03)

(1.05, 1.13)

(1.00, 1.25)

(1.18, 1.73)

(1.34, 1.70)

(0.69, 1.82)

(0.93, 0.95)

(1.30, 1.71)

(1.43, 1.52)

(1.39, 1.59)

(1.52, 1.87)

(0.65, 1.90)

(0.39, 0.46)

(0.72, 0.88)

(0.95, 1.08)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.1: A: Model estimates from case-mix adjustment of positive margin proportion among T2stage patients QI, with corresponding 95% confidence interval.

37

Page 49: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Positive margin proportion T3

Odds ratio (log scale)

0.01 0.05 0.50 5.00

Variable

Age (10 years)

Year of surgery

Time from diagnosis to surgery (years)

Black vs white race

Native vs white race

Asian vs white race

Other vs white race

Hispanic vs white race

Not insured vs medicare/other

Private vs medicare/other

Medicaid vs medicare/other

29+% vs <14% not completed high school

20−28.9% vs <14% not completed high school

14−19.9% vs <14% not completed high school

<$30,000 vs $46,000+ income

$30,000−$34,999 vs $46,000+ income

$35,000−$45,999 vs $46,000+ income

Urban/rural score (1−9)

Charlson score 1 vs 0

Charlson score 2+ vs 0

Regional positive nodes found vs not

Lymph−vascular invasion present vs not

Nodal disease vs not

Great circle distance (100 miles)

Prostate Specific Antigen (100 ng/ml)

Gleason score 7 vs 6

Gleason score 8 vs 6

Gleason score 9 vs 6

Gleason score 10 vs 6

Path. stage T3a vs unspec. T3

Path. stage T3b vs unspec. T3

OR

0.94

0.95

0.83

1.19

0.74

0.95

1.17

1.18

1.00

0.93

1.06

1.02

1.09

1.06

0.99

1.03

1.06

1.01

1.10

1.23

1.29

1.32

0.88

0.95

2.80

0.86

0.98

1.20

1.44

0.86

0.94

95% CI

(0.91, 0.98)

(0.94, 0.96)

(0.73, 0.95)

(1.11, 1.27)

(0.48, 1.13)

(0.83, 1.09)

(0.88, 1.56)

(1.06, 1.32)

(0.85, 1.17)

(0.88, 0.98)

(0.90, 1.23)

(0.93, 1.11)

(1.02, 1.17)

(1.00, 1.12)

(0.90, 1.09)

(0.95, 1.11)

(1.00, 1.13)

(1.00, 1.02)

(1.04, 1.16)

(1.08, 1.41)

(1.19, 1.40)

(1.23, 1.43)

(0.68, 1.13)

(0.94, 0.96)

(2.39, 3.28)

(0.81, 0.91)

(0.91, 1.06)

(1.11, 1.30)

(1.13, 1.83)

(0.79, 0.93)

(0.86, 1.02)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.1B: Model estimates from case-mix adjustment of positive margin proportion among T3 stagepatients QI, with corresponding 95% confidence interval.

38

Page 50: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Active surveillance proportion

Odds ratio (log scale)

0.01 0.05 0.50 5.00

Variable

Age (10 years)

Year of Diagnosis (since 2004)

Black vs white race

Native vs white race

Asian vs white race

Other vs white race

Hispanic vs white race

Not insured vs medicare/other

Private vs medicare/other

Medicaid vs medicare/other

29+% vs <14% not completed high school

20−28.9% vs <14% not completed high school

14−19.9% vs <14% not completed high school

<$30,000 vs $46,000+ income

$30,000−$34,999 vs $46,000+ income

$35,000−$45,999 vs $46,000+ income

Urban/rural score (1−9)

Charlson score 1 vs 0

Charlson score 2+ vs 0

Great circle distance (100 miles)

Prostate Specific Antigen (100 ng/ml)

Clin. stage T2a vs < T2a

OR

1.83

1.44

1.11

0.56

1.26

1.53

0.82

2.64

1.00

1.20

0.74

0.71

0.83

1.30

1.25

1.23

0.97

0.62

1.22

1.04

1.70

0.84

95% CI

(1.71, 1.95)

(1.39, 1.49)

(0.98, 1.25)

(0.19, 1.59)

(0.97, 1.64)

(0.93, 2.52)

(0.67, 1.01)

(1.97, 3.54)

(0.91, 1.10)

(0.91, 1.60)

(0.63, 0.88)

(0.62, 0.81)

(0.75, 0.93)

(1.08, 1.57)

(1.08, 1.44)

(1.10, 1.37)

(0.95, 1.00)

(0.55, 0.71)

(0.95, 1.56)

(1.01, 1.06)

(0.25, >10)

(0.73, 0.97)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.1C: Model estimates from case-mix adjustment of active surveillance proportion QI, with corre-sponding 95% confidence interval.

39

Page 51: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Active treatment proportion

Odds ratio (log scale)

0.01 0.05 0.50 5.00

Variable

Age (10 years)

Year of Diagnosis (since 2004)

Black vs white race

Native vs white race

Asian vs white race

Other vs white race

Hispanic vs white race

Not insured vs medicare/other

Private vs medicare/other

Medicaid vs medicare/other

29+% vs <14% not completed high school

20−28.9% vs <14% not completed high school

14−19.9% vs <14% not completed high school

<$30,000 vs $46,000+ income

$30,000−$34,999 vs $46,000+ income

$35,000−$45,999 vs $46,000+ income

Urban/rural score (1−9)

Charlson score 1 vs 0

Charlson score 2+ vs 0

Great circle distance (100 miles)

Prostate Specific Antigen (100 ng/ml)

Gleason score 7 vs 6

Gleason score 8 vs 6

Gleason score 9 vs 6

Gleason score 10 vs 6

Clin. stage unspec. T2 vs all T1

Clin. stage T2a vs all T1

Clin. stage T2b vs all T1

Clin. stage T2c vs all T1

Clin. stage unspec. T3 vs all T1

Clin. stage T3a vs all T1

Clin. stage T3b vs all T1

Clin. stage T4 vs all T1

OR

0.61

0.99

0.56

0.42

1.07

0.73

0.54

0.38

1.14

0.52

0.92

0.97

0.98

0.68

0.86

0.88

1.00

1.08

0.62

0.97

0.23

1.48

1.38

1.35

0.92

0.87

1.52

1.62

1.16

1.12

1.51

1.50

0.63

95% CI

(0.58, 0.63)

(0.98, 0.99)

(0.52, 0.60)

(0.29, 0.63)

(0.89, 1.30)

(0.51, 1.04)

(0.49, 0.61)

(0.33, 0.44)

(1.06, 1.22)

(0.46, 0.59)

(0.83, 1.03)

(0.88, 1.06)

(0.90, 1.06)

(0.61, 0.77)

(0.78, 0.95)

(0.82, 0.96)

(0.98, 1.01)

(1.00, 1.17)

(0.54, 0.71)

(0.95, 0.99)

(0.21, 0.26)

(1.37, 1.59)

(1.28, 1.50)

(1.24, 1.47)

(0.76, 1.10)

(0.77, 0.98)

(1.35, 1.72)

(1.41, 1.85)

(1.08, 1.24)

(0.95, 1.32)

(1.29, 1.77)

(1.25, 1.80)

(0.49, 0.81)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.1D: Model estimates from case-mix adjustment of active treatment proportion QI, with corre-sponding 95% confidence interval.

40

Page 52: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Log time to first treatment

Regression coefficient

−1.0 −0.5 0.0 0.5 1.0

Variable

Age (10 years)

Year of Diagnosis (since 2004)

Black vs white race

Native vs white race

Asian vs white race

Other vs white race

Hispanic vs white race

Not insured vs medicare/other

Private vs medicare/other

Medicaid vs medicare/other

29+% vs <14% not completed high school

20−28.9% vs <14% not completed high school

14−19.9% vs <14% not completed high school

<$30,000 vs $46,000+ income

$30,000−$34,999 vs $46,000+ income

$35,000−$45,999 vs $46,000+ income

Urban/rural score (1−9)

Charlson score 1 vs 0

Charlson score 2+ vs 0

Great circle distance (100 miles)

Prostate Specific Antigen (100 ng/ml)

Gleason score 7 vs 6

Gleason score 8 vs 6

Gleason score 9 vs 6

Gleason score 10 vs 6

Clin. stage unspec. T2 vs all T1

Clin. stage T2a vs all T1

Clin. stage T2b vs all T1

Clin. stage T2c vs all T1

Clin. stage unspec. T3 vs all T1

Clin. stage T3a vs all T1

Clin. stage T3b vs all T1

Clin. stage T4 vs all T1

Nodal disease vs not

Coef

−0.07

0.00

0.15

−0.04

0.18

0.03

0.13

−0.01

−0.11

0.05

0.01

0.02

0.04

−0.07

−0.03

−0.04

−0.00

−0.13

−0.23

0.01

−0.05

0.04

0.13

−0.01

−0.22

−0.11

0.16

0.20

−0.03

0.06

0.09

0.13

−0.92

0.06

95% CI

(−0.08, −0.06)

(−0.00, 0.00)

(0.13, 0.17)

(−0.19, 0.11)

(0.14, 0.23)

(−0.07, 0.14)

(0.09, 0.17)

(−0.06, 0.05)

(−0.13, −0.09)

(0.01, 0.10)

(−0.02, 0.04)

(−0.00, 0.04)

(0.02, 0.06)

(−0.10, −0.03)

(−0.06, −0.01)

(−0.06, −0.02)

(−0.01, −0.00)

(−0.15, −0.11)

(−0.28, −0.19)

(0.00, 0.01)

(−0.08, −0.01)

(0.02, 0.06)

(0.10, 0.15)

(−0.03, 0.02)

(−0.27, −0.17)

(−0.14, −0.07)

(0.13, 0.19)

(0.17, 0.23)

(−0.05, −0.01)

(0.02, 0.10)

(0.05, 0.12)

(0.09, 0.17)

(−0.99, −0.85)

(0.01, 0.11)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.1E: Model estimates from case-mix adjustment of time to first treatment QI, with corresponding95% confidence interval.

41

Page 53: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Log length of stay

Regression coefficient

−1.0 −0.5 0.0 0.5 1.0

Variable

Age (10 years)

Year of surgery

Time from diagnosis to surgery (years)

Black vs white race

Native vs white race

Asian vs white race

Other vs white race

Hispanic vs white race

Not insured vs medicare/other

Private vs medicare/other

Medicaid vs medicare/other

29+% vs <14% not completed high school

20−28.9% vs <14% not completed high school

14−19.9% vs <14% not completed high school

<$30,000 vs $46,000+ income

$30,000−$34,999 vs $46,000+ income

$35,000−$45,999 vs $46,000+ income

Urban/rural score (1−9)

Charlson score 1 vs 0

Charlson score 2+ vs 0

Regional positive nodes found vs not

Lymph−vascular invasion present vs not

Nodal disease vs not

Great circle distance (100 miles)

Prostate Specific Antigen (100 ng/ml)

Gleason score 7 vs 6

Gleason score 8 vs 6

Gleason score 9 vs 6

Gleason score 10 vs 6

Path. stage T2a vs unspec. T2

Path. stage T2b vs unspec. T2

Path. stage T2c vs unspec. T2

Path. stage unspec. T3 vs unspec. T2

Path. stage T3a vs unspec. T2

Path. stage T3b vs unspec. T2

Path. stage T4 vs unspec. T2

Coef

0.02

−0.04

−0.10

0.09

0.08

0.04

−0.01

0.07

0.04

−0.03

0.11

−0.02

0.01

0.01

0.02

0.01

0.01

0.01

0.06

0.14

0.02

0.01

0.12

−0.01

0.05

−0.01

0.03

0.04

0.08

−0.04

−0.04

−0.04

−0.00

−0.04

−0.01

−0.01

95% CI

(0.02, 0.03)

(−0.04, −0.03)

(−0.11, −0.09)

(0.08, 0.09)

(0.04, 0.13)

(0.03, 0.06)

(−0.04, 0.02)

(0.06, 0.08)

(0.03, 0.06)

(−0.03, −0.02)

(0.09, 0.12)

(−0.03, −0.01)

(−0.00, 0.01)

(0.01, 0.02)

(0.01, 0.03)

(0.01, 0.02)

(0.00, 0.02)

(0.01, 0.01)

(0.05, 0.06)

(0.13, 0.16)

(0.00, 0.03)

(−0.00, 0.02)

(0.07, 0.16)

(−0.01, −0.01)

(0.04, 0.07)

(−0.01, −0.00)

(0.02, 0.04)

(0.02, 0.05)

(0.04, 0.12)

(−0.05, −0.02)

(−0.05, −0.02)

(−0.05, −0.03)

(−0.02, 0.02)

(−0.05, −0.02)

(−0.03, −0.00)

(−0.04, 0.03)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.1F: Model estimates from case-mix adjustment of length of stay QI, with corresponding 95%confidence interval.

42

Page 54: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Readmission proportion

Odds ratio (log scale)

0.01 0.05 0.50 5.00

Variable

Age (10 years)

Year of surgery

Time from diagnosis to surgery (years)

Black vs white race

Native vs white race

Asian vs white race

Other vs white race

Hispanic vs white race

Not insured vs medicare/other

Private vs medicare/other

Medicaid vs medicare/other

29+% vs <14% not completed high school

20−28.9% vs <14% not completed high school

14−19.9% vs <14% not completed high school

<$30,000 vs $46,000+ income

$30,000−$34,999 vs $46,000+ income

$35,000−$45,999 vs $46,000+ income

Urban/rural score (1−9)

Charlson score 1 vs 0

Charlson score 2+ vs 0

Regional positive nodes found vs not

Lymph−vascular invasion present vs not

Nodal disease vs not

Great circle distance (100 miles)

Prostate Specific Antigen (100 ng/ml)

Gleason score 7 vs 6

Gleason score 8 vs 6

Gleason score 9 vs 6

Gleason score 10 vs 6

Path. stage T2a vs unspec. T2

Path. stage T2b vs unspec. T2

Path. stage T2c vs unspec. T2

Path. stage unspec. T3 vs unspec. T2

Path. stage T3a vs unspec. T2

Path. stage T3b vs unspec. T2

Path. stage T4 vs unspec. T2

OR

1.03

0.98

0.31

1.34

1.08

0.85

0.97

0.89

1.16

0.85

1.49

1.44

1.19

1.06

0.78

0.76

0.87

1.04

1.09

1.30

0.95

1.10

1.47

0.90

0.85

0.97

0.93

1.07

1.52

1.24

1.28

1.26

1.11

1.25

1.51

1.71

95% CI

(0.97, 1.08)

(0.97, 0.99)

(0.25, 0.40)

(1.23, 1.47)

(0.58, 2.04)

(0.66, 1.08)

(0.62, 1.54)

(0.74, 1.06)

(0.91, 1.47)

(0.79, 0.92)

(1.22, 1.82)

(1.27, 1.64)

(1.08, 1.31)

(0.97, 1.15)

(0.68, 0.90)

(0.68, 0.85)

(0.80, 0.95)

(1.02, 1.06)

(1.01, 1.18)

(1.08, 1.58)

(0.76, 1.18)

(0.93, 1.31)

(0.87, 2.49)

(0.88, 0.93)

(0.65, 1.12)

(0.90, 1.03)

(0.82, 1.06)

(0.92, 1.24)

(0.95, 2.43)

(1.03, 1.50)

(1.01, 1.61)

(1.06, 1.49)

(0.81, 1.51)

(1.04, 1.50)

(1.24, 1.85)

(1.12, 2.61)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.1G: Model estimates from case-mix adjustment of readmission proportion QI, with corresponding95% confidence interval.

43

Page 55: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Lymph node disection proportion

Odds ratio (log scale)

0.01 0.05 0.50 5.00

Variable

Age (10 years)

Year of surgery

Time from diagnosis to surgery (years)

Black vs white race

Native vs white race

Asian vs white race

Other vs white race

Hispanic vs white race

Not insured vs medicare/other

Private vs medicare/other

Medicaid vs medicare/other

29+% vs <14% not completed high school

20−28.9% vs <14% not completed high school

14−19.9% vs <14% not completed high school

<$30,000 vs $46,000+ income

$30,000−$34,999 vs $46,000+ income

$35,000−$45,999 vs $46,000+ income

Urban/rural score (1−9)

Charlson score 1 vs 0

Charlson score 2+ vs 0

Lymph−vascular invasion present vs not

Nodal disease vs not

Great circle distance (100 miles)

Prostate Specific Antigen (100 ng/ml)

Gleason score 7 vs 6

Gleason score 8 vs 6

Gleason score 9 vs 6

Gleason score 10 vs 6

Path. stage T2a vs unspec. T2

Path. stage T2b vs unspec. T2

Path. stage T2c vs unspec. T2

Path. stage unspec. T3 vs unspec. T2

Path. stage T3a vs unspec. T2

Path. stage T3b vs unspec. T2

Path. stage T4 vs unspec. T2

OR

0.97

0.98

0.63

0.94

1.34

1.28

0.95

0.86

1.09

1.00

1.03

0.91

0.90

1.00

1.01

0.99

0.96

0.99

1.01

0.97

1.46

5.63

1.11

2.42

2.03

4.79

6.45

5.22

0.79

1.10

0.85

0.94

1.33

1.64

1.26

95% CI

(0.95, 0.99)

(0.98, 0.99)

(0.59, 0.67)

(0.91, 0.97)

(1.05, 1.72)

(1.17, 1.39)

(0.81, 1.11)

(0.81, 0.92)

(0.99, 1.20)

(0.98, 1.03)

(0.94, 1.13)

(0.87, 0.96)

(0.87, 0.94)

(0.97, 1.03)

(0.96, 1.06)

(0.95, 1.03)

(0.93, 0.99)

(0.99, 1.00)

(0.98, 1.04)

(0.89, 1.05)

(1.35, 1.58)

(3.34, 9.50)

(1.11, 1.12)

(2.18, 2.68)

(1.98, 2.08)

(4.53, 5.06)

(5.96, 6.97)

(3.83, 7.12)

(0.73, 0.86)

(1.01, 1.21)

(0.79, 0.91)

(0.84, 1.06)

(1.23, 1.44)

(1.50, 1.79)

(1.03, 1.54)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.1H: Model estimates from case-mix adjustment of lymph node dissection proportion QI, withcorresponding 95% confidence interval.

44

Page 56: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

ADT with EBRT proportion

Odds ratio (log scale)

0.01 0.05 0.50 5.00

Variable

Age (10 years)

Year of Diagnosis (since 2004)

Black vs white race

Native vs white race

Asian vs white race

Other vs white race

Hispanic vs white race

Not insured vs medicare/other

Private vs medicare/other

Medicaid vs medicare/other

29+% vs <14% not completed high school

20−28.9% vs <14% not completed high school

14−19.9% vs <14% not completed high school

<$30,000 vs $46,000+ income

$30,000−$34,999 vs $46,000+ income

$35,000−$45,999 vs $46,000+ income

Urban/rural score (1−9)

Charlson score 1 vs 0

Charlson score 2+ vs 0

Great circle distance (100 miles)

Prostate Specific Antigen (100 ng/ml)

Gleason score 7 vs 6

Gleason score 8 vs 6

Gleason score 9 vs 6

Gleason score 10 vs 6

Clin. stage unspec. T2 vs all T1

Clin. stage T2a vs all T1

Clin. stage T2b vs all T1

Clin. stage T2c vs all T1

Clin. stage unspec. T3 vs all T1

Clin. stage T3a vs all T1

Clin. stage T3b vs all T1

Clin. stage T4 vs all T1

Nodal disease vs not

OR

0.94

1.04

1.02

1.25

0.77

0.85

0.95

0.80

1.03

0.89

0.88

0.93

0.93

1.03

1.06

1.00

1.02

1.03

1.01

1.01

0.82

2.26

3.36

3.92

4.02

1.05

1.23

1.47

1.03

1.24

1.68

1.37

1.31

1.07

95% CI

(0.91, 0.97)

(1.03, 1.05)

(0.96, 1.09)

(0.77, 2.03)

(0.66, 0.90)

(0.60, 1.22)

(0.85, 1.06)

(0.67, 0.95)

(0.98, 1.10)

(0.78, 1.02)

(0.80, 0.96)

(0.87, 1.01)

(0.88, 1.00)

(0.94, 1.14)

(0.98, 1.15)

(0.94, 1.07)

(1.01, 1.04)

(0.96, 1.11)

(0.87, 1.18)

(0.99, 1.04)

(0.74, 0.91)

(2.08, 2.46)

(3.09, 3.65)

(3.58, 4.28)

(3.43, 4.71)

(0.93, 1.18)

(1.13, 1.33)

(1.35, 1.60)

(0.97, 1.09)

(1.11, 1.39)

(1.51, 1.87)

(1.22, 1.54)

(1.05, 1.64)

(0.93, 1.23)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.1I: Model estimates from case-mix adjustment of concurrent EBRT and ADT therapies within 3months QI, with corresponding 95% confidence interval.

45

Page 57: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Appropriate EBRT dose proportion

Odds ratio (log scale)

0.01 0.05 0.50 5.00

Variable

Age (10 years)

Year of Diagnosis (since 2004)

Black vs white race

Native vs white race

Asian vs white race

Other vs white race

Hispanic vs white race

Not insured vs medicare/other

Private vs medicare/other

Medicaid vs medicare/other

29+% vs <14% not completed high school

20−28.9% vs <14% not completed high school

14−19.9% vs <14% not completed high school

<$30,000 vs $46,000+ income

$30,000−$34,999 vs $46,000+ income

$35,000−$45,999 vs $46,000+ income

Urban/rural score (1−9)

Charlson score 1 vs 0

Charlson score 2+ vs 0

Great circle distance (100 miles)

Prostate Specific Antigen (100 ng/ml)

Gleason score 7 vs 6

Gleason score 8 vs 6

Gleason score 9 vs 6

Gleason score 10 vs 6

Clin. stage unspec. T2 vs all T1

Clin. stage T2a vs all T1

Clin. stage T2b vs all T1

Clin. stage T2c vs all T1

Clin. stage unspec. T3 vs all T1

Clin. stage T3a vs all T1

Clin. stage T3b vs all T1

Clin. stage T4 vs all T1

OR

1.00

1.20

0.92

0.70

1.34

1.11

0.76

1.02

0.96

1.09

1.05

1.07

1.07

0.83

0.79

0.92

0.96

0.92

0.86

1.00

0.83

1.14

1.11

1.01

0.92

0.82

0.98

0.94

0.90

0.75

1.23

1.04

0.58

95% CI

(0.97, 1.03)

(1.19, 1.21)

(0.87, 0.97)

(0.48, 1.03)

(1.15, 1.56)

(0.78, 1.58)

(0.69, 0.83)

(0.86, 1.21)

(0.91, 1.01)

(0.96, 1.23)

(0.97, 1.13)

(1.00, 1.14)

(1.01, 1.13)

(0.76, 0.91)

(0.74, 0.85)

(0.87, 0.98)

(0.95, 0.97)

(0.86, 0.98)

(0.75, 0.98)

(0.98, 1.02)

(0.74, 0.93)

(1.08, 1.21)

(1.03, 1.19)

(0.94, 1.09)

(0.77, 1.11)

(0.74, 0.90)

(0.92, 1.05)

(0.88, 1.01)

(0.85, 0.96)

(0.66, 0.86)

(1.08, 1.40)

(0.90, 1.21)

(0.43, 0.79)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.1J: Model estimates from case-mix adjustment of appropriate EBRT dose QI, with corresponding95% confidence interval.

46

Page 58: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Trend in Positive margin proportion T2 by year

Calendar year

Cas

e−m

ix a

djus

ted

prop

ortio

n0.

00.

20.

40.

60.

81.

0

2004 2006 2008 2010 2012

● ●

● ●

● ●●

●●●

●●

●● ●

●● ●

●●

● ●●

● ●

●●

● ●

●●

●●

●●

● ●

● ●

● ●●

● ●

● ●

●●

● ●●

●●

●●

● ● ● ●

● ●

● ● ●

●●

●●

●●

●●

● ● ● ●

● ●

● ●

●●

● ●

●●

● ●

● ●

● ● ●

● ●

● ●

● ●

● ●

● ●

● ●

●●● ● ●

● ●

● ● ●

● ●

● ● ● ● ●

● ● ●

● ●

●●

● ●

● ●

● ● ●

● ●

●●

● ●

● ●

●●

●● ●

● ●

● ●

●●

●● ● ● ● ● ●

● ●

● ● ●

● ●● ●

● ●

●●

● ●

●●

●● ● ● ● ● ●

● ●

● ● ●

● ●●

● ●

● ● ●

● ● ●

● ●

●●

●●

●●

●●

● ● ● ● ●

● ● ● ● ● ● ●

● ● ●

● ●

● ● ●

● ●

● ●

● ● ●

● ●

● ●

● ●

● ●

●●

●● ● ● ● ●

● ●

● ●

● ●

●●

●●

● ●

●●

●●

●●

● ●

● ● ●

● ●

●●

● ●

● ● ●

●●

● ●

● ●

●● ●

● ●

● ●

● ●

● ●

● ●

●●

● ● ●

● ●

● ●

● ●

●●

● ●

●●

●●

● ●

● ●

● ●

● ●

●●

●●

●●

● ● ●

●●

●●

●●

●●

●●

● ●● ●

●●

● ●

● ●

●●

● ●

●●

● ● ●

●●

● ●

●●

● ● ●●

● ●

● ●

● ●

● ●

● ●

●●

●●

● ● ●

● ●●

● ●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

● ●

● ● ●

● ●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

● ●

●●

● ●

● ● ● ●

● ●

● ●

●●

●●

●●

● ●

● ●

● ●

● ●

●● ●

● ●

● ● ●

● ●

● ●

●●

● ●

● ● ● ●

● ●

● ●

● ●

●●

●●

● ●

●● ● ●

● ● ● ● ●

● ● ●

● ●●

● ● ●● ●

●● ● ● ●

● ● ●●

● ● ● ●

● ●

●●

●●

●●

●●

● ●

●●

● ● ● ● ●

● ●

● ● ● ● ●

● ● ●●

●●

● ●

● ●

● ●

● ●

●●

● ● ● ● ●

● ● ●●

● ●

●●

●●

●●

● ●

● ● ● ●

● ● ● ●

● ● ● ● ●

●●●

●●

● ●

● ●

● ●

●●

●●

● ●

●● ● ● ●

● ●

●●

●●

● ● ●

● ●

●●

● ●

● ● ●

● ●

● ●

● ●●

● ● ●●

●●

● ●

●● ●

● ●

●●●

● ●

● ●

● ●

●●

● ● ●

● ●●

● ●●

● ●

● ● ● ●

● ●

●●

●●

● ●

● ●●

● ●●

● ●

● ●

● ●

●●

● ● ●

● ●

● ●

●●

● ●

●●

● ●

●● ●

● ●

● ●

●●

●●

● ●

●●

● ● ●

●●

● ● ●

● ●

● ●

● ●●

● ●

● ●

●●

● ●

●●

●●

●●

● ●●

●●

●●

● ● ● ● ●

●●

●●

● ●●

●●

● ●

● ●

●●

●●

●●

● ● ●

●●

●●

● ● ●● ● ●

● ● ● ●

● ● ●

● ●

● ●

● ●

● ● ●

● ●

● ●

● ●

● ● ●

● ●

●●

●●

● ●

● ●●

●●

●●●

●●

●●

● ● ● ●

● ●

●●

● ●

● ●

● ● ● ●

● ● ● ●

●● ●

● ●

● ●● ● ●

● ●

● ●

● ●

● ●●

●●

●●

●●

● ●

● ●

●● ● ● ● ●

● ●

●● ●

●●

● ●● ● ● ●

● ●

●● ●

● ● ●●

●●

● ●●

● ● ● ● ●

●●

● ●

●●

● ● ● ●

● ●●

● ●● ●

●●

●●

● ●

● ● ● ●● ●

● ● ● ● ●●

● ●

● ● ●

●● ●

●●

● ●

● ●●

●●

●●

●●

● ●

● ●●

● ●●

●● ●

●●

● ●

● ●

● ●

● ● ●

●●

● ● ●

● ●

●●

●●

● ●●

●● ● ● ● ● ● ● ● ● ●● ● ●

● ● ● ●

● ● ● ●

● ● ●●

● ● ●

● ●● ● ● ●

●●

● ●

● ●●

● ● ● ● ● ●

● ●

● ●●

● ● ●●

● ●●

● ●●

●●

● ● ● ● ● ● ● ●

● ●●

● ●● ●

● ● ● ● ● ● ●

● ●

● ●

●●

●● ●

● ● ●● ●●

●●

● ●●

● ●

●●

●●

●● ●

● ● ● ●

● ●● ●

●●

●●

● ●

● ●● ●

●● ● ● ● ● ● ●

● ●● ●

● ●

● ● ●

● ● ● ● ● ● ● ● ● ●

● ● ●

●●

●●

● ●

●●

● ●

● ● ● ●

● ●

● ● ● ● ●●

● ●

● ● ● ●● ● ● ●

●● ●

● ●

●●

●● ●

● ●●

● ●

● ●

●●

●●

●●

●●

● ●

● ●●

●●

●●

●●

●●

● ●

● ●

● ● ●

●●

●●

● ●●

●●

●●

●●

● ●● ●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

● ●●

● ●●

●●

●●

● ●●

● ● ●

●●

● ●

● ●

● ●

● ●●

●●

●●

●●

● ● ●

●●

●●

● ● ● ●

● ●

● ●

● ● ●

● ● ●●

●●

●● ●

● ●●

Trend in Positive margin proportion T3 by year

Calendar year

Cas

e−m

ix a

djus

ted

prop

ortio

n0.

00.

20.

40.

60.

81.

0

2004 2006 2008 2010 2012

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ● ● ● ●

●●

● ●

● ●

●●● ●

● ●●

● ●

●●

● ● ●

●●

● ●

●●

● ●

●●

●●

● ●

● ●

● ● ●

● ●

● ●

● ●

●●

●●

● ●

● ● ● ● ●

● ●

● ●

● ●

●● ●

● ● ●

● ●

●●

●●

●●

●●

●●

●●

● ● ●

● ●

●●

● ●

●●

● ●

●●

● ● ●● ● ● ●

● ●

● ● ●

● ●

● ●

●●

●● ●

● ●

● ●

●●

●●

●●

● ●

● ●

● ●

● ● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●● ●

●●

● ● ●

● ● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

● ●

●●

●●

● ● ●

● ● ● ●

● ●

●●

● ●

●●

● ●

● ●

●●●

● ●

● ●

●●

● ●

●●

● ●

●●

●●

●●

● ● ●

●●

● ●

●●

● ● ●

●●

●●

● ●

● ●

●●

●●

● ●

● ●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●● ●

●●

● ●

●●

● ●

● ●

● ●

● ● ●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●● ● ● ●

● ●

● ●

● ● ● ● ●

● ●

●● ●

● ●

● ●

● ● ●

● ●

● ●

●●

●● ● ● ● ● ● ● ● ●● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

● ● ● ●

● ●

● ● ●

●●

● ●

●●

●●

● ● ●

● ●

●● ●

●● ● ● ● ●

● ●

● ●

●● ●

● ●●

● ●

● ●

●● ●

● ●● ●

●●

●●

● ●

● ●

●●

●●

●●

●●

● ●

●●

●● ●

● ●

● ●

● ●

●●

●●

●●

● ● ●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

● ●

Trend in Active surveillance proportion by year

Calendar year

Cas

e−m

ix a

djus

ted

prop

ortio

n0.

00.

20.

40.

60.

81.

0

2010 2011 2012 2013

●●

● ●

● ●

● ●

●● ● ●

● ●● ●

● ● ● ●● ●

● ● ●●

● ●

● ●●

●●

●● ●

●● ●

● ● ●●

●●

● ●

● ●

● ●● ● ●

●●

●● ●

●●

●● ●

●● ●● ● ● ●●

●●

● ● ●

●●

● ● ● ●

●●

● ●●

● ●

● ●● ● ● ●● ● ● ●● ●●

● ●

● ● ● ●● ●

● ● ● ●● ●

●●

●●

● ● ●● ●

●●

● ●●

● ●

● ● ●● ● ●

● ● ●

●●

●● ●

● ● ● ●● ● ● ●● ● ●● ●●●

● ●

● ●

● ●

● ● ●● ● ●●●●

●●

● ● ● ●● ● ●

● ●

●●

●● ●

●●

● ●

● ●

● ●● ●

● ●

●●

●●

●● ●● ●

● ●

● ● ●● ● ●

●●

● ●● ● ●● ●● ●

● ●

● ● ●● ●

●●

●● ● ●●

●● ● ●● ●

●● ● ●

●●

● ●● ●● ●●

●●

●●● ●

●●

● ● ●● ● ● ●●

● ●

● ●

● ●

● ●

● ●●

● ● ●

● ●● ●

● ● ●● ● ●

●● ● ●●

● ●

● ●

●●

● ● ● ●● ● ●● ● ●● ● ●● ●

●● ●

● ●

●●

● ● ● ●● ●●

●●

● ●● ●● ●● ● ●

● ● ●●● ●

●● ●

● ●●

● ●● ●● ● ●● ●●

● ●

● ● ●

● ● ● ●

●● ●

● ●●● ● ●● ● ●● ●

●● ● ●● ●

● ●

● ● ●

●●

● ●

● ●● ●

● ● ●

● ●● ●●

●●

●●

●● ● ●● ●

●●

●●

● ● ●● ● ● ●

● ● ●

●●

●●

● ● ●

●●

● ● ● ●● ● ●●

●● ● ● ●

●●

● ●

●●

●● ● ●●

●●

● ●● ● ● ●● ● ● ●

● ●

● ● ●

● ●

●●

●● ● ● ●

● ●

● ● ●●

●●

● ●

●●

● ●

● ●

●●

●● ●● ● ●

● ● ● ●

● ● ● ●

● ● ●● ● ●

● ●● ● ● ●● ● ● ●

●●

● ●

●●

● ●

● ●●

● ●

● ● ●

●● ●● ●

● ● ● ●

● ●

●● ● ● ●● ●●

●●

● ● ● ●

●● ● ●● ● ●● ● ●● ● ●● ● ● ●● ●

● ●● ●

● ● ●● ●● ● ● ● ●● ● ● ●● ● ●●●

●●

●●● ● ● ●● ● ● ●● ● ● ●●

● ●

● ● ●

●● ● ● ●●

● ● ● ●

● ● ●● ● ● ●● ● ● ●● ●● ● ● ●● ● ● ●●

●●

●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ●

●●● ●

● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ●● ● ● ●● ●

●●

● ● ● ●●

● ● ●

● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●●

● ●

●●

●●

● ● ● ●● ● ●

● ● ●● ●

●● ● ●

● ● ●

●●

● ●

● ●●

● ●● ● ● ●● ● ● ●● ● ● ●● ● ●

● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ●

● ● ● ●● ● ● ●● ● ●

● ● ● ●● ●

●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●

●●

● ● ● ●

●● ● ● ●● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●

● ● ●●

● ●● ● ● ●●

●●

●● ● ● ●

● ●

●● ●●

●●

● ●

● ●

● ● ●

●●

● ●

●●●

●●

●●

● ●

●●

●●

● ●

● ● ●

● ●

● ●

●●

●●

●●

● ●●

Trend in Active treatment proportion by year

Calendar year

Cas

e−m

ix a

djus

ted

prop

ortio

n0.

00.

20.

40.

60.

81.

0

2004 2006 2008 2010 2012

●● ●

●●

●●

● ●

● ● ●●

● ●●

●● ● ●

●●

● ● ●

●●

● ●

●●

● ●●

●● ●

● ●● ●

●●

● ●● ●

●●

●●

●● ●

●●

● ●

●●

● ● ●

● ● ●

● ●

●●

● ●

● ● ●

●●

● ● ●

● ●

● ●●

● ●

● ● ● ● ●●

● ●

● ● ●

●● ●

●●

● ●

● ●●

● ●

●●

● ● ● ● ● ●●

● ● ●

●●

● ●

●● ●

●●

● ●

● ●

● ●

● ●

●●

●●

● ●

●●

● ● ●

● ● ● ● ● ●

● ●

● ● ●● ●

● ● ●

●●

● ●

● ● ●

●●

●●

● ●

●●

●●

●●

●● ●

● ●

●●

●●

● ●●

●●

●●

● ● ●

● ●●

●● ●

●●

●●

● ●●

●●

●● ●

● ● ● ● ● ● ● ●

● ●

● ● ● ●●

●● ● ●

● ●

● ●

● ●

●●

● ●

●●

●● ●

●●

● ●●

●●

● ● ●

●● ●

●●

●● ● ●

● ●

●●

●●

● ● ●●

●●

● ●

●●

●●

● ● ●

● ●

●●

● ●●

●●

●●

●●

●●

● ● ● ●●

●●

●●

●●

●●

● ●● ●

●●

● ●● ●

● ● ●●

●●

●●

●●

●●

●●

● ●

● ● ● ● ●●

● ● ● ● ●

●● ●

● ●

●●

● ●

● ●

● ●● ●

●●

● ●

● ●

●●●

●●

●●

● ●

● ● ●

●● ●

● ●

●●

●●

● ●

● ●●

●●●

●●

● ●

● ●

●●

●●

●●

● ●●

● ●

● ●● ●

● ●

●● ●

● ●

●● ●

● ●●

● ●

● ●

● ●

●●

●● ●

● ●

● ● ● ●●

●●

●● ● ●

●●

●●

●●

● ● ●

● ● ●

●● ●

● ●

● ●

●●

● ●

●●

● ●●

● ● ●

●● ●

● ● ● ●

● ●

●●

●●

● ● ●

● ●●

●●

● ●

● ●●

● ●●

●●

●●

● ● ● ● ●

● ●●

●● ●

● ●

● ●

● ●● ●

●●

●●●

● ●

●●

● ●

●●

● ● ●●

● ●

● ●

● ●

●●

● ● ●

●●

● ●

●●

●●

●●● ● ●

● ● ●

●●

● ●

● ●

● ● ● ●

●●

● ●

●●

●●

● ● ● ●

● ●●

●●

● ●

● ●

● ●

● ●

● ●

●●

●●

●●

● ● ●

● ●● ●

● ●

●● ●

● ●

●●

●●

●●

●●

● ●●

●● ● ●

●●

●●

● ●●

●●

●● ●

● ●

●●

●●

●●

●● ● ● ●

● ●

● ●

● ●

● ● ●

● ●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ● ●

● ● ●

● ●●

●●

●●

● ●

● ●●

● ●

●●

●●

● ●

● ●●

●●

●●

● ● ●

● ●●

●●

●● ●

●● ●

● ● ● ●

● ●

● ●●

● ●

● ●

●●

●●

●●

● ● ●

● ●

●●

● ●

●●

●●

● ● ●

● ●

●●

●●

● ●

● ● ●

●●

● ●

● ●●

●●

●●

● ●

● ● ●

● ●

● ●●

● ● ● ● ●

● ●

● ● ●●

● ● ●

●●

●●

●●

● ●●

● ● ●

●● ● ●

● ● ●

● ●

●●

● ●●

●●

●●

● ●

● ● ●

● ●

●●

● ●

●●

●●

● ● ●●

● ● ●

● ● ● ●

● ●

● ●

● ●●

●●

● ●●

● ●

●●

●●

●●

●●

●●

● ●

● ●●

● ●

●●

●●

● ●

●●

● ●

●●

● ●

● ● ● ●

●●

●●

●●

●●

● ●

●●

● ● ●

● ● ●

●● ● ●

●●

● ●●

●●

●● ● ● ●

● ● ●

●●

●●

●●

●● ● ● ●

●●

●●

● ●

● ● ●●

● ●●

●●

●●●

● ● ●

● ●●

●●

●●

● ● ● ● ●

● ●

●●

● ● ● ● ●

● ●

● ● ●

●●

● ●

● ● ●

●●

●●

● ● ●

● ●

● ●●

● ●

● ●

● ● ●

●●

● ●

● ● ●

● ●●

●●

● ● ●

● ● ● ●

●● ●

●●

● ●

● ●

●●

●● ●

●● ●

●●

● ●

●●

● ● ● ●

●●

● ●

● ●

●●

●● ●

●●

● ●● ●

●● ●

●● ●

●●

●●

● ●

●●

●● ●

●● ●

● ● ●

● ●

●●

●●

●●

● ●●

● ●

● ●

●●

● ●

● ●

● ● ●● ●

● ● ● ● ●

●●

● ● ● ● ●

● ● ●

●●

●● ● ● ● ● ●

●●

● ●

● ● ● ●

● ●

● ●

●●

●●

● ● ●

● ●

● ●

● ● ●

●●

●●

● ●

●●

●●

● ● ●

●●

●●

● ●

●● ●

● ●●

●● ●

● ●●

●●

●●

●●

● ●

●● ●

●●

●●

● ●●

● ●

●●

● ●

● ● ● ● ●

●●

● ●●

● ●

●●

● ●●

● ●

● ●●

●●

● ● ● ●● ●●

● ●

● ●

● ●

● ●

●●

● ●

● ● ●

● ●

●●

●●

●●

●●

● ●●

● ● ●

●●

●●

●●

● ● ● ● ● ● ●

●●

●●

●●

● ●

● ●

● ●

● ● ●

● ●

●●

● ●

● ● ● ●

● ● ●

● ●

●●

● ●

● ● ●

● ●

●● ● ● ●

● ● ● ●●●

● ● ● ●

● ●

●● ●

●●

● ●

●●●

● ●

●●

● ● ● ● ●

●●

● ●

●●

● ●

● ● ●

●●

● ●

● ●

● ● ●

●●

●● ●

●●

●● ●

●●

●●

● ●

● ●

● ● ●●

●●

●●

●●

● ●

●●

● ●

● ●

● ●

● ●

●●

● ●

●●

●● ● ● ● ●

● ●

● ● ●

●●

● ●●

● ●

● ●

● ●

● ●●

●●

●● ●

● ●

● ●

● ●

● ●

● ● ● ●

● ●

●●

●●

●●

● ● ●● ● ●

● ●

●●

● ●

●●

● ●

●●

● ●

● ● ● ●

●● ●

●●

●●

●●

● ●

● ●

● ●

● ●●

●● ●

● ●

● ●●

● ●

●●

● ● ●●

● ● ●

● ● ● ● ●● ● ●

●●

●●

●●

● ●

●●

●● ●

●●

●●

●●

● ●

●●

●● ●

● ●

● ● ●

● ●

●●

● ● ●●

●● ●

● ●●

● ● ●

● ●

●● ●

●●

● ●

● ● ●

● ●

●●

●●

●● ●

● ● ●

●● ●

●●

●●

●●

●● ●

●●

● ●

● ●

●●

● ●

● ●●

● ●

● ●

● ●

●●

●●

● ●

● ●

● ●●

● ●

●●

● ● ●

● ●

●●

●●

● ● ● ●

● ●

●● ●

● ● ●

●●

● ● ●

●●

●●

●●

●● ● ●

●● ●

● ● ●

● ●●

●●

● ● ●● ●

● ●

●●

● ●

●●

● ● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●● ●● ●

● ●

●●

● ●

●●

●●

● ●●

●●

● ●

● ●

● ●

● ●

●●

●●

●●

● ●● ● ●

● ●

●●

●●

●● ● ●

●● ● ●

● ●● ● ● ● ● ●

●● ● ● ● ● ● ●● ● ● ● ● ● ●

● ●

● ● ●● ● ● ●

●● ●

● ● ● ● ● ● ● ● ●●

● ●

●●

● ● ● ●● ●●

●●

●●

● ● ●●●

● ●● ●

● ● ●●

● ●

● ● ●● ●

●● ● ● ● ●● ●

● ●

●● ● ● ● ● ● ●

● ●● ● ●

● ●

● ●

● ●●

●● ● ● ● ● ● ● ●● ● ●

● ● ●●

●● ● ● ● ● ● ●

●● ● ●

● ● ● ● ●● ●

● ● ● ● ●●

● ● ● ● ● ● ●

●● ● ● ● ● ●

● ● ●● ● ●

● ●● ● ●

●● ●

●● ● ●

● ● ● ●

●● ● ● ● ●

● ●●

●● ●

● ● ●●

● ●●● ● ● ● ● ● ● ●

●●

● ●●

● ● ● ●● ● ● ● ●

●●

● ● ● ● ● ●●

● ● ● ● ● ● ● ●●●

●● ● ●

●● ●

●● ● ● ● ● ● ●

●● ●

●● ●

●●

● ● ●

●●

● ●●

●●● ● ● ● ●

● ● ●

●●

● ● ● ● ● ● ● ●●

●●

● ● ●

●●

● ● ●●

●● ●

● ● ●

● ●●

●●

● ●●

●●

●●

●● ●

●● ● ● ●●

●● ●

● ●

●●

●●

●●

●● ●● ●

●● ● ● ● ● ●

● ● ● ● ● ●● ●

●● ●

●● ● ● ●

●●

● ● ● ● ● ●

●●

●●

● ●● ●

●●

● ● ●● ● ● ●

●●

●●

●●

●●

● ● ●● ● ● ● ●

● ● ● ●●

● ● ● ●

● ● ●● ● ● ● ●

● ●

● ●●

● ●

● ●

● ● ●●

●●

●● ● ●

●●

● ●●

● ●●

● ●

● ●

● ● ● ●

●●

●●

●● ●● ● ● ●

●●

●●

● ● ● ● ● ●

● ●●● ● ●

●● ● ● ●

● ● ●●

● ●● ●

●●

●●

● ●●

● ●

● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●

● ●

●● ●

● ●

● ● ● ● ●●

● ● ● ●

●●

● ●●● ● ● ●

●●

● ●● ●

●●

● ● ● ●

●● ● ● ● ● ● ●● ●● ● ●

● ●

●● ●

● ●

●●

● ●●

●● ●

● ●

●● ● ● ●

● ● ● ●

● ● ● ● ● ● ● ● ●●

● ● ● ●

●●

● ● ● ● ●● ● ●

●● ● ●

●●

● ● ● ●

●● ● ● ● ●

●●

●● ● ●● ● ● ●

●●

●●

● ● ● ● ●●

● ● ●● ● ● ● ● ●

●● ● ● ● ● ● ●

●●

●●

● ● ●●

●●

●●

●●

●● ● ● ●●

● ●

● ●

●● ● ●●

● ● ● ● ● ● ● ●● ●

● ●●

● ●

● ●●

●● ● ● ●

●●

●● ●

● ●●

●●

● ● ●● ●

● ●●

● ●

Trend in Time to first treatment by year

Calendar year

Cas

e−m

ix a

djus

ted

mea

n (d

ays)

020

4080

160

360

2004 2006 2008 2010 2012

●●

●●

● ●

●●

●●

● ●●

● ●● ●

● ●● ● ●

● ●

●● ●

●●

● ●

●●

● ● ● ●● ● ● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ● ●

● ●

● ● ●

● ●

● ●

●●

●●

●●

●●

● ●● ● ●

● ●

● ●● ●●

● ● ●

●●

●●

●●

●●

●●

●● ● ●

●●

●●

● ●

● ●

● ●●

● ●

● ● ●

●●

● ●

● ●

●● ●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●● ●

● ●

●●

●●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ● ●

● ●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●● ● ●

●●

● ● ●●

●●

●●

●●

●●

● ● ●

●●

●●

●● ● ●● ●

●● ●

● ●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●●

● ● ●●

● ●

●●

●●

●●

● ●

● ●● ●

●●

● ●

●●

● ●

●●

● ● ● ●

●● ●

●● ●

●●

●●

●●

● ●

●●

● ●

● ●

● ●

●●

●●

●●

●●

● ●

●● ●

●●

●●

●●

●●

●●

● ●

● ●●

●● ●

●●

● ●

●●

● ● ●

● ●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●● ●

● ●● ● ●

● ●● ●

●●

● ●● ●

●●

●●

● ●●

● ● ● ● ●

●●

● ● ●

●●

● ●

● ●

● ● ●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●●

● ●

●●

●●

●●

● ●●

●●

● ● ●●

●●

●●

●●

●● ● ●

●●

● ●● ●

● ●●

● ●● ●

● ● ●

● ●

●●

● ●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●● ●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

● ●●

● ●

●●

● ●

● ● ● ●

●● ● ●

● ●

● ●●

●●

●●

● ●●

●●

● ●● ●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●●

●●

● ●● ●

●●

●●

●●

●● ●

●●

● ●

●●

● ●●

●●

●●

●●

● ●

● ●

● ●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

● ● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

● ●

●● ●

● ●

●●

● ●● ●

● ● ●●

● ●●

●●

●●

● ●●

●●

● ●

●●●

●●

● ●

● ●

●●

● ●●

●●

●●

● ●

●● ●

●●

● ●●

●●

● ●

●●

●●

●●

● ● ●

●●

●●

● ● ● ●

●●

●●

● ●●

●●

●● ●

●●

●●

● ● ●

● ●● ● ● ●

● ●●

●●

●●

●●

● ● ●●

●●

●●

● ●

●●

●●

● ●

● ●●●

●●

● ●

●●

● ●●

●●

●●

● ●●

●● ●

●●

●●

● ●

●●

●● ●

●●

●●

●●

●●

●● ●

●●

●● ● ●

● ● ● ●

●●

●●

●●

●● ●

●●

●●

● ● ● ●●

● ● ●●

● ●●

●●

●●

● ●

●●

●●

● ●

●● ●

● ●

● ● ●● ●

● ●

●●

●●

●● ●

● ●

●● ● ● ●

● ●●

● ● ● ●

●●

● ●

●● ● ●

●●

●●

●●

● ●

● ● ●

● ● ●

●●

●●

●● ●

●●

● ●●

● ●

● ● ●

● ● ●

●● ●

●● ●

● ●●

●●

● ●●

●●

●●

●●

●●

●● ●

●● ●

●●

● ●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●●

● ●

● ●

●●

● ●

●●

●●

● ●

● ●

●●

● ●

● ●●

●●

● ●●

●● ●

●●

●●

● ●

●●

● ●●

●●

● ●

● ● ● ●●

● ● ●●

●●

●●

●● ●

●●

●●

●●

●● ●

●●

● ●●

●●

●●

●●

●●

● ●●

●●

●●

● ●

●●

●●

●●

●● ●

●●

● ●

●●

●●

● ●

●●

●●

●●●

● ●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●● ●

● ●

● ●

● ●● ●

●●

●● ●

●●

●●

● ●

● ●

●●

●●

● ● ● ● ● ● ● ● ● ●●

●●

● ● ● ●●

● ●

●●

●●

●●

●●

●●

● ●

●● ●

● ●

●●●

●●

● ●

●● ●

●●

●●

●●

●●

● ●

●●

● ●

● ●

● ●

●●

● ●

●●

● ●

● ●● ●

● ●

●●

●●

● ●

●●

●●

● ●●

●●

● ●

●●

●●

●●

● ●● ●

● ●●

●●

● ● ●

● ●●

●●

●●

● ●●

● ●

●●

●●

●●

● ●●

● ●

● ●● ● ●

●●

●●

●●

● ● ●● ●

● ●

● ●●

●●

●●

● ●●

●●

● ●

● ●

●●

● ●●

● ●

● ●

●●

●●

●● ● ● ●● ●

●● ●

● ●● ●

● ●●

●● ●

●●

●●

● ●

●●

●●

● ●

● ●●

● ●●

●●

● ●

● ●●

● ●●

●●

● ●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

● ●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

● ● ●

●●

●●

●●

● ●

●● ●

● ●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●● ●

●●

● ●●

●●

● ●

●●

● ●

●●

●● ● ●●

●● ●

●●

● ●

●●

● ● ●● ● ● ●

●● ●

● ●

●● ●

●●

● ● ●● ●

● ●●

● ●

●●

●●

●●

●●

● ●●

●● ●● ●

● ● ● ●●

●● ● ● ●

●● ● ●

●●

●●

●●

● ● ●●

● ●

● ●

● ●● ●

●●

●●

● ●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●●

●● ●

●●

● ●

●●

● ●

●●

●●

● ● ●●

●●

●●

● ●●

●●

● ●●

●●

●● ●

●●

●●

● ●

●●

● ●

●●

●●

●●●

●●

● ●●

● ●

● ●

●●

●● ●

●● ●

● ●●

●●●

●● ● ●

●● ●

● ●●

● ●●

● ●

● ●

●●

● ●● ● ●

● ●●

● ● ●

●●

●● ● ●

●●

●●

●● ●

● ●

●●

●●

●●

●●

● ● ●●

●●

●●

●●

● ●

●●

●●

●● ●

●●

● ● ● ●

●●●

●●

●●

● ●

●●

●●

●●

●●

● ● ●

● ● ● ● ●● ● ● ● ●

●●

●● ● ● ● ●

●●

●●

●● ●

●●

● ● ●●

●●

●●

●●

●●

● ● ● ● ● ● ● ●● ●

●●

● ●

●●

●● ●

● ● ● ● ●●

●●

●●

●●

●●

●●

●●

● ● ● ● ●●

● ●

● ●

● ●●

●●

● ●●

●●

● ●●

Trend in Length of stay by year

Calendar year

Cas

e−m

ix a

djus

ted

mea

n (d

ays)

01

23

45

79

2004 2006 2008 2010 2012

●● ● ● ● ● ●

●●

●●

● ● ● ● ● ●

●● ●

● ●

●●

●●

●●

● ●

●●

●●

● ● ●

●● ●●

●● ● ● ●

●●

●●

● ● ●

●●

●●

●● ●

●●

● ● ●

●●

● ●●

● ● ● ●●

●●

●●

● ● ● ●●

●●

●● ●

●●

●●

● ●

●●

● ●

● ●

● ●

● ●

●●

● ●

●● ●

●● ● ●

● ●

● ●●

● ●

●●

●●

● ●

●● ●

●●

● ●

● ● ●●

● ●

● ●

● ●

● ● ●●

● ●

●● ●

●●

● ●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

● ●

● ●

● ●● ● ● ●

● ●

●●

●● ●

●●

●●

●●

● ●

●●

● ●

● ●

●●

● ● ●

● ●

● ●

● ●

● ● ● ●●

● ●●

● ●

●● ●

● ●

● ●

●●

● ●

● ●●

● ●

●● ●

● ●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●● ●

●● ●

● ●●

●●

●●

● ● ●●

●●

● ●

●●

●●

● ● ● ●

● ●

● ●

●●

●●

● ●●

● ●

●●

●● ●

●●

● ● ●●

● ●

● ●● ●

●●

● ●

●●

●●

●●

●●

●● ●

● ● ●

● ●

● ●

●●

● ●

●●

●●

● ●●

● ●

●●

● ●

● ●

●●

● ●●

● ●

●●

● ●

●● ●

●● ●

● ●●

●●

● ●●

●●

●●

●●

● ●

● ●

●● ●

●● ●

● ● ● ● ●●

●●

●● ●

●●

●●

● ●

●●

● ● ●● ●

● ●

●●

●●

● ●

●●

● ●

● ●

●● ●

● ●

●● ●

●●

● ●

● ●

●●

●●

●●

● ●

●● ● ●

●● ●

● ● ●●

● ●

●●

● ●●

● ●

●●

● ●●

●●●

●● ● ●

● ●

● ●

● ●

● ●●

● ●

●●

● ●

●●

●●

● ●

● ●

● ● ●

●●

●● ●

● ●

●● ●

● ●

● ●

●●

● ● ● ● ●

●●

●●

●●

● ● ●

● ●

● ●●

●●

● ●

●●

● ●

● ●●

●●

●●●

●●

●● ●

●●

●●

● ●

●●

●●

●●

●●

●● ● ●

● ●

●●

● ● ● ●●

● ●

●●

● ●

● ●

●●

● ●

● ●

● ●

● ●

● ●

● ●

● ● ● ●

●●

●●

● ● ●

● ● ●●

●● ● ● ● ●

●●

● ●●

● ●

●●

●●

● ●● ●

● ●● ●

● ●●

●●

● ●●

●● ● ● ● ● ● ● ●

●●

● ●●

● ●

●●

● ●

●●

●●

●●

●● ●

●●

● ●●

● ● ●

●●

●●

● ●

● ●●

●●

● ● ● ● ● ● ● ● ● ●

● ● ●

● ●

● ●●

●●

● ●●

● ● ● ●●

●● ● ● ●

●●

● ●●

●●

● ●

● ●

● ●

●●

● ●●

● ●

● ● ●

● ●●

●●

● ●

●●

●●

●●

●●

● ● ●

● ● ● ●

● ●

● ●

●● ●

●●

● ● ● ● ●

● ●

●●

●●

●●

●●

● ●

● ●

● ● ●

●●

● ●

●●

●●

● ● ● ● ● ●

● ●

●●

● ● ●

● ● ●

●● ●

● ●

●●

●●

●●

●●

●●

● ● ●

● ●

●●

● ●●

●●

● ●

● ●● ● ● ● ● ●

● ● ● ●●

● ●

● ●

● ● ● ●

● ●● ●

●●

●●

● ●

●●

●●

●● ●

●●

● ●●

● ● ● ● ● ●

●●

●● ● ●

●●

● ●

● ●

●●

● ● ● ●●

●●

● ●

●●

●●

●●

● ● ●

●●

● ●

●●

●● ●

●●

● ●

● ●

● ●

● ●

● ● ●●

● ● ● ● ● ● ● ● ● ●

● ● ●

● ● ●

● ●

● ● ●

●●

● ●

●●

●● ●

● ●

●● ● ●

●●

●●

●●

● ● ● ● ● ●

●● ●

●●

●●

●●

●●

● ● ●

● ●

● ●

● ●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

● ●●

●●

● ●

●●

●●

●●

●●

● ●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●●

● ●

● ●●

● ●

●●

● ●

●●

● ●

● ●

● ●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●●

● ●

●●

●●

● ●●

●●

●●

● ●●

●●

● ●

● ● ●

●●

●● ●

● ●●

●●

●●

●● ●

● ●

●●

● ●●

●●

●●

●● ●

● ●

●●

● ●

● ●●

● ●

● ●

●●

● ●●

● ●●

●●

● ●

●● ●

●●

●●

● ●●

●●

●● ●

● ●●

●●

●● ●

●●

● ●●

●● ●

●● ●

● ●

●● ● ●

● ●●

● ●

●●

● ●

● ●

● ●

● ●

● ●

●● ●

●●

● ●

●●

● ●

● ●

●● ● ●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●● ●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

●● ●

●●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

● ● ●

●●

●●

● ●

● ●

● ●

●●

●●

● ●

● ●

●●

● ● ● ● ● ● ● ● ●

●●

●●

●●

● ●●

●●

●●

● ●●

●●

●●

● ●● ●

●●

● ●

● ● ●

●●

●●

● ●

● ● ●●

● ●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●● ●

● ●

● ●

●●

●● ●

●●

● ●

●●

●●

●● ●

●●

Trend in Readmission proportion by year

Calendar year

Cas

e−m

ix a

djus

ted

prop

ortio

n0.

00.

20.

40.

60.

81.

0

2004 2006 2008 2010 2012

● ● ● ● ●

●●

● ●● ●

●●

● ●● ● ● ●

● ● ● ● ●

● ●● ●

●● ●

●● ●●

● ● ●● ● ●

● ●

●●

●●

● ●●

● ● ●● ● ● ●

● ● ●

● ●

●● ●

● ●●

●●

● ● ● ●

●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ●

● ● ●●

● ●

●●

● ●●

●● ●

●● ● ● ● ●● ●

● ●

● ● ●

● ● ●

● ● ● ● ● ●● ● ● ●

● ● ● ●● ● ●●● ● ●

● ● ●

●●

● ● ● ●

●● ●

● ● ● ● ● ●● ●● ●

● ●● ● ● ●

● ●●

●● ● ● ●

●● ● ● ● ● ● ● ● ●●

● ●

● ● ● ● ●●

● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●

● ●

● ● ● ● ● ●● ●

● ● ● ● ● ●

● ● ● ● ● ● ●● ● ● ●

● ● ●

●●

●●

● ● ● ●●

●●

● ●

●●

●● ● ● ● ●

● ● ● ●● ● ● ● ● ● ● ●

●● ● ● ●

● ● ●● ● ● ● ● ● ● ●

● ●

●●

●●

●● ● ●

● ●●

●●

● ●●

● ●● ●

● ●● ●

●● ●

● ●● ● ● ● ● ● ● ● ●

● ●●

●●

● ● ● ●● ● ● ● ● ●

●● ● ● ● ● ● ● ● ● ●

● ● ● ● ●

● ●● ● ●

● ●

● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●

● ●

● ● ● ● ●

●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●

● ● ● ●

● ● ● ●● ● ● ● ●● ●

● ● ● ●●

● ●

● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●

●●

● ●

●●

●●

●● ● ●

● ● ● ● ●● ●

● ● ● ● ● ● ● ●

●● ●

● ● ● ●

● ● ● ● ● ●

● ●● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●

●●

●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ●

●●

● ● ● ● ● ●

●● ●● ● ● ● ● ● ● ●● ●

●●

● ●●

● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

● ● ●● ● ● ● ●

● ● ●

● ● ●

●● ● ● ● ● ● ● ● ●

● ●

● ● ●

●●

● ● ●

● ● ● ●

●●

● ●

● ● ● ● ●● ● ● ● ● ●

● ● ● ● ● ● ● ● ●● ● ●

● ● ● ●● ● ●● ● ●

● ● ●

●●

●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

● ●●

●●

● ● ● ● ● ● ● ● ●● ● ● ● ●

● ● ● ● ● ● ●

● ● ● ● ●

● ●● ● ● ● ● ● ●

● ●● ● ● ●● ● ● ● ●● ●

● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ●

● ●

●●

● ●●

●● ● ● ●

● ●

● ● ● ●●

● ● ●

● ● ●

● ● ●●

●●

● ●

● ●●

●●

● ●

● ●

● ● ●

● ● ● ●● ● ● ● ●

● ● ● ●● ● ● ●

● ●

●●

● ● ● ● ● ●● ● ● ● ● ●

● ●

●●

● ●

●●

●● ● ●

●●

● ● ● ● ● ●●

●●

● ● ● ● ● ● ●

●● ● ● ● ● ● ● ● ● ●

● ●

●● ● ●

●●● ●

●●

● ●

● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●

●● ●

● ● ● ● ●●

● ●

● ● ●

● ● ● ● ● ● ● ● ● ●

● ● ● ● ●● ● ● ●

● ●● ● ● ● ● ● ● ● ● ●●

● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●

● ●

●●

● ●

● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●

●●

● ●● ●

●●

● ●

● ●● ●

●● ●

● ●●●

● ● ● ● ● ● ● ●

● ●

● ● ●

● ●●

● ● ●

●●

● ● ● ● ● ● ● ● ●

● ● ● ● ●

●●

● ● ● ●

● ● ● ●

● ●

● ● ● ●

● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●

● ●

● ● ● ● ● ● ● ● ●● ● ● ●

●●

● ●

●●

● ● ● ● ●

● ● ● ●

● ●

● ●

●●

● ●● ● ● ● ● ● ● ● ●● ● ●

● ● ●

● ●● ●

● ●●

● ● ● ●● ● ● ● ● ● ●● ● ● ●

● ●

● ●●

● ●

●●● ●

● ● ● ●●

●●

● ●●

● ● ● ●● ● ● ●● ● ● ● ● ● ● ●

● ●

● ● ●

●●

● ● ● ●

● ● ● ● ●

● ●● ● ● ● ● ● ● ● ●● ● ● ●

● ●● ●

● ● ● ● ● ● ● ●● ● ● ● ● ● ●

● ●● ●● ● ●

● ●●

● ● ● ● ● ● ●● ●

● ● ● ●● ● ● ● ●

●●

●● ● ●

●●

● ●●

●●● ●

●●● ● ● ● ● ●● ● ● ● ● ● ●

● ● ●● ● ● ● ● ●

●● ● ●●

● ● ●

●●

● ●●

● ● ●●●

● ●

● ● ● ● ●● ● ●● ● ● ● ●

● ● ● ● ● ●

● ●

● ● ● ● ● ●

● ●

● ● ●●

● ● ● ●● ● ● ● ● ●●

● ● ●

● ● ● ●● ●

● ● ●

● ●

● ● ● ● ● ●●

● ● ●●

●● ● ● ● ●●

● ● ● ● ●● ● ● ● ● ● ● ● ● ●

●●

● ●● ● ●

●●

●●

● ●

● ●● ● ● ● ● ● ● ● ●●

● ● ● ● ● ●

●●

●● ● ●

● ●● ● ● ●

● ● ●

● ● ● ●

● ●

● ● ● ● ●●

● ●

●●

● ●●

●●

● ● ● ● ●

● ●● ●

●● ● ●

● ● ●● ● ● ● ● ● ● ● ● ●● ●

● ● ● ● ● ● ● ● ● ●● ● ● ●

●●

● ●● ● ● ●

● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

● ●

● ●●

● ● ● ●

● ● ● ● ● ●

● ● ● ● ● ● ● ●●

● ● ●

● ● ● ●

● ● ● ● ● ●

●● ●● ● ● ● ● ●

● ●● ● ● ● ●

● ● ● ● ●

●● ●

● ●

● ● ● ● ● ● ● ● ●●

● ●

● ● ● ●● ● ● ● ● ● ● ●●●

● ● ● ●

● ●

●●

● ● ●● ●

● ● ●●

●● ● ● ●● ● ●

● ●

● ●

● ● ●

●● ● ● ●

● ● ●

●● ● ● ● ●

● ●

● ● ●● ● ● ●●

● ● ●●

● ● ● ●

● ●

●●

●● ●

●●

● ● ● ● ● ●

● ● ●● ● ●

● ●●

●●

● ●●

● ● ●

● ● ● ● ● ● ● ● ● ●

● ● ● ● ●●● ● ● ● ● ● ● ● ●

● ● ● ●●

● ● ●

● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ●

●●● ●

● ● ● ● ● ● ● ● ● ●● ●● ●

●●

● ● ●● ● ● ● ● ● ●

● ●●

● ●

●●

● ● ● ●

● ● ● ● ●

●● ●

● ● ●

●● ● ●

● ●●

●● ● ● ● ●

●● ●● ● ● ● ● ● ●

● ●

● ●

● ●●

● ●

● ●

●●

● ● ● ●

● ● ●●● ● ● ● ● ● ● ● ●

● ● ●●

● ●

●● ●

●●

● ● ● ●

●●

●● ●● ● ● ● ●● ● ● ●

● ●

● ● ●●

●●

●●

● ●

● ●● ● ● ● ● ● ● ● ● ●●●● ● ●

●●

●● ●

●● ●●

● ●

●●

● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●

● ●● ● ● ●

●●

● ● ●

●● ● ● ●● ● ●

● ●

●● ●

● ● ● ● ●●

● ●

● ●

●● ●

● ●

● ●

● ● ●

● ● ● ●

● ●● ●

● ●

●●

● ●●● ● ● ●

● ●● ● ● ●● ● ●

● ●

●● ● ● ● ● ● ●

● ●

● ● ● ● ● ●

● ● ● ●

● ● ●●

●●

● ● ●

● ● ●●

●●

●● ●

● ● ● ● ● ●

● ● ● ●

● ● ● ● ● ● ●● ● ●●

● ●

● ●

●●

● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●

● ●

● ●

● ●

● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ●● ● ●

● ● ● ● ●●

● ● ●● ● ● ● ●

● ● ●

● ● ● ●

● ●●

●● ● ● ● ● ● ● ● ● ●●●

●●

●●

●●

● ●●

●● ● ●

● ● ● ●

● ●

● ● ● ● ● ● ● ● ● ●●

● ●

●●

● ●

●● ●

● ● ●

● ●

●● ● ● ● ● ●●

● ● ● ●

● ●

● ● ●

● ● ●

● ● ● ● ● ●

● ●● ● ● ● ●● ● ●

● ● ● ● ●

●●

● ● ●

● ● ● ● ● ●

●● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●●

●● ● ●● ● ●● ● ● ● ● ●● ●

● ●● ●

● ●●

●● ●

● ●●

● ●●

● ●

● ●

● ● ●

●● ●

●●

● ● ● ●

●●

●● ●

●●

● ●

● ● ● ● ● ● ● ●

● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●

● ●● ● ● ●

● ● ● ● ● ● ● ● ●

● ●

● ● ●

●●

●●

●● ● ●

●●

● ● ● ●

● ● ●● ● ● ●● ● ●

● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●

● ● ● ● ● ●

● ● ● ●

● ●

● ●

●●

●● ● ●

● ● ● ●● ● ● ● ● ●

●● ●

●●

● ●● ● ● ● ● ● ● ● ● ●● ●

●●

● ● ●●

●●

● ● ● ● ●

●●

● ● ●● ● ● ●

● ●

● ● ● ●● ● ● ●

● ● ● ●● ● ●

● ●●

●● ●

● ● ●

●●

● ● ● ●

● ● ● ● ●

● ● ● ● ●

● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ●

● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●

●● ● ● ●● ● ● ● ● ● ● ● ●●

● ● ● ●

●● ●

● ● ● ● ● ●

● ●

● ●●

● ● ● ● ● ●● ● ● ● ● ●

● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●

●●

● ● ●● ●

●●

●●

●●

● ●

●● ● ●

● ●● ● ● ● ● ●

● ● ●

●●

● ●

●●

● ●● ● ● ● ●

● ●

● ● ●

●●

● ● ● ● ● ●●

● ● ● ● ●● ●

● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●

●●

● ● ● ● ●

●● ● ● ● ● ●

● ●● ● ● ● ●● ●● ●

● ● ●●

● ●

● ● ● ● ● ●● ● ●

● ●

●● ● ● ●

● ●

●●●

● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ●●● ● ●

● ● ● ●●

●●

● ● ● ● ●

● ●● ● ● ●

●●

●● ● ● ● ● ●● ● ● ●● ● ● ● ● ●

●●

●● ● ● ● ● ● ●

● ● ●

●● ●

● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●

● ● ● ● ●●

● ●

● ● ● ● ● ●● ● ●●

● ●●

● ● ● ●

●●

● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●

● ●

● ●●

● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●

● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ●

●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●

●●

● ● ● ●●

● ● ●● ● ● ● ● ● ● ● ● ●●

● ● ● ● ● ● ● ●

● ●

● ● ● ● ● ●

● ● ● ● ● ● ● ● ●● ● ● ● ●

● ●

●●

●● ● ● ● ●

● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●

● ● ● ● ● ● ● ● ● ●● ●

● ●

● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●

● ●

● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●●

● ● ● ● ●● ● ● ● ● ● ● ● ● ●●

●● ● ● ● ● ● ● ●

●●

●●

●● ● ● ●

●● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●

●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●●

● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●

● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ●●

● ● ● ● ●●

● ●

● ● ● ● ●● ● ● ● ● ●●

● ● ●●

●● ●

● ●●

●●● ● ●

● ● ● ●●

●● ●

● ● ● ● ●●

● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●

● ● ● ● ●●

●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●

● ● ● ● ●

● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●●

●●

● ●● ●

●●● ● ● ● ● ●

● ● ● ●● ● ● ●

● ●

● ●●

● ● ●

● ●

●●

●●

● ● ●

●●

● ●● ●

●●

●● ●

●● ● ● ●

● ●

● ●

●●

● ● ●●●

●● ● ●

● ●

● ● ● ● ●

● ● ● ●●

●●

Trend in Lymph node dissection proportion by year

Calendar year

Cas

e−m

ix a

djus

ted

prop

ortio

n0.

00.

20.

40.

60.

81.

0

2004 2006 2008 2010 2012

●●

●●

●●

● ● ●

●●

● ●

●●

●● ●

● ● ●

●●

● ●

● ● ●

●●

● ● ● ●

●●

● ●

● ●●

● ●

●●

●● ●

●● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

● ●

● ● ●

●●

● ●

● ● ●

●●

●●

●●

● ●

● ●

●●

● ●

● ● ●

● ●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

● ●

● ●

●●

● ●

● ●

● ●

● ●

● ●

● ●

●●

● ●

● ●

● ●

●●

● ●

● ●

● ●

●●

●●

● ●

●●

●●

●●

● ●

● ●

● ●

●●

●● ●

●●

● ●

● ●

●● ●

●●

● ● ● ● ●

● ●

●●

● ●

●●

●● ●

●●

● ●

●●

● ●

● ●

● ●●

● ●

●●

● ●

●●

● ● ●

●●

● ●

● ● ● ● ●

● ●

● ●

●●

● ●

● ●

●●

●●

● ●●

●●

● ●

●●

●●

● ●

●●

● ●

●●

● ●

●●

● ●

●●

● ●

● ●

● ●

● ●

●●

● ●● ●

●● ●

● ●

● ●

● ●

●●

●●

● ● ●

● ●

●●

● ●

●●

● ●

● ●● ●

● ●

● ●

● ●

● ●

●●

●●

● ●

● ●

● ●

●●

● ●

● ●

● ●

● ●

●●

● ●

●●

● ●

●●

● ● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

● ● ●

●●

● ●

● ●

●●

●●

●●

●●

● ● ● ● ● ● ●

● ● ● ● ●

●●

●● ●

●●

● ●●

● ●

● ● ● ●● ●

●●

● ●● ●

●●

● ●

● ●

●●

●●

● ● ●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●●

●●

● ●

● ●

●●●

●●

● ● ●

● ●

●● ●

● ●

● ● ● ●

●●●

● ●●

● ●

●●

●●

●●

● ●

● ●

●●

●●

●● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ● ●

● ●

● ●

● ●

● ●

● ●

● ●

●●

● ●

● ● ●●

●●

●●

● ●

● ●

● ●

● ●

● ● ●

●● ● ●

●●

● ●

●●

● ●

●●

● ● ●

● ●

● ● ● ● ●

● ●

● ●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

● ● ●●

● ●

● ●

●●

● ●

● ● ● ●

● ●

●●

● ●

●●

●●

● ●

● ● ● ● ●

● ●

●●

●● ●

● ● ●

● ● ● ●

● ●

● ● ● ●

● ● ●

● ●

● ● ● ● ● ●

● ● ● ●●

● ●●

● ● ● ● ● ● ● ● ● ●●

●●

●● ● ● ● ●

● ●

● ●

● ●

●●

●●

● ● ●

●● ● ● ●

●●

● ●

● ●

● ● ● ●

●●

● ●

●●

● ● ●

●●

●●

● ● ●

● ● ● ●● ● ●

● ●●

● ●

● ● ● ●

● ● ● ●

● ●

● ●

● ●

●●

● ●

● ● ●

● ● ● ● ●

●●

● ●

● ● ● ● ● ●

● ● ● ●● ● ●

●●

● ●

●●

● ●

● ● ●

● ● ● ●

●●

● ●

●● ● ●

● ● ● ●

●●

●● ●

● ●

● ● ● ●

● ●

● ● ● ● ● ●

● ●

●● ●

● ●

● ●

● ●

● ●

● ●

●●

●●

●●

● ●

● ●

●●

● ●

● ●

● ●● ●

● ●

● ●

● ●

● ●

●● ●

●●

●●

●●

● ●●

● ●

●●

●● ● ● ●

● ●

● ●

●●

●●

● ●

●● ● ● ●

●●

● ● ●

● ●

●●

● ●

● ●● ● ● ● ●

● ● ● ●●

● ●

● ● ●

● ● ●

● ●

● ●

●●

●●

● ●●

●●

● ● ● ●● ●

●●

●●●

● ● ● ●

● ●

● ● ●

●●

● ● ● ●

● ●

●● ●

● ●

●●

● ●

●●

●●

●●

●●

● ●● ● ●

● ●

● ● ●

●●

● ● ● ● ● ●

● ● ● ● ● ●

●●

● ● ●●

●●

● ●

● ●● ●

●●

● ●

●●

● ● ●●

● ●

● ● ● ●

● ●

●●

● ●● ●

Trend in ADT with EBRT proportion by year

Calendar year

Cas

e−m

ix a

djus

ted

prop

ortio

n0.

00.

20.

40.

60.

81.

0

2004 2006 2008 2010 2012

●●

●●● ●

●●

● ●

● ● ●

●●

●●

●●

●●

● ●

●●

●● ●

●●

● ●

● ●

● ●

●●

● ●

● ●

●● ●

● ●

●●

● ● ● ●

●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

●●

● ● ● ●

● ● ●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

● ●

● ● ●

●●

●●

●●

● ●

●● ●

●● ●

● ●

● ●

● ●

●●

●● ●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ● ●

●●

● ● ● ●

● ●

● ● ●

●●

● ●

●●

●●

● ●

●●

● ●

●●

● ● ●

●●

●●

●●

● ●

●●

● ● ●

● ●

● ●

●●

● ●

● ●

● ●

● ●

● ●

● ●

●●

●●

● ●●

●●

● ●

●●

● ●

●●

● ●

● ●

●●

● ●

●●

● ● ● ●

● ●

● ●

●●

● ●

● ●

● ●

● ●

● ●

●●

● ●

●● ●

●●

●●

● ●

●●

● ●

●●

● ●

● ●

●●

● ●

●● ●

● ●

●●

● ●

●●

●●

● ●

●●

● ●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

● ●

● ●

● ●

● ● ●

● ●

● ●

● ●

●●

● ●

● ●

● ●

●●

● ●

●● ●

●●

● ●

●●

● ●

● ●

● ●

●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

● ●

● ●

● ●

● ●

● ●

● ●

●●

● ●

● ●

●●

● ●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●●

●●

●●

● ●

● ●

● ●

● ●

● ● ● ●

● ●

● ●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

● ●●

●● ●

●●

● ●

● ●●

● ●

● ●

●●

●●

●●

●● ●

● ●

● ● ●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●● ●

● ●

●●

● ●

● ● ● ●

●●

● ●

● ●

● ●

● ● ● ●

● ● ●

● ●

● ●

●●

● ● ●

●●●

●●

●●

●●

●●

● ● ● ●

● ●

● ●

● ●

● ● ●

● ● ●

●● ●

● ●

●●

● ●

●● ● ● ●

●● ●

●●

● ●

● ●

● ●

● ● ●● ● ●

● ●

● ●

●●

● ● ● ●

●●

●●

● ●

● ● ●

● ●

●●

● ●

● ●

●●

●●

● ●

●●

● ● ●

●●

●●

● ● ●

●●

●●

●●

● ● ●●

● ●

●●

●●

●● ●

● ●●

● ●

●●

●●

● ● ●

● ●

● ●

●●

● ●

● ●

● ●●

●●

●● ●

● ●

●● ●

● ●●

●●

● ●

●●

● ●

● ●

● ●

● ● ●

●●

● ●

● ● ●

●●

●●

● ● ●

●●

●●

● ●

● ●

●●

● ●

●●

●●

● ●

●●

●●

Trend in Appropriate EBRT dose proportion by year

Calendar year

Cas

e−m

ix a

djus

ted

prop

ortio

n0.

00.

20.

40.

60.

81.

0

2004 2006 2008 2010 2012

● ●

● ●

● ●●

● ●

● ●●

●●

●●

● ●

●●

●●

●●

● ●● ●

●●

●●

●●

● ●●●

● ● ●

● ●

●●

● ●

● ●

●●

●●

● ●

● ●

● ●

●●

●●

● ●●

● ● ●

● ● ●

●●

● ●

●●

●●

● ●

● ●

● ●

●●

● ●

● ●

● ●

● ●

●●

●●

● ●

●●

● ●

● ●

●● ● ●

● ●

●●

● ● ● ● ●

●● ●

●●

● ●

● ● ●

● ●

●●

●●

● ●

● ●●

● ●

●●

●●

● ●

● ●

●●

●●

●●

●●

●● ● ●

● ●

● ●

● ●

●●

● ●

● ●

●●

●●

●●

● ●●

●●

● ●

● ●

●●

●●

● ● ● ●

●●

●●

● ●

● ●

●●

●●

●●

● ● ●

● ●

● ●

●●

● ● ●

● ●

● ●

● ●●

● ● ● ●

●●

●●

●●

● ●

● ●

● ●

● ●

● ●

●●

●● ●

● ●

●●

●●

● ●●

● ●

●●

● ● ●

●●

● ●

●●

● ● ●

● ●

● ●●

●●

● ●

● ●

●●

● ● ●

● ● ●

●●

●●

● ●

● ● ● ●

● ●

● ● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

● ● ● ● ●

● ●

● ●

● ● ●

●●

● ● ●

● ●

● ● ●● ● ●

●●

● ●

●●

● ●

● ●●

● ●

● ●

●● ●

●●

● ●

●●

● ●

● ●

●●

● ●●

●●

● ●

● ● ●

● ●

●●

●●

● ● ●

●●

●●

●●

● ●

● ●

●●

●● ●

● ●

● ● ● ●

● ●

● ●

● ●

● ● ●

● ●

●●

●●

●●

● ●

● ● ● ● ●

●●

● ●

●●

● ● ● ●●

●●

● ●

● ● ●

●●

●●

●●

● ●

● ●

● ●

● ●

●●

●●

● ●

●●

● ●

●●

●●

● ●

● ● ●

●●

●●

●●

●● ●

● ●

● ●

● ●

●●

● ●

● ●

● ● ●●

● ●

● ●●

● ●

● ●

● ●

● ● ●

● ●

● ●

● ● ●●

●●

●●

● ●

● ●

● ● ● ●

●●

● ●

● ● ●

● ●

●● ●

●● ●

● ●●

●●

●●

● ●

●● ●●

● ●

● ●●

●●

● ● ●

●●

● ●

●●

● ●

●●

● ●

●●

● ●

● ●

●●

●●

● ●

● ●●

● ● ●

● ● ●

● ●

● ●

● ●

●●

● ● ● ● ●

● ● ● ● ● ●

● ●

●●

● ●

●●

● ● ●

●●

● ●

● ●

● ●● ● ● ● ● ●

● ●

● ● ● ●

● ●

● ●● ● ●

● ● ● ● ●

●●

●●

●●

● ● ●

● ● ●

● ●

●●

● ● ● ● ● ● ●

●●

●●

● ●

● ●

●●

●●

●● ●

● ● ●

● ● ●

● ●

● ● ● ●

●●

●●

●●

● ●

● ●

●● ● ● ● ● ● ●● ● ●

● ● ● ●●

● ● ● ●

● ●●

● ● ●

●●

●●

● ●

● ●

● ●

●●

● ●●

●●

● ● ●● ●

●●

● ●

● ● ● ●

●●

●● ●

● ● ● ● ● ● ●

● ● ● ● ●

● ●

● ● ● ●

● ● ●

● ●

● ● ●

● ●

● ● ●

● ● ●

●● ●

● ● ● ●

● ●

● ●

● ●

●●

● ●●

● ●

● ●

● ●

● ● ● ● ●

● ●

●●

● ●●

● ●

● ● ● ●

● ● ● ● ●

● ●

●●

● ● ● ● ●● ● ●

●● ●

●●

●●

●● ● ●

● ● ● ●●

●● ● ● ● ●

●●●

● ● ●●

● ●

● ● ● ●●

● ● ●● ●

● ● ●●

● ● ●

● ●

●● ● ● ●

● ● ● ●

● ● ●●

● ●

● ●

● ● ●

●●

●●

●●

●● ●

●●

● ●

●●

●●

●●

● ● ●

●●

● ●● ●●

●●

●●

●●

● ● ● ● ●

●●

● ●● ● ● ● ●

●●

● ●●

●●

● ●●

● ● ●●

● ● ●

●●

●●

● ●

● ● ●

●● ● ●

● ●

● ● ●

●● ●

● ●● ●

●●

● ●

●●

●●

●●

●●

●●

● ● ●

●●

●●

●●

●●

●●

● ●

●●

● ● ● ● ● ● ●

●●

● ● ●

● ● ●

● ●

●●

●● ● ●

●●

●● ● ●

●●

● ●

● ●●

●● ●

● ● ●

●●

● ●

●●

●●

● ● ●

● ●● ●

●● ● ● ●

● ● ● ● ●● ●

●● ● ●

● ●●

● ●

●●

●●

● ●

● ● ●●

●●

●●

● ●

● ●

● ●

● ●● ● ●

● ●

●●

●● ● ●

●●

● ● ●

● ●●

● ●

● ●●

● ● ● ● ●●

●●

●● ● ●

● ● ●● ● ● ●

● ●

● ● ●●

●●

● ●

● ● ●

● ●

● ●

●●

●●

●●

● ●●

● ●●

●●

● ●

● ●●

●●

● ●

● ●● ●

●●

●●

● ●

● ●

●●

●● ●

●●

● ●

● ●

●●

● ●●●

● ● ●

● ●

●●

●●

● ● ● ●●

●●

●● ●

● ●

● ●

● ● ● ● ●

●●

●● ●

● ●

● ● ●

● ● ● ●

● ●● ●

●●

●●

●● ●

● ● ●● ● ●

●●

● ●●

●●

●●

●●

● ●

● ● ● ●

● ● ●

● ●

● ● ● ● ●

●●

●●

● ● ● ●●

● ●

●●

●●

● ●

●● ● ●

●●

● ●●

● ●

●●

●●

● ● ● ●

● ●

● ●●

●●

●●

●●

● ● ●

●● ● ●

●●

●● ● ● ● ●

● ● ● ● ●

● ●

● ● ●

● ● ● ● ● ● ●

● ● ● ●●

●●

● ● ●● ●

●●

● ● ●

● ●

● ●

● ●

●● ●

●●

●●

●●

●●

●●

●● ● ● ●

Figure S3.2: Yearly trend in outlier status for each QI. Red circles are poor performers, blue circles aresuperior performers, black line is smoothed average time trend.

47

Page 59: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

30 day mortality

OR (log scale)

0.2 0.5 1.0 2.0

Variable

Positive margin T2 upper outlier

Positive margin T3 upper outlier

Act. Surv. upper outlier

Act. Tx. upper outlier

log Time to Tx. upper outlier

logLOS upper outlier

Readmission upper outlier

Lymph. Dis. upper outlier

Concurrent ADT upper outlier

EBRT dose upper outlier

OR

1.57

1.60

0.71

1.04

0.60

1.57

1.25

1.01

1.12

0.44

95% CI

(0.96, 2.56)

(0.96, 2.64)

(0.44, 1.13)

(0.44, 2.45)

(0.39, 0.94)

(1.09, 2.27)

(0.64, 2.43)

(0.71, 1.44)

(0.59, 2.10)

(0.15, 1.28)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

90 day mortality

OR (log scale)

0.2 0.5 1.0 2.0

Variable

Positive margin T2 upper outlier

Positive margin T3 upper outlier

Act. Surv. upper outlier

Act. Tx. upper outlier

log Time to Tx. upper outlier

logLOS upper outlier

Readmission upper outlier

Lymph. Dis. upper outlier

Concurrent ADT upper outlier

EBRT dose upper outlier

OR

1.13

1.18

0.77

0.88

0.68

1.37

1.13

1.08

0.93

0.46

95% CI

(0.72, 1.78)

(0.67, 2.09)

(0.49, 1.23)

(0.38, 2.01)

(0.46, 1.01)

(0.98, 1.92)

(0.57, 2.23)

(0.78, 1.51)

(0.53, 1.63)

(0.19, 1.16)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Overall mortality

OR (log scale)

0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5

Variable

Positive margin T2 upper outlier

Positive margin T3 upper outlier

Act. Surv. upper outlier

Act. Tx. upper outlier

log Time to Tx. upper outlier

logLOS upper outlier

Readmission upper outlier

Lymph. Dis. upper outlier

Concurrent ADT upper outlier

EBRT dose upper outlier

OR

1.16

1.26

0.87

1.06

0.86

1.16

1.15

0.99

0.96

0.93

95% CI

(1.03, 1.32)

(1.08, 1.47)

(0.76, 1.00)

(0.91, 1.24)

(0.78, 0.96)

(1.07, 1.25)

(0.97, 1.36)

(0.92, 1.06)

(0.85, 1.10)

(0.85, 1.01)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Salvage therapy

OR (log scale)

1.0 1.5 2.0 2.5 3.0

Variable

Positive margin T2 upper outlier

Positive margin T3 upper outlier

Act. Surv. upper outlier

Act. Tx. upper outlier

log Time to Tx. upper outlier

logLOS upper outlier

Readmission upper outlier

Lymph. Dis. upper outlier

Concurrent ADT upper outlier

EBRT dose upper outlier

OR

1.76

2.11

1.02

1.78

1.52

1.47

0.90

0.91

0.96

0.84

95% CI

(1.29, 2.39)

(1.45, 3.06)

(0.65, 1.60)

(1.16, 2.74)

(1.00, 2.33)

(1.13, 1.90)

(0.58, 1.39)

(0.70, 1.18)

(0.70, 1.31)

(0.60, 1.17)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

ADT initiation

OR (log scale)

1.0 1.5 2.0

Variable

Positive margin T2 upper outlier

Positive margin T3 upper outlier

Act. Surv. upper outlier

Act. Tx. upper outlier

log Time to Tx. upper outlier

logLOS upper outlier

Readmission upper outlier

Lymph. Dis. upper outlier

Concurrent ADT upper outlier

EBRT dose upper outlier

OR

1.11

1.24

0.94

0.80

1.61

1.27

1.04

1.10

0.76

0.93

95% CI

(0.89, 1.38)

(0.89, 1.71)

(0.72, 1.22)

(0.63, 1.02)

(1.27, 2.05)

(1.07, 1.52)

(0.65, 1.67)

(0.94, 1.29)

(0.56, 1.03)

(0.77, 1.13)

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

|

Figure S3.3: Associations of QIs with outcomes of interest, adjusted for case-mix.48

Page 60: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

−5 −4 −3 −2 −1 0 1 2 3 4 5 6

PC−QS

Num

ber

of h

ospi

tals

050

100

150

200

Figure S3.4: Distribution of the PC-QS in the validation set.

49

Page 61: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.1

:Q

uali

tyin

dic

ato

rd

efin

itio

ns

an

din

clu

sion

crit

eria

Quality

Indicato

rPositiveM

argin

Rate

T2

(PM

T2)

PositiveM

argin

Rate

T3

(PM

T3)

ActiveSurveillance

Proportion

(AS)

ActiveTreatm

ent

Proportion

(AT)

Tim

eto

Treatm

ent

(TT)

Defi

nit

ion

Posi

tive

marg

inra

tefo

rp

T2

pati

ents

.P

osi

tive

marg

inra

tefo

rp

T3

pati

ents

.P

rop

ort

ion

of

low

-ris

kp

ati

ents

un

der

goin

gA

S.

Pro

port

ion

of

hig

h-r

isk

pati

ents

un

der

goin

gp

rim

ary

surg

ery

or

rad

iati

on

ther

apy.

Tim

efr

om

bio

psy

top

rim

ary

trea

tmen

t.

Inclu

sion

-S

urg

ery

as

pri

mary

trea

tmen

t-

Su

rger

yas

pri

mary

trea

tmen

t-

Gle

aso

n6

AN

D-

Gle

aso

n≥

8-

Gle

aso

n≥

8

-p

T2

dis

ease

-p

T3

dis

ease

-P

SA≤

10

OR

OR

AN

D-

PS

A>

20

-P

SA

>20

-≤

cT2a

OR

OR

AN

D-≥

cT2c

-≥

cT2c

-≤

25%

core

sp

osi

tive

-cN

0M

0d

isea

se-

Su

rger

yor

rad

iati

on

as

pri

mary

trea

tmen

t

Exclu

sion

-M1

dis

ease

-M1

dis

ease

M1

dis

ease

M1

dis

ease

M1

dis

ease

Quality

Indicato

rLength

ofSta

y(L

OS)

Readm

ission

Proportion

(RP)

Lym

ph

Node

Disse

ction

Propor-

tion

(LND)

ConcurrentADT

&ERBT

(ADT-E

BRT)

EBRT

Dose

(RTD)

Defi

nit

ion

Len

gth

of

in-h

osp

ital

stay

follow

ing

rad

ical

pro

state

ctom

y.

Rea

dm

issi

on

pro

por-

tion

follow

ing

rad

ical

pro

state

ctom

y.

Per

form

an

ceof

aly

mp

hn

od

ed

isse

ctio

nat

the

tim

eof

rad

ical

pro

state

ctom

yfo

rin

term

edia

tean

dh

igh

risk

dis

ease

Pro

port

ion

of

hig

h-r

isk

pati

ents

rece

ivin

gA

DT

wit

hp

rim

ary

rad

iati

on

ther

apy.

Pro

port

ion

of

pati

ents

un

der

goin

gp

rim

ary

ra-

dia

tion

ther

apy

rece

iv-

ing

ad

ose

of

75-8

0G

y.

Inclu

sion

-A

llp

ati

ents

wit

hsu

rger

yas

pri

mary

ther

apy

-A

llp

ati

ents

wit

hsu

rger

yas

pri

mary

ther

apy

-G

leaso

n≥

7-

Gle

aso

n≥

8-

All

cN0M

0p

ati

ents

rece

ivin

gra

dia

tion

as

pri

mary

ther

apy

OR

AN

D-

PS

A>

10

-P

SA

>20

OR

AN

D-≥

pT

2b

-≥

pT

2c

-S

urg

ery

as

pri

mary

ther

apy

-R

ad

iati

on

as

pri

mary

ther

apy

Exclu

sion

M1

dis

ease

M1

dis

ease

M1

dis

ease

M1

dis

ease

M1

dis

ease

50

Page 62: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.2:

Des

crip

tive

stati

stic

sof

pati

ents

from

each

QI

cohort

inth

etr

ain

ing

set.

T2

Posi

tive

Marg

ins

T3

Posi

tive

Marg

ins

AS

wit

hP

osi

tive

Core

sA

ctiv

eT

reatm

ent

Tim

eto

Fir

stT

reatm

ent

Sta

tist

icM

edia

n/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Nu

mb

erof

pati

ents

147377

43571

21690

89773

99560

Age

(yea

rs)

61

56-6

662

67-7

762

57-6

866

60-7

267

60-7

3Y

ear

of

Dia

gn

osi

s2009

2007-2

011

2009

2007-2

011

2011

2010-2

012

2009

2006-2

011

2009

2006-2

011

PS

A(n

g/m

L)

5.2

4.1

-7.2

6.5

4.7

-10.5

54-6

.49.1

5.3

-25.8

95.3

-25

Gre

at

Cir

cle

Dis

tan

ce(m

iles

)14.9

6.4

-40.3

15.5

6.5

-43

11.8

5.3

-27.9

10.4

4.5

-24.7

10.4

4.5

-24.6

Mis

sin

gor

N/A

1519

Yea

rof

Su

rger

y2009

2007-2

011

2009

2007-2

011

2011

2010-2

012

2009

2007-2

011

2009

2007-2

011

Mis

sin

gor

N/A

0T

ime

toT

reatm

ent

(days)

68

47-9

764

46-9

185

59-1

25

78

50-1

18

78

50-1

18

Mis

sin

gor

N/A

0L

ength

of

Sta

y(d

ays)

11-2

11-2

11-2

21-2

21-2

Mis

sin

gor

N/A

6024

Tu

mou

rS

ize

(mm

)14

9-2

020

15-2

810

5-1

518

12-2

518

12-2

5M

issi

ng

or

N/A

99881

EB

RT

Dose

(Gy)

64.8

46-6

8.4

64.8

45-6

8.4

75.6

50.4

-78

54

45-7

654

45-7

5.6

Mis

sin

gor

N/A

145470

Pro

port

ion

of

Core

sP

osi

tive

0.3

0.2

-0.5

0.5

0.3

-0.8

0.1

40.1

-0.2

0.5

0.3

-0.8

0.5

0.3

-0.8

Mis

sin

gor

N/A

106769

Gle

aso

nC

ate

gory

671663

48.6

36482

14.8

821690

100

17101

19.0

517811

17.8

97

66417

45.0

724912

57.1

80

026964

30.0

429297

29.4

38

6623

4.4

95903

13.5

50

027263

30.3

730570

30.7

19

2559

1.7

45947

13.6

50

016965

18.9

19958

20.0

510

115

0.0

8327

0.7

50

01480

1.6

51924

1.9

3C

harl

son

Com

orb

idit

yIn

dex

0124677

84.6

35786

82.1

318471

85.1

675873

84.5

284087

84.4

61

20352

13.8

6861

15.7

52810

12.9

611836

13.1

813211

13.2

72

2348

1.6

924

2.1

2409

1.8

92064

2.3

2262

2.2

7P

osi

tive

Marg

ins

Mis

sin

g0

00

011119

51.2

651118

56.9

455679

55.9

3N

egati

ve

124718

84.6

24425

56.0

69203

42.4

327092

30.1

830611

30.7

5P

osi

tive

22659

15.4

19146

43.9

41368

6.3

111563

12.8

813270

13.3

3A

ctiv

eS

urv

eillan

ceM

issi

ng

86684

58.8

22707

52.1

10

054050

60.2

160091

60.3

6N

o60693

41.2

20864

47.8

918297

84.3

635342

39.3

739469

39.6

4Y

es0

00

03393

15.6

4381

0.4

20

0A

ctiv

eT

reatm

ent

Mis

sin

g0

00

05

0.0

20

00

0N

o0

00

04236

19.5

36274

6.9

90

0Y

es147377

100

43571

100

17449

80.4

583499

93.0

199560

100

Rea

dm

issi

on

Mis

sin

g2339

1.5

9758

1.7

477

0.3

61427

1.5

91531

1.5

4N

o141461

96

41695

95.6

921366

98.5

187036

96.9

596463

96.8

9Y

es3577

2.4

31118

2.5

7247

1.1

41310

1.4

61566

1.5

7L

ym

ph

Nod

eD

isse

ctio

nM

issi

ng

311

0.2

172

0.1

764

0.3

343

0.3

8392

0.3

9N

o60074

40.8

9605

22.0

417595

81.1

260809

67.7

467365

67.6

6Y

es86992

59

33894

77.7

94031

18.5

828621

31.8

831803

31.9

4

51

Page 63: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.2

Con

tinu

ed:

Des

crip

tive

stati

stic

sof

pati

ents

from

each

QI

coh

ort

inth

etr

ain

ing

set.

T2

Posi

tive

Marg

ins

T3

Posi

tive

Marg

ins

AS

wit

hP

osi

tive

Core

sA

ctiv

eT

reatm

ent

Tim

eto

Fir

stT

reatm

ent

Sta

tist

icM

edia

n/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Con

curr

ent

EB

RT

an

dA

DT

Mis

sin

g145343

98.6

36903

84.7

15237

70.2

543961

48.9

743135

43.3

3N

o1813

1.2

5237

12.0

26258

28.8

527245

30.3

533297

33.4

4Y

es221

0.2

1431

3.2

8195

0.9

18567

20.6

823128

23.2

3E

BR

Td

ose

75-8

0G

yM

issi

ng

145470

98.7

37279

85.5

618484

85.2

256157

62.5

557837

58.0

9N

o1881

1.3

6234

14.3

11582

7.2

923777

26.4

929637

29.7

7Y

es26

058

0.1

31624

7.4

99839

10.9

612086

12.1

4C

lin

ical

T-s

tage

Mis

sin

g19220

13.0

45724

13.1

40,

Aor

IS50

0.0

314

0.0

31

554

0.3

8126

0.2

9124

0.5

7261

0.2

9312

0.3

11a

497

0.3

476

0.1

7130

0.6

287

0.3

2430

0.4

31b

329

0.2

277

0.1

841

0.1

9434

0.4

8864

0.8

71c

94100

63.8

522751

52.2

219423

89.5

535856

39.9

438098

38.2

72

6761

4.5

92572

5.9

1972

9.0

93872

4.3

14194

4.2

12a

11640

7.9

3761

8.6

36143

6.8

47062

7.0

92b

3154

2.1

42390

5.4

94958

5.5

25790

5.8

22c

10797

7.3

32719

6.2

428338

31.5

731242

31.3

83

117

0.0

8754

1.7

32483

2.7

73039

3.0

53a

108

0.0

71562

3.5

84017

4.4

74550

4.5

73b

40

0.0

31014

2.3

32578

2.8

73066

3.0

84

10

0.0

131

0.0

7546

0.6

1913

0.9

2P

ath

olo

gic

al

T-s

tage

Mis

sin

g0

00

011147

51.3

951857

57.7

656814

57.0

70,

Aor

IS0

00

056

0.2

626

0.0

325

0.0

31

00

00

80.0

44

010

0.0

11a

00

00

17

0.0

813

0.0

118

0.0

21b

00

00

30.0

122

0.0

235

0.0

41c

00

00

248

1.1

4286

0.3

2318

0.3

22

7473

5.0

70

0391

1.8

991

1.1

1121

1.1

32a

22578

15.3

20

02124

9.7

91970

2.1

92216

2.2

32b

5890

40

0180

0.8

3881

0.9

81079

1.0

82c

111436

75.6

10

06810

31.4

18674

20.8

20793

20.8

83

00

2619

6.0

130

0.1

4862

0.9

61050

1.0

53a

00

27809

63.8

2600

2.7

77694

8.5

78600

8.6

43b

00

13143

30.1

667

0.3

16122

6.8

26995

7.0

34

00

00

90.0

4371

0.4

1486

0.4

9L

ym

ph

-vasc

ula

rIn

vasi

on

Mis

sin

g91935

62.4

24929

57.2

19294

42.8

568880

76.7

376924

77.2

6N

o53822

36.5

14780

33.9

212255

56.5

17981

20.0

319332

19.4

2Y

es1620

1.1

3862

8.8

6141

0.6

52912

3.2

43304

3.3

2N

od

al

Sta

tus

Mis

sin

g341

0.2

103

0.2

4144

0.6

6639

0.7

1775

0.7

8A

llN

egati

ve

86577

58.8

30640

70.3

24099

18.9

27339

30.4

529593

29.7

2P

osi

tive

nod

esfo

un

d648

0.4

3267

7.5

80.0

41982

2.2

12952

2.9

7N

on

od

esex

am

ined

59811

40.6

9561

21.9

417439

80.4

59813

66.6

366240

66.5

3

52

Page 64: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.2

Con

tinu

ed:

Des

crip

tive

stati

stic

sof

pati

ents

from

each

QI

coh

ort

inth

etr

ain

ing

set.

T2

Posi

tive

Marg

ins

T3

Posi

tive

Marg

ins

AS

wit

hP

osi

tive

Core

sA

ctiv

eT

reatm

ent

Tim

eto

Fir

stT

reatm

ent

Sta

tist

icM

edia

n/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

His

tolo

gic

al

Typ

eA

den

oca

rcin

om

a147364

99.9

43596

100

21688

99.9

989769

100

99556

100

Sci

rrh

ou

sad

enoca

rcin

om

a11

02

02

0.0

14

04

0S

up

erfi

cial

spre

ad

ing

ad

enoc.

10

00

00

00

00

Basa

lce

llad

enoca

rcin

om

a1

00

00

00

00

0U

rban

/R

ura

lM

issi

ng

4675

3.1

71263

2.9

531

2.4

52758

3.0

73102

3.1

21

met

ro,

at

least

1m

illion

pop

64298

43.6

319213

44.1

10066

46.4

138228

42.5

842119

42.3

12

met

ro,

250K

to1

million

34055

23.1

19597

22.0

35127

23.6

420038

22.3

222447

22.5

53

met

ro,

less

than

250K

16644

11.2

95026

11.5

42351

10.8

411084

12.3

512251

12.3

14

urb

an

pop

at

least

20K

,n

ear

met

ro7660

5.2

2425

5.5

71121

5.1

74650

5.1

85208

5.2

3

5u

rban

pop

at

least

20K

3392

2.3

1047

2.4

399

1.8

42119

2.3

62388

2.4

6u

rban

pop

at

least

2.5

K,

nea

rm

etro

8195

5.5

62478

5.6

91089

5.0

25711

6.3

66280

6.3

1

7u

rban

pop

at

least

2.5

K4769

3.2

41424

3.2

7586

2.7

2855

3.1

83228

3.2

48

com

ple

tely

rura

l,n

ear

met

ro1718

1.1

7539

1.2

4196

0.9

1125

1.2

51188

1.1

99

com

ple

tely

rura

l1971

1.3

4559

1.2

8224

1.0

31205

1.3

41349

1.3

5C

ensu

sare

ah

ou

seh

old

inco

me

Mis

sin

g5474

3.7

11549

3.5

6694

3.2

3229

3.6

3560

3.5

8L

ess

than

$30,0

00

14054

9.5

44521

10.3

82126

9.8

11808

13.1

512248

12.3

$30,0

00

-$3

4,9

99

23316

15.8

26978

16.0

23325

15.3

315932

17.7

517589

17.6

7$3

5,0

00

-$4

5,9

99

39409

26.7

411676

26.8

5781

26.6

524286

27.0

527302

27.4

2$4

6,0

00+

65124

44.1

918847

43.2

69764

45.0

234518

38.4

538861

39.0

3C

ensu

sare

ah

igh

sch

ool

dro

pou

tM

issi

ng

5486

3.7

21554

3.5

7695

3.2

3234

3.6

3567

3.5

829%

or

more

17031

11.5

65306

12.1

82802

12.9

214464

16.1

115148

15.2

120%

-28.9

%28555

19.3

88737

20.0

54238

19.5

419693

21.9

421670

21.7

714%

-19.9

%34009

23.0

810150

23.3

4941

22.7

820994

23.3

923355

23.4

6L

ess

than

14%

62296

42.2

717824

40.9

19014

41.5

631388

34.9

635820

35.9

8In

sura

nce

Sta

tus

Mis

sin

g2733

1.8

5626

1.4

4255

1.1

81453

1.6

21582

1.5

9N

ot

Insu

red

1733

1.1

8734

1.6

8259

1.1

91694

1.8

91649

1.6

6P

rivate

97782

66.3

525700

58.9

812836

59.1

837906

42.2

240953

41.1

3M

edic

aid

2018

1.3

7803

1.8

4408

1.8

82484

2.7

72394

2.4

Med

icare

41237

27.9

815178

34.8

47584

34.9

744596

49.6

851246

51.4

7O

ther

Gover

nm

ent

1874

1.2

7530

1.2

2348

1.6

1640

1.8

31736

1.7

4R

ace

Mis

sin

g3795

2.5

8998

2.2

9225

1.0

41136

1.2

71327

1.3

3W

hit

e119209

80.8

935128

80.6

217305

79.7

867809

75.5

376945

77.2

9B

lack

16620

11.2

84779

10.9

72740

12.6

314309

15.9

414480

14.5

4N

ati

ve

Am

eric

an

262

0.1

8101

0.2

344

0.2

198

0.2

2206

0.2

1A

sian

2245

1.5

2918

2.1

1392

1.8

12029

2.2

62198

2.2

1O

ther

653

0.4

4209

0.4

890

0.4

1398

0.4

4434

0.4

4H

isp

an

ic4593

3.1

21438

3.3

894

4.1

23894

4.3

43970

3.9

9

53

Page 65: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.2

Con

tinu

ed:

Des

crip

tive

stati

stic

sof

pati

ents

from

each

QI

coh

ort

inth

etr

ain

ing

set.

T2

Posi

tive

Marg

ins

T3

Posi

tive

Marg

ins

AS

wit

hP

osi

tive

Core

sA

ctiv

eT

reatm

ent

Tim

eto

Fir

stT

reatm

ent

Sta

tist

icM

edia

n/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Dia

gn

ost

icC

on

firm

ati

on

Mis

sin

g0

00

00

0113

0.1

3125

0.1

3N

o1

00

036

0.1

77

0.0

110

0.0

1Y

es14377

100

43571

100

21654

99.8

389653

99.8

799425

99.8

6R

ad

ical

Pro

state

ctom

yN

o0

00

011324

52.2

152279

58.2

357430

57.6

8Y

es147377

100

43571

100

10366

47.7

937494

41.7

742130

42.3

2O

pen

Su

rger

yM

issi

ng

87579

59.4

323780

54.5

810856

50.0

571330

79.4

678775

79.1

2N

o49257

33.4

215860

36.4

9177

42.3

114329

15.9

616130

16.2

Yes

10541

7.2

3931

9.0

21657

7.6

44114

4.5

84655

4.6

8P

rim

ary

Rad

iati

on

Tre

atm

ent

Mis

sin

g1746

1.1

8749

1.7

279

0.3

6716

0.8

809

0.8

1N

o143580

97.4

236071

82.7

915081

69.5

342317

47.1

442252

42.4

4Y

es2051

1.3

96751

15.4

96530

30.1

146740

52.0

656499

56.7

5P

rim

ary

EB

RT

Mis

sin

g1759

1.1

9778

1.7

987

0.4

895

11010

1.0

1N

o143600

97.4

436090

82.8

318321

84.4

754280

60.4

655657

55.9

Yes

2018

1.3

76703

15.3

83282

15.1

334598

38.5

442893

43.0

8P

rim

ary

AD

TM

issi

ng

2168

1.4

7700

1.6

172

0.3

31154

1.2

91277

1.2

8N

o132756

90.0

834769

79.8

19040

87.7

847790

53.2

351636

51.8

6Y

es3432

2.3

35373

12.3

3918

4.2

336485

40.6

442219

42.4

1N

ot

ad

min

iste

red

du

eto

kn

ow

nre

aso

n9021

6.1

22729

6.2

61660

7.6

54344

4.8

44428

4.4

5

cN0M

0N

o35509

24.0

910607

24.3

41972

9.0

90

012451

12.5

1Y

es111868

75.9

132964

75.6

619718

90.9

189773

100

87109

87.4

9cN

1N

o147273

99.9

343263

99.2

921682

99.9

689773

100

97761

98.1

9Y

es104

0.0

7308

0.7

18

0.0

40

01799

1.8

1R

ad

iati

on

-Su

rger

yS

equ

ence

No

rad

iati

on

or

surg

ery

143583

97.4

336066

82.7

821393

98.6

383626

93.1

592416

92.8

2R

ad

iati

on

bef

ore

surg

ery

37

0.0

325

0.0

658

0.0

673

0.0

7R

ad

iati

on

aft

ersu

rger

y2014

1.3

76718

15.4

2203

0.9

45282

5.8

86194

6.2

2R

ad

.b

oth

bef

ore

an

daft

ersu

rger

y7

0.0

28

0.0

19

0.0

1

Intr

aop

erati

ve

radia

tion

10

11

0.0

56

0.0

16

0.0

1O

ther

10

10

Seq

uen

ceu

nkn

ow

n1743

1.1

8754

1.7

383

0.3

8792

0.8

8861

0.8

6

54

Page 66: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.2

Con

tinu

ed:

Des

crip

tive

stati

stic

sof

pati

ents

from

each

QI

coh

ort

inth

etr

ain

ing

set.

Len

gth

of

Sta

yR

ead

mis

sion

Lym

ph

Nod

eD

isse

ctio

nC

on

cur.

EB

RT

&A

DT

(3m

th)

EB

RT

Dose

Sta

tist

icM

edia

n/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Nu

mb

erof

pati

ents

184370

189912

176495

36899

88309

Age

(yea

rs)

61

56-6

661

56-6

661

56-6

671

65-7

670

64-7

5Y

ear

of

Dia

gn

osi

s2009

2007-2

011

2009

2007-2

011

2009

2007-2

011

2008

2006-2

011

2008

2006-2

011

PS

A(n

g/m

L)

5.4

4.2

-7.8

5.4

4.2

-7.8

5.5

4.3

-8.1

11.8

6.3

-28.7

7.2

5-1

2G

reat

Cir

cle

Dis

tan

ce(m

iles

)15.2

6.5

-41.7

15

6.4

-41.1

15

6.4

-40.6

8.3

3.8

-18.7

83.8

-17.3

Mis

sin

gor

N/A

Yea

rof

Su

rger

y2009

2007-2

011

2009

2007-2

011

2009

2007-2

011

2008

2006-2

010

2007

2005-2

010

Mis

sin

gor

N/A

Tim

eto

Tre

atm

ent

(days)

67

47-9

667

47-9

567

47-9

5103

71-1

41

94

63-1

35

Mis

sin

gor

N/A

Len

gth

of

Sta

y(d

ays)

11-2

11-2

11-2

10-2

10-3

Mis

sin

gor

N/A

Tu

mou

rS

ize

(mm

)15

10-2

115

10-2

115

11-2

215

9-2

510

5-2

5M

issi

ng

or

N/A

EB

RT

Dose

(Gy)

64.8

45-6

8.4

64.8

45-6

8.4

64.8

45-6

8.4

54

45-7

672

45-7

7.4

Mis

sin

gor

N/A

Pro

port

ion

of

Core

sP

osi

tive

0.3

0.2

-0.6

0.3

0.2

-0.6

0.3

0.2

-0.6

0.6

0.3

-0.8

0.4

0.2

-0.7

Mis

sin

gor

N/A

Gle

aso

nC

ate

gory

675555

40.9

877567

40.8

462379

35.3

44535

12.2

930665

34.7

27

88108

47.7

990761

47.7

992161

52.2

210070

27.2

938032

43.0

78

12081

6.5

512541

6.6

12721

7.2

113023

35.2

911670

13.2

19

8192

4.4

48585

4.5

28766

4.9

78335

22.5

97187

8.1

410

434

0.2

4458

0.2

4468

0.2

7936

2.5

4755

0.8

5C

harl

son

Com

orb

idit

yIn

dex

0154644

83.8

8159519

84

147896

83.8

32354

87.6

878285

88.6

51

26543

14.4

27132

14.2

925509

14.4

53739

10.1

38278

9.3

72

3183

1.7

33261

1.7

23090

1.7

5806

2.1

81746

1.9

8P

osi

tive

Marg

ins

Mis

sin

g1106

0.6

1267

0.6

71177

0.6

736861

99.9

88256

99.9

4N

egati

ve

142996

77.5

6146949

77.3

8134075

75.9

728

0.0

842

0.0

5P

osi

tive

40268

21.8

441696

21.9

641243

23.3

710

0.0

311

0.0

1A

ctiv

eS

urv

eillan

ceM

issi

ng

105348

57.1

4108220

56.9

8100179

56.7

622705

61.5

355567

62.9

2N

o79022

42.8

681692

43.0

276316

43.2

414194

38.4

732742

37.0

8Y

es0

00

00

00

00

0A

ctiv

eT

reatm

ent

Mis

sin

g0

00

00

00

00

0N

o0

00

00

00

00

0Y

es184370

100

189912

100

176495

100

36899

100

88309

100

Rea

dm

issi

on

Mis

sin

g2380

1.2

92926

1.6

6390

1.0

61004

1.1

4N

o177367

96.2

185145

97.4

9169201

95.8

736406

98.6

687149

98.6

9Y

es4623

2.5

14767

2.5

14368

2.4

7103

0.2

8156

0.1

8

55

Page 67: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.2

Con

tinu

ed:

Des

crip

tive

stati

stic

sof

pati

ents

from

each

QI

coh

ort

inth

etr

ain

ing

set.

Len

gth

of

Sta

yR

ead

mis

sion

Lym

ph

Nod

eD

isse

ctio

nC

on

cur.

EB

RT

&(3

mth

s)E

BR

TD

ose

Sta

tist

icM

edia

n/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Lym

ph

Nod

eD

isse

ctio

nM

issi

ng

365

0.2

407

0.2

10

066

0.1

8139

0.1

6N

o67272

36.4

968882

36.2

762165

35.2

236343

98.4

987721

99.3

3Y

es116733

63.3

1120623

63.5

2114330

64.7

8490

1.3

3449

0.5

1C

on

curr

ent

EB

RT

an

dA

DT

Mis

sin

g176494

95.7

3181259

95.4

4167562

94.9

40

01153

1.3

1N

o6384

3.4

67016

3.6

97220

4.0

919243

52.1

561566

69.7

2Y

es1492

0.8

11637

0.8

61713

0.9

717656

47.8

525590

28.9

8E

BR

Td

ose

75-8

0G

yM

issi

ng

176975

95.9

9181768

95.7

1168075

95.2

3849

2.3

00

No

7317

3.9

78060

4.2

48336

4.7

224419

66.1

853254

60.3

Yes

78

0.0

484

0.0

484

0.0

511631

31.5

235055

39.7

Clin

ical

T-s

tage

Mis

sin

g23580

12.7

924512

12.9

122948

13

00

00

0,

Aor

IS63

0.0

367

0.0

453

0.0

30

00

01

657

0.3

6688

0.3

6579

0.3

398

0.2

7284

0.3

21a

545

0.3

567

0.3

444

0.2

528

0.0

8143

0.1

61b

398

0.2

2406

0.2

1369

0.2

137

0.1

131

0.1

51c

113411

61.5

1116473

61.3

3107343

60.8

214633

39.6

655496

62.8

42

8880

4.8

29323

4.9

18563

4.8

51419

3.8

53369

3.8

22a

15043

8.1

615309

8.0

613413

7.6

3124

8.4

710393

11.7

72b

5394

2.9

35540

2.9

25500

3.1

22873

7.7

95852

6.6

32c

12928

7.0

113316

7.0

113509

7.6

59507

25.7

68385

9.5

3830

0.4

5876

0.4

6901

0.5

11532

4.1

51250

1.4

23a

1566

0.8

51672

0.8

81689

0.9

61849

5.0

11625

1.8

43b

986

0.5

31063

0.5

61082

0.6

11415

3.8

31153

1.3

14

89

0.0

5100

0.0

5102

0.0

6384

1.0

4228

0.2

6P

ath

olo

gic

al

T-s

tage

Mis

sin

g0

00

00

035526

96.2

886015

97.4

0,

Aor

IS0

00

00

09

0.0

214

0.0

21

00

00

00

10

12

0.0

11a

00

00

00

20

1b

00

00

00

30.0

15

0.0

11c

00

00

00

183

0.5

562

0.6

42

7174

3.8

97464

3.9

33903

2.2

1105

0.2

8239

0.2

72a

21749

11.8

22366

11.7

810143

5.7

5137

0.3

7385

0.4

42b

5697

3.0

95800

3.0

55912

3.3

5150

0.4

1246

0.2

82c

107518

58.3

2110314

58.0

9111852

63.3

7577

1.5

6657

0.7

43

2415

1.3

12589

1.3

62648

1.5

64

0.1

751

0.0

63a

26680

14.4

727603

14.5

327967

15.8

559

0.1

653

0.0

63b

12378

6.7

112974

6.8

313251

7.5

160

0.1

650

0.0

64

759

0.4

1802

0.4

2819

0.4

625

0.0

718

0.0

2L

ym

ph

-vasc

ula

rIn

vasi

on

Mis

sin

g112524

61.0

3115724

60.9

4107215

60.7

533436

90.6

180573

91.2

4N

o66565

36.1

68626

36.1

463747

36.1

23264

8.8

57498

8.4

9Y

es5281

2.8

65562

2.9

35533

3.1

3199

0.5

4238

0.2

7

56

Page 68: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.2

Con

tinu

ed:

Des

crip

tive

stati

stic

sof

pati

ents

from

each

QI

coh

ort

inth

etr

ain

ing

set.

Len

gth

of

Sta

yR

ead

mis

sion

Lym

ph

Nod

eD

isse

ctio

nC

on

cur.

EB

RT

&A

DT

(3m

ths)

EB

RT

Dose

Sta

tist

icM

edia

n/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Nod

al

Sta

tus

Mis

sin

g436

0.2

4460

0.2

4386

0.2

2427

1.1

6899

1.0

2A

llN

egati

ve

113144

61.3

7116864

61.5

4110393

62.5

5620

1.6

81156

1.3

1P

osi

tive

nod

esfo

un

d3820

2.0

74015

2.1

14081

2.3

1284

0.7

765

0.0

7N

on

od

esex

am

ined

66970

36.3

268573

36.1

161635

34.9

235568

96.3

986189

97.6

His

tolo

gic

al

Typ

eA

den

oca

rcin

om

a184363

100

189897

99.9

9176480

99.9

936898

100

88306

100

Sci

rrh

ou

sad

enoca

rcin

om

a5

013

0.0

113

0.0

11

02

0S

up

erfi

cial

spre

ad

ing

ad

enoc.

10

10

10

00

10

Basa

lce

llad

enoca

rcin

om

a1

01

01

00

00

0U

rban

/R

ura

lM

issi

ng

5697

3.0

95930

3.1

25477

3.1

1041

2.8

22461

2.7

91

met

ro,

at

least

1m

illion

pop

80885

43.8

782806

43.6

76979

43.6

215263

41.3

638587

43.7

2m

etro

,250K

to1

million

41867

22.7

143556

22.9

340399

22.8

98599

23.3

20347

23.0

43

met

ro,

less

than

250K

20949

11.3

621617

11.3

819979

11.3

24759

12.9

11273

12.7

74

urb

an

pop

at

least

20K

,n

ear

met

ro9572

5.1

99987

5.2

69374

5.3

11832

4.9

64116

4.6

6

5u

rban

pop

at

least

20K

4340

2.3

54437

2.3

44120

2.3

3902

2.4

41944

2.2

6u

rban

pop

at

least

2.5

K,

nea

rm

etro

10272

5.5

710598

5.5

89947

5.6

42461

6.6

75289

5.9

9

7u

rban

pop

at

least

2.5

K6081

3.3

6183

3.2

65744

3.2

51095

2.9

72318

2.6

28

com

ple

tely

rura

l,n

ear

met

ro2207

1.2

2250

1.1

82110

1.2

481

1.3

999

1.1

3

9co

mp

lete

lyru

ral

2500

1.3

62548

1.3

42366

1.3

4466

1.2

6975

1.1

Cen

sus

are

ah

ou

seh

old

inco

me

Mis

sin

g6718

3.6

46987

3.6

86489

3.6

81196

3.2

42795

3.1

7L

ess

than

$30,0

00

17903

9.7

118478

9.7

317381

9.8

55187

14.0

611907

13.4

8$3

0,0

00

-$3

4,9

99

29116

15.7

930082

15.8

428116

15.9

36681

18.1

115198

17.2

1$3

5,0

00

-$4

5,9

99

49191

26.6

850860

26.7

847371

26.8

410354

28.0

624146

27.3

4$4

6,0

00+

81442

44.1

783505

43.9

777138

43.7

113481

36.5

334263

38.8

Cen

sus

are

ah

igh

sch

ool

dro

pou

tM

issi

ng

6735

3.6

57004

3.6

96504

3.6

91198

3.2

52806

3.1

829%

or

more

21334

11.5

722152

11.6

620765

11.7

76343

17.1

914688

16.6

320%

-28.9

%35830

19.4

337082

19.5

334698

19.6

68335

22.5

919505

22.0

914%

-19.9

%42814

23.2

243894

23.1

140918

23.1

88923

24.1

820979

23.7

6L

ess

than

14%

77657

42.1

279780

42.0

173610

41.7

112100

32.7

930331

34.3

5In

sura

nce

Sta

tus

Mis

sin

g3291

1.7

83350

1.7

62886

1.6

4710

1.9

21622

1.8

4N

ot

Insu

red

2379

1.2

92441

1.2

92361

1.3

4692

1.8

81290

1.4

6P

rivate

119382

64.7

5122713

64.6

2113301

64.2

9718

26.3

425567

28.9

5M

edic

aid

2716

1.4

72786

1.4

72684

1.5

21184

3.2

12508

2.8

4M

edic

are

54369

29.4

956210

29.6

53022

30.0

423649

64.0

955100

62.3

9O

ther

Gover

nm

ent

2233

1.2

12412

1.2

72241

1.2

7946

2.5

62222

2.5

2

57

Page 69: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.2

Con

tinu

ed:

Des

crip

tive

stati

stic

sof

pati

ents

from

each

QI

coh

ort

inth

etr

ain

ing

set.

Len

gth

of

Sta

yR

ead

mis

sion

Lym

ph

Nod

eD

isse

ctio

nC

on

cur.

EB

RT

&A

DT

(3m

ths)

EB

RT

Dose

Sta

tist

icM

edia

n/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian/N

IQR

/%

Race

Mis

sin

g4711

2.5

64775

2.5

14409

2.5

359

0.9

7971

1.1

Wh

ite

149071

80.8

5153594

80.8

8142172

80.5

527592

74.7

866161

74.9

2B

lack

20524

11.1

321166

11.1

520353

11.5

36180

16.7

514507

16.4

3N

ati

ve

Am

eric

an

344

0.1

9359

0.1

9346

0.2

79

0.2

1176

0.2

Asi

an

3093

1.6

83161

1.6

62936

1.6

6802

2.1

71816

2.0

6O

ther

828

0.4

5859

0.4

5803

0.4

5143

0.3

9322

0.3

6H

isp

an

ic5799

3.1

55998

3.1

65476

3.1

1744

4.7

34356

4.9

3D

iagn

ost

icC

on

firm

ati

on

Mis

sin

g0

00

00

088

0.2

4187

0.2

1N

o1

01

01

07

0.0

213

0.0

1Y

es184369

100

189911

100

176494

100

36804

99.7

488109

99.7

7R

ad

ical

Pro

state

ctom

yN

o0

00

00

036873

99.9

388276

99.9

6Y

es184370

100

189912

100

176495

100

26

0.0

733

0.0

4P

rim

ary

Rad

iati

on

Tre

atm

ent

Mis

sin

g106722

57.8

82293

1.2

12368

1.3

40

00

0N

o63513

34.4

5178862

94.1

8165091

93.5

40

00

0Y

es14135

7.6

78757

4.6

19036

5.1

236899

100

88309

100

Pri

mary

EB

RT

Mis

sin

g2392

1.3

2336

1.2

32410

1.3

70

00

0N

o174003

94.3

8178903

94.2

165127

93.5

60

00

0Y

es7975

4.3

38673

4.5

78958

5.0

836899

100

88309

100

Pri

mary

AD

TM

issi

ng

2722

1.4

82589

1.3

62665

1.5

1243

0.6

61021

1.1

6N

o161788

87.7

5166773

87.8

2154216

87.3

88197

22.2

141458

46.9

5Y

es8398

4.5

58852

4.6

68778

4.9

727367

74.1

741612

47.1

2N

ot

ad

min

iste

red

du

eto

kn

ow

nre

aso

n11462

6.2

211698

6.1

610836

6.1

41092

2.9

64218

4.7

8

cN0M

0N

o44392

24.0

845661

24.0

442456

24.0

64011

10.8

70

0Y

es139978

75.9

2144251

75.9

6134039

75.9

432888

89.1

388309

100

cN1

No

183960

99.7

8189477

99.7

7176056

99.7

535838

97.1

288309

100

Yes

410

0.2

2435

0.2

3439

0.2

51061

2.8

80

0R

ad

iati

on

-Su

rger

yS

equ

ence

No

rad

iati

on

or

surg

ery

174001

94.3

8178861

94.1

8165088

93.5

436844

99.8

588217

99.9

Rad

iati

on

bef

ore

surg

ery

60

0.0

358

0.0

30

00

0R

ad

iati

on

aft

ersu

rger

y7908

4.2

960

0.0

38970

5.0

850

0.1

592

0.1

Rad

.b

oth

bef

ore

an

daft

ersu

rger

y6

08689

4.5

87

00

00

0

Intr

aop

erati

ve

radia

tion

10

70

10

00

00

Oth

er1

02371

1.3

40

00

0S

equ

ence

un

kn

ow

n2394

1.3

2294

1.2

10

00

00

0

58

Page 70: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.3:

Des

crip

tive

stati

stic

sfo

rou

tcom

esu

bse

tsin

train

ing

an

dva

lid

ati

on

set.

30-D

ay

Mort

ality

90-D

ay

Mort

ality

Over

all

Mort

ality

Vali

dati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gS

tati

stic

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Nu

mb

erof

pati

ents

175173

170456

173534

169398

391030

381616

Age

(yea

rs)

61

56-6

661

56-6

661

56-6

661

56-6

665

59-7

165

59-7

1Y

ear

of

Dia

gn

osi

s2008

2006-1

02008

2006-1

02008

2006-1

02008

2006-1

02008

2006-1

02008

2006-1

0P

SA

(ng/m

L)

5.4

4.2

-7.9

5.4

4.2

-7.8

5.4

4.2

-7.9

5.4

4.2

-7.8

64.4

-9.3

5.9

4.4

-9.2

Gre

at

Cir

cle

Dis

tan

ce(m

iles

)14.3

6.3

-36

14.9

6.4

-40.8

14.2

6.3

-35.8

14.9

6.4

-40.5

11.3

5-2

7.2

11.2

4.9

-27.6

Mis

sin

gor

N/A

Yea

rof

Su

rger

y2008

2006-1

02008

2006-1

02008

2006-1

02008

2006-1

02009

2007-1

12009

2007-1

1M

issi

ng

or

N/A

Tim

eto

Tre

atm

ent

(days)

68

47-9

767

47-9

568

47-9

767

47-9

580

53-1

20

78

52-1

18

Mis

sin

gor

N/A

Len

gth

of

Sta

y(d

ays)

21-2

21-2

21-2

21-2

11-2

11-2

Mis

sin

gor

N/A

Tu

mou

rS

ize

(mm

)15

10-2

015

10-2

115

10-2

015

10-2

115

9-2

015

9-2

0M

issi

ng

or

N/A

EB

RT

Dose

(Gy)

64.8

45-6

864.8

45-6

8.4

64.8

45-6

864.8

45-6

8.4

66.6

45-7

770

45-7

6.4

Mis

sin

gor

N/A

Pro

port

ion

of

Core

sP

osi

tive

0.3

0.2

-0.6

0.3

0.2

-0.6

0.3

0.2

-0.6

0.3

0.2

-0.6

0.3

0.2

-0.6

0.3

0.2

-0.6

Mis

sin

gor

N/A

Follow

-up

for

mort

ality

(month

s)55.7

33.8

-79.6

55.9

34.4

-80.2

56.2

34.3

-79.8

56.2

34.8

-80.4

53.4

31.6

-77.5

54.7

32.5

-79.5

Mis

sin

gor

N/A

00

Vit

al

statu

sM

issi

ng

00

00

00

00

00

00

Dea

d8311

4.7

47887

4.6

38311

4.7

97887

4.6

642985

10.9

943836

11.4

9A

live

166862

95.2

6162569

95.3

7165223

95.2

1161511

95.3

4348045

89.0

1337780

88.5

1F

ollow

-up

for

AD

Tin

itia

tion

(days)

1624

933-2

376

1634

959-2

397

Mis

sin

gor

N/A

00

AD

Tin

itia

tion

statu

sN

o169064

96.5

1164592

96.5

6167472

96.5

1163561

96.5

5352055

90.0

3341820

89.5

7Y

es6109

3.4

95864

3.4

46062

3.4

95837

3.4

538975

9.9

739796

10.4

3D

eath

wit

hin

30

days

of

surg

ery

Mis

sin

g0

00

00

00

0213146

54.5

1210030

55.0

4N

o174933

99.8

6170236

99.8

7173294

99.8

6169178

99.8

7177588

45.4

2171318

44.8

9Y

es240

0.1

4220

0.1

3240

0.1

4220

0.1

3296

0.0

8268

0.0

7D

eath

wit

hin

90

days

of

surg

ery

Mis

sin

g1639

0.9

41058

0.6

20

00

0214874

54.9

5211130

55.3

3N

o173183

98.8

6169082

99.1

9173183

99.8

169082

99.8

1175652

44.9

2170041

44.5

6Y

es351

0.2

316

0.1

9351

0.2

316

0.1

9504

0.1

3445

0.1

2S

alv

age

Tre

atm

ent

Mis

sin

g3007

1.7

22337

1.3

73001

1.7

32331

1.3

858005

14.8

353796

14.1

No

164134

93.7

159784

93.7

4162504

93.6

4158735

93.7

1321665

82.2

6316843

83.0

3Y

es8032

4.5

98335

4.8

98029

4.6

38332

4.9

211360

2.9

110977

2.8

8

59

Page 71: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.3

Con

tinu

ed:

Des

crip

tive

stati

stic

sfo

rou

tcom

esu

bse

tsin

train

ing

an

dva

lid

ati

on

set.

30-D

ay

Mort

ali

ty90-D

ay

Mort

ality

Over

all

Mort

ality

Vali

dati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gS

tati

stic

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Gle

aso

nC

ate

gory

671108

40.5

970591

41.4

170445

40.5

970121

41.3

9174649

44.6

6172895

45.3

17

84464

48.2

281253

47.6

783654

48.2

180761

47.6

8161536

41.3

1155485

40.7

48

11161

6.3

710634

6.2

411055

6.3

710576

6.2

431596

8.0

830682

8.0

49

8002

4.5

77583

4.4

57946

4.5

87547

4.4

621112

5.4

20434

5.3

510

438

0.2

5395

0.2

3434

0.2

5393

0.2

32137

0.5

52120

0.5

6C

harl

son

Com

orb

idit

yIn

dex

0147276

84.0

7143440

84.1

5145917

84.0

9142545

84.1

5331957

84.8

9327922

85.9

31

24834

14.1

824135

14.1

624585

14.1

723985

14.1

650708

12.9

746128

12.0

92

3063

1.7

52881

1.6

93032

1.7

52868

1.6

98365

2.1

47566

1.9

8P

osi

tive

Marg

ins

Mis

sin

g1237

0.7

11192

0.7

1222

0.7

1181

0.7

215528

55.1

2212712

55.7

4N

egati

ve

135915

77.5

9131895

77.3

8134526

77.5

2131078

77.3

8138707

35.4

7132698

34.7

7P

osi

tive

38021

21.7

37369

21.9

237786

21.7

737139

21.9

236795

9.4

136206

9.4

9A

ctiv

eS

urv

eillan

ceM

issi

ng

113973

65.0

6108702

63.7

7113525

65.4

2108310

63.9

4258139

66.0

2256435

67.2

No

61200

34.9

461754

36.2

360009

34.5

861088

36.0

6126491

32.3

5120025

31.4

5Y

es0

00

00

00

06400

1.6

45156

1.3

5A

ctiv

eT

reatm

ent

Mis

sin

g0

00

00

00

01127

0.2

91073

0.2

8N

o0

00

00

00

037373

9.5

634607

9.0

7Y

es175173

100

170456

100

173534

100

169398

100

352530

90.1

5345936

90.6

5R

ead

mis

sion

Mis

sin

g3310

1.8

92999

1.7

63282

1.8

92988

1.7

67376

1.8

95610

1.4

7N

o168113

95.9

7163233

95.7

6166546

95.9

7162218

95.7

6379524

97.0

6371037

97.2

3Y

es3750

2.1

44224

2.4

83706

2.1

44192

2.4

74130

1.0

64969

1.3

Lym

ph

Nod

eD

isse

ctio

nM

issi

ng

386

0.2

2385

0.2

3384

0.2

2379

0.2

21913

0.4

91467

0.3

8N

o65082

37.1

562394

36.6

64591

37.2

261973

36.5

8284616

72.7

9277496

72.7

2Y

es109705

62.6

3107677

63.1

7108559

62.5

6107046

63.1

9104501

26.7

2102653

26.9

Con

curr

ent

EB

RT

an

dA

DT

Mis

sin

g167275

95.4

9162205

95.1

6165639

95.4

5161149

95.1

3221581

56.6

7212248

55.6

2N

o6222

3.5

56698

3.9

36219

3.5

86696

3.9

5132042

33.7

7133678

35.0

3Y

es1676

0.9

61553

0.9

11676

0.9

71553

0.9

237407

9.5

735690

9.3

5E

BR

Td

ose

75-8

0G

yM

issi

ng

167720

95.7

5162689

95.4

4166082

95.7

1161632

95.4

2286642

73.3

283089

74.1

8N

o7378

4.2

17691

4.5

17377

4.2

57690

4.5

470774

18.1

62650

16.4

2Y

es75

0.0

476

0.0

475

0.0

476

0.0

433614

8.6

35877

9.4

60

Page 72: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.3

Con

tinu

ed:

Des

crip

tive

stati

stic

sfo

rou

tcom

esu

bse

tsin

train

ing

an

dva

lid

ati

on

set.

30-D

ay

Mort

ali

ty90-D

ay

Mort

ality

Over

all

Mort

ality

Vali

dati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gS

tati

stic

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Clin

ical

T-s

tage

Mis

sin

g24731

14.1

224219

14.2

124666

14.2

124118

14.2

40

00

00,

Aor

IS116

0.0

758

0.0

3116

0.0

758

0.0

30

00

01

780

0.4

5605

0.3

5777

0.4

5601

0.3

52251

0.5

81808

0.4

71a

550

0.3

1523

0.3

1542

0.3

1519

0.3

13653

0.9

33700

0.9

71b

471

0.2

7368

0.2

2468

0.2

7365

0.2

22584

0.6

62472

0.6

51c

105009

59.9

5102195

59.9

5103926

59.8

9101474

59.9

259544

66.3

7254560

66.7

12

6964

3.9

88053

4.7

26896

3.9

77997

4.7

216635

4.2

517993

4.7

12a

14114

8.0

613694

8.0

313977

8.0

513622

8.0

440734

10.4

239157

10.2

62b

5361

3.0

64941

2.9

5268

3.0

44925

2.9

117255

4.4

116822

4.4

12c

13143

7.5

12403

7.2

813034

7.5

112336

7.2

835253

9.0

232891

8.6

23

839

0.4

8807

0.4

7835

0.4

8804

0.4

73267

0.8

43292

0.8

63a

1975

1.1

31526

0.9

1928

1.1

11522

0.9

5330

1.3

64618

1.2

13b

993

0.5

7969

0.5

7975

0.5

6963

0.5

73307

0.8

53119

0.8

2412

127

0.0

795

0.0

6126

0.0

794

0.0

61217

0.3

11184

0.3

1P

ath

olo

gic

al

T-s

tage

Mis

sin

g0

00

00

00

0219822

56.2

2215284

56.4

10,

Aor

IS0

00

00

00

0194

0.0

5148

0.0

41

00

00

00

00

56

0.0

151

0.0

11a

00

00

00

00

152

0.0

4141

0.0

41b

00

00

00

00

50

0.0

177

0.0

21c

00

00

00

00

1660

0.4

21319

0.3

52

5983

3.4

26654

3.9

5939

3.4

26607

3.9

6548

1.6

77471

1.9

62a

21401

12.2

220512

12.0

321239

12.2

420394

12.0

421624

5.5

320391

5.3

42b

5403

3.0

85472

3.2

15335

3.0

75443

3.2

15363

1.3

75727

1.5

2c

103556

59.1

299077

58.1

2102604

59.1

398432

58.1

199431

25.4

395070

24.9

13

2637

1.5

12461

1.4

42621

1.5

12448

1.4

52376

0.6

12170

0.5

73a

23390

13.3

524081

14.1

323092

13.3

123932

14.1

321991

5.6

222652

5.9

43b

11770

6.7

211432

6.7

111683

6.7

311380

6.7

210687

2.7

310346

2.7

14

1033

0.5

9767

0.4

51021

0.5

9762

0.4

51076

0.2

8769

0.2

Lym

ph

-vasc

ula

rIn

vasi

on

Mis

sin

g119454

68.1

9114678

67.2

8118904

68.5

2114226

67.4

3312021

79.7

9305310

80

No

51762

29.5

551673

30.3

150748

29.2

451102

30.1

774327

19.0

171486

18.7

3Y

es3957

2.2

64105

2.4

13882

2.2

44070

2.4

4682

1.2

4820

1.2

6N

od

al

Sta

tus

Mis

sin

g369

0.2

1430

0.2

5368

0.2

1429

0.2

53334

0.8

53191

0.8

4A

llN

egati

ve

105846

60.4

2104452

61.2

8104771

60.3

7103831

61.2

9102808

26.2

9101304

26.5

5P

osi

tive

nod

esfo

un

d4195

2.3

93481

2.0

44124

2.3

83460

2.0

44666

1.1

93962

1.0

4N

on

od

esex

am

ined

64763

36.9

762093

36.4

364271

37.0

461678

36.4

1280222

71.6

6273159

71.5

8H

isto

logic

al

Typ

eA

den

oca

rcin

om

a175164

99.9

9170441

99.9

9173525

99.9

9169383

99.9

9391005

99.9

9381597

100

Sci

rrh

ou

sad

enoca

rcin

om

a4

013

0.0

14

013

0.0

115

015

0

Su

per

fici

al

spre

ad

ing

ad

enoc.

40

10

40

10

90

30

Basa

lce

llad

enoca

rcin

om

a1

01

01

01

01

01

0

61

Page 73: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.3

Con

tinu

ed:

Des

crip

tive

stati

stic

sfo

rou

tcom

esu

bse

tsin

train

ing

an

dva

lid

ati

on

set.

30-D

ay

Mort

ali

ty90-D

ay

Mort

ality

Over

all

Mort

ality

Vali

dati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gS

tati

stic

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Urb

an

-Ru

ral

Mis

sin

g4796

2.7

45344

3.1

44728

2.7

25276

3.1

110449

2.6

712393

3.2

51

met

ro,

at

least

1m

illion

pop

89473

51.0

874426

43.6

688564

51.0

474041

43.7

1198184

50.6

8170106

44.5

8

2m

etro

,250K

to1

million

33232

18.9

739036

22.9

32958

18.9

938841

22.9

372595

18.5

783787

21.9

6

3m

etro

,le

ssth

an

250K

18628

10.6

319488

11.4

318500

10.6

619384

11.4

443017

11

44568

11.6

8

4u

rban

pop

at

least

20K

,n

ear

met

ro8299

4.7

48992

5.2

88191

4.7

28930

5.2

720601

5.2

719770

5.1

8

5u

rban

pop

at

least

20K

2956

1.6

93941

2.3

12930

1.6

93886

2.2

96549

1.6

78353

2.1

9

6u

rban

pop

at

least

2.5

K,

nea

rm

etro

9314

5.3

29491

5.5

79252

5.3

39418

5.5

620778

5.3

121830

5.7

2

7u

rban

pop

at

least

2.5

K4620

2.6

45448

3.2

4573

2.6

45360

3.1

610257

2.6

211804

3.0

9

8co

mp

lete

lyru

ral,

nea

rm

etro

1821

1.0

41987

1.1

71814

1.0

51973

1.1

64343

1.1

14289

1.1

2

9co

mp

lete

lyru

ral

2034

1.1

62303

1.3

52024

1.1

72289

1.3

54257

1.0

94716

1.2

4C

ensu

sare

ah

ou

seh

old

inco

me

Mis

sin

g6410

3.6

66250

3.6

76313

3.6

46164

3.6

414220

3.6

414379

3.7

7L

ess

than

$30,0

00

16274

9.2

916515

9.6

916134

9.3

16364

9.6

645059

11.5

243350

11.3

6$3

0,0

00

-$3

4,9

99

25062

14.3

126933

15.8

24855

14.3

226728

15.7

860645

15.5

162331

16.3

3$3

5,0

00

-$4

5,9

99

44056

25.1

545666

26.7

943649

25.1

545390

26.7

9100269

25.6

4101967

26.7

2$4

6,0

00+

83371

47.5

975092

44.0

582583

47.5

974752

44.1

3170837

43.6

9159589

41.8

2C

ensu

sare

ah

igh

sch

ool

dro

pou

tM

issi

ng

6436

3.6

76264

3.6

76339

3.6

56178

3.6

514263

3.6

514411

3.7

829%

or

more

20712

11.8

219616

11.5

120532

11.8

319483

11.5

54660

13.9

852899

13.8

620%

-28.9

%32789

18.7

233238

19.5

32509

18.7

333015

19.4

980025

20.4

778260

20.5

114%

-19.9

%37961

21.6

739539

23.2

37594

21.6

639291

23.1

986599

22.1

588540

23.2

Les

sth

an

14%

77275

44.1

171799

42.1

276560

44.1

271431

42.1

7155483

39.7

6147506

38.6

5In

sura

nce

Sta

tus

Mis

sin

g1871

1.0

73159

1.8

51858

1.0

73126

1.8

55804

1.4

86445

1.6

9N

ot

Insu

red

2644

1.5

12146

1.2

62612

1.5

12103

1.2

46721

1.7

25246

1.3

7P

rivate

114610

65.4

3110699

64.9

4113464

65.3

8110020

64.9

5189306

48.4

1181665

47.6

Med

icaid

2859

1.6

32431

1.4

32840

1.6

42399

1.4

28621

2.2

7453

1.9

5M

edic

are

50818

29.0

149947

29.3

50413

29.0

549699

29.3

4173820

44.4

5174886

45.8

3O

ther

Gover

nm

ent

2371

1.3

52074

1.2

22347

1.3

52051

1.2

16758

1.7

35921

1.5

5R

ace

Mis

sin

g3017

1.7

24583

2.6

92976

1.7

14560

2.6

96224

1.5

97308

1.9

2W

hit

e142244

81.2

138221

81.0

9140944

81.2

2137405

81.1

1306995

78.5

1299152

78.3

9B

lack

19083

10.8

918787

11.0

218915

10.9

18647

11.0

152343

13.3

951116

13.3

9N

ati

ve

Am

eric

an

302

0.1

7303

0.1

8301

0.1

7299

0.1

8807

0.2

1735

0.1

9A

sian

2673

1.5

32746

1.6

12641

1.5

22722

1.6

16815

1.7

47125

1.8

7O

ther

1108

0.6

3763

0.4

51095

0.6

3756

0.4

52183

0.5

61609

0.4

2H

isp

an

ic6746

3.8

55053

2.9

66662

3.8

45009

2.9

615663

4.0

114571

3.8

2

62

Page 74: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.3

Con

tinu

ed:

Des

crip

tive

stati

stic

sfo

rou

tcom

esu

bse

tsin

train

ing

an

dva

lid

ati

on

set.

30-D

ay

Mort

ali

ty90-D

ay

Mort

ality

Over

all

Mort

ality

Vali

dati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gS

tati

stic

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Dia

gn

ost

icC

on

firm

ati

on

Mis

sin

g4

00

04

00

0393

0.1

401

0.1

1N

o1

01

01

01

068

0.0

241

0.0

1Y

es175168

100

170455

100

173529

100

169397

100

390569

99.8

8381174

99.8

8R

ad

ical

Pro

state

ctom

yN

o0

00

00

00

0220115

56.2

9216884

56.8

3Y

es175173

100

170456

100

173534

100

169398

100

170915

43.7

1164732

43.1

7P

rim

ary

Rad

iati

on

Tre

atm

ent

Mis

sin

g3039

1.7

32326

1.3

63033

1.7

52320

1.3

74096

1.0

53283

0.8

6N

o164098

93.6

8159788

93.7

4162468

93.6

2158739

93.7

1214539

54.8

7205628

53.8

8Y

es8036

4.5

98342

4.8

98033

4.6

38339

4.9

2172395

44.0

9172705

45.2

6P

rim

ary

EB

RT

Mis

sin

g3103

1.7

72361

1.3

93097

1.7

82354

1.3

94593

1.1

73719

0.9

7N

o164119

93.6

9159826

93.7

6162489

93.6

4158776

93.7

3277873

71.0

6276667

72.5

Yes

7951

4.5

48269

4.8

57948

4.5

88268

4.8

8108564

27.7

6101230

26.5

3P

rim

ary

AD

TM

issi

ng

3992

2.2

82730

1.6

3985

2.3

2726

1.6

18064

2.0

65927

1.5

5N

o152514

87.0

6149458

87.6

8150971

87

148464

87.6

4276279

70.6

5271777

71.2

2Y

es8880

5.0

78213

4.8

28825

5.0

98182

4.8

385987

21.9

983816

21.9

6N

ot

ad

min

iste

red

du

eto

kn

ow

nre

aso

n9787

5.5

910055

5.9

9753

5.6

210026

5.9

220700

5.2

920096

5.2

7

cN0M

0N

o44264

25.2

743817

25.7

144090

25.4

143598

25.7

451412

13.1

549077

12.8

6Y

es130909

74.7

3126639

74.2

9129444

74.5

9125800

74.2

6339618

86.8

5332539

87.1

4cN

1N

o174688

99.7

2170061

99.7

7173059

99.7

3169007

99.7

7388172

99.2

7378993

99.3

1Y

es485

0.2

8395

0.2

3475

0.2

7391

0.2

32858

0.7

32623

0.6

9

63

Page 75: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.3

Con

tinu

ed:

Des

crip

tive

stati

stic

sfo

rou

tcom

esu

bse

tsin

train

ing

an

dva

lid

ati

on

set.

Salv

age

Th

erapy

AD

TIn

itia

tion

Valid

ati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gS

tati

stic

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Nu

mb

erof

pati

ents

367016

358966

433325

420258

Age

(yea

rs)

64

58-7

065

59-7

065

59-7

165

59-7

1Y

ear

of

Dia

gn

osi

s2009

2006-2

011

2009

2006-2

011

2009

2006-2

011

2009

2006-2

011

PS

A(n

g/m

L)

5.9

45.-

95.9

4.4

-96

4.5

-9.3

64.4

-9.2

Gre

at

Cir

cle

Dis

tan

ce(m

iles

)11.7

5.2

-28.2

11.5

5-2

8.2

11.4

5-2

7.2

11.2

4.9

-27.7

Mis

sin

gor

N/A

Yea

rof

Su

rger

y2009

2007-2

011

2009

2007-2

011

2009

2007-2

011

2009

2007-2

011

Mis

sin

gor

N/A

Tim

eto

Tre

atm

ent

(days)

82

55-1

21

80

54-1

19

80

53-1

20

78

52-1

18

Mis

sin

gor

N/A

Len

gth

of

Sta

y(d

ays)

11-2

11-2

11-2

11-2

Mis

sin

gor

N/A

Tu

mou

rS

ize

(mm

)15

10-2

115

10-2

115

9-2

015

10-2

0M

issi

ng

or

N/A

EB

RT

Dose

(Gy)

66.4

45-7

7.4

70

45-7

7.4

66

45-7

7.4

70

45-7

7.4

Mis

sin

gor

N/A

Pro

port

ion

of

Core

sP

osi

tive

0.3

0.2

-0.6

0.3

0.2

-0.6

0.3

0.2

-0.6

0.3

0.2

-0.6

Mis

sin

gor

N/A

Follow

-up

for

mort

ality

(month

s)54.7

32.9

-78.6

56.3

34-8

0.7

53.4

31.6

-77.5

54.7

32.5

-79.5

Mis

sin

gor

N/A

33991

31146

Vit

al

statu

sM

issi

ng

33991

9.2

631146

8.6

842295

9.7

638642

9.1

9D

ead

31498

8.5

832041

8.9

342985

9.9

243836

10.4

3A

live

301527

82.1

6295779

82.4

348045

80.3

2337780

80.3

7F

ollow

-up

for

AD

Tin

itia

tion

(days)

1280

334-2

141

1313

362-2

195

Mis

sin

gor

N/A

27711

25452

AD

Tin

itia

tion

statu

sN

o336045

91.5

6327389

91.2

390824

90.1

9377411

89.8

Yes

30971

8.4

431577

8.8

42501

9.8

142847

10.2

Dea

thw

ith

in30

days

of

surg

ery

Mis

sin

g200162

54.5

4197580

55.0

4255441

58.9

5248672

59.1

7N

o166629

45.4

161191

44.9

177588

40.9

8171318

40.7

6Y

es225

0.0

6195

0.0

5296

0.0

7268

0.0

6D

eath

wit

hin

90

days

of

surg

ery

Mis

sin

g201830

54.9

9198615

55.3

3257169

59.3

5249772

59.4

3N

o164850

44.9

2160078

44.5

9175652

40.5

4170041

40.4

6Y

es336

0.0

9273

0.0

8504

0.1

2445

0.1

1S

alv

age

Tre

atm

ent

Mis

sin

g0

00

066309

15.3

61292

14.5

8N

o354271

96.5

3346819

96.6

2354271

81.7

6346819

82.5

3Y

es12745

3.4

712147

3.3

812745

2.9

412147

2.8

9G

leaso

nC

ate

gory

6153905

41.9

3154395

43.0

1190116

43.8

7187144

44.5

37

160783

43.8

1154153

42.9

4180426

41.6

4172699

41.0

98

30731

8.3

729683

8.2

736032

8.3

234804

8.2

89

19821

5.4

19036

5.3

24289

5.6

123225

5.5

310

1776

0.4

81699

0.4

72462

0.5

72386

0.5

7

64

Page 76: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.3

Con

tinu

ed:

Des

crip

tive

stati

stic

sfo

rou

tcom

esu

bse

tsin

train

ing

an

dva

lid

ati

on

set.

Salv

age

Th

erapy

AD

TIn

itia

tion

Vali

dati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gS

tati

stic

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Ch

arl

son

Com

orb

idit

yIn

dex

0311848

84.9

7308849

86.0

4366789

84.6

5359885

85.6

31

47965

13.0

743702

12.1

757008

13.1

651774

12.3

22

7203

1.9

66415

1.7

99528

2.2

8599

2.0

5P

osi

tive

Marg

ins

Mis

sin

g179568

48.9

3178209

49.6

5237291

54.7

6232044

55.2

1N

egati

ve

147661

40.2

3141720

39.4

8154433

35.6

4147599

35.1

2P

osi

tive

39787

10.8

439037

10.8

741601

9.6

40615

9.6

6A

ctiv

eS

urv

eillan

ceM

issi

ng

219037

59.6

8219033

61.0

2260967

60.2

2259013

61.6

3N

o147979

40.3

2139933

38.9

8163067

37.6

3153829

36.6

Yes

00

00

9291

2.1

47416

1.7

6A

ctiv

eT

reatm

ent

Mis

sin

g0

00

01334

0.3

11323

0.3

1N

o0

00

043457

10.0

340016

9.5

2Y

es367016

100

358966

100

388534

89.6

6378919

90.1

6R

ead

mis

sion

Mis

sin

g5338

1.4

54331

1.2

17559

1.7

45862

1.3

9N

o357588

97.4

3350013

97.5

1421147

97.1

9408954

97.3

1Y

es4090

1.1

14622

1.2

94619

1.0

75442

1.2

9L

ym

ph

Nod

eD

isse

ctio

nM

issi

ng

1485

0.4

1102

0.3

12041

0.4

71632

0.3

9N

o250713

68.3

1245365

68.3

5312898

72.2

1303233

72.1

5Y

es114818

31.2

8112499

31.3

4118386

27.3

2115393

27.4

6C

on

curr

ent

EB

RT

an

dA

DT

Mis

sin

g183105

49.8

9176750

49.2

4248961

57.4

5237816

56.5

9N

o142592

38.8

5142804

39.7

8142979

33

143001

34.0

3Y

es41319

11.2

639412

10.9

841385

9.5

539441

9.3

8E

BR

Td

ose

75-8

0G

yM

issi

ng

252280

68.7

4251217

69.9

8318301

73.4

6312379

74.3

3N

o77898

21.2

268397

19.0

578125

18.0

368490

16.3

Yes

36838

10.0

439352

10.9

636899

8.5

239389

9.3

7C

lin

ical

T-s

tage

Mis

sin

g0

00

00

00

00,

Aor

IS0

00

00

00

01

1822

0.5

1454

0.4

12536

0.5

92011

0.4

81a

1328

0.3

61326

0.3

74088

0.9

44147

0.9

91b

1391

0.3

81229

0.3

42869

0.6

62736

0.6

51c

245701

66.9

5242538

67.5

7288812

66.6

5281836

67.0

62

15209

4.1

416376

4.5

618660

4.3

120018

4.7

62a

39631

10.8

38178

10.6

444648

10.3

42582

10.1

32b

17046

4.6

416416

4.5

719101

4.4

118278

4.3

52c

32720

8.9

230308

8.4

438023

8.7

735190

8.3

73

2952

0.8

2961

0.8

23578

0.8

33575

0.8

53a

5238

1.4

34475

1.2

55913

1.3

65078

1.2

13b

3249

0.8

93035

0.8

53718

0.8

63465

0.8

2412

729

0.2

670

0.1

91379

0.3

21342

0.3

2

65

Page 77: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.3

Con

tinu

ed:

Des

crip

tive

stati

stic

sfo

rou

tcom

esu

bse

tsin

train

ing

an

dva

lid

ati

on

set.

Salv

age

Th

erapy

AD

TIn

itia

tion

Vali

dati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gS

tati

stic

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Path

olo

gic

al

T-s

tage

Mis

sin

g183856

50.0

9180638

50.3

2241530

55.7

4234577

55.8

20,

Aor

IS190

0.0

5169

0.0

5227

0.0

5189

0.0

41

38

0.0

139

0.0

161

0.0

163

0.0

11a

81

0.0

268

0.0

2190

0.0

4179

0.0

41b

30

0.0

132

0.0

169

0.0

294

0.0

21c

1456

0.4

1142

0.3

22089

0.4

81588

0.3

82

6223

1.7

7381

2.0

67122

1.6

48200

1.9

52a

22339

6.0

921182

5.9

23572

5.4

422182

5.2

82b

5413

1.4

75755

1.6

5819

1.3

46106

1.4

52c

107062

29.1

7102554

28.5

7110817

25.5

7105778

25.1

73

2354

0.6

42229

0.6

22541

0.5

92340

0.5

63a

24865

6.7

725441

7.0

925621

5.9

126148

6.2

23b

12120

3.3

11633

3.2

412526

2.8

911999

2.8

64

989

0.2

7703

0.2

1141

0.2

6815

0.1

9L

ym

ph

-vasc

ula

rIn

vasi

on

Mis

sin

g273970

74.6

5269120

74.9

7330937

76.3

7321610

76.5

3N

o87045

23.7

283815

23.3

595978

22.1

592302

21.9

6Y

es6001

1.6

46031

1.6

86410

1.4

86346

1.5

1N

od

al

Sta

tus

Mis

sin

g2640

0.7

22478

0.6

93496

0.8

13346

0.8

All

Neg

ati

ve

112045

30.5

3110419

30.7

6115874

26.7

4113545

27.0

2P

osi

tive

nod

esfo

un

d4980

1.3

64079

1.1

45556

1.2

84589

1.0

9N

on

od

esex

am

ined

247351

67.4

241990

67.4

1308399

71.1

7298778

71.0

9H

isto

logic

al

Typ

eA

den

oca

rcin

om

a366992

99.9

9358946

99.9

9433298

99.9

9420238

100

Sci

rrh

ou

sad

enoca

rcin

om

a13

016

016

016

0S

up

erfi

cial

spre

ad

ing

ad

enoc.

10

03

010

03

0B

asa

lce

llad

enoca

rcin

om

a1

01

01

01

0U

rban

-Ru

ral

Mis

sin

g9674

2.6

411399

3.1

811443

2.6

413323

3.1

71

met

ro,

at

least

1m

illion

pop

185303

50.4

9160508

44.7

1220322

50.8

4187345

44.5

82

met

ro,

250K

to1

million

69461

18.9

379371

22.1

180049

18.4

792217

21.9

43

met

ro,

less

than

250K

39966

10.8

941751

11.6

347663

11

49249

11.7

24

urb

an

pop

at

least

20K

,n

ear

met

ro19210

5.2

318482

5.1

522809

5.2

621857

5.2

5u

rban

pop

at

least

20K

6128

1.6

77715

2.1

57262

1.6

89273

2.2

16

urb

an

pop

at

least

2.5

K,

nea

rm

etro

19593

5.3

420464

5.7

22927

5.2

924091

5.7

37

urb

an

pop

at

least

2.5

K9616

2.6

210941

3.0

511337

2.6

213037

3.1

8co

mp

lete

lyru

ral,

nea

rm

etro

4098

1.1

23960

1.1

4855

1.1

24734

1.1

39

com

ple

tely

rura

l3967

1.0

84375

1.2

24658

1.0

75132

1.2

2C

ensu

sare

ah

ou

seh

old

inco

me

Mis

sin

g13240

3.6

113213

3.6

815616

3.6

15554

3.7

Les

sth

an

$30,0

00

40875

11.1

439125

10.9

49910

11.5

247848

11.3

9$3

0,0

00

-$3

4,9

99

56553

15.4

157778

16.1

66948

15.4

568912

16.4

$35,0

00

-$4

5,9

99

93627

25.5

195890

26.7

1110764

25.5

6112358

26.7

4$4

6,0

00+

162721

44.3

4152960

42.6

1190087

43.8

7175586

41.7

8

66

Page 78: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Tab

leS

3.3

Con

tinu

ed:

Des

crip

tive

stati

stic

sfo

rou

tcom

esu

bse

tsin

train

ing

an

dva

lid

ati

on

set.

Salv

age

Th

erapy

AD

TIn

itia

tion

Vali

dati

on

Tra

inin

gV

alid

ati

on

Tra

inin

gS

tati

stic

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Med

ian

/N

IQR

/%

Cen

sus

are

ah

igh

sch

ool

dro

pou

tM

issi

ng

13282

3.6

213245

3.6

915664

3.6

115590

3.7

129%

or

more

49649

13.5

348390

13.4

860586

13.9

858688

13.9

620%

-28.9

%74879

20.4

73424

20.4

588631

20.4

586267

20.5

314%

-19.9

%81740

22.2

783366

23.2

295916

22.1

397321

23.1

6L

ess

than

14%

147466

40.1

8140541

39.1

5172528

39.8

1162392

38.6

4In

sura

nce

Sta

tus

Mis

sin

g4809

1.3

15441

1.5

26358

1.4

76930

1.6

5N

ot

Insu

red

5493

1.5

4500

1.2

57643

1.7

65894

1.4

Pri

vate

185798

50.6

2179394

49.9

8209698

48.3

9200101

47.6

1M

edic

aid

7480

2.0

46742

1.8

89805

2.2

68518

2.0

3M

edic

are

156539

42.6

5156759

43.6

7192221

44.3

6192147

45.7

2O

ther

Gover

nm

ent

6897

1.8

86130

1.7

17600

1.7

56668

1.5

9R

ace

Mis

sin

g5131

1.4

6525

1.8

26749

1.5

67631

1.8

2W

hit

e290661

79.2

282959

78.8

3339286

78.3

328804

78.2

4B

lack

47915

13.0

647086

13.1

258713

13.5

557057

13.5

8N

ati

ve

Am

eric

an

734

0.2

681

0.1

9910

0.2

1834

0.2

Asi

an

6382

1.7

46807

1.9

7662

1.7

77895

1.8

8O

ther

1914

0.5

21501

0.4

22393

0.5

51796

0.4

3H

isp

an

ic14279

3.8

913407

3.7

317612

4.0

616241

3.8

6D

iagn

ost

icC

on

firm

ati

on

Mis

sin

g338

0.0

9348

0.1

452

0.1

460

0.1

1N

o57

0.0

234

0.0

171

0.0

246

0.0

1Y

es366621

99.8

9358584

99.8

9432802

99.8

8419752

99.8

8R

ad

ical

Pro

state

ctom

yN

o179063

48.7

9177677

49.5

242404

55.9

4236645

56.3

1Y

es187953

51.2

1181289

50.5

190921

44.0

6183613

43.6

9P

rim

ary

Rad

iati

on

Tre

atm

ent

Mis

sin

g39

0.0

110

04532

1.0

53706

0.8

8N

o179877

49.0

1173212

48.2

5241237

55.6

7230572

54.8

6Y

es187100

50.9

8185744

51.7

4187556

43.2

8185980

44.2

5P

rim

ary

EB

RT

Mis

sin

g595

0.1

6500

0.1

45090

1.1

74197

1N

o246858

67.2

6247722

69.0

1308376

71.1

7305182

72.6

2Y

es119563

32.5

8110744

30.8

5119859

27.6

6110879

26.3

8P

rim

ary

AD

T8421

1.9

4M

issi

ng

4866

1.3

33849

1.0

7306675

70.7

76259

1.4

9N

o261003

71.1

1257031

71.6

94093

21.7

1299560

71.2

8Y

es81414

22.1

878878

21.9

724136

5.5

791138

21.6

9N

ot

ad

min

iste

red

du

eto

kn

ow

nre

aso

n19733

5.3

819208

5.3

523301

5.5

4

cN0M

0N

o43910

11.9

642237

11.7

755265

12.7

552728

12.5

5Y

es323106

88.0

4316729

88.2

3378060

87.2

5367530

87.4

5cN

1N

o364924

99.4

3357105

99.4

8429966

99.2

2417267

99.2

9Y

es2092

0.5

71861

0.5

23359

0.7

82991

0.7

1

67

Page 79: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Chapter 4

Doubly Robust Estimator for

Indirectly Standardized Mortality

Ratios

* The content of this chapter has been published in volume 6, issue 1 of the journal Epidemiologic Methods

in 2017 (Daignault and Saarela, 2017). There, I frame hospital profiling using indirectly standardized

mortality ratios in the causal inference framework to develop an explicit causal estimand for the SMR under

the national/provincial average level of care. I then develop a doubly robust estimator for this estimand and

illustrate the doubly robust property through a simulation. The manuscript in its entirety follows.

4.1 Abstract

Routinely collected administrative and clinical data are increasingly being utilized for comparing quality of

care outcomes between hospitals. This problem can be considered in a causal inference framework, as such

comparisons have to be adjusted for hospital-specific patient case-mix, which can be done using either an

outcome or assignment model. It is often of interest to compare the performance of hospitals against the

average level of care in the health care system, using indirectly standardized mortality ratios, calculated as

a ratio of observed to expected quality outcome. A doubly robust estimator makes use of both outcome

and assignment models in the case-mix adjustment, requiring only one of these to be correctly specified

for valid inferences. Doubly robust estimators have been proposed for direct standardization in the quality

comparison context, and for standardized risk differences and ratios in the exposed population, but as far

as we know, not for indirect standardization. We present the causal estimand in indirect standardization in

terms of potential outcome variables, propose a doubly robust estimator for this, and study its properties.

We also consider the use of a modified assignment model in the presence of small hospitals.

Keywords: quality indicators, causal inference, indirect standardization, direct standardization, doubly

robust estimation, provider profiling

68

Page 80: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

4.2 Introduction

Institutional comparisons have become popular in recent years as a means of assessing the care levels of

hospitals for the purpose of resource allocation and policy decisions. With the increasing availability of

large administrative databases comprising patient data from multiple hospitals, there is a need for reliable

statistical methods for such comparisons that address issues common to these data formats.

In this paper, we consider statistical methods for institutional comparisons for binary outcomes resulting

in proportion-type quality indicators, such as the proportion of patients treated with a particular procedure,

or proportion experiencing complications from a treatment procedure. In particular, we focus on comparisons

made using the standardized mortality ratio (SMR), a ratio of observed to expected outcomes. As patients

can not be randomized to hospitals for treatment, adjustment for case-mix must be made since for instance

high volume hospitals may also receive more complex cases (Shahian and Normand, 2008). Such adjustment

can be made through standardization where the choice between direct or indirect methods depends on the

particular comparison of interest. Adopting methods from causal inference, direct standardization can be

seen as comparing the potential expected outcomes had all patients in the standard population experienced

the care level of a given hospital. Such a comparison would be of particular interest for determining how

each hospital might care for the average population. However, policy makers might be interested in how

best to allocate resources across the hospital system. In this case, indirect standardization could be more

appropriate, as it contrasts the observed outcomes for patients treated in a specific hospital to their potential

expected outcomes had these patients experienced the care level of some reference system. In particular,

comparing to an average nationwide care level is relevant when the data available capture all hospitals from

across the country, and thus standardization is relative to a nationwide average standard level of care. An

example would be assessing the quality of surgical care for rectal cancer using the positive margin proportion

as the quality indicator (Massarweh et al., 2014) and data from the National Cancer Data Base (NCDB,

Raval et al., 2009), which captures hospitals across the United States.

Regardless of the institutional comparison of interest, statistical adjustment is required to attempt to

ensure that any differences in indicators are due solely to differences in actual institutional performance.

One such method is to calculate the propensity score for each hospital (Shahian and Normand, 2008). By

determining the probability of being treated at each institution based on patient characteristics (i.e. the

propensity score), it is then possible to compare the observed outcomes using the propensity score through

simple matching or stratification, or through weighting in regression modelling. Risk adjustment can also

be implemented using outcome models (Spiegelhalter, 2005b), which directly give the expected outcome

conditional on patient characteristics. By summing over the patients in each hospital, the expected outcome

needed for the SMR is obtained. A comprehensive summary of the evolution of standardization methods

can be found in Keiding and Clayton (2014).

A common issue that arises in any modelling scenario is that of model misspecification, which can be due

to a number of reasons, including unmeasured confounders, omission of observed confounders in the model,

and misspecification of the functional form of relationships. When making institutional comparisons in an

effort to identify under/over-performing hospital practices, model misspecification can have a potentially

69

Page 81: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

serious effect on the classification of these hospitals as outliers. An attempt to overcome or at least alleviate

such issues is to use doubly robust (DR) methods that incorporate both the propensity score and the outcome

model into a single estimator (Bang and Robins, 2005; Funk et al., 2011). We propose a DR estimator for

the SMR under indirect standardization, where the causal quantity being estimated is specified through the

expected potential outcome had the patients treated in a given hospital experienced a system-wide average

level of care. In the context of institutional quality comparisons, a DR estimator has been proposed for

direct standardization (Varewyck et al., 2014). In addition, Shinozaki and Matsuyama (2015) propose a DR

estimator for standardized risk differences and ratios in the exposed population. While the intended use of

their estimator was not for the purpose of institutional comparisons, it may be adopted in order to make

pairwise comparisons between hospitals, namely to estimate the expected potential outcome had patients

treated in hospital A been treated in hospital B. The causal comparison being made in Shinozaki and Mat-

suyama (2015) differs from the proposed estimator to follow as we attempt to compare each hospital to an

average level of care in a healthcare system instead of to a given reference hospital’s level of care.

The paper proceeds as follows. In Section 2 we review the ideas of direct and indirect standardization, and

specify the causal estimand in indirect standardization using potential outcomes notation. The proposed

DR estimator for the SMR under indirect standardization is developed and shown to be asymptotically

consistent. Simulation study results demonstrating the doubly robust property of the proposed estimator

are presented in Section 3. A discussion follows in Section 4.

4.3 Proposed Estimator

4.3.1 Notation and assumptions

For the extent of the paper, Y ∈ {0, 1} is the observed binary quality outcome variable, Z ∈ {1, . . . ,m} is the

hospital in which the patient was actually treated, and X ≡ (X1, . . . , Xp) is a vector of patient-level charac-

teristics relevant to case-mix adjustment, capturing for example demographic information, medical history,

and disease progression. The triples W ≡ (Y,Z,X) are assumed independent and identically distributed

across the patients. As is the convention in the causal inference literature, we denote by Yz the potential

outcome that would have been observed had the patient been treated in hospital z. Throughout, we make

the following standard causal assumptions. We assume that X is sufficient to control for any confounding

(conditional exchangeability) so that (Y1, . . . , Ym) ⊥⊥ Z | X. In addition, we assume consistency, under which

the observed outcome is determined by Y =∑mz=1 1{Z=z}Yz (Hernan and Robins, 2006), where 1{Z=z} is

the indicator function which takes on values {0, 1} depending on if the condition is false or true respectively.

Finally, we assume positivity, under which all patients have a non-zero probability of being treated at any

hospital, i.e. P (Z = z | X) > 0 for all z ∈ {1, . . . ,m} and X combinations.

4.3.2 Direct versus indirect standardization

In this section, we briefly review the two common standardization procedures used in epidemiology, and

discuss their causal interpretation in the quality comparison context. The main difference between the two

standardization methods is that direct standardization provides the expected outcome if the standard pop-

70

Page 82: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

ulation were to experience the event rate observed in the index population, whereas indirect standardization

provides the expected outcome if the index population had experienced the event rate from the standard

population. Table 1 provides an illustration of the different elements from each population used to compute

the expected outcome. Each standardization method computes the expected outcome by considering a co-

variate stratum-specific (e.g. age, gender) event rate applied to a stratum-specific population size.

Table 4.1: Difference between the standardization methods; The asterisk refers to the standard population,k indicates the covariate strata, πk is the estimated event rate, and E is the expected outcome.

Method Standard Population Index Population Expected OutcomeDirect n∗k πk E = 1∑

k n∗k

∑k n∗kπk

Indirect π∗k nk E = 1∑k nk

∑k nkπ

∗k

Direct standardization, as seen in the first row of Table 4.1, takes the event rate of the index population

and applies it to the standard population in each strata, and then averages over all covariate strata in the

standard population, resulting in E = (∑k n∗k)−1∑

k n∗kπk. Direct standardization assumes that the stratum

membership in the standard population is known and the stratum-specific rates must be estimated from the

index population. In the present context, the index population are patients treated in a given hospital, while

the standard population may be another hospital, or in the case of nationwide comparisons, the entire pa-

tient population across all hospitals. The case-mix adjustment required for the quality comparisons usually

involves a large number of covariate strata, and therefore the stratum-specific event rates are in practice

found using regression modelling techniques. The causal estimand under direct standardization where the

standard population is all hospitals nationwide, can be written as E[Yz] as in Varewyck et al. (2014), which

under the causal assumptions of Section 4.3.1 can be expressed as E[Yz] =∑xE[Y | Z = z, x]P (X = x) (e.g.

Hernan and Robins, 2006). This can be interpreted as the expected outcome had all patients experienced

the care level of hospital z.

In contrast, indirect standardization takes the event rate in each covariate stratum in the standard popu-

lation, applies it to the stratum membership in the index population, and then averages over the membership

of all strata in the index population, as E = (∑k nk)

−1∑k nkπ

∗k (see row 2 of Table 1). More conceptually,

indirect standardization can be thought of as determining the expected outcome if patients in the index pop-

ulation were to experience the same event rate as the standard population. In this case, the stratum-specific

event rates are obtained from the standard population, while the stratum membership is determined from

the study population. Because the stratum-specific event rates do not need to be estimated from the index

population, indirect standardization is still applicable when the index population is small, provided that the

standard population is large. Finally, the expected event counts are contrasted to the observed ones through

the standardized mortality ratio SMR = O/E.

The specific causal comparison being made in indirect standardization is determined by the choice of

standard population. Suppose for instance that the comparison of interest is how patients from hospital z

(index population) would fare if they were instead treated at hospital z′ (standard population). The causal

71

Page 83: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

estimand for indirect standardization must feature a conditional expectation of the form E[· | Z = z]. When

comparing two hospitals, this becomes simply E[Yz′ | Z = z], with SMR = E[Yz | Z = z]/E[Yz′ | Z = z].

The latter corresponds to the exposure effect among the exposed risk ratio discussed by Shinozaki and Mat-

suyama (2015). However, to express the causal estimand in comparison to the nationwide average care level,

instead of fixing the subscript of the potential outcome, we need to consider this as a random variable. This

corresponds to a hypothetical intervention of randomly assigning a patient actually treated in hospital z to

be treated in one of the hospitals under comparison. To express this, as a notational device, we define a new

random variable A to denote the unobserved potential hospital assignment, where A ∈ {1, . . . ,m}. Further,

we let (Y1, . . . , Ym) ⊥⊥ A | (Z,X) and and A ⊥⊥ Z | X so that we have the causal relationships presented

in Figure 4.1. We can now generally express the causal estimand in indirect standardization through the

conditional expectation E[YA | Z = z]. Several interesting special cases may be obtained by choosing the

hypothetical assignment probabilities P (A | X). The usual exposure effect among the exposed comparison

would be obtained by taking P (A = z′ | X) = 1. Comparison to the care level of an average provider would

be obtained by taking P (A = a | X) = 1/m for all a ∈ {1, . . . ,m} (see Section 4.3.4 and Varewyck et al.,

2014). However, herein we are specifically interested in comparisons to nationwide care level. In Section 4.3.4

we show that the corresponding causal estimand is specified by choosing the hypothetical assignment proba-

bilities as equal to the actual ones, effectively weighting the hospitals in the average by their patient volumes.

Z

U

A

X

Y

Y1, ..., Ym

Figure 4.1: The postulated causal mechanism (U is a non-confounder latent variable representing the corre-lation between potential outcomes for an individual).

Regardless of the standardization method used, it is necessary to estimate the respective expected po-

tential outcomes. When the number of covariate strata becomes too large, it is common to fit an outcome

model to the data and to estimate the particular expected number from the fitted values. However, such

estimates are subject to the possibility of bias from model misspecification due to any number of factors.

One attempt to protect against misspecification of the outcome model used in estimation is to instead use

a doubly robust estimator.

72

Page 84: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

4.3.3 Doubly robust estimation in direct standardization

Doubly robust (DR) estimation attempts to eliminate bias due to misspecification of a single model by

utilizing two separate models in the estimation process (Funk et al., 2011). By doing so, it incorpo-

rates as much information about the causal pathway between the outcome, the exposure and the co-

variates/confounders as possible (see Figure 4.1). DR estimators combine the use of an outcome model,

m(X, z, φ) ≡ E[Y | Z = z,X;φ], and a propensity/assignment model, e(X, z, γ) ≡ P (Z = z | X; γ),

parametrized with respect to φ and γ respectively, into an estimator such that, as long as one of the models

is correctly specified, the results should be unbiased or at least consistent (Bang and Robins, 2005).

In general, DR estimators of a mean effect are composed of three terms that contain either the outcome

model, the propensity/assignment model, or both such that the terms that contain the misspecified model

will cancel thereby resulting in estimation using only the correct model. Therefore, making use of fitted

outcome and assignment probabilities m(X, z, φ) and e(X, z, γ), where estimated model parameters were

denoted by φ and γ respectively, a DR estimator (Robins et al., 2007) for a marginal mean µz under direct

standardization can be written as

µDRz = n−1

n∑i=1

m(xi, z, φ) + n−1n∑i=1

1{Zi=z}

e(xi, z, γ)

[Yi −m(xi, z, φ)

](4.1)

= n−1n∑i=1

1{Zi=z}

e(xi, z, γ)Yi + n−1

n∑i=1

[1−

1{Zi=z}

e(xi, z, γ)

]m(xi, z, φ). (4.2)

Here the outcome model estimator, with hospital effect, has been augmented by weighting by the inverse

of the assignment probability. An estimator of this form is doubly robust if either the outcome or the as-

signment/propensity model is correctly specified. This is evident as the second term in (4.1) will in large

samples have mean zero if the outcome model is correctly specified, leaving the first term to provide the esti-

mate, while the second term of (4.2) will also in large samples have mean zero if the propensity/assignment

model is correctly specified and thus leaving only the propensity model to provide the estimate. Varewyck

et al. (2014) used such an estimator for estimating the potential full population risk, E[Yz], had all patients

received the care level of hospital z, a directly standardized quantity.

An issue that arises in direct standardization is the need to specify hospital effects in the outcome model.

In the case of a large nationwide database of hospitals, some of them small in volume, this requires the esti-

mation of a large number of parameters which might not be feasible without smoothing/shrinkage. Although

such smoothing could be employed through mixed effect models, this might be a questionable approach if

the purpose of the institutional comparison is to identify outliers. Varewyck et al. (2014) discuss possible

ways to reduce shrinkage, such as clustered mixed effect models on hospitals or Firth corrected fixed effects

logistic regression as the outcome model.

Additionally, direct standardization raises the question as to whether modelling hospital-patient interac-

tions would also be needed, especially in the case of hospitals that specialize in particular patient subgroups,

such as children or the elderly (Varewyck et al., 2016). Including interaction terms would further contribute

to the large number of parameters that require estimation in the outcome model. Varewyck et al. (2016)

73

Page 85: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

have shown that the omission of hospital-patient interactions in the models used for standardization can

contribute bias towards the estimated excess risks.

To sum up, in the case of nationwide comparisons involving hundreds or thousands of hospitals, many

of these small volume, direct standardization may be more problematic due to the large number of hos-

pital effects and patient-hospital interaction effects that would need to be estimated to ensure unbiased

estimation. However, if comparison to an average level of care as the reference is of interest, indirect stan-

dardization avoids the issue of modelling hospital effects as well as hospital-patient interactions, which we

will demonstrate in Section 4.3.4. Nevertheless, indirect standardization still requires specification of an

outcome model. We thus propose a doubly robust estimator for the standardized mortality ratio under

indirect standardization. In order to do this, we must first express the SMR as a causal estimand.

4.3.4 Causal estimand under indirect standardization

As per the discussion in Section 4.3.2, we define the causal estimand for hospital z in indirect standardization

as

SMR =E[Yz | Z = z]

E[YA | Z = z], (4.3)

where the observed response in the numerator results simply from considering the potential outcomes of the

patients of hospital z had they been treated in hospital z (i.e. the consistency assumption). In contrast,

the expected response in the denominator depends on the specified target assignment regime, P (A | X). As

mentioned in Section 4.3.2, notable special cases may be obtained by choosing the assignment probabilities.

First, consider the target assignment regime that gives patients an equal probability of being treated at each

hospital, P (A = a | X) = m−1. We then show in Appendix A that, for this choice of assignment regime,

(4.3) is equivalent to

SMR =E[Yz | Z = z]

m−1∑ma=1E[Ya | Z = z]

, (4.4)

which is the causal estimand briefly considered by Varewyck et al. (2014). This takes an equally weighted

average across all hospitals in the denominator and thus corresponds to using the care level of an average

provider as the reference in indirect standardization. In contrast, we want to use the national average level

of care as the reference, and choose as the target assignment regime P (A = a | X) = P (Z = a | X).

Then the denominator of (4.3) involves an average across all hospitals but weighted by their actual volume.

For the causal estimand for hospital z under this special case, we introduce the shorthand notation θz ≡ SMR.

Now, utilizing the causal assumptions listed in Section 4.3.1, and the additional conditional independence

properties (Y1, . . . , Ym) ⊥⊥ A | (Z,X) and A ⊥⊥ Z | X, it can be shown (Appendix A) that (4.3) can be

74

Page 86: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

expressed in terms of observable quantities as

θz =

∑x P (Y = 1 | X = x, Z = z)P (X = x | Z = z)∑

x

∑a P (Y = 1 | X = x, Z = a)P (A = a | X = x)P (X = x | Z = z)

=E[Y | Z = z]

E {E[Y | X] | Z = z}, (4.5)

where the denominator corresponds to using an outcome model without hospital effects, and averaging the

predictions from such a model over the patients of hospital z. This is similar to the indirect standardization

approach considered by e.g. Faris et al. (2003) and Tang et al. (2015), and is the appropriate modelling

approach when the reference is chosen as the average level of care in the health care system. While it depends

on the context whether this is the relevant comparison, we will now demonstrate how to obtain a simple

doubly robust estimator for the causal SMR under such standardization. We show in Appendix A that

similar manipulations of the causal estimand (that resulted in the equivalence of (4.3) and (4.5)) further

result in two other equivalent expressions in terms of observable quantities, that is,

θz =E[1{Z=z}Y

]E [P (Z = z | X)Y ]

(4.6)

and

θz =E[1{Z=z}Y

]E {E[Y | X]P (Z = z | X)}

. (4.7)

Thus, under the causal assumptions, the causal estimand θz can be written in terms of observable quantities

in three equivalent forms that could be estimated using either an outcome model (4.5), an assignment model

(4.6), or a combination of both models (4.7). Therefore, we may utilize all three forms in a doubly robust

estimator.

4.3.5 Proposed doubly robust estimator

As equations (4.5) - (4.7) are all equivalent and contain either one or both of an outcome model and

propensity/assignment model, we may now write the causal SMR as

θz =E[1{Z=z}Y

]E [P (Z = z | X)Y ]

+E[Y | Z = z]

E {E[Y | X] | Z = z}−

E[1{Z=z}Y

]E {E[Y | X]P (Z = z | X)}

. (4.8)

This motivates the proposed DR estimator for the SMR of hospital z under indirect standardization

θz ≡∑ni=1 1{Zi=z}Yi∑ni=1 e(xi, z, γ)Yi

+

∑ni=1 1{Zi=z}Yi∑n

i=1 1{Zi=z}m(xi, φ)−

∑ni=1 1{Zi=z}Yi∑n

i=1m(xi, φ)e(xi, z, γ). (4.9)

Here m(xi, φ) ≡ E[Yi | Xi = xi, φ] is an outcome model parametrized in terms of φ. In the the case of a

binary outcome variable, this would be a logistic regression model of the form

m(xi, φ) ≡ expit{φ0 + φ′1xi}, (4.10)

75

Page 87: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

where φ ≡ (φ0, φ1). The corresponding parameter estimates are denoted by φ. Further, e(xi, z, γ) ≡ P (Zi =

z | Xi = xi, γ) is a multinomial assignment probability model parametrized in terms of γ, given by

e(xi, z, γ) ≡ exp(γ0z + γ′1zxi)

1 +∑ma=2 exp(γ0a + γ′1axi)

, z = 2, . . . ,m (4.11)

and e(xi, 1, γ) = 1−∑mz=2 e(xi, z, γ), with γ ≡ (γ02, . . . , γ0m, γ12, . . . , γ1m) denoting the collection of all the

parameters, and the corresponding parameter estimates denoted by γ.

The estimator (4.9) is applied in turn to each hospital z = 1, . . . ,m, with the parameters φ and γ estimated

through fitting the regression models (4.10) and (4.11) to the pooled patient population. We note that the

outcome model m(xi, φ) no longer contains a term for the hospital effect, (as opposed to m(xi, z, φ) in (4.1)),

and thus we are estimating fewer parameters compared to the outcome model used for direct standardization.

However, the hospital assignment model requires estimation of hospital-level regression parameters, and thus

it is worth considering the case where the observational database may contain information on small hospitals,

in which very few patients are being treated. In such situations, there may not be sufficient data to estimate

the model parameters for all hospitals in the multinomial assignment model. However, we still want to

include all the hospitals in the standardization since the reference is the national average level of care. We

therefore also propose a modification of the multinomial assignment model (4.11) that only specifies covariate

effects for the hospitals that are large enough. Suppose that, out of m hospitals, the first l hospitals are

‘small’ and the rest are ‘large’. We may then pool the small hospitals together as the reference category and

specify the multinomial assignment model as

e(xi, z, γ) =

exp(γ0z)

1+∑la=2 exp(γ0a)+

∑ma=l+1 exp(γ0a+γ1axi)

for z = 2, . . . , l

exp(γ0z+γ1zxi)

1+∑la=2 exp(γ0a)+

∑ma=l+1 exp(γ0a+γ1axi)

for z = l + 1, . . . ,m

(4.12)

and e(xi, 1, γ) = 1 −∑mz=2 e(xi, z, γ). While the assignment model (4.12) is obviously misspecified for the

small hospitals z = 2, . . . , l, it will still help in estimation of the SMRs for the large hospitals, and thus is

an improvement over using only the outcome model, as the estimation of the θzs for z = l+ 1, . . . ,m will be

doubly robust.

While the DR estimator in (4.9) is composed of three terms as required by the form of (4.1) and (4.2), as

well as using two models for estimation, it is worth noting that the assignment model is not actually being

utilized in inverse probability weighting. Nevertheless, our estimator does in fact have the doubly robust

property, namely that θz converges in probability to θz as long as either the outcome or assignment model

is correctly specified, the proof of which can be found in Appendix B. We will demonstrate this property in

a simulation study in the following section.

76

Page 88: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

4.4 Simulation

We now present the results of a simulation study that illustrates the doubly robust property of the proposed

estimator θz (4.9). To this end, we purposefully kept the number of hospitals in the simulation small. We

simulated 1000 datasets according to the causal pathway in Figure 4.1. Each dataset consists of n = 1000

patients that are assigned into m = 5 hospitals. Each patient has p = 2 measured covariates which are

associated with both the hospital assignment and the quality outcome: X1i ∼ N(0, 1), a standard normal

variable, which, to demonstrate model misspecification, we transform into V1i = |X1i|/√

1− 2/π such that more

extreme values of X1i represent increased risk, and X2i ∼ Bernoulli(0.5). We also generate another standard

normal random variable Ui to represent the similarity among the potential outcomes for each patient (see

Figure 4.1). The binary potential outcomes are generated as

Yzi ∼ Bernoulli(expit(α0z + α1V1i + α2X2i + α3Ui))

independently for z = 1, . . . , 5, where the coefficients were chosen as (α1, α2, α3) = (0.5, 1.5, 1.0) for all

simulations and α0 = (α01, . . . , α05), which dictates the level of the true quality of care of each hospital, are

chosen according to two different scenarios. In the first, there is no difference in the quality of care between

the hospitals, and thus we set α0 = (0, . . . , 0), corresponding to SMR=1 for each hospital. In the second,

we set α0 = (0,−1, 0, 1, 0) such that 3 hospitals have SMR near 1 while one hospital has SMR larger than 1

and one has SMR smaller than 1.

The observed hospital assignment for each patient is generated as Zi ∼ Multinomial(p1i, . . . , p5i), where

pzi = expit(β0z + β1zV1i + β2zX2i) for z = 2, . . . , 5 and p1i = 1 −∑5z=2 pzi. Here β0 = (β02, . . . , β05)

dictates the volumes of the hospitals, and β1 = (β12, . . . , β15) and β2 = (β22, . . . , β25) dictates how the

hospital assignment probabilities depend on the patient-level characteristics. We let β1 = (0, 0, 0.5, 1) and

β2 = (−1,−0.5, 0.5, 1) for all simulations, while for the hospital volume we consider β0 = (−1,−0.5, 0.5, 1),

which results in three small volume and two large volume hospitals. This choice of β0 results in hospitals

1-3 having average sizes of 57.6, 17.0 and 30.7, while hospitals 4 and 5 have average sizes of 181.8 and 712.9

respectively, for all simulations.

Finally, the observed outcome Y is given by the potential outcome corresponding to the hospital assign-

ment of each patient, as required by the consistency assumption. Although the true SMRs are not directly

specified by the parameters in the data generating mechanism, we estimated the true SMRs from the simu-

lated potential outcomes using definition (4.3) and the true assignment probabilities, averaged over the 1000

simulation rounds.

For each dataset, under these specifications, we compute the SMR for each hospital using the estimators

based on both the outcome model (4.5) and the assignment model (4.6) alone and the proposed doubly

robust estimator (4.9) when all models are correctly specified. We then misspecifiy each model in turn and

then simultaneously, and compare the performance of all three estimators under each scenario. The type

of misspecification considered here is that of misspecifying the functional form of a covariate, i.e. using the

original untransformed variable X1i in place of V1i.

77

Page 89: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

0.4 0.6 0.8 1.0 1.2 1.4 1.6

No misspecification

SMR0.4 0.6 0.8 1.0 1.2 1.4 1.6

Outcome misspecified

SMR

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

0.4 0.6 0.8 1.0 1.2 1.4 1.6

Assignment misspecified

SMR

Hospital 5

Hospital 4

Hospital 3

Hospital 2

Hospital 1

Hospital 5

Hospital 4

Hospital 3

Hospital 2

Hospital 1

0.4 0.6 0.8 1.0 1.2 1.4 1.6

Both misspecified

SMR

True value

Figure 4.2: Sampling distributions of observed-to-expected ratios based on outcome model (4.5) only, as-signment model (4.6) only and doubly robust estimators (4.9) when true SMR = 1.0 for all hospitals.

Figure 4.2 presents the sampling distribution of the three SMR estimators (based on equations (4.5), (4.6)

and (4.9)) as well as the true value for this ratio under the scenario where there is no difference between the

quality of care. We see that when there is no misspecification of the models (top left panel), the sampling

distributions of the three estimators are nearly identical within each hospital. When either the outcome

model (top right) or the assignment model (bottom left) alone is misspecified, we see that the doubly robust

estimator in both cases produces results similar to the estimator featuring only the correctly specified model,

demonstrating the double robustness property, while the estimators featuring only the misspecified model

produce biased results. When both models are misspecified, all three estimators are biased, but the doubly

robust estimator does not introduce additional bias compared to the other two estimators. In each of the

78

Page 90: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

four misspecification scenarios considered in Figure 4.2, due to their small volume, the sampling distributions

of hospitals 1-3 exhibit more variability than those of hospitals 4 and 5.

0.2 0.4 0.6 0.8 1.0 1.2 1.4

No misspecification

SMR

Hospital 5

Hospital 4

Hospital 3

Hospital 2

Hospital 1

True value0.2 0.4 0.6 0.8 1.0 1.2 1.4

Outcome misspecified

SMR

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

0.2 0.4 0.6 0.8 1.0 1.2 1.4

Assignment misspecified

SMR

Hospital 5

Hospital 4

Hospital 3

Hospital 2

Hospital 1

0.2 0.4 0.6 0.8 1.0 1.2 1.4

Both misspecified

SMR

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

Figure 4.3: Sampling distributions of observed-to-expected ratios based on outcome model (4.5) only, as-signment model (4.6) only and doubly robust estimators (4.9) when true level of care varies across hospitals.

Figure 4.3 presents the sampling distributions under the scenario where the SMR is allowed to vary

across hospitals. When all models are correctly specified, the three estimators produce a similar sampling

distribution of SMRs for each hospital. Once again, when one of the models is misspecified, we see that the

doubly robust estimator and the estimator featuring only the correctly specified model produce nearly iden-

tical results while the estimator featuring only the misspecified model produces biased results. As expected,

when both models are misspecified, all three estimators produce biased estimates.

79

Page 91: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

0.2 0.4 0.6 0.8 1.0 1.2 1.4

No misspecification

SMR

Hospital 5

Hospital 4

Hospital 3

Hospital 2

Hospital 1

True value

0.2 0.4 0.6 0.8 1.0 1.2 1.4

Outcome misspecified

SMR

0.2 0.4 0.6 0.8 1.0 1.2 1.4

Both misspecified

SMR

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

DR

Assignment

Outcome

0.2 0.4 0.6 0.8 1.0 1.2 1.4

Assignment misspecified

SMR

Hospital 5

Hospital 4

Hospital 3

Hospital 2

Hospital 1

Figure 4.4: Sampling distributions of observed-to-expected ratios based on outcome model (4.5) only, mod-ified assignment model (4.12) only and doubly robust estimators (4.9) when true level of care varies acrosshospitals.

Although we did not simulate a scenario where there are hospitals so small that the full multinomial

assignment model cannot be fitted, it is of interest to consider the effect of pooling hospitals in the assignment

model, as discussed in Section 4.3.5. This requires fitting a multinomial logistic model of the form (4.12)

where only the intercept terms are estimated for hospitals 1, 2 and 3, and both intercept terms and regression

coefficients are estimated for hospitals 4 and 5. The results of this scenario are presented in Figure 4.4.

Relative to Figure 4.3, there is a small difference in the bias of the assignment model based estimator when

the models are correctly specified, yet we do not see much difference when it is misspecified. The doubly

robust estimator, as expected, consistently estimates the true SMR when the outcome model is correctly

specified, despite the presence of the added misspecification to the assignment model. When the outcome

80

Page 92: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

model is misspecified and the doubly robust estimator relies on the assignment model for estimation, there is

additional bias introduced by the use of the modified multinomial model for hospitals 1, 2, and 3. However,

the double robustness property still applies to hospitals 4 and 5, as discussed in Section 4.3.5.

4.5 Discussion

The doubly robust estimator for the SMR under indirect standardization that we have proposed in equation

(4.9) has been shown (in Appendix B) to be an asymptotically consistent estimator when either the outcome

or the assignment model is correctly specified and we have also demonstrated this property through simula-

tions. The simulation results demonstrated that our proposed estimator is robust to model misspecification

of one but not both of the models used for estimation, but performs no worse than the outcome model esti-

mator or the assignment model estimator when both models are misspecified. Some authors have discussed

scenarios where doubly robust estimators have the potential to increase bias (Kang and Schafer, 2007). We

did not encounter these in our simulations, although as a caution it should be noted that the results of the

simulation study apply only to the types of misspecification that we considered.

When small hospitals are present, a modified multinomial assignment model, pooling some of the hospi-

tals, can be used to avoid problems in estimating covariate effects. While the modified assignment model is

inherently a type of misspecified model, we see that the bias introduced by its use only concerns the small hos-

pitals, with the proposed estimator still demonstrating the double robustness property for the large hospitals.

The simulation results also demonstrated that there is little difference in the variance of the sampling

distributions of the estimated SMRs, regardless of the estimator being used. To explain this, we note that

the numerators of the three forms for the causal estimand (equations (4.5), (4.6) and (4.7)) can all be esti-

mated through the same quantity,∑ni=1 1{Zi=z}Yi which has a binomial sampling variance. Therefore, if the

variance resulting from the estimation of the denominator terms is small, the variance of all three estimators

will be similar. Secondly, though it is known that inverse probability weighted estimators can be highly

variable even when a correctly specified propensity model is used (Kang and Schafer, 2007), our estimator is

not actually employing the assignment model in inverse probability weighting and therefore is not subjected

to this added variability.

An important consideration to be made is the estimation of the variance of the proposed doubly robust

estimator. It is a common practice in indirect standardization to assume that the estimated expected number

of events in the SMR contributes no variability to the overall estimate of the SMR and thus confidence inter-

vals are built solely on the variability of the observed counts. Faris et al. (2003) have shown that, by ignoring

the modelling error in the estimated expected counts, bias is introduced into the confidence intervals and

can result in misclassification of hospitals as outliers. In the case where the expected counts are estimated

using a logistic model of the binary outcome on a single risk score that incorporates the information from

many patient characteristics deemed relevant, Tang et al. (2015) have proposed an asymptotic distribution

for the SMR from which confidence intervals can be obtained. They also show that these asymptotic confi-

dence intervals perform similarly to intervals obtained through a bootstrap procedure. We thus suggest that

81

Page 93: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

confidence intervals for the proposed doubly robust estimator be computed via bootstrap, and the derivation

of an explicit form for the variance of our estimator is left as future work.

The present methodological work has several possible extensions. One is to consider doubly robust

estimators for composite scores based on multiple quality indicators. This is motivated by the fact that

policy makers would likely base their decisions for the allocation of funding and resources on multiple

dimensions of quality of care. Further, the proposed framework can be generalized to also incorporate within

hospital comparisons over time, in addition to between-hospital comparisons in the same time period. Here

again we need to be able to remove the confounding due to changes in the patient population over time,

possibly in a doubly robust way.

82

Page 94: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

4.6 Appendix A: Proofs for equations (4.4)-(4.7)

Throughout, we make use of the notation and assumptions introduced in Sections 4.3.1 and 4.3.2. First,

under a general target assignment regime P (A | X) we can write

SMR =EX|Z=z {E[Yz | X,Z = z]}

EA,X|Z=z {E[YA | A,X,Z = z]}

=

∑x P (Yz = 1 | X = x, Z = z)P (X = x | Z = z)∑

x

∑a P (Ya = 1 | A = a, Z = z,X = x)P (A = a,X = x | Z = z)

=

∑x P (Yz = 1 | X = x, Z = z)P (X = x, Z = z)∑

x

∑a P (Ya = 1 | Z = z,X = x)P (A = a,X = x, Z = z)

=

∑x P (Yz = 1 | X = x, Z = z)P (Z = z | X = x)P (X = x)∑

x

∑a P (Ya = 1 | Z = a,X = x)P (A = a, Z = z | X = x)P (X = x)

=

∑x P (Yz = 1 | X = x, Z = z)P (Z = z | X = x)P (X = x)∑

x

∑a P (Ya = 1 | Z = a,X = x)P (A = a | X = x)P (Z = z | X = x)P (X = x)

. (4.13)

In the above, the third equality followed from the conditional independence property (Y1, . . . , Ym) ⊥⊥ A |(Z,X) and the fifth equality from the conditional independence property A ⊥⊥ Z | X, both of which are

taken to be true by the definition of A. To show the equivalence between (4.3) and (4.4), under the target

assignment regime where P (A = z | X) = m−1 we can further write this as

SMR =

∑x P (Yz = 1 | X = x, Z = z)P (X = x | Z = z)∑

x

∑a P (Ya = 1 | Z = a,X = x)P (A = a | X = x)P (X = x | Z = z)

=

∑x P (Yz = 1 | X = x, Z = z)P (X = x | Z = z)

m−1∑x

∑a P (Ya = 1 | Z = a,X = x)P (X = x | Z = z)

(4.14)

=P (Yz = 1 | Z = z)

m−1∑a P (Ya = 1 | Z = z)

=E[Yz | Z = z]

m−1∑aE[Ya|Z = z]

.

On the other hand, under the hypothetical assignment regime under which P (Z = z | X) = P (A = z | X),

starting from the general form (4.13) above, we can express the causal parameter θz in the form

θz =

∑x P (Yz = 1 | X = x, Z = z)P (X = x | Z = z)∑

x

∑a P (Y = 1 | Z = a,X = x)P (A = a | X = x)P (X = x | Z = z)

=

∑x P (Y = 1 | X = x, Z = z)P (X = x | Z = z)∑

x

∑a P (Y = 1 | Z = a,X = x)P (Z = a | X = x)P (X = x | Z = z)

=P (Y = 1 | Z = z)∑

x P (Y = 1 | X = x)P (X = x | Z = z)

=E[Y | Z = z]∑

xE[Y | X = x]P (X = x | Z = z)

=E[Y | Z = z]

E {E[Y | X] | Z = z}, (4.15)

83

Page 95: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

which proves equality (4.5). We note that (4.15) contains a term E[Y | X] which could be estimated by

fitting an outcome model. An alternative form may be obtained as

θz =

∑x P (Yz = 1 | X = x, Z = z)P (Z = z | X = x)P (X = x)∑

x

∑a P (Ya = 1 | Z = a,X = x)P (A = a | X = x)P (Z = z | X = x)P (X = x)

=

∑x P (Y = 1 | X = x, Z = z)P (Z = z | X = x)P (X = x)∑

x P (Z = z | X = x)∑a P (Y = 1 | Z = a,X = x)P (A = a | X = x)P (X = x)

=E[1{Z=z}Y ]∑

x

∑1y=0 P (Z = z | X = x)yP (Y = y | X = x)P (X = x)

=E[1{Z=z}Y ]

E[P (Z = z | X)Y ], (4.16)

which proves equality (4.6). Expression (4.16) only involves a term P (Z = z | X) which could be estimated

by fitting a multinomial assignment model. Finally we can derive one more expression for θz, beginning from

Equation (4.13), as

θz =

∑x P (Yz = 1 | X = x, Z = z)P (Z = z | X = x)P (X = x)∑

x

∑a P (Ya = 1 | Z = a,X = x)P (A = a | X = x)P (Z = z | X = x)P (X = x)

=E[1{Z=z}Y ]∑

x P (Z = z | X = x)P (Y = 1 | X = x)P (X = x)

=E[1{Z=z}Y ]

E {P (Z = z | X)P (Y = 1 | X)}

=E[1{Z=z}Y ]

E {E[Y | X]P (Z = z | X)}, (4.17)

which proves equality (4.7). Expression (4.17) is the final term in the proposed doubly-robust estimator.

The denominator combines two terms that could be estimated by an outcome model and a multinomial

assignment model, and serves as the cancellation term to achieve double robustness.

84

Page 96: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

4.7 Appendix B: Consistency of the Proposed Estimator

In this appendix we show that (4.9) is a consistent estimator. This will be done asymptotically using the

Law of Large Numbers combined with Slutsky’s theorem. We show here that the estimator is consistent

when all models are correctly specified, as well as when each model in turn is misspecified.

4.7.1 A note on correctly specified models

We assume that the triples Wi ≡ (Xi, Yi, Zi) and Wj ≡ (Xj , Yj , Zj) are independent and identically dis-

tributed for i 6= j. Further, we assume that for n → ∞ we have φ → φ0, namely, the estimator of the

outcome model parameters converges to some unknown constant in probability. We say that the relationship

between an outcome variable and the covariates is correctly specified when

E[Yi | Xi, φ0] = E[Yi | Xi].

Since the parameter φ0 is unknown and must be estimated by φ, by the continuous mapping theorem,

m(xi, φ) ≡ E[Yi | Xi, φ]→ E[Yi | Xi, φ0]. Therefore, if the model is correctly specified, we have

m(xi, φ)→ E[Yi | Xi, φ0] = E[Yi | Xi] as n→∞.

The above is the case when we are considering an outcome model. For the case of a correctly specified

assignment model with estimated parameters γ, we have that

γ → γ0 ⇒ e(xi, z, γ) ≡ P (Zi | Xi, γ)→ P (Zi | Xi, γ0)

and thus if the assignment model is correctly specified, we have

e(xi, z, γ)→ P (Zi | Xi, γ0) = P (Zi | Xi)

4.7.2 Consistency for correctly specified models

Our proof of general consistency will show that the numerators and denominators of each term in the

summation converge in probability to an expectation that is equivalent to the numerator and denominators

of the quantity of interest. Then, by Slutsky’s theorem, we see that, since all three summation terms are

equivalent, the summation itself converges in probability to the estimand, and therefore our estimator is

consistent.

Numerators: Let g(Wi) = 1{Zi=z}Yi, namely the numerator of all three terms in equation (4.9). Then

by Law of Large Numbers (LLN) we have

1

n

n∑i=1

g(Wi)P−→ E[g(W )] = E[1{Z=z}Y ]

85

Page 97: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

where we can write

E[1{Z=z}Y ] =∑x

P (Y = 1 | Z = z,X = x)P (Z = z,X = x)

=∑x

P (Yz = 1 | Z = z,X = x)P (Z = z,X = x)

= P (Yz = 1 | Z = z)P (Z = z)

= P (Z = z)E[Yz | Z = z]

Thus we have that1

n

n∑i=1

1{Z=z}YiP−→ P (Z = z)E[Yz | Z = z].

Denominators: Here we will show, for each term in the estimator, that each denominator converges in

probability to P (Z = z)E[YA | Z = z] and thus, by Slutsky’s theorem, the ratio converges in probability to

the causal estimand.

1. Let g(Wi; γ) = e(xi, z, γ)Yi, the denominator of the first term in (4.9). Under the assumption that the

assignment model is correctly specified,

1

n

n∑i=1

g(Wi; γ)P−→ E[g(W ; γ0)] = E[P (Z = z | X)Y ]

where, under the causal assumptions made in Section 4.3, we may write

E[P (Z = z | X)Y ]

=∑x

P (Z = z | X = x)P (Y = 1 | X = x)

=∑x

∑a

P (Y = 1 | Z = a,X = x)P (Z = a | X = x)P (X = x)P (Z = z | X = x)

=∑x

∑a

P (Ya = 1 | Z = z,A = a,X = x)P (A = a | X = x)P (X = x)P (Z = z | X = x)

=∑x

∑a

P (Ya = 1 | Z = z,A = a,X = x)P (A = a, Z = z | X = x)P (X = x)

=∑x

∑a

P (Ya = 1 | Z = z,A = a,X = x)P (A = a,X = x | Z = z)P (Z = z)

=∑x

P (YA = 1 | Z = z,X = x)P (X = x | Z = z)P (Z = z)

= P (Z = z)E[YA | Z = z]

so we have that1

n

n∑i=1

e(xi, z, γ)YiP−→ P (Z = z)E[YA | Z = z]

and therefore, by Slutsky’s theorem, we have that the first term of the estimator in (4.9) of the main

86

Page 98: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

paper converges to the causal estimand,

n−1∑ni=1 1{Z=z}Yi

n−1∑ni=1 e(xi, z, γ)Yi

P−→ P (Z = z)E[Yz | Z = z]

P (Z = z)E[YA | Z = z]=E[Yz | Z = z]

E[YA | Z = z].

2. Let g(Wi; φ) = 1{Zi=z}m(xi, φ), the denominator of the middle term of (4.9) in the main paper. Under

the assumption that the outcome model is correctly specified,

1

n

n∑i=1

g(Wi; φ)P−→ E[g(W ;φ0)] = E

{1{Z=z}E[Y | X]

}where we may write

E{1{Z=z}E[Y | X]

}=∑x

E[Y | X = x]P (X = x, Z = z)

=∑x

P (Y = 1 | X = x)P (X = x, Z = z)

=∑x

∑a

P (Y = 1 | Z = a,X = x)P (Z = a | X = x)P (X = x, Z = z)

=∑x

∑a

P (Ya = 1 | Z = a,X = x)P (A = a | X = x)P (X = x, Z = z)

=∑x

∑a

P (Ya = 1 | Z = a,X = x)P (A = a | X = x)P (Z = z | X = x)P (X = x)

=∑x

∑a

P (Ya = 1 | Z = a,X = x)P (A = a, Z = z | X = x)P (X = x)

=∑x

∑a

P (Ya = 1 | Z = a,X = x)P (A = a, Z = z,X = x)

=∑x

∑a

P (Ya = 1 | Z = z,X = x)P (A = a | Z = z,X = x)P (X = x, Z = z)

=∑x

P (YA = 1 | Z = z,X = x)P (X = x, Z = z)

=∑x

P (YA = 1 | Z = z,X = x)P (X = x | Z = z)P (Z = z)

= P (Z = z)E[YA | Z = z]

Therefore we have that the outcome model-based term of (4.9) converges to the causal estimand,

1

n

n∑i=1

1{Z=z}m(xi, φ)P−→ P (Z = z)E[YA | Z = z],

so using Slutsky, we have

n−1∑ni=1 1{Z=z}Yi

n−1∑ni=1 1{Z=z}m(xi, φ)

P−→ P (Z = z)E[Yz | Z = z]

P (Z = z)E[YA | Z = z]=E[Yz | Z = z]

E[YA | Z = z]

87

Page 99: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

and so the term that uses only the outcome model is a consistent estimator for our quantity of interest.

3. Let g(Wi; φ, γ) = m(xi, φ)e(xi, z, γ), the last term in (4.9) of the main paper. Under the assumption

that both models are correctly specified, then by the LLN

1

n

n∑i=1

g(Wi; φ, γ)P−→ E[g(W ;φ0, γ0)] = E {E[Y | X]P (Z = z | X)}

where we may write

E {E[Y | X]P (Z = z | X)}

=∑x

P (Z = z | X)P (Y = 1 | X = x)P (X = x)

=∑x

∑a

P (Z = z | X = x)P (Y = 1 | Z = a,X = x)P (Z = a | X = x)P (X = x)

=∑x

∑a

P (Z = z | X = x)P (Ya = 1 | Z = a,A = a,X = x)P (A = a | X = x)P (X = x)

=∑x

∑a

P (Z = z | X = x)P (Ya = 1 | Z = z,A = a,X = x)P (A = a | X = x)P (X = x)

=∑x

∑a

P (Ya = 1 | Z = z,A = a,X = x)P (A = a, Z = z | X = x)P (X = x)

=∑x

∑a

P (Ya = 1 | Z = z,A = a,X = x)P (A = a,X = x | Z = z)P (Z = z)

=∑x

P (YA = 1 | Z = z,X = x)P (X = x | Z = z)P (Z = z)

= P (Z = z)P (YA = 1 | Z = z)

= P (Z = z)E[YA | Z = z]

so we have that1

n

n∑i=1

m(xi, φ)e(xi, z, γ)P−→ P (Z = z)E[YA | Z = z]

and therefore, by Slutsky

n−1∑ni=1 1{Z=z}Yi

n−1∑ni=1m(xi, φ)e(xi, z, γ)

P−→ P (Z = z)E[Yz | Z = z]

P (Z = z)E[YA | Z = z]=E[Yz | Z = z]

E[YA | Z = z]

Finally, since each term in the summation is a consistent estimator of the causal estimand θz, we can use

Slutsky again to show that the entire estimator is a consistent estimator for θz:

θzP−→ E[Yz | Z = z]

E[YA | Z = z]− E[Yz | Z = z]

E[YA | Z = z]+E[Yz | Z = z]

E[YA | Z = z]=E[Yz | Z = z]

E[YA | Z = z]= θz

Thus, estimator (4.9) is asymptotically consistent, when the models are correctly specified.

88

Page 100: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

4.7.3 Consistency under misspecified assignment model

Now we can check the consistency of the estimator when each of the models in turn are misspecified in order

to show the double robust property. We begin by assuming that the assignment model is misspecified, but

the outcome model remains correct. The misspecified assignment model, denoted by asterisk, is assumed

to converge towards a constant different from the true assignment probability as e∗(xi, z, γ)P−→ P ∗(Zi =

z | Xi, γ0) 6= P (Zi | Xi). The second term in (4.9) (estimator for equation (4.5)) will consistently estimate

the causal quantity of interest as the outcome model is correctly specified and the assignment model is not

present in this term. Now consider the denominator of the first term of the sum,∑ni=1 e

∗(xi, z, γ)Yi, when

the assignment model is misspecified. We have by the law of large numbers

1

n

n∑i=1

e∗(xi, z, γ)YiP−→ E [P ∗(Z = z | X, γ0)Y ] .

For the third term in the summation, we have by LLN that, for the denominator,

1

n

n∑i=1

m(xi, φ)e∗(xi, z, γ)P−→ E {E[Y | X]P ∗(Z = z | X, γ0)}

where we can write

E {P ∗(Z = z | X, γ0)E[Y | X]} =∑x

P ∗(Z = z | X = x, γ0)P (Y = 1 | X = x)P (X = x)

=∑x

1∑y=0

P ∗(Z = z | X = x, γ0)yP (Y = y | X = x)P (X = x)

= E [P ∗(Z = z | X, γ0)Y ]

which is equivalent to the asymptotic denominator of the third term above. Thus, using information from

the previous section, we have

θzP−→ P (Z = z)E[Yz | Z = z]

E[P ∗(Z = z | X, γ0)Y ]− P (Z = z)E[Yz | Z = z]

E[P ∗(Z = z | X, γ0)Y ]+E[Yz | Z = z]

E[YA | Z = z]= θz

Therefore, when the assignment model is misspecified but the outcome model is correct, we have that the

doubly robust estimator remains a consistent estimator.

4.7.4 Consistency under misspecified outcome model

As we have shown earlier, the numerators converge as follows:

1

n

n∑i=1

1{Z=z}YiP−→ E[1{Z=z}Y ]

89

Page 101: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

where it is possible to write the asymptotic numerator alternatively as

E[1{Z=z}Y ] =∑x

P (Y = 1 | Z = z,X = x)P (Z = z | X = x)P (X = x)

=∑x

P (Y = 1 | Z = z,X = x)P (X = x | Z = z)P (Z = z)

= P (Z = z)P (Y = 1 | Z = z)

= P (Z = z)E[Y | Z = z]

Further, the misspecified outcome model, denoted by asterisk, is assumed to converge to a constant different

from the true expected outcome, as m∗(xi, φ)P−→ E∗[Yi | Xi, φ0] 6= E[Yi | Xi]. We also have by the LLN

that1

n

n∑i=1

1{Zi=z}m∗(xi, φ)

P−→ E{1{Z=z}E

∗[Y | X,φ0]}

and1

n

n∑i=1

m∗(xi, φ)e(xi, z, γ)P−→ E {E∗[Y | X,φ0]P (Z = z | X)} .

The first of these may be expressed as

E{1{Z=z}m

∗(xi, φ0)}

=∑x

P (Z = z | X = x)P (X = x)E∗[Y | X = x, φ0]

=∑x

P (X = x | Z = z)P (Z = z)E∗[Y | X = x, φ0]

= P (Z = z)E {E∗[Y | X,φ0] | Z = z} .

We note that under the misspecified outcome model, for the middle term of (4.9) we have the convergence∑ni=1 1{Z=z}Yi∑n

i=1 1{Z=z}m∗(xi, φ)

P−→E[1{Z=z}Y ]

E{1{Z=z}E∗[Y | X,φ0]

} , (4.18)

and for the third term of (4.9) the convergence∑ni=1 1{Z=z}Yi∑n

i=1m∗(xi, φ)e(xi, z, γ)

P−→E[1{Z=z}Y ]

E {E∗[Y | X,φ0]P (Z = z | X)}, (4.19)

Here for the right hand side of (4.18) we get

E[1{Z=z}Y ]

E{1{Z=z}E∗[Y | X,φ0]

} =E[Y | Z = z]

E {E∗[Y | X,φ0] | Z = z}

=P (Y = 1 | Z = 1)∑

xE∗[Y | X = x, φ0]P (X = x | Z = z)

=

∑x P (Y = 1 | X = x, Z = z)P (X = x | Z = z)∑

xE∗[Y | X = x, φ0]P (X = x | Z = z)

90

Page 102: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

=

∑x P (Y = 1 | Z = z,X = x)P (Z = z | X = x)P (X = x)∑

xE∗[Y | X = x, φ0]P (Z = z | X = x)P (X = x)

=E[1{Z=z}Y ]

E {E∗[Y | X,φ0]P (Z = z | X)}.

Therefore, we have that

θzP−→ E[Yz | Z = z]

E[YA | Z = z]−

E[1{Z=z}Y ]

E {E∗[Y | X,φ0]P (Z = z | X)}+

E[1{Z=z}Y ]

E {E∗[Y | X,φ0]P (Z = z | X)}= θz

and thus, when the assignment model is correctly specified, the DR estimator remains an asymptotically

consistent estimator, and thus we have the doubly robust property that we require.

91

Page 103: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Chapter 5

Effect of Positivity Violations on

Hospital Quality of Care Comparisons

5.1 Abstract

The assumption of positivity is a standard assumption made in the causal inference literature to ensure

that models used to estimate a causal effect of interest are identifiable. Violations of this assumption often

occur in health services research due to certain types of patients (i.e. with particular covariate values) not

receiving treatment at some hospitals, especially those with a low volume of patients. Comparing the quality

of care at these hospitals requires adjustment for patient-level factors that may confound the effect of the

hospital on the outcome of interest. This is often done through the use of direct or indirect standardization

approaches. This paper demonstrates the effect of positivity violations on both directly and indirectly

standardized quantities, where the reference comparison can be to another hospital, an average hospital or

the average provincial/national level. In particular, we demonstrate that only the indirectly standardized

mortality ratio where the reference comparison is the average provincial level is not vulnerable to positivity

violations, unlike the other causal contrasts considered.

5.2 Introduction

Estimating the causal effect of an exposure on some outcome is often of interest in many studies. Random-

ized studies ensure that exposures are randomly allocated across different subject profiles, allowing causal

effects to be estimated explicitly. Observational studies, however, require adjustment for all variables that

confound the exposure-outcome relationship in order for the causal effect to be identifiable based on such

data. Further, insufficient adjustment for such variables results in biased estimates of the causal effect due

to unmeasured confounding. Though confounder adjustment is necessary in observational studies, it is well

known (Cochran, 1957) that there are issues with effect estimation when there are covariate values for which

one or more exposures are not observed. In particular, there must be a sufficient number of observations

of each exposure across all confounder strata for causal effects to be identifiable. In causal inference, this

has been termed positivity (Hernan and Robins, 2006) and is a necessary assumption alongside those of

92

Page 104: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

consistency (Cole and Frangakis, 2009) and exchangeability (Greenland and Robins, 1986).

In health services research, quality comparisons often aim to compare the quality of patient care across

all hospitals in a province or country (Raval et al., 2009; Massarweh et al., 2014), where hospital of treat-

ment functions as the multi-level exposure. Despite such comparisons being causal in nature (Donabedian,

1988), not much has been done to formalize them in explicit causal terms (Varewyck et al., 2014, 2016;

Daignault and Saarela, 2017). Positivity violations are a large concern for profiling studies as randomization

to hospitals cannot occur. Further, various characteristics of the hospitals, such as being highly-specialized

institutions (e.g. children’s hospital), in addition to certain patient characteristics (e.g. patient postal code)

can cause strong violations of positivity.

Positivity violations can be classified into two types: random and deterministic violations (Westreich and

Cole, 2010). Deterministic violations occur when participants with at least one level of a confounder variable

simply cannot receive at least one level of the exposure. In the hospital profiling setting, one example would

be that adults cannot be treated at a children’s hospital and therefore, for some age groups, there can be no

data regarding the effect of this exposure on the outcome for this group. In contrast, random violations occur

when at least one level of the exposure has not been observed for at least one or more levels of the confounder

variables purely by chance. For example, during a specified data collection window, one hospital may happen

to only treat seniors whereas other hospitals were observed to treat both seniors and non-seniors. In particu-

lar, random positivity violations can happen much more easily in the hospital profiling context due to a large

number of hospitals treating very few patients. Regardless of the mechanism driving the positivity violation,

the result of both the random and deterministic violations is non-identifiability of the causal effect of interest.

Detection of positivity violations is not straightforward, yet there have been some methods proposed.

Cole and Hernan (2008) investigated non-positivity by searching for sparsity in contingency tables of con-

tinuous variables categorized by groups or quintiles. Cheng et al. (2010) used matching based on propensity

scores to exclude subjects contributing to positivity violations and thus avoid model extrapolation in their

effect estimation. However, the causal effect being estimated is no longer the exposure effect for the total

study population but rather for the population of matched subjects. Wang et al. (2006) proposed a diagnos-

tic based on the parametric bootstrap to quantify the extent to which inference for a given causal effect is

affected by positivity violations. However, given the multi-level nature of the exposure in hospital profiling

and the large number of covariates likely needed for adequate confounder adjustment, many of these ways

of detecting the presence of positivity violations may be impractical or infeasible.

Standardization methods are frequently used when profiling hospital care to adjust for confounders by

comparing the observed indicator of care to what would be expected under some reference level of care.

Direct standardization considers how the entire population under study would fare if given the care level of

one hospital. In contrast, indirect standardization considers how the population of one hospital would fare

under either the care of another hospital, the care of an average hospital, or a provincial/national average

level of care. Each different reference level will result in a different causal estimand, and thus each may

be affected by positivity violations in different ways. This paper will focus on four causal effect estimands

93

Page 105: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

that may be used in hospital profiling analyses: the directly standardized risk difference (Varewyck et al.,

2014) and three forms for the indirectly standardized mortality ratio (SMR), an observed-to-expected ratio,

where the reference is the care of another hospital, or an average hospital’s care (Varewyck et al., 2014) or a

province-wide or nationwide average care level (Daignault and Saarela, 2017). In Section 5.3, we introduce

the causal estimand for direct standardization and show that it is susceptible to positivity violations. In

Section 5.4, we present the causal estimands for the SMR under the three alternative reference levels of care

above. We then demonstrate that the indirectly standardized SMR where reference is made to the average

national/provincial level of care is unaffected by violations of positivity and that the estimated causal effect

still reflects a comparison to this average. These results are then illustrated in Section 5.5 using a toy

example of hospital comparisons in which one hospital has not treated any patients younger than 60 years

of age, followed by a short discussion in Section 5.6.

5.3 Direct Standardization and Positivity

5.3.1 Notation and Assumptions in Causal Inference

We will consider for simplicity a binary quality outcome variable, Y ∈ {0, 1}, but remark that the results that

follow are generalizable to other variable types. We let Z ∈ {1, . . . , p} be the observed hospital assignment

variable, and X ≡ {X1, . . . , Xq} is a set of patient-level characteristics relevant to case-mix adjustment,

capturing for example demographic information, medical history, and disease progression. The triples W ≡(Y,Z,X) are assumed independent and identically distributed across patients. As is the convention in the

causal inference literature, we denote by Yz the potential outcome that would have been observed had the

patient been treated in hospital z. The assumption of positivity can be stated mathematically as

0 < P (Z = z | X = x) < 1 for all z ∈ {1, . . . , p} and x combinations.

In addition to the assumption of positivity, we also assume conditional exchangeability (i.e. no unmeasured

confounders), (Y1, . . . , Yp) ⊥⊥ Z | X, and consistency, Y =∑pz=1 1{Z=z}Yz, where 1{Z=z} is an indicator

function taking the value 1 if the condition is met, and 0 otherwise.

X

Z Y

A

Figure 5.1: The postulated causal mechanism.

In order to denote the potential outcome that refers to a comparison to some population-level expected

level of care, we introduce the variable A ∈ {1, . . . , p} to represent a hypothetical “randomized” hospital

assignment mechanism, similar to VanderWeele et al. (2014). The variable A is defined so that the follow-

ing conditional independence relationships hold, as depicted in Figure 5.1: (Y1, . . . , Yp) ⊥⊥ A | (Z,X) and

94

Page 106: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

A ⊥⊥ Z | X.

Now let X ∈ S ⊂ IRq be the case-mix variables from the set of observable covariates. Suppose, for some

hospital denoted by a∗, there exist covariate values x ∈ V ⊂ S such that P (Z = a∗ | X = x) = 0 for all x ∈ V(i.e. positivity is violated for these x values) but P (Z = a∗ | X = x) > 0 for all x ∈ S \ V (i.e. no positivity

violations for all other x values). In the example above in which one hospital a∗ has only treated seniors,

V corresponds to the variable patient age (seniors vs. non-seniors) and positivity violation only occurs for

that specific hospital, but not the others, since they treated both age groups.

5.3.2 Directly Standardized Hospital Comparisons

Direct standardization is regarded as the appropriate method of standardization to be employed for ranking

of hospitals (Pouw et al., 2013). In practice, however, indirect standardization is more commonly used.

Direct standardization is the more appropriate method for ensuring comparability of hospitals because it

considers how all patients would fare had they received the care level provided at a specific hospital. This

is an appealing comparison as all hospitals in theory should provide good care to all types of patients.

Varewyck et al. (2014) define a causal estimand for a directly standardized hospital comparison as E[Yz],

which is interpreted as the expected outcome had all patients been treated in z. This causal estimand can

be used to compare hospital care by considering either a risk difference or ratio for pairwise comparisons

between hospitals. Using the consistency and conditional exchangeability assumptions detailed previously,

it is possible to express this causal estimand using observable data as

E[Yz] = EX{E[Yz | X]}

= EX{E[Yz | Z = z,X]} (exchangeability)

= EX{E[Y | Z = z,X]} (consistency)

=∑x

E[Y | Z = z,X = x]P (X = x). (5.1)

Assuming positivity is necessary as the causal estimand cannot be estimated non-parametrically, without

model extrapolation, from this last equation if the conditional expectation E[Y | Z = z,X = x] is not well-

defined (Hernan and Robins, 2006). This can be seen mathematically by considering the setup outlined in the

previous section. We can split the summation over all covariate values in equation (5.1) into those contained

in V (i.e. covariate values that violate positivity) and all others in X. Consider the causal estimand for the

outcome had all patients been treated in hospital a∗. We can rewrite (5.1) as

E[Ya∗ ] =∑x

E[Y | Z = a∗, X = x]P (X = x)

=∑x∈V

E[Y | Z = a∗, X = x]P (X = x) +∑

x∈S\V

E[Y | Z = a∗, X = x]P (X = x) (5.2)

where the conditional expectation in the first summation of (5.2) is not well-defined due to P (Z = a∗ | X =

x) = 0 for all x ∈ V . In short, because there were a set of patients defined by covariates values in V that

were not treated in a∗, it is not possible to estimate the directly standardized risk. Model extrapolation may

95

Page 107: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

improve identifiability problems but would require making parametric model assumptions that may not be

appropriate (Petersen and van der Laan, 2014). Without such modelling assumptions, the only alternative

would be to omit hospital a∗ from the comparisons.

5.4 Positivity Violations on Indirectly Standardized SMRs

Indirect standardization considers how patients of hospital z would fare under an alternative reference level

of care. While direct standardization allows hospitals to be compared to each other, indirectly standardized

quantities are not intended to be compared between hospitals (as in ranking), but rather a separate bench-

mark is constructed for each hospital which serves as the comparison. The benchmark level of care can be

the care level of another hospital, an average hospital, or a national or provincial average care level. The

latter two comparisons are particularly relevant for policy makers who would be interested in allocating a

finite set of resources or funds to improve the care a hospital provides to its own patients. Each reference

level of care will lead to a unique causal estimand on which the effects of positivity violations will be shown

for each.

5.4.1 Comparison to Another Hospital

Causal estimands for all indirectly standardized comparisons must take the form of a conditional expectation,

such as E[· | Z = z], which will reflect the expected outcome under some reference for the patient population

of a specific hospital z. The potential outcome will define the comparison of interest in each case. Here,

since we have fixed the value of Z in the condition, we must use the hypothetical assignment indicator A

introduced in Section 5.3.1 to represent the reference hospital in the potential outcome, as E[Ya | Z = z].

When a = z, the estimand represents the observed outcome. If we are interested in estimating the effect of

being treated at hospital a∗ versus hospital z, the SMR can be written as

SMRz =E[Yz | Z = z]

E[Ya∗ | Z = z]. (5.3)

To estimate the numerator and denominator, the estimand must be expressed using observable quantities

using the assumptions as in Section 5.3. The denominator can be written as

E[Ya∗ | Z = z] = EX|Z{E[Ya∗ | Z = z,X]}

= EX|Z{E[Ya∗ | X]} (exchangeability)

= EX|Z{E[Ya∗ | Z = a∗, X]} (exchangeability)

= EX|Z{E[Y | Z = a∗, X]} (consistency)

=∑x

E[Y | Z = a∗, X = x]P (X = x | Z = z) (5.4)

which takes a nested expectation of the estimand, conditional on the population of hospital z. The numer-

ator can be re-expressed in a similar way.

Once again, to see the effect of positivity violation on this causal estimand, consider that P (Z = a∗ |

96

Page 108: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

X = x) = 0 for hospital a∗ and covariates x ∈ V . Then the summation in equation (5.4) can be partitioned

into those covariates that do and do not violate positivity,

E[Ya∗ | Z = z] =∑x

E[Y | Z = a∗, X = x]P (X = x | Z = z)

=∑x∈V

E[Y | Z = a∗, X = x]P (X = x | Z = z)

+∑

x∈S\V

E[Y | Z = a∗, X = x]P (X = x | Z = z).

Similarly to the directly standardized risk, the expectation in the first summation again is not identifiable as

there is no data available to estimate the outcome effect for hospital a∗ given the observed covariate values

in V . Thus the indirectly standardized SMR when comparing to the care of another hospital is not estimable

if positivity does not hold.

5.4.2 Comparison to an Average Hospital

The causal estimand for an SMR that makes a comparison to an average hospital was given by Varewyck

et al. (2014) as

SMRz =E[Yz | Z = z]

1p

∑pa=1E[Ya | Z = z]

. (5.5)

To use the notational device A here, it is necessary to specify the hypothetical target assignment regime

P (A | X) that would result in a comparison to an average hospital. In this case, the required assignment

regime is P (A = a | X) = p−1 for all a ∈ {1, . . . , p}. To write the causal estimand in terms of observed data,

we take a conditional nested expectation of both the numerator and denominator as in Section 5.4.1. The

denominator can be written as

1

p

p∑a=1

E[Ya | Z = z] =1

p

p∑a=1

EX|Z{E[Ya | Z = z,X]}

=1

p

p∑a=1

∑x

E[Y | Z = a,X = x]P (X = x | Z = z) (5.6)

where the exchangeability and consistency assumptions were used similarly to Section 5.4.1. Notice that

(5.6) is equivalent to estimating (5.4) for every hospital and averaging over them. Since (5.4) and (5.6) are

related in this way, positivity has a similar effect on the causal estimand given in (5.5) as in (5.3). Again

consider that only one hospital a∗ is subject to positivity violations for covariate values x ∈ V . The double

97

Page 109: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

sum over covariate values x and hospitals can now be partitioned into the following three components:

(5.6) =1

p

[p∑a=1

(∑x∈V

E[Y | Z = a,X = x]P (X = x | Z = z) (5.7)

+∑

x∈S\V

E[Y | Z = a,X = x]P (X = x | Z = z)

=

1

p

∑a 6=a∗

∑x∈V

E[Y | Z = a,X = x]P (X = x | Z = z)

+∑x∈V

E[Y | Z = a∗, X = x]P (X = x | Z = z) (5.8)

+

p∑a=1

∑x∈S\V

E[Y | Z = a,X = x]P (X = x | Z = z)

where first the sum is split into covariates values x ∈ V and x ∈ S \V . Then the hospital summation in (5.7)

is further partitioned into hospital a∗ and all other hospitals a 6= a∗. Equation (5.8) is the term in which

positivity violations affect estimation of the causal effect, as the combination of hospital a∗ and covariate

values x ∈ V result in the term E[Y | Z = a∗, X = x] being impossible to estimate. Note that (5.8) is

the same as (5.4) and thus the effect of non-positivity is not surprising. Again, without further modelling

assumptions, hospital a∗ would have to be dropped from the calculation of the reference level of care.

5.4.3 Comparison to Average Nationwide Care

The final indirectly standardized comparison to consider is to the national/provincial average level of care.

The hypothetical assignment regime needed to define this comparison is P (A = a | X) = P (Z = a | X) for

all a ∈ {1, . . . , p}, which involves again an average across all hospitals (as in Section 5.4.2), but weighted by

the actual volume of each hospital. The SMR for this comparison, given in Daignault and Saarela (2017), is

SMRz =E[Yz | Z = z]

E[YA | Z = z]. (5.9)

It is possible to rewrite the denominator in terms of observable data as

E[YA | Z = z] = EA,X|Z {E[YA | A,X,Z = z]}

=∑x

p∑a=1

E[Ya | Z = z,A = a,X = x]P (A = a,X = x | Z = z)

=∑x

p∑a=1

E[Ya | Z = z,X = x]P (A = a, Z = z | X = x)P (X = x)

P (Z = z)

=∑x

p∑a=1

E[Y | Z = a,X = x]P (A = a | X = x)P (X = x)P (Z = z | X = x)

P (Z = z)(5.10)

98

Page 110: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

where both the original assumptions and the conditional independence assumptions needed for the use of the

notational device A were used. Note that the form of (5.10) is more complicated than (5.6). This is because

the potential outcome contained a random exposure variable and thus a conditional expectation across both

A and X was needed. Now we consider the effect of positivity violations on this estimand by partitioning

the summations into the same three components as in Section 5.4.2:

(5.10) =∑

x∈S\V

p∑a=1

E[Y | Z = a,X = x]P (A = a | X = x)P (X = x)P (Z = z | X = x)

P (Z = z)

+∑x∈V

∑a6=a∗

E[Y | Z = a,X = x]P (A = a | X = x)P (X = x)P (Z = z | X = x)

P (Z = z)

+∑x∈V

E[Y | Z = a∗, X = x]P (A = a∗ | X = x)P (X = x)P (Z = z | X = x)

P (Z = z)(5.11)

The last line (5.11) can now be written, by using the hypothetical assignment regime P (A | X) = P (Z = x),

as

(5.11) =∑x∈V

E[Y | Z = a∗, X = x]P (A = a∗ | X = x)P (X = x)P (Z = z | X = x)

P (Z = z)

=∑x∈V

E[Y | Z = a∗, X = x]P (Z = a∗ | X = x)︸ ︷︷ ︸=0

P (X = x)P (Z = z | X = x)

P (Z = z)

= 0.

Note here that even though, for hospital a∗ and covariate values x ∈ V , the conditional expectation E[Y |Z = a∗, X = x] remains unidentifiable due to positivity violations, it does not matter for the estimation

of the causal estimand. This is because the unidentifiable term is removed by multiplication with zero,

caused by the very positivity violation that makes that term unidentifiable. The causal estimand can then

be simplified to

E[YA | Z = z] =∑

x∈S\V

p∑a=1

E[Y | Z = a,X = x]P (A = a,X = x | Z = z)

+∑x∈V

∑a 6=a∗

E[Y | Z = a,X = x]P (A = a,X = x | Z = z) + 0

which indicates that even in the presence of positivity violations, the causal estimand can still be estimated.

Further the SMR retains its interpretation as a comparison to a national average level of care for the hospital-

specific patient population because the hospital at which positivity fails is still included in the average but

given weight zero.

99

Page 111: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

5.5 Toy Example of Positivity Violations

To illustrate the effects of positivity violations for each of the directly and indirectly standardized causal

estimands, we will consider a toy example of patients being treated for hip fractures in 3 hospitals. Here,

Y is a binary variable indicating whether or not a patient admitted to the ER with a hip fracture received

treatment within 24 hours, Z ∈ {1, 2, 3} is the hospital of treatment indicator, and X will be a binary

variable taking value 1 if the patient is older than 60 years of age and 0 if younger than 60 years old. The

hypothetical data is found in Table 5.1.

Hospital 1 Hospital 2 Hospital 3 Combined

Treated within 24 hours (Y )No Yes No Yes No Yes No Yes

Age (X)Age < 60 0 0 3 15 6 5 9 20Age ≥ 60 4 10 2 9 7 3 13 22

Crude rate 0.714 0.828 0.381 0.656

Table 5.1: Hypothetical example of comparing rate of hip fracture treatment within 24 hours (Y ) betweenthree hospitals (Z) while adjusting for age of patient (X). Crude rate is the rate of treatment in eachhospital, unadjusted for X.

From a health services perspective, there appears to be variability in the proportion of patients treated

within 24 hours between hospitals. Hospital 1 also has no data on patients with ages < 60, thus a positivity

violation exists for this hospital. In order to calculate the causal estimands discussed in Sections 5.3 and 5.4

without turning to regression modeling, the analogous empirical proportions would be used in place of the

conditional mean E[Y | Z,X] ≡ P (Y | Z,X), assignment propensity P (Z | X), and covariate distribution

P (X). The resulting empirical proportions can be found in Table 5.2. It is very clear that, based on the

data collected, the probability of being treated within 24 hours given the patient is younger than 60 years

and is seen in hospital 1 cannot be calculated since the probability that such a patient would be seen at

hospital 1 is 0.

Suppose we were interested in estimating the causal estimands for both directly and indirectly standard-

ized measures. Consider the directly standardized comparison between hospital 1 and 2, E[Y1]−E[Y2]. We

would have no issues estimating the second term, as

E[Y2] =

1∑x=0

E[Y | Z = 2, X = x]P (X = x)

= P (Y = 1 | Z = 2, X = 0)P (X = 0) + P (Y = 1 | Z = 2, X = 1)P (X = 1)

= (15/18) (29/64) + (9/11) (35/64) = 0.825.

However, it is not possible to estimate the same for Z = 1 because we have no value for P (Y = 1 | Z =

1, X = 0) and thus we cannot make any causal comparison involving hospital 1 using direct standardization.

100

Page 112: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Empirical Proportions

Covariate Levels Probabilities Hospital 1 Hospital 2 Hospital 3 All

P (Z = z) 14/64 29/64 21/64

Age < 60P (X = 0) - - - 29/64P (Z = z | X = 0) 0/29 18/29 11/29 -P (Y = 1 | Z = z,X = 0) NI 15/18 5/11 -

Age ≥ 60P (X = 1) - - - 35/64P (Z = z | X = 1) 14/35 11/35 10/35 -P (Y = 1 | Z = z,X = 1) 10/14 9/11 3/10 -

Table 5.2: Empirical proportions based on hypothetical data (Table 5.1), for use in causal effect estimation.Note: the conditional outcome proportion for hospital 1 is given value NI for non-identifiable.

Similarly, for the indirectly standardized pairwise comparison in equation (5.3), a similar issue arises

for the estimation of E[Y1 | Z = z], where here it does not matter which hospital’s patient population we

consider. For example, suppose we wish to know how the population of hospital 2 would fare if treated at

hospital 1. The estimand would be estimated as

E[Y1 | Z = 2] =

1∑x=0

E[Y | Z = 1, X = x]P (X = x | Z = 2)

=

1∑x=0

P (Y = 1 | Z = 1, X = x)P (Z = 2 | X = x)P (X = x)/P (Z = 2)

= (NI) (18/29) (29/64) (64/29) + . . . = not defined

which is again not well-defined. So the indirectly standardized pairwise comparison also cannot be estimated

under positivity violations. As mentioned in Section 5.4.2, the term that contains the positivity violation

for the indirectly standardized SMR with a comparison to the average hospital is identical in form to that

of the pairwise comparison above, meaning that the causal comparison to an average hospital is also not

estimable.

Finally, we can consider the comparison to the average level of care between the three hospitals. As

shown before, this comparison is the only one that results in an estimate of the causal estimand, regardless

of which hospital population we are interested in, so we can consider without loss of generality the population

101

Page 113: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

of hospital 3:

E[YA | Z = 3] =

1∑x=0

3∑a=1

E[Y | Z = a,X = x]P (Z = a | X = x)P (X = x)P (Z = 3 | X = x)

P (Z = 3)

= (NI) (0)︸ ︷︷ ︸ (29/64)(11/29)

(21/64)+ (15/18) (18/29) (29/64)

(11/29)

(21/64)+ (5/11) (11/29) (29/64)

(11/29)

(21/64)+ . . .

= 0 + 0.661 = 0.661.

The first term, even though there is no well-defined estimate for P (Y = 1 | Z = 1, X = 0), receives a weight

of zero from the P (Z = 1 | X = 0) term thus we may consider that this term in fact just contributes 0 to

the average over all hospitals and covariate values, and we are able to obtain an estimate of the causal effect.

Therefore, we are able to calculate an SMR for hospital 3 as

SMRz=3 =E[Y3 | Z = 3]

E[YA | Z = 3]=

0.131

0.661= 0.198

even though there exists non-positivity in the data.

5.6 Discussion

The assumption of positivity is often made in the inference of causal effects from observational data, yet in

practice is rarely assessed (Mortimer et al., 2005). As we have demonstrated, the choice of standardization

method and measure will be affected by positivity violations in different ways. Only the indirectly standard-

ized mortality ratio can actually be estimated in the presence of positivity violations, but only when the

comparison is to the national/provincial average level of care. For those that cannot be estimated, Petersen

et al. (2012) suggest a few ways of mitigating the effects of positivity violations, including restricting the set

of adjustment covariates to exclude those that cause violations, restricting the sample to exclude subjects

with limited or no variability within the exposure assignment, or redefining the causal effect to be estimated.

The first involves balancing the bias due to positivity violations and the bias due to insufficient confounder

adjustment, while the second alters the target population for inference (Westreich and Cole, 2010), possibly

leading to erroneous causal conclusions being drawn. In general, it is recommended that one abstains from

including covariates that would naturally induce positivity violations, such as patient postal code. While

such a covariate is a near perfect predictor of the probability of treatment at a hospital, in this case, the

trade-off between large positivity bias and likely much smaller confounder bias favours omitting this covariate

from the adjustment. Rather than attempting to adjust for neighbourhood or rurality which would cause

positivity violations, another option would be to instead adjust for characteristics of the neighbourhoods,

such as those derived in the Ontario Marginalization Index (Matheson, 2018), as these are more likely to be

confounders but should not violate positivity.

When regression modeling is employed for covariate adjustment, model extrapolation may be used to

obtain estimates of the causal effect. However, such approaches themselves require additional parametric

assumptions to be made that may not be suitable (Petersen et al., 2012). Further, if modeling approaches are

applied blindly without attention to the presence of possible positivity violations in the data, significant bias

102

Page 114: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

in the causal estimates may be overlooked. The indirectly standardized SMR when comparing against the

average system-wide care level will not be as susceptible to issues arising from model extrapolation as other

standardized measures. Not only are the fitted values for covariates involved in positivity violations given

zero weight, but indirect standardization involves only extracting fitted values for the observed covariates

within a single hospital. Even in the situation where a hospital has treated patients that no other hospital

has treated (i.e. positivity violated completely), one can still obtain an SMR = 1 for this hospital since only

the data from that hospital will contribute to the average care considered (i.e. due to non-positivity, fitted

values from all other hospitals receive a weight of 0). Therefore even under complete positivity violation,

it is at least possible to estimate a causal effect as well as retain the original reference comparison without

the need to restrict covariate adjustment or redefine the causal estimand. This may have contributed to the

popularity of the indirect SMR over other directly standardized measures in hospital profiling; while indirect

standardization may not provide exactly what is needed, it will always give you something.

103

Page 115: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Chapter 6

Causal Mediation Analysis for

Standardized Mortality Ratios

* The content of this chapter has been published in the journal Epidemiology, volume 30 in July 2019

(Daignault et al., 2019). There, I continue in the causal inference framework and propose a total effect

decomposition for the hospital effect on an outcome that may be mediated through some process of care. I

propose two estimators for this decomposition and compare their performance through a simulation study as

well as illustrate their use through an application to Ontario kidney cancer data. The complete manuscript,

as published, follows.

6.1 Abstract

Indirectly standardized mortality ratios (SMR) are often used to compare patient outcomes between health

care providers as indicators of quality of care. Observed differences in the outcomes raise the question of

whether these could be causally attributable to earlier processes or outcomes in the pathway of care that

the patients received. Such pathways can be naturally addressed in a causal mediation analysis framework.

Adopting causal mediation models allows the total provider effect on outcome to be decomposed into direct

and indirect (mediated) effects. This in turn enables quantification of the improvement in patient outcomes

due to a hypothetical intervention on the mediator. We formulate the effect decomposition for the indi-

rectly standardized SMR when comparing to a health care system-wide average performance, propose novel

model-based and semi-parametric estimators for the decomposition, study the properties of these through

simulations, and demonstrate their use through application to Ontario kidney cancer data.

Keywords: causal inference, effect decomposition, indirect standardization, mediation analysis, provider

profiling, standardized mortality ratio

104

Page 116: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

6.2 Introduction

Quality improvement in health care should ideally be focused towards initiatives that can demonstrate mea-

surable benefits on patient outcomes. It is common to use patient outcomes to compare the care provided

by hospitals, administrative regions, or surgeons; henceforth, without loss of generality, we refer to compar-

isons between providers. This approach is motivated by the notion that some aspect of the care provided is

associated with patient outcomes. For example, from a clinical perspective, the presence of positive surgical

margins is assessed pathologically following a radical prostatectomy for prostate cancer. If positive margins

are detected, the patient may be referred for salvage radiation therapy (Thompson et al., 2013). Health

service researchers would be interested in determining whether variations observed in the salvage therapy

rates between providers are causally linked to the rate of positive margins. In another example, observed

variations in length of stay after radical nephrectomy for early stage kidney cancer may be causally linked to

the rate of minimally invasive versus open surgery. Therefore, statistical analysis of such pathways between

providers and patient outcomes would provide insight into whether some aspect of the care received from a

provider could contribute towards worse outcomes, and if so, by how much.

In this article, we consider binary or continuous patient outcomes can be summarized by a quality indi-

cator in the form of a proportion or an average, such as the proportion of prostate cancer patients needing

salvage radiation therapy, or average length of hospitalization after radical nephrectomy for kidney cancer.

Fair comparison of the quality indicators between providers requires adjustment for patient case-mix (Shahian

and Normand, 2008), that is, for differences in the characteristics of the provider-specific patient populations.

Standardization methods are most commonly employed for this purpose, where the choice between direct

and indirect standardization methods will depend on the comparison of interest. Direct standardization esti-

mates the expected outcomes had the entire standard population experienced the covariate-conditional rates

of the study population. Indirect standardization instead estimates the expected outcomes had the study

population experienced the covariate-conditional rates of the standard population. The latter is a practical

advantage if the standard population is large compared to the study population, as indirect standardization

only requires estimation of the covariate-conditional rates in the standard population. In the present con-

text, the study population are the patients treated by a given provider, and the standard population is either

the patient population of a reference provider or the entire system-wide patient population. The between

provider comparisons and methods to adjust for patient case-mix are most naturally formulated in a causal

modeling framework (Varewyck et al., 2014; Daignault and Saarela, 2017).

Standardized mortality ratio (SMR), an indirectly standardized quantity, is a ratio of observed to ex-

pected outcomes and is commonly used to compare provider-specific performance to average performance of

the health care system. Despite its name, the SMR can be used to compare patient outcomes other than

mortality (Wolfe, 1994). Although less appropriate for ranking providers (Pouw et al., 2013), SMR has

the advantage of not requiring modeling of the provider effects or interactions between provider effects and

case-mix factors (Varewyck et al., 2016) (i.e. provider-case-mix interactions), while the comparisons to the

average system-wide performance avoids numerous pairwise comparisons between the providers if the latter

are not of interest.

105

Page 117: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Causal mediation analysis can be used to formalize the notion that patient outcomes are influenced by

both the provider at which they receive treatment and a particular process the provider actually performs

by decomposing the total provider effect on patient outcomes into a direct and an indirect (mediated) effect.

Causal mediation analysis of the effect of a particular structural characteristic of the providers (e.g. academic

versus non-academic hospital (Rochon et al., 2014)) has been considered, and can be carried out using con-

ventional mediation analysis methods. Herein, however, we are specifically interested in the decomposition of

the provider effect itself on the standardized mortality ratio, with multiple providers as the exposure levels.

Causal effect decompositions have been formulated for risk and mean differences (VanderWeele, 2009), odds

ratios (VanderWeele and Vansteelandt, 2010), and risk differences among the exposed population (Vanstee-

landt and VanderWeele, 2012); estimated through either parametric model-based estimators (VanderWeele,

2009; Baron and Kenny, 1986); or semi-parametric weighted estimators (Lange et al., 2012). However, as

far as we know, causal mediation analysis has not been considered for SMRs in the provider profiling context.

The objectives of this article are as follows. First, we use potential outcomes notation to express the

indirectly standardized SMR as a causal contrast in the mediated case and demonstrate that it can be

decomposed into direct and indirect (mediated) effects. Second, we propose novel model-based and semi-

parametric estimators for this decomposition. Third, we compare the performance of these estimators

through a simulation study. Last, we illustrate our methods using Ontario kidney cancer data. A brief

discussion follows.

6.3 Causal Estimand and Total Effect Decomposition

6.3.1 Notation

We let Y ∈ {0, 1} or Y ∈ R be the observed binary or continuous outcome variable used to construct a

quality indicator (e.g. salvage radiation therapy following radical prostatectomy or length of hospital stay

after radical nephrectomy), Z ∈ {1, . . . , p} be the provider (e.g. the hospital that performed the surgery),

M ∈ {0, 1} the observed binary mediator (e.g. presence of positive margins in the pathology report or

indicator for minimally invasive versus open surgery), and X ≡ (X1, . . . , Xq) a vector of patient-level covari-

ates for case-mix adjustment, such as demographics, disease-progression, and medical history. We assume

that the quadruples W ≡ (Y,M,X,Z) are independent and identically distributed across patients. As is

the convention in causal mediation analysis, we denote by Yzm the potential outcome had the patient been

treated by provider z ∈ {1, . . . , p} at the mediator level m ∈ {0, 1}. Similarly, we denote by Mz the potential

mediator level had the patient been treated by provider z. Therefore, we may for instance denote by YzMz∗

the potential outcome had a patient been treated by z but had the mediator level that would have naturally

been observed had they been treated by provider z∗, while otherwise receiving the care level of z.

The total effect (TE) in the comparison of study provider z to reference provider z∗ can now be decom-

posed into natural indirect and natural direct effects as

TE = E[YzMz]− E[Yz∗Mz∗ ] = (E[YzMz

]− E[YzMz∗ ]) +(E[YzM∗z ]− E[Yz∗Mz∗ ]

)= NIE + NDE.

106

Page 118: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

The estimation of this decomposition could proceed in the usual way through model or weighting based

methods. However, the resulting p(p − 1)/2 pairwise comparisons may not all be of interest (with p = 50,

there would already be 1,225 pairwise comparisons), and arguably it might be unrealistic (in terms of the

required positivity assumptions) to consider expected outcomes had the entire patient population been

treated by a given provider. Thus, here we concentrate on indirectly standardized comparisons, formulated

for each providers’ own patient population (i.e. conditional on Z = z). In the potential outcomes framework,

these can be formulated as (Daignault and Saarela, 2017)

SMRz =E[Yz | Z = z]

E[YA | Z = z], (6.1)

where the random variable A ∈ {1, . . . , p} is introduced as a notational device to correspond to a hypothetical

“randomized” target assignment regime with chosen assignment probabilities. For previous use of similar

notation representing random draws of potential outcomes, we refer to VanderWeele et al. (2014, p. 303-304).

The interpretation of equation (6.1) depends on the choice of target assignment regime, P (A | X). We note

that by choosing P (A = z∗ | X) = 1 for a given reference provider z∗, with a binary outcome we would

obtain the usual treatment effect among the treated risk ratio E[Yz | Z = z]/E[Yz∗ | Z = z]. However, in our

context of multiple providers as exposure levels, we are interested in comparisons to a system-wide average

performance as the reference. Thus, we choose P (A = a | X) = P (Z = a | X) which results in a comparison

to the average level of care that patients similar to those treated by provider z would receive in the system.

The causal relationships between the variables introduced in this section are presented in Figure 6.1.

X

Z M Y

A

Figure 6.1: The postulated causal mechanism.

6.3.2 Causal estimand for SMR

The SMR in equation (6.1) can be expressed in the mediated case as

SMRTEz =

E[Yz | Z = z]

E[YA | Z = z]=

E[YzMz | Z = z]

E[YAMA| Z = z]

(6.2)

=

Expected outcome of patients of provider z treated atthe care level of provider z

Expected outcome of patients of provider z treated atthe average care level in the system

107

Page 119: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

where Mz refers to the value that the mediator would naturally take for a patient treated by z. The natural

direct effect (NDE) and natural indirect effect (NIE) SMRs for provider z can now be defined as

SMRNDEz =

E[YzMA| Z = z]

E[YAMA| Z = z]

(6.3)

=

Expected outcome of patients of provider z treated atthe average level of the mediator, and at the care level of

provider z otherwise

Expected outcome of patients of provider z treated atthe average care level in the system

and

SMRNIEz =

E[YzMz| Z = z]

E[YzMA| Z = z]

(6.4)

=

Expected outcome of patients of provider z treated atthe care level of provider z

Expected outcome of patients of provider z treated atthe average level of the mediator, and at the care level of

provider z otherwise

.

In other words, the NIE corresponds to the effect of intervening on the mediator (e.g. the positive margin

rate in hospital z) from the observed level Mz to the average level MA, while the NDE corresponds to the

provider effect that remains after this intervention (i.e. the effect due to any other aspect of care by z, other

than positive margins).

6.3.3 Total effect decomposition of SMR

A multiplicative total effect decomposition for mediation analysis holds for the SMR, as

SMRTEz =

E[YzMz| Z = z]

E[YAMA| Z = z]

=E[YzMz | Z = z]

E[YAMA| Z = z]

× E[YzMA| Z = z]

E[YzMA| Z = z]

=E[YzMz | Z = z]

E[YzMA| Z = z]

× E[YzMA| Z = z]

E[YAMA| Z = z]

= SMRNIEz × SMRNDE

z , (6.5)

where the components are as in equations (6.3) and (6.4), and are themselves SMRs. Here it is obvious that

if provider z performs at the average level for the mediator, we have that SMRTEz = SMRNDE

z , and if on the

other hand z is average in all other aspects, SMRTEz = SMRNIE

z .

108

Page 120: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

We note that an alternative effect decomposition could be written as

SMRTEz =

E[YzMz| Z = z]

E[YAMz| Z = z]

× E[YAMz| Z = z]

E[YAMA| Z = z]

(6.6)

= SMRNDE∗

z × SMRNIE∗

z ,

which resembles the one considered by Vansteelandt and VanderWeele (2012). In equation (6.6), the NIE

corresponds to the effect of intervening on the mediator from the observed level Mz to the average level MA

with all other aspects of care at the average level, while the NDE corresponds to the provider effect due to

all other aspects of care, while keeping the mediator at the observed level. Vansteelandt and VanderWeele

(2012) noted that the estimation of their decomposition would require fewer assumptions than mediation

analysis usually does, since the mediator is held at the observed level. However, we argue that in the present

context of provider profiling equation (6.6) is less relevant, as it does not correspond to the effect of inter-

vening on the mediator in provider z while keeping other things fixed.

We also note that it would be possible to define controlled direct effect SMRs as

SMRCDEmz =

E[Yzm | Z = z]

E[YAm | Z = z],

with the mediator controlled at the level m. However, we do not pursue this for two reasons: it does not

allow for decomposition of the total effect, and it considers an intervention where everyone would receive the

same level of care in terms of the mediator. In the context of the kidney cancer example, it is more realistic

that even after an intervention, a hospital would still continue to perform both types of surgeries, with the

proportion of either type depending on the case-mix that the hospital treats. Thus, we proceed with the

effect measures (6.3) and (6.4), and the decomposition in equation (6.5), and derive estimators for it in the

following section.

6.4 Proposed Estimators

6.4.1 Proposed model-based estimators

Throughout, we make some standard causal assumptions for causal mediation analysis (VanderWeele and

Vansteelandt, 2009), listed in the eAppendix 1. In the following estimators, the numerators and denominators

of equations (6.3) and (6.4) are estimated separately and the ratio taken to provide estimates for the NIE and

NDE. The derivation of the following model-based estimators can be found in eAppendix 2. The numerator

of equation (6.4), which corresponds to the observed outcome rate in provider z, can be expressed in terms

of observable quantities, using the assumptions and relations in eAppendix 1, as

E[YzMz | Z = z] =∑x

1∑m=0

E[Y | Z = z,X = x,M = m]P (M = m | Z = z,X = x)

× P (X = x | Z = z), (6.7)

109

Page 121: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

which involves averaging over predicted mediator levels for patients of provider z. Similarly, the denominator

of equation (6.4) and numerator of equation (6.3) can be re-expressed as

E[YzMA| Z = z] =

∑x

p∑a=1

1∑m=0

E[Y |M = m,Z = z,X = x]P (M = m | Z = a,X = x)

× P (Z = a | X = x)P (X = x | Z = z), (6.8)

which again involves averaging over the predicted mediator levels as well as predictive probabilities of being

treated by a given provider for patients in provider z. Finally the denominator of equation (6.3) can similarly

be expressed as

E[YAMA| Z = z] =

∑x

p∑a=1

1∑m=0

E[Y |M = m,Z = a,X = x]P (M = m | Z = a,X = x)

× P (Z = a | X = x)P (X = x | Z = z). (6.9)

For estimation purposes, we substitute parametric models for the components in equations (6.7)-(6.9),

with exception of P (X = x | Z = z) where we use the empirical covariate distribution of patients of provider

z. We shall denote by f(x,m, z;φ) ≡ E[Yi | Xi = x, Zi = z,Mi = m,φ] the outcome model parameterized

by φ. In the case of a binary outcome variable, this can be a logistic regression model of the form

f(x,m, z;φ) ≡ expit{φ0 + φ′1x+ φ2m+ φ3z} (6.10)

where φ = (φ0, φ1, φ2, φ3). The corresponding maximum likelihood parameter estimates are denoted by

φ. Further, we denote predictive probabilities given by the mediator model, parameterized by α, by

g(x,m, z;α) ≡ P (Mi = m | Zi = z,Xi = x, α). In the case of a binary mediator, these would again

be derived from a logistic regression model as

g(x,m, z;α) ≡ [expit{α0 + α′1x+ α2z}]m

[1− expit{α0 + α′1x+ α2z}]1−m

(6.11)

where α = (α0, α1, α2), with corresponding parameter estimates denoted by α. Finally, the assignment

probabilities e(x, z; γ) ≡ P (Zi = z | Xi = x, γ) are derived from a multinomial regression model for the

provider assignment probability parameterized in terms of γ, given by

e(x, z; γ) ≡ exp(γ0z + γ′1zx)

1 +∑pa=2 exp(γ0a + γ′1a)

, z = 2, . . . , p (6.12)

and e(x, 1; γ) = 1−∑hz=2 e(x, z; γ) with γ = (γ02, . . . , γ0p, γ12, . . . , γ1p) denoting the collection of all param-

eters, with the corresponding parameter estimates denoted by γ.

Therefore, we propose the following model-based estimators for estimation of the total effect based on

110

Page 122: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

equations (6.7) and (6.9)

ˆSMRTE

z =

∑ni=1

∑1m=0 1{Zi=z}f(xi,m, z; φ)g(xi,m, z; α)∑n

i=1

∑pa=1

∑1m=0 1{Zi=z}f(xi,m, a; φ)g(xi,m, a; α)e(xi, a; γ)

, (6.13)

the natural direct effect based on equations (6.8) and (6.9)

ˆSMRNDE

z =

∑ni=1

∑pa=1

∑1m=0 1{Zi=z}f(xi,m, z; φ)g(xi,m, a; α)e(xi, a; γ)∑n

i=1

∑pa=1

∑1m=0 1{Zi=z}f(xi,m, a; φ)g(xi,m, a; α)e(xi, a; γ)

(6.14)

and finally for the natural indirect effect based on equations (6.7) and (6.8)

ˆSMRNIE

z =

∑ni=1

∑1m=0 1{Zi=z}f(xi,m, z; φ)g(xi,m, z; α)∑n

i=1

∑pa=1

∑1m=0 1{Zi=z}f(xi,m, z; φ)g(xi,m, a; α)e(xi, a; γ)

. (6.15)

The estimators in equations (6.13), (6.14) and (6.15) are applied in turn to each provider z = 1, . . . , p with

parameters φ, γ, and α estimated by fitting the regression models specified above to the pooled patient pop-

ulation. These model-based estimators involve fitting an outcome model conditional on provider assignment

as well as on the mediator and patient covariates. When data are available from a large number of providers,

some of which may be small in volume, such models require estimating a large number of parameters which

may not be feasible without shrinkage. Allowing for possible provider–mediator or provider–case-mix inter-

actions would further add to the number of parameters to be estimated. Thus, in the following section, we

propose alternative semi-parametric estimators of the total effect decomposition that do not require fitting

an outcome model with provider effects.

6.4.2 Proposed semi-parametric estimators

The derivation of the following semi-parametric estimators can be found in eAppendix 3, and broadly follows

the ideas outlined by Lange et al. (2012). As in the previous section, the numerator and denominator of

each causal contrast will be estimated separately and their ratio taken to provide estimates of each of the

effects of interest. Once again, we employ the notation and assumptions detailed in the eAppendix 1. We

can derive an alternative expression for the numerator of equations (6.2) and (6.4) to that shown in equation

(6.7) as

E[YzMz| Z = z] =

∑x

∑z

1∑m=0

1∑y=0

y1{Z=z}

P (Z = z)P (Y = y,M = m,Z = z,X = x) (6.16)

which can be seen as weighting the outcome for each patient by the proportion of patients treated by provider

z and then averaging over the patients in provider z. A similar derivation yields the following expression for

the numerator of equation (6.3) and denominator of equation (6.4)

E[YzMA| Z = z] =

∑x

p∑a=1

1∑m=0

1∑y=0

[1{Z=z}y

P (M = m | Z = a,X = x)

P (M = m | Z = z,X = x)

P (Z = a | X = x)

P (Z = z)

]× P (Y = y,M = m,Z = z,X = x) (6.17)

111

Page 123: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

which again can be viewed as weighting each outcome by first, the ratio of the probability of a mediator value

given a patient was treated by a versus z, then the ratio of the probability of being assigned to provider a

versus z, followed by averaging over the patients of z. Finally we obtain an expression for the denominator

of equations (6.2) and (6.3) as

E[YAMA| Z = z] =

∑x

p∑a=1

1∑m=0

1∑y=0

yP (Z = z | X = x)

P (Z = z)P (Y = y,M = m,Z = a,X = x) (6.18)

where here the patient outcomes are only weighted by the ratio of the probability of provider assignment to

z given covariates to the volume of provider z. These expressions motivate the proposed semi-parametric

estimator for the causal estimand of the total effect given in equation (6.2), by taking the ratio of equations

(6.16) and (6.18),

ˆSMRTE

z =

∑ni=1 Yi1{Zi=z}∑ni=1 Yie(xi, z; γ)

(6.19)

where the entire patient population is used to fit the hospital assignment model. Similarly, a proposed

semi-parametric estimator for the causal estimand of the natural direct effect given in equation (6.3), given

by taking the ratio of equations (6.17) and (6.18), is

ˆSMRNDE

z =

∑ni=1

∑pa=1 1{Zi=z}Yi

g(xi,mi,a;α)g(xi,mi,z;α)

e(xi, a; γ)∑ni=1 Yie(xi, z; γ)

(6.20)

where g(·) is the same logistic model for the mediator as defined in equation (6.11). Finally, a semi-parametric

estimator of the natural indirect effect shown in equation (6.4) is given by

ˆSMRNIE

z =

∑ni=1 Yi1{Zi=z}∑n

i=1

∑pa=1 1{Zi=z}Yi

g(xi,mi,a;α)g(xi,mi,z;α)

e(xi, a; γ), (6.21)

a ratio of equations (6.16) and (6.17). The performance of both the proposed model-based and semi-

parametric estimators will be illustrated in the following section.

6.5 Simulation study

We now present simulation results to illustrate that the total effect decomposition in equation (6.5) holds

for both the model-based and semi-parametric estimators, as well as to compare the performance of both

sets of proposed estimators. To this end, we maintain a small number of providers (p = 5) in our simulation

of n = 1, 000 patients according to the causal pathway in Figure 6.2. The providers increase in patient

volume, with mean volumes ranging from 30 to 380 for providers 1 and 5 respectively. The details of the

data generation mechanism can be found in eAppendix 4. We performed the simulation and application in

the next section using R software and sample code for implementation of these methods can be found in

eAppendix 5.

To illustrate the total effect decomposition, we present only the case where both a natural direct and

indirect effect exist for ease of presentation. Similarly, to compare the performance of the model-based and

112

Page 124: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

M

X

Z M1, . . . ,Mp

A

U1

Y10, . . . , Yp0, Y11, . . . , Yp1

U2

Y

Figure 6.2: Causal relationship for simulated data. U1, U2 are non-confounder latent variables represent-ing individual-level correlation among the potential binary mediator and potential binary outcome valuesrespectively.

the semi-parametric estimators in the previous sections, we present only the case where no direct provider–

outcome effect exists (NDE = 1 for all z), but there exists both a provider–mediator and mediator–outcome

effect. Two further scenarios in which the NIE = 1 for all z can be found in eAppendix 4, alongside the

details for generating all scenarios. In each case, we simulated 100 datasets and estimated the total effect,

natural direct effect and natural indirect effect using both estimation approaches.

The total effect decomposition under both the model-based and semi-parametric estimators is presented

in Figure 6.3. We plotted the SMRs for the TE, NDE, and NIE on the log-scale but the y-axis remains on

the untransformed SMR scale; thus, the bars have an additive rather than a multiplicative interpretation. It

can be easily seen that, for each provider, subtracting the height of the NIE bar from the height of the NDE

bar will give the height of the TE bar, thus illustrating that the decomposition of the total effect in equation

(6.5) holds for the model-based estimators (Figure 6.3a) and for the semi-parametric estimators (Figure 6.3b).

It can be noted that the error bars in Figure 6.3b are slightly longer than in Figure 6.3a and are much

longer in both figures for provider 1. These attributes can also be seen in the subsequent figure comparing

the performance of both sets of estimators. Recall that provider 1 has the smallest volume (n = 30) and thus

displays much more sampling variability than the other providers. This can be seen in all three simulation

scenarios presented in Figure 6.4 and Figures S6.1 and S6.2. As providers become larger, the sampling

variability of both the model-based and semi-parametric estimators decrease, as seen most evidently in

Figure 6.4 for the NDE. We also see that, when estimating the NIE, the semi-parametric estimator exhibits

larger sampling variability across all providers, but performs similar to the model-based estimator when

estimating the NDE in Figure 6.4. The estimators have nearly identical sampling distributions for the TE,

but are slightly biased for provider 1. The results for the scenarios when the NIE is set to 1 (Figure S6.1

and S6.2) show similar performance, with slightly more sampling variability observed in both estimators of

the indirect effect when no indirect effect exists via the provider-mediator pathway.

113

Page 125: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Provider 1 Provider 2 Provider 3 Provider 4 Provider 5

TE decomposition for model-based estimatorsSMR

0.76

0.86

0.96

1.06

1.16

1.26

1.36

TENDENIE

(a)

Provider 1 Provider 2 Provider 3 Provider 4 Provider 5

TE decomposition for semi-parametric estimators

SMR

0.74

0.84

0.94

1.04

1.14

1.24

1.44

TENDENIE

TENDENIE

(b)

Figure 6.3: Total effect decomposition for five providers using (a) model-based and (b) semi-parametricestimators. Bars are the means of the sampling distribution for each hospital, and error bars represent 2.5th

and 97.5th percentiles of sampling distributions. NDE indicates natural direct effect; NIE, natural indirecteffect; SMR, standardized mortality ratio; TE, total effect.

0.7 0.8 0.9 1.0 1.1 1.2

Total Effect

Provider 5

Provider 4

Provider 3

Provider 2

Provider 1

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

● True value 0.8 0.9 1.0 1.1 1.2 1.3

Indirect Effect

0.7 0.9 1.1 1.3

Direct Effect

Figure 6.4: Total, indirect, and direct effect sampling distributions of proposed estimators when an indirecteffect exists, but no direct effect exists.

6.6 Application to Ontario Kidney Cancer Data

We illustrate use of the proposed estimators using patient-level kidney cancer data between the years 2004

and 2014 from the Ontario-wide provincial health care databases available through the Institute for Clinical

Evaluative Sciences. There is evidence that minimally invasive surgery can result in shorter length of stay

114

Page 126: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

for patients undergoing radical nephrectomies (Semerjian et al., 2015; Bragayrac et al., 2016). We wish to

determine how much of the observed variation in length of stay between hospitals is attributable to the

rate at which each hospital performs minimally invasive surgery or due to other practices. This study was

approved by the University Health Network Research Ethics Board.

We considered a cohort of patients who underwent radical nephrectomy for T1 or T2 stage renal cell

carcinoma, were older than 18 years of age and were diagnosed after 2004. We identified 4451 patients who

met these criteria, distributed over 73 different hospitals in Ontario. The mean length of stay was 5.12 days

with a range of 0 to 170 days, while 56% of patients received minimally invasive surgery. In this patient

subgroup, we consider shorter length of stay to be an indicator for better quality of care, and thus SMRs

of less than one correspond to better than average performance. Variables used for case-mix adjustment

consisted of age, sex, income quintile, rural versus urban residence, year of diagnosis, Charlson comorbidity

score, Adjusted Clinical Group (ACG) score (Starfield et al., 1991), days from diagnosis to nephrectomy,

tumor size, and stage of disease. We fitted a linear model for the log(length of stay+1) outcome, and logistic

and multinomial logistic models for the mediator and hospital assignment, respectively, all adjusted for the

same case-mix factors. We computed the total, direct and indirect effect SMRs via both proposed estimation

approaches for each of the hospitals. We repeated this over 500 bootstrap resamples to obtain approximate

sampling distributions. To reduce variability in the semi-parametric estimators, we truncated the weights at

the 99th percentile.

For the purposes of our illustration, we present the results for the 10 largest hospitals in terms of their

patient volume. Figure 6.5 presents the point estimates and bootstrap sampling distributions using both

estimation approaches for these hospitals in decreasing order of volume. We observed that while the point

estimates are comparable, generally the semi-parametric estimators are more variable than the model-based

estimators. In the leftmost panel, Institution 1 performs the best in terms of the TE standardized ratio of

observed versus expected (under average level of care) length of stay. The NIE standardized ratio of the

same institution is substantially smaller than one, demonstrating that higher rates of MIS in this institution

indeed do explain part of its good performance in terms of length of stay. However, the NDE standardized

ratio, accounting for all other practices of Institution 1, is still smaller than one, suggesting that there are

still other factors that contribute towards the short average length of stay of Institution 1. Both in terms

of minimally invasive surgery and other aspects of care, Institution 1 performs above average, and thus no

interventions need to be considered.

Institution 2 also has shorter than expected length of stay (TE < 1). However, the NIE for this provider

is greater than 1, suggesting that if Institution 2 could increase its minimally invasive surgery rate to the

provincial average level, it could further improve its average length of stay. Institution 7 might be one

targeted for intervention (e.g. increasing capacity to perform minimally invasive surgery) as its TE and NIE

on length of stay are both above one, the latter appreciably so. By increasing the minimally invasive surgery

rate to provincial average level, the estimated NDEs suggest that Institution 7 could reduce its length of

stay to at least provincial average level.

115

Page 127: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

0.7 0.8 0.9 1.0 1.1 1.2 1.3

Total Effect

SMR SMR SMR

Provider Volume

486

221

172

167

156

147

138

116

114

106

Estimator

Semi-parametric

Model-based

Semi-parametric

Model-based

Semi-parametric

Model-based

Semi-parametric

Model-based

Semi-parametric

Model-based

Semi-parametric

Model-based

Semi-parametric

Model-based

Semi-parametric

Model-based

Semi-parametric

Model-based

Original Estimate

Inst 1

Inst 2

Inst 3

Inst 4

Inst 5

Inst 6

Inst 7

Inst 8

Inst 9

Inst 10Semi-parametric

Model-based

0.7 0.8 0.9 1.0 1.1 1.2 1.3

Indirect Effect

0.7 0.8 0.9 1.0 1.1 1.2 1.3

Direct Effect

Figure 6.5: Total, natural indirect (mediated through minimally invasive surgery) and natural direct (notmediated through minimally invasive surgery) hospital effects on length of stay for the 10 largest Ontariohospitals, with distribution of 500 bootstrap resamples and whiskers corresponding to 95 percentile intervals.The standardized mortality ratios (SMRs) refer to the ratio of observed versus expected (under average levelof care) length of stay for the patient case-mix of a given hospital.

6.7 Discussion

In general, the model-based estimators exhibit less sampling variability than the semi-parametric estimators,

but it comes at the expense of required specification of the outcome model (eq. (6.10)), as well as possible

provider–case-mix and provider–mediator interactions. The proposed semi-parametric estimators are thus an

appealing alternative as they do not require the use of an outcome model, or specification and estimation of

interaction terms. On the contrary, since they are not able to benefit from model extrapolation, they possibly

require more in terms of numbers of patients and events; for example, application of the semi-parametric

estimators requires that both levels of the mediator are observed for each provider. Both approaches can

be adapted for a continuous mediator; however, the weights in the semi-parametric approach may become

more unstable, as densities will need to be substituted for probabilities (Lange et al., 2012).

Sampling variability of both proposed estimators is seen to decrease as the volume of the providers

increases, and the ability to detect direct and indirect effects will depend on the volume. In addition,

for models that involve fitting provider effects, the presence of small providers may require application of

116

Page 128: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

shrinkage through random effect models. A comparison of fixed and random effects models with indirect

standardization for provider profiling (Austin et al., 2003) has shown that fixed effects models involving

provider effects have a higher sensitivity at detecting outlying providers but lower specificity than random

effects models. Further, Normand et al. (1997) argue that shrinkage towards the mean may be justifiable if

the estimates for small providers are drastically different than the mean as such estimates would likely be

imprecise. However, as shrinkage in the outcome model would muffle the effect of the provider and mediator

on the outcome, i.e. the effect of interest, the semi-parametric estimators avoid the use of the outcome

model at the expense of larger sampling variability. Omitting small providers from the analysis would not be

desirable as the reference in indirect standardization would then no longer include all providers nationwide,

but preferably, these could be combined for the analysis.

The decomposition of the total effect of the provider on the outcome into direct and indirect effects

allows the measurement of the effect that a hypothetical intervention on the mediator may have on the

outcome, leading to a better understanding of aspects of care to prioritize improvement. We are currently

working on applying these methods to administrative data in the context of analyzing quality variations in

cancer care. As funding agencies and policy makers would likely base their recommendations on multiple

facets of care, a possible extension of the present work would be the inclusion of multiple either parallel or

serial mediators along the provider-outcome pathway. Although we have outlined our methods in the health

services research context, we note that they could be applied similarly in comparison of multiple geographical

regions, where causal models are also applicable (Moreno-Betancur et al., 2017). The present work could

also be extended to consideration of possible hierarchical exposure levels, such as surgeons within hospitals

within administrative subdivisions, by introducing further indexing with respect to the different levels of

comparison. The present notation applies for a single layer of hierarchy, such as either surgeons or hospitals,

or a cross-classification such as surgeon by hospital (relevant in case some surgeons operate in more than

one hospital); when estimating the marginal hospital effects, the possible surgeon effects will be absorbed

into these, as in our application.

117

Page 129: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

6.8 Supplementary Digital Content

6.8.1 eAppendix 1. Assumptions

We assume counterfactual consistency of the outcome and the mediator, that is, when Z = z and M = m

the counterfactual YZM equals the observed Y , and similarly MZ = M when Z = z. We also assume no

unmeasured confounders for any of the causal pathways, namely (i) the mediator-outcome (Yzm ⊥⊥M | Z,X),

(ii) the exposure-outcome (Yzm ⊥⊥ Z | X), and (iii) the exposure-mediator relationship (Mz ⊥⊥ Z | X).

Finally, we (iv) assume that Yzm ⊥⊥ Mz∗ | Z,X, for any combination of z, z∗ and m, which states that

there are no confounders of the mediator-outcome relationship that are effects of the exposure. In addition

to the above assumptions, we define A so that it has similar conditional independence properties (a) Yzm ⊥⊥A | Z,X, (b) Mz ⊥⊥ A | Z,X, (c) Yzm ⊥⊥ Mz∗ | A,Z,X and (d) A ⊥⊥ Z | X so that we have the causal

relationship presented in Figure 6.1.

6.8.2 eAppendix 2. Derivation of model-based estimators

Here, and in eAppendix 3 below, we make use of the notation and causal relations defined in notation section,

as well as the assumptions listed in above. Further, the asymptotic consistency of the estimators derived in

eAppendix 2 and 3 can be shown through Slutsky’s lemma and the continuous mapping theorem, assuming

that the parameter estimates of the models converge towards their true values. We present the derivations

of both estimators for a binary mediator and outcome, but note that the derivations proceed similarly for a

continuous outcome.

The derivation of equation (6.7) which becomes the numerator of both the total effect and indirect effect

(eqs. (6.2), (6.4)) is as follows:

E[YzMz| Z = z] = EMz,X|Z {E[YzMz

| Z = z,Mz, X]}

=∑x

1∑m=0

P (Yzm = 1 |Mz = m,Z = z,X = x)P (Mz = m,X = x | Z = z)

(iv)=∑x

1∑m=0

P (Yzm = 1 | Z = z,X = x)P (Mz = m | Z = z,X = x)

× P (X = x | Z = z)

(ii)=∑x

1∑m=0

P (Yzm = 1 |M = m,Z = z,X = x)P (M = m | Z = z,X = x)

× P (X = x | Z = z)

=∑x

1∑m=0

P (Y = 1 |M = m,Z = z,X = x)P (M = m | Z = z,X = x)

118

Page 130: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

× P (X = x | Z = z)

=∑x

1∑m=0

E[Y |M = m,Z = z,X = x]P (M = m | Z = z,X = x)

× P (X = x | Z = z)

where we have employed the standard causal assumptions (ii) and (iv), as well as consistency of both potential

outcome and mediator. Now we present the derivation of equation (6.8) which serves as the numerator of

the direct effect (eq. (6.3)) and the denominator of the indirect effect (eq. (6.4)):

E[YzMA| Z = z] = EMA,A,X|Z {E[YzMA

| Z = z,MA, A,X]}

=∑x

∑a

1∑m=0

P (Yzm = 1 | Z = z,Ma = m,A = a,X = x)

× P (Ma = m,A = a,X = x | Z = z)

(iv)=∑x

∑a

1∑m=0

P (Yzm = 1 | Z = z,A = a,X = x)P (Ma = m | A = a,X = x, Z = z)

× P (A = a | Z = z,X = x)P (X = x | Z = z)

(a),(b),(d)=

∑x

∑a

1∑m=0

P (Yzm = 1 | Z = z,X = x)P (Ma = m | X = x, Z = z)

× P (A = a | X = x)P (X = x | Z = z)

(i),(iii)=

∑x

∑a

1∑m=0

P (Yzm = 1 |M = m,Z = z,X = x)P (Ma = m | X = x, Z = a)

× P (Z = a | X = x)P (X = x | Z = z)

=∑x

∑a

1∑m=0

P (Y = 1 |M = m,Z = z,X = x)P (M = m | X = x, Z = a)

× P (Z = a | X = x)P (X = x | Z = z)

where again the standard causal assumptions (i), (iii) and (iv) as well as consistency of potential outcome

and mediator, and the causal relationships (a), (b), and (d) defined for the A notation are used. We also

employ the target assignment regime P (A = a | X = x) = P (Z = a | X = x) in the derivation. Finally,

we can derive equation (6.9) which becomes the denominator of both the total effect and the direct effect

(eqs. (6.2), (6.3)), using the causal relationships (a) - (d) imposed in the definition of A, standard causal

assumptions (i) and (iii) along with consistency of potential outcome and mediator, as well as the target

assignment regime specified above:

119

Page 131: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

E[YAMA| Z = z] = EMA,A,X|Z {E[YAMA

| Z = z,MA, A,X]}

=∑x

∑a

1∑m=0

P (Yam = 1 | Z = z,Ma = m,A = a,X = x)

× P (Ma = m,A = a,X = x | Z = z)

(c)=∑x

∑a

1∑m=0

P (Yam = 1 | A = a, Z = z,X = x)P (Ma = m | A = a, Z = z,X = x)

× P (A = a | Z = z,X = x)P (X = x | Z = z)

(a),(b),(d)=

∑x

∑a

1∑m=0

P (Yam = 1 | Z = z,X = x)P (Ma = m | Z = z,X = x)

× P (A = a | X = x)P (X = x | Z = z)

(i),(iii)=

∑x

∑a

1∑m=0

P (Yam = 1 |M = m,Z = z,X = x)P (Ma = m | Z = a,X = x)

× P (Z = a | X = x)P (X = x | Z = z)

=∑x

∑a

1∑m=0

P (Y = 1 |M = m,Z = z,X = x)P (M = m | Z = a,X = x)

× P (Z = a | X = x)P (X = x | Z = z)

6.8.3 eAppendix 3. Derivation of semi-parametric estimators

First we consider the derivation of equation (6.16), which becomes the numerator of the total effect (eq.

(6.2)) and the indirect effect (eq. (6.4)):

E[YzMz | Z = z] = EX|Z{EMz|X,Z(E[YzMz | Z = z,X,Mz])}

=∑x

1∑m=0

E[Yzm |Mz = m,Z = z,X = x]P (Mz = m | Z = z,X = x)

× P (X = x | Z = z)

(iv)=∑x

1∑m=0

E[Yzm | Z = z,X = x]P (Mz = m | Z = z,X = x)P (X = x | Z = z)

(i)=∑x

1∑m=0

E[Yzm |M = m,Z = z,X = x]P (M = m | Z = z,X = x)

× P (X = x | Z = z)

=∑x

1∑m=0

E[Y |M = m,Z = z,X = x]P (M = m | Z = z,X = x)

120

Page 132: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

× P (X = x | Z = z)

=

1∑y=0

∑x

1∑m=0

yP (Y = y |M = m,Z = z,X = x)P (M = m | Z = z,X = x)

× P (X = x | Z = z)

=

1∑y=0

∑x

1∑m=0

yP (Y = y |M = m,Z = z,X = x)P (M = m | Z = z,X = x)

× P (Z = z | X = x)P (X = x)

P (Z = z)

=

1∑y=0

∑z

∑x

1∑m=0

y1{Z=z}1

P (Z = z)P (Y = y,M = m,Z = z,X = x)

=

1∑y=0

1∑m=0

∑x

yP (Y = y,M = m,X = x | Z = z)

where we have used assumptions (i) and (iv) as well as consistency of the potential outcome and mediator.

In a similar way, we can derive the result in equation (6.17), which becomes the numerator of the direct

effect (eq. (6.3)) and the denominator of the indirect effect (eq. (6.4)) as

E[YzMA| Z = z] = EX|Z{EA|Z,X(EMA|Z,A,X{E[YzMA

| Z = z,MA, A,X]})}

=∑x

∑a

1∑m=0

E[Yzm | Z = z,A = a,Ma = m,X = x]P (Ma = m | Z = z,A = a,X = x)

× P (A = a | Z = z,X = x)P (X = x | Z = z)

(b)−(d)=

∑x

∑a

1∑m=0

E[Yzm | Z = z,A = a,X = x]P (Ma = m | Z = z,X = x)

× P (A = a | X = x)P (X = x | Z = z)

(a),(iii)=

∑x

∑a

1∑m=0

E[Yzm | Z = z,X = x]P (Ma = m | Z = a,X = x)P (Z = a | X = x)

× P (X = x | Z = z)

(i)=∑x

∑a

1∑m=0

E[Yzm |M = m,Z = z,X = x]P (M = m | Z = a,X = x)

× P (Z = a | X = x)P (X = x | Z = z)

=∑x

∑a

1∑m=0

E[Y |M = m,Z = z,X = x]P (M = m | Z = a,X = x)

× P (Z = a | X = x)P (X = x | Z = z)

=

1∑y=0

∑x

∑a

1∑m=0

∑z

1{Z=z}yP (Y = y |M = m,Z = z,X = x)P (M = m | Z = a,X = x)

121

Page 133: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

× P (Z = a | X = x)P (Z = z | X = x)P (X = x)

P (Z = z)

=

1∑y=0

∑x

∑a

1∑m=0

∑z

[1{Z=z}y

P (M = m | Z = a,X = x)

P (M = m | Z = z,X = x)

P (Z = a | X = x)

P (Z = z)

]× P (Y = y,M = m,Z = z,X = x)

=

1∑y=0

1∑m=0

∑a

∑x

[yP (M = m | Z = a,X = x)

P (M = m | Z = z,X = x)P (Z = a | X = x)

]× P (Y = y,M = m,X = x | Z = z)

where we have used the causal relations for defining the A notation (i.e. (a)-(d)), as well as the standard

causal assumption (i), (iii) and consistency of potential outcome and mediator. This derivation also uses the

target provider assignment regime P (A = a | X = x) = P (Z = a | X = x), that is providers are weighted

by their actual size.

Finally, we can derive the result in equation (6.18) which serves as the denominator of both the total

effect (eq. (6.2)) and the direct effect (eq. (6.3)), shown below. Once again, the standard causal assumptions

(i)-(iii) are used, while the causal relations (a)-(d) created to define the A notation are also used, as well as

the target provider assignment regime as above and consistency of potential outcome and mediator.

E[YAMA| Z = z] = EX|Z{EA|X,Z(EMA|A,X,Z{E[YAMA

| Z = z,MA, A,X]})}

=∑x

∑a

1∑m=0

E[Yam |Ma = m,A = a, Z = z,X = x]P (Ma = m | A = a, Z = z,X = x)

× P (A = a | Z = z,X = x)P (X = x | Z = z)

(b)−(d)=

∑x

∑a

1∑m=0

E[Yam | A = a, Z = z,X = x]P (Ma = m | Z = z,X = x)

× P (A = a | X = x)P (X = x | Z = z)

(a),(iii)=

∑x

∑a

1∑m=0

E[Yam | Z = z,X = x]P (Ma = m | Z = a,X = x)P (Z = a | X = x)

× P (X = x | Z = z)

(i),(ii)=

∑x

∑a

1∑m=0

E[Yam |M = m,Z = a,X = x]P (M = m | Z = a,X = x)P (Z = a | X = x)

× P (X = x | Z = z)

=∑x

∑a

1∑m=0

E[Y |M = m,Z = a,X = x]P (M = m | Z = a,X = x)

122

Page 134: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

× P (Z = a | X = x)P (X = x | Z = z)

=

1∑y=0

∑a

1∑m=0

∑x

yP (Y = y |M = m,Z = a,X = x)P (M = m | Z = a,X = x)

× P (Z = a | X = x)P (Z = z | X = x)P (X = x)

P (Z = z)

=

1∑y=0

∑a

1∑m=0

∑x

yP (Z = z | X = x)

P (Z = z)P (Y = y,M = m,Z = a,X = x)

6.8.4 eAppendix 4. Additional simulation details and results

Data generation

Each patient has two covariates, both associated with provider assignment, mediator, and outcome, and dis-

tributed as X1i ∼ Bernoulli(0.5) and X2i ∼ N(0, 1). We also generate random variables U1i, U2ii.i.d.∼ N(0, 1)

to represent the correlated nature of potential outcome and mediator values for each patient (see Figure 6.2).

The observed provider assignment is generated as Zi ∼ Multinomial(π1i, . . . , π5i) for each patient, where

πzi = expit(γ∗0z + γ∗1zX1i + γ∗2zX2i) for z = 2, . . . , 5 and π1i = 1 −∑5z=2 πzi. Here γ∗1 = (γ∗12, . . . , γ

∗15)

and γ∗2 = (γ∗22, . . . , γ∗25) determine how the provider assignment depends on the patient characteristics

and γ∗0 = (γ∗02, . . . , γ∗05) dictates the volume of the providers. We let γ∗1 = (−1.0,−0.5, 0.5, 1.0) and

γ∗2 = (0.0, 0.0, 0.5, 1.0) for all simulations, while for provider volume we let γ∗0 = (2, 2, 2, 2) which results

in mean provider volumes of 30, 156, 180, 257 and 380 for providers 1 to 5 respectively across all simulations.

The binary potential mediators are generated using a latent variable method such that Mzi = 1{µzi+ri≥0},

where µzi = α∗0z+α∗1X1i+α∗2X2i+α∗3U1i is the success probability for m = 1 and ri ∼ Logistic(0, 1) are ran-

dom error for these probabilities. Here α∗0 = (α∗01, . . . , α∗05) corresponds to the effect of each provider on the

value of the mediator and depends on the scenario being considered, while we let (α∗1, α∗2, α∗3) = (0.75, 0.5, 1.0)

determine how the mediator depends on the patient characteristics for all simulations. The observed medi-

ator for each patient corresponds to their observed provider assignment.

Similarly, the binary potential outcomes are also generated according to a latent variable method where

the probability of the outcome is computed as

µzmi = φ∗0zm + φ∗1X1i + φ∗2X2i + φ∗3U2i

for z = 1, . . . , 5 and m = 0, 1, and the random error for these probabilities is again generated as ri ∼Logistic(0, 1). Then the potential outcomes are generated as Yzmi = 1{µzmi+ri≥0} for each m = 0, 1.

Again, we let (φ∗1, φ∗2, φ∗3) = (1.5, 0.75, 1.0) determine how the outcome depends on patient characteristics.

Meanwhile, φ∗00 = (φ∗010, . . . , φ∗050) and φ∗01 = (φ∗011, . . . , φ

∗051) denote the effect of the provider assignment

on the outcome for m = 0, 1 respectively and again the values depend on the simulation scenario being

considered. The observed outcome corresponds to the observed provider assignment and mediator value for

each patient.

123

Page 135: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Simulation scenarios

To generate the existence of both a natural direct and indirect effect for the demonstration of the to-

tal effect decomposition (Figure 6.3), we let α∗0 = (−1, 1, 0,−1, 1) to create a provider-mediator effect,

φ∗00 = (1,−1, 0, 1,−1) and φ∗01 = φ∗00 + 1 to create a provider-outcome and mediator-outcome effect simul-

taneously. We thus simulate 100 datasets as above and estimate the TE, NDE and NIE using equations

(6.13) - (6.15) for the model-based estimators and equations (6.19) - (6.21) for the semi-parametric estimators.

To compare the performance of the model-based and the semi-parametric estimators in the previous

sections under the null (either NDE = 1 or NIE = 1), we consider three further scenarios.

1. We consider the case where no direct provider-outcome effect exists (NDE = 1 for all z), but there

exists both a provider-mediator and mediator-outcome effect. We thus let α∗0 = (−1, 1, 0,−1, 1),

φ∗00 = (0, 0, 0, 0, 0) and φ∗00 = φ∗01 + 1.

2. Next, we consider the presence of a provider-outcome effect but no provider-mediator effect exists (NIE

= 1 for all z). We now let α∗0 = (0, 0, 0, 0, 0), φ∗00 = (1,−1, 0, 1,−1) and φ∗01 = φ∗00 + 1.

3. Finally, we consider an alternative way for setting the NIE = 1, in which there is a provider-mediator

effect but no mediator-outcome relationship. To this end, we let α∗0 = (−1, 1, 0,−1, 1) and φ∗00 = φ∗01 =

(1,−1, 0, 1,−1).

Under each of these scenarios, we again simulate 100 datasets according to above and estimate the TE, NDE

and NIE using both sets of estimators. Scenario 1 is presented in the main text (Figure 6.4) while scenarios

2 and 3 are presented here.

Additional simulation results

The results of scenario 2 when the NIE is set to 1 for each provider by removing the provider effect on

the mediator is presented in Figure S6.1. Once again, as provider size increases, the sampling variability

decreases. Comparing Figure S6.1 to Figure 6.4, we see that in general the estimation of the NIE when it is

set to 1 is much more variable than when it is allowed to vary between providers. Both estimators perform

similarly when estimating the NDE and TE, while the model-based estimator is slightly less variable than

the semi-parametric when estimating the NIE.

Figure S6.2 presents similar results to Figure S6.1, however the NIE is now set to 1 by removing the

mediator effect on the outcome, but retaining the provider effect on the mediator. In this case, the differences

in sampling variability between the semi-parametric and model-based estimators for the NIE is much larger,

yet this difference decreases as provider volume increases. However, both estimators have similar sampling

variability for the NDE and TE, in particular, the TE shows no extra variability compared to the NDE in

any of the simulation scenarios, despite the large sampling variability of the NIE.

124

Page 136: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

0.6 0.8 1.0 1.2 1.4 1.6

Total Effect

Provider 5

Provider 4

Provider 3

Provider 2

Provider 1

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

● True value 0.95 1.00 1.05

Indirect Effect

0.6 0.8 1.0 1.2 1.4 1.6

Direct Effect

Figure S6.1: Total, indirect and direct effect sampling distributions of proposed estimators when a directeffect exists, but no indirect effect exists via the provider-mediator pathway.

0.6 0.8 1.0 1.2 1.4 1.6 1.8

Total Effect

Provider 5

Provider 4

Provider 3

Provider 2

Provider 1

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

● True value 0.8 0.9 1.0 1.1 1.2 1.3

Indirect Effect

0.6 0.8 1.0 1.2 1.4 1.6 1.8

Direct Effect

Figure S6.2: Total, indirect and direct effect sampling distributions of proposed estimators when a directeffect exists, but no indirect effect exists via mediator-outcome pathway.

6.8.5 eAppendix 5. Sample R code for simulations

1 library(mlogit)2 library(nnet)34 library(tableone)5 library(survey)67 expit <- function(x) {1/(1+exp(-x))}8 logit <- function(x) {log(p)-log(1-p)}910 # simulation parameters11 za <- c(2,2,2,2) # hospital size12 zb <- c(-1.0, -0.5, 0.5, 1.0) # how hospital treats patient13 zc <- c(0.0, 0.0, 0.5, 1.0)1415 # dictates difference in effect of mediator across hospitals (0’s => SMR =1)16 ma <- c(-1,1,0,-1,1)17 mb1 <- 0.75

125

Page 137: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

18 mb2 <- 0.519 mc <- 1.02021 # dictates true SMR22 ya0 <- c(1,-1,0,1,-1)23 ya1 <- ya024 yb1 <- 1.525 yb2 <- 0.7526 yc <- 1.02728 nhosp=529 nobs =100030 nmed=2313233 ## function that simulates decomposed quantities34 decomp <- function(nrun , nobs , nhosp , nmed , za, zb, zc, ma, mb1 , mb2 ,35 mc , ya0 , ya1 , yb1 , yb2 , yc){36 set.seed (1)3738 # initialize results storage39 true_TEnum <- matrix(NA , nrow=nrun , ncol=nhosp)40 true_TEdenom <- matrix(NA, nrow=nrun , ncol=nhosp)41 true_NDEnum <- matrix(NA, nrow = nrun , ncol = nhosp)4243 stand_TEnum <- matrix(NA, nrow=nrun , ncol=nhosp)44 stand_TEdenom <- matrix(NA, nrow=nrun , ncol=nhosp)45 stand_NDEnum <- matrix(NA , nrow = nrun , ncol = nhosp)4647 est_TEnum <- matrix(NA , nrow=nrun , ncol=nhosp)48 est_TEdenom <- matrix(NA, nrow=nrun , ncol=nhosp)49 est_NDEnum <- matrix(NA, nrow=nrun , ncol=nhosp)5051 n <- matrix(NA , nrow=nrun , ncol=nhosp)5253 for(iter in 1:nrun){54 # generate covariates5556 x1 <- rbinom(nobs , 1, 0.5)57 x2 <- rnorm(nobs)58 u1 <- rnorm(nobs)59 u2 <- rnorm(nobs)6061 # generate hospital assignment62 pz <- exp(matrix(za, nobs , nhosp -1, byrow=TRUE)63 + matrix(zb , nobs , nhosp -1, byrow=TRUE) * matrix(x1, nobs , nhosp -1, byrow=FALSE)64 + matrix(zc , nobs , nhosp -1, byrow=TRUE) * matrix(x2, nobs , nhosp -1, byrow=FALSE))65 pz <- cbind(1/(1+ rowSums(pz)), pz/(1+ matrix(rowSums(pz), nobs , nhosp -1, byrow=FALSE)))6667 # generate potential mediator68 em <- matrix(ma, nobs , nhosp , byrow=TRUE)69 + mb1 * matrix(x1 , nobs , nhosp , byrow=FALSE)70 + mb2 * matrix(x2 , nobs , nhosp , byrow=FALSE)71 + mc * matrix(u1, nobs , nhosp , byrow=FALSE)72 pm <- expit(em)73 mres <- matrix(rlogis(nobs , location =0.0) , nobs , nhosp , byrow=FALSE)74 mall <- mres + em >= 0.07576 # generate potential outcomes77 my0 <- matrix(ya0 , nobs , nhosp , byrow=TRUE)78 + yb1 * matrix(x1 , nobs , nhosp , byrow=FALSE)79 + yb2 * matrix(x2 , nobs , nhosp , byrow=FALSE)80 + yc * matrix(u2, nobs , nhosp , byrow=FALSE)81 py0 <- expit(my0)82 my1 <- matrix(ya1 , nobs , nhosp , byrow=TRUE)83 + yb1 * matrix(x1 , nobs , nhosp , byrow=FALSE)84 + yb2 * matrix(x2 , nobs , nhosp , byrow=FALSE)85 + yc * matrix(u2, nobs , nhosp , byrow=FALSE)86 py1 <- expit(my1)87 yres <- matrix(rlogis(nobs , location =0.0) , nobs , nhosp , byrow=FALSE)88 yall.m0 <- yres + my0 >= 0.089 yall.m1 <- yres + my1 >= 0.09091 # a list of all potential outcomes , indexed by m.92 yall <- list(yall.m0 = yall.m0 , yall.m1 = yall.m1)9394 # create observed dataset

126

Page 138: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

95 z <- rep(NA, nobs) # observed hospital assignment96 for(i in 1:nobs){97 z[i] <- sample (1:nhosp , 1, prob=pz[i,])98 }99 n[iter ,] <- as.numeric(table(z))100101 m <- mall[cbind (1:nobs , z)] # observed mediator value102103 y <- rep(NA, nobs)104 for(i in 1:nobs){105 y[i] <- yall[[m[i]+1]][i,z[i]]106 }107108 ## observed dataset109 obs.dat <- data.frame(outcome = y, site = as.factor(z), mediator = m, x1 = x1, x2 = x2)110111 m.pred <- z.pred <- z.pred0 <- matrix(NA , nrow =1000 , ncol=nhosp)112 y.pred0all <- y.pred1all <- matrix(NA, nrow =1000, ncol=nhosp)113114 # prediction models115 out <- glm(outcome ~ as.factor(mediator) + as.factor(site) + as.factor(x1) + x2 ,116 data=obs.dat , family = binomial(link=logit))117 med <- glm(mediator ~ as.factor(site) + as.factor(x1) + x2 , data=obs.dat ,118 family = binomial(link=logit))119 hos <- multinom(site ~ as.factor(x1) + x2, data = obs.dat)120 hos0 <- multinom(site ~ 1, data = obs.dat)121122 # compute predicted values within a run for all patients and hospitals123 z.pred <- fitted(hos , type=’response ’)124 z.pred0 <- fitted(hos0 , type=’response ’)125126 sites <- as.numeric(sort(unique(obs.dat$site)))127 for(i in 1:nhosp){128 ind <- which(obs.dat$site == sites[i])129130 # with potential mediator131 true_TEnum[iter , i] <- sum(py0[ind , sites[i]]*(1 - pm[ind , sites[i]])132 + py1[ind , sites[i]]*pm[ind , sites[i]]) # E[Y_zMz | Z = z]133 true_TEdenom[iter , i] <- sum(rowSums ((py0[ind , ] * (1 - pm[ind , ])134 + py1[ind , ] * pm[ind , ]) * pz[ind , ])) # E[Y_AMA | Z = z]135136 # numerator of NDE and denominator of NIE137 true_NDEnum[iter , i] <- sum(rowSums ((py0[ind , sites[i]] * (1 - pm[ind ,])138 + py1[ind , sites[i]] * pm[ind , ]) * pz[ind , ])) # E[Y_zMA | Z = z]139140 for(j in 1:nhosp){141 new <- data.frame(outcome = obs.dat$outcome , mediator = obs.dat$mediator ,142 x1 = obs.dat$x1, x2 = obs.dat$x2, site = sites[j])143144 new0all <- data.frame(outcome = obs.dat$outcome , mediator = FALSE ,145 x1 = obs.dat$x1 , x2 = obs.dat$x2, site = sites[j])146 new1all <- data.frame(outcome = obs.dat$outcome , mediator = TRUE ,147 x1 = obs.dat$x1 , x2 = obs.dat$x2, site = sites[j])148149 m.pred[,j] <- predict(med , newdata = new , type=’response ’)150 y.pred0all[,j] <- predict(out , newdata = new0all , type=’response ’)151 y.pred1all[,j] <- predict(out , newdata = new1all , type=’response ’)152 }153154 # E[Y_{zMz}|Z=z]155 stand_TEnum[iter ,i] <- sum(y.pred0all[ind , sites[i]] * (1-m.pred[ind , sites[i]])156 + y.pred1all[ind ,sites[i]]*m.pred[ind ,sites[i]])157 est_TEnum[iter , i] <- sum(obs.dat$outcome[ind]/z.pred0[ind , sites[i]])158159 # E[Y_{AM_A} | Z = z]160 stand_TEdenom[iter , i] <- sum(rowSums ((y.pred0all[ind , ] * (1-m.pred[ind , ])161 + y.pred1all[ind , ]*m.pred[ind ,]) * z.pred[ind ,]))162 est_TEdenom[iter , i] <- sum(obs.dat$outcome * (z.pred[,sites[i]]/z.pred0[,sites[i]]))163164 # E[Y_{zM_A} | Z = z]165 stand_NDEnum[iter ,i] <- sum(rowSums ((y.pred0all[ind ,sites[i]]*(1-m.pred[ind ,])166 + y.pred1all[ind ,sites[i]]*m.pred[ind ,]) * z.pred[ind ,]))167 est_NDEnum[iter , i] <- sum(rowSums(obs.dat$outcome[ind]168 * (z.pred[ind ,]/z.pred0[ind , sites[i]])169 * (ifelse(matrix(obs.dat$mediator[ind]==1, length(ind), nhosp), m.pred[ind ,],

1-m.pred[ind ,])/ifelse(obs.dat$mediator[ind]==1,170 m.pred[ind ,sites[i]], 1-m.pred[ind ,sites[i]]))))

127

Page 139: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

171 }172 }173 return(list(true_TEnum = true_TEnum ,174 true_TEdenom = true_TEdenom ,175 true_NDEnum = true_NDEnum ,176 stand_TEnum = stand_TEnum ,177 stand_TEdenom = stand_TEdenom ,178 stand_NDEnum = stand_NDEnum ,179 est_TEnum = est_TEnum ,180 est_TEdenom = est_TEdenom ,181 est_NDEnum = est_NDEnum ,182 n = n))183 }184 res <- decomp (100, nobs , nhosp , nmed , za, zb, zc, ma, mb1 , mb2 , mc, ya0 , ya1 , yb1 , yb2 , yc)185186 trueTE <- res$true_TEnum/res$true_TEdenom187 trueNDE <- res$true_NDEnum/res$true_TEdenom188 trueNIE <- res$true_TEnum/res$true_NDEnum189190 standTE <- res$stand_TEnum/res$stand_TEdenom191 standNIE <- res$stand_TEnum/res$stand_NDEnum192 standNDE <- res$stand_NDEnum/res$stand_TEdenom193194 estTE <- res$est_TEnum/res$est_TEdenom195 estNIE <- res$est_TEnum/res$est_NDEnum196 estNDE <- res$est_NDEnum/res$est_TEdenom

128

Page 140: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Chapter 7

Using Causal Mediation Analysis toTarget Minimally Invasive SurgeryRates to Improve Length of Stay afterSurgical Treatment of Kidney Cancer

7.1 Abstract

Process measures (e.g. procedures) are preferable to patient outcomes for targeting hospital quality im-

provement as interventions to improve care are more definable; yet the end goal is improving outcomes.

Causal mediation analysis allows decomposition of the total hospital effect on outcomes into indirect effect

acting through a specific process and direct effect comprising all other pathways. The effect of a hypothetical

intervention on the process can then be quantified and interventions targeted where greatest improvement in

patient outcomes may occur. We present results of a mediation analysis assessing the impact of minimally

invasive (MIS) vs. open surgery on length of hospital stay in surgical treatment of kidney cancer patients

in Ontario. The intervention considered is to bring MIS proportion to the provincial average. We discuss

implementation of the methods in presence of low volume hospitals and compare approaches for estimating

the variability of the effect decomposition.

7.2 Introduction

Benchmarking hospital patient care in kidney cancer has received much attention in the literature (Gore

et al., 2012; Patzer and Pastan, 2013; Wallis et al., 2016; Lawson et al., 2017b) due in part to increasing

availability of administrative and population level databases. Such studies aim to identify variation in the

care provided and to classify hospitals as superior or poor care providers, relative to some benchmark, for

the purpose of quality improvement initiatives. Indicators of disease-specific quality can be measures of

structural, process or outcome elements of care (Donabedian, 1988). While outcome measures are appealing

as they are considered the bottom line for patients and hospitals alike (Birkmeyer et al., 2004), process mea-

sures are the most natural choice of indicator as they represent what is actually being done for the patient

129

Page 141: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

and thus represent clear areas on which to intervene (Lilford et al., 2004). Regardless of the choice of quality

indicator (QI), adjustment for differences in patient characteristics, termed case-mix, between hospitals is

needed to allow the QI to solely reflect variations in care, rather than differences in patient populations

(Shahian and Normand, 2008). Adjustment is commonly made using the indirect standardization method,

often resulting in an observed-to-expected ratio called the standardized mortality ratio (SMR), which allows

the care of each hospital to be assessed on its own patient population. The most common reference level of

care used to benchmark and standardize hospitals under indirect standardization is the average national or

provincial level of care. This comparison is relevant for policy makers who must allocate limited funds or

resources across a province or country to improve patient care. To this end, comparing to the average care

in a system allows identification of the hospitals most in need of quality improvement measures.

Identifying outlier hospitals is the first step to initiating quality of care improvement measures. However,

further information about the potential benefits to patient outcomes following some intervention to improve

care would allow policy makers to prioritize hospitals targeted for care interventions. In order to make such

causal conclusions, a causal relationship between the process being used as an indicator and the patient

outcome of interest must have been established. Then by adopting causal mediation analysis methods, de-

tailed in Daignault et al. (2019), it is possible to decompose the total hospital effect on patient outcomes

into the effect that can be attributed to the process (i.e. the natural indirect effect) and the effect that does

not involve the process (i.e. the natural direct effect). By doing so, it is possible to assess the effect that a

hypothetical intervention on the process would have on certain patient outcomes. The hypothetical inter-

vention in Daignault et al. (2019) is to intervene on the process to bring it to the average level nationwide

or province-wide. This type of analysis would allow policy makers to target their interventions on hospitals

that would see the largest benefit to patient outcomes.

This paper will investigate the effect that a hospital has on patient length of stay (LOS) after surgery for

kidney cancer patients that may or may not be attributed to the effect of minimally invasive nephrectomy

surgery (MIS) in Ontario hospitals. There has been substantial research into establishing a relationship

between MIS for nephrectomies and patient LOS for kidney cancer patients (Tan et al., 2014; Tarin et al.,

2014; Semerjian et al., 2015; Bragayrac et al., 2016; Pereira et al., 2018). The hypothetical intervention

in this case would be to raise the rate of minimally invasive surgery being performed at a hospital to the

Ontario provincial average rate. Section 7.3 outlines the mediation analysis approach proposed by Daignault

et al. (2019) for decomposing the indirectly standardized SMR into a direct and indirect (mediated) effect,

as well as the models used to estimate these effects. The data used for this analysis, made available through

the Institute for Clinical Evaluative Sciences, will be discussed here, as well as the criteria for defining the

patient cohort. Section 7.4 presents the results of the mediation analysis of MIS on LOS. We also present

a comparison of the performance of different computational methods for estimating the hospital assignment

probabilities as well as for estimating confidence intervals for the total, direct and indirect effects. We end

with a short discussion in Section 7.5.

130

Page 142: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

7.3 Materials and Methods

7.3.1 Data and Study Cohort

The data used for this analysis are from a number of prospectively collected patient-level linked databases

available through the Institute for Clinical Evaluative Sciences in Ontario, including the Discharge Abstract

Database (DAD), the Ontario Cancer Registry (OCR), and the Ontario Health Insurance Plan (OHIP)

database. These databases contain information on cancer diagnosis, pathology records, disease progression,

patient demographic characteristics, hospital billing information and treatment information. The flowchart

in Figure 7.1 details the procedure to obtain a cohort of patients who underwent a nephrectomy for kidney

cancer treatment, including the number of patients identified at each step.

Beginning with the DAD, we first removed patients who were hospitalized for procedures or interventions

that did not correspond to either a partial or radical nephrectomy, then further removed those patients who

received both surgeries in the same hospitalization. By linking these patients with the cancer registry (OCR),

we restricted these nephrectomy hospitalizations to patients with a kidney cancer diagnosis, according to

the International Classification for Disease (ICD-9). The goal of the analysis was to assess the impact of

minimally invasive nephrectomy surgery on length of stay for primary treatment of kidney cancer. Therefore,

we further restricted our general cohort to first nephrectomies dated after or no more than 90 days prior to

the date of kidney cancer diagnosis.

Patient pathology records included information on disease progression including tumour size, stage, grade,

and histology. We removed any records that were identical duplicates with respect to patient identifier, report

date, procedure type and the above disease progression variables. We further removed remaining duplicate

records with identical dates but no observed conflicts in tumour size, stage, grade and histology. As tumour

size was needed for proper case-mix adjustment, records with missing tumour size were dropped. The re-

maining pathology reports were linked by patient identifier to the cohort of kidney cancer patients with first

nephrectomies occurring no more than 90 days prior to the diagnosis date, resulting in the cohort labelled

Cohort 1 in Figure 7.1. In order to ensure that each nephrectomy was only linked to a single pathology

record, we matched based on there being either one surgery and one report, or by matching the closest report

with date within 30 days of the procedure date and no conflicting data if there are multiple reports with the

same date. Cohort 2 further excludes patients from cohort 1 with missing tumour stage.

Finally, we restrict the OHIP billing data by codes related to the performance of nephrectomies. We then

remove entries with anesthesia and assistant related billings, so that we retain the billings related only to the

procedure itself. Then we remove any billings that occur on the same day that are not related to a diagnosis

of kidney cancer, followed by those that indicate both a radical and partial nephrectomy were performed

on the same day. Finally, entries with multiple billings on the same day were combined prior to merging

with Cohort 2. Merging with cohort 2 is done by matching first nephrectomies with the closest OHIP billing

within 90 days of the surgery for entries with non-missing institution identifier. Cohort 3 is the final general

analysis cohort from which we may define the indicator-specific cohorts of interest. The minimally invasive

surgery cohort is obtained by restricting cohort 3 to patients with a kidney cancer diagnosis who underwent

131

Page 143: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

23,402 hospitalizationswith procedure codes673, 6730, 674, 6741or intervention codes

1PC87, 1PC89, 1PC91

23,330 nephrectomyhospitalizations without

partial and radicalprocedure during thesame hospitalization

21,192 nephrectomiesfor patients with

cancer registry di-agnostic code C649

28,956 patients withcancer registry di-agnostic code C649

17,441 pathologyreports (5,733 ‘old’and 11,708 ‘new’)

16,344 pathology reportsafter removing full dupli-cates with respect to id,report date, tumor size,stage, grade, histology

and procedure type

15,974 pathology re-ports after removing

duplicates with no ob-served conflict in tumorsize, stage, grade and

histology in the same day

15,842 pathologyreports with non-missing tumor size

62,617 OHIP billingswith billing codesS411, S413, S415,S416, S420, S423

22,903 OHIP billingswith anesthesia and

assistant relatedbillings removed

22,883 OHIP billingsafter dropping billings

on the same day withoutdiagnosis code 189

22,759 OHIP billingsafter dropping partial

and radical billingson the same day

22,715 OHIP billingsafter combining mul-tiple primary billings

on the same day

21,245 first kidney cancernephrectomies after orno more than 90 days

before the diagnosis date

Cohort 1: 14,427 firstnephrectomies withmatching pathology

report (either only onenephrectomy and onepathology report, or

closest report date within30 days of the procedureand no conflicting data ifmultiple reports available

on the same day)

Cohort 2: 11,046first nephrectomieswith non-missing

tumor size and stage

Cohort 3: 10,583 firstnephrectomies withclosest OHIP billingwithin 90 days and

non-missing institution

DAD dataOHIP dataCancer registry data Pathology reports

Figure 7.1: Flow diagram illustrating the database merging and cohort defining steps resulting in the generalanalysis dataset from which defined our analysis cohort.

132

Page 144: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

radical nephrectomy surgery, had T1 or T2 stage disease, were older than 18 years of age and were diagnoses

between 1995 and 2014.

7.3.2 Causal Mediation Analysis for Hospital Comparisons

Causal mediation analysis allows the total effect (TE) of some exposure Z on some outcome Y to be

decomposed into the natural direct effect (NDE) of Z on Y , and the natural indirect effect (NIE) of Z

on Y that acts through some mediating variable M (Baron and Kenny, 1986). In our case, we are interested

in decomposing the effect of a hospital on patient LOS that may be acting through MIS, as in Figure 7.2.

Case-mix factors

Hospital MIS LOS

Figure 7.2: Causal model representing the effect of hospital on patient length of stay (LOS) that maybe mediated by performance of minimally invasive surgery (MIS). Case-mix factors include patient leveldemographic and disease-progression information.

The decomposition allows the assessment of the effect a hypothetical intervention on the mediator may

have on the outcome of interest. As shown in Daignault et al. (2019), when considering an intervention to

bring the mediator to the average provincial level, the total effect decomposition on the SMR takes the form

SMRTEz = SMRNIE

z × SMRNDEz ,

which is calculated for each hospital in the system. The total effect SMR in this case is equivalent to a

standard quality comparison assessment using indirect standardization, where the QI is the outcome LOS.

One would then be looking to identify hospitals in which LOS is significantly lower than it would be if that

hospital was providing a provincial average level of care by checking

SMRTEz =

> 1⇒ LOS is longer than under provincial average care⇒ Investigate

intervention on MIS

< 1⇒ LOS is shorter than expected ⇒ No intervention on MIS

needed.

The NIE represents the effect on LOS if MIS rates in that hospital were changed to the average provincial

133

Page 145: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

level, compared to their current level. The effect of this intervention can be seen by considering

SMRNIEz =

> 1⇒

Intervention on MIS to bring to provincial average reduces

LOS ⇒ intervention beneficial to patient LOS

< 1⇒Intervention increases LOS compared to current MIS rates

⇒ MIS not causing long LOS.

Thus the NIE is the measure used to assess the potential benefit to patient LOS of improving care through

intervention on MIS. The NDE then represents the remaining hospital effect on LOS once the MIS rates are

changed to the provincial average. It can be used to diagnose whether this particular intervention improves

LOS enough for this hospital to no longer be considered an outlier in care, or if further interventions are

necessary on other hospital practices. This can be assessed by considering

SMRNDEz =

> 1⇒

After intervention, LOS is longer than under provincial average

care ⇒ intervention needed on another practice

< 1⇒After intervention, LOS is shorter than under provincial average

care ⇒ intervention on MIS succeeded in reducing LOS.

These effects will be estimated for all hospitals in Ontario treating patients identified based on the definition

in Section 7.3.1.

7.3.3 Estimation of Effect Decomposition

The total, natural indirect and natural direct effects when considering a hypothetical intervention to bring

the mediator to the average provincial level can be estimated in two ways: model-based or semi-parametric

estimators (Daignault et al., 2019). The model-based estimators involve fitting three regression models

representing each of the main pathways in Figure 7.2 and then extracting predicted values for specific values

of the mediator and exposure for patients in each hospital. We fit a linear regression model for the outcome

LOS (y), under a log-transformation, adjusting for the mediator MIS (m), the hospital of treatment (z), and

patient-level confounders (x),

f(x,m, z;φ) ≡ E[Yi | Xi = x, Zi = z,Mi = m,φ]

= φ0 + φ′1x+ φ2m+

p∑z=1

φ3z1{Z=z}. (7.1)

We could also include an interaction term between the hospital (z) and the mediator (m) in the above model.

This would allow for the effect of the mediator on the outcome to differ between hospitals. However, for this

paper, we will use the outcome model without interaction. We also fit a logistic regression model for MIS,

134

Page 146: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

adjusting for hospital of treatment and patient-level confounders,

g(x,m, z;α) ≡ P (Mi = m | Zi = z,Xi = x, α)

=

[expit{α0 + α′1x+

p∑z=1

α2z1{Z=z}}

]m [1− expit{α0 + α′1x+

p∑z=1

α2z1{Z=z}}

]1−m(7.2)

as well as a multinomial regression model for the hospital assignment given patient covariates

e(x, z; γ) ≡ P (Zi = z | Xi = x, γ)

=

exp(γ0z + γ′1zx)

1 +∑pa=2 exp(γ0a + γ′1a)

, z = 2, . . . , p

1

1 +∑pa=2 exp(γ0a + γ′1a)

, z = 1.(7.3)

The fitted values based on the observed data can be extracted from these models along with the predicted

values based on fixing either the hospital of treatment or the MIS status to a particular level to estimate the

effects of interest using the following model-based estimators:

ˆSMRTE

z =

∑ni=1

∑1m=0 1{Zi=z}f(xi,m, z; φ)g(xi,m, z; α)∑n

i=1

∑pa=1

∑1m=0 1{Zi=z}f(xi,m, a; φ)g(xi,m, a; α)e(xi, a; γ)

, (7.4)

ˆSMRNDE

z =

∑ni=1

∑pa=1

∑1m=0 1{Zi=z}f(xi,m, z; φ)g(xi,m, a; α)e(xi, a; γ)∑n

i=1

∑pa=1

∑1m=0 1{Zi=z}f(xi,m, a; φ)g(xi,m, a; α)e(xi, a; γ)

, (7.5)

ˆSMRNIE

z =

∑ni=1

∑1m=0 1{Zi=z}f(xi,m, z; φ)g(xi,m, z; α)∑n

i=1

∑pa=1

∑1m=0 1{Zi=z}f(xi,m, z; φ)g(xi,m, a; α)e(xi, a; γ)

, (7.6)

where {α, γ, φ} are the maximum likelihood estimates of the fitted MIS, hospital assignment and LOS mod-

els respectively. Here, whenever we are considering a comparison to some provincial average level of care,

predicted values are obtained by fixing the hospital indicator to each level in turn and summing over all

hospital indicator levels.

The semi-parametric estimators of the total, direct and indirect effects involve fitting the same hospital

assignment model (equation (7.3)) and the MIS mediator model (equation (7.2)) as the model-based estima-

tors but avoid the necessity of fitting the LOS outcome model (equation (7.1)). In contrast to modelling all

pathways as in the model-based estimators, the semi-parametric estimators weight the observed LOS in each

hospital by a combination of predicted values from models (7.2) and (7.3). The resulting semi-parametric

estimators have the form

ˆSMRTE

z =

∑ni=1 Yi1{Zi=z}∑ni=1 Yie(xi, z; γ)

, (7.7)

ˆSMRNDE

z =

∑ni=1

∑pa=1 1{Zi=z}Yi

g(xi,mi,a;α)g(xi,mi,z;α)

e(xi, a; γ)∑ni=1 Yie(xi, z; γ)

, (7.8)

ˆSMRNIE

z =

∑ni=1 Yi1{Zi=z}∑n

i=1

∑pa=1 1{Zi=z}Yi

g(xi,mi,a;α)g(xi,mi,z;α)

e(xi, a; γ). (7.9)

135

Page 147: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Here again we are taking averages of predicted values from both models over all hospitals to represent the

average provincial level of care. We therefore estimate the total effect decomposition using both the model-

based and semi-parametric estimators to determine whether intervention on MIS rates in a hospital have a

benefit on patient LOS.

The hospital assignment model in (7.3) requires estimation of hospital-specific regression parameters

which may become problematic if there are hospitals with a small number of patients. In this situation,

there may not be sufficient data to estimate the model parameters for all hospitals. Therefore we also fit

two alternative hospital assignment models for use in estimation of the total, indirect and direct effects. The

first is a simpler implementation of a multinomial regression model where the small hospitals are combined

into a single category. Suppose, out of p hospitals, the first l are small, while the rest are large. The first

alternative multinomial model will be specified as

e(x, z; γ) =

exp(γ0z + γ′1zx)

1 +∑pa=l+1 exp(γ0a + γ′1a)

, z = l + 1, . . . , p

1

1 +∑pa=l+1 exp(γ0a + γ′1a)

, z = 1, . . . , l(7.10)

where the combined category serves as the reference level. The second alternative multinomial model involves

restricting the parameter estimation to hospitals that have sufficient information. We therefore specify a

model that only estimates covariate effects for the large hospitals, as in Daignault and Saarela (2017), by

e(xi, z, γ) =

exp(γ0z)

1 +∑la=2 exp(γ0a) +

∑pa=l+1 exp(γ0a + γ1axi)

, z = 2, . . . , l

exp(γ0z + γ1zxi)

1 +∑la=2 exp(γ0a) +

∑pa=l+1 exp(γ0a + γ1axi)

, z = l + 1, . . . , p(7.11)

and e(xi, 1, γ) = 1 −∑pz=2 e(xi, z, γ). While this second model is intentionally misspecified for the small

hospitals, it will still help estimation for the larger hospitals. This model can be fit in R using the VGAM

package. Note that both alternative multinomial models continue to use all hospitals in the data, ensuring

that the comparison to a provincial average care level still holds. We will use the multinomial models specified

by equation (7.10) and (7.11) in our analysis.

7.3.4 Standard Errors for the Estimated SMRs

To obtain estimates of the standard errors and subsequent confidence intervals, we consider two different

resampling techniques. The first is the nonparametric bootstrap (Efron and Tibshirani, 1986) which can be

used to obtain the sampling distribution for both the model-based and semi-parametric estimators. Patients

are resampled from the overall patient population in order to allow the hospital sizes to vary. When equations

(7.3) or (7.10) are used to estimate the multinomial hospital assignment probabilities, the nonparametric

bootstrap is not too computationally intensive and thus can be allowed to run for a reasonably large number

of iterations. However, when using the constrained multinomial model (equation (7.11)), the computational

expense is quite high. Therefore, we also consider the use of a Normal approximation (Talbot et al., 2011) to

obtain confidence intervals for our SMR estimates, where the standard error is obtained from a much smaller

136

Page 148: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

number of nonparametric bootstrap iterations. The second resampling technique is an approximate Bayesian

method (Gelman et al., 2013) that resamples values of the model parameters from an approximate posterior

distribution under the assumption that the model parameters are distributed according to a multivariate

Normal distribution centred at the parameter MLE of each model and with variance set to the squared

parameter standard error. Then predicted values are extracted using the resampled parameter values and

the effects of interest are computed as before. Note that the approximate Bayesian resampling can only be

applied to the model-based estimators, as for these all the uncertainty is due to the parameter estimation.

We therefore estimate the variability of the TE, NIE and NDE first by grouping all hospitals of patient

volume less than 50 and using the multinomial model specified by (7.10), and secondly by only grouping

hospitals of patient volume less than 9 and then estimating covariate effects in the multinomial model for

hospitals with more than 50 patients only, as in (7.11). The latter will be referred to as the constrained

model. Hospitals with less than 9 patients are grouped into a single category in the constrained model so that,

when resampling patients in the bootstrap, all hospitals will always be represented in the resamples. When

using the unconstrained multinomial model, we estimate the variability of the estimates using the sampling

distribution obtained from the nonparametric bootstrap of 500 resamples. When using the constrained model,

we compare the use of a bootstrap sample from 50 iterations in conjunction with a Normal approximation

for obtaining confidence intervals to the use of a larger 125 iteration bootstrap sampling distribution. In the

case of the model-based estimators using the constrained model, we also obtain sampling distributions from

the approximate Bayesian method using 500 iterations.

7.4 Results

7.4.1 Description of the Data

Based on the cohort definition in Section 7.3.1, we identified 4079 Ontario patients diagnosed with kidney

cancer older than 18 years of age, who underwent a radical nephrectomy for T1 or T2 stage cancer, with

diagnosis occurring between 1995 and 2014. These patients were treated at 72 different hospitals in Ontario,

29 of which treated more than 50 patients between 2004 and 2014. Twelve hospitals provided only one level

of the mediator variable MIS (i.e. performed either only minimally invasive or only open surgeries) and

therefore were excluded to ensure that the mediator models are identifiable. Of the remaining 60 hospitals, 5

hospitals treated fewer than 9 patients and were thus combined to create a pooled ‘Other’ hospital category.

We therefore had 56 hospital categories for the analysis involving the constrained multinomial assignment

model (55 hospitals + 1 ‘Other’ pooled category) consisting of 4001 patients, with hospital volumes ranging

from 12 patients to 454 patients, as seen in Figure 7.3. For the unconstrained hospital assignment model, we

further pooled all hospitals that treated fewer than 50 patients, indicated by the red vertical line in Figure 7.3.

A summary of the patient characteristics of the combined population of 4001 patients, not stratified by

hospital, is provided in Table 7.1. We note that 58.4% of patients received minimally invasive nephrectomies,

while the standard deviation of length of stay is quite high at 6.39 days, motivating log-tranformation for

the analysis. Many patients were diagnosed closer to the end of the study period and the majority overall

had T1 stage kidney cancer over T2 stage with relatively few co-morbidities according to the Charlson score.

137

Page 149: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Inst 56Inst 55Inst 53Inst 54OthersInst 52Inst 51Inst 49Inst 50Inst 47Inst 48Inst 46Inst 44Inst 45Inst 42Inst 43Inst 40Inst 41Inst 39Inst 37Inst 38Inst 36Inst 34Inst 35Inst 33Inst 31Inst 30Inst 29Inst 28Inst 27Inst 26Inst 25Inst 24Inst 22Inst 23Inst 20Inst 21Inst 19Inst 18Inst 17Inst 16Inst 15Inst 13Inst 14Inst 12Inst 11Inst 10

Inst 9Inst 8Inst 7Inst 6Inst 5Inst 4Inst 3Inst 2Inst 1

Hospital volume

Hos

pita

l ID

0 100 200 300 400

Figure 7.3: Number of patients in cohort per hospital. ‘Others’ is a pooled category combining hospitalswho treated fewer than 9 patients. The red line indicates the cut point for pooling hospitals who treat fewerthan 50 patients.

Differences in the covariate distribution of patients between hospitals were determined by calculating the

pairwise standardized mean differences (SMD) for each covariate for all hospitals. The distribution of the

pairwise SMDs for each covariate can be seen in Figure 7.4. Minimally invasive surgery shows differences

between hospitals, as do many of the other covariates, highlighting the importance of covariate adjustment.

Figures 7.5 and 7.6, displaying the case-mix adjusted SMRs for each hospital in a funnel plot for both

MIS and LOS, indicate substantial variability in patient LOS as well as the rate of MIS between hospi-

tals. Further heat maps displaying covariate imbalance between hospitals for the other covariates can be

138

Page 150: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Categorical VariablesVariable n(%) Variable n(%)

Age Group Income Quintile0− 49 years 740 (18.5) 1 757 (18.9)50− 59 years 1040 (26.0) 2 857 (21.4)60− 69 years 1128 (28.2) 3 773 (19.3)70− 79 years 812 (20.3) 4 845 (21.1)> 80 years 281 (7.0) 5 769 (19.2)

Sex (Male vs Female) 2408 (60.2) Tumour Stage (T2 vs T1) 830 (20.7)Year of Diagnosis Charlson Score

1997 1 (0.0) 0 15 (0.4)2001 2 (0.0) 1 7 (0.2)2002 111 (2.8) 2 2660 (66.5)2003 94 (2.3) 3 618 (15.4)2004 200 (5.0) 4 243 (6.1)2005 148 (3.7) 5 145 (3.6)2006 238 (5.9) 6 81 (2.0)2007 325 (8.1) 7 40 (1.0)2008 355 (8.9) 8 122 (3.0)2009 354 (8.8) 9 37 (0.9)2010 348 (8.7) 10 18 (0.4)2011 396 (9.9) 11 9 (0.2)2012 479 (12.0) 12 3 (0.1)2013 469 (11.7) 13 2 (0.1)2014 481 (12.0) MIS (yes vs no) 2338 (58.4)

Continuous VariablesVariable Mean (sd) Variable Mean (sd)

ACG Score 22.91 (11.93) Tumor Size (cm) 5.23 (2.92)Length of Stay (days) 5.09 (6.39) Days from DX to NX 1.19 (1.92)

Table 7.1: Descriptive statistics for population of n = 4001 Ontario kidney cancer patients undergoing radicalnephrectomies across 60 hospitals. Here, DX refers to diagnosis, NX refers to nephrectomy, and ACG scoreis the Adjusted Clinical Group score (Starfield et al., 1991).

found in the Supplemental Figures (SFigures S7.1 to S7.9). From these, there appear to be substantial differ-

ences between hospitals in terms of the age, sex, income, Charlson score and tumour size of their populations.

7.4.2 Mediation Analysis Results

As discussed in Section 7.3.3, the model-based estimators of the total, direct and indirect effects require mod-

elling both the outcome and mediator pathways as in equations (7.1) and (7.2), whereas the semi-parametric

estimators require the mediator model (7.2). A summary of these models can be found in Figures 7.7 and 7.8,

when all hospitals treating fewer than 50 patients have been pooled into a single category. The analogous re-

sults for the case where hospitals treating fewer than 9 patients are pooled can be found in the Supplemental

Figures (SFigures S7.10 and S7.11). Importantly, the mediator MIS is significantly associated with shorter

LOS, further supporting the notion that an intervention on MIS could improve LOS. Figure 7.7 shows that

the odds of receiving MIS are less for older age groups compared to the youngest age group and that patients

139

Page 151: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

●●● ●●●●● ● ●●● ●● ●● ●●● ●●● ●● ●● ●● ●●●● ●●● ●●● ● ●● ● ●●

●● ●● ●● ●●●●●●● ●●● ●●●●●●●

● ●●● ●●●● ●●● ●● ● ●● ●●● ●●● ●●●● ●● ●● ●●●● ● ●● ● ●● ●● ●●●●● ● ●● ●●● ● ●● ●● ●●●● ●●●●● ●●● ●● ●● ●●● ●● ●● ● ●

●●● ●●● ●●●● ●●●●●●● ● ●● ● ●●● ● ●●● ● ●● ●●● ●

● ●● ●● ●●● ●● ●● ●● ●● ●●● ●●● ●●● ●● ●●●● ●●● ● ●● ●●● ● ●●●

●●● ● ●●● ● ●● ●●●● ●●● ●●● ●● ●●●● ● ●● ●●● ●●

●●●●●● ●●● ●●● ●●●●●●

● ●●●● ●● ●● ●●● ●●

●● ●●●● ●●● ●● ●●●● ●● ●●●● ● ●● ●● ● ● ●●●●● ●●●●● ●●● ●●● ●●● ●● ●●●●●●● ●●● ●● ● ● ●●● ●● ●●●● ●● ●●● ●

●●●●●●●

Income Quintile

Year of Diagnosis

Charlson Score

ACG Score

Tumor Size

Days from dx to nx

Length of Stay

Age Group

Sex

Tumor Stage

MIS

0.0 0.5 1.0 1.5

Pairwise SMD

small medium large

1>

152>

21>

64>

Figure 7.4: Distribution of pairwise standardized mean differences (SMD) between hospitals for each covari-ate, as well as the mediator MIS and outcome LOS.

are significantly less likely to receive MIS in all other hospitals compared to hospital 1 (the largest). Further,

as the year of diagnosis becomes more recent, patients are significantly more likely to receive MIS. From

the outcome model in Figure 7.8, we note that older patients have significantly longer LOS, as well as those

with more co-morbidities, a longer wait time from diagnosis to surgery, and T2 stage cancer, but patients

with a recent diagnosis and those in a higher income quintile see significantly shorter LOS. In addition

to these models, both estimation methods require a multinomial hospital assignment model. The simplest

option is to use the model specified in equation (7.10), where the hospitals with fewer than 50 patients are

pooled into a single category. Figure 7.9 presents the estimates for all hospitals (omitting the ‘Others’ pooled

category) using both the model-based and semi-parametric approaches with models (7.1), (7.2) and (7.10).

The whiskers of the boxplots represent the 2.5 and 97.5 percentiles of the sampling distribution from a 500

iteration bootstrap.

In general, the semi-parametric estimators (equations (7.7)-(7.9)) display larger variability than the

model-based estimators (equations (7.4)-(7.6)), but the estimates themselves are often quite similar with a

few exceptions. Based on Section 7.3.2, hospitals that would be targets for an intervention to improve LOS

are those with an estimated TE significantly larger than 1. Using this cutoff for both estimation approaches,

we are able to identify hospitals 12, 13, 21, 22 and 27 as having LOS that is longer than if they were providing

overall average provincial level care. The natural indirect effect then indicates whether LOS changes signif-

140

Page 152: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Minimally invasive surgery(10 lower outliers with 1806 patients, 13 non−outliers with 1079 patients, 7 upper outliers with 1116 patients)

Case−mix adjusted proportion

1/S

E

0.0 0.2 0.4 0.6 0.8 1.0

05

1015

●●

●●

●●

●●

●●

● ●

●I2 = 96.2% (p−value: <0.001)

Figure 7.5: Funnel plot of case-mix adjusted minimally invasive surgery proportions. Circles representhospital standardized mortality ratios, proportional to their volume, plotted against the inverse of theirestimated standard error. Red indicates hospitals classified as poor outliers, blue for superior outliers.

icantly after intervening to bring MIS rates to the provincial average level and holding all other practices

constant. Indirect effects significantly larger than 1 indicate that the intervention would be able reduce LOS.

Of the hospitals targeted for intervention, the model-based estimators for hospitals 12, 13, and 21 show that

intervention was successful in reducing LOS yet the semi-parametric approaches, while providing estimates

that are larger than one, do not show a significant improvement due to their higher variability. A direct

effect significantly larger than 1 implies that the intervention, while reducing LOS, was not sufficient to bring

LOS to the provincial average level. Of the hospitals who benefited from the intervention on MIS, both NDE

estimates for hospital 12 are significantly larger than 1, indicating that there are other hospital practices

other than MIS that are contributing to its longer than average LOS. The model-based estimates of the NDE

for hospital 21 indicate that LOS remains longer than average after intervention, yet the semi-parametric

approach indicates that it may have reduced LOS to near provincial average levels. Finally the model-based

approach for hospital 13 indicates, like hospital 21, that further interventions are needed.

Such methods can also be used to assess what aspects of care contribute to a hospital’s superior per-

formance. Hospitals 1 and 10 would be identified from the TE as having significantly lower LOS than the

provincial average. The NIE of hospital 1, by being significantly less than 1, shows that the lower than aver-

age LOS can be attributed to better than average MIS practices. However the NIE of hospital 10 indicates

that to intervene on MIS in this hospital to bring it to the provincial average would have no effect on LOS

141

Page 153: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Length of stay(5 lower outliers with 1715 patients, 17 non−outliers with 1555 patients, 8 upper outliers with 731 patients)

Case−mix adjusted mean (days)

1/S

E

1 2 3 4 5 6 7

020

4060

● ●● ●●●●● ●

●● ●

●● ●

●●

●●●

●●

I2 = 84.7 % (p−value: <0.001)

Figure 7.6: Funnel plot of case-mix adjusted length of stay. Circles represent hospital standardized mortalityratios, proportional to their volume, plotted against the inverse of their estimated standard error. Redindicates hospitals classified as poor outliers, blue for superior outliers.

and would result in the same LOS as observed. Based on the NDE for this hospital, this is due to the fact

that MIS is not the driving force behind the low LOS but some other hospital practice. Whereas the NIE

for hospital 1 being significantly less than 1 indicates that their current MIS rates lead to shorter LOS than

if they were performing the provincial average rate of MIS. However, the NDE for the same hospital shows

that there are still other aspects of care being provided that contribute to lower LOS.

Figure 7.10 shows the same results when using the constrained multinomial assignment model from

equation (7.11). Recall that, in this case, hospitals treating fewer than 9 patients were pooled and covariate

effects were only estimated for hospitals treating more than 50 patients. Here the confidence intervals are

calculated using a Normal approximation, with variance estimated from a 50-iteration bootstrap. We note

that the confidence intervals using the Normal approximation are now forced to be centred at the estimate,

whereas those from the non-parametric bootstrap may not always be centred. Also note that the intervals

seem to be slightly shorter under the Normal approximation than in Figure 7.9. A more in-depth compari-

son of the error estimation methods are provided in the next section. However, we do see small differences

between the estimates of the TE, NIE and NDE depending on whether the constrained or unconstrained

multinomial model is used. Figure 7.11 presents the differences in the point estimates of the TE, NIE and

NDE for both approaches, comparing the use of the unconstrained versus the constrained multinomial model

in the estimation. The semi-parametric estimates of the TE, NIE and NDE show larger differences between

142

Page 154: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Mediator Model Estimates

0.005 0.010 0.020 0.050 0.100 0.200 0.500 1.000 2.000

Odds Ratio (log scale)

Variable

Intercept

Age Group 50−59yrs vs 0−49yrs

Age Group 60−69yrs vs 0−49yrs

Age Group 70−79yrs vs 0−49yrs

Age Group 80+yrs vs 0−49yrs

Male vs Female

Income Quintile

Diagnosis Year

Charlson Score

ACG score

Days from Diagnosis to Surgery

Tumour size

Tumour stage T2 vs T1

Inst 2 vs Inst 1

Inst 3 vs Inst 1

Inst 4 vs Inst 1

Inst 5 vs Inst 1

Inst 6 vs Inst 1

Inst 7 vs Inst 1

Inst 8 vs Inst 1

Inst 9 vs Inst 1

Inst 10 vs Inst 1

Inst 11 vs Inst 1

Inst 12 vs Inst 1

Inst 13 vs Inst 1

Inst 14 vs Inst 1

Inst 15 vs Inst 1

Inst 16 vs Inst 1

Inst 17 vs Inst 1

Inst 18 vs Inst 1

Inst 19 vs Inst 1

Inst 20 vs Inst 1

Inst 21 vs Inst 1

Inst 22 vs Inst 1

Inst 23 vs Inst 1

Inst 24 vs Inst 1

Inst 25 vs Inst 1

Inst 26 vs Inst 1

Inst 27 vs Inst 1

Inst 28 vs Inst 1

Inst 29 vs Inst 1

Others vs Inst 1

OR

27.07

1.00

0.86

0.85

0.80

0.93

0.99

1.31

0.88

1.00

1.05

0.88

0.73

0.03

0.06

0.23

0.33

0.03

0.04

0.03

0.13

0.06

0.13

0.05

0.02

0.03

0.06

0.12

0.05

0.04

0.05

0.08

0.02

0.02

0.05

0.04

0.18

0.14

0.10

0.03

0.10

0.04

95% CI

(17.18, 42.64)

(0.79, 1.26)

(0.68, 1.08)

(0.66, 1.09)

(0.57, 1.12)

(0.79, 1.08)

(0.94, 1.05)

(1.28, 1.34)

(0.84, 0.92)

(1.00, 1.01)

(1.00, 1.10)

(0.84, 0.92)

(0.55, 0.96)

(0.02, 0.05)

(0.03, 0.10)

(0.13, 0.42)

(0.19, 0.60)

(0.02, 0.05)

(0.03, 0.07)

(0.02, 0.06)

(0.07, 0.24)

(0.03, 0.11)

(0.07, 0.23)

(0.03, 0.08)

(0.01, 0.03)

(0.02, 0.06)

(0.04, 0.12)

(0.06, 0.22)

(0.03, 0.09)

(0.02, 0.08)

(0.02, 0.09)

(0.04, 0.15)

(0.01, 0.03)

(0.01, 0.03)

(0.02, 0.09)

(0.02, 0.08)

(0.09, 0.38)

(0.07, 0.29)

(0.05, 0.21)

(0.01, 0.05)

(0.05, 0.21)

(0.03, 0.07)

|||

||

||

|||

||

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

||||||

||||

||

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

Figure 7.7: Caterpillar plot of the parameter estimates and 95% confidence intervals of the mediator modelused in the model-based and semi-parametric estimators of the total effect decomposition. Here, all hospitalstreating fewer than 50 patients are pooled into a single category (‘Others’).

the estimates using the unconstrained versus constrained multinomial model compared to the model-based

estimates.

143

Page 155: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Outcome Model Estimates

0.6 0.8 1.0 1.2 1.4 1.6

Regression Coefficient

Variable

Intercept

MIS vs Open surgery

Age Group 50−59yrs vs 0−49yrs

Age Group 60−69yrs vs 0−49yrs

Age Group 70−79yrs vs 0−49yrs

Age Group 80+yrs vs 0−49yrs

Male vs Female

Income Quintile

Diagnosis Year

Charlson Score

ACG score

Days from Diagnosis to Surgery

Tumour size

Tumour stage T2 vs T1

Inst 2 vs Inst 1

Inst 3 vs Inst 1

Inst 4 vs Inst 1

Inst 5 vs Inst 1

Inst 6 vs Inst 1

Inst 7 vs Inst 1

Inst 8 vs Inst 1

Inst 9 vs Inst 1

Inst 10 vs Inst 1

Inst 11 vs Inst 1

Inst 12 vs Inst 1

Inst 13 vs Inst 1

Inst 14 vs Inst 1

Inst 15 vs Inst 1

Inst 16 vs Inst 1

Inst 17 vs Inst 1

Inst 18 vs Inst 1

Inst 19 vs Inst 1

Inst 20 vs Inst 1

Inst 21 vs Inst 1

Inst 22 vs Inst 1

Inst 23 vs Inst 1

Inst 24 vs Inst 1

Inst 25 vs Inst 1

Inst 26 vs Inst 1

Inst 27 vs Inst 1

Inst 28 vs Inst 1

Inst 29 vs Inst 1

Others vs Inst 1

Coef

6.30E+10

0.78

1.05

1.10

1.20

1.35

0.98

0.98

0.99

1.04

1.00

1.02

1.01

1.07

0.96

1.07

1.04

1.17

1.11

1.17

0.99

1.10

0.92

1.03

1.46

1.20

1.11

1.01

1.14

1.10

1.17

1.13

1.07

1.21

1.23

1.12

1.22

1.16

1.00

1.26

1.18

1.06

1.02

95% CI

(2E+07, 5E+01)

(0.76, 0.80)

(1.01, 1.09)

(1.06, 1.14)

(1.15, 1.24)

(1.28, 1.43)

(0.96, 1.00)

(0.97, 0.99)

(0.98, 0.99)

(1.03, 1.05)

(1.00, 1.00)

(1.01, 1.02)

(1.00, 1.01)

(1.03, 1.11)

(0.90, 1.03)

(0.99, 1.15)

(0.97, 1.12)

(1.09, 1.26)

(1.03, 1.20)

(1.08, 1.26)

(0.90, 1.07)

(1.01, 1.19)

(0.85, 1.01)

(0.95, 1.12)

(1.34, 1.59)

(1.10, 1.31)

(1.02, 1.22)

(0.92, 1.10)

(1.04, 1.24)

(1.00, 1.20)

(1.07, 1.29)

(1.03, 1.24)

(0.97, 1.18)

(1.10, 1.33)

(1.11, 1.35)

(1.01, 1.23)

(1.10, 1.35)

(1.05, 1.29)

(0.90, 1.12)

(1.13, 1.41)

(1.05, 1.32)

(0.94, 1.19)

(0.97, 1.07)

||

||

||

||

||||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

Figure 7.8: Caterpillar plot of the parameter estimates and 95% confidence intervals of the outcome modelused in the model-based estimators of the total effect decomposition. Here, all hospitals treating fewer than50 patients are pooled into a single category (‘Others’).

We also note that when comparing the values in Figure 7.11 between the two estimation approaches, the

144

Page 156: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

0.5 1.0 1.5 2.0

Total Effect

SMR

Inst 29

Inst 28

Inst 27

Inst 26

Inst 25

Inst 24

Inst 22

Inst 23

Inst 20

Inst 21

Inst 19

Inst 18

Inst 17

Inst 16

Inst 15

Inst 13

Inst 14

Inst 12

Inst 11

Inst 10

Inst 9

Inst 8

Inst 7

Inst 6

Inst 5

Inst 4

Inst 3

Inst 2

Inst 1

Provider

50

51

55

57

63

64

74

74

75

75

78

80

90

92

94

96

96

98

99

101

102

104

124

130

145

161

163

201

454

Volume

Model−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametric

Estimator

● Original Estimate 0.6 0.8 1.0 1.2 1.6

Indirect Effect

SMR

0.5 1.0 1.5 2.5 3.5

Direct Effect

SMR

Figure 7.9: Boxplots of bootstrap sampling distribution of model-based and semi-parametric estimatorsof the total effect decomposition when pooling hospitals who treat fewer than 50 patients and fitting themultinomial model specified in (7.10). Whiskers of boxplots represent 95% confidence intervals.

model-based estimates of the effects result in predominantly positive differences, while the semi-parametric

145

Page 157: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

0.5 1.0 1.5 2.0 3.0

Total Effect

SMR

Inst 29

Inst 28

Inst 27

Inst 26

Inst 25

Inst 24

Inst 22

Inst 23

Inst 20

Inst 21

Inst 19

Inst 18

Inst 17

Inst 16

Inst 15

Inst 13

Inst 14

Inst 12

Inst 11

Inst 10

Inst 9

Inst 8

Inst 7

Inst 6

Inst 5

Inst 4

Inst 3

Inst 2

Inst 1

Provider

50

51

55

57

63

64

74

74

75

75

78

80

90

92

94

96

96

98

99

101

102

104

124

130

145

161

163

201

454

Volume

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Model−based

Semi−parametric

Estimator

● Original Estimate 0.4 0.6 0.8 1.0 1.4

Indirect Effect

SMR

1 2 3 4 5

Direct Effect

SMR

Figure 7.10: Boxplots of 95% confidence intervals of model-based and semi-parametric estimators of thetotal effect decomposition when pooling hospitals who treat fewer than 9 patients and fitting a constrainedmultinomial model specified in (7.11). Variability was estimated via a 50 iteration non-parametric bootstrapand a Normal approximation was used for the confidence intervals.

approach results in more negative differences between the two multinomial modelling approaches. This

implies that, depending on the choice of multinomial model, discrepancies in the point estimates between

146

Page 158: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

TE semi TE model NIE semi NIE model NDE semi NDE model

−0.

10−

0.05

0.00

0.05

Difference Between Using Constrained and Unconstrained Multinomial Model

Estimated Total Effect Decomposition

Diff

eren

ces

Figure 7.11: Differences in point estimates between the use of unconstrained and constrained multinomialassignment models for both semi-parametric and model-based estimators of the total, indirect and directeffects for the 29 large hospitals in Ontario.

the semi-parametric and model-based approaches may be accentuated. For example, in Figure 7.9, the point

estimates of the total effect for hospital 8 are quite close together but in Figure 7.10 they have moved farther

apart. Thus the choice of multinomial assignment model may have some small impact on the estimates of

the total effect decomposition.

7.4.3 Comparison of Error Estimation Methods

We now compare the effect of the choice of variance estimation procedure on the conclusions of the mediation

analysis. As the multinomial model, as specified in equation (7.10), is relatively quick to run, we compare

the margins of error of the confidence intervals obtained through a 500 iteration bootstrap procedure to

other methods that use the constrained multinomial model of equation (7.11). The constrained model is

quite computationally expensive to run, therefore we compare the unconstrained version to a 125 iteration

bootstrap procedure, as well as a 50 iteration bootstrap procedure with a Normal approximation. For the

model-based approach, we also consider an approximate Bayesian procedure, as described in Section 7.3.4,

which is run for 500 iterations. The complete results of the mediation analysis are found in Figure 7.9 for the

147

Page 159: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

unconstrained model bootstrap, Figure 7.10 for the constrained model with Normal approximation, Figure

S7.12 for the large bootstrap with constrained model, and Figure S7.13 for the approximate Bayesian method.

Inst 29

Inst 28

Inst 27

Inst 26

Inst 25

Inst 24

Inst 22

Inst 23

Inst 20

Inst 21

Inst 19

Inst 18

Inst 17

Inst 16

Inst 15

Inst 13

Inst 14

Inst 12

Inst 11

Inst 10

Inst 9

Inst 8

Inst 7

Inst 6

Inst 5

Inst 4

Inst 3

Inst 2

Inst 1

Total Effect

Margin of Error (ME)

0.00 0.05 0.10 0.15 0.20

Pooled Boot.

Indirect Effect

Margin of Error (ME)

0.00 0.02 0.04 0.06 0.08 0.10

Approx. Bayes Normal Approx.

Direct Effect

Margin of Error (ME)

0.00 0.05 0.10 0.15 0.20

Constrained Boot.

Figure 7.12: Margin of error for the 95% confidence intervals of the model-based estimators for each ofthe variance estimation methods: 500 iteration non-parametric bootstrap using the unconstrained multi-nomial model, the approximate Bayesian method, the 50 iteration non-parametric bootstrap with Normalapproximation and the 125 iteration non-parametric bootstrap using constrained multinomial model.

Figure 7.12 presents the margins of error of the confidence intervals for each of the bootstrap methods

listed for the model-based estimators. For the total effect confidence intervals, the approximate Bayesian

method provides wider intervals more often than the other methods. For the larger hospitals (hospitals 1-10)

148

Page 160: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

the widths are relatively similar, whereas they seem to vary more as the number of patients decreases. The

intervals of the NIE tend to be much more similar and are shorter than their respective direct and total

effect counterparts.

Figure 7.13 shows analogous results for the semi-parametric estimators, omitting the approximate Bayesian

method as it is not an appropriate method for semi-parametric approaches. Note first that the estimators

are much more variable than the model-based estimators. Except for a few hospitals, such as hospital 13

and 16, it appears that all variance estimation procedures perform similarly, resulting in confidence intervals

of similar widths. In particular, the 125 iteration bootstrap and the Normal approximation perform almost

identically for the indirect effect confidence intervals, implying that the computational expense of the large

bootstrap provides little benefit over the simpler Normal approximation.

7.5 Discussion

In this paper, we have applied mediation analysis methods for hospital comparisons, proposed by Daignault

et al. (2019), to assess whether a hypothetical intervention on minimally invasive surgery rates in hospitals in

Ontario would result in improved patient length of stay. The particular hypothetical intervention considered

is to bring the MIS rates to the provincial average level. The total effect decomposition of the hospital effect

SMR into the indirect and direct effect of MIS on length of stay has been shown to be useful in determining

at which hospitals an intervention would result in the largest improvement. Due to the nature of hospital

care comparisons, the presence of small hospitals in the data can make such an analysis difficult. While it

would be unrealistic to estimate the effect decomposition for the small hospitals, we nevertheless include

them so as to maintain the provincial average reference level for standardization. We therefore compared

two possible approaches to dealing with the need to model hospital assignment when small hospitals are

present: a general pooling approach and a constrained estimation approach. While being far simpler, pool-

ing hospitals with patient volumes below some cutoff may not be desirable as the cutoff can be arbitrary

and often results in the creation of a ‘mega hospital’ that is much larger than others in the data. However,

using a constrained multinomial model for hospital assignment that only estimates covariate effects for large

hospitals is computationally expensive and intentionally misspecifies the model for smaller hospitals. We

have shown that the choice of whether to pool or constrain the estimation can lead to small differences in

the estimates of the total, direct and indirect effect SMRs causing potential disagreement between approaches.

We further compared various methods used to obtain a measure of the variability in the estimation of the

total effect decomposition. The approximate Bayesian resampling method yielded longer confidence intervals

more often for both the model-based total and direct effects compared with the other bootstrap methods.

These remaining methods produced very similar confidence interval widths for both the model-based and

semi-parametric approaches. The constrained multinomial model was implemented using the VGAM func-

tion in the R software which is time-consuming to run. Therefore running a bootstrapping procedure with a

reasonable number of iterations is far more computationally expensive than using a smaller bootstrap with

a Normal approximation or using the pooled category multinomial model instead. As the resulting sampling

distributions are similar, we recommend using one of these methods rather than the large non-parametric

149

Page 161: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Inst 29

Inst 28

Inst 27

Inst 26

Inst 25

Inst 24

Inst 22

Inst 23

Inst 20

Inst 21

Inst 19

Inst 18

Inst 17

Inst 16

Inst 15

Inst 13

Inst 14

Inst 12

Inst 11

Inst 10

Inst 9

Inst 8

Inst 7

Inst 6

Inst 5

Inst 4

Inst 3

Inst 2

Inst 1

Total Effect

Margin of Error (ME)

0.0 0.2 0.4 0.6 0.8

Pooled Bootstrap

Indirect Effect

Margin of Error (ME)

0.0 0.1 0.2 0.3 0.4 0.5 0.6

Normal Approx.

Direct Effect

Margin of Error (ME)

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

Constrained Bootstrap

Figure 7.13: Margin of error for the 95% confidence intervals of the semi-parametric estimators for each of thevariance estimation methods: 500 iteration non-parametric bootstrap using the unconstrained multinomialmodel, the 50 iteration non-parametric bootstrap with Normal approximation and the 125 iteration non-parametric bootstrap using constrained multinomial model.

bootstrap with the constrained multinomial model.

In general, both the model-based and semi-parametric approaches produce similar estimates of the total,

direct and indirect effects, with a couple of exceptions (i.e. hospitals 13 and 16). The large difference between

approaches in the estimates for these two hospitals may be due in part to the presence of very large LOS

values for some patients (exceeding 100 days). The semi-parametric approach does not allow adjustment for

150

Page 162: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

covariates on the outcome pathway, and thus such extreme values may be pulling the estimates of the total,

direct and indirect effects away from the model-based analogues. This may also contribute to the much

larger variability in the semi-parametric estimates for these hospitals. Further, we did not fit interactions

between the hospital and the mediator in the outcome model used in the model-based estimates. By not

allowing an interaction, we are implicitly assuming that the effect of MIS on LOS must be the same for all

hospitals, which may not be entirely realistic and may thus explain the discrepancies observed between the

two estimation approaches. It is left as future work to determine whether the inclusion of such hospital-

mediator interactions would bring the model-based and semi-parametric estimates closer together.

Mediation analysis methods such as these should only be applied to mediators and outcomes for which

there exists a well-established causal relationship. As the care patients receive at a hospital consists of

combinations of various practices and policies, preliminary studies in quality improvement initiatives should

focus on identification of multiple causal relationships between these practices and patient outcomes of

interest. Only once such relationships are identified can the methods in this paper be applied to each pair

of causally related process and outcome. However, in such cases, the direct effect cannot identify specific

alternative areas of improvement, only that improvement is needed elsewhere to improve patient outcomes.

Therefore, future methodological work would be to extend the mediation analysis presented here to include

the effect decomposition across multiple mediators acting on an outcome of interest concurrently.

151

Page 163: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

7.6 Supplemental Figures

Inst

56

Inst

55

Inst

53

Inst

54

Oth

ers

Inst

52

Inst

51

Inst

49

Inst

50

Inst

47

Inst

48

Inst

46

Inst

44

Inst

45

Inst

42

Inst

43

Inst

40

Inst

41

Inst

39

Inst

37

Inst

38

Inst

36

Inst

34

Inst

35

Inst

33

Inst

31

Inst

30

Inst

29

Inst

28

Inst

27

Inst

26

Inst

25

Inst

24

Inst

22

Inst

23

Inst

20

Inst

21

Inst

19

Inst

18

Inst

17

Inst

16

Inst

15

Inst

13

Inst

14

Inst

12

Inst

11

Inst

10

Inst

9In

st 8

Inst

7In

st 6

Inst

5In

st 4

Inst

3In

st 2

Inst

1

Inst 1

Inst 2

Inst 3

Inst 4

Inst 5

Inst 6

Inst 7

Inst 8

Inst 9

Inst 10

Inst 11

Inst 12

Inst 14

Inst 13

Inst 15

Inst 16

Inst 17

Inst 18

Inst 19

Inst 21

Inst 20

Inst 23

Inst 22

Inst 24

Inst 25

Inst 26

Inst 27

Inst 28

Inst 29

Inst 30

Inst 31

Inst 33

Inst 35

Inst 34

Inst 36

Inst 38

Inst 37

Inst 39

Inst 41

Inst 40

Inst 43

Inst 42

Inst 45

Inst 44

Inst 46

Inst 48

Inst 47

Inst 50

Inst 49

Inst 51

Inst 52

Others

Inst 54

Inst 53

Inst 55

Inst 56

Reference hospital

Inde

x ho

spita

l

0 0.2 0.4 0.6 0.8 1

SMD

010

020

030

040

050

0C

ount

Figure S7.1: Heat map of pairwise standardized mean differences (SMD) between hospitals for assessingimbalance in age group. Red means small imbalances, yellow means larger imbalances. Legend shows thedistribution of pairwise SMDs.

152

Page 164: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Inst

56

Inst

55

Inst

53

Inst

54

Oth

ers

Inst

52

Inst

51

Inst

49

Inst

50

Inst

47

Inst

48

Inst

46

Inst

44

Inst

45

Inst

42

Inst

43

Inst

40

Inst

41

Inst

39

Inst

37

Inst

38

Inst

36

Inst

34

Inst

35

Inst

33

Inst

31

Inst

30

Inst

29

Inst

28

Inst

27

Inst

26

Inst

25

Inst

24

Inst

22

Inst

23

Inst

20

Inst

21

Inst

19

Inst

18

Inst

17

Inst

16

Inst

15

Inst

13

Inst

14

Inst

12

Inst

11

Inst

10

Inst

9In

st 8

Inst

7In

st 6

Inst

5In

st 4

Inst

3In

st 2

Inst

1

Inst 1

Inst 2

Inst 3

Inst 4

Inst 5

Inst 6

Inst 7

Inst 8

Inst 9

Inst 10

Inst 11

Inst 12

Inst 14

Inst 13

Inst 15

Inst 16

Inst 17

Inst 18

Inst 19

Inst 21

Inst 20

Inst 23

Inst 22

Inst 24

Inst 25

Inst 26

Inst 27

Inst 28

Inst 29

Inst 30

Inst 31

Inst 33

Inst 35

Inst 34

Inst 36

Inst 38

Inst 37

Inst 39

Inst 41

Inst 40

Inst 43

Inst 42

Inst 45

Inst 44

Inst 46

Inst 48

Inst 47

Inst 50

Inst 49

Inst 51

Inst 52

Others

Inst 54

Inst 53

Inst 55

Inst 56

Reference hospital

Inde

x ho

spita

l

0 0.2 0.4 0.6 0.8 1 1.2

SMD

020

040

060

080

0C

ount

Figure S7.2: Heat map of pairwise standardized mean differences (SMD) between hospitals for assessingimbalance in ACG score. Red means small imbalances, yellow means larger imbalances. Legend shows thedistribution of pairwise SMDs.

153

Page 165: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Inst

56

Inst

55

Inst

53

Inst

54

Oth

ers

Inst

52

Inst

51

Inst

49

Inst

50

Inst

47

Inst

48

Inst

46

Inst

44

Inst

45

Inst

42

Inst

43

Inst

40

Inst

41

Inst

39

Inst

37

Inst

38

Inst

36

Inst

34

Inst

35

Inst

33

Inst

31

Inst

30

Inst

29

Inst

28

Inst

27

Inst

26

Inst

25

Inst

24

Inst

22

Inst

23

Inst

20

Inst

21

Inst

19

Inst

18

Inst

17

Inst

16

Inst

15

Inst

13

Inst

14

Inst

12

Inst

11

Inst

10

Inst

9In

st 8

Inst

7In

st 6

Inst

5In

st 4

Inst

3In

st 2

Inst

1

Inst 1

Inst 2

Inst 3

Inst 4

Inst 5

Inst 6

Inst 7

Inst 8

Inst 9

Inst 10

Inst 11

Inst 12

Inst 14

Inst 13

Inst 15

Inst 16

Inst 17

Inst 18

Inst 19

Inst 21

Inst 20

Inst 23

Inst 22

Inst 24

Inst 25

Inst 26

Inst 27

Inst 28

Inst 29

Inst 30

Inst 31

Inst 33

Inst 35

Inst 34

Inst 36

Inst 38

Inst 37

Inst 39

Inst 41

Inst 40

Inst 43

Inst 42

Inst 45

Inst 44

Inst 46

Inst 48

Inst 47

Inst 50

Inst 49

Inst 51

Inst 52

Others

Inst 54

Inst 53

Inst 55

Inst 56

Reference hospital

Inde

x ho

spita

l

0 0.2 0.4 0.6

SMD

010

030

050

0C

ount

Figure S7.3: Heat map of pairwise standardized mean differences (SMD) between hospitals for assessingimbalance in Charlson comorbidity score. Red means small imbalances, yellow means larger imbalances.Legend shows the distribution of pairwise SMDs.

154

Page 166: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Inst

56

Inst

55

Inst

53

Inst

54

Oth

ers

Inst

52

Inst

51

Inst

49

Inst

50

Inst

47

Inst

48

Inst

46

Inst

44

Inst

45

Inst

42

Inst

43

Inst

40

Inst

41

Inst

39

Inst

37

Inst

38

Inst

36

Inst

34

Inst

35

Inst

33

Inst

31

Inst

30

Inst

29

Inst

28

Inst

27

Inst

26

Inst

25

Inst

24

Inst

22

Inst

23

Inst

20

Inst

21

Inst

19

Inst

18

Inst

17

Inst

16

Inst

15

Inst

13

Inst

14

Inst

12

Inst

11

Inst

10

Inst

9In

st 8

Inst

7In

st 6

Inst

5In

st 4

Inst

3In

st 2

Inst

1

Inst 1

Inst 2

Inst 3

Inst 4

Inst 5

Inst 6

Inst 7

Inst 8

Inst 9

Inst 10

Inst 11

Inst 12

Inst 14

Inst 13

Inst 15

Inst 16

Inst 17

Inst 18

Inst 19

Inst 21

Inst 20

Inst 23

Inst 22

Inst 24

Inst 25

Inst 26

Inst 27

Inst 28

Inst 29

Inst 30

Inst 31

Inst 33

Inst 35

Inst 34

Inst 36

Inst 38

Inst 37

Inst 39

Inst 41

Inst 40

Inst 43

Inst 42

Inst 45

Inst 44

Inst 46

Inst 48

Inst 47

Inst 50

Inst 49

Inst 51

Inst 52

Others

Inst 54

Inst 53

Inst 55

Inst 56

Reference hospital

Inde

x ho

spita

l

0 0.5 1 1.5 2

SMD

020

040

060

080

0C

ount

Figure S7.4: Heat map of pairwise standardized mean differences (SMD) between hospitals for assessingimbalance in days from diagnosis to nephrectomy. Red means small imbalances, yellow means larger imbal-ances. Legend shows the distribution of pairwise SMDs.

155

Page 167: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Inst

56

Inst

55

Inst

53

Inst

54

Oth

ers

Inst

52

Inst

51

Inst

49

Inst

50

Inst

47

Inst

48

Inst

46

Inst

44

Inst

45

Inst

42

Inst

43

Inst

40

Inst

41

Inst

39

Inst

37

Inst

38

Inst

36

Inst

34

Inst

35

Inst

33

Inst

31

Inst

30

Inst

29

Inst

28

Inst

27

Inst

26

Inst

25

Inst

24

Inst

22

Inst

23

Inst

20

Inst

21

Inst

19

Inst

18

Inst

17

Inst

16

Inst

15

Inst

13

Inst

14

Inst

12

Inst

11

Inst

10

Inst

9In

st 8

Inst

7In

st 6

Inst

5In

st 4

Inst

3In

st 2

Inst

1

Inst 1

Inst 2

Inst 3

Inst 4

Inst 5

Inst 6

Inst 7

Inst 8

Inst 9

Inst 10

Inst 11

Inst 12

Inst 14

Inst 13

Inst 15

Inst 16

Inst 17

Inst 18

Inst 19

Inst 21

Inst 20

Inst 23

Inst 22

Inst 24

Inst 25

Inst 26

Inst 27

Inst 28

Inst 29

Inst 30

Inst 31

Inst 33

Inst 35

Inst 34

Inst 36

Inst 38

Inst 37

Inst 39

Inst 41

Inst 40

Inst 43

Inst 42

Inst 45

Inst 44

Inst 46

Inst 48

Inst 47

Inst 50

Inst 49

Inst 51

Inst 52

Others

Inst 54

Inst 53

Inst 55

Inst 56

Reference hospital

Inde

x ho

spita

l

0 0.5 1 1.5

SMD

020

040

060

080

0C

ount

Figure S7.5: Heat map of pairwise standardized mean differences (SMD) between hospitals for assessingimbalance in income quintile. Red means small imbalances, yellow means larger imbalances. Legend showsthe distribution of pairwise SMDs.

156

Page 168: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Inst

56

Inst

55

Inst

53

Inst

54

Oth

ers

Inst

52

Inst

51

Inst

49

Inst

50

Inst

47

Inst

48

Inst

46

Inst

44

Inst

45

Inst

42

Inst

43

Inst

40

Inst

41

Inst

39

Inst

37

Inst

38

Inst

36

Inst

34

Inst

35

Inst

33

Inst

31

Inst

30

Inst

29

Inst

28

Inst

27

Inst

26

Inst

25

Inst

24

Inst

22

Inst

23

Inst

20

Inst

21

Inst

19

Inst

18

Inst

17

Inst

16

Inst

15

Inst

13

Inst

14

Inst

12

Inst

11

Inst

10

Inst

9In

st 8

Inst

7In

st 6

Inst

5In

st 4

Inst

3In

st 2

Inst

1

Inst 1

Inst 2

Inst 3

Inst 4

Inst 5

Inst 6

Inst 7

Inst 8

Inst 9

Inst 10

Inst 11

Inst 12

Inst 14

Inst 13

Inst 15

Inst 16

Inst 17

Inst 18

Inst 19

Inst 21

Inst 20

Inst 23

Inst 22

Inst 24

Inst 25

Inst 26

Inst 27

Inst 28

Inst 29

Inst 30

Inst 31

Inst 33

Inst 35

Inst 34

Inst 36

Inst 38

Inst 37

Inst 39

Inst 41

Inst 40

Inst 43

Inst 42

Inst 45

Inst 44

Inst 46

Inst 48

Inst 47

Inst 50

Inst 49

Inst 51

Inst 52

Others

Inst 54

Inst 53

Inst 55

Inst 56

Reference hospital

Inde

x ho

spita

l

0 0.1 0.3 0.5

SMD

010

030

050

0C

ount

Figure S7.6: Heat map of pairwise standardized mean differences (SMD) between hospitals for assessing im-balance in sex. Red means small imbalances, yellow means larger imbalances. Legend shows the distributionof pairwise SMDs.

157

Page 169: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Inst

56

Inst

55

Inst

53

Inst

54

Oth

ers

Inst

52

Inst

51

Inst

49

Inst

50

Inst

47

Inst

48

Inst

46

Inst

44

Inst

45

Inst

42

Inst

43

Inst

40

Inst

41

Inst

39

Inst

37

Inst

38

Inst

36

Inst

34

Inst

35

Inst

33

Inst

31

Inst

30

Inst

29

Inst

28

Inst

27

Inst

26

Inst

25

Inst

24

Inst

22

Inst

23

Inst

20

Inst

21

Inst

19

Inst

18

Inst

17

Inst

16

Inst

15

Inst

13

Inst

14

Inst

12

Inst

11

Inst

10

Inst

9In

st 8

Inst

7In

st 6

Inst

5In

st 4

Inst

3In

st 2

Inst

1

Inst 1

Inst 2

Inst 3

Inst 4

Inst 5

Inst 6

Inst 7

Inst 8

Inst 9

Inst 10

Inst 11

Inst 12

Inst 14

Inst 13

Inst 15

Inst 16

Inst 17

Inst 18

Inst 19

Inst 21

Inst 20

Inst 23

Inst 22

Inst 24

Inst 25

Inst 26

Inst 27

Inst 28

Inst 29

Inst 30

Inst 31

Inst 33

Inst 35

Inst 34

Inst 36

Inst 38

Inst 37

Inst 39

Inst 41

Inst 40

Inst 43

Inst 42

Inst 45

Inst 44

Inst 46

Inst 48

Inst 47

Inst 50

Inst 49

Inst 51

Inst 52

Others

Inst 54

Inst 53

Inst 55

Inst 56

Reference hospital

Inde

x ho

spita

l

0 0.2 0.4 0.6 0.8

SMD

010

030

050

0C

ount

Figure S7.7: Heat map of pairwise standardized mean differences (SMD) between hospitals for assessingimbalance in tumour size (cm). Red means small imbalances, yellow means larger imbalances. Legend showsthe distribution of pairwise SMDs.

158

Page 170: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Inst

56

Inst

55

Inst

53

Inst

54

Oth

ers

Inst

52

Inst

51

Inst

49

Inst

50

Inst

47

Inst

48

Inst

46

Inst

44

Inst

45

Inst

42

Inst

43

Inst

40

Inst

41

Inst

39

Inst

37

Inst

38

Inst

36

Inst

34

Inst

35

Inst

33

Inst

31

Inst

30

Inst

29

Inst

28

Inst

27

Inst

26

Inst

25

Inst

24

Inst

22

Inst

23

Inst

20

Inst

21

Inst

19

Inst

18

Inst

17

Inst

16

Inst

15

Inst

13

Inst

14

Inst

12

Inst

11

Inst

10

Inst

9In

st 8

Inst

7In

st 6

Inst

5In

st 4

Inst

3In

st 2

Inst

1

Inst 1

Inst 2

Inst 3

Inst 4

Inst 5

Inst 6

Inst 7

Inst 8

Inst 9

Inst 10

Inst 11

Inst 12

Inst 14

Inst 13

Inst 15

Inst 16

Inst 17

Inst 18

Inst 19

Inst 21

Inst 20

Inst 23

Inst 22

Inst 24

Inst 25

Inst 26

Inst 27

Inst 28

Inst 29

Inst 30

Inst 31

Inst 33

Inst 35

Inst 34

Inst 36

Inst 38

Inst 37

Inst 39

Inst 41

Inst 40

Inst 43

Inst 42

Inst 45

Inst 44

Inst 46

Inst 48

Inst 47

Inst 50

Inst 49

Inst 51

Inst 52

Others

Inst 54

Inst 53

Inst 55

Inst 56

Reference hospital

Inde

x ho

spita

l

0 0.2 0.4 0.6 0.8 1

SMD

020

040

060

080

0C

ount

Figure S7.8: Heat map of pairwise standardized mean differences (SMD) between hospitals for assessingimbalance in tumour stage. Red means small imbalances, yellow means larger imbalances. Legend showsthe distribution of pairwise SMDs.

159

Page 171: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Inst

56

Inst

55

Inst

53

Inst

54

Oth

ers

Inst

52

Inst

51

Inst

49

Inst

50

Inst

47

Inst

48

Inst

46

Inst

44

Inst

45

Inst

42

Inst

43

Inst

40

Inst

41

Inst

39

Inst

37

Inst

38

Inst

36

Inst

34

Inst

35

Inst

33

Inst

31

Inst

30

Inst

29

Inst

28

Inst

27

Inst

26

Inst

25

Inst

24

Inst

22

Inst

23

Inst

20

Inst

21

Inst

19

Inst

18

Inst

17

Inst

16

Inst

15

Inst

13

Inst

14

Inst

12

Inst

11

Inst

10

Inst

9In

st 8

Inst

7In

st 6

Inst

5In

st 4

Inst

3In

st 2

Inst

1

Inst 1

Inst 2

Inst 3

Inst 4

Inst 5

Inst 6

Inst 7

Inst 8

Inst 9

Inst 10

Inst 11

Inst 12

Inst 14

Inst 13

Inst 15

Inst 16

Inst 17

Inst 18

Inst 19

Inst 21

Inst 20

Inst 23

Inst 22

Inst 24

Inst 25

Inst 26

Inst 27

Inst 28

Inst 29

Inst 30

Inst 31

Inst 33

Inst 35

Inst 34

Inst 36

Inst 38

Inst 37

Inst 39

Inst 41

Inst 40

Inst 43

Inst 42

Inst 45

Inst 44

Inst 46

Inst 48

Inst 47

Inst 50

Inst 49

Inst 51

Inst 52

Others

Inst 54

Inst 53

Inst 55

Inst 56

Reference hospital

Inde

x ho

spita

l

0 1 2 3 4 5 6

SMD

050

010

0015

00C

ount

Figure S7.9: Heat map of pairwise standardized mean differences (SMD) between hospitals for assessingimbalance in year of diagnosis. Red means small imbalances, yellow means larger imbalances. Legend showsthe distribution of pairwise SMDs.

160

Page 172: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Mediator Model Estimates

0.005 0.010 0.020 0.050 0.100 0.200 0.500 1.000 2.000

Odds Ratio (log scale)

VariableInterceptAge Group 50−59yrs vs 0−49yrsAge Group 60−69yrs vs 0−49yrsAge Group 70−79yrs vs 0−49yrsAge Group 80+yrs vs 0−49yrsMale vs FemaleIncome QuintileDiagnosis YearCharlson ScoreACG scoreDays from Diagnosis to SurgeryTumour sizeTumour stage T2 vs T1Inst 2 vs Inst 1Inst 3 vs Inst 1Inst 4 vs Inst 1Inst 5 vs Inst 1Inst 6 vs Inst 1Inst 7 vs Inst 1Inst 8 vs Inst 1Inst 9 vs Inst 1Inst 10 vs Inst 1Inst 11 vs Inst 1Inst 12 vs Inst 1Inst 13 vs Inst 1Inst 14 vs Inst 1Inst 15 vs Inst 1Inst 16 vs Inst 1Inst 17 vs Inst 1Inst 18 vs Inst 1Inst 19 vs Inst 1Inst 20 vs Inst 1Inst 21 vs Inst 1Inst 22 vs Inst 1Inst 23 vs Inst 1Inst 24 vs Inst 1Inst 25 vs Inst 1Inst 26 vs Inst 1Inst 27 vs Inst 1Inst 28 vs Inst 1Inst 29 vs Inst 1Inst 30 vs Inst 1Inst 31 vs Inst 1Inst 33 vs Inst 1Inst 34 vs Inst 1Inst 35 vs Inst 1Inst 36 vs Inst 1Inst 37 vs Inst 1Inst 38 vs Inst 1Inst 39 vs Inst 1Inst 40 vs Inst 1Inst 41 vs Inst 1Inst 42 vs Inst 1Inst 43 vs Inst 1Inst 44 vs Inst 1Inst 45 vs Inst 1Inst 46 vs Inst 1Inst 47 vs Inst 1Inst 48 vs Inst 1Inst 49 vs Inst 1Inst 50 vs Inst 1Inst 51 vs Inst 1Inst 52 vs Inst 1Inst 53 vs Inst 1Inst 54 vs Inst 1Inst 55 vs Inst 1Inst 56 vs Inst 1Others vs Inst 1

OR31.23 0.99 0.84 0.84 0.75 0.93 0.98 1.35 0.88 1.00 1.04 0.87 0.72 0.03 0.05 0.22 0.33 0.02 0.04 0.03 0.12 0.05 0.12 0.04 0.01 0.03 0.06 0.11 0.05 0.04 0.04 0.07 0.02 0.01 0.04 0.04 0.18 0.13 0.09 0.02 0.10 0.03 0.10 0.01 0.03 0.27 0.06 0.15 0.21 0.02 0.03 0.02 0.09 0.18 0.07 0.01 0.02 0.00 0.01 0.00 0.12 0.23 0.27 0.13 0.01 0.06 0.01 0.03

95% CI(19.60, 49.75) (0.78, 1.26) (0.66, 1.06) (0.65, 1.09) (0.53, 1.08) (0.79, 1.09) (0.93, 1.04) (1.32, 1.39) (0.84, 0.93) (1.00, 1.01) (0.99, 1.09) (0.83, 0.91) (0.54, 0.96) (0.02, 0.04) (0.03, 0.09) (0.12, 0.39) (0.18, 0.59) (0.01, 0.04) (0.02, 0.07) (0.02, 0.05) (0.07, 0.23) (0.03, 0.10) (0.07, 0.22) (0.02, 0.07) (0.01, 0.03) (0.02, 0.05) (0.03, 0.11) (0.06, 0.20) (0.02, 0.08) (0.02, 0.07) (0.02, 0.08) (0.04, 0.14) (0.01, 0.03) (0.01, 0.03) (0.02, 0.08) (0.02, 0.07) (0.08, 0.37) (0.06, 0.28) (0.04, 0.20) (0.01, 0.05) (0.05, 0.21) (0.01, 0.06) (0.05, 0.22) (0.00, 0.02) (0.01, 0.07) (0.10, 0.74) (0.03, 0.14) (0.06, 0.37) (0.08, 0.53) (0.01, 0.04) (0.01, 0.06) (0.01, 0.05) (0.03, 0.25) (0.07, 0.46) (0.03, 0.17) (0.01, 0.03) (0.01, 0.05) (0.00, 0.01) (0.01, 0.03) (0.00, 0.01) (0.04, 0.31) (0.07, 0.74) (0.06, 1.14) (0.04, 0.40) (0.00, 0.03) (0.02, 0.19) (0.00, 0.05) (0.01, 0.09)

|||

||

||

|||

||

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||||||

||||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

|

Figure S7.10: Caterpillar plot of the parameter estimates and 95% confidence intervals of the mediator modelused in the model-based and semi-parametric estimators of the total effect decomposition. Here, all hospitalstreating fewer than 9 patients are pooled into a single category (‘Others’).

161

Page 173: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Outcome Model Estimates

0.6 0.8 1.0 1.2 1.4 1.6

Regression Coefficient

VariableInterceptMIS vs Open surgeryAge Group 50−59yrs vs 0−49yrsAge Group 60−69yrs vs 0−49yrsAge Group 70−79yrs vs 0−49yrsAge Group 80+yrs vs 0−49yrsMale vs FemaleIncome QuintileDiagnosis YearCharlson ScoreACG scoreDays from Diagnosis to SurgeryTumour sizeTumour stage T2 vs T1Inst 2 vs Inst 1Inst 3 vs Inst 1Inst 4 vs Inst 1Inst 5 vs Inst 1Inst 6 vs Inst 1Inst 7 vs Inst 1Inst 8 vs Inst 1Inst 9 vs Inst 1Inst 10 vs Inst 1Inst 11 vs Inst 1Inst 12 vs Inst 1Inst 13 vs Inst 1Inst 14 vs Inst 1Inst 15 vs Inst 1Inst 16 vs Inst 1Inst 17 vs Inst 1Inst 18 vs Inst 1Inst 19 vs Inst 1Inst 20 vs Inst 1Inst 21 vs Inst 1Inst 22 vs Inst 1Inst 23 vs Inst 1Inst 24 vs Inst 1Inst 25 vs Inst 1Inst 26 vs Inst 1Inst 27 vs Inst 1Inst 28 vs Inst 1Inst 29 vs Inst 1Inst 30 vs Inst 1Inst 31 vs Inst 1Inst 33 vs Inst 1Inst 34 vs Inst 1Inst 35 vs Inst 1Inst 36 vs Inst 1Inst 37 vs Inst 1Inst 38 vs Inst 1Inst 39 vs Inst 1Inst 40 vs Inst 1Inst 41 vs Inst 1Inst 42 vs Inst 1Inst 43 vs Inst 1Inst 44 vs Inst 1Inst 45 vs Inst 1Inst 46 vs Inst 1Inst 47 vs Inst 1Inst 48 vs Inst 1Inst 49 vs Inst 1Inst 50 vs Inst 1Inst 51 vs Inst 1Inst 52 vs Inst 1Inst 53 vs Inst 1Inst 54 vs Inst 1Inst 55 vs Inst 1Inst 56 vs Inst 1Others vs Inst 1

Coef1.17E+12 0.79 1.05 1.10 1.20 1.35 0.98 0.98 0.99 1.04 1.00 1.01 1.01 1.07 0.97 1.07 1.05 1.17 1.12 1.17 0.99 1.10 0.93 1.04 1.47 1.21 1.12 1.01 1.14 1.10 1.18 1.14 1.07 1.22 1.24 1.12 1.23 1.16 1.01 1.27 1.19 1.06 1.04 1.05 1.15 1.07 0.88 1.06 0.92 0.73 0.91 1.10 0.98 1.29 1.07 1.19 1.19 0.93 0.96 1.25 1.18 0.86 1.05 0.89 1.02 1.12 0.90 1.20 1.12

95% CI(3E+08, 5E+01) (0.77, 0.81) (1.01, 1.09) (1.06, 1.14) (1.16, 1.25) (1.28, 1.43) (0.95, 1.00) (0.97, 0.99) (0.98, 0.99) (1.03, 1.05) (1.00, 1.00) (1.01, 1.02) (1.00, 1.01) (1.02, 1.11) (0.91, 1.04) (1.00, 1.15) (0.97, 1.12) (1.09, 1.26) (1.04, 1.21) (1.09, 1.27) (0.91, 1.08) (1.01, 1.20) (0.86, 1.02) (0.95, 1.13) (1.35, 1.60) (1.11, 1.32) (1.03, 1.22) (0.93, 1.10) (1.05, 1.24) (1.01, 1.20) (1.08, 1.30) (1.03, 1.25) (0.97, 1.18) (1.11, 1.34) (1.13, 1.37) (1.02, 1.23) (1.11, 1.36) (1.05, 1.29) (0.90, 1.12) (1.14, 1.42) (1.06, 1.33) (0.95, 1.19) (0.92, 1.16) (0.93, 1.18) (1.02, 1.30) (0.95, 1.22) (0.78, 0.99) (0.93, 1.21) (0.81, 1.04) (0.64, 0.83) (0.80, 1.04) (0.96, 1.26) (0.86, 1.13) (1.12, 1.47) (0.93, 1.22) (1.03, 1.37) (1.03, 1.37) (0.81, 1.07) (0.83, 1.12) (1.07, 1.45) (1.02, 1.38) (0.74, 1.00) (0.91, 1.23) (0.76, 1.04) (0.85, 1.21) (0.94, 1.33) (0.75, 1.08) (0.96, 1.49) (0.95, 1.32)

||

||

||

||

||||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

||

|||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

||

|||

||

||

||

||

||

|||

||

||

||

||

||

||

Figure S7.11: Caterpillar plot of the parameter estimates and 95% confidence intervals of the outcome modelused in the model-based estimators of the total effect decomposition. Here, all hospitals treating fewer than9 patients are pooled into a single category (‘Others’).

162

Page 174: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

0.5 1.0 1.5 2.0

Total Effect

SMR

Inst 29

Inst 28

Inst 27

Inst 26

Inst 25

Inst 24

Inst 22

Inst 23

Inst 20

Inst 21

Inst 19

Inst 18

Inst 17

Inst 16

Inst 15

Inst 13

Inst 14

Inst 12

Inst 11

Inst 10

Inst 9

Inst 8

Inst 7

Inst 6

Inst 5

Inst 4

Inst 3

Inst 2

Inst 1

Provider

50

51

55

57

63

64

74

74

75

75

78

80

90

92

94

96

96

98

99

101

102

104

124

130

145

161

163

201

454

Volume

Model−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametricModel−basedSemi−parametric

Estimator

● Original Estimate 0.6 0.8 1.0 1.2 1.6

Indirect Effect

SMR

0.5 1.0 1.5 2.5 3.5

Direct Effect

SMR

Figure S7.12: Boxplots of 95% confidence intervals of model-based and semi-parametric estimators of thetotal effect decomposition when pooling hospitals who treat fewer than 9 patients and fitting a constrainedmultinomial model specified in (7.11). Variability was estimated via a 125 iteration non-parametric bootstrap.

163

Page 175: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

0.5 1.0 1.5 2.0

Total Effect

SMR

Inst 29

Inst 28

Inst 27

Inst 26

Inst 25

Inst 24

Inst 22

Inst 23

Inst 20

Inst 21

Inst 19

Inst 18

Inst 17

Inst 16

Inst 15

Inst 13

Inst 14

Inst 12

Inst 11

Inst 10

Inst 9

Inst 8

Inst 7

Inst 6

Inst 5

Inst 4

Inst 3

Inst 2

Inst 1

Provider

50

51

55

57

63

64

74

74

75

75

78

80

90

92

94

96

96

98

99

101

102

104

124

130

145

161

163

201

454

Volume

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Model−based

Estimator

● Original Estimate 0.8 1.0 1.2 1.4 1.6

Indirect Effect

SMR

0.5 1.0 1.5 2.0

Direct Effect

SMR

Figure S7.13: Boxplots of 95% confidence intervals of model-based estimators of the total effect decompositionwhen pooling hospitals who treat fewer than 9 patients and fitting a constrained multinomial model specifiedin (7.11). Variability was estimated via an approximate Bayesian method that resamples fitted modelparameters.

164

Page 176: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Chapter 8

Discussion

8.1 Limitations and Future Considerations

8.1.1 Causal Inference and Assumptions

This thesis aimed to address common methodological problems that arise in comparisons of hospital quality

of care research by adopting a causal modelling framework. In particular, we considered comparing quality

using the indirectly standardized mortality ratio where the reference level of care is the national or provincial

average level of care. Quality comparisons are inherently causal (Dowd, 2011) as they attempt to quantify the

effect of the hospital of treatment (i.e. exposure) on some performance measure or outcome of interest. The

advantage to formulating these standardized quality metrics like the SMR using explicit causal language

is that it forces the researcher to be aware of the assumptions required to make valid causal conclusions

about hospital quality. These assumptions, detailed in Chapters 2, 4, and 6, consist in part of various no

unmeasured confounding assumptions, also referred to as conditional exchangeability assumptions. By being

forced to make these assumptions, the researcher must acknowledge the limitations of the available data for

case-mix adjustment as well as the possibility for misleading causal conclusions if important confounders

are not available. Not all administrative and observational databases are created equal, thus the number

of variables and the quality of the information collected will vary. By being cognizant of the assumptions

necessary for valid causal conclusions, perhaps change can be affected in how health services research data

are collected.

8.1.2 Variables for Case-mix Adjustment

The assumption of no unmeasured confounders is necessary to account for any patient level factors that

may influence the causal relationship between the exposure Z and the outcome Y . Often, as was the case

in the analyses presented in Chapters 3 and 7, these variables will include patient sociodemographic or

socioeconomic factors such as gender, measures of socioeconomic status, or race. It may be perceived that

by choosing to adjust the exposure-outcome relationship for such factors one is accepting that such factors

indeed affect the care being provided to patients. For example, inclusion of patient income level in the

adjustment may be perceived as acknowledging that low income patients are inherently treated differently

165

Page 177: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

than high income patients, i.e. acknowledging that care disparities exist. However, to not adjust for such

factors would result in unfair comparisons as the hospitals that treat the low income patients would be

disproportionately penalized by assuming that all hospitals treat the same patients and thus the care must

be equal for all patients when in reality this is not the case. Therefore, the issue is not whether there are

sociodemographic disparities in care (i.e. existence of an effect ofX → Y ), which would be a different research

problem, but if such disparities exist they must be included in the adjustment in order for unconfounded

results in the institutional comparison.

8.1.3 Sensitivity of Results to Assumptions

While such no unmeasured confounders assumptions are necessary to allow for causal effects to be iden-

tifiable, they are inherently untestable (VanderWeele and Vansteelandt, 2010). If the quality of clinical

and administrative data cannot be improved, then methods are needed to assess the impact of unmeasured

confounders on causal conclusions. A number of authors have proposed sensitivity analyses for the effect

of unmeasured confounders in the estimation of causal effects (Brumback et al., 2004; Blackwell, 2014), as

well as in mediation analysis (VanderWeele, 2010; Hafeman, 2011; Tchetgen Tchetgen and Shipster, 2014;

VanderWeele and Chiba, 2014) but not in the context of health services research and hospital comparisons.

Therefore possible future directions for the work presented in this thesis would be to develop sensitivity anal-

yses for the proposed doubly robust estimator (Chapter 4) and estimators of the total effect decomposition

(Chapter 6 and 7). Such potential sensitivity analyses would focus on assessing the impact of conditional

independence/exchangeability assumptions, as Chapter 5 has already considered sensitivity to the positivity

assumption.

8.1.4 Variability of Proposed Estimators

Estimation of the variability of the estimators proposed in Chapters 4 and 6 was obtained through non-

parametric bootstrapping methods. These methods, when used in conjunction with a multinomial hospital

assignment model with constrained estimation of covariate effects, as proposed in Chapters 4 and 7, can be

computationally burdensome. The lack of an explicit estimator of the variance for these proposed estimators

has prompted this reliance on the bootstrap to obtain sampling distributions. Some authors have developed

variance estimators for the SMR (Hosmer and Lemeshow, 1995; Talbot et al., 2011) but not for the case

of a doubly robust estimator or mediation analysis. Therefore one possible future direction is to develop

asymptotic variance estimators for the proposed methods in order to hopefully improve on the computational

expense required by the bootstrap methods.

8.1.5 Profiling using Multiple Indicators

Hospital comparisons of quality in practice would be based off numerous aspects of disease-specific care being

provided. The methods presented in this thesis have focused on a single indicator of care at a time. While

the methods can be applied to each indicator of interest, they do not provide a combined assessment of

the hospital care. In particular, the mediation analysis methods proposed in Chapter 6 decompose the total

hospital effect on some patient outcome into the indirect effect acting through some hospital practice and the

direct effect representing all other practices. Thus the direct effect only provides insight into whether there is

166

Page 178: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

remaining variations in care after intervention, but cannot be used to determine what other hospital practices

are the cause. Policy makers would be interested in incorporating multiple mediators along the pathway

between hospital of treatment and the patient outcome to identify the optimal target for an intervention

on care. Decomposing the total hospital effect on the outcome into multiple mediator pathways, either by

considering multiple mediators in parallel or in series, would provide multiple indirect effects representing

how interventions on each mediator would affect the outcome. Some work has been done to incorporate

multiple mediators in a mediation analysis (Albert and Nelson, 2011; VanderWeele and Vansteelandt, 2013;

Daniel et al., 2015) but these do not consider the SMR as the causal estimand nor are they in the context of

hospital profiling. Thus a natural future direction would be to adapt the methods of Chapter 6 to include

the presence of multiple mediators along the pathway of care.

8.1.6 Quality Improvement over Time

Policy makers would also be interested in assessing whether interventions made to improve hospital care

were successful by tracking the hospital’s care practices over time. Such comparisons over time could be

performed at the within hospital level, where the indicator for a specific hospital would be compared to

itself at various time points in an effort to determine if care has in fact been improved, perhaps through

some intervention on hospital practices. Further, between-hospital comparisons over time would also be

valuable for determining whether hospitals remain outliers in care. Some work has been done to address

within-hospital comparisons over time (Marshall et al., 1998; Bronskill et al., 2002; Daniels and Normand,

2006) through the use of hierarchical modelling, but not under a causal inference framework. There has

also been work that considers the occurrence of regression to the mean when modelling recent changes to a

quality indicator over time (Jones and Spiegelhalter, 2009; Gajewski and Dunton, 2013; Kasza et al., 2015),

but these do not make use of the causal inference framework. A future direction of this work could be to

develop possible doubly robust estimators for such hospital comparisons over time for the causal estimand

derived in Chapter 4.

8.2 Impact of Thesis

With limited financial resources available to our universal health care system, and the constant improvement

and introduction of novel treatments, it is imperative for policy makers and government to adequately

quantify the quality of care being provided by hospitals. In particular, quality improvement should focus on

initiatives that demonstrate measurable benefits on patient outcomes. With a recent shift towards increased

provider transparency and accountability for patient care, there is much need for the development of measures

that can benchmark the quality of care that patients receive across the country, as well as methodology that

can provide fair comparisons of hospitals in a nationwide or provincial capacity. The statistical methods

developed in this thesis contribute towards better care delivery by enabling policy makers to better identify

hospitals providing poor care to patients so that improvements can be implemented where most needed. The

proposed doubly robust estimator provides fair comparisons between hospitals as long as one of the causal

pathways is correctly specified. This ensures that policy makers can be more confident that the results

of the hospital comparisons and outlier identification are more likely to be reflective of the true hospital

performance. Further, by considering the reference care level as the national/provincial average, the proposed

167

Page 179: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

methods rely on fewer assumptions than other comparisons so there are fewer opportunities for misleading

conclusions about the quality of care provided. Thus policy makers can be more confident in their detection

of outlying care hospitals. Finally, when devising care improvement strategies with limited availability of

resources, policy makers would need to target intervention strategies for hospitals that would benefit most.

The proposed mediation analysis methods thus provide stakeholders with the necessary information to guide

their interventions towards areas that show the greatest benefit to patient outcomes, thus avoiding costly

improvement initiatives that may not result in improved care delivery. This thesis provides policy makers

with valuable tools for making optimal care improvement decisions to the benefit of stakeholders and patients

alike.

168

Page 180: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

Bibliography

J. Albert and S. Nelson. Generalized causal mediation analysis. Biometrics, 67(3):1028–1038, September

2011.

D. Alwin and R. Hauser. The decomposition of effects in path analysis. American Sociological Review, 40:

37–47, 1975.

P. Austin, D. Alter, and J. Tu. The use of fixed- and random-effects models for classifying hospitals as

mortality outliers: a Monte Carlo assessment. Med Decis Making, 23:526–539, 2003.

P. C. Austin, C. D. Naylor, and J. V. Tu. A comparison of a Bayesian vs. a frequentist method for profiling

hospital performance. Journal of Evaluation in Clinical Practice, 7(1):35–45, 2001.

H. Bang and J. M. Robins. Doubly robust estimation in missing data and causal inference models. Biometrics,

61:962–972, 2005.

R. Baron and D. Kenny. The moderator-mediator variable distinction in social psychological research:

Conceptual, strategic, and statistical considerations. J Pers Soc Psychol, 51(6):1173–1182, 1986.

C. B. Begg, E. R. Riedel, P. B. Bach, M. W. Kattan, D. Schrag, J. L. Warren, and P. T. Scardino. Variations

in morbidity after radical prostatectomy. New England Journal of Medicine, 346(15):1138–1144, 2002.

J. D. Birkmeyer, J. B. Dimick, and N. J. Birkmeyer. Measuring the quality of surgical care: Structure,

process, or outcomes? Journal of the Americal College of Surgeons, 198(4):626–632, 2004.

M. Blackwell. A selection bias approach to sensitivity analysis for causal effects. Political Analysis, 22(2):

169–182, 2014.

D. Boffa, J. Rosen, K. Mallin, and et.al. Using the national cancer database for outcomes research: A review.

Journal of the American Medical Association Oncology, 3:1722–1728, 2017.

L. A. Bragayrac, D. Abbotoy, K. Attwood, F. Darwiche, J. Hoffmeyer, E. C. Kauffman, and T. Schwaab.

Outcome of minimal invasive vs open radical nephrectomy for the treatment of locally advanced renal-cell

carcinoma. Journal of Endourology, 30(8):871–876, August 2016.

T. B. Brakenhoff, K. G. Moons, J. Kluin, and R. H. Groenwold. Investigating risk adjustment methods for

health care provider profiling when observations are scarce or events rare. Health Services Insights, 11:

1–10, 2018. doi: 10.1177/1178632918785133.

169

Page 181: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

S. E. Bronskill, S.-L. T. Normand, M. B. Landrum, and R. A. Rosenheck. Longitudinal profiles of health

care providers. Statistics in Medicine, 21:1067–1088, 2002.

B. Brumback, M. A. Hernan, S. Haneuse, and J. M. Robins. Sensitivity analyses for unmeasured confounding

assuming a marginal structural model for repeated measures. Statistics in Medicine, 23(5):749–767, March

2004.

J. F. Burgess, C. L. Christiansen, S. E. Michalak, and C. N. Morris. Medical profiling: improving standards

and risk adjustments using hierarchical models. Journal of Health Economics, 19:291–309, 2000.

K. Chamie, S. Williams, and J. Hu. Population-based assessment of determing treatments for prostate

cancer. Journal of the American Medical Association Oncology, 1:60–67, 2015.

Y. W. Cheng, A. Hubbard, A. B. Caughey, and I. B. Tager. The association between persistent fetal

occiput posterior position and perinatal outcomes: an example of propensity score and covariate distance

matching. American Journal of Epidemiology, 171(6):656–663, 2010.

C. L. Christiansen and C. N. Morris. Improving the statistical approach to health care provider profiling.

Annals of Internal Medicine, 127:764–768, 1997.

W. G. Cochran. Analysis of covariance: It’s nature and uses. Biometrics, 13(3):261–281, September 1957.

E. A. Codman. The shoulder: Rupture of the supraspinatus tendon and other lesions in or about the

subacromial bursa, volume V-XL. Thomas Todd Co., Boston, 1934.

S. R. Cole and C. E. Frangakis. The consistency statement in causal inference: a definition or an assumption?

Epidemiology, 20(1):3–5, January 2009.

S. R. Cole and M. A. Hernan. Constructing inverse probability weights for marginal structural models.

American Journal of Epidemiology, 168(6):656–664, 2008.

J. Crook, M. Milosevic, P. Catton, I. Yeung, T. Tran, C. Catton, M. McLean, and M. Panzarella, T. Haider.

Interobserver variation in postimplant computed tomography contouring affects quality assessment of

prostate brachytherapy. Brachytherapy, 1(2):66–73, 2002.

K. Daignault and O. Saarela. Doubly robust estimator for indirectly standardized mortality ratios. Epi-

demiologic Methods, 6(1), 2017. doi: 10.1515/em-2016-0016.

K. Daignault, K. A. Lawson, A. Finelli, and O. Saarela. Causal mediation analysis for standardized mortality

ratios. Epidemiology, 30:532–540, 2019.

R. Daniel, B. De Stavola, S. Cousens, and S. Vansteelandt. Causal mediation analysis with multiple media-

tors. Biometrics, 71(1):1–14, March 2015.

M. J. Daniels and S.-L. T. Normand. Longitudinal profiling of health care units based on continuous and

discrete patient outcomes. Biostatistics, 7(2):1–15, 2006.

E. R. DeLong, E. D. Peterson, D. M. DeLong, L. H. Muhlbaier, S. Hackett, and D. B. Mark. Comparing

risk-adjustment methods for provider profiling. Statistics in Medicine, 16:2645–2664, 1997.

170

Page 182: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

J. Dimick, D. Staiger, N. Osborne, and et.al. Composite measures for rating hospital quality with major

surgery. Health Services Research, 47:1861–1879, 2012.

A. Donabedian. The quality of medical care. American Association for the Advancement of Science, 200

(4344):856–864, 1978.

A. Donabedian. The quality of care: How can it be assessed? Journal of the American Medical Association,

260(12):1743–1748, 1988.

B. E. Dowd. Separated at birth: Statisticians, social scientists, and causality in health services research.

Health Research and Educational Trust, 46(2):397–420, 2011. doi: 10.1111/j.1475-6773.2010.01203.x.

D. Draper and M. Gittoes. Statistical analysis of performance indicators in UK higher education. Journal

of the Royal Statistical Society, Series A, 167(3):449–474, 2004.

B. Efron and R. Tibshirani. Bootstrap methods for standard errors, confidence intervals, and other measures

of statistical accuracy. Statistical Science, 1(1):54–77, 1986.

M. Egger, G. D. Smith, M. Schneider, and C. Minder. Bias in meta-analysis detected by a simple, graphical

test. BMJ, 315:629–634, 1997. doi: doi.org/10.1136/bmj.315.7109.629.

L. Ellison, J. Heaney, and J. D. Birkmeyer. Trends in the use of radical prostatectomy for treatment of

prostate cancer. Effective Clinical Practice, 2:228–233, 1999.

S. Evans, J. Millar, C. Moore, and et.al. Cohort profile: the TrueNTH global registry - an international

registry to monitor and improve localised prostate cancer health outcomes. BMJ Open, 7, 2017. doi:

e017006.

P. D. Faris, W. A. Ghali, and R. Brant. Bias in estimates of confidence intervals for health outcome report

cards. Journal of Clinical Epidemiology, 56:553–558, 2003.

L. Flynn, Y. Liang, G. L. Dickson, and L. H. Aiken. Effects of nursing practice environments on quality

outcomes in nursing homes. Journal of the American Geriatric Society, 58:2401–2406, 2010.

T. Freeman. Using performance indicators to improve health care quality in the public sector: a review of

the literature. Health Services Management Research, 15(2), May 2002.

M. J. Funk, D. Westreich, C. Wiesen, T. Sturmer, A. M. Brookhart, and M. Davidian. Doubly robust

estimation of causal effects. American Journal of Epidemiology, 173:761–767, 2011.

B. J. Gajewski and N. Dunton. Identifying individual changes in performance with composite quality indi-

cators while accounting for regression to the mean. Medical Decision Making, 33:396–406, 2013.

B. J. Gajewski, J. D. Mahnken, and N. Dunton. Improving quality indicator report cards through bayesian

modeling. BMC Medical Research Methodology, 8(77), 2008. doi: 10.1186/1471-2288-8-77.

P. Gandaglia, F. Bray, M. Cooperberg, and et.al. Prostate cancer registries: Current status and future

directions. European Urology, 69:998–1012, 2016.

171

Page 183: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. Bayesian Data Analysis,

chapter 4: Asymptotics and connections to non-Bayesian approaches. CRC Press, third edition, 2013.

P. Godley. Racial differences in mortality among medicare recipients after treatment for localized prostate

cancer. Cancer Spectrum Knowledge Environment, 95:1702–1710, 2003.

H. Goldstein and D. J. Spiegelhalter. League tables and their limitations: Statistical issues in comparisons

of institutional performance. Journal of the Royal Statistical Society, Series A, 159:385–409, 1996.

J. L. Gore, J. L. Wright, K. B. Daratha, K. P. Roberts, D. W. Lin, H. Wessells, and M. Porter. Hospital-level

variation in the quality of urologic cancer surgery. Cancer, 118(4):987–996, February 2012.

S. Greenland and J. M. Robins. Identifiability, exchangeability, and epidemiological confounding. Interna-

tional Journal of Epidemiology, 15(3):412–418, 1986.

D. M. Hafeman. Confounding of indirect effects: A sensitivity analysis exploring the range of bias due to a

cause common to both the mediator and the outcome. American Journal of Epidemiology, 174(6):710–717,

2011.

L. Harlan, A. Potosky, F. Gilliland, R. Hoffman, P. Albertsen, A. Hamilton, J. Eley, J. Stanford, and

R. Stephenson. Factors associated with initial therapy for clinically localized prostate cancer: prostate

cancer outcomes study. Journal of the National Cancer Institute, 93(24):1864–1871, December 19 2001.

M. A. Hernan. A definition of causal effect for epidemiological research. Journal of Epidemiology and

Community Health, 58:265–271, 2004. doi: 10.1136/jech.2002.006361.

M. A. Hernan and J. M. Robins. Estimating causal effects from epidemiological data. Journal of Epidemiology

and Community Health, 60:578–586, 2006.

L. Herrel, S. Kaufman, P. Yan, and et.al. Health care integration and quality among men with prostate

cancer. The Journal of Urology, 2016.

K. Hoffman, J. Niu, Y. Shen, and et.al. Physician variation in management of low-risk prostate cancer:

a population-based cohort study. Journal of the American Medical Association Internal Medicine, 174:

1450–1459, 2014.

P. W. Holland. Statistics and causal inference. Journal of the American Statistical Association, 81(396):

945–960, 1986.

D. W. Hosmer and S. Lemeshow. Confidence interval estimates of an index of quality performance based on

logistic regression models. Statistics in Medicine, 14:2161–2172, 1995.

P. P. Howley and R. Gibberd. Using hierarchical models to analyse clinical indicators: a comparison of the

gamma-poisson and beta-binomial models. Internation Journal for Quality in Health Care, 15(4):319–329,

2003.

I. C. Huang, C. Frangakis, F. Dominici, G. B. Diette, and A. W. Wo. Application of a propensity score

approach for risk adjustment in profiling multiple physician groups on asthma care. Health Services

Research, 40(1):253–278, 2005.

172

Page 184: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

D. Hume. An enquiry concerning human understanding. Open Court Publishing Co. 1949, Lasalle, Ill., 1748.

H. Hyman. Survey Design and Analysis: Principles, Cases and Procedures. Free Press, Glencoe, IL., 1955.

K. Imai, L. Keele, and D. Tingley. A general approach to causal mediation analysis. Psychological Methods,

15(4):309–334, 2010a.

K. Imai, L. Keele, D. Tingley, and T. Yamamoto. Causal mediation analysis using r. In H. Vinod, editor,

Advances in Social Science Research Using R, pages 129–154, New York, 2010b. Springer.

L. James and J. Brett. Mediators, moderators, and tests for mediation. Journal of Applied Psychology, 69:

307–321, 1984.

B. Jarman, S. Gault, B. Alves, A. Hider, S. Dolan, A. Cook, B. Hurwitz, and L. I. Iezzoni. Explaining

differences in english hospital death rates using routinely collected data. BMJ, 318(7197):1515–1520, June

1999.

B. Jarman, D. Pieter, A. A. van der Veen, R. B. Kool, P. Aylin, A. Bottle, G. P. Westert, and S. Jones. The

hospital standardised mortality ratio: a powerful tool for Dutch hospitals to assess their quality of care?

BMJ Quality & Safety, 19(1):9–13, 2010. doi: 10.1136/qshc.2009.032953.

M. Joffe, D. Small, and C.-Y. Hsu. Defining and estimating intervention effects for groups that will develop

an auxiliary outcome. Statistical Science, 22:74–97, 2007.

H. E. Jones and D. J. Spiegelhalter. Accounting for regression-to-the-mean in tests for recent changes in

institutional performance: Analysis and power. Statistics in Medicine, 28:1645–1667, 2009.

H. E. Jones and D. J. Spiegelhalter. The identification of “unusual” health-care providers from a hierarchical

model. The American Statistician, 65(3):154–163, 2011.

H. E. Jones, D. I. Ohlssen, and D. J. Spiegelhalter. Use of the false discovery rate when comparing multiple

health care providers. Journal of Clinical Epidemiology, 61:232–240, 2008.

C. Judd and D. Kenny. Process analysis: Estimating mediation in treatment evaluations. Evaluation Review,

5:602–619, 1981.

S. A. Julious, J. Nicholl, and S. George. Why do we continue to use standardized mortality ratios for small

area comparisons? Journal of Public Health Medicine, 23(1):40–46, 2001.

J. D. Kang and J. L. Schafer. Demystifying double robustness: a comparison of alternative strategies for

estimating a population mean from incomplete data. Statistical Sciences, 22:523–539, 2007.

J. Kasza, J. L. Moran, and P. J. Solomon. Assessing changes over time in healthcare provider performance:

Addressing regression to the mean over multiple time points. Biometrical Journal, 57(2):271–285, 2015.

N. Keiding and D. Clayton. Standardization and control for confounding in observational studies: A historical

perspective. Statistical Sciences, 29:529–558, 2014.

173

Page 185: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

H. Kraemer, M. Kiernan, M. Essex, and D. Kupfer. How and why the criteria defining moderators nad

mediators differ between the Baron & Kenny and MacArthur approaches. Health Psychology, 27(2 Suppl.):

S101–S108, 2008.

T. Krupski, L. Kwan, and A. Afifi. Geographic and socioeconomic variation in the treatment of prostate

cancer. Journal of Clinical Oncology, 23:7881–7888, 2005.

T. Lange, S. Vansteelandt, and M. Bekaert. A simple unified approach for estimating natural direct and

indirect effects. Am J Epidemiol, 176(3):190–195, 2012.

K. A. Lawson, O. Saarela, R. Abouassaly, S. P. Kim, R. H. Breau, and A. Finelli. The impact of quality

variations on patients undergoing surgery for renal cell carcinoma: A National Cancer Database study.

European Urology, 72:379–386, 2017a.

K. A. Lawson, O. Saarela, Z. Liu, L. T. Lavallee, R. H. Breau, L. Wood, M. A. Jewett, A. Kapoor, S. Tan-

guay, R. B. Moore, R. Rendon, F. Pouliot, P. C. Black, J. Kawakami, D. Drachenberg, and A. Finelli.

Benchmarking quality for renal cancer surgery: Canadian Kidney Cancer information system (CKCis)

perspective. Canadian Urological Association Journal, 11(8):232–237, 2017b.

R. Lilford and P. Provonost. Using hospital mortality rates to judge hospital performance: a bad idea that

just won’t go away. BMJ, 340, 2010.

R. Lilford, M. A. Mohammed, D. J. Spiegelhalter, and R. Thomson. Use and misuse of process and outcome

data in managing performance of acute medical care: avoiding institutional stigma. The Lancet, 363:

1147–1154, 2004.

J. K. Lunceford and M. Davidian. Stratification and weighting via the propensity score in estimation of

causal treatment effects: a comparative study. Statistics in Medicine, 23(19):2937–2960, October 15 2004.

D. MacKinnon. Introduction to Statistical Mediation Analysis. Erlbaum, New York, 2008.

D. MacKinnon and J. Dwyer. Estimating mediated effects in prevention studies. Evaluation Review, 17:

144–158, 1993.

E. C. Marshall and D. J. Spiegelhalter. Reliability of league tables of in vitro fertilisation clinics: retrospective

analysis of live birth rates. BMJ, 316:1701–1705, 1998.

G. Marshall, A. L. W. Shroyer, F. L. Grover, and K. E. Hammermeister. Time series monitors of outcomes:

A new dimension for measuring quality of care. Medical Care, 36(3):348–356, March 1998.

N. N. Massarweh, C.-Y. Hu, N. You, B. K. Bednarski, M. A. Rodriguez-Bigas, J. M. Skibber, S. B. Cantor,

J. N. Cormier, B. W. Feig, and G. J. Chang. Risk-adjusted pathologic margin positivity rate as a quality

indicator in rectal cancer surgery. Journal of Clinical Oncology, 32(27):2967–2974, 2014.

F. I. Matheson. 2016 Ontario marginalization index: user guide. Ontario Agency for Health Protection and

Promotion (Public Health Ontario), Toronto, ON: Providence St. Joseph’s and St. Michael’s Healthcare,

2018. Joint publication with Public Health Ontario.

174

Page 186: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

M. Maurice, D. Sundi, E. Schaeffer, and et.al. Risk of pathological upgrading and up staging among men

with low risk prostate cancer varies by race: Results from the national cancer database. Journal of Urology,

197:627–631, 2017.

O. S. Miettinen. Components of the crude risk ratio. American Journal of Epidemiology, 96(2):168–172,

1972.

D. Miller, M. Schonlau, M. Litwin, and et.al. Renal and cardiovascular morbidity after partial or radical

nephrectomy. Cancer, 112:511–520, 2008.

D. C. Miller and C. S. Saigal. Quality of care indicators for prostate cancer: progress toward consensus.

Urologic Oncology, 27:427–434, 2009.

M. Moreno-Betancur, J. Koplin, A. Ponsonby, and et.al. Measuring the impact of differences in risk factor

distributions on cross-population differences in disease occurrence: a causal approach. Int J Epidemiol.,

2017. Epub ahead of print doi:10.1093/ije/dyx194.

K. M. Mortimer, R. Neugebauer, M. J. van der Laan, and I. B. Tager. An application of model-fitting

procedures for marginal structural models. American Journal of Epidemiology, 162(4):382–388, 2005.

K. Moses, H. Orom, A. Brasel, and et.al. Racial/ethnic disparity in treatment for prostate cancer: Does

cancer severity matter? Urology, 99:76–83, 2017.

N. Nag, J. Millar, I. Davis, and et.al. Development of indicators to assess quality of care for prostate cancer.

European Urology Focus, 2016.

N. Nag, J. Millar, I. D. Davis, S. Costello, J. B. Duthie, S. Mark, W. Delprado, D. Smith, D. Pryor, D. Galvin,

F. Sullivan, A. C. Murphy, D. Roder, H. Elsaleh, D. Currow, C. White, M. Skala, K. L. Moretti, T. Walker,

P. De Ieso, A. Brooks, P. Heathcote, M. Frydenberg, J. Thavaseelan, and S. M. Evans. Development of

indicators to assess quality of care for prostate cancer. European Urology, 4:57–63, 2018.

J. Neuberger, S. Madden, and D. Collett. Review of methods for measuring and comparing center perfor-

mance after organ transplantation. Liver Transplantation, 16:1119–1128, 2010.

R. Neugebauer and M. J. van der Laan. Why prefer double robust estimates? illustration with causal point

treatment studies. Journal of Statistical Planning and Inference, 129(1-2):405–426, 2005.

F. Nightingale. Notes on matters affecting the health, efficiency and hospital administration of the British

army, founded chiefly on the experience of the Late War. Harrison, London, 1858.

F. Nightingale. Notes on Hospitals. Longman, London, 1863.

S.-L. T. Normand and D. M. Shahian. Statistical and clinical aspects of hospital outcomes profiling. Statistical

Sciences, 22(2):206–226, 2007.

S.-L. T. Normand, M. E. Glickman, and C. A. Gatsonis. Statistical methods for profiling providers of medical

care: Issues and applications. Journal of the American Statistical Association, 92(439):803–814, 1997.

175

Page 187: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

L. Ortelli, A. Spitale, L. Mazzucchelli, and A. Bordoni. Quality indicators of clinical cancer care for prostate

cancer: a population-based study in southern Switzerland. BMC Cancer, 18(733), 2018. doi: 10.1186/

s12885-018-4604-2.

R. Patzer and S. Pastan. Measuring the disparity gap: quality improvement to eliminate health disparities

in kidney transplantation. American Journal of Transplantation, 13:247–248, 2013.

J. Pearl. Direct and indirect effects. In Proceedings of the Seventeenth Conference on Uncertainty and

Artificial Intelligence, pages 411–420, San Francisco, 2001. Morgan Kaufmann.

J. Pearl. Principal stratification - a goal or a tool? Internation Journal of Biostatistics, 7(1), 2011. Article

20.

J. Pereira, J. Renzullill, G. Pareek, D. Moreira, R. Guo, Z. Zhang, A. Amin, A. Mega, D. Golijanin,

and B. Gershman. Perioperative morbidity of open versus minimally invasive partial nephrectomy: a

contemporary analysis of the National Surgical Quality Improvement Program. Journal of Endourology,

32(2):116–123, February 2018.

M. L. Petersen and M. J. van der Laan. Causal models and learning from data: Integrating causal modeling

and statistical estimation. Epidemiology, 25(3):418–426, 2014.

M. L. Petersen, K. E. Porter, S. Gruber, Y. Wang, and M. J. van der Laan. Diagnosing and responding

to violations in the positivity assumption. Statistical Methods in Medical Research, 21(1):31–54, February

2012.

A. Potosky, W. Davis, R. Hoffman, J. Stanford, R. Stephenson, D. Penson, and L. Harlan. Five-year outcomes

after prostatectomy or radiotherapy for prostate cancer: the prostate cancer outcomes study. Journal of

the National Cancer Institute, 96(18):1358–1367, September 15 2004.

M. E. Pouw, L. M. Peelen, H. F. Lingsma, D. Pieter, E. Steyerberg, and C. J. Kalkman. Hospital standardized

mortality ratio: Consequences of adjusting hospital mortality with indirect standardization. PLoS ONE,

8(4), 2013.

K. Preacher, D. Rucker, and A. Hayes. Addressing moderated mediation hypotheses: Theory, methods, and

prescriptions. Multivariate Behavioral Research, 42(1):185–227, 2007.

M. J. Racz and J. Sedransk. Bayesian and frequentist methods for provider profiling using risk-adjusted

assessments of medical outcomes. Journal of the American Statistical Association, 105(489):48–58, 2010.

doi: 10.1198/jasa.2010.ap07175.

M. V. Raval, K. Y. Bilimoria, A. K. Stewart, D. J. Bentrem, and C. Y. Ko. Using the NCDB for cancer

care improvement: An introduction to available quality assessment tools. Journal of Surgical Oncology,

99:488–490, 2009.

J. Robins. Semantics of causal DAG models and the identification of direct and indirect effects. In P. Green,

N. Hjort, and S. Richardson, editors, Highly Structured Stochastic Systems, pages 70–81, New York, 2003.

Oxford University Press.

176

Page 188: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

J. Robins, M. Sued, Q. Lei-Gomez, and A. Rotnitzky. Comment: Performance of doubly-robust estimators

when ”inverse probability” weights are highly variable. Statistical Sciences, 22:544–559, 2007.

J. M. Robins. Robust estimation in sequentially ignorable missing data and causal inference models. Pro-

ceedings of the American Statistical Association Section on Bayesian Science, pages 6–10, 2000.

J. M. Robins and S. Greenland. Identifiability and exchangeability for direct and indirect effects. Epidemi-

ology, 3:143–155, 1992.

J. M. Robins and A. Rotnitzky. Comment on the bickel and kwon article “on double robustness.”. Statistica

Sinica, 11:920–936, 2001.

J. Rochon, A. du Bois, and T. Lange. Mediation analysis of the relationship between institutional research

activity and patient survival. BMC Med Res Methodol, 14(9), 2014. http://www.biomedcentral.com/

1471-2288/14/9.

P. Rosenbaum and D. Rubin. The central role of the propensity score in observational studies for causal

effects. Biometrika, 70(1):41–55, 1983.

D. B. Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of

Educational Psychology, 66(5):688–701, 1974.

D. B. Rubin, E. A. Stuart, and E. L. Zanutto. A potential outcomes view of value-added assessment in

education. Journal of Educational and Behavioral Statistics, 29(1):103–116, Spring 2004.

D. O. Scharfstein, A. Rotnitzky, and J. M. Robins. Adjusting for nonignorable drop-out using semiparametric

nonresponse models. Journal of the American Statistical Association, 94(448):1096–1120, 1999.

M. Schmid, C. Meyer, G. Reznor, and et.al. Racial differences in the surgical care of medicare beneficiaries

with localized prostate cancer. Journal of the American Medical Association Oncology, 2:85–93, 2016.

V. J. Schoenbach and W. D. Rosamond. Understanding the Fundamentals of Epidemiology - An Evolving

Text, chapter 6. School of Public Health, UNC Chapel Hill, NC 27599-7435 USA, 2000.

F. Schroeck, S. Kaufman, B. Jacobs, and et.al. Regional variation in quality of prostate cancer care. Journal

of Urology, 191:957–962, 2014a.

F. Schroeck, S. Kaufman, B. Jacobs, and et.al. Adherence to performance measures and outcomes among

men treated for prostate cancer. Journal of Urology, 192:743–748, 2014b.

F. Schroeck, S. Kaufman, B. Jacobs, and et.al. Receipt of best care according to current quality of care

measures and outcomes in men with prostate cancer. Journal of Urology, 193:500–504, 2015.

I. A. Scott, C. A. Brand, G. E. Phelps, A. L. Barker, and P. A. Cameron. Using hospital standardised

mortality ratios to assess quality of care - proceed with extreme caution. Medical Journal of Australia,

194(12):645–648, 2011.

177

Page 189: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

A. Semerjian, S. L. Zettervall, R. Amdur, T. W. Jarrett, and K. Vaziri. 30-day morbidity and mortality

outcomes of prolonged minimally invasive kidney procedures compared with shorter open procedures:

National Surgical Quality Improvement Program analysis. Journal of Endourology, 29(7):830–837, July

2015.

D. M. Shahian and S.-L. T. Normand. Comparison of “risk-adjusted” hospital outcomes. Circulation:

Journal of the American Heart Association, 117:1955–1963, 2008.

D. M. Shahian, R. E. Wolf, L. I. Iezzoni, L. Kirle, and S.-L. T. Normand. Variability in the measurement of

hospital-wide mortality rates. New England Journal of Medicine, 363:2530–2539, 2010.

D. M. Shahian, L. I. Iezzoni, G. S. Meyer, L. Kirle, and S.-L. T. Normand. Hospital-wide mortality as a

quality metric: conceptual and methodological challenges. American Journal of Medical Quality, 27(2):

112–123, 2012.

T. Shinozaki and Y. Matsuyama. Doubly robust estimation of standardized rosk difference and ratio in the

exposed population. Epidemiology, 26:873–877, 2015.

M. Sobel. Asymptotic confidence intervals for indirect effects in structural equation models. In S. Leinhart,

editor, Sociological Methodology, pages 290–213. Jossey-Bass, 1982.

W. Sohn, M. Resnick, S. Greenfield, and et.al. Impact of adherence to quality measures for localized prostate

cancer on patient-reported health-related quality of life outcomes, patient satistfaction, and treatment-

related complications. Medical Care, 54:738–744, 2016.

B. Spencer, M. Steinberg, J. Malin, and et.al. Quality-of-care indicators for early-stage prostate cancer.

Journal of Clinical Oncology, 21:1928–1936, 2003.

B. Spencer, D. Miller, M. Litwin, and et.al. Variations in quality of care for men with early-stage prostate

cancer. Journal of Clinical Oncology, 26:3735–3742, 2008.

D. J. Spiegelhalter. Surgical audit: Statistical lessons from nightingale and codman. Journal of the Royal

Statistical Society, Series A (Statistics in Society), 162(1):45–58, 1999.

D. J. Spiegelhalter. Handling over-dispersion of performance indicators. Quality and Safety in Health Care,

14:347–351, 2005a. doi: 10.1136/qshc.2005.013755.

D. J. Spiegelhalter. Funnel plots for comparing institutional performance. Statistics in Medicine, 24:1185–

1202, 2005b.

D. J. Spiegelhalter, C. Sherlaw-Johnson, M. Bardsley, I. Blunt, C. Wood, and O. Grigg. Statistical methods

for healthcare regulation: rating, screening and surveillance. Journal of the Royal Statistical Society,

Series A, 175(1):1–47, 2012.

J. Splawa-Neyman, D. M. Dabrowska, and T. P. Speed. On the application of probability theory to agricul-

tural experiments: essay on principles, section 9. Statistical Science, 5(4):465–472, 1990.

B. Starfield, J. Weiner, L. Mumford, and D. Steinwachs. Ambulatory care groups: a categorization of

diagnoses for research and management. Health Services Research, 26(1):53, 1991.

178

Page 190: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

M. Susser. Causal Thinking in the Health Sciences: Concepts and Strategies of Epidemiology. Oxford

University Press, New York, 1973.

D. Talbot, T. Duchesne, J. Brisson, and N. Vandal. Variance estimation and confidence intervals for the

standardized mortality ratio with application to the assessment of a cancer screening program. Statistics

in Medicine, 20:3024–3037, 2011.

H.-J. Tan, J. S. Wolf Jr., Z. Ye, K. S. Hafez, and D. C. Miller. Population level assessment of hospital

based outcomes following laparoscopic versus open partial nephrectomy during the adoption of minimally

invasive surgery. Journal of Urology, 191:1231–1237, May 2014.

X. Tang, F. F. Gan, and L. Zhang. Standardized mortality ratio for an estimated number of deaths. Journal

of Applied Statistics, 42:1348–1366, 2015.

T. Tarin, A. Feifer, S. Kimm, L. Chen, D. Sjoberg, J. Coleman, and P. Russo. Impact of a common clinical

pathway on length of hospital stay in patients undergoing open and minimally invasive kidney surgery.

Journal of Urology, 191:1225–1230, May 2014.

E. Tchetgen Tchetgen and I. Shipster. Estimation of a semiparametric natural direct effect model incorpo-

rating baseline covariates. Biometrika, 101(4):849–864, December 2014.

N. Thomas, N. T. Longford, and J. E. Rolph. Empirical bayes methods for estimating hospital-specific

mortality rates. Statistics in Medicine, 13:889–903, 1994.

I. Thompson, R. Valicenti, P. Albertsen, and et.al. Adjuvant and salvage radiotherapy after prostatectomy:

AUA/ASTRO guideline. J Urol, 190:441–449, 2013.

M. J. van der Laan and J. M. Robins. Unified Methods for Censored Longitudinal Data and Causality.

Springer-Verlag, New York, 2003.

Y. R. van Gestel, V. E. Lemmens, H. F. Lingsma, I. H. de Hingh, H. J. Rutten, and J. W. W. Coebergh. The

hospital standardized mortality ratio fallacy: a narrative review. Medical Care, 50(8):662–667, August

2012.

T. VanderWeele. Marginal structural models for the estimation of direct and indirect effects. Epidemiology,

20(1):18–26, January 2009.

T. VanderWeele. Bias formulas for sensitivity analysis for direct and indirect effects. Epidemiology, 21(4):

540–551, July 2010.

T. VanderWeele. Policy-relevant proportions for direct effects. Epidemiology, 24:175–176, 2013.

T. VanderWeele. Explanation in Causal Inference: Methods for Mediation and Interaction, chapter 2, pages

20–65. Oxford University Press, New York, 2015.

T. VanderWeele and Y. Chiba. Sensitivity analysis for direct and indirect effects in the presence of exposure-

induced mediator-outcome confounders. Epidemiology, Biostatistics and Public Health, 11(2), 2014. doi:

e9027.

179

Page 191: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

T. VanderWeele and S. Vansteelandt. Conceptual issues concerning mediation, interventions and composi-

tion. Statistics and Its Interface, 2(4):457–468, 2009.

T. VanderWeele and S. Vansteelandt. Odds ratios for mediation analysis for a dichotomous outcome. Amer-

ican Journal of Epidemiology, 172(12):1339–1348, 2010.

T. VanderWeele and S. Vansteelandt. Mediation analysis with multiple mediators. Epidemiologic Methods,

2(1):95–115, 2013.

T. VanderWeele, S. Vansteelandt, and J. Robins. Effect decomposition in the presence of an exposure-induced

mediator-outcome confounder. Epidemiology, 25(2):300–306, Mar 2014.

S. Vansteelandt and T. VanderWeele. Natural direct and indirect effects on the exposed: Effect decomposition

under weaker assumptions. Biometrics, 68:1019–1027, December 2012.

M. Varewyck, E. Goetghebeur, M. Eriksson, and S. Vansteelandt. On shrinkage and model extrapolation in

the evaluation of clinical center performance. Biostatistics, 15:651–664, 2014.

M. Varewyck, S. Vansteelandt, M. Eriksson, and E. Goetghebeur. On the practice of ignoring center-patient

interactions in evaluating hospital performance. Statistics in Medicine, 35:227–238, 2016.

C. J. Wallis, G. Bjarnason, J. Byrne, D. C. Cheung, A. Hoffman, G. S. Kulkarni, A. B. Nathens, R. K. Nam,

and R. Satkunasivam. Morbidity and mortality of radical nephrectomy for patients with disseminated

cancer: an analysis of the National Surgical Quality Improvement Program database. Urology, 95:95–102,

2016.

Y. Wang, M. L. Petersen, D. Bangsberg, and M. J. van der Laan. Diagnosing bias in the inverse probability

of treatment weighted estimator resulting from violation of experimental treatment assignment. U.C.

Berkeley Division of Biostatistics Working Paper Series, Working Paper 211, September 2006.

C. Webber, M. Brundage, D. Siemens, and et.al. Quality of care indicators and their related outcomes: A

population-based study in prostate cancer patients treated with radiotherapy. Radiotherapy and Oncology,

107:358–365, 2013.

E. Wen, C. Sandoval, J. Zelmer, and G. Webster. Understanding and using the hospital standardized

mortality ratio in Canada: challenges and opportunities. Healthcare Papers, 8(4):26–36, 2008.

D. Westreich and S. R. Cole. Invited commentary: Positivity in practice. American Journal of Epidemiology,

171(6):674–677, February 2010.

A. Wijensinha, C. B. Begg, H. H. Funkenstein, and B. J. McNeil. Methodology for the differential diagnosis

of a complex data set: A case study using data from routine CT examination. Medical Decision Making,

3:133–154, 1983.

R. Wolfe. The standardized mortality ratio revisited: Improvements, innovations, and limitations. Am. J.

Kidney Dis., 24(2):290–297, August 1994.

S. Wright. Correlation and causation. Journal of Agricultural Research, 20:557–585, 1921.

180

Page 192: Causal Inference Methodology for Comparisons of …...Abstract Causal Inference Methodology for Comparisons of Hospital Quality of Care Katherine Daignault Doctor of Philosophy Graduate

A. M. Zaslavsky. Statistical issues in reporting quality data: small samples and casemix variation. Interna-

tional Journal for Quality in Health Care, 13(6):481–488, 2001.

181