58
Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Methods for Multilevel Analysis

XH Andrew Zhou, PhDProfessor, Department of BiostatisticsUniversity of Washington

Page 2: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Examples of Multilevel (Hierarchical) Data

Individual-family-neighborhood Students-classroom-school-district Patient-provider-facility (the

Ambulatory Care Quality Improvement Project (ACQUIP).

Other types, multiple outcomes nested within individual

Page 3: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

ACQUIP Alcohol Trial• A group-randomized trial• Intervention: Feedback given to the

providers at each visit on patient’s general perceived health status as well as the condition specific perceived health status for 6 common conditions — chronic obstructive pulmonary disease (COPD), coronary artery disease (CAD), hypertension, depression, diabetes, and alcohol problems.

• Outcome at 1-yr follow-up: (1) Self-reports of advice about alcohol from their provider; binary outcome.

Page 4: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Hierarchical Nature of Data

Patients – Providers – facility Patient’s characteristics, e.g. advice

at baseline, co-morbility Provider’s characteristics, e.g panel

size Facility’s characteristics, e.g. urban

vs rural.

Page 5: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Research Questions

Whether the intervention was significantly related with patient self-reports of advice about alcohol from their providers after one year of the intervention.

Independent effects of patient-level, provider-level, and facility-level factors.

Quantification of provider-to-provider variability and facility-to-facility variability and the degree which it can be explained by patient-level, provider-level, and facility-level factors

Page 6: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Research Questions, Cont

Do facilities differ in expected outcomes after controlling for individual-level, provider-level, and facility-level factors?

Do providers differ in expected outcomes after controlling for individual-level, provider-level, and facility-level factors?

Page 7: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Multilevel (Hierarchical) Models

A hierarchical model analysis will treat the sites and the providers as random effects and will parse out the amount of total variation in the outcome that is attributable to each level of hierarchy.

Page 8: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

An example using two-level linear model on schools

A study of the relationship between a single student-level predictor variable (say, socioeconomic status (SES)) and one student-level outcome variable (mathematics achievement) in J schools randomly drawn from the entire population of schools.

Page 9: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

The SES-Achievement relationship in one school

Our regression model would be Figure 2.1 provides a scatterplot of

this relationship.

20 1

i

0 1

, ~ (0, )

is the math achievement score for student i,

x is the socioeconomic status for subject i.

is the intercept, and is the slope.

i i i i

i

Y x N

Y

Page 10: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington
Page 11: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Centering in covariates

0 is defined as the expected achivement of a student whose SES is zero.

It may be helpful to scale the independent variable, X, so that the intercept will be meaningful.

We center SES by subtracting the mean SES from each score.

Figure 2.2 shows the regression model with centering.

Page 12: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington
Page 13: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

The SES-Achievement relationship in two schools

Figure 2.3 shows separate regression models for two schools.

21 01 11 1 1 1

1

i1

01 11

2

, ~ (0, )

is the achievement for student i in school 1,

x is the socioeconomic status for subject i in school 1.

is the intercept, and is the slope in school 1.

i i i i

i

i

Y x N

Y

Y

202 12 2 2 2

2

, ~ (0, )

is the achievement for student i in school 2i i i

i

x N

Y

Page 14: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington
Page 15: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

The two lines indicate that School 1 and School 2 differ in two ways.

(1) School 1 has higher mean than school 2 (01>02) (2) SES is less predictive of achievement in School 1

than School 2 (11<12) If students had been randomly assigned to the two

schools, we could say that School 1 is both more “effective” and more “equitable”.

Of course, students are not assigned at random, so such interpretations of school effects are unwarranted without taking into account other differences in student composition.

Page 16: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

The SES-Achievement relationship in J schools (2-level Variance Component)

20 11

ij

0j 1j

, ~ (0, )

is the achievement for student i in school j,

x is the socioeconomic status for subject i in school j.

is the intercept, and is the slope in school j.

For e

ij j ij ij ij

ij

Y x N

Y

0j 1

ach school, effectiveness and equality are described by

the pair of values ( , ).j

Page 17: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Often sensible and convenient to assume that the intercept and slope have a bivariate normal distribution across the population of schools.

0 0 0 00 1 1

1 11 0 1 01

( ) , ( ) , ( ) ,

( ) ,Cov( , )=

j j j

j j j

E Var E

Var

Page 18: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Interpretation 0: the average school mean for the

population of schools 00: the population variance among the

school means 1: the average SES-achievement slope for

the population of schools 11:the population variance among the

slopes 01: the population covariance between

slopes and intercepts

Page 19: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Figure 2.4 provides a scatterplot of the relationship between 0j and 1j for a hypothetical sample of 200 schools.

There is more dispersion among means than slopes (00> 11)

Two effects tend to be negatively correlated (01<0); schools with high

averaged achievment, 0j, tend to have weak SES-achievement relationship, 1j

Page 20: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington
Page 21: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Modeling the second level

• Having examined graphically how schools vary in terms of their intcepts and slopes, we wish to develop a model to predict 0j and 1j

using school characteristics.• Let Wj be an indicator, which takes on a

value of one for Catholic schools and a value of zero for public

Page 22: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Two-level Linear Model, Cont

0 00 01 00

1 10 11 1 1 00

0 1 10

, ~ (0, ),

, ~ (0, ),

cov( , )

j j oj oj

j j j j

j j

W u u N

W u u N

u u

Page 23: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Interpretation 00: the mean achievement for public schools 01: the mean achievement difference

between Catholic and public schools 10: the average SES-achievement slope in

public schools 11: the mean difference in SES-achievement

slope in between Catholic and public schools u1j:the unique effect of school j on mean

achievement holding Wj constant u0j: the unique effect of school j on SES-

achievement slope holding Wj constant

Page 24: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Estimation methods

It is not possible to estimate the parameters of these regression models directly because the outcomes (0j, 1j) are not observed.

However, the data contain information needed for this estimation.

Page 25: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Estimation methods, cont

Combining models in two stages, we obtain

00 01 10 11 ,

0 1 ij

( ) ( )

(X -X)+ .

ij j ij j ij ij

ij j j ij

Y W X X W X X

u u

Page 26: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Estimation methods, Cont

The overall linear regression model is not the typical linear model assumed in standard ordinary least squares (OLS).

Efficient estimation and accurate hypothesis testing based on OLS require that the random errors are independent, normally distributed, and have constant variance.

In contrast, random errors in our overall model are dependent within each school and also have non-constant variances.

Page 27: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Estimation methods, cont.

The variance of random errors has the following complicated form:

2 200 11 ij( ) (X -X) + . ijVar

Page 28: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Estimation methods, cont

Through standard regression analysis is not appropriate, such models can be estimated by iterative maximum likelihood procedure.

Figure 2.5 provides a graphical representation of the model specified above.

Here we see two hypothetical plots of the association between 0j and 1j, one for public and a second for Catholic schools.

Plots show Catholic schools have both higher mean achievement and weaker SES effects than do the public school

Page 29: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington
Page 30: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Estimation methods, Cont

• Three types of parameters to estimate to be estimated:

1. Fixed effects (00,01,10,11)

2. Random level-1 coefficients (0j,1j)

3. Variance-covariance components (2,00,11,01)

Page 31: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Three common estimation methods

Maximum likelihood (ML) method is a general estimation procedure, which produces estimates for the population parameters that maximize the probability of the observing the data given the model.

Iterative generalized least squares (IGLS) and Restricted Iterative generalized least squares.

Bayesian method

Page 32: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

ML method

Two different likelihood functions:

1. Full Maximum Likelihood (FML) – both the regression coefficients and the variance components are included in the likelihood function.

2. Restricted Maximum Likelihood (RML) – only the variance components are included in the likelihood function, and the regression coefficients are estimated in a second estimation step.

Page 33: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Comparison of these two methods

FML is more efficient and can provide estimates for both variance components and fixed effect parameters. But, FML may produce biased estimates for variance components.

RML can provide less biases estimates for the variance components and is equivalent to ANOVA estimates, which are optimal, if the groups are balanced.

FML still continues to be used because (1) its computation is generally easier, and (2) it is easier to compare two models that differ in the fixed parameters using the likelihood-based tests. However, with RML, only differences in the random part can be compared with likelihood-based tests

Page 34: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

IGLS and RIGLS

The combined model is

00 01 10 11 ,

2 200 11 ij

00 01 ij kj 11 ij kj

11

( ) ( )

var( ) (X -X) + ,

cov( , )= [1 (X -X)+(X -X)]+ (X -X)(X -X)

for i k. Or we can re-write the model as

Y ~ ( , ),

( , , ) ',J

J

ij j ij j ij ij

ij

ij kj

m J

m J

Y W X X W X X

N X

Y Y Y

00 01 10 11( , , , ) '.

Page 35: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

IGLS and RIGLS, Cont

If , 00, 11, and 01 were known, then the covariance matrix,, could be constructed immediately, and the estimation could be performed with generalized least squares.

However, without knowledge of the covariance matrix, the estimation method is instead and iterative process known as iterative generalized least squares (IGLS).

Page 36: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

IGLS and RIGLS, Cont

The first step is to start with reasonable estimates of the fixed parameters. Typically these are the estimates from Ordinary Least Squares (OLS) that assumes 00=11=01=0.

From these estimates, the raw residuals are formed:

0 1ˆ ˆ

ij ij ijy y x

Page 37: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

IGLS and RIGLS, Cont

Let be the vector of raw residuals and it can be shown

[ ]

The estimation of these variance components involves an

application of Generalized Least Squares (GLS).

GLS is a regression te

T

Y

E YY

chnique that is used

when the error terms from OLS estimation

display non-random patterns, such as correlation.

Page 38: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

IGLS and RIGLS, Cont

With the estimates of and from GLS, the iterative procedure returns to the fixed part of the model and calculates new estimates of the fixed effects.

The procedure alternates between the fixed and random effects in this way until convergence, or until the parameter estimates do not change from iteration to iteration.

Page 39: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

IGLS and RIGLS, Cont

IGLS estimation may produce biased estimates of the random parameters because it does not take into account the sampling variation of the estimates for variance components.

This may be most severe in small samples. However, unbiased estimates can be produced

using Restricted Iterative Generalized Least Squares (RIGLS).

The main difference between IGLS and RIGLS is that IGLS uses maximum likelihood and RIGLS uses restricted maximum likelihood.

Page 40: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Bayesian method

Bayesian methods combine any prior information about the parameters with the information contained in the data to produce a posterior distribution.

MCMC methods are commonly used computational methods for generaring a random sample from a posterior distribution.

MCMC methods are also iterative and include Gibbs sampling and Metropolis-Hastings sampling. MCMC methods tend to produce more accurate interval estimates for small samples.

Page 41: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Three-level binary response models for the Alcohol Drinking

Let Yijk be the binary response variable for whether to receive drinking advice by subject i cared by provider j in hospital k

Xijk is an intervention status for subject i by provider j in hospital k.

Page 42: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Three-level logistic regression

0 1

0 0

2

2

2

logit(Pr( 1))

~ (0, )

~ (0, )

~ (0, )

ijk jk ijk ijk

jk k jk

k v

jk u

ijk e

Y X e

v u

v N

u N

e N

Page 43: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

The parameter e is a natural test for whether the assumption of Binomial variation is valid.

If is significantly different from one, the data is said to exhibit extra-binomial variation.

If is less than one is, the data is said to be under-dispersed and if is greater than one, the data is is said to be over-dispersed.

Page 44: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Two estimation methodsTwo estimation methods for multi-

level logistic regression models:• A quasi-likelihood approach• Bayesian approach with MCMC

methods. I will briefly describe these two approaches below.

Page 45: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Two Quasi-likelihood methods

For the quasi-likelihood approach, the first step in the estimation is to approximate the non-linear logistic regression equation using a Taylor series expansion. A Taylor series approximates a nonlinear function by an infinite series of terms.

If only the first term in the series is used, then the estimation is known as a first order approximation.

If the second term in the series is also used, then is referred to as second order approximation.

If the Taylor series is expanded about the fixed parameters only, then the estimation is known as Marginal Quasi-likelihood (MQL).

Page 46: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Two Quasi-likelihood methods,Cont

If the Taylor series is expanded about the fixed and the random parameters, then the estimation is known as Penalized Quasi-Likelihood (PQL).

Once the quasi-likelihood has been formed, the estimation procedures, IGLS and RIGLS, can be applied to estimate the parameter values.

Page 47: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Bayesian method

The MCMC method used for the logistic regression equations in this paper will be Metropolis-Hastings sampling.

Page 48: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

ACQUIP Alcohol Trial

• Binary outcome at 1-yr follow-up: (1) Self-reports of advice about alcohol patients receive from their provider.

• Patient-level covariates• Provider-level covariates.

Page 49: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

The Alcohol Example, Cont

Random assignment at the firm level should ensure that, on average, the two groups should be balanced on the baseline covariates. However, imbalance may still occur and confounding may still present a problem.

Patient-level potential confounders: hypertension, liver disease, being a smoker in the past year, and the AUDIT score.

Provider-level potential confounders: the number of patients per provider (Panel Size) and provider training.

Page 50: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Alcohol example

0 1 2

3 4 5

6 7 8 9

10 11 12

logit(Pr( 1))ijk jk ijk ijk

ijk ijk ijk

ijk ijk ijk ijk

ijk ijk ijk ijk

Y X AdviceAtBaseline

Hypertension LiverDisease PastYearSmoker

BaselineAUDIT PanelSize Fellow NP

PA Resident RN e

0 0

2

2

2

~ (0, )

~ (0, )

~ (0, )

jk k jk

k v

jk u

ijk e

v u

v N

u N

e N

Page 51: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Three-level logistic regression, Cont

Here, the variables Hypertension, LiverDisease, and PastYearSmoker are dichotomous variables that are equal to one if the patient reported the condition and zero if the patient did not report the condition.

The variable BaselineAUDIT is the patients AUDIT score at the baseline and is a continuous variable that ranges from 0 to 40,

the variable PanelSize indicates the range of the provider’s panel size.

The variables Fellow, NP, PA, Resident, and RN are dichotomous variables representing the categorical variable of provider type. The referent provider type is staff physician

Page 52: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington
Page 53: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Results

Table 3.5.1 shows the MQL estimates under the combinations of first order and second order approximation and the binomial and extra-binomial assumptions and Table 3.5.2 shows the PQL estimates under the combinations of first order and second order approximation and the binomial and extra-binomial assumptions.

Page 54: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington
Page 55: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Results for fixed effects

The estimates for the fixed effects are quite stable between estimation procedures. The estimate of the intervention effect is approximately 1.35, indicating that a patient in the intervention group is more likely to report advice than a patient in the control group.

This result is not significant if a two-tailed test is used. However, this result is significant if a one-tailed test is used. The p-values for the one-tailed test range from 0.02 to 0.05 depending on which estimate is considered.

Page 56: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Results for fixed effects, Cont

A patient self-report of advice at baseline as well as the patient’s baseline AUDIT score are the only additional variables significantly associated with a patient self-report of advice on the one-year follow-up survey.

None of the provider-level variables are associated with a patient self-report of advice on the one-year follow-up survey.

Page 57: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Results for variances and covariances

Estimates of the variance components are slightly more variable across estimation procedures in this model.

The estimate of the site level variance component has increased from approximately zero to be in the range of 0.01 to 0.04. However, these estimates tend to include zero in the confidence interval, indicating as before, that there may be little or no residual clustering at the site level.

The provider level variance components estimates are between 0.01 and 0.16, thus showing the greatest variation under the different estimations.

Page 58: Methods for Multilevel Analysis XH Andrew Zhou, PhD Professor, Department of Biostatistics University of Washington

Results for variances and covariances, cont

The majority of the residual variation is at the patient level. The estimates for the patient level variance component remain close to one and support the assumption of binomial variance at the patient level.