BIO226 Lab Session 8: Generalized Linear Mixed Effects Models (GLMMs)

BIO226 Lab Session 8: Generalized Linear Mixed Effects

Models(GLMMs)

Professor Brent CoullTA: Shira Mitchell

May 3, 2012

Key Points of GLMM

1. GLMMs extend the approach of linear mixed effects models to categorical data.

2. GLMMs assume heterogeneity across individuals in a subset of regression coefficients (e.g. intercepts and slopes).

3. While Marginal Models (GEEs) focus on inferences about populations, GLMMs focus on inferences about individuals.

4. Regression parameters from GLMMs have ‘subject specific’ interpretations in terms of changes in the transformed mean response for a specific individual.

Specification of GLMM

The GLMM can be considered in 2 steps:1. Assume conditional distribution of each Yij,

given individual-specific effects bi, belongs to exponential family with conditional mean

g(E[Yij |bi]) = X′ij β + Z′ijbi,

where g(.) is known link function and Zij is known design vector (a subset of Xij) linking random effects bi to Yij.

Specification of GLMM

2. The bi are assumed to vary independently from one individual to another and bi ∼N(0,G), where G is covariance matrix for random effects.

Note: additional assumption of “conditional independence”, i.e. given bi, the responses Yi1, Yi2, ..., Yini

are assumed to be mutually independent.

GLMM ExampleLongitudinal Binary Response to Depression Medication

DX TRT NNN NNA NAN NAA ANN ANA AAN AAA

Mild Standard 16 13 9 3 14 4 15 6

Mild New 31 0 6 0 22 2 9 0

Severe Standard 2 2 8 9 9 15 27 28

Severe New 7 2 5 2 31 5 32 6

Reprinted by Agresti (2002) with permission from original source (Koch et al., 1977, Biometrics)The data stored in ‘depress.txt’ have already been converted into long form and contains the following 6 variables:

ID, Y (0=Abnormal , 1=Normal )

Severe (0=mild, 1=severe), Drug (0=standard, 1=new), Time, and Drug*Time.

Cross-classification of responses on depression at 3 times (N=Normal, A=Abnormal)

SAS CodeDATA depress;INFILE ‘depress.txt’;INPUT id y severe drug time dt;RUN;

DATA depress;SET depress;t=time; \* create categorical time variable*\RUN;

PROC PRINT DATA=depress(WHERE=(id=65 or id=101));RUN;

SAS Output

Obs id y severe drug time dt1 t

193 65 0 0 0 0 0 0

194 65 0 0 0 1 0 1

195 65 1 0 0 2 0 2

301 101 1 0 1 0 0 0

302 101 1 0 1 1 1 1

303 101 1 0 1 2 2 2

Question of interest: Do patient-specific changes in probability of normaldiffer between the two treatments?

1. drug*time

Marginal Model for Depression DataTo obtain initial parameter values for the GLMM, we fit the following Marginal Model (GEE):

logit{Pr(Yij = 1)} = ηij = β1 + β2severei + β3drugi + β4timej + β5drugi time∗ j

where:• Yij = 0 subject i is abnormal in period j; 1 subject i is normal in period j

• severei = 0 mild depression, initial diagnosis; 1 severe depression, initial diagnosis

• drugi = 0 standard; 1 new drug

• timej = 0 if baseline; 1 if time 1; 2 if time 2

and we assume:• Yij Bernoulli (∼ eηij/(1+eηij))

• Var(Y ij) = E(Yij)(1 − E(Yij)), note that Pr(Yij = 1) = E(Yij) because Yij is binary.

• log OR(Yij,Yik) = αjk

SAS Code

PROC GENMOD DESCENDING DATA=depress;CLASS id t;MODEL y=severe drug time dt / DIST=binomial

LINK=logit;REPEATED SUBJECT=id / WITHINSUBJECT=t

LOGOR=fullclust;RUN;

SAS Output Log Odds Ratio

Parameter Information

Parameter Group Alpha1 (1, 2) Alpha2 (1, 3) Alpha3 (2, 3)

GLMM for Depression Data

• Consider the following GLMM:logit{Pr(Yij = 1| bi1)} = ηij = β1 + β2severei + β3drugi + β4timej + β5drugi ∗

timej+ bi1

where bi1 is a random intercept that allows a different baseline probability of normal (vs abnormal) for each subject.

and we assume:• Yij|bi1 Bernoulli (∼ e ηij/(1+eηij)) which implies that Var(Yij|bi1) = E(Yij|bi1)(1

− E(Yij| bi1)). Note: E(Yij|bi1) = Pr(Yij = 1| bi1) because Yij is binary.

• Given bi1, the responses Yi0, Yi1, Yi2, are mutually independent.

• The bi1 are assumed to vary independently from one individual to another and bi1 N(0, ∼ σ2

b).

NLMIXED in SAS

PROC NLMIXED DATA=depress QPOINTS=20;PARAMS beta1=-0.03 beta2=-1.3 beta3=-0.05 beta4=0.48 beta5=1.01 sigma=0.07;

eta = beta1 + beta2*severe + beta3*drug + beta4*time + beta5*dt + b1;p = (exp(eta)/(1 + exp(eta));

MODEL y ~ BINARY(p);RANDOM b1 ~ NORMAL(0, sigma*sigma) SUBJECT = id;

ESTIMATE ’treatment effect, time 1’ beta3 + beta5;ESTIMATE ’treatment effect, time 2’ beta3 + 2*beta5;ESTIMATE ’time trend standard treatment’ beta4;ESTIMATE ’time trend new treatment’ beta4 + beta5;RUN;

NLMIXED in SAS• PARAMS statement: lists all parameters (fixed effects

and covariance for random effects) and their initial values (default initial value is 1).

• Program statements: defines linear predictor eta (includes fixed and random effects) and relates mean response (p) to linear predictor (eta).

• MODEL statement: specifies response variable and conditional distribution of response given random effects (e.g. BINARY).

• RANDOM effects distribution SUBJECT=variable: ∼defines random effects (RANDOM) and variable that determines clustering of observations within an individual (SUBJECT).

Note: PROC NLMIXED does not have a CLASS statement, therefore, it is critical that the dataset is sorted by ID prior to analysis.

Estimate StatementsTreatment effect, time 1

logit{Pr(Yij = 1| bi1)} = β1 + β2severei + β3drugi + β4timej + β5drugi time∗ j+ bi1

For drugi = 0 and timej = 1,

logit{Pr(Yij = 1| bi1)} = β1 + β2severei + β4 + bi1

For drug=1 and time=1,logit{Pr(Yi’j = 1| bi’1)} = β1 + β2severei’ + β3 + β4 + β5 + bi’1

Thus, difference = β3 + β5 assuming bi1 = bi’1 and severei = severei’

Estimate StatementsTreatment effect, time 2

logit{Pr(Yij = 1| bi1)} = β1 + β2severei + β3drugi + β4timej + β5drugi time∗ j+ bi1

For drug=0 and time=2,logit{Pr(Yij = 1| bi1)} = β1 + β2severei + 2β4 + bi1.

For drug=1 and time=2,logit{Pr(Yi’j = 1| bi’1)} = β1 + β2severei’ + β3 + 2β4 + 2β5 + bi1.

Thus, the difference = β3 + 2β5 assuming bi1 = bi’1 and severei = severei’

Estimate Statementslogit{Pr(Yij = 1| bi1)} = β1 + β2severei + β3drugi + β4timej +

β5drugi time∗ j+ bi1

Time Trend, Standard Treatmentlogit{Pr(Yij = 1| bi1)} = β1 + β2severei + β4timeij + bi1.

Time Trend, New Treatmentlogit{Pr(Yij = 1| bi1)} = β1 + β2severei + β3 + (β4 +β5)timeij +

bi1.

Specifications

Data Set WORK.DEPRESS

Dependent Variable y

Distribution for Dependent Variable

Binary

Random Effects b0

Distribution for Random Effects Normal

Subject Variable id

Optimization Technique Dual Quasi-Newton

Integration Method Adaptive Gaussian Quadrature

Dimensions

Observations Used 1020

Observations Not Used

0

Total Observations 1020

Subjects 340

Max Obs Per Subject 3

Parameters 6

Quadrature Points 20

NL Mixed Output

Parameters

beta1 beta2 beta3 beta4 beta5 sigma NegLogLike

-0.03 -1.3 -0.05 0.48 1.01 0.07 580.980217

Iteration History

Iter Calls NegLogLike Diff MaxGrad Slope

1 4 580.977896 0.002321 1.640809 -12.2797

8 20 580.969876 1.541E-7 0.000097 -3.16E-7

NOTE: GCONV convergence criterion satisfied.

Fit Statistics

-2 Log Likelihood 1161.9

AIC (smaller is better) 1173.9

AICC (smaller is better)

1174.0

BIC (smaller is better) 1196.9

NL Mixed Output

NL Mixed OutputParameter Estimates

Parameter EstimateStandard

Error DF t Value Pr > |t| Alpha Lower Upper Gradient

beta1 -0.02795 0.1641 339 -0.17 0.8649 0.05 -0.3508 0.2949 -0.0001

beta2 -1.3152 0.1546 339 -8.50 <.0001 0.05 -1.6194 -1.0110 -0.00002

beta3 -0.05970 0.2225 339 -0.27 0.7886 0.05 -0.4973 0.3779 -1.1E-6

beta4 0.4828 0.1160 339 4.16 <.0001 0.05 0.2547 0.7109 -0.00008

beta5 1.0184 0.1924 339 5.29 <.0001 0.05 0.6400 1.3969 -7.03E-7

sigma 0.06583 1.2417 339 0.05 0.9578 0.05 -2.3766 2.5083 0.000012

logit{Pr(Yij = 1| bi1)} = β1 + β2severei + β3drugi + β4timej + β5drugi time∗ j+bi1

Not significant at baseline (RCT)

NL Mixed OutputAdditional Estimates

Label EstimateStandard

Error DF t Value Pr > |t| Alpha Lower Upper

treatment effect, time 1 0.9587 0.1523 339 6.30 <.0001 0.05 0.6592 1.2582

treatment effect, time 2 1.9771 0.2663 339 7.42 <.0001 0.05 1.4533 2.5010

time trend standard treatment 0.4828 0.1160 339 4.16 <.0001 0.05 0.2547 0.7109

time trend new treatment 1.5013 0.1608 339 9.34 <.0001 0.05 1.1850 1.8175

logit{Pr(Yij = 1| bi1)} = β1 + β2severei + β3drugi + β4timej + β5drugi time∗ j+bi1

Conclusions• Research question: are patient-specific changes in probability of normal

different between the two treatments over time? This corresponds to a testing

H0: β5 = 0

• β5 = 1.0184 (p-value<.0001). Thus, we reject H0 of no treatment effect and conclude that there are greater patient-specific changes in probability of normal for the new treatment.

• The estimated odds ratio of normal comparing a patient on the new treatment to a patient on the standard treatment with the same random intercept and severity of initial diagnosis is 2.61 (1.93, 3.52) [e0.9587(e.659,e1.258)] for time 1, and 7.22 (4.28, 12.19) [e1.977(e1.453, e2.501)] for time 2.

Conclusions, continued• We estimate that the odds of normal for a subject on standard treatment increases by a factor of 1.62 (e0.483) for each time period. We estimate that the odds of normal for a subject on the new treatment increases by a factor of 4.49 (e1.501) for each time period.

• The odds of normal of a subject with an initial diagnosis of severe depression are 0.27 (e−1.315) times the odds of normal of a subject with mild depression and the same random intercept (i.e., a lower odds of normal).

• There appears to be little heterogeneity among subjects (σb = 0.06583). Approximately 95% of patients in the standard group with an initial diagnosis of mild depression are expected to have a baseline (time=0) log odds of normal between -0.1568 and 0.1011 (−0.02795 ± 1.96 × 0.06583) or baseline probability of normal between 0.461 = e-0.1568/(1+e-0.1568) and 0.525 = e0.1011/(1+e0.1011). (Lecture 20 slide 23)

Conclusions, continued Note that, when we interpret the parameter estimates from the mixed model, we interpret them at the patient level. When we report odds ratios comparing two patients, we assume that they have the same random intercepts (i.e. the same baseline propensity for normal).

Documents

BIO226 Lab Session 8: Generalized Linear Mixed Effects Models (GLMMs)