Term 4, 2006BIO656--Multilevel Models 1 PART 8 Two Stage & Joint Models

BIO656--Multilevel Models 1Term 4, 2006

PART 8PART 8

Two Stage & Joint ModelsTwo Stage & Joint Models


SEERMED DATA SEERMED DATA

End of Life Colorectal Cancer Costs

Motivation:

Expenditure$0 $500,000


DataData

Patient – Physician

Death

Terminal-Phase Costs

12 mos

Cancer Diagnosis

Professional Health-Care Services

HMO Hospice FFS

Private Ins.Medicare

Medicare Payments

Claims

RejectedAllowed Co-PayDeductibles

Factors: Need-based Enabling Predisposing


DataData

Patient – Physician

Death

Terminal-Phase Costs

12 mos

3 mos

Cancer Diagnosis

Medicare Payments


SEERMED DATA SEERMED DATA

End of Life Colorectal Cancer Costs

Motivation:

Expenditure$0 $500,000


A “Normal” DistributionA “Normal” Distribution

Den

sity

Y


A Complex DistributionA Complex Distribution

Den

sity

Y


Complex Distributions Complex Distributions Mixtures of Simple Distributions Mixtures of Simple Distributions

Den

sity

Y

Mixtures-of-Experts Models (MEM)

Finite Mixture Models (FMM)

McLachlan, Peel. (2001), FMMJacobs, Jordan. (1991), MEM, Neural Comp


A simple, two-part mixtureA simple, two-part mixture

$0 $+ 1. P(Y>0)

E(Y+)

2. E(Y|Y>0)


A Two-Part Model:A Two-Part Model:(Intensity & Size)(Intensity & Size)

IS – logit/lognormal

1. logit{ Pr(Yi>0) } = x

2. i.) log10(Yi+) = x + i

ii.) i ~ N(0,2)

0. “Tobit” model: Tobin (1958)1. Selection (hurdle) models: (Amemiya 1984; Heckman 1976) 2. Zero-inflated models (Lambert 1992; Green 1994)3. Two-part models (Manning 1981; Mullahy 1998)


Another Two-Part Model:Another Two-Part Model:(Intensity & Size)(Intensity & Size)

IS – Probit/log-Gamma

1. -1{ Pr(Yi>0) } = x

2. i.) log10{E( Yi+)} = x

ii.) Yi+ ~ (,)


A Two-Part Model:A Two-Part Model:The Intensity-Size GLMThe Intensity-Size GLM

h1 binary data link function

h2 continuous data link function

f exponential family w/ dispersion

IS – GLM


0 +

Multiple Levels 1Multiple Levels 1


MonthlyMonthly SEERMED Data SEERMED Data

Month 10

Month 11

Month 12

10

12

10

11

12

+

+

11 +


HMREM1HMREM1

TimeX

f12

0 +

g2

g1

f10

0 +

g2

g1

f11

0 +

g2

g1

Month 12

Month 10

Month 11

a

a

b

b

Multiple Levels 2Multiple Levels 2

X

X

X

X

X


1. Intensity: logit( i

) = x2. Size:

a) i = x

b) Yi+ ~ f ( i

, )

A 2-Part ModelA 2-Part Model


1. Intensity: logit( i

c ) = x + zai

2. Size:

a) ic = x + zbi

b) Yi+c ~ f ( i

c, )

ui = ~ N , = ai 0 aa

bi 0 ba bb

A Longitudinal 2-Part ModelA Longitudinal 2-Part Model

1. Olsen, Schafer, (2001)

2. Tooze, Grunwald, Jones, (2002)

3. Yau, Lee, Ng, (2002)

3. Random Effects:


Data Analysis: 3 General StepsData Analysis: 3 General Steps

1. Exploration

2. Model Fitting and Estimation

3. Diagnostics

and the greatest of these is…


3 33

2

2

2

1 1

1

0

10 11 12

01

23

45

Month

log

10C

ost

1

Uncooked Spaghettis PlotUncooked Spaghetti Plot


MonthlyMonthly SEERMED Data SEERMED Data

Month 10

Month 11

Month 12

10

12

10

11

12

+

+

11 +


Figure 5: Seermed log10 month 1 & 2Figure 5: Seermed log10 month 1 & 2

Expenditure 10

Expenditure 11

Density

Month 10 & Month 11 log10(Costs)

0

5

0

5

Bivariate Point Mass

Univariate Continuous Distbs.

Bivariate Continuous

Distb.

BIO656--Multilevel Models 22Term 4, 20060 1 2 3 4 5

01

23

45

D2 0.36

Rho 0.56

OR 12.9

D1 0.77

13% 7%

10% 70%

SEERMED Costs: Months 10 & 11

log10y10 1

log

10y

11

1

PRISM plot: Month 10 & 11 SEERMED Costs

Paired

Response

Intensity

Size

Mixture

plot

aa

bb

ba


x

Den

sity

logy10

0

0

0.36 0.56

12.9 0.77

0

0

0.04 0.34

8.65 0.54

13%7%

10% 70%

01

23

45

6

x

Den

sity

logy11

0

0

0.25 0.51

15.12 0.83

8%5%

15% 72%

0 1 2 3 4 5 6

9%4%

10% 77%

x

Den

sity

logy12

0 1 2 3 4 5 6

PRISM Matrix: Months 10-12


SEERMED MREMSEERMED MREM

1. Intensity:

h1( ic ) = 0 + 1Obs + 2Male + 3Obs*Male + ai

2. Size:

a) h2( ic ) = 0 + 1Obs + 2Male + bi

b) Yi+c ~ f ( i

c, )

3. Random Effects:

Size: Lognormal, Gamma

Intensity: Probit, Logistic

ui = ~ N , = ai 0 a

bi 0 ba b

2

2


EstimationEstimation

Li()

Likelihood:

Whoa.

• PQL, MCEM, MCMC, …

• Adaptive Quadrature – Newton-Raphson

Zeger, Karim (1991); Davidian, Giltinan, (1993); Pinheiro, Bates (1995);

Mcculloch (1997); Booth et al. (2001); Rabe-Hesketh, et al. (2004)

Non-Linear Mixed Model (NLMM)But:


Estimation: SASEstimation: SAS

proc nlmixed data=SEERMED;

parms / data=parms_start;

*- 1) logistic: logit{Pr( Y>0 | a )} = Xalpha + a = “eta0” -*;

eta0 = alpha0_c + alpha1_c*obs + alpha2_c*male + alpha3_c*obsmale + a;

pi_c = exp(eta0) / (1+exp(eta0));

*- 2) log-normal: E( log(Y) | Y>0, b ) = XB + b = “eta1” -*;

eta1 = beta0_c + beta1_c*obs + beta2_c*male + b;

*- log-likelihood -*;

pi=CONSTANT('PI');

if y=0 then ll1 = 0;

else ll1=-.5*log(2*pi*sigma**2)-.5*((log10y-eta1)/sigma)**2;

ll = (1-Gpos)*log(1-pi_c) + Gpos*log(pi_c) + Gpos*(ll1);

model y ~ GENERAL(ll);

RANDOM a b ~ NORMAL([0,0],[tau_aa, tau_ba, tau_bb]) SUBJECT=id;

run;


Estimation: SAS (better)Estimation: SAS (better) proc nlmixed data=sanfran qpoints=10;

parms / data=parms_start;

*-logit-*;

eta0 = alpha0_c + alpha1_c*obs + alpha2_c*male + alpha3_c*obsmale + a;

expeta = exp(eta0);

pi_c = expeta / (1+expeta);

tau_aa = exp(logtau_a)**2;

*-lognormal-*;

eta1 = beta0_c + beta1_c*obs + beta2_c*male + b;

phi = 10**(log10phi); *std dev of log10(Y+1)|b;

tau_bb = (10**(log10tau_b))**2;

*- RE Var -*;

rho_ba = (exp(2*zrho_ba) - 1) / (exp(2*zrho_ba) + 1);

tau_ba = rho_ba*(tau_aa*tau_bb)**.5;

*- log-likelihood -*;

pi=CONSTANT('PI');

if y=0 then ll1 = 0; else ll1=-.5*log(2*pi*phi**2)-.5*((log10y-eta1)/phi)**2;

ll = (1-Gpos)*log(1-pi_c) + Gpos*log(pi_c) + Gpos*(ll1);

model y ~ GENERAL(ll);

RANDOM a b ~ NORMAL([0,0],[tau_aa, tau_ba, tau_bb]) SUBJECT=id;

ods output ParameterEstimates = parms_new;

run;


SEERMED MREM Results 1SEERMED MREM Results 1


Profile ll (alpha3)Profile ll (alpha3)

Intensity model Obs*Male interaction term (3)

Sca

led

Pro

file

Like

lihoo

dMREM Profile Likelihood Plots for 3

Logit-Lognormal

Probit*-Lognormal

Probit*-Gamma

c

c

Logit-Gamma LR 6


SEERMED MREM Results 2SEERMED MREM Results 2

BIO656--Multilevel Models 31Term 4, 20060 1 2 3 4 5

01

23

45

D2 0.36

Rho 0.56

OR 12.9

D1 0.77

13% 7%

10% 70%

SEERMED Costs: Months 10 & 11

log10y10 1

log

10y

11

1

PRISM plot: Month 10 & 11 SEERMED Costs

Paired

Response

Intensity

Size

Mixture

plot

aa

bb

ba


SEERMED SEERMED MREMMREM Results 2 Results 2

But do these models fit?…


02000

4000

6000

8000

10000

12000

Y10+

Y11+

Y12+

G10

G11

G12

L10

L11

L12

Y10+

Y11+

Y12+

G10

G11

G12

L10

L11

L12

10

11

12

P10

P11

P12

10

11

12

P10

P11

P12

10 11 12 10 11 12

0.0

0.2

0.4

0.6

0.8

1.0Female Male

Inte

nsity:

Pr(

Y>

0)

Siz

e:

E(Y

|Y>

0)

Month

Data vs. MREM Models

Obs: , YExp: P, L,G


Diagnostic PRISM Matrix: lognormal IS-GLMM Residuals

Observed

Expected

Ob

serv

ed

Expected

QQ Plot1

12%6%

13% 70%

9%4%

15% 71%

13%7%

10% 70%

01

23

45

6

Ob

serv

ed

Expected

QQ Plot2

8%5%

10% 77%

8%5%

15% 72%

0 1 2 3 4 5 6

9%4%

10% 77%

Ob

serv

ed

Expected

QQ Plot3

0 1 2 3 4 5 6


Diagnostic PRISM Matrix: lognormal IS-GLMM Residuals

Observed

Expected

x

De

nsi

ty

Res10

12%6%

13% 70%

9%4%

15% 71%

13%7%

10% 70%

01

23

45

6

x

De

nsi

ty

Res11

8%5%

10% 77%

8%5%

15% 72%

0 1 2 3 4 5 6

9%4%

10% 77%

x

De

nsi

ty

Res12

0 1 2 3 4 5 6


MEMMEM

MREMMREM

Review & Related WorkReview & Related Work

0 1 2

+

HMREMHMREM HMMMM

1. Simple Combinations of Simple Models

2. Complex (Multi-Level) Data:Many Models & Many Pictures

Ideas


Data vs. HMREM Models

02000

4000

6000

8000

10000

12000

Y10+

Y11+

Y12+

G10

G11

G12

L10

L11

L12

H10

H11

H12

Y10+

Y11+

Y12+

G10

G11

G12

L10

L11

L12

H10

H11

H12

10

11

12

l10

l11

l12

10

11

12

l10

l11

l12

10 11 12 10 11 12

0.0

0.2

0.4

0.6

0.8

1.0Female Male

Inte

nsity:

Pr(

Y>

0)

Siz

e:

E(Y

|Y>

0)

Month

Data vs. HMMMM Models


Review & Related WorkReview & Related Work

• These ideas are not just for Zero-Inflated Data

• Latent Variables are useful for “connecting” things


Opportunistic Infection & IDUOpportunistic Infection & IDU

Day in Study6 months prior to 1st interview

Opportunistic Infection

Interview: Reported Drug Use

Interview: Reported No Drug Use

Always Users

Intermittent Users

Never Users

Each Line Represents 1 subject’s time in the study


But what about Possible But what about Possible Informative Missingness?Informative Missingness?

OIDrug Use

Death / Dropout


Jointly Analyze Survival & OIsJointly Analyze Survival & OIs

1) logistic model:

logit{ Pr(OIij | ai) } = 0 + 1SUij +

2SUij*HCuseij + 3AUij + 4Periodj + ai

2) Survival Model:

log{ (t) } = 0 + 1SUij + 2AUij + ai

3) Latent Effects:

ai ~ N(0,)

Guo & Carlin (2004)


Warning!Warning!

• But “Buyer Beware”

-- Model Assumptions

-- Identifiability

-- Model Fit

-- Marginalize & Check whenever possible

• MLMs require even more due-diligence than usual


ReferencesReferences• Mixture Models:

– McLachlan, G. J. and Peel, D. (2001), Finite mixture models, John Wiley & Sons.

– Jacobs, R. A. and Jordan, M. I. (1991), “Adaptive mixtures of local experts. Neural Computation,” Neural Computation, 3, 79–87.

• Two-Part Models:– Tobin, J. (1958), “Estimation of Relationships for Limited Dependent Variables,”

Econometrica, 25, 24–36. – Amemiya, T. (1984), “Tobit models: A survey,” Journal of Econometrics, 24, 3–61.– Heckman, J. (1976), “The common structure of statistical models of truncation, sample

selection, and limited dependent variables, and a sample estimator for such models,” The Annals of Economic Development and Social Measurement, 5, 475–592.

– Lambert, D. (1992), “Zero-inflated Poisson regression, with an application to defects in manufacturing,” Technometrics, 34, 1–14.

– Green, W. (1994), “Accounting for excess zeros and sample selection in Poisson and negative binomial regression models,” Working Paper EC-94-10, Department of Economics, New York University

– Manning, W., Newhouse, J., Orr, L., Duan, N., Keeler, E., Leibowitz, A., Marquis, M., and Phelps, C. (1981), “A two-part model of the demand for medical care: Preliminary results from the health insurance experiment,” in Health, Economics, and Health Economics, eds. van der Gaag, J. and Perlman, M., pp. 103–104.

– Mullahy, J. (1998), “Much ado about two: reconsidering retransformation and the two part model in health economics,” Journal of Health Economics, 17, 247–281.


• Longitudinal 2-part models– Olsen, M. K. and Schafer, J. L. (2001), “A two-part random-effects model for semicontinuous

longitudinal data,” Journal of the American Statistical Association, 96, 730–745.– Tooze, J. A., Gunward, G. K., and Jones, R. H. (2002), “Analysis of repeated measures

data with clumping at zero,” Statistical Methods in Medical Research, 11, 341–355.– Yau, K. K. W., Lee, A. H., and Ng, A. S. K. (2002), “A zero-augmented gamma mixed model

for longitudinal data with many zeros,” The Australian and New Zealand Journal of Statistics 44, 177–183.

• Estimation:– Zeger, S. L. and Karim, M. R. (1991), “Generalized linear models with random effects: A

Gibbs sampling approach,” Journal of the American Statistical Association, 86, 79–86.– Davidian, M. and Giltinan, D. M. (1993), “Some general estimation methods for nonlinear

mixed-effects models,” Journal of Biopharmaceutical Statistics, 3, 23–55.– Pinheiro, J. C. and Bates, D. M. (1995), “Approximations to the log-likelihood function in the

nonlinear mixed-effects model,” Journal of Computational and Graphical Statistics,4, 12–35.– McCulloch, C. E. (1997), “Maximum likelihood algorithms for generalized linear mixed

models,” Journal of the American Statistical Association, 92, 162–170.– Booth, J. G., Hobert, J. P., and Jank, W. (2001), “A survey of Monte Carlo algorithms for

maximizing the likelihood of a two-stage hierarchical model,” Statistical Modelling: An International Journal, 1, 333–349.

– Rabe-Hesketh, S., Skrondal, A., and Pickles, A. (2004), “Maximum likelihood estimation of limited and discrete variable models with nested random effects,” Journal of Econometrics, in press.

• Other:– Guo, X. and Carlin, B.P. (2004), ``Separate and Joint Modeling of Longitudinal and

Event Time Data Using Standard Computer Packages," The American Statistician, 58 16--24.

ReferencesReferences

Documents

Term 4, 2006BIO656--Multilevel Models 1 PART 8 Two Stage & Joint Models