21
Statistics 262: Intermediate Biostatistics Modelling Hazard: Parametric Models Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics – p.1/??

Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Statistics 262: IntermediateBiostatistics

Modelling Hazard: Parametric Models

Jonathan Taylor & Kristin Cobb

Statistics 262: Intermediate Biostatistics – p.1/??

Page 2: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Today’s class

Hazard rates (yes, again).

Parametric models: accelerated failuremodels.

Likelihood.

Diagnostics.

Statistics 262: Intermediate Biostatistics – p.2/??

Page 3: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Hazard Rates & Densities

f(t)dt = P (t ≤ T < t + dt) .

h(t)dt = P(t ≤ T < t + dt

∣∣T > t)

=f(t)dt

S(t)

Both model instantaneous probability offailure at time t, but hazard is conditional onsurviving up to time t.

Statistics 262: Intermediate Biostatistics – p.3/??

Page 4: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Hazard Rates & Densities

Conceptually, it can be easier to modelhazard than density.

Hazard incorporates “what-ifs”: “If you surviveup to 100 days, then the probability you willfail within a day is ' h(100).”

Density does not: “The probability you will failon the 101-st day is ' f(100).”

Statistics 262: Intermediate Biostatistics – p.4/??

Page 5: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Weibull

0.0 0.5 1.0 1.5 2.0

01

23

45

Time

Haz

ard

Statistics 262: Intermediate Biostatistics – p.5/??

Page 6: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Log-logistic

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

Time

Haz

ard

Statistics 262: Intermediate Biostatistics – p.6/??

Page 7: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Log-normal

0 2 4 6 8 10

0.0

0.5

1.0

1.5

Time

Haz

ard

Statistics 262: Intermediate Biostatistics – p.7/??

Page 8: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Accelerated failure model

Log-linear model Wi = log Zi, V ar(Wi) = σ2:

log Ti = β0 +

p∑

j=1

Xijβj + Wi

Statistics 262: Intermediate Biostatistics – p.8/??

Page 9: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Accelerated failure model

Survival function:

S(t∣∣β,X

)= SZ

(exp

(β0 +

p∑

j=1

Xijβj

)t

).

Hazard rate:

h(t∣∣β,X

)= exp

(β0 +

p∑

j=1

βjXj

)

× hZ

(exp

(β0 +

p∑

j=1

βjXj

)t

)

Statistics 262: Intermediate Biostatistics – p.9/??

Page 10: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Accelerated failure model

“Accelerated” refers to the fact that increasein linear predictor “accelerates” your positionalong the hazard curve.

All variables have the same shape of hazardfunction, hZ.

Covariates act multiplicatively in front ofhazard and on the time scale.

Statistics 262: Intermediate Biostatistics – p.10/??

Page 11: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Likelihood of model

L(β, σ|(ti, δi), 1 ≤ i ≤ n)

n∏

i=1

(1 − F (ti|β, σ))1−δif(ti|β, σ)δi.

Only difference with this and Kaplan-Meier isthat F is a parametric model so likelihood isactually a function of parameters β, σ.

Maximization is carried out numerically.

Statistics 262: Intermediate Biostatistics – p.11/??

Page 12: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Interpretation of coefficients

Coefficient βj affects median (or any otherquantile of) survival time.

Specifically,

eβj =t50(X1, . . . , Xj + 1, . . . , Xp)

t50(X1, . . . , Xj, . . . , Xp)

= TR((X1, . . . , Xj + 1, . . . , Xp),

(X1, . . . , Xj + 1, . . . , Xp))

Another reason why it is called “accelerated”failure: i.e. for every cigarette you smoke,your median survival time goes down by afactor eβcigarette.

Statistics 262: Intermediate Biostatistics – p.12/??

Page 13: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Fisher information

Unlike linear models there is no explicitformula for variance / covariance ofparameter estimates (β̂, σ̂).

V̂ ar(σ̂, β̂) =

(−

∂2 log L

∂σ∂β

)−1 ∣∣∣∣(β̂,σ̂)

.

Generic way to estimate variance/covarianceof MLE estimates. Obeys “delta rule.”

Statistics 262: Intermediate Biostatistics – p.13/??

Page 14: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Example: variance of sample mean

If σ2 is known

− log L(µ|Z1, . . . , Zn) ∝1

2σ2

(‖Z‖2 − 2nZµ + nµ2

).

Second derivative gives:

−∂2

∂µ2log L(µ|Z1, . . . , Zn) =

n

σ2.

“Inverse” yields

V̂ ar(Z) =σ̂2

n.

Statistics 262: Intermediate Biostatistics – p.14/??

Page 15: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Residual diagnostics: Cox-Snell

Like regression models, to assess goodnessof fit, we use residuals.

Parametric model imposes a (cumulative)hazard function H(t,X, β). Leads toCox-Snell residuals (similar to “QQplot”)

ri = H(ti, Xi, β̂).

If model is correct, the collection (ri, δi)1≤i≤n

should be like a sample from a censoredExp(1) sample. Plot of ri vs. Nelson-Aalenestimator should be linear with slope 1.

Statistics 262: Intermediate Biostatistics – p.15/??

Page 16: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Residual diagnostics: martingale

Mi = δi − ri.

Martingale residuals can be used to seerelation between j-th covariate and hazard byplotting (Xij , ri) and using “lowess” or othersmoother.

Residuals are motivated the Cox model, anddefinitions are basically identical. We’ll revisitthem later.

Statistics 262: Intermediate Biostatistics – p.16/??

Page 17: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Example: HMO-HIV+

We will look at many models:Exponential: only drug.Weibull: only drug.Weibull: drug + age.Lognormal: drug + age.Loglogistic: drug + age.

Statistics 262: Intermediate Biostatistics – p.17/??

Page 18: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Exponential vs. Weibull

Exponential is special case of Weibull withα = 1.

Can test if exponential is appropriate bytesting α = 1 or not.

Statistics 262: Intermediate Biostatistics – p.18/??

Page 19: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Cox-snell: exponential

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

1.0

2.0

3.0

Model hazard

Est

imat

ed h

azar

d

Statistics 262: Intermediate Biostatistics – p.19/??

Page 20: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Cox-snell: Weibull

0 1 2 3 4 5

01

23

Model hazard

Est

imat

ed h

azar

d

Statistics 262: Intermediate Biostatistics – p.20/??

Page 21: Statistics 262: Intermediate Biostatisticsstatweb.stanford.edu/.../spring.2004/notes/week6.pdf · Both model instantaneous probability of failure at time t, but hazard is conditional

Martingale: Weibull

20 25 30 35 40 45 50 55

−4

−3

−2

−1

01

Age

Mar

tinga

le r

esid

ual

Statistics 262: Intermediate Biostatistics – p.21/??