Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Statistics 262: IntermediateBiostatistics
Modelling Hazard: Parametric Models
Jonathan Taylor & Kristin Cobb
Statistics 262: Intermediate Biostatistics – p.1/??
Today’s class
Hazard rates (yes, again).
Parametric models: accelerated failuremodels.
Likelihood.
Diagnostics.
Statistics 262: Intermediate Biostatistics – p.2/??
Hazard Rates & Densities
f(t)dt = P (t ≤ T < t + dt) .
h(t)dt = P(t ≤ T < t + dt
∣∣T > t)
=f(t)dt
S(t)
Both model instantaneous probability offailure at time t, but hazard is conditional onsurviving up to time t.
Statistics 262: Intermediate Biostatistics – p.3/??
Hazard Rates & Densities
Conceptually, it can be easier to modelhazard than density.
Hazard incorporates “what-ifs”: “If you surviveup to 100 days, then the probability you willfail within a day is ' h(100).”
Density does not: “The probability you will failon the 101-st day is ' f(100).”
Statistics 262: Intermediate Biostatistics – p.4/??
Weibull
0.0 0.5 1.0 1.5 2.0
01
23
45
Time
Haz
ard
Statistics 262: Intermediate Biostatistics – p.5/??
Log-logistic
0 2 4 6 8 10
0.0
0.5
1.0
1.5
2.0
Time
Haz
ard
Statistics 262: Intermediate Biostatistics – p.6/??
Log-normal
0 2 4 6 8 10
0.0
0.5
1.0
1.5
Time
Haz
ard
Statistics 262: Intermediate Biostatistics – p.7/??
Accelerated failure model
Log-linear model Wi = log Zi, V ar(Wi) = σ2:
log Ti = β0 +
p∑
j=1
Xijβj + Wi
Statistics 262: Intermediate Biostatistics – p.8/??
Accelerated failure model
Survival function:
S(t∣∣β,X
)= SZ
(exp
(β0 +
p∑
j=1
Xijβj
)t
).
Hazard rate:
h(t∣∣β,X
)= exp
(β0 +
p∑
j=1
βjXj
)
× hZ
(exp
(β0 +
p∑
j=1
βjXj
)t
)
Statistics 262: Intermediate Biostatistics – p.9/??
Accelerated failure model
“Accelerated” refers to the fact that increasein linear predictor “accelerates” your positionalong the hazard curve.
All variables have the same shape of hazardfunction, hZ.
Covariates act multiplicatively in front ofhazard and on the time scale.
Statistics 262: Intermediate Biostatistics – p.10/??
Likelihood of model
L(β, σ|(ti, δi), 1 ≤ i ≤ n)
∝
n∏
i=1
(1 − F (ti|β, σ))1−δif(ti|β, σ)δi.
Only difference with this and Kaplan-Meier isthat F is a parametric model so likelihood isactually a function of parameters β, σ.
Maximization is carried out numerically.
Statistics 262: Intermediate Biostatistics – p.11/??
Interpretation of coefficients
Coefficient βj affects median (or any otherquantile of) survival time.
Specifically,
eβj =t50(X1, . . . , Xj + 1, . . . , Xp)
t50(X1, . . . , Xj, . . . , Xp)
= TR((X1, . . . , Xj + 1, . . . , Xp),
(X1, . . . , Xj + 1, . . . , Xp))
Another reason why it is called “accelerated”failure: i.e. for every cigarette you smoke,your median survival time goes down by afactor eβcigarette.
Statistics 262: Intermediate Biostatistics – p.12/??
Fisher information
Unlike linear models there is no explicitformula for variance / covariance ofparameter estimates (β̂, σ̂).
V̂ ar(σ̂, β̂) =
(−
∂2 log L
∂σ∂β
)−1 ∣∣∣∣(β̂,σ̂)
.
Generic way to estimate variance/covarianceof MLE estimates. Obeys “delta rule.”
Statistics 262: Intermediate Biostatistics – p.13/??
Example: variance of sample mean
If σ2 is known
− log L(µ|Z1, . . . , Zn) ∝1
2σ2
(‖Z‖2 − 2nZµ + nµ2
).
Second derivative gives:
−∂2
∂µ2log L(µ|Z1, . . . , Zn) =
n
σ2.
“Inverse” yields
V̂ ar(Z) =σ̂2
n.
Statistics 262: Intermediate Biostatistics – p.14/??
Residual diagnostics: Cox-Snell
Like regression models, to assess goodnessof fit, we use residuals.
Parametric model imposes a (cumulative)hazard function H(t,X, β). Leads toCox-Snell residuals (similar to “QQplot”)
ri = H(ti, Xi, β̂).
If model is correct, the collection (ri, δi)1≤i≤n
should be like a sample from a censoredExp(1) sample. Plot of ri vs. Nelson-Aalenestimator should be linear with slope 1.
Statistics 262: Intermediate Biostatistics – p.15/??
Residual diagnostics: martingale
Mi = δi − ri.
Martingale residuals can be used to seerelation between j-th covariate and hazard byplotting (Xij , ri) and using “lowess” or othersmoother.
Residuals are motivated the Cox model, anddefinitions are basically identical. We’ll revisitthem later.
Statistics 262: Intermediate Biostatistics – p.16/??
Example: HMO-HIV+
We will look at many models:Exponential: only drug.Weibull: only drug.Weibull: drug + age.Lognormal: drug + age.Loglogistic: drug + age.
Statistics 262: Intermediate Biostatistics – p.17/??
Exponential vs. Weibull
Exponential is special case of Weibull withα = 1.
Can test if exponential is appropriate bytesting α = 1 or not.
Statistics 262: Intermediate Biostatistics – p.18/??
Cox-snell: exponential
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
1.0
2.0
3.0
Model hazard
Est
imat
ed h
azar
d
Statistics 262: Intermediate Biostatistics – p.19/??
Cox-snell: Weibull
0 1 2 3 4 5
01
23
Model hazard
Est
imat
ed h
azar
d
Statistics 262: Intermediate Biostatistics – p.20/??
Martingale: Weibull
20 25 30 35 40 45 50 55
−4
−3
−2
−1
01
Age
Mar
tinga
le r
esid
ual
Statistics 262: Intermediate Biostatistics – p.21/??