Threshold Regression Models Mei-Ling Ting Lee, University of Maryland, College Park MLTLEE@UMD.EDU

Preview:

Citation preview

Threshold Regression Models

Mei-Ling Ting Lee, University of Maryland, College Park

MLTLEE@UMD.EDU

2

Outline

• An example to demonstrate the usefulness of the first-hitting time based threshold regression (TR) model.

• Introduction of the TR model

• Connections with the PH model

• Semi-parametric TR model

• Simulations

3

A non-proportional hazard example: Time to infection of kidney dialysis patients with different catheterization procedures(Nahman et al 1992, Klein & Moesberger 2003)

• Surgical group:43 patients utilized a surgically placed catheter

• Percutaneous group:76 patients utilized a percutaneous placement

of their catheter

The survival time is defined by the time to cutaneous exit-site infection.

4

Kaplan-Meier Estimate versus PH Cox Model

5

Weibull versus Lognormal

6

Loglogistic versus Gamma

7

Kaplan-Meier Estimate versus First-hitting-time based Threshold Regression Model

9

First-hitting Time Based Threshold Regression:Modeling Event Times by

a Stochastic Process Reaching a Boundary (Lee & Whitmore 2006, Statistical Sciences)

• Example: Equipment Failure: Equipment fails when its cumulative wear first reaches a failure

threshold.

Question: What is the influence of ambient temperature on failure?

10

Occupational and Environmental Health

Occupational risk: A railroad worker is exposed to diesel exhaust in the workplace. The exposure and other influences cause the worker’s health status to gradually tend downward toward a critical threshold that will result in death from a particular cause (e.g., lung cancer).

Question: Does diesel exhaust exposure increase the risk of lung cancer and, if so, to what degree?

11

y0

S0

0

time t

Process Y(t)

First hitting time S of a fixed boundary at level zero for a stochastic process of interest Y(t)

Sample path

First Hitting Time (FHT) Models

{Y(t)} : the stochastic process of interest

: the boundary set

First hitting time S defined by

S = inf { t : Y(t) ∈ B }

13

Examples of first hitting time (FHT) models:

• Wiener diffusion to a fixed boundaryProgress of multiple myeloma until death • Renewal process to a fixed count of renewal eventsTime to the nth epileptic seizure • Semi-Markov process to an absorbing state Multi-state model for disease with death as an absorbing state

14

y0

S L0

time t

Process Y(t)

Two sample paths of a stochastic process of interest:

(1) One path experiences ‘failure’ at first hitting time S

(2) One path is ‘surviving’ at end of follow up at time L

Sample paths Y(t)

16

Parameters for the FHT Model

Model parameters for the latent process Y(t) :

• Process parameters: θ = (where is the mean drift and is the variance

• Baseline level of process: Y(0) = y0

• Because Y(t) is latent, we set

17

Likelihood Inference for the FHT Model

The likelihood contribution of each sample subject is as follows.

• If the subject fails at S=s:

f (s | y0, ) = Pr [ first-hitting-time in (s, s+ds) ]

• If the subject survives beyond time L:

1- F (L | y0 ,) = Pr [ no first-hitting-time before L ]

18

0 0 01

ln , ln , 1 ln , .

where

is the failure indicator for subject

is a censored survival time if subject fails

and denote the FHT

n

i i i ii

i

i i i

L x d f t x d F t x

d i

t t s i

f F

p.d.f and complementary c.d.f.

19

Threhold Regression

Link Functions: parametric or semi-parametric

Simultaneous regressions:Possible Link functions for the baseline parameter Y(0) and drift parameter

include

• Linear combinations of covariates X1,…, Xp

• polynomial combinations of X1, …, Xp

• Regression splines• Penalized regression splines• Random effects

20

Threshold regression (TR)

More than one simultaneous regression functions with different links may be used to estimate parameters of:

1. Process Y(t): Wiener process, gamma process, etc

2. Boundary threshold: straight lines or curves

3. Time scale: calendar or running time, analytical time

21

References

• Aalen O.O. and Gjessing H.K. (2001). Understanding the shape of the hazard rate: a process point of view. Statistical Science, 16: 1-22.

• Lawless, J. F. (2003). Statistical Models and Methods for Lifetime Data, Second Edition, Wiley.

• Lee, M.-L. T. and G. A. Whitmore (2006). Threshold regression for survival analysis: modeling event times by a stochastic process reaching a boundary. Statistical Sciences.

• Aalen O.O., Borgon O, and Gjessing H.K (2008). Survival and Event History Analysis: A process Point of View. Springer.

22

Threshold Regression Interpretation of PH functions

• Most survival distributions can be related to hitting time distributions for some stochastic processes.

• Families of PH functions can be generated by varying time scales or boundaries of a TR model

• The same family of PH functions can be produced by different TR models.

• Simulation studies: both TR model and PH hold simultaneously, based on standard Brownian motion with variation of time scale. (Julia Batishev’s presentation in sec 2 of Tract 2 on July 3rd)

23

Threshold Regression

If Y(t) has a Wiener Process, then the first hitting time S has an inverse Gaussian distribution.

Consider

24

Semi-parametric Threshold Regression(Joint work with Z. Yu and W. Tu)

When Y(t) has a Wiener Process, then the first hitting time S has an inverse

Gaussian distribution.

We consider

Where the functional form of is unspecified.

25

Semi-parametric TR using Regression Spline

• Use linear link for covariates X1, …, Xp

• For covariate Z, consider the nonparametric function (z) as a linear combination of a set of basis functions Bj(z).

(z) =j jBj(z).

• Select the smoothing parameter and the number of knots.

26

27

Semi-parametric TR using Penalized Spline

In addition to regression spline, we also consider a cubic spline approach with penalty on the second derivative of the nonparametric function

28

Cross Validation

29

Simulation Procedures

30

Simulation Results

Table 1 summarizes simulation results using both spline approaches.

For the penalized spline approach, over 400 replications, the means are

=0.491 and = 0.504 (very close to the true value =0.5).

The mean of the estimated standard error are 0.311 and 0.180 which are very

close to the empirical standard errors 0.307 and 0.177.

The empirical coverage probabilities are 0.952 and 0.947.

Our simulation results show satisfactory performance of the penalized spline

approach with respect to regression coefficients and corresponding variance

estimation.

The mean of the estimate for over 400 replications is 1.99 which is very close

to the true value 2.

31

32

33

36

Analyzing Longitudinal Survival Data Using Threshold Regression:

Comparison with Cox Regression

Mei-Ling Ting Lee, University of Maryland

G. A. Whitmore, McGill University

Bernard Rosner, Harvard Medical School

37

Longitudinal Data Structure

38

Longitudinal Data Structure

Health examples:

• Annual monitoring of blood pressure

• Current status of disease

• Cohort study of smoking and lung cancer with bi-annual medical checks

39

x0

S

tm

0

A sequence of time points

Process X(t)

Figure: Longitudinal data structure for threshold regression

t1 t2

(x, z, f, c)

…..

40

Longitudinal Data Structure (cont.)

Individual observation sequences may include:

• Process level x• Covariate vector z• Failure indicator f• Censoring indicator c

41

Longitudinal Data Structure (cont.)

42

Uncoupling Longitudinal Data

Definition of uncoupling: Break each longitudinal record into a series of single records.

Handling longitudinal data is simple with uncoupling.

Under what conditions is uncoupling valid?

43

Analysis of Longitudinal Threshold Regression Data with Uncoupling

Define the observation vector for each visit j :

The longitudinal observation sequence is stopped by censoring or failure at some visit m:

44

Analysis of Longitudinal Threshold Regression Data with Uncoupling

Probability of the stopped observation sequence:

Invoke a Markov assumption:

45

A Common Analytical Situation for P(Aj | Aj-1)

Consider independent censoring c , Failure indicator f (f =1 if failed, f = 0 if not)Process reading x , and covariate z

Without readings on process { X (t) } (i.e., the process of interest is latent), probability elements for the likelihood then involve simple conditional expressions:

46

Case Illustration

Threshold regression output for the illustrative longitudinal data set

Assuming X(t) follows a standardized Brownian motion with initial status x0 to a fixed barrier at zero. We can make inferences about the influence of covariates on x0 at each visit.

47

Case Illustration

48

Case Illustration (cont.)

Threshold regression output has the familiar look of conventional regression output but offers greater insights and a richer interpretation

49

Uncoupling in Cox PH Regression

The main probability element

has the following form when Cox PH regression is uncoupled

Term h0j denotes a segment of a discrete baseline hazard function over [tj-1, tj).

52

Output

53

Case Illustration

The Nurses’ Health Study cohort data set

Questionnaire completed every two years

Interest in incidence of lung cancer

Longitudinal records from 1986 to 2000: 115,768 subjects748,007 observational intervals1,577,382 person-years at risk

The health process is latent.

54

Case Illustration (cont.)

Assume independent censoring.

Assume a Wiener diffusion process with zero mean and unit variance (Brownian motion).

• A zero mean is consistent with the data. • A latent health status scale allows the variance to be fixed arbitrarily.

The only parameter is x0, the initial health status at baseline for each observation interval.

55

Case Illustration (cont.)

For each observation interval, the covariates are:

1. Baseline cumulative smoking (pkyrs0, in pack years).

2. Baseline age (age0, in years)

A log-linear link is used, i.e., ln(x0)

56

Case Illustration (cont.)

Threshold regression output based on longitudinal uncoupling

57

Case Illustration (cont.)

Uncoupling breaks each longitudinal record in the data set into a series of single records.

Once parameters are estimated, predictive inferences can be made by splicing together forward records of a case using specified covariate conditions. In essence, splicing is the reverse of uncoupling.

58

Case Illustration (cont.)

The splicing process involves multiplying estimated conditional event probabilities as follows.

59

Partial Likelihood

Explicit representation of the initial probability and one-step transition probabilities:

60

Partial Likelihood (cont.)

Factor conditional probability

61

Partial Likelihood (cont.)

Build the partial likelihood for parameters of the parent process, running time and boundary.

Set aside the likelihood contribution of the covariate process and censoring mechanism

62

Common Analytical Situation (cont.)

Longitudinal records are broken into a series of single records with the following elements.

63

Recommended