Preliminary Concepts A Clinical Trial is a Medical Experiment With Human Subjects It has two purposes : 1) Treat the patients in the trial 2) Obtain scientifically

Preliminary Concepts

A Clinical Trial is a Medical Experiment With Human Subjects

It has two purposes :

1) Treat the patients in the trial

2) Obtain scientifically useful information for developing or choosing new or improved treatments for future patients

A General Fact About Experimental Design

An Experiment Cannot be “Optimally” Designed Until

After it Has Been Conducted

Most “Optimal Designs” Are of No Practical Use Whatsoever

. . .Aside from advancing the careers of

numerous college professors

A General Fact About “Optimal” Designs

A General Fact About Cancer Clinical Trials

Cancer Clinical Trials Are Almost Never Conducted

Exactly As Designed

A General Fact About Statistical Models

All Statistical Models are Incorrect but

Some Statistical Models Are Less Incorrect than Others

(with apologies to George Orwell)

Adaptive Decision Rules

Use a patient’s current data + data from previous patients in the trial + possibly data from other trials, to decide whether and how to :

- Choose the next patient’s treatment- Choose patient’s next course of therapy

- Drop a treatment, or add a new treatment

- Stop the trial

- Change the trial’s goals or outcomes

- Start a new trial

The Bayesian Paradigm

= Parameters (probabilities, median survival, covariate effects, variances, correlations, etc.)

p() = Prior distribution of , before the data are observed, based on experience or historical data

data = What one actually observes

L (data | )=Likelihood of the observed data

p(data) = constant X L (data | )Xp()= Posterior distribution of , after observing data

Bayesian Sequential Analysis

Apply Bayes’ Law Repeatedly

At each step, ( n n+m) use the posterior from the previous step as the new prior, get more data, compute an “updated posterior”

p(datan ,datan+m

) =

constant X L (datan+m | ) x

p(datan )

and use this as the basis for the next decision

Two Types of Statistical Methodologies for Clinical Trials

BAYESIAN NON

BAYESIAN

Two Types of Practical Statistical Methodologies for Clinical Trials

BAYESIAN NON

with good BAYESIAN

frequentist with lots of properties random effects

The Two Types of Statistical Methodologies for Clinical Trials

Practical methods Everything

that actually are Else

applied

Bayesian Statistics

The Most Important Dichotomy for Clinical Trial Designs

DESIGNS USED TO

CONDUCT REAL TRIALS

IN WHICH EVERYTHING

REAL PHYSICIANS ELSE

USE REAL TREATMENTS

ON REAL PATIENTS

WITH REAL DISEASES

Some Facts About Cancer Clinical Trials

1. Patient outcome is highly complex

2. Patient outcome is highly stochastic Probability modeling and statistical methods are inevitable

3. Most cancer treatments are harmful

Trade-offs between treatment efficacy and toxicity are inevitable

4. All events must be defined explicitly

5. The design must be practical, not merely decorative The design must reflect actual medical practice, not a statistician’s fantasy

Things to Identify When Designing a Clinical Trial

1. Disease and patient subgroup (“Entry Criteria”)

2. Treatment(s) and/or multi-stage regimes to be studied

3. Standard treatment(s), if any, for these patients

4. Therapeutic paradigm

a) doses, if phase I or phase I-II

b) treatment schedule(s)

c) treatment administration mode (IV, oral, subcutaneous drug, radiation modality, etc.)

d) multi-stage treatment regimes (treatments, adaptive rules, sequencing, timing)

5. Patient outcomes/events relevant to treatment evaluation and safety monitoring

6. Important known prognostic covariates/subgroups

7. Relevant historical data or elicited prior information

a) Outcome (response, toxicity, disease progression time, survival time , etc.) rates/means with standard treatment(s)

b) Covariate effects

8. Scientific/medical objectives, relative to the clinical outcomes

a) Comparisons

b) What is to be estimated

c) Relationship between statistical/scientific goals and medical objectives

9. Institutions that will participate

10. Who supplies the drug(s) and who pays what

11. Logistical parameters

a) Maximum trial duration

b) Accrual rate (expected, or possibly a range)

c) Methods for evaluating patient outcomes

d) Time required to evaluate outcomes

e) Rx availability

f) Costs and financial limitations

g) Key personnel: Research nurses, physicians, statisticians, data managers, programmers,

A Bayesian Paradigm for Designing Clinical Trials

1. Determine all relevant background information

2. Write down a reasonable probability model, likelihood p(X|) and prior p()

3. Elicit prior information and establish 4. Write down a reasonable set of decision rules

based on the posterior, p(|X), including all design parameters

5. Write down a reasonable set of clinical scenarios, each a fixed value true or p(X)true

6. Simulate the design under each scenario to obtain the its operating characteristics (OCs)

a) sample size distribution

b) rates/probabilities of important clinical events

c) trial duration distribution

d) decision (stop, choose dose/rx, etc.) probabilities

7. Calibrate the design parameters and re-simulate until acceptable OCs are obtained

8. Design the trial as if your mother / father / wife / husband / best friend / child might be enrolled as a patient

A Bayesian Paradigm for Designing Clinical Trials

A General Bayesian Methodology for Constructing Phase II Clinical

Trial Designs

Thall and Simon (1994)Thall, Simon and Estey (1995, 1996)Thall and Sung (1998)Thall, Wooten, and Tannir (2005)

A small trial, usually single-arm in oncology, usually based on an early outcome, usually a binary indicator of treatment “response.”

The goal is to determine whether an experimental treatment, E, is sufficiently “promising” to motivate a large, randomized phase III trial of E versus a standard treatment, S.

Phase II Clinical Trial

“Response” may be 1) > 50% shrinkage of a solid tumor2) Stable disease or better for > 3 months3) Complete Remission (CR) of leukemia4) Engraftment of a bone marrow transplant

“Toxicity” may be1) Transient toxicity (nausea, vomiting, fatigue,

low blood cell counts, hair loss) 2) Permanent organ damage (kidneys, liver,

heart, muscles, brain)3) Regimen-related death

Phase II Clinical Trial

1) There is no existing treatment with substantive anti-disease activity

The null mean Pr(response) = o < .05

2) Objective: Determine whether E has “substantive” anti-disease effect

Stop the trial if Pr(E> o|data) < pL

where E = Pr(response | E). E.g, stop if 0/15 responses observed

Phase IIA “Activity” Clinical Trial

1) There exists at least one available “standard” treatment S with some anti-disease activity Null mean Pr(response) = S > .10

2) Objective: Determine whether E substantively improves anti-disease effect compared to S

Phase IIB Clinical Trial

Xn = # responses in n patients = Pr(response)

~beta(a,b) E() = = a/(a+b)var() = (1- )/(a+b+1)

[ Xn | ] ~ bin(n,) and ~beta(a,b) [ | Xn ] ~ beta(a+ Xn , b+n- Xn)

W90 = U - L where Pr(L < < U) = .90

Beta Distributions

Some beta pdfs for

W90 = .10

W90 = .20

W90 = .30W90 = .40

= .20

W90(a, b) a+b

.10 (34.1, 136.6) 169

.20 (8.15, 32.6) 39

.30 (3.3, 13.1) 14

.40 (1.4, 5.7) 5

Relationship between W90 and a+b for beta distributions with mean .20

1/(a+b)

.0012

.026

.071

.020

W90(a, b) a+b

.10 (104, 156) 260

.15 (46.4, 69.6) 116

.20 (25.6, 38.4) 64

.25 (16.4, 24.6) 41

Relationship between W90 and a+b for beta distributions with mean .40

1/(a+b)

.0038

.0086

.016

.024

= .40

W90 = .35 W90 = .23

W90 = .25

W90 = .29 W90 = .17

W90 = .19

E = “Experimental” treatmentS = “Standard” treatment

S = Pr(response with S)E = Pr(response with E)

Conduct a single arm trial of E, and stop the trial early based on the interim data if it is eithera) Likely that E is superior to S, orb) Unlikely that E is superior to S

Bayesian Designs for Phase IIB Trials

S~ beta(aS, bS)

1) Mean E(S) = S = aS/(aS+bS)

2) aS+bS = the effective sample size of the prior on S. This must be reasonably large, i.e. this prior is informative.

Priors: The Historical Standard Rx

E ~ beta(aE, bE)

Mean E(E) = E = aE / (aE+bE)

aE+bE = 1 or 2 (non-informative)

Priors: The Experimental Rx

ES + / 2

E

S

M = the smallest integer n so that, for givenwidth W = U – L , Pr(L < < U | XM ) > p*for “typical” data, e.g. XM = ME

E( Xn p*=.90 p*=.95

.20 42 59

.30 55 81

.40 63 89

.50 65 93

Determining Maximum Sample Size

1) If Pr(E > S + | data) < pL stop and conclude E is not promising compared to S (Futility)

2) If Pr(E > S | data) > pU stop and conclude E is promising compared to S (Superiority)

=.15-.20, pL =.01-.20, pU =.80-.99

(Thall and Simon, 1994)

Bayesian Designs for Phase IIB Trials

1) Specify a beta(aS,bS) prior onS and a beta(aE,bE) prior onE

with aS+bS “large” & aE+bE “small”2) E = S or possibly E = S + /23) Specify a maximum sample size4) Specify ( , pL , pU )5) Compute upper and lower stopping

boundaries6) Apply the stopping rules after each

patient (“continuously”), or after cohorts of given size, or periodically

Implementation

1) For each of several fixed values ofEtrue,

simulate the trial on the computer and record the design’s average behavior, i.e. its “Operating Characteristics” (OCs)

2) The OCs consist of a) Probability distribution of

N = achieved sample sizeb) p+ = prob(E declared “promising”)c) p- = prob(stop due to futility)

= prob(E declared “not promising”)3) Calibrate the design parameters based on the

OCs, re-simulate, and iterate

Simulation

Example: A Chemotherapy Trial in AML

S = Fludarabine + Cytarabine (cytosine arabinoside, ara-C)E = S + granulocyte colony stimulating factor (G-CSF, Neupogen®, Neulasta®)

1) S ~beta(33, 33) S = .50, WS,90=.202) =.20 Targeted E = .50 +.20 = .703)E ~beta(1.2, 0.8) E = .60, WE,90=.88

pL=.025, pU=.95, nmin=10, nmax=65

Example: A Chemotherapy Trial in AML

Early stopping bounds:

1) Stop the trial if [# responses] / [# pats.] = Xn / n

> 10/10, 11/11, 12/12, …,54/64 (Superiority)< 0/7, 1/8, 2/9, 3/10, …, 33/60 (Futility)

2) Stopping bounds for cohort size = 5

> 10/10, 14/15, 18/20, …,51/60 (Superiority)< 3/10, 6/15, 8/20, …, 31/60 (Futility)

Operating Characteristics

Etrue Pr(Stop) Sample Size

(25, 50, 75)%ile

.50 .83 11, 22, 49

.70 .15 65, 65, 65

.50 .77 15, 30, 60

.70 .11 65, 65, 65

Continuous Monitoring

Monitor Cohorts of Size 5

S

E

S

E

E

5/17 CRs observed Pr(E > S | 5/17) = .08

S

E

E



E

E

S

Thall-Sung (1998) adaptation of Thall-Simon (1994) to accommodate activity trials :

Given fixed target response probability p0

assume E ~ beta( 2 p0,2(1- p0) ),or maybe beta(p0,1- p0)

1) M = maximum sample size2) pL= stopping probability cut-off3) Stop the trial if Pr[ p0 < E |data] < pL

4) The “Target Activity Level ” isp0 = .10 to .30, most often .20

Phase II Equivalence Trials

Phase II Equivalence Trials

Patient Safety is NEVER a Secondary Consideration in a

Clinical Trial

In an ethical clinical trial, safety monitoring is inevitable - - -

including explicit rules to stop the trial early if the rate of toxicity or rate of regimen-related death is excessively high

Patient outcomes is nearly always much more complex than a one binary “response” variable

Patient safety is never a secondary consideration in a clinical trial

The rate of toxicity usually is determined very unreliably in phase I

Monitoring Multiple Discrete Outcomes

(Thall, Simon and Estey, 1995; Thall and Sung, 1998)

K=3

Monitor pr(CR) and pr(death)

K=5

Monitor 1) pr(CR) , or pr(CR|alive) 2) pr(TOX), or pr(TOX|

alive) 3) pr(death)

A Bio-Chemotherapy Trial in Acute Leukemia

Complete Remission (CR)

Yes No

Yes A1 A2

No A3 A4

TOX IC ITY

An experiment has k possible elementary outcomes, A1,…, Ak , with

j = Pr(Aj), for j=1,…,k-1, and k = 1 - 1 …- k-1

so = (1,…, k-1) is k-1 dimensional.

The Dirichlet pdf with parameters a = (a1,…,ak)

p(a) ∝ 1a1-1 1

a2-1 … k ak-1

We write “ a ~ Dirichet(a) ”

Dirichlet – Multinomial Model

a ~ Dirichet(a)

1) a+ = a1 +…+ ak = Effective sample size

2) E(j) = aj / a+ = j , a+ = a1 + … + ak

3) var(j) = j(1- j)/(1 + a+ )

4) cov(aj, ar) = – j r / (1 + a+ )

The case k=2 is the Beta(a1 , a2 )


a ~ Dirichlet(a)

All subvectors of are Dirichlet

E.g. k=4 with

(1, 2 , 3) ~ Dirichlet (a1, a2, a3, a4 )

(1, 2) ~ Dirichlet (a1 , a2 , a3+a4 )


a ~ Dirichlet(a)

Sums of subvectors of are Beta

E.g. k=4 with

(1, 2 , 3) ~ Dirichlet (a1, a2, a3, a4 )

(1+ 2) ~ Beta (a1 + a2 , a3 + a4 )


Complete Remission

TOX IC ITY

Yes No

Yes 1 2

No 4 3

(1 , 2 , 3 , 4 Dirichlet(80, 70, 30, 120)

TOX1 + 2 Beta(150, 150) TOX = ½

CR1 + 4 Beta(200, 100) CR = ⅔

a ~ Dirichlet(a) Summing subvectors of gives a

Dirichlet

E.g.(1,…, 5) ~ Dirichlet (a1,…,a6)

(1+ 2, 3+ 4 , 5) ~ Dir(a1+a2 , a3+a4 , a5)

This corresponds to collapsing elementary events, e.g. from 5 to 3 in this example.


X = (X1,…,Xk) ~ k-nomial with outcome probabilities 1…k and

| a ~ Dirichlet (a) a posteriori

[| X ] ~ Dirichlet(a X) ≡

Dirichlet(a1 + X1 , … , + ak + Xk)

k=2 is the “binomial(X1, 1) - beta(a1 , a2)” with X1 = X, X1 + X2 = n and 1 =


4 elementary events

= () ~ Dirichlet (a1, a2, a3, a4)

R = A1 U A3

T = A1 U A2

(R) = ~ beta(a1 +a3 , a2 +a4 )

(T) = ~ beta(a1 +a2 , a3 +a4 )


1

2

3

4

CR

Yes No

Yes

No

TOX

T = 12 and CR = 13

CR & Toxicity Data from 264 AML Patients Treated With an Anthracycline + ara-C

CR No CR

Toxicity 73

(27.7%)

63 (23.9%)

136

(51.5%)

No

Toxicity

101

(38.3%)

27

(10.2%)

128

(48.5%)

174

(65.9%)

90

(34.1%)

264

P(CR|Tox) = 73/136 = .54, P(CR|No Tox) = 101/128 = .79

S ~ Dirichlet (73,63,101,27)

aS,+ = 73+63+101+27 = 264 S = (.277, .239, .382, .102)

E(S,T) = E(S,1S,2) = .515

E(S,CR) = E(S,1S,3) = .659

Set E = S with aE,+ = 4 E ~ Dirichlet (1.11, .955, 1.53, .409)

Dirichlet Priors

Marginal Priors

Posterior of TOX after 17 / 21 toxicities

Marginal Priors

Posterior of CR after 13 / 21 CRs

Maximum sample size = 56

If X=45 (45/56 = .81, the target) then

Pr(.68 < E,CR < .88 | X = 45) = .95

That is, a posterior 95% credible interval (CI) will have width .20

Stop the trial if

1) Pr(S,CR + .15 < E,CR | data) < .05

or

2) Pr(S,TOX + .05 < E,TOX | data) > .95,

A .05 increase (“slippage”) in TOX is a

Trade-Off for a .15 increase in CR

Decision Rules

Stop the trial if

[# CRs] / [# patients] < 0/3, 1/4, 2/5, 3/6, 3/7, 4/8, 5/9, …(Futility)

[# Toxicities] / [# patients]> 6/6, 7/7, 8/8, 9/9, 9/10, 10/11, 11/12, 11/13 (Toxicity)

Decision Cut-Offs

Operating Characteristics

Case True E Prob. Stopping Early* Sample Size

(25%,50%,75%)

1 (.327,.239,.332,.102)

Tox ↑ 0.05

.856 + .043 + .006

=.90

6 12 27

2 (.377,.239, .282,.102)

Tox ↑ 0.10

.809 + .100 + .012

=.92

6 12 26

3 (.427,.089,.382,.102)

CR ↑ 0.15

.207 + .052 + .001

=.26

45 56 56

4 (.627,.089,.182,.102)

CR ↑ 0.15 & Tox ↑ 0.20

.167 + .693 + .003

=.86

10 19 36

S = (.277, .239, .382,.102)

* Prob Stopping Early = the sum of stopping probabilities to due the CR rule alone, the Tox rule alone and both rules

Monitoring Event Times in Phase II(Thall, Wooten and Tannir 2005)

An Activity Trial Based On Progression-Free Survival (PFS) Time

Patients with renal cell carcinoma, relapsed or refractory after immunotherapy

S = 5-FU+ Gemcitabine

E = Xeloda + Gemcitabine

T | ~ Exp() f(t| ) = exp(-t /)

E(T) = , var(T) = 2

| a,b ~ Inverse Gamma (IG) (a,b)

E() = b/(a-1), var() = b2 /(a-1)2(a-2)

To = min{T,C} , C = right-censoring time

Exponential-Inverse Gamma Model

For right-censored event times T1

o ,…, Tn

o from T1 ,…, Tn ~ iid Exp() with ~ IG(a,b)

N = # uncensored events, and T+ = T1

o +…+ Tn

o = total time on test

| data ~ IG (a + N, b + T+ )

E( |data) = (b+T+) / (a+N-1)

Exponential-Inverse Gamma Model

Experimental Group Survival Times1.5, (1.6), (2.4), (4.2), 4.5, (6.7), (8.0), (11.0),15.0NE = 3 deaths in nE = 9 patients, TE

+ = 54.9MLE of mean survival time

= TE+ / NE = 54.9 / 3 = 18.3

Control Group Survival Times0.5, (0.6), 1.5, 1.6, (2.0), 3.0, (3.5), (4.0), 4.8, 6.2,

(10.5), (11.0), 14.5NC = 7 deaths in nC = 13 patients, TC

+ = 63.7MLE of mean survival time

= TC+ / NC = 63.7 / 7 = 9.1

Exponential-Inverse Gamma Example

Assume Experimental Group Survival Times ~ iid Exp(E) Control Group Survival Times are ~ iid Exp(C)

A prioriE , C ~ iid IG(.01,.01) Mean =1, var=100

E | NE = 3 deaths, TE+ = 54.9 ~ IG(3.01, 54.91)

Posterior mean = 54.91/(3.01-1) = 27.32, var = 738.9

C | NC = 7 deaths, TC+ = 63.7 ~ IG(7.01, 63.71)

Posterior mean = 63.71/(7.01-1) = 10.60, var = 22.43

Exponential-Inverse Gamma Example

Pr(C < E | data) = .87

The Exponential-Inverse Gamma Model: Application to Phase II Monitoring

Tj = time to disease progression or death ( “failure” ) for j = S or E

j = Mean time to failure

Tj | j ~ Exponential

S ~ Inverse gamma (aS, bS)

E ~ Inverse gamma (aE, bE)

Observed data on each patient:

T0 = time to the event or right-censoring = 1 if T0 = T, = 0 if T0 < T

For T with pdf f and survivor function F(t) = Pr(T > t), the likelihood takes the usual form

L(data|) = i=1…n {fE(Ti0 | )}i {FE (Ti

0 | )}1-i

Event Time Likelihood

Priors

IG (a, b), equiv, 1/~ Gam(a,b)

E() = b/(a-1), var() = b2/{(a-1)2(a-2)}

Elicited values from historical rx :

E{median(T | S )} = 5.4 mos = log(2)E(S)

E(S) = 5.4 / log(2) = 7.8 mos

PriorsPr(S > 7 mos) = .50 var(S) = 12.2

S~ IG( aS = 6.87, bS = 45.76)

For the experimental agent prior, we set

E(E) = E(S) = 7.8 and var(E) = 1000

E~ IG( aE = 2.06 , bE = 8.27 )

Note: Establishing good IG priors can be tricky!

Trial Conduct

1) A Single-arm trial of E = Xeloda + Gemcitabine

2) Disease progression evaluated at 8-week intervals

3) Accrue a maximum of 84 patients

4) Monitor the data every 2 months

Trial Conduct

Stop the trial if Pr(S < E | data) < .18

at any time (a “phase II equivalence” rule)

The cutoff .18 was chosen to obtain incorrect stopping probability .05 if the true mean(TE) = 10.8 months, the desired improvement

Sample Size Rationale: If 70/84 failures observed, with mean 10.3 months, then

Pr[ 8.1 < E < 12.9 | data] = .95

Assuming accrual rate = 6 patients/month. Patients arrive according to a Poisson process. 1,000 reps. per case.

Fixed median(TE)

Fixed mean(TE)

Prob

Stop Early

Number of

Patients

Trial Duration

3 4.3 .94 33 5.4 mos

4 5.8 .45 59 9.8 mos

7.5 10.8 .05 80 13.4 mos

Operating Characteristics for the

Stopping rule Pr(S < E | data) < .18

A More Optimistic Goal

A “3 months improvement” target :

If an improvement of = 3 months over the historical mean(T) = 7.8 were desired, then the stopping rule would be

Pr(S + 3 < E | data) < pL = .038

with pL calibrated to make pSTOP = .05

(or whatever false stopping rate is desired)

at fixed E = 10.8 mos

Operating Characteristics for the Stopping rule Pr(S + 3 < E | data) < .038

Fixed median(TE)

Fixed mean(TE)

Prob

Stop Early

Number of Patients

Trial Duration

3 4.3 >.99(up from .94)

28(down from 33)

4.5 mos

4 5.8 .84(up from .45)

46(down from 59)

7.7 mos

7.5 10.8 .05 81 13.4 mos

0

0.2

0.4

0.6

0.8

1

3 4 5 6 7 8 9 10

Med Time to Failure (mos)

Delta = 0

Delta = 3

Early Stopping Probabilities for the “Equivalence” and “Improvement” Rules

Randomized Phase II Selection Trials

A Trial of Topotecan for AML

Patient Outcomes in the Randomized Trial of Topotecan for AML Salvage

25

Historical data with S = ara-C

6

3 2

1035

S,CR = 9/81 = .11 S,TOX = 41/81 = .51,

S,DEATH = 12/81 = .15

Stop arm Ej if

1) Pr[ E,j (TOX) > S (TOX) + .05 | data] > pU,TOX

2) Pr[ E,j (CR) > S (CR) + .20 | data] < pL,CR

3) Pr[ E,j (Death) > S (Death) | data] > pU,DEATH

A .05 increase (“slippage”) in (TOX) is the trade-off for a .20 increase in (CR).

Within-Arm OCs of the Topotecan Trial Design

Selection Probabilities for the Topotecan Trial Design

Selection Probabilities for the Topotecan Trial Design

Some General Conclusions

1) Randomized phase II selection trials provide unbiased comparisons among 2 or more experimental treatments.

2) The goal is to select the best E for future evaluation in phase III. The goal is not to achieve a specified improvement over S with a specified power.

3) If an arm is terminated, it is best to randomize all remaining patients to the remaining arms; otherwise, the null selection probabilities are inflated.

Documents

Preliminary Concepts A Clinical Trial is a Medical Experiment With Human Subjects It has two purposes : 1) Treat the patients in the trial 2) Obtain scientifically