Upload
angela-webb
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Preliminary Concepts
A Clinical Trial is a Medical Experiment With Human Subjects
It has two purposes :
1) Treat the patients in the trial
2) Obtain scientifically useful information for developing or choosing new or improved treatments for future patients
A General Fact About Experimental Design
An Experiment Cannot be “Optimally” Designed Until
After it Has Been Conducted
Most “Optimal Designs” Are of No Practical Use Whatsoever
. . .Aside from advancing the careers of
numerous college professors
A General Fact About “Optimal” Designs
A General Fact About Cancer Clinical Trials
Cancer Clinical Trials Are Almost Never Conducted
Exactly As Designed
A General Fact About Statistical Models
All Statistical Models are Incorrect but
Some Statistical Models Are Less Incorrect than Others
(with apologies to George Orwell)
Adaptive Decision Rules
Use a patient’s current data + data from previous patients in the trial + possibly data from other trials, to decide whether and how to :
- Choose the next patient’s treatment- Choose patient’s next course of therapy
- Drop a treatment, or add a new treatment
- Stop the trial
- Change the trial’s goals or outcomes
- Start a new trial
The Bayesian Paradigm
= Parameters (probabilities, median survival, covariate effects, variances, correlations, etc.)
p() = Prior distribution of , before the data are observed, based on experience or historical data
data = What one actually observes
L (data | )=Likelihood of the observed data
p(data) = constant X L (data | )Xp()= Posterior distribution of , after observing data
Bayesian Sequential Analysis
Apply Bayes’ Law Repeatedly
At each step, ( n n+m) use the posterior from the previous step as the new prior, get more data, compute an “updated posterior”
p(datan ,datan+m
) =
constant X L (datan+m | ) x
p(datan )
and use this as the basis for the next decision
Two Types of Statistical Methodologies for Clinical Trials
BAYESIAN NON
BAYESIAN
Two Types of Practical Statistical Methodologies for Clinical Trials
BAYESIAN NON
with good BAYESIAN
frequentist with lots of properties random effects
The Two Types of Statistical Methodologies for Clinical Trials
Practical methods Everything
that actually are Else
applied
Bayesian Statistics
The Most Important Dichotomy for Clinical Trial Designs
DESIGNS USED TO
CONDUCT REAL TRIALS
IN WHICH EVERYTHING
REAL PHYSICIANS ELSE
USE REAL TREATMENTS
ON REAL PATIENTS
WITH REAL DISEASES
Some Facts About Cancer Clinical Trials
1. Patient outcome is highly complex
2. Patient outcome is highly stochastic Probability modeling and statistical methods are inevitable
3. Most cancer treatments are harmful
Trade-offs between treatment efficacy and toxicity are inevitable
4. All events must be defined explicitly
5. The design must be practical, not merely decorative The design must reflect actual medical practice, not a statistician’s fantasy
Things to Identify When Designing a Clinical Trial
1. Disease and patient subgroup (“Entry Criteria”)
2. Treatment(s) and/or multi-stage regimes to be studied
3. Standard treatment(s), if any, for these patients
4. Therapeutic paradigm
a) doses, if phase I or phase I-II
b) treatment schedule(s)
c) treatment administration mode (IV, oral, subcutaneous drug, radiation modality, etc.)
d) multi-stage treatment regimes (treatments, adaptive rules, sequencing, timing)
5. Patient outcomes/events relevant to treatment evaluation and safety monitoring
6. Important known prognostic covariates/subgroups
7. Relevant historical data or elicited prior information
a) Outcome (response, toxicity, disease progression time, survival time , etc.) rates/means with standard treatment(s)
b) Covariate effects
8. Scientific/medical objectives, relative to the clinical outcomes
a) Comparisons
b) What is to be estimated
c) Relationship between statistical/scientific goals and medical objectives
9. Institutions that will participate
10. Who supplies the drug(s) and who pays what
11. Logistical parameters
a) Maximum trial duration
b) Accrual rate (expected, or possibly a range)
c) Methods for evaluating patient outcomes
d) Time required to evaluate outcomes
e) Rx availability
f) Costs and financial limitations
g) Key personnel: Research nurses, physicians, statisticians, data managers, programmers,
A Bayesian Paradigm for Designing Clinical Trials
1. Determine all relevant background information
2. Write down a reasonable probability model, likelihood p(X|) and prior p()
3. Elicit prior information and establish 4. Write down a reasonable set of decision rules
based on the posterior, p(|X), including all design parameters
5. Write down a reasonable set of clinical scenarios, each a fixed value true or p(X)true
6. Simulate the design under each scenario to obtain the its operating characteristics (OCs)
a) sample size distribution
b) rates/probabilities of important clinical events
c) trial duration distribution
d) decision (stop, choose dose/rx, etc.) probabilities
7. Calibrate the design parameters and re-simulate until acceptable OCs are obtained
8. Design the trial as if your mother / father / wife / husband / best friend / child might be enrolled as a patient
A Bayesian Paradigm for Designing Clinical Trials
A General Bayesian Methodology for Constructing Phase II Clinical
Trial Designs
Thall and Simon (1994)Thall, Simon and Estey (1995, 1996)Thall and Sung (1998)Thall, Wooten, and Tannir (2005)
A small trial, usually single-arm in oncology, usually based on an early outcome, usually a binary indicator of treatment “response.”
The goal is to determine whether an experimental treatment, E, is sufficiently “promising” to motivate a large, randomized phase III trial of E versus a standard treatment, S.
Phase II Clinical Trial
“Response” may be 1) > 50% shrinkage of a solid tumor2) Stable disease or better for > 3 months3) Complete Remission (CR) of leukemia4) Engraftment of a bone marrow transplant
“Toxicity” may be1) Transient toxicity (nausea, vomiting, fatigue,
low blood cell counts, hair loss) 2) Permanent organ damage (kidneys, liver,
heart, muscles, brain)3) Regimen-related death
Phase II Clinical Trial
1) There is no existing treatment with substantive anti-disease activity
The null mean Pr(response) = o < .05
2) Objective: Determine whether E has “substantive” anti-disease effect
Stop the trial if Pr(E> o|data) < pL
where E = Pr(response | E). E.g, stop if 0/15 responses observed
Phase IIA “Activity” Clinical Trial
1) There exists at least one available “standard” treatment S with some anti-disease activity Null mean Pr(response) = S > .10
2) Objective: Determine whether E substantively improves anti-disease effect compared to S
Phase IIB Clinical Trial
Xn = # responses in n patients = Pr(response)
~beta(a,b) E() = = a/(a+b)var() = (1- )/(a+b+1)
[ Xn | ] ~ bin(n,) and ~beta(a,b) [ | Xn ] ~ beta(a+ Xn , b+n- Xn)
W90 = U - L where Pr(L < < U) = .90
Beta Distributions
Some beta pdfs for
W90 = .10
W90 = .20
W90 = .30W90 = .40
= .20
W90(a, b) a+b
.10 (34.1, 136.6) 169
.20 (8.15, 32.6) 39
.30 (3.3, 13.1) 14
.40 (1.4, 5.7) 5
Relationship between W90 and a+b for beta distributions with mean .20
1/(a+b)
.0012
.026
.071
.020
W90(a, b) a+b
.10 (104, 156) 260
.15 (46.4, 69.6) 116
.20 (25.6, 38.4) 64
.25 (16.4, 24.6) 41
Relationship between W90 and a+b for beta distributions with mean .40
1/(a+b)
.0038
.0086
.016
.024
= .40
W90 = .35 W90 = .23
W90 = .25
W90 = .29 W90 = .17
W90 = .19
E = “Experimental” treatmentS = “Standard” treatment
S = Pr(response with S)E = Pr(response with E)
Conduct a single arm trial of E, and stop the trial early based on the interim data if it is eithera) Likely that E is superior to S, orb) Unlikely that E is superior to S
Bayesian Designs for Phase IIB Trials
S~ beta(aS, bS)
1) Mean E(S) = S = aS/(aS+bS)
2) aS+bS = the effective sample size of the prior on S. This must be reasonably large, i.e. this prior is informative.
Priors: The Historical Standard Rx
E ~ beta(aE, bE)
Mean E(E) = E = aE / (aE+bE)
aE+bE = 1 or 2 (non-informative)
Priors: The Experimental Rx
ES + / 2
E
S
M = the smallest integer n so that, for givenwidth W = U – L , Pr(L < < U | XM ) > p*for “typical” data, e.g. XM = ME
E( Xn p*=.90 p*=.95
.20 42 59
.30 55 81
.40 63 89
.50 65 93
Determining Maximum Sample Size
1) If Pr(E > S + | data) < pL stop and conclude E is not promising compared to S (Futility)
2) If Pr(E > S | data) > pU stop and conclude E is promising compared to S (Superiority)
=.15-.20, pL =.01-.20, pU =.80-.99
(Thall and Simon, 1994)
Bayesian Designs for Phase IIB Trials
1) Specify a beta(aS,bS) prior onS and a beta(aE,bE) prior onE
with aS+bS “large” & aE+bE “small”2) E = S or possibly E = S + /23) Specify a maximum sample size4) Specify ( , pL , pU )5) Compute upper and lower stopping
boundaries6) Apply the stopping rules after each
patient (“continuously”), or after cohorts of given size, or periodically
Implementation
1) For each of several fixed values ofEtrue,
simulate the trial on the computer and record the design’s average behavior, i.e. its “Operating Characteristics” (OCs)
2) The OCs consist of a) Probability distribution of
N = achieved sample sizeb) p+ = prob(E declared “promising”)c) p- = prob(stop due to futility)
= prob(E declared “not promising”)3) Calibrate the design parameters based on the
OCs, re-simulate, and iterate
Simulation
Example: A Chemotherapy Trial in AML
S = Fludarabine + Cytarabine (cytosine arabinoside, ara-C)E = S + granulocyte colony stimulating factor (G-CSF, Neupogen®, Neulasta®)
1) S ~beta(33, 33) S = .50, WS,90=.202) =.20 Targeted E = .50 +.20 = .703)E ~beta(1.2, 0.8) E = .60, WE,90=.88
pL=.025, pU=.95, nmin=10, nmax=65
Example: A Chemotherapy Trial in AML
Early stopping bounds:
1) Stop the trial if [# responses] / [# pats.] = Xn / n
> 10/10, 11/11, 12/12, …,54/64 (Superiority)< 0/7, 1/8, 2/9, 3/10, …, 33/60 (Futility)
2) Stopping bounds for cohort size = 5
> 10/10, 14/15, 18/20, …,51/60 (Superiority)< 3/10, 6/15, 8/20, …, 31/60 (Futility)
Operating Characteristics
Etrue Pr(Stop) Sample Size
(25, 50, 75)%ile
.50 .83 11, 22, 49
.70 .15 65, 65, 65
.50 .77 15, 30, 60
.70 .11 65, 65, 65
Continuous Monitoring
Monitor Cohorts of Size 5
S
E
S
E
E
5/17 CRs observed Pr(E > S | 5/17) = .08
S
E
E
9/17 CRs observed Pr(E > S | 9/17) = .61
15/17 CRs observed Pr(E > S | 15/17) = .99
E
E
S
Thall-Sung (1998) adaptation of Thall-Simon (1994) to accommodate activity trials :
Given fixed target response probability p0
assume E ~ beta( 2 p0,2(1- p0) ),or maybe beta(p0,1- p0)
1) M = maximum sample size2) pL= stopping probability cut-off3) Stop the trial if Pr[ p0 < E |data] < pL
4) The “Target Activity Level ” isp0 = .10 to .30, most often .20
Phase II Equivalence Trials
Phase II Equivalence Trials
Patient Safety is NEVER a Secondary Consideration in a
Clinical Trial
In an ethical clinical trial, safety monitoring is inevitable - - -
including explicit rules to stop the trial early if the rate of toxicity or rate of regimen-related death is excessively high
Patient outcomes is nearly always much more complex than a one binary “response” variable
Patient safety is never a secondary consideration in a clinical trial
The rate of toxicity usually is determined very unreliably in phase I
Monitoring Multiple Discrete Outcomes
(Thall, Simon and Estey, 1995; Thall and Sung, 1998)
K=3
Monitor pr(CR) and pr(death)
K=5
Monitor 1) pr(CR) , or pr(CR|alive) 2) pr(TOX), or pr(TOX|
alive) 3) pr(death)
A Bio-Chemotherapy Trial in Acute Leukemia
Complete Remission (CR)
Yes No
Yes A1 A2
No A3 A4
TOX IC ITY
An experiment has k possible elementary outcomes, A1,…, Ak , with
j = Pr(Aj), for j=1,…,k-1, and k = 1 - 1 …- k-1
so = (1,…, k-1) is k-1 dimensional.
The Dirichlet pdf with parameters a = (a1,…,ak)
p(a) ∝ 1a1-1 1
a2-1 … k ak-1
We write “ a ~ Dirichet(a) ”
Dirichlet – Multinomial Model
a ~ Dirichet(a)
1) a+ = a1 +…+ ak = Effective sample size
2) E(j) = aj / a+ = j , a+ = a1 + … + ak
3) var(j) = j(1- j)/(1 + a+ )
4) cov(aj, ar) = – j r / (1 + a+ )
The case k=2 is the Beta(a1 , a2 )
Dirichlet – Multinomial Model
a ~ Dirichlet(a)
All subvectors of are Dirichlet
E.g. k=4 with
(1, 2 , 3) ~ Dirichlet (a1, a2, a3, a4 )
(1, 2) ~ Dirichlet (a1 , a2 , a3+a4 )
Dirichlet – Multinomial Model
a ~ Dirichlet(a)
Sums of subvectors of are Beta
E.g. k=4 with
(1, 2 , 3) ~ Dirichlet (a1, a2, a3, a4 )
(1+ 2) ~ Beta (a1 + a2 , a3 + a4 )
Dirichlet – Multinomial Model
Complete Remission
TOX IC ITY
Yes No
Yes 1 2
No 4 3
(1 , 2 , 3 , 4 Dirichlet(80, 70, 30, 120)
TOX1 + 2 Beta(150, 150) TOX = ½
CR1 + 4 Beta(200, 100) CR = ⅔
a ~ Dirichlet(a) Summing subvectors of gives a
Dirichlet
E.g.(1,…, 5) ~ Dirichlet (a1,…,a6)
(1+ 2, 3+ 4 , 5) ~ Dir(a1+a2 , a3+a4 , a5)
This corresponds to collapsing elementary events, e.g. from 5 to 3 in this example.
Dirichlet – Multinomial Model
X = (X1,…,Xk) ~ k-nomial with outcome probabilities 1…k and
| a ~ Dirichlet (a) a posteriori
[| X ] ~ Dirichlet(a X) ≡
Dirichlet(a1 + X1 , … , + ak + Xk)
k=2 is the “binomial(X1, 1) - beta(a1 , a2)” with X1 = X, X1 + X2 = n and 1 =
Dirichlet – Multinomial Model
4 elementary events
= () ~ Dirichlet (a1, a2, a3, a4)
R = A1 U A3
T = A1 U A2
(R) = ~ beta(a1 +a3 , a2 +a4 )
(T) = ~ beta(a1 +a2 , a3 +a4 )
Dirichlet – Multinomial Model
1
2
3
4
CR
Yes No
Yes
No
TOX
T = 12 and CR = 13
CR & Toxicity Data from 264 AML Patients Treated With an Anthracycline + ara-C
CR No CR
Toxicity 73
(27.7%)
63 (23.9%)
136
(51.5%)
No
Toxicity
101
(38.3%)
27
(10.2%)
128
(48.5%)
174
(65.9%)
90
(34.1%)
264
P(CR|Tox) = 73/136 = .54, P(CR|No Tox) = 101/128 = .79
S ~ Dirichlet (73,63,101,27)
aS,+ = 73+63+101+27 = 264 S = (.277, .239, .382, .102)
E(S,T) = E(S,1S,2) = .515
E(S,CR) = E(S,1S,3) = .659
Set E = S with aE,+ = 4 E ~ Dirichlet (1.11, .955, 1.53, .409)
Dirichlet Priors
Marginal Priors
Posterior of TOX after 17 / 21 toxicities
Marginal Priors
Posterior of CR after 13 / 21 CRs
Maximum sample size = 56
If X=45 (45/56 = .81, the target) then
Pr(.68 < E,CR < .88 | X = 45) = .95
That is, a posterior 95% credible interval (CI) will have width .20
Stop the trial if
1) Pr(S,CR + .15 < E,CR | data) < .05
or
2) Pr(S,TOX + .05 < E,TOX | data) > .95,
A .05 increase (“slippage”) in TOX is a
Trade-Off for a .15 increase in CR
Decision Rules
Stop the trial if
[# CRs] / [# patients] < 0/3, 1/4, 2/5, 3/6, 3/7, 4/8, 5/9, …(Futility)
[# Toxicities] / [# patients]> 6/6, 7/7, 8/8, 9/9, 9/10, 10/11, 11/12, 11/13 (Toxicity)
Decision Cut-Offs
Operating Characteristics
Case True E Prob. Stopping Early* Sample Size
(25%,50%,75%)
1 (.327,.239,.332,.102)
Tox ↑ 0.05
.856 + .043 + .006
=.90
6 12 27
2 (.377,.239, .282,.102)
Tox ↑ 0.10
.809 + .100 + .012
=.92
6 12 26
3 (.427,.089,.382,.102)
CR ↑ 0.15
.207 + .052 + .001
=.26
45 56 56
4 (.627,.089,.182,.102)
CR ↑ 0.15 & Tox ↑ 0.20
.167 + .693 + .003
=.86
10 19 36
S = (.277, .239, .382,.102)
* Prob Stopping Early = the sum of stopping probabilities to due the CR rule alone, the Tox rule alone and both rules
Monitoring Event Times in Phase II(Thall, Wooten and Tannir 2005)
An Activity Trial Based On Progression-Free Survival (PFS) Time
Patients with renal cell carcinoma, relapsed or refractory after immunotherapy
S = 5-FU+ Gemcitabine
E = Xeloda + Gemcitabine
T | ~ Exp() f(t| ) = exp(-t /)
E(T) = , var(T) = 2
| a,b ~ Inverse Gamma (IG) (a,b)
E() = b/(a-1), var() = b2 /(a-1)2(a-2)
To = min{T,C} , C = right-censoring time
Exponential-Inverse Gamma Model
For right-censored event times T1
o ,…, Tn
o from T1 ,…, Tn ~ iid Exp() with ~ IG(a,b)
N = # uncensored events, and T+ = T1
o +…+ Tn
o = total time on test
| data ~ IG (a + N, b + T+ )
E( |data) = (b+T+) / (a+N-1)
Exponential-Inverse Gamma Model
Experimental Group Survival Times1.5, (1.6), (2.4), (4.2), 4.5, (6.7), (8.0), (11.0),15.0NE = 3 deaths in nE = 9 patients, TE
+ = 54.9MLE of mean survival time
= TE+ / NE = 54.9 / 3 = 18.3
Control Group Survival Times0.5, (0.6), 1.5, 1.6, (2.0), 3.0, (3.5), (4.0), 4.8, 6.2,
(10.5), (11.0), 14.5NC = 7 deaths in nC = 13 patients, TC
+ = 63.7MLE of mean survival time
= TC+ / NC = 63.7 / 7 = 9.1
Exponential-Inverse Gamma Example
Assume Experimental Group Survival Times ~ iid Exp(E) Control Group Survival Times are ~ iid Exp(C)
A prioriE , C ~ iid IG(.01,.01) Mean =1, var=100
E | NE = 3 deaths, TE+ = 54.9 ~ IG(3.01, 54.91)
Posterior mean = 54.91/(3.01-1) = 27.32, var = 738.9
C | NC = 7 deaths, TC+ = 63.7 ~ IG(7.01, 63.71)
Posterior mean = 63.71/(7.01-1) = 10.60, var = 22.43
Exponential-Inverse Gamma Example
Pr(C < E | data) = .87
The Exponential-Inverse Gamma Model: Application to Phase II Monitoring
Tj = time to disease progression or death ( “failure” ) for j = S or E
j = Mean time to failure
Tj | j ~ Exponential
S ~ Inverse gamma (aS, bS)
E ~ Inverse gamma (aE, bE)
Observed data on each patient:
T0 = time to the event or right-censoring = 1 if T0 = T, = 0 if T0 < T
For T with pdf f and survivor function F(t) = Pr(T > t), the likelihood takes the usual form
L(data|) = i=1…n {fE(Ti0 | )}i {FE (Ti
0 | )}1-i
Event Time Likelihood
Priors
IG (a, b), equiv, 1/~ Gam(a,b)
E() = b/(a-1), var() = b2/{(a-1)2(a-2)}
Elicited values from historical rx :
E{median(T | S )} = 5.4 mos = log(2)E(S)
E(S) = 5.4 / log(2) = 7.8 mos
PriorsPr(S > 7 mos) = .50 var(S) = 12.2
S~ IG( aS = 6.87, bS = 45.76)
For the experimental agent prior, we set
E(E) = E(S) = 7.8 and var(E) = 1000
E~ IG( aE = 2.06 , bE = 8.27 )
Note: Establishing good IG priors can be tricky!
Trial Conduct
1) A Single-arm trial of E = Xeloda + Gemcitabine
2) Disease progression evaluated at 8-week intervals
3) Accrue a maximum of 84 patients
4) Monitor the data every 2 months
Trial Conduct
Stop the trial if Pr(S < E | data) < .18
at any time (a “phase II equivalence” rule)
The cutoff .18 was chosen to obtain incorrect stopping probability .05 if the true mean(TE) = 10.8 months, the desired improvement
Sample Size Rationale: If 70/84 failures observed, with mean 10.3 months, then
Pr[ 8.1 < E < 12.9 | data] = .95
Assuming accrual rate = 6 patients/month. Patients arrive according to a Poisson process. 1,000 reps. per case.
Fixed median(TE)
Fixed mean(TE)
Prob
Stop Early
Number of
Patients
Trial Duration
3 4.3 .94 33 5.4 mos
4 5.8 .45 59 9.8 mos
7.5 10.8 .05 80 13.4 mos
Operating Characteristics for the
Stopping rule Pr(S < E | data) < .18
A More Optimistic Goal
A “3 months improvement” target :
If an improvement of = 3 months over the historical mean(T) = 7.8 were desired, then the stopping rule would be
Pr(S + 3 < E | data) < pL = .038
with pL calibrated to make pSTOP = .05
(or whatever false stopping rate is desired)
at fixed E = 10.8 mos
Operating Characteristics for the Stopping rule Pr(S + 3 < E | data) < .038
Fixed median(TE)
Fixed mean(TE)
Prob
Stop Early
Number of Patients
Trial Duration
3 4.3 >.99(up from .94)
28(down from 33)
4.5 mos
4 5.8 .84(up from .45)
46(down from 59)
7.7 mos
7.5 10.8 .05 81 13.4 mos
0
0.2
0.4
0.6
0.8
1
3 4 5 6 7 8 9 10
Med Time to Failure (mos)
Delta = 0
Delta = 3
Early Stopping Probabilities for the “Equivalence” and “Improvement” Rules
Randomized Phase II Selection Trials
A Trial of Topotecan for AML
Patient Outcomes in the Randomized Trial of Topotecan for AML Salvage
25
Historical data with S = ara-C
6
3 2
1035
S,CR = 9/81 = .11 S,TOX = 41/81 = .51,
S,DEATH = 12/81 = .15
Stop arm Ej if
1) Pr[ E,j (TOX) > S (TOX) + .05 | data] > pU,TOX
2) Pr[ E,j (CR) > S (CR) + .20 | data] < pL,CR
3) Pr[ E,j (Death) > S (Death) | data] > pU,DEATH
A .05 increase (“slippage”) in (TOX) is the trade-off for a .20 increase in (CR).
Within-Arm OCs of the Topotecan Trial Design
Selection Probabilities for the Topotecan Trial Design
Selection Probabilities for the Topotecan Trial Design
Some General Conclusions
1) Randomized phase II selection trials provide unbiased comparisons among 2 or more experimental treatments.
2) The goal is to select the best E for future evaluation in phase III. The goal is not to achieve a specified improvement over S with a specified power.
3) If an arm is terminated, it is best to randomize all remaining patients to the remaining arms; otherwise, the null selection probabilities are inflated.