11
HEALTH ECONOMICS Health Econ. 8: 191–201 (1999) ECONOMIC EVALUATION BAYESIAN ESTIMATION OF COST-EFFECTIVENESS RATIOS FROM CLINICAL TRIALS DANIEL F. HEITJAN a,b, *, ALAN J. MOSKOWITZ b AND WILLIAM WHANG b a Di6ision of Biostatistics, Columbia Uni6ersity, New York, USA b International Center for Health Outcomes and Inno6ation Research, Columbia Uni6ersity, New York, USA SUMMARY Estimation of the incremental cost-effectiveness ratio (ICER) is difficult for several reasons: treatments that decrease both cost and effectiveness and treatments that increase both cost and effectiveness can yield identical values of the ICER; the ICER is a discontinuous function of the mean difference in effectiveness; and the standard estimate of the ICER is a ratio. To address these difficulties, we have developed a Bayesian methodology that involves computing posterior probabilities for the four quadrants and separate interval estimates of ICER for the quadrants of interest. We compute these quantities by simulating draws from the posterior distribution of the cost and effectiveness parameters and tabulating the appropriate posterior probabilities and quantiles. We demonstrate the method by re-analysing three previously published clinical trials. Copyright © 1999 John Wiley & Sons, Ltd. KEY WORDS — Bayesian inference; clinical trials; confidence intervals; cost-effectiveness ratios; net health bene- fit INTRODUCTION A number of authors [1–9] have proposed the incremental cost -effecti6eness ratio (ICER) as a measure of the cost-effectiveness of a new treat- ment. In a two-arm clinical trial pitting an experi- mental treatment against a standard, the ICER is defined to be the ratio of the mean difference in average cost to the mean difference in average effectiveness, where effectiveness may be mea- sured by quality-adjusted life years, survival, pa- tient response, or some other clinical outcome. Figure 1 presents a graphical view. The vertical axis is the difference in mean cost, and the hori- zontal axis is the difference in mean effectiveness. If the parameters lie in quadrant I, the new treat- ment increases both effectiveness and cost, imply- ing that the benefit of the new treatment comes at a price. If the parameters lie in quadrant II, the new treatment increases cost but decreases effec- tiveness, implying that the control dominates the experimental. If the parameters lie in quadrant III, the new treatment decreases both effectiveness and cost, implying that the new treatment achieves its cost saving at the expense of patient health. If the parameters lie in quadrant IV, the new treatment increases effectiveness while de- creasing cost, implying that the experimental dominates the control. If the average cost is g 1 on experimental and g 0 on control, and the average effectiveness is e 1 on experimental and e 0 on con- trol, then the vector of treatment-control differ- ences has horizontal coordinate e =e 1 -e 0 and vertical coordinate g =g 1 -g 0 . The ICER r is the slope of a ray from the origin to this point, or * Correspondence to: Division of Biostatistics, Joseph L. Mailman School of Public Health, Columbia University, 600 W. 168th Street, New York, NY 10032, USA. E-mail: [email protected] CCC 1057–9230/99/030191 – 11$17.50 Copyright © 1999 John Wiley & Sons, Ltd. Recei6ed 28 October 1997 Accepted 29 October 1998

Bayesian estimation of cost-effectiveness ratios from clinical trials

Embed Size (px)

Citation preview

Page 1: Bayesian estimation of cost-effectiveness ratios from clinical trials

HEALTH ECONOMICS

Health Econ. 8: 191–201 (1999)

ECONOMIC EVALUATION

BAYESIAN ESTIMATION OFCOST-EFFECTIVENESS RATIOS FROM

CLINICAL TRIALS

DANIEL F. HEITJANa,b,*, ALAN J. MOSKOWITZb AND WILLIAM WHANGb

a Di6ision of Biostatistics, Columbia Uni6ersity, New York, USAb International Center for Health Outcomes and Inno6ation Research, Columbia Uni6ersity, New York, USA

SUMMARY

Estimation of the incremental cost-effectiveness ratio (ICER) is difficult for several reasons: treatments thatdecrease both cost and effectiveness and treatments that increase both cost and effectiveness can yield identicalvalues of the ICER; the ICER is a discontinuous function of the mean difference in effectiveness; and the standardestimate of the ICER is a ratio. To address these difficulties, we have developed a Bayesian methodology thatinvolves computing posterior probabilities for the four quadrants and separate interval estimates of ICER for thequadrants of interest. We compute these quantities by simulating draws from the posterior distribution of the costand effectiveness parameters and tabulating the appropriate posterior probabilities and quantiles. We demonstratethe method by re-analysing three previously published clinical trials. Copyright © 1999 John Wiley & Sons, Ltd.

KEY WORDS — Bayesian inference; clinical trials; confidence intervals; cost-effectiveness ratios; net health bene-fit

INTRODUCTION

A number of authors [1–9] have proposed theincremental cost-effecti6eness ratio (ICER) as ameasure of the cost-effectiveness of a new treat-ment. In a two-arm clinical trial pitting an experi-mental treatment against a standard, the ICER isdefined to be the ratio of the mean difference inaverage cost to the mean difference in averageeffectiveness, where effectiveness may be mea-sured by quality-adjusted life years, survival, pa-tient response, or some other clinical outcome.

Figure 1 presents a graphical view. The verticalaxis is the difference in mean cost, and the hori-zontal axis is the difference in mean effectiveness.If the parameters lie in quadrant I, the new treat-ment increases both effectiveness and cost, imply-

ing that the benefit of the new treatment comes ata price. If the parameters lie in quadrant II, thenew treatment increases cost but decreases effec-tiveness, implying that the control dominates theexperimental. If the parameters lie in quadrantIII, the new treatment decreases both effectivenessand cost, implying that the new treatmentachieves its cost saving at the expense of patienthealth. If the parameters lie in quadrant IV, thenew treatment increases effectiveness while de-creasing cost, implying that the experimentaldominates the control. If the average cost is g1 onexperimental and g0 on control, and the averageeffectiveness is e1 on experimental and e0 on con-trol, then the vector of treatment-control differ-ences has horizontal coordinate e=e1−e0 andvertical coordinate g=g1−g0. The ICER r is theslope of a ray from the origin to this point, or

* Correspondence to: Division of Biostatistics, Joseph L. Mailman School of Public Health, Columbia University, 600 W. 168thStreet, New York, NY 10032, USA. E-mail: [email protected]

CCC 1057–9230/99/030191–11$17.50Copyright © 1999 John Wiley & Sons, Ltd.

Recei6ed 28 October 1997Accepted 29 October 1998

Page 2: Bayesian estimation of cost-effectiveness ratios from clinical trials

D.F. HEITJAN ET AL.192

Figure 1. Schematic representation of the space of cost and effectiveness differences to be estimated from a clinical trial. Thearrows indicate the direction of change from less to more favourable ICERs.

ICER=r=g/e= (g1−g0)/(e1−e0).

Typically, we expect new treatments to be bothmore costly and more effective than their prede-cessors, and thus we expect the differences to liein quadrant I. If uncertainty is small (thanks to asufficiently large sample size), interval estimatesof treatment effects and consequently of r will benarrow. However, when either the effectiveness orcost difference is small relative to its S.E., as oftenhappens in clinical trials, it is unclear whether theunderlying mean differences actually lie in quad-rant I. Methods for estimating the ICER need toaccount for such uncertainties.

The standard point estimate of the ICER is

r̂= (C( 1−C( 0)/(E( 1−E( 0),

where C( i is the average cost in group i and E( i theaverage effectiveness in group i. Currently, thereare four popular approaches for interval estima-tion of r. In the Taylor-series method [6], theinterval is r̂92×S.E.(r̂), where S.E.(r̂) is the

estimated S.E. of the approximating large-samplenormal distribution of r̂. In Fieller’s method[2,3,7,9], the confidence bounds are the roots of aquadratic equation in r. In the Bonferronimethod [3], one computes separate intervals for gand e, then projects the limits of these boundstoward the origin to produce a range of r values.In the bootstrap method [1,2,5–7], one draws alarge number of bootstrap samples—i.e., sampleswith replacement from the original data—andcomputes the estimate r̂ from each to produce asimulated sampling distribution of r̂. One thensummarizes this distribution to produce an inter-val estimate for r.

The Taylor-series method is simple but unreli-able, because the sampling distribution of r̂ maybe asymmetric even in large samples, and theTaylor-series S.E. can underestimate the true sam-pling variability. Fieller’s method works well un-less the observed denominator difference E( 1−E( 0

is not significantly different from zero, in whichcase it produces infinite intervals [3,9]. The Bon-

Copyright © 1999 John Wiley & Sons, Ltd. Health Econ. 8: 191–201 (1999)

Page 3: Bayesian estimation of cost-effectiveness ratios from clinical trials

BAYESIAN ESTIMATION OF COST-EFFECTIVENESS RATIOS 193

ferroni method is reliable even if the treatmenteffects are ambiguous, but it can be overly conser-vative [3]. The bootstrap method has found fa-vour recently because it requires few assumptionsabout the underlying distribution of the data.However, it too can fail when the effect differenceis small [8].

Several authors have identified vexing problemsin the interpretation and estimation of the ICER,above and beyond the usual statistical difficultieswith estimation of a ratio parameter. Laska et al.[3], among others, have noted that ICER ratiosfrom quadrant I can take the same values asICER ratios from quadrant III. For example, atreatment that increases costs by 3 units andeffectiveness by 2 units has an ICER of 1.5, thesame as a treatment that decreases costs by 3units and decreases effectiveness by 2 units. De-spite their equal r values, normally one would notdeem two such treatments equivalent in a cost-ef-fectiveness sense.

Moreover, in quadrant I lower ICER values arepreferred because they give more benefit per unitof cost, whereas in quadrant III higher ICERvalues are preferred because they give a greatersaving per unit of effectiveness sacrificed. So ifutility is a monotone function of r in both quad-rants (decreasing in I, increasing in III), and thereis an ICER value r0\0 that implies equal utilityregardless of the signs of the numerators anddenominators, there can be only one such value:positive values of r less than r0 will have higherutility than r0 in quadrant I, but lower utility inquadrant III.

A second, more subtle, problem is the disconti-nuity of the ICER as a function of its constituentparameters. Specifically, for a fixed value of thecost difference, the ICER is a discontinuous func-tion of the effectiveness difference at e=0. Forexample, suppose the cost difference is g=10.Then if e=1, the ICER is r=10, and as edecreases towards 0, r increases to �. On theother hand, if e= −1, then r= −10, but as eapproaches 0 from the negative side, r ap-proaches −�. Figure 2 is a perspective plot ofr(e, g). The dominant feature of this plot is theexplosion in the value of r along the vertical axis.

This discontinuity has consequences for intervalestimation of r. For example, if there is a largepositive difference in cost but only a modestpositive difference in effectiveness, many boot-strap samples will give cost and effectiveness pairs

lying near the vertical axis. Points just to the leftof the axis give large negative r̂, whereas pointsjust to the right of the axis give large positive r̂. Ifone naively lines up the bootstrap estimates in aconventional histogram, the negative values of r̂from quadrant II are misplaced; they actuallyshould lie at the far right, beyond the large posi-tive values from quadrant I that represent highcost for low effect. By placing negative values atthe left, one may artificially reduce both the upperand lower bootstrap confidence limits for r,thereby invalidating the coverage probability ofthe interval [8].

We believe that the solution to these difficultiesis to consider cost-effectiveness as a two-dimen-sional parameter, comprising the numerical valueof r together with the quadrant that the differ-ences lie in. Moreover, we believe that the bestway to estimate cost-effectiveness is to use theBayesian paradigm [10–12] that treats modelparameters as random quantities. In ICER esti-mation, this approach allows us to calculate theposterior probability, given the data, of any rea-sonable hypothesis about the parameters—for ex-ample, the hypothesis that the effects are either inquadrant I with a r value less than, say, $10000per life saved, or in quadrant IV, where newdominates control. Our method merely requiresthat one simulate from the posterior distributionand compute means and quantiles with the simu-lated parameter draws.

We present a brief introduction to Bayesianinference, then describe our method and apply itto three examples.

BAYESIAN INFERENCE

Modern statistics distinguishes two modes of in-ference. The dominant mode is frequentist statis-tics, which bases probability statements on thedistribution of the observables given modelparameters. The frequentist benchmark for a pro-cedure is whether it achieves desirable propertieson average in repeated samples. For example, insignificance testing one evaluates methods interms of their long-run probabilities of falselyrejecting a true null hypothesis (the type I errorrate), and of correctly rejecting a false null hy-pothesis (the power). In confidence interval esti-mation, one evaluates methods in terms of suchproperties as the probability of covering the true

Copyright © 1999 John Wiley & Sons, Ltd. Health Econ. 8: 191–201 (1999)

Page 4: Bayesian estimation of cost-effectiveness ratios from clinical trials

D.F. HEITJAN ET AL.194

Figure 2. A perspective plot of the ICER parameter r as a function of mean cost difference g=g1−g0 and mean effectivenessdifference e=e1−e0. The function has a singularity along the line e=0.

parameter value and the average interval width,were the study to be repeated infinitely manytimes.

In Bayesian, or subjectivist, statistics one condi-tions on what is known—i.e., the data—and usesprobability distributions to describe uncertaintyabout the unknowns—i.e., model parameters.Bayes’s theorem is a recipe for quantifying viewsabout model parameters in the light of priorknowledge about the parameters and newly ac-quired data. Assume that we begin with a priordensity f(u) that describes current uncertaintyabout model parameter u. We collect data x thatgive rise to the likelihood function L(u ; x)=f(x �u), or the probability of the observed datagiven u, considered as a function of u. Bayes’stheorem states that the new description of uncer-tainty about u is the posterior density, computedas follows:

f(u �x)=f(u)L(u ; x)&f(u)L(u ; x) du

.

There is a rich literature on Bayesian methods.Some valuable introductory texts are those ofBerry [10], Box and Tiao [11] and Gelman et al.[12].

The Bayesian approach is straightforward andprincipled in that it focuses the analyst’s attentionon synthesizing existing information (the prior)and modelling current data (the likelihood). Withthe posterior distribution in hand, one addresseshypotheses about u directly by calculating theirposterior probabilities given the data. The compu-tation of Bayesian posterior probabilities, long animpediment to practice, has become much morefeasible in recent years, and this has contributedto a re-examination and broader acceptance ofBayesian ideas.

Copyright © 1999 John Wiley & Sons, Ltd. Health Econ. 8: 191–201 (1999)

Page 5: Bayesian estimation of cost-effectiveness ratios from clinical trials

BAYESIAN ESTIMATION OF COST-EFFECTIVENESS RATIOS 195

BAYESIAN ESTIMATION OFCOST-EFFECTIVENESS

We assume only that one has a means of simulat-ing draws from the posterior distribution of(e, g)= (e1−e0, g1−g0); call this distributionf(e, g �data). In the examples of the next section,we will take this to be a bivariate normal distribu-tion with expectation vector (E( 1−E( 0, C( 1−C( 0)and variance–covariance matrix given by theusual estimated variance–covariance matrix ofthe differences in sample means. This assumptionholds approximately in many clinical trials. It isnot generally necessary for the method, however,provided one can produce appropriate samplesfrom the correct posterior, whatever it may be.The method merely involves taking a large num-ber M of draws

(e (m), g (m)), m=1, . . . , M

from this distribution. For each sampled value,compute the corresponding ICER

r (m)=g (m)/e (m), m=1, . . . , M.

With these M values as data, we then estimate anumber of posterior quantities:

(1) The posterior probability that the new treat-ment increases both cost and effectiveness, i.e. theposterior probability that the differences lie inquadrant I. We estimate this by the fraction ofparameter values m that satisfy both g (m)\0 ande (m)\0. This is a simulation approximation to&�

0

&�0

f(e, g �data) dg de.

(2) The posterior probability that the new treat-ment decreases both cost and effectiveness, i.e. theposterior probability that the differences lie inquadrant III. We estimate this by the fraction ofparameter values m that satisfy both g (m)B0 ande (m)B0. This is a simulation approximation to& 0

−�

& 0

−�

f(e, g �data) dg de.

(3) The posterior probability that the new treat-ment dominates on both cost and effectiveness,i.e. the posterior probability that the differenceslie in quadrant IV. We estimate this by the frac-tion of parameter values m that satisfy bothg (m)B0 and e (m)\0. This is a simulation approx-imation to

&�0

& 0

−�

f(e, g �data) dg de.

(4) The posterior probability that the controldominates on both cost and effectiveness, i.e. theposterior probability that the differences lie inquadrant II. We estimate this by the fraction ofparameter values m that satisfy both g (m)\0 ande (m)B0. This is a simulation approximation to& 0

−�

&�0

f(e, g �data) dg de.

A new treatment is worthy of adoption if iteither falls anywhere in quadrant IV, or in quad-rant I with an ICER that is less than someagreed-upon threshold value k. To assess the evi-dence for this hypothesis, we make the followingcalculation:

(5) The posterior probability that the new ther-apy is either dominant (lies in quadrant IV) or ismore costly and more effective with an ICER lessthan k. We estimate this probability by the frac-tion of parameter values m that satisfy either

(a) g (m)\0 and e (m)\0 and r (m)Bk, or

(b) g (m)B0 and e (m)\0.

This is a simulation approximation to&�0

& 0

−�

f(e, g �data) dg de

+&�

0

& ke

0

f(e, g �data) dg de.

From the four quadrant probabilities (whichmust sum to 1 to within rounding error), one canget a clear idea of the strength of the evidence foreach of the major hypotheses about the effect ofthe new treatment on cost and effectiveness. If thenew treatment either dominates the control (quad-rant IV) or is dominated by it (quadrant II),further estimation of r is pointless [13,14]. Ifsubstantial probability lies in quadrant I or III, itmay be desirable to have an interval estimate of rgiven that it lies in that quadrant. The followingcalculation gives an interval estimate for theICER under the assumption that it increases bothhealth and cost (quadrant I):

(6) Among the parameter values that lie inquadrant I, find the 100a/2 and 100(1−a/2) cen-tiles of r. This is a 100(1−a)% probability inter-val for r given that the new treatment is cost- andhealth-increasing. The limits are the solutions rL

and rU of

Copyright © 1999 John Wiley & Sons, Ltd. Health Econ. 8: 191–201 (1999)

Page 6: Bayesian estimation of cost-effectiveness ratios from clinical trials

D.F. HEITJAN ET AL.196

Figure 3. Contours of the bivariate normal approximation to the posterior distribution of the cost and effectiveness differencesin the telephone call trial example. The contours represent the 5th, 25th, 50th, 75th and 95th centiles of the distribution.

&�0

& rL e

0

f(e, g �data) dg de&�0

&�0

f(e, g �data) dg de

=a/2

and&�0

&�rUe

f(e, g �data) dg de&�0

&�0

f(e, g �data) dg de

=a/2.

One can define an analogous interval for r giventhat the new treatment decreases both cost andeffectiveness (quadrant III). Because ratios fromquadrants I and III follow opposite orderings andgenerally are not comparable, we believe it isinappropriate to combine ratios from the twoquadrants in a single interval.

We have written a set of functions in the S-Plusstatistical language (MathSoft, Inc., Seattle, WA)for simulating parameters from a multivariatenormal posterior distribution and estimating pos-terior probabilities and probability intervals from

simulated parameters. Code is available from thefirst author.

EXAMPLES

Calls example

Chaudhary and Stearns [2] presented summarydata from an evaluation of the effect of an infor-mational telephone call on well-child screeningprogramme participation. Effectiveness was a bi-nary variable taking value 1 for households wherea child made use of the screening programme and0 otherwise. Cost was measured as the variablecost of the telephone call. Using the data in TableI of their paper and assuming a flat prior, wecalculated the posterior mean and variance of(e, g) and generated 250000 sets of normal ran-dom vectors from this distribution. We then ap-plied the calculations described above to thesimulated parameter values.

Copyright © 1999 John Wiley & Sons, Ltd. Health Econ. 8: 191–201 (1999)

Page 7: Bayesian estimation of cost-effectiveness ratios from clinical trials

BAYESIAN ESTIMATION OF COST-EFFECTIVENESS RATIOS 197

Figure 4. Contours of the bivariate normal approximation to the posterior distribution of the cost and effectiveness differencesin the drug trial example. The contours represent the 5th, 25th, 50th, 75th and 95th centiles of the distribution.

Figure 3 shows the contours of the bivariatenormal probability distribution of (e, g). There isevidence of a moderate increase in the utilizationrate, and the cost of calling the households isclearly positive. From the simulations, the proba-bility that the telephone call increases both costand effectiveness is 99.95%, the probability thatno call dominates a call is 0.05%, and the proba-bility of the other two quadrants is 0% (as it mustbe, because the cost of no call was defined to be$0). We calculated the probability that the calleither dominates or is cost- and effect-increasing,with a r less than $90 per child screened, to be94.5%. A 95% probability interval for r, giventhat the telephone call is both cost- and effective-ness-increasing, spans the range $29.4–$113.4 perchild screened.

In their confidence interval calculation, Chaud-hary and Stearns included bootstrap replicatesthat gave negative r values. These must representvalues in quadrant II, which are the unlikely butpossible parameters where the effect of the tele-phone call on utilization is negative. Because such

points are more properly grouped with high posi-tive values of r, allowing them to affect thecalculation may artificially reduce both the upperand lower confidence limits. Their normal boot-strap limits did include negative values.

Drug trial example

Briggs et al. [1] presented data from an evaluationof the effect of drug therapy on cost and effective-ness. Figure 4 shows contours of the bivariatenormal posterior distribution of the mean costand treatment difference. The data show strongevidence of a difference on effectiveness butweaker evidence of an effect on cost. We esti-mated the posterior probability that the newtreatment increases both cost and effectiveness tobe 96.8%, the probability that it dominates con-trol to be 3.2%, and the probability of the otheroutcomes to be non-zero but negligible. The pos-terior probability that the new treatment eitherdominates control or increases cost and effect,with a r less than $200 per unit change in effec-

Copyright © 1999 John Wiley & Sons, Ltd. Health Econ. 8: 191–201 (1999)

Page 8: Bayesian estimation of cost-effectiveness ratios from clinical trials

D.F. HEITJAN ET AL.198

Figure 5. Contours of the bivariate normal approximation to the posterior distribution of the cost and effectiveness differencesin the sepsis trial example. The contours represent the 5th, 25th, 50th, 75th and 95th centiles of the distribution.

tiveness, is 75.7%. A 95% probability interval forr, given that the new treatment is in quadrant I,spans the range $19.9–$380.5 per unit. This isclose to most of the intervals constructed byBriggs et al.; the exception is their normal boot-strap interval, which has a negative lower limit.Although we may like to think of these negativevalues as representing dominance of the treatmentover the control, nothing in the method impliesthis interpretation.

Sepsis example

Laska et al. [3] analysed a trial comparing an IL1receptor agonist (IL1ra) with a placebo in thetreatment of sepsis, reported by Fisher et al. [15],Gordon et al. [16] and van Hout et al. [17]. Wefollow them in assuming a common correlation oftreatment and effect. In this example, cost (in Dfl)is based on the number and type of hospital days,and effectiveness is an indicator of survival to day28.

Figure 5 shows the contours of the bivariatenormal posterior density of the treatment differ-ences. The effect on survival is significant but notoverwhelming, there is little effect on cost, and thetwo effects are moderately correlated. Moreover,the 95% probability contour passes through allfour quadrants, implying that the data do not ruleout even the possibility that placebo dominatesIL1ra. From the simulations, the probability thatIL1ra increases both cost and survival is 59.4%,the probability that placebo dominates IL1ra is0.3%, the probability that IL1ra decreases bothcost and survival is 0.9%, and the probability thatIL1ra dominates placebo is 39.5%. The probabil-ity that IL1ra either dominates placebo or is cost-and effect-increasing, with a r less than Dfl35000per life saved, is 91.8%.

Figure 6 presents histograms of the simulatedr (m) values for quadrants I and III. A 95% proba-bility interval for r, given that IL1ra is cost- andsurvival-increasing, spans the range Dfl791–Dfl63400 per life saved. A 95% probability inter-val for r, given that IL1ra is cost- and

Copyright © 1999 John Wiley & Sons, Ltd. Health Econ. 8: 191–201 (1999)

Page 9: Bayesian estimation of cost-effectiveness ratios from clinical trials

BAYESIAN ESTIMATION OF COST-EFFECTIVENESS RATIOS 199

Figure 6. Histograms of sampled values of r in quadrant I (6a) and quadrant III (6b) from the sepsis example. Histograms areon the log scale to reduce skewness. Labelled points are the 2.5th, 50th and 97.5th centiles.

survival-decreasing, spans the range Dfl8400–Dfl4580000 saved per life sacrificed.

Laska et al. computed three confidence inter-vals for r from these data: a two-sided Fiellerinterval extended from −Dfl127057 to Dfl59338per life saved; a one-sided Fieller interval ex-tended from −� to Dfl268027 per life saved;and a two-sided Bonferroni interval extendedfrom −Dfl3.39×106 to Dfl4.05×106 per lifesaved. Again, there is no implication that thenegative lower limits represent r values in aparticular quadrant, and it is uncertain whetherpositive values represent ratios in quadrant I orIII.

DISCUSSION

The ICER is an ill-defined parameter because itidentifies pairs of cost and effectiveness differ-ences that can have altogether different interpreta-tions. Moreover, because it is discontinuous as afunction of the effectiveness difference, points thatare close in the cost-effectiveness plane may bewidely separated on the ICER scale. Estimationprocedures that ignore these problems will givemisleading results.

We have proposed a Bayesian approach toICER estimation. Within this paradigm one canconstruct an interval estimate for the ICER given

Copyright © 1999 John Wiley & Sons, Ltd. Health Econ. 8: 191–201 (1999)

Page 10: Bayesian estimation of cost-effectiveness ratios from clinical trials

D.F. HEITJAN ET AL.200

that it lies in a particular quadrant, thus avoidingthe problems of mixing ratios that lie on oppositesides of the vertical axis. One can also readilycompute posterior probabilities for each quadrantand for the ICER being in some range of fa-vourable values. By simulating ei and gi values,i=0, 1, one can also compute posterior distribu-tions of treatment cost-effectiveness ratios [3] ri=gi/ei.

Comparison to the acceptability cur6e

Our method generalizes the C/E-acceptabilitycurve of van Hout et al. [17]. In our terminology,they assume a flat prior, calculate the integral

F(k)=&�

−�

& ke

−�

f(e, g �data) dg de,

and plot F(k) versus k� (0,�). We believe thatour approach is more sensible for two reasons.First, the acceptability curve assumes that pointsin quadrant III whose ratios exceed k have thesame utility as points in quadrant I whose ratiosare less than k. As indicated above, because theordering of ratios is opposite in quadrants I andIII, there can be at most one value k where ratiosfrom the opposite quadrants have the same utility.Therefore, one should not interpret equal ratiosfrom the two quadrants as having equal utilities,which is the premise underlying the acceptabilitycurve. Second, van Hout et al. derived their curveunder a frequentist argument, but it is not clearwhat its properties are or how to interpret it.

Prior and likelihood approximations

Bayesian methods require specification of a priordistribution for model parameters. Unfortunately,in many situations either there are no prior stud-ies available, or only studies of uncertain rele-vance. It can be difficult to know how to properlyweight such information. A common solution isto use flat prior distributions that embody an apriori belief that all combinations of e and g areequally likely. Unless the prior is highly concen-trated, its effect will be small in moderate to largesamples. Because randomized trials are designedto be conclusive, they typically dwarf their pilotstudies, suggesting that analyses with flat priorswill often be satisfactory.

In the examples, we simulated from a bivariatenormal distribution centered at the vector of ob-

served effects and with dispersion matrix equal tothe estimated variance matrix of the effect esti-mates. This is equivalent to taking a flat prior andassuming that the log-likelihood function isquadratic, which is true approximately providedthe sample size is large. We adopted these as-sumptions merely to simplify these illustrative ex-amples; we do not mean to imply that this is idealpractice. With modern simulation and integrationmethods [12], it has become relatively easy toaccommodate whatever prior and likelihood areappropriate for the study at hand.

Bayesian and frequentist inter6al estimation

A Bayesian 95% probability interval has the inter-pretation that one can attach 95% posterior prob-ability to the hypothesis that the parameter lieswithin the interval. By contrast, a frequentist 95%confidence interval is a single realization of arandom interval that we know will cover the truevalue of the parameter, whatever that value maybe, in 95% of a hypothetical series of replications.This interval may or may not cover the truth—wenever really know. But if long-run performancematters, we can be sure that 95% of the time thisprocedure will work, whatever the state of nature.

There is some confusion in the health econom-ics literature [4] about what exactly constitutes aconfidence interval, and many scientists mistak-enly assign Bayesian interpretations to frequentistintervals. In many situations the two modes leadto nearly identical intervals, but unfortunatelyICER estimation is not such a situation. In ourexperience, when the two approaches do notagree, only the Bayesian method gives sensibleanswers.

ICER and net health benefit

The problems with ICER have led others to sug-gest replacing it with the incremental net healthbenefit (INHB) [14]. The INHB is the healthbenefit of a new therapy minus the health fore-gone had the extra cost been spent purchasinghealth care at the threshold cost-effectiveness rate(conventionally $50000 per life year in the US).INHB is a monotone function of effectiveness andcost, has no discontinuities, is simple to estimate,and is readily interpretable. Yet as different asthey may at first appear, INHB and ICER areclosely linked. For example, the INHB is positive

Copyright © 1999 John Wiley & Sons, Ltd. Health Econ. 8: 191–201 (1999)

Page 11: Bayesian estimation of cost-effectiveness ratios from clinical trials

BAYESIAN ESTIMATION OF COST-EFFECTIVENESS RATIOS 201

only if the ICER is less than the threshold rate (inquadrant I) or greater than the threshold rate (inquadrant III). Moreover, estimation of a range ofrates where the INHB is near zero is equivalent tointerval estimation of the ICER [9]. Thus INHBand ICER are only superficially different, and onemay profitably estimate either one depending on astudy’s specific objectives. With the availability offlexible Bayesian methods, it should not be neces-sary to abandon the ICER.

Conclusion

In estimating cost-effectiveness it is necessary tothink carefully about the parameter space and theranking of parameter values. Such considerationsbring the differences between Bayesian and fre-quentist methods into stark relief, and in ouropinion reveal the fundamental advantages of theBayesian approach.

REFERENCES

1. Briggs, A.H., Wonderling, D.E. and Mooney, C.Z.Pulling cost-effectiveness analysis up by its boot-straps: a non-parametric approach to confidenceinterval estimation. Health Economics 1997; 6:327–340.

2. Chaudhary, M.A. and Stearns, S.C. Estimatingconfidence intervals for cost-effectiveness ratios: anexample from a randomized trial. Statistics inMedicine 1996; 15: 1447–1458.

3. Laska, E.M., Meisner, M. and Siegel, C. Statisticalinference for cost-effectiveness ratios. Health Eco-nomics 1997; 6: 229–242.

4. Manning, W.G., Fryback, D.G. and Weinstein,M.C. Reflecting uncertainty in cost-effectivenessanalysis. In: Gold, M.R., Siegle, J.E., Russell, L.B.and Weinstein, M.C. (eds.) Cost-effecti6eness inhealth and medicine. Oxford: Oxford UniversityPress, 1996.

5. Obenchain, R.L. Issues and algorithms in cost-ef-fectiveness inference. Biopharmaceutical Report1997; 5: 1–7.

6. O’Brien, B.J., Drummond, M.F., Labelle, R.J. andWillan, A. In search of power and significance:issues in the design and analysis of stochastic cost-effectiveness studies in health care. Medical Care1994; 32: 150–163.

7. Polsky, D., Glick, H.A., Willke, R. and Schulman,K. Confidence intervals for cost-effectiveness ra-tios: a comparison of four methods. Health Eco-nomics 1997; 6: 243–252.

8. Heitjan, D.F., Moskowitz, A.J. and Whang, W.Problems with interval estimates of the incrementalcost-effectiveness ratio. Medical Decision Making1999; in press.

9. Heitjan, D.F. Fieller’s method and net health bene-fits. Submitted for publication.

10. Berry, D.A. Statistics: a Bayesian perspecti6e. Bel-mont, CA: Duxbury Press, 1996.

11. Box, G.E.P. and Tiao, G.C. Bayesian inference instatistical analysis. Reading, MA: Addison-Wesley,1973.

12. Gelman, A., Carlin, J., Stern, H. and Rubin, D.B.Bayesian data analysis. London: Chapman & Hall,1995.

13. Stinnett, A.A. and Mullahy, J. The negative side ofcost-effectiveness analysis. Journal of the AmericanMedical Association 1997; 277: 1931–1932.

14. Stinnett, A.A. and Mullahy, J. Net health benefits:a new framework for the analysis of uncertainty incost-effectiveness analysis. Medical Decision Mak-ing 1998; 18: S68–S80.

15. Fisher, C.J., Slotman, G.J., Opal, S.M., et al. Ini-tial evaluation of human recombinant interleukin-1receptor agonist in the treatment of sepsis syn-drome: a randomized, open-label, placebo-con-trolled multicenter trial. Critical Care Medicine1994; 22: 12–21.

16. Gordon, G.S., Fisher, C.J., Slotman, G.J. et al.Cost-effectiveness of treatment with interleukin-1receptor agonist (IL-1ra) in patients with sepsissyndrome. Clinical Research 1992; 40: 254A.

17. van Hout, B.A., Al, M.J., Gordon, G.S. and Rut-ten, F.F.H. Costs, effects and C/E-ratios alongsidea clinical trial. Health Economics 1994; 3: 309–319.

Copyright © 1999 John Wiley & Sons, Ltd. Health Econ. 8: 191–201 (1999)