6
Potential surrogate endpoints in cancer research – some considerations and examples S. W. Duffy, a and F. P. Treasure b We present an introductory survey of the use of surrogates in cancer research, in particular in clinical trials. The concept of a surrogate endpoint is introduced and contrasted with that of a biomarker. It is emphasized that a surrogate endpoint is not universal for an indication but will depend on the mechanism of treatment. We discuss the measures of validity of a surrogate and give examples of both cancer surrogates and biomarkers on the path to surrogacy. Circumstances in which a surrogate endpoint may actually be preferred to the clinical endpoint are described. We provide pointers to the recent substantive literature on surrogates. Copyright r 2009 John Wiley & Sons, Ltd. Keywords: surrogacy; surrogate endpoints; biomarkers; cancer; clinical trials 1. INTRODUCTION Chambers Dictionary (1998) ‘surrogate: a person or thing standing, e.g. in a dream, for another person or thing’. The difference between a surrogate and true endpoint is like the difference between a cheque and cash. You can often get the cheque earlier but then, of course, it may bounce. Senn [1], p. 125. 1.1. What is a surrogate? A ‘surrogate’ endpoint, in the cancer clinical trial context, is an endpoint which is used instead of the primary clinical endpoint. The substitution is usually made because the surrogate endpoint can be obtained earlier or more cheaply or with less variability than the clinical endpoint. A surrogate endpoint could be another relevant clinical endpoint but is often an objective biological or biochemical assessment (see the discussion of ‘biomarkers’ below). Of course it is necessary to show that the surrogate endpoint is an appropriate substitute for the primary clinical endpoint. There should be confidence that conclusions about efficacy drawn from the surrogate endpoint should match the conclu- sions that would have been drawn from the clinical endpoint. Surrogates are also commonly used in cancer epidemiology and cancer diagnosis (especially screening). Although this paper is mainly about pharmaceutical clinical trials we shall draw on such other contexts. A recent discussion of surrogacy in cancer screening is given in Cuzick et al. [2]. 1.2. Where do surrogate endpoints come from? Some surrogate endpoints command universal acceptance – so much so that it easy to forget that a surrogate is being used. Lowering blood pressure has become an end in itself, when, strictly, high blood pressure is merely a surrogate for cardiovascular disease. (Much safety data – vital signs, laboratory measurements – is essentially surrogate in nature). If a surrogate endpoint is new or not universally accepted then its appropriateness needs to be established. This is not so easy, partly because it is not always clear what makes a good surrogate and how to demonstrate it. There can be logistical difficulties in that it may delay a clinical development programme unduly to study whether serum concentration of biomarker X is an appropriate surrogate for, say, survival. We will use the term ‘biomarker’ as a weaker form of ‘surrogate’, though of course biomarkers are of considerable interest in their own right. A biomarker has been defined as ‘A characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic inter- vention’ (National Institutes of Health, [3]). Biomarkers may arise as a by-product of clinical trials, but more usually emerge from uncontrolled case series, or epidemiological or laboratory studies. A biomarker will only achieve the status of a surrogate endpoint if it can be demonstrated that it is a suitable substitute for the clinical endpoint. A minimal requirement is that the biomarker be assessed after the start of therapy. For an overview of biomarkers in clinical drug development, see [4]. 1.3. Choice of surrogate endpoint depends on treatment There is a strong and understandable temptation to think of a disease as having a standard surrogate endpoint attached to it. 34 MAIN PAPER Received 3 July 2008, Revised 21 August 2009, Accepted 9 September 2009 Published online 22 December 2009 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/pst.406 Pharmaceut. Statist. 2011, 10 34–39 Copyright r 2009 John Wiley & Sons, Ltd. a Cancer Research UK Centre for EMS, Wolfson Institute of Preventive Medicine, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK b Department of Public Health and Primary Care, Eastern Cancer Registration and Information Centre, University of Cambridge, Cambridge, UK *Correspondence to: F. P. Treasure, Department of Public Health and Primary Care, Institute of Public Health, University Forvie Site, Robinson Way, Cambridge CB2 2SR, UK. E-mail: [email protected]

Potential surrogate endpoints in cancer research – some considerations and examples

Embed Size (px)

Citation preview

Page 1: Potential surrogate endpoints in cancer research – some considerations and examples

Potential surrogate endpoints in cancerresearch – some considerations and examplesS. W. Duffy,a and F. P. Treasureb�

We present an introductory survey of the use of surrogates in cancer research, in particular in clinical trials. The concept of asurrogate endpoint is introduced and contrasted with that of a biomarker. It is emphasized that a surrogate endpoint is notuniversal for an indication but will depend on the mechanism of treatment. We discuss the measures of validity of asurrogate and give examples of both cancer surrogates and biomarkers on the path to surrogacy. Circumstances in which asurrogate endpoint may actually be preferred to the clinical endpoint are described. We provide pointers to the recentsubstantive literature on surrogates. Copyright r 2009 John Wiley & Sons, Ltd.

Keywords: surrogacy; surrogate endpoints; biomarkers; cancer; clinical trials

1. INTRODUCTION

Chambers Dictionary (1998) ‘surrogate: a person or thingstanding, e.g. in a dream, for another person or thing’.

The difference between a surrogate and true endpoint is likethe difference between a cheque and cash. You can often get thecheque earlier but then, of course, it may bounce. Senn [1], p. 125.

1.1. What is a surrogate?

A ‘surrogate’ endpoint, in the cancer clinical trial context, is anendpoint which is used instead of the primary clinical endpoint.The substitution is usually made because the surrogateendpoint can be obtained earlier or more cheaply or with lessvariability than the clinical endpoint. A surrogate endpoint couldbe another relevant clinical endpoint but is often an objectivebiological or biochemical assessment (see the discussion of‘biomarkers’ below).

Of course it is necessary to show that the surrogate endpointis an appropriate substitute for the primary clinical endpoint.

There should be confidence that conclusions about efficacydrawn from the surrogate endpoint should match the conclu-sions that would have been drawn from the clinical endpoint.

Surrogates are also commonly used in cancer epidemiologyand cancer diagnosis (especially screening). Although this paperis mainly about pharmaceutical clinical trials we shall draw onsuch other contexts. A recent discussion of surrogacy in cancerscreening is given in Cuzick et al. [2].

1.2. Where do surrogate endpoints come from?

Some surrogate endpoints command universal acceptance – somuch so that it easy to forget that a surrogate is being used.Lowering blood pressure has become an end in itself, when,strictly, high blood pressure is merely a surrogate forcardiovascular disease. (Much safety data – vital signs, laboratorymeasurements – is essentially surrogate in nature).

If a surrogate endpoint is new or not universally acceptedthen its appropriateness needs to be established. This is not soeasy, partly because it is not always clear what makes a goodsurrogate and how to demonstrate it. There can be logisticaldifficulties in that it may delay a clinical developmentprogramme unduly to study whether serum concentration ofbiomarker X is an appropriate surrogate for, say, survival.

We will use the term ‘biomarker’ as a weaker form of‘surrogate’, though of course biomarkers are of considerableinterest in their own right. A biomarker has been defined as‘A characteristic that is objectively measured and evaluated asan indicator of normal biological processes, pathogenicprocesses, or pharmacologic responses to a therapeutic inter-vention’ (National Institutes of Health, [3]).

Biomarkers may arise as a by-product of clinical trials, but moreusually emerge from uncontrolled case series, or epidemiologicalor laboratory studies. A biomarker will only achieve the status of asurrogate endpoint if it can be demonstrated that it is a suitablesubstitute for the clinical endpoint. A minimal requirement is thatthe biomarker be assessed after the start of therapy.

For an overview of biomarkers in clinical drug development,see [4].

1.3. Choice of surrogate endpoint depends on treatment

There is a strong and understandable temptation to think of adisease as having a standard surrogate endpoint attached to it.

34

MAIN PAPER

Received 3 July 2008, Revised 21 August 2009, Accepted 9 September 2009 Published online 22 December 2009 in Wiley Online Library

(wileyonlinelibrary.com) DOI: 10.1002/pst.406

Pharmaceut. Statist. 2011, 10 34–39 Copyright r 2009 John Wiley & Sons, Ltd.

aCancer Research UK Centre for EMS, Wolfson Institute of Preventive Medicine,Barts and the London School of Medicine and Dentistry, Queen Mary Universityof London, London, UK

bDepartment of Public Health and Primary Care, Eastern Cancer Registrationand Information Centre, University of Cambridge, Cambridge, UK

*Correspondence to: F. P. Treasure, Department of Public Health and PrimaryCare, Institute of Public Health, University Forvie Site, Robinson Way, CambridgeCB2 2SR, UK.E-mail: [email protected]

Page 2: Potential surrogate endpoints in cancer research – some considerations and examples

If this were the case, life would be very easy – pharmaceuticalcompanies and regulators would agree on the official surrogateendpoint for disease Y and clinical trial design would besimplified. In particular, the requirement for development orvalidation would be much reduced once a surrogate hadachieved the status of a standard.

Unfortunately, this may not be the case. As we discuss below,in general the choice of surrogate endpoint depends on thetreatment under evaluation. (As an extreme example –reduction in the size of the primary tumour may be a reasonablesurrogate endpoint for a trial of neoadjuvant cytotoxicchemotherapy but it would not be for a trial of surgery).

There is still therefore some onus on the pharmaceuticalsponsor to demonstrate that the surrogate endpoint isappropriate both to the disease and the clinical endpoint.(And there are potential difficulties – a study may compare twoactive treatments each of which has its own preferred surrogateendpoint).

1.4. A surrogate endpoint is usually more precise than andsometimes may be preferred to the clinical endpoint

Another understandable temptation is to design a clinical trial(or even a whole clinical programme) to report in two stages:firstly to report the results based on the surrogate endpoint (andmake decisions about the rest of the programme, or evensubmit for registration); secondly, some time later, to report theresults based on the clinical endpoint.

There is of course the risk that efficacy may be demonstratedfor the surrogate endpoint but not for the clinical endpoint.There are potentially difficulties of interpretation here. Forexample, Day and Duffy [5] described how use of the size, lymphnode status and grade of tumours diagnosed to estimatepredicted mortality in a trial of different breast screeningfrequencies would require less than half the study size of a trialbased on observed mortality. This is because the formerendpoint is based on tumour attributes in all cancers diagnosedwhether or not they result in deaths. The observed mortalityresult has variance based only on the deaths. This means that atrial powered for the surrogate endpoint will have a veryimprecise estimate of the effect on the final clinical endpoint.

In some clinical circumstances the surrogate endpoint mightbe argued to be preferable in principle to the clinical endpoint.In cervical screening, the object is to diagnose premalignanciesand treat them – avoiding the occurrence of cervical cancer andtherefore death from the disease. Use of incidence of cervicalcancer might be regarded as a surrogate, but because invasivecervical cancer has poor survival, and because there is a clearbenefit in avoiding a diagnosis of cancer in addition topreventing death from the disease, the surrogate endpoint(incidence of cervical cancer) might be argued to be actuallymore clinically informative than the clinical endpoint (deathfrom cervical cancer).

1.5. Example: Tumour response and progression-freesurvival

Tumour response (complete, partial or complete), duration ofresponse and progression-free survival time are traditionalsurrogate endpoints for overall survival in phase II oncologystudies, where a rapid indication of efficacy is needed.

To some extent their validity is based on clinical plausibility.A therapy may therefore be designed to interfere with tumour

progression. Any particular mechanism of action will generatepotential new surrogates, specific to that mechanism. (See [6]for a general discussion orientated towards breast cancer).

For example, angiogenesis is a pre-requisite of tumour growthand metastasis. Vascular cell adhesion molecule 1 (VCAM-1)would be expected to be over-expressed in the presence oftumour angiogenesis and Byrne and Bundred [7] demonstrateda strong correlation between serum VCAM-1 and intratumouralmicrovessel density in early breast cancer. They concluded that‘Serum VCAM-1 is a surrogate marker of angiogenesis in breastcancer and its measurement may help in the assessment ofantiangiogenic drugs currently in phase II trials’.

Cautious general accounts of the use of tumour response andprogression as surrogates can be found in [8] and chapter 8.2 of[9]. Rustin et al. [10] contrast the usefulness of these endpointswith that of a serum marker (CA-125) in ovarian cancer.

2. ASSESSMENT OF A POTENTIALSURROGATE ENDPOINT

Various criteria have been proposed for assessing whether aresponse variable is suitable to use as a surrogate endpoint. Wepresent some general remarks below, for an overview ofmethodology see the book by Burzykowski et al. [11].

2.1. The Prentice criterion (PC)

Prentice’s [12] criterion essentially states that the distribution ofthe clinical endpoint conditional on the surrogate endpointdoes not depend on the treatment group. This implies thatusing the surrogate endpoint instead of the clinical endpointdoes not ‘throw away’ any relevant information about theresponse to treatment.

This could be demonstrated by regressing the clinicalendpoint onto the surrogate endpoint. Adding a variable fortreatment to the regression should not substantially improvethe fit.

Technically, the PC seems more necessary than sufficient inthat it is possible to construct examples which satisfy thecriterion but would not be good surrogates [13].

Further, Baker and Kramer [14] show how even with perfectcorrelation within treatment groups, the surrogate endpoint cangive the opposite result to the clinical endpoint when the PC isnot satisfied. Hence, a strong empirical correlation between apotential surrogate and the clinical endpoint is not sufficient initself to establish surrogacy.

2.2. Informal model of surrogacy

The clinical endpoint can be thought of informally as modelledby a linear combination of variables plus noise: SbiXi1ei. The Xi

may be a mixture of variables known at baseline and variableswhich are intermediate in the causal chain.

If the causal link between treatment and endpoint has beencaptured then adding treatment to the predictor would notimprove the prediction.

Some of the Xi will depend on the treatment. A goodsurrogate endpoint will involve those Xi which depend on thetreatment. But as different treatments will affect different Xi, wecannot expect to use the same surrogate endpoint for differentclasses of treatment. In this context, hormonal therapy, cytotoxicchemotherapy and surgery may well affect survival after a 3

5

S. W. Duffy and F. P. Treasure

Pharmaceut. Statist. 2011, 10 34–39 Copyright r 2009 John Wiley & Sons, Ltd.

Page 3: Potential surrogate endpoints in cancer research – some considerations and examples

diagnosis of cancer through different causal pathways and,hence, different Xi.

Furthermore, the presence of the random noise ei prevents agood surrogate from being a perfect surrogate. There is forexample a substantial random component to survival and so itwould be unrealistic to expect a surrogate to predict survivalexactly. (There may also, of course, be a random component –such as measurement error or intra-individual variation – to thesurrogate).

Molenberghs et al. (Chapter 5 of [11]) present formally therelationship between treatment, surrogate endpoint and clinicalendpoint within a regression framework.

2.3. Buyse and Molenberghs

Buyse and Molenberghs [15] recommend that R2 for thebetween-trial association between the estimate of the treatmenteffect in terms of the surrogate endpoint and the estimate ofthe treatment effect in terms of the clinical endpoint should beclose to 1. (Clearly, this would only been known after severaltrials had completed with both endpoints).

2.4. Freedman’s measure in survival analysis

Freedman et al. [16] proposed a measure of the extent to whichthe PC is satisfied:

P ¼ 1�bA

bU

where bU is the unadjusted regression coefficient of treatmentin a model predicting the clinical endpoint and bA thecorresponding regression coefficient adjusted for the surrogateendpoint. (For example: the bs could be the coefficients from aproportional hazards or logistic regression model). A value ofP near unity would imply that the PC is being more-or-lesssatisfied.

The act of detecting a breast cancer by screening can bethought of as a treatment for that cancer. It is well known [17]that tumour size, node status and grade (SNG) are predictive ofsurvival. The efficacy of screening might therefore be assessedby comparing the SNG of tumours in a group randomized toscreening with cancers detected in the control group.

Freedman’s measure can be readily calculated. For example,the relative hazards for survival after a cancer detected at firstscreen against survival after a cancer diagnosed outside ascreening program were 0.29 and 0.57 for unadjusted andadjusted for SNG, respectively [18]. The corresponding regressioncoefficients are (taking natural logarithms) �1.237 and �0.562for bU and bA, respectively. Freedman’s measure is therefore55%. Within a screening programme, where the ‘treatments’ are(a) detection by screening (second and subsequent) and (b)detection clinically between screens, Freedman’s measureincreases to 86% – suggesting (as far as we can rely on pointestimates) that SNG is a better surrogate endpoint within apopulation that has already been screened once.

There are difficulties with basing an assessment of theadequacy of a surrogate on the treatment effect, see [15, 19,20]. Approaches based on the contribution of the surrogate tothe information available about the treatment effect haverecently been advocated [21, 22, 23]. These methods are attrac-tive alternatives to Freedman’s measure but have not yet beenused much in practice.

2.5. Mechanism of treatment

As stated above, there may not be a universal surrogateendpoint for a clinical endpoint when there are several possibletreatments. The mechanism of treatment must also beconsidered in a broad clinical context. There are many examplesin oncology. Size of tumour may be a good surrogate endpointfor neoadjuvant cytotoxic chemotherapy but not for surgery.Suppression of ovulation is an indication that treatment withtamoxifen is working but is an unwelcome side effect forcytotoxic chemotherapy.

3. THE PATH TO SURROGACY

New potential surrogate endpoint usually emerge fromepidemiological and biochemical studies. Before they attainthe status of ‘surrogate’ they are usually described as‘biomarkers’. Only a small proportion of biomarkers will beused as surrogate endpoints.

Here are some examples of biomarkers which may or may notbe on the path to surrogacy.

3.1. Epstein–Barr virus in nasopharyngeal cancer

Relative risks of 30–130 for nasopharyngeal cancer are quotedwith respect to a positive antibody test for the Epstein–Barrvirus, in Chinese populations: for example, Cheng et al. [24]report a test sensitivity of 92% and a specificity of 93%. The5–10% of subjects with a positive serum test contribute 80–95%of cases. Furthermore, of those who test positive, between 2%and 10% are found to have nasopharyngeal cancer onimmediate examination.

There is an obvious potential for using the presence ofEpstein–Barr virus antibody as a screening test (Chen et al. [25]).Screen-detected nasopharyngeal cancers are generally at a lessadvanced stage than clinically detected cancers and survivalworsens with stage.

It has been hypothesized that reducing the virus antibodyresponse would have therapeutic benefit. If that is successfulthen the virus antibody response would be a candidate as asurrogate for survival (Chan et al. [26]).

The success of the Epstein–Barr virus antibody response as asurrogate endpoint depends critically on being confident thatthe antibody response lies on the causal pathway betweentreatment and survival. That is why it is necessary to show thatreducing the antibody response has therapeutic benefit.Otherwise the presence of Epstein–Barr virus may just be apopulation marker for susceptibility – and changing the virusload at an individual level may have no effect on thatindividual’s prognosis.

3.2. Prostate specific antigen and prostate cancer

Prostate specific antigen (PSA) levels are strongly relatedto the presence and extent of prostate cancer. Both the timeto PSA ‘failure’ and the PSA doubling time are commonlyused as surrogate endpoints in clinical trials. There is a largeliterature on this topic, see for example: Barqawi et al. [27] andMoul [28].

After prostatectomy, PSA levels fall and with recurrence, PSAlevels rise. PSA doubling time of less than 3 months is associated3

6

S. W. Duffy and F. P. Treasure

Copyright r 2009 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2011, 10 34–39

Page 4: Potential surrogate endpoints in cancer research – some considerations and examples

with 52% 8-year survival whereas a doubling time of 3 monthsor more is associated with 99% survival (D’Amico et al. [29]).

The same large study showed that post-PSA recurrencesurvival after radiotherapy was much worse than that aftersurgery: the unadjusted hazard ratio [95% confidence limits] was11.4 [7.8,16.8]. After adjusting for explanatory variables otherthan PSA doubling time the hazard ratio reduces to 8.7 [5.6,13.4]and when PSA doubling time is added to the model the ratioreduces to 1.0 [0.6,1.9]. This corresponds to a Freedman measureof unity.

The disappearance of the treatment effect after adjustmentfor PSA doubling time is exactly what is required of a surrogateby Prentice’s criterion.

However, care has to be taken – these results come from anepidemiological study and we do not have the protection ofrandomization. It may well be that relatively frail patients bothtend both to have short PSA doubling times and to be deemedunsuitable for surgery.

In fact PSA doubling time is considered in practice a goodsurrogate for prostate cancer survival in clinical trials (asexamples: Barqawi et al. [27] and Albertsen et al. [30]). However,as recently as 2003, D’Amico et al. [29] concluded ‘Therelationship between prostate-specific antigen (PSA)-definedrecurrence and prostate cancer-specific mortality remainsunclear’. Buyse et al. (chapter 7 of [4]) state at the end of anextensive case study ‘yPSA does not qualify as an acceptablesurrogate, regardless of how it is analyzed’. Moul [28] comments‘The condition of biochemical failure has only been recognizedin the last decade and few ‘PSA-era’ patients with biochemicalrecurrence have actually died of the disease’. This controversyhas attracted a substantial literature: see also [31–34].

4. CANDIDATES FOR SURROGACY

4.1. VCAM-1

‘Serum vascular cell adhesion molecule 1 (VCAM-1) is asurrogate marker of angiogenesisy’ (Byrne and Bundred [7]).The authors made this assertion on the grounds that there is:

(1) a higher VCAM-1 in lymph node positive disease;(2) a correlation of 0.61 with microvessel density.

The authors mean at least that it would be prudent clinicalpractice to regard those with high VCAM-1 measurement as athigher risk. They in fact take the argument further by claimingthat the measurement of VCAM-1 may help in the assessment ofantiangiogenic drugs – their view being based on an under-standing of the underlying biochemistry.

4.2. CA15.3

In metastatic breast cancer, increases in serum levels of theonco-antigen carbohydrate antigen 15.3 (CA15.3) after treat-ment are correlated with poorer prognosis (Barrenetxea et al.[35]). CA15.3 is therefore suggested as an early marker ofresponse to treatment in advanced breast cancer. Again, this isprobably prudent clinical practice, but we would require moreevidence than prognostic significance alone before using it as asurrogate for survival in a trial. This would include the sort ofdemonstration of accounting for treatment effects shown forPSA above. Conversely, it could be argued that – as survival

from advanced breast cancer is relatively short – the trouble andcontroversy inherent in establishing a surrogate may not bejustified by the potential gains.

4.3. Breast density

High breast density is the presence of a high proportion of whiteareas on a mammogram (breast X-ray). It is a good example of abiomarker on the path to surrogacy. High breast density isassociated with high relative and attributable risks for breastcancer. In a trial of chemoprevention by tamoxifen, breastcancer incidence was reduced by 32% (Cuzick et al. [36]). In asubgroup of the trial population with available mammograms,breast density was reduced by around 10% in absolute termsand 20% in relative terms, in the tamoxifen arm compared toplacebo (Cuzick et al. [37]).

It remains to be established whether (1) changing the densityactually leads to a change in risk; and (2) does the change indensity with treatment account for the change in risk withtreatment? These issues are being addressed via nested studieswithin the chemoprevention trials.

5. VALIDATION

Validation of a surrogate endpoint is difficult and time-consuming, and may not be practicable within a clinicaldevelopment program. It will usually depend on availability ofdata from previous studies: using the treatment of interest, ortreatments with similar modes of action. A useful case study inthe validation of a potential surrogate endpoint (here:recurrence of disease as a surrogate for overall survival inadjuvant therapy for colorectal cancer) is to be found in papersby Sargent et al. [38, 39]. A candidate surrogate endpoint inadvanced colorectal cancer is progression-free survival; Buyseet al. [40] demonstrated its validity as a surrogate. In contrast,Burzykowski et al. [41] did not find an endpoint suitable as asurrogate for overall survival in metastatic breast cancer; in theaccompanying editorial Sargent and Hayes [42] comment that,nevertheless, progression-free survival is ‘appropriate as aprimary endpoint irrespective of formal surrogacy’ with overallsurvival.

Validation needs to include evidence that the PC is at leastpartly satisfied, i.e. that a considerable proportion of thetreatment effect on the clinical endpoint is accounted for bythe surrogate (implying a history of successful intervention). Itshould also include evidence of a strong association of thesurrogate endpoint with the clinical endpoint at the individuallevel and if possible evidence of an ecological correlationbetween treatment effects on surrogate and clinical endpointfrom previous studies of comparable treatments. For this lastreason, meta-analysis data can be of value in evaluatingpotential surrogate endpoints. Finally, there should be biologicalplausibility for the surrogate endpoint – implicit in the PC is theidea that the treatment actually achieves its effect on the clinicalendpoint via its effect on the surrogate endpoint. For example,breast screening achieves its effect on breast cancer mortality byfirst reducing the incidence of advanced stage breast cancers.

It must be remembered that the statistical contribution to theevaluation of a surrogate endpoint, though important, is only asmall part of the whole: the assessment of a potential surrogatemust be based firmly on scientific and clinical argument (seeTemple [43] for a regulatory view). 3

7

S. W. Duffy and F. P. Treasure

Pharmaceut. Statist. 2011, 10 34–39 Copyright r 2009 John Wiley & Sons, Ltd.

Page 5: Potential surrogate endpoints in cancer research – some considerations and examples

6. CONCLUSION

Surrogate endpoints can potentially save time and subjectswithin a clinical trial programme. It is a long path from first signsthat a biomarker might be useful to a fully validated surrogateendpoint. Surrogates used in practice range from newbiochemical markers which suggest themselves by way ofbiology or epidemiology to endpoints which have been used formany years and enjoy a ‘well-established’ status. The challengesof validation, prospectively and retrospectively, should not beunderestimated.

There is a large literature on surrogate endpoints. Werecommend starting with the article by Hilsenbeck and Clark [6],the recent book by Burzykowski et al. [11] (but read chapter 7 of [4]first) and the recent review articles in Statistics in Medicine [44–46].

REFERENCES

[1] Senn S. Statistical Issues in Drug Development (2nd edn). Wiley:Chichester, 2008.

[2] Cuzick J, Cafferty FH, Edwards R, Møller H, Duffy SW. Surrogateendpoints for cancer screening trials: general principles and anillustration using the UK Flexible Sigmoidoscopy Screening Trial.Journal of Medical Screening 2007; 14:178–185.

[3] Biomarker Definition Working Group. Biomarkers and surrogateendpoints: preferred definitions and conceptual framework.Clinical Pharmacology and Therapeutics 2001; 69:89–95.

[4] Bloom JC, Dean RA. Biomarkers in Clinical Drug Development.Marcel Dekker: New York, 2003.

[5] Day NE, Duffy SW. Trial design based on surrogate end points –application to comparison of different breast screening frequen-cies. Journal of the Royal Statistical Society (A) 1996; 159:49–60.

[6] Hilsenbeck SG, Clark GM. Surrogate endpoints in chemopreven-tion of breast cancer: guidelines for evaluation of new biomarkers.Journal of Cellular Biochemistry – Supplement 1993; 17G:205–211.

[7] Byrne GJ, Bundred NJ. Surrogate markers of tumour angiogenesis.International Journal of Biological Markers 2000; 15:334–339.

[8] Tangen CM, Crowley JJ. Phase II trials using time-to-event endpoints.In Handbook of Statistics in Clinical Oncology (2nd edn), Crowley J,Ankerst DP (eds). Chapman & Hall, CRC: Boca Raton, 2006.

[9] Green S, Benedetti J, Crowley J. Clinical Trials in Oncology (2nd edn).Chapman & Hall, CRC: Boca Raton, 2003.

[10] Rustin GJS, Bast RC, Kelloff GJ, Barrett JC, Carter SK, Nisen PD,Sigman CC, Parkinson DR, Ruddon RW. Use of CA-125 in clinicaltrial evaluation of new therapeutic drugs for ovarian cancer.Clinical Cancer Research 2004; 10:3919–3926.

[11] Burzykowski T, Molenberghs G, Buyse M. The Evaluation ofSurrogate Endpoints. Springer: Heidelberg, 2005.

[12] Prentice RL. Surrogate endpoints in clinical trials: definitions andoperational criteria. Statistics in Medicine 1989; 8:431–440.

[13] Begg CB, Leung DHY. On the use of surrogate end points inrandomized trials. Journal of the Royal Statistical Society A 2000;163:15–24.

[14] Baker SG, Kramer BS. A perfect correlate does not a surrogatemake. BMC Medical Research Methodology 2003; 3:16.

[15] Buyse M, Molenberghs G. The validation of surrogate endpoints inrandomized experiments. Biometrics 1998; 54:1014–1029.

[16] Freedman LS, Graubard BI, Schatzkin A. Statistical validation ofintermediate endpoints for chronic diseases. Statistics in Medicine1992; 11:167–178.

[17] Tabar L, Fagerberg G, Chen HH, Duffy SW, Smart CR, Gad A, SmithRA. Efficacy of breast cancer screening by age: new results fromthe Swedish two-county trial. Cancer 1995; 75:2507–2517.

[18] Duffy SW, Tabar L, Fagerberg G, Gad A, Grontoft O, South MC, Day NE.Breast screening, prognostic factors and survival – results from theSwedish two-county study. British Journal of Cancer 1991; 64:1133–1138.

[19] Flandre P, Saidi Y. Estimating the proportion of treatment effectexplained by a surrogate marker. Statistics in Medicine 1999;18:107–109.

[20] Molenberghs G, Buyse M, Geys H, Renard D, Burzykowski T, Alonso A.Statistical challenges in the evaluation of surrogate endpoints inrandomized trials. Controlled Clinical Trials 2002; 23:607–625.

[21] Alonso A, Molenberghs G, Burzykowski T, Renard D, Geys H,Shkedy Z, Tibaldi F, Abrahantes JC, Buyse M. Prentice’s approachand the meta-analytic paradigm: a reflection on the role ofstatistics in the evaluation of surrogate endpoints. Biometrics 2004;60:724–728.

[22] Qu Y, Case M. Quantifying the effect of the surrogate marker byinformation gain. Biometrics 2007; 63:958–960.

[23] Alonso A, Molenberghs G. Surrogate marker evaluation from aninformation theory perspective. Biometrics 2007; 63:180–186.

[24] Cheng W-M, Chan KH, Chen H-L, Luo R-X, Ng SP, Luk W, Zheng B-J,Ji M-F, Liang J-S, Sham JST, Wang DK, Zong Y-S, Ng MH. Assessingthe risk of nasopharyngeal carcinoma on the basis of EBV antibodyspectrum. International Journal of Cancer 2002; 97:489–492.

[25] Chen HH, Prevost TC, Duffy SW. Evaluation of screening fornasopharyngeal carcinoma: trial design using Markov chainmodels. British Journal of Cancer 1999; 79:1894–1900.

[26] Chan ATC, Teo PML, Johnson PJ. Nasopharyngeal carcinoma.Annals of Oncology 2002; 13:1007–1015.

[27] Barqawi AB, Moul JW, Ziada A, Handel L, Crawford ED. Combina-tion of low-dose flutamide and finasteride for PSA-only recurrentprostate cancer after primary therapy. Urology 2003; 62:872–876.

[28] Moul JW. Variables in predicting survival based on treating ‘PSA-only’ relapse. Urologic Oncology 2003; 21:292–304.

[29] D’Amico AV, Moul JW, Carroll PR, Sun L, Lubeck D, Chen M-H.Surrogate end point for prostate cancer-specific mortality afterradical prostatectomy or radiation therapy. Journal of the NationalCancer Institute 2003; 95:1376–1383.

[30] Albertsen PC, Hanley JA, Penson DF, Fine J. Validation ofincreasing prostate specific antigen as a predictor of prostatecancer death after treatment of localized prostate cancer withsurgery or radiation. Journal of Urology 2004; 171:2221–2225.

[31] Collette L, Burzykowski T, Carroll KJ, Newling D, Morris T, SchroderFH. Is prostate-specific antigen a valid surrogate endpoint forsurvival in hormonally treated patients with metastatic prostatecancer? Journal of Clinical Oncology 2005; 23:6139–6148.

[32] Petrylak DP, Ankerst DP, Jiang CS, Tangen CM, Hussain MHA,Lara PM, Jones JA, Taplin ME, Burch PA, Kohli M, Benson MC,Small EJ, Raghavan D, Crawford ED. Evaluation of prostate-specificantigen declines for surrogacy in patients treated on SWOG 99-16.Journal of the National Cancer Institute 2006; 98: 516–552. [BakerSG (editorial): Journal of the National Cancer Institute 2006;502–503].

[33] Armstrong AJ, Garrett-Mayer E, Ou Yang YC, Carducci MA, Tannock I,de Wit R, Eisenberger M. Prostate-specific antigen and pain surrogacyanalysis in metastatic hormone-refractory prostate cancer. Journal ofClinical Oncology 2007; 25:3965–3970.

[34] Collette L, Buyse M, Burzykowski T. Are prostate-specific antigenchanges valid surrogates for survival in hormone-refractorycancer? A meta-analysis is needed! Journal of Clinical Oncology2007; 25:5673–5674.

[35] Barrenetxea G, Schneider J, Centeno MM, Genolla J, Lorente F,Rodrıguez-Escudero FJ. CA15.3: A breast cancer marker predictinglocation of metastases even before treatment. European Journal ofCancer 1996; 32(Supp. 1):S17.

[36] Cuzick J, Forbes J, Edwards R et al. First results from theinternational breast cancer intervention study (IBIS): a randomizedprevention trial. Lancet 2002; 360:817–824.

[37] Cuzick J, Warwick J, Pinney E, Warren RML, Duffy SW. Effect oftamoxifen on women at increased risk of breast cancer. Journal ofthe National Cancer Institute 2004; 96:621–628.

[38] Sargent DJ, Wieand HS, Haller DG, Gray R, Benedetti JK, Buyse M,Labianca R, Seitz JF, O’Callaghan CJ, Francini G, Grothey A, ConnellM, Catalano PJ, Blanke CD, Kerr D, Green E, Wolmark N, Andre T,Goldberg RM, De Gramont A. Disease-free survival versus overallsurvival as a primary end point for adjuvant colon cancer studies:individual patient data from 20 898 patients on 18 randomizedtrials. Journal of Clinical Oncology 2005; 23:8664–8670.

[39] Gill S, Sargent D. End points for adjuvant therapy trials: has thetime come to accept disease-free survival as a surrogate end pointfor overall survival? The Oncologist 2006; 11;624–629.

[40] Buyse M, Burzykowski T, Carroll K, Michiels S, Sargent DJ, Miller LL,Elfring GL, Pignon JP, Piedbois P. Progression-free survival is a3

8

S. W. Duffy and F. P. Treasure

Copyright r 2009 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2011, 10 34–39

Page 6: Potential surrogate endpoints in cancer research – some considerations and examples

surrogate for survival in advanced colorectal cancer. Journal ofClinical Oncology 2007; 25:5218–5224.

[41] Burzykowski T, Buyse M, Piccart-Gebhart MJ, Sledge G, Carmichael J,Luck H-J, Mackey JR, Nabholtz J-M, Paridaens R, Biganzoli L, Jassem J,Bontenbal M, Bonneterre J, Chan S, Basaran GA, Therasse P. Evaluationof tumor response, disease control, progression-free survival, and timeto progression as potential surrogate end points in metastatic breastcancer. Journal of Clinical Oncology 2008; 26:1987–1992.

[42] Sargent DJ, Hayes DF. Assessing the measure of a new drug: issurvival the only thing that matters? Journal of Clinical Oncology2008; 26:1922–1923.

[43] Temple R. Are surrogate markers adequate to assess cardiovas-cular disease drugs? Journal of the American Medical Association1995; 282:790–795.

[44] Weir CJ, Walley RJ. Statistical evaluation of biomarkers as surrogateendpoints: a literature review. Statistics in Medicine 2006;25:183–203.

[45] Alonso A, Molenberghs G, Geys H, Buyse M, Vangeneugden T.A unifying approach for surrogate marker validation based onPrentice’s criteria. Statistics in Medicine 2006; 25:205–221.

[46] Qu Y, Case M. Quantifying the indirect treatment effect viasurrogate markers. Statistics in Medicine 2006; 25:223–231.

39

S. W. Duffy and F. P. Treasure

Pharmaceut. Statist. 2011, 10 34–39 Copyright r 2009 John Wiley & Sons, Ltd.