Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Word count: 6061
Checkpoint inhibitors for malignant melanoma: a systematic review and meta-analysis
Dove Medical Press
Manuscript submission
Saturday 27th August 2016
Authors:
Adam Karlsson, BSc (Hons)1*
Sohag Saleh, PhD2
1Faculty of Medicine, Sir Alexander Fleming Building, Imperial College
London, Exhibition Road, London SW7 2AZ, UK
2Faculty of Medicine, 3S1c, Commonwealth Building, Hammersmith
Hospital Campus, Imperial College London, Du Cane Road, London W12
0NN, UK
* Corresponding author:
Address - Flat 31, 105 Hallam Street, W1W 5HD, LondonTel - (+44) 07446950989Email - [email protected]
1
Word count: 6061
Abstract
Background and Objectives
Rates of malignant melanoma are continuing to increase, and, until recently
effective treatments were lacking. Since 2011, three immunotherapeutic
agents, known as checkpoint inhibitors, have, however, been approved.
This review aims to establish whether these three drugs - ipilimumab,
nivolumab and pembrolizumab - offer greater efficacy and tolerability
compared to control interventions (placebo, immunotherapy, or
chemotherapy) in patients with stage III or IV unresectable cutaneous
melanoma.
Methods
A search on four major medical and scientific databases yielded 7553
records, of which seven met the inclusion criteria, with a total study
population of 3628. Only prospective, phase II or III randomized controlled
trials (RCTs) on checkpoint inhibitors for patients with unresectable
cutaneous melanoma that reported data on survival (overall, or progression-
free), tumor response, or adverse events were included. Three meta-
analyses were carried out.
Results
The hazard ratio for progression or death was 0.54 (0.44 – 0.67), and the
odds ratio for best overall response rate was 4.48 (2.77 – 7.24), both in
favor of checkpoint inhibitors. However, control treatments were associated
with a non-significantly lower rate of discontinuation of treatment due to
adverse, or treatment-related adverse events, OR 1.63 (0.55 – 4.88).
2
Word count: 6061
Discussion
This study finds that checkpoint inhibitors are more effective than control
interventions, both in terms of survival and tumor response, and yet, no less
tolerable. PD-1 therapies (nivolumab and pembrolizumab) appear to offer
greater efficacy than CTLA-4 therapy (ipilimumab). The combination of
nivolumab and ipilimumab was, however, the most effective, but
significantly less tolerable than monotherapy. The lack of published clinical
data does, however, limit this study.
Further research is needed into two areas in particular; first, to determine
the optimal use of checkpoint inhibitors, specifically in terms of combination
therapy, and second, to identify reliable biomarkers to predictive responders
and guide treatment assignment.
KEYWORDS: checkpoint inhibitors, immunotherapy, melanoma,
ipilimumab, nivolumab, pembrolizumab
3
Introduction
Melanoma
Melanoma is a malignant neoplasm arising from melanocytes, the melanin-
producing cells of the body. Over the last half century, the incidence of
melanoma in most developed countries has risen more than any other form of
cancer, with rates increasing by 360% in Great Britain since the late 1970s.1-3
. The current World Health Organization (WHO) estimates are that 132,000
melanomas occur each year around the world, resulting in 65,000 deaths
annually.4, 5. While genetic and phenotypic factors such as lightly pigmented
skin increases one’s risk, the main cause is thought to be environmental
exposure to the sun’s UV radiation.6. Early diagnosis and resection will cure
nine out of ten cases of stage I melanoma.7. The prognosis for regional and
distant metastatic melanoma (stage III and IV, respectively) is variable but
generally poor, with 5-year survival rates for stage III ranging from 13%–69%,
and as low as 6% in stage IV.8, 9.
The poor prognosis of advanced melanoma is in part due to the limited
therapeutic options available. Surgery and radiotherapy provide mainly
palliation, and chemotherapy, most commonly with dacarbazine has failed to
show any consistent survival benefit.10-12. Novel pharmacological agents have,
however, been developed, such as BRAF13 (v‐raf murine sarcoma viral
oncogene homolog B1) and MEK inhibitors14, as well as several
immunotherapeutic agents, most notably the class of drugs known as
checkpoint inhibitors.15.
Word count: 6061 4
Checkpoint inhibitors
The cellular immune defense against neoplasms begins with the recognition
of a tumor antigen by a tumor specific T-cell receptor. The interaction of co-
stimulatory, and co-inhibitory molecules with their respective receptors on T-
cells, as illustrated in Figure 1 below, determines the balance between T-cell
activation and inhibition. Cytotoxic lymphocyte associated antigen-4 (CTLA-4)
is a co-inhibitory receptor present on the cell-surface of CD4+ and CD8+ T-
cells, that acts to dampen down the immune response. CTLA-4 expression is
upregulated by increased T-cell activation and an inflammatory environment,
suggesting that it acts as a physiological brake on the immune response.
Through higher affinity for CD80 and CD86 ligands present on antigen-
presenting cells and tumor cells, CTLA-4 is able to outcompete the co-
stimulatory receptor CD28 for binding and thus negatively regulates T-cell
activation.16, 17. Similarly, PD-1 receptors expressed on T-cells and other
immune cells generate a co-inhibitory signal upon binding to its ligands, PD-
L1 and PD-L2, resulting in direct inhibition of tumor apoptosis, T-cell
exhaustion, and conversion of effector T-cells to regulatory T-cells.18.
Melanoma cells are able to hijack this system by eg expressing co-inhibitory
molecules within the tumor microenvironment. This dampens down the
immune response, and thus hampers effective tumor clearance.19, 20 .
Ipilimumab, a fully human IgG1 monoclonal antibody that targets CTLA-4,
along with nivolumab and pembrolizumab, humanized IgG4 monoclonal
antibodies that target PD-1 prevent the interaction between co-inhibitory
Word count: 6061 5
molecules and their receptors, thereby releasing the brake on the body’s
natural defense to tumors.21, 22.
Word count: 6061 6
Ipilimumab received FDA (US Food and Drug Administration) approval in
2011, pembrolizumab and nivolumab in 2014, for the treatment of
unresectable or metastatic melanoma. Nivolumab is also licensed for
combination therapy with ipilimumab, as well as for non-small cell lung cancer
(as is pembrolizumab) and renal cell cancer.23-25. All three drugs are also
recommended for use by NICE (The National Institute for Health and Care
Excellence) in the UK.26.
Rationale
Given the increasing rates of melanoma and the poor prognosis of advanced
disease, checkpoint inhibitors have the potential to greatly improve patient
outcomes. Therefore, a comprehensive overview of the evidence on the
efficacy and tolerability of this drug class is needed to ascertain its value and
identify any gaps in knowledge requiring further research and investigation.
Objectives
Word count: 6061
Figure 1. The mechanism of action of checkpoint inhibitors
A tumor specific T-cell receptor (TCR) interacts with a tumor antigen
presented on a major histocompatibility complex (MHC) initiating T-cell
activation. Activation is incomplete though without additional co-stimulatory
signaling through eg the interaction between the CD28 receptor and the B7
molecule. However, co-inhibitory signaling occurs between PD-L1/2 and the
PD-1 receptor, and CTLA-4 outcompeting CD28 for B7 binding. The
balance is shifted in favor of T-cell activation through the use of monoclonal
antibodies (anti PD-1 and anti CTLA-4) that inhibit the interaction between
immune checkpoint molecules and their inhibitory receptors, thereby
restoring the anti-tumor immune response.
7
For this reason, this systematic review and meta-analysis aims to answer the
question: do the three currently approved checkpoint inhibitors - ipilimumab,
nivolumab and pembrolizumab - offer greater efficacy and tolerability
compared to control interventions, consisting of a placebo, another
immunotherapeutic agent, and/or chemotherapy (see Table 2), in patients
with stage III or IV unresectable cutaneous melanoma, in terms of
progression-free and overall survival, tumor response, and discontinuation
rates? The specific objectives are thus to identify all relevant studies, and to
use quantitative methods to compile their results. Any heterogeneity in the
results between individual drugs and studies will also be explored in order to
assess the between-drug differences in efficacy and tolerability. Finally, the
findings of this study will be placed in their context, and areas requiring further
research explored.
The authors hypothesize that checkpoint inhibitors will be found to be both
more effective and tolerable than control treatments.
Word count: 6061 8
Methods
Study identification
An electronic search was carried out on four databases:
Embase Classic & Embase: 1947 – 26/3/2016
Medline In-Process & Other Non-Indexed Citations and Medline: 1946
– 27/3/2016
Web of Science Core Collection: All years – 27/3/2016
Cochrane library – All years – 27/3/2016
A similar search strategy was conducted for all databases, consisting of
various iterations of the drug names (eg ipilimumab, MDX-010, MDX-101,
yervoy, and BMS-734016) and of melanoma (see Appendix A for full list). No
limits, either in terms of date ranges or “NOT” search terms were used. The
reference lists of other articles identified as relevant were manually screened
for any missing studies.
Word count: 6061 9
Study selection
Search results were exported to Microsoft Excel, and duplicates removed,
before a first screening of the title and abstract of the remaining reports was
conducted, wherein those that did not pertain to cutaneous malignant
melanoma and/or checkpoint inhibitors, were not original research, were not
available in English, or had a clearly inappropriate study design according to
the inclusion and exclusion criteria were removed. The remaining articles
were reviewed in their entirety and assessed according to the inclusion and
exclusion criteria, as listed in Table 1. Two investigators, working
independently, carried out the study selection, and came to a combined
decision on the eligibility of studies when there were any differences of
opinion.
Word count: 6061 10
Assessment of study quality
Word count: 6061
Inclusion criteria Exclusion criteria
Time period
- Embase - 1947 – 26th March 2016
- Medline - 1946 – 27th March 2016
- Web of Science - All years – 27th March 2016
- Cochrane library - All years – 27th March 2016
- None
Study design
- Human
- English language papers
- Prospective
- Randomized
- Controlled
- Phase II and III studies
- FDA approved checkpoint inhibitor(s)
- Cutaneous unresectable malignant melanoma
- Uncontrolled
- Retrospective
- Follow-up studies
- Phase I studies
- Extended Access Programs
- Review papers, editorials, opinion
pieces, commentaries, letters,
conference proceedings, meeting
abstracts, case series, case reports
- NSCLC*, prostate cancer, mucosal or
uveal melanoma
Outcomes - Overall survival
- Progression-free survival
- Tumor response
- Discontinuation rates
- Adverse events
- Quality of life
Table 1. Study inclusion and exclusion criteria
The inclusion and exclusion criteria for the systematic review and meta-
analysis are shown, divided into three categories: Time period, Study
design, and Outcomes.
* Non-small cell lung cancer
11
Using the 2010 CONsolidated Standards Of Reporting Trials (CONSORT)
checklist27, compromising 25 items relating to the design, analysis, and
interpretation of randomized controlled trials (RCTs), the quality of all included
studies was assessed. A test of the strength of correlation between study
quality and primary efficacy outcome was carried out, in order to assess
whether poorer quality studies may have biased the results of the meta-
analysis (see Online Supplementary Material, Section A for full details).
A risk of bias assessment at the study level was carried out, using the criteria
provided in Review Manager (version 5.3). The risk of bias across studies was
also assessed by funnel plots to test for the presence of publication bias
(Online Supplementary Material, Section D), and by examining the source of
funding for all included studies.
Data collection
Baseline participant demographic data and outcome data were extracted into
separate spreadsheets. No data was extrapolated or directly extracted from
graphs. When multiple sets of data were reported, the data judged as the
most robust and unbiased was extracted, eg independent review committee’s
data over investigator-assessed data.
Word count: 6061 12
Outcomes
The outcomes of this study relate to the efficacy and tolerability of checkpoint
inhibitors compared to control interventions:
Primary outcome
o Survival – Hazard ratio for progression or death
Secondary outcome
o Tumor response – Odds ratio for best overall response rate
(BORR)
o Tolerability – Odds ratio for rates of discontinuation due to
adverse, or treatment-related adverse events
The primary outcome used hazard ratios for progression, or hazard ratios for
death, based on progression-free survival (PFS), and overall survival (OS),
respectively. While OS is defined as the time from randomization to death
from any cause, PFS is the time from randomization to first disease
progression or death from any cause, whichever comes first. Due to the
disparity in the reporting of endpoints in the literature, and to ensure an
adequate sample size, these endpoints were combined for the primary
outcome meta-analysis. Importantly, a meta-analysis has shown that for
melanoma, PFS is a reliable surrogate for OS, with correlation coefficients
ranging from 0.55 – 0.96.28. Where both endpoints were reported21, 29, the
hazard ratio for progression was used, meaning that for only one study15 was
the hazard ratio for death used (see Table 3 for an overview of outcomes
reported in each study).
Word count: 6061 13
The secondary outcome on tumor response used BORR, defined as the
proportion of patients with a partial or a complete response as assessed by
the revised Response Evaluation Criteria In Solid Tumors (RECIST v.1.1)
criteria30 for five studies, or the modified WHO criteria31 for two studies.
The secondary outcome on tolerability was the rates of discontinuation due to
adverse events, or specifically treatment-related adverse events. The latter
was used when available, meaning that for only one study29 was data on rates
due to adverse events used.
Statistical analysis
The statistical analysis was carried out using the Cochrane Review software
Review Manager (version 5.3). For the dichotomous outcomes (tumor
response and tolerability) an odds ratio was calculated based on the Mantel-
Haenszel statistical method. For the primary outcome analysis with data in the
form of hazard ratios, the generic inverse variance analysis was used. The
standard error was required for this analysis, and was manually calculated
from the 95% confidence intervals according to the equation32:
SE=upper limit−lower limit3.92
The weight of each study was automatically calculated as the inverse
variance of the effect estimate, meaning studies with narrower confidence
intervals were more heavily weighted.
Word count: 6061 14
Due to the inherent heterogeneity from combining three different drugs, the
intervention treatments could not be said to be functionally equivalent,
meaning a random, rather than a fixed effects model was used.
Tests of heterogeneity were performed on Review Manager. I2 was the
measure used as it emphasizes the effect of heterogeneity rather then merely
reporting its presence.33.
Missing data
Attempts were made to contact four corresponding authors to request missing
or unreported data, all without success.
Results
The Results section is sub-divided into three parts, the first providing data on
the included studies, the second on the results of the meta-analyses, and the
third on the results of the bias assessment.
Included studies
Word count: 6061 15
Study selection
7553 records across four databases were identified, with seven studies
ultimately meeting the inclusion criteria, as seen in Figure 2, where the
number of studies identified, reviewed, and excluded at each stage of the
study selection is listed. After duplicates were removed 4947 records were
screened, and 295 full-text articles assessed for eligibility.
Word count: 6061 16
Word count: 6061 17
Study design
All seven studies that met the inclusion criteria were randomized, controlled
phase II or III trials; five of which were double-blinded, one completely open-
labeled22, and another partially open-labeled.34. Two studies included only
ipilimumab, two only nivolumab, one only pembrolizumab, and two both
ipilimumab and nivolumab. The control arms consisted of a placebo and
checkpoint inhibitor in two studies, gp100 (peptide cancer vaccine) and
placebo in one study, dacarbazine alone or with a placebo in two studies,
and investigator-choice chemotherapy in two studies, as illustrated in Table 2
below.
Three studies had three treatment arms, meaning a choice was made by the
investigators as to which arms to compare.15, 34, 35. For the Hodi study15
ipilimumab & gp100 was compared to gp100 alone, in order to isolate the
effects of ipilimumab. For the Larkin study35 the combination arm (nivolumab
+ ipilimumab) was compared with ipilimumab to isolate the effects of
nivolumab, as comparing with nivolumab would be fail to isolate the effects of
Word count: 6061
Figure 2. Study flow diagram
The top boxes show the number of records identified in each of the four
databases, followed by the total number of records, before and after
duplicates were removed. The number of records screened and excluded
on the basis of the title and abstract, along with the reasons for exclusion
follows. Below this is the number of full-text articles assessed for eligibility,
and the number of those excluded with reasons as listed. Seven studies
were included in the qualitative and quantitative synthesis (meta-analysis).
18
ipilimumab given the different nivolumab doses used in the two arms. Lastly,
for the Ribas study34, the approved dose of pembrolizumab (2mg/kg) was
compared to investigator-choice chemotherapy, rather than
pembrolizumab10mg/kg.
Word count: 6061 19
Word count: 6061
Author Journal, Year Study design Randomized patients – no.
Intervention arm Drug (dose) – no.
Control arm Drug (dose) – no.
Additional arm Drug (dose) – no.
Hodi15 NEJM, 2010 Randomized
Controlled
Double-blinded
Phase III study
676 Ipilimumab (3 mg/kg) +
gp100 vaccine – 403
gp100 vaccine
(2 x 1mg/kg) +
placebo – 136
Ipilimumab (3mg/kg) +
placebo – 137
Larkin35 NEJM, 2015 Randomized
Controlled
Double-blinded
Phase III study
945 Nivolumab (1 mg/kg) +
Ipilimumab (3 mg/kg) –
314
Ipilimumab (3 mg/kg) +
placebo – 315
Nivolumab (3 mg/kg) +
placebo – 316
Postow36 NEJM, 2015 Randomized
Controlled
Double-blinded
Phase II study
142 Ipilimumab (3 mg/kg) +
Nivolumab (1 mg/kg) – 95
Ipilimumab (3 mg/kg) +
placebo – 47
Ribas34 The Lancet
Oncology, 2015
Randomized
Controlled
Open-label *
Phase II study
540 Pembrolizumab (2 mg/kg)
– 180
ICC † – 179
(Paclitaxel +
carboplatin, paclitaxel,
carboplatin,
dacarbazine, or oral
Pembrolizumab (10
mg/kg) – 181
20
Word count: 6061
Author Journal, Year Study design Randomized patients – no.
Intervention arm Drug (dose) – no.
Control arm Drug (dose) – no.
Additional arm Drug (dose) – no.
Robert21 NEJM, 2011 Randomized
Controlled
Double-blinded
Phase III study
502 Ipilimumab (10 mg/kg) +
Dacarbazine (850mg/m2) –
250
Dacarbazine (850mg/m2)
+ placebo – 252
Robert29 NEJM, 2015 Randomized
Controlled
Double-blinded
Phase III study
418 Nivolumab (3 mg/kg) +
placebo – 210
Dacarbazine (1000mg/m2)
+ placebo – 208
Weber22 The Lancet
Oncology, 2015
Randomized
Controlled
Open-label
Phase III study
405 Nivolumab (3 mg/kg) –
272
ICC † – 133
(Dacarbazine or paclitaxel
+ carboplatin)
21
In total, data from 3628 patients was included. The mean age across the
seven studies ranged from 56.2 – 61.7 years, and the mean proportion of
female participants was 38% as shown in Table 2, Appendix B. The tumor-
node-metastasis (TNM) system for melanoma by the American Joint
Committee on Cancer was used in all included studies, with 2383 patients
classified as M1c, and 1143 patients classified as M0, M1a, or M1b.
All seven studies were included in the primary outcome analysis on survival,
and the secondary outcome analysis on tumor response, with one study
reporting overall-survival, and the rest progression-free survival. For the
secondary outcome on tolerability the Hodi15 study did not report data on
discontinuations due to adverse, or treatment-related adverse events, as
illustrated in Table 3 below, and was therefore not included in the secondary
outcome analysis on tolerability.
Word count: 6061
Table 2. Overview of the characteristics of included studies
An overview of the basic characteristics of all included studies, showing in addition
to the first author, journal name and year of publication, the study design, number of
participants, intervention given as well as number of patients randomized to each
arm. The first column of drugs is the intervention arm and the second column is the
control arm that was used to compare with for the meta-analysis.
* Assignment to ICC or pembrolizumab was open-label, but dose of pembrolizumab
given was double-blinded.
† Investigator-choice chemotherapy.
22
Word count: 6061
Author Hazard ratio for death
Hazard ratio for death or disease progression
Best overall response rates
Total adverse events
Total treatment-related adverse events
Discontinuation due to adverse events
Discontinuation due to treatment-related adverse events
Hodi ✓ ✓ * ✓ ✓ ✓Larkin ✓ ✓ ✓ ✓ ✓Postow ✓ ✓ ✓ ✓Ribas ✓ ✓ ✓ ✓ ✓Robert (2011)
✓ ✓ ✓ ✓ ✓
Robert (2015)
✓ ✓ ✓ ✓ ✓ ✓
Weber ✓ ✓ ✓ ✓
23
Study quality
The mean score across the seven studies for the 2010 CONSORT checklist
was 64.4%, with only one study scoring <60%. The three parameters of the
CONSORT checklist that were consistently done poorly, however, were:
providing a hypothesis or objective, describing the randomization procedure,
and identifying any weaknesses or limitations in the study.
There was a positive correlation (Pearson’s r = 0.57) between the CONSORT
checklist score and the hazard ratio for the primary efficacy outcome,
wherein the lower quality studies reported more significant hazard ratios (ie
HRs closer to 0).
Word count: 6061
Table 3. Outcomes reported in included studies
An overview of the outcomes reported in each of the seven studies, with
green ticks representing outcomes reported and data used in meta-
analysis, and black ticks outcomes that were reported but data not used in
the meta-analysis.
* Did not report the 95% confidence intervals necessary to calculate the
standard error to construct a forest plot, meaning the HR for death data was
used instead.
24
Heterogeneity
As seen in Table 4, there was significant heterogeneity in all meta-analyses.
Removing the lowest quality study as assessed by the CONSORT checklist,
or the two open-label studies had no significant effect on the I2 score.
Word count: 6061
Heterogeneity I2 (%)Meta-analysis All studies Lowest quality
study removedOpen-label studies removed
Primary outcome – survival 91 92 94
Secondary outcome – tumor response 72 71 81
Secondary outcome – tolerability 93 94 93
Table 4. Heterogeneity scores for meta-analyses
The I2 heterogeneity score for each meta-analysis is listed, for three cases:
when all studies are included, when the lowest quality study as assessed by
the CONSORT checklist is excluded, and when the two open-label
studies22, 34 are excluded.
25
Meta-analyses results
Primary outcome – Hazard ratio for progression or death
This study found that the median overall-survival (OS) and progression-free
survival (PFS) were consistently greater in the checkpoint inhibitor arms than
in the control arms, with an overall hazard ratio of 0.54 (0.44 – 0.67) in favor
of checkpoint inhibitors, as seen in Figure 3. The greatest advantage for
checkpoint inhibitors was seen in the two studies comparing the combination
of nivolumab and ipilimumab to ipilimumab monotherapy, which assuming an
additive, as opposed to a synergistic effect isolating isolates the effects of
nivolumab.35, 36. In one of these studies, the PFS in the combination arm, 11.5
months (8.9 – 16.7), was only significantly superior when compared to the
ipilimumab and placebo arm, 2.9 months (2.8 – 3.4), but not the nivolumab
and placebo arm, 6.9 months (4.3 – 9.5). 35. The third greatest benefit for
checkpoint inhibitors was seen for the comparison of nivolumab with
dacarbazine.29.
No statistically significant difference was found for PFS in the one study
comparing two different doses of a checkpoint inhibitor (pembrolizumab). 34.
The only study to cross the line of no effect was the fully open-labeled study22,
which reported data for only a portion of its study population (182 / 405), and
thus has a markedly wider confidence interval. The I2 score is 91%, reflecting
the poor alignment of confidence intervals amongst the studies.
Word count: 6061 26
Word count: 6061 27
Secondary outcome – Tumor response
Similar to the primary outcome on survival, all studies reporting best-overall
response rates (BORR) found that checkpoint inhibitors were superior to
control interventions. The meta-analysis showed an overall effect estimate of
OR 4.48 (2.77 – 7.24) favoring checkpoint inhibitors, as seen in Figure 4. Only
two studies, both of which had ipilimumab as the checkpoint inhibitor, failed to
show a statistically significant advantage, one compared to gp100 vaccine15,
and the other to dacarbazine.21.
The greatest tumor response was seen in the two studies combining
nivolumab and ipilimumab (57.6% and 58.9%)35, 36, but the BORR with
nivolumab alone was more than twice as great as with ipilimumab alone
(43.7% vs. 19.0%) in the one study reporting both.35. The BORR in the
chemotherapy arms ranged from (4.5% – 13.9%), but dacarbazine specific
arms had a narrower spread (10.3% and 13.9%).21, 22, 29, 34.
Word count: 6061
Figure 3. Forest plot for the primary outcome analysis on survival
The forest plot for the primary outcome on survival with, the hazard ratio for
progression, or death along the x-axis, and the results from all seven
studies shown with the red dot representing the effect estimate and the line
through it representing the 95% confidence interval. The percentage weight
is listed next to each study. Data on the heterogeneity of the meta-analysis
is shown in the bottom-left, with the relevant measure being the I2 score.
The black diamond represents the overall effect measure, which lies clear
of the line of no effect, showing a benefit for checkpoint inhibitors as
compared to control intervention, with an overall HR of 0.54 (95% CI: 0.44 –
0.67).
28
Four studies had appreciably larger confidence intervals, all of which had
smaller control arms with fewer objective responses.15, 22, 34, 36. The I2 score
was 72%, suggesting moderate heterogeneity, but this is reduced to 0% when
removing the two studies assessing the tumor response of ipilimumab.
Word count: 6061 29
Word count: 6061 30
Secondary outcome – Tolerability
Unlike the previous two meta-analyses, the overall effect estimate for
discontinuation due to adverse and treatment-related adverse events was OR
= 1.63 (0.55 – 4.88), non-significantly in favor of the control intervention as
compared to checkpoint inhibitors, as seen in Figure 5. Three studies favored
control treatment, and three favored checkpoint inhibitors, but all of the latter
crossed the line of no effect. These three studies all compared either
nivolumab or pembrolizumab monotherapy to chemotherapy. Both studies
comparing a combination of ipilimumab and nivolumab to ipilimumab alone
found that more patients discontinued in the combination arms.
The confidence intervals were poorly aligned, with a high heterogeneity score,
I2 = 93%.
Word count: 6061
Figure 4. Forest Plot for the secondary outcome analysis on tumor response
Forest plot for the secondary outcome on tumor response, with the odds
ratio for best overall response rate (BORR) on the x-axis, and the results
from all seven studies shown with the blue dot representing the effect
estimate and the line through it representing the 95% confidence interval.
The percentage weight is listed next to each study. Data on the
heterogeneity of the meta-analysis is shown in the bottom-left, with the
relevant measure being the I2 score. The black diamond represents the
overall effect measure, which lies clear of the line of no effect, showing a
benefit for checkpoint inhibitors as compared to control interventions, with
an overall odds ratio of 4.48 (95% CI: 2.77 – 7.24).
31
In the one study reporting both tolerability endpoints, the order for
discontinuations due to specifically treatment-related adverse events was
pembrolizumab 10mg/kg > chemotherapy > pembrolizumab 2mg/kg.
However, for discontinuation due to all adverse events, chemotherapy rather
than pembrolizumab 2mg/kg caused the lowest rate of discontinuation.
Word count: 6061 32
Word count: 6061 33
Bias
The risk of bias was assessed at both the study, and the outcome level, and
in addition to this, the presence of publication bias was assessed. As seen in
Table 5, most bias domains (selection bias, performance bias, attrition bias,
reporting bias, and other bias) were marked as low or unclear risk, but three
studies had one domain marked as high risk each. An unclear risk of bias was
defined as a risk of bias that was greater than low, but not sufficient to be
considered high. At the study level, the Larkin and Postow studies35, 36 had the
lowest risk of bias, while the Weber study22 had the highest (see Online
Supplementary Material, Section C for full risk of bias tables).
Word count: 6061
Figure 5. Forest plot for the secondary outcome analysis on tolerability
The forest plot for the secondary outcome on tolerability, with the odds ratio
for rates of discontinuation due to adverse and treatment-related adverse
events along the x-axis, and the results from the six studies reporting a
tolerability endpoint shown with the blue dot representing the effect estimate
and the line through it representing the 95% confidence interval. The
percentage weight is listed next to each study. Data on the heterogeneity of
the meta-analysis is shown in the bottom-left, with the relevant measure
being the I2 score. The black diamond represents the overall effect
measure, which lies towards the right, favoring control interventions, but
crosses the line of no effect meaning the results are statistically non-
significant, overall odds ratio 1.63 (95% CI: 0.55 – 4.88).
34
Word count: 6061 35
At the domain level, the random sequence generation and allocation
concealment were done well, while blinding of participants and personnel and
blinding of outcome assessment were done poorly, as shown in Table 6.
Incomplete outcome data was marked as unclear risk of bias for all seven
studies, for not adequately explaining why some patients were not evaluated
or included in the analysis. All studies were funded by, and designed in
collaboration with the pharmaceutical company that developed or marketed
the checkpoint inhibitor, which was noted under the ‘other bias’ domain.
The risk of bias, or CONSORT quality scores were not used in the weighting
of the meta-analyses, only discussed as part of the qualitative assessment of
the included studies.
Word count: 6061
Table 5. Risk of bias assessment at the study level
The risk of bias assessment showing for each study as listed on the left, the
number of low, unclear, and high scores given for the seven parameters
assessed, represented by green, yellow, and red circles, respectively.
36
Word count: 6061
Table 6. Risk of bias assessment at the domain level
The risk of bias assessment showing for each domain as listed on the left,
the number of low, unclear, and high scores given across the seven studies,
represented by green, yellow, and red fields, respectively.
37
Publication bias
In assessing the presence of publication bias, the funnel plot for the primary
outcome meta-analysis shows an even spread of studies on either side of the
overall effect estimate line, as seen in Figure 6. There is a lack of low quality
studies with widespread effect estimates, reflecting the scarcity of published
data. There does not, however, appear to be any significant publication bias.
The funnel plots for the secondary outcome analysis on tumor response and
tolerability (Figures 1 and 2, Online Supplementary Material, Section D) are
also spread evenly around their respective overall effect estimate lines, but
are less closely clustered together due to the greater disparity in the standard
error of the log odds ratio. There are too few studies in all of the meta-
analyses carried out for any formal tests of funnel plot asymmetry to be
performed.
Word count: 6061 38
Word count: 6061
Figure 6. Funnel Plot for the primary outcome analysis on survival
Each study is represented as a black circle, with the hazard ratio for
progression or death (ie the result) along the x-axis, and the standard error
of the natural log of the hazard ratio (ie the reliability) on the y-axis. The
smaller the SE (log [Hazard Ratio]), the more reliable the result from that
studies is, meaning less reliable studies will be found closer to the x-axis.
There is an even spread of studies on either side of the vertical blue line
representing the overall effect estimate (HR = 0.54).
39
Discussion
Checkpoint inhibitors used in the treatment of unresectable stage III and IV
melanoma are found to be more effective, as determined by prolonged
survival times and improved tumor responses, and yet no less tolerable than
control treatments in meta-analyses of seven randomized controlled trials.
The discussion is divided into three sections, one exploring the results of the
meta-analyses, another the limitations of this study arising due to bias in the
included studies, the outcomes used, and limitations at the review-level, and
lastly one placing the findings of this study in their wider context and exploring
the future directions of checkpoint inhibitors for the treatment of melanoma.
Meta-analyses
Efficacy
The hazard ratio for progression or death was 0.54 (0.44 – 0.67), and the
odds ratio for BORR was 4.48 (2.77 – 7.24), both in favor of checkpoint
inhibitors. The two studies finding the greatest benefit in terms of survival and
tumor response both compared the combination of nivolumab and ipilimumab
to ipilimumab alone35, 36, suggesting that combination therapy is superior to the
use of either drug aloneipilimumab monotherapy. However, the rates of
discontinuation were significantly greater in the combination arms in both
studies (OR = 3.30, and 4.18).
Word count: 6061 40
Given that the greatest benefit was seen in the combination studies isolating
the effects of nivolumab, followed by the two studies comparing nivolumab to
dacarbazine29, and pembrolizumab to ICC34, one could suggest that the anti-
PD-1 mAb are more effective than anti-CTLA-4 mAb. However, comparing
combination therapy A + B to drug A in order to isolate the effects of drug B
assumes that the drugs have an additive rather than a synergistic effect.
Additionally, the superiority of pembrolizumab over ipilimumab was, not
statistically significant as the confidence intervals overlapped.
Nonetheless, the Larkin study35 found that combination therapy vs ipilimumab,
and vs nivolumab yielded HR 0.42 (0.31 – 0.57) and HR 0.74 (0.60 – 0.92),
respectively, indicating that while combination therapy was undoubtedly the
most effective, those on nivolumab compared more favorably than those on
ipilimumab. Moreover, the direct comparison of nivolumab and ipilimumab
monotherapies gave a significant advantage to nivolumab, hazard ratio for
progression 0.57 (0.43 – 0.76). 35. Taken together with the weakest benefit for
checkpoint inhibitors coming from studies with ipilimumab in the experimental
arm (with exception of the Weber study22 due to its markedly wider confidence
interval), PD-1 therapy does in fact appear to be superior to CTLA-4 therapy.
Word count: 6061 41
Data from the secondary meta-analysis on tumor response showing that both
studies on ipilimumab failed to find statistically significant advantages over
control treatments further suggests that ipilimumab lacks in efficacy compared
to the PD-1 targeted therapies.15, 21. This is despite the clear benefit of
ipilimumab on PFS and OS, which raises the question of whether the two
commonly used criteria for evaluating tumor response, are suitable for
checkpoint inhibitors. Unlike traditional cytotoxic agents immunotherapies may
mediate cytostatic rather than cytotoxic effects, or cause delayed tumor
shrinkage due to the time lag between the disinhibition of the immune
response and subsequent anti-tumor effects, meaning the traditional criteria
may miss the positive effects of immunotherapies.37-39. A new immune-related
response criteria (irRECIST) has been developed to better capture the
atypical tumor responses seen with immunotherapeutic agents.40.
None of the remaining studies failed to find statistically significant differences
in BORR, although these studies all used the RECIST criteria while the
ipilimumab studies used the modified WHO criteria. If, however, one assumes
that this disparity is not due to the different criteria being used, given the data
in Table 4, Online Supplementary Material Section B, showing that the BORR
for ipilimumab monotherapy was virtually the same in two studies, one using
the modified WHO criteria15, and the other the RECIST criteria36 (10.9%, and
10.6% respectively), a reasonable interpretation is that PD-1 therapy is again
shown to be superior to CTLA-4 therapy. A third plausible explanation is that
differences in response kinetics are such that nivolumab and pembrolizumab
are simply more suitable for evaluation with traditional criteria than ipilimumab
Word count: 6061 42
is, and that therefore, no inference can be made about their relative efficacies
based on this particular parameter.41.
Tolerability
For the secondary outcome analysis on tolerability, checkpoint inhibitors were
shown to be non-significantly inferior to control interventions for rates of
discontinuation due to adverse, or treatment-related adverse events, OR =
1.63 (0.55 – 4.88). Importantly though, all three studies favoring control
interventions compared combination therapy to monotherapy, which would
naturally make the monotherapy control arm appear more tolerable.21, 35, 36.
The three studies favoring checkpoint inhibitors compared either
pembrolizumab34, or nivolumab22, 29 to chemotherapy, and although this may
suggest superior tolerability compared to ipilimumab, differences in trial
design prohibits such a conclusion. The three studies on PD-1 therapy all
compared monotherapy to chemotherapy, while the study on ipilimumab
compared the combination of chemotherapy and ipilimumab to chemotherapy
alone, where the monotherapy arm, would naturally be expected to be more
tolerable. Nonetheless, data in Table 5, Online Supplementary Material
Section B, shows that 14.8 – 17.4% discontinued ipilimumab monotherapy
due to treatment-related adverse events across two studies35, 36, compared to
only 2.2% for pembrolizumab34, and 2.6 – 7.7% for nivolumab.22, 35. While
comparing data directly across studies is confounded by differences in study
design, the Larkin study did in fact have both a nivolumab and an ipilimumab
monotherapy arm, and yet almost twice as many patients on ipilimumab
Word count: 6061 43
discontinued due to treatment-related adverse events (46/311) as did patients
on nivolumab (24/313). 35.
There is, therefore, reason to suspect that ipilimumab may be less tolerable
than the two PD-1 targeted therapies, although checkpoint inhibitors as a
class were not shown to be significantly less tolerable than control treatments.
Heterogeneity
As was highlighted in the Results section, the heterogeneity of the meta-
analyses was significant. For the primary outcome analysis the heterogeneity
was I2 = 91%. However, when considering the three drugs separately, the
studies on each drug are well aligned, despite the different control
interventions, and the use of hazard ratio for death rather than progression in
one study.15. Similarly, the substantial heterogeneity for the secondary meta-
analysis on tumor response, I2 = 72%, was reduced to 0% upon excluding the
two ipilimumab studies.15, 21. This suggests firstly, that the heterogeneity stems
from the combination of different checkpoint inhibitors into one arm rather
than inconsistent effect estimates from individual studies, and secondly, that
the two endpoints for the primary meta-analysis (OS and PFS) are sufficiently
similar to combine.
Word count: 6061 44
Limitations
Bias
The studies included in this systematic review and meta-analysis were of
good quality, scientifically rigorous, and at low risk of bias. A further
exploration of the bias assessment does, however, show that certain bias
domains were more relevant than others, and that the most relevant domain
varied between the different outcomes.
The primary outcome on survival, and the secondary outcome on tumor
response were objective outcomes where the impartiality of assessment of
progression of disease and tumor response were vital, meaning the most
relevant bias domain was the ‘blinding of outcome assessment’. While most
studies had an independent review committee (ICR), 5/7 studies were marked
as ‘unclear risk’ of bias, mostly due to the failure to specify who made up the
ICR. One study used only investigator-assessed tumor response, and did not
specify whether they remained blinded during assessment, and was therefore
marked as high risk of bias.29. The impact on the results is, however, believed
to minor, given that firstly, the one study marked as high risk produced results
on par with other studies, and secondly, the failure to specify who sat on the
ICR does not necessarily mean that they were either unqualified or biased.
Word count: 6061 45
The secondary outcome on tolerability was more subjective; meaning the
‘blinding of participants and personnel’ was the most important bias domain.
Five studies were marked as low risk of bias, but two studies were completely
or partially open-label and were therefore marked as high risk of bias.22, 34.
Patients in the chemotherapy arms of these two trials may have been more
likely to report adverse events, given that chemotherapy is commonly known
to cause side effects. As these two studies were amongst the only three
studies favoring checkpoint inhibitors, the results of this meta-analysis may
have been biased in favor of checkpoint inhibitors. The third study favoring
checkpoint inhibitors was, however, double-blinded, and the common
denominator identified previously was that these were the only studies that
compared checkpoint inhibitor monotherapy to control treatment.
All studies were marked as unclear risk under the ‘other bias’ domain due to
the funding being provided by the patent-holding pharmaceutical company,
who, in collaboration with the authors, were responsible for the study design,
data collection and analysis of results. On a study level it is not possible to
determine whether this potential bias was relevant or not, although it is
noteworthy that all high-quality data on checkpoint inhibitors for melanoma
has been funded by the pharmaceutical industry.
Word count: 6061 46
Outcomes
At the outcome-level, the use of PFS and thus hazard ratios for progression
as a surrogate for the gold-standard endpoint, OS and hazard ratios for death
was necessary given the lack of published data on OS, but nonetheless a
limitation. While a meta-analysis has shown that PFS is a reliable surrogate
marker, and that the correlation is stronger for melanoma than for any other
cancer, only studies with dacarbazine in the control arm, and only one study
assessing a checkpoint inhibitor (ipilimumab) were included.28. The atypical
tumor responses seen with immunotherapy makes it possible for patients to
be prematurely marked as progressing, even though a positive late response
may still occur. This uncertainty is compounded by the use of different
checkpoint inhibitors with different kinetics in both the experimental and
control arms. The same applies for the secondary outcome on tumor
response, wherein the use of the RECIST or modified WHO criteria may fail to
capture the delayed response of checkpoint inhibitors.40, 42.
However, the direction of bias is such that, if anything, the efficacy of
checkpoint inhibitors would be underestimated given that patients would have
shorter PFS, and a lower BORR if they were prematurely evaluated as having
progressive disease. In fact, the hazard ratios for progression are less
substantial, ie closer to 1.00, than the hazard ratios for death in the studies
that reported both endpoints, meaning this potential limitation did not impact
the overall findings of this study.15, 21, 29.
Word count: 6061 47
Review
At the review-level, the weaknesses include the high heterogeneity, which
may reduce the credibility of a meta-analysis, and suggest that the studies are
too dissimilar to pool. However, the studies on each individual drug produced
similar results, suggesting that the studies were not producing randomly
spurious results, and that the heterogeneity was a reflection of genuine
differences between the three drugs. This study has compensated for the
inherent heterogeneity from combining three drugs by firstly, reviewing the
results of each drug separately and comparing against each other, and
secondly, by not assuming a common effect estimate, and therefore choosing
a random effects model.
A second potential limitation is that in four out of seven studies, the
combination of a checkpoint inhibitor and control treatment was compared to
the control treatment alone, in order to isolate the effect of the checkpoint
inhibitor. This, however, assumes that the drugs do not act synergistically,
which would exaggerate the effect of the checkpoint inhibitor. There is limited
evidence on whether checkpoint inhibitors act in an additive or synergistic way
when combined with chemotherapy or another immunotherapeutic agent. In a
mouse model with a peritoneal ID8 tumor an α-PD-1 monoclonal antibody was
shown to produce synergistic effects when combined with trabectedin, and
separate to this, the combination of non-efficacious doses of anti-PD-1 and
anti-CTLA-4 antibodies were able to significantly reduce the tumour volume in
a mouse.43-45. This is, however, weak evidence, and in this review, only two
studies allowed for an evaluation of synergism, neither one providing
Word count: 6061 48
especially convincing evidence. The Larkin study35 showed that in the
combination arm, the PFS was slightly greater (11.5 vs. 9.8 months) but the
BORR slightly lower (57.6% vs. 62.7%) than the combined sum of the
monotherapy arms, while the Hodi study15 found that both were lower. The
assumption of an additive effect is therefore unlikely to have significantly
biased the results of this study.
The relatively low numbers of studies (seven) and total participants (3628),
and the inclusion of only one study assessing the tolerability of ipilimumab is
another limitation. Lastly, the presence of reporting bias, specifically in the
form of time-lag bias is also relevant, as the median overall survival data is yet
to be released for several studies. Based on the funnel plot, as shown in the
Online Supplementary Material, Section D, there was, however, no significant
publication bias.
Context and future directions
The results of this study support the FDA and EMA (European Medicines
Agency) approvals, and NICE recommendation for ipilimumab, nivolumab,
and pembrolizumab.26. Previous systematic reviews and meta-analyses
looking at only CTLA-4 or PD-1 targeted therapies separately, have, like this
study, shown that ipilimumab46, nivolumab and pembrolizumab47, 48 improve
survival and tumor response. Furthermore, a recent systematic review and
meta-analysis by Yun et al49, came to similar conclusions, and also found
evidence to suggest that anti-PD-1 treatment is of greater clinical benefit than
anti-CTLA-4 treatments. It, however, did not include the recent clinical trials
Word count: 6061 49
conducted by Larkin et al35, and Postow et al36 in the quantitative analyses,
which were included in this study. However, it did include one study on the
unapproved drug tremelimumab, another monoclonal antibody to CTLA-4,
which did not meet our inclusion criteria, as it is not FDA or EMA approved.
Tremelimumab failed to show efficacy in its phase III clinical trial, and is
therefore no longer being pursued as a treatment for melanoma.50. Additional
evidence in favor of PD-1 target therapy comes from the recent KEYNOTE
006 trial, which showed greater progression-free survival, overall survival, and
objective response rates with two different dosing regimens of pembrolizumab
compared to ipilimumab.51.
Similar to this study though, the safety of especially ipilimumab has been of
concern, while nivolumab has in fact been shown to cause a non-significant
decrease in adverse events.47, 48. Immune-related adverse events have,
however, been reported more frequently for the combination of nivolumab and
ipilimumab than for ipilimumab alone, which is consistent with the poorer
tolerability of combination therapy reported in this study.47, 52. These immune-
related adverse events range from mild and self-limiting, to life-threatening
organ inflammation, and although they respond well to steroids, and in severe
cases infliximab, prophylactic budesonide failed to reduce the rate of grade ≥2
diarrhea in ipilimumab-treated patients.53.
Word count: 6061 50
Combination therapies with multiple checkpoint inhibitors and/or with other
treatments such as signal transduction (BRAF/MEK) inhibitors remain an
important avenue to explore in order to obtain the maximum survival benefit of
checkpoint inhibitors in advanced melanoma. In this study, even with a
combination of ipilimumab and nivolumab, some 40% of patients nevertheless
failed to respond to treatment, while still remaining at risk of toxicity. Predictive
biomarkers, capable of giving a pre-treatment indication of the risk:benefit
ratio in an individual patient may therefore improve the use of checkpoint
inhibitors; especially as several new checkpoint inhibitors with novel targets
are coming close to market release. This means that choosing the best
combination of drugs to prescribe may become difficult, which is especially
problematic in advanced melanoma, where the poor prognosis makes a trial
and error approach to treatment inappropriate.
Based on the findings of this study, the focus of future research should
therefore be on two areas:
1) Determining the optimal use of checkpoint inhibitors, specifically in
terms of combination therapy and the optimal duration, constellation,
and sequence of such treatment
2) Identifying reliable biomarker algorithms to predict responders and
guide treatment assignments
Word count: 6061 51
Identifying reliable biomarkers to guide the use of checkpoint inhibitors may
not only spare non-responders from adverse effects and maximize the benefit
in responders, but also suggests novel drug targets. Moreover, carefully
designed dose ranging studies may also be helpful in determining the duration
of treatment required to achieve optimal effect without causing undue side
effects, given the results in this study showing prolonged survival with
combination treatment despite increased rates of discontinuations.35, 36.
Studies on combination treatments, such as the phase I/II study KEYNOTE
022, which combines pembrolizumab with the MEK inhibitor, trametinib, and
the BRAF inhibitor, dabrafenib are currently underway.54. As are studies
looking at potential biomarkers, such as tumor genomics, with a recent study
on pembrolizumab for colorectal carcinoma showing that mismatch-repair
status predicted the clinical benefit.55. While BRAF mutation status has been
shown to not affect the efficacy of checkpoint inhibitors56, further studies are
needed to clarify the usefulness of PD-L1 status as a predictive marker. The
Larkin study35 found a nominally greater tumor response in PD-L1 positive
patients treated with nivolumab alone or combined with ipilimumab as
compared to ipilimumab alone, whilst the Postow study36 found that tumor
response was independent of PD-L1 status.
Finally, validating the new immune-related response criteria, and
incorporating it into clinical trials, along with the development of clearer
guidelines on the management of checkpoint inhibitor-induced toxicities may
improve the study, and safe use of checkpoint inhibitors.
Word count: 6061 52
Conclusion
This meta-analysis has found that checkpoint inhibitors provide a statistically
significant advantage over control interventions for progression-free survival,
overall-survival and best overall response rates in patients with unresectable
stage III or IV melanoma, without significantly worsening tolerability. The
combination of ipilimumab and nivolumab was the most effective, but not
surprisingly, less tolerable than monotherapy. Reliable and predictive
biomarkers, along with clear guidelines for the optimal use of checkpoint
inhibitors holds the potential of improving the prognosis of patients with
advanced melanoma, and move immunotherapy towards becoming the 4th
generation of cancer treatment, along with surgery, chemotherapy, and
radiotherapy.
Word count: 6061 53
Word count: 6061
References
1. Cancer Research UK. Skin cancer incidence statistics.
http://www.cancerresearchuk.org/health-professional/cancer-statistics/
statistics-by-cancer-type/skin-cancer/incidence#ref-2.
2. Thompson JF, Scolyer RA, Kefford RF. Cutaneous melanoma. Lancet.
2005;365(9460):687-701.
3. de Vries E, Bray FI, Coebergh JW, Parkin DM. Changing epidemiology of
malignant cutaneous melanoma in Europe 1953-1997: rising trends in
incidence and mortality but recent stabilizations in western Europe and
decreases in Scandinavia. Int J Cancer. 2003;107(1):119-126.
4. Lucas R, McMichael T, Smith W, Armstrong B. Solar Ultraviolet
Radiation: Global burden of disease from solar ultraviolet radiation.
2006;13.
5. WHO. Skin cancers. http://www.who.int/uv/faq/skincancer/en/index1.html.
6. Bolognia JL, Schaffer JV, Duncan KO, Ko CJ. Cutaneous melanoma. In:
Dermatology Essentials. 1st ed. Elsevier Saunders; 2014:909-928.
7. Bolognia JL, Jorizzo JL, Rapini RP, et al. Melanoma. In: Dermatology. 1st
ed. Mosby; 2003:1789-1815.
8. Balch CM, Soong SJ, Gershenwald JE, et al. Prognostic factors analysis
of 17,600 melanoma patients: validation of the American Joint Committee
on Cancer melanoma staging system. J Clin Oncol. 2001;19(16):3622-
3634.
54
Word count: 6061
9. Barth A, Wanek LA, Morton DL. Prognostic factors in 1,521 melanoma
patients with distant metastases. J Am Coll Surg. 1995;181(3):193-201.
10. Fletcher WS, Pommier RF, Lum S, Wilmarth TJ. Surgical treatment of
metastatic melanoma. Am J Surg. 1998;175(5):413-417.
11. Lui P, Cashin R, Machado M, Hemels M, Corey-Lisle PK, Einarson TR.
Treatments for metastatic melanoma: synthesis of evidence from
randomized trials. Cancer Treat Rev. 2007;33(8):665-680.
12. Maverakis E, Cornelius LA, Bowen GM, et al. Metastatic melanoma - a
review of current and future treatment options. Acta Derm Venereol.
2015;95(5):516-524.
13. Chapman PB, Hauschild A, Robert C, et al. Improved survival with
vemurafenib in melanoma with BRAF V600E mutation. N Engl J Med.
2011;364(26):2507-2516.
14. Flaherty KT, Robert C, Hersey P, et al. Improved survival with MEK
inhibition in BRAF-mutated melanoma. N Engl J Med. 2012;367(2):107-114.
15. Hodi FS, O'Day SJ, McDermott DF, et al. Improved Survival with
Ipilimumab in Patients with Metastatic Melanoma. N Engl J Med.
2010;363(8):711-723.
16. Walker LS, Sansom DM. The emerging role of CTLA4 as a cell-extrinsic
regulator of T cell responses. Nat Rev Immunol. 2011;11(12):852-863.
17. Wolchok JD, Saenger Y. The mechanism of anti-CTLA-4 activity and
the negative regulation of T-cell activation. Oncologist. 2008;13 Suppl. 4:2-
9.
55
Word count: 6061
18. Amarnath S, Mangus CW, Wang JC, et al. The PDL1-PD1 axis converts
human TH1 cells into regulatory T cells. Sci Transl Med.
2011;3(111):111ra120.
19. Singh BP, Salama AKS. Updates in Therapy for Advanced Melanoma.
Cancers. 2016;8(1):17.
20. Blank C, Brown I, Peterson AC, et al. PD-L1/B7H-1 inhibits the effector
phase of tumor rejection by T cell receptor (TCR) transgenic CD8+ T cells.
Cancer Res. 2004;64(3):1140-1145.
21. Robert C, Thomas L, Bondarenko I, et al. Ipilimumab plus Dacarbazine
for Previously Untreated Metastatic Melanoma. N Engl J Med.
2011;364(26):2517-2526.
22. Weber JS, D'Angelo SP, Minor D, et al. Nivolumab versus
chemotherapy in patients with advanced melanoma who progressed after
anti-CTLA-4 treatment (CheckMate 037): a randomised, controlled, open-
label, phase 3 trial. Lancet Oncol. 2015;16(4):375-384.
23. UpToDate. Nivolumab: Drug information.
http://www.uptodate.com.iclibezp1.cc.ic.ac.uk/contents/nivolumab-drug-
information?source=see_link.
24. UpToDate. Pembrolizumab: Drug information.
http://www.uptodate.com.iclibezp1.cc.ic.ac.uk/contents/pembrolizumab-
drug-information?source=see_link..
25. UpToDate. Ipilimumab: Drug information.
http://www.uptodate.com.iclibezp1.cc.ic.ac.uk/contents/ipilimumab-drug-
information?source=see_link.
56
Word count: 6061
26. National Institute for Health and Care Excellence. Treating stage IV
melanoma: Immunotherapy and targeted therapy.
http://pathways.nice.org.uk/pathways/melanoma#path=view%3A/pathways/
melanoma/treating-stage-iv-melanoma.xml&content=view-node%3Anodes-
immunotherapy-and-targeted-therapy.
27. CONSORT: Transparent reporting of trials. CONSORT 2010 Checklist.
http://www.consort-statement.org/consort-2010.
28. Flaherty KT, Hennig M, Lee SJ, et al. Surrogate endpoints for overall
survival in metastatic melanoma: a meta-analysis of randomised controlled
trials. The Lancet Oncology. 2014;15(3):297-304.
29. Robert C, Long GV, Brady B, et al. Nivolumab in Previously Untreated
Melanoma without BRAF Mutation. N Engl J Med. 2015;372(4):320-330.
30. Eisenhauer EA, Therasse P, Bogaerts J, et al. New response evaluation
criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J
Cancer. 2009;45(2):228-247.
31. James K, Eisenhauer E, Christian M, et al. Measuring response in solid
tumors: unidimensional versus bidimensional measurement. J Natl Cancer
Inst. 1999;91(6):523-528.
32. Higgins JP, Green S. Cochrane Handbook for Systematic Reviews of
Interventions: Obtaining standard errors from confidence intervals and P
values: absolute (difference) measures.
http://handbook.cochrane.org/chapter_7/7_7_7_2_obtaining_standard_error
s_from_confidence_intervals_and.htm.
57
Word count: 6061
33. Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring
inconsistency in meta-analyses. British Medical Journal.
2003;327(7414):557-560.
34. Ribas A, Puzanov I, Dummer R, et al. Pembrolizumab versus
investigator-choice chemotherapy for ipilimumab-refractory melanoma
(KEYNOTE-002): a randomised, controlled, phase 2 trial. The Lancet
Oncology. 2015;16(8):908-918.
35. Larkin J, Chiarion-Sileni V, Gonzalez R, et al. Combined Nivolumab and
Ipilimumab or Monotherapy in Untreated Melanoma. N Engl J Med.
2015;373(1):23-34.
36. Postow MA, Chesney J, Pavlick AC, et al. Nivolumab and Ipilimumab
versus Ipilimumab in Untreated Melanoma. N Engl J Med.
2015;372(21):2006-2017.
37. Saenger YM, Wolchok JD. The heterogeneity of the kinetics of response
to ipilimumab in metastatic melanoma: patient cases. Cancer Immun.
2008;8:1.
38. Hales RK, Banchereau J, Ribas A, et al. Assessing oncologic benefit in
clinical trials of immunotherapy agents. Ann Oncol. 2010;21(10):1944-1951.
39. Topalian SL, Sznol M, McDermott DF, et al. Survival, durable tumor
remission, and long-term safety in patients with advanced melanoma
receiving nivolumab. J Clin Oncol. 2014;32(10):1020-1030.
40. Wolchok JD, Hoos A, O'Day S, et al. Guidelines for the evaluation of
immune therapy activity in solid tumors: immune-related response criteria.
Clin Cancer Res. 2009;15(23):7412-7420.
58
Word count: 6061
41. Luke JJ, Ott PA. PD-1 pathway inhibitors: The next generation of
immunotherapy for advanced melanoma. Oncotarget. 2014;6(6):3479-3492.
42. Dranitsaris G, Cohen RB, Acton G, et al. Statistical Considerations in
Clinical Trial Design of Immunotherapeutic Cancer Agents. J Immunother.
2015;38(7):259-266.
43. Guo Z, Wang H, Meng F, Li J, Zhang S. Combined Trabectedin and
anti-PD1 antibody produces a synergistic antitumor effect in a murine model
of ovarian cancer. Journal of Translational Medicine. 2015;13:247.
44. Snzol M. Combined CTLA4 and PD-1 pathway blockade for treatment of
advanced cancer. 2015.
45. Curran MA, Montalvo W, Yagita H, Allison JP. PD-1 and CTLA-4
combination blockade expands infiltrating T cells and reduces regulatory T
and myeloid cells within B16 melanoma tumors. Proc Natl Acad Sci U S A.
2010;107(9):4275-4280.
46. Dequen P, Lorigan P, Jansen JP, van Baardewijk M, Ouwens MJNM,
Kotapati S. Systematic Review and Network Meta-Analysis of Overall
Survival Comparing 3 mg/kg Ipilimumab With Alternative Therapies in the
Management of Pretreated Patients With Unresectable Stage III or IV
Melanoma. Oncologist. 2012;17(11):1376-1385.
47. Jin C, Zhang X, Zhao K, Xu J, Zhao M, Xu X. The efficacy and safety of
nivolumab in the treatment of advanced melanoma: a meta-analysis of
clinical trials. Journal of OncoTargets and therapy. 2016;9:1571-1578.
59
Word count: 6061
48. Chen R, Peng P, Wen B, et al. Anti-Programmed Cell Death (PD)-1
Immunotherapy for Malignant Tumor: A Systematic Review and Meta-
Analysis. Translational Oncology. 2015;9(1):32-40.
49. Yun S, Vincelette ND, Green MR, Wahner Hendrickson AE, Abraham I.
Targeting immune checkpoints in unresectable metastatic cutaneous
melanoma: a systematic review and meta-analysis of anti-CTLA-4 and anti-
PD-1 agents trials. Cancer Med. 2016;5(7):1481-1491. doi:
10.1002/cam4.732 [doi].
50. Ribas A, Kefford R, Marshall MA, et al. Phase III randomized clinical
trial comparing tremelimumab with standard-of-care chemotherapy in
patients with advanced melanoma. J Clin Oncol. 2013;31(5):616-622. doi:
10.1200/JCO.2012.44.6112 [doi].
51. Robert C, Schachter J, Long GV, et al. Pembrolizumab versus
Ipilimumab in Advanced Melanoma. N Engl J Med. 2015;372(26):2521-
2532. doi: 10.1056/NEJMoa1503093 [doi].
52. Bertrand A, Kostine M, Barnetche T, Truchetet ME, Schaeverbeke T.
Immune related adverse events associated with anti-CTLA-4 antibodies:
systematic review and meta-analysis. BMC Med. 2015;13:211-015-0455-8.
doi: 10.1186/s12916-015-0455-8 [doi].
60
Word count: 6061
53. Weber J, Thompson JA, Hamid O, et al. A randomized, double-blind,
placebo-controlled, phase II study comparing the tolerability and efficacy of
ipilimumab administered with or without prophylactic budesonide in patients
with unresectable stage III or IV melanoma. Clin Cancer Res.
2009;15(17):5591-5598. doi: 10.1158/1078-0432.CCR-09-1024 [doi].
54. ClinicalTrials.gov. A Study of the Safety and Efficacy of Pembrolizumab
(MK-3475) in Combination With Trametinib and Dabrafenib in Participants
With Advanced Melanoma (MK-3475-022/KEYNOTE-022).
https://clinicaltrials.gov/ct2/show/NCT02130466.
55. Le DT, Uram JN, Wang H, et al. PD-1 Blockade in Tumors with
Mismatch-Repair Deficiency. N Engl J Med. 2015;372(26):2509-2520. doi:
10.1056/NEJMoa1500596 [doi].
56. Larkin J, Lao CD, Urba WJ, et al. Efficacy and Safety of Nivolumab in
Patients With BRAF V600 Mutant and BRAF Wild-Type Advanced
Melanoma: A Pooled Analysis of 4 Clinical Trials. JAMA Oncol.
2015;1(4):433-440. doi: 10.1001/jamaoncol.2015.1184 [doi].
61
Word count: 6061
Appendices
Appendix A – Search terms
Database Search termsMedline 1. Ipilimumab; 2. MDX-010; 3. MDX-101; 4. Yervoy; 5. BMS-734016; 6. Nivolumab; 7. ONO-
4538; 8. BMS-936558; 9. MDX-1106; 10. Opdivo; 11. Pembrolizumab; 12. MK-4375; 13. Lambrolizumab; 14. Keytruda; 15. Checkpoint inhib*; 16. 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9 or 10 or 11 or 12 or 13 or 14 or 15; 17. Melanoma or Melanoma/ or Melanoma skin cancer; 18. Malignant melanoma; 19. Skin tumor or Skin cancer; 20. Skin neoplasm or Skin neoplasm/; 21. Skin carcinoma; 22. 17 or 18 or 19 or 20 or 21; 23. 16 and 22
Embase 1. Ipilimumab or Ipilimumab/; 2. MDX-010; 3. MDX-101;4. Yervoy; 5. BMS-734016; 6. Nivolumab or Nivolumab/; 7. ONO-4538; 8. BMS-936558; 9. MDX-1106; 10. Opdivo; 11. Pembrolizumab or Pembrolizumab/; 12. MK-4375; 13. Lambrolizumab; 14. Keytruda; 15. Checkpoint inhib*; 16. 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9 or 10 or 11 or 12 or 13 or 14 or 15; 17. Melanoma or Melanoma/; 18. Melanoma skin cancer or Melanoma skin cancer/; 19. Malignant melanoma; 20. Skin tumor or Skin tumor/; 21. Skin cancer or Skin cancer/; 22. Skin carcinoma or Skin carcinoma/; 23. Skin neoplasm*; 24. 17 or 18 or 19 or 20 or 21 or 22 or 23; 25. 16 and 24
Cochrane 1. Ipilimumab or MDX-010 or MDX-101 or Yervoy or BMS 734016; 2. Nivolumab or ONO-4538 or BMS936558 or MDX-1106 or Opdivo; 3. Pembrolizumab or MK-4375 or Lambrolizumab or Keytruda; 4. Checkpoint inhib*; 5. 1 or 2 or 3 or 4; 6. Melanoma/; 7. Melanoma or Melanoma skin cancer; 8. Malignant melanoma; 9. Skin neoplasm/; 10. Skin cancer or Skin tumor or Skin carcinoma or Skin neoplasm; 11. 6 or 7 or 8 or 9 or 10; 12. 5 and 11
Web of Science
1. Ipilimumab; 2. MDX-010; 3. MDX-101; 4. Yervoy; 5. BMS-734016; 6. Nivolumab; 7. ONO-4538; 8. BMS-936558; 9. MDX-1106; 10. Opdivo; 11. Pembrolizumab; 12. MK-4375; 13. Lambrolizumab; 14. Keytruda; 15. Checkpoint inhib*; 16. 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9 or 10 or 11 or 12 or 13 or 14 or 15; 17. Melanoma or Melanoma skin cancer or Malignant melanoma or Skin tumor or Skin cancer or Skin neoplasm* or Skin carcinoma; 18. 16 and 17
62
Word count: 6061
Table 1. Search strategy for the databases
The search terms used on the four databases, with each number being an individual search and an
italicized region representing a group of similar search terms referring to the same drug or disease.
‘OR’ & ‘AND’ were used to connect separate searches.
/ MeSH search term.
* Open-ended search.
63
Word count: 6061
Appendix B – Participant data
Author Mean age
Female – (%)
ECOG Performance status † – no. (%)
Metastasis stage – no. (%)
Lactate dehydrogenase levels – no. (%)
Previous systemic therapy
Hodi 56.2 40.7 0 – 374 (55.3)
1 – 291 (43.0)
2 – 9 (1.3)
3 – 1 (0.1)
Unknown – 1 (0.1)
M0 – 10 (1.5)
M1a – 62 (9.2)
M1b – 121 (17.9)
M1c – 483 (71.4)
≤ULN – 417 (61.7%)
>ULN – 254 (37.6)
Unknown – 5 (0.7)
Yes
(chemotherapy or IL-2)
Larkin 60.0 35.4 0 – 692 (73.2)
1 – 251 (26.6)
2 – 1 (0.1)
Not reported – 1 (0.1)
M0, M1a, M1b – 397 (42.0)
M1c – 548 (58.0)
≤ULN – 589 (62.3)
>ULN – 341 (36.1)
Unknown – 15 (1.6)
No
Postow 65 * 33.1 0 – 116 (81.7)
1 – 24 (16.9)
≥2 – 2 (1.4)
M0 – 13 (9.2)
M1a – 23 (16.2)
M1b – 39 (27.5)
M1c – 65 (45.8)
Not reported – 2 (1.4)
≤ULN – 106 (74.6)
>ULN – 35 (24.6)
Unreported – 1 (0.7)
No
64
Word count: 6061
Author Mean age
Female – (%)
ECOG Performance status † – no. (%)
Metastasis stage – no. (%)
Lactate dehydrogenase levels – no. (%)
Previously systemic therapy
Ribas 61.7 * 39.4 0 – 295 (54.6)
1 – 243 (45.0)
Not reported – 2 (0.4)
M0 – 4 (0.7)
M1a – 37 (6.9)
M1b – 54 (10.0)
M1c – 445 (82.4)
Normal – 311 (57.6)
≥110% ULN) – 218 (40.4)
Unknown – 11 (2.0)
Yes
(ipilimumab ± BRAF and/or
MEK inhibitor and/or
chemotherapy)
Robert (2011)
56.9 40.0 0 – 356 (70.9)
1 – 146 (29.1)
M0 – 14 (2.8)
M1a – 80 (15.9)
M1b – 126 (25.1)
M1c – 282 (56.2)
≤ULN – 297 (59.2%)
>ULN – 203 (40.4)
Unknown – 2 (0.4)
No **
(Adjuvant therapy – 133
(26.5))
Robert (2015)
65.0 ‡ 41.1 0 – 269 (64.4)
1 – 144 (34.4)
2 – 4 (1.0)
M0, M1a, M1b – 163
(39.0) M1c – 255 (61.0)
≤ULN – 245 (58.6%)
>ULN - 153 (36.6)
Not reported – 20 (4.8)
No **
(Adjuvant – 68 (16.3))
Neoadjuvant – 2 (0.5))
Weber 60.0 * 35.6 0 – 246 (60.7)
1 – 158 (39.0)
Not reported – 1 (0.2)
M1c – 305 (75.3)
Not reported – 100 (24.7)
>ULN – 185 (45.7)
Not reported – 220 (54.3)
Yes
(ipilimumab, ± BRAF
inhibitor and/or
chemotherapy)
65
Word count: 6061
Table 2. Baseline participant demographics and disease status
An overview of baseline participant data on demographics (mean age of
study population, and proportion of female participants), and disease status
(ECOG performance status, metastasis stage, and levels of lactate
dehydrogenase). Previous treatments in the study populations are also
shown in the last column.
* The median rather than the mean age is reported.
† Eastern Cooperative Oncology Group (ECOG) performance-status scores
ranges from 0 to 5, where 0 is no symptoms, 1 is symptomatic but
completely ambulatory, and 2 and 3 is symptomatic and in bed during the
day <50% and >50%, respectively.
‡ The mean of each arm was manually calculated as no data was given for
the whole study population.
** Patients were previously untreated, but for a group of patients that had
received past adjuvant or neoadjuvant therapy.
66
Word count: 6061
Online supplementary material
Section A – Study quality assessment
The quality of all included studies was assessed using the 2010 CONSORT
checklist. For each item, the studies were scored as ‘done’, ‘not done’, or
‘not applicable’, as seen in the example for the Hodi study15, in Table 5
below. From this an overall score of the study quality was calculated based
on the equation:
itemsdone(total items−not applicable)
×100
The overall mean, minimum and maximum scores were calculated, and
items consistently done poorly were noted. The results were used to test for
the strength of correlation between study quality and primary efficacy
outcome in order to assess whether poor quality studies may have biased
the results of the meta-analysis, as described in the Results section.
Additionally, the effect that removing the poorest quality studies had on the
heterogeneity measure (I2) was determined.
67
Word count: 6061
Section/Topic Item No Checklist item Reported on page No.
Title and abstract 1a Identification as a randomized trial in the title X
1b Structured summary of trial design, methods, results, and conclusions (for specific guidance
see CONSORT for abstracts) X✓ 711
Introduction
Background and objectives 2a Scientific background and explanation of rationale ✓ 712
2b Specific objectives or hypotheses X
MethodsTrial design 3a Description of trial design (such as parallel, factorial) including allocation ratio ✓ 713
3b Important changes to methods after trial commencement (such as eligibility criteria), with
reasons
N/A
Participants 4a Eligibility criteria for participants ✓ 712
4b Settings and locations where the data were collected ✓ 712
68
Word count: 6061
Interventions 5 The interventions for each group with sufficient details to allow replication, including
how and when they were actually administered
✓ 713
Outcomes 6a Completely defined pre-specified primary and secondary outcome measures,
including how and when they were assessed
X
6b Any changes to trial outcomes after the trial commenced, with reasons ✓ 713
Sample size 7a How sample size was determined ✓ 714
7b When applicable, explanation of any interim analyses and stopping guidelines ✓ 714
Randomization:
Sequence generation 8a Method used to generate the random allocation sequence X
8b Type of randomization; details of any restriction (such as blocking and block size) XAllocation concealment
mechanism
9 Mechanism used to implement the random allocation sequence (such as sequentially
numbered containers), describing any steps taken to conceal the sequence until
interventions were assigned
X
Implementation 10 Who generated the random allocation sequence, who enrolled participants, and who
assigned participants to interventions
X
Blinding 11a If done, who was blinded after assignment to interventions (for example, participants,
care providers, those assessing outcomes) and how
X
11b If relevant, description of the similarity of interventions XStatistical methods 12a Statistical methods used to compare groups for primary and secondary outcomes ✓ 714
12b Methods for additional analyses, such as subgroup analyses and adjusted analyses ✓ 717
69
Word count: 6061
Results
Participant flow (a diagram
is strongly recommended)
13a For each group, the numbers of participants who were randomly assigned, received
intended treatment, and were analyzed for the primary outcome
✓ 714
13b For each group, losses and exclusions after randomization, together with reasons X
Recruitment 14a Dates defining the periods of recruitment and follow-up X14b Why the trial ended or was stopped N/A
Baseline data 15 A table showing baseline demographic and clinical characteristics for each group ✓ 715
Numbers analyzed 16 For each group, number of participants (denominator) included in each analysis and
whether the analysis was by original assigned groups
✓ 714
Outcomes and estimation 17a For each primary and secondary outcome, results for each group, and the estimated
effect size and its precision (such as 95% confidence interval)
✓ 718
17b For binary outcomes, presentation of both absolute and relative effect sizes is
recommended
✓ 718
Ancillary analyses 18 Results of any other analyses performed, including subgroup analyses and adjusted
analyses, distinguishing pre-specified from exploratory
✓ 717
Harms 19 All important harms or unintended effects in each group (for specific guidance see
CONSORT for harms)
✓ 720
70
Word count: 6061
Discussion
Limitations 20 Trial limitations, addressing sources of potential bias, imprecision, and, if relevant,
multiplicity of analyses
X
Generalizability 21 Generalizability (external validity, applicability) of the trial findings ✓ 712
Interpretation 22 Interpretation consistent with results, balancing benefits and harms, and considering
other relevant evidence
✓ 721
Other information
Registration 23 Registration number and name of trial registry ✓ 711
Protocol 24 Where the full trial protocol can be accessed, if available ✓ 712
Funding 25 Sources of funding and other support (such as supply of drugs), role of funders ✓ 721
Table 1. CONSORT checklist for Hodi (2010)
A sample 2010 CONSORT checklist as it appears for the Hodi study.15. The 25 items are listed in the leftmost column, numbered,
sub-categorized (37 items in total), and described in the subsequent columns, with the final column on the right containing a
green tick and the page number if the criteria was met, a red cross if not met, or marked as N/A if not applicable.
71
Word count: 6061
Author Fulfilled Not fulfilled Not Applicable Score Percentage
Hodi 23 12 2 23/35 65.7%
Larkin 22 13 2 22/35 62.9%
Postow 19 15 3 19/34 55.9%
Ribas 27 8 2 27/35 77.1%
Robert (2011)
21 14 2 21/35 60.0%
Robert (2015)
21 14 2 21/35 60.0%
Weber 26 8 3 26/34 76.5%
Min.
Mean.
Max.
55.9%
65.4%
77.1%
Table 2. CONSORT study quality scores
The overall quality scores for each study as a fraction and a percentage is
listed in the final two columns, along with the number of criteria that were
fulfilled, not fulfilled, or not applicable. The minimum and maximum study
scores, as well as the overall mean for all seven studies is listed in the
bottom right.
72
Word count: 6061
Section B – Raw data
Author Median overall survival –months (95% CI)
Median progression free survival – months (95% CI)
HR death (95% CI) HR progression (95% CI)
Hodi Ipi & gp100 – 10.0 (8.5 – 11.5)
gp100 – 6.4 (5.5 – 8.7)
Ipi – 10.1 (8.0 – 13.8)
Ipi & gp100 – 2.76 (2.73 – 2.79)
gp100 – 2.76 (2.73 – 2.83)
Ipi – 2.86 (2.76 – 3.02)
Ipi & gp100 vs. gp100 0.68
(0.55 – 0.85)
Ipi vs. gp100 0.66 (0.51 -
0.87)
Ipi & gp100 vs. Ipi – 0.81
Ipi vs. gp100 – 0.64 †
Larkin Data immature Nivo & Ipi – 11.5 (8.9 – 16.7)
Nivo & placebo – 6.9 (4.3 – 9.5)
Ipi & placebo – 2.9 (2.8 – 3.4)
Data immature Nivo & Ipi vs. Ipi – 0.42 (0.31 –
0.57)
Nivo vs. Ipi – 0.57 (0.43 – 0.76)
Nivo & Ipi vs. Nivo – 0.74 (0.60
– 0.92)
Postow Not reported Ipi & Nivo – Data immature
Ipi – 4.4 (2.8 – 5.7)
Not reported 0.40 (0.23 - 0.68)
73
Word count: 6061
Author Median overall survival –months (95% CI)
Median progression free survival – months (95% CI)
HR death (95% CI) HR progression (95% CI)
Ribas Data immature Pembro 2mg/kg – 5.4 (4.7 – 6.0
Pembro 10mg/kg – 5.8 (5.1 – 6.4)
ICC – 3.6 (3.2 – 4.1) ‡§
Data immature Pembro 2m/kg vs. chemotherapy –
0.57 (0.45 – 0.73)
Pembro 10m/kg vs. chemotherapy –
0.50 (0.39 – 0.64)
Pembro 10m/kg vs. Pembro 2mg/kg –
0.91 (0.71 – 1.16)
Robert Ipi & Dacarb. – 11.2 (9.4 – 13.6)
Dacarb. – 9.1 (7.8 – 10.5)
Not reported 0.72 (0.59 - 0.87) 0.76 (0.63 – 0.93)
Robert Nivo – Data immature
Dacarb. – 10.8 (9.3 – 12.1)
Nivo – 5.1 (3.5 – 10.8)
Dacarb. – 2.2 (2.1 – 2.4)
0.42 (0.25 - 0.73) § 0.43 (0.34 – 0.56)
Weber Data immature Nivo – 4.7 (2.3 – 6.5)
ICC – 4.2 (2.1 – 6.3) ‡§
Data immature 0.82 (0.32 – 2.05) **
74
Word count: 6061
Table 3. Primary outcome raw data on survival
The raw data as reported in the included studies for median overall, and
median progression-free survival in months, and hazard ratios for death,
and progression for each study and treatment arm. Ipo, nivo, pembro, and
dacarb are short for ipilimumab, nivolumab, pembrolizumab, and
dacarbazine, respectively.
* p <0.05
† p <0.001
‡ Data from only 182 / 405 patients was reported
§ 99.79% confidence intervals
** 99.99% confidence intervals
75
Word count: 6061
Author BORR – No. objective responses / Total no. patients (%)
Hodi Ipilimumab & gp100 – 23/403 (5.7%)
gp100 – 2/136 (1.5%)
Ipilimumab – 15/137 (10.9%)
Larkin Combination – 181/314 (57.6%)
Nivolumab – 138/316 (43.7%)
Ipilimumab – 60/315 (19.0%)
Postow *Ipilimumab & Nivolumab – 56/95 (58.9%)
Ipilimumab – 5/47 (10.6%)
Ribas Pembrolizumab 2mg/kg – 38/180 (21.1%)
Pembrolizumab 10mg/kg – 46/181 (25.4%)
ICC – 8/179 (4.5%)
RobertIpilimumab & Dacarbazine – 38/250 (15.2%)
Dacarbazine – 26/252 (10.3%)
Robert Nivolumab – 84/210 (40.0%)
Dacarbazine – 29/208 (13.9%)
Weber Nivolumab – 38/122 (31.1%)
ICC – 5/60 (8.3%)
Table 4. Secondary outcome raw data on tumor response
The raw data as reported in the included studies for best overall response
rate (BORR), ie the number of patients in each study and treatment arm
that achieve an objective response (complete or partial response) as a
fraction of the total number of patients.
* Combined data for BRAF wild-type tumors, and BRAF V600 mutation-
positive tumors, which was reported separately in the study.
76
Word count: 6061
Author Discontinuations due to adverse events – No. events / Total (%)
Discontinuations due to treatment-related adverse events – No. events / Total (%)
Hodi * Not reported Not reported
Larkin Not reported Combination – 114/313 (36.4%)
Nivolumab – 24/313 (7.7%)
Ipilimumab – 46/311 (14.8%)
Postow Not reported Ipilimumab & Nivolumab – 44/94 (46.8%)
Ipilimumab – 8/46 (17.4%)
Ribas Pembrolizumab 2mg/kg – 21/178 (11.8%)
Pembrolizumab 10mg/kg – 24/179 (13.4%)
ICC – 18/171 (10.5%)
Pembrolizumab 2mg/kg – 4/178 (2.2%)
Pembrolizumab 10mg/kg – 12/179 (6.7%)
ICC – 10/171 (5.8%)
Robert Not reported Ipilimumab & Dacarbazine – 89/247 (36.0%)
Dacarbazine – 10/251 (4.0%)
Robert Nivolumab – 14/206 (6.8%)
Dacarbazine – 24/205 (11.7%)
Not reported
Weber Not reported Nivolumab – 7/268 (2.6%)
ICC – 7/102 (6.9%)
77
Word count: 6061
Table 5. Secondary outcome raw data on tolerability
The raw data as reported in the included studies for the secondary outcome
on tolerability, with rates of discontinuation due to adverse events, and
treatment-related adverse events for each study and treatment arm listed.
* Study reported neither tolerability endpoint and is thus not included in the
meta-analysis.
78
Word count: 6061
Section C – Risk of bias assessment
Bias Author’s judgment Support for judgment
Random sequence generation (selection bias)
Low risk "Patients were randomly assigned to one of three study groups"
Comment: Probably done.
Allocation concealment (selection bias)
Unclear risk "The Biostatistics group in Medarex will provide a centralized randomization list to Clinical
Operations using SAS procedure PROC PLAN. The randomization will be performed in two separate
stages using different block sizes for different treatment allocation ratios."
Comment: Unclear who performed randomization and what the method was used.
Blinding of participants and personnel (performance bias)
Low risk "Placebo will be utilized for both MDX-010 and melanoma peptide vaccine. The melanoma peptide
vaccine (placebo and active) will be delivered via masked syringe by s.c. injection."
"All Study Site personnel, patients, and Medarex, Inc. personnel involved in the study...will remain
blinded to treatment assignment during the course of the study."
Comment: Probably done.
79
Word count: 6061
Blinding of outcome assessment (detection bias)
Low risk "Tumor responses were determined by the investigators with the use of modified WHO criteria to
evaluate bidimensionally measurable lesions."
"The IRC will be blinded to patient dosage group assignments…The IRC will be comprised of at
least 2 radiologists or oncologists experienced in tumor imaging and assessment."
Comment: IRC assessed tumor-response data. Other relevant personnel were also blinded.
Incomplete outcome data (attrition bias)
Unclear risk "Efficacy analyses were performed on the intention-to-treat population, which included all patients
who had undergone randomization (676 patients). The safety population included all patients who
had undergone randomization and who had received any amount of study drug (643 patients)"
"Of the 143 patients who could not be evaluated for a response, 33 patients did not receive any
study drug and 110 patients did not have baseline or week-12 tumor assessments (or both)"
Comment: 143 patients were not evaluated for BORR, of which 110 patients, without further
explanation, were said to lack data to compare with.
Selective reporting (reporting bias)
Unclear risk Comment: Changed the primary outcome from BORR to overall survival in January 2009. Reported
all specified outcomes.
Other bias Unclear risk "Funded by Medarex and Bristol-Myers Squibb"
"The trial was designed jointly by the senior academic authors and the sponsors, Medarex and
Bristol-Myers Squibb. Data were collected by the sponsors and analyzed in collaboration with the
senior academic authors."
Comment: Lead authors received consulting fees, grants, honoraria, and fees from BMS (patent
holders for Ipilimumab).
80
Word count: 6061
Table 6. Risk of bias data for Hodi (2010)
The primary risk of bias assessment for the Hodi study15, listing for each of the seven study domains firstly, the review
author’s judgment of the overall risk of bias (low, unclear, or high risk of bias), and secondly, the support for judgment
consisting of extracts from the study or its supplementary material as well as comments made by the review author. An
unclear risk of bias was defined as a risk of bias that was greater than low, but not sufficient to be considered high.
81
Word count: 6061
Bias Author’s judgment Support for judgment
Random sequence generation (selection bias)
Low risk "Enrolled patients were randomly assigned"
Comment: Probably done.
Allocation concealment (selection bias)
Low risk "The randomization procedures will be carried out via permuted blocks within each stratum. The
exact procedures for using the IVRS will be detailed in the IVRS manual."
Comment: Probably done.
Blinding of participants and personnel (performance bias)
Low risk "For subjects who are receiving treatment and have not progressed, the Sponsor, subjects,
investigator and site staff will be blinded to the study drug administered"
"Each investigative site must assign an unblinded pharmacist/designee, and an unblinded site
monitor will be assigned by sponsor to provide oversight of drug supply and other unblinded study
documentation.
"The Sponsor’s central protocol team (including but not limited to clinical, statistics, data
management) will remain blinded."
Comment: Used placebo and blinded staff.
82
Word count: 6061
Blinding of outcome assessment (detection bias)
Unclear risk "Tumor assessments for ongoing study treatment decisions will be completed by the investigator
using RECIST (Response Evaluation Criteria in Solid Tumors) 1.1criteria. Radiographic images will
be collected for independent radiological review committee tumor assessment."
Comment: Unclear whether the investigator remained blinded. No description of the independent
radiological review committee.
Incomplete outcome data (attrition bias)
Unclear risk Comment: BOR could not be determined in 78 patients, with no explanation as to why.
Selective reporting (reporting bias)
Low risk Comment: Reported specified outcomes with the exception of median overall survival (not mature)
and PD-L1 expression as a predictive marker of efficacy.
Other bias Unclear risk "Funded by Bristol-Myers Squibb"
"The trial was designed as a collaboration between the senior academic authors and the sponsor,
Bristol-Myers Squibb. Data were collected by the sponsor and analyzed in collaboration with all the
authors."
Comment: BMS holds the patent for Ipilimumab. Authors declared receiving funds, grants, and
honoraria from pharmaceutical industry, including BMS.
83
Word count: 6061
Table 7. Risk of bias data for Larkin (2015)
The primary risk of bias assessment for the Larkin study35, listing for each of the seven study domains firstly, the review
author’s judgment of the overall risk of bias (low, unclear, or high risk of bias), and secondly, the support for judgment
consisting of extracts from the study or its supplementary material as well as comments made by the review author. An
unclear risk of bias was defined as a risk of bias that was greater than low, but not sufficient to be considered high.
84
Word count: 6061
Bias Authors’ judgment Support for judgment
Random sequence generation (selection bias)
Low risk "We randomly assigned patients in a 2:1 ratio"
Comment: Probably done.
Allocation concealment (selection bias)
Low risk "Enrolled subjects that have met all eligibility criteria will be ready to be randomized through the
IVRS"
"The randomization procedures will be carried out via permuted blocks within each stratum."
Comment: Probably done.
Blinding of participants and personnel (performance bias)
Low risk "The Sponsor, subjects, investigator and site staff will be blinded to the study drug administered"
"Each investigative site must assign an unblinded pharmacist/designee, and an unblinded site
monitor will be assigned by sponsor to provide oversight of drug supply and other unblinded study
documentation."
"In the ipilimumab-monotherapy group, the same dosing schedule was used, except that nivolumab
was replaced with matched placebo"
Comment: Used placebo and blinded staff.
85
Word count: 6061
Blinding of outcome assessment (detection bias)
Unclear risk "The best overall response was assessed by the investigator with the use of the Response
Evaluation Criteria in Solid Tumors"
"An independent radiology review committee was established to provide a sensitivity assessment of
objective responses,"
Comment: Unclear whether the investigator remained blinded. No description of the independent
radiology review committee.
Incomplete outcome data (attrition bias)
Unclear risk Comment: BOR could not be determined in 18, with no explanation as to why.
1 patient's LDH, and 1 patient's history of brain metastasis was not recorded.
Selective reporting (reporting bias)
Low risk Comment: All endpoints reported.
Other bias Unclear risk "Data were collected by the sponsor, Bristol-Myers Squibb, and were analyzed in collaboration with
the authors."
Comment: Study funded by BMS (patent holders of Ipilimumab). Authors declared receiving funds,
grants, and honoraria from pharmaceutical industry, including BMS (patent-holders).
86
Word count: 6061
Table 8. Risk of bias data for Postow (2015)
The primary risk of bias assessment for the Postow study36, listing for each of the seven study domains firstly, the review
author’s judgment of the overall risk of bias (low, unclear, or high risk of bias), and secondly, the support for judgment
consisting of extracts from the study or its supplementary material as well as comments made by the review author. An
unclear risk of bias was defined as a risk of bias that was greater than low, but not sufficient to be considered high.
87
Word count: 6061
Bias Author’s judgment Support for judgment
Random sequence generation (selection bias)
Low risk "We randomly assigned (1:1:1) patients in a block size of six"
Comment: Probably done.
Allocation concealment (selection bias)
Low risk "Block randomization with a block size of six in each stratum was used. After all screening
procedures were complete, a centralized interactive voice-response system with or without web
functionality was used to allocate patients to treatment."
Comment: Probably done.
Blinding of participants and personnel (performance bias)
High risk "Individual treatment assignment between pembrolizumab and chemotherapy was open label;
investigators and patients were masked to assignment to pembrolizumab dose. A designated
pharmacist at each site who was unmasked prepared the pembrolizumab dose so that it could be
administered to the patient in a masked fashion."
"The sponsor was masked to all treatment assignments in the statistical analyses, as well as
treatment-level analysis results."
Comment: No placebo used, and assignment to chemotherapy or pembrolizumab was open-label.
88
Word count: 6061
Blinding of outcome assessment (detection bias)
Unclear risk "All scans were evaluated by independent central review. The independent radiologists were
masked to treatment assignments, identifying patient characteristics, and investigator-assessed
findings."
Comment: Outcomes also assessed by investigator for "sensitivity analysis", but these results were
reported separately. No description of independent central review committee.
Incomplete outcome data (attrition bias)
Unclear risk Comment: BOR was not evaluable in 71 patients, a fraction of which was due to patients being
"withdrawn by investigator" with no further explanation. 16 patients discontinued their assigned
treatment due to "physician decision" with no further explanation.
Selective reporting (reporting bias)
Low risk Comment: All outcomes reported, with the exception of median overall survival (not mature), and
time from BOR to disease progression.
Other bias Unclear risk "Merck Sharp & Dohme, a subsidiary of Merck & Co, sponsored this study".
Comment: Pharmaceutical company also helped design study and collect data. Authors declared
receiving funds, grants, and honoraria from pharmaceutical industry, including Merck & Co who
holds the pembrolizumab patent.
89
Word count: 6061
Table 9. Risk of bias data for Ribas (2015)
The primary risk of bias assessment for the Ribas study34, listing for each of the seven study domains firstly, the review
author’s judgment of the overall risk of bias (low, unclear, or high risk of bias), and secondly, the support for judgment
consisting of extracts from the study or its supplementary material as well as comments made by the review author. An
unclear risk of bias was defined as a risk of bias that was greater than low, but not sufficient to be considered high.
90
Word count: 6061
Bias Author’s judgment Support for judgment
Random sequence generation (selection bias)
Low risk "We randomly assigned 502 patients"
Comment: Probably done.
Allocation concealment (selection bias)
Low risk "To randomize an eligible patient, the unblinded pharmacist will call IVRS to obtain a treatment
assignment."
Comment: Used an interactive voice response system. No further information on design of stratum
or blocks.
Blinding of participants and personnel (performance bias)
Low risk "The Sponsor, CRO, patients, Investigator and site staff will be blinded to the ipilimumab dose (i.e.,
placebo or 10 mg/kg). The local pharmacists in addition to a pharmacy - based CRO monitor will be
unblinded. The DMC will also be unblinded to permit a real-time ongoing assessment of safety and
efficacy."
Comment: Placebo used, and all relevant personnel were blinded, no description of the presentation
of the placebo.
91
Word count: 6061
Blinding of outcome assessment (detection bias)
Unclear risk "Tumor assessments were performed by the local investigator and by a central independent review
committee."
"All efficacy end points (except survival) were based on assessments performed by the independent
review committee, whose members were not aware of the treatment assignments."
"For the purpose of final analysis of study results, an IRC will review all images from all time points
for all patients and assess response parameters as specified."
Comment: Probably done, no description of who made up the IRC.
Incomplete outcome data (attrition bias)
Unclear risk Comment: 101/502 patients had their "response not evaluated" for BOR, due to some lacking a
baseline and/or follow-up scan. Two patients had unknown LDH levels.
Selective reporting (reporting bias)
Low risk Comment: Changed primary end point from progression-free survival to overall survival.
Reported all specified outcomes with the exception of time to a response.
Other bias Unclear risk "Funded by Bristol-Myers Squibb"
"The trial was designed jointly by the senior academic authors and the sponsor, Bristol-Myers
Squibb. Data were collected by the sponsor and analyzed in collaboration with the senior academic
authors"
Comment: BMS hold the patent for Ipilimumab. Authors declared receiving funds, grants, and
honoraria from pharmaceutical industry, including BMS.
92
Word count: 6061
Table 10. Risk of bias data for Robert (2011)
The primary risk of bias assessment for the Robert (2011) study21, listing for each of the seven study domains firstly, the
review author’s judgment of the overall risk of bias (low, unclear, or high risk of bias), and secondly, the support for judgment
consisting of extracts from the study or its supplementary material as well as comments made by the review author. An
unclear risk of bias was defined as a risk of bias that was greater than low, but not sufficient to be considered high.
93
Word count: 6061
Bias Author’s judgment Support for judgment
Random sequence generation (selection bias)
Low risk "We randomly assigned 418 previously untreated patients"
Comment: Probably done.
Allocation concealment (selection bias)
Low risk "The subject number will be assigned through an interactive voice response system (IVRS)"
"The randomization procedures will be carried out via permuted blocks within each stratum."
Comment: Probably done.
Blinding of participants and personnel (performance bias)
Low risk "The Sponsor, subjects, investigator and site staff will be blinded to the study drug administered"
"Each investigative site must assign an unblinded pharmacist/designee, and an unblinded site
monitor will be assigned to provide oversight of drug supply and other unblinded study
documentation."
Comment: Used placebo, and blinded relevant personnel, but no description of the presentation of
the placebo presentation.
94
Word count: 6061
Blinding of outcome assessment (detection bias)
High risk "The best overall response was assessed by the investigator with the use of the Response
Evaluation Criteria in Solid Tumors"
"The duration of investigator-assessed progression-free survival (PFS)"
Comment: No mention of an independent review committee, nor whether the investigators remained
masked during outcome assessment.
Incomplete outcome data (attrition bias)
Unclear risk Comment: BOR could not be determined in 54/418 patients, without further explanation as to why.
LDH and BRAF status not reported 20 and 12 patients, respectively.
Selective reporting (reporting bias)
Low risk Comment: Reported all specified outcomes with the exception of median overall survival (not
mature) and PD-L1 expression as a predictive marker of efficacy.
Other bias Unclear risk "Funded by Bristol-Myers Squibb"
"Data were collected by the sponsor, Bristol-Myers Squibb, and analyzed in collaboration with the
academic authors."
Comment: BMS hold the patent for Nivolumab. Authors declared receiving funds, grants, and
honoraria from pharmaceutical industry, including BMS.
95
Word count: 6061
Table 11. Risk of bias data for Robert (2015)
The primary risk of bias assessment for the Robert (2015) study29, listing for each of the seven study domains firstly, the
review author’s judgment of the overall risk of bias (low, unclear, or high risk of bias), and secondly, the support for judgment
consisting of extracts from the study or its supplementary material as well as comments made by the review author. An
unclear risk of bias was defined as a risk of bias that was greater than low, but not sufficient to be considered high.
96
Word count: 6061
Bias Author’s judgment Support for judgment
Random sequence generation (selection bias)
Low risk "Participating investigators randomly assigned (with an interactive voice response system) patients"
Comment: Probably done.
Allocation concealment (selection bias)
Low risk "We used permuted blocks (block size of six) within each stratum."
Comment: Used an IVRS.
Blinding of participants and personnel (performance bias)
High risk "Treatment was given open-label because of the choices available to the investigators in the ICC
group"
Comment: An open-label study.
97
Word count: 6061
Blinding of outcome assessment (detection bias)
Unclear risk "Tumor assessments were done centrally by radiologists on an independent review committee who
were masked to patients’ treatment assignments."
"Confirmed response by independent radiology review committee per Response Evaluation Criteria
in Solid Tumors"
Comment: No description of the IRC.
Incomplete outcome data (attrition bias)
Unclear risk Comment: Reports data from only 182 / 405 patients – number of patients "who had been
randomized at the point of the first planned assessment of objective responses". In 23/167 they
were "unable to establish" BOR, due to lack of scan at 9 months without any further explanation.
Comment: Unclear whether remaining data will be published.
Selective reporting (reporting bias)
Unclear risk Comment: Reported some specified outcomes but not all, specifically: PD-L1 expression as a
predictive biomarker for objective response, overall survival (not mature), and health-related Quality
of life.
Other bias Unclear risk "Funding Bristol-Myers Squibb"
"Data collected by the funder were analyzed in collaboration with all authors."
Comment: BMS hold the patent for Nivolumab. Authors declared receiving funds, grants, and
honoraria from pharmaceutical industry, including BMS.
98
Word count: 6061
Table 12. Risk of bias data for Weber (2015)
The primary risk of bias assessment for the Weber study22, listing for each of the seven study domains firstly, the review
author’s judgment of the overall risk of bias (low, unclear, or high risk of bias), and secondly, the support for judgment
consisting of extracts from the study or its supplementary material as well as comments made by the review author. An
unclear risk of bias was defined as a risk of bias that was greater than low, but not sufficient to be considered high.
99
Word count: 6061
Section D – Publication bias assessment
Figure 1. Funnel Plot for the secondary outcome analysis on tumor response
The funnel plot for the secondary outcome analysis on tumor response,
showing each study as a black circle, with the odds ratio for best overall
response rate (BORR) along the x-axis, and the standard error of the
natural log of the odds ratio on the y-axis. The smaller the SE (log [Odds
Ratio]), the more reliable the result from that studies is, meaning less
reliable studies will be found closer to the x-axis. There is an even spread of
studies on either side of the vertical blue line representing the overall effect
estimate (OR = 4.48).
100
Word count: 6061
Figure 2. Funnel Plot for the secondary outcome analysis on tolerability
The funnel plot for the secondary outcome analysis on tolerability, showing
each study as a black circle, with the odds ratio for rates of discontinuations
due to adverse and treatment-related adverse events along the x-axis, and
the standard error of the natural log of the odds ratio on the y-axis. The
smaller the SE (log [Odds Ratio]), the more reliable the result from that
studies is, meaning less reliable studies will be found closer to the x-axis.
There is an even spread of studies on either side of the vertical blue line
representing the overall effect estimate (OR = 1.63).
101