20
Research Article Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHCPeptide Binding Data Set Maria Bonsack 1,2,3 , Stephanie Hoppe 1,2,3 , Jan Winter 1,3 , Diana Tichy 4 , Christine Zeller 1 , Marius D. Kupper 1,3 , Eva C. Schitter 1,3 , Renata Blatnik 1,2,3 , and Angelika B. Riemer 1,2 Abstract Knowing whether a protein can be processed and the resulting peptides presented by major histocompatibility complex (MHC) is highly important for immunotherapy design. MHC ligands can be predicted by in silico peptideMHC class-I binding prediction algorithms. However, pre- diction performance differs considerably, depending on the selected algorithm, MHC class-I type, and peptide length. We evaluated the prediction performance of 13 algorithms based on binding afnity data of 8- to 11-mer peptides derived from the HPV16 E6 and E7 proteins to the most prevalent human leukocyte antigen (HLA) types. Peptides from high to low predicted binding likelihood were syn- thesized, and their HLA binding was experimentally veried by in vitro competitive binding assays. Based on the actual binding capacity of the peptides, the performance of pre- diction algorithms was analyzed by calculating receiver operating characteristics (ROC) and the area under the curve (A ROC ). No algorithm outperformed others, but different algorithms predicted best for particular HLA types and peptide lengths. The sensitivity, specicity, and accuracy of decision thresholds were calculated. Commonly used decision thresholds yielded only 40% sensitivity. To increase sensitivity, optimal thresholds were calculated, validated, and compared. In order to make maximal use of prediction algorithms available online, we developed MHCcombine, a web application that allows simultaneous querying and output combination of up to 13 prediction algorithms. Taken together, we provide here an evaluation of peptideMHC class-I binding prediction tools and recommendations to increase prediction sensitivity to extend the number of potential epitopes applicable as targets for immunotherapy. Introduction Immunotherapy has emerged over the past decades to be a promising approach to personalize treatment of cancer patients. The key prerequisite of successful immunotherapy is a tumor- specic antigen that allows induction and focusing of an immune attack specically against tumor cells. Such tumor-specic anti- gens could be either viral proteins, in the case of virus-driven malignancies, or mutation-derived neoantigens. Not every possible antigenic peptide will be processed and presented on the cell surface by major histocompatibility complex (MHC) molecules and not all MHC-presented peptides are immu- nogenic T-cell epitopes. Immunogenicity is dependent on several factors, including protein expression, antigen processing and trans- port, peptideMHC binding afnity and stability of the resulting complex, peptide competition for MHC binding, as well as the T-cell receptor (TCR) repertoire (1). Several methods have been developed to identify and to predict T-cell epitopes from a given source protein. Based on in vitro assays, such as competitive binding assays and ELISpot assays, MHC-binding afnity and immunoge- nicity of peptides can be experimentally assessed (2, 3). However, given the large variety of antigens and MHC allotypes, in vitro testing of all possible candidates is not feasible (4). For this reason, various in silico algorithms have been developed to predict the peptides resulting from the different steps involved in epitope presenta- tion (510). The prediction of the binding likelihood of a given peptide to a specic MHC molecule is based on identied binding preferences and anchor motifs of MHC molecules (11). One of the rst computational methods to predict MHC binding is SYFPEITHI (12). Other algorithms for the prediction of MHC class-I binding use various statistics, machine learning approaches, and training data sets. They facilitate prescreening of antigens and are therefore an element of (neo)epitope identication and prioriti- zation strategies used in studies and clinical trials in the eld of personalized immunotherapies (1319). Mass spectrometry (MS) is used to analyze the repertoire of peptides presented on the cell surface by detecting ligands eluted from MHC complexes. In contrast to MHC-binding prediction, MS identication of MHC ligands naturally includes the poten- tially selective steps of antigen processing and transport (reviewed 1 German Cancer Research Center (DKFZ), Immunotherapy and Immunopreven- tion, Heidelberg, Germany. 2 German Center for Infection Research (DZIF), Molecular Vaccine Design, partner site Heidelberg, Heidelberg, Germany. 3 Faculty of Biosciences, Heidelberg University, Heidelberg, Germany. 4 German Cancer Research Center (DKFZ), Division of Biostatistics, Heidelberg, Germany. Note: Supplementary data for this article are available at Cancer Immunology Research Online (http://cancerimmunolres.aacrjournals.org/). M. Bonsack and S. Hoppe contributed equally to this article. S. Hoppecurrently employed at Bristol-Myers Squibb in Munich, Germany. Corrected online June 6, 2019. Corresponding Author: Angelika B. Riemer, German Cancer Research Center, Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany. Phone: 49-6221- 423820; Fax: 49-6221-423899; E-mail: [email protected] doi: 10.1158/2326-6066.CIR-18-0584 Ó2019 American Association for Cancer Research. Cancer Immunology Research www.aacrjournals.org 719 on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584 on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584 on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

Research Article

Performance Evaluation of MHC Class-I BindingPrediction Tools Based on an ExperimentallyValidated MHC–Peptide Binding Data SetMaria Bonsack1,2,3, Stephanie Hoppe1,2,3, Jan Winter1,3, Diana Tichy4, Christine Zeller1,Marius D. K€upper1,3, Eva C. Schitter1,3, Renata Blatnik1,2,3, and Angelika B. Riemer1,2

Abstract

Knowing whether a protein can be processed and theresulting peptides presented by major histocompatibilitycomplex (MHC) is highly important for immunotherapydesign. MHC ligands can be predicted by in silico peptide–MHC class-I binding prediction algorithms. However, pre-diction performance differs considerably, depending on theselected algorithm, MHC class-I type, and peptide length.We evaluated the prediction performance of 13 algorithmsbased on binding affinity data of 8- to 11-mer peptidesderived from the HPV16 E6 and E7 proteins to the mostprevalent human leukocyte antigen (HLA) types. Peptidesfrom high to low predicted binding likelihood were syn-thesized, and their HLA binding was experimentally verifiedby in vitro competitive binding assays. Based on the actualbinding capacity of the peptides, the performance of pre-diction algorithms was analyzed by calculating receiver

operating characteristics (ROC) and the area under the curve(AROC). No algorithm outperformed others, but differentalgorithms predicted best for particular HLA types andpeptide lengths. The sensitivity, specificity, and accuracyof decision thresholds were calculated. Commonly useddecision thresholds yielded only 40% sensitivity. Toincrease sensitivity, optimal thresholds were calculated,validated, and compared. In order to make maximal useof prediction algorithms available online, we developedMHCcombine, a web application that allows simultaneousquerying and output combination of up to 13 predictionalgorithms. Taken together, we provide here an evaluationof peptide–MHC class-I binding prediction tools andrecommendations to increase prediction sensitivity toextend the number of potential epitopes applicable astargets for immunotherapy.

IntroductionImmunotherapy has emerged over the past decades to be a

promising approach to personalize treatment of cancer patients.The key prerequisite of successful immunotherapy is a tumor-specific antigen that allows induction and focusing of an immuneattack specifically against tumor cells. Such tumor-specific anti-gens could be either viral proteins, in the case of virus-drivenmalignancies, or mutation-derived neoantigens.

Not every possible antigenic peptide will be processed andpresented on the cell surface bymajor histocompatibility complex(MHC)molecules andnot allMHC-presentedpeptides are immu-

nogenic T-cell epitopes. Immunogenicity is dependent on severalfactors, including protein expression, antigen processing and trans-port, peptide–MHC binding affinity and stability of the resultingcomplex, peptide competition for MHC binding, as well as theT-cell receptor (TCR) repertoire (1). Several methods have beendeveloped to identify and to predict T-cell epitopes from a givensource protein. Basedon in vitro assays, such as competitive bindingassays and ELISpot assays, MHC-binding affinity and immunoge-nicity of peptides can be experimentally assessed (2, 3). However,given the large varietyof antigens andMHCallotypes, in vitro testingof all possible candidates is not feasible (4). For this reason, variousin silico algorithms have been developed to predict the peptidesresulting from the different steps involved in epitope presenta-tion (5–10). The prediction of the binding likelihood of a givenpeptide to a specific MHCmolecule is based on identified bindingpreferences and anchor motifs of MHC molecules (11). Oneof the first computational methods to predict MHC binding isSYFPEITHI(12).Otheralgorithms for thepredictionofMHCclass-Ibinding use various statistics, machine learning approaches, andtraining data sets. They facilitate prescreening of antigens and aretherefore an element of (neo)epitope identification and prioriti-zation strategies used in studies and clinical trials in the field ofpersonalized immunotherapies (13–19).

Mass spectrometry (MS) is used to analyze the repertoire ofpeptides presented on the cell surface by detecting ligands elutedfrom MHC complexes. In contrast to MHC-binding prediction,MS identification of MHC ligands naturally includes the poten-tially selective steps of antigen processing and transport (reviewed

1German Cancer Research Center (DKFZ), Immunotherapy and Immunopreven-tion, Heidelberg, Germany. 2German Center for Infection Research (DZIF),Molecular Vaccine Design, partner site Heidelberg, Heidelberg, Germany.3Faculty of Biosciences, Heidelberg University, Heidelberg, Germany. 4GermanCancer Research Center (DKFZ), Division of Biostatistics, Heidelberg, Germany.

Note: Supplementary data for this article are available at Cancer ImmunologyResearch Online (http://cancerimmunolres.aacrjournals.org/).

M. Bonsack and S. Hoppe contributed equally to this article.

S. Hoppe—currently employed at Bristol-Myers Squibb in Munich, Germany.

Corrected online June 6, 2019.

Corresponding Author: Angelika B. Riemer, German Cancer Research Center,Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany. Phone: 49-6221-423820; Fax: 49-6221-423899; E-mail: [email protected]

doi: 10.1158/2326-6066.CIR-18-0584

�2019 American Association for Cancer Research.

CancerImmunologyResearch

www.aacrjournals.org 719

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 2: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

in refs. 20 and 21). New MHC class-I binding predictors havebeen trained on MS data, with the aim to improve prediction ofligands presented by MHC (22–25). Common thresholds fordiscriminating binders from nonbinders indicated by the algo-rithms are a half-maximal inhibitory concentration (IC50) of�50 nM for identifying strong binders ("strong binding affinitythreshold"), �500 nM for identifying strong and weak binders("intermediate binding affinity threshold"), and < 5,000 nM foridentifying any possible binder ("low binding affinity thresh-old"; ref. 26). A comparison of MS data and predicted HLA-binding affinity by Bassani-Sternberg and colleagues revealedthat the most commonly used threshold for predicted bindingaffinity (IC50: 500 nM) is far too stringent for some HLAtypes (27).

Predictors are regularly updated, and methods such asMHCflurry, MSIntrinsic, or MixMHCPred are being devel-oped (23, 24, 28). This prompted us to conduct a performanceevaluation of several widely usedMHC class-I binding predictionmethods available online. Based on experimentally assessedbinding affinity data of 743 peptides for the seven major HLAclass-I types, we calculated the sensitivity, specificity, and accuracyof the predictors depending onHLA type and peptide length (29).MHC-binding predictors have been benchmarked and reviewedpreviously (30–34). Weekly automated benchmarking is per-formed for new entries to the immune epitope database (IEDB)but often covers only a small sample of peptides and HLAtypes (35). Here, we calculated how different common decisionthresholds affect the sensitivity, specificity, and accuracy of pep-tide–MHCprediction results and how sensitivity can be increasedby using new individually calculated decision thresholds. Addi-tionally, we developed MHCcombine, a web application for thecombination of output from several predictors, to facilitate thesimultaneous use and comparison of multiple MHC class-I pre-diction algorithms. Thereby, we provide themeans to increase thenumber of trueHLA ligands in the prediction output, and thus thenumber of potential T-cell epitopes considered as candidates forimmunotherapy.

Materials and MethodsStudy design

The objective of this study was to provide a performanceevaluation of publicly available MHC ligand prediction algo-rithms, based on a newly generated experimental MHC–peptidebinding data set that is independent of any algorithm trainingdata. The experimental binding data set was generated in thecontext of a project on therapeutic HPV16 vaccine design. Thus,it includes peptides derived from the HPV16 proteins E6 andE7, binding to the five major HLA class-I supertypes, repre-sented by seven HLA class-I alleles. The web applicationMHCcombine was developed to facilitate simultaneous query-ing of multiple prediction methods and systematical combina-tion of output. Likelihood of binding to each of the seven HLAmolecules was predicted by 13 algorithms—NetMHC 4.0 (36,37), NetMHC 3.4 (38, 39), NetMHCpan 4.0 (22), NetMHCpan3.0 (40, 41), NetMHCpan 2.8 (40, 42), NetMHCcons 1.1 (43),PuickPocket 1.1 (44), IEDB recommended (45), IEDB consen-sus (46), IEDB SMMPMBEC (47, 48), IEDB SMM (49),MHCflurry 1.1 (28), and SYFPEITHI (12) also listed in Supple-mentary Table S1—for all 956 possible 8- to 11-mer peptidesderived from E6 and E7. Based on the respective output format

of the predictor, the decision thresholds indicated by the algo-rithms are either an IC50 � 500 nM or a median percentilerank � 2. In a first step, all peptides predicted to be binderswithin these thresholds were experimentally assessed for bind-ing. As this resulted in the identification of binders beyond thethreshold of individual algorithms, we extended the experi-mental binding analysis. Data collection was stopped whenonly nonbinders could be detected by experimental validation.For this reason, the experimental data are concentrated on thetop list of predicted binding affinities, and sample sizes vary foreach HLA type and peptide length. Each experimental affinitydetermination was carried out with at least three biologicalreplicates for binders and two biological replicates for non-binders. Positive and negative controls were included in eachbinding assay, and all data points from experiments with validcontrol results were included in the analysis. Overall, experi-mental binding affinity has been determined for 743 peptides inthe context of a specific HLA type (Supplementary Table S2).Based on the experimental results, predictions were categorizedinto true positives, false positives, true negatives, and falsenegatives. To improve prediction sensitivity, we defined newthreshold recommendations individually calculated for eachprediction algorithm, HLA type, and peptide length. The per-formance of these new thresholds was cross-evaluated by boot-strapping (100� resampling) of the HPV data set, in order togenerate results that are representative for the whole datapopulation.

Development of the web-based tool "MHCcombine"The web application MHCcombine was developed to allow

simultaneous querying of multiple online available algorithmsbased on different servers for peptide–MHC binding predictions.The tool automatically combines the output of up to 12 selectablemethods. It returns the combined output in the format of comma-separated values (.csv), which can be read by any text andspreadsheet editor. MHCcombine can be accessed online viahttp://mhccombine.dkfz.de/mhccombine/.

Epitope predictionFor the prediction of HLA class-I ligands, 13 online accessible

prediction algorithms were used (accessed March 23, 2018). Thisstudy included predictive methods based on artificial neuralnetworks [ANN; NetMHC 4.0 (36, 37), NetMHC 3.4 (38, 39),NetMHCpan 4.0 (22), NetMHCpan 3.0 (40, 41), NetMHCpan2.8 (40, 42), and MHCflurry 1.2 (28)], scoring matrices[PickPocket 1.1 (44)], stabilized matrix method (IEDB smm;ref. 49), smm with a peptide:MHC binding energy covariancematrix (IEDB smmpmbec; refs. 47, 48), and SYFPEITHI (12), andtwo consensus methods [NetMHCcons 1.1 (43), IEDBconsensus (46)]. We further used the method IEDB recom-mended, which aims to use the best possible method for a givenMHC molecule based on the availability of predictors and pre-viously observed prediction performance [Consensus>NetMHC4.0>SMM>NetMHCpan 3.0>CombLib (45)]. These algorithmswere used to predict 8-, 9-, 10-, and 11-mer peptides derivedfrom themodel protein sequences HPV16 E6 and E7 (UniProtKB:P03126 and P03129, respectively) binding to the HLA class-IallelesA�01:01, A�02:01, A�03:01, A�11:01, A�24:02, B�07:02, andB�15:01. The output of the methods, the predicted likelihood ofpeptide–MHC binding, was expressed as predicted half-maximalinhibitory concentration (IC50), median percentile rank (by IEDB

Bonsack et al.

Cancer Immunol Res; 7(5) May 2019 Cancer Immunology Research720

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 3: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

consensus and IEDB recommended) or score (by SYFPEITHI).Initially, the threshold values for intermediate binding affinityindicated by the respective algorithms (IC50 � 500 nM; medianpercentile rank �2) were applied to select peptides for synthesisand subsequent analysis. As we found binding peptides beyondthe thresholds of individual algorithms (predicted to be bindersby another algorithm), in a second step the thresholds weresystematically lowered. This method allowed testing of weakerpredicted binding affinities until only nonbinders could bedetected experimentally.

Synthesis of peptidesAll synthetic peptides used in this study were produced with a

purity of >95%. For the solid phase synthesis, the Fmoc-strate-gy (50, 51) was used in a fully automated multiple synthesizerSyro II (MultiSyn Tech). The synthesis was carried out on pre-loaded Wang-Resins. As coupling agent, 2-(1H-Benzotriazole-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate (HBTU)was used. The material was purified by preparative HPLCon a Kromasil 100-10C 10 mm 120 A reverse phase column(20 � 150 mm) using an eluent of 0.1% trifluoroacetic acid inwater (A) and 80% acetonitrile in water (B). The peptide waseluted with a successive linear gradient of 25% B to 80% B in 30minutes at a flow rate of 10mL/min. The fractions correspondingto the purified peptides were lyophilized. The purified materialwas characterized with analytical HPLC and MS (Thermo Finni-gan LCQ). Peptides were dissolved in DMSO (Sigma) at 10 mg/mL and stored in small aliquots at �80�C.

Cell linesThe EBV-transformed B-lymphoblastic cell lines (B-LCLs)

1341-8346 (HLA-B�07:02), BSM (HLA-A�02:01, HLA-B�15:01),E481324 (HLA-A�01:01), EA (HLA-A�03:01), FH8 (HLA-A�11:01), LKT3 (HLA-A�24:02), and WT100BIS (HLA-A�11:01)were obtained from the International Histocompatibility Work-ing Group Cell Bank (IHWG Cell Bank) in 2011 (BSM, EA, LKT3,WT100BIS), 2012 (FH8) and 2016 (1341-8346, E481324)and cultured in RPMI-1640 supplemented with 15% fetalbovine serum (both from Sigma), 1 mM sodium pyruvate and2 M L-glutamine (both from Corning; B-LCL medium) understandard cell culture conditions. Cells were used for experimentsfrom passages 2 to 12. Cell lines were regularly authenticated andconfirmed to be free of Mycoplasma by SNP profilingand multiplex-PCR (latest in October 2018) by MultiplexionGmbH.

Competition-based peptide–HLA-binding assaysThebinding affinityof synthesized test peptides to selectedHLA

class-I molecules was assessed in competition-based cellularbinding assays as previously published (2, 52). These assaysare based on the HLA class-I binding competition of a knownhigh-affinity fluorescein-labeled reference peptide and the testpeptide of interest. In brief, cells of a B-LCL with the desired HLAexpression were stripped from naturally bound peptides by citricacid buffer treatment (0.263 mol/L citric acid and 0.123 mol/LNa2HPO4) with specific pH (pH 3.1 for HLA-A�02:01, HLA-A�11:01, HLA-A�24:02, and HLA-B�07:02; and pH 2.9 for HLA-A�03:01 and HLA-B�15:01). The cells were suspended at a con-centration of 4 � 105 cells/mL in the B-LCL medium containing2 mg/mL ß2-microglobulin (MP Biomedicals) to reconstitute the

HLA class-I complex. The cells were transferred to a 96-well plate,and a mixture of 150 nmol/L fluorescein-labeled reference pep-tide and serially diluted test peptide was added. Each test peptidewas analyzed at eight different concentrations, ranging from100 mmol/L to 0.78 mmol/L, in a minimum of three independentexperiments for binders and a minimum of two for nonbinders.Fluorescence was measured by flow cytometry (FACS Canto II orFACS Accuri; BD Biosciences) and interpreted with FlowJo V10(FlowJo, LLC). Background and maximum fluorescence wasdetermined based on cells without peptide and cells with fluo-rescein-labeled reference peptide only, respectively. For every testpeptide concentration, the mean percentage of reference peptideinhibition was calculated relative to the maximum fluorescence.The test peptide concentration that inhibits 50% binding ofthe fluorescein-labeled reference peptide was determined bynonlinear regression analysis based on the following equation(formula A; SigmaPlot V13.0, Systat Software). This half-maximalinhibitory concentration (IC50) indicates the binding affinity.Peptideswere classified as binders (IC50�100mM)or nonbinders(IC50 > 100 mM or nonlinear regression analysis not possible)according to (52).

y ¼ aþ b

1þ xc

� �dðAÞ

Performance evaluation of prediction algorithmsAs per experimentally determined binding affinity of pep-

tides, predictions were assessed to be true (T) or false (F). Thisclassification was used to generate receiver operating charac-teristic (ROC) curves for each analyzed method, HLA mole-cule, and peptide length (SigmaPlot V13.0). This curve reflectsthe rate of true positives (TPR ¼ TP/P; true predicted binders/total binders) on the y-axis versus the rate of false positives(FPR¼ FP/N; nonbinders incorrectly predicted as binders/totalnonbinders) on the x-axis over all possible thresholds. Eachtested peptide corresponds to a single point in the ROC spacewith discrete values for sensitivity (¼TPR) and 1 � specificity(¼FPR). The capacity of each algorithm to discriminatebetween binders and nonbinders was analyzed by calculatingthe area under the ROC curve (AUCROC) as an estimate ofprediction performance over the range of all possible decisionthresholds. A good predictor should predict a high rate of truepositives (TPR, sensitivity) and a low rate of false positives(FPR, 1 � specificity). The perfect ROC curve would have acoordinate of (0;1) and consequently an area under the ROCcurve (AROC) of 1. The use of AROC values for the evaluation ofmachine learning algorithms is well established (53). Accord-ing to Lin and colleagues, AROC values above 0.9 indicateexcellent prediction capability, whereas values between 0.9and 0.8 show intermediate prediction performance and valuesbelow 0.8 indicate poor prediction capability (54).

According to the predictions using different binding affinitythresholds, peptides were categorized into predicted binders(positives, P) and predicted nonbinders (negatives, N) for eachalgorithm. Based on experimental validation, these predictionswere categorized into true positives (TP), false positives (FP), truenegatives (TN), and false negatives (FN). Different thresholdvalues were evaluated by calculating the respective sensitivity,specificity, accuracy (formula B), and the positive predictive value

Performance Evaluation of MHC Class-I Binding Predictors

www.aacrjournals.org Cancer Immunol Res; 7(5) May 2019 721

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 4: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

(PPV, formula C):

accuracy ¼ TPþ TNPþN

ðBÞ

PPV ¼ TPTPþ FP

ðCÞ

Statistical analysisCalculation and validation of recommended decision thresholds.Based on our experimental binding affinity data, we calculatednew threshold values for each analyzed prediction algorithm,HLA type, and peptide length. These recommended thresholdvalues were selected based on the following criteria: (i) specificity�0.66 (equal to FPR � 0.33), (ii) TPR �2 � FPR, and (iii) thethreshold yielding the highest possible sensitivity within thelimits defined in (i) and (ii). Lacking a validation data set(a second set of similarly obtained binding affinity data),we used a bootstrapping algorithm to statistically validate ourrecommended threshold values and their respective sensitivity,specificity, and accuracy. In brief, the data set was randomly splitinto 2/3 training data and 1/3 test data per HLA allele. Applyingthe mentioned criteria, the optimal threshold for each predictionmethod was calculated based on the training data. The calculatedoptimal threshold was applied to the test data, and sensitivity,specificity, and accuracy were calculated. This was repeated 100times. From these 100 runs of resampling, the median optimalthreshold and the bootstrap confidence intervals for sensitivity,specificity, and accuracywere calculated. To check the reliability ofthe validated threshold on arbitrary data sets, a second boot-strapping was performed. Thus, we performed 100 runs of resam-pling a set of one third of the size of the total data set. The meanoptimal threshold from the first bootstrapping as well as thestrong, intermediate, and lowbinding affinity threshold indicatedby the methods were applied to the 100 resampling sets. In eachrun and for each applied threshold, sensitivity, specificity, andaccuracy were calculated. After 100 runs, the confidence intervalsof sensitivity, specificity, and accuracywere calculated.Differencesofmean sensitivity, specificity, and accuracy of applied thresholdswere compared by one-way ANOVA for repeated measures fol-lowed by the Dunnett multiple comparisons test.

Comparison of criteria-based and bootstrapping-validated thresh-olds. Because of limited sample sizes for individual peptidelengths, the validation of recommended thresholds was onlymeaningful for the analysis of pooled peptide lengths. To showthat prediction accuracies obtained by criteria-based thresholdsare representative for the whole statistical population, theywere compared with the accuracy obtained by validatedthresholds by performing two-tailed Student t tests (signi-ficance, P < 0.05).

Comparison of sequence motifs of the HPV16 E6/E7 peptide data setand known HLA-binding motifs. Peptides of the HPV data set weresorted according to HLA type and peptide length. Any peptidewith predictions within the general thresholds of IC50 � 500 nMor a median percentile rank �2 was considered "predicted."All peptides with experimentally determined actual bindingaffinity were considered "tested." All tested peptides with HLA-specific binding were considered "binders." Sequence motifs ofthese peptide sets were generated using the Seq2Logo 2.0 (55)

webtool with default settings [Kullback–Leibler logo type,Hobohm1 clustering method with threshold 0.63, amino acidillustration: red negatively charged side chains (D, E), green polaruncharged side chains (N, Q, S, G, T, Y), blue positively chargedside chains (R, K, H), black others (C, U, P, A, V, I, L, M, F, W)].These motifs were compared with motifs generated from HLA-type– and length-specific human epitope entries of linear peptidesfrom the IEDB (56).

Software. Statistical testing was performed using GraphPad PrismVersion 5.04 (GraphPad Software). Bootstrap procedures forcalculating optimal thresholds and validation of respective sen-sitivity, specificity, and accuracy have been performed using R,version 3.4.3. The R-script is available via https://github.com/DKFZ-F130/cross-validation.

ResultsHLA class-I ligand prediction does not match experimentallyvalidated binding capacity

To exploit the individual strengths of the various in silicomethods, 13 prediction algorithms were used for HLA class-Ibinding predictions of peptides derived from the HPV16 pro-teins E6 and E7. The prediction methods are listed in Supple-mentary Table S1. We predicted binding affinity of all 956possible 8-, 9-, 10-, and 11-mer peptides derived from theseproteins to each of the major HLA class-I types A1, A2, A3, A11,A24, B7, and B15, represented by the HLA-A�01:01, A�02:01,A�03:01, A�11:01, A�24:02, B�07:02, and B�15:01 alleles. Tofacilitate querying of all the algorithms and to systematicallycombine the resulting prediction output, we developed a webapplication, MHCcombine.

The algorithms did not uniformly predict the same peptidesto be binders. Prediction resulted in different values for eachpeptide including some predicted binding affinities beyond thethreshold. Next, the actual binding affinity of the peptides wasverified in vitro in competitive binding assays (2). Not allpeptides that were predicted to be binders in silico boundexperimentally and vice versa. This justified a systematiclowering of thresholds individually for each prediction methodand HLA type to increase the number of binders confirmed byexperimental binding affinity assessment. We iterated thisprocedure until no more binders were detected. Finally, thedata set comprised 743 peptide–HLA-binding assessmentstested in at least two independent experiments: 44 for HLA-A1, 154 for HLA-A2, 105 for HLA-A3, 135 for HLA-A11, 128 forHLA-A24, 52 for HLA-B7, and 125 for HLA-B15 (Fig. 1;Supplementary Table S2). Sequence motifs of predicted, tested,and binding peptides of the data set resemble known motifsof HLA-type–specific epitopes included in the IEDB(Supplementary Table S3).

Applying the respective default thresholds to the results of allused predictors, 242 peptides were predicted to be binders.Experimental assessment identified 278 true binders. The pre-dicted and experimentally verified binding only partially over-lapped (Fig. 1). For example, 52 peptides were predicted to bebinders to HLA-A2 (TP þ FP). Of these positive predictions, 25were true and 27 false. Additionally, 21 actual binders were falselypredicted to be nonbinders (FNs), and 81 nonbinders were TNsthat were predicted correctly. As we stopped the experimentalbinding assessment when no more binders were detected, it is

Bonsack et al.

Cancer Immunol Res; 7(5) May 2019 Cancer Immunology Research722

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 5: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

fair to assume that the remaining peptides predicted as non-binders (with even lower predicted likelihood of binding thanthe tested peptides) are TNs. The following percentages aretherefore only given for positive predictions, as all of these wereexperimentally assessed. Across all analyzed HLA types, wefound that only 151 of 242 (62%) positive predictions matchedwith actual binding. Additionally, 127 of 278 (46%) truebinders were not predicted within the given thresholds. Thispronounced disparity between binding prediction and exper-imental assessment prompted us to conduct an analysis of theprediction performance of the used algorithms based on ourHPV16 E6/E7 data set.

Prediction algorithms depend on HLA type and peptide lengthto discriminate binders from nonbinders

To assess the predictive performance of the individual algo-rithms, predictionswere sorted according to the predicted bindinglikelihood and classified by the experimental binding result.By considering each verified binder as a positive event andeach nonbinder as a negative event, performance can be analyzedin ROC curves, and the predictive strength of an algorithmindependent of a threshold can be evaluated (SupplementaryFig. S1A).

ROC curves for 12 predictors are shown in Fig. 2 for HLA-A2,A3, and A24, and in Supplementary Fig. S1B for HLA-A1, A11, B7,and B15. None of the methods perfectly discriminated bindersfrom nonbinders. Considering all peptides, regardless of their

length, the 12 algorithms analyzed in these two figures differedonly slightly in their prediction performance. However, differ-ences became more pronounced when analyzing every peptidelength and HLA type separately. Performances for 9- and 10-mer peptides were still very similar among predictors. Theyperformed better than for the pooled lengths, and for HLA-A3,they reached excellent AROC values of > 0.9. For 11-mers, predic-tion performance was dependent on the HLA molecule. Predic-tion of 11-mers binding to A24 yielded similar performanceresults between predictors, but only intermediate AROC values of�0.8. The poorest 11-mer prediction performance was observedfor HLA-A3 for which AROC values dropped below 0.5, whichequals a randomassignment of binders andnonbinders. ForHLA-A2, 11-mer peptides were discriminated poorly with considerableperformance differences betweenmethods (AROC0.59–0.82). Theanalysis of 8-mer peptides showed the most pronounced differ-ences in method performance. Here, AROC values ranged from0.17 (IEDB SMM for HLA-A24) to 0.91 (NetMHC 4.0 for HLA-A3). Similar observations could be made for the other analyzedHLA types.

Not all algorithms allow every peptide prediction. For example,IEDB SMM and IEDB SMMPMBEC cannot predict 8- and 11-merpeptides for HLA-B15. The 13th analyzed predictor, SYFPEITHI,only allows prediction of 9- and 10-mer peptides for most HLAtypes. Prediction of 11-mers by SYFPEITHI is possible only forHLA-A1. Further, it does not offer prediction for HLA-B15.Because of this limitation in data, a scoring system that is differentfrom all other analyzed predictors, and an unclear definition ofpredicted binders and nonbinders, SYFPEITHI was analyzedseparately (Supplementary Fig. S2). AROC values were found tobe below 0.8 for all HLA types and peptide lengths, indicatingpoorer performance than more regularly updated predictionalgorithms.

In general, ANN-based pan-specific algorithms showed thebest prediction performance. The AROC values of the NetMHCfamily were always among the highest. In contrast, IEDB SMMand IEDB SMMPMBEC could be found among the poorlyperforming predictors for most of the analyzed settings.Surprisingly, the newest predictors NetMHCpan 4.0 and MHCflurry 1.2 were not able to distinctively outperform othermethods. Indeed, no single algorithm performed outstandinglywell. Thus, we recommend always choosing the most suitablealgorithm for the specific HLA type and peptide length inquestion.

Commonly used decision thresholds result in low predictionsensitivity

As outlined above, we observed that actual binders couldstill be found beyond the commonly used binding affinitythresholds indicated by the prediction methods. Therefore,we were interested in the predictors' performance when differ-ent decision thresholds were applied. We compared the accu-racy, defined as the ratio of all true predictions (TP þ TN)over all data points, for indicated thresholds predicting forstrong (IC50: � 50 nM; percentile rank: � 0.5), intermediate(IC50:� 500 nM; percentile rank:� 2), and low binding affinity(IC50: �5,000 nM). Figure 3A shows the accuracies for predic-tions to HLA-A2, A3, and A24 for pooled and single peptidelengths from 8- to 11-mers; Supplementary Fig. S3A shows thesame analysis for HLA-A1, A11, B7, and B15. The highestaccuracy that can be achieved is 1.0. For pooled peptide lengths,

Figure 1.

Comparison of in silico–predicted and in vitro–verified HLA binding forpeptides derived from HPV16 E6 and E7 proteins. The ligands to indicatedHLAmolecules were predicted using 13 algorithms and algorithm-indicatedthresholds of either IC50� 500 nM or median percentile rank� 2 and verifiedby in vitro competitive binding assays. Bar size represents the testedpeptides per HLA type. Background color of bars indicates verified binders(black) and nonbinders (white). Checkboard pattern highlights predictedpositives (P) returned by anymethod. Fractions without pattern werepredicted negative (N) by all methods. Experimental assessment categorizedthe predictions into true (T) or false (F). The table below summarizes theresults. The total number of true binders (278) and predicted binders (242) isindicated to the right of the table (FN, false negatives; TP, true positives; FP,false positives; TN, true negatives).

Performance Evaluation of MHC Class-I Binding Predictors

www.aacrjournals.org Cancer Immunol Res; 7(5) May 2019 723

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 6: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

accuracy for the different thresholds varies between 0.50 and0.85. Here, the predictors showed very similar accuracy for thesame threshold. Differences between predictors again becamemore pronounced when single peptide lengths were analyzed.Overall, the strong binding affinity threshold resulted in the

most stable accuracy across different predictors, but alsoresulted in the lowest accuracy values. For example, using thestrong binding affinity threshold for the prediction of 8-merbinders to HLA-A24 yielded the lowest accuracy (0.24) amongall predictors. Applying the intermediate binding affinity

NetMHCpan 4.0

NetMHC 4.0NetMHC 3.4

NetMHCpan 3.0NetMHcpan 2.8NetMHCcons 1.1

IEDB recommendedIEDB consensus

PickPocket 1.1 IEDB SMMPMBECIEDB SMMMHCflurry 1.2

11-m

ers

TPR

FPR

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

n = 230.0

0.2

0.4

0.6

0.8

1.0

n = 29

10-m

ers

0.0

0.2

0.4

0.6

0.8

1.0

n = 460.0

0.2

0.4

0.6

0.8

1.0

n = 350.0

0.2

0.4

0.6

0.8

1.0

9-m

ers

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

n = 340.0

0.2

0.4

0.6

0.8

1.0

n = 36

8-m

ers

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

n = 130.0

0.2

0.4

0.6

0.8

1.0

n = 17

AROC

Poo

led

leng

ths

AROC AROC

0.0

0.2

0.4

0.6

0.8

1.0

n = 128 0.0

0.2

0.4

0.6

0.8

1.0

n = 105 0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

n = 154 0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

n = 210.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

n = 380.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

n = 460.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

n = 490.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Figure 2.

Performance evaluation of HLA-binding prediction algorithms: the capacity to separate HLA-A2, A3, and A24 binders from nonbinders. Prediction performanceof indicated prediction methods was analyzed by ROC curves and area under the curve (AROC). The TPR was plotted against the FPR to generate ROC curves.AROC values are shown in bar graphs. Analysis was performed for 8- to 11-mer peptides (pooled and individually) binding to HLA-A2, A3, and A24. Sample size (n)is indicated in the bottom right corner of each ROC plot.

Bonsack et al.

Cancer Immunol Res; 7(5) May 2019 Cancer Immunology Research724

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 7: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

NetMHCpan 3.0

NetMHCpan 2.8

NetMHCpan 4.0

spec

sens

spec

spec

sens

sens

Strong Inter LowHLA-A3

spec

sens

spec

spec

sens

sens

Strong Inter Low

NetMHCcons 1.1

PickPocket 1.1

IEDB recommended

IEDB consensus

IEDB SMMPMBEC

IEDB SMM

MHCflurry 1.2

Prediction method

HLA-A2

spec

sens

spec

spec

sens

sens

Strong Inter LowHLA-A24

NetMHC 4.0

NetMHC 3.4

B

StrongIntermediateLow

Threshold (affinity):

AP

oole

dle

ngth

s8-

mer

s9-

mer

s10

-mer

s11

-mer

s

ycaruccA

ycaruccA

ycaruccA

ycaruccA

ycaruccA

HLA-A2

0.4

0.6

0.8

1.0

0.4

0.6

0.8

1.0

0.4

0.6

0.8

1.0

0.4

0.6

0.8

1.0

0.4

0.6

0.8

1.0

HLA-A3

0

0.2

0.4

0.6

0.8

1.0HLA-A24

0

0.2

0.4

0.6

0.8

1.0

0

0

0

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

NetMHC 4.

0

NetMHC 3.

4

NetMHCpa

n 4.0

NetMHCpa

n 3.0

NetMHcp

an 2.

8

NetMHCco

ns 1.

1

PickPoc

ket 1

.1

IEDB re

commen

ded

IEDB co

nsen

sus

IEDB S

MMPMBEC

IEDB S

MM

MHCflurry

1.2

0

0

0

0

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

NetMHC 4.

0

NetMHC 3.

4

NetMHCpa

n 4.0

NetMHCpa

n 3.0

NetMHcp

an 2.

8

NetMHCco

ns 1.

1

PickPoc

ket 1

.1

IEDB re

commen

ded

IEDB co

nsen

sus

IEDB S

MMPMBEC

IEDB S

MM

MHCflurry

1.2

NetMHC 4.

0

NetMHC 3.

4

NetMHCpa

n 4.0

NetMHCpa

n 3.0

NetMHcp

an 2.

8

NetMHCco

ns 1.

1

PickPoc

ket 1

.1

IEDB re

commen

ded

IEDB co

nsen

sus

IEDB S

MMPMBEC

IEDB S

MM

MHCflurry

1.2

Figure 3.

Decision threshold–dependent performance analysis of indicated HLA-binding prediction methods for HLA-A2, A3, and A24. Prediction performance wasassessed at strong (IC50� 50 nM or percentile rank� 0.5), intermediate ("inter," IC50� 500 nM or percentile rank� 2), and low (IC50� 5,000 nM) bindingaffinity thresholds. For predictors that did not indicate a low binding affinity threshold, data points are blank. A, Accuracy is shown for 8- to 11-mer peptidespooled and individually. B,Applying the indicated thresholds, specificity (spec) and sensitivity (sens) of predictors are visualized as pie slices in shades of grayfor 8- to 11-mers pooled.

Performance Evaluation of MHC Class-I Binding Predictors

www.aacrjournals.org Cancer Immunol Res; 7(5) May 2019 725

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 8: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

threshold showed slightly better accuracies in the pooled lengthanalysis, and some very high accuracy values in single lengthanalysis. We could not observe any major improvement inaccuracy for a specific peptide length, but HLA type–dependentdifferences. The highest variance in accuracy across methodswas observed for the low binding affinity threshold. Using thisthreshold, accuracy dropped dramatically for single predictionmethods such as PickPocket 1.1, IEDB SMM, and IEDBSMMPMBEC. For A24, the low binding affinity thresholdperformed with superior accuracy compared with the interme-diate and strong binding affinity thresholds. In contrast, for A2and A3, the low binding affinity threshold generally performedleast accurately. Similar observations could be made for theother analyzed HLA types. The more tolerant thresholdsimproved prediction performance for the HLA types A1, A24,B7, and B15 and worsened prediction performance for A2, A3,and A11 (Fig. 3A; Supplementary Fig. S3A).

As accuracy considers all true predictions, it may mask a lowcapability to predict TPs by correctly predicting a high number ofTNs. We further analyzed the threshold-dependent predictionperformance of methods by calculating sensitivity and specificity.The threshold-dependent sensitivity and specificity of bindingpredictions to HLA-A2, A3, and A24 are shown for pooled 8-merto 11-mer peptides in Fig. 3B, and forHLA-A1, A11, B7, andB15 inSupplementary Fig. S3B. As expected, specificity was found to behighest and sensitivity lowest for the most stringent strong bind-ing affinity threshold (column "strong" in the indicated figures).Specificity decreases and sensitivity increases naturally whenusing more tolerant thresholds (columns "inter" and "low" inthe indicated figures). For HLA-A2, applying the commonly usedintermediate binding affinity threshold of IC50� 500 nM resultedin a maximum sensitivity of only 0.39 (for PickPocket 1.1 andIEDB SMMPMBEC) with a coinciding minor reduction of spec-ificity (Fig. 3B, panel HLA-A2, column "inter"). Using the lowbinding affinity threshold increased the sensitivity to > 0.54(Fig. 3B, panel HLA-A2, column "low"). For predictions byPickPocket 1.1, the IC50 � 5,000 nM threshold increased sensi-tivity to 0.93, but this was accompanied by a pronounced reduc-tion of specificity to 0.36. This trend could be observed for allpredictors, however to a lesser extent (Fig. 3B; Supplementary Fig.S3B, columns "low").

For HLA-A3, using the intermediate binding affinity thresholdresulted in highest sensitivity for IEDB consensus (0.52) and IEDBrecommended (0.48; Fig. 3B, panel HLA-A3, column "inter").Thesemethods return a percentile rank as prediction score and donot indicate a third low binding affinity threshold. For HLA-A24,A1, B7, and B15, IEDB recommended and IEDB consensusshowed the highest sensitivity across tested methods when theintermediate binding affinity threshold was applied (Fig. 3B;Supplementary Fig. S3B, columns "inter").

Recommended individual decision thresholds increaseprediction sensitivity

For projects aimed at finding all possible HLA-binding pep-tides from a given protein, the gain in sensitivity by using moretolerant decision thresholds is attractive if it can be balancedagainst decreased specificity. However, our analysis showedthat it is not generally favorable to use the low binding affinitythresholds indicated by the prediction algorithms. We calcu-lated new threshold recommendations individually for eachpredictor, HLA type, and peptide length. Our recommendations

are based on the following criteria: (i) a minimum specificity of0.66 (equal to a maximum FPR of 0.33), (ii) prediction of atleast twice as many TPs than false positives, and (iii) calculationof the threshold yielding the highest possible sensitivity withinthe limits defined in (i) and (ii). The application of thesecriteria is shown for HLA-A2 and two selected predictors(NetMHC 4.0 and IEDB SMM) in Fig. 4A. For predictorsshowing high AROC values (e.g., for ANN methods predictingfor HLA-A2 binding), applying these criteria often resulted inthresholds even more tolerant than the low binding affinitythresholds indicated by the algorithms. However, for the morepoorly performing methods (e.g., IEDB SMM predicting forHLA-A2 binding), threshold recommendations were oftenfound between the intermediate and low-affinity thresholds.For single-length predictions, all threshold recommendationsand associated performance measures (AROC, PPV, specificity,and sensitivity) are listed in Table 1.

In order to recommend thresholds applicable to any data set,we calculated criterion-based threshold recommendationsfor every HLA type and for every predictor evaluated in thisstudy, based on bootstrapping of the HPV data set. This methodresamples the HPV data set multiple times to calculate therecommended threshold that best represents the entire statis-tical population. The median criterion-based threshold of 100bootstraps was termed the "validated threshold." In a secondbootstrapping, the sensitivity, specificity, and accuracy associ-ated with the validated threshold were compared with theintermediate and low binding affinity thresholds indicated bythe prediction algorithms. Results for HLA-A2 are shownin Fig. 4B; results for all other analyzed HLA types are shownin Supplementary Fig. S4. For HLA-A2, A3, A11, and A24, allpredictors yielded a significant and relevant (change by �0.1)increase in sensitivity, using the respective bootstrapping-validated threshold in comparison with the intermediate affin-ity threshold. This was also true for the majority of predictionmethods for HLA-A1, B7, and B15. For predictions to HLA-A2,A1, A11, and A24, sensitivity of most predictors could besignificantly increased applying the bootstrapping-validatedthreshold compared with the low binding affinity threshold.For all HLA types, some predictors, such as IEDB SMMPMBECand IEDB SMM, showed decreased sensitivity when respectivebootstrapping-validated thresholds were applied. In this case,the respective validated threshold was found to be more strin-gent than the low binding affinity threshold. As expected, whensensitivity was lost, concomitant specificity was gained andvice versa. However, in almost all cases the absolute gain out-weighed the loss. Therefore, for HLA-A2, accuracy was relevant-ly increased (validated vs. low binding affinity threshold forPickPocket 1.1, IEDB SMMPMBEC, and IEDB SMM), notsignificantly changed (for validated vs. intermediate bindingaffinity threshold for PickPocket 1.1 and IEDB recommended,for validated versus low binding affinity threshold for NetMHC4.0, NetMHC 3.4, and NetMHCflurry 1.2, both for NetMHCpan2.8 and NetMHCcons 1.1) or reduced only to a minor extent(Fig. 4B). In the HLA types A24, B7, and B15, use of thevalidated threshold led to relevant increases in accuracy(Supplementary Fig. S4). Accuracy changes for HLA-A11 wereeither not significant or less relevant (<0.1). Comparing thevalidated threshold with the low binding affinity threshold,most methods showed an increase in accuracy for HLA-A3. ForHLA-A1 changes in resulting accuracy varied.

Bonsack et al.

Cancer Immunol Res; 7(5) May 2019 Cancer Immunology Research726

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 9: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

Criteria-based thresholds match well with bootstrapping-validated thresholds

Due to the sample size of our experimental data set, thisanalysis was performed only for pooled peptide lengths. To

investigate howwell the performance calculated for the HPV dataset sample represents the true performance of predictors, wedirectly compared the respective accuracies of criteria-based andbootstrapping-validated thresholds for each predictor. Results for

A B500 nM

5,000 nM

6,937 nM

500 nM

5,000 nM

5,000 nM

6,045 nMn =154

n =21

n = 38

n = 46

n = 49

500 nM

8,103 nM

500 nM

5,000 nM

9,681 nM

500 nM

5,000 nM

4,923 nM

500 nM

5,000 nM

5,383 nM

500 nM

5,000 nM

954 nM

0.5

2

10.35

0.5

2

9.25

500 nM

5,000 nM

1,715 nM

500 nM

5,000 nM

1,276.5 nM

500 nM

5,000 nM

5,310 nM

Figure 4.

Assessment of prediction performance using thresholds individually recommended for HLA-A2 binding predictions versus general intermediate and low bindingaffinity thresholds. General thresholds for predicting intermediate (IC50� 500 nM or percentile rank� 2) and low (IC50� 5000 nM) HLA-binding affinity werecompared with individually recommended thresholds specific for each predictor and peptide length. A, In ROC curves of NetMHC 4.0 and IEDB SMM, general andrecommended thresholds are presented. The red area indicates the possible location of the recommended threshold, based on the criteria FPR�0.33 (specificity� 0.66) and TPR� 2� FPR, shown as red lines. Within these, the threshold resulting in the highest sensitivity (highest TPR) was selected. Colored arrows pointon the last data point within the respective threshold indicated in the figure legend (downward for NetMHC 4.0 and upward for IEDB SMM). Sample size (n) isindicated in the bottom right corner of each ROC plot. B, Sensitivity, specificity, and accuracy of the recommended thresholds were validated by bootstrappingand compared with the generally used intermediate and low binding affinity thresholds in a second bootstrapping. Box plots show quartiles, and whiskersindicate the 95% interval of data. Thresholds are indicated on the y-axis. One-way ANOVA followed by the Dunnett multiple comparisons test (significance, P <0.05) was performed to determine the significance of indicated differences of the means. ��� , P < 0.001; �� , P < 0.01; ns, not significant.

Performance Evaluation of MHC Class-I Binding Predictors

www.aacrjournals.org Cancer Immunol Res; 7(5) May 2019 727

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 10: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

Table

1.Sum

maryofHLA

type–

andpep

tideleng

th–d

epen

den

tperform

ance

indicators

andthreshold

reco

mmen

dations

forthean

alyzed

predictionmetho

ds

8-m

ers(n

¼6)a

9-m

ers(n

¼12)a

10-m

ers(n

¼10)a

11-m

ers(n

¼16)a

Pooledleng

ths(n

¼44)b

HLA

-A1

Predictorc

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Val.thresho

ld(PPV[%

])Sp

ecificity

Sensitivity

NetMHC4.0

0.67

1170

2.00

0.67

0.96

670

6.00

0.89

0.84

3676

.00

1.00

0.69

632

3.00

1.00

0.82

670

6.00

0.90

0.07

(75.00)

1.00

(75.00)

1.00

(100.00)

0.80

(100.00)

0.50

(78.57)

0.75

0.17

NetMHC3.4

1.00

2008.00

1.00

1.00

5319.00

1.00

0.88

3631.00

1.00

0.54

11458

.00

0.75

0.84

9154.00

0.72

0.11

(100.00)

1.00

(100.00)

1.00

(100.00)

0.80

(40.00)

0.50

(59.09)

0.79

0.16

NetMHCpan

4.0

0.78

1478

1.00

0.67

1.00

3127

.00

1.00

0.84

1548.00

1.00

0.83

8915.00

0.75

0.84

8641.0

00.68

0.12

(75.00)

1.00

(100.00)

1.00

(100.00)

0.80

(50.00)

0.75

(57.14)

0.75

0.17

NetMHCpan

3.0

0.89

1525

1.00

0.67

1.00

1900.00

1.00

0.84

2246.00

0.80

0.71

2745.00

1.00

0.84

7097.00

0.86

0.09

(75.00)

1.00

(100.00)

1.00

(80.00)

0.80

(100.00)

0.50

(73.33

)0.71

0.16

NetMHCpan

2.8

0.89

2191.0

00.67

0.96

2318.00

0.89

0.56

268.00

1.00

0.58

5158

.00

0.75

0.76

5684.00

0.71

0.11

(75.00)

1.00

(75.00)

1.00

(100.00)

0.20

(40.00)

0.50

(60.00)

0.79

0.17

NetMHCco

ns1.1

1.00

2100.00

1.00

1.00

3511.00

1.00

0.80

455

2.00

0.80

0.58

7693.00

0.75

0.82

7693.00

0.69

0.14

(100.00)

1.00

(100.00)

1.00

(80.00)

0.80

(40.00)

0.50

(57.14)

0.80

0.15

PickP

ocket

1.11.0

052

11.00

1.00

0.93

1766.00

0.89

0.76

428

9.00

0.80

0.65

7776

.00

0.67

0.74

5211.00

0.74

0.12

(100.00)

1.00

(75.00)

1.00

(80.00)

0.80

(42.86)

0.75

(57.89)

0.73

0.19

IEDBreco

mmen

ded

0.11

11.25

0.33

0.78

2.40

0.67

0.80

1.30

0.80

0.40

2.95

0.58

0.57

0.95

0.87

0.08

(33.33

)0.33

(50.00)

1.00

(80.00)

0.80

(16.67)

0.25

(100.00)

0.22

0.17

IEDBco

nsen

sus

0.11

11.20

0.33

0.78

2.15

0.67

0.76

1.05

0.80

0.40

2.90

0.58

0.57

0.70

0.93

0.07

(33.33

)0.33

(50.00)

1.00

(80.00)

0.80

(16.67)

0.25

(55.56

)0.30

0.17

IEDBSMMPMBEC

0.44

3764.00

0.67

0.85

3601.0

00.67

0.72

1042.00

0.80

0.44

526.00

1.00

0.59

1042.00

0.90

0.08

(50.00)

0.33

(50.00)

1.00

(75.00)

0.60

(100.00)

0.25

(62.50

)0.31

0.20

IEDBSMM

0.11

16632

.00

0.33

0.74

300.00

1.00

0.84

1754

.00

0.80

0.21

2839

.00

0.33

0.48

315.00

0.97

0.05

(33.33

)0.33

(100.00)

0.33

(83.33

)1.0

0(11.11)

0.25

(66.67)

0.12

0.14

MHCflurry

1.20.78

1204.00

0.67

1.00

1529

.00

1.00

0.80

1454

.00

1.00

0.56

5595.00

0.83

0.82

6850

.00

0.69

0.12

(75.00)

1.00

(100.00)

1.00

(100.00)

0.60

(33.33

)0.25

(55.00)

0.76

0.15

8-m

ers(n

¼21)a

9-m

ers(n

¼38

)a10-m

ers(n

¼46)a

11-m

ers(n

¼49)a

Pooledleng

ths(n

¼154)b

HLA

-A2

Predictorc

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Val.thresho

ld(PPV[%

])Sp

ecificity

Sensitivity

NetMHC4.0

0.83

9129.00

0.81

0.87

1812.00

1.00

0.78

2566.00

0.69

0.82

11451.00

0.70

0.79

6937

0.71

0.06

(57.14)

0.80

(100.00)

0.71

(42.11)

0.73

(60.00)

0.94

(51.5

2)0.74

0.08

NetMHC3.4

0.81

6045.00

0.69

0.85

7890.00

0.67

0.86

474

9.00

0.80

0.67

436

2.00

0.73

0.79

6045

0.72

0.06

(50.00)

1.00

(60.00)

0.86

(56.25)

0.82

(50.00)

0.56

(53.13)

0.74

0.09

NetMHCpan

4.0

0.76

11632

.00

0.88

0.91

425

8.00

0.75

0.8

2254

.00

0.80

0.8

12699.00

0.67

0.81

8103

0.70

0.06

(60.00)

0.60

(68.42)

0.93

(53.33

)0.73

(56.00)

0.88

(49.30)

0.73

0.09

NetMHCpan

3.0

0.74

9488.00

0.88

0.88

473

4.00

0.79

0.81

2033

.00

0.80

0.8

11935

.00

0.79

0.78

9681

0.68

0.06

(60.00)

0.60

(70.59)

0.86

(53.33

)0.73

(63.16)

0.75

(49.28)

0.73

0.09

NetMHCpan

2.8

0.8

2302.00

0.69

0.87

853

8.00

0.67

0.81

4923

.00

0.83

0.71

4715.00

0.70

0.79

4923

0.73

0.06

(50.00)

1.00

(60.00)

0.86

(57.14)

0.73

(54.55)

0.75

(56.25)

0.79

0.08

NetMHCco

ns1.1

0.81

2328

.00

0.75

0.86

857

2.00

0.67

0.83

8712.00

0.69

0.71

5383.00

0.70

0.80

5383

0.74

0.06

(50.00)

0.80

(60.00)

0.86

(47.62)

0.91

(54.55)

0.75

(57.14)

0.78

0.09

PickP

ocket

1.10.76

736.00

0.75

0.84

561.0

00.83

0.76

705.00

0.86

0.67

954

.00

0.76

0.76

954

0.75

0.05

(50.00)

0.80

(73.33

)0.79

(54.55)

0.55

(57.89)

0.69

(54.24)

0.71

0.09

IEDBreco

mmen

ded

0.7

1.00

1.00

0.81

14.00

0.67

0.85

6.60

0.80

0.76

16.05

0.70

0.77

10.35

0.72

0.06

(100.00)

0.20

(60.00)

0.86

(56.25)

0.82

(54.55)

0.75

(52.31)

0.74

0.09

IEDBco

nsen

sus

0.7

0.80

1.00

0.85

8.70

0.71

0.89

6.20

0.77

0.76

9.25

0.79

0.77

9.25

0.69

0.06

(100.00)

0.20

(63.16)

0.86

(55.56

)0.91

(58.82)

0.63

(50.00)

0.73

0.09

IEDBSMMPMBEC

0.6

927

.00

0.75

0.85

3073

.00

0.75

0.91

1294.00

0.83

0.64

284.00

0.94

0.78

1715

0.70

0.06

(42.86)

0.60

(64.71)

0.79

(62.5)

0.91

(60.00)

0.19

(49.30)

0.73

0.09

IEDBSMM

0.51

167.00

0.94

0.86

3077

.00

0.79

0.9

1643.00

0.74

0.59

91.0

01.0

00.73

1276

.50.69

0.06

(50.00)

0.20

(70.59)

0.86

(52.63)

0.91

(100.00)

0.06

(47.83)

0.62

0.10

(Con

tinu

edon

thefollowingpag

e)

Bonsack et al.

Cancer Immunol Res; 7(5) May 2019 Cancer Immunology Research728

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 11: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

Table

1.Sum

maryofHLA

type–

andpep

tideleng

th–d

epen

den

tperform

ance

indicators

andthreshold

reco

mmen

dations

forthean

alyzed

predictionmetho

ds(Cont'd)

8-m

ers(n

¼21)a

9-m

ers(n

¼38

)a10-m

ers(n

¼46)a

11-m

ers(n

¼49)a

Pooledleng

ths(n

¼154)b

HLA

-A2

Predictorc

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Val.thresho

ld(PPV[%

])Sp

ecificity

Sensitivity

MHCflurry

1.20.84

4315.00

0.69

0.93

4896.00

0.79

0.89

3410.00

0.89

0.75

8055

.00

0.67

0.83

5310

0.75

0.06

(50.00)

1.00

(73.68)

1.00

(69.23)

0.82

(54.17

)0.81

(58.46)

0.83

0.08

8-m

ers(n

¼13)a

9-m

ers(n

¼34

)a10-m

ers(n

¼35

)a11-m

ers(n

¼23

)aPooledleng

ths(n

¼105)

b

HLA

-A3

Predictorc

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Val.thresho

ld(PPV[%

])Sp

ecificity

Sensitivity

NetMHC4.0

0.91

10605.00

0.91

0.89

3476

.00

0.92

0.94

1603.00

0.89

0.34

234.00

1.00

0.78

6962.00

0.72

0.07

(66.67)

1.00

(77.78

)0.88

(70.00)

1.00

(100.00)

0.25

(41.0

3)0.78

0.14

NetMHC3.4

0.55

752.00

0.73

0.91

3086.00

0.96

0.97

2536

.00

0.89

0.45

955

.00

1.00

0.84

3086.00

0.78

0.06

(25.00)

0.50

(87.50

)0.88

(70.00)

1.00

(100.00)

0.25

(48.57)

0.81

0.13

NetMHCpan

4.0

0.82

12511.0

00.73

0.88

830

2.00

0.85

0.95

1365.00

0.93

0.42

132.00

1.00

0.78

1365.00

0.94

0.03

(40.00)

1.00

(63.64)

0.88

(77.78

)1.0

0(100.00)

0.25

(36.36)

0.66

0.17

NetMHCpan

3.0

0.82

1053

7.00

0.73

0.88

8408.00

0.77

0.95

1769.00

0.89

0.39

163.00

1.00

0.78

5534

.00

0.77

0.06

(40.00)

1.00

(53.85)

0.88

(70.00)

1.00

(100.00)

0.25

(44.12

)0.70

0.14

NetMHCpan

2.8

0.73

637

.00

0.91

0.90

7002.00

0.92

0.98

2366.00

0.96

0.36

132.00

1.00

0.79

2366.00

0.79

0.07

(50.00)

0.50

(77.78

)0.88

(87.50

)1.0

0(100.00)

0.25

(47.06)

0.75

0.15

NetMHCco

ns1.1

0.64

693.00

0.91

0.90

4651.00

0.96

0.98

2000.00

0.96

0.39

355.00

1.00

0.82

2000.00

0.86

0.06

(50.00)

0.50

(87.50

)0.88

(87.50

)1.0

0(100.00)

0.25

(38.64)

0.76

0.16

PickP

ocket

1.10.68

1551.00

0.82

0.78

1655

.00

0.88

0.92

4990.00

0.71

0.36

1423

.00

0.84

0.73

2874

.00

0.71

0.07

(33.33

)0.50

(66.67)

0.75

(46.67)

1.00

(25.00)

0.25

(39.47)

0.69

0.17

IEDBreco

mmen

ded

0.64

3.45

0.82

0.88

2.70

0.96

0.98

3.50

0.93

0.34

6.65

0.68

0.77

6.65

0.70

0.07

(33.33

)0.50

(87.50

)0.88

(77.78

)1.0

0(14.29)

0.25

(39.02)

0.76

0.14

IEDBco

nsen

sus

0.64

3.15

0.82

0.88

2.90

0.96

0.98

11.35

0.89

0.33

6.60

0.63

0.77

11.35

0.69

0.07

(33.33

)0.50

(87.50

)0.88

(70.00)

1.00

(12.50

)0.25

(38.10

)0.76

0.13

IEDBSMMPMBEC

0.36

21976

.00

0.64

0.90

1881.0

00.96

0.96

971.00

0.93

0.54

692.00

0.79

0.80

1881.0

00.83

0.06

(20.00)

0.50

(87.50

)0.88

(77.78

)1.0

0(33.33

)0.50

(51.6

1)0.76

0.15

IEDBSMM

0.55

13611.00

0.64

0.88

1909.00

0.96

0.96

907.00

0.89

0.36

3706.00

0.42

0.79

2930

.00

0.75

0.06

(20.00)

0.50

(87.50

)0.88

(70.00)

1.00

(21.4

3)0.75

(37.78

)0.76

0.14

MHCflurry

1.20.64

220.00

1.00

0.85

3506.00

0.85

0.97

2048.00

0.93

0.41

1273

.00

0.84

0.79

3506.00

0.72

0.06

(100.00)

0.50

(63.64)

0.88

(77.78

)1.0

0(25.00)

0.25

(41.4

6)

0.83

0.13

8-m

ers(n

¼24

)a9-m

ers(n

¼40)a

10-m

ers(n

¼33

)a11-m

ers(n

¼38

)aPooledleng

ths(n

¼135)

b

HLA

-A11

Predictorc

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Val.thresho

ld(PPV[%

])Sp

ecificity

Sensitivity

NetMHC4.0

0.98

934

2.00

0.94

0.95

9086.00

0.77

0.82

1928

.00

0.85

0.79

9183.00

0.72

0.85

934

2.00

0.71

0.07

(85.71)

1.00

(56.25)

1.00

(75.00)

0.69

(56.25)

0.69

(58.06)

0.89

0.08

NetMHC3.4

0.98

1354

.00

0.94

0.94

13066.00

0.71

0.87

5497.00

0.70

0.78

6972

.00

0.76

0.87

6972

.00

0.75

0.06

(85.71)

1.00

(50.00)

1.00

(64.71)

0.85

(64.71)

0.85

(60.00)

0.88

0.08

NetMHCpan

4.0

1.00

7720

.00

1.00

0.92

7399.00

0.71

0.83

2783.00

0.70

0.75

5361.0

00.88

0.85

7720

.00

0.73

0.07

(100.00)

1.00

(47.06)

0.89

(62.50

)0.77

(70.00)

0.54

(57.63)

0.83

0.08

NetMHCpan

3.0

1.00

5309.00

1.00

0.90

980.00

0.97

0.82

1722

.00

0.70

0.74

7709.00

0.76

0.85

7709.00

0.69

0.06

(100.00)

1.00

(87.50

)0.78

(60.00)

0.69

(57.14)

0.62

(54.69)

0.86

0.07

NetMHCpan

2.8

1.00

349.00

1.00

0.94

6918.00

0.87

0.86

5277

.00

0.70

0.74

7182.00

0.68

0.86

7354

.00

0.67

0.07

(100.00)

1.00

(66.67)

0.89

(66.67)

0.92

(55.56

)0.77

(55.07)

0.91

0.07

NetMHCco

ns1.1

0.98

686.00

0.94

0.94

6830

.00

0.87

0.86

4407.00

0.70

0.76

7093.00

0.72

0.87

7734

.00

0.72

0.07

(85.71)

1.00

(66.67)

0.89

(64.71)

0.85

(63.16)

0.92

(59.38)

0.92

0.07

PickP

ocket

1.10.94

3237

.00

0.89

0.90

2417.00

0.74

0.75

4626

.00

0.70

0.80

6683.00

0.68

0.77

4063.00

0.70

0.07

(75.00)

1.00

(50.00)

0.89

(62.50

)0.77

(57.89)

0.85

(51.6

7)0.73

0.09

IEDBreco

mmen

ded

0.94

11.05

0.67

0.94

5.70

0.74

0.87

6.15

0.70

0.85

8.85

0.84

0.88

8.85

0.69

0.07

(50.00)

1.00

(52.94)

1.00

(62.50

)0.77

(75.00)

0.92

(56.72)

0.93

0.05

(Con

tinu

edon

thefollowingpag

e)

Performance Evaluation of MHC Class-I Binding Predictors

www.aacrjournals.org Cancer Immunol Res; 7(5) May 2019 729

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 12: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

Table

1.Sum

maryofHLA

type–

andpep

tideleng

th–d

epen

den

tperform

ance

indicators

andthreshold

reco

mmen

dations

forthean

alyzed

predictionmetho

ds(Cont'd)

8-m

ers(n

¼24

)a9-m

ers(n

¼40)a

10-m

ers(n

¼33

)a11-m

ers(n

¼38

)aPooledleng

ths(n

¼135)

b

HLA

-A11

Predictorc

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Val.thresho

ld(PPV[%

])Sp

ecificity

Sensitivity

IEDBco

nsen

sus

0.94

10.65

0.67

0.94

5.00

0.74

0.87

4.85

0.70

0.84

7.65

0.84

0.87

7.45

0.71

0.06

(50.00)

1.00

(52.94)

1.00

(62.50

)0.77

(75.00)

0.92

(56.72)

0.90

0.06

IEDBSMMPMBEC

0.95

3425

.00

0.89

0.92

1254

.00

1.00

0.86

2716.00

0.75

0.67

510.00

0.84

0.85

1966.00

0.74

0.06

(75.00)

1.00

(100.00)

0.78

(66.67)

0.77

(63.64)

0.54

(51.5

6)

0.78

0.09

IEDBSMM

0.81

3250

.00

0.72

0.92

3634

.00

0.68

0.86

2924

.00

0.70

0.72

229.00

0.92

0.83

1539

.00

0.81

0.06

(50.00)

0.83

(44.44)

0.89

(64.71)

0.85

(66.67)

0.31

(63.27

)0.76

0.10

MHCflurry

1.21.0

037

7.00

1.00

0.94

3532

.00

0.71

0.81

1263.00

0.75

0.76

4319.00

0.68

0.85

3412.00

0.68

0.06

(100.00)

1.00

(50.00)

1.00

(64.29)

0.69

(57.89)

0.85

(50.88)

0.80

0.09

8-m

ers(n

¼17)a

9-m

ers(n

¼36

)a10-m

ers(n

¼46)a

11-m

ers(n

¼29

)aPooledleng

ths(n

¼128)b

HLA

-A24

Predictorc

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Val.thresho

ld(PPV[%

])Sp

ecificity

Sensitivity

NetMHC4.0

0.77

25903.00

0.75

0.76

7820

.00

0.71

0.76

9078

.00

0.75

0.78

20604.00

0.67

0.73

13918.00

0.68

0.09

(91.6

7)0.85

(72.22

)0.68

(71.4

3)0.68

(60.00)

0.82

(68.18

)0.66

0.08

NetMHC3.4

0.71

7863.00

1.00

0.81

654

8.00

0.82

0.78

14977

.00

0.67

0.78

17668.00

0.72

0.76

12453

.00

0.72

0.08

(100.00)

0.54

(82.35

)0.74

(68.00)

0.77

(61.5

4)

0.73

(67.69)

0.65

0.08

NetMHCpan

4.0

0.87

33856

.00

0.75

0.82

12844.00

0.71

0.77

12098.00

0.75

0.74

22982.00

0.67

0.74

1426

8.00

0.69

0.08

(92.31)

0.92

(75.00)

0.79

(73.91)

0.77

(60.00)

0.82

(68.18

)0.69

0.08

NetMHCpan

3.0

0.87

3457

1.00

0.75

0.81

1234

6.00

0.71

0.76

11145.00

0.71

0.72

1825

3.00

0.72

0.74

1234

6.00

0.74

0.08

(92.31)

0.92

(73.68)

0.74

(70.83)

0.77

(61.5

4)

0.73

(67.69)

0.67

0.08

NetMHCpan

2.8

0.79

10003.00

0.75

0.81

1022

9.00

0.82

0.76

12415.00

0.88

0.77

19060.00

0.72

0.78

1422

9.00

0.67

0.08

(90.91)

0.77

(83.33

)0.79

(82.35

)0.64

(64.29)

0.82

(68.66)

0.71

0.09

NetMHCco

ns1.1

0.77

10028

.00

1.00

0.81

10934

.00

0.71

0.79

1456

5.00

0.67

0.78

2059

0.00

0.72

0.78

13142.00

0.69

0.08

(100.00)

0.62

(76.19

)0.84

(69.23)

0.82

(64.29)

0.82

(72.46)

0.77

0.09

PickP

ocket

1.10.75

5100.00

0.75

0.69

1655

.00

0.82

0.71

5211.00

0.71

0.79

24219.00

0.67

0.71

5045.00

0.69

0.09

(90.00)

0.69

(76.92)

0.53

(66.67)

0.64

(62.50

)0.91

(67.69)

0.63

0.08

IEDBreco

mmen

ded

0.33

6.20

1.00

0.73

3.15

0.71

0.78

5.35

0.75

0.85

19.90

0.89

0.70

7.45

0.70

0.09

(100.00)

0.15

(72.22

)0.68

(75.00)

0.82

(81.8

2)0.82

(71.7

0)

0.60

0.07

IEDBco

nsen

sus

0.31

3.85

1.00

0.73

2.80

0.71

0.77

5.00

0.75

0.84

21.95

0.78

0.69

7.55

0.69

0.08

(100.00)

0.08

(72.22

)0.68

(75.00)

0.82

(69.23)

0.82

(70.59)

0.58

0.07

IEDBSMMPMBEC

0.17

1123

.00

0.75

0.72

2183.00

0.71

0.74

3620

.00

0.67

0.75

9957

.00

0.67

0.64

1854

.00

0.81

0.07

(50.00)

0.08

(70.59)

0.63

(65.22

)0.68

(62.50

)0.91

(69.44)

0.38

0.08

IEDBSMM

0.17

60.00

1.00

0.71

1725

.00

0.82

0.76

3356

.00

0.71

0.75

8672

.00

0.78

0.64

3035

.00

0.76

0.08

(100.00)

0.08

(75.00)

0.47

(70.83)

0.77

(66.67)

0.73

(68.00)

0.51

0.09

MHCflurry

1.20.69

3942.00

0.75

0.77

2322

.00

0.71

0.75

436

9.00

0.67

0.82

10191.0

00.67

0.78

3942.00

0.72

0.08

(88.89)

0.62

(73.68)

0.74

(66.67)

0.73

(60.00)

0.82

(70.00)

0.66

0.08

8-m

ers(n

¼11)a

9-m

ers(n

¼13)a

10-m

ers(n

¼15)a

11-m

ers(n

¼13)a

Pooledleng

ths(n

¼52

)b

HLA

-B7

Predictorc

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Val.thresho

ld(PPV[%

])Sp

ecificity

Sensitivity

NetMHC4.0

0.97

21900.00

0.80

0.90

6674

.00

1.00

0.82

3061.0

01.0

00.73

1327

9.00

0.80

0.80

1327

9.00

0.72

0.11

(85.71)

1.00

(100.00)

0.86

(100.00)

0.75

(83.33

)0.63

(70.37)

0.74

0.12

NetMHC3.4

0.97

16001.0

00.80

0.93

5180.00

1.00

0.77

5373

.00

0.91

0.90

14182.00

0.80

0.88

5373

0.94

0.06

(85.71)

1.00

(100.00)

0.86

(75.00)

0.75

(87.50

)0.88

(92.59

)0.81

0.11

NetMHCpan

4.0

0.90

5515.00

1.00

0.93

14911.00

0.67

0.82

5644.00

0.91

0.83

1752

7.00

0.80

0.84

14911.00

0.70

0.14

(100.00)

0.83

(77.78

)1.0

0(75.00)

0.75

(85.71)

0.75

(71.14)

0.80

0.11

NetMHCpan

3.0

0.90

6605.00

1.00

0.88

6431.00

0.67

0.80

4632

.00

0.91

0.80

14473

.00

0.80

0.83

14473

.00

0.77

0.12

(100.00)

0.83

(75.00)

0.86

(75.00)

0.75

(85.71)

0.75

(75.00)

0.83

0.11

NetMHCpan

2.8

0.93

1846.00

1.00

0.90

3666.00

1.00

0.75

3231.00

0.91

0.90

455

3.00

0.80

0.88

455

3.00

0.84

0.11

(100.00)

0.83

(100.00)

0.86

(75.00)

0.75

(87.50

)0.88

(84.00)

0.86

0.10

(Con

tinu

edon

thefollowingpag

e)

Bonsack et al.

Cancer Immunol Res; 7(5) May 2019 Cancer Immunology Research730

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 13: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

Table

1.Sum

maryofHLA

type–

andpep

tideleng

th–d

epen

den

tperform

ance

indicators

andthreshold

reco

mmen

dations

forthean

alyzed

predictionmetho

ds(Cont'd)

8-m

ers(n

¼11)a

9-m

ers(n

¼13)a

10-m

ers(n

¼15)a

11-m

ers(n

¼13)a

Pooledleng

ths(n

¼52

)b

HLA

-B7

Predictorc

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Val.thresho

ld(PPV[%

])Sp

ecificity

Sensitivity

NetMHCco

ns1.1

0.93

1092.00

1.00

0.93

433

6.00

1.00

0.75

4175.00

0.91

0.90

7946.00

0.80

0.89

7946.00

0.81

0.12

(100.00)

0.83

(100.00)

0.86

(75.00)

0.75

(87.50

)0.88

(80.77)

0.86

0.11

PickP

ocket

1.10.93

2340.00

1.00

0.79

462.00

0.83

0.73

1585.00

0.91

0.73

1766.00

0.80

0.80

2874

.00

0.76

0.13

(100.00)

0.83

(83.33

)0.71

(75.00)

0.75

(83.33

)0.63

(74.07)

0.81

0.13

IEDBreco

mmen

ded

1.00

17.90

1.00

0.93

3.80

1.00

0.64

1.75

0.91

0.43

1.10

1.00

0.76

4.65

0.78

0.11

(100.00)

1.00

(100.00)

0.86

(66.67)

0.50

(100.00)

0.13

(71.4

3)0.64

0.17

IEDBco

nsen

sus

1.00

17.85

1.00

0.93

3.80

1.00

0.77

1.45

0.91

0.45

1.30

1.00

0.76

4.20

0.81

0.11

(100.00)

1.00

(100.00)

0.86

(66.67)

0.50

(100.00)

0.13

(75.00)

0.63

0.14

IEDBSMMPMBEC

0.87

3197.00

0.80

0.90

1133

.00

1.00

0.68

667.00

0.91

0.40

456

7.00

0.60

0.67

1309.00

0.83

0.11

(83.33

)0.83

(100.00)

0.86

(66.67)

0.50

(66.67)

0.50

(72.22

)0.55

0.15

IEDBSMM

0.90

334.00

1.00

0.93

1256

.00

1.00

0.36

687.00

0.45

0.35

88.00

1.00

0.65

692.00

0.69

0.13

(100.00)

0.83

(100.00)

0.86

(33.33

)0.75

(100.00)

0.25

(63.64)

0.54

0.14

MHCflurry

1.20.93

1816.00

1.00

0.88

4448.00

0.83

0.77

2064.00

0.82

0.88

3296.00

1.00

0.87

5354

.00

0.76

0.11

(100.00)

0.83

(85.71)

0.86

(50.00)

0.50

(100.00)

0.75

(73.08)

0.78

0.13

8-m

ers(n

¼19)a

9-m

ers(n

¼33

)a10-m

ers(n

¼48)a

11-m

ers(n

¼25

)aPooledleng

ths(n

¼125)

b

HLA

-B15

Predictorc

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Threshold

(PPV[%

])Sp

ecificity

Sensitivity

AROC

Val.thresho

ld(PPV[%

])Sp

ecificity

Sensitivity

NetMHC4.0

0.92

14126.00

0.73

0.82

2652

.00

0.79

0.89

6140.00

0.70

0.82

652

5.00

0.80

0.82

6476

.00

0.67

0.09

(72.73

)1.0

0(82.35

)0.74

(64.00)

0.89

(93.33

)0.70

(72.60)

0.78

0.07

NetMHC3.4

0.81

2654

.00

0.73

0.80

3786.00

0.86

0.89

455

1.00

0.83

0.60

438

.00

0.80

0.82

5050

.00

0.70

0.08

(70.00)

0.88

(86.67)

0.68

(76.19

)0.89

(90.91)

0.50

(74.65)

0.82

0.07

NetMHCpan

4.0

0.99

7695.00

0.91

0.94

478

2.00

0.71

0.86

7669.00

0.67

0.85

5189.00

0.80

0.89

679

0.00

0.68

0.09

(88.89)

1.00

(81.8

2)0.95

(62.96)

0.94

(94.12

)0.80

(75.31)

0.91

0.05

NetMHCpan

3.0

1.00

7375

.00

1.00

0.94

4008.00

0.86

0.87

7103.00

0.67

0.79

3766.00

1.00

0.88

7817.00

0.69

0.09

(100.00)

1.00

(90.00)

0.95

(62.96)

0.94

(100.00)

0.65

(75.95)

0.92

0.05

NetMHCpan

2.8

0.83

2808.00

0.73

0.89

6047.00

0.86

0.87

10190.00

0.70

0.62

709.00

0.80

0.86

6047.00

0.71

0.09

(72.73

)1.0

0(88.24)

0.79

(62.50

)0.83

(90.91)

0.50

(75.34

)0.86

0.06

NetMHCco

ns1.1

0.84

3220

.00

0.82

0.86

5531.00

0.71

0.89

10875

.00

0.67

0.61

558.00

0.80

0.85

5183.00

0.71

0.09

(80.00)

1.00

(78.95)

0.79

(62.96)

0.94

(90.91)

0.50

(72.97)

0.81

0.07

PickP

ocket

1.10.75

3933

.00

0.82

0.93

4831.00

0.71

0.83

10875

.00

0.73

0.53

2444.00

0.80

0.81

675

6.00

0.70

0.09

(71.4

3)0.63

(81.8

2)0.95

(63.64)

0.78

(90.00)

0.45

(72.22

)0.79

0.06

IEDBreco

mmen

ded

0.90

3.60

0.73

0.80

8.20

0.79

0.89

14.75

0.67

0.82

0.90

0.80

0.85

5.80

0.81

0.07

(72.73

)1.0

0(81.2

5)0.68

(62.96)

0.94

(93.33

)0.70

(82.54

)0.79

0.07

IEDBco

nsen

sus

0.97

2.80

0.73

0.80

8.20

0.79

0.87

4.55

0.87

0.56

0.10

0.60

0.82

5.80

0.79

0.08

(72.73

)1.0

0(81.2

5)0.68

(77.78

)0.78

(87.50

)0.70

(80.00)

0.81

0.06

IEDBSMMPMBEC

XX

X0.81

2911.00

0.71

0.79

354.00

0.73

XX

X0.72

478

0.81

0.09

(77.78

)0.74

(63.64)

0.78

(44.44)

0.55

0.13

IEDBSMM

XX

X0.82

1615.00

0.86

0.82

183.00

0.77

XX

X0.70

184

0.86

0.08

(86.67)

0.68

(66.67)

0.78

(45.24

)0.52

0.13

MHCflurry

1.20.70

1629

.00

0.73

0.87

3509.00

0.71

0.86

2586.00

0.87

0.55

596.00

0.80

0.81

2825

.00

0.65

0.08

(62.50

)0.63

(80.95)

0.89

(77.78

)0.78

(90.91)

0.50

(70.59)

0.75

0.08

aForsing

le-len

gth

predictions,a

nalysiswas

perform

edforcriterion-defi

nedthresholds.

bForpooledleng

thpredictions,a

nalysiswas

perform

edforbootstrap

ping-validated

thresholds.Value

sofsensitivityan

dspecificity

aregiven

asmea

n

SDof100resamplingruns.

cUnits

ofthresholdsaredep

enden

tonpredictoroutput.IEDBreco

mmen

ded

andIEDBco

nsen

susexpress

resultsas

percentile

rank

.Allother

metho

dspredicttheha

lf-m

axim

alinhibitory

concen

tration(IC50)in

nM.

AROC:A

reaun

der

thereceiver

operatingcharacteristic

curve(m

axim

um1).

PPV[%

]:Positive

predictive

valuegiven

in%.

Graybackg

roun

d:N

opossible

threshold

fitinto

thecriteria

ofFPR�0

.33an

dTPR�2

�FPR.A

nalysiswas

perform

edforthepep

tidebindingaffinity

ofthebindingpep

tideresultingin

lowestFPRan

dhighe

stTPR.

X:P

redictionno

tavailable

bythispredictor.

Italic

font:Absolute

number

ofan

alyzed

pep

tides

differs.

www.aacrjournals.org Cancer Immunol Res; 7(5) May 2019 731

Performance Evaluation of MHC Class-I Binding Predictors

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 14: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

HLA-A2, A3, and A24 are shown in Fig. 5, and for HLA-A1, A11,B7, and B15 in Supplementary Fig. S5. In most cases, criterion-based recommendation and bootstrapping validation returnedthe very same thresholds. In these cases, accuracy did not differsignificantly. Aminority of threshold pairs varied fromeachother.The greater the difference between the two threshold values, thegreater was the difference between associated accuracies. Howev-er, even when validation calculated a different threshold, theaccuracy did not necessarily differ. This was true for, e.g., IEDBSMMPMBEC predicting HLA-A2 affinity and NetMHC 4.0,NetMHCpan 4.0, and IEDB SMM predicting HLA-A24 affinity.

Applying the recommended thresholds increases the number ofpredicted true binders

The use of criteria-based thresholds for individual peptidelengths and the bootstrapping-validated thresholds for thepooled lengths increased the amount of true binders among thepredicted peptides of the HPV data set. This is illustrated for HLA-A2 in Fig. 6 and for HLA-A1, A3, A11, A24, B7, and B15 inSupplementary Fig. S6. The first column of each heat map repre-sents the amount of peptides experimentally verified to be truebinders (blue) or nonbinders (red). The following columnsillustrate the categorized prediction output of each predictor asindicated in the figure legend, differentiating between peptidespredicted either within the respective intermediate thresholdonly (dark red), the recommended threshold only (blue), theconsensus of both thresholds (dark blue), or predicted to benonbinders (red). For HLA-A2, applying the common thresh-olds of IC50 �500 nM and percentile rank � 2 identified onlyone third of assessed true binders at best (Fig. 6) and resulted inseveral false-positive predictions. For the 8 and 11 residue–longpeptides, the common thresholds often cover only few truebinders. For 9- and 10-mers, the intermediate binding affinitythresholds are more suitable. The amount of predicted truebinders could be increased by using the more tolerant recom-mended thresholds (predicted binders in green, predicted non-binders in red). Applying the recommended thresholds alsoincreased the number of FPs. This applies to other HLA types aswell (Supplementary Fig. S6).

DiscussionUsing an experimentally verified binding affinity data set of

more than 750 peptides derived fromHPV16 E6 and E7 proteins,we evaluated the performance of 13 currently used MHC class-Ibinding predictors for seven major HLA types for pooled andsingle length 8-, 9-, 10-, and 11-mer peptides. We found that,overall, current ANNmethods perform superiorly tomatrix-basedapproaches,which is in linewithpreviousfindings (54).No singleprediction method performed outstandingly in discriminatingbinders fromnonbinders, but performancewas dependent on theselected HLA type and peptide length. Even the ANN predictorMHCflurry 1.2 and the MS data trained NetMHCpan 4.0 was notgenerally able to outperform other methods. Thus, we recom-mend using either multiple methods or the algorithm that per-forms best for the specific HLA type and peptide length of interest.To make this feasible, we developed the web applicationMHCcombine that allows querying 12 of the 13 algorithmsanalyzed in this study and combines the output in an orderedand searchable file. In contrast to a tool previously developed byTrost and colleagues, MHCcombine returns the individual output

of several MHC-binding prediction methods and does not heu-ristically combine it. Further,MHCcombine allows predicting notonly 9-mers, but also 8-, 10-, 11-, and 12-mers, if this is offered bythe prediction algorithm and is publicly available as a web-basedtool.

Currently available MHC class-I binding prediction algorithmsare trained on different sets of data derived fromdatabases such asthe IEDB. These databases comprise different numbers of bio-chemically determined MHC-binding peptides, MS-detected nat-urally processed and presented MHC ligands, or reported trueT-cell epitopes for defined HLA types. In this study, we analyzedthe prediction methods solely as predictors of MHC class-I bind-ing. It is possible that if a predictor was trained on anything elsethan MHC-binding data, this may have negatively influenced theprediction performance as assessed herein. Apart from NetMHC-pan 4.0 (mentioned above), MHC-NP, and the previously devel-oped MSIntrinsic and MixMHCPred prediction algorithms havebeen trained on MS-derived MHC ligand data sets (23–25).Although there is a role for MS data sets in prediction algorithms(reviewed in ref. 57), we did not include the latter three methodsin our analysis as they either do not provide prediction for ourHLA types of interest or are not available as easy-to-use webapplications. In addition, the MS data-based algorithms aretrained on peptides that were naturally selected for peptideprocessing, MHC-binding affinity, peptide binding competitionfor MHC molecules, and bona fide peptide presentation. As ourdata set comprises only MHC-binding affinity data, we found itnot suitable to evaluate the performance of those methods, butfocused on analyzing predictors for MHC binding.

As our approach was to determine binding affinities until nobinder could be found anymore, it neither reflects thewhole rangeof predicted binding affinities nor does it contain all possible E6-and E7-derived peptides. This selection of peptides might create abias, as binding peptides with poor prediction scores may havebeen missed, and the predictors' ability to correctly predict posi-tives could have been overestimated. Depending on the predic-tions and the binding assay results, sample sizes differ for eachHLA type and peptide length. The HLA types A1 and B7 as well as8-mer and 11-mer peptides are underrepresented in this study,and results for these should be interpreted with caution. Thepreference of certain amino acids at anchor positions was com-parable, although the HPV16 data set was too small to deriveregions of variability or conservation. Based on the similarity inanchor positions, we do not expect significant bias resulting fromthe data set derived from only two viral proteins.

Previous findings indicate that many MHC ligands mightbe missed when the routinely used intermediate bindingaffinity decision thresholds (IC50 � 500 nM, percentile rank� 2) are applied (27, 58, 59). Indeed, comparing the sensitivityof predictors at these intermediate binding affinity thresholdsshowed that this yielded a maximum sensitivity of only 40%.We analyzed the impact of more tolerant thresholds on pre-diction performance and showed that applying the low bindingaffinity threshold (IC50 � 5000 nM) did not generally improveprediction performance. Therefore, we calculated new thresh-old recommendations based on our data set. Paul and collea-gues have previously shown that allele-specific affinity thresh-olds increased the predictive efficacy, and recommended HLAtype–specific binding affinity thresholds for 38 HLA-A andHLA-B types (60). For 27 of these 38 HLA types, they recom-mended IC50 values > 500 nM and < 1,000 nM. For the

Bonsack et al.

Cancer Immunol Res; 7(5) May 2019 Cancer Immunology Research732

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 15: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

remaining 11 types, among them A2 and A11, thresholds below500 nM were recommended. However, this study focused onlyon 9-mer peptides and the SMM algorithm and thus thesethresholds are not generally applicable. With our study, weprovide more comprehensive threshold recommendationsindividually for 12 different and currently used predictors, forseven major HLA types, as well as for pooled and single peptidelengths.

The historic threshold of IC50 � 500 nM based mostly on 9-and 10-mer HLA-A2 binders (26) has been commonly usedacross different HLA types and peptide lengths since the mid-1990s. We demonstrate here that this threshold is not generallysuited and applicable to identify binders when using MHCclass-I binding prediction methods. Although strong bindingaffinity to MHC molecules was described to correlate well withtherapy outcome (e.g., tumor rejection), it might not be thebest predictor of therapeutically relevant human "self" tumor-specific epitopes, as it is likely that T cells specific to these havebeen deleted during negative selection in the thymus (59, 61).We previously showed that immunogenic HPV16-derived A2peptides can be found with predicted IC50 > 500 nM (6/11epitopes, up to an IC50 value of 4,351 nM predicted byNetMHC 4.0; ref. 62). It was reported that neoepitopes, whichelicit protective immunity in mice, may have an MHC affinitywell over IC50 values of 500 nM (63). Thus, we question the

continued suitability of the intermediate binding affinitythreshold for predicting true T-cell epitopes.

Accurate prediction of MHC-binding affinity is a prerequisitefor the identification of T-cell epitopes. After prediction, peptidesneed to be tested for actual immunogenicity to identify promisingtherapy candidates. We previously identified immunogenic epi-topes among our data set of HLA-A2 binding peptides throughIFNg responses of peripheral blood mononucleated cells fromhealthy donors (62). Four selected epitopes were validated in vivoin an MHC-humanized mouse model (64). This demonstratesthat MHC class-I binding prediction, applying the commonlyused strong and intermediate binding affinity thresholds, pro-vides high prediction specificity. In therapeutic settings wheremany different antigens are available, few candidates per antigenmight be enough to find a true epitope among them.However, forimmunotherapeutic approaches in which possible antigen is rare,it becomes necessary to test any possible MHC binder in order todetect a true epitope. In this case, it is reasonable to increase theprediction sensitivity by applying the more tolerant thresholdsrecommended in this study, which describes how to choose thebest online available prediction method for HLA type– andpeptide length–specific research questions. Ultimately, improve-ment of MHC class-I binding prediction methods will facilitatethe identification of true T-cell epitopes and thereby advance thedevelopment of epitope-specific immunotherapies.

Figure 5.

Comparison of prediction accuracies obtained by applying criteria-based and bootstrapping-validated thresholds for each analyzed predictor for HLA-A2, A3,and A24. Accuracy values are shown for using the criteria-based recommended (left) and the bootstrapping-validated thresholds (right) indicated on the x-axis.Mean accuracy of 100 resampling runs for each threshold was determined and compared by using Student t test (significance, P < 0.05). Box plots showquartiles, and whiskers indicate the 95% interval of accuracy. Differences of means are indicated above levels of significance. ��� , P < 0.001; �� , P < 0.01;� , P < 0.05; ns, not significant.

Performance Evaluation of MHC Class-I Binding Predictors

www.aacrjournals.org Cancer Immunol Res; 7(5) May 2019 733

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 16: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

Data and Materials AvailabilityThe web application MHCcombine is publicly available via

http://mhccombine.dkfz.de. The source code for MHCcombineand theR-script for the cross-validation algorithmare available onGitHub (https://github.com/DKFZ-F130).

Disclosure of Potential Conflicts of InterestNo potential conflicts of interest were disclosed.

Authors' ContributionsConception and design: M. Bonsack, S. Hoppe, A.B. RiemerDevelopment of methodology: M. Bonsack, S. Hoppe, J. Winter, D. Tichy,C. Zeller, R. BlatnikAcquisition of data (provided animals, acquired and managed patients,provided facilities, etc.): M. Bonsack, S. Hoppe, J. Winter, M.D. K€upper,E.C. Schitter, R. BlatnikAnalysis and interpretation of data (e.g., statistical analysis, biostatistics,computational analysis): M. Bonsack, S. Hoppe, D. Tichy, R. BlatnikWriting, review, and/or revision of the manuscript: M. Bonsack, S. Hoppe,J. Winter, D. Tichy, R. Blatnik, A.B. RiemerStudy supervision: A.B. RiemerOther (development and testing of web application MHCcombine):M. Bonsack, S. Hoppe, J. Winter, C. Zeller

AcknowledgmentsWe thank Martin W€uhl and Alexandra Klevenz for technical assistance, and

Cyril Mongis and Tobias Reber for their contributions to the implementation ofMHCcombine. We gratefully acknowledge the flow cytometry core facility andthe GMP unit of the German Cancer Research Center (DKFZ) for technicalsupport and peptide synthesis, respectively. General funding and support wasprovided by DKFZ. A.B. Riemer was funded by the German Center for InfectionResearch (DZIF, grant number TTU 07.706),M. Bonsack and S.Hoppe by a PhDscholarship from the Helmholtz International Graduate School of the DKFZ,and R. Blatnik by a PhD scholarship from the Eduard and Melanie zur HausenFoundation.

The costs of publication of this article were defrayed in part by thepayment of page charges. This article must therefore be hereby markedadvertisement in accordance with 18 U.S.C. Section 1734 solely to indicatethis fact.

Received September 14, 2018; revised December 19, 2018; accepted March18, 2019; published first March 22, 2019.

References1. Assarsson E, Sidney J, Oseroff C, Pasquetto V, Bui H-H, Frahm N, et al. A

quantitative analysis of the variables affecting the repertoire of T cellspecificities recognized after vaccinia virus infection. J Immunol 2007;178:7890–901.

2. Kessler JH, Benckhuijsen WE, Mutis T, Melief CJM, van der Burg SH,Drijfhout JW. Competition-based cellular peptide binding assay for HLAclass I. Curr Protoc Immunol 2004;Chapter 18:Unit 18.12.

3. Wulf M, Hoehn P, Trinder P. Identification and validation of T-cellepitopes using the IFN-g ELISPOT assay. Methods Mol Biol 2009;524:439–46.

4. Dendrou CA, Petersen J, Rossjohn J, Fugger L. HLA variation and disease.Nat Rev Immunol 2018;18:325–39.

5. Trolle T, Nielsen M. NetTepi: an integrated method for the prediction ofT cell epitopes. Immunogenetics 2014;66:449–56.

6. Jørgensen KW, RasmussenM, Buus S, NielsenM. NetMHCstab - predictingstability of peptide-MHC-I complexes; impacts for cytotoxic T lymphocyteepitope discovery. Immunology 2014;141:18–26.

7. Nielsen M, Lundegaard C, Lund O, Kesmir C. The role of the protea-some in generating cytotoxic T-cell epitopes: insights obtained fromimproved predictions of proteasomal cleavage. Immunogenetics 2005;57:33–41.

exp.

bin

ding

Net

MH

C 4

.0N

etM

HC

3.4

Net

MH

Cpa

n 4.

0N

etM

HC

pan

3.0

Net

MH

Cpa

n 2.

8N

etM

HC

cons

1.1

Pic

kpoc

ket 1

.1IE

DB

reco

mm

ende

dIE

DB

con

sens

usIE

DB

SM

MP

MB

EC

IED

B S

MM

MH

Cflu

rry

1.2

8-m

ers

9-m

ers

10-m

ers

11-m

ers

Pool

ed le

ngth

sex

p. b

indi

ngN

etM

HC

4.0

Net

MH

C 3

.4N

etM

HC

pan

4.0

Net

MH

Cpa

n 3.

0N

etM

HC

pan

2.8

Net

MH

Cco

ns 1

.1P

ickp

ocke

t 1.1

IED

B re

com

men

ded

IED

B c

onse

nsus

IED

B S

MM

PM

BE

CIE

DB

SM

MM

HC

flurr

y 1.

2

Binder/predicted within recommended thresholdPredicted with IC50 ≤500 nM or percentile rank ≤2 and ≤recommended thresholdPredicted with IC50 ≤500 nM or percentile rank ≤2 but >recommended threshold Nonbinder/not predicted within thresholds

Figure 6.

Effect of using individually calculated criteria-based thresholds versuscommonly used thresholds for the prediction of HPV16 E6- and E7-derivedligands to HLA-A2. Experimental binding (first column) was categorized intobinders (blue) and nonbinders (red). Predicted binding affinity (followingcolumns) was categorized into predicted within (blue) or beyond (red) therecommended thresholds. Dark shades of red and blue represent peptidesthat were additionally predicted within the general threshold as indicated inthe color legend below the heat map. Results are sorted according toexperimental binding and IC50 values predicted by NetMHC 4.0.

Bonsack et al.

Cancer Immunol Res; 7(5) May 2019 Cancer Immunology Research734

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 17: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

8. Larsen M V, Lundegaard C, Lamberth K, Buus S, Lund O, Nielsen M. Large-scale validation ofmethods for cytotoxic T-lymphocyte epitope prediction.BMC Bioinformatics 2007;8:424.

9. Calis JJA, Maybeno M, Greenbaum JA, Weiskopf D, De Silva AD, Sette A,et al. Properties of MHC class I presented peptides that enhance immu-nogenicity. PLoS Comput Biol 2013;9:e1003266.

10. Tenzer S, Peters B, Bulik S, SchoorO, LemmelC, SchatzMM, et al.Modelingthe MHC class I pathway by combining predictions of proteasomalcleavage, TAP transport and MHC class I binding. Cell Mol Life Sci2005;62:1025–37.

11. Falk K, R€otzschke O, Stevanovi�c S, Jung G, Rammensee HG. Allele-specificmotifs revealed by sequencing of self-peptides eluted from MHC mole-cules. Nature 1991;351:290–6.

12. Rammensee H, Bachmann J, Emmerich NP, Bachor OA, Stevanovi�c S.SYFPEITHI: database for MHC ligands and peptide motifs. Immunoge-netics 1999;50:213–9.

13. Ott PA, Hu Z, Keskin DB, Shukla SA, Sun J, Bozym DJ, et al. An immu-nogenic personal neoantigen vaccine for patients with melanoma. Nature2017;547:217–21.

14. Rooney MS, Shukla SA, Wu CJ, Getz G, Hacohen N. Molecular and geneticproperties of tumors associated with local immune cytolytic activity. Cell2015;160:48–61.

15. Yadav M, Jhunjhunwala S, Phung QT, Lupardus P, Tanguay J, Bumbaca S,et al. Predicting immunogenic tumour mutations by combining massspectrometry and exome sequencing. Nature 2014;515:572–6.

16. Rajasagi M, Shukla SA, Fritsch EF, Keskin DB, DeLuca D, Carmona E, et al.Systematic identificationof personal tumor-specific neoantigens in chroniclymphocytic leukemia. Blood 2014;124:453–62.

17. MatsushitaH, VeselyMD, Koboldt DC, Rickert CG,Uppaluri R,Magrini VJ,et al. Cancer exome analysis reveals a T-cell-dependent mechanism ofcancer immunoediting. Nature 2012;482:400–4.

18. Sahin U, Derhovanessian E, Miller M, Kloke B-P, Simon P, L€ower M, et al.Personalized RNA mutanome vaccines mobilize poly-specific therapeuticimmunity against cancer. Nature 2017;547:222–6.

19. Schumacher TN, Schreiber RD. Neoantigens in cancer immunotherapy.Science 2015;348:69–74.

20. Caron E, Kowalewski DJ, Chiek Koh C, Sturm T, Schuster H, Aebersold R.Analysis of major histocompatibility complex (MHC) immunopepti-domes using mass spectrometry. Mol Cell Proteomics 2015;14:3105–17.

21. Schirle M, Weinschenk T, Stevanovi�c S. Combining computer algorithmswith experimental approaches permits the rapid and accurate identifica-tion of T cell epitopes from defined antigens. J Immunol Methods 2001;257:1–16.

22. Jurtz V, Paul S, Andreatta M,Marcatili P, Peters B, NielsenM. NetMHCpan-4.0: Improved peptide–MHC class I interaction predictions integratingeluted ligand and peptide binding affinity data. J Immunol 2017;199:3360–8.

23. Bassani-Sternberg M, Chong C, Guillaume P, Solleder M, Pak H, GannonPO, et al. DecipheringHLA-Imotifs across HLA peptidomes improves neo-antigen predictions and identifies allostery regulating HLA specificity.PLoS Comput Biol 2017;13:e1005725.

24. Abelin JG, Keskin DB, Sarkizova S, Hartigan CR, Zhang W, Sidney J, et al.Mass spectrometry profiling of HLA-associated peptidomes in mono-allelic cells enables more accurate epitope prediction. Immunity 2017;46:315–26.

25. Gigu�ere S, Drouin A, Lacoste A, MarchandM, Corbeil J, Laviolette F. MHC-NP: predicting peptides naturally processed by the MHC. J ImmunolMethods 2013;400–401:30–6.

26. Sette A, Vitiello A, Reherman B, Fowler P, Nayersina R, Kast WM,et al. The relationship between class I binding affinity and immuno-genicity of potential cytotoxic T cell epitopes. J Immunol 1994;153:5586–92.

27. Bassani-SternbergM, Br€aunlein E, Klar R, Engleitner T, Sinitcyn P, AudehmS, et al. Direct identification of clinically relevant neoepitopes presented onnativehumanmelanoma tissue bymass spectrometry.NatCommun2016;7:13404.

28. O'Donnell TJ, Rubinsteyn A, Bonsack M, Riemer AB, Laserson U, Ham-merbacher J. MHCflurry: open-source class I MHC binding affinity pre-diction. Cell Syst 2018;7:129–132.e4.

29. Sidney J, Peters B, Frahm N, Brander C, Sette A. HLA class I supertypes: arevised and updated classification. BMC Immunol 2008;9:1.

30. Yu K, Petrovsky N, Sch€onbach C, Koh JYL, Brusic V.Methods for predictionof peptide binding to MHC molecules: a comparative study. Mol Med2002;8:137–48.

31. Peters B, Bui H-H, Frankild S, Nielson M, Lundegaard C, Kostem E, et al. Acommunity resource benchmarking predictions of peptide binding toMHC-I molecules. PLoS Comput Biol 2006;2:e65.

32. Gowthaman U, Chodisetti SB, Parihar P, Agrewala JN. Evaluation ofdifferent generic in silico methods for predicting HLA class I bindingpeptide vaccine candidates using a reverse approach. Amino Acids2010;39:1333–42.

33. Gfeller D, Bassani-Sternberg M, Schmidt J, Luescher IF. Current tools forpredicting cancer-specific T cell immunity. Oncoimmunology 2016;5:e1177691.

34. Kar P, Ruiz-Perez L, Arooj M, Mancera RL. Current methods for theprediction of T-cell epitopes. Pept Sci 2018;110:e24046.

35. Trolle T, Metushi IG, Greenbaum JA, Kim Y, Sidney J, Lund O, et al.Automated benchmarking of peptide-MHC class I binding predictions.Bioinformatics 2015;31:2174–81.

36. Nielsen M, Lundegaard C, Worning P, Lauemøller SL, Lamberth K, Buus S,et al. Reliable prediction of T-cell epitopes using neural networks withnovel sequence representations. Protein Sci 2003;12:1007–17.

37. AndreattaM,NielsenM.Gapped sequence alignment using artificial neuralnetworks: application to the MHC class I system. Bioinformatics 2016;32:511–7.

38. Lundegaard C, Lamberth K, Harndahl M, Buus S, Lund O, Nielsen M.NetMHC-3.0: accurate web accessible predictions of human, mouse andmonkey MHC class I affinities for peptides of length 8–11. Nucleic AcidsRes 2008;36:W509–12.

39. Lundegaard C, Lund O, Nielsen M. Accurate approximation method forpredictionof class IMHCaffinities for peptides of length 8, 10 and11usingprediction tools trained on 9mers. Bioinformatics 2008;24:1397–8.

40. Hoof I, Peters B, Sidney J, Pedersen LE, Sette A, LundO, et al. NetMHCpan,a method for MHC class I binding prediction beyond humans. Immuno-genetics 2009;61:1–13.

41. NielsenM, AndreattaM.NetMHCpan-3.0; improved prediction of bindingto MHC class I molecules integrating information from multiple receptorand peptide length datasets. Genome Med 2016;8:33.

42. Nielsen M, Lundegaard C, Blicher T, Lamberth K, Harndahl M, Justesen S,et al. NetMHCpan, a method for quantitative predictions of peptidebinding to any HLA-A and -B locus protein of known sequence.PLoS One 2007;2:e796.

43. Karosiene E, Lundegaard C, LundO,NielsenM.NetMHCcons: a consensusmethod for the major histocompatibility complex class I predictions.Immunogenetics 2012;64:177–86.

44. Zhang H, Lund O, Nielsen M. The PickPocket method for predictingbinding specificities for receptors based on receptor pocket similarities:application to MHC-peptide binding. Bioinformatics 2009;25:1293–9.

45. Sidney J, Assarsson E, Moore C, Ngo S, Pinilla C, Sette A, et al. Quantitativepeptide binding motifs for 19 human and mouse MHC class I moleculesderived using positional scanning combinatorial peptide libraries.Immunome Res 2008;4:2.

46. Moutaftsi M, Peters B, Pasquetto V, Tscharke DC, Sidney J, Bui H-H, et al. Aconsensus epitope prediction approach identifies the breadth of murineT(CD8þ)-cell responses to vaccinia virus. Nat Biotechnol 2006;24:817–9.

47. Kim Y, Sidney J, Pinilla C, Sette A, Peters B. Derivation of an amino acidsimilarity matrix for peptide: MHC binding and its application as aBayesian prior. BMC Bioinformatics 2009;10:394.

48. Parker KC, BednarekMA, Coligan JE. Scheme for ranking potential HLA-A2binding peptides based on independent binding of individual peptideside-chains. J Immunol 1994;152:163–75.

49. Peters B, Sette A. Generating quantitative models describing the sequencespecificity of biological processes with the stabilized matrix method.BMC Bioinformatics 2005;6:132.

50. Merrifield RB. Solid phase peptide synthesis. I. the synthesis of a tetrapep-tide. J Am Chem Soc 1963;85:2149–54.

51. Carpino LA, Han GY. 9-Fluorenylmethoxycarbonyl amino-protectinggroup. J Org Chem 1972;37:3404–9.

52. Kessler JH, Mommaas B, Mutis T, Huijbers I, Vissers D, BenckhuijsenWE, et al. Competition-based cellular peptide binding assays for 13prevalent HLA class I alleles using fluorescein-labeled syntheticpeptides. Hum Immunol 2003;64:245–55.

Performance Evaluation of MHC Class-I Binding Predictors

www.aacrjournals.org Cancer Immunol Res; 7(5) May 2019 735

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 18: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

53. Bradley AP. The use of the area under the ROC curve in the evaluation ofmachine learning algorithms. Pattern Recognit 1997;30:1145–59.

54. Lin HH, Ray S, Tongchusak S, Reinherz EL, Brusic V. Evaluation of MHCclass I peptide binding prediction servers: applications for vaccine research.BMC Immunol 2008;9:8.

55. Thomsen MCF, Nielsen M. Seq2Logo: a method for construction andvisualization of amino acid binding motifs and sequence profilesincluding sequence weighting, pseudo counts and two-sided represen-tation of amino acid enrichment and depletion. Nucleic Acids Res 2012;40:W281–7.

56. Vita R, Mahajan S, Overton JA, Dhanda SK,Martini S, Cantrell JR, et al. Theimmune epitope database (IEDB): 2018 update. Nucleic Acids Res 2018;1–5.

57. CreechAL, Ting YS,Goulding SP, Sauld JFK, BarthelmeD, RooneyMS, et al.The role of mass spectrometry and proteogenomics in the advancement ofHLA epitope prediction. Proteomics 2018;1700259:1–10.

58. Nogueira C, Kaufmann JK, Lam H, Flechtner JB. Improving cancer immu-notherapies through empirical neoantigen selection. Trends Cancer 2018;4:97–100.

59. Engels B, Engelhard VH, Sidney J, Sette A, Binder DC, Liu RB, et al. Relapseor eradication of cancer is predicted by peptide-major histocompatibilitycomplex affinity. Cancer Cell 2013;23:516–26.

60. Paul S, Weiskopf D, Angelo MA, Sidney J, Peters B, Sette A. HLA class Ialleles are associated with peptide-binding repertoires of different size,affinity, and immunogenicity. J Immunol 2013;191:5831–9.

61. Kammertoens T, Blankenstein T. It's the peptide-MHC affinity, stupid.Cancer Cell 2013;23:429–31.

62. Blatnik R, Mohan N, Bonsack M, Falkenby LG, Hoppe S, Josef K, et al. Atargeted LC-MS strategy for low-abundant HLA class I-presented peptidedetection identifies novel human papillomavirus T-cell epitopes. Proteo-mics 2018;e1700390.

63. Duan F, Duitama J, Al Seesi S, Ayres CM, Corcelli SA, Pawashe AP, et al.Genomic and bioinformatic profiling of mutational neoepitopes revealsnew rules to predict anticancer immunogenicity. J Exp Med 2014;211:2231–48.

64. Kruse S, B€uchler M, Uhl P, Sauter M, Scherer P, Lan TCT, et al. Therapeuticvaccination using minimal HPV16 epitopes in a novel MHC-humanizedmurine HPV tumor model. Oncoimmunology 2018;0:1–12.

Cancer Immunol Res; 7(5) May 2019 Cancer Immunology Research736

Bonsack et al.

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584

Page 19: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

Correction

Correction: Performance Evaluation of MHCClass-I Binding Prediction Tools Based on anExperimentally Validated MHC–PeptideBinding Data Set

In the original version of this article (1), text on page 722 of the Materials andMethods section incorrectly referred to "nmol/L" instead of "nM," and "nM" wasmissing in two places on page 732 of theDiscussion section. An incorrect subheadingof Supplementary Table S2 has also been removed.

All errors have been corrected in the latest online HTML and PDF versions of thearticle. The authors regret these errors.

References1. BonsackM,Hoppe S,Winter J, TichyD, Zeller C, K€upperMD, et al. Performance evaluation ofMHC

class-I binding prediction tools based on an experimentally validated MHC–peptide binding dataset. Cancer Immunol Res 2019;7:719–36.

Published online July 1, 2019.Cancer Immunol Res 2019;7:1221doi: 10.1158/2326-6066.CIR-19-0387�2019 American Association for Cancer Research.

CancerImmunologyResearch

www.aacrjournals.org 1221

Page 20: Performance Evaluation of MHC Class-I Binding Prediction ...€¦ · Performance Evaluation of MHC Class-I Binding Prediction Tools Based on an Experimentally Validated MHC–Peptide

2019;7:719-736. Published OnlineFirst March 22, 2019.Cancer Immunol Res   Maria Bonsack, Stephanie Hoppe, Jan Winter, et al.   Set

Peptide Binding Data−Based on an Experimentally Validated MHC Performance Evaluation of MHC Class-I Binding Prediction Tools

  Updated version

  10.1158/2326-6066.CIR-18-0584doi:

Access the most recent version of this article at:

  Material

Supplementary

  http://cancerimmunolres.aacrjournals.org/content/suppl/2019/03/22/2326-6066.CIR-18-0584.DC1

Access the most recent supplemental material at:

   

   

  Cited articles

  http://cancerimmunolres.aacrjournals.org/content/7/5/719.full#ref-list-1

This article cites 61 articles, 9 of which you can access for free at:

  Citing articles

  http://cancerimmunolres.aacrjournals.org/content/7/5/719.full#related-urls

This article has been cited by 10 HighWire-hosted articles. Access the articles at:

   

  E-mail alerts related to this article or journal.Sign up to receive free email-alerts

  Subscriptions

Reprints and

  [email protected]

To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department

  Permissions

  Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

.http://cancerimmunolres.aacrjournals.org/content/7/5/719To request permission to re-use all or part of this article, use this link

on October 31, 2020. © 2019 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Published OnlineFirst March 22, 2019; DOI: 10.1158/2326-6066.CIR-18-0584