13
Vaccine 23 (2005) 5212–5224 A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules B. Peters a,, H.-H. Bui a , J. Sidney a , Z. Weng b , J.T. Loffredo c , D.I. Watkins c,d , B.R. Moth´ e a,e , A. Sette a a La Jolla Institute for Allergy and Immunology, Vaccine Discovery – I, 3030 Bunker Hill Street, Suite 326, San Diego CA 92109, USA b Boston University, Department of Biomedical Engineering, 44 Cummington Street, Boston MA 02215, USA c University of Wisconsin-Madison, National Primate Research Center (WPRC), 1220 Capitol Court, Madison WI 53715, USA d University of Wisconsin-Madison, Department of Pathology and Laboratory Medicine, 1300 University Avenue, Madison WI 53706, USA e California State University, Department of Biological Sciences, San Marcos CA 92096, USA Received 8 April 2005; accepted 28 July 2005 Available online 18 August 2005 Abstract Non-human primates, in general, and Indian rhesus macaques, specifically, play an important role in the development and testing of vaccines and diagnostics destined for human use. To date, several frequently expressed macaque MHC molecules have been identified and their binding specificities characterized in detail. Here, we report the development of computational algorithms to predict peptide binding and potential T cell epitopes for the common MHC class I alleles Mamu-A*01, -A*02, -A*11, -B*01 and -B*17, which cover approximately two thirds of the captive Indian rhesus macaque populations. We validated this method utilizing an SIV derived data set encompassing 59 antigenic peptides. Of all peptides contained in the SIV proteome, the 2.4% scoring highest in the prediction contained 80% of the antigenic peptides. The method was implemented in a freely accessible and user friendly website at www.mamu.liai.org. Thus, we anticipate that our approach can be utilized to rapidly and efficiently identify CD8+ T cell epitopes recognized by rhesus macaques and derived from any pathogen of interest. © 2005 Elsevier Ltd. All rights reserved. Keywords: Indian rhesus macaque; MHC; Epitope 1. Introduction MHC class I molecules bind peptides usually derived from proteins synthesized within a given cell [1]. These peptide–MHC complexes are then presented on the cell sur- face. Thus, MHC–peptide complexes allow CD8+ T cells to scan the content of cells and recognize cells that present abnormal peptides, such as those derived from proteins encoded by microbial pathogens and cancer antigens. Pep- tides recognized by cellular immune responses are referred to as antigenic peptides or T cell epitopes. Exact knowledge of which epitopes are presented during the course of an infec- tion or following vaccination is crucial for the development Corresponding author. Tel.: +1 858 228 1379; fax: +1 858 228 1008. E-mail address: bjoern [email protected] (B. Peters). of reagents to monitor immune responses. Exact measure- ment of cellular immunity is in turn key to the study of host–pathogen interactions and also for the evaluation of new vaccine constructs. The rhesus macaque is a key non-human primate species utilized for the development of vaccines and diagnostic tools. New vaccine constructs and diagnostics are usually tested in non-human primates before the initiation of human clini- cal trials. Non-human primate studies often offer invaluable insights into host-pathogen interactions and provide, because of their similarities with humans, the most accurate ani- mal model possible. However, accurate tools to measure and quantify immune responses in non-human primates are rela- tively scarce and, as a result, the development of vaccines and diagnostics, as well as basic immunological model studies of human disease, are hampered. 0264-410X/$ – see front matter © 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.vaccine.2005.07.086

A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

Embed Size (px)

Citation preview

Page 1: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

Vaccine 23 (2005) 5212–5224

A computational resource for the prediction of peptide bindingto Indian rhesus macaque MHC class I molecules

B. Petersa,∗, H.-H. Buia, J. Sidneya, Z. Wengb, J.T. Loffredoc,D.I. Watkinsc,d, B.R. Moth́ea,e, A. Settea

a La Jolla Institute for Allergy and Immunology, Vaccine Discovery – I, 3030 Bunker Hill Street, Suite 326, San Diego CA 92109, USAb Boston University, Department of Biomedical Engineering, 44 Cummington Street, Boston MA 02215, USA

c University of Wisconsin-Madison, National Primate Research Center (WPRC), 1220 Capitol Court, Madison WI 53715, USAd University of Wisconsin-Madison, Department of Pathology and Laboratory Medicine, 1300 University Avenue, Madison WI 53706, USA

e California State University, Department of Biological Sciences, San Marcos CA 92096, USA

Received 8 April 2005; accepted 28 July 2005Available online 18 August 2005

Abstract

of vaccinestheir binding

potential Tof theic peptides.he methodizedt.

sure-off new

eciesols.

estedlini-ableause

ani-e andrela-s andes of

Non-human primates, in general, and Indian rhesus macaques, specifically, play an important role in the development and testingand diagnostics destined for human use. To date, several frequently expressed macaque MHC molecules have been identified andspecificities characterized in detail. Here, we report the development of computational algorithms to predict peptide binding andcell epitopes for the common MHC class I alleles Mamu-A*01, -A*02, -A*11, -B*01 and -B*17, which cover approximately two thirdscaptive Indian rhesus macaque populations. We validated this method utilizing an SIV derived data set encompassing 59 antigenOf all peptides contained in the SIV proteome, the 2.4% scoring highest in the prediction contained 80% of the antigenic peptides. Twas implemented in a freely accessible and user friendly website atwww.mamu.liai.org. Thus, we anticipate that our approach can be utilto rapidly and efficiently identify CD8+ T cell epitopes recognized by rhesus macaques and derived from any pathogen of interes© 2005 Elsevier Ltd. All rights reserved.

Keywords: Indian rhesus macaque; MHC; Epitope

1. Introduction

MHC class I molecules bind peptides usually derivedfrom proteins synthesized within a given cell[1]. Thesepeptide–MHC complexes are then presented on the cell sur-face. Thus, MHC–peptide complexes allow CD8+ T cellsto scan the content of cells and recognize cells that presentabnormal peptides, such as those derived from proteinsencoded by microbial pathogens and cancer antigens. Pep-tides recognized by cellular immune responses are referredto as antigenic peptides or T cell epitopes. Exact knowledgeof which epitopes are presented during the course of an infec-tion or following vaccination is crucial for the development

∗ Corresponding author. Tel.: +1 858 228 1379; fax: +1 858 228 1008.E-mail address: [email protected] (B. Peters).

of reagents to monitor immune responses. Exact meament of cellular immunity is in turn key to the studyhost–pathogen interactions and also for the evaluation ovaccine constructs.

The rhesus macaque is a key non-human primate sputilized for the development of vaccines and diagnostic toNew vaccine constructs and diagnostics are usually tin non-human primates before the initiation of human ccal trials. Non-human primate studies often offer invaluinsights into host-pathogen interactions and provide, becof their similarities with humans, the most accuratemal model possible. However, accurate tools to measurquantify immune responses in non-human primates aretively scarce and, as a result, the development of vaccinediagnostics, as well as basic immunological model studihuman disease, are hampered.

0264-410X/$ – see front matter © 2005 Elsevier Ltd. All rights reserved.doi:10.1016/j.vaccine.2005.07.086

Page 2: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

B. Peters et al. / Vaccine 23 (2005) 5212–5224 5213

Several different computerized methods have been devel-oped over the course of the last 15 years to predict peptideligands for MHC molecules. Efficient predictive methods areavailable online for most common MHC molecules expressedby humans, as well as common mouse inbred strains utilizedin immunological and vaccine studies[2–11]. The availabil-ity of such online predictive resources has been an invaluabletool for the scientific community. However, none of the cur-rently available tools allows prediction of epitopes presentedby Indian rhesus macaques.

We have previously performed in-depth studies of T cellresponses against SIV infection in macaques. To analyze cor-relates of protection, and develop vaccine candidates, it isessential to know which epitopes are presented on infectedcells. We have therefore studied which peptides bind tomacaque MHC molecules, as this binding step is the mostselective requirement in the MHC class I pathway. Afteridentifying class I MHC binding peptides derived from theSIV proteome, we next determined which of these elicitedimmune responses from CD8+ T cells isolated from SIVinfected macaques. Repeating this process for several MHCalleles representative of those found in captive macaque pop-ulations, we developed a consistent dataset of non-bindingpeptides, binding peptides and antigenic peptides for fivecommon MHC molecules. Based on this knowledge, we wereable to perform studies which have provided insight intok nesiss ctiona

ptides dic-t erallg tiont jor-i lyzedt iffer-e s( ra-t monM ationc ipti tiona 7a eelya

eene ntly,b ep-t tides ani erroro ag-n ingsp fec-t ntala

2. Materials and methods

2.1. Allele frequencies in Indian rhesus macaques

The MHC typing data from 2294 animals was kindlyprovided to us by the typing facility at the Universityof Wisconsin. Approximately one third of the animalsoriginate from the Wisconsin National Primate ResearchCenter, and the other two thirds are from a variety of othersources. Samples were typed for eight different class I allelessimultaneously. The eight class I alleles are Mamu-A*01,-A*02, -A*08, -A*11, -B*01, -B*03, -B*04 and -B*17, andtypings were performed as previously described ([31,33]and Rehrauer et al., manuscript in preparation). Withthe exception of B*04 (Wisconsin = 1.7%, other sources0.3%), the allele frequencies at Wisconsin relative to othersources were not different in a statistically significantmanner.

2.2. Peptide and epitope datasets

The peptide binding and antigenicity data for the Mamu-A*01, -A*02, -A*11 and -B*17 alleles were taken from pub-lished literature[21–24,29]and our laboratory database. Forthe Mamu-B*01 allele, similar studies were conducted and noantigenicity data is available yet (Loffredo et al., manuscripti

d ast nome)w deter-m llele.T amu-A les.T 251s uses,w inglei weret ele.Ip d fora mM iteda les, itw

forb ep-t r theb

2

Pep-s opes( en-t bs( l or

inetics of immune responses after infection, pathogetudies and viral escape mechanisms after natural infend vaccination[12–28].

Here, we present a computational analysis of this peet, starting with the generation of peptide binding preions for the most common macaque alleles. The ovoal of our investigation was to design reliable predic

ools to identify antigenic epitopes recognized in the maty of Indian rhesus macaques. Recent work has anahe sequence motifs of peptides binding to several dnt Mamu (Macaca mulatta) MHC class I MHC molecule[21–24,29,30]and Loffredo et al., manuscript in prepaion). Here, we considered the frequency of the most comHC class I alleles expressed to ensure thorough popul

overage ([21,26,29,31,32]and Rehrauer et al., manuscrn preparation). Accordingly, we have developed prediclgorithms for Mamu-A*01, -A*02, -A*11, -B*01 and -B*1lleles and integrated this information in a resource frccessible over the internet atwww.mamu.liai.org.

Testing these predictions, we found a correlation betwxperimental and predicted binding and, more importaetween predicted binding affinity and antigenicity of p

ides. Interestingly, the predicted binding affinity of a pephows a slightly higher correlation with antigenicity thts measured affinity, suggesting that the experimentalf an individual experiment is in the same order of mitude as the error made in the prediction. These findrovide us with the ability to predict antigenic peptides ef

ively without undergoing timely and expensive experimepproaches.

n preparation).In each of these studies, the SIV proteome (define

he set of predicted expressed sequences from the geas scanned for peptides containing anchor residuesined to be a prerequisite for binding to the examined ahe SIV sequences scanned were SIVmac251 for the M*01 allele, and SIVmac239 sequence for all other allehe historic reason for this switch is that the SIVmacequence was derived from a heterogeneous mix of virhile the SIVmac239 sequence was derived from a s

solate. Peptides with the appropriate anchor residueshen tested for binding to the corresponding MHC allf the measured affinity was better than IC50 = 500 nM, theeptide was classified as a binder, and further testentigenicity in an IFN-� ELISPOT assay with cells froHC-matched, SIV-infected macaques. If a peptide elicpositive response in at least two independent sampas classified as an epitope.For several alleles, additional peptides were tested

inding that were not derived from SIV. Here, these pides were used as additional training and testing data foinding predictions.

.3. Peptide synthesis

Peptides for screening were purchased from eithercan Systems (Lelystad, The Netherlands), MimotClayton, Australia), synthesized at the Biotechnology Cer at the University of Wisconsin-Madison, or at A&A LaSan Diego, CA) using standard tertiary butyloxycarbony

Page 3: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

5214 B. Peters et al. / Vaccine 23 (2005) 5212–5224

fluronylmethyloxycarbonyl solid phase methods[34]. Pep-tides were resuspended at 4–20 mg/ml in 100% dimethylsulfoxide (DMSO) (Sigma, St. Louis, MO), then dilutedto required concentrations in PBS or PBS + 0.05% NP40.For ELISPOT assays, peptide stocks at 10 mg/ml in 100%DMSO were diluted to 1 mg/ml in Hank’s buffered salt solu-tion (HBSS; Gibco, Grand Island, NY). Peptides for use asradiolabeled probes were purified to >95% homogeneity byreverse phase HPLC, and composition ascertained by aminoacid analysis, sequencing, and/or mass spectrometry analy-sis. Radiolabeling was done using the chloramine T method[35].

2.4. MHC purification and peptide-binding assay

Stable transfectants expressing specific rhesus macaqueMHC class I molecules for Mamu-A*01, -A*02, -A*11,-B*01 and -B*17 were created in the HLA class I-deficient human B-cell line 721.221, as described previ-ously [22–24,29]. MHC class I molecules were purifiedfrom cell lysates using affinity chromatography as previ-ously described with the anti-HLA class I (-A, -B and -C)antibody W6/32[35,36]. Protein purity, concentration, andeffectiveness of depletion steps were monitored by SDS-PAGE.

Quantitative assays for peptide binding to detergent sol-u inhi-b tideu nt ofp1 oomt si sL asei 1, -A ibed[ ast pep-t ity2

ledp wasd inerL od,F undc ardI thec theb ated( de-p here[uI n oft l fori

2.5. IFN-γ enzyme-linked immunospot (ELISPOT) assay

Peripheral blood mononuclear cells (PBMC) were sep-arated from whole heparin or EDTA-treated blood byFicoll-paque PLUS (GE Health Sciences, Piscataway, NJ)density centrifugation. The PBMC were used directly inELISPOT assays as previously described[21–24](Loffredoet al., manuscript in preparation). Briefly, 96-well, flat-bottomed, clear plate ELISPOT kits (U-Cytech-BV, Utrecht,The Netherlands) were used for the detection of IFN-�.Cells were resuspended in RPMI 1640 (BioWhittaker, Walk-ersville, MD) supplemented withl-glutamine (Mediatech,Herndon, VA), penicillin–streptomycin (Mediatech, Hern-don, VA), and 5% fetal bovine serum (FBS; HyClone, Logan,UT) (R05). The R05 also contained either 5–10�g/ml ofconcanavalin A (Con A; Sigma Chemical, St. Louis, MO),1–10�g/ml of various MHC-restricted binder (IC50 values≤500 nM) peptides, 1–10�g/ml of irrelevant peptides, orno peptide. Input cell numbers were 1.0–2.0× 105 PBMCin 100�l/well in triplicate wells. Cells were then incubated16–19 h (overnight) at 37◦C in 5% CO2.

ELISPOT plates were developed using the U-Cytech-BVactivator mix consisting of a silver salt solution that pre-cipitates at the sites of gold clusters (from the gold-labeledantibiotin solution), visualizing the sites where IFN-� wassecreted. When black spots appeared in the wells under ani illedw

thew singa n TE3 ly. As withat ptidew devia-t ys asp h-o plust anal-y ,a FCse cificp -m FCsa ack-g frome itive ift sam-p- OTa ader( auto-m andg lev-e onse

bilized Mamu class I molecules were based on theition of binding of a radiolabeled standard probe pepsing the same protocol described for the measuremeeptide binding to HLA class I molecules[35]. Briefly,–10 nM radiolabeled peptide was co-incubated at r

emperature with 1�M to 1 nM purified class I moleculen the presence of 1�M human�2-microglubulin (Scrippaboratories, San Diego, CA) and a cocktail of prote

nhibitors. The radiolabeled peptide used for Mamu-A*0*02, -A*11 and -B*17 assays were as previously descr

22–24,29]. For Mamu-B*01, the radiolabeled peptide whe Macaque tumor rejection antigen gp96 245–253ide (sequence SDYLELDTI; Mamu-B*01 binding capac.7 nM).

After a 2-day incubation, binding of the radiolabeeptide to the corresponding MHC class I moleculeetermined by capturing MHC/peptide complexes on Greumitrac 600 microplates (Greiner Bio-one, LongwoL) coated with the W6/32 antibody, and measuring bopm using the TopCount microscintillation counter (Packnstrument Co.). In the case of competitive assays,oncentration of peptide yielding 50% inhibition ofinding of the radiolabeled probe peptide was calculIC50). Peptides were typically tested in three or more inendent experiments. Under the conditions used, w

label] < [MHC] and IC50≥ [MHC], the measured IC50 val-es are reasonable approximations of the trueKd values[37].

n each experiment, a titration of the unlabeled versiohe radiolabeled probe was tested as a positive contronhibition.

nverted microscope, the wells were washed with distater to stop development and then air-dried.For Mamu-A*01 and Mamu-B*17 ELISPOT assays,

ells were imaged with IP Lab Spectrum 3.23 software uHamamatsu C4880 series camera attached to a Niko

00 inverted microscope. Spots were counted manualpot-forming cell (SFC was defined as a large black spotfuzzy border[38]). For Mamu-A*01 ELISPOT assays[21],

o determine significance levels, a baseline for each peas first established using the average and standard

ion of the number of SFCs in three independent assaerformed on Mamu-A*01+ but SIV-naive animals. A thresld significance value corresponding to this average

wo standard deviations was then determined. In oursis of samples from SIV-infected Mamu-A*01+ animalsresponse was considered positive if the number of S

xceeded the threshold significance level for that speeptide. For Mamu-B*17 ELISPOT assays[22], to deterine significance levels, the average of the number of Snd standard deviation for each peptide was calculated. Bround (sample with no peptide) levels were subtractedach peptide average. A response was considered pos

he number of SFCs exceeded twice the level of thele with no peptide. For Mamu-A*02[23], -A*11 [24] andB*01 (Loffredo et al., manuscript in preparation) ELISPssays, the wells were imaged with an AID ELISPOT reAID, Strassberg, Germany). Spots were counted by anated system with set parameters for size, intensity,radient. Background (mean of wells without peptide)ls were subtracted from each well on the plate. A resp

Page 4: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

B. Peters et al. / Vaccine 23 (2005) 5212–5224 5215

was considered positive if the mean number of SFCs of trip-licate sample wells exceeded background plus two standarddeviations (S.D.).

Assay results are recorded as SFC per 1× 106 cells. Typi-cally, responses <50 SFC per 1× 106 cells are not consideredpositive because these counts are not significantly abovebackground. Wells containing Con A (positive control) werealways greater than 1000 SFCs per 1× 106 PBMC.

2.6. SMM predictions for MHC binding

For the prediction of MHC binding, we used the sta-bilized matrix method (SMM), first described in ref.[39],with modifications described in ref.[40]. Basically, thealgorithm takes a set of peptides of the same length withmeasured binding affinity as an input. It assumes that themeasured binding affinities can be explained by a matrixmat(aapos, pos), which quantifies the contribution of eachamino acid (aa) at each position in the peptide (pos) tothe binding free energy. For a peptide of lengthL withsequence aapos, the matrix would predict an affinity oflog (IC50, predicted) =�pos ={1, . . ., L} mat(aapos, pos) + offset.The core SMM algorithm then searches for matrix valuesfor which the predicted IC50 values are close to the measuredIC50 values.

More specifically, the matrix entries are determined bym resf loga-r( -t ,a sid-e ondt thedm ha eno ingt them atd llv

hod[d posi-t cted.T e-fv col-u g forat timese en-t tion.T 10i

2.7. Evaluating prediction quality using ROC curves

Receiver operating characteristic (ROC)[41] curves areused to measure the quality of a prediction which classifiesobjects into one of two categories. In our case, binding pre-dictions were used to predict whether or not peptides areantigenic. Given a cut-off for the predicted value, predictionsfor peptides are separated into positive and negative subsets.Depending on whether a peptide was found to be antigenic, apositive prediction can be either true or false. Normalizing thenumber of true and false positive predictions by the maximumnumber possible, gives rates of true positive and false posi-tive predictions. Plotting the rates of true positive predictionsas a function of the rate of false positive predictions gives anROC curve. Calculating the area under the ROC curve (=AUCvalue) provides a highly useful measure of prediction qual-ity, which is 0.5 for random predictions and 1.0 for perfectpredictions. The AUC value is equivalent to the probabil-ity that the predicted score for a randomly chosen antigenicpeptide is lower (=better) than that of a randomly chosenpeptide that is not antigenic. To assess if the AUC valueof one prediction is significantly better than that of anotherprediction, we re-sampled the set of peptides for which pre-dictions are made. Using bootstrapping with replacement[42], 50 new datasets were generated with a constant ratioof antigenic to not-antigenic peptides. We then calculatedt ewd theri r-eo

3

3M

tifso[ on)c mostp highf se inb lysiso ulesw

, wea Wis-c ormsM nt inf rentM ualm -A e),a en-

inimizing the distanceΦ between the predicted scoor the peptide sequences in the training set and theithms of their associated measured IC50 values (Φ =�peptideslog (IC50, measured) − log (IC50, predicted))2. In the case of pepides for which only a lower limit of the IC50 value is knownny prediction equal to or above this threshold is conred perfect (distance = 0). To avoid over-fitting, a sec

erm is added to the minimization function, punishingeviation of matrix entries from zeroΨ =Φ +�posλpos�aaat(aa, pos)2. By minimizing this objective function witnon-zeroλpos value, a tradeoff is introduced betwe

ptimally reproducing the experimental values (includheir inevitable experimental error) and minimizingatrix entries mataa, pos. This forces all matrix entries tho not significantly lower the distanceΦ towards smaalues.

In contrast to previous applications of the SMM met11,39], the regularization parameterλpos is now positionependent. This takes into account that for the different

ions in the peptide a different signal to noise ratio is expehe optimal values for theλposare determined through fiv

old cross validation on the training set. Starting withλposalues optimal if the predictions were based on eachmn alone, the final values are determined by searchinminimal cross validated distance as a function of theλpos

hrough steepest descent. This procedure is repeated 10ach time using different random splitting of the experim

al dataset into training and test data for the cross validahe final prediction matrix is given as the mean of the

ndependently obtained matrices.

,

he difference in AUC for the two predictions on each nataset. One prediction is significantly better than ano

f the distribution of the AUC values is significantly diffent, which we measure using a pairedt-test with ap-valuef 0.001.

. Results

.1. Projected population coverage afforded by targetamu MHC class I alleles

Previous studies have detailed the MHC binding mof the Mamu-A*01[21,29], -A*02 [23], -A*11 [24], -B*17

22], and -B*01 (Loffredo et al., manuscript in preparatilass I molecules. These alleles were chosen, for theart, because they were shown to occur in relatively

requencies in the macaque populations available for uiomedical research. However, a formal quantitative anaf the population coverage afforded by all these molecas lacking.To address population coverage with this set of alleles

ssembled specific Mamu allele frequencies from theonsin National Primate Research Center, which perfHC class I typing of their animals and on samples se

rom other laboratories. The frequencies of eight diffeHC alleles routinely tested in a total of 2294 individacaques studies is shown inTable 1. Except for the Mamu*08 allele (for which no motif data is currently availablll routinely typed alleles occurring at relatively high frequ

Page 5: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

5216 B. Peters et al. / Vaccine 23 (2005) 5212–5224

Table 1Phenotypic frequencies of Mamu class I alleles in a captive population ofIndian-origin rhesus macaques

Animals typed positive for Population coverage (%)

In typedpopulation

Calculatedfrom allelefrequencies

Any of the eight typed allelesa 1830 79.8 79.7A*01, A*02, A*11, B*01 or B*17 1483 64.6 65.7

a A*01, A*02, A*08, A*11, B*01, B*03, B*04 and B*17.

cies (3–28%) were addressed by the current study, includingMamu-A*01, -A*02, -A*11, -B*17 and -B*01.

Next, we addressed the population coverage afforded byvarious combinations of MHC class I molecules. In the typedpopulation, the combination of all eight routinely typed alle-les afforded a coverage of approximately 80% of the animals.Focusing on the five alleles for which motifs are availableafforded a population coverage of approximately two-thirds(64.6%) of the macaque population considered.

The MHC class I system of the Indian rhesus macaqueis considerably complex, with a variable number of genesencoding distinct molecules and allele variants[31,43–46].For this reason, it was unclear whether population coverage inthe case of Indian rhesus class I molecules could be estimatedfrom simple Hardy-Weinberg equilibrium equations.

However, using Hardy-Weinberg equations and the data inTable 1, we calculated that 46% of the macaques from thesecaptive colonies express at least one of the targeted Mamu-Aalleles, and 37% express at least one of the targeted Mamu-Balleles. Assuming A/B linkage disequilibrium, we approx-imated that 66% of these Indian rhesus macaques wouldexpress at least one of the A or B alleles studied (Table 2).This is in good agreement with the 65% value found whenactually typing the animals that were positive for at least oneof these alleles. This data also suggests that Hardy-Weinbergequations can be applied to the rhesus MHC allelic system,d

thatt -B thec esults

TP les

A )

AAAABBBB

T

suggest that development of a resource to effectively predictcandidate epitopes for these alleles would be of considerablepractical interest.

3.2. Prediction of Mamu-A*01 binding capacityaccording to the SMM method

Accordingly, we decided to establish bioinformatic toolsto allow prediction of peptides capable of binding the alle-les mentioned in the previous section. Previous studies haveestablished that most epitopes bound by Mamu MHC class Imolecules range in size from 8 to 11 residues[29,47,48]. Weapplied the SMM method[39,40] to various published andunpublished sets of peptide binding data to generate matricesfor predicting the Mamu-A*01 binding capacity of peptides,in this size range.

As an example, the matrix for predicting binding of 8-mer peptides to Mamu-A*01 is shown inTable 3. The tablealso illustrates how the affinity of a sample peptide (withsequence LTPEKGWL), is predicted for the Mamu-A*01molecule. Specifically, by adding the matrix entries corre-sponding to each residue found at each position along thepeptide sequence and adding the offset value, a sum of 1.76is derived. This value represents the predicted log (IC50) valueof the peptide, which corresponds to an IC50 value of 58 nM.

To evaluate the quality of the predictions, and at the samet ande his,t sets.N ainedi usedt atrixg timeg et ofp allelea -d ntl achp ptideL am-pa esep oeffi-c 01,f eenm

3c

e too MMm ep-t re edic-t de

espite it’s apparent complexity.In conclusion, from these results we have determined

he set of alleles including Mamu-A*01, -A*02, -A*11,*17 and -B*01, provides coverage for the majority ofaptive population of Indian rhesus macaques. These r

able 2opulation coverage afforded by selected Mamu MHC class I molecu

llele Number of animals Frequency (%

*01 566 24.7*02 498 21.7*08 689 30.0*11 80 3.5*01 637 27.8*03 21 0.9*04 17 0.7*17 240 10.5

otal 2294 100

ime avoid utilizing the same data for matrix generationvaluation, we used five-fold cross-validation. To do the total set of peptides first was separated into five subext, a scoring matrix was established on the data cont

n four of these subsets. Finally, the resulting matrix waso make predictions for the subset not included in the meneration. This process was repeated five times, eachenerating one ‘blind’ prediction result on a unique subseptides. This same procedure was repeated for eachnd for each different peptide size.Fig. 1shows plots of preicted vs. measured log (IC50) values for peptides of differe

engths tested for binding to Mamu-A*01 molecules. Eoint in the graph corresponds to one peptide. The peTPEKGWL used in the sample prediction above, for exle, has a measured affinity of IC50 = 8 nM, which is plottedgainst its predicted affinity of 58 nM. The quality of thredictions can be measured by the linear correlation cientR2. TheR2 values ranged, in the case of Mamu-A*rom 0.56 to 0.30 indicating significant agreement betweasured and predicted values.

.3. Extension of the method to other common Mamulass I molecules

Next, we applied the methodology described abovther Mamu class I alleles. As mentioned above, the Sethod allows combining prediction results for different p

ide lengths because a quantitative IC50 value is predicted foach peptide, irrespective of its length. The combined pr

ions are shown inFig. 2 for the various alleles and pepti

Page 6: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

B. Peters et al. / Vaccine 23 (2005) 5212–5224 5217

Table 3Prediction of the binding affinity of peptide LTPEKGWL to Mamu-A*01 using an SMM matrix

Amino acid(single letter code)

Position

1 2 3 4 5 6 7 8

A −0.12 −0.73 −0.55 −0.09 −0.66 −0.06 −0.10 0.60C −0.17 0.62 0.39 0.05 −0.11 −0.38 0.10 −0.13D 1.21 0.39 0.31 0.14 0.56 0.43 0.20 −0.28E 0.72 0.98 0.14 −0.06 0.61 0.13 0.04 1.08F −0.52 0.76 0.22 0.06 −0.17 −0.27 −0.14 −0.95G 0.16 −0.19 0.44 0.14 −0.60 0.51 0.19 −0.29H 0.02 0.39 0.00 0.02 0.14 −0.30 0.09 0.75I −0.11 0.29 −0.07 −0.15 −0.07 −0.17 −0.03 −0.76K −0.18 1.00 0.00 0.10 0.42 0.44 0.14 0.75L −0.38 0.05 −0.02 −0.01 0.21 −0.05 −0.28 −1.25M −0.38 −0.19 0.00 0.04 0.06 0.03 −0.30 −1.04N 0.55 0.64 0.31 0.02 0.07 0.10 0.18 1.68P 0.41 −0.28 −1.89 −0.30 −0.58 0.00 −0.03 −0.06Q 0.34 0.36 0.00 0.07 0.35 −0.21 0.04 −1.18R −0.04 −0.13 0.00 0.10 0.38 0.00 0.02 0.75S −0.01 −1.37 0.07 −0.01 −0.57 0.15 0.04 1.66T −0.14 −1.69 −0.07 −0.08 −0.29 0.09 0.02 0.86V −0.71 −0.83 0.00 −0.06 0.17 −0.16 −0.03 −0.45W −0.39 0.26 0.31 0.04 0.07 0.00 −0.07 0.39Y −0.25 −0.33 0.40 −0.01 0.01 −0.27 −0.08 0.11

Offset value: 6.20. Example: prediction for peptide LTPEKGWL: 6.20− 0.38− 1.69− 1.89− 0.06 + 0.42 + 0.51− 0.07− 1.25 = 1.76→ 58 nM.

F9

ig. 1. Prediction of Mamu A01 binding capacity according to the SMM algo-mer (B), 10-mer (C) and 11-mer (D) peptides are shown. Correlation coeffi

rithm. Predicted vs. measured Mamu-A*01 binding affinity plots for 8-mer (A),cients evaluating prediction quality are given in each panel.

Page 7: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

5218 B. Peters et al. / Vaccine 23 (2005) 5212–5224

Fig. 2. Prediction of MHC binding capacity for five macaque MHC class I alleles using SMM matrices. Scatter plots of predicted vs. measured binding affinityfor the five MHC class I alleles Mamu-A*01, -A*02, -A*11, -B*01 and -B*17. Each panel shows predictions for 8-, 9-, 10- and 11-mer peptides combined.Correlation coefficients are also shown in each panel.

lengths. Overall,R2 values were in the 0.35–0.71 range, withan average of 0.54.

Table 4presents an overview of the peptide sets utilizedfor predictions, and associatedR2 values. The sets of peptidesof different sizes being tested for binding to various allelesranged in size between 156 and 527 individual peptides. Thefraction of peptides binding at the 500 nM level or better

was usually in the 20–50% range. Notable exceptions were8-, 10- and 11-mers in the case of Mamu-B*01 and Mamu-B*17, where less than 11% of the peptides bound, apparentlyreflecting a preference for ligands of nine residues in size forthese two alleles.

Associated correlation coefficients for individual peptidelengths are also presented inTable 4. In general, highR2 val-

Page 8: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

B. Peters et al. / Vaccine 23 (2005) 5212–5224 5219

Table 4Overview of the correlation between measured and predicted log (IC50) affinity values

Allele + motifa Peptidelength

All peptides PredictionR2 SIV derived peptides

Number ofpeptides

Percentbinders (%)b

Number ofpeptides

Number ofbindersb

Number ofepitopesc

Mamu-A*01 8 374 28 0.56 110 9 3P2 = “STFWYNAGPLIVMCH” 9 527 39 0.49 150 17 10P3 = “PTAC” 10 481 37 0.38 173 9 1C = “FLIVMWYTA” 11 293 32 0.30 94 2 1

Mamu-A*02 8 156 37 0.40 52 12 3P2 = “TSVAGLMI” 9 283 52 0.36 95 29 14

10 211 47 0.29 88 25 4C = “MYFLWVA” 11 210 36 0.24 25 9 0

Mamu-A*11 8 217 21 0.56 166 31 1P2 = “EDM” 9 465 40 0.55 163 58 4

10 277 42 0.54 147 38 0C = “IVLMFWATY” 11 220 21 0.44 184 48 2

Mamu-B*01 8 159 4 0.16 95 2 –P2 = “DEASNGIQLIV” 9 211 34 0.76 76 2 –

10 204 0 0.03 78 0 –C = “ILFVWY” 11 221 1 0.12 102 1 –

Mamu-B*17 8 156 2 0.15 72 3 2P2 = “HAMRFLPQKSCWYTG” 9 306 29 0.69 223 22 8

10 200 11 0.43 147 18 4C = “WFY” 11 197 6 0.36 77 7 2

a Motif as previously published. P2/P3/C refers to position 2/3/C-terminus of a peptide, respectively. Residues are listed in order of preference.b Peptides with an IC50 value below 500 nm.c All SIV derived peptides that were found to be binders were further tested for antigenicity. Epitopes are peptides eliciting a repeat response in an elispot

assay.

ues were observed. However, predictions of 8-mer, 10-merand 11-mer binding to Mamu-B*01, as well as 8-mer bindingto Mamu-B*17 were notably poor as indicated by correla-tion coefficients close to zero. These cases of poor predictionquality likely reflect the low number of good binding pep-tides of these lengths contained in the set of peptides testedfor binding to the respective alleles.

For Mamu-A*01, -A*02 and -A*11 it was observed thatpredictions worked most accurately for peptides of shorterlengths. This was true independent of the number of datapoints available for training, and even when normalized forthe degrees of freedom of different length matrix predic-tions. This suggests that longer peptides may be inherentlymore difficult to predict, possibly because of less constrainedrestrictions on the position of the peptide in the MHC bindinggrove.

3.4. Antigenicity of SIV-derived binding peptides

SIV-derived peptides (n = 2317) represented a large frac-tion of our macaque MHC binding data. A total of 337SIV-derived peptides (15%), which were associated with ameasured IC50 value below 500 nM, are classified as binders[37]. Binding peptides were tested for antigenicity usingPBMC from chronically and acutely SIV infected macaquesin previous studies[21–24]. In the following, we use this seto een

binding affinity and antigenicity. Fifty-nine peptides inducedrecall responses as measured by IFN-� ELISPOT assays.Table 4shows the allele and length distribution of the peptidestested, and the number of epitopes identified. More informa-tion is compiled inTable S1, Supplementary data, showingthe 337 binding peptides, isolate of origin, source proteinand position, sequence length, measured and predicted bind-ing affinity, the results of previous antigenicity testing andthe putative restricting MHC allele. For some peptides, theoriginal publications[21–24]contain additional informationwhich may be of interest, such as cytotoxicity- and tetramerbinding assays.

It is well known that a minimum binding ability of apeptide to MHC is required to induce an immune response.However, the quantitative effect of increasing affinity of apeptide for MHC (once the minimum threshold is passed),on the likelihood of a response being generated, is not clearlydefined. To analyze if a correlation exists between bindingaffinity and immunogenicity, we utilized receiver operatingcharacteristics curves (ROC curves, see Section2for details).The area under the curve (AUC or AROC) is utilized toquantify the success of a prediction, with AUC = 0.5 cor-responding to a random and AUC = 1.0 corresponding to aperfect prediction.

We found that peptides with a higher measured MHCbinding affinity had an increased likelihood of eliciting ani t that

f peptides for a meta-analysis of the relationship betw mmune response (AUC = 0.58). These results sugges
Page 9: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

5220 B. Peters et al. / Vaccine 23 (2005) 5212–5224

this SIV-derived peptides data set could be utilized to deriveuseful affinity threshold to predict antigenic epitopes. To oursurprise, the predicted IC50 value was an even better indica-tor of potential antigenicity (AUC = 0.62;p < 0.001) than themeasured IC50 value.

3.5. Identifying antigenic peptides within proteinsequences

The results presented above were based on assaying pep-tides for antigenicity that contain the binding motif of thecorresponding MHC allele, and were experimentally foundto bind with high affinity. In most real life situations how-ever, binders, and therefore epitope candidates, would bepredicted directly from protein sequences. To mimic this typeof scenario, all potential 8-, 9-, 10- and 11-mers from the SIVproteome were generated. For these, affinity predictions weremade for all available alleles (except Mamu-B*01 for whichantigenicity data is not yet available), resulting in a total of51,738 predicted IC50 values.

For this analysis, we had to modify the SMM affinitymatrices described above, to take into account that peptideswith non-anchor residues were specifically not tested forbinding for all peptide lengths. Therefore, at main anchorpositions we set the matrix values residues for which nocorresponding peptide existed in the training set 100-foldh anyo

andt singaT andt inF tivec rd-i ictedp pes.F cut-o d.

fsw me.U -t enici fied

TE

M

P(

)))

2 %)

Fig. 3. Identification of antigenic peptides within proteins using MHC bind-ing predictions. (A) and (B) depict different representations of the same data:(A) total number of peptides/epitopes predicted in the SIV proteome whenchoosing the MHC affinity on thex-axis as the prediction cut-off. (B) Rates ofpredicted peptides/epitopes were calculated by dividing the numbers givenin (A) by the total number of peptides contained in SIV (51,738), or the totalnumber of SIV epitopes identified (59), respectively.

22% (13 out of a total of 59) of all antigenic peptides known tous, while being highly discriminative in predicting only 0.1%(39 out of 51,738) of the total number of peptides in SIV.The identification of 44% of all antigenic peptides requireda cut-off value of 50 nM, which predicted 185 (0.4%) of theSIV peptides, representing an about seven-fold increase inthe number of peptides predicted. Cut-off values of 200 and500 nM identified 68 and 80%, of the antigenic peptides in ourdata set, respectively, and were associated with total positiveprediction rates in the 1.3–2.4% range. The highest cut-offlisted was 2000 nM, which identifies nearly all antigenic pep-tides (97%) but is associated with a 6.1% positive predictionrate. More cut-off values can also be easily generated fromFig. 3. Taken together, these observations demonstrate howvarious cut-off values for predicted binding affinities can beused to tailor sets of peptides with a greatly increased likeli-hood to be recognized by CD8+ T cells.

In a parallel series of experiments, we sought to vali-date the algorithms performance following an independentapproach. Our original scan of the SIV proteome for likelyMamu binding peptides was based only on the presence ofprimary anchor residues. With the more sophisticated bind-

igher than that of the best (=lowest) value found forther residue at the same position.

Fig. 3A depicts how the number of epitopes identified,he total number of peptides predicted varies when choos a cut-off the MHC binding affinity indicated on thex-axis.he corresponding fraction of known epitopes identified

he fraction of peptides predicted within SIV are shownig. 3B. These results illustrate how choosing a restricut-off will reduce the number of predictions, and accongly the number of experiments needed to test the predeptides, at the cost of overlooking some of the epitoig. 3provides a guideline to assist in the selection of aff value suitable for the particular application considere

Furthermore,Table 5illustrates how five different cut-ofould have performed in an analysis of the SIV proteosing a prediction cut-off of IC50 < 10 nM, a total of 39 pep

ides were identified. Of these 39 peptides, 13 were antign SIV infected macaques. Therefore, this cut-off identi

able 5xamples of algorithm performance for different prediction cut-offs

HC affinity prediction

rediction cut-offIC50 nM)

SIV Epitopespredicted≤ cut-off

SIV peptidespredicted≤ cut-off

10 13/59 (22%) 39/51,738 (0.1%)50 26/59 (44%) 211/51,738 (0.4%

200 40/59 (68%) 651/51,738 (1.3%500 47/59 (80%) 1256/51,738(2.4%000 57/59 (97%) 3131/51,738 (6.1

Page 10: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

B. Peters et al. / Vaccine 23 (2005) 5212–5224 5221

ing predictions now available to us, we repeated the scanof the proteome looking for peptides predicted to bind toMamu-A*01 with an IC50 value <500 nM. We identified 109peptides that were not previously tested, most of which didnot posses the exact anchor residues defined for Mamu-A*01.In essence, these peptides were predicted to compensate forthe absence of primary anchors by having favorable residueselsewhere. Tested experimentally for MHC binding capac-ity, 57 out of the 109 peptides were found to bind withIC50 < 500 nM (Table S2, Supplementary data). This is a highsuccess rate, especially as the fraction of non-anchor pos-sessing peptides which are binders is at least 10-fold lowerthan that of anchor possessing peptides[49]. Taken together,these results provide independent experimental validation ofthe predictive algorithms.

3.6. Implementation of the predictive algorithms in afreely accessible website

We have implemented the binding predictions on a freelyaccessible website atwww.mamu.liai.org. Fig. 4A shows thescreen in which a user can define a prediction. First, the userhas to specify the protein amino acid sequence(s) of inter-est using the standard single letter amino acid code. Then,the user has to specify which macaque allele he/she wantst cifyi gth,b ouldb

F eni f thep

For the display of results, the user can specify the sortingorder and a cut-off limiting the displayed results to peptideswith predicted IC50 values below that cut-off. Using such acut-off is strongly recommended if the protein sequence(s)contain a total of more than 1000 amino acids, as transferringthese results can otherwise take a long time.Fig. 4B depictsa screenshot of the table in which results are displayed. Therows in the table correspond to individual peptide bindingpredictions. By clicking on the corresponding table columnheaders, the rows can be sorted by peptide position in theprotein, peptide length, peptide sequence, MHC allele theprediction is made for and predicted affinity.

To decide which peptides are of interest for further study,the user has to weigh the cost associated with making a cer-tain number of experiments against the cost associated withmissing a certain fraction of epitopes. As an example, we con-sidered a user interested in identifying antigenic peptides ofany length derived from a protein of 300 amino acids and rec-ognized in animals expressing the Mamu-A*01 allele. In thisprotein, there are 1166 peptides of length between 8 and 11residues (293 + 292 + 291 + 290). As can be seen inTable 5,1.3% of all peptide binding predictions in SIV gave IC50values below 200 nM. For our example, that would trans-late roughly to 1166× 1.3% = 15 peptides. Again referringto Table 5, 68% of all identified SIV epitopes had a predictedIC value below 200 nM. The user can therefore estimatet pre-d Thise es ons is ar fore,t studyd l datab

4

entala lle-l . Weh g tom vail-a

s weh ajor-i tionc riumf ersc weret thera hichM licu

hich

o make binding predictions for. Next, the user can spef he/she is only interested in peptides of a specific lenetween eight and eleven amino acids, or if predictions she made for all those lengths.

ig. 4. Screenshots ofwww.mamu.liai.orgwebsite. (A) Image of the scren which a user enters the parameters of a prediction. (B) Image o

rediction output. o s I

50hat for the 300 amino acid length protein, the set of 15icted peptides may contain the majority of epitopes.stimate should only be taken as an example, as it relieveral assumptions, most importantly that the SIV dataepresentative sample of antigenicity as a whole. Therehese estimates should only serve as a first guideline foresign, and should be re-evaluated as more experimentaecomes available.

. Discussion

We here present the first comprehensive experimnd computational study of peptide binding to MHC a

es and peptide antigenicity in Indian rhesus macaquesave derived quantitative predictions of peptide bindinacaque MHC molecules, which are made publicly able on the web atwww.mamu.liai.org.

The phenotypic frequencies of macaque MHC alleleave targeted indicate that our predictions cover a m

ty of the captive macaque population. Rates of populaoverage calculated based on Hardy-Weinberg equilibor Mamu A and B alleles agreed with the actual numbounted when a large population of captive macaquesyped. In the future, we plan to extend our predictions to olleles occurring at significant frequencies, and for wHC binding motifs are known, and include them for pubse onwww.mamu.liai.org.

For peptide binding, we have generated predictions wutput the IC50 value of a peptide binding to MHC clas

Page 11: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

5222 B. Peters et al. / Vaccine 23 (2005) 5212–5224

molecules. Evaluating these predictions by comparing mea-sured with predicted IC50 values in a cross validation studyresulted in good agreement between the two, similar to thatfound in similar studies of binding predictions for humanMHC alleles [39]. Our study simultaneously analyzes thequantitative binding capacity for ligands of different sizesfor several different alleles.

We found a decrease in the prediction quality of peptidebinding to MHC with increasing peptide length, which cannotbe explained with the amount of training data available alone.This led us to speculate that longer peptides can more easilyadopt different positions in the MHC binding grove.

The goal of most MHC binding experiments and predic-tions is to identify antigenic peptides. It is known that mostantigenic peptides have a high affinity for MHC molecules.Ninety percent of known MHC class I T cell epitopes havean IC50 < 500 nM[37]. We examined here how much influ-ence on antigenicity differences in binding affinities have forpeptides that would all be considered binders. Indeed, weobserved that peptides with a very high affinity are morelikely to be antigenic than those with lesser affinities.

Repeating the above analysis using predicted instead ofmeasured values for peptide affinities, we were surprised tofind that the predicted affinity of a peptide was a somewhatbetter indicator of its antigenicity than its measured affin-ity. This may indicate that we are approaching the point inw ptideb tione edic-t cur.T waysa hilea erente bind-i tedr icityw

itinga SIVi r thant cei n oft thep eav-a s ofp ized.T ingp

nst anti-g e sett se ofM osto forb ind,f will

address whether the newly identified binders contain a simi-lar fraction of antigenic peptides as the previously identifiedpeptides.

To ensure thorough representation, using predicted IC50nM cut-offs of 11, 14 and 18 nM, we identified 1, 2 and 3antigenic SIV peptides, respectively for each allele (exceptB*01). For a virus of similar sequence length and composi-tion, these cut-offs may therefore be sufficient to identify apool of peptides that contains at least one peptide antigenicin the context of each allele. The computational resourcesneeded to screen a proteomic sequence are minimal, and scan-ning the entire SIV proteome using our website only takes afew seconds. At the same time, such a computational screen-ing significantly reduces the amount of experimental workand easily addresses any pathogen of interest.

It is important to validate the epitope identificationapproach described here for different viral systems. Possi-ble targets for further study include the influenza, vaccinia,ebola and lassa viruses. These are known to be infectiousin rhesus macaques, making them a reasonable model forthe testing of vaccines and diagnostics destined for humanuse.

The new tools available atwww.mamu.liai.org, based onthe studies reported herein, include the first set of predictivetools for rhesus macaques that should bridge a gap in knowl-edge and availability of tools designed to assist investigationsi ns ofn

A

ofH opeD HCB 51R

A

canb .2

R

tigen

ntialivid-

novictifs.

des,

hich the experimental error connected to measuring peinding is of similar or greater magnitude than the predicrror. As more data becomes available to generate pr

ions, this phenomenon is more and more likely to ochis is because a non-negligible experimental error is alssociated with individual experimental determinations wn accurate prediction can essentially average over diffxperimental errors to capture a more exact estimate of

ng affinity. A slightly better correlation between predicather than measured peptide affinities and their antigenas reported before in[50].Antigenic peptides were defined here as peptides elic

response in ex vivo assays using fresh PBMCs fromnfected macaques. In such an assay, many factors othehe ability of a peptide to bind to MHC molecules influenf a peptide is antigenic or not. The level of expressiohe peptide-containing protein, the efficiency with whicheptide is liberated from the protein by proteasomal clge and the T cell repertoire available are all examplerocesses that can influence if a peptide will be recognherefore, it is not surprising that a large number of bindeptides were not found to be antigenic.

Applying the newly established MHC binding predictioo screen the SIV proteome for potential binders andenic peptides identified a somewhat different candidat

han those previously experimentally tested. In the caamu-A*01, we tested 109 newly predicted peptides, mf which did not contain the complete anchor motifinding. More than half of these peptides did indeed b

urther validating our predictive method. Further studies

n basic host-pathogen interactions and in the evaluatioew diagnostic and vaccine candidates alike.

cknowledgments

This work was supported by the National Institutesealth’s contract HHSN26620040006C (Immune Epitatabase and Analysis Program), grant R24 RR15371 (Mound, SIV Derived, CTL and HTL Epitopes) and 5PR000167 (WNPRC).

ppendix A. Supplementary data

Supplementary data associated with this articlee found, in the online version, atdoi:10.1016/j.vaccine005.07.086.

eferences

[1] Pamer E, Cresswell P. Mechanisms of MHC class I–restricted anprocessing. Annu Rev Immunol 1998;16:323–58.

[2] Parker KC, Bednarek MA, Coligan JE. Scheme for ranking poteHLA-A2 binding peptides based on independent binding of indual peptide side-chains. J Immunol 1994;152(1):163–75.

[3] Rammensee H, Bachmann J, Emmerich NP, Bachor OA, StevaS. SYFPEITHI: database for MHC ligands and peptide moImmunogenetics 1999;50(3–4):213–9.

[4] Donnes P, Elofsson A. Prediction of MHC class I binding peptiusing SVMHC. BMC Bioinform 2002;3(1):25.

Page 12: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

B. Peters et al. / Vaccine 23 (2005) 5212–5224 5223

[5] Bhasin M, Raghava GP. SVM based method for predicting HLA-DRB1*0401 binding peptides in an antigen sequence. Bioinformatics2004;20(3):421–3.

[6] Guan P, Doytchinova IA, Zygouri C, Flower DR. MHCPred: A serverfor quantitative prediction of peptide-MHC binding. Nucl Acids Res2003;31(13):3621–4.

[7] Hakenberg J, Nussbaum AK, Schild H, Rammensee HG, KuttlerC, Holzhutter HG, et al. MAPPP: MHC class I antigenic peptideprocessing prediction. Appl Bioinform 2003;2(3):155–8.

[8] Hattotuwagama CK, Guan P, Doytchinova IA, Zygouri C,Flower DR. Quantitative online prediction of peptide bindingto the major histocompatibility complex. J Mol Graph Model2004;22(3):195–207.

[9] Singh H, Raghava GP. ProPred1: prediction of promiscuous MHCClass-I binding sites. Bioinformatics 2003;19(8):1009–14.

[10] Reche PA, Glutting JP, Zhang H, Reinherz EL. Enhancement to theRANKPEP resource for the prediction of peptide binding to MHCmolecules using profiles. Immunogenetics 2004;56(6):405–19.

[11] Peters B, Bulik S, Tampe R, van Endert PM, Holzhutter HG. Iden-tifying MHC class I epitopes by predicting the TAP transport effi-ciency of epitope precursors. J Immunol 2003;171(4):1741–9.

[12] Vogel TU, Friedrich TC, O’Connor DH, Rehrauer W, Dodds EJ,Hickman H, et al. Escape in one of two cytotoxic T-lymphocyteepitopes bound by a high-frequency major histocompatibility com-plex class I molecule, Mamu-A*02: a paradigm for virus evolutionand persistence? J Virol 2002;76(22):11623–36.

[13] O’Connor DH, Allen TM, Watkins DI. Cytotoxic T-lymphocyteescape monitoring in simian immunodeficiency virus vaccine chal-lenge studies. DNA Cell Biol 2002;21(9):659–64.

[14] O’Connor DH, Allen TM, Vogel TU, Jing P, DeSouza IP, Dodds E,et al. Acute phase cytotoxic T lymphocyte escape is a hallmark of

–9.[ al.

iency

[ er-defi-

[ elIVture

[ P,d by

ring

[ llenlassun-36–

[ llenuses

[ al.tedmajor

ca-.

[ insof

pesnol

[ poeencyency

molecule, Mamu-A*02, and potential escape from CTL recognition.J Immunol 2004;173(8):5064–76.

[24] Sette A, Sidney J, Bui HH, Del Guercio MF, Alexander J, Lof-fredo J, et al. Characterization of the peptide-binding specificity ofMamu-A*11 results in the identification of SIV-derived epitopes andinterspecies cross-reactivity. Immunogenetics 2005.

[25] Allen TM, Jing P, Calore B, Horton H, O’Connor DH, Hanke T,et al. Effects of cytotoxic T lymphocytes (CTL) directed against asingle simian immunodeficiency virus (SIV) Gag CTL epitope onthe course of SIVmac239 infection. J Virol 2002;76(20):10507–11.

[26] Vogel TU, Reynolds MR, Fuller DH, Vielhuber K, Shipley T,Fuller JT, et al. Multispecific vaccine-induced mucosal cytotoxic Tlymphocytes reduce acute-phase viral replication but fail in long-term control of simian immunodeficiency virus SIVmac239. J Virol2003;77(24):13348–60.

[27] Moniuszko M, Bogdan D, Pal R, Venzon D, Stevceva L, Nacsa J, etal. Correlation between viral RNA levels but not immune responsesin plasma and tissues of macaques with long-standing SIVmac251infection. Virology 2005;333(1):159–68.

[28] Moniuszko M, Brown C, Pal R, Tryniszewska E, Tsai WP, HirschVM, et al. High frequency of virus-specific CD8+ T cells in thecentral nervous system of macaques chronically infected with simianimmunodeficiency virus SIVmac251. J Virol 2003;77(22):12346–51.

[29] Sidney J, Dzuris JL, Newman MJ, Johnson RP, Kaur A, AmitinderK, et al. Definition of the Mamu A*01 peptide binding specificity:application to the identification of wild-type and optimized ligandsfrom simian immunodeficiency virus regulatory proteins. J Immunol2000;165(11):6387–99.

[30] Dzuris JL, Sidney J, Horton H, Correa R, Carter D, Chesnut RW,et al. Molecular determinants of peptide binding to two commonrhesus macaque major histocompatibility complex class II molecules.

[ . Ad bydirect

[ vinNefesusss I

[ JE,xicilityease

[ nent2.1

[ reyIn:98,

[ yerotifan

rus.

[ astandnol

[ rine

[ thees to

simian immunodeficiency virus infection. Nat Med 2002;8(5):49315] Allen TM, Mortara L, Mothe BR, Liebl M, Jing P, Calore B, et

Tat-vaccinated macaques do not control simian immunodeficvirus SIVmac239 replication. J Virol 2002;76(8):4108–12.

16] O’Connor D, Friedrich T, Hughes A, Allen TM, Watkins D. Undstanding cytotoxic T-lymphocyte escape during simian immunociency virus infection. Immunol Rev 2001;183:115–26.

17] Allen TM, O’Connor DH, Jing P, Dzuris JL, Mothe BR, VogTU, et al. Tat-specific cytotoxic T lymphocytes select for Sescape variants during resolution of primary viraemia. Na2000;407(6802):386–90.

18] Mothe BR, Horton H, Carter DK, Allen TM, Liebl ME, Skinneret al. Dominance of CD8 responses specific for epitopes bouna single major histocompatibility complex class I molecule duthe acute phase of viral infection. J Virol 2002;76(2):875–84.

19] Mothe BR, Weinfurter J, Wang C, Rehrauer W, Wilson N, ATM, et al. Expression of the major histocompatibility complex cI molecule Mamu-A*01 is associated with control of simian immodeficiency virus SIVmac239 replication. J Virol 2003;77(4):2740.

20] Friedrich TC, Dodds EJ, Yant LJ, Vojnov L, Rudersdorf R, CuC, et al. Reversion of CTL escape-variant immunodeficiency virin vivo. Nat Med 2004;10(3):275–81.

21] Allen TM, Mothe BR, Sidney J, Jing P, Dzuris JL, Liebl ME, etCD8(+) lymphocytes from simian immunodeficiency virus-infecrhesus macaques recognize 14 different epitopes bound by thehistocompatibility complex class I molecule mamu-A*01: implitions for vaccine design and testing. J Virol 2001;75(2):738–49

22] Mothe BR, Sidney J, Dzuris JL, Liebl ME, Fuenger S, WatkDI, et al. Characterization of the peptide-binding specificityMamu-B*17 and identification of Mamu-B*17-restricted epitoderived from simian immunodeficiency virus proteins. J Immu2002;169(1):210–9.

23] Loffredo JT, Sidney J, Wojewoda C, Dodds E, Reynolds MR, NaG, et al. Identification of seventeen new simian immunodeficivirus-derived CD8+ T cell epitopes restricted by the high frequ

J Virol 2001;75(22):10958–68.31] Knapp LA, Lehmann E, Piekarczyk MS, Urvater JA, Watkins DI

high frequency of Mamu-A*01 in the rhesus macaque detectepolymerase chain reaction with sequence-specific primers andsequencing. Tissue Antigens 1997;50(6):657–61.

32] Robinson S, Charini WA, Newberg MH, Kuroda MJ, Lord CI, LetNL. A commonly recognized simian immunodeficiency virusepitope presented to cytotoxic T lymphocytes of Indian-origin rhmonkeys by the prevalent major histocompatibility complex claallele Mamu-A*02. J Virol 2001;75(21):10179–86.

33] Evans DT, Jing P, Allen TM, O’Connor DH, Horton H, Venhamet al. Definition of five new simian immunodeficiency virus cytotoT-lymphocyte epitopes and their restricting major histocompatibcomplex class I molecules: evidence for an influence on disprogression. J Virol 2000;74(16):7400–10.

34] Ruppert J, Sidney J, Celis E, Kubo RT, Grey HM, Sette A. Promirole of secondary anchor residues in peptide binding to HLA-Amolecules. Cell 1993;74(5):929–37.

35] Sidney J, Southwood S, Oseroff C, Del Guercio MF, Sette A, GH. Measurement of MHC/peptide interactions by gel filtration.Current protocols in immunology. John Wiley & Sons, Inc.; 1918.13.11–18.13.19.

36] Allen TM, Sidney J, del Guercio MF, Glickman RL, LensmeGL, Wiebe DA, et al. Characterization of the peptide binding mof a rhesus MHC class I molecule (Mamu-A*01) that bindsimmunodominant CTL epitope from simian immunodeficiency viJ Immunol 1998;160(12):6062–71.

37] Sette A, Vitiello A, Reherman B, Fowler P, Nayersina R, KWM, et al. The relationship between class I binding affinityimmunogenicity of potential cytotoxic T cell epitopes. J Immu1994;153(12):5586–92.

38] Klinman DM. ELISPOT assay to detect cytokine-secreting muand human cells. Curr Prot Immunol 1994;6(19):1–8.

39] Peters B, Tong W, Sidney J, Sette A, Weng Z. Examiningindependent binding assumption for binding of peptide epitopMHC-I molecules. Bioinformatics 2003;19(14):1765–72.

Page 13: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules

5224 B. Peters et al. / Vaccine 23 (2005) 5212–5224

[40] Peters B, Sette A. Generating quantitative models describing thesequence specificity of biological processes with the stabilized matrixmethod. BMC Bioinform 2005;6(1):132.

[41] Swets JA. Measuring the accuracy of diagnostic systems. Science1988;240(4857):1285–93.

[42] Efron B, Tibshirani RJ. An introduction to the bootstrap. Chapman& Hall; 1993.

[43] Otting N, Heijmans CM, Noort RC, de Groot NG, Doxiadis GG, vanRood JJ, et al. Unparalleled complexity of the MHC class I regionin rhesus macaques. Proc Natl Acad Sci USA 2005;102(5):1626–31.

[44] Boyson JE, Shufflebotham C, Cadavid LF, Urvater JA, Knapp LA,Hughes AL, et al. The MHC class I genes of the rhesus monkey.Different evolutionary histories of MHC class I and II genes inprimates. J Immunol 1996;156(12):4656–65.

[45] Cadavid LF, Watkins DI. The duplicative nature of the MHCclass I genes: an evolutionary perspective. Eur J Immunogenet1997;24(4):313–22.

[46] Geraghty DE, Daza R, Williams LM, Vu Q, Ishitani A. Genetics ofthe immune response: identifying immune variation within the MHCand throughout the genome. Immunol Rev 2002;190:69–85.

[47] Sette A, Sidney J, del Guercio MF, Southwood S, Ruppert J,Dahlberg C, et al. Peptide binding to the most frequent HLA-Aclass I alleles measured by quantitative molecular binding assays.Mol Immunol 1994;31(11):813–22.

[48] Kubo RT, Sette A, Grey HM, Appella E, Sakaguchi K, Zhu NZ,et al. Definition of specific peptide motifs for four major HLA-Aalleles. J Immunol 1994;152(8):3913–24.

[49] Kast WM, Brandt RM, Sidney J, Drijfhout JW, Kubo RT, GreyHM, et al. Role of HLA-A motifs in identification of potential CTLepitopes in human papillomavirus type 16 E6 and E7 proteins. JImmunol 1994;152(8):3904–12.

[50] Honeyman MC, Brusic V, Stone NL, Harrison LC. Neural network-based prediction of candidate T-cell epitopes. Nat Biotechnol1998;16(10):966–9.