11
Nonredundant mass spectrometry: A strategy to integrate mass spectrometry acquisition and analysis Alexander Scherl 1 , Patrice Francois 2 , Véronique Converset 1 , Manuela Bento 2 , Jennifer A. Burgess 1 , Jean-Charles Sanchez 1 , Denis F. Hochstrasser 1 , Jacques Schrenzel 2 and Garry L. Corthals 1 1 Biomedical Proteomics Research Group, Central Clinical Chemistry Laboratory 2 Genomic Research Laboratory, Department of Infectious Diseases, Geneva University Hospital, Geneva, Switzerland Protein identification using automated data-dependent tandem mass spectrometry (MS/MS) is now a standard procedure. However, in many cases data-dependent acquisition becomes redundant acquisition as many different peptides from the same protein are fragmented, whilst only a few are needed for unambiguous identification. To increase the quality of information but decrease the amount of information, a nonredundant MS (nrMS) strategy has been developed. With nrMS, data analysis is an integral part of the overall MS acquisition and analysis, and not an endpoint as typically performed. In this nrMS workflow a matrix assisted laser desorption/ ionization-time of flight-time of flight (MALDI-TOF/TOF) instrument is used. MS and restricted MS/MS data are searched and identified proteins are used to generate an “exclusion list”, after in silico digestion. Peptide fragmentation is then restricted to only the most intense ions not present in the exclusion list. This process is repeated until all peaks are accounted for or the sample is consumed. Compared to nanoLC-MS/MS, nrMS yielded similar results for the anal- ysis of six pooled two-dimensional electrophoresis (2-DE) spots. In comparison to standard data-dependent MALDI-MS/MS for sodium dodecyl sulfate-polyacrylamide gel electrophore- sis (SDS-PAGE) gel band analysis, nrMS dramatically increased the number of identified pro- teins. It was also found that this new workflow significantly increased sequence coverage by identifying unexpected peptides, which can result from post-translational modifications. Keywords: Matrix-assisted laser desorption/ionization-time of flight-time of flight / Nonredundant / Result-dependent acquisition Received 28/8/03 Revised 9/10/03 Accepted 24/10/03 Proteomics 2004, 4, 917–927 917 1 Introduction Over the past decade, mass spectrometry (MS) has be- come the technology of choice for protein identifica- tion. Enzymatic protein digestion, peptide mass analysis, peptide fragmentation, and data analysis by database searching is now a standard procedure. For LC-ESI-MS/ MS experiments, peptide selection for fragmentation is usually done in an automated ‘data-dependent’ manner [1]. First, ions above a set threshold are chosen for MS/ MS, and second, selected ions are filtered and fragment- ed by collision-induced dissociation (CID). The data is only analyzed at the end of the run. This way of perform- ing analysis results in fragmentation of multiple peptides from the same protein and other ‘ionization events’ (noise). This is what we term redundant data acquisition. An inherent problem to this technique is the multiple suc- cessive fragmentation of ions that have previously been analyzed. In recent years, approaches to limit this prob- lem have used a ‘dynamic exclusion’, where the time spent acquiring MS/MS data on specific peptide ions is controlled, and consequently increasing the sequence coverage [2]. This does not, however, limit the successive fragmentation of peptide ions that have a different charge state from those already fragmented, neither does it allow for exclusion of ions that are not peptides. Even with these imperfections, this approach has proven to be par- ticularly useful for automated (multiple)LC-MS/MS work- flows, and currently empowers high-throughput projects where many hundreds of proteins can be identified [3, 4]. Correspondence: Dr. Garry L. Corthals, Biomedical Proteomics Research Group, Geneva University Hospital/LCCC, 24 rue Micheli-du-Crest, CH-1211 Genève 14, Switzerland E-mail: [email protected] Fax: 141-22-3727370 Abbreviations: nrMS, nonredundant mass spectrometry; PMF , peptide mass fingerprinting; PTM, post-translational modifica- tion 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de DOI 10.1002/pmic.200300673

Nonredundant mass spectrometry: A strategy to integrate mass spectrometry acquisition and analysis

Embed Size (px)

Citation preview

Nonredundant mass spectrometry: A strategy tointegrate mass spectrometry acquisition and analysis

Alexander Scherl1, Patrice Francois2, Véronique Converset1, Manuela Bento2,Jennifer A. Burgess1, Jean-Charles Sanchez1, Denis F. Hochstrasser1,Jacques Schrenzel2 and Garry L. Corthals1

1Biomedical Proteomics Research Group, Central Clinical Chemistry Laboratory2Genomic Research Laboratory, Department of Infectious Diseases,Geneva University Hospital, Geneva, Switzerland

Protein identification using automated data-dependent tandem mass spectrometry (MS/MS)is now a standard procedure. However, in many cases data-dependent acquisition becomesredundant acquisition as many different peptides from the same protein are fragmented, whilstonly a few are needed for unambiguous identification. To increase the quality of information butdecrease the amount of information, a nonredundant MS (nrMS) strategy has been developed.With nrMS, data analysis is an integral part of the overall MS acquisition and analysis, and notan endpoint as typically performed. In this nrMS workflow a matrix assisted laser desorption/ionization-time of flight-time of flight (MALDI-TOF/TOF) instrument is used. MS and restrictedMS/MS data are searched and identified proteins are used to generate an “exclusion list”, afterin silico digestion. Peptide fragmentation is then restricted to only the most intense ions notpresent in the exclusion list. This process is repeated until all peaks are accounted for or thesample is consumed. Compared to nanoLC-MS/MS, nrMS yielded similar results for the anal-ysis of six pooled two-dimensional electrophoresis (2-DE) spots. In comparison to standarddata-dependent MALDI-MS/MS for sodium dodecyl sulfate-polyacrylamide gel electrophore-sis (SDS-PAGE) gel band analysis, nrMS dramatically increased the number of identified pro-teins. It was also found that this new workflow significantly increased sequence coverage byidentifying unexpected peptides, which can result from post-translational modifications.

Keywords: Matrix-assisted laser desorption/ionization-time of flight-time of flight / Nonredundant /Result-dependent acquisition

Received 28/8/03Revised 9/10/03Accepted 24/10/03

Proteomics 2004, 4, 917–927 917

1 Introduction

Over the past decade, mass spectrometry (MS) has be-come the technology of choice for protein identifica-tion. Enzymatic protein digestion, peptide mass analysis,peptide fragmentation, and data analysis by databasesearching is now a standard procedure. For LC-ESI-MS/MS experiments, peptide selection for fragmentation isusually done in an automated ‘data-dependent’ manner[1]. First, ions above a set threshold are chosen for MS/MS, and second, selected ions are filtered and fragment-

ed by collision-induced dissociation (CID). The data isonly analyzed at the end of the run. This way of perform-ing analysis results in fragmentation of multiple peptidesfrom the same protein and other ‘ionization events’(noise). This is what we term redundant data acquisition.An inherent problem to this technique is the multiple suc-cessive fragmentation of ions that have previously beenanalyzed. In recent years, approaches to limit this prob-lem have used a ‘dynamic exclusion’, where the timespent acquiring MS/MS data on specific peptide ions iscontrolled, and consequently increasing the sequencecoverage [2]. This does not, however, limit the successivefragmentation of peptide ions that have a different chargestate from those already fragmented, neither does it allowfor exclusion of ions that are not peptides. Even withthese imperfections, this approach has proven to be par-ticularly useful for automated (multiple)LC-MS/MS work-flows, and currently empowers high-throughput projectswhere many hundreds of proteins can be identified [3, 4].

Correspondence: Dr. Garry L. Corthals, Biomedical ProteomicsResearch Group, Geneva University Hospital/LCCC, 24 rueMicheli-du-Crest, CH-1211 Genève 14, SwitzerlandE-mail: [email protected]: 141-22-3727370

Abbreviations: nrMS, nonredundant mass spectrometry; PMF,peptide mass fingerprinting; PTM, post-translational modifica-tion

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

DOI 10.1002/pmic.200300673

918 A. Scherl et al. Proteomics 2004, 4, 917–927

The most common use of this technique, however, is theanalysis of single proteins from gel bands or gel spots,where the throughput is 10’s of samples per day. In anattempt to tackle these problems “real-time databasesearching and intelligent precursor selection” wererecently presented for an LC-ESI-MS/MS setup [5]. Theimplementation of such a workflow using MALDI ioniza-tion is reported here.

Ideally, one would use an intelligent strategy where analy-sis guides acquisition. With the nonredundant massspectrometry (nrMS) strategy proposed here, such con-trol exists. In this workflow, data acquisition and dataanalysis are used consecutively. Following primary acqui-sition, a data analysis step is introduced. Based on theknowledge of the first identified protein in the sample,precursor selection for analysis is then filtered to avoidrepetitive identification of unmodified peptides from the

same protein. Further analysis is then limited to peptidesthat have m/z values different from the m/z values ofunmodified peptides from these entries. The analysis isrepeated until the sample is consumed or no further ionsare unaccounted for. The nrMS workflow is shown inFig. 1. For this type of result-dependent workflow timeis needed between two analysis steps. MALDI-MS/MSrepresents an ideal platform for this result-dependentacquisition because the sample is immobilized and avail-able for further investigation. In one MS acquisition, theinformation about all peptides present is available beforeany MS/MS acquisition. A simple peptide mass finger-print (PMF) results usually in at least one protein identi-fication. In the case of ambiguities, the PMF result canbe confirmed with one or more MS/MS spectra. If a priorpeptide separation step is introduced (for example,HPLC-MALDI), the fragmentation of only a limited numberof peptides should result in a positive identification of one

Figure 1. nrMS workflow. Afterdigestion, peptide masses areanalyzed by MS. Only a fewpeptides are chosen for MS/MSanalysis (two in this example).Both MS and MS/MS data arethen used for database search-ing. The identified protein isthen in silico digested andobtained peptide masses areused to build up an exclusionlist. MS/MS is then performedon the most abundant peptidesnot present on this exclusionlist, controlling subsequent ac-quisition. These steps are thenrepeated several times, untileach peak is either fragmentedor its association with a givenprotein is known.

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Proteomics 2004, 4, 917–927 Nonredundant MS 919

or more database entries. A minimum amount of sampleis therefore consumed before the first data analysis step.The remaining sample can therefore be used for furthercharacterization of other peptides. Introducing a dataanalysis step at this stage is useful for many kinds of anal-yses. In the case of multiple proteins present in the sam-ple, this results in an increased number of identifications.Moreover, it can also contribute to selectively confirm thepresence of a protein of lower abundance, avoid acquisi-tion of usual contaminants, increase the quality of theidentification by selective fragmentation or limit the acqui-sition of selected peptides (as, for example, peptideslabelled with stable isotopes). The ability to increase thenumber to identification in presence of multiple proteinsand the identification of modified peptides is demon-strated in this work.

A key component of this nrMS workflow is a TOF/TOFtandem MS analyzer equipped with a collision cell. Be-sides the obvious MALDI target allowing one to pause be-tween acquisition and analysis, these instruments haveseveral advantages in comparison to the typical ESI-MS/MS instruments (triple quadrupole, ion trap or quadru-pole-TOF). Metastable decay (or in-source fragmentation)results in low energy fragments such as y- and b-ions asseen with the typical ESI-MS/MS instruments. However,high energy fragments, such as x-, w-, c- and d-ions,can also be obtained from CID after the first TOF massfilter [6]. The generated spectra are a result of the combi-nation of these two types of fragmentation. Spectrumcomplexity may make manual de novo sequencing moredifficult but still possible [7, 8]. On the other hand, thesespectra, rich in information, are useful for validatingambiguous database search results [9]. In addition, a tan-dem instrument combined with a MALDI source and ahigh-frequency laser (200 Hz) provide very high speeddata acquisition and, therefore, for high-throughput anal-ysis. Analysis of selected, useful data results also in a sig-nificant reduction of the analysis time. It is also importantto notice that the nrMS strategy can potentially be usedwith any kind of MS/MS platform using a MALDI ioniza-tion source. The data quality, however, will depend onthe mass accuracy of the instrument and the resolutionof the mass filter. Both these parameters contribute tothe successful use of an nrMS strategy and will vary be-tween different MS/MS instruments. A fully automatedworkflow (under construction) is of course desirable forthis approach.

The nrMS workflow was evaluated with a MALDI-TOF/TOF-MS. Three different experiments were performedusing samples from Staphylococcus aureus N315 [10]. Inthe first experiment, three sets of six different proteinspots from 2-DE gels were pooled. The results obtained

from nrMS were compared to nanoLC-ESI-MS/MS analy-sis on the same pooled spots, and to the analysis of thenonpooled gel spots using the ‘standard’ data-dependentMALDI-TOF/TOF approach. In the second experiment,nrMS was used to analyze protein mixtures in 1-D SDS-PAGE gel bands loaded with a membrane extract ofS. aureus N315. In the third experiment, nrMS was ap-plied to individual protein spots from 2-DE gels and com-pared to a classic data-dependent MALDI-TOF/TOF anal-ysis. nrMS proved to be valuable in all cases. This wasevident from the decrease in acquisition time and numberof MS/MS spectra. This ultimately results in quickersearch times and less data validation.

2 Materials and methods

2.1 Reagents and chemicals

All chemicals purchased were of the highest purity avail-able, unless otherwise stated. Milli-Q water (Millipore,Bedford, MA, USA) was used in the preparation of allbuffers and solvents. Methanol, hydrochloric acid, mag-nesium chloride, potassium chloride, and CoomassieBrilliant Blue R-250 were purchased from Merck (Darm-stadt, Germany). Acetonitrile and acrylamide:bis solution(37.5:1) were purchased from Biosolve (Volkenswaard,Holland). SDS and 2,2,2-trifluoroethanol (99.0%) werefrom Fluka (Buchs, Switzerland). Trifluoroacetic acid, 1,4-dithioerythritol (DTE), ammonium bicarbonate, iodoacet-amide, CHAPS, glycerol, glycine, phosphate-bufferedsaline, porcine trypsin, and Tris were from Sigma-Aldrich(St. Louis, MO, USA). IPG strips (4–7) were purchasedfrom Amersham Biosciences (Piscataway, NJ, USA).Agarose and molecular mass markers were from Bio-Rad (Hercules, CA, USA).

2.2 Cell culture, protein extraction and gelelectrophoresis

S. aureus N315 was grown in Mueller Hinton Broth (Difco,Detroit, MI, USA) as previously described procedures[11], with minor modifications. For total protein extracts,cells were lysed with 20 mg/mL lysostaphine (Ambicin,Applied Microbiology, Tarratown, NY, USA) for 15 min at377C, in Tris-EDTA (TE) buffer. Insoluble material wasremoved by centrifugation at 50006g for 10 min. Proto-plasts were prepared using the same amount of lysosta-phin in the presence of 1.1 M sucrose. Protoplasts werelysed in hypotonic buffer and membrane extracts col-lected by centrifugation at 33 0006g for 1 h. 2-DE wasperformed in the presence of trifluoroethanol during iso-electric focusing as described earlier [12]. SDS-PAGE

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

920 A. Scherl et al. Proteomics 2004, 4, 917–927

was performed using a standard procedure with 12.5%acrylamide [13]. 2-DE gels were analyzed using Melanie3 software (Genebio, Geneva, Switzerland).

2.3 Creation of protein mixtures from 2-DEspots

In order to obtain controlled protein mixtures, spots from2-DE gels were pooled. For this, four identical 2-DE gelswere prepared. These gels were loaded with 1 mg ofS. aureus total protein extract and stained with Coomas-sie Brilliant Blue R-250. Three spot sets were chosen fromeach of these identical gels. Each set contained six spotsthat were chosen at similar molecular masses to minimizeexperimental variation. For the first set, six spots werechosen at , 60 kDa; for the second set, six spots at, 35 kDa and the third set, six spots at , 20 kDa (Fig. 2).

2.4 Protein mixtures from SDS-PAGE gel bands

An SDS-PAGE gel lane was loaded with 50 mg of amembrane extract of S. aureus. Two different bands(1.566 mm) in the 60 kDa region were excised and

in-gel digested with trypsin. The peptide extract wasdivided in two equal parts (for duplicate experiments)and spotted on a MALDI target plate.

2.5 In-gel digestion and peptide extraction

Spots were cut from 2-DE gels using a 1.4 mm innerdiameter needle and placed in 0.2 mL polypropylenevials (Abegene, Epsom, UK). Gel bands of , 1.5 mm66 mm were cut out from the SDS-PAGE gel. Gel pieceswere destained, in-gel digested with trypsin, and thepeptides extracted using previously described methods[14]. Prior to peptide MS analysis, the volume of eachgel spot extraction was adjusted to 5 mL by addition of50% v/v CH3CN with 0.1% v/v TFA. Each sample wasdeposited on two separate wells of a 192-well MALDItarget plate (0.5 mL for each well) and dried undervacuum. Matrix solution (0.5 mL of 10 mg/mL a-cyano-4-hydroxycinnamic acid in 50% v/v CH3CN, 0.1% v/vTFA ) was added to the previously deposited sampleand the samples again dried under vacuum to improvecocrystallization of matrix and sample.

Figure 2. Coomassie blue-stained 2-DE gel window.Spots 1–6 are from the 60 kDaregion, spots 7–12 from the35 kDa region, spots 13–18from the 20 kDa region.

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Proteomics 2004, 4, 917–927 Nonredundant MS 921

2.6 MALDI-TOF/TOF acquisition

MS and MS/MS analysis of peptides from 2-DE gel spotswas performed with a 4700 Proteomics Analyzer MALDI-TOF/TOF mass spectrometer (Applied Biosystems, Fra-mingham, MA, USA). For MS analysis, the instrumentwas tuned in reflector mode. The laser intensity was set5% higher than the threshold for ionization. MS spectrawere obtained after accumulation of 1000 consecutivelaser shots. For MS/MS analysis, the laser intensity wasset 15% higher than the threshold for ionization. Nitrogenwas used as collision gas and a medium CID pressurewas selected (approximately 861027 torr as measuredby the Source 2 pressure gauge). Data-dependent MS/MS analysis was performed in automatic mode, with theten highest peaks in MS selected for MS/MS from themost intense ion to the least intense. For MS analysis,external calibration was performed on a tryptic digest oflysozyme C. For MS/MS, external calibration was per-formed on fragments from precursor at m/z 1753 fromthe lysozyme C digest. Calibration was performed on thefragment ions of the precursor and the precursor ion. MS/MS spectra were the result of 1500 consecutive lasershots.

2.7 nrMS data acquisition

Instrument parameters for nrMS data acquisition were thesame as for data-dependent acquisition describedabove. A precursor list of all peaks with a S/N . 5 weregenerated by the instrument software. The two mostintense ions were selected in a data-dependent mannerfor the first round of MS/MS analysis. In the case of apositive identification, all other matching peptides formthe identified protein (retrieved from the database searchtool mascot 1.8) were deleted from the precursor list.When no positive identification resulted from the search,the two next most intense precursor where selected forMS/MS. This interrogation was repeated until the samplewas consumed or until all precursor ions were accountedfor.

2.8 LC-ESI-MS/MS

NanoLC-ESI-MS/MS was performed on a DecaXP iontrap (Thermofinnigan, San Jose, CA, USA) coupled withan LC-PAL autosampler (CTC analytics, Zwingen, Switz-erland) and Rheos 2000 Micro HPLC Pump (Flux Instru-ments, Basel, Switzerland). For each experiment, 5 mL ofsample was injected on a C18 reversed-phase column of75 mm inner diameter packed in-house with YMS-ODS-AQ200 (Michrom Bioresources, Auburn, CA, USA). Pep-tides were eluted from the column using a CH3CN gradi-

ent in the presence of 0.1% formic acid. For peptideelution, CH3CN concentration was increased from 16 to68% in 18 min. A flow splitter was used to decrease theflow rate from 250 mL/min to , 0.2 mL/min. An 1800 Vpotential was applied on the nano-electrospray capillary(New Objective, Woburn, MA, USA). Helium was used ascollision gas. The collision energy was set at 35% of themaximum. MS/MS spectra were acquired by automaticswitching between MS and MS/MS mode. The two high-est peaks from each MS scan were chosen for MS/MS.MS/MS spectra were limited to one scan per precursorin one minute.

2.9 Database search and validation

Data from MALDI MS and MS/MS acquisitions were usedin a combined search with MASCOT [15] Version 1.8after peak detection with embedded routine software(“Peak-to-Mascot”). Spectra from ESI-MS/MS analysiswere converted to DTA files, regrouped using in-housesoftware and the database search was also performedwith MASCOT 1.8. For MALDI data, MASCOT was usedwith the following parameters: peptide tolerance 400 ppm,fragment tolerance 0.4 Da, trypsin, specificity one pos-sible missed cleavage site and MALDI-TOF/TOF wasselected as the instrument. For nrMS workflow the aminoacid sequences from identified proteins were used togenerate a theoretical peptide list with trypsin as theendoprotease. The specificity was one missed cleavage,and trypsin hydrolysis of the peptide bond at the C-termi-nus of lysine and arginine, if not preceded by a proline. ForESI-MS/MS data, a tolerance of 2.0 Da was chosen forthe precursor and 1.0 Da for fragments. ESI-TRAP wasselected as the instrument. The combined SWISS-PROTand TrEMBL database was searched without speciesrestriction. Modifications were searched using the error-tolerant Mascot tool [16] and the Unimod modificationdatabase (http://www.unimod.org/). Database searchresults with a Mascot score higher than threshold wereaccepted (= identified), if the top scoring protein corre-sponded to the correct species (S. aureus). In the condi-tions described before, the threshold of significance wasgiven by a score of 35 or higher by Mascot.

3 Results

3.1 Comparison of different workflows onpooled 2-DE gel spots

To demonstrate the ability of nrMS to identify proteins inmixtures without prior separation, protein mixtures werecreated from pooled 2-DE spots (see Section 2). The four

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

922 A. Scherl et al. Proteomics 2004, 4, 917–927

identical sets of protein mixtures were used for thefollowing four experiments: (i) Data-dependent MALDI-TOF-TOF on the 18 individual (nonpooled) 2-DE spots;(ii) nanoLC-ESI-MS/MS on the pooled samples; (iii) Data-dependent MALDI-TOF-TOF analysis on the pooled sam-ples; and (iv) nrMS on the three pooled samples (60, 35,and 20 kDa).

(i) Data-dependent MALDI-TOF-TOF analysis of the eigh-teen individual spots resulted in the identification of oneor two proteins per spot. Seven proteins were identifiedfrom the 60 kDa spot set, six proteins from the 35 kDaspot set, and seven proteins from the 20 kDa spot set

(see Table 1, individual MALDI-TOF-TOF). (ii) NanoLC-ESI-MS/MS on the pooled samples increased the numberof identified proteins in two out of three pooled samplesets. Seven proteins were identified from the 60 kDaspot set. From the 35 kDa spot set, seven proteins werealso identified. Nine proteins were identified in the 20 kDaspot set (see Table 1, nanoLC-ESI-MS/MS). A total timeof 75 min was necessary for each nanoLC-ESI-MS/MSanalysis, including column washing and regenerating.(iii) Data-dependent MALDI-TOF-TOF analysis was per-formed on the ten highest peaks. This resulted in feweridentified proteins. Only three proteins were identifiedfrom the 60 kDa spot set. In the 35 kDa spot set four

Table 1. Comparison of different workflows

Spot DB entry Entry name IndividualMALDI-TOF/TOF

nanoLC-ESI-MS/MS

MALDI-TOF/TOF on10 peaks

MALDI-MS/MS on50 peaks

nrMS Nr. ofspectrafor nrMS

1 Q99TI6 Trigger factor (TF) 960 271 213 596 135 212 Q99SL7 60 kDa chaperonin (protein Cpn60) 1300 767 – – 423 Q931U2 Phosphoenolpyruvate-protein 579 721 – – 1793 P45554 Chaperone protein dnaK 495 208 182 322 –4 Q99R88 ATP-dependent Clp proteinase chain clpL 1120 1443 – 90 1975 Q59821 Dihydrolipoamide S-acetyltransferase

component995 618 108 279 100

6 Q99UD4 Transketolase 961 817 – – 134

7 Q99R31 SA2399 protein (fructose-bisphosphate) 908 618 – 401 217 118 Q99SD3 Fructose-bisphosphate aldolase 1150 520 258 425 3869 P72357 D-Specific D-2-hydroxyacid dehydrogenase 1060 619 139 202 109

10 Q99R05 Carbamate kinase 1160 590 99 265 16411 Q99W9 Cysteine synthase (O-acetylserine

sulfhydrylase)1280 538 63 386 534

12 Q99UM Succinyl-CoA synthetase 852 600 – 161 313– Q99UU0 Hypothetical protein SAV1170 – 64 – 326 –

13 Q9Z5C3 Triosephosphate isomerase 473 648 71 663 322 2414 Q53647 Alkyl hydroperoxide reductase subunit C 764 401 585 267 12115 Q99SC1 Purine nucleoside phosphorylase 427 202 202 – 12816 O33276 Probable ribosome recycling factor 538 370 – – 6616 Q99UV5 HAM1 protein homolog 92 256 – 305 –17 Q9Z5W Superoxide dismutase SodA 679 412 152 – 3518 Q99TF6 Hypothetical metal-dependent hydrolase 787 98 – – 321– Q99S40 Adenylate kinase – 223 – – –– Q99SC2 Deoxyribose-phosphate aldolase 2 – 94 – – – –

Mascot score for the proteins identified from individual and pooled spots using the four different workflows. Spots 1–6 fromthe 60 kDa region, spots 7–12 from the 35 kDa region, and spots 13–18 from the 20 kDa region were pooled or analyzedindividually. The individual spots were analyzed with MALDI-TOF-TOF. The pooled samples were analyzed using data-dependent nanoLC-ESI-MS/MS, data-dependent MALDI-TOF-TOF on the 10 and 50 highest peaks and the nrMS work-flow. The number of fragmented precursors for the nrMS analysis is indicated in the column on the right.

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Proteomics 2004, 4, 917–927 Nonredundant MS 923

proteins were identified. In the 20 kDa spot set, four pro-teins were also identified (Table 1, MALDI-TOF-TOF onten highest peaks). However, by applying the nrMS strat-egy to the three pooled samples, similar results to the in-dividual spot analysis was obtained. Six proteins wereidentified from all the three spot sets (Table 1, nrMS). It isimportant to note that here three “samples” were ana-lyzed, whereas for individual spot analysis 18 sampleswere analyzed.

The effect of increasing the number of MS/MS spectra indata-dependent MALDI-TOF-TOF was also evaluated.Data-dependent MALDI-TOF-TOF on the pooled sam-ples was repeated with an increased number of precur-sors. For this, MS/MS data was acquired on the top50 peaks, instead of just top 10. For the 60 kDa spot set,only one additional protein was identified when the num-ber of MS/MS spectra was increased from 10 to 50. In thesecond spot set (35 kDa), all six proteins were identifiedinstead of only four. In the 20 kDa spot set, increasing thenumber of MS/MS spectra from 10 to 50 did not increasethe number of identified proteins. These results are alsosummarized in Table 1. The number of acquired spectrafor nrMS was less than 25 for each of the three samples

(Table 1). With a laser frequency of 200 Hz and 1500 shotsper spectra, the instrument time was less than 3.5 min persample, including the first MS analysis.

3.2 Comparison using SDS-PAGE gel bands

The potential of nrMS to resolve mixtures was shownwith the analysis of pooled gel spots. The ability of nrMSto increase the number of identification on “real life”samples was investigated by the analysis of SDS-PAGEgel bands. Extracted peptides from these bands weredivided into two equal parts. The first part was analyzedby data-dependent MALDI-TOF-TOF, where the ten mostintense ions were selected for MS/MS. The second partwas analyzed using the nrMS workflow.

With data-dependent MALDI-TOF/TOF analysis, one pro-tein was identified from the first gel band. Eight MS/MSspectra matched to this protein and the two remainingMS/MS spectra did not allow any other identification.Similar results were obtained for the second gel bandwhere only one protein was identified (Table 2). The sec-ond part of the sample was then analyzed by nrMS. Using

Table 2. Identified protein from gel bands using data-dependent and nrMS workflow

Sample Data-dependent MALDI-TOF-TOF nrMS (10 precursors) nrMS (unlimited)

Protein entry and name/peptide sequence

Mascotscore

Protein entry and name/peptide sequence

Mascotscore

Protein entry and name/peptide sequence

Mascotscore

Band a Q931U0 Pyru. dehyd. E1. Q931U0 Pyru. dehyd. E1. Q931U0 Pyru. dehyd. E1.LGFYAPTAGQEASQLASQYAL 97 GLWNEDKENEVIER 72 GLWNEDKENEVIER 72GLWNEDKENEVIER 78 DVPQIIWHGLPLTEAFLFSR 39 DVPQIIWHGLPLTEAFLFSR 39DVPQIIWHGLPLTEAFLFSR 73 LGFYAPTAGQEASQLASQYALE 92 LGFYAPTAGQEASQLASQYALE 92LQAQFDAVK 52 – –EDYILPGYR 41 – –AVAGEGPTLIETMTYR 37 – –YGPHTMAGDDPTR 19 – –LGFYAPTAGQEASQLASQYAL 6 – –

– Q99V36 Hypothetical protein Q99V36 Hypothetical protein– DVSDKPLIPAR 33 DVSDKPLIPAR 33– TLYDYEKPPK 42 TLYDYEKPPK 42

– Q99VD1 Ornithine amino . . . Q99VD1 Ornithine amino . . .– GYGPLLDGFR 40 GYGPLLDGFR 40

GIEPNKAEIIAFNGNFHGR 11

Q99UR6 Carbamoyl . . .YLVLEDGSFYEGYR 31TLHDVLELHQIPGIAGVDTR 51

Q99TH5 Glyceraldehyde . . .VIAWYDNEWGYSNR 47

Q99R49 Hypothetical proteinALEHLGFGALELGGITPKPQPG 39

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

924 A. Scherl et al. Proteomics 2004, 4, 917–927

Table 2. Continued

Sample Data-dependent MALDI-TOF-TOF nrMS (10 precursors) nrMS (unlimited)

Protein entry and name/peptide sequence

Mascotscore

Protein entry and name/peptide sequence

Mascotscore

Protein entry and name/peptide sequence

Mascotscore

Band b Q99SF5 ATP synthase Q99SF5 ATP synthase beta. Q99SF5 ATP synthase beta.ALEPSIVGQEHYEVAR 112 ALEPSIVGQEHYEVAR 65 ALEPSIVGQEHYEVAR 65DILDGKYDHIPEDAFR 66 VFNVLGETIDLKEEISDSVR 136 VFNVLGETIDLKEEISDSVR 136FTQAGSEVSALLGR 61 – –VFNVLGETIDLKEEISDSVR 42 – –VALSGLTMAEYFR 22 – –– Q99R30 Malate:quinone ox. Q99R30 Malate:quinone ox.– TLLFGPFANVGPK 76 TLLFGPFANVGPK 76– IDEGTDVNFGELTR 48 IDEGTDVNFGELTR 48

– Q99VD0 NAD-specific . . . Q99VD0 NAD-specific . . .– AISQFVGPNK 42 AISQFVGPNK 42– VVIQGFGNAGSFLAK 81 VVIQGFGNAGSFLAK 81

– Q99SF3 ATP synthase alpha.– HVLIVYDDLTK 57– EAYPGDVFYLHSR 82

– Q99W61 Translation . . .– LLDYAEAGDNIGALLR 61– ILELMEAVDTYIPTPER 16– DLLSEYDFPGDDVPVIAGSALK 62

Comparison of data-dependent MALDI-TOF-TOF and nrMS on two digested SDS-PAGE gels bands. Data-dependentacquisition was performed with the ten highest peaks chosen for MS/MS. Ten ions fragmented using nrMS shows alreadyincreased identifications. Unlimited acquisition (sample consumed or no more precursors) reveal significant improvementsin protein identification.

this approach, six different proteins were identified fromthe first gel band. Five proteins are unambiguously identi-fied from the second gel band (Table 2). These results areobtained with unlimited sample acquisition (e.g., the sam-ple was consumed or no more peak was present). Frag-mentation of only ten ions using nrMS shows already asignificant improvement in protein identification (Table 2).

3.3 nrMS on individual gel spots

Finally, the nrMS workflow was used on individual 2-DEspots to investigate the capacity of this approach to bet-ter characterize a single protein sample. An importantprocedure in this step is the use of the MASCOT (Ver-sion 1.8) database search tool. Following the final acquisi-tion step, all MS/MS spectra were submitted to an error-tolerant search [16]. Here, the nonredundant workflowresulted in an increased sequence coverage of identifiedproteins. In this particular case, the strategy of onlyselecting peptides for MS/MS that did not match to pep-tides in the in silico digested protein sequence allowedthe identification of modified peptides. These modifica-tions may be the result of post-translational modifications

(PTMs) or artifacts generated during the protein extrac-tion, digestion and analysis process. Using data-depend-ent acquisition on the ten highest peaks with MALDI-TOF/TOF, Hypothetical Protein SAV0968 ( TrEMBL accessionnumber Q99VC2) was identified. nrMS on the other handenabled the identification of this protein and severalpeptides that did not match to the theoretical peptidelist of this protein. Two peptides were found to beSAV0968 peptides that contained deaminated Asn andGln residue. One peptide was a SAV0968 peptide thatwas sodiated on a Glu residue. Notably, one nontrypticpeptide matched to the sequence ADFAEGDFHPK (a.a.32 to 42). This peptide is preceded by a Phe (a.a. 31),which suggested either that this peptide is cleaved invivo after residue 31 or results from chymotryptic activity.These results are summarized in Fig. 3.

4 Discussion

Repetitive and redundant MS data acquisition should beavoided for two important reasons: (i) to save time on datavalidation; and (ii) to increase the quality and amount of

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Proteomics 2004, 4, 917–927 Nonredundant MS 925

Figure 3. Recovery of TrEMBLdatabase entry Q99VC2 usingdata-dependent and nonredun-dant workflow. (A) MS/MSrecovery after fragmentation ofthe ten highest peptides. Therecovered peptides are shownin bold. (B) MS/MS recoveryand modification using thenrMS workflow. The recoveryobtained after fragmentation ofthe three highest peaks isshown in bold. In bold italic, theatypical cleavage or peptideafter protein processing. De-aminated peptides are under-scored. A sodated peptide(ditdqldyEgelgivig, sodated onE) is in italic.

information resulting from an MS experiment. To unam-biguously identify a protein, two or three good qualityMS/MS spectra are usually enough. Ideally, the remainingsample should be used for further characterization, whichcan result in an increased number of protein identifi-cations or further information on the post-translationalstate of a protein. However, in typical MS/MS workflowscurrently implemented, one generally gathers extensivedata on an (potentially) already identified protein throughthe exhaustive generation of MS/MS spectra. For thistype of work, microcapillary LC systems are directlycoupled to ESI-MS/MS instruments [1]. In this workflow,the implementation of dynamic exclusion to limit a firststep of redundant acquisition has greatly improved theability to identify multiple proteins in mixtures [2], whichis of particular importance for the comprehensive charac-terization of samples containing multiple proteins. nrMSwould be desirable for LC-MS procedures as well al-though this is currently a challenge. On-line LC/ESI-MS/MS can not be paused after a run has started, and thetime between elution of individual peptides is typicallyinsufficient compared to the time it takes to identify a pro-tein by interrogation of the spectra with current databasesearch programs. However, recently it has been reportedthat rapid database search tools (1 s for SWISS-PROT/TrEMBL) coupled to on-line LC-ESI-MS/MS acquisitioncan be used to generate dynamic exclusion lists fordata-dependent mLC/ESI-MS/MS [5]. With on-line capil-lary electrophoresis ESI-MS/MS [17], separation can betemporarily paused [18], although there is no means ofinstrument feedback currently available for this approach

other than that mentioned above. Furthermore, on-linecapillary electrophoresis ESI-MS/MS is a much more dif-ficult task to perform in most cases and is not easilyimplemented with existing equipment.

MALDI coupled to a tandem mass analyzer has an intrin-sic advantageous feature for a nonredundant workflow.MALDI-MS/MS is a step-wise serial analysis process,where the analysis can be paused before acquisition andfurther analysis. Typically, the masses of all peptides areanalyzed after the first MS step. This information is oftensufficient for a protein identification by PMF. In caseswhere PMF alone does not result in unambiguous identifi-cation, MS/MS allows rapid confirmation or correction ofa false identification. Importantly, the sample is ‘frozen intime’ with MALDI, which in practical terms means there isunlimited time for data analysis between two steps ofdata acquisition.

In the work described here, protein mixtures were ana-lyzed with the nrMS workflow and compared to data-dependent MALDI-TOF/TOF analysis and nanoLC-ESI-MS/MS. Analysis of reconstituted protein mixtures frompooled 2-DE gels spots has revealed that this techniqueyields similar performances to data-dependent nanoLC-ESI-MS/MS but it is currently not as good. In fact, nrMSwas able to identify six proteins out of seven identifiedwith nanoLC-ESI-MS/MS in two mixtures, and six pro-teins out of nine in a third mixture. It is important to notethat a comparison of the nrMS workflow to a data-dependent MALDI-MS/MS acquisition procedure reveals

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

926 A. Scherl et al. Proteomics 2004, 4, 917–927

that introducing a data analysis step between two dataacquisition steps significantly increases the number ofidentified proteins in mixtures. In addition, this also limitsthe number of acquired spectra. Reducing the number ofspectra acquired results in overall quicker search times,and less data validation. Moreover, the data shows thatincreasing the number of acquired spectra in a data-de-pendent manner (without feedback) does not systemati-cally increase the number of identified proteins, whereasnrMS does. This is very important for high throughputanalysis since the instrument time is minimized. The num-ber of acquired spectra was less then 25 for each nrMSanalysis. This results in a reduction of analysis time bymore than a factor of two and consequently less sampleis consumed, which is important, for when the sample isto be used for further analysis; e.g., confirmation of anambiguous identification or a PTM. By comparison withnanoLC-ESI-MS/MS, the instrument time using nrMS is25 times quicker: total instrument time for nanoLC-ESI-MS/MS is 75 min, whereas the nrMS run was about3 min. The capacity of nrMS to analyze protein bandsfrom SDS-PAGE gels was also tested. In this case, nrMSyielded in a dramatic increase of the number of identifiedproteins in comparison to data-dependent acquisition (byMALDI-MS/MS). Data-dependent MALDI-MS/MS analy-sis identified only one protein per gel band. Analysis ofthe same digest with nrMS resulted in the identificationof six and five proteins for each gel band, respectively(Table 2).

The use of nrMS was also tested on individual 2-DE gelspots. In this example, information on modified peptideswas acquired which was not obtained for normal (data-dependent) acquisition. As shown in Fig. 3, an atypicalcleavage site or the potential cleavage site of the signalpeptide from Hypothetical Protein SAV0968 (TrEMBLentry Q99VC2) is found only with nrMS acquisition. Thistype of information may be of value as it could have bio-logical implications. Modifications to sequences that areknown from existing database information or commonmodifications (such as sodiated or deaminated) foundhere could be used in exclusion lists for future work.The ability of targeted nrMS to identify PTMs (e.g., phos-phorylation, glycosylation, etc.) is being further investi-gated, and is anticipated to be advantageous here aswell.

The nrMS workflow is currently being automated. Inthis data analysis, database search acceptance criteriaare defined, in addition to enzyme selectivity for in-silicodigestion. As any MS/MS technique, this workflow hasmargins within it will work best. Good-quality spectra,sufficiently concentrated sample, and good peptide frag-mentation are necessary. Furthermore, mixtures of only

large proteins with one or two small proteins will influencethe performance of nrMS. In this scenario, many digestedpeptides will be generated from the large proteins, yield-ing a long exclusion list and potentially preventing subse-quent MS/MS on ions from small proteins by coincidentaloverlap. Even in this case it is likely that these small pro-teins would be identified by PMF as MS/MS data is notalways a prerequisite for protein identification. It remainsto be seen where the exact limitations are for mixtureanalysis. For the experiments reported here such prob-lems could not occur as proteins were extracted fromSDS-PAGE gels with a molecular weight between 15 and100 kDa, resulting in a small but similar number of trypticpeptides from each protein. This workflow is likely usefulfor a sample complexity of up to eight proteins per MALDItarget plate well. This is a typical maximum complexityencountered in 1-DE and 2-DE workflows, as singlebands or gel spots. Coupled to a (multidimensional)-LCsetup, the nrMS workflow is also feasible, providingsample overloading per well is prevented, which wouldotherwise result in ion suppression, physically or virtually(through exclusion lists). The merits and limitations willbecome better defined in time of nrMS. It will be of partic-ular interest to see the advantages of intelligent dataacquisition applied to, for example, fragmentation of onlydifferentially expressed peptides following stable isotopelabelling. Also the specific screening for modifications onselected amino acids. For example, fragment only knownpeptides with an additional mass 80 Da, but only thosethat contain S,Tand/or Yamino acids, to confirm possiblephosphorylation sites.

For the data analyzed in this study substantial differencesbetween data-dependent analysis and nrMS were foundon typical in-gel digested samples. The nonredundantstrategy significantly increased the amount of biologicalinformation in terms of the number of identified proteinsand modified peptides. Furthermore, we have shownthat a “result driven” workflow should be applied in rou-tine proteomic analysis and consequently be implemen-ted in data-acquisition software. This seems feasible forMALDI tandem MS instruments and has already been(partially) implemented in one case [19]. The data hereshows that nrMS using only MALDI-MS/MS is encourag-ing but not equal to the superior identification providedby LC-ESI-MS/MS. Thus, using nrMS to obtain maximuminformation on protein mixtures will require the coupling ofLC to the procedure. This integration of an nrMS workflowin LC-MS approaches is challenging, but not impossible,as promising developments demonstrating LC depositiononto MALDI plates has been reported [20]. Therefore, itis possible that further multidimensional LC workflowsexchange the final ionization step from ESI to MALDIusing LC deposition techniques. With dedicated soft-

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de

Proteomics 2004, 4, 917–927 Nonredundant MS 927

ware enabling automated nrMS and continued improve-ments in LC-MALDI integration, the timing may be rightto shift from ESI to MALDI for extremely rapid MS pro-tein identification employing result-dependent MS/MSacquisition.

This work was supported by the Swiss National Found forScientific Research (Grant 31-59095.99).

5 References

[1] Ducret, A., Van Oostveen, I., Eng, J. K., Yates III, J. R., Aeber-sold, R., Prot. Sci. 1998, 7, 706–719.

[2] Gatlin, C. L., Eng, J. K., Cross, S. T., Detter, J. C., Yates III,J. R., Anal. Chem. 2000, 72, 757–763.

[3] Link, A. J., Eng, J., Schieltz, D. M., Carmack, E. et al., Nat.Biotechnol. 1999, 17, 676–682.

[4] Florens, L., Washburn, M. P., Raine, J. D., Anthony, R. M. etal., Nature 2002, 419, 520–526.

[5] Wallace, Ritchie, M. A., Jones, C., Leicester, S., Langridge, J.,ABRF 2003 Poster, J. Biomol. Techn. 2003, 14, 80.

[6] Medzihradszky, K. F., Campbell, J. M., Baldwin, M. A., Falick,A. M. et al., Anal. Chem. 2000, 72, 552–558.

[7] Yergey, A. L., Coorssen, J. R., Backlund, P. S., Jr., Blank, P. S.et al., J. Am. Soc. Mass Spectrom. 2002, 13, 784–791.

[8] Bienvenut, W. V., Deon, C., Pasquarello, C., Campbell, J. M.et al., Proteomics 2002, 2, 868–876.

[9] Huang, L., Baldwin, M. A., Maltby, D. A., Medzihradszky, K.F. et al., Mol. Cell. Proteomics 2002, 1, 434–450.

[10] Kuroda, M., Ohta, T., Uchiyama, I., Baba, T. et al., Lancet2001, 357, 1225–1240.

[11] McDevitt, D., Francois, P., Vaudaux, P., Foster, T. J., Mol.Microbiol. 1995, 16, 895–907.

[12] Deshusses, J. M., Burgess, J. A., Scherl, A., Wenger, Y. etal., Proteomics 2003, 3, 1418–1424.

[13] Laemmli, U. K., Nature 1970, 227, 680–685.

[14] Scherl, A., Coute, Y., Deon, C., Calle, A. et al., Mol. Biol. Cell2002, 13, 4100–4109.

[15] Perkins, D. N., Pappin, D. J., Creasy, D. M., Cottrell, J. S.,Electrophoresis 1999, 20, 3551–3567.

[16] Creasy, D. M., Cottrell, J. S., Proteomics 2002, 2, 1426–1434.

[17] Smith, R. D., Fields, S. M., Loo, J. A., Barinaga, C. J. et al.,Electrophoresis 1990, 11, 709–717.

[18] Figeys, D., Corthals, G. L., Gallis, B., Goodlett, D. R. et al.,Anal. Chem. 1999, 71, 2279–2287.

[19] Graber, A., Juhasz, P. S., Khainovski, K. C., Patterson, P. H.et al., Proteomics 2004, 4, 474–489.

[20] Preisler, J., Foret, F., Karger, B. L., Anal. Chem. 1998, 70,5278–5287.

2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.de