7
Review Proteomics quality and standard: From a regulatory perspective Qiang Gu 1 , Li-Rong Yu Division of Systems Biology, National Center for Toxicological Research, Food and Drug Administration, USA ARTICLE INFO ABSTRACT Article history: Received 15 July 2012 Accepted 22 November 2013 Available online 4 December 2013 Proteomics has emerged as a rapidly expanding field dealing with large-scale protein analyses. It is anticipated that proteomics data will be increasingly submitted to the U.S. Food and Drug Administration (FDA) for biomarker qualification or in conjunction with applications for the approval of drugs, medical devices, and other FDA-regulated consumer products. To date, however, no established guideline has been available regarding the generation, submission and assessment of the quality of proteomics data that will be reviewed by regulatory agencies for decision making. Therefore, this commentary is aimed at provoking some thoughts and debates towards developing a framework which can guide future proteomics data submission. The ultimate goal is to establish quality control standards for proteomics data generation and evaluation, and to prepare government agencies such as the FDA to meet future obligations utilizing proteomics data to support regulatory decision. Published by Elsevier B.V. Keywords: Proteomics Mass spectrometry Quality control Standards Biomarker qualification Regulatory decision Contents Acknowledgments ........................................................ 358 References ............................................................ 358 Proteomics has emerged as a rapidly expanding field dealing with large-scale qualitative and quantitative protein analyses. The number of published research articles related to proteo- mics has shown a steady increase according to a search in Thompson Reuters Web of Knowledge Science Citation Index (Fig. 1). The annual increase rate is over 100% before 2002, and about 25% on average during the past decade. Although the recent relative increase rate becomes smaller because of the larger base of published papers, the absolute number of publications is still steadily increased. Proteomics has the promise to gain insights into the overall picture of proteome- wide alterations of protein abundance, interactions, activities, post-translational modifications and so forth under various physiological and pathological conditions. It is anticipated that a growing number of proteomic data will be submitted to regulatory agencies for biomarker qualification and/or in JOURNAL OF PROTEOMICS 96 (2014) 353 359 Corresponding author at: Biomarkers and Alternative Models Branch, National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR 72079, USA. Tel.: +1 870 543 7052; fax: +1 870 543 7686. E-mail address: [email protected] (L.-R. Yu). 1 Present affiliation: Division of Neurotoxicology, National Center for Toxicological Research, Food and Drug Administration, USA. 1874-3919/$ see front matter. Published by Elsevier B.V. http://dx.doi.org/10.1016/j.jprot.2013.11.024 Available online at www.sciencedirect.com ScienceDirect www.elsevier.com/locate/jprot

Proteomics quality and standard: From a regulatory perspective

  • Upload
    li-rong

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

J O U R N A L O F P R O T E O M I C S 9 6 ( 2 0 1 4 ) 3 5 3 – 3 5 9

Ava i l ab l e on l i ne a t www.sc i enced i r ec t . com

ScienceDirect

www.e l sev i e r . com/ loca te / j p ro t

Review

Proteomics quality and standard:

From a regulatory perspective

Qiang Gu1, Li-Rong Yu⁎

Division of Systems Biology, National Center for Toxicological Research, Food and Drug Administration, USA

A R T I C L E I N F O

⁎ Corresponding author at: Biomarkers andAdministration, Jefferson, AR 72079, USA.

E-mail address: [email protected] (L1 Present affiliation: Division of Neurotoxic

1874-3919/$ – see front matter. Published byhttp://dx.doi.org/10.1016/j.jprot.2013.11.024

A B S T R A C T

Article history:Received 15 July 2012Accepted 22 November 2013Available online 4 December 2013

Proteomicshas emerged asa rapidly expanding field dealingwith large-scale protein analyses.It is anticipated that proteomics data will be increasingly submitted to the U.S. Food and DrugAdministration (FDA) for biomarker qualification or in conjunction with applications for theapproval of drugs, medical devices, and other FDA-regulated consumer products. To date,however, no established guideline has been available regarding the generation, submissionand assessment of the quality of proteomics data that will be reviewed by regulatory agenciesfor decision making. Therefore, this commentary is aimed at provoking some thoughts anddebates towards developing a frameworkwhich can guide future proteomics data submission.The ultimate goal is to establish quality control standards for proteomics data generation andevaluation, and to prepare government agencies such as the FDA to meet future obligationsutilizing proteomics data to support regulatory decision.

Published by Elsevier B.V.

Keywords:ProteomicsMass spectrometryQuality controlStandardsBiomarker qualificationRegulatory decision

Contents

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

Proteomics has emerged as a rapidly expanding field dealingwith large-scale qualitative and quantitative protein analyses.The number of published research articles related to proteo-mics has shown a steady increase according to a search inThompson Reuters Web of Knowledge Science Citation Index(Fig. 1). The annual increase rate is over 100% before 2002, andabout 25% on average during the past decade. Although therecent relative increase rate becomes smaller because of the

Alternative Models BranTel.: +1 870 543 7052; fa.-R. Yu).ology, National Center fo

Elsevier B.V.

larger base of published papers, the absolute number ofpublications is still steadily increased. Proteomics has thepromise to gain insights into the overall picture of proteome-wide alterations of protein abundance, interactions, activities,post-translational modifications and so forth under variousphysiological and pathological conditions. It is anticipatedthat a growing number of proteomic data will be submittedto regulatory agencies for biomarker qualification and/or in

ch, National Center for Toxicological Research, Food and Drugx: +1 870 543 7686.

r Toxicological Research, Food and Drug Administration, USA.

0

1000

2000

3000

4000

5000

6000

7000

96 97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12

Nu

mb

er o

f P

ub

licat

ion

s Proteomics Articles

Year

[8]

Fig. 1 – Annually published research articles (excludingreviews, abstracts, editorials, or commentaries) containingthe keyword “proteomics”, “proteomic”, or “proteome”based on a search in Thompson Reuters Web of KnowledgeScience Citation Index.

354 J O U R N A L O F P R O T E O M I C S 9 6 ( 2 0 1 4 ) 3 5 3 – 3 5 9

conjunctionwith drug applications to support, at themolecularlevel, drug efficacy and toxicity levels. The increasing submis-sion of proteomic data and biomarkers will bring regulatoryagencies great challenges in evaluating this type of data. It is notpractical to validate a large number of proteins individually;therefore, it becomes of paramount importance that objectiveparameters and reliable procedures can be utilized to reflect theoverall quality of a given proteomics dataset. Hence, establish-ing quality control standards for proteomics data generationand evaluationwill help regulatory agenciesmeet obligations toutilize proteomics data in conjunction with drug review andbiomarker qualification processes [1,2].

Compared to genomics, proteomics is a much more compli-cated proposition because of not only thehugenumber of proteinspecies generated by variant splicing and post-translationalmodification, but also additional dynamic characteristics suchas protein–protein interactions, interactions with other non-protein-type molecules, proteolysis, subcellular localization,high dynamic range of concentrations (particularly in plasma asa highly accessible sample type for biomarker discovery), etc.Therefore, proteomics could facemore complex, challenging, andunique issues. To accurately evaluate proteomics data, simpleand relevant questions have to be addressed beforemore specificquestions relevant to a particular case are asked. For instance,how efficient is a protein extraction method? Will differentmethods for protein concentration measurements generatedifferent results? These questions retain much less attentionin the proteomics community than other issues such as theperformance of liquid chromatography-mass spectrometry(LC-MS) platforms. Since proteins have to be extracted frommembrane, cytoplasm, nucleus, and other organelles for prote-omic analyses, diverse methods have been developed andutilized in protein extraction and purification processes. Whilein theory all proteins should be included in the analysis, it isoften not the case in real experiments when diverse extractionmethods with different yield efficiencies were utilized even forthe same type of tissue or cell samples. It is conceivable thatdifferent protein extraction methods may yield variableamounts of protein for subsequent proteomics analyses. Thus,would it be appropriate to determine and state protein yield(e.g., mg protein/g tissue) for a particular protein extractionmethod and a particular type of sample? In addition, different

methods to determine protein concentrations have beenemployed in practice. While it is not practical to endorse onemethod and abandon others, would it be necessary to set up akind of conversion standard amongdifferentmethods for proteinconcentration measurement, so that inter-methodological re-sults could be appropriately compared and assessed?

Major standardization efforts for certain proteomicapproaches and for setting scientific publication guidelineshave been made by proteomics communities. Several attemptshave been initialized such as the Proteomics StandardsInitiative by the Human Proteome Organization (HUPO-PSI) [3]and the Clinical Proteomic Technology Assessment for theCancer by National Cancer Institute [4]. A series of proposalshave been recommended and published [5–17]. More recently,the ProteomeXchange (www.proteomexchange.org) consor-tium has been set up to provide a single point of submissionof MS proteomics data to the main existing proteomicsrepositories such as PRIDE and PeptideAtlas. The primarygoals of these initiatives were aimed at facilitating data com-parisons from different laboratories and for overall data qualityevaluation. Although these frameworks recommend the min-imal reporting information regarding experimental procedures,unfortunately they do not address all parameters that can havemajor influences on the accuracy of experimental data. To date,there is no established guideline and standard availableconcerning the generation of proteomics data for regulatoryevaluation. A recent report by the HUPO Test Sample WorkingGroup revealed that out of the 27 laboratories which examinedthe same sample that consisted of 20 highly purified humanproteins, only 7 laboratories reported all 20 proteins correctly[18]. Notably, this was only at the qualitative level, i.e., iden-tification of the correct proteins rather than determination ofthe relative abundance of each protein in the sample. ManyMS-based quantitative proteomic approaches were developedin thepast decade for globalmeasurement of proteome changesor targeted analysis of biomarkers of disease or drug response[19–23]. These technologies were further improved more re-cently to achieve ultra high resolution, selectivity, sensitivity,and speed of peptide measurement [24–27], which resulted inhigher throughput and nearly complete analysis of proteomes[28–31]. Although recent efforts were made to initiate theevaluation of the intra- and inter-laboratory performance oftargeted proteomics approaches [32,33], the performance ofmajority of global quantitative proteomic approaches has notbeen evaluated thoroughly. What should be considered as the“gold-standard” for proteomics data evaluation? In our opinion,similar as already proposed by others [34], a high qualityproteomics dataset should fulfill at least two criteria: 1) repeat-ability of data generated within the same laboratory and2) consistency of data generated from different laboratories.However, without further quantitative definition of consis-tence, questionswill remain such aswhat is theminimum levelof quality required to accept or reject proteomics data?

In order to begin addressing these questions, we reviewedresearch articles published in leading proteomics journalsincluding Molecular & Cellular Proteomics, Journal of ProteomeResearch, Journal of Proteomics, Proteomics, Proteome Science, andProteomics-Clinical Applications. Our objective was to use thepublished papers in a calendar year as an example to providea snapshot of the diversity of subjects and methodologies in

Bradford BCA Lowry

Pro

tein

Co

nce

ntr

atio

n

(mg

/ml)

0.0

0.1

0.2

0.3

0.40.317(n=3)

0.366(n=3)

0.249(n=3)

Protein Assay

Fig. 2 – Protein concentration of the same sample wasdetermined using three different protein assay methods(BCA, Bradford, and Lowry). The same bovine serum albuminconcentration series were used for generating the standardcurve for each method. Error bars indicate standard error ofthe mean. There is a statistically significant differenceamong different protein assay methods (p < 0.0001, ANOVA,n = 3 replicates).

355J O U R N A L O F P R O T E O M I C S 9 6 ( 2 0 1 4 ) 3 5 3 – 3 5 9

the proteomics field. We began our effort in 2009 when thefull-year literature of 2008 became available to us. Although thequantitation numbers in such a snapshot might change fromyear to year, the diversity of the proteome analyzing techniques,protein concentration determinationmethods, and the proteomestudy subjects will remain unchanged in the foreseeable future.The analysis determined there were 903 papers published inthose journals in that year. Excluding method/technology/software development papers, short/brief/rapid communica-tions, and papers focused on sub-proteomes (e.g., membraneproteomics, mitochondrial proteomics, and nuclear proteomics),521 received further in-depth review (Supplemental Table 1).Based on these publications, we were able to draw the followingconclusions. First, diverse protein extraction methods have beenused, not only for samples of different origins (e.g., tissues, celllines, microorganisms, and plants) but also for the same type ofsample. It is unknown how different the extraction efficienciesare for these variable protein extraction methods and in whatdegree the variety of methods affect the experimental outcomes.As an indicator for protein extraction efficiency, one could simplystate that X gram protein was yielded from Y gram of tissue or ZmL of body fluid. However, only a small number of papers (<4%)reported protein yield fromoriginal samples.While the extractedprotein amount from a sample might not be a major concern forsome research laboratories that study differential protein levelsin a relative sense, it is highly relevant from the regulatoryperspective, especially concerning data acquisition followingGood Laboratory Practice, and when data from different sourcesneed to be compared and reviewed by the regulatory agencies.Next, differentmethods such as Bradford, BCA, and Lowry assaysareemployed forprotein concentrationmeasurements.Althoughthese are routine laboratory procedures, different assays may ormaynot lead to different conclusions depending onwhat specificprotein targets are analyzed by the subsequent analyticalprocedure. Also, a large number of the papers (~40%) did notmention what method was used in determining protein concen-trations in their experimental procedures at all, whichmake datacomparisons evenmoredifficult. Finally,mass spectrometry (MS)is by far the most-widely used tool for proteomic analyses,employed in ~95%of the studies.While our literature reviewmaynot provide the most comprehensive and up-to-date statistics,since a huge number of changes have occurred since 2009 in theproteomics field including the increased usage of labeledquantitative MS analyses, it nevertheless demonstrated thediversity of the assessed parameters in the proteomics field.Since the trend may not change anytime soon, current effortstowards developing guidelines and standardizations in theproteomics field should focus on MS approaches.

In addition to the above discussion about the protein samplepreparation and concentrationmeasurement based on literature,we examined a typical procedure from sample extraction toonline nanoflow liquid chromatography-tandem mass spec-trometry (nanoflow LC-MS/MS) to illustrate additional issues,which might have previously been thought as trivial. Thisapproach, without isotopic labeling, has been widely applied tothe identification and quantification of proteins from a variety ofbiological matrixes, which has been termed as label-free quan-titative proteome analysis [35,36]. There are also many otherlabel-based proteomic approaches relying on stable isotopelabeling, including stable isotope labeling by amino acids in cell

culture (SILAC) [37], isotope-coded affinity tags (ICAT) [38,39],isobaric multiplexing tagging reagents for relative andabsolute protein quantitation (iTRAQ) [40], tandem masstags (TMT) [41], enzyme-catalyzed 16O/18O labeling [42–44], etc.All these approaches have been used for the discovery andquantification of biomarker candidates. An emerging tool forverification of biomarker candidates is mass spectrometry-based selected reaction monitoring (SRM) or multiple reactionmonitoring (MRM) in which specific biomarker candidatesare targeted and detected [20,45]. For all the various massspectrometry-based biomarker discovery and verification tools,some technologies may be more sophisticated than the others.Technical evaluation and discussion of each technology arebeyond the scope of this commentary. However, each approachinvolves protein sample preparation, LC-MS/MS, and dataanalysis. Here we focus on these common techniques anddiscuss associated issues that might be concerns in regulatoryevaluation.

With respect to sample preparation, we examined threecommonly used protein concentration assays, Bradford, BCA,and Lowry, which together cover about 93% of all proteinconcentration assays according to the literature review. Wemeasured protein concentration of the same mammalianproteome sample using each of the three protein assays andgenerated standard curves for each method using the sameprotein standards (see Supplemental Materials formore details).The results demonstrated that different assay methods couldindeed generate significantly different outcomes (Fig. 2). Sincemost laboratories use the same assay for all sample measure-ments for a specific project, thismight not be an issue. However,when different laboratories use diverse assays to determineprotein concentrations, it might become a critical issue. Forexample, one might assume one microgram of peptides wasinjected onto the LC column while the actual amount could besignificantly more or less than one microgram depending onwhich protein assay method was used.

356 J O U R N A L O F P R O T E O M I C S 9 6 ( 2 0 1 4 ) 3 5 3 – 3 5 9

To examine the ramifications of varying the amount ofprotein injected, a study was done to examine the quantitativeoutcomes. The rationale was that the LC-MS intensities of apeptide, defined as the area under the extracted ion chromato-graphic (XIC) peak of the peptide, should be proportional to theamounts of peptides injected onto the LC column. We not onlyobserved that some low-abundant peptides in the samplebecame undetectable when the injection amount was reducedto a half, but also found thatmany peptides did not respond in alinear fashion in MS intensities to the sample amounts. Forexample, some peptides had a reduction of MS intensities of 20–30% instead of the expected 50% (Fig. 3). The inconsistency couldbe due to variable ionization efficiency of the same peptide indifferent amounts, peptide species co-elution, slight change ofsolvent electrospray microenvironment in which the peptide iseluted, etc. The results demonstrated that careful control ofsample loading amounts is critical when comparative proteomicanalysis is performed and unexpected differential injection ofsample amounts could indeed generate significantly differentoutcomes, especially when label-free quantitative proteomicsapproaches are used. The results from this study and others [18]could explain some of the cross-laboratory inconsistencies dueto unexpected sample loading variations.

Shotgun proteomics is a bottom-up approach that has beenincreasingly utilized by the scientific community. This ap-proach identifies proteins and characterizes their amino acidsequences and post-translational modifications by proteolyticdigestion of proteins first, followed by one or more dimensionsof separation of the peptides by liquid chromatography coupledto mass spectrometry for online detection. Since conventionalprotein extraction buffers often contain protease inhibitors toprevent protein breakdown, it would be worthwhile to knowwhether or not adding protease inhibitors in the protein

13.6 14.0 14.4 14.8 15.2 15.6 16.0 16.4

Time (min)

50

100

50

100

Rel

ativ

e A

bu

nd

ance

(%

)

Area: 8.04E7

Area: 4.55E7

NL: 2.65E6m/z 568.1Loading: 1 ug

NL: 1.78E6m/z 568.1Loading: 0.5 ug

A

B

Fig. 3 – Examples of different peptide loading amounts (1 μg forlymphoma cell proteome sample affecting peptide intensities diMaterials and Methods. A and B show the peptide DAVTYTEHAKarea) of ~43% as expected when the amount of sample loading wthat the peptide GESPVDYDGGR (m/z 576.5) had a reduction of in

extraction buffer would make any difference in the LC-MS/MSanalysis for peptide/protein identification. Based on the litera-ture review mentioned above, 64 studies used the shotgunapproach. Among them, 28 (44%) includedprotease inhibitors inthe extraction buffer, while 36 (56%) did not. Thus, clearly thereis no consensus of the scientific community with regard ofusing or omitting protease inhibitors in the shotgun approach.For this reason, a study was performed in which proteins wereextracted from the same batch of cultured cells using proteinextraction buffers with or without protease inhibitors, followedby trypsin digestion and subsequent 1D LC-MS/MS analyses.The results showed that protein extraction without proteaseinhibitors led to significantly larger numbers of identifiedpeptides compared to those using the extraction buffer withprotease inhibitors (Fig. 4). Similar results were obtained for thenumbers of proteins identified (data not shown). This could bedue to the inhibitory effect of the remaining protease inhibitorsin the protein sample on trypsin activity during the digestionprocess and due to an interference of the remaining proteaseinhibitors in the peptide sample during MS analyses. Byexamining the LC-MS spectra, we found that the non-smallmolecule protease inhibitors remained in the sample andsignificantly interfered with reversed-phase LC (RPLC) separa-tion and MS/MS data acquisition for peptides. However,this effect could be minimized when the two-dimensionalLC-MS/MS approach such as strong cation exchange (SCX) LCcoupled with RPLC-MS/MS is employed. Therefore, addition ofprotease inhibitors or not during sample preparation should becarefully considered along with the subsequent analyticalapproaches.

With the evolution of mass spectrometers towards highmass precision instruments, label-free quantification of LC-MSdata has become a very appealing approach for the quantitative

20.0 20.5 21.0 21.5 22.0 22.5 23.0

Time (min)

50

100

50

100

Rel

ativ

e A

bu

nd

ance

(%

)

Area: 5.30E7

Area: 3.88E7

NL: 2.89E6m/z 576.5Loading: 1 ug

NL: 2.60E6m/z 576.5Loading: 0.5 ug

D

C

A and C, and 0.5 μg for B and D, respectively) of a mousesproportionally. For method details see Supplemental(m/z 568.1) had a reduction of intensities (integrated peakas decreased to a half. However, C and D show an exampletensity of ~27% instead of the expected 50%.

Peptides(n = 11 pairs)

PI Plus PI Minus

1100

1200

1300

1400

1500

1600N

um

ber

of

Pep

tid

es

Fig. 4 – Sample preparation using the protein extraction bufferwith orwithout protease inhibitors (PI) led to different numbersof identified peptides. Proteins were extracted frommouselymphoma cells (L5178Y/Tk+/–3.7.2C). Error bars indicatestandard error of the mean. The number of identified peptidesis significantly larger for the samples without proteaseinhibitors than that with protease inhibitors (p < 0.002, pairedt-test, n = 11 pairs of “PI Plus” versus “PI Minus”).

357J O U R N A L O F P R O T E O M I C S 9 6 ( 2 0 1 4 ) 3 5 3 – 3 5 9

analysis of biological samples, as it has been demonstrated thatthe intensities of extracted peptide signals scales linearly ingeneral with their molecular concentrations across a dynamicrange of 3 to 4 orders of magnitude [46]. Although there arepotential aforementioned issues, because of its promisingfuture, we examined this label-free approach, and evaluatedconsistency of experimental results by repeated mass spectro-metric analyses of the same peptidemixture. Whereas variabil-ity in peptide LC retention time, sample loading, and columnseparation efficiency over time are potential issues as discussedin the proteomic community [47], quantitation by calculatingthe peak area of integrated chromatographic peak of a definedpeptide with a specific mass-to-charge ratio (i.e., m/z) is also ofconcern. Previous reports have already identified softwareanalysis issues [48,49] such as differences between the twomost popular search algorithms, Mascot and Sequest. In termsof accurate quantification, measurement of peak areas is a

Peak Area: 44065300 Peak Area: 46673308 Peak Area:

Measure #1 Measure #2 MeasurRT: 51.91 - 59.91 SM: 5G

52.0 52.5 53.0 53.5 54.0 54.5 55.0 55.5 56.0 56.5 57.0 57.5 58.0 58.5 59.0 59.5Time (min)

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

10055.85

55.02

58.3853.29 59.2352.50

53.77 54.2057.7056.89

NL: 2.01E6

m/z= 3082.867-3086.867 F: + Full ms B1

RT: 51.22 - 59.22 SM: 5G

51.5 52.0 52.5 53.0 53.5 54.0 54.5 55.0 55.5 56.0 56.5 57.0 57.5 58.0 58.5 59.0Time (min)

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

10056.19

55.39

58.9252.33 53.3657.4153.86 57.7754.62 58.19

54.22

NL: 2.12E6

m/z= 3081.858-3085.858 F: + Full ms B2

RT: 52.39 - 60.39 SM: 5G

52.5 53.0 53.5 54.0 54.5 55.0 55.5 56.0 56.5Time (min)

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

10056.27

55.55

53.8752.92

52.64

Rel

ativ

e In

tens

ity

Time (min) Time (Time (min)

Fig. 5 – An example of software analysis variations in MS peak iStress-70 protein) was repeatedlymeasured. Automatic peak integmodule in Bioworks (Rev. 3.3.1 SP1). The extracted ion chromatogrusing a linear ion trap mass spectrometer LTQ-XL for the same try(L5178Y/Tk+/−3.7.2C). The circle indicates an additional peak that wSince these were technical replicates, these results demonstrate aprocessing even though Good Laboratory Practice has been followanalytical experiments.

standard. However, intrinsic variability in peak integration ofMS data owing to the software could be a critical factor affectingquantitation accuracy (Fig. 5). While different software mayhave its own issues, problems in peak integration could becommon. Although proteomic scientists may be aware of thisissue, the problem has not been fully resolved. While manualintegration is possible, it is not practical for high-throughputanalyses of hundreds and thousands of peptides. More sophis-ticated algorithms for MS peak integration should be developedfor more accurate quantification.

These results warrant further debates at the next level,i.e., how a regulatory agency should evaluate proteomics data.Identifying molecular changes as early biomarkers for drugsafety assessment [50] should have distinct advantages becausecellular and tissue damage is often not reversible, and becauseoften times it is already at a late stage when damage is observedby conventional methods. As high-quality data produced arefundamental to the generation of reliable biological knowledge,quality control at each step of the entire experimental andanalytical processes is essential to ensure that the detecteddifferences reflect biological but not methodological variations.From a uniquely regulatory perspective, unlike cross-laboratorydata comparison for basic science research when discrepanciescould be eventually explained and resolved through additionalexperiments in the future, regulatory agencies such as the FDAbear additional responsibilities, and have tomake a timely “Yes”or “No” decision, that could affect the public health.Would therebe confidence in the data generated using the same experimen-tal procedures but from two independent laboratories? Even thisis practically doable; would two datasets be sufficient to draw anaccurate conclusion? Since false-positive or false-negative ratesare dependent upon the pre-analytical and analytical variables,should these false-positive and false-negative rates be identifiedin each dataset? If so, how should they be determined and howreliable are the methods to determine such rates? While themethods and approaches for global and local false discovery rate(FDR) estimation at the peptide level are now well understood,there has been much less development in the area of proteinlevel analysis and the field is essentially lacking a robust and

Peak Area: 40213348Peak Area: 63642432 44916564

e #3 Measure #4 Measure #5

57.0 57.5 58.0 58.5 59.0 59.5 60.0

58.7257.17

60.0357.57 59.56

58.27

NL: 2.08E6

m/z= 3082.975-3086.975 F: + Full ms B3

RT: 51.66 - 59.66 SM: 5G

52.0 52.5 53.0 53.5 54.0 54.5 55.0 55.5 56.0 56.5 57.0 57.5 58.0 58.5 59.0 59.5Time (min)

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

10056.53

55.83

58.9153.11 57.81

54.6154.21 57.2152.25 52.72 58.4054.98

NL: 2.40E6

m/z= 3083.726-3087.726 F: + Full ms B4

RT: 52.36 - 60.36 SM: 5G

52.5 53.0 53.5 54.0 54.5 55.0 55.5 56.0 56.5 57.0 57.5 58.0 58.5 59.0 59.5 60.0Time (min)

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

10056.24

55.41

53.2958.89

60.0059.2657.5252.61

53.7950.8563.45

NL: 1.67E6

m/z= 3083.029-3087.029 F: + Full ms B5

Time (min)Time (min)min)

ntegration when the same peptide (from mitochondrialration and area calculationwere performed using the PepQuanams of the peptide were generated from 5 LC-MS/MS analysespsin-digested proteome sample frommouse lymphoma cellsas included in the calculation of the integrated peak area value.nother variance that is caused by the post-analytical dataed and the reproducibility was kept at an optimal level during

358 J O U R N A L O F P R O T E O M I C S 9 6 ( 2 0 1 4 ) 3 5 3 – 3 5 9

statistically powerful model with a practical software imple-mentation for the analysis of very large proteomic datasetsaccording to a recent evaluation [51]. It is expected thatimprovement in tools and methods for shotgun proteomicdata analysis will further increase the quality of publishedproteomic data. A recent study that recommended 46 perfor-mance metrics for LC-MS analyses [47] is a good start and willdefinitely facilitate quality control of proteomics data. A recentintroduction of yeast protein extract as a proteomics qualitycontrol material to evaluate the preanalytical and analyticalvariables of proteomics-based experimental workflows by theNational Institute of Standards and Technology (NIST) could beanother helpful approach for improving the accuracy, repeat-ability and reproducibility [52]. In addition, the analytical instru-ment and softwarewill also get further improvement over time.However, it seems that there is still a significant work aheadtowards establishing a regulatory standard for proteomic datasubmission and evaluation. The challenge will require a jointeffort from the government, academia, and private sectors.

Supplementary data to this article can be found online athttp://dx.doi.org/10.1016/j.jprot.2013.11.024.

Acknowledgments

Themass spectrometryproteomics data havebeendeposited tothe ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository [53]with the dataset identifier PXD000334. Thisworkwas supportedin part by FDA Commissioner's Fellowship Program and by anappointment to the Research Participation Program at theNational Center for Toxicological Research administered bythe Oak Ridge Institute for Science and Education through aninteragency agreement between the U.S. Department of Energyand the U.S. Food and Drug Administration. We wish to thankYuan Gao, Ricky Holland, Xiaoqing Guo, NanMei, Reagan Kelly,Joshua Xu, and Hong Fang for providing various levels of helpand assistance throughout this study. The views expressedhereare those of the authors andnot necessarily of theU.S. Food andDrug Administration, nor does mention of trade names,commercial products, or organizations imply endorsement bythe U.S. Government.

R E F E R E N C E S

[1] Goodsaid F, Papaluca M. Evolution of biomarker qualificationat the health authorities. Nat Biotechnol 2010;28:441–3.

[2] Goodsaid FM, Mendrick DL. Translational medicine and thevalue of biomarker qualification. Sci TranslMed 2010;2 [47 ps4].

[3] Taylor CF, Paton NW, Lilley KS, Binz PA, Julian Jr RK, Jones AR,et al. The minimum information about a proteomicsexperiment (MIAPE). Nat Biotechnol 2007;25:887–93.

[4] Tao F. 1st NCI annual meeting on Clinical ProteomicTechnologies for Cancer. Expert Rev Proteomics 2008;5:17–20.

[5] Taylor CF, Binz PA, Aebersold R, Affolter M, Barkovich R,Deutsch EW, et al. Guidelines for reporting the use of massspectrometry in proteomics. Nat Biotechnol 2008;26:860–1.

[6] Binz PA, Barkovich R, Beavis RC, Creasy D, Horn DM, Julian JrRK, et al. Guidelines for reporting the use ofmass spectrometryinformatics in proteomics. Nat Biotechnol 2008;26:862.

[7] Gibson F, Anderson L, Babnigg G, Baker M, Berth M, Binz PA,et al. Guidelines for reporting the use of gel electrophoresis inproteomics. Nat Biotechnol 2008;26:863–4.

[8] Domann PJ, Akashi S, Barbas C, Huang L, Lau W,Legido-Quigley C, et al. Minimum Information About aProteomics E. Guidelines for reporting the use of capillaryelectrophoresis in proteomics. Nat Biotechnol 2010;28:654–5.

[9] Jones AR, Carroll K, Knight D, Maclellan K, Domann PJ,Legido-Quigley C, et al. Minimum Information About aProteomics E. Guidelines for reporting the use of columnchromatography in proteomics. Nat Biotechnol 2010;28:654.

[10] Hoogland C, O'Gorman M, Bogard P, Gibson F, Berth M,Cockell SJ, et al. Minimum Information About a Proteomics E.Guidelines for reporting the use of gel image informatics inproteomics. Nat Biotechnol 2010;28:655–6.

[11] Orchard S, Salwinski L, Kerrien S, Montecchi-Palazzi L,Oesterheld M, Stumpflen V, et al. The minimum informationrequired for reporting a molecular interaction experiment(MIMIx). Nat Biotechnol 2007;25:894–8.

[12] Bourbeillon J, Orchard S, Benhar I, Borrebaeck C, de DaruvarA, Dubel S, et al. Minimum information about a proteinaffinity reagent (MIAPAR). Nat Biotechnol 2010;28:650–3.

[13] Montecchi-Palazzi L, Kerrien S, Reisinger F, Aranda B, JonesAR, Martens L, et al. The PSI semantic validator: a frameworkto check MIAPE compliance of proteomics data. Proteomics2009;9:5112–9.

[14] Gloriam DE, Orchard S, Bertinetti D, Bjorling E,Bongcam-Rudloff E, Borrebaeck CA, et al. A communitystandard format for the representation of protein affinityreagents. Mol Cell Proteomics 2010;9:1–10.

[15] Gibson F, Hoogland C, Martinez-Bartolome S, Medina-AunonJA, Albar JP, Babnigg G, et al. The gel electrophoresis markuplanguage (GelML) from the Proteomics Standards Initiative.Proteomics 2010;10:3073–81.

[16] Tabb DL, Vega-Montoto L, Rudnick PA, Variyath AM, Ham AJ,Bunk DM, et al. Repeatability and reproducibility in proteomicidentifications by liquid chromatography-tandem massspectrometry. J Proteome Res 2010;9:761–76.

[17] Mischak H, Apweiler R, Banks RE, Conaway M, Coon J,Dominiczak A, et al. Clinical proteomics: a need to define thefield and to begin to set adequate standards. Proteomics ClinAppl 2007;1:148–56.

[18] Bell AW, Deutsch EW, Au CE, Kearney RE, Beavis R, Sechi S,et al. A HUPO test sample study reveals common problems inmass spectrometry-based proteomics. Nat Methods2009;6:423–30.

[19] Sabido E, Selevsek N, Aebersold R. Mass spectrometry-basedproteomics for systems biology. Curr Opin Biotechnol2012;23:591–7.

[20] Meng Z, Veenstra TD. Targetedmass spectrometry approachesfor protein biomarker verification. J Proteomics 2011;74:2650–9.

[21] Gerszten RE, Carr SA, SabatineM. Integration of proteomic-basedtools for improved biomarkers of myocardial injury. Clin Chem2010;56:194–201.

[22] Ning MM, Lopez M, Sarracino D, Cao J, Karchin M, McMullin D,et al. Pharmaco-proteomics opportunities for individualizingneurovascular treatment. Neurol Res 2013;35:448–56.

[23] Yu LR. Pharmacoproteomics and toxicoproteomics: the fieldof dreams. J Proteomics 2011;74:2549–53.

[24] Michalski A, Damoc E, Lange O, Denisov E, Nolting D, Muller M,et al. Ultra high resolution linear ion trap Orbitrap massspectrometer (Orbitrap Elite) facilitates topdownLCMS/MSandversatile peptide fragmentation modes. Mol Cell Proteomics2012;11 [O111 013698].

[25] Altelaar AF, Heck AJ. Trends in ultrasensitive proteomics.Curr Opin Chem Biol 2012;16:206–13.

[26] Boja ES, Rodriguez H. Mass spectrometry-based targetedquantitative proteomics: achieving sensitive and reproducibledetection of proteins. Proteomics 2012;12:1093–110.

359J O U R N A L O F P R O T E O M I C S 9 6 ( 2 0 1 4 ) 3 5 3 – 3 5 9

[27] Shi T, Sun X, Gao Y, Fillmore TL, Schepmoes AA, Zhao R, et al.Targeted quantification of low ng/mL level proteins inhuman serum without immunoaffinity depletion. J ProteomeRes 2013;12:3353–61.

[28] Nagaraj N, Kulak NA, Cox J, Neuhauser N, Mayr K, Hoerning O,et al. System-wide perturbation analysis with nearly completecoverage of the yeast proteome by single-shot ultra HPLC runson a bench top Orbitrap. Mol Cell Proteomics 2012;11 [M111013722].

[29] Hebert AS, Merrill AE, Stefely JA, Bailey DJ, Wenger CD,Westphall MS, et al. Amine-reactive neutron-encoded labelsfor highly plexed proteomic quantitation. Mol Cell Proteomics2013;12:3360–9.

[30] Ding C, Jiang J, Wei J, Liu W, Zhang W, Liu M, et al. A fastworkflow for identification and quantification of proteomes.Mol Cell Proteomics 2013;12:2370–80.

[31] Hebert AS, Richards AL, Bailey DJ, Ulbrich A, Coughlin EE,Westphall MS, et al. The one hour yeast proteome. Mol CellProteomics 2013;12 [M113.034769].

[32] Kuhn E, Whiteaker JR, Mani DR, Jackson AM, Zhao L, Pope ME,et al. Interlaboratory evaluation of automated, multiplexedpeptide immunoaffinity enrichment coupled to multiplereactionmonitoringmass spectrometry for quantifying proteinsin plasma. Mol Cell Proteomics 2012;11 [M111 013854].

[33] Abbatiello SE, Mani DR, Schilling B, Maclean B, Zimmerman LJ,Feng X, et al. Design, implementation andmultisite evaluationof a system suitability protocol for the quantitative assessmentof instrument performance in liquid chromatography-multiplereaction monitoring-MS (LC-MRM-MS). Mol Cell Proteomics2013;12:2623–39.

[34] Eisenacher M, Schnabel A, Stephan C. Quality meets quantity -quality control, data standards and repositories. Proteomics2011;11:1031–6.

[35] Wiener MC, Sachs JR, Deyanova EG, Yates NA. Differentialmass spectrometry: a label-free LC-MS method for findingsignificant differences in complex peptide and proteinmixtures. Anal Chem 2004;76:6085–96.

[36] Wang M, You J, Bemis KG, Tegeler TJ, Brown DP. Label-freemass spectrometry-based protein quantification technologiesin proteomic analysis. Brief Funct Genomic Proteomic2008;7:329–39.

[37] Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H,Pandey A, et al. Stable isotope labeling by amino acids in cellculture, SILAC, as a simple and accurate approach toexpression proteomics. Mol Cell Proteomics 2002;1:376–86.

[38] Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R.Quantitative analysis of complex protein mixtures usingisotope-coded affinity tags. Nat Biotechnol 1999;17:994–9.

[39] Yu LR, Conrads TP, Uo T, Issaq HJ, Morrison RS, Veenstra TD.Evaluation of the acid-cleavable isotope-coded affinity tag

reagents: application to camptothecin-treated corticalneurons. J Proteome Res 2004;3:469–77.

[40] Ross PL, Huang YN, Marchese JN, Williamson B, Parker K,Hattan S, et al. Multiplexed protein quantitation inSaccharomyces cerevisiae using amine-reactive isobarictagging reagents. Mol Cell Proteomics 2004;3:1154–69.

[41] Thompson A, Schafer J, Kuhn K, Kienle S, Schwarz J, SchmidtG, et al. Tandemmass tags: a novel quantification strategy forcomparative analysis of complex protein mixtures by MS/MS.Anal Chem 2003;75:1895–904.

[42] Schnolzer M, Jedrzejewski P, Lehmann WD. Protease-catalyzed incorporation of 18O into peptide fragments and itsapplication for protein sequencing by electrospray andmatrix-assisted laser desorption/ionizationmass spectrometry.Electrophoresis 1996;17:945–53.

[43] Yao X, Freas A, Ramirez J, Demirev PA, Fenselau C. Proteolytic18O labeling for comparative proteomics: model studies withtwo serotypes of adenovirus. Anal Chem 2001;73:2836–42.

[44] Gao Y, Gopee NV, Howard PC, Yu LR. Proteomic analysis ofearly response lymph node proteins in mice treated withtitanium dioxide nanoparticles. J Proteomics 2011;74:2745–59.

[45] Anderson L, Hunter CL. Quantitative mass spectrometricmultiple reaction monitoring assays for major plasmaproteins. Mol Cell Proteomics 2006;5:573–88.

[46] Levin Y, Schwarz E, Wang L, Leweke FM, Bahn S. Label-freeLC-MS/MS quantitative proteomics for large-scale biomarkerdiscovery in complex samples. J Sep Sci 2007;30:2198–203.

[47] Rudnick PA, Clauser KR, Kilpatrick LE, Tchekhovskoi DV, NetaP, Blonder N, et al. Performance metrics for liquidchromatography-tandem mass spectrometry systems inproteomics analyses. Mol Cell Proteomics 2010;9:225–41.

[48] Callister SJ, Barry RC, Adkins JN, Johnson ET, Qian WJ,Webb-Robertson BJ, et al. Normalization approaches forremoving systematic biases associatedwithmass spectrometryand label-free proteomics. J Proteome Res 2006;5:277–86.

[49] Balgley BM, Laudeman T, Yang L, Song T, Lee CS. Comparativeevaluation of tandemMS search algorithms using a target-decoysearch strategy. Mol Cell Proteomics 2007;6:1599–608.

[50] Gao Y, Holland RD, Yu LR. Quantitative proteomics for drugtoxicity. Brief Funct Genomic Proteomic 2009;8:158–66.

[51] Nesvizhskii AI. A survey of computational methods and errorrate estimation procedures for peptide and protein identificationin shotgun proteomics. J Proteomics 2010;73:2092–123.

[52] Beasley-Green A, Bunk D, Rudnick P, Kilpatrick L, Phinney K.A proteomics performance standard to support measurementquality in proteomics. Proteomics 2012;12:923–31.

[53] Vizcaino JA, Cote RG, Csordas A, Dianes JA, Fabregat A, FosterJM, et al. The PRoteomics IDEntifications (PRIDE) databaseand associated tools: status in 2013. Nucleic Acids Res2013;41:D1063–9.