Use of high-throughput LC–MS/MS proteomics technologies in drug discovery

TECHNOLOGIES

DRUG DISCOVERY

TODAY

Drug Discovery Today: Technologies Vol. 3, No. 3 2006

Editors-in-Chief

Kelvin Lam – Pfizer, Inc., USA

Henk Timmerman – Vrije Universiteit, The Netherlands

Medicinal chemistry

Use of high-throughput LC–MS/MSproteomics technologies in drugdiscoveryZhouxin Shen1, Ming-wei Wang2, Steven P. Briggs1,*1Section of Cell & Developmental Biology, Division of Biological Sciences, University of California, San Diego, Natural Science Building,

Room 6320, 9500 Gilman Drive, La Jolla, CA 92093-0380, USA2The National Center for Drug Screening, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 189 Guo Shou Jing Road,

Zhangjiang High-Tech Park, Shanghai 201203, PR China

Significant improvements have been made during the

past few years in technologies for high-throughput

proteomics based on mass spectrometry. As proteo-

mics technologies advance and become more widely

accessible, efforts to profile and quantify full proteomes

are underway to complement other genomics

approaches, such as transcript and metabolite profil-

ing. Of particular interest is the application of proteo-

mics to protein biomarker discovery, which is

increasingly being recognized as crucially important

for the study of disease processes, both from the diag-

nostic and prognostic points of view. This review will

discuss the advances and current limitations of full

proteome analysis for biomarker discovery, including

deep proteome profiling, quantitative proteomics, and

phosphoproteome profiling.

*Corresponding author: S.P. Briggs ([email protected])

1740-6749/$ � 2006 Elsevier Ltd. All rights reserved. DOI: 10.1016/j.ddtec.2006.09.007

Section Editors:Li-he Zhang – School of Pharmaceutical Science, PekingUniversity, Beijing, ChinaKaixian Chen – Shanghai Institute of Materia Medica,Chinese Academy of Sciences, Shanghai, China

Introduction

Significant improvements have been made during the past

few years in technologies for mass spectrometry-based high-

throughput proteomics. As proteomics technologies advance

and become more widely accessible, efforts to profile and

quantify full proteomes are underway to complement other

genomics approaches, such as transcript and metabolite pro-

filing. Of particular interest is the application of proteomics

to protein biomarker discovery, which is increasingly being

recognized as crucially important for the study of disease

processes, both from the diagnostic and prognostic points of

view.

Many new technologies ranging from instrumentation,

sample preparation, protein and peptide separation, as well

as computer software and bioinformatics have emerged in the

past few years. The three major areas in the proteomics field

are deep proteome profiling, quantitative proteomics, and

posttranslational modification. These three areas represent

the biggest challenges as well as opportunities in proteomics.

The recent advances in these three areas are reviewed here.

Deep proteome profiling

The rapid advances of mass spectrometry-based high-

throughput proteomics have enabled scientists to survey

large numbers of proteins in a systems biology mode [1,2].

The extreme complexity and dynamic range of biological

samples, especially the plasma proteome [3], pose a big

challenge for identifying low-abundance proteins. ‘Shotgun’

301

mailto:[email protected]

http://dx.doi.org/10.1016/j.ddtec.2006.09.007

Drug Discovery Today: Technologies | Medicinal chemistry Vol. 3, No. 3 2006

Figure 1. Schematics of reverse phase (RP)-strong cation exchange (SCX)-reverse phase (RP) 3D liquid chromatography (3DLC) and SCX-RP 2D LC

separation. In 2DLC separation, all peptides are loaded onto the SCX column and a salt gradient is used to fractionate the peptides onto the RP column; a

high-resolution separation is achieved on the RP column. In 3DLC separation, an initial reverse phase separation (RP1) separates peptides based on

hydrophobicity, the subsequent SCX further separates the peptides using a salt gradient, and finally the second RP2 enables a high-resolution separation of

the SCX fractions.

proteomics [4], a combination of multidimensional liquid

chromatography (LC), tandem mass spectrometry (MS/MS)

and data processing, has been proventobe an effective method

for deep proteome profiling of complex samples. In a typical

shotgun proteomics analysis, the protein mixture is digested

with a proteolytic enzyme (trypsin in most cases) or a set of

enzymes. The peptide mixture is separated by in-line chroma-

tography column segments composed of a strong cation

exchange (SCX) resin followed by a reverse phase (RP) resin

coupled with electrospray ionization and MS/MS analysis.

Combining shotgun proteomics with various sample fractio-

nation and separation technologies, thousands of proteins can

be identified in a single experiment. Using Saccharomyces

cerevisiae (yeast) as a model system with 6300 genes, 1484

proteins were identified using an online system [4] and 1504

proteins were identified using an offline [5] SCX-RPLC (2DLC)

method. An RP presegment of the LC column can be added in

front of the SCX segment to either further enhance the 2DLC

separation (Fig. 1) by fractionating the peptides using shallow

RP gradients or to function as a desalting segment in the online

configuration. Using a 3DLC–MS/MS approach, a total of 3019

proteins were identified from yeast [6].

Shotgun proteomics has been successfully applied to human

and mouse samples. Four hundred and ninety proteins were

identified from human serum using offline SCX-RP 2DLC–MS/

MS [7]. Combining subcellular fractionation with an online

2DLC–MS/MS approach, 3274 proteins were identified with

high confidence from four major organellar compartments

[cytosol, membranes (microsomes), mitochondria, and nuclei]

of six organs (brain, heart, kidney, liver, lung, and placenta)

from mouse [8].

With advances in mass spectrometry instrumentation,

such as the linear two-dimensional ion trap [9], the sensitivity

and scan speed have been dramatically improved. By com-

302 www.drugdiscoverytoday.com

bining the shotgun proteomics method with the new mass

spectrometers, thousands of proteins can be routinely iden-

tified from complex biological samples. The whole mouse

brain proteome was studied by combining global proteome

profiling with a cysteinyl-peptide enrichment method. Both

the global and the cysteinyl-enriched peptide samples were

analyzed by offline SCX-RP 2D LC–MS/MS. Seven thousand

seven hundred and ninety two nonredundant proteins

(approximately 34% of the predicted mouse proteome) were

identified in this study [10]. Protein profiling of human

embryonic kidney cells have been conducted (line HEK293;

unpublished data). HEK293 cells grown under normal con-

dition were compared with cells that were treated with CoCl2,

which simulates some aspects of hypoxia. A total of 18,300

proteins were identified, and the protein quantities were

compared for the two conditions using a nonlabeled, spectral

counting method (see discussion in the Quantitative proteo-

mics section). The depth of these protein profiling studies is

approaching the depth of transcript-based profiling, which is

the most widely used method for systems biological studies.

Combing the proteomics data with microarray data will

reveal important information such as posttranscriptional

and posttranslational gene regulation.

Quantitative proteomics

Quantitative proteome analysis is complementary to tran-

scriptome methods to study steady-state gene expression and

perturbation-induced changes. In comparison with gene

expression analysis at the mRNA level, proteome analysis

provides more accurate information about biological systems

and pathways because the measurement directly focuses on

the proteins. Currently, two main approaches are used for

relative quantitation in shotgun proteomics: stable isotope

labeling methods and label-free methods.

Vol. 3, No. 3 2006 Drug Discovery Today: Technologies | Medicinal chemistry

Isotope labeling methods

In the stable isotope labeling methods, two samples are

labeled with chemically identical tags that only differ in

isotopic composition. The samples are pooled together after

labeling and are analyzed by LC–MS/MS. Because the labeled

peptide pairs have virtually identical physical and chemical

properties except the masses, they behave in the same way

during sample preparation, LC separation, and electrospray

ionization. The MS intensity ratios of the peptide pairs can be

used for relative quantitation. The three most widely used

isotope labeling methods are isotope coded affinity tags

(ICAT) [11], 16O/18O labeling [12] and stable isotope labeling

by amino acids in cell culture (SILAC) [13].

In the ICAT approach, two samples are labeled with isotope

coded (H/D or 12C/13C) tags by a thiol-reactive group (which

covalently links to cysteine residues) and a biotin moiety. The

samples are combined and enzymatically digested, and the

labeled peptides are selectively enriched via biotin–avidin

affinity chromatography. Because the ICAT-labeled peptide

fragments differ in mass by a known amount, they can be

separated and quantified via mass spectrometry. By specific

labeling, only cysteine residues of any given complex protein

sample, are simplified, thus allowing the analysis of an

increased dynamic range of peptides. The specificity of the

ICAT reagents for cysteine residues is also one of its draw-

backs. Because the ICAT reagents can only be used to analyze

proteins that contain a cysteine residue, many important

proteins, including those with posttranslational modifica-

tions, are missed by this technique.

In the 16O/18O method, two samples are digested by trypsin

in H216O or H2

18O. Two 16O or 18O atoms are incorporated

universally into the carboxyl termini of all tryptic peptides

during the proteolytic cleavage of all proteins. The two pep-

tide mixtures are pooled together and analyzed by mass

spectrometry. Relative signal intensities of paired peptides

(differing by 4 Da) determine the expression levels of their

precursor proteins in the two samples. The advantages of16O/18O labeling are its high efficiency, simple protocol, and

inexpensive reagents. However, the 4 Da mass difference is

sometimes too small to be differentiated by ESI-ion trap mass

spectrometers, especially for the triply charged ions. Higher

resolution precursor ion scans such as zoom scans may be

useful to overcome this problem.

The SILAC method incorporates isotopic labels into pro-

teins via metabolic labeling during cell culture growth. Cell

samples to be compared are grown separately in media con-

taining either a heavy or a light form of an essential amino

acid (e.g. one that cannot be synthesized by the cell). The

advantages of SILAC are that it has higher fidelity than ICAT

(incorporating with nearly 100% efficiency) and does not

require multiple chemical processing and purification steps,

thus ensuring that the samples to be compared have been

subjected to similar conditions throughout the experiment.

This approach, however, is not suitable for experimental

samples, such as primary human cells that cannot be grown

in vitro.

A common drawback shared by the approaches above is

that quantitation is based on parent ion intensities. In a deep

proteome profiling experiment, most of the instrument time

(>90%) is used for MS/MS to collect peptide fragment infor-

mation for protein identification. Less than 10% of the

instrument time is used to acquire parent ion data. Thus,

the instrument duty cycle is biased against ion intensity

measurements. Furthermore, the extreme sensitivity and

dynamic range of shotgun proteomics comes from the tan-

dem MS scans, in which only the intended ions are isolated

and fragmented. Very often in a deep protein profiling experi-

ment, a large fraction of the identified peptides have very low-

intensity MS signals, making the chromatogram peak picking

and integration very difficult and inaccurate, especially for

lower resolution mass spectrometers such as ion traps.

To overcome these drawbacks, a new class of isobaric

reagents, iTRAQ, was developed [14]. iTRAQ can be used

for multiplexed protein profiling of up to four different

samples. This approach labels samples with four independent

reagents of the same mass that, upon fragmentation in MS/

MS, give rise to four unique reporter ions (m/z = 114–117) that

are subsequently used to quantify the four different samples,

respectively. In the above-mentioned approach, relative

quantitation by iTRAQ clearly has advantages over 16O/18O

labeling (unpublished data). The iTRAQ ratios are more accu-

rate, and they are not abundance dependent, which makes

quantitation of low-abundance peptides more robust.

Besides the relative quantitation of proteins, there is a

strong need for the determination of absolute quantities of

proteins in ultra-complex mixtures. As a promising approach,

stable isotope dilution in combinations with shotgun pro-

teomics is emerging [15]. Similar to the isotope labeling

approaches, proteins of interest are mixed with synthetic

peptide standards of known concentration with an incorpo-

rated stable isotope (13C, 15N). These standards are identical

to the analyte peptides of interest but are distinguished by

mass difference. Stable isotope-labeled and unlabeled pep-

tides coelute during LC separation and absolute quantitation

is achieved by comparison of the peak area abundances of the

internal standard peptide with the corresponding native

counterpart by multiple reaction monitoring (MRM) via tan-

dem MS.

Label-free quantitation

The label-free quantitation approach relies on peak intensity

measurements of peptides detected by mass spectrometry

[16] or on the number of MS/MS spectra per protein detected

[17]. An advantage of these two approaches is the reduction

in cost of experiments when compared with use of stable

isotopes. There are, however, several disadvantages of these

www.drugdiscoverytoday.com 303


approaches when used in conjunction with complex peptide

mixtures. Changes in peptide chromatography conditions

and ionization efficiency may affect quantitation. This pro-

blem is compounded when using multidimensional chroma-

tography separations of complex mixtures because slight

variation in chromatography will lead to irreproducible pep-

tide separations and different ionization environments.

A recent project sponsored by the Association of Biomo-

lecular Resource Facilities compared different relative quan-

titation methods from 52 laboratories (ABRF Proteomics

Research Group, PRG 2006: Relative Protein Quantitation

Study Results, http://www.abrf.org/prg/). Two samples, both

containing the same eight proteins, were mixed at known

ratios ranging from 4:1 to 1:76. Gel electrophoresis-based

isotope labeling and label free mass spectrometry-based

quantitation methods were included in the study. Overall,

MS outperformed electrophoretic techniques, and particu-

larly the label-free MS methods did remarkably well for

proteins with a large ratio difference.

Phosphoproteome profiling

Protein phosphorylation is widely recognized as a major

mechanism for regulating protein function. Phosphoryla-

tion may alter protein activity or subcellular localization,

cause proteins to be degraded, or cause changes in composi-

tion of protein complexes. Alterations in the regulation of

phosphorylation pathways can play a key role in oncogen-

esis by the activation of cell proliferation signaling or by

inhibition of apoptotic pathways. Upregulation of the cell

surface receptors that are required for activation of phos-

phorylation pathways is well documented for numerous

types of cancers. Different kinases may be involved in each

of these processes for a single protein, allowing a large

degree of combinatorial regulation at the posttranslational

level. Therefore, in addition to knowing if a protein is

phosphorylated, knowing which residue is phosphorylated

during a particular response is essential in understanding

the mechanism of regulation. Traditionally, identification

of phosphoproteins was performed one-by-one, character-

izing a single protein during a single response. With the

advances of modem mass spectrometry, high-throughput

analysis of protein phosphorylation in whole cells or tissues

becomes possible.

Phosphopeptides have intrinsic properties that act as

obstacles to their detection and identification. First, the

fragmentation patterns of phosphopeptides by collision-

induced dissociation (CID) are often dominated by a neutral

loss of phosphoric acid (H3PO4, 98 Da) from the phospho-

peptide, producing uninformative MS/MS spectra. Searching

such data against protein databases severely reduces the

possibility of unambiguously identifying the phosphopep-

tide and pinpointing the phosphorylation sites. Data-depen-

dant MS3 scans of the neutral loss of phosphoric acid has been


shown to be effective for improving the fragmentation of

phosphopeptides by CID [18].

Second, the phosphorylated form of a protein frequently

has low stoichiometry relative to its unphosphorylated coun-

terpart. The data-dependent MS/MS scan method is typically

set up to fragment peptide ions according to their intensities.

Considering the already low expression levels of most pro-

teins regulated by phosphorylation, it is obvious that analyz-

ing phosphopeptides without any enrichment can easily miss

these rare peptides. Affinity-based methods (antibody or

metal ion chromatography) are the most common ways to

enrich phosphopeptides.

Antibodies are routinely used to immunoprecipitate spe-

cific proteins. There are several commercially available anti-

bodies that bind to phosphorylated Tyr residues in a generic

fashion. These antibodies can be used to immunoprecipitate,

and therefore to enrich, tyrosine phosphorylated peptides

from complex mixtures of proteins such as cell lysates. Tyr

phosphorylation is generally thought to represent less than

1% of all cellular phosphorylation, making them even harder

to be found. Combining phosphotyrosine-specific antibody

to immunoprecipitated phosphotyrosine peptides directly

from digested cellular protein extracts and standard shotgun

proteomics methods, 688 nonredundant phosphotyrosine-

containing peptides and 628 phosphotyrosine sites have been

identified from three distinct cell types [19].

The most common method to enrich phosphopeptides is

immobilized metal affinity chromatography (IMAC). Under

acidic binding conditions, the strong positive charge of the

transition metal, usually Fe3+ or Ga3+, selects the negatively

charged phosphate group from the mixture [20]. A potential

limitation of IMAC is that it may also bind peptides contain-

ing acidic residues. Methylation of acidic residues was found

to improve the specific binding of phosphopeptides [21] in a

study of 216 peptide sequences defining 383 sites of phos-

phorylation from yeast. TiO2 (titanium dioxide) was recently

found to be an effective alternative to IMAC for phosphopep-

tide enrichment [22].

Chromatographic separation can also be effective to enrich

phosphopeptides. The use of strong cation exchange (SCX)

columns under very specific pH conditions greatly enriches

phosphopeptides. This approach resulted in the identifica-

tion of phosphorylation sites from more than 967 phospho-

proteins and 2002 phosphorylation sites using 8 mg HeLa cell

nuclear protein [18]. The combination of IMAC and SCX

fractionation identified phosphorylation sites from more

than 500 yeast proteins [23].

Conclusion

Shotgun proteomics has proven to be the method of choice

for large-scale proteomics. For complex biological samples,

multidimensional LC separation is necessary to separate the

peptide mixtures to be detected by mass spectrometer. Both

http://www.abrf.org/prg/

Vol. 3, No. 3 2006 Drug Discovery Today: Technologies | Medicinal chemistry

Table 1. Comparison summary table

Technology 1 Technology 2 Technology 3 Technology 4 Technology 5

Name of specific

type of technology

Protein profiling Protein profiling Quantitation Quantitation Quantitation

Names of specific

technologies

with associated

companies and

company websites

Online LC separation Offline LC separation Intensity-based isotope

labeling – ICATb,

SILACc, 16O/18O

iTRAQa Spectra count

Pros Fully automated;

minimum samples loss;

better reproducibility

More flexible than online;

can use more separation

modes and solvents;

easy to set up in most

laboratories

Duplex; reproducible

chromatogram

separation

Multiplex (4); same

chromatogram separation

for all four samples; single

parent ion and MS/MS

scan for all four samples

No labeling required;

minimum sample loss;

sensitivity not

compromised; easy;

cheap

Cons Limited separation

modes; solvent has

to be compatible

with MS

Potential sample loss;

poorer reproducibility

Less accurate for

low-abundance proteins;

more expensive than

label-free methods

More expensive than

label-free methods

Semi-quantitative;

less accurate for

low-abundance

proteins; no multiplex

References [4,6] [5] [11–13] [14] [17]

a Isobaric tags for relative and absolute quantitation.b Isotope-coded affinity tags.c Stable isotope labeling by amino acids in cell culture.

online and offline LC separations have been successfully

applied to various systems. The online approach provides

the benefits of less human error, higher sample recovery, and

better reproducibility (Table 1).

Currently, two main approaches are used for relative quan-

titation in shotgun proteomics: stable isotope labeling meth-

ods and label-free methods (Table 1). In the label-free

approach, direct measurements of the peak intensity or the

number of MS/MS spectra is used for relative quantitation. It

is a simple but effective way to compare different samples.

The disadvantage is that it only works well for relatively

abundant proteins and typically many replicate analyses

are required to obtain reliable quantitation. Isotope labeling

approaches have the advantages of multiplexing and mini-

mum run-to-run variation. Similar to the label-free methods,

the MS intensity-based methods including ICAT, SILAC and16O/18O labeling do not work well for low-abundance pro-

teins. iTRAQ technology overcomes this problem by measur-

ing the intensities of the reporter ions in the MS/MS for

relative quantitation. It also has the advantage of running

four samples in a single LC–MS/MS run.

Global posttranslational modification, especially phos-

phorylation, is another big challenge in the proteomics

field. Both antibody and metal affinity-based enrichment

methods have been developed to help detect phosphopep-

tides. Chromatogram enrichment was also found to be

useful. Owing to the nature of the phosphopeptide, it is

unlikely that any of the above methods alone can solve the

problem. A combination of affinity and chromatogram

enrichments as well as new advances in mass spectrometry

are needed to enable researchers to study the full phospho-

proteome.

References1 Aebersold, R. and Goodlett, D.R. (2001) Mass spectrometry in proteomics.

Chem. Rev. 101, 269–295

2 Yates, J.R., 3rd (2004) Mass spectral analysis in proteomics. Annu. Rev.

Biophys. Biomol. Struct. 33, 297–316

3 Anderson, N.L. and Anderson, N.G. (2002) The human plasma proteome:

history, character, and diagnostic prospects. Mol. Cell Proteomics 1, 845–867

4 Washburn, M.P. et al. (2001) Large-scale analysis of the yeast proteome by

multidimensional protein identification technology. Nat. Biotechnol. 19,

242–247

5 Peng, J. et al. (2003) Evaluation of multidimensional chromatography

coupled with tandem mass spectrometry (LC/LC–MS/MS) for large-scale

protein analysis: the yeast proteome. J. Proteome Res. 2, 43–50

6 Wei, J. et al. (2005) Global proteome discovery using an online three-

dimensional LC–MS/MS. J. Proteome Res. 4, 801–808

7 Adkins, J.N. et al. (2002) Toward a human blood serum proteome: analysis

by multidimensional separation coupled with mass spectrometry. Mol.

Cell Proteomics 1, 947–955

8 Kislinger, T. et al. (2006) Global survey of organ and organelle protein

expression in mouse: combined proteomic and transcriptomic profiling.

Cell 125, 173–186

9 Cha, B. et al. (2000) An interface with a linear quadrupole ion guide

for an electrospray-ion trap mass spectrometer system. Anal. Chem. 72,

5647–5654

10 Wang, H. et al. (2006) Characterization of the mouse brain proteome using

global proteomic analysis complemented with cysteinyl-peptide

enrichment. J. Proteome Res. 5, 361–369

11 Gygi, S.P. et al. (1999) Quantitative analysis of complex protein mixtures

using isotope-coded affinity tags. Nat. Biotechnol. 17, 994–999

12 Yao, X. et al. (2001) Proteolytic 18O labeling for comparative proteomics:

model studies with two serotypes of adenovirus. Anal. Chem. 73,

2836–2842

13 Ong, S.E. et al. (2002) Stable isotope labeling by amino acids in cell culture,

SILAC, as a simple and accurate approach to expression proteomics. Mol.

Cell Proteomics 1, 376–386

www.drugdiscoverytoday.com 305


14 Ross, P.L. et al. (2004) Multiplexed protein quantitation in Saccharomyces

cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell

Proteomics 3, 1154–1169

15 Gerber, S.A. et al. (2003) Absolute quantification of proteins and

phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci. U S A

100, 6940–6945

16 Andersen, J.S. et al. (2003) Proteomic characterization of the human

centrosome by protein correlation profiling. Nature 426, 570–574

17 Liu, H. et al. (2004) A model for random sampling and estimation of relative

protein abundance in shotgun proteomics. Anal. Chem. 76, 4193–4201

18 Beausoleil, S.A. et al. (2004) Large-scale characterization of HeLa cell

nuclear phosphoproteins. Proc. Natl. Acad. Sci. U S A 101, 12130–12135


19 Rush, J. et al. (2005) Immunoaffinity profiling of tyrosine phosphorylation

in cancer cells. Nat. Biotechnol. 23, 94–101

20 Posewitz, M.C. and Tempst, P. (1999) Immobilized gallium(III) affinity

chromatography of phosphopeptides. Anal. Chem. 71, 2883–2892

21 Ficarro, S.B. et al. (2002) Phosphoproteome analysis by mass spectrometry

and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 20,

301–305

22 Kuroda, I. et al. (2004) Phosphopeptide-selective column-switching

RP-HPLC with a titania precolumn. Anal. Sci. 20, 1313–1319

23 Gruhler, A. et al. (2005) Quantitative phosphoproteomics applied

to the yeast pheromone signaling pathway. Mol. Cell Proteomics 4,

310–327

Documents

Use of high-throughput LC–MS/MS proteomics technologies in drug discovery