Upload
zhouxin-shen
View
214
Download
0
Embed Size (px)
Citation preview
TECHNOLOGIES
DRUG DISCOVERY
TODAY
Drug Discovery Today: Technologies Vol. 3, No. 3 2006
Editors-in-Chief
Kelvin Lam – Pfizer, Inc., USA
Henk Timmerman – Vrije Universiteit, The Netherlands
Medicinal chemistry
Use of high-throughput LC–MS/MSproteomics technologies in drugdiscoveryZhouxin Shen1, Ming-wei Wang2, Steven P. Briggs1,*1Section of Cell & Developmental Biology, Division of Biological Sciences, University of California, San Diego, Natural Science Building,
Room 6320, 9500 Gilman Drive, La Jolla, CA 92093-0380, USA2The National Center for Drug Screening, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 189 Guo Shou Jing Road,
Zhangjiang High-Tech Park, Shanghai 201203, PR China
Significant improvements have been made during the
past few years in technologies for high-throughput
proteomics based on mass spectrometry. As proteo-
mics technologies advance and become more widely
accessible, efforts to profile and quantify full proteomes
are underway to complement other genomics
approaches, such as transcript and metabolite profil-
ing. Of particular interest is the application of proteo-
mics to protein biomarker discovery, which is
increasingly being recognized as crucially important
for the study of disease processes, both from the diag-
nostic and prognostic points of view. This review will
discuss the advances and current limitations of full
proteome analysis for biomarker discovery, including
deep proteome profiling, quantitative proteomics, and
phosphoproteome profiling.
*Corresponding author: S.P. Briggs ([email protected])
1740-6749/$ � 2006 Elsevier Ltd. All rights reserved. DOI: 10.1016/j.ddtec.2006.09.007
Section Editors:Li-he Zhang – School of Pharmaceutical Science, PekingUniversity, Beijing, ChinaKaixian Chen – Shanghai Institute of Materia Medica,Chinese Academy of Sciences, Shanghai, China
Introduction
Significant improvements have been made during the past
few years in technologies for mass spectrometry-based high-
throughput proteomics. As proteomics technologies advance
and become more widely accessible, efforts to profile and
quantify full proteomes are underway to complement other
genomics approaches, such as transcript and metabolite pro-
filing. Of particular interest is the application of proteomics
to protein biomarker discovery, which is increasingly being
recognized as crucially important for the study of disease
processes, both from the diagnostic and prognostic points of
view.
Many new technologies ranging from instrumentation,
sample preparation, protein and peptide separation, as well
as computer software and bioinformatics have emerged in the
past few years. The three major areas in the proteomics field
are deep proteome profiling, quantitative proteomics, and
posttranslational modification. These three areas represent
the biggest challenges as well as opportunities in proteomics.
The recent advances in these three areas are reviewed here.
Deep proteome profiling
The rapid advances of mass spectrometry-based high-
throughput proteomics have enabled scientists to survey
large numbers of proteins in a systems biology mode [1,2].
The extreme complexity and dynamic range of biological
samples, especially the plasma proteome [3], pose a big
challenge for identifying low-abundance proteins. ‘Shotgun’
301
Drug Discovery Today: Technologies | Medicinal chemistry Vol. 3, No. 3 2006
Figure 1. Schematics of reverse phase (RP)-strong cation exchange (SCX)-reverse phase (RP) 3D liquid chromatography (3DLC) and SCX-RP 2D LC
separation. In 2DLC separation, all peptides are loaded onto the SCX column and a salt gradient is used to fractionate the peptides onto the RP column; a
high-resolution separation is achieved on the RP column. In 3DLC separation, an initial reverse phase separation (RP1) separates peptides based on
hydrophobicity, the subsequent SCX further separates the peptides using a salt gradient, and finally the second RP2 enables a high-resolution separation of
the SCX fractions.
proteomics [4], a combination of multidimensional liquid
chromatography (LC), tandem mass spectrometry (MS/MS)
and data processing, has been proventobe an effective method
for deep proteome profiling of complex samples. In a typical
shotgun proteomics analysis, the protein mixture is digested
with a proteolytic enzyme (trypsin in most cases) or a set of
enzymes. The peptide mixture is separated by in-line chroma-
tography column segments composed of a strong cation
exchange (SCX) resin followed by a reverse phase (RP) resin
coupled with electrospray ionization and MS/MS analysis.
Combining shotgun proteomics with various sample fractio-
nation and separation technologies, thousands of proteins can
be identified in a single experiment. Using Saccharomyces
cerevisiae (yeast) as a model system with 6300 genes, 1484
proteins were identified using an online system [4] and 1504
proteins were identified using an offline [5] SCX-RPLC (2DLC)
method. An RP presegment of the LC column can be added in
front of the SCX segment to either further enhance the 2DLC
separation (Fig. 1) by fractionating the peptides using shallow
RP gradients or to function as a desalting segment in the online
configuration. Using a 3DLC–MS/MS approach, a total of 3019
proteins were identified from yeast [6].
Shotgun proteomics has been successfully applied to human
and mouse samples. Four hundred and ninety proteins were
identified from human serum using offline SCX-RP 2DLC–MS/
MS [7]. Combining subcellular fractionation with an online
2DLC–MS/MS approach, 3274 proteins were identified with
high confidence from four major organellar compartments
[cytosol, membranes (microsomes), mitochondria, and nuclei]
of six organs (brain, heart, kidney, liver, lung, and placenta)
from mouse [8].
With advances in mass spectrometry instrumentation,
such as the linear two-dimensional ion trap [9], the sensitivity
and scan speed have been dramatically improved. By com-
302 www.drugdiscoverytoday.com
bining the shotgun proteomics method with the new mass
spectrometers, thousands of proteins can be routinely iden-
tified from complex biological samples. The whole mouse
brain proteome was studied by combining global proteome
profiling with a cysteinyl-peptide enrichment method. Both
the global and the cysteinyl-enriched peptide samples were
analyzed by offline SCX-RP 2D LC–MS/MS. Seven thousand
seven hundred and ninety two nonredundant proteins
(approximately 34% of the predicted mouse proteome) were
identified in this study [10]. Protein profiling of human
embryonic kidney cells have been conducted (line HEK293;
unpublished data). HEK293 cells grown under normal con-
dition were compared with cells that were treated with CoCl2,
which simulates some aspects of hypoxia. A total of 18,300
proteins were identified, and the protein quantities were
compared for the two conditions using a nonlabeled, spectral
counting method (see discussion in the Quantitative proteo-
mics section). The depth of these protein profiling studies is
approaching the depth of transcript-based profiling, which is
the most widely used method for systems biological studies.
Combing the proteomics data with microarray data will
reveal important information such as posttranscriptional
and posttranslational gene regulation.
Quantitative proteomics
Quantitative proteome analysis is complementary to tran-
scriptome methods to study steady-state gene expression and
perturbation-induced changes. In comparison with gene
expression analysis at the mRNA level, proteome analysis
provides more accurate information about biological systems
and pathways because the measurement directly focuses on
the proteins. Currently, two main approaches are used for
relative quantitation in shotgun proteomics: stable isotope
labeling methods and label-free methods.
Vol. 3, No. 3 2006 Drug Discovery Today: Technologies | Medicinal chemistry
Isotope labeling methods
In the stable isotope labeling methods, two samples are
labeled with chemically identical tags that only differ in
isotopic composition. The samples are pooled together after
labeling and are analyzed by LC–MS/MS. Because the labeled
peptide pairs have virtually identical physical and chemical
properties except the masses, they behave in the same way
during sample preparation, LC separation, and electrospray
ionization. The MS intensity ratios of the peptide pairs can be
used for relative quantitation. The three most widely used
isotope labeling methods are isotope coded affinity tags
(ICAT) [11], 16O/18O labeling [12] and stable isotope labeling
by amino acids in cell culture (SILAC) [13].
In the ICAT approach, two samples are labeled with isotope
coded (H/D or 12C/13C) tags by a thiol-reactive group (which
covalently links to cysteine residues) and a biotin moiety. The
samples are combined and enzymatically digested, and the
labeled peptides are selectively enriched via biotin–avidin
affinity chromatography. Because the ICAT-labeled peptide
fragments differ in mass by a known amount, they can be
separated and quantified via mass spectrometry. By specific
labeling, only cysteine residues of any given complex protein
sample, are simplified, thus allowing the analysis of an
increased dynamic range of peptides. The specificity of the
ICAT reagents for cysteine residues is also one of its draw-
backs. Because the ICAT reagents can only be used to analyze
proteins that contain a cysteine residue, many important
proteins, including those with posttranslational modifica-
tions, are missed by this technique.
In the 16O/18O method, two samples are digested by trypsin
in H216O or H2
18O. Two 16O or 18O atoms are incorporated
universally into the carboxyl termini of all tryptic peptides
during the proteolytic cleavage of all proteins. The two pep-
tide mixtures are pooled together and analyzed by mass
spectrometry. Relative signal intensities of paired peptides
(differing by 4 Da) determine the expression levels of their
precursor proteins in the two samples. The advantages of16O/18O labeling are its high efficiency, simple protocol, and
inexpensive reagents. However, the 4 Da mass difference is
sometimes too small to be differentiated by ESI-ion trap mass
spectrometers, especially for the triply charged ions. Higher
resolution precursor ion scans such as zoom scans may be
useful to overcome this problem.
The SILAC method incorporates isotopic labels into pro-
teins via metabolic labeling during cell culture growth. Cell
samples to be compared are grown separately in media con-
taining either a heavy or a light form of an essential amino
acid (e.g. one that cannot be synthesized by the cell). The
advantages of SILAC are that it has higher fidelity than ICAT
(incorporating with nearly 100% efficiency) and does not
require multiple chemical processing and purification steps,
thus ensuring that the samples to be compared have been
subjected to similar conditions throughout the experiment.
This approach, however, is not suitable for experimental
samples, such as primary human cells that cannot be grown
in vitro.
A common drawback shared by the approaches above is
that quantitation is based on parent ion intensities. In a deep
proteome profiling experiment, most of the instrument time
(>90%) is used for MS/MS to collect peptide fragment infor-
mation for protein identification. Less than 10% of the
instrument time is used to acquire parent ion data. Thus,
the instrument duty cycle is biased against ion intensity
measurements. Furthermore, the extreme sensitivity and
dynamic range of shotgun proteomics comes from the tan-
dem MS scans, in which only the intended ions are isolated
and fragmented. Very often in a deep protein profiling experi-
ment, a large fraction of the identified peptides have very low-
intensity MS signals, making the chromatogram peak picking
and integration very difficult and inaccurate, especially for
lower resolution mass spectrometers such as ion traps.
To overcome these drawbacks, a new class of isobaric
reagents, iTRAQ, was developed [14]. iTRAQ can be used
for multiplexed protein profiling of up to four different
samples. This approach labels samples with four independent
reagents of the same mass that, upon fragmentation in MS/
MS, give rise to four unique reporter ions (m/z = 114–117) that
are subsequently used to quantify the four different samples,
respectively. In the above-mentioned approach, relative
quantitation by iTRAQ clearly has advantages over 16O/18O
labeling (unpublished data). The iTRAQ ratios are more accu-
rate, and they are not abundance dependent, which makes
quantitation of low-abundance peptides more robust.
Besides the relative quantitation of proteins, there is a
strong need for the determination of absolute quantities of
proteins in ultra-complex mixtures. As a promising approach,
stable isotope dilution in combinations with shotgun pro-
teomics is emerging [15]. Similar to the isotope labeling
approaches, proteins of interest are mixed with synthetic
peptide standards of known concentration with an incorpo-
rated stable isotope (13C, 15N). These standards are identical
to the analyte peptides of interest but are distinguished by
mass difference. Stable isotope-labeled and unlabeled pep-
tides coelute during LC separation and absolute quantitation
is achieved by comparison of the peak area abundances of the
internal standard peptide with the corresponding native
counterpart by multiple reaction monitoring (MRM) via tan-
dem MS.
Label-free quantitation
The label-free quantitation approach relies on peak intensity
measurements of peptides detected by mass spectrometry
[16] or on the number of MS/MS spectra per protein detected
[17]. An advantage of these two approaches is the reduction
in cost of experiments when compared with use of stable
isotopes. There are, however, several disadvantages of these
www.drugdiscoverytoday.com 303
Drug Discovery Today: Technologies | Medicinal chemistry Vol. 3, No. 3 2006
approaches when used in conjunction with complex peptide
mixtures. Changes in peptide chromatography conditions
and ionization efficiency may affect quantitation. This pro-
blem is compounded when using multidimensional chroma-
tography separations of complex mixtures because slight
variation in chromatography will lead to irreproducible pep-
tide separations and different ionization environments.
A recent project sponsored by the Association of Biomo-
lecular Resource Facilities compared different relative quan-
titation methods from 52 laboratories (ABRF Proteomics
Research Group, PRG 2006: Relative Protein Quantitation
Study Results, http://www.abrf.org/prg/). Two samples, both
containing the same eight proteins, were mixed at known
ratios ranging from 4:1 to 1:76. Gel electrophoresis-based
isotope labeling and label free mass spectrometry-based
quantitation methods were included in the study. Overall,
MS outperformed electrophoretic techniques, and particu-
larly the label-free MS methods did remarkably well for
proteins with a large ratio difference.
Phosphoproteome profiling
Protein phosphorylation is widely recognized as a major
mechanism for regulating protein function. Phosphoryla-
tion may alter protein activity or subcellular localization,
cause proteins to be degraded, or cause changes in composi-
tion of protein complexes. Alterations in the regulation of
phosphorylation pathways can play a key role in oncogen-
esis by the activation of cell proliferation signaling or by
inhibition of apoptotic pathways. Upregulation of the cell
surface receptors that are required for activation of phos-
phorylation pathways is well documented for numerous
types of cancers. Different kinases may be involved in each
of these processes for a single protein, allowing a large
degree of combinatorial regulation at the posttranslational
level. Therefore, in addition to knowing if a protein is
phosphorylated, knowing which residue is phosphorylated
during a particular response is essential in understanding
the mechanism of regulation. Traditionally, identification
of phosphoproteins was performed one-by-one, character-
izing a single protein during a single response. With the
advances of modem mass spectrometry, high-throughput
analysis of protein phosphorylation in whole cells or tissues
becomes possible.
Phosphopeptides have intrinsic properties that act as
obstacles to their detection and identification. First, the
fragmentation patterns of phosphopeptides by collision-
induced dissociation (CID) are often dominated by a neutral
loss of phosphoric acid (H3PO4, 98 Da) from the phospho-
peptide, producing uninformative MS/MS spectra. Searching
such data against protein databases severely reduces the
possibility of unambiguously identifying the phosphopep-
tide and pinpointing the phosphorylation sites. Data-depen-
dant MS3 scans of the neutral loss of phosphoric acid has been
304 www.drugdiscoverytoday.com
shown to be effective for improving the fragmentation of
phosphopeptides by CID [18].
Second, the phosphorylated form of a protein frequently
has low stoichiometry relative to its unphosphorylated coun-
terpart. The data-dependent MS/MS scan method is typically
set up to fragment peptide ions according to their intensities.
Considering the already low expression levels of most pro-
teins regulated by phosphorylation, it is obvious that analyz-
ing phosphopeptides without any enrichment can easily miss
these rare peptides. Affinity-based methods (antibody or
metal ion chromatography) are the most common ways to
enrich phosphopeptides.
Antibodies are routinely used to immunoprecipitate spe-
cific proteins. There are several commercially available anti-
bodies that bind to phosphorylated Tyr residues in a generic
fashion. These antibodies can be used to immunoprecipitate,
and therefore to enrich, tyrosine phosphorylated peptides
from complex mixtures of proteins such as cell lysates. Tyr
phosphorylation is generally thought to represent less than
1% of all cellular phosphorylation, making them even harder
to be found. Combining phosphotyrosine-specific antibody
to immunoprecipitated phosphotyrosine peptides directly
from digested cellular protein extracts and standard shotgun
proteomics methods, 688 nonredundant phosphotyrosine-
containing peptides and 628 phosphotyrosine sites have been
identified from three distinct cell types [19].
The most common method to enrich phosphopeptides is
immobilized metal affinity chromatography (IMAC). Under
acidic binding conditions, the strong positive charge of the
transition metal, usually Fe3+ or Ga3+, selects the negatively
charged phosphate group from the mixture [20]. A potential
limitation of IMAC is that it may also bind peptides contain-
ing acidic residues. Methylation of acidic residues was found
to improve the specific binding of phosphopeptides [21] in a
study of 216 peptide sequences defining 383 sites of phos-
phorylation from yeast. TiO2 (titanium dioxide) was recently
found to be an effective alternative to IMAC for phosphopep-
tide enrichment [22].
Chromatographic separation can also be effective to enrich
phosphopeptides. The use of strong cation exchange (SCX)
columns under very specific pH conditions greatly enriches
phosphopeptides. This approach resulted in the identifica-
tion of phosphorylation sites from more than 967 phospho-
proteins and 2002 phosphorylation sites using 8 mg HeLa cell
nuclear protein [18]. The combination of IMAC and SCX
fractionation identified phosphorylation sites from more
than 500 yeast proteins [23].
Conclusion
Shotgun proteomics has proven to be the method of choice
for large-scale proteomics. For complex biological samples,
multidimensional LC separation is necessary to separate the
peptide mixtures to be detected by mass spectrometer. Both
Vol. 3, No. 3 2006 Drug Discovery Today: Technologies | Medicinal chemistry
Table 1. Comparison summary table
Technology 1 Technology 2 Technology 3 Technology 4 Technology 5
Name of specific
type of technology
Protein profiling Protein profiling Quantitation Quantitation Quantitation
Names of specific
technologies
with associated
companies and
company websites
Online LC separation Offline LC separation Intensity-based isotope
labeling – ICATb,
SILACc, 16O/18O
iTRAQa Spectra count
Pros Fully automated;
minimum samples loss;
better reproducibility
More flexible than online;
can use more separation
modes and solvents;
easy to set up in most
laboratories
Duplex; reproducible
chromatogram
separation
Multiplex (4); same
chromatogram separation
for all four samples; single
parent ion and MS/MS
scan for all four samples
No labeling required;
minimum sample loss;
sensitivity not
compromised; easy;
cheap
Cons Limited separation
modes; solvent has
to be compatible
with MS
Potential sample loss;
poorer reproducibility
Less accurate for
low-abundance proteins;
more expensive than
label-free methods
More expensive than
label-free methods
Semi-quantitative;
less accurate for
low-abundance
proteins; no multiplex
References [4,6] [5] [11–13] [14] [17]
a Isobaric tags for relative and absolute quantitation.b Isotope-coded affinity tags.c Stable isotope labeling by amino acids in cell culture.
online and offline LC separations have been successfully
applied to various systems. The online approach provides
the benefits of less human error, higher sample recovery, and
better reproducibility (Table 1).
Currently, two main approaches are used for relative quan-
titation in shotgun proteomics: stable isotope labeling meth-
ods and label-free methods (Table 1). In the label-free
approach, direct measurements of the peak intensity or the
number of MS/MS spectra is used for relative quantitation. It
is a simple but effective way to compare different samples.
The disadvantage is that it only works well for relatively
abundant proteins and typically many replicate analyses
are required to obtain reliable quantitation. Isotope labeling
approaches have the advantages of multiplexing and mini-
mum run-to-run variation. Similar to the label-free methods,
the MS intensity-based methods including ICAT, SILAC and16O/18O labeling do not work well for low-abundance pro-
teins. iTRAQ technology overcomes this problem by measur-
ing the intensities of the reporter ions in the MS/MS for
relative quantitation. It also has the advantage of running
four samples in a single LC–MS/MS run.
Global posttranslational modification, especially phos-
phorylation, is another big challenge in the proteomics
field. Both antibody and metal affinity-based enrichment
methods have been developed to help detect phosphopep-
tides. Chromatogram enrichment was also found to be
useful. Owing to the nature of the phosphopeptide, it is
unlikely that any of the above methods alone can solve the
problem. A combination of affinity and chromatogram
enrichments as well as new advances in mass spectrometry
are needed to enable researchers to study the full phospho-
proteome.
References1 Aebersold, R. and Goodlett, D.R. (2001) Mass spectrometry in proteomics.
Chem. Rev. 101, 269–295
2 Yates, J.R., 3rd (2004) Mass spectral analysis in proteomics. Annu. Rev.
Biophys. Biomol. Struct. 33, 297–316
3 Anderson, N.L. and Anderson, N.G. (2002) The human plasma proteome:
history, character, and diagnostic prospects. Mol. Cell Proteomics 1, 845–867
4 Washburn, M.P. et al. (2001) Large-scale analysis of the yeast proteome by
multidimensional protein identification technology. Nat. Biotechnol. 19,
242–247
5 Peng, J. et al. (2003) Evaluation of multidimensional chromatography
coupled with tandem mass spectrometry (LC/LC–MS/MS) for large-scale
protein analysis: the yeast proteome. J. Proteome Res. 2, 43–50
6 Wei, J. et al. (2005) Global proteome discovery using an online three-
dimensional LC–MS/MS. J. Proteome Res. 4, 801–808
7 Adkins, J.N. et al. (2002) Toward a human blood serum proteome: analysis
by multidimensional separation coupled with mass spectrometry. Mol.
Cell Proteomics 1, 947–955
8 Kislinger, T. et al. (2006) Global survey of organ and organelle protein
expression in mouse: combined proteomic and transcriptomic profiling.
Cell 125, 173–186
9 Cha, B. et al. (2000) An interface with a linear quadrupole ion guide
for an electrospray-ion trap mass spectrometer system. Anal. Chem. 72,
5647–5654
10 Wang, H. et al. (2006) Characterization of the mouse brain proteome using
global proteomic analysis complemented with cysteinyl-peptide
enrichment. J. Proteome Res. 5, 361–369
11 Gygi, S.P. et al. (1999) Quantitative analysis of complex protein mixtures
using isotope-coded affinity tags. Nat. Biotechnol. 17, 994–999
12 Yao, X. et al. (2001) Proteolytic 18O labeling for comparative proteomics:
model studies with two serotypes of adenovirus. Anal. Chem. 73,
2836–2842
13 Ong, S.E. et al. (2002) Stable isotope labeling by amino acids in cell culture,
SILAC, as a simple and accurate approach to expression proteomics. Mol.
Cell Proteomics 1, 376–386
www.drugdiscoverytoday.com 305
Drug Discovery Today: Technologies | Medicinal chemistry Vol. 3, No. 3 2006
14 Ross, P.L. et al. (2004) Multiplexed protein quantitation in Saccharomyces
cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell
Proteomics 3, 1154–1169
15 Gerber, S.A. et al. (2003) Absolute quantification of proteins and
phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci. U S A
100, 6940–6945
16 Andersen, J.S. et al. (2003) Proteomic characterization of the human
centrosome by protein correlation profiling. Nature 426, 570–574
17 Liu, H. et al. (2004) A model for random sampling and estimation of relative
protein abundance in shotgun proteomics. Anal. Chem. 76, 4193–4201
18 Beausoleil, S.A. et al. (2004) Large-scale characterization of HeLa cell
nuclear phosphoproteins. Proc. Natl. Acad. Sci. U S A 101, 12130–12135
306 www.drugdiscoverytoday.com
19 Rush, J. et al. (2005) Immunoaffinity profiling of tyrosine phosphorylation
in cancer cells. Nat. Biotechnol. 23, 94–101
20 Posewitz, M.C. and Tempst, P. (1999) Immobilized gallium(III) affinity
chromatography of phosphopeptides. Anal. Chem. 71, 2883–2892
21 Ficarro, S.B. et al. (2002) Phosphoproteome analysis by mass spectrometry
and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 20,
301–305
22 Kuroda, I. et al. (2004) Phosphopeptide-selective column-switching
RP-HPLC with a titania precolumn. Anal. Sci. 20, 1313–1319
23 Gruhler, A. et al. (2005) Quantitative phosphoproteomics applied
to the yeast pheromone signaling pathway. Mol. Cell Proteomics 4,
310–327