Upload
benjamin-mann
View
213
Download
0
Embed Size (px)
Citation preview
RAPID COMMUNICATIONS IN MASS SPECTROMETRY
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
) DOI: 10.1002/rcm.3781
Published online in Wiley InterScience (www.interscience.wiley.comProteinQuant Suite: a bundle of automated software
tools for label-free quantitative proteomics
Benjamin Mann1, Milan Madera1,2, Quanhu Sheng2,3, Haixu Tang2,3, Yehia Mechref1,2*
and Milos V. Novotny1,2*1Department of Chemistry, Indiana University, Bloomington, IN 47405, USA2National Center for Glycomics and Glycoproteomics, Department of Chemistry, Indiana University, Bloomington, IN 47405, USA3School of Informatics, Indiana University, Bloomington, IN 47405, USA
Received 24 April 2008; Revised 14 August 2008; Accepted 17 September 2008
*CorrespoChemistrton, IN 4E-mails: yContract/Sciences,tract/graContract/Glycomicber: RR01
In simplifying the evaluation and quantification of high-throughput label-free quantitative proteo-
mic data, we introduce ProteinQuant Suite. It comprises three standalone complementary computer
utilities, namely ProtParser, ProteinQuant, and Turbo RAW2MGF. ProtParser is a filtering utility
designed to evaluate database search results. Filtering is performed according to different criteria that
are defined by the end-user. ProteinQuant then utilizes this parsed list of peptides and proteins in
conjunction with mzXML or mzData files generated from the raw files for quantification. This
quantification is based on the automatic detection and integration of chromatographic peaks
representative of the liquid chromatography/mass spectrometry (LC/MS) elution profiles of ident-
ified peptides. Turbo RAW2MGFwas developed to extend the applicability of ProteinQuant Suite to
data collected from different types of mass spectrometers. It directly processes raw data files
generated by Xcalibur, a ThermoElectron data acquisition software, and generates a MASCOT
generic file (MGF). This file format is needed since the protein identification results generated by
the database search employing this file format include information required for the precise identi-
fication and quantification of chromatographic peaks. The performance of ProteinQuant Suite was
initially validated using LC/MS/MS generated for a mixture of standard proteins as well as standard
proteins spiked in a complex biological matrix such as blood serum. Automated quantification of the
collected data resulted in calibration curves with R2 values higher than 0.95 with linearity spanning
over more than 2 orders of magnitude with peak quantification reproducibility better than 15%
(RSD). ProteinQuant Suite was also applied to confirm the binding preference of standard glyco-
proteins to Con A lectin using a sample consisting of both standard glycoproteins and proteins.
Copyright # 2008 John Wiley & Sons, Ltd.
The continuous development of qualitative and quantitative
proteomic approaches has been made possible through
the technological advancement in the areas of separation
science and mass spectrometry (MS). Such advances
stimulated the life science activities aiming at biomarker
discovery as well as at a better understanding of biological
systems. Quantitative proteomics can be successfully used in
characterizing alterations in protein abundance as a con-
sequence of disease state or treatment of a disease. This is
based on the assumption that such differences represent
differential protein expression originating from a pertur-
bation of the biological system as a consequence of such
ndence to: Y. Mechref or M. V. Novotny, Department ofy, Indiana University, 800 E Kirkwood Ave, Blooming-7405, [email protected]; [email protected] sponsor: National Institute of General MedicalUS Department of Health and Human Services; con-
nt number: GM24349.grant sponsor: NIH/NCRR – National Center fors and Glycoproteomics (NCGG); contract/grant num-8942.
conditions. Different MS-based approaches for performing
quantitative proteomics which offer distinct advantages and
disadvantages have been developed.1–4
The available methods can be classified into those based on
electrophoretic separation techniques such as one- and two-
dimensional polyacrylamide gel electrophoresis (1-DE or 2-
DE, respectively) and those based on chromatographic
separations.2 Due to the high complexity of most proteomes,
2-DE is very popular in comparative quantitative proteo-
mics.5–7 It is able to resolve thousands of proteins, allowing
visualization of changes between complex proteome
samples. In addition, only those spots that appear differen-
tially abundant need to be analyzed by MS, thus substantially
reducing the overall task. Nevertheless, 2-DE still suffers
from its limited sensitivity and dynamic range as well as its
long and tedious procedure.8,9
Quantitative proteomics has recently capitalized on major
advances in chromatographic media, columns and instru-
mentation. Chromatographically based quantitative proteo-
mic approaches largely depend on the liquid chromatog-
raphy/tandem mass spectrometry (LC/MS/MS) analyses of
Copyright # 2008 John Wiley & Sons, Ltd.
3824 B. Mann et al.
proteome samples that have been subjected to proteolytic
digestion. Such analyses are achieved through comparing the
differences between LC/MS/MS runs of the proteolytic
digests of both control and experimentally perturbed
systems. Generally, quantitative proteomics involves the
analysis of samples that are either subjected to stable-isotope
labeling, such as isotope-coded affinity tag (ICAT), global
internal standard strategy (GIST) and isobaric tag for relative
and absolute quantification (iTRAQ),10–12 or analyzed with-
out any labeling step, an approach commonly referred to as
label-free quantitative proteomics.8,13,14
Regardless of the general approach used, quantitative
proteomics often deals with the analysis of large sets of
samples, generating a considerable number of data files.
Therefore, data evaluation required for either absolute or
relative quantification of all or a limited number of
components is extremely challenging and manually imposs-
ible. Due to the lack of comprehensive quantification
packages currently supplied by the mass spectrometry
vendors as part of their data acquisition software, a variety
of open source tools15 for both stable-isotope labeling16–19
and label-free19–24 experiments have been developed by
different groups. These programs are frequently developed
as cross-platform applications and are commonly distributed
as a source code, which requires additional compilation prior
to use. Enabling compatibility with multiple operating
systems is undoubtedly useful but the successful deploy-
ment of such utilities usually requires broad computer
knowledge and extensive configuration. Although the
majority of currently available software uses very sophisti-
cated algorithms to resolve chromatographic peaks and can
handle raw data in a universal format, some still have a
limited capability of automatically processing multiple data
files,19 while others lack the association between the resolved
and integrated peaks and the list of identified peptides and
proteins.24 For example, mapQuant, a software capable of
large-scale protein quantification developed by Church and
coworkers,23 resolves chromatographic peaks through the
combination of 2D imaging, watershed segmentation and
isotopic deconvolution, but it still lacks the support for
unified mzXML or mzData file formats. Another software,
msInspect, released by McIntosh and colleagues,19 utilizes
similar 2D imaging of LC/MS/MS runs in mzXML format
for the determination of eluting components, yet it does not
officially support label-free quantification approaches and
requires additional script writing to allow for automated
processing of multiple data files. Notable features of
mzMine, a tool facilitating label-free quantitative proteomics,
include its ability to run in batch mode and implement de-
noising, background subtraction and isotopic deconvolu-
tion.24 However, mzMine does not currently associate the
areas of evaluated peaks with identified peptides or proteins,
thus making it more suitable for high-throughput quanti-
tative profiling, where the identification of resolved features
is not necessary.
Recently, the aforementioned limitations associated with
quantitative proteomics appear to be overcome via trans-
proteomic pipelines (TPPs) with standardized inputs and
outputs.25 These robust solutions, led by CPAS (Compu-
tational Proteomics Analysis System), are not however
Copyright # 2008 John Wiley & Sons, Ltd.
designed as standalone utilities for proteomic quantification;
they rather provide complete and unified data processing
with the capability of sharing the results among different
institutions. TPPs are highly configurable and work with
various quantification utilities, such as Xpress or ASAPRa-
tio;18 however, these software plug-ins thus far only support
approaches based on isotopic labeling. Because of their
robustness and universal architecture, TPPs require dedi-
cated servers or even computer clusters to handle systematic
centralized data processing; therefore, they may not be
deemed convenient or necessary for some laboratories.
In responding to some general needs of quantitative
proteomics with an emphasis on providing a utility that
would be very easy to use without a need for extensive
computer knowledge or configuration, we have developed
ProteinQuant Suite software package. It facilitates the
evaluation of multiple data files generated from label-free
proteomic experiments and offers a simple, user-friendly
and standalone alternative to other currently available high-
throughput quantification tools. The utility of ProteinQuant
Suite is demonstrated for high-throughput comparative
quantification of proteolytic digests of standard proteins
analyzed separately or spiked in depleted human blood
serum as an example of a complex biological mixture.
Different features associated with the utility of ProteinQuant
Suite in quantitative proteomics are addressed, including
run-to-run reproducibility, the linearity of dynamic range,
and different normalization methods. The use of Protein-
Quant Suite was also demonstrated in studying of glyco-
protein binding by lectin affinity chromatography.
EXPERIMENTAL
Reagents and standardsLysozyme, bovine serum albumin (BSA), cytochrome C,
ovalbumin, alpha-lactalbumin, lactoglobulin B, histone,
alpha-casein, myoglobin, lactoferrin, immunoglobulin G,
glutathione s-transferase, hemoglobin, ribonuclease B,
fetuin, thryoglobulin, carbonic anhydrase II, and trypsin
(proteomics, sequencing grade) were purchased from Sigma-
Aldrich (St. Louis, MO, USA). Dithiothreitol (DTT) and
iodoacetamide (IAA) were acquired from Bio-Rad (Hercules,
CA, USA). HPLC grade reagents were purchased from EMD
(Darmstadt, Germany). The different buffers used here were
prepared in Millipore deionized water (Millipore, Billerica,
MA, USA). Standard protein mixtures were prepared from
stock solutions suspended in 50 mM ammonium bicarbon-
ate. Female human blood serum was acquired from
Innovative Research, Inc. (Southfield, MI, USA). Serum
was divided into 1-mL aliquots and frozen at �208C in less
than 1 h upon its receipt from the vendor. This step was
necessary to avoid unnecessary freeze/thaw cycles.
Software developmentAll applications included in ProteinQuant Suite were written
in C# programming language and compiled in Microsoft
Visual Studio 2005 Professional Edition. The developed
software is fully compatible with Windows-based operating
systems with ‘.NET’ framework v2.0. It features an easy
installation procedure and provides a graphic, user-friendly
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm
ProteinQuant Suite: software bundle for proteomics 3825
interface. This is true for all utilities except for Turbo
RAW2MGF converter which requires a complete installation
of XCalibur mass spectrometer controlling software (Ther-
moElectron, San Jose, CA, USA) including the XDK
development kit.
Depletion of human blood serum with themultiple affinity removal system (MARS)Blood serum was depleted on an Akta purifier (Amersham
Biosciences, NJ, USA) using the multiple affinity removal
system (MARS), a 4.6� 100 mm affinity column (Agilent
Technologies, Santa Clara, CA, USA). Depletion was
performed as suggested by the manufacturer’s LC protocol.
A 30-mL aliquot from each depletion process was collected
and subsequently pooled with other depleted fractions and
reconcentrated to a total volume of 500mL. Next, buffer
exchange was performed, replacing the depletion buffers
with 50 mM ammonium bicarbonate and preconcentrating
the mixture to ca. 0.5 mg/mL concentration. The total protein
concentration of the final mixture was determined by
Bradford protein assay (BioRad, Hercules, CA, USA).
Trypsin digestionProtein samples were subjected to tryptic digestion accord-
ing to the following procedure. After thermal denaturation at
958C for 10 min, samples were reduced through the
addition of DTT to a final concentration of 5 mM and
incubated at 608C for 45 min. Alkylation was achieved by
adding IAA to a final concentration of 20 mM prior to
incubation at room temperature for 45 min in the dark. A
second aliquot of DTT was then added, increasing the final
concentration of DTT to ca. 10 mM. Samples were then
incubated at room temperature for 30 min to quench the
alkylation reaction. Next, trypsin was added (1:30 w/w) and
the solutions were incubated at 378C for 18 h.26 The
enzymatic digestions were then quenched through the
addition of neat formic acid. The sample containing
17 standard proteins at equimolar concentrations was
prepared prior to enzymatic digestion, as were the depleted
human blood serum samples spiked with different oval-
bumin concentrations.
Lectin affinity chromatographyA mixture of standard proteins (BSA, cytochrome C,
myoglobin) and glycoproteins (fetuin, ovalbumin, ribonu-
clease B) was prepared in Con A binding buffer (10 mM
TRIS.HCl, pH 7.5, 500 mM NaCl, 1 mM MnCl2, 1 mM CaCl2,
0.08% NaN3) to a final concentration of 1mg/mL of each
protein. A 50-mL aliquot of Con A Sepharose was thoroughly
washed with 1 mL Con A binding buffer, mixed with 100mL
of the protein mixture sample followed by the addition of
100mL lectin binding buffer. After overnight incubation at
48C, unbound proteins were washed from the media with
2� 100mL binding buffer and combined in an Eppendorf
tube. Bound glycoproteins were then displaced from the
lectin through three sequential washes each with a 150-mL
aliquot of the elution buffer (0.2 M a-D-methylmannoside,
0.2 M a-D-methylglucoside in the binding buffer) and
combined in a separate vial. Both bound and unbound
Copyright # 2008 John Wiley & Sons, Ltd.
fractions were desalted using Microcon 10 kDa spin mem-
brane filters and dried.
The dried proteins were then denatured with 20mL of 6 M
guanidine hydrochloride and incubated at room tempera-
ture for 30 min. After the addition of 180mL of 50 mM
ammonium bicarbonate buffer, sample was reduced with
5mL of 200 mM DTT for 30 min at 608C, and alkylated with
20mL of 200 mM IAA for 30 min at room temperature.
Finally, the mixture was digested with trypsin (2% w/w) for
18 h at 378C and a 1mL of the generated peptides was
subjected to LC/MS/MS analysis.
Nano-LC/MS/MSA nano-LC/MS/MS system comprised of a 1100 nano-LC
system (Agilent Technologies, Santa Clara, CA, USA)
interfaced to XCT Ultra LC/MSD ion-trap mass spectrometer
(Bruker Daltonics, Billerica, MA, USA) and equipped with
the nano-electrospray ionization (ESI) source was used here.
Samples were desalted through on-line trapping using a
PepMap300 C18 cartridge (5mm, 300 A; Dionex, Sunnyvale,
CA, USA) prior to separating the peptides on a Zorbax 300SB
C18 nanocolumn (3.5mm particles, 75mm� 150 mm; Agilent
Technologies, Santa Clara, CA, USA). The separation was
performed at a flow rate of 250 nL/min using a linear
gradient from 3% to 55% acetonitrile containing 0.1% formic
over 45 min. The LC system was controlled by ChemStation
(Agilent Technologies, Santa Clara, CA, USA), while MS data
acquisition was performed using Esquire Control software
(Bruker Daltonics, Billerica, MA, USA). Capillary voltage
was kept at 1700 V, while the desolvation temperature was
maintained at 3008C. The ion charge control value (ICC) was
set to 200 000 with a maximum accumulation time of 200 ms.
MS/MS fragmentation of the five most intense precursor
ions in the spectra was performed automatically with an
exclusion window of 0.5 min.
Data processing and quantification usingProteinQuant SuiteThe data acquired by the XCT Ultra mass spectrometer were
processed with Data Analysis software (Bruker Daltonics,
Billerica, MA, USA) and the generated peak lists were saved
as a MASCOT generic file (MGF). MGF files were then
submitted to MASCOT database searching and results were
parsed with ProtParser set to specific parsing criteria which
are defined by the end-users. In this study, only þ2 and þ3
charged peptides were subjected to MS/MS experiments,
since every tryptic peptide should have at least two charges,
one on its N-terminus, and the other at lysine or arginine
residue of its C-terminus.27 Minimum MOWSE ion score
threshold was set to 30. Also, peptide mass threshold was set
to 600 Da to exclude possible low molecular weight
fragments or other possible non-peptide interferences.
Additionally, tryptic peptides with KK, RR, RK or KR motifs
were also not considered valid, as trypsin would most likely
cleave at least one of the bonds.28 The use of the so-called
‘decoy database’ has recently been shown to be a valuable
tool for evaluating the rate of false positive identifications of
peptides through database searching.29 Using this approach
in conjunction with the abovementioned filtering criteria, the
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm
3826 B. Mann et al.
false positive identification rate for our different experiments
was estimated to be 4.9%.
All parsed files were finally combined into a master file,
which contained the list of all proteins and peptides
identified across all the processed LC/MS/MS analyses.
The generated master files, in conjunction with their
corresponding mzXML files created from raw data files by
Bruker’s CompassXport conversion tool, were finally sub-
mitted to ProteinQuant, which quantifies identified features
through several steps described below.
Construction of extracted ion chromatogramsFirst, ProteinQuant reads the m/z values of the identified
peptide saved in the ProtParser text file and subsequently
creates extracted ion chromatograms for each m/z value from
the mzXML or mzData MS data file. Since the data generated
by a mass spectrometer may sometimes be noisy, Protein-
Quant uses a Savitzky-Golay30 smoothing algorithm, thereby
facilitating peak integration. ProteinQuant also has a variable
m/z tolerance window that can be adjusted to reconstruct
chromatograms. This feature allows the use of this software
to integrate data generated by both high and low mass
resolution and accuracy mass spectrometers.
Baseline calculationProteinQuant offers several options for baseline evaluation,
which can be modified to accommodate the end-user
requirements. By default, baseline values are calculated as
an average intensity of the data points in the extracted ion
chromatogram through the first minute of data acquisition.
Alternatively, the user can define this time segment or assign
a fixed baseline value. These values are entered in the
configuration page of ProteinQuant.
Peak apex and edge assignmentEvery peptide hit in the ProtParser files is associated with a
retention time and intensity value. In addition, these files
contain only the top-intensity queries of multiple hits of
matching peptides with the same sequences. Therefore,
ProteinQuant initially utilizes the retention time of a peptide
listed in the parsed file as the apex of chromatographic peak
for that peptide. However, this retention time only reflects
the time of MS/MS acquisition of a particular precursor ion,
which does not often correspond to the real apex of the
chromatographic peak of an identified peptide. Due to mass
spectrometer duty cycle and depending on MS method
settings, the precursor ion may be selected for MS/MS before
or after the maximum of its eluting peak. Therefore,
ProteinQuant checks the intensity of the precursor ion in
the interval given by the peak width entry, which has a
default setting of 1 min, and, subsequently, within this peak
width it assigns the apex of the chromatographic peak
corresponding to the m/z value of the identified peptide.
Next, ProteinQuant allows the end-user to define the method
to be used for assigning the edges of the chromatographic
peak. Peaks can be defined by the edges calculated using full-
width at half maximum criteria,31 an arbitrary time window,
intensity threshold, or, by default, from a combination of an
intensity threshold that is constrained by a maximum time
window. After the definition of the apex and both edges, the
Copyright # 2008 John Wiley & Sons, Ltd.
elution profiles of the identified peptides are then integrated
based on rectangular approximation. Peptide and protein
quantification results and the information of the identified
peptides are finally reported and saved in comma delimited
(CSV) file format.
RESULTS AND DISCUSSION
ProteinQuant Suite figures of meritsDue to the increasing interest in the quantitative aspects of
proteomics studies accompanied by a frequent need to
analyze very complex protein or peptide mixtures, high-
throughput quantification usually requires software-assisted
data evaluation methodologies. Therefore, we have devel-
oped the ProteinQuant Suite software bundle, consisting of
three stand-alone complementary utilities, namely Turbo
RAW2MGF, ProtParser and ProteinQuant. The workflow of
ProteinQuant Suite is summarized in Fig. 1. The first step in
the workflow involves the processing of LC/MS/MS raw
data to generate a peak list which consists of precursor ion
m/z values and MS/MS fragments and their intensities. This
peak list is saved as MASCOT generic file (MGF) format
which is necessary for quantification with ProteinQuant,
since MGF files contain retention times and intensities of
precursor ions required for quantification. MGF files are then
submitted to the database searching engine, MASCOT,
which outputs the result as a list containing identified
peptides and their corresponding proteins. The generated
results are saved as HTML files, which are easily processed
and filtered using parsing software such as ProtParser
described here. Although ProteinQuant Suite is only capable
of processing data searched using MASCOT, the potential of
using ProteinQuant with other search engines is currently
being investigated.
In our ProteinQuant Suite, the same MS raw data files,
which were used to generate the list of proteins, will need to
be converted to mzXML32 or HUPO’s mzData33 file format
prior to quantification. Translating raw data files to these
universal file formats is usually performed separately using a
variety of open source utilities that are either part of data
acquisition software or can be downloaded from the
internet.34 Therefore, we opted not to include any third-
party converters into the Suite.
The performance of ProteinQuant in terms of peak
integration was evaluated through comparing its integrated
values to those generated using the vendor’s software and
referred to here as manual integration. This comparison was
performed using the data acquired from the LC/MS/MS
analysis of the tryptic digests of both a 10-ng aliquot
(150 fmol) of BSA and a 1-mg aliquot of depleted human
serum. The areas of 13 reliably identified peptides integrated
manually and by ProteinQuant Suite are listed in Table 1.
Accordingly, the peak areas reported by ProteinQuant are
exceedingly comparable to those obtained through manual
integration. The differences between the two approaches
were less than 10% for all peptides, including those detected
at low signal-to-noise (S/N) ratios, suggesting an acceptable
performance of ProteinQuant’s peak-picking and peak-
integration algorithm.
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm
Table 1. Integrated areas of different peptides derived from BSA tryptic digest and depleted human blood serum digest,
calculated through manual integration and ProteinQuant integration
Peptide m/z S/Na
Integration method
Difference [%]Manual ProteinQuant
BSA peptidesGLVLIAFSQYLQQCPFDEHVK 831.66 232 3.47Eþ 08 3.43Eþ 08 1.3TVMENFVAFVDK 700.34 614 9.03Eþ 07 8.60Eþ 07 5.1LGEYGFQNALIVR 740.71 79 1.18Eþ 08 1.13Eþ 08 3.9DAFLGSFLYEYSR 784.34 71 3.20Eþ 08 2.91Eþ 08 9.7
Depleted human blood serum peptidesSPVGVQPILNEHTFCAGMSK 724.8 79 1.10Eþ 10 1.12Eþ 10 2.14ILLQGTPVAQMTEDAVDAERLK 800.3 38 4.72Eþ 09 4.81Eþ 09 1.84DYVSQFEGSALGK 700.7 29 2.42Eþ 09 2.50Eþ 09 3.19NFPSPVDAAFR 610.7 29 3.40Eþ 09 3.37Eþ 09 1.06IASFSQNCDIYPGKDFVQPPTK 838.2 20 2.21Eþ 09 2.19Eþ 09 0.52YFKPGMPFDLMVFVTNPDGSPAYR 917.5 12 2.03Eþ 09 1.97Eþ 09 2.98FICPLTGLWPINTLK 887.4 10 1.86Eþ 09 1.82Eþ 09 2.24GPSVFPLAPCSR 644.2 6 2.05Eþ 08 1.96Eþ 08 4.32AFQPFFVELTMPYSVIRGEAFTLK 931.7 4 2.51Eþ 08 2.46Eþ 08 1.75
a S/N (as calculated by Data Analysis Software, Bruker Daltonics, Billerica, MA, USA) equals the chromatographic peak height of the extractedion chromatogram of interest divided by five times the standard deviation, s, of the 3rd derivative of the total ion chromatogram during the first
5 min of the LC/MS/MS experiment. s is calculated using the equation, s ¼ffiffiffiffiffiffiffiffiffiffiffiPNi¼1
y000N
s, where y is the chromatographic height at each data point i
and N is the total number of data points.
Figure 1. ProteinQuant Suite experimental workflow describing the automated
quantification aspects.
Copyright # 2008 John Wiley & Sons, Ltd. Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm
ProteinQuant Suite: software bundle for proteomics 3827
3828 B. Mann et al.
In addition to having a good agreement between the
results from a supervised and the automated integration of
constructed elution profiles for identified peptides, it is also
very important to demonstrate reproducible chromato-
graphic peak-picking and peak-integration. The extracted
ion chromatograms of the peptide LGEYGFQNALIVR
constructed from five injections of BSA tryptic digest were
both evaluated manually and with ProteinQuant (data not
shown). The shapes of the chromatographic peaks, as well as
their retention times, are very consistent, as suggested by an
automated integration relative standard deviation (RSD) of
less than 5%. This suggests that contribution of Protein-
Quant’s integration algorithm to the variation of the
evaluated peaks is negligible. This is supported by the fact
that the overall variations of an LC/MS/MS analysis can be
as high as 15%.35
Quantitative proteomics employingProteinQuant Suite
Label-free quantification of a mixture of standardproteins using normalization and a master peptide fileThe addition of an internal standard in label-free quantitative
proteomics has repeatedly been suggested for its benefit as a
control element to measure the consistency of instrument
response. Several strategies, involving the addition of an
internal standard in quantitative proteomic experiments,
have been previously discussed.36,37 On the other hand, the
use of a sample-dependent, global normalization coefficient,
that does not require artificial spiking and utilizes all
components assumed to be present in a mixture at constant
concentration, has also been discussed as an alternative
approach to normalization of proteomic data.8 ProteinQuant
Suite allows the normalization of proteomic data using any of
these approaches. An entry box is included in ProteinQuant
to allow the user to define the normalization approach to be
employed, if any.
As a model choice for internal spiking with a standard, we
chose lysozyme. In 2004, Riter et al. reported that lysozyme
was a sound choice for the differential study of protein
expression in rat serum for many reasons, including its size
(14 kDa) and multiple tryptic peptides of similar mass-to-
charge ratios.38 Lysozyme was also attractive for a reversed-
phase separation as its peptides eluted over a large range of
the chosen gradient. This final characteristic was considered
to be highly desirable for complex mixtures. In cases where
analyte components might co-elute with certain standard
peptides, the broad range over which lysozyme peptides
elute could limit variation introduced in the standard
response by competitive ionization.39 Standard protein
normalization was quickly performed with ProteinQuant
by inputting the Swiss-Prot entry name (LYSC_CHICK) in
the normalization tab of the software configuration menu.
The global normalization strategy was also evaluated since
it has its own merits: fewer preparation steps, minimal
sample complexity, higher efficiency compared to spiking
with a standard, and normalization to a large set of signals,
thus reducing the influence of random variation. This
method was also easily evaluated as it has been implemented
in ProteinQuant. With regard to the results discussed herein,
Copyright # 2008 John Wiley & Sons, Ltd.
lysozyme was spiked in the mixture of 17 standard proteins,
and the global strategy was employed in both the standard
mixture study and that where ovalbumin was spiked in the
depleted human serum sample.
Current applications of high-throughput proteomics seek
to reveal the significant changes in abundance of signal
proteins for particular diseases or perturbed conditions.3,40
Often, confident analysis of these important components
may be hampered by limited selection of critical MS
precursors in a given experiment as a result of the duty
cycle of the mass spectrometer. It is more likely that a peptide
will not be subjected to an MS/MS experiment if it is present
at a low concentration. Therefore, a sophisticated approach
to quantify components that are known to exist in a sample
even when they are not determined through MS/MS is
needed to prevent exclusion of possibly critical peptides.
This issue has been discussed, and it is illustrated here by
the example depicted in Fig. 2, in which the peptide,
LSFNPTQLEEQCHI from beta-lactoglobulin, was skipped
four times over the course of 20 LC/MS/MS experiments.
Clearly, the base peak chromatogram for the associated ion of
m/z 858.49 was present at the same retention time in the
experiments for which it was not picked for MS/MS
(Fig. 2(A)). Accordingly, it was vital to implement a strategy
for including these significant components for quantitative
purposes.
Previously, research teams have reported methods based
on ‘landmark’ alignment of MS peaks associated with
commonly identified peptides in which they calculated a
‘universal’ retention time for each component by comparison
to a designated template chromatogram.41,42 Through this
approach, chromatograms were aligned to the template by
extension or compression of the intervals between landmarks
after which cross-assignment of each identified peptide
(throughout the entire investigation) to a coinciding MS peak
allowed for quantitative analysis.20 While this approach
utilizes sophisticated alignment algorithms, it operates
under the assumption that the order in which components
elute is unchanging from experiment to experiment, which
may be difficult to conclude in a mixture of several thousand
peptides. Smith and coworkers have discussed another
method in which chromatograms are normalized to one
another to generate ‘normalized elution times’ (NET) for
each peptide that are then included in an ‘accurate mass and
time’ tag (AMT) database which is used to identify peptides
from standard proteomics experiments.43 This approach,
however, requires the utility of high-mass accuracy mass
spectrometers.
Although peak alignment strategies are known to improve
the performance of quantification algorithms, we would like
to note that these methods require computer clusters to
facilitate data processing in a timely manner.41,42,44 Protein-
Quant has been designed as a portable tool that could be used
by proteomics laboratories that may not have access to
designated servers for data processing. This being said,
ProteinQuant includes a retention time window option as a
means of accounting for chromatographic shifts. The
approach we developed is based on compiling a master file
containing all peptides identified with MASCOT over all
experiments in ProtParser and integrating the MS precursor
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm
Figure 2. Extracted ion chromatograms (EIC) of LSFNPTQLEEQCHI. In 20 LC/MS/MS injections, this
peptide was not subjected to MS/MS four times. (A) Top chromatogram shows an EIC for an experiment in
which the MS precursor was picked for MS2 and the four proceeding were cases in which the precursor
was not picked in spite of the apparently significant chromatographic peak. (B) Zoom spectrum for m/z
250–950 from a representative MS2 experiment that was used to identify the peptide by MASCOT; in all 16
cases where the peptide was picked for MS2, the ion score was >30.
ProteinQuant Suite: software bundle for proteomics 3829
peak associated with each component listed in this new file in
every individual LC/MS/MS analysis. Compiling of differ-
ent parsed files into a master file is attained using the
‘combine’ function of ProtParser. While there were slight
variations in the retention time of peptides from experiment
to experiment, these were accounted for with the Protein-
Quant peak apex assignment options, in which an appro-
priate scan window was designated for the specific
investigation. The master file was created by automated
scanning of the individual database searches from each
experiment and including the first occurrence of a peptide
and its associated retention time (the time at which the MS
precursor was subjected to MS/MS) from these individual
runs. After a peptide had been added to the master file,
proceeding occurrences from subsequent database searches
were not included by ProtParser. In order to reassign the
retention time for each peptide to the apex of the MS peak,
ProteinQuant performed automatic peak assignment by
generating a base peak chromatogram for the appropriate
m/z value and then scanning in each direction from the
retention time associated with the MS precursor for the
maximum intensity value within the user-designated time
window. Without using a master file, the default window for
peak assignment had been �1 min. Because a master file
included the first occurrence of each peptide in all 20 database
search files of the 20 LC/MS/MS analyses, the scan window
was increased to ensure proper assignment of the apex for all
LC/MS experiments throughout the 20-injection investi-
gation. Although the chromatographic retention time
variation was limited to 1–2 min for any peptide, it was
necessary to account for the different points at which a
peptide could be picked on its elution profile for MS/MS
experiments by the instrument. Therefore, the peak assign-
ment window was increased to �1.5 min for quantification
with the master file. It has been discussed previously by
Copyright # 2008 John Wiley & Sons, Ltd.
Higgs and coworkers41 that extension of the integration
window can mask the area of an individual peptide by
including partial peaks from co-eluting peptides in the
calculation, and it is important to note that our approach
does not expand the integration window, only the peak
assignment window. It should be noted that the compen-
sation for the chromatographic shift through the peak
assignment window is necessary only when the user chooses
to use the master file. In this case, the end-user would need to
experimentally define chromatographic shifts and sub-
sequently retention time variation. However, if only those
peptides that were identified in each run are quantified, then
it is not necessary to compensate for chromatographic shift
because each peak is associated with a retention time at
which its MS/MS spectrum is acquired.
The method described above was tested using a standard
mixture containing 17 proteins, all present at equimolar
concentrations. The mixture was injected 20 times, and a
master peptide file was generated. Quantification results
based on individual experiments were compared to those
obtained with the master file. In Fig. 3(A), components were
not normalized, and in Fig. 3(B) components were globally
normalized to all 225 identified peptides. Normalization to
the lysozyme protein area in each sample was also
performed (data not shown), and while results were
improved over the not normalized, global normalization
showed significantly greater improvement. The global
approach was also more attractive because it did not require
artificial spiking of sample, thus eliminating additional
uncertainty.
Considering that less intense MS precursors were more
likely to be skipped for MS/MS, it is expected that the total
areas calculated for each protein would only increase by a
small amount when a master file was used. Reproducibility
improved significantly, lowering most coefficients of vari-
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm
Figure 4. Calibration curve of ovalbumin spiked in 1mg
depleted human blood serum quantified with a master file:
(A) not normalized, (B) normalized, and (C) ovalbumin signal
was increased by a factor of 100 for visual clarity and then log
transformed to emphasize the continuation of the linear trend
into the low end of the dynamic range.
Figure 3. Bar graph comparing the quantification results with
a master file of peptides to quantification with only peptides
identified through MASCOT for each individual experiment; a
considerable increase in precision was observed using a
master file for both (A) not normalized and (B) normalized
results; normalization also contributed to additional improve-
ment in reproducibility (>15%).
3830 B. Mann et al.
ation from 20–30% to 5–15%, which was also to be expected
since a master file insured that all peaks identified for a
protein throughout the investigation would be integrated in
each experiment. Based on these results, implementation of a
master file with ProteinQuant coupled with normalized
quantification appeared to be advantageous for label-free
analysis of proteins.
Label-free quantification of ovalbumin spiked in depletedhuman blood serum sampleTo test the efficacy of the software for label-free quantifi-
cation of a complex biological mixture, namely depleted
human blood serum, known quantities of ovalbumin were
injected in 1-mg aliquots of depleted serum sample. Seven
experiments were conducted through which triplicate
injections were made of samples containing 250–10 000 fmol
ovalbumin. Relative protein abundance for the standard
protein was calculated with a master file as described above.
We chose a modified global normalization method, in part to
utilize the large number of serum proteins that were
assumed to be present at constant concentration, and also
to limit the number of sample preparation steps. Ovalbumin
peptide areas were normalized to the sum of all serum
peptides listed in the master file in an automated fashion by
Copyright # 2008 John Wiley & Sons, Ltd.
configuring ProteinQuant to normalize to all peptides except
those from ovalbumin.
While normalized data suggested a trend in the calculated
area as the amount of ovalbumin increased (Figs. 4(A)
and 4(B)), the contribution of the lower points to the overall
linearity (R2) was further elucidated by performing linear
regression on log transformed data. The contribution of the
lower abundant points to the overall linearity was augmen-
ted, thus clearly suggesting that this trend extended to the
lower end of the dynamic range (Fig. 4(C)). To further
illustrate this point, five ovalbumin peptides quantified in
the mixture at 250 fmol were manually inspected to verify
that the ProteinQuant algorithm was functioning properly
for peptides present near the limit of quantification (Table 2).
It should also be noted that this data provided empirical
evidence that the peak apex assignment method employed
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm
Table 2. Integrated areas of five chicken ovalbumin peptides injected in triplicate at a concentration of 250 fmol in 1mg depleted
human blood serum digest. The data empirically demonstrates that ProteinQuant can accurately quantify peptides for a protein
present near the limit of quantification, 187 fmol
Peptide m/z S/Na
Integration method
Difference [%]Manual ProteinQuant
Ovalbumin peptides (250 fmol)Injection 1
ISQAVHAAHAEINEAGR 592.02 7 5.53Eþ 07 5.55Eþ 07 0.32GGLEPINFQTAADQAR 844.39 12 3.38Eþ 07 3.91Eþ 07 13.68ELINSWVESQTNGIIR 930.69 10 3.16Eþ 07 3.83Eþ 07 17.40NVLQPSSVDSQTAMVLVNAIVFKGLWEK 1025.51 10 4.40Eþ 07 4.89Eþ 07 9.99LYAEERYPILPEYLQCVK 762.11 11 7.61Eþ 07 7.46Eþ 07 2.06
Injection 2ISQAVHAAHAEINEAGR 592.02 10 3.65Eþ 08 3.32Eþ 08 9.83GGLEPINFQTAADQAR 844.39 12 5.67Eþ 08 5.7Eþ 08 0.51ELINSWVESQTNGIIR 930.69 7 2.74Eþ 08 2.77Eþ 08 1.06NVLQPSSVDSQTAMVLVNAIVFKGLWEK 1025.51 7 1.02Eþ 08 1.24Eþ 08 17.85LYAEERYPILPEYLQCVK 762.11 14 3.21Eþ 08 3.48Eþ 08 7.95
Injection 3ISQAVHAAHAEINEAGR 592.02 10 4.06Eþ 08 4.09Eþ 08 0.69GGLEPINFQTAADQAR 844.39 5 6.77Eþ 07 8.22Eþ 07 17.68ELINSWVESQTNGIIR 930.69 5 9.27Eþ 07 9.63Eþ 07 3.74NVLQPSSVDSQTAMVLVNAIVFKGLWEK 1025.51 4 5.55Eþ 07 6.74Eþ 07 17.73LYAEERYPILPEYLQCVK 762.11 9 2.16Eþ 08 2.12Eþ 08 2.02
a Calculated as in Table 1.
ProteinQuant Suite: software bundle for proteomics 3831
by ProteinQuant was effective in the quantification of
peptides originating from a protein present in a complex
mixture. However, in the very rare situation where two
peptides with the same m/z value co-elute within the user
designated retention time window, the ability to automati-
cally distinguish between those peptides is not possible. This
situation is less pronounced for a high-mass accuracy
instrument. The m/z delta option in ProteinQuant, in this
case, allows accurate peak-picking, thus making it even less
likely that such a situation would occur.
The amounts at which standards were spiked in blood
serum were comparable to those of middle to high
abundance proteins endogenously present in human serum.
The limit of quantification for this calibration curve was
determined employing confidence bands according to
IUPAC standards.45 Through this method, the low limit of
quantification was determined to be ca. 180 fmol. We have
observed in this study that for globally normalized data the
threshold is 0.0005 below which measurements become
inadequate for quantification. However, this value was
determined in this study, and it may not necessarily apply to
other label-free studies of different complexity and consti-
tution. Accordingly, ProteinQuant users are advised to
determine their specific limit of quantification.
The low relative deviation for the ovalbumin signals,
<15%, suggested that this method could offer enough
precision to observe an up- or down-regulation of approxi-
mately 30% or greater. This result suggested that Protein-
Quant Suite is viable for label-free quantification exper-
iments in a complex biological mixture. The primary concern
for a method such as this would be its value for analysis of
low abundance proteins in a biological fluid or tissue sample
in which challenges associated with signal masking originat-
ing from highly abundant proteins still persist. However,
we believe that the same theory could be applied to these
Copyright # 2008 John Wiley & Sons, Ltd.
proteins, and that with advanced enrichment,46–48
depletion,49,50 and affinity purification techniques,51,52 it
will be possible to observe their fluctuations as a result of
biological perturbations.
Quantification of proteins and glycoproteins subjected tolectin (Con A) affinity chromatographyComparative quantification of proteins and peptides has
elucidated the fact that changes in expression of certain
features identified in biological samples reflect the pro-
gression of various diseases. Furthermore, many of these
potential biomarkers feature characteristic post-translational
modifications, such as glycosylation or phosphorylation.
Particularly, glycoproteins have been receiving continuous
attention, as more than 50% of all proteins in mammalian
systems are commonly believed to be glycosylated53 and
some glycoproteins have already been implicated in
perturbations related to different types of cancer.54
We have employed Con A affinity chromatography to
enrich specific standard glycoproteins present in a mixture to
demonstrate the utility of ProteinQuant Suite for the relative
quantification of these glycoproteins. The resulting lectin-
bound and unbound fractions were adjusted to the same
volumes, subjected to proteolytic digestion, analyzed by LC/
MS/MS, and quantified with ProteinQuant Suite. In this
case, the choice of concanavalin A lectin was arbitrary, yet we
took advantage of its relatively broad specificity and facile
availability in a pure form. As described elsewhere,55 Con A
exhibits strong affinity toward carbohydrates with a high
content of mannose, glucose and galactose, and occasionally
interacts with the chitobiose core of N-glycans. The standard
protein mixture consisted of three proteins and three
glycoproteins, namely BSA, cytochrome C, and myoglobin
proteins and ribonuclease B, ovalbumin, and fetuin glyco-
proteins.
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm
Table 3. Comparative quantification of proteins and glyco-
proteins identified in unbound and bound fractions after Con A
lectin affinity chromatography enrichment and label-free
quantification using ProteinQuant Suite. Calculation of confi-
dence intervals (a¼ 0.05) was based on three replicates
Protein
Protein area (�1010) Ratio
Con Aunbound
Con Abound
bound/unbound
Cytochrome C 8.38� 1.68 0.05� 0.02 0.01BSA 27.70� 5.04 0.22� 0.19 0.01Ribonuclease B� 1.08� 0.53 8.42� 1.31 7.80Fetuin� 7.24� 0.25 0.37� 0.05 0.05Myoglobin 18.50� 3.08 0.26� 0.26 0.01Ovalbumin� 12.75� 3.12 10.67� 5.55 0.84
�Glycosylated proteins.
3832 B. Mann et al.
The results summarized in Table 3 clearly demonstrate the
significantly higher amount of ribonuclease B and ovalbumin
in the Con A bound fraction. Such observations were totally
expected since both glycoproteins contain high mannose
glycan structures. On the other hand, lower binding
efficiency of fetuin, which is a well-studied glycoprotein
with characterized N- and O-glycans, is mainly due to the
presence of terminal sialic acids, which interfere with Con A
binding. It is also worth mentioning that a compiled master
file was used here for the automated quantification of all
proteins identified in both bound and unbound fractions.
Table 4. Comparison of ProteinQuant performance to that of CPA
proteomic experiments in which ovalbumin was spiked in deplete
offered by the different platforms
ProteinQuant LabKey C
Ovalbumin theoretical
concentration ratio2.50 2.52� 0.01 1.71� 0.271.33 1.48� 0.25 1.65� 0.251.50 1.62� 0.26 1.67� 0.25
Software features
Normalization total peptide signal orinternal standard(s)
none
De-noising Savitzky-Golay noneBatch mode yes yesVisualization none elution profilDatabase searchingcompatibility
Mascot Mascot, X!tan
Master file based on databaseidentifications
none
Peak alignment nonec none
Statistical analysis noneb noneb
Data file format mzXML, mzData mzXMLComputer requirements 2 GHz CPU or better, 1 GB
RAM, Microsoft .NETFramework 2.0 or higher
P4 or dual p>1 GB RAMspace free, op
a Label-free quantification with the XPRESS algorithm was performed accobased label-free quantification is not officially supported by LabKey CPAS.performed manually.b Outputs can be processed with third-party statistical software tools.c Possible retention time shifts are compensated for through the peak aped With Thermo XCalibur software installed.
Copyright # 2008 John Wiley & Sons, Ltd.
Hence, ProteinQuant calculated peak areas for identified
proteins and glycoproteins as if they had the same likelihood
of being present in both fractions. This completely eliminated
the bias imposed by the proteins, which commonly generate
more peptides after enzymatic digestion, and the reported
area comparison then truly reflected the difference in the
abundance of proteins and glycoproteins identified in both
fractions.
Comparing the performance of ProteinQuantto two other software platformsAs mentioned above, several quantitative proteomic plat-
forms have been previously developed. Therefore, the
performance of ProteinQuant was compared to two other
software tools, LabKey CPAS and mzMine, that are readily
available and offer similar functionality. A quantitative
evaluation was performed using data collected from the
experiments in which depleted human blood serum was
spiked with different concentration of ovalbumin (5.0, 2.0,
1.5, and 1.0 pmol/mL). The data output from each software
platform was used to calculate three protein signal ratios
using the mean protein area for each concentration.
Assuming no experimental measurement error related to
sample preparation or proteomic analyses, the theoretical
ratios are expected to be 2.5, 1.33, and 1.5 for 5.0:2.0, 2.0:1.5,
and 1.5:1.0, respectively. It can be seen in Table 4 that
ProteinQuant offers relatively higher accuracy and reprodu-
S and mzMine. Data used in this comparison were based on
d human blood serum. The table also summarizes features
PAS w/XPRESSa mzMinea
2.14� 0.071.48� 0.651.46� 1.16
average peak height, averagesquared peak height, maximumpeak height or total raw signalmoving average or Savitzky-Golayyes
e 2D plot, 3D plot, spectrum plotdem, Sequest none
gap filler estimates area for missing peaks
aligned to master template withm/z and ret. time thresholdsLog ratio plot, PCAmzXML, mzData, netCDF, Thermo RAWd
rocessor machine,, 1 GB hard drivetional cluster config.
2 GHz CPU or better, 2 GB RAM,JRE 5.0 or better, Java 3D
rding to the suggestions of LabKey CPAS support; area or intensity-Since mzMine does not support database connectivity, this step was
x assignment window.
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm
ProteinQuant Suite: software bundle for proteomics 3833
cibility as suggested by the values calculated using the three
platforms.
Some of the key features of each software package are also
summarized in Table 4. mzMine and other aforementioned
utilities allow chromatographic alignment, thus permitting
the quantification of all significant chromatographic features,
including those that are not identified through database
searching. CPAS offers a centralized data-processing work-
space, and a workflow that can be completely automated,
beginning with the input of an mzXML data file and ending
with the output of a list of identified peptides as well as their
associated peak areas. However, CPAS does not officially
support area or intensity-based label-free quantification and
only implements spectral counting. Other features are also
summarized in Table 4, including data normalization, data
de-noising, data visualization, multiple file processing,
database searching engine compatibility, utility of master
file, peak alignment, statistical analysis and computer
requirements.
CONCLUSIONS
We have developed ProteinQuant Suite, a software bundle,
which allows fast and automated high-throughput label-free
quantification in proteomics. The suite collectively permits
automated and efficient quantitative evaluation of LC/MS/
MS proteomics data. Although it is currently limited to the
MASCOT search engine, plans to expand its compatibility to
other engines and the universal pepXML format are in
progress. This extension will help to reduce platform
dependence. ProteinQuant is a freeware application avail-
able for download.56
The performance of ProteinQuant has been verified for
label-free quantification with the data reported here for three
sample sets, including a mixture of standard protein,
depleted human blood serum spiked with ovalbumin, and
Con A lectin-enriched fractions. A label-free approach does
not complicate sample preparation, and does not increase the
likelihood of sample loss prior to LC injection. By compiling a
master list of peptides for quantification via MS peak
integration, increased reproducibility was achieved, as
previously suggested by similar strategies,42 and limitations
resulting from the duty cycle of the instrument were
circumnavigated. It is important to consider the effects of
slight chromatographic variations for different experiments
with this approach, but software tools can account for these
and ensure that MS peaks are properly assigned. Application
of this approach to the quantification of a lectin-enriched
sample is encouraging for future studies in which significant
enrichment steps could lead to quantitative knowledge of
lower abundance proteins.
AcknowledgementsThis work was primarily supported by Grant No. GM24349
from the National Institute of General Medical Sciences, US
Department of Health and Human Services. Further support
was provided by NIH/NCRR – National Center for Glyco-
mics and Glycoproteomics (NCGG), Grant No. RR018942.
Copyright # 2008 John Wiley & Sons, Ltd.
REFERENCES
1. Lowe JB, Marth JD. Annu. Rev. Biochem. 2003; 72: 643.2. Righetti PG, Campostrini N, Pascali J, Hamdan M, Astner H.
Eur. J. Mass Spectrom. 2004; 10: 335.3. Veenstra TD. J. Chromatogr. B 2007; 847: 3.4. Hale JE, Gelfanova V, Ludwig JR, Knierman MD. Briefings
Funct. Genom. Proteomics 2003; 2: 185.5. Witzmann FA, Arnold RJ, Bai F, Hrncirova P, Kimpel MW,
Mechref YS, McBride WJ, Novotny MV, Pedrick NM, Ring-ham HN, Simon JR. Proteomics 2005; 5: 2177.
6. Baek W-O, Haupt K, Colin C, Vijayalakshmi MA. Electro-phoresis 1996; 17: 489.
7. Klouckova I, Hrncirova P, Mechref Y, Arnold RJ, Li T-K,McBride WJ, Novotny MV. Proteomics 2006; 6: 3060.
8. Wang G, Wu WW, Zeng W, Chou CL, Shen RF. J. ProteomeRes. 2006; 5: 1214.
9. Gygi SP, Corthals GL, Zhang Y, Rochon Y, Aebersold R. Proc.Natl. Acad. Sci. 2000; 97: 9390.
10. Parker KC, Patterson D, Williamson B, Marchese J, Graber A,He F, Jacobson A, Juhasz P, Martin S. Mol. Cell. Proteomics2004; 3: 625.
11. Julka S, Regnier F. J. Proteome Res. 2004; 3: 350.12. Qiu R, Regnier FE. Anal. Chem. 2005; 77: 7225.13. Wiener MC, Sachs JR, Deyanova EG, Yates NA. Anal. Chem.
2004; 76: 6085.14. Higgs RE, Knierman MD, Gelfanova V, Butler JP, Hale JE.
J. Proteome Res. 2005; 4: 1442.15. Matthiesen R. Proteomics 2007; 7: 2815.16. MacCoss MJ, Wu CC, Liu H, Sadygov R, Yates JR. Anal.
Chem. 2003; 75: 6912.17. Venable JD, Dong M-Q, Wohlschlegel J, Dillin A, Yates JR I.
Nat. Methods 2004; 1: 39.18. Li XJ, Zhang H, Ranish JA, Aebersold R. Anal. Chem. 2003; 75:
6648.19. Bellew M, Coram M, Fitzgibbon M, Igra M, Randolph T,
Wang P, May D, Eng J, Fang R, Lin C, Chen J, Goodlett D,Whiteaker J, Paulovich A, McIntosh M. Bioinformatics 2006;22: 1902.
20. Andreev VP, Li L, Rejtar T, Li Q, Ferry JG, Karger BL.J. Proteome Res. 2006; 5: 2039.
21. Andreev VP, Li L, Cao L, Gu Y, Rejtar T, Wu S-L, Karger BL.J. Proteome Res. 2007; 6: 2186.
22. Ono M, Shitashige M, Honda K, Isobe T, Kuwabara H,Matsuzuki H, Hirohashi S, Yamada T. Mol. Cell. Proteomics2006; 5: 1338.
23. Leptos KC, Sarracino DA, Jaffe JD, Krastins B, Church GM.Proteomics 2006; 6: 1770.
24. Katajamaa M, Miettinen J, Oresic M. Bioinformatics 2006; 22:634.
25. Rauch A, Bellew M, Eng J, Fitzgibbon M, Holzman T, HusseyP, Igra M, Maclean B, Lin CW, Detter A, Fang R, Faca V,Gafken P, Zhang H, Whitaker J, States D, Hanash S, Paulo-vich A, McIntosh MW. J. Proteome Res. 2006; 5: 112.
26. Bondarenko PV, Chelius D, Shaler TA. Anal. Chem. 2002; 74:4741.
27. Yocum AK, Yu K, Oe T, Blair IA. J. Proteome Res. 2005; 4: 1722.28. Resing KA, Meyer-Arendt K, Mendoza AM, Aveline-Wolf
LD, Jonscher KR, Pierce KG, Old WM, Cheung HT, Russell S,Wattawa JL, Goehle GR, Knight RD, Ahn NG. Anal. Chem.2004; 76: 3556.
29. Wang L-h, Li D-Q, Fu Y, Wang H-P, Zhang J-F, Yuan Z-F,Sun R-X, Zeng R, He S-M, Gao W. Rapid Commun. MassSpectrom. 2007; 21: 2985.
30. Savitzky A, Golay MJE. Anal. Chem. 1964; 36: 1627.31. Wang GH, Wu WW, Pisitkun T, Hoffert JD, Knepper MA,
Shen RF. Anal. Chem. 2006; 78: 5752.32. Lin SM, Zhu L, Winter AQ, Sasinowski M, Kibbe WA. Exp.
Rev. Proteomics 2005; 2: 839.33. Sandra O, Chris T, Henning H, Weimin Z, Randall J, Rolf A.
Exp. Rev. Proteomics 2004; 1: 79.34. Available: http://sashimi.sourceforge.net/software_glossolalia.
html35. Massaroti P, Moraes LAB, Marchioretto MAM, Cassiano
NM, Bernasconi G, Calafatti SA, Barros FAP, Meurer EC,Pedrazzoli J. Anal. Bioanal. Chem. 2005; 382: 1049.
36. Immler D, Greven S, Reinemer S. Proteomics 2006; 6: 2947.37. Cutillas PR, Geering B, Waterfield MD, Vanhaesebroeck B.
Mol. Cell. Proteomics 2005; 4: 1038.
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm
3834 B. Mann et al.
38. Riter LS, Hodge BD, Gooding KM, Julian RK Jr. J. ProteomeRes. 2005; 4: 153.
39. Tang K, Page JS, Smith RD. J. Am. Soc. Mass Spectrom. 2004;15: 1416.
40. Hale EJ, Gelfanova V, Ludwig RJ, Knierman MD. BriefingsFunct. Genom. Proteomics 2003; 2: 185.
41. Higgs RE, Knierman MD, Gelfanova V, Butler JP, Hale JE.J. Proteome Res. 2005; 4: 1442.
42. Andreev VP, Li L, Cao L, Gu Y, Rejtar T, Wu SL, Karger BL.J. Proteome Res. 2007; 6: 2186.
43. Qian W-J, Monroe ME, Liu T, Jacobs JM, Anderson GA, ShenY, Moore RJ, Anderson DJ, Zhang R, Calvano SE, Lowry SF,Xiao W, Moldawer LL, Davis RW, Tompkins RG, Camp DGII, Smith RD. Mol. Cell. Proteomics 2005; 4: 700.
44. Finney GL, Blackler AR, Hoopmann MR, Canterbury JD, WuCC, MacCoss MJ. Anal. Chem. 2008; 80: 961.
45. Mocak J, Bond AM, Mitchell S, Scollary G. Pure Appl. Chem.1997; 69: 297.
Copyright # 2008 John Wiley & Sons, Ltd.
46. Madera M, Mechref Y, Novotny MV. Anal. Chem. 2005; 77:4081.
47. Madera M, Mechref Y, Klouckova I, Novotny MV. J. ProteomeRes. 2006; 5: 2348.
48. Yang Z, Harris LE, Palmer-Toy DE, Hancock WS. Clin. Chem.2006; 52: 1897.
49. Zolotarjova N, Martosella J, Nicol G, Bailey J, Boyes BE,Barrett WC. Proteomics 2005; 5: 3304.
50. Schuchard MD, Mehigh RJ, Kappel WK, Scott GB. Sigma-Aldrich Application Note.
51. Cartellieri S, Hamer O, Helmholtz H, Niemeyer B. Biotechn.Appl. Biochem. 2002; 35: 83.
52. Lee W-C, Lee KH. Anal. Biochem. 2004; 324: 1.53. Dwek MV, Ross HA, Leathem AJC. Proteomics 2001; 1:
756.54. Dwek MV, Alaiya AA. Br. J. Cancer 2003; 89: 305.55. Cummings RD. Methods Enzymol. 1994; 230: 66.56. Available: www.ncgg.indiana.edu
Rapid Commun. Mass Spectrom. 2008; 22: 3823–3834
DOI: 10.1002/rcm