15

Click here to load reader

Functional genomics studied by proteomics

Embed Size (px)

Citation preview

Page 1: Functional genomics studied by proteomics

Functional genomics studiedby proteomicsBent Honore,1* Morten Østergaard,1 and Henrik Vorum2

SummaryThe human genome contains about 30,000 genes, eachcreating several transcripts per gene. Transcript struc-tures and expression are studied by high-throughputtranscriptomic techniques using microarrays. Generally,transcripts are not directly operating molecules, but aretranslated into functional proteins, post-translationallymodified by proteolysis, glycosylation, phosphorylation,etc., sometimes with great functional impact. Proteinsneed to be analyzed by proteomic techniques, less suitedfor high-throughput. Two-dimensional polyacrylamidegel electrophoresis (2D-PAGE), separating thousands ofproteins has developed slowly over the past quarter of acentury. This technique is now quite reproducible andsuitable for differential proteomics, comparing normaland diseased cells/tissues revealing differentially re-gulated proteins. 2D-PAGE is combined with protein-identification methods, currently mass spectrometry(MS), which has been significantly improved over the last

decade. Other proteomic techniques studying protein–protein interactions are now either established or stillbeing developed, such as peptide or protein arrays,phage display, and the yeast two-hybrid system. Thestrengths and weaknesses of these techniques arediscussed. BioEssays 26:901–915, 2004.� 2004 Wiley Periodicals, Inc.

Introduction

Completion of the draft sequenceThe human genome has now been completely sequenced(1)

two years after the draft sequence was published (2,3) and

thirteen years since the start of TheHumanGenomeProject in

1990. The sequence can be obtained from GenBANK at the

NCBI home page (http://www.ncbi.nih.gov/Genbank/) and

creates a firm foundation for the further analysis of gene

structures as well as for determining variations between

individuals (single nucleotide polymorphisms, SNP), muta-

tions, etc. The number of genes present in the genome has

now been estimated to be only about 30,000, although the

exact number remains to be determined as genes cannot be

located with certainty in the sequence by computer analysis

alone.(1,2) Although the number of genes is surprisingly low, it

should be remembered that they are transcribed to pre-mRNA

and that several mechanisms serve to increase the variability

of theexpressedgenes. Themajority of genesare subjected to

alternative splicing after transcription with the result that

several gene products are produced from each gene.(4) Other

RNA modifications as well as post-translational modifications

of the proteins add complexity to the expression of the

genome. The knownmechanisms that occur in the expression

of agene throughpre-mRNA,mRNAandproteinare illustrated

in Fig. 1 with all the putative regulatory steps indicated.

Gene analysesIn principle, genetic information may be analyzed using three

different approaches by monitoring (1) the genome through

genomics (e.g., identification of mutations and SNPs, etc), (2)

the transcriptome through transcriptomics (i.e. by monitoring

the structure and/or the levels present of the transcripts), or (3)

the proteome through proteomics (Fig. 1). The techniques

used within genomics and transcriptomics are based on

hybridization of nucleotide probes to nucleotide targets. Due

BioEssays 26:901–915, � 2004 Wiley Periodicals, Inc. BioEssays 26.8 901

1Department of Medical Biochemistry, University of Aarhus, Aarhus C,

Denmark.2Department of Ophthalmology, Aarhus University Hospital Aarhus C,

Denmark

Funding agencies: The Danish Medical Research Council, the Novo

Nordisk Foundation, the Danish Cancer Society and the John and

Birthe Meyer Foundation.

*Correspondence to: Bent Honore, Department of Medical Biochem-

istry, University of Aarhus, Ole Worms Alle, bldg. 170, DK-8000

Aarhus C, Denmark. E-mail: [email protected]

DOI 10.1002/bies.20075

Published online in Wiley InterScience (www.interscience.wiley.com).

Abbreviations: 1D, one-dimensional; 2D, two-dimensional; 3D, three-

dimensional; CID, collission-induced dissociation; DIGE, difference in-

gel electrophoresis; ESI, electrospray ionization; FT-ICR, Furier

transform ion cyclotron resonance; GST, glutathione S-transferase;

ICAT, isotope-coded affinity tags; IEF, isoelectric focusing; IMAC;

immobilized-metal affinity chromatography; IPG; immobilized pH-

gradient; IT, ion trap; LC, liquid chromatography; LCM, laser capture

microdissection; MALDI, matrix-assisted laser desorption/ionization;

MS, mass spectrometry; MS/MS, tandem mass spectrometry;

MudPIT; multidimensional protein identification technology; NanoLC,

nanorange liquid chromatography; NEPHGE, non-equilibrium pH gel

electrophoresis; PAGE, polyacrylamide gel electrophoresis; Q, quad-

rupole; SELDI, surface-enhanced laser desorption/ionization; SNP,

single nucleotide polymorphisms; TOF, time-of-flight.

What’s new?

Page 2: Functional genomics studied by proteomics

to the relatively simple chemical nature of nucleotides,

such techniques are ‘digital’ in nature, being suitable for

measuring the qualitative presence or absence of specific

nucleotides or the levels present of transcripts. They are, thus,

appropriate for high-throughput analyses using the DNA chip

technology.(5) The development of new techniques is continu-

ing in order to bring them closer to the functional levels of the

proteins. An example of such an approach is the current effort

to selectively measure transcripts that are actively being

translated by purifying and analyzing polysome-bound tran-

scripts.(6) High-throughput transcriptome analyses have been

performed for a number of years and many good results have

been obtained, although the microarray technique still has

some shortcomings such as a lack of high-level intralaboratory

reproducibility.(7) A full review of transcriptomic techniques is

beyond the scope of the present article; however, there are

many good reviews that discuss the strengths and weak-

nesses of such analyses (e.g. Refs. 8–9).

Although genes and transcripts are relatively easy to

analyze using high-throughput techniques, it is evident that

these techniques do not reveal the molecules that directly

function in the cell, namely the protein molecules (Fig. 1). The

proteomic techniques are supposed to address this defect by

directly focusing analyses on the protein molecules on a large

scale. However, proteomic techniques face huge challenges,

since the protein molecules possess greater individual

chemical variation than nucleic acids thereby making these

techniques less suitable for high-throughput analyses than

genomic and transcriptomic techniques. At present, no single

proteomic technique exists that fully serves this purpose.

Several techniques are currently being developed that, in

certain combinations, may approach a high-throughput level.

This review will focus on the proteomic techniques, their

strengths and weaknesses.

ProteomicsProteomics can be defined as the discipline that details the

proteome ideally by analyzing the levels and structure of all

proteins present, including their post-translational modifica-

tions that take place in the lifetime of a cell or a tissue.(10) The

proteome is adynamic entity therebypossessinganenormous

complexity. In addition, proteins may be expressed at levels

that can vary from five orders of magnitude in yeast cells(11) to

about ten orders of magnitude in humans.(12) At present, two

techniques are absolutely central for proteomic analyses:

(1) two-dimensional polyacrylamide gel electrophoresis (2D-

PAGE), which can separate thousands of proteins in a few

steps, and (2) mass spectrometry (MS) for the identification of

proteins and their post-translational modifications.(13,14) Most

proteins function in a physiologic context by interacting with

other proteins. There are several established and emerging

techniques that can be used to identify interacting proteins

such as peptide/protein arrays, phage display and the yeast

two-hybrid technique. These will be discussed later under

the umbrella term ‘‘recent developments in other proteomic

technologies’’.

Figure 1. Overview of the transfer of information

from the sequence in the genes to the functioning

proteins of the cell (the central dogma) with the

possible control mechanisms indicated. A gene

(DNA, red) is transcribed (step 1) to pre-mRNA

that may be edited (step 2) and then processed

(step 3) to one mRNA or by alternative splicing to

several forms of mRNAs (blue). The mRNAs

are transported (step 4) out of the nucleus to the

cytosol. In the cytosol, the mRNA may be

degraded (step 5), or translated (step 6) into

protein (green). Protein activity is controlled

(step 7). Proteins may be synthesized as inactive

forms that are later reversibly or irreversibly

activated or, alternatively, they may be synthe-

sized as active proteins that are later inactivated.

Proteins are the ultimate operating molecules

producing the physiologic effect (step 8) in virtually

every mechanism in the cell. Reprinted with

permission from Honore B & Østergaard M.

Transcriptomics and proteomics: integration?

Nature Encyclopedia of the Human Genome,

Vol. 5, 579–584 (2003) Nature Publishing Group.

What’s new?

902 BioEssays 26.8

Page 3: Functional genomics studied by proteomics

Two-dimensional polyacrylamide gel

electrophoresis (2D-PAGE)

Classical isoelectric focusing (IEF) andnon-equilibrium pH gel electrophoresis (NEPHGE)techniques with ampholytesThe 2D-PAGE technique, invented more than a quarter of a

century ago byO’Farrell andKlose,(15,16) separates proteins in

the first dimension according to the isoelectric point using a pH

gradient and in the second dimension according to the

molecular mass. It is now possible to separate up to 10,000

proteins(17) with high and unprecedented resolution. Many

variations of the technique have been presented. The early

work was based on the use of carrier ampholytes to establish

thepHgradient in the first dimension, isoelectric focusing (IEF)

for acidic and neutral proteins(15) and non-equilibrium pH gel

electrophoresis (NEPHGE) for basic proteins.(18) However,

the ampholyte technique has been difficult to implement

because it is labor intensive and it has been very difficult to

achieve reproducibility due to variations in different batches of

ampholytes used to create the pH gradient in the first

dimension. Only laboratories where a significant number of

gels are run with careful titration of the ampholytes have been

able to produce high-quality, reproducible gel images.(19) The

technique has developed slowly over the years, although

improvements have been made, especially by introducing the

immobilized pH gradient gel (IPG) system. The reader is

referred to many good recent reviews available. (e.g. Refs.

20–22) Only some of the major improvements will be

discussed below.

Immobilized pH gradient gel (IPG) systemReproducibility has improved greatly with the introduction of

the IPG system in the first dimension, because the pH gradient

is permanently fixed within the polyacrylamide matrix from

the manufacturer. The technique has therefore become

available for the general scientific community.(21,23,24) The

ampholyte system and the IPG system may resolve a similar

number of proteins when performed within the same labora-

tory(25) and the IPG system has a higher loading capacity.

Even though the gel images of the two systems may look

similar, detailed studies show that each protein migrates

slightly differently and unpredictably in each of the two

systems(26) so that the identity of a protein cannot be deduced

solely from its position in the gel but must be identified directly

from the gel by other means (see later).

Although the introduction of the IPG system was a major

improvement, there are still some problemswith respect to the

separation of the very basic proteins, which were traditionally

separated with the NEPHGE system.(18) The IPG system

gradients, with pH ranges 6–9, 6–11 or 7–10, did not

generally separate the proteins as well as the acidic gels.(13)

However, reproducible patterns have now been achieved by

Gorg et al. using a system with pH gradients from 4 and up to

12.(27) The currently commercially available strips are listed

in Table 1.

Narrow pH range gels (IPG system)A problem with the 2D gel system is that high-abundance

proteins may co-migrate and overshadow low-abundance

proteins, making these difficult to detect. A solution to this was

the introduction of narrow-range pH gradients covering the

acidic side in steps of e.g. 1 pH unit.(28–31) The number of

proteins that can be resolvedwith such a systemmay increase

to more than 10,000 proteins(13) although the amount of work

involved in running several gels increases substantially. The

narrow-range strips commercially available at present are

listed in Table 1.

Sample complexity and preparation for 2D-PAGEAn absolutely crucial question for the quality and reproduci-

bility of 2D-PAGE analysis concerns sample complexity and

preparation.Onespecial problem that hasnot beenaddressed

systematically is the analysis of tissue samples since they are

morecomplex to analyze thancell culturesdue to thepresence

of many different types of cells. Laser capture microdissection

(LCM)(32) isolating selected cells may aid in this respect. But

even when one type of cells is analyzed, high-abundance

proteins may still hinder analysis of low-abundance proteins

due to co-migration. In order to decrease the number of pro-

teins to be analyzed, samples can be prefractionated,(33–35)

although introduction of several manipulations increases the

chances that proteinsmay be degraded or artificially modified.

A shortcoming of 2D-PAGE is that it is very difficult to analyze

membrane proteins with the technique(36) and attempts to

resolve this issue are being conducted.(37–41)

Protein-detection methodsThe proteins need to be visualized either by protein-staining

techniques performed after gels have been run or by protein-

labeling methods used prior to gel electrophoresis. The ideal

visualization technique would have a high detection sensitivity

and a broad linear dynamic range and be compatible with

methods for further identification and analyses of the

proteins.(42) An ideal reagent is not yet available, however,

fluorescent probes represent the best available option. Some

of the currently available reagents are listed in Table 1.

Protein-staining techniques. The classical staining with

Coomassie Brilliant Blue has been commercially available for

about 40 years. It is clearly inadequate for the detection of low-

abundance proteins with a detection limit of about 100 ng.(42)

Colloidal Coomassie Brilliant Blue has a lower detection limit

but also a low range of linearity. Silver nitrate possesses a

rather low detection limit of about 1–10 ng(43) and may thus

detect more of the low-abundance proteins. Refined silver-

What’s new?

BioEssays 26.8 903

Page 4: Functional genomics studied by proteomics

staining techniques are compatible with subsequent MS

analysis.(44,45) Theadvantagesof the silver-staining technique

are thesimplicity of thehandlingandstorageof thegels.Silver-

stained gels may be dried between cellophane sheets and

stored in adry condition for several years, analyzed at any time

and, when convenient, proteins can be excised for identifica-

tion from the dry gels or after the gels are rehydrated at a still

later time.Drawbacks, however, are the relatively lowdynamic

range of linearity (1–60 ng),(46) saturation of high-abundance

proteins and even a tendency for negative staining of strong

spots. Promising alternatives to silver staining are the

fluorescent SYPRO dyes.(47) SYPRO Ruby, SYPRO Orange

and SYPRO Red stainings are about as sensitive as silver

staining but are more linear(48) and therefore more suitable for

quantitative studies. However, gels stained with fluorescent

dyes need to be scanned in the wet condition shortly after the

gels are stained. In addition, the protein spots of the wet gels

need to be excised from the gel using equipment suited for the

purpose, e.g., a spot cutter equipped with fluorescence-

detection system.

Protein labeling techniques. Instead of staining the gels, it

is possible to label proteins prior to running the gels. If

compatible with the samples (e.g. cultured cells) proteins can

be radioactively labeled. It is then possible to quantify protein

spots absolutely provided that the specific activity of a given

cell lysate is calculated and the radioactivity of a given spot

with a known number of methionine residues is measured.(49)

However, this approach is labor intensive as it requires

excision of the spots from the gel and counting in a scintillation

counter. In addition, it may not always be convenient to

radioactively label proteins if tissue samples are studied. In

such cases, covalent labeling can be performed with

fluorescent dyes where several are available.(42) Monobro-

mobimane tags cysteine residues but suffers from lack of

linearity.(50) Other promising fluorescent dyes such as propyl-

Cy3 (Cy3) and methyl-Cy5 (Cy5) dyes are only slightly less

sensitive than silver staining. They bind to the free amine

groups of lysine residues and have a wide linear detection

range of about three or four orders of magnitude.(42) Thus,

they are excellent for measuring relative differences in

Table 1. Two-dimensional gel electrophoresis tools

VendorIPG gel stripspH rangea

Pre-cast 2nddimension gelsa

Coomassie Brilliantblue/Colloidal

Coomassie bluebSilver

stainingcFluorescent

labeling/staining

Amersham

Biosciences

4–7; 6–9; 6–11; 7–11 NLd;

3–10; 3–10 NL; 3–11 NL;

3.5–4.5; 3.0–5.6 NL;

4.0–5.0; 4.5–5.5; 5.0–6.0;

5.3–6.5; 5.5–6.7; 6.2–7.5

Homogeneous

12.5%; Gradient

12–14%

Available Available CyDyeTM DIGE Flour

labels (Cy2, Cy3 and

Cy5); Deep PurpleTM

gel stain

Bio-Rad 3–6; 4–7; 5–8; 7–10; 3–10;

3–10 NL; 3.9–5.1; 4.7–5.9;

5.5–6.7; 6.3–8.3

Homogeneous 10%,

12%; Gradient

10–20%, 8–16%

Available Available

Genetix 3–6; 5–8; 7–10; 3–10; 3–10

NL; 3.9–4.9; 4.7–5.7;

5.5–6.5; 6.3–7.3; 7.2–8.2;

8.0–9.0; 8.8–9.8; 9.5–10.5

Available Available

Invitrogen 4–7; 6–10; 3–10 NL; 4.5–5.5;

5.3–6.3; 6.1–7.1

Gradient 4–12%,

4–20%

Available Available

Servern Biotech 3–6; 5–8; 7–10; 3–10; 3–10

NL; 3.9–4.9; 4.7–5.7;

5.5–6.5; 6.3–7.3; 7.0–8.0;

8.8–9.8; 9.5–10.5

Available

Sigma-Aldrich 3–5; 4–7; 5–8; 8–11; 6–11;

3–10

Available Available

Molecular

Probes

SYPRO1 Ruby gel stain;

Pro-Q1 Diamond

Phosphoprotein gel

stain; Pro-Q1 Emerald

300 Glycoprotein gels

stain; Pro-Q1 Amber

Transmembrane

Protein gel stain

aAvailable in different sizes.bSeveral reagents available. Only some vendors offers Colloidal Coomassie Blue stains.cSeveral kits available. Some may not be compatible with MS technology.dNon-linear.

What’s new?

904 BioEssays 26.8

Page 5: Functional genomics studied by proteomics

concentrations between two samples using the 2D difference

in-gel electrophoresis system (2D-DIGE, see below).(51)

Gel-based differential expression proteomicsOne important property of the gel-based techniques is the

possibility of quantitatively comparing the proteins of one

group of cells or tissue with the proteins of another group of

cells or tissue, be it normal versus transformed (e.g. cancer),

undifferentiated versus differentiated or non-stimulated cells

versus cells stimulated with a certain substance (e.g. a

cytokine or a drug). With such analyses, the amount of a

given protein in each gel is quantified by the relative amount of

the protein versus the total amount of protein detected in the

gel. Usuallymeasurements are performed by scanning the gel

images in an appropriate densitometer scanner for stained

spots, e.g. Coomassie or silver stained, with a suitable device

for fluorochromes or with a phosphoimager for radiolabeled

samples. Each pixel in the gel is thus assigned an absorbance

value, fluorescence value or radioactivity value that ideally is

proportional to the concentration of protein present in the pixel.

Computer software suitable for analysis of such 2D images

includes Melanie, PDQuest, Bio Image, DeCyder and Image

Master 2D Elite image analysis software. These software

programs are able to assign protein spots in the gels, to

calculate the integrated absorbance or fluorescence and

thereby the relative concentration of protein in each spot

detected by the software. By summing the values of each pixel

contained in the spot, the volumemay be calculated for a given

spot. The spot volume may then be divided by the sum of

volumes of all spots detected in the gel and the resulting

volume expressed relatively as a percentage spot volume of

the total gel volume. Although computer software is available

for this type of analysis, it generally requires a significant

amount ofmanual editing to reliably analyze gels. Each gel will

contain a number of artifacts that have to be dealt with in the

analysis and it is also necessary to critically evaluate how the

gels are aligned in order to obtain reliable results.

By comparing two groups of proteins, it is possible to

determine those proteins that are differentially regulated, i.e.

upregulated or downregulated by a certain factor. It is up to the

researcher to define what level of differential regulation is

considered to be of biological significance. As the sensitivity

of these techniques improves, it will be possible to detect

more subtle changes that may or may not be of biological

importance. Due to the inherent limited reproducibility of the

2D-gel system at present, it is necessary to run several gels in

each group in order to pinpoint proteins that are significantly

differentially regulated.(52)

Recently, a novel principle was introduced to analyze two

different cell populations by two-dimensional difference in-gel

electrophoresis (2D-DIGE).(51,53–57) The principle is shown in

Fig. 2A. The great advantage of this setup is that one sample is

labeled with one of the dyes, e.g. Cy3, the other sample with

the other dye, e.g. Cy5, and the samples are then mixed to be

run on the same 2D-gel thereby eliminating any inter-gel

variations. Identical proteins from each pool migrate to exactly

the same position. By using red and green dyes, the proteins

that appear in equal concentrations in the two samples

become yellow ((Fig. 2B), arrows). Those upregulated in one

of the samples are red or reddish ((Fig. 2B), black arrowheads)

and those upregulated in the other are green or greenish

((Fig. 2B), white arrowheads). The experimental variation of

Figure 2. Principle of the two-dimensional difference in-gel

electrophoresis system (2D-DIGE).A:Normal tissue is labeled

with Cy3 and pathological tissue (e.g. cancer) is labeled with

Cy5. The dyes label 1–2% of the proteins present. The labeled

solutions aremixed and analyzed on the same 2D gel. After gel

electrophoresis, the gels are scanned with a fluorescent

scanner able to detect either the Cy3- or the Cy5-staining

patterns.B: By superimposing the two images, it is possible to

visualize differentially expressed proteins. Proteins upregu-

lated in one sample may appear as red or reddish (Cy3, black

arrowheads), thoseupregulated in theother sampleasgreenor

greenish (Cy5, white arrowheads) and those that are at the

same level as yellow (white arrows). Panel B is modified with

permission from Van den Bergh G et al. Fluorescent two-

dimensional difference gel electrophoresis and mass spectro-

metry identify age-related protein expression differences for

the primary visual cortex of kitten and adult cat. J Neurochem

2003;85:193–205, Copyright (2003) Blackwell Publishing.

What’s new?

BioEssays 26.8 905

Page 6: Functional genomics studied by proteomics

the 2D-DIGE system can be further reduced by adding a

standard pool of proteins labeled with a third dye, for example

Cy2, to all samples analyzed. The results in each gel are

thereby measured relative to the standard thus reducing

the experimental variation substantially.(58)

The 2D-DIGE technique is fast and reproducible, but some

technical details need to be considered such as (1) critical

labeling conditions, (2) proteinswithout lysine residues are not

labeled, and (3) labeled proteins are 0.5 kDa higher in mole-

cularmass than the unlabeled resulting in different positions of

labeled and unlabeled proteins.(53) This latter issue is

important since the proteins are deliberately labeled at

subsaturating conditions so that only 1–2% of the proteins

are fluorescently labeled. It is the unlabeled protein that should

be used for later analysis. Therefore, it may be necessary to

stain the gel afterwards with different dyes when protein spots

are to be excised for protein identification with mass spectro-

metry. This can be done with silver nitrate(54) or SYPRO

Ruby.(55) Recently, efforts to overcome this problem have

been investigated by using a second set of Cy dyes con-

structed to label cysteine residues at saturating conditions(56)

instead of lysine residues. This offers some advantages with

higher sensitivity since the proteins are completely labeled

and, in addition, it avoids the need for post-quantification

staining of proteinswith other substances like silver or SYPRO

Ruby.(56) However, the fact that the dyes react with cysteine

residues instead of lysine residues has the effect that the gel

images appear significantly different from the silver-stained

and fluorescently labeled image under nonsaturating condi-

tions.(56) An additional consideration concerning the applica-

tion of this technique is that 13% of eukaryotic proteins do not

contain cysteines(59) and, therefore, will not be labeled at this

residue although other residues may be labeled to a certain

extent as observed with myoglobin.(56)

Differentially regulated proteins may then be indentifiable

with appropriate techniques for protein characterization and

identification. Previously, this was done mostly by Edman

sequencing, a technique that is still being developed.(60) At

present, however, the state-of-the-art method is mass

spectrometry (MS) (see below).

Two-dimensional gel databases2D gel databases are databanks that make use of 2D-PAGE

as the core technology. These are typically constructed with a

hyperlinked 2D reference gel representing the sample being

studied, whether an organism, a cell line or a tissue. Detected

spots on the reference gel are annotated with information

about that particular protein, e.g., identity,molecularmass and

pI, quantity, cellular localization, response to treatment with

various effectors, etc. Thus, the main goal of these databases

is to catalog all proteins that can be resolved on a 2Dgel froma

given sample and, additionally, to attach as much information

about the individual proteins as possible.

In an effort to link protein information with DNA sequence

information from the genome projects, a number of compre-

hensive 2D gel databases have been constructed over the last

couple of decades. The list of available databases includes

various tissues, cell types, fluids and cell lines. Several of

these databases have been constructed to support the study

of human diseases, see The WORLD-2DPAGE: http://www.

expasy.org/ch2d/2d-index.html. Following the completion of

The Human Genome Project, these databanks are expected

to be highly useful tools in annotating the human genome, and

pinpointing those genes related to disease.(61)

Concluding remarks on 2D-PAGEAlthough the 2D-PAGE technique can be improved, at present

it is still superior to other techniques when it comes to high

resolution separation of several proteins,(62) especially when

quantitative data rather than qualitative data are needed.(22) In

addition, the coupling of 2D-PAGE with other methods

strengthens the technique, e.g. immunoprecipitation for identi-

fication of protein–protein interactions and Western blotting

for characterization of antibody specificity or identification of

splicing variants or post-translational modifications of pro-

teins. In addition, 2D-PAGE has a 100% sequence coverage

giving it the ability to monitor unknown post-translational

modifications that change the migration of proteins, mostly in

the pI direction. This is not the casewith theMSmethodwhere

some sort of qualified estimate of the chemical nature of the

modification is necessary in order to make proper analyses

(see later). The 2D-PAGE technique, however, cannot stand

alone. It is especially strong when combined with MS for

protein identification.

Mass spectrometry (MS)

Basic principleThe mass spectrometry technique has developed strongly over

the past decade(63–65) becoming the method of choice for

protein identification. This has been possible due to the parallel

development of high-quality equipment and the accumulation of

an immense amount of information in DNA and protein

databases. Several combinations of instrumentation may be

used. Each has its advantages and their limitations. Mass

spectrometers are composed of an ion source that brings the

molecules into ionized form in a gas phase, amass analyzer that

measures themass-to-charge ratio (m/z) of themolecules and a

detector that measures the number of ions at eachm/z value.

Most commonly there are two ways of volatizing and

ionizing proteins or peptides. One is matrix-assisted laser

desorption/ionization (MALDI) where the peptides/proteins

are co-crystallized with matrix molecules as dry samples on a

plate. A laser pulse brings the molecules into an ionized gas

phase (Fig. 3A). Usually this procedure gives singly charged

ions that are subsequently analyzed in themass spectrometer

What’s new?

906 BioEssays 26.8

Page 7: Functional genomics studied by proteomics

(MALDI–MS) (Fig. 3B). The other method is electrospray

ionization (ESI) used to analyzemolecules in solution. ESI can

produce positive as well as negative ions. Usually the positive

ions are analyzed. Characteristically, ESI results in multiply

charged ions, thereby lowering them/z value. AsESI works on

molecules in solution, it may easily be combined with liquid

chromatography (LC) techniques thereby applying an addi-

tional separation step prior to MS analysis (LC-MS).

The mass analyzer, a key feature of the instrument, uses

electric and/or magnetic fields for a mass-dependent handling

of the ions. Currently, the main types used in proteomic re-

search are time-of-flight (TOF),(66) quadrupole (Q), ion trap

(IT) and Furier transform ion cyclotron resonance (FT-ICR-

MS)(67) analyzers. Each has its strengths and weaknesses

and can be used alone or in combination to improve per-

formance. In TOF instruments, the ions are accelerated to a

high kinetic energy and are then, due to differences in

velocities, separated in a flight tube before reaching the

detector where they are counted. The quadrupole selects ions

by time-varying electric fields between four rods that permit

Figure 3. Principles of mass spectrometry.A:MALDI–TOFmass spectrometer. The sample is co-crystallized with matrix molecules as

a dry sample on the plate. The peptides are brought to an ionized gas phase by a laser pulse.B: The ionized peptides are analyzed in the

time-of-flight (TOF) unit in the mass spectrometer giving a peptide mass fingerprint. C: If the sample is pure enough, the peptide mass

fingerprint can be used to search DNA and protein databases for identification.D: Tandemmass spectrometry (MS/MS) as obtained by a

Q-TOF mass spectrometer. The sample is ionized at atmospheric pressure by electrospray ionization (ES source). The ions enter the

vacuum system through the sampling cone and, in the quadrupole section ions, of a particular m/z are selected and fragmented in the

collision cell. E: In the collision cell, peptides are mainly fragmented at the peptide bonds producing b type (blue) and y type (red) ions.

The masses of the resulting peptide fragments are measured in the TOF unit. F: Example of a collision-induced spectrum with the amino

acid sequencegivenasdetected from theN-terminal (b type ions) and from theC-terminal (y type ions). Themass fingerprint (C) is reprinted

with permission from Honore B. Genome- and proteome-based technologies: status and applications in the postgenomic era. Expert Rev

Mol Diagn 2001;1:265–274, Copyright (2001) Future Drugs Ltd.

What’s new?

BioEssays 26.8 907

Page 8: Functional genomics studied by proteomics

ions of a specificm/z value to pass through. The IT instrument

captures the ions for a certain time interval before they are

subjected to MS analysis. IT instruments are robust and

sensitive but possess a relatively lowmass accuracy. The FT-

ICR-MS instrument uses a strong magnetic field in high

vacuum to trap ions before fragmentation and detection.

The apparatus possesses a very high potential perfor-

mance although it is expensive and complex to operate and

therefore not routinely used. The set-ups currently used

comprise MALDI together with time-of-flight (MALDI–TOF),

and liquid chromatography (LC) in combinationwith an ion trap

or a hybrid consisting of a quadrupole and a time-of-flight unit,

Q-TOF.(68)

Matrix-assisted laser desorption/ionization–time-of-flight (MALDI–TOF)MALDI–TOFhasbeenpopular for the identification of proteins

isolated with 1D- or 2D-PAGE. This technique’s strength

resides in the rapid identification of proteins. No purification

step, except for gel electrophoresis, is used prior to analysis.

This means that relatively pure protein samples are needed

for the analysis. Proteins excised from gels are subjected to

enzymatic digestion, mostly tryptic, and the resulting peptides

obtained are analyzed by MS, thereby producing a peptide

mass fingerprint (Fig. 3C) that can be used to search the DNA

and protein databases for deduced peptide fragments that

match thosemeasured. The sample tested should bepure and

originate from a single protein. If more proteins are present or

the sample is contaminated with keratins due to human error,

interpretation of the results becomes difficult or impossible.

Keratin contamination may be minimized by careful handling

of gels and samples, using gloves and sterile hoods as much

as possible and excising spots with a spot cutter. However,

even if the proteins are separatedby2D-PAGE, it is known that

some spots may contain more than one protein.(26,54) In such

cases, MS can be combined with additional purification steps

such as liquid chromatography (LC).

Mass spectrometry combined with liquidchromatography and 2D gel separationByusingnanorange liquid chromatographic (NanoLC) separa-

tion of 2D gel-separated proteins before MS, it is possible to

obtain very pure samples for the MS analysis. The LC step is

usually performed as a reversed-phase step, e.g. containing

PepMap C18 material using a mobile phase consisting of a

gradient of, for instance, a low-to-high concentration of

acetonitrile.(53,69,70) By using a short gradient over a few

minutes, high peptide concentrations are obtained in narrow

peaks giving identification of virtually every protein that is

detectable or barely detectable on 2D gels with the currently

used fluorescent stains.(70)

After chromatographic separation, the peptides are ana-

lyzed by the Q-TOF unit (Fig. 3D). In survey scan mode,

peptides eluting from the column and detected in the MS can

trigger a switch toMS/MSmodewhere the quadrupole selects

a peptide with a defined m/z, which is then fragmented in the

collision cell. The peptide is mostly fragmented at the peptide

bonds resulting in b-type ions representing the N-terminal

sequence and y-type ions representing the C-terminal

sequence (Fig. 3E). The peptides are subsequently analyzed

in the TOFunit thereby producing a fragmentation spectrumor

a collision-induced dissociation (CID) spectrum (Fig. 3F),

giving a de novo sequence of the peptide.

Gel-independent mass spectrometry

MS combined with chromatography. Due to the problems

that may result from the use of gels prior to MS analysis, there

have been attempts to performMSwithout 2D-gel separation.

One solution is to expand the chromatographic step so that the

reversed-phase separation is supplemented with, for exam-

ple, a strong cation exchange column (2D chromatography)

andanaffinity column,whichmaycontain avidin (3Dchromato-

graphy). Such a combination of several chromatographic

steps inconjunctionwithmassspectrometryanalysis is termed

multidimensional protein identification technology (MudPIT)

and was introduced by Yates and co-workers.(71) By avoiding

the gel step, the method is capable of analyzing several

proteins otherwise missed in the gel step, low-abundance

proteins, membrane proteins, small protein (<10 kDa), large

proteins (�180 kDa) and proteins with extreme pI values.(72) It

is thus a strong technique for qualitative identification of the

proteins in a sample. However, it is not useful for quantitative

aspects and, furthermore, it requires the smooth operation of a

nanospray for extended time periods without clogging.(22)

The principle behind other chromatographic procedures

is to select for the presence of specific modified peptides,

e.g. glycopeptides(73) or phosphopeptides by immobilized-

metal-affinity chromatography (IMAC).(74,75) Also, it is possi-

ble to enrich for the presence of specific amino acids, e.g.

cysteine residues(76) or a specific part of the protein, e.g.

N-terminal peptides.(77)

Surface-enhanced laser desorption–ionization (SELDI)

affinity technology. The SELDI technology was developed

about 10 years ago and advances have been described in

several reviews (see Ref. 78 and references therein). Funda-

mental to this technique is that the surface plays an active role

in the extraction, fractionation and purification of the proteins

followed by MS identification. The technique has among other

things been used for identifying a suppressing factor of HIV-1

replication and for early detection of ovarian cancer.

Isotope-coded affinity tags (ICAT)for quantification of peptidesIn general, the intensity of apeptide ion signalmeasuredbyMS

does not accurately reflect the amount of peptide present. MS,

What’s new?

908 BioEssays 26.8

Page 9: Functional genomics studied by proteomics

therefore, is intrinsically less quantitative than 2D-PAGE. To

improve the quantitative characteristics of MS, isotope

labeling techniques have been introduced. These techniques

imply that peptides of identical chemical nature, only differing

in mass because of differences in isotopic composition, are

expected to produce identical signals in a mass spectrometer.

With the isotope-coded affinity tag (ICAT) technique,(79)

samples are labeled with an alkylating group, iodoacetic acid,

which is covalently attached to reduced cysteine resides in

the protein. This is coupled to a polyether linker and a biotin

affinity tag. The linker may contain eight hydrogen atoms

(light version) or eight deuterium atoms (heavy version)

(Fig. 4A). One sample is labeled with the light version and

another sample with the heavy version. The samples are

combined and subjected to enzymatic digestion. The ICAT-

peptides are then enriched by avidin affinity chromatography

and subsequently analyzed by LC-MS/MS. Ideally each

cysteinyl peptide will appear as a pair of signals that differ by

the mass difference of the mass tag, 8 Da, when only one

cysteine is present in the peptide and the ratio between the

two signals will reflect the ratio between the proteins in

the samples from where the peptides are obtained (Fig. 4B).

By labeling cystein residues, the13%of theproteins that donot

contain this amino acid are missed.(59) Other isotopes have

been used, for example 16O or 18O incorporated fromH216O or

H218O by proteolytic digestion(80) and 15N using metabolic

labeling.(81) The statistical veracity of this approach has yet to

be adequately addressed(22) and it should also be noted that

proteins may be part of larger families of related proteins

produced by processes such as alternative splicing, or formed

by cleavage post-translationally. More than 74% of eukaryotic

genes are expressed as splicing variants.(4) The variants will

share some exons while others will differ so that, after

enzymatic cleavage of the proteins, some of the peptides

analyzedmay originate fromdifferent proteins thereby blurring

the quantification. Aebersold recently suggested an elegant

solution to this problem on a global scale.(82) For each protein,

protein isoform or specifically modified form of a protein, a

peptide sequence that uniquely identifies that polypeptide

should be selected, chemically synthesized and labeled with

tags of a heavy stable isotope. A given protein sample is then

labeled with a light stable isotope and precisely measured

amounts of the reference peptides are added to the sample

and analyzed byMS. In this way, it will be possible to measure

the amounts of given proteins present in the sample thereby

avoiding the de novo identification and quantification of

proteins in each sample.

Recent developments in other

proteomic technologies

The limitations of the current techniques within proteomics,

especially 2D-PAGE, has required the development of several

different technologies. Here we will only focus on recent

Figure 4. Quantitative mass spectrometry using ICAT

reagents.A: The ICAT reagent consists of an affinity tag (such

as biotin), a mass encoded linker (with either hydrogen, H, or

deuterium, D) and a protein reactive groupwith, e.g. sulfhydryl-

specific reactivity. The reactive groups of the proteins in the

sample (such as cysteine residues) are labeled separately with

either light (red) or heavy (blue) reagent. B: The two samples

are then mixed and digested with enzymes. The labeled

peptides are affinity purified, quantified and identified in the

mass spectrometer by LC-MS/MS. The spectra are reprinted

with permission from Patterson SD & Aebersold RH. Proteo-

mics: the first decade and beyond. Nat Genet 2003;33

suppl:311–323 (2003) Nature Publishing Group.

What’s new?

BioEssays 26.8 909

Page 10: Functional genomics studied by proteomics

developments in three techniques peptide/protein arrays,

phage display and the yeast two-hybrid technique. Finally,

we briefly describe the identification of novel kinase substrates

as an example of an area where several proteomic techniques

have been used.

Peptide and protein arrays

Concept of the array technique. Great success was been

achieved in the late 1990s with the DNA microarrays within

transcriptomics, creating an enormous amount of data in a

short time. However, these transcriptomic data may be some

distance away from the functional level of the proteins and

this has led to the development of proteomic applications.(83)

The challenges that these peptide/protein arrays face are

significant as highlighted in recent reviews.(84–87) The

problems to be resolved include the development of immobi-

lization strategies and inert surfaces that are suitable for

immobilizing molecules without interference. The main chal-

lenge at present, however, is the availability of quality-tested

molecules for immobilization. Many arrays involve the use of

antibodies because of the availability of several commercial

antibodies. Examples of the principles of the techniques

are depicted in Fig. 5. Suitable antibodies can either be

produced conventionally as polyclonal or monoclonal or with

the phage display technique (see below). Irrespective of the

technique used, antibody specificity and cross reactivity have

to be characterized. Other array techniques make use of

recombinant proteins either GST fusion proteins or proteins

synthesized with a hexahistidine tag that may be used for

immobilization to a Ni-NTA surface. In addition, the proteins to

be used, either purified or constructed as recombinant, will

have to be tested for proper folding, retained biological activity,

etc. Such analyses are not always straightforward. These

techniques are still in their infancy but their potentials are

enormous.Herewewill only briefly describe the basic principle

of some examples.

Antibody arrays for diagnostic applications. Antibody

arrays are mostly used for diagnostic applications where the

concentration of a specific protein in a solution of a mixture of

proteins can be determined. The antibodies are spotted on a

solid support (Fig. 5A). The sample is then applied, allowing

the antigens to react with the antibodies. The antigens are

detected with a second labeled antibody. Such tests have

worked for the detection of myeloma proteins using IgG

subclasses(88) and with antibodies against cytokines.(89) The

technology requires the availability of two high-quality anti-

bodies against each protein. The analysis of the release of

cytokines using this approach has been particularly success-

ful, because of the commercial availability of suitable anti-

bodies. Measuring intracellular proteins is much more

complicated as only 5% of over 100 commercial antibodies

are suitable for microarray-based analyses of cellular

lysates.(84) However, it is feasible as has recently been shown

by the description of an array measuring the abundance and

the modification state of intracellular signaling proteins, e.g.

phosphorylations.(90) Antibody arrays may also be used to

measure the levels of specific antigens present in a sample

avoiding the use of a specific second antibody (Fig. 5B). Such

techniques either require a label-free detection method as, for

example, MS or surface plasmon resonance or a chemical

labeling of all the proteins in a sample with a fluorescent dye.

The approach has been used to make a two-color labeling

technique where the antigen in a sample is labeled with Cy5.

The sample is thenmixedwith a reference sample labeledwith

Cy3.(91) The mixture of proteins labeled with Cy5 and Cy3 will

thus compete for binding to the antibody chip and differences

in the concentrations of proteins between the sample and the

reference will be displayed as differences in the colours on the

chip. It has been used for analysis of colon carcinoma cells in

Figure 5. Protein arrays may be produced either by

immobilizing antibodies on chips (A,B); antigens (C), purified

or recombinant proteins (D) and even small molecules for

studying protein ligand interactions (E). A: The antibody

captures the antigen in the sample and is detected with a

fluorescently labeled second antibody. B: In a two-colour

labeling strategy the sample is labeled with one dye, e.g. Cy 5

and then mixed with a reference sample that is labeled with

Cy3. C: Labeled antibodies are used to detect specific

immobilized proteins from a sample. D: By immobilizing

specific proteins on a chip, protein-protein interactions may

be demonstrated using labeled proteins. E: By immobilizing

small molecules (ligands) on the chip, it is possible to study

binding activities ofmacromolecules to ligands. The challenges

with the array technique are the production of high-quality

specific antibodies and functionally intact proteins at a high-

throughput level, a task that is not easily accomplished.

What’s new?

910 BioEssays 26.8

Page 11: Functional genomics studied by proteomics

response to ionizing radiation using 146 different antibodies

against variousproteins.(92) In aone-color approach,Knezevic

et al.(93) used 368 different antibodies to measure biotinylated

proteinswith an enzyme linked colorimetric assay fromnormal

and cancerous epithelium in the oral cavity.

Antigen arrays for diagnostic applications. Instead of

immobilizing antibodies on the chip, the proteins in a sample

may be immobilized and specific proteins can be detected

with labeled specific antibodies, (Fig. 5C). The technique has

been used to monitor normal and cancerous tissue from

prostate gland lysates.(94) The approach may further be used

to monitor the presence of antibodies in a sample by reacting

against immobilized antigens, e.g., for diagnosis of the pre-

sence of antibodies in autoimmune diseases(95) or IgE anti-

bodies against specific allergen proteins or other peptides and

small molecules.(96)

Arrays for basic research. The above-mentioned arrays are

mostly used for diagnostic purposes.Other types of arrays can

beused in basic research for functional studies anddrug target

identification. By immobilizing specific proteins on the chip,

protein–protein interactionsmay be identified(97) (Fig. 5D) and

by immobilizing small molecules, binding activities of macro-

molecules to ligands can be studied(85) (Fig. 5E). The great

challenge is to produce high-quality macromolecules, in

particular, pure recombinant proteins in a suitable expression

system with proper folding for functional activity, either in

Escherichia coli, yeast, insects or human cells. Particularly,

in E. coli the proteins are produced without post-translational

modifications while the latter systems are capable of perform-

ing such putatively essential modifications. Other challenges

the array technique face is the high-throughput production of

antibodies with high-enough affinity for their antigens without

having cross-reactivity to other proteins. One promising way

may be to produce phage display antibodies (see below).

Phage display technique

Basic principle. Phage display makes use of filamentous

phages, bacteriophages that propagate inE. coli, to express a

wide variety of ligands (antibodies, peptides, etc.). These

ligands are expressed by the bacteriophage, and displayed on

the surface of the phage particle, where they can be selected

against any given target. Following selection, the phage

particle, and hence the ligand, can be propagated in E. coli.

This is possible due to a unique feature of the bacteriophage:

the direct coupling between the genotype and the phenotype.

Thegene for a ligandor a repertoire of ligands is cloned into the

phage genome, directly upstream to one of the genes

encoding a phage coat protein (Fig. 6A). Upon transcription

of the phage genome, a fusion protein of the ligand and the

coat protein will be incorporated into new phage particles

produced within E. coli, and subsequently released from the

bacteria. Theexpressionof the fusionprotein results in phages

containing the modified genome expressing the ligand protein

on the surface of the particles (Fig. 6A). Although phage

display techniques promise to provide powerful proteomic

tools, it has certain shortcomings by being limited to the study

of small- to medium-sized proteins lacking eukaryotic post-

translational modifications. Thus, the peptides might be non-

functional.

Recombinant antibodies. During the last decade, recombi-

nant antibodies have been isolated from repertoires of

antibody-fragment-displayed bacteriophages, thereby by-

passing immunization and the hybridoma technology.(98)

Phage-displayed antibody repertoires are constructed from

V-gene repertoires, obtained from either non-immune or

Figure 6. Phage display technique for the production of

antibodies. A: The gene for a ligand or a repertoire of ligands

(gLig) is cloned into the phagegenome, directly upstreamof the

genes (gIII) encoding a phage coat protein. A fusion protein of

the ligand (pLig) and the coat protein (pIII) will be incorporated

into new phage particles that are produced within E. coli

and subsequently released from the bacteria. B: Phage

displayed antibody repertoires are constructed from V-gene

repertoires, which can be obtained from either non-immune or

immune sources. From non-immune repertoires, antibodies to

virtually any target can be isolated by selection and amplifica-

tion procedures that mimic the immune system. Antibody

fragments can be constructed in several ways. Most widely

used is the single chain Fv fragments (scFv), but also Fab

fragments displayed on phages are very potent antibodies.

What’s new?

BioEssays 26.8 911

Page 12: Functional genomics studied by proteomics

immune sources. Fromnon-immune repertoires, antibodies to

virtually any target are isolated by selection and amplification

procedures mimicking the immune system. Antibody frag-

ments canbeconstructedas single chainFv fragments (scFv),

but also Fab fragments displayed on phages are very potent

antibodies (Fig. 6B).

In recent years, the utilization of recombinant antibodies in

proteomics has become highly feasible.(99) Several groups

have produced phage antibodies directed towards biomar-

kers, and novel antigens have been identified using differential

and subtractive selection methods.(100) In addition, cell-

surface-specific phage antibodies have been generated

towards a number of different cell types. Recently, phage

antibodies directedagainst intracellular antigensof cell lysates

have opened up the possibility of making differential protein

analysis.(101) In addition, a phage display screening method

has been developed for selecting peptides recognized by cir-

culating tumor-associated antibodies in prostate cancer.(102)

Thus, the phage display technique is very promising for

antibody production on a high-throughput level although they

also need characterization with respect to specificity.

Protein–protein interactions by the yeasttwo-hybrid technique

Basic principle. Proteins often function in a physiological

context by interacting with other proteins. Introduced by Fields

andSong in 1989(103) the yeast two-hybrid technique hasbeen

the method of choice for analyzing protein–protein interac-

tions for over a decade, but has also been used for studying

protein–RNA(104) and protein–DNA(105) interactions. The

method is based on the nature of eukaryotic gene expression.

The transcription of genes to mRNA is controlled by transcrip-

tion factors that have two important distinct domains: DNA-

binding domain and the transcription activation domain. The

DNA-binding domain binds to the promoter region of the gene

specific for the transcription factor, while the transcription

activation domain recruits the rest of the elements needed for

transcribing the gene into mRNA. The presence of both

domains is essential for activation of gene transcription. In a

typical two-hybrid experiment, the DNA-binding domain is

fused to one of the proteins that is being studied, and the

transcription activation domain is fused to the other protein

(Fig. 7). If the two proteins interact, an active element

consisting of both the DNA-binding domain and the transcrip-

tion activation domain is formed, and gene transcription can

occur. The active element serves as a transcription factor for

a reporter gene, e.g. b-galactosidase, so that the activity of

the reporter enzyme is an indicator of interaction between the

two proteins being studied (Fig. 7).

A number of factors may lead to false results, e.g. non-

specific spontaneous activation of the reporter gene or direct

activation of the reporter gene by one of the proteins being

studied. To avoid these problems, experimental design must

be carried out carefully. In addition, since the interaction of

interest takes place in the nucleus, the physiological environ-

ment may cause misfolding of the proteins of interest or they

might miss post-translational modifications that are important

for the interaction.(106)

Yeast hybrid systems and proteomics. The yeast two-

hybrid method is used to find many novel protein-protein

interactions, and further development of the technique, the

yeast three-hybrid system, is used to study receptor-ligand

interactions.(107) In addition, modifications are used to find

elements inhibiting an interaction, the so-called ‘reverse

hybrid systems’.(108) These enhanced systems may be very

suitable for discovering new drugs, drug targets and ther-

apeutic agents.

After the completion of the human genome sequence,(1)

the next great scientific challenge of assigning functionality

to the proteins has led to much debate about which of the

proteomic techniques will be best suited for this overwhelming

task. Since yeast hybrid techniques are genetic systems

for studying protein function, these methods have been

Figure 7. Yeast two-hybrid technique for in vivo detection of

protein-protein interactions. The technique is based on the

interaction of a ‘bait’ protein with a ‘prey’ protein inside the

nucleus of a yeast cell. The bait protein consists of a target

protein (P1) fused to the DNA-binding domain (B) of a

transcription factor. The ‘prey’ protein consists of a binding

protein (P2) fused to the transcriptional activator domain (A)

of a transcription factor. By the interaction of the bait with the

prey, a functional transcription factor is created that turns on

transcription of the reporter gene. Themethod requires that the

protein–protein interactioncanoccur in thenucleusof the yeast

cell. A false positive reaction may occur if the prey protein in

itself is a functional transcription factor.

What’s new?

912 BioEssays 26.8

Page 13: Functional genomics studied by proteomics

suggested as powerful tools for generating comprehensive

protein interaction maps. Although the yeast hybrid method is

not regarded as a high-throughput technique, as compared to

the previously mentioned techniques, several attempts have

been made to screen large libraries for factors that bind to

the protein or receptor of interest.(108) Being an in vivo system,

the yeast hybrid technique possesses the inherent advantage

of assigning reliable functionality to eukaryotic proteins as

compared with the in vitro protein array and phage display

techniques. Therefore, future adaptation to high-throughput

levels could give the yeast hybrid technique a central position

for forthcoming large-scale proteomic projects.

Technologies for identifying kinase substratesVarious proteomic techniques, including phage display and

the yeast two-hybrid system, have beenused to identify kinase

substrates, as recently reviewed byManning and Cantley.(109)

Since each technique possesses strengths and weaknesses,

additional techniques have been developed. An approach that

consists of generating a peptide library with random amino

acids oriented around a Ser, Thr, or Tyr being phosphorylated

by a specific kinase and purified on a ferric columnwas used to

identify preferred, tolerated and selected residues around the

phosphorylation site. Afterwards, these peptides can be used

to search databases for candidate kinase substrates and the

information used to raise phospho-motif-specific antibodies

for use in for immunoblotting or immunoprecipitation.(109)

Concluding remarks

Two techniques, in particular, have dominated the proteomic

field in recent years, 2D-PAGE and MS. Although 2D-PAGE

possesses shortcomings, it has developed slowly and steadily

in the last quarter of a century. At present, two strong features

are recognised, unique protein resolution and suitability for

quantitative measurements. In addition, it does not require

knowledge of the chemical nature of a post-translational

modification, thereby possessing a100%sequence coverage.

Major shortcomings include its relative inability to analyze

membrane proteins and a requirement to be coupled with

protein identification techniques (at present MS, which is the

state-of-the-art technique for protein identification and char-

acterization). MS alone is not yet able to compete with 2D-

PAGE for quantitative purposes, although investigations are

being conducted in that direction. Other proteomic techniques

that still need proof of concept before being able to seriously

compete with 2D-PAGE andMS are the protein arrays, phage

display and the yeast two-hybrid system. These techniques

are indeed very promising and, although facing difficulties,

they possess huge potentials within the proteomic field.

Acknowledgments

We thankProfessorGregory E. Rice, University ofMelbourne,

Australia for discusions and editorial assistance.

References

1. Collins FS, Green ED, Guttmacher AE, Guyer MS. 2003. A vision for the

future of genomics research. Nature 422:835–847.

2. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, et al. 2001.

Initial sequencing and analysis of the human genome. Nature 409:860–

921.

3. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, et al. 2001. The

Sequence of the Human Genome. Science 291:1304–1351.

4. Johnson JM, Castle J, Garrett-Engele P, Kan Z, Loerch PM, et al. 2003.

Genome-wide survey of human alternative pre-mRNA splicing with

exon junction microarrays. Science 302:2141–2144.

5. Honore B. 2001. Genome- and proteome-based technologies: status

and applications in the postgenomic era. Expert Rev Mol Diagn 1:265–

274.

6. Pradet-Balade B, Boulme F, Beug H, Mullner EW, Garcia-Sanz JA.

2001. Translation control: bridging the gap between genomics and

proteomics? Trends Biochem Sci 26:225–229.

7. Petricoin EF 3rd, Hackett JL, Lesko LJ, Puri RK, Gutman SI, et al. 2002.

Medical applications of microarray technologies: a regulatory science

perspective. Nat Genet 32 Suppl:474–479.

8. Carr KM, Bittner M, Trent JM. 2003. Gene-expression profiling in human

cutaneous melanoma. Oncogene 22:3076–3080.

9. Bertucci F, Viens P, Tagett R, Nguyen C, Houlgatte R, et al. 2003. DNA

arrays in clinical oncology: promises and challenges. Lab Invest 83:

305–316.

10. Honore B, Østergaard M. 2003. Transcriptomics and proteomics:

integration? In: Cooper DN, editor. Nature Encyclopedia of the Human

Genome. London: Nature Publishing Group. p 579–584.

11. Gygi SP, Rochon Y, Franza BR, Aebersold R. 1999. Correlation

between protein and mRNA abundance in yeast. Mol Cell Biol

19:1720–1730.

12. Patterson SD, Aebersold RH. 2003. Proteomics: the first decade and

beyond. Nat Genet 33 Suppl:311–323.

13. Fey SJ, Larsen PM. 2001. 2D or not 2D. Curr Opin Chem Biol 5:26–33.

14. Blackstock WP, Weir MP. 1999. Proteomics: quantitative and physical

mapping of cellular proteins. Trends Biotechnol 17:121–127.

15. O’Farrell PH. 1975. High-resolution two dimensional gel electrophoresis

of proteins. J Biol Chem 250:4007–4021.

16. Klose J. 1975. Protein mapping by combined isoelectric focusing and

electrophoresis of mouse tissues. A novel approach to testing for

induced point mutations in mammals. Humangenetik 26:231–243.

17. Klose J, Kobalz U. 1995. Two-dimensional electrophoresis of proteins:

an updated protocol and implications for a functional analysis of the

genome. Electrophoresis 16:1034–1059.

18. O’Farrell PZ, Goodman HM, O’Farrell PH. 1977. High resolution two-

dimensional electrophoresis of basic as well as acidic proteins. Cell 12:

1133–1141.

19. Celis JE, Rasmussen HH, Gromov P, Olsen E, Madsen P, et al. 1995.

The human keratinocyte two-dimensional gel protein database (update

1995): Mapping components of signal transduction pathways. Electro-

phoresis 16:2177–2240.

20. Link AJ. Methods in Molecular Biology, Vol. 112, 2-D proteome analysis

protocols. Totowa, New Jersey: Humana Press, 1999.

21. Gorg A, Obermaier C, Boguth G, Harder A, Scheibe B, et al. 2000. The

current state of two-dimensional electrophoresis with immobilized pH

gradients. Electrophoresis 21:1037–1053.

22. Rabilloud T. 2002. Two-dimensional gel electrophoresis in proteomics:

old, old fashioned, but it still climbs up the mountains. Proteomics 2:

3–10.

23. Hanash SM. 2000. Biomedical applications of two-dimensional electro-

phoresis using immobilized pH gradients: current status. Electrophor-

esis 21:1202–1209.

24. Blomberg A, Blomberg L, Norbeck J, Fey SJ, Larsen PM, et al. 1995.

Interlaboratory reproducibility of yeast protein patterns analyzed by

immobilized pH gradient two-dimensional gel electrophoresis. Electro-

phoresis 16:1935–1945.

25. Bjellqvist B, Basse B, Olsen E, Celis JE. 1994. Reference points for

comparisons of two-dimensional maps of proteins from different human

cell types defined in a pH scale where isoelectric points correlate with

polypeptide compositions. Electrophoresis 15:529–539.

What’s new?

BioEssays 26.8 913

Page 14: Functional genomics studied by proteomics

26. Nawrocki A, Larsen MR, Podtelejnikov AV, Jensen ON, Mann M, et al.

1998. Correlation of acidic and basic carrier ampholyte and immobi-

lized pH gradient two-dimensional gel electrophoresis patterns based

on mass spectrometric protein identification. Electrophoresis 19:1024–

1035.

27. Gorg A, Boguth G, Obermaier C, Weiss W. 1998. Two-dimensional

electrophoresis of proteins in an immobilized pH 4-12 gradient.

Electrophoresis 19:1516–1519.

28. Wildgruber R, Harder A, Obermaier C, Boguth G, Weiss W, et al. 2000.

Towards higher resolution: two-dimensional electrophoresis of Sac-

charomyces cerevisiae proteins using overlapping narrow immobilized

pH gradients. Electrophoresis 21:2610–2616.

29. Righetti PG, Bossi A. 1997. Isoelectric focusing in immobilized pH

gradients: recent analytical and preparative developments. Anal

Biochem 247:1–10.

30. Corthals GL, Wasinger VC, Hochstrasser DF, Sanchez JC. 2000. The

dynamic range of protein expression: a challenge for proteomic

research. Electrophoresis 21:1104–1115.

31. Hoving S, Voshol H, van Oostrum J. 2000. Towards high performance

two-dimensional gel electrophoresis using ultrazoom gels. Electro-

phoresis 21:2617–2621.

32. Emmert-Buck MR, Bonner RF, Smith PD, Chuaqui RF, Zhuang Z, et al.

1996. Laser capture microdissection. Science 274:998–1001.

33. Cordwell SJ, Nouwens AS, Verrills NM, Basseal DJ, Walsh BJ. 2000.

Subproteomics based upon protein cellular location and relative

solubilities in conjunction with composite two-dimensional electrophor-

esis gels. Electrophoresis 21:1094–1103.

34. Molloy MP, Herbert BR, Walsh BJ, Tyler MI, Traini M, et al. 1998.

Extraction of membrane proteins by differential solubilization for

separation using two-dimensional gel electrophoresis. Electrophoresis

19:837–844.

35. Zuo X, Speicher DW. 2000. A method for global analysis of complex

proteomes using sample prefractionation by solution isoelectrofocus-

ing prior to two-dimensional electrophoresis. Anal Biochem 284:266–

278.

36. Garrels JI, McLaughlin CS, Warner JR, Futcher B, Latter GI, et al. 1997.

Proteome studies of Saccharomyces cerevisiae: identification and

characterization of abundant proteins. Electrophoresis 18:1347–1360.

37. Rabilloud T, Blisnick T, Heller M, Luche S, Aebersold R, et al. 1999.

Analysis of membrane proteins by two-dimensional electrophoresis:

comparison of the proteins extracted from normal or Plasmodium

falciparum-infected erythrocyte ghosts. Electrophoresis 20:3603–3610.

38. Santoni V, Kieffer S, Desclaux D, Masson F, Rabilloud T. 2000.

Membrane proteomics: use of additive main effects with multiplicative

interaction model to classify plasma membrane proteins according to

their solubility and electrophoretic properties. Electrophoresis 21:

3329–3344.

39. Wissing J, Heim S, Flohe L, Bilitewski U, Frank R. 2000. Enrichment of

hydrophobic proteins via Triton X-114 phase partitioning and hydro-

xyapatite column chromatography for mass spectrometry. Electrophor-

esis 21:2589–2593.

40. Rabilloud T, Adessi C, Giraudel A, Lunardi J. 1997. Improvement of

the solubilization of proteins in two-dimensional electrophoresis with

immobilized pH gradients. Electrophoresis 18:307–316.

41. Chevallet M, Santoni V, Poinas A, Rouquie D, Fuchs A, et al. 1998. New

zwitterionic detergents improve the analysis of membrane proteins by

two-dimensional electrophoresis. Electrophoresis 19:1901–1909.

42. Patton WF. 2000. A thousand points of light: the application of

fluorescence detection technologies to two-dimensional gel electro-

phoresis and proteomics. Electrophoresis 21:1123–1144.

43. Switzer RC 3rd, Merril CR, Shifrin S. 1979. A highly sensitive silver stain

for detecting proteins and peptides in polyacrylamide gels. Anal

Biochem 98:231–237.

44. Shevchenko A, Wilm M, Vorm O, Mann M. 1996. Mass spectrometric

sequencing of proteins from silver-stained polyacrylamide gels. Anal

Chem 68:850–858.

45. Mørtz E, Krogh TN, Vorum H, Gorg A. 2001. Improved silver staining

protocols for high sensitivity protein identification using matrix-assisted

laser desorption/ionization-time of flight analysis. Proteomics 1:1359–

1363.

46. Syrovy I, Hodny Z. 1991. Staining and quantification of proteins

separated by polyacrylamide gel electrophoresis. J Chromatogr 569:

175–196.

47. Steinberg TH, Jones LJ, Haugland RP, Singer VL. 1996. SYPRO orange

and SYPRO red protein gel stains: one-step fluorescent staining of

denaturing gels for detection of nanogram levels of protein. Anal

Biochem 239:223–237.

48. Yan JX, Harry RA, Spibey C, Dunn MJ. 2000. Postelectrophoretic

staining of proteins separated by two-dimensional gel electrophoresis

using SYPRO dyes. Electrophoresis 21:3657–3665.

49. Gygi SP, Aebersold R. 1999. Absolute quantitation of 2-D protein spots.

Methods Mol Biol 112:417–421.

50. Berggren K, Chernokalskaya E, Steinberg TH, Kemper C, Lopez MF,

et al. 2000. Background-free, high sensitivity staining of proteins in one-

and two-dimensional sodium dodecyl sulfate-polyacrylamide gels

using a luminescent ruthenium complex. Electrophoresis 21:2509–2521.

51. Unlu M, Morgan ME, Minden JS. 1997. Difference gel electrophoresis:

a single gel method for detecting changes in protein extracts.

Electrophoresis 18:2071–2077.

52. Honore B, Vorum H, Pedersen AE, Buus S, Claesson MH. 2004.

Changes in protein expression in p53 deleted spontaneous thymic

lymphomas. Exp Cell Res 295:91–101.

53. Zhou G, Li H, DeCamp D, Chen S, Shu H, et al. 2002. 2D differential in-

gel electrophoresis for the identification of esophageal scans cell

cancer-specific protein markers. Mol Cell Proteomics 1:117–124.

54. Van den Bergh G, Clerens S, Cnops L, Vandesande F, Arckens L.

2003. Fluorescent two-dimensional difference gel electrophoresis and

mass spectrometry identify age-related protein expression differences

for the primary visual cortex of kitten and adult cat. J Neurochem

85:193–205.

55. Gharbi S, Gaffney P, Yang A, Zvelebil MJ, Cramer R, et al. 2002.

Evaluation of two-dimensional differential gel electrophoresis for

proteomic expression analysis of a model breast cancer cell system.

Mol Cell Proteomics 1:91–98.

56. Shaw J, Rowlinson R, Nickson J, Stone T, Sweet A, et al. 2003.

Evaluation of saturation labelling two-dimensional difference gel

electrophoresis fluorescent dyes. Proteomics 3:1181–1195.

57. Tonge R, Shaw J, Middleton B, Rowlinson R, Rayner S, et al. 2001.

Validation and development of fluorescence two-dimensional differ-

ential gel electrophoresis proteomics technology. Proteomics 1:377–

396.

58. Alban A, David SO, Bjorkesten L, Andersson C, Sloge E, et al. 2003. A

novel experimental design for comparative two-dimensional gel

analysis: two-dimensional difference gel electrophoresis incorporating

a pooled internal standard. Proteomics 3:36–44.

59. Vuong GL, Weiss SM, Kammer W, Priemer M, Vingron M, et al. 2000.

Improved sensitivity proteomics by postharvest alkylation and radio-

active labelling of proteins. Electrophoresis 21:2594–2605.

60. Shively JE. 2000. The chemistry of protein sequence analysis. EXS 88:

99–117.

61. Gromov PS, Østergaard M, Gromova I, Celis JE. 2002. Human

proteomic databases: a powerful resource for functional genomics in

health and disease. Prog Biophys Mol Biol 80:3–22.

62. Celis JE, Gromov P. 1999. 2D protein electrophoresis: can it be

perfected? Curr Opin Biotechnol 10:16–21.

63. Mann M, Hendrickson RC, Pandey A. 2001. Analysis of proteins and

proteomes by mass spectrometry. Annu Rev Biochem 70:437–473.

64. Aebersold R, Mann M. 2003. Mass spectrometry-based proteomics.

Nature 422:198–207.

65. Lin D, Tabb DL, Yates JR. 2003. Large-scale protein identification using

mass spectrometry. Biochim Biophys Acta 1646:1–10.

66. Cotter RJ. 1989. Time-of-flight mass spectrometry: an increasing role in

the life sciences. Biomed Environ Mass Spectrom 18:513–532.

67. Marshall AG, Hendrickson CL, Jackson GS. 1998. Fourier transform ion

cyclotron resonance mass spectrometry: a primer. Mass Spectrom Rev

17:1–35.

68. Morris HR, Paxton T, Dell A, Langhorne J, Berg M, et al. 1996. High

sensitivity collisionally-activated decomposition tandem mass spectro-

metry on a novel quadrupole/orthogonal-acceleration time-of-flight

mass spectrometer. Rapid Commun Mass Spectrom 10:889–896.

What’s new?

914 BioEssays 26.8

Page 15: Functional genomics studied by proteomics

69. Cavdar Koc E, Blackburn K, Burkhart W, Spremulli LL. 1999. Identifica-

tion of a mammalian mitochondrial homolog of ribosomal protein S7.

Biochem Biophys Res Commun 266:141–146.

70. Edmondson RD, Vondriska TM, Biederman KJ, Zhang J, Jones RC,

et al. 2002. Protein kinase C epsilon signaling complexes include

metabolism- and transcription/translation-related proteins: complimen-

tary separation techniques with LC/MS/MS. Mol Cell Proteomics 1:421–

433.

71. Link AJ, Eng J, Schieltz DM, Carmack E, Mize GJ, et al. 1999. Direct

analysis of protein complexes using mass spectrometry. Nat Biotech-

nol 17:676–682.

72. Washburn MP, Wolters D, Yates JR 3rd. 2001. Large-scale analysis of

the yeast proteome by multidimensional protein identification technol-

ogy. Nat Biotechnol 19:242–247.

73. Hayes BK, Greis KD, Hart GW. 1995. Specific isolation of O-linked

N-acetylglucosamine glycopeptides from complex mixtures. Anal

Biochem 228:115–122.

74. Ficarro SB, McCleland ML, Stukenberg PT, Burke DJ, Ross MM, et al.

2002. Phosphoproteome analysis by mass spectrometry and its

application to Saccharomyces cerevisiae. Nat Biotechnol 20:301–305.

75. Mann M, Ong SE, Grønborg M, Steen H, Jensen ON, et al. 2002.

Analysis of protein phosphorylation using mass spectrometry: deci-

phering the phosphoproteome. Trends Biotechnol 20:261–268.

76. Spahr CS, Susin SA, Bures EJ, Robinson JH, Davis MT, et al. 2000.

Simplification of complex peptide mixtures for proteomic analysis:

reversible biotinylation of cysteinyl peptides. Electrophoresis 21:1635–

1650.

77. Gevaert K, Goethals M, Martens L, Van Damme J, Staes A, et al. 2003.

Exploring proteomes and analyzing protein processing by mass

spectrometric identification of sorted N-terminal peptides. Nat Bio-

technol 21:566–569.

78. Tang N, Tornatore P, Weinberger SR. 2004. Current developments in

SELDI affinity technology. Mass Spectrom Rev 23:34–44.

79. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, et al. 1999.

Quantitative analysis of complex protein mixtures using isotope-coded

affinity tags. Nat Biotechnol 17:994–999.

80. Yao X, Freas A, Ramirez J, Demirev PA, Fenselau C. 2001. Proteolytic18O labeling for comparative proteomics: model studies with two

serotypes of adenovirus. Anal Chem 73:2836–2842.

81. Oda Y, Huang K, Cross FR, Cowburn D, Chait BT. 1999. Accurate

quantitation of protein expression and site-specific phosphorylation.

Proc Natl Acad Sci USA 96:6591–6596.

82. Aebersold R. 2003. Constellations in a cellular universe. Nature 422:

115–116.

83. Aebersold R, Cravatt BF. 2002. Proteomics — advances, applications

and the challenges that remain. Trends Biotechnol 20:S1–S2.

84. MacBeath G. 2002. Protein microarrays and proteomics. Nat Genet 32

Suppl:526–532.

85. Zhu H, Snyder M. 2003. Protein chip technology. Curr Opin Chem Biol

7:55–63.

86. Lee YS, Mrksich M. 2002. Protein chips: from concept to practice.

Trends Biotechnol 20:S14–S18.

87. Elia G, Silacci M, Scheurer S, Scheuermann J, Neri D. 2002. Affinity-

capture reagents for protein arrays. Trends Biotechnol 20:S19–S22.

88. Silzel JW, Cercek B, Dodson C, Tsay T, Obremski RJ. 1998. Mass-

sensing, multianalyte microarray immunoassay with imaging detection.

Clin Chem 44:2036–2043.

89. Schweitzer B, Roberts S, Grimwade B, Shao W, Wang M, et al. 2002.

Multiplexed protein profiling on microarrays by rolling-circle amplifica-

tion. Nat Biotechnol 20:359–365.

90. Nielsen UB, Cardone MH, Sinskey AJ, MacBeath G, Sorger PK. 2003.

Profiling receptor tyrosine kinase activation by using Ab microarrays.

Proc Natl Acad Sci USA.

91. Haab BB, Dunham MJ, Brown PO. 2001. Protein microarrays for highly

parallel detection and quantitation of specific proteins and antibodies

in complex solutions. Genome Biol 2: RESEARCH0004.1–0004.13.

92. Sreekumar A, Nyati MK, Varambally S, Barrette TR, Ghosh D, et al.

2001. Profiling of cancer cells using protein microarrays: discovery of

novel radiation-regulated proteins. Cancer Res 61:7585–7593.

93. Knezevic V, Leethanakul C, Bichsel VE, Worth JM, Prabhu VV, et al.

2001. Proteomic profiling of the cancer microenvironment by antibody

arrays. Proteomics 1:1271–1278.

94. Paweletz CP, Charboneau L, Bichsel VE, Simone NL, Chen T, et al.

2001. Reverse phase protein microarrays which capture disease

progression show activation of pro-survival pathways at the cancer

invasion front. Oncogene 20:1981–1989.

95. Joos TO, Schrenk M, Hopfl P, Kroger K, Chowdhury U, et al. 2000.

A microarray enzyme-linked immunosorbent assay for autoimmune

diagnostics. Electrophoresis 21:2641–2650.

96. Hiller R, Laffer S, Harwanegg C, Huber M, Schmidt WM, et al. 2002.

Microarrayed allergen molecules: diagnostic gatekeepers for allergy

treatment. FASEB J 16:414–416.

97. Zhu H, Bilgin M, Bangham R, Hall D, Casamayor A, et al. 2001. Global

analysis of protein activities using proteome chips. Science 293:2101–

2105.

98. Winter G, Griffiths AD, Hawkins RE, Hoogenboom HR. 1994. Making

antibodies by phage display technology. Annu Rev Immunol 12:433–

455.

99. Holt LJ, Enever C, de Wildt RM, Tomlinson IM. 2000. The use of re-

combinant antibodies in proteomics. Curr Opin Biotechnol 11:445–449.

100. Hoogenboom HR, de Bruine AP, Hufton SE, Hoet RM, Arends JW,

et al. 1998. Antibody phage display technology and its applications.

Immunotechnology 4:1–20.

101. de Wildt RM, Mundy CR, Gorick BD, Tomlinson IM. 2000. Antibody

arrays for high-throughput screening of antibody-antigen interactions.

Nat Biotechnol 18:989–994.

102. Mintz PJ, Kim J, Do KA, Wang X, Zinner RG, et al. 2003. Fingerprinting

the circulating repertoire of antibodies from cancer patients. Nat

Biotechnol 21:57–63.

103. Fields S, Song O. 1989. A novel genetic system to detect protein-

protein interactions. Nature 340:245–246.

104. Putz U, Skehel P, Kuhl D. 1996. A tri-hybrid system for the analysis and

detection of RNA–protein interactions. Nucleic Acids Res 24:4838–4840.

105. Alexander MK, Bourns BD, Zakian VA. 2001. One-hybrid systems for

detecting protein-DNA interactions. Methods Mol Biol 177:241–259.

106. Hengen PN. 1997. False positives from the yeast two-hybrid system.

Trends Biochem Sci 22:33–34.

107. Licitra EJ, Liu JO. 1996. A three-hybrid system for detecting small

ligand-protein receptor interactions. Proc Natl Acad Sci USA 93:

12817–12821.

108. Vidal M, Legrain P. 1999. Yeast forward and reverse ‘n’-hybrid systems.

Nucleic Acids Res 27:919–929.

109. Manning BD, Cantley LC. 2002. Hitting the target: emerging technol-

ogies in the search for kinase substrates. Science’s STKE 2002:PE49.

What’s new?

BioEssays 26.8 915