Drug discovery is a prolonged process that uses a variety oftools from diverse fields. To accelerate the process, a number ofbiotechnologies, including genomics, proteomics and a numberof cellular and organismic methodologies, have beendeveloped. Proteomics development faces interdisciplinarychallenges, including both the traditional (biology andchemistry) and the emerging (high-throughput automation andbioinformatics). Emergent technologies include two-dimensionalgel electrophoresis, mass spectrometry, protein arrays, isotope-encoding, two-hybrid systems, information technologyand activity-based assays. These technologies, as part of thearsenal of proteomics techniques, are advancing the utility ofproteomics in the drug-discovery process.
AddressesActivX Biosciences, Inc., 11025 North Torrey Pines Road, Suite 120,La Jolla, CA 92037, USA*e-mail: firstname.lastname@example.org
Current Opinion in Chemical Biology 2002, 6:427433
1367-5931/02/$ see front matter 2002 Elsevier Science Ltd. All rights reserved.
Published online 6 June 2002
Abbreviations2DGE two-dimensional gel electrophoresisABP activity-based probeESI electrospray ionizationICAT isotope-coded affinity taggingMALDI matrix-assisted laser desorption ionizationMudPIT multidimensional protein identification technologyPCR polymerase chain reaction
IntroductionThe drug-discovery process involves many phases, includingtarget identification, lead identification, small-moleculeoptimization, and pre-clinical/clinical development. Efficiencyin this process relies on timely knowledge of biologicalcause-and-effect in the course of disease and treatment,which ultimately rests on knowledge of protein functionand regulation. One of the key steps, target identification,has been fostered through applied genomics, primarilybecause both high-throughput methods and tools that allownucleic acid amplification have enabled large-scale profilingof expressed genes . However, analysis of the informationproduced by genomics, when measured against comparableinformation regarding protein expression, has led to theconclusion that message abundance fails to correlate withprotein quantity . Further, post-translational processessuch as protein modifications or protein degradation remainunaccounted for in genomic analysis . Because bothcell function and its biochemical regulation depend on protein activity, and because the correlation between message level and protein activity is low, the measurementof expression has proven to be inadequate. Consequently,the development of drug-discovery technologies has begun
to shift from genomics to proteomics. This shift hasoccurred not only in target discovery but also in many otherareas of the process, including patient treatment and care. This review focuses on the burgeoning field of proteomics as it applies to drug discovery, which relies uponthe determination of cellular function and regulationthrough large-scale measurement of protein function and interaction.
Proteomics techniquesProteomics, as a scientific field, is defined as the study ofthe protein products of the genome, and their interactionsand functions. Similarly, the proteins expressed at a giventime in a given environment constitute a proteome .From a technology viewpoint, traditional proteomicsinvolves separation of proteins in a proteome, coupled to ameans of identification. Until recently, the tools of choicewere two-dimensional gel electrophoresis (2DGE) for separation, and mass spectrometry (MS) for protein identification. However, 2DGE is limited because it failsto detect proteins at the extremes of separation either bysize or by isoelectric point, and because it is insufficientlysensitive for low-abundance proteins . From the perspective of drug discovery, 2DGE fails in two importantways. First, 2DGE is ineffective for the separation of membrane proteins, which represent nearly 50% of importantdrug targets . Second, low-abundance proteins areunder-represented in a 2DGE analysis, yet often representkey sites of biological regulation. Specifically, it has beenestimated that more than 50% of proteins in cells are of lowabundance [10,11]. Therefore, although 2DGE is powerful,researchers wishing to apply proteomics to drug discoverymust seek innovative ways to measure both protein abundance and activity.
Proteomics presents researchers with a formidable challenge for a number of reasons. First, protein levels varywidely with both cell type and environment . Second,unlike genomics, which can amplifybenefits from theamplification of single genes using the polymerase chainreaction (PCR), protein science has no comparable ampli-fication method . Third, proteomics is complicated bythe fact that the absolute quantity of protein is of limitedinterest to drug discovery, because protein activities arehighly regulated post-translationally . Therefore, proteinscan be abundant, yet possess little activity. Finally, becauseproteins interact functionally in vivo, proteinprotein andproteinsmall-molecule interactions need to be evaluatedin processes of interest .
For drug discovery, the ideal proteomics method would beone that is:
1. Sensitive enough to detect low-abundance proteins.
Proteomics in drug discoveryJonathan Burbaum* and Gabriela M Tobal
2. Able to detect activity over in addition to abundance.
3. Able to detect proteinprotein and proteinsmall-molecule interactions.
4. Easily implemented and performed quickly.
Research in proteomics seeks to satisfy all, or some, ofthese conditions by developing new methods to understand
428 Next-generation therapeutics
Properties of various proteomics techniques.
Analyticaltechnique 2DGE MudPIT Protein chips 2-Hybrid systems ICAT ABPs
Polypeptide chain size
Polypeptide chain size
Surface affinity Protein interaction
potentialIsoelectric point Active site peptide
Abundance Polypeptide chain size
Identity MS MS MS DNA MS MSestablishedby:
Applications Target selection Target selection Target selection Target selection Target selection Target selectionin drug
Protein express- ion profile
Profiling for diagnostics
Drug screening Drug screening Profiling for diagnostics
Drug screening and pan selectivity
Profiling for diagnostics
Detection of protein protein and proteindrug interactions
Pros Can detect 1000s of proteins at once
Can detect 1000s of proteins at once
Can detect 1000s of proteins at once
Can detect potential protein interactions
Circumvents the proteome coverage problems of 2DGE
Circumvents the proteome coverage problems of2DGE
Circumvents the Circumvents the Is easily Is easily Can detect 1000s of proteome coverage problems of
proteome coverage problems of
automated automated proteins at once
2DGE 2DGE Is easily automated
Is easily automated Results in cloned genes for all
Detects protein activity,rather than protein
proteins abundance interrogated
Cons Cannot detect proteins that are very small, large, acidic or basic, poorly soluble and of low abundance
Does not detect abundance, activity, or interactions
At a proteomic scale, will require the cloning of 100s of 1000s of proteins
False negatives and positives
Does not detectpost-translation-al modifications or interactions
Probes needed for all protein families, hence proteomic-scale coverage difficult to ascertain
Difficult to automate
Does not detect interactions in physiologically relevant situations
Limited to proteins localized in the nucleus
protein function and interactions in a biological context.Recent technological advances in this area include developments in separation and identification technologies(i.e. MS, protein-chip technologies, and phage display),bioinformatics, and technologies that detect protein interactions and activities (i.e. activity-based assays, andtwo-hybrid assays) (Table 1).
Mass spectrometryAdvances in MS have allowed the rapid sequencing of proteins . In particular, techniques that enable thetransfer and charging of large molecules such as proteinsand peptides peptides, as well as transfer into a gaseousphase (e.g. electrospray ionization [ESI] and matrix-assistedlaser desorption ionization [MALDI]), have allowed proteins to be analyzed by MS . ES ionizationESIproduces a fine spray of charged particles through acharged needle, whereas MALDI involves crystallizing thesample of interest within a matrix that can be vaporizedquickly using a laser pulse . Two general MS methodsare employed for protein identification using MS. Thefirst, peptide-mass fingerprinting, compares the pattern ofmolecular weights of peptides generated from a proteolyticdigestion to theoretical fingerprints derived from proteindatabases. The second method, tandem mass spectrometry(MSn), selects peptides of interest and uses a secondaryfragmentation process to determine the peptide sequence,which is then identified using protein sequence databases.Further technology improvements, including ion-sourceminiaturization (for sensitivity) and detectors (for massaccuracy), have greatly expanded the method . Thecentral importance of MS in proteomics is attributable tothe large amounts of structural data that the method cancreate at great speed, and is demonstrated by the fact thatmost separation and detection methods rely on MS for protein identification.
BioinformaticsAs a method of analysis, MS is vital to proteomicsresearchers because it identifies their data, playing thesame role as nucleotide sequencing in genomics.I