Proteomics in drug discovery

Embed Size (px)

Text of Proteomics in drug discovery

  • 427

    Drug discovery is a prolonged process that uses a variety oftools from diverse fields. To accelerate the process, a number ofbiotechnologies, including genomics, proteomics and a numberof cellular and organismic methodologies, have beendeveloped. Proteomics development faces interdisciplinarychallenges, including both the traditional (biology andchemistry) and the emerging (high-throughput automation andbioinformatics). Emergent technologies include two-dimensionalgel electrophoresis, mass spectrometry, protein arrays, isotope-encoding, two-hybrid systems, information technologyand activity-based assays. These technologies, as part of thearsenal of proteomics techniques, are advancing the utility ofproteomics in the drug-discovery process.

    AddressesActivX Biosciences, Inc., 11025 North Torrey Pines Road, Suite 120,La Jolla, CA 92037, USA*e-mail: jonb@activx.com

    Current Opinion in Chemical Biology 2002, 6:427433

    1367-5931/02/$ see front matter 2002 Elsevier Science Ltd. All rights reserved.

    Published online 6 June 2002

    Abbreviations2DGE two-dimensional gel electrophoresisABP activity-based probeESI electrospray ionizationICAT isotope-coded affinity taggingMALDI matrix-assisted laser desorption ionizationMudPIT multidimensional protein identification technologyPCR polymerase chain reaction

    IntroductionThe drug-discovery process involves many phases, includingtarget identification, lead identification, small-moleculeoptimization, and pre-clinical/clinical development. Efficiencyin this process relies on timely knowledge of biologicalcause-and-effect in the course of disease and treatment,which ultimately rests on knowledge of protein functionand regulation. One of the key steps, target identification,has been fostered through applied genomics, primarilybecause both high-throughput methods and tools that allownucleic acid amplification have enabled large-scale profilingof expressed genes [1]. However, analysis of the informationproduced by genomics, when measured against comparableinformation regarding protein expression, has led to theconclusion that message abundance fails to correlate withprotein quantity [2]. Further, post-translational processessuch as protein modifications or protein degradation remainunaccounted for in genomic analysis [35]. Because bothcell function and its biochemical regulation depend on protein activity, and because the correlation between message level and protein activity is low, the measurementof expression has proven to be inadequate. Consequently,the development of drug-discovery technologies has begun

    to shift from genomics to proteomics. This shift hasoccurred not only in target discovery but also in many otherareas of the process, including patient treatment and care[6]. This review focuses on the burgeoning field of proteomics as it applies to drug discovery, which relies uponthe determination of cellular function and regulationthrough large-scale measurement of protein function and interaction.

    Proteomics techniquesProteomics, as a scientific field, is defined as the study ofthe protein products of the genome, and their interactionsand functions. Similarly, the proteins expressed at a giventime in a given environment constitute a proteome [7].From a technology viewpoint, traditional proteomicsinvolves separation of proteins in a proteome, coupled to ameans of identification. Until recently, the tools of choicewere two-dimensional gel electrophoresis (2DGE) for separation, and mass spectrometry (MS) for protein identification. However, 2DGE is limited because it failsto detect proteins at the extremes of separation either bysize or by isoelectric point, and because it is insufficientlysensitive for low-abundance proteins [8]. From the perspective of drug discovery, 2DGE fails in two importantways. First, 2DGE is ineffective for the separation of membrane proteins, which represent nearly 50% of importantdrug targets [9]. Second, low-abundance proteins areunder-represented in a 2DGE analysis, yet often representkey sites of biological regulation. Specifically, it has beenestimated that more than 50% of proteins in cells are of lowabundance [10,11]. Therefore, although 2DGE is powerful,researchers wishing to apply proteomics to drug discoverymust seek innovative ways to measure both protein abundance and activity.

    Proteomics presents researchers with a formidable challenge for a number of reasons. First, protein levels varywidely with both cell type and environment [12]. Second,unlike genomics, which can amplifybenefits from theamplification of single genes using the polymerase chainreaction (PCR), protein science has no comparable ampli-fication method [13]. Third, proteomics is complicated bythe fact that the absolute quantity of protein is of limitedinterest to drug discovery, because protein activities arehighly regulated post-translationally [5]. Therefore, proteinscan be abundant, yet possess little activity. Finally, becauseproteins interact functionally in vivo, proteinprotein andproteinsmall-molecule interactions need to be evaluatedin processes of interest [14].

    For drug discovery, the ideal proteomics method would beone that is:

    1. Sensitive enough to detect low-abundance proteins.

    Proteomics in drug discoveryJonathan Burbaum* and Gabriela M Tobal

  • 2. Able to detect activity over in addition to abundance.

    3. Able to detect proteinprotein and proteinsmall-molecule interactions.

    4. Easily implemented and performed quickly.

    Research in proteomics seeks to satisfy all, or some, ofthese conditions by developing new methods to understand

    428 Next-generation therapeutics

    Table 1

    Properties of various proteomics techniques.

    Analyticaltechnique 2DGE MudPIT Protein chips 2-Hybrid systems ICAT ABPs

    Measure-ment

    Polypeptide chain size

    Polypeptide chain size

    Surface affinity Protein interaction

    Relative abundance

    Catalytic activity

    potentialIsoelectric point Active site peptide

    sequence

    Abundance Polypeptide chain size

    Identity MS MS MS DNA MS MSestablishedby:

    Applications Target selection Target selection Target selection Target selection Target selection Target selectionin drug

    Protein express- ion profile

    Profiling for diagnostics

    Drug screening Drug screening Profiling for diagnostics

    Drug screening and pan selectivity

    discovery

    Profiling fordiagnostics

    Profiling for diagnostics

    Detection of protein protein and proteindrug interactions

    Pros Can detect 1000s of proteins at once

    Can detect 1000s of proteins at once

    Can detect 1000s of proteins at once

    Can detect potential protein interactions

    Circumvents the proteome coverage problems of 2DGE

    Circumvents the proteome coverage problems of2DGE

    Circumvents the Circumvents the Is easily Is easily Can detect 1000s of proteome coverage problems of

    proteome coverage problems of

    automated automated proteins at once

    2DGE 2DGE Is easily automated

    Is easily automated Results in cloned genes for all

    Detects protein activity,rather than protein

    proteins abundance interrogated

    Cons Cannot detect proteins that are very small, large, acidic or basic, poorly soluble and of low abundance

    Does not detect abundance, activity, or interactions

    At a proteomic scale, will require the cloning of 100s of 1000s of proteins

    False negatives and positives

    Does not detectpost-translation-al modifications or interactions

    Probes needed for all protein families, hence proteomic-scale coverage difficult to ascertain

    Difficult to automate

    Does not detect interactions in physiologically relevant situations

    Limited to proteins localized in the nucleus

  • protein function and interactions in a biological context.Recent technological advances in this area include developments in separation and identification technologies(i.e. MS, protein-chip technologies, and phage display),bioinformatics, and technologies that detect protein interactions and activities (i.e. activity-based assays, andtwo-hybrid assays) (Table 1).

    Mass spectrometryAdvances in MS have allowed the rapid sequencing of proteins [15]. In particular, techniques that enable thetransfer and charging of large molecules such as proteinsand peptides peptides, as well as transfer into a gaseousphase (e.g. electrospray ionization [ESI] and matrix-assistedlaser desorption ionization [MALDI]), have allowed proteins to be analyzed by MS [16]. ES ionizationESIproduces a fine spray of charged particles through acharged needle, whereas MALDI involves crystallizing thesample of interest within a matrix that can be vaporizedquickly using a laser pulse [15]. Two general MS methodsare employed for protein identification using MS. Thefirst, peptide-mass fingerprinting, compares the pattern ofmolecular weights of peptides generated from a proteolyticdigestion to theoretical fingerprints derived from proteindatabases. The second method, tandem mass spectrometry(MSn), selects peptides of interest and uses a secondaryfragmentation process to determine the peptide sequence,which is then identified using protein sequence databases.Further technology improvements, including ion-sourceminiaturization (for sensitivity) and detectors (for massaccuracy), have greatly expanded the method [17]. Thecentral importance of MS in proteomics is attributable tothe large amounts of structural data that the method cancreate at great speed, and is demonstrated by the fact thatmost separation and detection methods rely on MS for protein identification.

    BioinformaticsAs a method of analysis, MS is vital to proteomicsresearchers because it identifies their data, playing thesame role as nucleotide sequencing in genomics.I