Proteomics: drug target discovery on an industrial scale

Embed Size (px)

Text of Proteomics: drug target discovery on an industrial scale

  • Terence E. RyanScott D. Patterson*Celera Genomics Group,

    45 West Gude Drive,

    Rockville, MD 20850, USA.

    *e-mail: scott.patterson@

    celera.com

    http://www.trends.com S45

    Trends in Biotechnology Vol. 20 No. 12 (Suppl.), 2002 A TRENDS Guide to Proteomics | Review

    0167-7799/02/$ see front matter 2002 Elsevier Science Ltd. All rights reserved. PII: S0167-7799(02)02089-9

    Proteomics represents the systematic and broad applica-

    tion of technologies that have traditionally supported the

    field of protein biochemistry. In its most common appli-

    cation, proteomics is used to characterize differences in

    protein expression between biological specimens.

    Although proteomics technologies can be used to catalog

    protein differences after metabolic perturbation, its great-

    est therapeutic value lies in the comparison of cells from

    normal tissue with those representing a disease state

    (e.g. [1]). Such comparisons could enable the identifica-

    tion of disease-specific biomarkers that could be used for

    diagnostic or prognostic tests, or target proteins that have

    the potential for drug intervention.

    Owing to the variability of natural protein expression in

    the same tissue between individuals (owing to inherent

    genetic, metabolic, diurnal, environmental and nutritional

    differences, among others), the disease specificity of an

    observed protein-expression differential needs to be rigor-

    ously demonstrated.This can be achieved by characterizing

    the frequency of a differential expression across a range of

    samples taken from many individuals with the disease, as

    well as by the relative absence of the differential expression

    in other normal tissues in the same individual. These

    requirements, which demonstrate the specificity for the

    disease state, require an experimental design that can

    encompass large numbers of experimental samples and

    controls, and effective interassay comparisons between

    individual samples and among sample groups.The number

    of individual samples required to generate statistical confi-

    dence results from a complex mixture of biological and

    laboratory process considerations. For example, a single

    disease can be represented by various degrees of disease

    progression or characteristic phenotype. Patients with acute

    myeloid leukemia (AML) can generally be classified into

    one of seven FrenchAmericanBritish (FAB)-AML classi-

    fication disease groups [2,3]. Examination of AML sam-

    ples requires that they be grouped accordingly to produce

    meaningful results, or that a larger number of AML

    samples are examined to identify pan-disease patterns of

    differential protein expression. In addition to relevant

    disease subtypes, samples need to be grouped according

    to tumor staging, degree of metastasis, and known

    genetic lesions. The reproducibility (variability in repli-

    cate processes) and sensitivity of laboratory processes also

    contributes to the number of samples needed for exami-

    nation; differentials at the limits of signal-to-noise identi-

    fication will require a greater number of samples to

    achieve statistical importance. Statistical evaluation of the

    differentially expressed proteins will establish appropriate

    levels of confidence for each observation; for the processes

    outlined here, we have found that 20 samples per study

    point is usually sufficient. However, in general, the greater

    the frequency of representation of a particular protein-

    expression differential in a range of samples correlates not

    only with the degree of statistical significance, but also

    with the level of interest in that differential as representative

    of the disease process under study. The rigor required for

    these comparisons suggests that proteomics approaches

    need to become standardized within each laboratory and,

    in addition, the laboratory should be able to process the

    requisite number of samples required to provide statistical

    confidence in the results.These requirements make a factory

    approach to proteomic discovery essential: a facility where

    standard protocols are applied to large numbers of samples,

    with the product being the generation of information

    with a high statistical confidence.

    Proteomics: drug target discoveryon an industrial scaleTerence E. Ryan and Scott D. PattersonThe discovery of targets that are sufficiently robust to yield marketable therapeutics is an enormous challenge. Through theyears, several approaches have been used with varying degrees of success. These include target-independent screening oftumor-derived cell lines (disease-dependent), reductionist approaches to identifying crucial elements of disease-affectedpathways, disease-independent screening of homologs of previously drugged targets, disease-dependent globalexamination of gene transcript levels, and disease-dependent global examination of protein expression levels. Theseendeavors have been enabled by several major advancements in technology, most recently, the sequencing of the humangenome. This review identifies the technical issues to be addressed for industrial-scale protein-based discovery in theidentification of targets for therapeutic (or diagnostic) intervention. Such approaches aim to direct discovery in a way thatincreases the probability of robust target identification, and decreases the probability of failure owing to variable expressionin this emerging field.

  • http://www.trends.comS46

    Trends in Biotechnology Vol. 20 No. 12 (Suppl.), 2002Review | A TRENDS Guide to Proteomics

    Standardization and methodologyThe need for standard protocols that are reproducible in

    both their execution and data output heightens the

    importance of methodology in large-scale proteomics.

    The complexity of biological samples, as well as the capa-

    bilities of the current-generation mass spectrometers,

    enables the separation of proteins or peptides into discrete,

    analyzable entities. Traditionally, this high-resolution

    separation step has been achieved using 2D gel elec-

    trophoresis (Fig. 1). Comparisons between samples must

    therefore be made on the basis of separate 2D gel experi-

    ments; this requires an extraordinary level of care to

    ensure that protocols for gel preparation, sample prepara-

    tion, sample loading, electrophoresis conditions, and pro-

    tein spot staining and identification, precisely match [4].

    Chromatographic approaches are increasingly used for

    proteomic studies as they provide this precision, are rela-

    tively easy to automate, and the instrument software is

    robust (Fig. 1).

    The complexity of protein mixtures from cellular

    lysates or fractions can undergo only limited reduction

    using ion exchange, molecular sieving, or affinity chro-

    matography. However, mixtures of proteins from limited

    chromatographic fractionation can be proteolyzed as a

    group, and the resulting peptides separated by reverse-

    phase chromatography with online mass spectrometric

    detection [59]. This complex-mixture method of gen-

    erating peptides for tandem mass spectrometric identifi-

    cation has been widely used in academia and industry

    because of its reproducibility and ease of automation [10].

    It has gained further favor over gel-based methods

    because it can detect low-abundance peptides [11], and

    also gives a more complete representation of cellular

    proteins (particularly membrane proteins). This review

    discusses issues surrounding the large-scale application of

    complex-mixture proteomic analysis for drug target dis-

    covery: the first step in the drug discovery and develop-

    ment pipeline (Fig. 2). However, it should be noted that

    the platform described here can be applied not only to the

    initial stages of the pipeline, but also to all subsequent

    steps (except filing and marketing).

    Normal Disease

    Enrichment of cell type, subcellular organelle or protein class

    Image analysis

    Selected spot excision

    Digestion and/or MALDI-MS

    MALDI-MS

    Identification

    Digestion of proteins

    Peptide capture (e.g. ICAT-peptides on avidin)

    LCMS (quantitative analysis)

    LCMSMS

    Identification

    Stable isotope labeling (e.g. ICAT)of samples separately (combine)

    d0-ICAT d8-ICAT

    TRENDS in Biotechnology

    Figure 1. The two most commonly used analytical approaches in proteomics

    Complex mixture analysis using 2D gel electrophoresis, liquid chromatography and isotope-coded affinitytag (ICAT) reagent are the current standards for analysis of protein expression levels on a broad scale.

    TRENDS in Biotechnology

    Datamanagementand analysis

    Data captureQuantitation IdentificationSeparation Fractionation

    Preparation for analysis Sample processing Data analysis

    Target discovery

    Sampleacquisition

    Sampleacquisition

    Targetidentification

    Targetvalidation

    Lead IDoptimization

    Preclinicaldevelopment

    Clinicaldevelopment

    Filing, salesand marketing

    Figure 2. Workflow for large-scale proteomics approach for target discovery within a pharmaceutical setting

    Although the schematic infers proteomics is applied only in target discovery, the platform can also be used for all additional parts of the traditional drug discovery pipeline (theuppermost flow-chart) except the filing and subsequent sales and marketing components. Of note is the i