20
Mapping the Proteome of Barrel Medic (Medicago truncatula) 1[w] Bonnie S. Watson, Victor S. Asirvatham, Liangjiang Wang, and Lloyd W. Sumner* Plant Biology Division, The Samuel Roberts Noble Foundation, P.O. Box 2180, Ardmore, Oklahoma 73402 A survey of six organ-/tissue-specific proteomes of the model legume barrel medic (Medicago truncatula) was performed. Two-dimensional polyacrylamide gel electrophoresis reference maps of protein extracts from leaves, stems, roots, flowers, seed pods, and cell suspension cultures were obtained. Five hundred fifty-one proteins were excised and 304 proteins identified using peptide mass fingerprinting and matrix-assisted laser desorption ionization time-of-flight mass spectrom- etry. Nanoscale high-performance liquid chromatography coupled with tandem quadrupole time-of-flight mass spectrom- etry was used to validate marginal matrix-assisted laser desorption ionization time-of-flight mass spectrometry protein identifications. This dataset represents one of the most comprehensive plant proteome projects to date and provides a basis for future proteome comparison of genetic mutants, biotically and abiotically challenged plants, and/or environmentally challenged plants. Technical details concerning peptide mass fingerprinting, database queries, and protein identification success rates in the absence of a sequenced genome are reported and discussed. A summary of the identified proteins and their putative functions are presented. The tissue-specific expression of proteins and the levels of identified proteins are compared with their related transcript abundance as quantified through EST counting. It is estimated that approximately 50% of the proteins appear to be correlated with their corresponding mRNA levels. Legumes are valuable agricultural and commercial crops that serve as important nutrient sources for both humans and animals. For example, alfalfa (Medicago sativa) is an important forage crop with over 24 million acres planted annually with an annual U.S. value approaching 6 billion dollars (U.S. Department of Agriculture-National Agricul- tural Statistics Service, 2002). Legumes are character- ized by symbiotic relationships with both nitrogen- fixing bacteria and arbuscular mycorrhizal fungi (Barker et al., 1990). These host-symbiont interactions result in the ability to fix atmospheric nitrogen and effect mutualistic and defense-related biosynthetic pathways such as the isoflavones, which have been reported to possess antimicrobial, anticarcinogenic, and other health-promoting properties (Dixon, 1999). Other secondary metabolites in legumes such as the triterpenes have been associated with defense and are of particular interest as novel pharmaceuticals (Small, 1996; Haridas et al., 2001). The study of legume biology using many of the agriculturally important legumes such as soybean (Glycine max) and alfalfa is complicated by the large genome size and complex ploidy of these species. Fortunately, barrel medic (Medicago truncatula) has a smaller diploid genome that yields more manageable genetics. These traits, along with its autogamous na- ture, short generation time, and prolific seed produc- tion have made barrel medic a useful model legume (Barker et al., 1990; Cook et al., 1997; Cook, 1999; Bell et al., 2000; Trieu et al., 2000). The impressive achievements in genome and ex- pressed sequence tag (EST) sequencing have yielded a wealth of information for many model organisms, including the plants Arabidopsis and barrel medic. Unfortunately, sequence information alone is insuf- ficient to answer questions concerning gene function, developmental/regulatory biology, and the bio- chemical kinetics of life. To address these questions, more comprehensive approaches that include quan- titative and qualitative analyses of gene expression products are necessary at the transcriptome, pro- teome, and metabolome levels. Transcriptome ap- proaches using microarray and serial analysis of gene expression technologies are powerful tools; however, mRNA abundances may only represent putative function because there is still a questionable correla- tion between mRNA and protein levels (Futcher et al., 1999; Gygi et al., 1999). In contrast, proteomics provides a more direct assessment of biochemical processes by monitoring the actual proteins per- forming the enzymatic, regulatory, and structural functions encoded by the genome and transcrip- tome. Recent improvements in high-resolution two- dimensional PAGE (2-DE; Klose and Kobalz, 1995; Go ¨ rg et al., 1999), increased content of protein and nucleotide databases, and increased capabilities for protein identification utilizing modern mass spec- trometry methods such as matrix-assisted laser de- 1 This work was supported by the Samuel Roberts Noble Foun- dation and by the National Science Foundation (Plant Genome Research Project no. 0109732). [w] The online version of this article contains Web-only data. The supplemental material is available at www.plantphysiol.org. * Corresponding author; e-mail [email protected]; fax 580 – 224 – 6692. Article, publication date, and citation information can be found at www.plantphysiol.org/cgi/doi/10.1104/pp.102.019034. 1104 Plant Physiology, March 2003, Vol. 131, pp. 1104–1123, www.plantphysiol.org © 2003 American Society of Plant Biologists Downloaded from https://academic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

Mapping the Proteome of Barrel MedicPlant Biology Division, The Samuel Roberts Noble Foundation, P.O. Box 2180, Ardmore, Oklahoma 73402 A survey of six organ-/tissue-specific proteomes

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Mapping the Proteome of Barrel Medic(Medicago truncatula)1[w]

    Bonnie S. Watson, Victor S. Asirvatham, Liangjiang Wang, and Lloyd W. Sumner*

    Plant Biology Division, The Samuel Roberts Noble Foundation, P.O. Box 2180, Ardmore, Oklahoma 73402

    A survey of six organ-/tissue-specific proteomes of the model legume barrel medic (Medicago truncatula) was performed.Two-dimensional polyacrylamide gel electrophoresis reference maps of protein extracts from leaves, stems, roots, flowers,seed pods, and cell suspension cultures were obtained. Five hundred fifty-one proteins were excised and 304 proteinsidentified using peptide mass fingerprinting and matrix-assisted laser desorption ionization time-of-flight mass spectrom-etry. Nanoscale high-performance liquid chromatography coupled with tandem quadrupole time-of-flight mass spectrom-etry was used to validate marginal matrix-assisted laser desorption ionization time-of-flight mass spectrometry proteinidentifications. This dataset represents one of the most comprehensive plant proteome projects to date and provides a basisfor future proteome comparison of genetic mutants, biotically and abiotically challenged plants, and/or environmentallychallenged plants. Technical details concerning peptide mass fingerprinting, database queries, and protein identificationsuccess rates in the absence of a sequenced genome are reported and discussed. A summary of the identified proteins andtheir putative functions are presented. The tissue-specific expression of proteins and the levels of identified proteins arecompared with their related transcript abundance as quantified through EST counting. It is estimated that approximately50% of the proteins appear to be correlated with their corresponding mRNA levels.

    Legumes are valuable agricultural and commercialcrops that serve as important nutrient sources forboth humans and animals. For example, alfalfa(Medicago sativa) is an important forage crop withover 24 million acres planted annually with anannual U.S. value approaching 6 billion dollars(U.S. Department of Agriculture-National Agricul-tural Statistics Service, 2002). Legumes are character-ized by symbiotic relationships with both nitrogen-fixing bacteria and arbuscular mycorrhizal fungi(Barker et al., 1990). These host-symbiont interactionsresult in the ability to fix atmospheric nitrogen andeffect mutualistic and defense-related biosyntheticpathways such as the isoflavones, which have beenreported to possess antimicrobial, anticarcinogenic,and other health-promoting properties (Dixon, 1999).Other secondary metabolites in legumes such as thetriterpenes have been associated with defense andare of particular interest as novel pharmaceuticals(Small, 1996; Haridas et al., 2001).

    The study of legume biology using many of theagriculturally important legumes such as soybean(Glycine max) and alfalfa is complicated by the largegenome size and complex ploidy of these species.Fortunately, barrel medic (Medicago truncatula) has a

    smaller diploid genome that yields more manageablegenetics. These traits, along with its autogamous na-ture, short generation time, and prolific seed produc-tion have made barrel medic a useful model legume(Barker et al., 1990; Cook et al., 1997; Cook, 1999; Bellet al., 2000; Trieu et al., 2000).

    The impressive achievements in genome and ex-pressed sequence tag (EST) sequencing have yieldeda wealth of information for many model organisms,including the plants Arabidopsis and barrel medic.Unfortunately, sequence information alone is insuf-ficient to answer questions concerning gene function,developmental/regulatory biology, and the bio-chemical kinetics of life. To address these questions,more comprehensive approaches that include quan-titative and qualitative analyses of gene expressionproducts are necessary at the transcriptome, pro-teome, and metabolome levels. Transcriptome ap-proaches using microarray and serial analysis of geneexpression technologies are powerful tools; however,mRNA abundances may only represent putativefunction because there is still a questionable correla-tion between mRNA and protein levels (Futcher etal., 1999; Gygi et al., 1999). In contrast, proteomicsprovides a more direct assessment of biochemicalprocesses by monitoring the actual proteins per-forming the enzymatic, regulatory, and structuralfunctions encoded by the genome and transcrip-tome. Recent improvements in high-resolution two-dimensional PAGE (2-DE; Klose and Kobalz, 1995;Görg et al., 1999), increased content of protein andnucleotide databases, and increased capabilities forprotein identification utilizing modern mass spec-trometry methods such as matrix-assisted laser de-

    1 This work was supported by the Samuel Roberts Noble Foun-dation and by the National Science Foundation (Plant GenomeResearch Project no. 0109732).

    [w] The online version of this article contains Web-only data. Thesupplemental material is available at www.plantphysiol.org.

    * Corresponding author; e-mail [email protected]; fax 580 –224 – 6692.

    Article, publication date, and citation information can be foundat www.plantphysiol.org/cgi/doi/10.1104/pp.102.019034.

    1104 Plant Physiology, March 2003, Vol. 131, pp. 1104–1123, www.plantphysiol.org © 2003 American Society of Plant Biologists

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • sorption ionization time-of-flight mass spectrometry(MALDI-TOFMS; Pappin et al., 1993; Yates, 1998a,1998b; Corthals et al., 2000) have made the large-scaleprofiling and identification of proteins a dynamicnew area of research in plant biology.

    Although there is a substantial amount of work inthe literature on bacterial (Guerreiro et al., 1999; Mor-ris and Djordevic, 2001), yeast (Futcher et al., 1999),and human proteomes (Anderson et al., 2001; Stens-balle and Jensen, 2001), there is relatively less infor-mation on plant proteomes (van Wijk, 2001). Costaand coworkers have identified proteins from xylemand needles of maritime pine (Pinus pinaster; Costa etal., 1998, 1999), and Tsugita and coworkers haveworked on the rice (Oryza sativa) proteome withsome success (Tsugita et al., 1994). Both of thesegroups have relied heavily on Edman sequencing,which suffers due to the inability to sequence pro-teins blocked at the N terminus. More recently, re-searchers have reported on subcellular proteomessuch as the chloroplast membrane (Peltier et al., 2000,2002) whereas others have focused on single tissuesincluding Arabidopsis seeds (Gallardo et al., 2001),Arabidopsis mitochondria (Kruft et al., 2001; Millaret al., 2001), maize (Zea mays) root tips (Chang et al.,2000), and barrel medic roots (Mathesius et al., 2001,2002). To date, there has been no large-scale project toidentify proteins from multiple tissues of the sameplant species.

    The objective of the present work was to survey theorgan-/tissue-specific proteomes of the model le-gume barrel medic, to provide an overview of thebarrel medic proteome, and to serve as a basis forfuture proteome comparisons of genetic mutants, bi-otically, abiotically, and/or environmentally chal-lenged plants. The survey was accomplished using2-DE to produce reference maps of protein extractsfrom leaves, stems, roots, flowers, seed pods, and cellsuspension cultures. MALDI-TOFMS peptide massfingerprinting was used to identify 304 proteins.HPLC coupled with quadrupole time-of-flight tan-dem mass spectrometry (LC/MS/MS) was used tovalidate marginal MALDI-TOFMS protein identifica-tions. The identified proteins are discussed and clas-sified based on putative functions determinedthrough similarity (Bevan et al., 1998). Databasesearch results are quantified and strategies dis-cussed. The expression levels quantified by 2-DE arecompared with mRNA levels quantified by ESTcounting.

    RESULTS AND DISCUSSION

    2-DE Reference Maps and Protein Identifications ofBarrel Medic Tissues

    2-DE reference maps were obtained for barrelmedic leaves, stems, roots, flowers, seed pods, andcell suspension cultures and are provided in Figure 1.To qualitatively survey the proteins visualized by

    2-DE, a total of 551 proteins (i.e. approximately 96arbitrary protein spots per gel including positive mo-lecular mass marker controls and negative gel blankcontrols) were excised from each of the organ-/tissue-specific Coomassie-stained 2-DE gels and an-alyzed by mass spectrometry. Typically, high-qualityMALDI-TOFMS peptide mass maps were obtained,and representative spectra are provided in Figure 2.Of the 551 protein spots processed, 304 proteins weresuccessfully identified and are listed in Table I.

    Supplemental Table I (see www.plantphysiol.org)contains extensive data that document the analyticalrigor of the protein identifications. These data in-clude an assigned protein spot number (see Fig. 1), anarbitrary peptide mass fingerprint data quality(PMFQ) score of 1 to 5 (with 5 being best, see “Ma-terials and Methods”) to allow assessment of dataquality, the number of peptides matched, m/z accu-racy and sd of peptides matched, percent proteincoverage, theoretical molecular mass and pI, experi-mental molecular mass and pI, the database acces-sion number of the best match and the databases thatyielded concurrent identifications, LC/MS/MS datafor select proteins, and the organism to which thematching protein was identified through similarity.For protein identifications determined using theSwissProt and National Center for Biotechnology In-formation (NCBI) databases, the organism reportedin supplemental Table I is that from which the pro-tein or gene was directly sequenced. In the case ofmost ESTs, protein identifications were first made tobarrel medic ESTs that were not annotated. TheseESTs were annotated by comparison with The Insti-tute for Genomic Research (TIGR) gene indices orthrough similarity to other organisms via BLAST.The organism yielding the highest similarity score isthe organism reported for EST database identifica-tions in Supplemental Table I. Protein function is alsoclassified and recorded in Supplemental Table I. Aminimum of four peptides is statistically necessary toqualify as a confident match (Pappin et al., 1993). Useof additional criteria such as those listed above areadvised and increase the confidence in the proteinidentification. Most proteins identified in Table I hadhigh confidence identifications; however, a smallnumber (23) of the original proteins were identifiedusing only four peptides that had poor m/z accura-cies (i.e. above 30 ppm). These protein identificationswere considered marginal and were further interro-gated using LC/MS/MS. LC/MS/MS data were que-ried against the same three databases (NCBI,SwissProt, and dbESTothers) used to query MALDI-TOFMS data. The majority of identifications werefound to be valid, but four MALDI-TOFMS proteinswere revealed as misidentified. The correct LC/MS/MS identifications for these four are reported inTable I. Tandem data was also used to confirm aspecific MALDI-TOFMS identified protein ques-tioned by a reviewer in leaves (spot no. 51) that had

    Proteomics of Barrel Medic

    Plant Physiol. Vol. 131, 2003 1105

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • Figure 1. 2-DE proteome reference maps were obtained for A, leaf; B, stem; C, root; D, flowers; E, seed pods; and F, cellsuspension cultures. Proteins that were identified in this study are marked with arrows and numbers. The numbers correlatewith protein identifications listed in Table I. 2-DE was performed using 0.75 to1.0 mg of protein, linear 11-cm IPG strips (pH3–10), and a 12% (w/v) total acrylamide SDS second dimension. Gels were stained overnight with Coomassie Brilliant BlueR-250, destained the next day, and images recorded.

    Watson et al.

    1106 Plant Physiol. Vol. 131, 2003

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • a minimal four matching peptides and low sequencecoverage. This identification was confirmed usingLC/MS/MS. These results are provided in Figure 3and include a search score from dbESTothers (12peptides matched and Mascot score of 513), repre-sentative TOFMS data, and tandem TOF/MS/MSdata. Nine proteins from the original list of 23 mar-ginal identifications could not be validated by LC/MS/MS due to limited sample, therefore, were omit-ted from Table I.

    Database Query Strategies and Success Rates

    In an attempt to maximize our protein identifica-tion success rate for barrel medic proteins, we haveused protein (SwissProt), nucleotide (NCBI), and ESTdatabases (dbESTothers, and barrel medic-only ESTsfrom NCBI) for queries of experimental peptide massmaps (Mann and Wilm, 1994; Pappin et al., 1993;Yates, 1998aa, 1998b; Choudhary et al., 2001). Thespecific databases used to successfully identify eachindividual protein are reported in Table I, and asummary of the protein identification success rates is

    provided in Table II. In most cases, the resultingpeptide mass maps were of high quality; however,this did not always translate to successful proteinidentification.

    The average protein identification success rate for alltissues using only the protein databases (SwissProtand NCBInr) was 25%, whereas the average proteinidentification success rate for all tissues using the ESTdatabase was 46% (see Table II). Interestingly, theaverage overlap in the number of proteins identifiedin both databases was only 15%; thus, searching bothdatabases was complementary and not necessarily re-dundant. For example, the peptide maps provided inFigure 2 are of similar high quality; however, spectra2b could not be identified successfully in theSwissProt or NCBI databases and could only be iden-tified successfully through EST database queries. Thiscomplementary searching strategy yielded a final pro-tein identification success rate of 55% for our repre-sentative protein set.

    Strategies using multiple database queries have en-hanced our ability to identify proteins even in theabsence of a genomic sequence. Our overall success

    Figure 2. Representative peptide mass maps ob-tained using MALDI-TOFMS illustrating gooddata quality but differences in protein identifi-cation success dependent upon the databasequeried. Mass spectral peaks are labeled withmonoisotopic mass-to-charge ratio (m/z) valuesused for database searching. A, Stromal 70-kDheat shock-related protein (HSP70, accessionno. Q02028) was successfully identified in seedpods (pds#7) using the NCBI databases. B,Isoflavone reductase (accession no. BE325778)from seed pods (pds#39) was identifiable onlythrough use of the EST databases.

    Proteomics of Barrel Medic

    Plant Physiol. Vol. 131, 2003 1107

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • Table I. Proteins identified in barrel medic tissues

    Table I contains a list of identified proteins from specific tissues of Medicago truncatula. The data are separated by tissue and include: anassigned protein spot no. (see Fig. 1), database accession no. of the best match, databases that yielded concurrent identifications, and the numberof MALDI-TOFMS peptides matched. LC/MS/MS was performed on select proteins, and Mascot scores for these proteins are provided inparentheses. Not applicable (NA) denotes that no MALDI data was used in the identification. Significantly more detailed data supporting theprotein identifications can be found in Supplemental Table I. Accession no. is GenBank no. Databases have following notations: N, NCBI; S,SwissProt; E, pdbESTothers; (E), MtESTonly. Species are noted as Hv, Hordeum vulgare; Sb, Sorghum bicolor, LE, Lycopersicon esculentum; andMt, Medicago truncatula.

    Tissue Spot # Identification Accession Number Databases# Peptides/LC/MS/MS

    lvs 35 er ATPase (CDC48-like protein)a NP190891 N 11lvs 39 DNA mismatch repair proteina O66652 S 8lvs 47 Rubisco BE420942 E/Hv, (E) 7lvs 52 Rubisco BE420942 E/Hv 5lvs 51 F23N19.10, TPR repeat protein AW694998 E/Mt, (E) 4/(513)lvs 63 Cell division prt. FTSK homologa P45264 S, N 8lvs 78 Rubisco BAA20039 N, S, E/Hv 11lvs 82 Rubisco AAF97663 N, S, E/Hv 9lvs 84 Rubisco AAF15326 N, S, E/Hv 9lvs 92 Rubisco X69528 N NA/(389)lvs 98 Transcription factora BF004459 (E) 5lvs 105 S-adenosyl-Met synthetase BG581653 E/Mt, N, S 12lvs 108 S-adenosyl-Met synthetase P50303 S 8lvs 111 Rubisco activase AF251264 N NA/(372)lvs 113b ATP synthase beta chain NP077960 N 8lvs 113b Rubisco activase Q42450 S, E/Sb 6lvs 124 Rubisco activase AAG61120 N, S 9lvs 126 Rubisco activase AAG61120 N, S 8lvs 123 Aminomethyl transferase, mito. Precursora BF521422 E/Mt, (E) 12lvs 128 Aminomethyl transferase, (T protein)a P49364 N, S, E/Mt 11lvs 136 Fru biphosphate aldolase BI309468 (E), N, S, E/Mt 9lvs 138 Spermine synthase BE204391 E/Mt 5lvs 139 Putative Arabidopsis thaliana proteina AW685607 E/Mt 5lvs 141 Ankyrin repeat protein AL388433 E/Mt 5lvs 144b Glyceraldehyde-3-phosphate dehydrogenase BF003409 E/Mt, (E) 10lvs 144b Possible tartrate dehydrogenase P70792 S, N 6lvs 149b Glyceraldehyde-3-phosphate dehydrogenase BG453922 (E) 7lvs 149b Tartrate dehydrogenase P70792 N, S 6lvs 155 Leu2 (3-isopropyl malate dehydrogenase)a P18120 N, S 7lvs 158 Malate dehydrogenase T09286 N, E/Mt, (E) 14lvs 196b Ascorbate peroxidase BG587041 (E) 5lvs 187 Oxygen evolving enhancer protein P14226 S, N, E/Mt, (E) 7lvs 191 Oxygen evolving enhancer protein P14226 S, N, E/Mt, (E) 9lvs 188 Remorina BG588209 (E) 7lvs 189 Remorina BG588209 (E) 8lvs 196b Rubisco AAC35045 N, S 8lvs 206b Mitotic cyclin B1-1a AAC24244 N 8lvs 206b ATP synthasea BG582863 (E) 9lvs 205 Oxygen-evolving enhancer protein 1 BG449793 (E) 5lvs 219 Cystathione-B-lyasea P53780 S 6lvs 222 Chloro membrane-associated 30-kD protein/transit pepta AW776774 E/Mt, (E) 9lvs 223 RNA-binding protein BF641320 E/Mt, (E) 8lvs 238 L-ascorbate peroxidase P48534 N, S, E/Mt, (E) 6lvs 241 Ascorbate peroxidase AAL15164 N, S, (E) 6lvs 237 Acid phosphatase BG588612 (E) 8lvs 239 Acid phosphatase BG588612 (E) 11lvs 251 Triose phosphate isomerase, cytosolic BF642390 E/Mt, (E) 10lvs 250 ABC transportera NP488322 N 7lvs 258b Pyrimidine-nucleoside phosphorylasea P39N9 S 8lvs 263 Chaperonin 21 precursora AW775755 E/Mt 6lvs 265 Chaperonin 21 precursora AW776607 E/Mt, (E) 5lvs 261 Patatin-like proteina AAF98369 N 5lvs 258b Transcription factor VSE-1a CAA05898 N 6

    (Table continues on following page.)

    Watson et al.

    1108 Plant Physiol. Vol. 131, 2003

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • Table I. Continued from previous page.

    Tissue Spot # Identification Accession Number Databases# Peptides/LC/MS/MS

    lvs 270 Oxygen-evolving enhancer protein P16059 S, E/Mt, (E) 7lvs 280 Oxygen-evolving enhancer protein BF521386 E/Mt, S, (E) 12lvs 281 Plastid specific ribosomal proteina BE318731 (E) 4lvs 287 Gly-rich cell wall structural protein 2a AL366848 E/Mt, (E) 8lvs 284 Oxygen-evolving enhancer protein P16059 S, E/Mt, (E) 5lvs 338 Hypothetical proteina NP180029 N 5lvs 363 Aspartate 1-decarboxylase precursora P52999 S 4lvs 388 Rubisco small subunit BF520627 E/Mt, (E) 9lvs 387 Rubisco small subunit BF519126 E/Mt 5lvs 397 Plastocyanine precursor AW776926 E/M. t., (E) 7lvs 422 Photosystem I iron-sulfur proteina NP039445 N, S 8stm 5 Cell division (valosin-containing) protein P54774 N, S 8stm 7 Heat shock protein 70 1909352A N, S 10stm 10 TPR repeat protein AW694998 (E), E/Mt 7stm 9 Heat shock protein 70 P37900 N, S, E/Mt 8stm 17 Rubisco CAA93074 N, S 8stm 16 Rubisco P28400 N, S 7stm 18b ATP synthasea CAB85681 N 7stm 18b Rubisco P30401 S, N 7stm 19 Rubisco P04991 S, N, E/Mt 13stm 20 Rubisco AAF97641 N, S, E/Mt 6stm 22 Tubulin alpha chain Q43473 N, S, E/Mt 13stm 23 26S proteasome AAA-ATPase subunita BE325937 E/Mt 5stm 24 26S proteasome (TAT binding)a NP187204 N, S, E/Mt 7stm 26 SAM synthetase P46611 N, S, E/Mt 7stm 29 Actin Q96483 N, S, E/Mt 7stm 36 ATPase or P loop kinasea NP347611 N 9stm 37 Fru 1,6 biphosphate aldolase O65735 N, S, E/Mt 8stm 39 Adenosine kinasea BF004017 (E) 10stm 40 Malate dehydrogenase BI310064 (E) 6stm 42 Annexina T09552 N, E/Mt 6stm 43 Fructokinase AW584645 E/Mt, (E) 7stm 44 Ribose-phosphate pyrophosphokinasea P47304 S 7stm 45 IFR-like oxidoreductase BF644624 E/Mt, (E) 5stm 48 Atran bp1a (Ran-binding protein 1 domain)a AW686211 (E) 7stm 46 Cinnamoyl-CoA reductasea BF635045 (E) 7stm 49 G protein beta subunita Q39836 N, S, E/Mt, (E) 4/(265)stm 51 Oxygen-evolving enhancer protein I P14226 S, N 10stm 52 RNA-binding protein-like NP196048 N 6stm 57 Ascorbate peroxidase BG648814 (E) 4stm 58 Proteasome subunit alpha type 7 (20S) Q9SXU1 S, N, E/Mt, (E) 5stm 55 SAM:trans-caffeoyl CoA 3-O methyl transf.a T09399 N, E/Mt 10stm 54 RNA-binding protein-like BF641320 E/Mt 5stm 60 Ascorbate peroxidase BG648814 (E), E/Mt 5stm 61 Acid phosphatase BG588612 (E), E/Mt 6stm 62 Triosphosphate isomerase BF642390 E/Mt, (E) 10stm 64 Expressed proteina AI774799 E/Le 5stm 66 Uridylate monophosphate kinase AW981222 E/Mt, (E) 8stm 70 23-kD O2-evolving pht. sys. II precursor

    a P16059 N, S, E/Mt, (E) 7stm 72 ATP synthase, delta chaina Q41000 S 4stm 81 vcCYP AW775250 E/Mt, (E), N 8stm 85 40S ribosomal protein S12a AL375805 E/Mt 4stm 86 Gly-rich RNA binding protein AL379229 E/Mt, (E) 4stm 88 60S ribosomal protein AW776748 E/Mt, (E) 4stm 90 Nucleoside diphosphate kinase I P47922 S 5stm 89 Gly-rich RNA binding protein AA660717 E/Mt, (E), N 9stm 95 Hypotheticala NP174644 N 6rts 2 Heat shock 70 Q02028 N, S, E/Mt 22rts 5 Phosphoglyceromutase BG585916 (E) 5

    (Table continues on following page.)

    Proteomics of Barrel Medic

    Plant Physiol. Vol. 131, 2003 1109

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • Table I. Continued from previous page.

    Tissue Spot # Identification Accession Number Databases# Peptides/LC/MS/MS

    rts 6 Protein disulfide isomerase BI309490 E/Mt, (E), N, S 10rts 9 Putative methyl binding domain AL378817 E/Mt 4rts 12 ATPase beta subunit CAA75477 N, S, E/Mt, (E) 13rts 19 Actin isoform B T51183 N, S, E/Mt, (E) 13rts 20 Peroxidase precursor AL369822 (E), E/Mt 8rts 22 Ankyrin repeat protein HBP1 BI311773 (E), E/Mt 12rts 28 Glyceraldehyde-3-phosphate dehydrogenase BG453922 (E), S 10rts 32 Cationic peroxidase precursor BG584470 (E) 7rts 33 Isoflavone reductase BG645198 (E) 13rts 36 Isoflavone reductase homolog BI312226 E/Mt, (E) 5rts 39 Acidic glucanasea BF650084 (E), E/Mt, N 4rts 40 Cytochrome c oxidase subunit 6b-1 BI310278 (E), E/Mt 8rts 41 Gluco endo-1,3-beta-d-glucosidase BE239884 E/Mt 4rts 42 Hydroxyacyl glutathione hydrolasea BG584417 (E) 8rts 43 Chitinasea CAA71402 N, S, E/Mt, (E) 9rts 44 Chitinasea CAA71402 N, S, E/Mt, (E) 12rts 45 Chitinasea CAA71402 N, S, E/Mt, (E) 10rts 46 Cys proteinase BI269594 (E), E/Mt 4rts 47 Cys proteinase precursor BG645760 (E), E/Mt 6rts 48 Ascorbate peroxidase BG648814 (E) 4rts 51 Ascorbate peroxidase P48534 S, N, E/Mt, (E) 4rts 52 Triose phosphate isomerase BG584164 (E), E/Mt 11rts 53 In2-1 proteina BF635446 E/Mt, (E) 7rts 54 Uridylate kinase (UDP kinase) AW981222 E/Mt, (E) 4rts 56 Chalcone-flavone isomerase AW559891 (E), E/Mt, N, S 6rts 60 Unknown proteina AW686250 (E), E/Mt 9rts 61 Alpha fucosidasea BE942130 (E) 6rts 63 Putative protein T25B15.70a BF520168 E/Mt, (E) 8rts 65 Seed protein precursor AL371551 E/Mt, (E) 4rts 66 Vc Cyp (peptidyl isomerase) BE316900 (E), E/Mt 4rts 69 Profucosidasea AW126318 (E) 5rts 75 Putative proteina BF005271 (E), E/Mt 8rts 77 Glyceraldehyde-3-phosphate dehydrogenase BF635050 E/Mt, (E), N 9rts 79 Cu/Zn superoxide dismutasea AL387737 E/Mt 4rts 80 aba-Responsive protein ABR17 BF648027 E/Mt, (E) 8rts 81 Unknown proteina AL365549 E/Mt, (E) 6rts 82 Putative ripening-related proteina BE943167 E/Mt, (E) 4rts 92 Thioredoxin BE997543 (E), E/Mt 4flw 3 Valosin-containing cell division protein P54774 S, N 16flw 6 NADH ubiquinone oxidoreductase AW587332 E/Mt, (E) 6flw 7 Heat shock 70 Q02028 N, S 10flw 8 Poly(A�)-binding proteina BG584083 (E) 5flw 11 Phosphoglyceromutase BG585916 (E), E/Mt, N, S 10flw 13 Putative methyl-binding domain AL378817 (E), E/Mt 5flw 14 Calreticulin AW773889 E/Mt 4/(348)flw 17 ATPase beta subunit CAA75477 N, S, E/Mt, (E) 8flw 18 Enolase CAB75428 N, S, E/Mt, (E) 7flw 21 S-adenosyl Met synthetase AAL16064 N, S, E/Mt, (E) 8flw 23 Rubisco activase AAK25798 N, S, E/Mt, (E) 9flw 27 Ankyrin repeat protein HBP1 BI311773 (E) 9flw 26 Fru-1,6-biphosphate aldolase O65735 N, S, E/Mt, (E) 7flw 28 Aspartate aminotransferase P46643 S, N, E/Mt 4/(115)flw 31 1-Aminocyclopro. carboxylic acid oxidasea AY062251 N, (E) 6flw 32 Pyruvate dehydrogenase beta unita BF645846 (E), E/Mt 6flw 30 Glyceraldehyde-3-phosphate dehydrogenase P34922 N, S, E/Mt, (E) 8flw 33 Malate dehydrogenase O48905 S, N 4flw 34 Malate dehydrogenase O48905 N, S, (E) 8flw 35 Ripening-induced proteina BI308422 (E) 8flw 39 Cytochrome c oxidase subunit 6b BI310278 (E), E/Mt 5

    (Table continues on following page.)

    Watson et al.

    1110 Plant Physiol. Vol. 131, 2003

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • Table I. (Continued from previous page.

    Tissue Spot # Identification Accession Number Databases# Peptides/LC/MS/MS

    flw 40 Stromal ascorbate peroxidase Z67113 (E), E/Mt 7flw 50 Acid phosphatase BG588612 (E), E/Mt 7flw 51 Acid phosphatase BF004054 (E), E/Mt 6flw 53 Triose phosphate isomerase BG584164 (E) 11flw 55 Osmotin-like protein BI270608 (E) 8flw 56 Chalcone isomerase BI310352 (E), E/Mt 6flw 57 Ascorbate peroxidase AAL15164 N, S, E/Mt, (E) 5flw 60 Oxygen-evolving enhancer protein 2 BF636854 E/Mt, (E) 5flw 71 Peptidyl prolyl isomerase BE999037 (E), E/Mt 5flw 73 Acid phosphatase AW584917 (E), E/Mt 4/(131)flw 74 Peptidyl prolyl isomerase BE997455 (E), E/Mt 6flw 75 Gly-rich RNA binding protein BF637655 E/Mt 4flw 77 Peroxiredoxin (peroxidase) AW585033 (E), E/Mt 12flw 78 Ubiquitin-like SMT3 protein AL376595 (E), E/Mt, S 8flw 79 Ubiquitin-like SMT3 protein P55852 S, N, E/Mt mt 7flw 81 60S acidic ribosomal protein p3 BF003585 (E), E/Mt 4/(123)flw 82 Gly cleavage system h precursora BF518986 (E), E/Mt 5flw 88 Acid ribosomal protein P2a2 AW329482 (E) 5flw 91 Immunophilin AW574158 (E), E/Mt 6flw 92 Rubisco small chain BI268542 (E), E/Mt 5flw 94 Profilin 1a AL373653 (E) 9flw 96 NADH plastoquinone oxidoreductase 4a BF631701 (E) 6pds 5 Convicilina BI312063 (E) 5pds 6 Convicilina BI310979 E/Mt 10pds 7 Heat shock 70 Q02028 N, S 9pds 8 Legumin a2 precursora BI312252 E/Mt 9pds 9 Protein disulfide isomerase P29828 N, S, (E), E/Mt 7pds 10 Glycinina BI308459 (E) 7pds 13 Legumin a2 precursora BI311943 (E) 9pds 12b Rubisco AAK70985 N, S 8pds 12b NAp1p (plasma membrane intrinsic protein)a AW774263 E/Mt, (E) 10pds 14 Vicilin 47kD precursora BI310576 (E) 9pds 15 Provicilin precursora BI312400 (E) 6pds 16 Vicilin 47kD precursora BI311712 (E) 4pds 19 Vicilin 47kD precursora BI311712 (E) 7pds 22 Glycinina BI311592 (E) 13pds 20 Glycinina BI311729 (E) 4pds 21 Glycinina BI308883 (E) 12pds 23 Glycinina BI308883 (E) 6pds 24 Legumin a2 precursora BI309500 (E) 11pds 28 Legumin a2 precursora BI311943 (E) 14pds 27 Legumin a2 precursora BI311943 (E) 11pds 31 Fru 1,6-biphosphate aldolase O65735 N, S, (E), E/Mt 7pds 30 Legumin a2 precursora BI311943 (E) 11pds 29 Legumin a2 precursora BI311943 (E) 8pds 32 Legumin a2a BI307938 (E) 5pds 34 Cytosolic malate dehydrogenase BG583001 (E), N, S 6pds 38 Malate dehydrogenase precursor AW688679 E/Mt, N 7pds 37 Glycinina BI311164 (E) 10pds 39 IFR-like NADH-dependent oxidoreductase BE325778 E/Mt 7pds 41 Peroxidase 2 CAC38106 N, E/Mt 16pds 43b Rubisco CAA62888 N, S 10pds 43b Enolase BG362941 E/Gm 7pds 44 Vicilin 47-kD precursora BI310576 (E) 9pds 47 Acid phosphatase BF004054 E/Mt 5pds 46 Acid phosphatase BF004054 E/Mt 7pds 51 Acid phosphatase BG588612 (E) 9pds 52 Proteosome 20S subunit BE922062 E/St 7pds 53 Ascorbate peroxidase BG648703 (E), E/Mt 10

    (Table continues on following page.)

    Proteomics of Barrel Medic

    Plant Physiol. Vol. 131, 2003 1111

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • Table I. Continued from previous page.

    Tissue Spot # Identification Accession Number Databases# Peptides/LC/MS/MS

    pds 54 Osmotin like protein precursor BG582096 (E), E/Mt 8pds 55 Putative GSH-dependent dehydroascorbate

    reductaseaBF636747 E/Mt, (E) 6

    pds 56 Legumin a2 precursora BI307938 (E) 7pds 57 Legumin a2 precursora BI309895 (E) 6pds 58 Oxygen-evolving enhancer protein 2 AW775879 E/Mt, (E), S 12pds 59 Legumin a2 precursora BI309155 (E) 9pds 62 Legumin b (minor small)a BI311720 (E) 9pds 65 Legumin b (minor small)a BI311720 (E) 12pds 66 Legumin b (minor small)a BI311437 (E) 6pds 68 Legumin-related high-Mr polypeptide

    a BI310430 (E) 8pds 70 Hypothetical proteina AW685677 E/Mt, (E) 9pds 71 Vicilin 4-kD precursora BI312335 (E) 10pds 75 LEA proteina BG454568 (E) 4pds 73 Legumin a2 precursora BI309500 (E) 4pds 72 Eukaryotic initiation factor 5aa AL389124 E/Mt 4pds 77 VcCyP peptidylprolyl isomerase BE997455 E/Mt, (E) 10pds 78 VcCyP peptidylprolyl isomerase BE997455 E/Mt, (E) 13pds 80 Ubiquitin-like protein AL376595 E/Mt 5pds 82 aba-Responsive protein abr 17 BF648027 (E) 5pds 84 Gly-rich RNA-binding protein BI309824 (E), E/Mt 11pds 85 Acidic ribosomal protein AL383563 E/Mt 5pds 91 Legumin (minor small)a BI311437 (E) 7pds 95 Rubisco small subunit BF519894 E/Mt 8pds 94 Plastocyanin precursor BF005687 E/Mt 6cls 2 Cell division cycle prt 48 (valosin contain. prt) P54774 S 10cls 5 Heat shock protein 70-kD (Bip A) T06598 E/Hv, N, S 10cls 6 Luminal-binding protein CAC14168 N, S, E/Hv 8cls 7 Psst 70 Q02028 N, S, E/Hv 9cls 8 Putative-luminal binding protein CAC14168 N, S, (E) 9cls 10 Leucyl aminopeptidasea S57811 N, S 4cls 9 70-kD heat shock protein Q01899 N, S, E/Mt, (E) 8cls 20 Catalasea P49315 N, S, (E) 6cls 15 Selenium-binding proteina CAC67501 N, (E) 12cls 16 Selenium-binding proteina CAC67501 N, (E) 7cls 14b Calreticulin Q40401 N, E/Mt 4/(135)cls 14b Nucleosome assembly protein 1a S60893 E/Mt NA/(208)cls 18 ATP synthase beta subunit CAA75478 N, S, E/Mt, (E) 6cls 19 Inosine-5�-monophosphate dehydrogenasea AAL18815 N, E/Mt, (E) 5cls 21 Hydroxymethyltransferasea AW980652 E/Mt, (E) 4cls 22 Enolase CAB75428 N, E/Mt 6cls 23 SAM synthetase AAG17666 N, E/Mt, (E) 5cls 24b Glc-6-phosphate 1 dehydrogenase Q42919 S 5cls 24b SAM synthetase 2 Q96552 N, E/Mt, E/Gm 5cls 28 Aspartate aminotransferase P28011 N, S, E/Mt 17cls 29 Putative heat shock protein AAK63929 N 5cls 30 12-Oxophytodienoic acid 10,11-reductasea BG648922 E/Mt, (E) 6cls 31 12-Oxophytodienoate reductase (OPR2)a AW776305 E/Mt 6cls 27 RAD23 (ubiquitin-like protein)a AW586882 (E) 4cls 33 Probable mannitol dehydrogenase AW981164 (E) 6cls 32 Alcohol dehydrogenasea P12886 S, (E) 5cls 35b Catalasea P45739 S 5cls 35b Fru-1,6-biphosphate aldolase P46257 N, S, E/Mt 5cls 37 2-Nitropropane dioxygenase-like proteina BF518520 E/Mt, (E) 7cls 39 Fructokinase AW584645 E/Mt, N, S 8cls 46 Beta-1,3-glucanase BF650622 E/Mt 4/(378)cls 49 Stromal L-ascorbate peroxidase precursor BE941206 E/Mt, (E) 5cls 53 Cytochrome b5 reductasea NP 568391 E/Mt NA/(406)cls 60 Glyceraldehyde-3-phosphate dehydrogenase P54270 N, S 5cls 62 Rubisco, small subunit PS6577 S 4

    (Table continues on following page.)

    Watson et al.

    1112 Plant Physiol. Vol. 131, 2003

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • rate of 55% is good when compared with other re-ports focused on organisms without sequenced ge-nomes. For example, a recent publication concerningpea (Pisum sativum) chloroplast proteins reported asuccess rate of 15% using mass spectrometry andEdman sequencing (Peltier et al., 2000), whereas abarrel medic root proteome article reported a successrate of 37% (Mathesius et al., 2001). Our proteinidentification success rates are approaching those fororganisms with sequenced genomes. For example,identification success rates of 54% using MS only(Kruft et al., 2001) and 69% (Millar et al., 2001) usingMS, immunoblotting, and Edman sequencing werereported for Arabidopsis mitochondrial proteomes.Further, protein identification success rates in humanproteome projects are approximately 60% (Stensballeand Jensen, 2001). We expect protein identificationsuccess rates to continually increase as the popula-tion of unique ESTs continues to increase, as full-length EST sequences are generated, and as genomicsequence of barrel medic becomes available (Com-ment, 2002).

    The average length of barrel medic ESTs used tosuccessfully identify proteins in all organ/tissueswas 597 � 177 nucleotides (or 199 � 59 amino acids).For proteins in the 30-kD range or less, this repre-sents complete or almost complete sequence cover-age by the EST; thus, our confidence in these identi-fications is very high. For larger proteins this onlyrepresents partial protein sequence; however, ourdata demonstrate that the current EST information issufficient to allow confident identifications. Addi-tional experimental data such as number of peptidesmatched, m/z accuracy, molecular mass, and pI pro-vide additional confirmation of identification. It islogical that a strategy including both protein andnucleotide databases would yield greater proteinidentification rates as some mRNAs, such as mito-chondrial and chloroplast-encoded mRNAs (i.e.

    Rubisco large subunit), do not contain poly(A�) tails(Sugiura and Takeda, 2000). These poly(A�) tails areused in the initial stages of affinity purification ofmRNAs in the cDNA/EST library generation process(Sambrook et al., 1989). Messenger RNAs withoutpoly(A�) tails pass through the affinity purificationprocess and are unlikely to be sequenced. These pro-teins are poorly represented in the EST libraries butare present in many of the protein databases. There-fore, querying both provides greater identificationsuccess rates.

    Protein Identifications and Functional Classifications

    Putative protein functional classifications were as-signed based on similarity to better understand thebiological processes encompassed by the proteinsidentified using a 2-DE proteomics approach. Sum-maries of protein functions observed in the barrelmedic proteome are provided in Figure 4. Proteinfunctions were assigned using the protein functiondatabase Pfam (http://www.sanger.ac.uk/Software/Pfam/; Bateman et al., 2002) or Inter-Pro (http://www.ebi.ac.uk/interpro/; Apweiler et al., 2001). Pro-tein function was categorized into 13 classes as previ-ously described for Arabidopsis (Bevan et al., 1998).The “unclear” protein class included proteins thatwere successfully matched to putative proteins fromsuch sources as the Arabidopsis genomic sequence butdo not yet have a known function. Most proteinscould be unambiguously classified; however, a smallnumber of proteins were associated with multiplefunctions. Classifications for these proteins werebased on their predominate function. Discussions con-cerning a portion of the proteins observed and theirfunctional role are presented below in relation to thetissue in which they were observed.

    Table I. Continued from previous page.

    Tissue Spot # Identification Accession Number Databases# Peptides/LC/MS/MS

    cls 64 Proteosome subunit �-type 5 (20S subunit) Q9M4T8 E/Mt, N, S NA/(449)cls 67 NADH ubiquinone oxidoreductase BG448277 (E) 4cls 74 vcCyp (peptidylprolyl isomerase) AW775250 E/Mt 9cls 78 Peroxiredoxin TPx1 (thioredoxin peroxidase) AW559683 (E), E/Mt 11cls 81 Peptidylprolyl isomerase (immunophilin) BF635887 E/Mt, (E) 5cls 83 Disease resistance response proteina BE942549 E/Mt, S 9cls 82b aba-Responsive protein BF648027 E/Mt, (E) 10cls 82b Leghemoglobin 2 (Pprg2)a P27993 S 4/(680)cls 87 Class 10 PR proteina Q43560 N, S, E/Mt 5cls 85 Cytochrome C-555a P00124 S 5cls 88 Nucleoside diphosphate kinase P47922 S 7cls 86 Gly-rich RNA binding protein AAF06329 E/Mt, N 9cls 90 Immunophilin AL377066 E/Mt 4cls 92 Acidic ribosomal protein (60S) AL378424 E/Mt 5cls 96 10-kD chaperonina AL377948 E/Mt 4/(121)a Putative unique protein identified only in one tissue. b Multiple proteins identified in this 2-DE spot.

    Proteomics of Barrel Medic

    Plant Physiol. Vol. 131, 2003 1113

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • Figure 3. Representative LC/MS/MS data obtained on an ABI Qstar Pulsar for leaves (spot no. 51) confirming theidentification of this protein as a TPR repeat protein (accession no. AW694998) as suggested by MALDI-TOFMS peptidemass fingerprinting. The data include: A, database search score and peptides successfully identified; B, example TOF/MS;and C, tandem TOF/MS/MS mass spectra for the peptide observed at m/z 677.62.

    Watson et al.

    1114 Plant Physiol. Vol. 131, 2003

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • Leaves

    Photosynthetic enzymes dominated the 2-DE pro-files of leaf tissue. Approximately 40% of the leafprotein mass visualized with Coomassie staining canbe attributed to a small number of enzymes includingthe large subunit of Rubisco (26.1%), Rubisco smallsubunit (2.8%), Rubisco activase (3.2%), and oxygen-evolving protein (6.4%). Most of these proteins ap-pear as multiple spots, and the reported percentagesare estimates including all identified spots. The rela-tively high concentrations of the abundant photosyn-thetic enzymes demonstrate the importance of theseenzymes; however, the prominence of these proteins,specifically Rubisco, in specific regions of the gel,

    generally contributes to lower quality 2-DE gels andprevents the observation of moderate or lower abun-dance proteins due to their relatively lower concen-trations and the limited dynamic range of common2-DE staining techniques including Coomassie. Otherproteins involved in photosynthesis and carbon fixa-tion were observed in leaf, including: PS1 iron-sulfurprotein, ATP synthase, glyceraldehyde 3-phosphatedehydrogenase, malate dehydrogenase, triose phos-phate isomerase, tartrate dehydrogenase, and Frubiphosphate aldolase. Many of these photosyntheticenzymes were also observed at lower levels in othergreen tissues such as stems and immature seedpods.

    Figure 4. Summary of the distribution of tissue specific identified protein classes as determined using the protein functiondatabase Pfam (http://www.sanger.ac.uk/Software/Pfam/) and classification schema previously reported for Arabidopsis(Bevan et al., 1998).

    Table II. Summary of protein identification success rates

    Success rates are reported as total no. of proteins identified and as a percentage of those identifiedrelative to those processed in parentheses.

    Tissue Protein Databases EST Databases Overlap Total

    Leaves 37 (44%) 42 (50%) 15 (18%) 64/84 (76%)Stems 28 (30%) 33 (38%) 15 (16%) 46/94 (49%)Roots 12 (13%) 40 (43%) 12 (13%) 40/94 (43%)Flowers 16 (17%) 40 (43%) 13 (14%) 43/94 (46%)Pods 9 (10%) 59 (65%) 7 (8%) 61/91 (67%)Cells 33 (35%) 40 (40%) 23 (24%) 50/94 (53%)

    Total 135/551 (25%) 254/551 (46%) 85/551 (15%) 304/551 (55%)

    Proteomics of Barrel Medic

    Plant Physiol. Vol. 131, 2003 1115

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • Several signal transduction proteins were observedin leaves, including the multiple domain protein re-morin. Remorin binds simple and complex galactu-ronide and its C-terminal region has functional sim-ilarities to viral intercellular communication proteins(Reymond et al., 1996). Other proteins involved inprotein destination or transport included chaperonin21 precursor, an ankryin repeat protein, and an ATP-binding cassette transporter. Ankyrin repeat proteinshave been associated with protein-protein interaction(Gorina and Pavletich, 1996), transcriptional regula-tion (Batchelor et al., 1998), and transcription inhibi-tion (Jacobs and Harrison, 1998). ATP-binding cas-sette transporters are membrane-localized proteinsthat transport small hydrophilic molecules acrossmembranes and include an ATP-binding domain(Higgins, 1992; Jasiñski et al., 2001). Interestingly,other membrane localized proteins were identifiedand included a chloroplast membrane-associated30-kD protein (Li et al., 1994) and ATP synthase. Theidentifications of membrane proteins are importantbecause these proteins are generally underrepre-sented in 2-DE proteomic studies due to low solubil-ity (Molloy et al., 1998). The observation of plantproteins in 2-DE relative to their general averagehydropathicity score has been discussed recently(Millar et al., 2001). Additional proteins identified inleaf tissues included: two cell division proteins, fila-mentous temperature sensitive protein K homologcell division protein, miotic cyclin B1-1, DNA mis-match repair protein, RNA-binding protein, tran-scription factor, and a Gly-rich cell wall structuralprotein.

    Stems

    The 2-DE reference map of barrel medic stem pro-teins was of better quality than that of leaves, pri-marily due to a lower abundance of Rubisco. Many ofthe same photosynthetic and carbon metabolism en-zymes reported above for leaf were also identified instems. In addition, several members of the ATP com-plex associated with energy metabolism were ob-served. Proteins involved in protein destination andstorage were also identified and included the 26Sproteasome AAA-ATPase subunit and a 20S protea-some subunit alpha type 7 protein. The 26S protea-some is responsible for protein degradation of en-dogenous proteins.

    Proteins involved in secondary metabolism are ofspecific interest to our functional genomics projectfocused on natural products (National Science Foun-dation Plant Genome Research Project no. 0109732).Several secondary metabolic enzymes were identifiedin stems and included cinnamoyl-CoA reductase,which plays a role in lignin biosynthesis, and isofla-vone reductase-like oxidoreductase, an enzyme in-volved in phytoalexin production. Stems also revealedseveral kinases including adenosine kinase, fructoki-

    nase, Rib-phosphate pyrophosphokinase, uridylatemonophosphate kinase, and nucleoside diphosphatekinase1. A number of RNA binding proteins thought tobe important in transcription were also observed. Mul-tiple ribosomal proteins including 40S and 60S ribo-somal proteins were identified and function in proteinsynthesis.

    Roots

    The roots of legumes are of special interest becauseof their role in the characteristic symbiotic relation-ships formed with microorganisms. Although recentarticles have been published on the proteomes ofbarrel medic nodulated root (Bestel-Corre et al., 2002)and uninoculated root (Mathesius et al., 2001), wehave included roots as part of our survey for com-pleteness and comparison. Approximately 24% ofroot proteins identified in this report were associatedwith plant disease/defense and included peroxi-dases, superoxide dismutases, ripening related pro-tein, abscisic acid (ABA)-responsive protein, andchitinase. Peroxidases are generally involved in hy-drogen peroxide detoxification and are induced bybacterial infection (Cook et al., 1995; Peng et al.,1996). Peroxidases also play a major role in ligninbiosynthesis (Lewis and Yamamoto, 1990; Davin andLewis, 1992). Several glucanases were also identified.These normally constitutively expressed proteins areinduced in response to fungal and viral elicitation(Meins et al., 1992). Proteins involved in secondarymetabolism of the flavonoid/isoflavonoid pathwaymade up another 8% of the identified root proteins.Similar to leaves, several membrane-localized pro-teins such as ATPase and cytochrome C oxidase werealso observed in roots.

    Relative to other tissues, a larger percentage (i.e.15%) of the barrel medic root proteins were identifiedas putative proteins or unannotated proteins. Theseproteins could be confidently linked to specific ESTsor predicted open reading frames whose functionsare still unknown. The observation of unannotatedproteins provides experimental evidence of puta-tive/predicted proteins that offer exceptional oppor-tunities in gene annotation (Mann and Pandey, 2001).Because roots appear to have the largest percentageof proteins of unknown function, it is possible thatmany of these proteins may be specific to legumesand may be involved in microbial interactions char-acteristic of legumes.

    The root 2-DE reference map and protein identifi-cations reported here are consistent with the previ-ous studies by Mathesius et al. (2001) in young, uni-noculated barrel medic roots, and by Bestel-Corre etal. (2002) using roots inoculated with Glomus mosseaeor Sinorhizobium meliloti. Similar to our results listedabove, Mathesius and coworkers reported 5% of theiridentified root proteins to be associated with fla-vonoid metabolism and 18% with defense and stress

    Watson et al.

    1116 Plant Physiol. Vol. 131, 2003

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • response, yielding a total of 23% defense-related pro-teins. Further, the total overlap in identified root pro-teins between the current study and the detailed re-port by Mathesius and coworkers was over 50%.These included heat shock 70 protein, protein disul-fide isomerase, glyceraldehyde-3-phosphate dehydro-genase, isoflavone reductase and chalcone isomerase,a glucosidase and a Cys proteinase, ascorbate peroxi-dase, alpha-fucosidase, and a ripening-related protein.Many of these proteins had very similar molecularmass and pI values in both studies. For example,cytochrome c oxidase was reported to have a gel mo-lecular mass/pI of 37 kD/4.2 by Mathesius and co-workers, whereas it was observed at a molecularmass/pI of 36 kD/4.9 in the present study. Similarly,ripening related protein had an experimental molecu-lar mass/pI of 16 kD/5.8 in this study and 18 kD/5.5or 17 kD/6.2 (isoforms) in the Mathesius et al. work.Interestingly, some proteins demonstrated variedslightly between studies. For example, VcCyp wasobserved at a gel molecular mass/pI of 22 kD/6.3 inthe current study as opposed to molecular mass/pI 20kD/8.8 in Mathesius and coworkers. These slight in-consistencies may represent real differences in post-translational modifications of the proteins or may bethe result of experimental variability.

    Proteins identified in all three investigations in-clude a peroxidase precursor, cytochrome c oxidasesubunit 6, VcCyp (cyclophilin), a superoxide dis-mutase, and ABA-responsive protein. Only the ABA-responsive protein and VcCyp were reported to beconstitutively expressed by Bestel-Corre et al. (2002),whereas the others proteins common to all threeinvestigations were identified by them as symbiosis-related proteins. Bestel-Corre also identified and re-ported profucosidase as a symbiosis-related protein.This protein was identified in the current report us-ing uninoculated roots.

    Interestingly, two proteins identified in this inves-tigation were not found in either of the other twostudies. Acidic glucanase was observed as a rela-tively abundant protein in the present report (rts#39),but due to its pI of 8.4 and the fact that Mathesius andcoworkers’ first dimension immobilized pH gradient(IPG) pH range was 4 to 7, it was not present on theirgels. We also identified three isoforms of chitinase,all with a pI above 7, that are missing in the Mathe-sius et al. work. Bestel-Corre et al. (2002) used a pHof 3 to 10 first dimension IPG; thus, these proteinsshould be visible in their gels. Unfortunately, thetotal number of identified proteins in the Bestel-Corre report was limited, and these proteins were notidentified by them.

    Overall, these three reports (this report; Mathesiuset al., 2001; Bestel-Corre et al., 2002) provide a wealthof information on the barrel medic root proteome.There are significant similarities between the refer-ence maps that serve as landmarks and can be usedfor navigation through the root proteome. For exam-

    ple, ABA-responsive protein is one of the most abun-dant root proteins in each of these investigations. Itsrelative position can be used to locate PR10 (a highlyabundant low-molecular mass protein reported byMathesius and coworkers next to ABA-responsiveprotein, rts#80, that was not identified in the presentstudy) in the present and other studies based onsimilarity. Unfortunately, absolute comparisons ofthe proteome reference maps are not always straight-forward as demonstrated by the differences in mo-lecular mass and pI values shown above for VcCyp.

    Flowers

    The proteome of flowers contained proteins fromalmost every functional category. The major portion(38%) of the identified proteins was associated withenergy production including glycolysis, pyruvatemetabolism, and the tricarbonylic acid (TCA) cycle.Another 21% of the identified proteins were involvedwith protein synthesis or protein destination. Forexample, peptidyl prolyl isomerase accelerates pro-tein folding by catalyzing cis-trans isomerization inoligopeptides. Several proteins identified were re-lated to disease/defense or involved in secondarymetabolism, such as chalcone isomerase. These en-zymes are commonly associated with flower pigmen-tation or UV protection and serve as important de-fense proteins in developing seeds. One of the proteinsidentified specifically in the flower proteome was pro-filin. Profilin normally binds to monomeric actin toprevent polymerization, although under certain con-ditions it can promote the polymerization of actin. Itoccurs in all organs, but is most abundant in maturepollen, making it more likely to be identified in flow-ers. Many proteins associated with oxidative re-sponses were also identified in flowers. Low levels ofa few photosynthetic enzymes were observed due tocollection of green sepals with the flowers.

    Seed Pods

    The intact seed pod proteome was generated fromtissue containing both seed and pod tissue. The pro-teins visualized and identified in the barrel medicseed pod proteome consisted primarily of globulinsor seed storage proteins that serve as a nitrogen/nutritional source for developing plants. Severalmembers of the superfamily of “cupins” were iden-tified in barrel medic seed and included 7S and 11Sglobulins (Dunwell, 1998). The 11S globulins are non-glycosylated proteins and include glycinin and legu-min (Hayashi et al., 1988; Duranti et al., 1995). The 7Sproteins are a series of similar but progressivelylarger variations of the same subunit and includevicilin, convicilin, and legumin. It is also interestingto note that 85% of the proteins in this group havebeen matched to other legumes, suggesting a highlevel of sequence similarity in legume storage pro-

    Proteomics of Barrel Medic

    Plant Physiol. Vol. 131, 2003 1117

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • teins. All of the barrel medic seed storage proteinswere observed at multiple molecular masses and pIs.These may represent various stages of protein syn-thesis and degradation, posttranslational processingnot observable at the genome or transcriptome level,or may be the products of multigene families. Similarvariations in observed isoforms have been reportedfor Arabidopsis 12S seed storage proteins in matureand developing seeds (Gallardo et al., 2001).

    A significant number of disease-/defense-relatedproteins were observed in seed pods including per-oxidases, osmotin, and ABA-responsive protein.These proteins help defend the plant in early stagesof development. Other proteins associated with car-bon metabolism, nutrient acquisition, and proteinsyntheses were also observed. These proteins supplynecessary nutrients to the developing plant. Severalphotosynthetic proteins were observed and are at-tributed to the collection of immature green seedpods.

    Cell Suspension Cultures

    Cell suspension cultures were initiated from barrelmedic root calli (Dixon, 1980) and their proteomesurveyed. Cell culture proteins were extracted with aTris buffer and, thus, consisted primarily of cytosolicproteins. Most of the identified proteins from cellcultures could be classified in four categories: energy(24%), protein destination and storage (24%), metab-olism (22%), and disease/defense (18%). The defenseproteins were primarily composed of pathogenesis-related proteins. The most abundant proteins identi-fied were an ABA-responsive protein and a class 10PR protein. Other disease/defense proteins identifiedincluded selenium-binding protein, catalase, and per-oxiredoxin. Several of the metabolic enzymes identi-fied in cells were not identified in any other tissue.One of these, 12-oxophytodienoate reductase, is asso-

    ciated with the conversion of 12-oxophytodienoic acidto jasmonic acid.

    In some instances, more than one protein was iden-tified with high confidence in each protein spot. Forexample, spot cls#82 contained peptides that couldbe associated with both ABA-responsive protein andleghemoglobin. Interestingly, leghemoglobin wasidentified as a root nodule-specific isoform (Gallusciet al., 1991). This protein is root specific and is in-duced during nodulation; however, it is generally notobserved at appreciable levels in uninoculated roots.Thus, the observation of leghemoglobin is uniquehere, and this protein may be induced by the cellculturing process. Further, it may also suggest a“memory” effect or root-specific expression patternobserved in the cell cultures that were originallygenerated from root material (Dixon, 1980). Althoughmany flavonoid-related proteins were observed inother tissues such as root and stem, none were iden-tified in the limited set of proteins surveyed in un-challenged cell cultures.

    The proteome of suspension cell cultures is ofspecial interest because the tissue is relatively ho-mogeneous and, therefore, provides a good modeltissue system for experiments directed toward inte-grated functional genomic studies of natural products(https://www.fastlane.nsf.gov/servlet/showaward?award�0109732). Future work will focus on genera-tion of an extensive 2-DE proteome reference map ofsuspension cell cultures and the changes in the pro-teome after biotic and abiotic elicitation.

    Tissue-/Organ-Specific Expression of Proteins

    Many of the proteins identified were redundant asan average of 61% were identified in one or moretissues of barrel medic. The remaining 39% wereidentified in only one tissue and have the potential ofbeing uniquely expressed in specific tissues/organs

    Figure 5. Bar graph summarizing the number ofredundant proteins identified in more than onetissue (A) and the number of putative tissue-specific proteins identified in a single tissue only(B). The graph is segregated by tissue. A total of61% of the proteins were found to be redundantand 39% were found to be putatively tissuespecific. Guarantee of specificity at this stage isdifficult due to the limited size of the reportedprotein dataset relative to the total proteome.

    Watson et al.

    1118 Plant Physiol. Vol. 131, 2003

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • based on our limited dataset. The quantities of re-dundant and potentially unique proteins identifiedin each specific tissue are summarized in Figure 5.Many of the putative unique proteins are related tothe primary function of the specific tissue. For exam-ple, photosynthetic enzymes such as PSI iron-sulfurprotein and plastid specific ribosomal proteins wereonly identified in leaves. Other proteins identifiedonly in a specific tissue include the seed storageproteins glycinin, convicilin, and legumin in seedpods. Profilin, a known pollen allergen, was alsoidentified in flowers. These are limited examples il-lustrating the unique nature of the proteome, but weare hopeful that continued evaluation of the tissue-and organelle-specific proteomes of barrel medic willyield further insight into the specialized functionalityof these tissues.

    Comparison of Barrel MedicProteome and Transcriptome

    A better understanding of the relationship betweenmRNA and protein abundances is needed to eluci-date the processes and regulation of transcriptionand translation. Several recent publications presentconflicting views concerning the correlation of mRNAand protein levels. Gygi et al. (1999) suggested thatthere is a poor correlation between most yeast mRNAsand protein levels with the exception of only the mostabundant proteins. In contrast, Futcher et al. (1999)reported a good correlation between yeast mRNAabundances, measured by both SAGE and microarraychips, and protein abundances.

    Given the large abundance of EST information forbarrel medic (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi/), a simple comparison of identified pro-tein levels with their corresponding mRNA levels wasperformed. Currently, over 145,000 EST sequencesfrom approximately 20 different non-subtractive, non-normalized (J. White, TIGR, personal communication)cDNA libraries are available (Covitz et al., 1998; Cook,1999; Bell et al., 2000; Gyorgyey et al., 2000). It is

    possible that a select few sequences from these librar-ies are being held back by the contributors, but theseare few and specialized, and should have a minimalaffect on the following comparisons. The cDNA librar-ies were used to estimate or “count” the relative ex-pression level of a particular barrel medic transcriptbased on the repetitive occurrence of sequences fromthe same mRNA (Audic and Claverie, 1997; Ewing etal., 1999). The relative abundances of the top 200 ESTsfor barrel medic leaves, stems, uninoculated roots,flowers, seed pods, and elicited cell cultures werequantified in this manner and are provided in Supple-mental Table II (see www.plantphysiol.org). The rela-tive abundances for the ESTs were generated usingcDNA libraries originating from similar tissues; how-ever, these tissues were from multiple and separateorigins. Comparisons were based on functional anno-tation and not necessarily on specific protein or Gen-Bank numbers, i.e. oxygen-evolving protein as op-posed to P14226. Although this comparison is not ofhigh analytical rigor, it does provide insight into cor-relation of protein and mRNA levels.

    Although the proteins were arbitrarily chosenacross pI and molecular mass ranges, most representrelatively abundant proteins typical of 2-DE and CBB250 staining. Based on the 2-DE protein quantifica-tion results presented here, 67% of the identifiedproteins were in the top 100 most abundant proteinsvisualized with Coomassie, whereas 97% of the pro-teins identified were in the top 200 most abundantproteins. Thus, identified proteins were comparedwith the top 200 most abundant tissue-specific ESTsin related cDNA libraries. The percentages of theidentified proteins observed by 2-DE that were alsoobserved in the top tissue specific ESTs are summa-rized in Table III. This summary reveals that an av-erage of 50% of the identified proteins were observedin the top 200 tissue-specific ESTs. An evaluation ofthe top 100 tissue-specific ESTs shows that 40% ofproteins identified in 2-DE experiments were alsoobserved in the 100 most abundant tissue-specificESTs. These results suggest a moderate level of cor-

    Table III. Summary of the correlated protein and EST libraries

    Ninety-seven percent of all identified proteins were quantified as being in the top 200 most abundantproteins observed in Coomassie-stained 2-DE gels. The occurrence of these identified proteins in the top100 and 200 ESTs is reported. The no. of EST sequences used for EST counting is listed in parenthesesunder each tissue identifier.

    TissueNo. of Proteins Matched in

    Top 100 ESTs/No. ofIdentified Proteins

    No. of Proteins Matched inTop 200 ESTs/No. of

    Identified Proteins

    Leaves (7,831 ESTs) 21/64 (33%) 30/64 (47%)Stems (10,314 ESTs) 12/46 (26%) 16/46 (35%)Roots (6,593 ESTs) 16/40 (40%) 19/40 (48%)Flowers (3,404 ESTs) 16/43 (37%) 19/43 (44%)Pods (4,587 ESTs) 45/61 (74%) 48/61 (79%)Suspension cells (8,926 ESTs) 12/50 (24%) 19/50 (38%)

    Total (41,655 ESTs) 122/304 (40%) 151/304 (50%)

    Proteomics of Barrel Medic

    Plant Physiol. Vol. 131, 2003 1119

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • relation between mRNA and protein. For example,leaf proteins such as the photosynthetic enzymesRubisco small subunit and oxygen-evolving proteinappear to be highly correlated with their respectivemRNA levels.

    Interestingly, some highly expressed proteins suchas Rubisco large subunit were not observed in theEST libraries. As mentioned earlier, we believe thatthis is due to the chloroplast-encoded nature of cer-tain mRNAs, such as Rubisco large subunit, whichdo not contain poly(A�) tails necessary for purifica-tion and cDNA library preparation (Sambrook et al.,1989).

    Highly abundant leaf ESTs not represented in theprotein data to date included aquaporins, chlorophyll-binding proteins, and cytochrome B6. This apparentlack of correlation can be explained by the integralthylakoid membrane nature of these proteins. It iscommonly accepted that integral membrane proteinsare underrepresented in 2-DE due to poor solubiliza-tion. Lipoxygenase also appeared in the top 100 clonesof five tissue-specific EST libraries; however, it wasnever identified in the protein dataset. Plants expressboth cytosolic and chloroplast isoforms of lipoxygen-ase, most of which have a molecular mass of approx-imately 100 kD. A possible explanation for the absenceof this protein from the protein data could be theinherent discrimination against high-molecular massproteins encountered during isoelectric focusing usingIPG strips of fixed gel composition (Candiano et al.,2002).

    The lack of correlation between mRNA and proteincould not always be explained. For example, identi-fied stem proteins included acid phosphatase, actin,and osmotin; however, these proteins were absent orof very low abundance in the stem-specific EST li-brary. Other proteins identified but not representedin the EST libraries included: RNA-binding proteinand ankyrin repeat protein in flowers and hydroxya-cyl glutathione hydrolase in roots. Interestingly,elongation factor 1-alpha was observed as a highlyexpressed EST (top 50) in all tissues but was notobserved in the protein set. The lack of correlationmay be due to the relative turnover rates of bothtranscripts and proteins, or translational controlssuch as codon bias (Gygi et al., 1999), mRNA second-ary structure (Wang and Wessler, 2001), or upstreamopen reading frame repression (Wang and Wessler,1998).

    Based on the limited comparison above, we esti-mate a moderate 50% correlation between proteinand mRNA levels. This value suggests a correlationthat is higher than that reported by Gygi et al. (1999)but lower than that reported by Futcher et al. (1999).If the limitations imposed by the chloroplast-encodedproteins, poor representation of membrane proteinsin 2-DE, and our limited protein dataset are takeninto account, a higher correlation than that reportedmay be possible. Although a significant level of cor-

    relation is perceived, there are still many specificexamples that show poor correlation.

    CONCLUSIONS

    To date, we have identified over 300 proteins inspecific tissues of barrel medic. Protein identifica-tions using only protein databases were 25% success-ful even with good peptide mass fingerprints. Signif-icant increases in protein identification success rateswere achieved by using EST sequence databases. Us-ing complementary protein, nucleotide, and EST se-quence libraries, we were able to achieve a proteinidentification success rate of 55% for our representa-tive protein dataset. We consider this a relativelyhigh success rate in the absence of a genomic se-quence and in comparison with other plant pro-teomic projects. Tentative consensus searches cur-rently are being performed and confirm many of theproposed identifications in this study (Asirvatham etal., 2002b); however, this topic will be discussed in aseparate publication.

    The 2-DE profiles of various barrel medic tissuesprovide reference maps for future proteomic compar-isons of genetic mutants, biotically and abioticallychallenged plants, and/or environmentally chal-lenged plants. The identified proteins provide a sur-vey of those proteins observable using current tech-nology and also serve to define the limitations of thereported proteomics approach. For example, it willbe difficult to study other physiological processesbesides photosynthesis and carbon metabolism inleaves using current proteomic technologies due tothe very high level of these proteins in leaves. Fur-ther, the proteins identified serve as physiologicalmarkers of tissue-specific protein expression. Basedon the limited dataset, 39% of all the identified pro-teins were only identified in a single tissue. Theseputative unique proteins provide valuable insightinto the specialized physiological function of each ofthe tissues. For example, a comparison of roots androot-derived cell cultures can yield insights into thephysiological phenomena associated with the dedif-ferentiation of root tissue during establishment of asuspension cell culture.

    A comparison between the levels of the identifiedproteins and mRNA levels quantified through ESTcounting was performed. It is estimated that on av-erage 50% of the proteins appear to be correlatedwith their corresponding mRNA levels; conversely,50% are not. Information on both transcript and pro-tein levels can be utilized for targeting potential reg-ulatory genes that are characterized by high tran-script but low protein levels.

    The proteins identified in this study as unclear orputative represent unique opportunities to probe mo-lecular function. Systematic perturbations and mon-itoring of these proteins would be expected to yieldinsight into function. These abundant but unclassi-

    Watson et al.

    1120 Plant Physiol. Vol. 131, 2003

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • fied proteins have been linked to specific ESTs and,thus, establish the feasibility to experimentally mon-itor both the protein and mRNA. The relatively highabundance of these proteins further stresses the bio-logical but unknown importance of these proteins inbarrel medic.

    This report provides a comprehensive overview ofthe barrel medic proteome and provides a good foun-dation for future comparative proteomic efforts associ-ated with this important model plant. The importanceof barrel medic is further emphasized by the recentrecommendation from the National Academy of Sci-ences that the goals of the National Plant GenomeInitiative for 2003 through 2008 should focus on a smallnumber of key species including barrel medic (http://books.nap.edu/books/0309085292/html/index.html).This work serves as a major step in this direction for akey plant species. As we seek to better understand genefunction and to study the holistic biology of systems, itis inevitable that we study the proteome.

    MATERIALS AND METHODS

    Plant Material and Protein Extraction

    Differentiated plant tissues were collected from barrel medic (Medicagotruncatula cv Jemalong A17) grown in an environmentally controlled growthchamber and maintained under standard conditions (Asirvatham et al.,2002a). Eight-week-old plants were used for leaf and stem tissue. The toptwo apical unfolded trifoliates were sampled for leaf tissue, and stem tissuewas restricted to the first two apical internodes. Flowers included all stagesfrom buds until petal browning and all parts except the peduncles. Greenseed pods were collected from a variety of developmental stages (includingvery young pods to those with maturing seeds) of 3-month-old plants. Rootswere collected from seedlings grown in perlite 2 weeks after planting. Totalprotein from these tissues was extracted according to a reported method(Tsugita et al., 1994). In brief, tissues (0.4–1.0 g) were ground in liquid N2 andproteins precipitated at �20°C with 10% (w/v) TCA in acetone containing0.07% (w/v) 2-mercaptoethanol for at least 45 min. The mixture was centri-fuged at 35,000g at 4°C for 15 min, and the precipitates were washed withacetone containing 0.07% (w/v) 2-mercaptoethanol, 1 mm phenylmethylsul-fonyl fluoride, and 2 mm EDTA. Pellets were dried by vacuum centrifugationand solubilized in 8 m urea, 4% (w/v) CHAPS, 20 mm DTT, 0.1% (v/v)Biolytes (pH 3–10; Bio-Rad Laboratories, Hercules, CA; Molloy et al., 1998).

    Cell cultures derived from barrel medic cv Jemalong A17 roots weregrown in the dark in shaker flasks and suspended in Schenk and Hilde-brandt (SH) medium with transfer to fresh medium every 2 weeks. Cellswere harvested 4 d after transfer, washed once with fresh SH medium andonce with SH:water (1:1 [v/v]), ground in liquid N2, and extracted with 40mm Tris (pH 9.5), 50 mm MgCl2, 2% (w/v) polyvinylpolypyrrolidone, 1 mmphenylmethylsulfonyl fluoride, and 120 units mL�1 endonuclease (cata-logue no. E8263, Sigma, St. Louis) by sonication (Molloy et al., 1998). Aftercentrifuging at 12,000g, 4°C, for 10 min, proteins in the supernatant wereprecipitated on ice with 12% (w/v) TCA, centrifuged, and washed with coldacetone. The pellet was air dried and resuspended in solubilization buffer.

    Protein Quantification and Electrophoresis

    Protein concentrations of all tissue extracts were quantified using theBradford method (Bradford, 1976) and a commercial dye reagent (Bio-Rad)with bovine serum albumin as a standard. Eleven-centimeter immobilized pHgradient (IPG) strips (linear, pH 3–10) from Bio-Rad were rehydrated at 20°Cwith 0.75 to 1.0 mg of protein in 300 �L for 15 to 16 h. Focusing was carriedout in a Bio-Rad Protean IEF Cell for a total of 35,000 volt hours. Afterfocusing, strips were equilibrated with reduction and then with alkylationbuffers, loaded onto a 12% (w/v) acrylamide gel, and run at 25 mA gel�1

    (Asirvatham et al., 2002a). Gels were stained overnight with Coomassie

    Brilliant Blue R-250 and destained the next day. Gel images were digitizedwith a Bio-Rad FluorS equipped with a 12-bit camera. Experimental molecu-lar mass and pI were calculated from digitized 2-DE images using standardmolecular mass marker proteins and the linear calibration option of GenomicSolutions HT Analyzer software (Genomic Solutions, Ann Arbor, MI).

    Digestions and MALDI-TOFMS

    Protein spots were excised from the gel, washed twice with water for 15min, and destained with a 1:1 (v/v) solution of acetonitrile and 50 mmammonium bicarbonate while changing solutions every 30 min until theblue color of Coomassie was removed. 2-DE gel spots were then dehydratedby washing twice with 100% acetonitrile and dried by vacuum centrifuga-tion. Gel plugs were rehydrated with a solution of 10 ng �L�1 bovinetrypsin (Roche) in 25 mm ammonium bicarbonate and digested for 4 to 6 hat 37°C. The enzymatic digestions were stopped with the addition of 10%(v/v) formic acid, and the supernatant was saved. Gel plugs were extractedonce with 25 �L acetonitrile:water (1:1 [v/v]) and once with 25 �L of 100%(w/v) acetonitrile. Supernatants were combined and taken to dryness. Pep-tides were resuspended in 2% (w/v) formic acid:acetonitrile (1:1 [w/v]),mixed 1:1 with matrix (10 mg mL�1 �-cyano-4-hydroxycinnamic acid in samesolvent), and spotted for MALDI-TOFMS. Mass spectra were obtained with aPerSeptive Biosystems DE-STR at an instrument resolution exceeding 10,000and internally mass calibrated by matching to at least one and often moreautolytic trypsin peaks (906.5049, 1153.5741, 2163.0570, and 2273.1602). Da-tabase search results were reprocessed with a reiterative search algorithm(Intellical, XXXX, XX) at 20 ppm that recalibrates m/z based on the best hit.Intellical software is part of the ABI Proteomics Solutions 1 software. If thebest match is a real match, the identification confidence score will increaseafter reiterative calibration. If the best match is a false positive, the score willgenerally decline. The process was especially useful when trypsin autolyticpeaks were of low abundance or absent. Resultant peptide mass fingerprintswere assigned an arbitrary quality score (PMFQ) to quantify the quality ofthe peptide fingerprint and are reported in Supplemental Table I. The PMFQscores were assigned based on the relative number of analyte peptidesobserved and their relative intensities as compared with the most abundanttrypsin autolytic peptide peaks (2,163 and 2,273). If no peptides were ob-served or if analyte peptides were less than 10% of the trypsin autolytic peaks,a PMFQ value of 0 was assigned. If fewer than five peptides with relativeintensities less than the trypsin peaks were observed, then a PMFQ of 1 wasassigned. If five or more analyte peptides with intensities approximatelyequal to the trypsin autolytic peaks were observed, then a PMFQ value of 3was assigned. If significantly more peptides were observed with a relativeintensity greater than the trypsin autolytic peaks (but trypsin peaks still �10%for internal m/z calibration) were observed, then a PMFQ value of 4 (approx-imately 10 peptides) or 5 (�10 peptides) was assigned. Both MALDI-TOFMSpeptide fingerprints illustrated in Figure 2 have a PMFQ of 5.

    Database Queries and Protein Identifications

    The peptide mass fingerprints were compared with sequences in: (a)NCBInr database (release January 1, 2002), (b) SwissProt database (releaseJanuary 1, 2002), and/or (c) dbESTothers (NCBI; release January 1, 2002), (d)and/or a subset of dbESTothers (NCBI) consisting of approximately 145,000barrel medic EST sequences, dated November 15, 2001, and queried usingMS-Fit (http://prospector.ucsf.edu) in an automated mode using ProteomicSolutions 1 software from Applied Biosystems (Foster City, CA). Massspectra were de-isotoped, baseline corrected, and threshold adjusted beforedatabase searching. Database searches were performed using a 100-ppmmass accuracy with a minimum requirement of four peptide matches froma submission list of typically 30 peptides. The maximum number of missedcleavages was set at one. The only user-defined modification specified wascarbamidomethylation of Cys; however, the software default consideredpossible modifications of N-terminal Gln to pyro-Glu, oxidation of Met, andprotein N terminus acetylation. When peptide mass fingerprints werematched to sequences in the EST databases, functional information wasobtained by BLASTX (NCBI; http://www.ncbi.nlm.nih.gov/BLAST/) ofthe sequence or reference of the clone identifier to the barrel medic geneindex (MtGI; http://www.tigr.org/tdb/mtgi/). The theoretical molecularmass and pI of the identified protein were then calculated using GPMAW(Lighthouse data) and compared with the experimental molecular mass

    Proteomics of Barrel Medic

    Plant Physiol. Vol. 131, 2003 1121

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • calculated from the digitized 2-DE images. Protein identifications wereevaluated on the basis of multiple variables including the number of pep-tides matched, mass error (m/z accuracy), percent coverage of the matchedprotein with 10% of the full-length protein set as the minimum value,quality of the peptide maps, intensity of the matched peaks (18%–20%minimum), similarity of experimental and theoretical protein molecularmasses and pIs, and species from which the sequence was matched. For ESTmatches, the percent coverage was calculated by dividing the number ofmatched amino acids by the total number of amino acids in the proteinsequence returned from the BLASTX or MtGI searches.

    LC/MS/MS

    Select digest mixtures were analyzed by nanoscale HPLC coupled withLC/MS/MS. Data were obtained using an ABI QSTAR Pulsar (AppliedBiosystems) hybrid quadrupole time-of-flight mass spectrometer. The in-strument m/z was calibrated with standards supplied by the manufacturer.Separated peptides were introduced into the mass spectrometer from anHPLC system equipped with an autosampler (LC Packings, San Francisco).Separations were achieved using an LC Packings nanoscale pepmap column(15 cm � 75 �m i.d., 3 �m, 100 Å, C18) and a linear binary gradient (solventA was 1% [v/v] formic acid in 95%:5% [v/v] water:acetonitrile, whereassolvent B was a 0.8% [v/v] formic acid in 5%:95% [v/v] water:acetonitrile).The linear gradient was 95% (w/v) A:5% (w/v) B (0 min) to 60% (w/v)A:40% (w/v) B over 33 min, then ramped to 5% (w/v) A:95% (w/v) B at 37min and held at 5% (w/v) A:95% (w/v) B until 42 min, where it wasreturned to 95% (w/v) A:5% (w/v) B 48 min and allowed to reequilibrate to95% (w/v) A:5% (w/v) B 60 min. Nanoscale-ESI was performed using aProtona interface and nanoelectrospray needles (silver-coated glass capil-lary, New Objective, Woburn, MA). Mass spectra datasets were searchedagainst NCBInr, SwissProt, dbESTothers, and mtEST databases using Mas-cot (http://www.matrixscience.com). The search results were validated asdescribed for the peptide mass fingerprint results.

    EST Counting and Protein RelativeAbundance Estimates

    Barrel medic ESTs were extracted from dbEST (http://www.ncbi.nlm.nih.gov/, accessed November 4, 2001). ESTs were assembled into tentativeconsensus sequences by TIGR to generate the barrel medic gene index(MtGI, http://www.tigr.org/tdb/tgi.shtml). The MtGI release of September7, 2001 was used to count the occurrence of barrel medic genes in sixdifferent EST datasets including leaf (one cDNA library of developing leaf,7,831 ESTs), stem (one library of developing stem, 10,314 ESTs), root (threelibraries of uninoculated root, 6,593 ESTs), flower (one library of developingflower, 3,404 ESTs), seed pod (one library of developing seed and onelibrary of developing pod, 4,587 ESTs), and cell suspensions (one library ofelicited cell suspensions, 8,926 ESTs). The barrel medic genes were thensorted in the descending order on their EST counts for each dataset and usedin the comparison with proteomic data.

    Protein abundances were calculated using the normalized spot volume ofeach protein determined with HT Analyzer software (Genomic Solutions) aspreviously reported (Asirvatham et al., 2002a).

    ACKNOWLEGMENTS

    We thank Dr. Richard Dixon for scientific discussion and editorial com-ments. We thank Drs. Zhentian Lei and Aaron Elmer for their assistance inperforming LC/MS/MS analyses.

    Received December 11, 2002; returned for revision December 24, 2002;accepted January 3, 2003.

    LITERATURE CITED

    Anderson NG, Matheson A, Anderson NL (2001) Back to the future: thehuman protein index and the agenda for post-proteomic biology. Pro-teomics 1: 3–12

    Apweiler R, Attwood TK, Bairoch A, Bateman A, Birney E, Biswas M,Bucher P, Cerutti L, Corpet F, Croning MDR et al. (2001) The InterPro

    database, an integrated documentation resource for protein families,domains, and functional sites. Nucleic Acids Res 29: 37–40

    Asirvatham VS, Watson BS, Sumner LW (2002a) Analytical and biologicalvariances associated with proteomic studies of Medicago truncatula by2-DE. Proteomics 2: 960–968

    Asirvatham VS, Watson BS, Wang L, Sumner LW (2002b) Protein identi-fication success rates in proteomics studies of Medicago truncatula usingpeptide mass fingerprints to search protein, nucleotide and EST data-bases in a species without sequenced genomes. Proceedings of the 50th

    ASMS Conference on Mass Spectrometry and Allied Topics, Orlando, FL,June 2–6. American Society for Mass Spectrometry, Santa Fe, NM, ppxxx–xxx

    Audic S, Claverie J-M (1997) The significance of digital gene expressionprofiles. Genome Res 7: 986–995

    Barker DG, Bianchi S, Blondon F, Dattée Y, Duc G, Essad S, Flament P,Gallusci P, Génier G, Pierre G et al. (1990) Medicago truncatula, a modelplant for studying the molecular genetics of the Rhizobium-legume sym-biosis. Plant Mol Biol Rep 8: 40–49

    Batchelor AH, Piper DE, de la Brousse FC, McKnight SL, Wolberger C(1998) The structure of GABPalpha/beta: an EST domain-ankryin repeatheterodimer bound to DNA. Science 279: 1037–1041

    Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL (2002) The Pfamprotein families database. Nucleic Acids Res 30: 276–280

    Bell CA, Dixon RA, Farmer AD, Flores R, Inman J, Gonzales RA, HarrisonMJ, Paiva NL, Scott AD, Weller JW et al. (2000) The Medicago genomeinitiative: a model legume database. Nucleic Acids Res 29: 1–4

    Bestel-Corre G, Dumas-Gaudot E, Poinsot V, Dieu M, Dierick J-F, vanTuinen D, Remacle J, Gianinassi-Pearson V, Gianinazzi S (2002) Pro-teome analysis and identification of symbiosis-related proteins fromMedicago truncatula Gaertn. by two-dimensional electrophoresis and massspectrometry. Electrophoresis 23: 122–137

    Bevan M, Bancroft I, Bent E, Love K, Goodman H, Dean C, Bergkamp R,Dirske W, Van Staveren M, Stiekema W et al. (1998) Analysis of 1.9 Mbof contiguous sequence from chromosome 4 of Arabidopsis thaliana. Na-ture 391: 485–488

    Bradford MM (1976) A rapid and sensitive method for the quantitation ofmicrogram quantities of protein utilizing the principle of protein-dyebinding. Anal Biochem 72: 248–254

    Candiano G, Musante L, Bruschi M, Ghiggeri GM, Herbert B, AntonucciF, Righetti PG (2002) Two-dimensional maps in soft immobilized pHgradient gels: a new approach to the proteome of the third millennium.Electrophoresis 23: 292–297

    Chang WWP, Huang L, Shen M, Webster C, Burlingame AL, Roberts JKM(2000) Patterns of protein synthesis and tolerance of anoxia in root tips ofmaize seedlings acclimated to a low-oxygen environment, and identifi-cation of proteins by mass spectrometry. Plant Physiol 122: 295–317

    Choudhary JS, Blackstock WP, Creasy DM, Cottrell JS (2001) Matchingpeptide mass spectra to EST and genomic DNA databases. Trends Bio-technol 19: S17–S22

    Comment (2002) World’s first complete legume genome sequencing project.Trends Plant Sci 7: 101

    Cook D, Dreyer D, Bonnet D, Howell M, Nony E, VandenBosch K (1995)Transient induction of a peroxidase gene in Medicago truncatula precedesinfection by Rhizobium meliloti. Plant Cell 7: 43–55

    Cook DR (1999) Medicago truncatula: a model in the making! Curr OpinPlant Biol 2: 301–304

    Cook DR, VandenBosch K, de Bruijn FJ, Huguet T (1997) Model legumesget the nod. Plant Cell 9: 275–281

    Corthals G, Gygi S, Aebersold R, Patterson SD (2000) Identification ofproteins by mass spectrometry. In T Rabilloud, ed, Proteome Research:Two-Dimensional Gel Electrophoresis and Identification Methods.Springer-Verlag, Berlin, pp 197–231

    Costa P, Bahrman N, Frigerio J-M, Kremer A, Plomion C (1998) Water-deficit-responsive proteins in maritime pine. Plant Mol Biol 38: 587–596

    Costa P, Pionneau C, Bauw G, Dubos C, Bahrmann N, Kremer A, FrigerioJ-M, Plomion C (1999) Separation and characterization of needle andxylem maritime pine proteins. Electrophoresis 20: 1098–1108

    Covitz PA, Smith LS, Long SR (1998) Expressed sequence tags from aroot-hair-enriched Medicago truncatula cDNA library. Plant Physiol 117:1325–1332

    Davin LB, Lewis NG (1992) Phenylpropanoid metabolism: biosynthesis ofmonolignols, lignans and neolignans, lignins and suberins. In HA Staf-

    Watson et al.

    1122 Plant Physiol. Vol. 131, 2003

    Dow

    nloaded from https://academ

    ic.oup.com/plphys/article/131/3/1104/6111073 by guest on 07 July 2021

  • ford, RK Ibrahim, eds, Recent Advances in Phytochemistry, PhenolicMetabolism in Plants. Plenum Press, New York, pp 325–376

    Dixon RA (1980) Plant tissue culture methods in the study of phytoalexininduction. In DS Ingram, JP Helgeson, eds, Tissue Culture Methods forPlant Pathologists. Blackwell Scientific Publications, Oxford, pp 185–186

    Dixon RA (1999) Isoflavonoids: biochemistry, molecular biology, and bio-logical function. In D Barton, K Nakanishi, O Meth-Cohn, eds, Compre-hensive Natural Product Chemistry. Elsevier, New York, pp 774–821

    Dunwell JM (1998) Cupins: a new superfamily of functionally diverseproteins that include germins and plant storage proteins. BiotechnolGenet Eng Rev 15: 1–32

    Duranti M, Horstamann C, Gilroy J, Croy RR (1995) The molecular basisfor N-glycosylation in the 11S globulin (legumin) of lupin seed. J ProteinChem 14: 107–110

    Ewing RM, Kahla AB, Poirot O, Lopez F, Audic S, Claaverie J-M (1999)Large-scale statistical analyses of rice ESTs reveal correlated patterns ofgene expression. Genome Res 9: 950–959

    Futcher B, Latter GI, Monardo P, McLaughlin CS, Garrels JI (1999) Asampling of the yeast proteome. Mol Cell Biol 19: 7357–7368

    Gallardo K, Job C, Groot SPC, Puype M, Demol H, Vanderkerckhove J,Job D (2001) Proteomic analysis of Arabidopsis seed germination andpriming. Plant Physiol 126: 835–848

    Gallusci P, Dedieu A, Journet EP, Huguet T, Barker DG (1991) Synchro-nous expression of leghaemoglobin in Medicago truncatula duringnitrogen-fixing root nodule development and response to exogenouslysupplied nitrate. Plant Mol Biol 17: 335–349

    Görg A, Obermaler C, Boguth G, Weiss W (1999) Recent developments intwo-dimensional gel electrophoresis with immobilized pH gradients:wide pH gradients up to pH 12, longer separation distances and simpli-fied procedures. Electrophoresis 20: 712–717

    Gorina S, Pavletich NP (1996) Structure of the p53 tumor suppressor boundto the ankyrin and SH3 domains of 53BP2. Science 274: 1001–1005

    Guerreiro N, Djordjevic MA, Rolfe BG (1999) Proteome analysis of themodel microsymbiont Sinorhizobium meliloti: isolation and characterisa-tion of novel proteins. Electrophoresis 20: 818–825

    Gygi SP, Rochon Y, Franza BR, Aebersold R (1999) Correlation betweenprotein and mRNA abundance in yeast. Mol Cell Biol 19: 1720–1730

    Gyorgyey J, Vaubert D, Jimenez-Zurdo JI, Charon C, Troussard L, Kon-dorosi A, Kondorosi E (2000) Analysis of Medicago truncatula noduleexpressed sequence tags. Mol Plant-Microbe Interact 13: 62–71

    Haridas V, Higuchi M, Jayatilake GS, Bailey D, Mujoo K, Blake ME,Arntzen CJ, Gutterman JU (2001). Avicins: triterpenoid saponins fromAcacia victoriae (Bentham) induce apoptosis by mitochondrial perturba-tion. Proc Natl Acad Sci USA 98: 5821–5826

    Hayashi M, Mori H, Nishimura M, Akazawa T, Hara-Nishimura I (1988)Nucleotide sequence of cloned cDNA coding for pumpkin 11-S globulinbeta subunit. Eur J Biochem 172: 627–632

    Higgins CF (1992) ABC transporters: from microorganisms to man. AnnuRev Cell Biol 8: 67–113

    Jacobs MD, Harrison SC (1998) Structure of an IkappaBalpha/NF-kappaBcomplex. Cell 95: 749–758

    Jasiñski M, Stukkens Y, Degand H, Purnelle B, Marchand-Brynaert J,Boutry M (2001) A plant plasma membrane ATP binding cassette-typetransporter is involved in antifungal terpenoid secretion. Plant Cell 13:1095–1107

    Klose J, Kobalz U (1995) Two-dimensional electrophoresis of proteins: anupdated protocol and implications for a functional analysis of the ge-nome. Electrophoresis 16: 1034–1059

    Kruft V, Holger E, Jänsch L, Wolf W, Braun H-P (2001) Proteomic approachto identify novel mitochondrial proteins in Arabidopsis. Plant Physiol 127:1694–1710

    Lewis NG, Yamamoto E (1990) Lignins: occurrence, biosynthesis and bio-degradation. Annu Rev Plant Physiol Plant Mol Biol 41: 455–496

    Li HM, Kaneko Y, Keegstra K (1994) Molecular cloning of a chloroplasticprotein associated with both the envelope and thylakoid membranes.Plant Mol Biol 25: 619–632

    Mann M, Pandey A (2001) Use of mass spectrometry-derived data toannotate nucleotide and protein sequence databases. Trends Biochem Sci26: 54–60

    Mann M, Wilm M (1994) Error-tolerant identification of peptide sequencetags. Anal Chem 66: 4390–4399

    Mathesius U, Imin N, Chen H, Djordjevic MA, Weinman JJ, Natera SHA,Morris AC, Kerim T, Paul S, Menzel C et al. (2002) Evaluation of

    proteome reference maps for cross-species identification of proteins bypeptide mass fingerprinting. Proteomics 2: 1288–1303

    Mathesius U, Keijzers G, Natera SHA, Winman JJ, Djordjevic MA, RolfeBG (2001) Establishment of a root proteome reference map for the modellegume Medicago truncatula using the expressed sequence tag database forpeptide mass