13
THE PLANT GENOME JULY 2014 VOL . 7 , NO. 2 1 OF 13 ORIGINAL RESEARCH Generation and Characterization of a Sugarbeet Transcriptome and Transcript-Based SSR Markers Karen Klotz Fugate,* Diego Fajardo, Brandon Schlautman, Jocleita Peruzzo Ferrareze, Melvin D. Bolton, Larry G. Campbell, Eric Wiesman, and Juan Zalapa Abstract Sugarbeet is a major source of refined sucrose and increasingly grown for biofuel production. Demand for higher productivity for this crop requires greater knowledge of sugarbeet physiology, pathology, and genetics, which can be advanced by the develop- ment of new genomic resources. Towards this end, a sugarbeet transcriptome of expressed genes from leaf and root tissues at varying stages of development and production, and after elicita- tion with jasmonic acid (JA) or salicylic acid (SA), was constructed and used to generate simple sequence repeat (SSR) markers. The transcriptome was generated via paired-end RNA sequencing and contains 82,404 unigenes. A total of 37,207 unigenes were anno- tated, of which 9480 were functionally classified using clusters of orthologous groups (COG) annotations, 17,191 were classified into biological process, molecular function, or cellular component using gene ontology (GO) terms, and 17,409 were assigned to 126 metabolic pathways using Kyoto Encyclopedia of Genes and Ge- nomes (KEGG) identifiers. A SSR search of the transcriptome identi- fied 7680 SSRs, including 6577 perfect SSRs, of which 3834 were located in unigenes with ungapped sequence. Primer-pairs were designed for 288 SSR loci, and 72 of these primer-pairs were tested for their ability to detect polymorphisms. Forty-three primer-pairs detected single polymorphic loci and effectively distin- guished diversity among eight B. vulgaris genotypes. The transcrip- tome and SSR markers provide additional, public domain genomic resources for an important crop plant and can be used to increase understanding of the functional elements of the sugarbeet genome, aid in discovery of novel genes, facilitate RNA-sequencing based expression research, and provide new tools for sugarbeet genetic research and selective breeding. S UGARBEET (Beta vulgaris L.) is an herbaceous dicoty- ledon and member of the Amaranthaceae family. Grown primarily for the production of refined sucrose, sugarbeet provides approximately 22% of the world’s sugar (Südzucker, 2013). It is also the source of two high- energy animal feeds (beet molasses and beet pulp), and is increasingly grown for biofuel production (Harland et al., 2006; Panella, 2010). Sugarbeet is grown in 42 coun- tries on five continents, with Europe and North America producing more than 60% of the crop. Other economi- cally important members of the species include table beet, chard, and fodder beet. Sugarbeet production is challenged by an array of abiotic and biotic stresses that reduce biomass and sucrose content. Insufficient water, excessively hot or cold tempera- tures, and saline soils prevent the crop from reaching its full genetic potential and can reduce yield by as much as 50% (Boyer, 1982; Ober and Rajabi, 2010). Insects, includ- ing the sugarbeet root maggot ( Tetanops myopaeformis von Röder) and root aphid ( Pemphigus betae Doane), Published in The Plant Genome 7 doi: 10.3835/plantgenome2013.11.0038 © Crop Science Society of America 5585 Guilford Rd., Madison, WI 53711 USA An open-access publication All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Permission for printing and for reprinting the material contained herein has been obtained by the publisher. K.K. Fugate, M.D. Bolton, and L.G. Campbell, USDA-ARS, Northern Crop Science Lab., 1605 Albrecht Blvd. N., Fargo, ND 58102; D. Fajardo, and B. Schlautman, Dep. of Horticulture, Univ. of Wiscon- sin, 1575 Linden Dr., Madison, WI 53705; J.P. Ferrareze, Dep. de Agronomia, Centro de Ciências Agroveterinárias, Univ. do Estado de Santa Catarina, 88520-000, Lages, SC, Brazil; E. Wiesman, and J. Zalapa, USDA-ARS, Vegetable Crops Research Unit, 1575 Linden Dr., Madison, WI 53705. Received 22 Nov. 2013. *Corre- sponding author ([email protected]). Abbreviations: BAC, bacterial artificial chromosome; COG, clusters of orthologous groups; EST, expressed sequence tag; GO, gene ontology; JA, jasmonic acid; KEGG, Kyoto Encyclopedia of Genes and Genomes; NCBI, National Center for Biotechnology Informa- tion; nr, nonredundant protein; PCoA, principal coordinates analysis; RPKM, reads per kilobase per million reads; SA, salicylic acid; SSR, simple sequence repeat; UPGMA, unweighted pair group method with arithmetic mean. Published April 11, 2014

Generation and characterization of a sugarbeet (Beta vulgaris L.) transcriptome and transcript-based SSR markers

  • Upload
    wisc

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

the plant genome july 2014 vol. 7, no. 2 1 of 13

original research

Generation and Characterization of a Sugarbeet Transcriptome and Transcript-Based SSR Markers

Karen Klotz Fugate,* Diego Fajardo, Brandon Schlautman, Jocleita Peruzzo Ferrareze, Melvin D. Bolton, Larry G. Campbell, Eric Wiesman, and Juan Zalapa

AbstractSugarbeet is a major source of refined sucrose and increasingly grown for biofuel production. Demand for higher productivity for this crop requires greater knowledge of sugarbeet physiology, pathology, and genetics, which can be advanced by the develop-ment of new genomic resources. Towards this end, a sugarbeet transcriptome of expressed genes from leaf and root tissues at varying stages of development and production, and after elicita-tion with jasmonic acid (JA) or salicylic acid (SA), was constructed and used to generate simple sequence repeat (SSR) markers. The transcriptome was generated via paired-end RNA sequencing and contains 82,404 unigenes. A total of 37,207 unigenes were anno-tated, of which 9480 were functionally classified using clusters of orthologous groups (COG) annotations, 17,191 were classified into biological process, molecular function, or cellular component using gene ontology (GO) terms, and 17,409 were assigned to 126 metabolic pathways using Kyoto Encyclopedia of Genes and Ge-nomes (KEGG) identifiers. A SSR search of the transcriptome identi-fied 7680 SSRs, including 6577 perfect SSRs, of which 3834 were located in unigenes with ungapped sequence. Primer-pairs were designed for 288 SSR loci, and 72 of these primer-pairs were tested for their ability to detect polymorphisms. Forty-three primer-pairs detected single polymorphic loci and effectively distin-guished diversity among eight B. vulgaris genotypes. The transcrip-tome and SSR markers provide additional, public domain genomic resources for an important crop plant and can be used to increase understanding of the functional elements of the sugarbeet genome, aid in discovery of novel genes, facilitate RNA-sequencing based expression research, and provide new tools for sugarbeet genetic research and selective breeding.

Sugarbeet (Beta vulgaris L.) is an herbaceous dicoty-ledon and member of the Amaranthaceae family.

Grown primarily for the production of refined sucrose, sugarbeet provides approximately 22% of the world’s sugar (Südzucker, 2013). It is also the source of two high-energy animal feeds (beet molasses and beet pulp), and is increasingly grown for biofuel production (Harland et al., 2006; Panella, 2010). Sugarbeet is grown in 42 coun-tries on five continents, with Europe and North America producing more than 60% of the crop. Other economi-cally important members of the species include table beet, chard, and fodder beet.

Sugarbeet production is challenged by an array of abiotic and biotic stresses that reduce biomass and sucrose content. Insufficient water, excessively hot or cold tempera-tures, and saline soils prevent the crop from reaching its full genetic potential and can reduce yield by as much as 50% (Boyer, 1982; Ober and Rajabi, 2010). Insects, includ-ing the sugarbeet root maggot (Tetanops myopaeformis von Röder) and root aphid (Pemphigus betae Doane),

Published in The Plant Genome 7 doi: 10.3835/plantgenome2013.11.0038 © Crop Science Society of America 5585 Guilford Rd., Madison, WI 53711 USA An open-access publication

All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Permission for printing and for reprinting the material contained herein has been obtained by the publisher.

K.K. Fugate, M.D. Bolton, and L.G. Campbell, USDA-ARS, Northern Crop Science Lab., 1605 Albrecht Blvd. N., Fargo, ND 58102; D. Fajardo, and B. Schlautman, Dep. of Horticulture, Univ. of Wiscon-sin, 1575 Linden Dr., Madison, WI 53705; J.P. Ferrareze, Dep. de Agronomia, Centro de Ciências Agroveterinárias, Univ. do Estado de Santa Catarina, 88520-000, Lages, SC, Brazil; E. Wiesman, and J. Zalapa, USDA-ARS, Vegetable Crops Research Unit, 1575 Linden Dr., Madison, WI 53705. Received 22 Nov. 2013. *Corre-sponding author ([email protected]).

Abbreviations: BAC, bacterial artificial chromosome; COG, clusters of orthologous groups; EST, expressed sequence tag; GO, gene ontology; JA, jasmonic acid; KEGG, Kyoto Encyclopedia of Genes and Genomes; NCBI, National Center for Biotechnology Informa-tion; nr, nonredundant protein; PCoA, principal coordinates analysis; RPKM, reads per kilobase per million reads; SA, salicylic acid; SSR, simple sequence repeat; UPGMA, unweighted pair group method with arithmetic mean.

Published April 11, 2014

2 of 13 the plant genome july 2014 vol. 7, no. 2

nematodes (Heterodera schachtii Schmidt), and fungal and viral diseases including Aphanomyces root rot (causal agent, Aphanomyces cochlioides Drechsl.), Cercospora leaf spot (causal agent, Cercospora beticola Sacc.), Fusarium yellows (causal agent, Fusarium oxysporum Schlect. f. sp. betae Snyd. & Hans.), Rhizoctonia root and crown rot (causal agent, Rhizoctonia solani Kühn), and rhizomania (causal agent, Beet necrotic yellow vein virus), significantly reduce root yield and sucrose content and can devastate individual fields (Harveson et al., 2009). Improvements in biomass production, sucrose content, nitrogen use efficiency, storage properties, and processing character-istics would also increase the efficiency of sugarbeet and sucrose production, reduce crop inputs, and minimize the environmental impacts of production. To address these challenges and goals, a better understanding of sugarbeet physiology, pathology, and genetics is essential.

Genomic resources are invaluable for advancing our understanding of plant biology and plant pathology and for improving crop germplasm. Genomic resources provide a wealth of information about genes and their expression and have been used extensively to provide insight into plant interactions with pathogens, insects, and the environment. They also have been widely used for the development of molecular markers that are used to assist selective breeding efforts. A variety of genomic resources are presently available for sugarbeet. DNA-based markers, including randomly amplified polymor-phic DNAs (Barzen et al., 1995; Uphoff and Wricke, 1995), amplified fragment length polymorphisms (Schon-delmaier et al., 1996; McGrath et al., 2007), single nucleo-tide polymorphisms (Schneider et al., 2007), and SSRs (Laurent et al., 2007; Smulders et al., 2010), have been developed and used to construct genetic maps of the nine chromosomes of sugarbeet. As of September 2013, 64,388 partial or complete sugarbeet nucleotide sequences, of which 30,313 are expressed sequence tags (ESTs), were available in the National Center for Biotechnology Infor-mation’s (NCBI) GenBank. In 2013, an unannotated draft of the sugarbeet genome was made available, with restricted use (Dohm et al., 2012). A sugarbeet transcrip-tome of expressed sequences has also been reported. While this transcriptome is a valuable resource, its use-fulness is limited by the use of only apical shoot tissue for its construction (Mutasa-Gottgens et al., 2012).

Here, we describe the generation and characteriza-tion of a sugarbeet transcriptome that contains genes expressed in leaf and root tissue during early and late development, root tissue after postharvest storage, and root tissue after elicitation with JA or SA. The RNA extracted from both leaf and root tissue at different stages of development and production were used to maximize the diversity of the transcripts in the generated database. Jasmonic acid- and SA-elicited tissues were also included to increase the expression of defense-related and stress-related transcripts since both JA and SA have central roles in the induction of plant defense responses to a wide variety of biotic and abiotic challenges (Vlot

et al., 2009; Hayat et al., 2010; Ballaré, 2011; Wasternack and Hause, 2013). This transcriptome provides an addi-tional genomic resource for sugarbeet that will assist in understanding the functional elements of the sugarbeet genome, aid in the discovery of novel genes, and facilitate RNA-sequencing-based expression studies. In addi-tion, 7680 transcript-based SSR markers were identified using the information contained in the transcriptome and primer sets were designed for 288 of these markers. From these 288 markers, 72 markers were validated and 43 polymorphic transcript-based SSR markers were used to assess diversity among eight B. vulgaris accessions to demonstrate their effectiveness. These markers add to the genomic and EST-based SSR markers currently avail-able, and provide additional tools for assessing genetic diversity of Beta species and for molecular breeding. The transcriptome and SSR markers are made available with no restrictions on their use.

Materials and MethodsPlant MaterialSugarbeet ‘VDH66156’ was grown in a greenhouse as previously described for up to 16 wk (Fugate et al., 2013). Roots receiving a JA treatment, a SA treatment, or placed into postharvest storage were washed to remove adher-ing potting media. Jasmonic acid and SA treatments were administered on the day of harvest by submerging harvested roots for 1 h at room temperature in aqueous solutions of 10 µM JA or 1 mM SA. Postharvest storage incubations were performed in a controlled environment chamber (Conviron, model MTR30, Winnipeg, MB, Canada) at 20°C and 90% relative humidity, in the dark. Tissue samples to be used for RNA isolations were col-lected of young, newly developed leaf tissue with lamina lengths <5 cm, fully expanded leaf tissue, root tissue 5 wk after planting, root tissue 16 wk after planting, root tissue after 60 d postharvest storage, root tissue 2 d after JA treatment, and root tissue 2 d after SA treatment. Leaf tissue was collected from plants 6 to 10 wk after planting. Fully expanded leaf tissue was collected without regard to leaf order on the plant. All tissue samples were pooled samples collected from a minimum of four plants. Sam-ples were flash frozen in liquid nitrogen, lyophilized, and stored at –80°C before use.

Eight B. vulgaris lines of diverse backgrounds were greenhouse grown for validating SSR loci. Accessions included wild B. vulgaris subsp. maritima annuals from France (PI 540605) and Greece (PI 546420), H-537 (a sugarbeet ´ fodder beet cross), F1010 (PI 535818), Y-322 (PI 583780), F1024 (PI 658654), FC221 (PI 651016), and L-19 (PI 590690) (Doney and Theurer, 1984; Germplasm Resources Information Network, 2013). Young leaf tissue from each accession was collected for DNA extraction. Tissue was flash frozen in liquid nitrogen immediately after collection and stored at –80°C before use.

fugate et al.: sugarbeet transcriptome and ssr markers 3 of 13

RNA and DNA IsolationTotal RNA was extracted from 20 mg lyophilized tissue using a RNeasy Plant Mini Kit (Qiagen, Valencia, CA) with an on-column DNase digestion according to the manufacturers’ instructions. The RNA was quantified spectroscopically (NanoDrop ND-1000, Thermo Scien-tific, Wilmington, DE) and its quality was confirmed with an Agilent Technologies (Palo Alto, CA) 2100 Bio-analyzer. Genomic DNA was isolated using a DNeasy Plant Mini Kit (Qiagen) according to the manufacturer’s instructions. The DNA was quantified spectroscopically (NanoDrop ND-1000, Thermo Scientific), and DNA quality was assessed by the ratio of absorbances at 230, 260, and 280 nm and by gel electrophoresis.

RNA SequencingEqual quantities (5 µg) of RNA from each of the seven tissues were pooled for RNA sequencing. Oligo(dT) beads were used to isolate mRNA from total RNA. The mRNA was fragmented using heat and divalent cat-ions, and the fragments were used for cDNA first strand synthesis using random hexamer primers and reverse transcriptase. The RNA was degraded with RNase H and second-strand cDNA was synthesized with DNA Polymerase I. The cDNA fragments were purified using a QiaQuick PCR extraction kit (Qiagen), and ends were repaired with T4 DNA polymerase and Klenow DNA polymerase, 3¢ end adenylated, and ligated to sequenc-ing adapters. Adaptor-ligated cDNAs were subjected to agarose gel electrophoresis, size selected, and amplified by PCR. Amplified products were pair-end sequenced by BGI Americas (Cambridge, MA) using an Illumina HiSeq 2000 (San Diego, CA) system. Sequence data were deposited in the NCBI Sequence Read Archive under accession number PRJNA219421.

Sequence Assembly, Annotation, and ClassificationRaw sequence data were cleaned to remove reads with adapters, reads with >10% unknown nucleotides, and low quality reads. Transcriptome assembly from clean reads to contigs, scaffolds, and unigenes was performed using SOAPdenovo assembly software (Li et al., 2010). Unige-nes were aligned with the protein databases nr (nonre-dundant protein), Swiss-Prot, KEGG, and COG using BLASTx (E-value < 10–5) to obtain sequence direction and gene, function, and pathway annotations. Align-ments were made using the databases as they existed in October 2011. Where differences were noted between database alignments, a priority order of nr, Swiss-Prot, KEGG, and COG was followed. Unaligned unigenes were scanned using ESTScan ver. 2.1 to assist with the assignment of sequence direction (Iseli et al., 1999). Gene ontology annotation was obtained using nr annotations and the BLAST2GO program (Conesa et al., 2005). Func-tional classification of unigenes and determination of distribution of gene functions was performed using GO annotations and WEGO software (Ye et al., 2006).

SSR Loci Discovery and Primer DesignDetection of perfect SSRs was performed with SSRLo-cator (da Maia et al., 2008) identifying motifs with at least six dimer repeats, trimers with a minimum of four repeats, and at least three repeats for tetra-, penta- and hexa-mers. Primer design was performed using Primer3 (Rozen and Skaletsky, 2000) and WebSat (Martins et al., 2009) to visually verify the location of the designed primer and avoid multiple amplification of sequence repeats in the same PCR product.

PCR and SSR Genotyping and Diversity AnalysisThe PCR and genotyping was conducted according to Zhu et al. (2012). Briefly, SSR forward primers were appended at the 5¢ end with the M13 sequence (5¢-CAC-GACGTTGTAAAACGAC-3¢) to allow indirect label-ing of reactions. Reverse primers were appended with the sequence GTTTCTT (PIG) at the 5¢ end to promote nontemplated (A) addition and to facilitate subsequent genotyping. The M13 universal primer was labeled with carboxyfluorescein fluorescent tag. SSR allele genotyping was conducted using a carboxy-X-rhodamine standard (GeneFlo-625 ROX; CHIMERx, Milwaukee, WI) in an ABI 3730 fluorescent sequencer (POP-6 and a 50-cm array; Applied Biosystems, Foster City, CA). Alleles were scored using GeneMarker Software v. 1.91 (SoftGenet-ics, State College, PA). Genetic diversity of the eight beet genotypes using SSR loci was evaluated using GenAlEx 6.4 (Peakall and Smouse, 2006). A principal coordinates analysis plot (PCoA) and unweighted pair group method with arithmetic mean (UPGMA) tree were constructed based on genetic distances estimated between pairs of individuals as computed by GenAlEx 6.4 and MEGA 5.0 (Peakall and Smouse, 2006; Tamura et al., 2011).

Results and DiscussionTranscriptome Generation and AssemblyIllumina paired-end sequencing is increasingly used to generate organismal or tissue-specific transcriptomes (Wang et al., 2010; Garg et al., 2011; Wei et al., 2011; Tao et al., 2012) and was used to generate a leaf and root tran-scriptome for sugarbeet. The sugarbeet transcriptome was generated from RNA isolated from leaf and root tissue at different stages of development, and root tissue after storage, elicitation with JA, or elicitation with SA. Illumina sequencing generated 80,295,612 clean reads comprising 7226,605,080 nucleotides. Greater than 96% of sequenced bases had quality scores ³ 20, indicat-ing a sequencing error rate less than or equal to 1% for these bases. Within clean reads, only 0.01% of bases were unsequenced. The GC content of the generated sequence was 44.5%. This compares well with the 44.0 and 44.1% GC contents that were reported for coding regions of sequenced B. vulgaris bacterial artificial chromosome (BAC) clones isolated from two independently created BAC libraries (Dohm et al., 2009).

4 of 13 the plant genome july 2014 vol. 7, no. 2

Clean reads were assembled into 730,022 contigs containing 91,543,225 nucleotides using the SOAPdenovo software program (Li et al., 2010). Median and mean contig lengths were 92 and 125 nucleotides, respectively, with the length distribution of contigs shown in Fig. 1A. Contigs were then assembled into 165,266 scaffolds con-taining 44,317,682 nucleotides. Median and mean scaf-fold lengths were 313 and 268 nucleotides, respectively, with the length distribution of scaffolds shown in Fig. 1B. Greater than 80.6% of scaffolds contained no gaps. Paired-end reads were used to fill gaps where possible and generate unigenes that contained the smallest number of unassigned nucleotides and which could not be fur-ther extended at either the 5¢ or 3¢ end. A total of 82,404 unigenes containing 32,889,791 nucleotides were assem-bled. Unigenes ranged in size from 200 to 4497 nucleo-tides, with median and mean unigene lengths of 424 and 399 nucleotides, respectively. Overall, 65,962 unigenes

(80.05%) were 200 to 500 nucleotides long, 13,066 unige-nes (15.86%) were 500 to 1000 nucleotides long, 2326 unigenes (2.82%) were 1000 to 1500 nucleotides long, 712 unigenes (0.86%) were 1500 to 2000 nucleotides long, and 338 unigenes (0.41%) were >2000 nucleotides long (Fig. 1C). Greater than 83.4% of unigenes contained no gaps. An additional 6.0% of the unigenes contained gaps less than or equal to 5% of their length. Sugarbeet tran-scriptome unigene sequences were deposited in figshare repository and are publically available (Fugate, 2013).

Unigene Annotation and ExpressionA BLASTx sequence similarity search against entries contained in the NCBI nr, Swiss-Prot, KEGG, and COG protein databases, using a threshold E value of 10–5, led to the annotation of 37,207 unigenes or 45.2% of the unigenes of the transcriptome. Unigenes varied widely in their expression within the transcriptome. To

Figure 1. Length distribution of assembled contigs, scaffolds, and unigenes. (A) Number of contigs , (B) scaffolds, and (C) unigenes as a function of size. Contigs, scaffolds, and unigenes were assembled from raw sequence data after reads with adapters, reads with >10% unknown nucleotides, and low quality reads were removed.

fugate et al.: sugarbeet transcriptome and ssr markers 5 of 13

describe expression levels, the number of raw reads for each unigene was converted to reads per kilobase per million reads (RPKM) to eliminate the influence of gene length and sequencing discrepancies on the calculation of gene expression (Mortazavi et al., 2008). The expres-sion level definitions of Verma et al. (2013) were applied for very low expression (RPKM < 3), low expression (RPKM 3–10), moderate expression (RPKM 10–50), high expression (RPKM 50–100), and very high expression (RPKM > 100). Sugarbeet unigenes had RPKM values of <1 to 2942, with 14,457 unigenes (17.5%) exhibiting very low expression, 25,450 unigenes (30.9%) exhibit-ing low expression, 28,445 unigenes (34.5%) moderately expressed, 9021 unigenes (10.9%) exhibiting high expres-sion, and 5031 unigenes (6.1%) with very high expression. Unigene annotations and expression levels are available (Supplemental Table S1).

Unigene Functional ClassificationA similarity search of unigenes to the COG database led to the assignment of 9480 unigenes to COGs which were grouped into 24 functional classifications (Fig. 2). The greatest number of unigenes (2450; 25.6% of classified unigenes) were classified as “general function prediction only,” a classification that generally denotes biochemi-cal activity (Tatusov et al., 2001). A large number of

unigenes were also classified to the functional group, replication, recombination, and repair (1665, 17.6%) and transcription (1273, 13.4%). The predominance of unigene assignment to these three functional classifica-tions is typical in plants (Wang et al., 2010; Wei et al., 2011; Liu et al., 2013). Of particular interest for sugarbeet were the 858 unigenes (9.1%) classified for their putative involvement in carbohydrate transport and metabolism since sucrose production and transport are of critical importance for the crop. Also of interest were the 316 unigenes (3.3%) classified for their potential contribution to defense mechanisms since JA and SA-elicited root tis-sues were included as transcriptome source material to enhance generation of defense-related unigenes.

Using nr annotations and BLAST2GO software (Conesa et al., 2005), 17,191 unigenes were assigned GO terms and categorized using WEGO software (Ye et al., 2006) into the biological process, molecular function, or cellular component to which they putatively contrib-ute (Fig. 3). Among biological processes, the greatest number of unigenes were functionally assigned to meta-bolic processes (7150 unigenes; 41.6% of GO annotated unigenes) and cellular processes (6863 unigenes; 39.9%), two functional classes that include unigenes involved in metabolism at the organismal or cellular level, respec-tively. Other biological process classifications containing

Figure 2. Distribution of unigenes based on Clusters of Orthologous Groups (COG) classification. Unigenes were aligned with COG database, and 9480 unigenes were assigned COG annotations. The COG-annotated unigenes were grouped into 24 classes based on their putative function.

6 of 13 the plant genome july 2014 vol. 7, no. 2

>10% of GO annotated unigenes include the response to stimulus class (2225 unigenes, 12.9%) and localization class (1852 unigenes, 10.8%). The response to stimulus classification includes genes that function in the detec-tion of and response to internal and external stimuli; the localization class includes genes participating in the transport, movement, sequestration, or retrieval of sub-stances or cellular components. Also of interest were 85 unigenes (0.49%) assigned to immune system processes. Immune system processes include genes involved in defense responses as well as symbiotic and mutalistic processes. Among molecular functions, >10% of GO annotated unigenes were putatively categorized as having catalytic activity (8264 unigenes; 48.1%) or binding (7764 unigenes; 45.2%) functions. Cellular component classifi-cations assigned putative locations to unigene gene prod-ucts. A large number of unigenes were unsurprisingly assigned to the cell (11,668 unigenes; 67.9%) or cell part (10,581 unigenes; 61.5%). A total of 8556 unigenes (49.8%) putatively produce gene products localized to organelles.

BLASTx alignment of unigenes to the KEGG protein database led to the annotation of 17,409 unigenes with KEGG identifiers and the assignment of these unigenes to 126 KEGG pathways (Table 1). Of the KEGG-anno-tated unigenes, 4020 unigenes (23.1%) were functionally assigned to metabolic pathways, 2131 unigenes (12.2%) were assigned to the biosynthesis of secondary metabo-lites, and 1253 unigenes (7.2%) were assigned roles in

Figure 3. Distribution of unigenes based on gene ontology (GO) classification. The GO terms were assigned to 17,191 unigenes. These unigenes were categorized by biological process, cellular location, or molecular process to which they putatively contribute.

Table 1. Distribution of unigenes assigned to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The KEGG identifiers were assigned to 17,409 unigenes and identifiers were used to functionally categorize unigenes to 126 KEGG pathways. The number of unigenes assigned to a pathway and the number of unigenes assigned to a pathway expressed as the percentage of the total KEGG-annotated unigenes are provided.

Pathway

Unigenes in pathway

Number %

Metabolic pathways 4020 23.09Biosynthesis of secondary metabolites 2131 12.24Plant-pathogen interaction 1253 7.20Plant hormone signal transduction 996 5.72Spliceosome 647 3.72Ribosome biogenesis in eukaryotes 627 3.60RNA transport 613 3.52RNA degradation 578 3.32Starch and sucrose metabolism 450 2.58Endocytosis 416 2.39Phenylpropanoid biosynthesis 405 2.33Protein processing in endoplasmic reticulum 402 2.31Glycerophospholipid metabolism 366 2.10Ubiquitin mediated proteolysis 337 1.94Purine metabolism 326 1.87mRNA surveillance pathway 326 1.87

(cont’d)

fugate et al.: sugarbeet transcriptome and ssr markers 7 of 13

Pathway

Unigenes in pathway

Number %

Pyrimidine metabolism 293 1.68Stilbenoid, diarylheptanoid, and gingerol biosynthesis 266 1.53Ether lipid metabolism 249 1.43Limonene and pinene degradation 242 1.39Ribosome 239 1.37Amino sugar and nucleotide sugar metabolism 233 1.34Zeatin biosynthesis 226 1.30Glycolysis and gluconeogenesis 219 1.26ABC transporters 218 1.25Flavonoid biosynthesis 208 1.19Oxidative phosphorylation 207 1.19Peroxisome 204 1.17Phenylalanine metabolism 193 1.11Nucleotide excision repair 187 1.07Terpenoid backbone biosynthesis 177 1.02Phosphatidylinositol signaling system 177 1.02Pyruvate metabolism 177 1.02Cyanoamino acid metabolism 172 0.99Inositol phosphate metabolism 164 0.94Phagosome 160 0.92Aminoacyl-tRNA biosynthesis 158 0.91Homologous recombination 156 0.90a -Linolenic acid metabolism 151 0.87Cysteine and methionine metabolism 146 0.84Glutathione metabolism 141 0.81RNA polymerase 140 0.80Arginine and proline metabolism 138 0.79Galactose metabolism 135 0.78Tryptophan metabolism 134 0.77Glycerolipid metabolism 132 0.76DNA replication 129 0.74Circadian rhythm—plant 128 0.74Fatty acid metabolism 127 0.73Basal transcription factors 123 0.71Pentose and glucuronate interconversions 123 0.71Ascorbate and aldarate metabolism 120 0.69Porphyrin and chlorophyll metabolism 115 0.66Mismatch repair 110 0.63Valine, leucine and isoleucine degradation 110 0.63Carotenoid biosynthesis 108 0.62Propanoate metabolism 107 0.61Carbon fixation in photosynthetic organisms 103 0.59N-glycan biosynthesis 99 0.57Glycine, serine, and threonine metabolism 99 0.57Tyrosine metabolism 98 0.56Alanine, aspartate and glutamate metabolism 97 0.56Fructose and mannose metabolism 97 0.56Base excision repair 96 0.55Biosynthesis of unsaturated fatty acids 89 0.51Citrate cycle (tricarboxylic acid cycle) 88 0.51Pentose phosphate pathway 87 0.50Valine, leucine and isoleucine biosynthesis 86 0.49Photosynthesis 85 0.49b -Alanine metabolism 85 0.49Flavone and flavonol biosynthesis 83 0.48

Pathway

Unigenes in pathway

Number %

Regulation of autophagy 83 0.48Lysine degradation 81 0.47Ubiquinone and other terpenoid-quinone biosynthesis 81 0.47Diterpenoid biosynthesis 81 0.47Protein export 79 0.45Sphingolipid metabolism 79 0.45Phenylalanine, tyrosine and tryptophan biosynthesis 78 0.45Proteasome 76 0.44Other glycan degradation 72 0.41Fatty acid biosynthesis 67 0.38Glucosinolate biosynthesis 66 0.38SNARE interactions in vesicular transport 66 0.38Nitrogen metabolism 63 0.36Benzoxazinoid biosynthesis 63 0.36Butanoate metabolism 61 0.35Histidine metabolism 60 0.34Linoleic acid metabolism 55 0.32Steroid biosynthesis 55 0.32Natural killer cell mediated cytotoxicity 49 0.28Glycosylphosphatidylinositol(GPI)-anchor biosynthesis 48 0.28Isoquinoline alkaloid biosynthesis 48 0.28Lysine biosynthesis 48 0.28Glyoxylate and dicarboxylate metabolism 47 0.27Selenocompound metabolism 46 0.26Folate biosynthesis 46 0.26Glycosaminoglycan degradation 45 0.26Pantothenate and CoA biosynthesis 44 0.25Sulfur metabolism 44 0.25Tropane, piperidine and pyridine alkaloid biosynthesis 41 0.24Other types of O-glycan biosynthesis 33 0.19One carbon pool by folate 29 0.17Riboflavin metabolism 28 0.16Nicotinate and nicotinamide metabolism 27 0.16Sulfur relay system 27 0.16Non-homologous end-joining 25 0.14Circadian rhythm—mammal 24 0.14Thiamine metabolism 24 0.14Glycosphingolipid biosynthesis—Ganglio series 23 0.13Arachidonic acid metabolism 22 0.13Vitamin B6 metabolism 22 0.13Taurine and hypotaurine metabolism 21 0.12Glycosphingolipid biosynthesis—Globo series 19 0.11Sesquiterpenoid biosynthesis 18 0.10Brassinosteroid biosynthesis 18 0.10Synthesis and degradation of ketone bodies 16 0.09Indole alkaloid biosynthesis 15 0.09C5-branched dibasic acid metabolism 11 0.06Caffeine metabolism 11 0.06Biotin metabolism 9 0.05Photosynthesis- antenna proteins 8 0.05Monoterpenoid biosynthesis 8 0.05Anthocyanin biosynthesis 7 0.04Fatty acid elongation in mitochondria 7 0.04Lipoic acid metabolism 4 0.02Betalain biosynthesis 4 0.02

Table 1. Continued.

8 of 13 the plant genome july 2014 vol. 7, no. 2

plant-pathogen interactions. Since sucrose synthesis and metabolism is of critical importance for sugarbeet, the large number of unigenes putatively involved in primary carbon metabolism is of interest. Four-hundred-fifty unigenes (2.9%) were assigned functions in starch and sucrose metabolism, 219 unigenes (1.3%) were assigned to glycolysis and gluconeogenesis, 103 unigenes (0.6%) were assigned functions in carbon fixation, 88 unigenes (0.5%) were assigned to the tricarboxylic acid cycle, 85 unigenes (0.5%) putatively produce proteins involved in photo-synthesis, and 8 unigenes (0.05%) encode photosynthetic antenna proteins. Differences in unigene assignments were evident between GO and KEGG classifications. In categorizing unigenes based on GO annotations, 3 (0.02%), 13 (0.08%), and 4 (0.02%) unigenes were assigned roles in nitrogen utilization, pigmentation, and rhyth-mic processes, respectively (Fig. 3). When unigenes were assigned to pathways using KEGG identifiers (Table 1), 63 unigenes (0.4%) were assigned to nitrogen metabolism, 119 unigenes (0.7%) were assigned roles in carotenoid, anthocyanin, or betalain biosynthesis, and 128 unigenes (0.7%) were assigned roles in plant circadian rhythm.

SSR Identification and CharacterizationA screen of the 82,404 sugarbeet unigene sequences using SSRLocator (da Maia et al., 2008) yielded 7680 putative SSRs in 6546 unigenes including 6577 perfect SSRs (Table 2). Of the 6546 unigenes containing puta-tive SSRs, 3589 unigenes contained ungapped sequence close to the repeated motif yielding 3834 perfect SSRs to be used for downstream applications. Among these, the most frequent SSR motifs were trinucleotides, followed by dinucleotides and tetranucleotides (Fig. 4). Dinucleo-tides, trinucleotides, and tetranucleotides accounted for 22, 67, and 5% of the total repeat motifs, respectively. The most frequent dinucleotide, trinucleotide, and tet-ranucleotide motifs were AG, CTT, and AAAT, which accounted for 8, 5, and 0.003%, respectively, of the total motifs. The unigenes containing the 3834 SSR sequences were screened for suitable flanking sites for PCR prim-ers using WebSat (Martins et al., 2009) and primers were

designed using Primer3 (Rozen and Skaletsky, 2000). A total of 288 unigenes contained suitable flanking sites and high quality SSR motifs for primer design with 97% of these containing dinucleotide (18%) and trinucleotide (79%) motifs (Supplemental Table S2).

For validation purposes, 72 SSR primer-pairs were used to amplify DNA from eight diverse B. vulgaris geno-types (Table 3). Sixty-four primer-pairs (89%) produced amplification products in the eight genotypes, while 13 primer-pairs (18%) showed monomorphic allelic patterns and seven primers (10%) produced allelic patterns not consistent with single locus segregation in a diploid spe-cies. The remaining 43 primers (60%) produced a maxi-mum of two fragments of the expected size lengths per individual (Table 4) and were subsequently considered single polymorphic loci. The sequences of the 43 poly-morphic SSR loci were submitted to GenBank and acces-sion numbers were assigned to them (Supplemental Table S3). Out of these 43 SSR loci, 20 had a hit using the NCBI nr database (Table 5). The 43 polymorphic SSR primer-pairs were used to describe the genetic diversity of eight sampled B. vulgaris accessions (Table 3), and detected 199 alleles in eight individual plants (Table 4). The num-ber of different alleles for each primer-pair ranged from two to eight, with an average of 4.65 alleles per locus. The effective number of alleles ranged from 1.49 to 6.40, with an average of 3.09 per locus. Shannon’s Information Index for each primer-pair ranged from 0.56 to 1.91, with an average of 1.23. Based on the 43 SSR loci data, PCoA and an UPGMA tree clearly separated all of the Beta genotypes in the study (Fig. 5).

The 6577 perfect SSRs identified in the sugarbeet transcriptome and the 288 SSRs for which primers were designed (Supplemental Table S2) provide additional molecular tools for sugarbeet genetic research and selec-tive breeding. Simple sequence repeats have proven to be effective markers for assessing genetic diversity, genetic mapping, examining population genetics, conducting comparative genomics studies, and for linkage map-ping and association analysis, since SSR markers are codominant, hypervariable, abundant, relatively evenly

Table 2. Distribution of microsatellite repeat motifs in 82,404 sugarbeet transcriptome unigenes. The SSR loci were identified using SSRLocator software program and grouped based on the number of nucleotides that were repeated (motif type) and the number of times this motif type was repeated. Parentheses contain the percentage of SSRs within a motif type and repeat number that were located in unigenes with ungapped sequences.

No. of repeats

Motif type

TotalDi- Tri- Tetra- Penta- Hexa-

3 – – 1397 (3.07%) 402 (4.97%) 484 (5.99%) 22834 – 2566 (69.4%) 128 (82.8%) 71 (73.2%) 160 (79.3%) 29255 – 721 (64.4%) 25 (80%) 21 (71.4%) 13 (61.5%) 7806 641 (70.8%) 260 (62.3%) 12 (83.3%) 4 (50%) 1 (100%) 9187 251 (68.5%) 123 (61.7%) 5 (100%) 0 0 3798 112 (64.2%) 55 (78.1%) 1 (100%) 0 0 1689 55 (72.7%) 22 (63.6%) 0 0 1 (100%) 78

>10 129 (81.3%) 19 (36.8%) 0 0 1 (100%) 149Total 1188 3766 1568 498 660 7680

fugate et al.: sugarbeet transcriptome and ssr markers 9 of 13

distributed across genomes, and transferrable across related species (Varshney et al., 2005; Zalapa et al., 2012). As transcript-based SSRs, the markers identified and characterized in this research are especially useful since they are more evolutionarily conserved and more likely to be associated with functional genes and genotype than SSRs developed from the genome (Zalapa et al., 2012). The SSR markers reported in this study add to the four

genomic SSR markers developed by Mörchen et al. (1996), 57 genomic, restricted-use SSRs developed by Rae et al. (2000), 11 genome-based SSR markers developed from sea beet (B. vulgaris spp. maritima; Cureton et al., 2002; Viard et al., 2002), eight genomic SSR markers developed by Richards et al. (2004), 23 newly reported SSR markers used for mapping purposes by McGrath et al. (2007), 25 genomic SSR markers developed by Smulders et al. (2010), 13 EST-based SSR markers reported in Simko et al. (2012), and the 242 genomic SSRs and 773 EST-based SSRs reported by Laurent et al. (2007), of which 41 genomic SSR marker primer pairs were published.

ConclusionsRNA sequencing was used to generate a sugarbeet tran-scriptome that contained genes expressed in immature and mature leaves, early and late season roots, roots after postharvest storage, and roots after elicitation with JA or SA. The transcriptome contains 82,404 unigenes, of which 37,207 (45.2%) were annotated. Of the annotated unigenes, 9480 unigenes (25.5%) were functionally clas-sified using Cluster of Orthologous annotations, 17,191 unigenes (46.2%) were classified into the biological pro-cess, molecular function, or cellular component to which they putatively contribute using GO terms, and 17,409 unigenes (46.8%) were assigned to 126 metabolic or functional pathways using Kyoto Encyclopedia of Genes and Genomes identifiers. A SSR search of the reference

Figure 4. Frequency of dinucleotide to hexanucleotide repeat motifs in 3834 ungapped sugarbeet transcriptome SSR loci.

Table 3. Beta vulgaris accessions used to validate 72 transcriptome microsatellite loci. Accessions are identified by the common name assigned to a line by the breeder responsible for its development, or for collected germplasm, the plant introduction (PI) number assigned by the National Plant Germplasm System (NPGS). The PI numbers are listed for accessions submitted to NPGS. Entries are listed in approximate order of decreasing within-entry diversity.

Accession Description PI number

PI 540605 wild annual from France 540605PI 546420 wild annual from Greece 546420H-537 sugarbeet ´ fodder beet cross developed for biomass –F1010 high sugar population selected from Beta collection 535818Y-322 selection from a wild ´ cultivated (L53) cross 583780F1024 root maggot and Cercospora-resistant line 658654FC221 Colorado line with multiple disease resistance 651016L-19 inbred line from Utah 590690

10 of 13 the plant genome july 2014 vol. 7, no. 2

Table 4. Primer sequence and diversity characteristics of 43 polymorphic sugarbeet transcriptome microsatellite markers. Markers were tested in eight diverse accessions that are described in Table 3. Per locus diversity characteristics include: the number of accessions (N), range of the product size observed (Range), number of alleles (Na), effective number of alleles (Ne), Shannon’s Information Index (I), observed heterozygosities (Ho), and expected heterozygosities (He).

SSR locus Primers Motif Expected size N Range Na Ne I Ho He

Unigene24552 F:AACATCTCACTCATCCTTCTTCR:ATGATAGCAAACGACTAGCAG (CTT)14 195 8 189–223 8 5.82 1.91 0.75 0.83

Unigene27906 F:GAGCAGCAAACATGATAAGAR:GAAAACAGTGAGTATGGGTCTA (CTA)11 267 8 273–291 4 3.05 1.21 0.50 0.67

Unigene15863 F:CATGTGTGTTGGAATGGATR:GCTAATTTCTGTTACCACCTTC (ATT)11 217 7 231–244 4 2.80 1.17 0.29 0.64

Unigene26753 F:GAGAAACAAATTCACCCATCR:GTAGTGGAAGTAAAAGCACCA

(CAA)10 2888

302–3286 3.05 1.39 0.50 0.67

Unigene7492 F:GCTTTCTTCTCATTAGGAACACR:CACGTATTGTTGCCATATCTC (AAT)10 249 8 252–273 5 2.67 1.24 0.63 0.63

Unigene22373 F:AAAGGAAACTACCCCTACACTTR:AAGAGAGAAAGAAGACGATGAG (CCA)4 180 8 177–198 4 1.49 0.69 0.25 0.33

Unigene11422 F:TCTCTT TCTCTCTCCTCCATTAR:GTCTTCGTCCTCTGTCATATT T (CAT)8 178 6 183–217 5 4.80 1.59 0.17 0.79

Unigene16898 F:AGAACTTAGATTGTGACCTGCTR:GATGGGAAGAGAGAGATTAGTG (CAA)8 281 8 289–328 6 3.56 1.49 0.50 0.72

Unigene2305 F:TACTAAAACCCTACGAACTCCAR:TACACCTGTGATTGTCAGAAGA (TCA)7 164 8 186–216 5 4.57 1.57 0.75 0.78

Unigene29209 F:T TATGACGACTTCTGTTAGCTGR:ACATTTGCTAGGATACATAGGC (TTG)7 181 8 204–213 3 1.86 0.78 0.13 0.46

Unigene72402 F:T TAAGTACCAACTTCCAACAGCR:GCTGGCTAACGACATAAATTC (TGT)7 243 6 266–304 4 3.00 1.24 0.00 0.67

Unigene17623B F:AT TAGACCTCAATCTTCCAGCR:AATAATGGCAATCTACCAGC (CAA)13 164 8 254–272 6 4.27 1.60 0.63 0.77

Unigene18963 F:CACTACCCCTTGTTTATCTTCAR:GGAAAATCTTGCTTCATTCC (TGA)7 271 8 300–321 5 3.88 1.46 0.13 0.74

Unigene4157 F:CTCCTCTT T TCTTCTCTCTTCAR:GATCTTCTTGTTTGTCGTCTTC (TAA)7 165 8 189–201 3 1.68 0.74 0.00 0.41

Unigene14805 F:ACATCTCAACTCTCAACAATCCR:TCACAAGGAGAAACCCTTC (TCA)7 270 8 289–301 6 3.56 1.49 0.63 0.72

Unigene26319 F:CAGAATACACTTGGTGAGATGAR:TACTATGTTGTTGCTGCTGTG (TGG)7 227 8 241–257 4 3.12 1.22 0.38 0.68

Unigene27833 F:GAGTCATCAACACCAAACTACAR:AT TAGCCAAGAAAATCACCC (ATA)7 203 8 210–240 7 6.40 1.89 0.63 0.84

Unigene77067 F:CT T TAGTGTAGCGTTAGAGCGR:TAACAGCAGGACTGGAGAAG (GTG)7 272 8 284–300 4 2.72 1.14 0.25 0.63

Unigene78820 F:AGGCTATCATCACTAGAACCATR:GTAGTTTCTCGGGTCAATACAT (TCA)7 260 8 283–286 2 1.60 0.56 0.25 0.38

Unigene9269 F:CTAAGGCAGAGTACACATTCGR:AGTTTGAAGAGGGTGAAGAAG (TAT)7 279 8 300–321 5 3.88 1.46 0.13 0.74

Unigene22354 F:AGCTTTGTAACTTGTAAGGGGR:GAAAAGAGAAACGAGGGTTAG (CTT)7 212 8 229–238 5 4.57 1.57 0.25 0.78

Unigene57236 F:T TGGAGAGAGAAAAGAGAGAAGR:ATCCCTTGACAGTAGAACTCC (CAT)7 138 8 163–172 4 1.71 0.82 0.50 0.41

Unigene62524 F:GAGATTCATTCACCTTGCACR:GGGAGATGCTTAGTTT TGTTAG (CAA)7 198 8 218–232 5 4.41 1.54 0.63 0.77

Unigene2170 F:T T TCTGTCTCCTCTAAATCAGCR:GTACTCTCCATCTCCATGCTT (ACA)6 129 8 144–155 4 2.84 1.18 0.38 0.65

Unigene25611 F:AGTTGGGGAAACAGAGAAAR:CACCATACTGAAGAACCTAAGA (AG)23 249 6 273–335 4 3.00 1.24 0.00 0.67

Unigene11965 F:T TGAGTATTT TCGTCGGCR:CATCTACATCAGTTT TCCCTTC (GA)18 144 8 144–168 7 4.57 1.72 0.63 0.78

Unigene26133 F:TGTATGAGAGAGATGGGATGTAR:CTGTTGTGACGGTAATTT TG (TC)15 217 8 224–248 5 2.98 1.30 0.50 0.66

Unigene11506 F:GTAAACTCGGGTAATAACAAGCR:TATAAGTCATGCAGTCGAGCTA (TC)13 160 7 180–201 5 2.58 1.22 0.43 0.61

Unigene15915 F:T TAGGTCTCTACAACTGATCCCR:TAGGGTCATAGGCAGTAAGATT (CA)12 294 8 314–361 6 3.20 1.44 0.50 0.69

(cont’d)

fugate et al.: sugarbeet transcriptome and ssr markers 11 of 13

SSR locus Primers Motif Expected size N Range Na Ne I Ho He

Unigene15403 F:ACTACGCTCTCGCTCTCTCR:ATGAGGTTGAGTTTGGTTTG (TC)12 127 8 137–155 5 3.12 1.35 0.50 0.68

Unigene75234 F:ACTTGACCATCATAAAGGGTAGR:T TCTGGAAATTAGCGTGAGT (GA)11 283 8 301–308 5 3.66 1.41 0.38 0.73

Unigene11240 F:AGTCTCTCAAGAACAACAACAGR:GTGGGGTAGTTAATCAGTTACA (GA)11 239 8 256–264 4 2.61 1.16 0.38 0.62

Unigene27374 F:AT T T TAGGTGAATGGTGGTGR:GCTATAAGGCAAAAGGATGAC (AT)11 147 7 162–172 4 1.85 0.90 0.29 0.46

Unigene23788 F:TGACTACCATCCATATTCTAGGR:GCAAATGAGAGGGACAATA (TC)11 198 7 231–255 4 2.18 1.03 0.14 0.54

Unigene17923 F:AACCTTACTCCCTCTGATTTCTR:GGAGATACAACTTACAAGAGCC (AC)10 211 8 197–237 6 3.28 1.47 0.38 0.70

Unigene15907 F:GGTCAACAATGGGATCTATGTR:CTACCTTCAACACTGCAAAATC (GA)10 167 8 183–210 6 2.72 1.33 0.50 0.63

Unigene9505 F:ACCTACCTCTCCAAAACTTCAR:GTGTAATGCTTCTT T TCTGAGC (CT)10 121 8 141–145 2 1.75 0.62 0.38 0.43

Unigene4192 F:GGAGTGTTGATGAGAAAGATGTR:CACGTACTGAAGATTGAAAAGC (GA)10 207 8 232–238 4 1.94 0.92 0.50 0.48

Unigene9667 F:GTGACTGGTGAGAGGATAAAAR:TACACAGAACACATACCTTTCC (GA)10 141 8 158–173 5 2.00 1.04 0.38 0.50

Unigene53607 F:CAGTATAGTTAGGAGTGATGCGR:AAATGAGGATTGACTTGGTG (TC)10 183 8 201–216 4 3.12 1.23 0.75 0.68

Unigene10114 F:GCAAGTGGAGAAATATGTGAGTR:TATGTGGTAGTGGTACTGCTTG (TG)9 284 8 310–320 3 1.68 0.74 0.00 0.41

Unigene48657 F:TAACTAAGGTTGGTGGAACAR:CTCTCATT TCTCCCTATCTCTC (AG)9 157 8 176–182 3 1.68 0.74 0.00 0.41

Unigene14118 F:AAGTCTAACACCAGAATCCAGAR:AACCAGAGAGAATATGAGGATG (AACA)8 158 8 165–180 3 2.84 1.07 0.13 0.65

Table 5. Annotations of 20 polymorphic sugarbeet transcriptome microsatellite markers tested in eight diverse accessions. Annotations were obtained by BLASTx alignment of SSR markers with the National Center for Biotechnology Information (NCBI) nonredundant protein (nr) database with E-value < 10–5. Normalized score (Score) and expected value (E-value) of alignment are provided.

SSR locus nr-ID Annotation Score E-value

Unigene24552 gi|87241257|gb|ABD33115.1| SAM (and some other nucleotide) binding motif; methyltransferase small; tetratricopeptide-like helical (Medicago truncatula) 64.7 3E-10

Unigene26753 gi|149944305|gb|ABR46195.1| At1g36940 (Arabidopsis thaliana) 72.4 5E-12Unigene7492 gi|1087017|gb|AAB35284.1| arabinogalactan-protein (Nicotiana alata) 67.0 3E-10

Unigene22373 gi|224053953|ref|XP_002298055.1| predicted protein (Populus trichocarpa) > gi|222845313|gb|EEE82860.1| predicted protein (Populus trichocarpa) 91.7 3E-18

Unigene16898 gi|15239614|ref|NP_197396.1| pentatricopeptide (PPR) repeat-containing protein (Arabidopsis thaliana) > gi|223635758|sp|Q8GYM2.2| PP393_ARATH RecName: Full = Pentatricopeptide repeat-containing protein At5g18950 81.6 4E-15

Unigene2305 gi|296085240|emb|CBI28735.3| unnamed protein product (Vitis vinifera) 90.5 5E-18Unigene29209 gi|90200735|gb|ABD92785.1| sister of ramosa 3 (Chasmanthium latifolium) 159 4E-38Unigene17623B gi|2346972|dbj|BAA21920.1| ZPT2–11 (Petunia ´ hybrida) 169 6E-49

Unigene18963 gi|42563995|ref|NP_187692.2| zinc finger (C3HC4-type RING finger) family protein (Arabidopsis thaliana) > gi|145651778|gb|ABP88114.1| At3g10810 (Arabidopsis thaliana) 62.8 1E-9

Unigene14805 gi|124365574|gb|ABN09808.1| Protein phosphatase inhibitor (Medicago truncatula) 96.7 7E-20Unigene26319 gi|87162867|gb|ABD28662.1| cAMP response element binding (CREB) protein (Medicago truncatula) 186 1E-82Unigene27833 gi|15230209|ref|NP_188511.1| protein kinase family protein (Arabidopsis thaliana) 51.6 5E-6

Unigene77067 gi|15242850|ref|NP_200579.1| heat shock protein-related (Arabidopsis thaliana) > gi|9759268|dbj|BAB09589.1| 101 kDa heat shock protein; HSP101-like protein (Arabidopsis thaliana) 285 3E-76

Unigene9269 gi|31323443|gb|AAP47023.1|AF375964_1 bell-like homeodomain protein 3 (Solanum lycopersicum) 256 2E-67Unigene22354 gi|34365593|gb|AAQ65108.1| At1g70350 (Arabidopsis thaliana) 76.3 2E-13

Unigene57236 gi|115448117|ref|NP_001047838.1| Os02g0700300 (Oryza sativa Japonica Group) > gi|113537369|dbj|BAF09752.1| Os02g0700300 (Oryza sativa Japonica Group) 77.8 4E-14

Unigene62524 gi|12003402|gb|AAG43558.1|AF211540_1 Avr9/Cf-9 rapidly elicited protein 75 (Nicotiana tabacum) 55.5 2E-7

Unigene15907 gi|224132436|ref|XP_002328269.1| glycosyltransferase, family GT8 (Populus trichocarpa) > gi|222837784|gb|EEE76149.1| glycosyltransferase, family GT8 (Populus trichocarpa) 348 4E-95

Unigene53607 gi|225459499|ref|XP_002284416.1| PREDICTED: hypothetical protein (Vitis vinifera) 52.4 2E-6Unigene14118 gi|22331889|ref|NP_191650.2| catalytic and/or methyltransferase (Arabidopsis thaliana) 121 2E-27

Table 4. Continued.

12 of 13 the plant genome july 2014 vol. 7, no. 2

transcriptome identified 7680 SSRs, including 6577 per-fect SSRs, of which 3834 were located in unigenes with ungapped sequence. Primer-pairs were designed for 288 selected SSR loci, and 72 of these primer-pairs were tested for their ability to detect polymorphisms, of which 43 primer-pairs (60%) detected single polymorphic loci and were effective in distinguishing diversity among a set of eight diverse B. vulgaris genotypes. The transcrip-tome and SSR markers described here provide additional public domain genomic resources for an important crop plant and will further our understanding of the func-tional elements of the sugarbeet genome, aid in the dis-covery of novel genes, facilitate RNA-sequencing-based expression research, and provide new tools for sugarbeet genetic research and selective breeding.

Supplemental Information AvailableSupplemental material is included with this manuscript.

Supplemental Table S1. Annotation and expression of sugarbeet unigenes.

Supplemental Table S2. Primer sequences, motif, expected product size, and annealing temperature of 288 sugarbeet transcriptome SSR loci.

Supplemental Table S3. The NCBI GenBank acces-sion numbers assigned to the 43 polymorphic sugarbeet transcriptome microsatellite markers.

Supporting Data AvailableThe unigene sequences comprising the reference transcrip-tome are available in figshare repository (Fugate, 2013).

AcknowledgmentsThe authors thank John Eide for technical assistance, CNPq of Brazil for providing JPF’s scholarship, and the Beet Sugar Development Foun-dation for partial financial support of this research. Mention of trade names or commercial products is solely for the purpose of providing spe-cific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture. USDA is an equal opportunity provider and employer.

ReferencesBallaré, C.L. 2011. Jasmonate-induced defenses: A tale of intelligence, col-

laborators and rascals. Trends Plant Sci. 16:249–257. doi:10.1016/j.tplants.2010.12.001

Barzen, E., W. Mechelke, E. Ritter, E. Schulte-Kappert, and F. Salamini. 1995. An extended map of the sugar beet genome containing RFLP and RAPD loci. Theor. Appl. Genet. 90:189–193. doi:10.1007/BF00222201

Boyer, J.S. 1982. Plant productivity and environment. Science 218:443–448. doi:10.1126/science.218.4571.443

Conesa, A., S. Götz, J.M. García-Gómez, J. Terol, M. Talón, and M. Robles. 2005. Blast2GO: A universal tool for annotation, visualiza-tion and analysis in functional genomics research. Bioinformatics 21:3674–3676. doi:10.1093/bioinformatics/bti610

Cureton, A.N., M.J. Burns, B.V. Ford-Lloyd, and H.J. Newbury. 2002. Development of simple sequence repeat (SSR) markers for the assessment of gene flow between sea beet (Beta vulgaris ssp. mari-tima) populations. Mol. Ecol. Notes 2:402–403. doi:10.1046/j.1471-8286.2002.00253.x

da Maia, L.C., D.A. Palmieri, V.Q. de Sonza, M.M. Kopp, F.I. de Carvalho, and A.C. de Oliveira. 2008. SSR Locator: Tool for simple sequence repeat discovery integrated with primer design and PCR simulation. Int. J. Plant Genomics 2008:412696. doi:10.1155/2008/412696

Dohm, J.C., C. Lange, D. Holtgräwe, T.R. Sörensen, D. Borchardt, B. Schulz, H. Lehrach, B. Weisshaar, and H. Himmelbauer. 2012. Palaeohexaploid ancestry for Caryophyllales inferred from exten-sive gene-based physical and genetic mapping of the sugar beet genome (Beta vulgaris). Plant J. 70:528–540. doi:10.1111/j.1365-313X.2011.04898.x

Dohm, J.C., C. Lange, R. Reinhardt, and H. Himmelbauer. 2009. Haplotype divergence in Beta vulgaris and microsynteny with sequenced plant genomes. Plant J. 57:14–26. doi:10.1111/j.1365-313X.2008.03665.x

Doney, D.L., and J.C. Theurer. 1984. Potential of breeding for ethanol fuel in sugarbeet. Crop Sci. 24:255–257. doi:10.2135/cropsci1984.0011183X002400020011x

Fugate, K. 2013. Sugarbeet transcriptome unigenes. figshare. Available at http://figshare.com/articles/Sugarbeet_transcriptome_unige-nes/843615 (accessed 20 Nov. 2013, verified 13 Mar. 2014).

Fugate, K.K., J.P. Ferrareze, M.D. Bolton, E.L. Deckard, L.G. Campbell, and F.L. Finger. 2013. Postharvest salicylic acid treatment reduces storage rots in water-stressed but not unstressed sugarbeet roots. Postharvest Biol. Technol. 85:162–166. doi:10.1016/j.posthar-vbio.2013.06.005

Garg, R., R.K. Patel, A.K. Tyagi, and M. Jain. 2011. De novo assembly of chickpea transcriptome using short reads for gene discovery and marker identification. DNA Res. 18:53–63. doi:10.1093/dnares/dsq028

Figure 5. Genetic relationships among sugarbeet accessions based on SSR marker analysis. (A) Principal coordinates analysis and (B) unweighted pair group method with arithmetic mean (UPGMA) tree were used to visualize the genetic relationships among eight sugar-beet accessions as determined by SSR analysis using 43 polymorphic transcriptome microsatellite loci.

fugate et al.: sugarbeet transcriptome and ssr markers 13 of 13

Germplasm Resources Information Network (GRIN). 2013. Available at www.ars-grin.gov/npgs/ (accessed 18 Nov. 2013, verified 13 Mar. 2014).

Harland, J.I., C.K. Jones, and C. Hufford. 2006. Co-products. In: A.P. Dray-cott, editor, Sugar beet. Blackwell Publ., Oxford, UK. p. 443–463.

Harveson, R.M., L.E. Hanson, and G.L. Hein. 2009. Compendium of beet diseases and pests. 2nd ed. APS Press, St. Paul, MN.

Hayat, Q., S. Hayat, M. Irfan, and A. Ahmad. 2010. Effect of exogenous salicylic acid under changing environment: A review. Environ. Exp. Bot. 68:14–25. doi:10.1016/j.envexpbot.2009.08.005

Iseli, C., C.V. Jongeneel, and P. Bucher. 1999. ESTScan: A program for detecting, evaluating, and reconstructing potential coding regions in EST sequences. Proc. Int. Conf. Intel. Syst. Mol. Biol. 1999:138–148.

Laurent, V., P. Devaux, T. Thiel, F. Viard, S. Mielordt, P. Touzet, and M.C. Quillet. 2007. Comparative effectiveness of sugar beet microsatel-lite markers isolated from genomic libraries and GenBank ESTs to map the sugar beet genome. Theor. Appl. Genet. 115:793–805. doi:10.1007/s00122-007-0609-y

Li, R., H. Zhu, J. Ruan, W. Qian, X. Fang, Z. Shi, Y. Li, S. Li, G. Shan, K. Kristiansen, S. Li, H. Yang, J. Wang, and J. Wang. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20:265–272. doi:10.1101/gr.097261.109

Liu, T., S. Zhu, Q. Tang, P. Chen, Y. Yu, and S. Tang. 2013. De novo assembly and characterization of transcriptome using Illumina paired-end sequencing and identification of CesA gene in ramie (Boehmeria nivea L. Gaud). BMC Genomics 14:125. doi:10.1186/1471-2164-14-125

Martins, W.S., D.C. Lucas, K.F. Neves, and D.J. Bertioli. 2009. WebSat—A web software for microsatellite marker development. Bioinforma-tion 3:282–283. doi:10.6026/97320630003282

McGrath, J.M., D. Trebbi, A. Fenwick, L. Panella, B. Schulz, V. Laurent, S. Barnes, and S. Murray. 2007. An open-source first-generation molecular genetic map from a sugarbeet × table beet cross and its extension to physical mapping. Plant Gen. 1:S27–S44.

Mörchen, M., J. Cuguen, G. Michaelis, C. Hänni, and P. Saumitou-Laprade. 1996. Abundance and length polymorphism of microsat-ellite repeats in Beta vulgaris L. Theor. Appl. Genet. 92:326–333. doi:10.1007/BF00223675

Mortazavi, A., B.A. Williams, K. McCue, L. Schaeffer, and B. Wold. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5:621–628. doi:10.1038/nmeth.1226

Mutasa-Gottgens, E., A. Joshi, H. Holmes, P. Hedden, and B. Gottgens. 2012. A new RNASeq-based reference transcriptome for sugar beet and its application in transcriptome-scale analysis of vernalization and gib-berellin responses. BMC Genomics 13:99. doi:10.1186/1471-2164-13-99

Ober, E., and A. Rajabi. 2010. Abiotic stress in sugar beet. Sugar Technol. 12:294–298. doi:10.1007/s12355-010-0035-3

Panella, L. 2010. Sugar beet as an energy crop. Sugar Technol. 12:288–293. doi:10.1007/s12355-010-0041-5

Peakall, R., and P.E. Smouse. 2006. GENALEX 6: Genetic analysis in Excel. Population genetic software for teaching and research. Mol. Ecol. Notes 6:288–295. doi:10.1111/j.1471-8286.2005.01155.x

Rae, S.J., C. Aldam, I. Dominguez, M. Hoebrechts, S.R. Barnes, and K.J. Edwards. 2000. Development and incorporation of microsatellite markers into the linkage map of sugar beet (Beta vulgaris spp.). Theor. Appl. Genet. 100:1240–1248. doi:10.1007/s001220051430

Richards, C.M., M. Brownson, S.E. Mitchell, S. Kresovich, and L. Panella. 2004. Polymorphic microsatellite markers for inferring diversity in wild and domesticated sugar beet (Beta vulgaris). Mol. Ecol. Notes 4:243–245. doi:10.1111/j.1471-8286.2004.00630.x

Rozen, S., and H.J. Skaletsky. 2000. Primer3 on the WWW for general users and for biologist programmers. In: S. Misener and S. Krawetz, editors, Bioinformatics methods and protocols. Humana Press, Totowa, NJ. p. 365–386.

Schneider, K., D. Kulosa, T. Soerensen, S. Möhring, M. Heine, G. Durst-ewitz, A. Polley, E. Weber, Jamsari, J. Lein, U. Hohmann, E. Tahiro, B. Weisshaar, B. Schulz, G. Koch, C. Jung, and M. Ganal. 2007. Analysis of DNA polymorphisms in sugar beet (Beta vulgaris L.) and development of an SNP-based map of expressed genes. Theor. Appl. Genet. 115:601–615. doi:10.1007/s00122-007-0591-4

Schondelmaier, J., G. Steinrücken, and C. Jung. 1996. Integration of AFLP markers into a linkage map of sugar beet (Beta vulgaris L.). Plant Breed. 115:231–237. doi:10.1111/j.1439-0523.1996.tb00909.x

Simko, I., I. Eujayl, and T.J.L. van Hintum. 2012. Empirical evaluation of DArT, SNP, and SSR marker-systems for genotyping, clustering, and assigning sugar beet hybrid varieties into populations. Plant Sci. 184:54–62. doi:10.1016/j.plantsci.2011.12.009

Smulders, M., G. Esselink, I. Everaert, J. De Riek, and B. Vosman. 2010. Characterisation of sugar beet (Beta vulgaris L. ssp. vul-garis) varieties using microsatellite markers. BMC Genet. 11:41. doi:10.1186/1471-2156-11-41

Südzucker. 2013. Sugar statistics. Available at http://www.suedzucker.de/en/Zucker/Zahlen-zum-Zucker/Welt/ (accessed 18 Nov. 2013, veri-fied 13 Mar. 2014).

Tamura, K., D. Peterson, N. Peterson, G. Stecher, M. Nei, and S. Kumar. 2011. MEGA5: Molecular evolutionary genetics analysis using maxi-mum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28:2731–2739. doi:10.1093/molbev/msr121

Tao, X., Y.-H. Gu, H.-Y. Wang, W. Zheng, X. Li, C.-W. Zhao, and Y.-Z. Zhang. 2012. Digital gene expression analysis based on integrated de novo transcriptome assembly of sweet potato [Ipomoea batatas (L.) Lam.]. PLoS ONE 7:e36234. doi:10.1371/journal.pone.0036234

Tatusov, R.L., D.A. Natale, I.V. Garkavtsev, T.A. Tatusova, U.T. Shanka-varam, B.S. Rao, B. Kiryutin, M.Y. Galperin, N.D. Fedorova, and E.V. Koonin. 2001. The COG database: New developments in phy-logenetic classification of proteins from complete genomes. Nucleic Acids Res. 29:22–28. doi:10.1093/nar/29.1.22

Uphoff, H., and G. Wricke. 1995. A genetic map of sugar beet (Beta vulgaris) based on RAPD markers. Plant Breed. 114:355–357. doi:10.1111/j.1439-0523.1995.tb01249.x

Varshney, R.K., A. Graner, and M.E. Sorrells. 2005. Genic microsatellite markers in plants: Features and applications. Trends Biotechnol. 23:48–55. doi:10.1016/j.tibtech.2004.11.005

Verma, P., N. Shah, and S. Bhatia. 2013. Development of an expressed gene catalogue and molecular markers from the de novo assembly of short sequence reads of the lentil (Lens culinaris Medik.) transcrip-tome. Plant Biotechnol. J. 11:894–905. doi:10.1111/pbi.12082

Viard, F., J. Bernard, and B. Desplanque. 2002. Crop-weed interactions in the Beta vulgaris complex at a local scale: Allelic diversity and gene flow within sugar beet fields. Theor. Appl. Genet. 104:688–697. doi:10.1007/s001220100737

Vlot, A.C., D.M.A. Dempsey, and D.F. Klessig. 2009. Salicylic acid, a multifaceted hormone to combat disease. Annu. Rev. Phytopathol. 47:177–206. doi:10.1146/annurev.phyto.050908.135202

Wang, Z., B. Fang, J. Chen, X. Zhang, Z. Luo, L. Huang, X. Chen, and Y. Li. 2010. De novo assembly and characterization of root transcrip-tome using Illumina paired-end sequencing and development of cSSR markers in sweetpotato (Ipomoea batatas). BMC Genomics 11:726. doi:10.1186/1471-2164-11-726

Wasternack, C., and B. Hause. 2013. Jasmonates: Biosynthesis, perception, signal transduction and action in plant stress response, growth and development. Ann. Bot. 111:1021–1058. doi:10.1093/aob/mct067

Wei, W., X. Qi, L. Wang, Y. Zhang, W. Hua, D. Li, H. Lv, and X. Zhang. 2011. Characterization of the sesame (Sesamum indicum L.) global transcrip-tome using Illumina paired-end sequencing and development of EST-SSR markers. BMC Genomics 12:451. doi:10.1186/1471-2164-12-451

Ye, J., L. Fang, H. Zheng, Y. Zhang, J. Chen, Z. Zhang, J. Wang, S. Li, R. Li, L. Bolund, and J. Wang. 2006. WEGO: A web tool for plotting GO anno-tations. Nucleic Acids Res. 34:W293–W297. doi:10.1093/nar/gkl031

Zalapa, J.E., H. Cuevas, H. Zhu, S. Steffan, D. Senalik, E. Zeldin, B. McCown, R. Harbut, and P. Simon. 2012. Using next-generation sequencing approaches to isolate simple sequence repeat (SSR) loci in the plant sciences. Am. J. Bot. 99:193–208. doi:10.3732/ajb.1100394

Zhu, H., D. Senalik, B.H. McCown, E.L. Zeldin, J. Speers, J. Hyman, N. Bassil, K. Hummer, P.W. Simon, and J.E. Zalapa. 2012. Mining and validation of pyrosequenced simple sequence repeats (SSRs) from American cranberry (Vaccinium macrocarpon Ait.). Theor. Appl. Genet. 124:87–96. doi:10.1007/s00122-011-1689-2