8
Author's personal copy Evolutionary history of plant microRNAs Richard S. Taylor 1 , James E. Tarver 1, 2 , Simon J. Hiscock 3 , and Philip C.J. Donoghue 1 1 School of Earth Sciences, University of Bristol, Wills Memorial Building, Queen’s Road, Bristol BS8 1RJ, UK 2 Genome Evolution Laboratory, Department of Biology, National University of Ireland Maynooth, Maynooth, Co. Kildare, Ireland 3 School of Biological Sciences, University of Bristol, Woodland Road, Bristol BS8 1UG, UK microRNAs (miRNAs) are short noncoding regulatory genes that perform important roles in plant develop- ment and physiology. With the increasing power of next generation sequencing technologies and the develop- ment of bioinformatic tools, there has been a dramatic increase in the number of studies surveying the miR- NAomes of plant species, which has led to an explosion in the number of described miRNAs. Unfortunately, very many of these new discoveries have been incompletely annotated and thus fail to discriminate genuine miRNAs from small interfering RNAs (siRNAs), fragments of longer RNAs, and random sequence. We review the published repertoire of plant miRNAs, discriminating those that have been correctly annotated. We use these data to explore prevailing hypotheses on the tempo and mode of miRNA evolution within the plant kingdom. Plant miRNAs are small genes with big potential but equally big problems miRNAs are short (21 nt) noncoding RNAs involved in post-transcriptional gene regulation through both degra- dative and nondegradative mechanisms [1,2]. In plants, miRNAs have been demonstrated to have an influential role in development [3] as well as tolerance and response to extrinsic stresses [4], including drought [5], temperature [6,7], salinity [8], oxidative stress [9], and exposure to UV radiation [10]. Hence, miRNAs are obvious targets for bioengineering to improve crop yield and food security, mitigating the impact of global climate change [11–13]. However, although there has been considerable research surveying the miRNA repertoire of individual plant spe- cies, and crop species in particular, a synthetic under- standing of their systematic distribution has been compromised by specious annotation of putative miRNA loci resulting in the effective corruption of databases such as miRBase [14,15] (http://mirbase.org). Our aim in this review is to evaluate published plant miRNAs to determine whether they meet the criteria required of annotation. We propose a phylogenetic framework for organising these data which predicts the miRNA repertoire of evolutionary lineages, including those that have yet to be studied. In this regard, we evaluate conventional perceptions of the tempo and mode of plant miRNA evolution and their role within plant organismal evolution, before considering this within the perspective of miRNA evolution more generally. How to identify a plant miRNA The criteria required to identify novel plant miRNAs are unambiguous. Initial surveys of the systematic distribu- tion and diversity of miRNAs quickly led to a consensus on the best practice criteria required for correctly identifying novel miRNAs based on the distinctive nature of miRNA biogenesis [16–18]. In this regard, the role of the DCL1 enzyme, a member of the Dicer family that cleaves double- strand RNAs, is influential [19,20]. DCL1 is responsible for the cleavage of the hairpin structures, known as miRNA primary sequences (pri-miRNA), into short functional miRNA strands. For DCL1 to process a double-stranded RNA, there must be a high level of complementarity between the opposing arms of the hairpin [21]. DCL1- mediated cleavage is extremely precise, recognising the 21 nt miRNA sequence with high fidelity [22]. Conse- quently, the pri-miRNA strands are cleaved with great precision 5 0 of the mature miRNA sequence and 3 0 of the miRNA* [23]. Cleavage on the 5 0 and 3 0 arms occurs with a positional offset of two nucleotides, giving rise to a char- acteristic overhang of two nucleotides on the 5 0 arm [23]. Thus, identification of bona fide miRNAs requires the following five criteria (Figure 1). (i) The miRNA sequence must have a high degree of complementarity to the oppos- ing arm. The necessary degree of complementarity of the functional part of the precursor is unclear but at a mini- mum there should be in excess of 15 nucleotides bonding with the opposing arm. (ii) The processed miRNA strands must show evidence of precise 5 0 cleavage. The vast major- ity of the processed reads from each arm of the precursor should have their 5 0 ends within a nucleotide of each other at the miRNA site. (iii) There is little or no heterogeneity in the sequences matching to the miRNA precursor. Evidence of such ‘smearing’, even if there is a greater accumulation of the purported miRNA strand, strongly suggests that a locus is an siRNA and not a miRNA. (iv) There must be evidence for the expression of the miRNA*. For validation of a miRNA family, this evidence can be obtained from any organism, and from any paralogue within the miRNA Review 1360-1385/$ see front matter ß 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tplants.2013.11.008 Corresponding author: Donoghue, P.C.J. ([email protected]). Keywords: microRNA; plant; annotation; classification; evolution; miRBase. Trends in Plant Science, March 2014, Vol. 19, No. 3 175

Evolutionary history of plant microRNAspalaeo.gly.bris.ac.uk/donoghue/PDFs/2014/Taylor_et_al_2014.pdf · Author's personal copy family. 0 (v)Thereshouldbeatwonucleotideoverhangofthe

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Evolutionary history of plant microRNAspalaeo.gly.bris.ac.uk/donoghue/PDFs/2014/Taylor_et_al_2014.pdf · Author's personal copy family. 0 (v)Thereshouldbeatwonucleotideoverhangofthe

Author's personal copy

Evolutionary history of plantmicroRNAsRichard S. Taylor1, James E. Tarver1,2, Simon J. Hiscock3, and Philip C.J. Donoghue1

1 School of Earth Sciences, University of Bristol, Wills Memorial Building, Queen’s Road, Bristol BS8 1RJ, UK2 Genome Evolution Laboratory, Department of Biology, National University of Ireland Maynooth, Maynooth, Co. Kildare, Ireland3 School of Biological Sciences, University of Bristol, Woodland Road, Bristol BS8 1UG, UK

microRNAs (miRNAs) are short noncoding regulatorygenes that perform important roles in plant develop-ment and physiology. With the increasing power of nextgeneration sequencing technologies and the develop-ment of bioinformatic tools, there has been a dramaticincrease in the number of studies surveying the miR-NAomes of plant species, which has led to an explosionin the number of described miRNAs. Unfortunately, verymany of these new discoveries have been incompletelyannotated and thus fail to discriminate genuine miRNAsfrom small interfering RNAs (siRNAs), fragments oflonger RNAs, and random sequence. We review thepublished repertoire of plant miRNAs, discriminatingthose that have been correctly annotated. We use thesedata to explore prevailing hypotheses on the tempo andmode of miRNA evolution within the plant kingdom.

Plant miRNAs are small genes with big potential butequally big problemsmiRNAs are short (�21 nt) noncoding RNAs involved inpost-transcriptional gene regulation through both degra-dative and nondegradative mechanisms [1,2]. In plants,miRNAs have been demonstrated to have an influentialrole in development [3] as well as tolerance and response toextrinsic stresses [4], including drought [5], temperature[6,7], salinity [8], oxidative stress [9], and exposure to UVradiation [10]. Hence, miRNAs are obvious targets forbioengineering to improve crop yield and food security,mitigating the impact of global climate change [11–13].However, although there has been considerable researchsurveying the miRNA repertoire of individual plant spe-cies, and crop species in particular, a synthetic under-standing of their systematic distribution has beencompromised by specious annotation of putative miRNAloci resulting in the effective corruption of databases suchas miRBase [14,15] (http://mirbase.org).

Our aim in this review is to evaluate publishedplant miRNAs to determine whether they meet thecriteria required of annotation. We propose a phylogenetic

framework for organising these data which predicts themiRNA repertoire of evolutionary lineages, including thosethat have yet to be studied. In this regard, we evaluateconventional perceptions of the tempo and mode of plantmiRNA evolution and their role within plant organismalevolution, before considering this within the perspective ofmiRNA evolution more generally.

How to identify a plant miRNAThe criteria required to identify novel plant miRNAs areunambiguous. Initial surveys of the systematic distribu-tion and diversity of miRNAs quickly led to a consensus onthe best practice criteria required for correctly identifyingnovel miRNAs based on the distinctive nature of miRNAbiogenesis [16–18]. In this regard, the role of the DCL1enzyme, a member of the Dicer family that cleaves double-strand RNAs, is influential [19,20]. DCL1 is responsible forthe cleavage of the hairpin structures, known as miRNAprimary sequences (pri-miRNA), into short functionalmiRNA strands. For DCL1 to process a double-strandedRNA, there must be a high level of complementaritybetween the opposing arms of the hairpin [21]. DCL1-mediated cleavage is extremely precise, recognising the�21 nt miRNA sequence with high fidelity [22]. Conse-quently, the pri-miRNA strands are cleaved with greatprecision 50 of the mature miRNA sequence and 30 of themiRNA* [23]. Cleavage on the 50 and 30 arms occurs with apositional offset of two nucleotides, giving rise to a char-acteristic overhang of two nucleotides on the 50 arm [23].Thus, identification of bona fide miRNAs requires thefollowing five criteria (Figure 1). (i) The miRNA sequencemust have a high degree of complementarity to the oppos-ing arm. The necessary degree of complementarity of thefunctional part of the precursor is unclear but at a mini-mum there should be in excess of 15 nucleotides bondingwith the opposing arm. (ii) The processed miRNA strandsmust show evidence of precise 50 cleavage. The vast major-ity of the processed reads from each arm of the precursorshould have their 50 ends within a nucleotide of each otherat the miRNA site. (iii) There is little or no heterogeneity inthe sequences matching to the miRNA precursor. Evidenceof such ‘smearing’, even if there is a greater accumulationof the purported miRNA strand, strongly suggests that alocus is an siRNA and not a miRNA. (iv) There must beevidence for the expression of the miRNA*. For validationof a miRNA family, this evidence can be obtained from anyorganism, and from any paralogue within the miRNA

Review

1360-1385/$ – see front matter

� 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tplants.2013.11.008

Corresponding author: Donoghue, P.C.J. ([email protected]).Keywords: microRNA; plant; annotation; classification; evolution; miRBase.

Trends in Plant Science, March 2014, Vol. 19, No. 3 175

Page 2: Evolutionary history of plant microRNAspalaeo.gly.bris.ac.uk/donoghue/PDFs/2014/Taylor_et_al_2014.pdf · Author's personal copy family. 0 (v)Thereshouldbeatwonucleotideoverhangofthe

Author's personal copy

family. (v) There should be a two nucleotide overhang of the50 sequence from the 30 sequence. Small variations infolding, or errors in folding prediction, may lead to a minoramount of heterogeneity.

Since these criteria were established [17,18], sequen-cing technology has improved dramatically, facilitatingcheaper sequencing at orders of magnitude greater depth.Many of these original criteria were effectively idealistic

TRENDS in Plant Science

Small RNA read dat amiRNA candidate 1 = AUACACCGAGG ACUU UAUGAA AUG miRNA candidate 2 = CGACAGA AGAGAGUG AGC ACmiRNA candidate 3 = AGU UACGACAUGCUUU UAUUAGUG

miRNA candid ate 1

miRNA candid ate 2

miRNA candid ate 3

Do the reads have consistent 5′ processing and presence of a star sequence?

miRNA candid ate 2

Are the two products complementary when folded?

miRNA candidate 2

BLAST search mature product against miRBase

miRNA candid ate 1

miRNA candid ate 2

Publish

Do reads map to genome and form a hairpin?

Figure 1. miRNA identification protocol. An outline of the correct protocol for identifying and annotating miRNAs from deep sequencing libraries, to ensure accuracy and

fulfilment of the established annotation criteria [16–18]. Adapted from [41].

Review Trends in Plant Science March 2014, Vol. 19, No. 3

176

Page 3: Evolutionary history of plant microRNAspalaeo.gly.bris.ac.uk/donoghue/PDFs/2014/Taylor_et_al_2014.pdf · Author's personal copy family. 0 (v)Thereshouldbeatwonucleotideoverhangofthe

Author's personal copy

because it was not practically feasible to demonstratephenomena such as cleavage precision and expression ofthe miRNA*. As a result, this led to the spurious identi-fication of, for instance, siRNAs as miRNAs [24]. However,given the depth of read afforded by contemporary sequen-cing technology, these features can now be demonstratedtrivially, hence they have become required elements formiRNA annotation [16,17,25].

An unfortunate consequence of this increase in theburden required for miRNA annotation is that publicdatabases such as miRBase include putative miRNAs forwhich evidence is lacking for many of these annotationcriteria [14,15]. Resequencing has demonstrated thatmany previously annotated miRNAs are siRNAs or frag-ments of long noncoding RNAs including tRNAs, mRNAs,and rRNAs [26]. Despite the increased economy and scaleof deep sequencing technology, many contemporary sur-veys of plant miRNAs continue to regard many of theannotation criteria as optional, adding more incompletelyjustified and, therefore, potentially spurious miRNAs topublic databases [15].

Separating the wheat from the chaffWe set out to assess the validity of annotated plant miR-NAs. We evaluated the miRNA repertoires of all plantspecies available on miRBase v20 and, where possible,we also considered the sequencing data from the originalpublications when it had not been submitted to miRBase(see Table S4 in the supplementary material online).Additionally, we searched MirNest [27] (http://lemur.amu.edu.pl/share/php/mirnest/), an online repository for com-putationally predicted miRNAs. We considered theserecords in terms of the annotation criteria outlined above,paying particular attention to deep sequencing read data(where available) in attempting validation.

Our analysis considered putative miRNAs within theirfamilies, namely, the grouping of loci into families based onsimilarity in the sequence of the mature miRNA, based onthe assumption that paralogues within a family are des-cended through duplication from a single orthologue. Thisassumption reduced the burden of evidence required formiRNA validation because miRNA* expression need onlybe demonstrated in one paralogue of that family across allspecies.

Putative miRNAs exhibiting complementarity pairingof 15 or fewer nucleotides between the opposing armshairpin were deemed to have insufficiently robust struc-ture to be validated as a miRNA, unless there was strongevidence from deep sequencing for precise cleavage. Evi-dence of heterogeneous 50 processing, or an offset of greaterthan three nucleotides or less than one nucleotide betweenthe 50 and 30 cleaved ends of the hairpin sequence, were alsoreasons for rejection of many putative miRNAs.

Finally, we conducted a BLAST search of the validatedmiRNAs against the genome sequence data that wereavailable for each of the taxa. This allowed us to detectvalidated miRNAs in species where they had not beenreported previously, establish the systematic distributionof miRNA families among extant plant species, and, ulti-mately, infer their evolutionary history within the plantkingdom.

ResultsWe examined 6172 miRNA genes annotated in miRBasev20 and found that 1993 (32.3%) lacked sufficient evidenceto justify their annotation as genuine miRNAs (see TableS1 in the supplementary material online). Of these, themajority (61.1%) were rejected because they lacked evi-dence of miRNA* sequence expression for any paraloguewithin the respective family (112 families lacked evidenceof miRNA* expression in their miRBase entry but it wasavailable in their primary publication). A further 9.5%were rejected due to insufficient complementarity betweenopposing arms of the pri-miRNA and 29.4% of miRNA lociwere rejected because of evidence of imprecise 50 proces-sing. At the family level, we rejected 1351 of the 1802 plantmiRNA families on miRBase (Table 1; see Table S2 in thesupplementary material online). The vast majority (95.8%)of the miRNA families that we reject have been describedin just a single taxon. In some taxa, almost all such orphanmiRNA families failed to fulfil the criteria required forannotation (e.g., 96.3% of orphan miRNAs in Glycine maxand 89.2% in Medicago truncatula), invariably because ofan absence of miRNA* expression data. MirNest contained22 additional miRNAs that fulfilled the annotation cri-teria. BLAST searches of publicly available genomesequence data revealed 109 previously unidentifiedmiRNA families in 20 taxa (see Table S3 in the supple-mentary material online). We provide a library of validplant miRNAs in FASTA format at <http://palaeo.gly.bris.ac.uk/donoghue/>, which we hope will serve as a bench-mark for future surveys of plant miRNAomes.

The rejection of almost one-third of annotated plantmiRNA loci in miRBase, and three-quarters of the miRNAfamilies, confirms previous assertions [14,15] that thedatabase has been corrupted by spurious data. As pessi-mistic as this appears, our results may underestimate thenumber of spuriously identified miRNAs in miRBasebecause some entries are supported by extremely limitedevidence. However, many of the miRNAs that we reject forlack of evidence of miRNA* sequence expression may yetbe validated by resequencing. However, they cannot beconsidered valid miRNAs based on the evidence currentlyavailable.

A role for ‘Tree-Thinking’ in the study of plant miRNAsmiRNA families are defined on evolutionary principles andit has been generally accepted that there is some degree ofconservation in the evolution of plant miRNAs [28–30].Nevertheless, there have been few attempts to organise thedistribution of miRNAs among plant species within anevolutionary framework [29,31,32]. Rather, the presenceor absence of miRNA families among plant species hasbeen conveyed through increasingly unwieldy tablesthat fail to exploit the predictive power of tree-basedapproaches to organising phylogenetic data [33]. BecausemiRNAs are inherited through vertical descent, it is pos-sible to infer the miRNAome of common ancestors usingphylogenetic approaches and, therefore, predict the mini-mal miRNA repertoire of hitherto unsampled lineages(Figure 2).

Although it has been argued that few miRNAs aredeeply conserved and that the majority of miRNA families

Review Trends in Plant Science March 2014, Vol. 19, No. 3

177

Page 4: Evolutionary history of plant microRNAspalaeo.gly.bris.ac.uk/donoghue/PDFs/2014/Taylor_et_al_2014.pdf · Author's personal copy family. 0 (v)Thereshouldbeatwonucleotideoverhangofthe

Author's personal copy

have evolved recently [29,34,35], a phylogenetic perspec-tive demonstrates precisely the reverse. Many miRNAfamilies have been inherited from the ancestral embryo-phyte (Figure 2, node a), and another large suite of miRNAfamilies encountered among angiosperms evolved in theancestral spermatophyte (Figure 2, node d). Few miRNAfamilies appear to have evolved within the lineage leadingto angiosperms after it separated from gymnosperms(Figure 2, node f). Instead many new miRNA familiesevolved among the angiosperms, within componentlineages after the divergence of the extant flowering plantsfrom their last common ancestor.

Based on the current knowledge of miRNA family dis-tribution and lineage divergence dates [36], it is alsopossible to establish evolutionary rates for the innovationof miRNA families (Figure 3). The rate of accumulation offamilies is high in the embryophyte stem lineage (Figure 2,node a; ‘bryophytes’ and vascular plants), after it separatedfrom its nearest green algal relatives, and in the sperma-tophyte stem lineage (Figure 2, node d; gymnosperms plus

angiosperms) after it separated from the pteridophytelineage (ferns). Rates of miRNA family innovation appearapproximately equivalent elsewhere within plant phylo-geny with the exception of the angiosperms where there isa marked increase in the rate of innovation after thedivergence of the monocot and eudicot evolutionarylineages (Figure 2, nodes g and h, respectively). However,due to the disparity in depth of coverage between thedifferent flowering plants species, it is impossible to under-take a proper comparison of the number of families pre-sent. For example, the most highly studied genus isArabidopsis, and it is unlikely to be a coincidence thatthe highest number of miRNA families has been recordedfrom this genus [37].

Figure 2 clarifies the effect that disparities in taxonomicsampling have on our understanding of the evolutionof miRNAs. Entire divisions have either no coverage(Hepatophyta, Anthocerophyta, Psilophyta, Sphenophyta,Cycadophyta, Gingkophyta, and Gnetophyta) or only asingle taxon has been surveyed (Bryophyta, Lycophyta,

Table 1. An outline of the miRNA families listed on miRBase v20 that were rejected due to lacking sufficient evidence for confidentannotationa

Embryophyta

Embr

yoph

yta

Tracheophyta

Trac

heop

hyta

Euphyllophyta

Euph

yllo

phyt

a

Spermatophyta

Sper

mat

ophy

ta Angiospermae

Poales

Angi

ospe

rmae

156, 159, 160, 166, 167, 171, 390, 408, 477, 530, 535

396

168, 169, 172

162, 164, 393, 394, 395, 397, 398, 399, 482, 2950

827

437, 444, 528, 2275

1432, 1878, 5566

403, 828, 2111, 3627, 3630, 4376, 4414

Taxon microRNA family present

Reason for rejec�on Number of families rejected (% rejected)

No reported star

Available read data do not support mature/ star

Heterogeneous processing

Improper mature/ star offset

Improper precursor structure

Reported products do not map to genome

No experimental support for annota�on

Total invalid

Total valid

779 (43.2%)

65 (3.6%)

423 (23.5%)

60 (3.3%)

89 (4.9%)

2 (0.1%)

97(5.4%)

1351 (75.0%)

451 (25.0%)

Monocotyledones

Eudicotyledones

aIn addition, the complement of miRNA families present in each of the major plant groups is listed, as inferred from the distribution in sampled taxa.

Review Trends in Plant Science March 2014, Vol. 19, No. 3

178

Page 5: Evolutionary history of plant microRNAspalaeo.gly.bris.ac.uk/donoghue/PDFs/2014/Taylor_et_al_2014.pdf · Author's personal copy family. 0 (v)Thereshouldbeatwonucleotideoverhangofthe

Author's personal copy

Ceratopteris

651

951

0 61

661

7 61

1 71

0 93

804

6 93

169

261

793

893

2 84

461

393

4 93

9 93

035

827

477

828

1 112

0363

673 4

003 5

403 5

620 6

6016

1 116

9 19 1

3 04

7 263

414 4

3 26 3

4 26 3

5 263

8 26 3

9 263

1 363

236 3

336 3

4363

536 3

636 3

7263

0363

159 3

2593

3593

4 593

851

161

361

3 71

1 93

0 04

204

7 74

035

1 77

377

4 77

187

428

438

7 38

048

2 48

548

648

748

258

658

758

858

958

1 224

414 4

3343

534 3

634 3

9343

3443

6 443

8443

9 443

577

677

777

628

038

9 48

468

768

9392

3393

4144

9 051

7051

9555

0 151

514 4

1735

3 04

774

6 802

8802

2952

9 16 2

926 2

7 263

3 125

042 5

5555

6555

7555

8 555

065 5

1655

3 655

0 217

1 217

3 217

421 7

5217

6 217

7217

8 217

304

284

728

9051

726 3

437444

5282275

48223

4187

8 16 6

5 5

530

4655

075 5

0231

5 24 1

6 481

568 1

6681

478 1

090 2

9793

289 3

7415

251 5

751 5

261 5

665 5

261

840 5

0505

100 2

2002

3 002

6002

7002

9002

010 2

1102

2102

3102

4 102

0202

1 202

2202

320 2

4 20 2

520 2

620 2

llemiR-1llem

2-Ri

llemi

3-R

ll emi

4-R

l lemiR-5

llemiR-6

llemiR-7l lem

iR-8llem

iR-9llem

iR-10

llemiR-11

llemiR-123 1

- Ri

mell

774

335

435

535

735

835

8 98

2 09

4 09

3201

4 20 1

5201

720 1

820 1

9 201

2 301

3301

4301

5 30 1

6301

8 30 1

930 1

0401

140 1

240 1

3 401

4401

740 1

8401

9401

0 50 1

1501

250 1

3501

4 50 1

850 1

9 501

0 60 1

1601

2601

3601

4 601

5 60 1

7601

860 1

1701

270 1

3701

470 1

8701

9701

1121

212 1

4121

5 12 1

612 1

7121

9121

122 1

222 1

322 1

8 702

970 2

1802

2802

4802

609

019

9411

7511

261 1

861

271

Physcomitrella

Key:

Selaginella

Picea

Larix

Liriodendron

Chlamydomonas

Cyanophora

5 11 1 76

109

3

0 801

1 80 1

2801

3 801

480 1

5801

6801

880 1

9 801

0 901

190 1

2901

390 1

490 1

5 901

690 1

790 1

8 901

990 1

0011

1011

2 01 1

3011

401 1

5011

6 011

701 1

8 011

9011

0111

111 1

2111 77

4

Musa

Oryza

Tri�cum

Zea

Brachypodium

Hordeum

Sorghum

Pinus

Aquilegia

059 2

Cynara

6116

8116

411 6

Nico�ana

Solanum

Vi�s

6 734

Citrus

059 2

Theobroma

A. thaliana

A. lyrata

Medicago

Malus

Cucumis

Glycine

Populus

Ricinus

0592

3 462

472 5

4 443

4565

2105 3 5

8

8205

0 592

6 734

0592

5 93

0592

0592

0363

728

536

0592

0 35

EudicotyledonsM

onocotyledons

Tracheophyta

Embryophyta

Gym

nosperms

119

151 1

251 1

551 1

6 701

5 701

044

678 1

341 5

9415

5 51 5

6 515

8 515

7 085

1526

5967

3 01 6

7 206

4206

520 6

910 6

020 6

4416

5416

6 416

7 41 6

841 6

9416

3 516

4 516

551 6

9516

7 441

8 441

0541

1 246

4246

7246

0346

2 346

434 6

5 44 6

6446

9546

3 646

7646

8646

4 746

4 144

2125

6967

7 967

8967

9967

1077

635

328

749

059

159

6201

1151

3 96 3

796 3

9 07 3

5225

5 225

a

b

c

d

f

h

e

g

Angiosperms

3 22 6

361 5

2715

3 715

581 5

1 025

9 077

417 7

3277

4 277

7277

8277

1 377

237 7

4 377

837 7

2 47 7

3 577

4577

0677

367 7

2877

8 051

2253

7834

4994

7 665

7 881

3 424

9 112

7 305

9912

1325

672 5

2 646

322 4

4224

5224

6 224

0324

5 324

7 324

3324

7424

0 524

2424

1 68

868

5151

168 1

3681

578 1

3682

1 78 2

3 782

0 882

TRENDS in Plant Science

Latest �me of evolu�on of miRNA familymiRNA family lost in all descendents

Figure 2. The acquisition of validated miRNA families in taxa across 31 taxa representing a broad coverage of the plant kingdom. The observed distribution of families in

these taxa is utilised to infer which node each family is likely to have evolved at. Node labels: (a) Embryophyta, (b) Tracheophyta, (c) Euphyllophyta, (d) Spermatophyta, (e)

Gymnospermae, (f) Angiospermae, (g) Monocotyledons, and (h) Eudicotyledons. The established plant phylogeny is used (Angiosperm Phylogeny Website. Version 12,

July 2012; http://www.mobot.org/MOBOT/research/APweb/) and the majority of data are drawn from miRBase v20, with additional miRNA families that fulfil the annotation

criteria included for Musa [46], Triticum [10], Larix [47], Liriodendron [31], and Ceratopteris [31].

Review Trends in Plant Science March 2014, Vol. 19, No. 3

179

Page 6: Evolutionary history of plant microRNAspalaeo.gly.bris.ac.uk/donoghue/PDFs/2014/Taylor_et_al_2014.pdf · Author's personal copy family. 0 (v)Thereshouldbeatwonucleotideoverhangofthe

Author's personal copy

and Pteridophyta). In the latter case, this creates theillusion that few miRNA families are conserved in theselineages, whereas the reality is that these miRNAs appearspecies-specific simply because they are the only speciesstudied so far within their fundamental evolutionary line-age. Inevitably, denser taxonomic sampling will revealthat the majority of these miRNAs have evolved muchearlier than the origin of the species.

Not so young miRNAsIt has been argued that, because the majority of plantmiRNAs appear to be species specific, they must haveevolved recently [29,34,35,38,39], that most of these miR-NAs are likely to lack any function, and that they will soonbe lost through neutral selection [35]. Our analysis revealstwo fundamental errors in these arguments. First,although it may be true that the species in which a novelmiRNA family is identified may itself be young, if thatspecies is the sole representative of an otherwiseunsampled evolutionary lineage, then we have no con-straint on the antiquity of the miRNA family except thatit evolved after that species lineage separated from thenearest living species or lineage whose miRNAome hasbeen assayed.

This is both most obvious and extreme in Physcomitrellapatens, which is the only representative of the moss evolu-tionary lineage to have its miRNA repertoire studied. P.patens has been described to encode 67 miRNA familiesthat are not encountered in any other plant species studiedso far. However, it does not follow that these miRNAfamilies are exclusive to the species P. patens. Rather,they are the net sum of the miRNA families that evolvedin the lineage leading to P. patens after it separated fromthe last common ancestor that it shared with tracheo-phytes (vascular plants, including the lycophytes, pterido-phytes, gymnosperms, and angiosperms; Figure 3, node a).

Through this episode lasting half a billion years, some ofthe miRNA families will have evolved very early and somemuch more recently, interpretations that can be reconciledby assaying the miRNAome of members of the moss lineagethat are related by degree to P. patens.

Similarly, even within the comparatively intensivelysampled angiosperms, those species that have been inves-tigated for their miRNAs are all relatively distantlyrelated. They are representative of their evolutionarylineages, not merely of their species. Indeed, we only haveapproximate knowledge of species-specific miRNAs in Ara-bidopsis where both Arabidopsis thaliana and Arabidopsislyrata have been investigated [37]. These two speciesappear to have 14 and 20 new miRNAs, respectively(Figure 2). However, our survey revealed that spuriouslyidentified, incompletely annotated, and/or lowly expressednovel miRNAs are most common in individual species. It islikely that resequencing efforts will significantly diminishthe inventory of truly species-specific miRNAs.

Nothing unique about model systemsIt has often been noted that miRNA families are distrib-uted very unevenly among evolutionary lineages, withspecies of Arabidopsis, Oryza, and Glycine havingevolved a disproportionately large number of miRNAfamilies. However, given the phylogenetic perspectivethat we provide, the distribution of miRNA familiesappears to be approximately proportional to the anti-quity of the evolutionary lineages assayed. For instance,the miRNAomes of the moss P. patens and the eudicot A.thaliana have both been well characterised. In the halfbillion years of independent evolutionary history sincethey last shared a common ancestor, these distantlyrelated evolutionary lineages have evolved comparablenumbers of novel miRNA families (67 and 66, respec-tively; Figure 2).

TRENDS in Plant Science

Num

ber o

f miR

NA

fam

ilies

pre

sent

100200300400500600Time before present (Ma)

0

20

40

60

80

100

Physcomitrella

Selaginella

Arabidopsis

a bc

ffdPinaceae

TracheophytaSpermatophyta

Angiospermae

e gh

Figure 3. The rate of acquisition of miRNA families over geological time. Divergence dates are consistent with both molecular data and the fossil record [36,48]. Node

labels: (a) Embryophyta, (b) Tracheophyta, (c) Euphyllophyta, (d) Spermatophyta, (e) Gymnospermae, (f) Angiospermae, (g) Monocotyledons, and (h) Eudicotyledons. Each

line produced at the point of the monocot (brown) and eudicot (green) divergence represents an angiosperm species, and illustrates both the disparity in coverage between

the angiosperm and rest of the plant kingdom, and the differences in coverage within the angiosperms.

Review Trends in Plant Science March 2014, Vol. 19, No. 3

180

Page 7: Evolutionary history of plant microRNAspalaeo.gly.bris.ac.uk/donoghue/PDFs/2014/Taylor_et_al_2014.pdf · Author's personal copy family. 0 (v)Thereshouldbeatwonucleotideoverhangofthe

Author's personal copy

Less well-characterised taxa, such as Nicotiana andSelaginella, appear to have fewer miRNAs but this isalmost certainly an artefact of incomplete sampling. Wherecomparisons have been made within angiosperms to high-light the considerable variability of miRNA evolution atthe individual gene level, for example, between rice (Oryzasativa) and papaya (Carica papaya), it has been noted thatrice has gained 126 genes while losing only one, whereaspapaya has gained only eight genes and lost 25 [35].However, rice and papaya differ dramatically in the extentto which their miRNAs have been studied; a Web of Sciencesearch (search string Topic=([rice j papaya]) AND Topic=(microRNA OR miRNA) yields 602 versus ten publishedstudies, respectively. Bioinformatic approaches could over-come this if the level of gene sequencing was comparablefor both taxa. However, whereas rice has enjoyed effec-tively complete genome sequencing and annotation, thepapaya genome has been the subject of only very low (3�)sequence coverage [40]. Previous studies in animals haveshown that the ability to identify miRNAs bioinformati-cally is correlated to the level of genome sequencing, withlow coverage genomes missing up to 32% of the expectedmiRNA repertoire, whereas completely sequenced gen-omes were missing <3% [41]. Consequently, for compar-isons to be made between the miRNA repertoires of anytwo species, care should be taken to ensure that samplingbias does not affect interpretations of the available data.

The evolution of plant miRNAsA key feature of miRNA evolution is that once evolved,families are rarely lost and, as such, this high level ofconservation between taxa was exploited as an ancillarycriterion for miRNA annotation [17,18]. However, not allmiRNAs are equally conserved and it has been argued thatmore ancient miRNA families have a higher level of con-servation than younger families [32]. Our phylogeneticperspective on miRNA evolution reveals that of the 36observed secondary losses of miRNA families, 21 of thesewere in families that evolved prior to the origin of thespermatophytes (Figure 2), suggesting that older familiesmay not be more highly conserved.

It is clear that new miRNA families are not integratedinto gene regulatory networks at a continuous rate overplant evolutionary history but, rather, in distinct epi-sodes of miRNAome expansion [e.g., in the ancestralembryophyte (Figure 2, node a) and ancestral spermato-phyte (Figure 2, node d)], with long intervening periods ofstasis. It is tempting to suggest that these bursts ofmiRNA innovation are associated causally with majorepisodes of plant phenotypic evolution, such as in theadaptation of plants to life on land. Indeed, expansion inthe number of miRNA families within the ancestralangiosperm lineage (Figure 2, node f) has been implicatedin the origin of the flowering plants [35], although ouranalysis shows that this apparent coincidence is an arte-fact of the paucity of miRNA studies in pteridophytes andgymnosperms. Evidently, not all of the major events inplant evolution can be explained by the origin of newmiRNA families, although because miRNAs are never-theless known to play an essential role in the develop-ment of the flower, [42,43] co-option of existing miRNA

families to perform novel functions may also play a crucialrole in the evolution of novel phenotypes.

The relation between miRNA and phenotypic evolutionin plants contrasts with this same relation in animalevolutionary history where the innovation of miRNAfamilies is a constant process [44]. This has been evidencedby the observation that novel miRNA families characterisealmost every major animal clade surveyed to date [41].Animal evolution also exhibits a positive relation betweenthe sum of miRNA families and proxies for phenotypiccomplexity [45]. This relation does not appear to hold inplants, not least since, as we have shown, the moss P.patens and eudicot A. lyrata have acquired a comparablerepertoire of miRNAs.

Concluding remarks and future perspectivesOur review of the plant miRNA database has shown thatvery many fail to meet the criteria required for miRNAannotation. However, many of the miRNA families that wefailed to validate were rejected due to absence of evidence,rather than evidence that they failed to meet validationcriteria. In particular, the identification of miRNA* expres-sion was the major reason for the rejection of families.Conversely, many families were not rejected but have readcounts so low that it is difficult to assess the precision of 50

processing with any confidence. This paucity of evidence isparticularly acute in important crop species includingmaize (Zea mays) and soybean (Glycine max) (Figure 2).It is important, therefore, that the validity of these candi-date miRNAs is more fully evaluated through a systematicprogramme of miRNAome resequencing.

The phylogenetic framework within which we organisethe database of valid plant miRNAs provides not only apowerful perspective within which to trace the evolution ofmiRNA families but also to predict the miRNAome ofunstudied plant species. Although angiosperms representthe majority of plant biodiversity, to gain a more completeunderstanding of plant miRNA evolution it is essentialthat there is a rebalancing of sampling to more fullyrepresent the systematic breadth of the plant kingdom.Targeted sequencing of hitherto neglected lineages thatrepresent the majority of plant evolutionary history istherefore urgently needed. This will afford a better per-spective on the role that miRNAs have played in develop-mental evolution within the plant kingdom.

Finally, given their high levels of conservation andinterspersed locations across the genome, miRNAs mayserve as a simple proxy for tracing major events in theevolution of plant genomes. For instance, paralogy withinmiRNA families affords a ready insight into the phyloge-netic timing and sequence of whole genome duplicationevents in plant evolutionary history, as well as revealingthe roles that miRNA evolution play in responding togenome obesity following whole genome duplicationevents. This will require the establishment of a coherentsystematic classification scheme for paralogy groupswithin miRNA families, to replace current schemes thatserve more to obscure than to reveal homology.

Research into the evolution of the plant miRNAomeholds great potential for bioengineering as well as unco-vering the role of these regulatory factors in organismal

Review Trends in Plant Science March 2014, Vol. 19, No. 3

181

Page 8: Evolutionary history of plant microRNAspalaeo.gly.bris.ac.uk/donoghue/PDFs/2014/Taylor_et_al_2014.pdf · Author's personal copy family. 0 (v)Thereshouldbeatwonucleotideoverhangofthe

Author's personal copy

evolution. However, these prospects can only be achieved ifgreater effort is expended by researchers and referees inensuring that the published record is not further corruptedby the description of spurious miRNAs.

AcknowledgementsWe would like to thank Kevin Peterson (Dartmouth College) for advice,support, and encouragement throughout the course of this project. Thisresearch was funded through an Engineering and Physical SciencesResearch Council (EPSRC) doctoral training grant (DTG) through theBristol Centre for Complexity Science (R.S.T.), an Irish Research Council(IRCSET) postdoctoral fellowship (J.E.T.), a Black Swan Fund from theUniversity of Bristol, Faculty of Science (P.C.J.D., S.J.H.), a LeverhulmeTrust Research Fellowship (P.C.J.D.), the Natural EnvironmentalResearch Council (P.C.J.D. and S.J.H.), the Royal Society (P.C.J.D.),and the Wolfson Foundation (P.C.J.D.).

Appendix A. Supplementary dataSupplementary data associated with this article can befound, in the online version, at doi:10.1016/j.tplants.2013.11.008.

References1 Voinnet, O. (2009) Origin, biogenesis, and activity of plant microRNAs.

Cell 136, 669–6872 Carthew, R.W. and Sontheimer, E.J. (2009) Origins and mechanisms of

miRNAs and siRNAs. Cell 136, 642–6553 Rubio-Somoza, I. and Weigel, D. (2011) MicroRNA networks and

developmental plasticity in plants. Trends Plant Sci. 16, 258–2644 Kruszka, K. et al. (2012) Role of microRNAs and other sRNAs of plants

in their changing environments. J. Plant Physiol. 169, 1664–16725 Zeng, C.Y. et al. (2010) Conservation and divergence of microRNAs and

their functions in Euphorbiaceous plants. Nucleic Acids Res. 38, 981–9956 Barakat, A. et al. (2012) Genome wide identification of chilling

responsive microRNAs in Prunus persica. BMC Genomics 13, 4817 Chinnusamy, V. et al. (2007) Cold stress regulation of gene expression

in plants. Trends Plant Sci. 12, 444–4518 Frazier, T.P. et al. (2011) Salt and drought stresses induce the aberrant

expression of microRNA genes in tobacco. Mol. Biotechnol. 49, 159–1659 Sunkar, R. et al. (2006) Post-transcriptional induction of two Cu/Zn

superoxide dismutase genes in Arabidopsis is mediated bydownregulation of miR398 and important for oxidative stresstolerance. Plant Cell 18, 2051–2065

10 Wei, B. et al. (2009) Novel microRNAs uncovered by deep sequencing ofsmall RNA transcriptomes in bread wheat (Triticum aestivum L.) andBrachypodium distachyon (L.) Beauv. Funct. Integr. Genomics 9, 499–511

11 Ivashuta, S. et al. (2011) Regulation of gene expression in plantsthrough miRNA inactivation. PLoS ONE 6, e21330

12 Zhou, M. and Luo, H. (2013) MicroRNA-mediated gene regulation:potential applications for plant genetic engineering. Plant Mol. Biol.83, 59–75

13 Li, J.F. et al. (2013) Comprehensive protein-based artificial microRNAscreens for effective gene silencing in plants. Plant Cell 25, 1507–1522

14 Jones-Rhoades, M.W. (2012) Conservation and divergence in plantmicroRNAs. Plant Mol. Biol. 80, 3–16

15 Meng, Y. et al. (2012) Are all the miRBase-registered microRNAs true?A structure- and expression-based re-examination in plants. RNA Biol.9, 249–253

16 Kozomara, A. and Griffiths-Jones, S. (2011) miRBase: integratingmicroRNA annotation and deep-sequencing data. Nucleic Acids Res.39, D152–D157

17 Meyers, B.C. et al. (2008) Criteria for annotation of plant microRNAs.Plant Cell 20, 3186–3190

18 Ambros, V. et al. (2003) A uniform system for microRNA annotation.RNA 9, 277–279

19 Reinhart, B.J. et al. (2002) MicroRNAs in plants. Genes Dev. 16, 1616–1626

20 Park, W. et al. (2002) CARPEL FACTORY, a Dicer homolog, andHEN1, a novel protein, act in microRNA metabolism in Arabidopsisthaliana. Curr. Biol. 12, 1484–1495

21 Axtell, M.J. et al. (2011) Vive la difference: biogenesis and evolution ofmicroRNAs in plants and animals. Genome Biol. 12, 221

22 Bologna, N.G. et al. (2013) Processing of plant microRNA precursors.Brief. Funct. Genomics 12, 37–45

23 Kurihara, Y. and Watanabe, Y. (2004) Arabidopsis micro-RNAbiogenesis through Dicer-like 1 protein functions. Proc. Natl. Acad.Sci. U.S.A. 101, 12753–12758

24 Berezikov, E. (2011) Evolution of microRNA diversity and regulation inanimals. Nat. Rev. Genet. 12, 846–860

25 Tarver, J.E. et al. (2012) Do miRNAs have a deep evolutionary history?Bioessays 34, 857–866

26 Chiang, H.R. et al. (2010) Mammalian microRNAs: experimentalevaluation of novel and previously annotated genes. Genes Dev. 24,992–1009

27 Szczesniak, M.W. et al. (2012) miRNEST database: an integrativeapproach in microRNA search and annotation. Nucleic Acids Res.40, D198–D204

28 Zhang, B. et al. (2006) Conservation and divergence of plant microRNAgenes. Plant J. 46, 243–259

29 Cuperus, J.T. et al. (2011) Evolution and functional diversification ofmiRNA genes. Plant Cell 23, 431–442

30 Axtell, M.J. et al. (2007) Common functions for diverse small RNAs ofland plants. Plant Cell 19, 1750–1769

31 Axtell, M.J. and Bartel, D.P. (2005) Antiquity of microRNAs and theirtargets in land plants. Plant Cell 17, 1658–1673

32 Axtell, M.J. and Bowman, J.L. (2008) Evolution of plant microRNAsand their targets. Trends Plant Sci. 13, 343–349

33 Pagel, M. (1999) Inferring the historical patterns of biologicalevolution. Nature 401, 877–884

34 Axtell, M.J. (2013) Classification and comparison of small RNAs fromplants. Annu. Rev. Plant Biol. 64, 137–159

35 Nozawa, M. et al. (2012) Origins and evolution of microRNA genes inplant species. Genome Biol. Evol. 4, 230–239

36 Clarke, J.T. et al. (2011) Establishing a time-scale for plant evolution.New Phytol. 192, 266–301

37 Fahlgren, N. et al. (2010) MicroRNA gene evolution in Arabidopsislyrata and Arabidopsis thaliana. Plant Cell 22, 1074–1089

38 Naqvi, A.R. et al. (2012) Biogenesis, functions and fate of plantmicroRNAs. J. Cell. Physiol. 227, 3163–3168

39 Fahlgren, N. et al. (2007) High-throughput sequencing of ArabidopsismicroRNAs: evidence for frequent birth and death of miRNA genes.PLoS ONE 2, e219

40 Ming, R. et al. (2008) The draft genome of the transgenic tropical fruittree papaya (Carica papaya Linnaeus). Nature 452, 991–996

41 Tarver, J.E. et al. (2013) miRNAs: small genes with big potential inmetazoan phylogenetics. Mol. Biol. Evol. 30, 2369–2382

42 Luo, Y. et al. (2013) Evolutionary conservation of microRNA regulatoryprograms in plant flower development. Dev. Biol. 380, 133–144

43 Srikanth, A. and Schmid, M. (2011) Regulation of flowering time: allroads lead to Rome. Cell. Mol. Life Sci. 68, 2013–2037

44 Wheeler, B.M. et al. (2009) The deep evolution of metazoan microRNAs.Evol. Dev. 11, 50–68

45 Sempere, L.F. et al. (2006) The phylogenetic distribution of metazoanmicroRNAs: insights into evolutionary complexity and constraint. J.Exp. Zool. B: Mol. Dev. Evol. 306B, 575–588

46 D’Hont, A. et al. (2012) The banana (Musa acuminata) genome and theevolution of monocotyledonous plants. Nature 488, 213–217

47 Zhang, J. et al. (2012) Genome-wide identification of microRNAs inlarch and stage-specific modulation of 11 conserved microRNAs andtheir targets during somatic embryogenesis. Planta 236, 647–657

48 Smith, S.A. et al. (2010) An uncorrelated relaxed-clock analysissuggests an earlier origin for flowering plants. Proc. Natl. Acad. Sci.U.S.A. 107, 5897–5902

Review Trends in Plant Science March 2014, Vol. 19, No. 3

182