Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Author's personal copy
Evolutionary history of plantmicroRNAsRichard S. Taylor1, James E. Tarver1,2, Simon J. Hiscock3, and Philip C.J. Donoghue1
1 School of Earth Sciences, University of Bristol, Wills Memorial Building, Queen’s Road, Bristol BS8 1RJ, UK2 Genome Evolution Laboratory, Department of Biology, National University of Ireland Maynooth, Maynooth, Co. Kildare, Ireland3 School of Biological Sciences, University of Bristol, Woodland Road, Bristol BS8 1UG, UK
microRNAs (miRNAs) are short noncoding regulatorygenes that perform important roles in plant develop-ment and physiology. With the increasing power of nextgeneration sequencing technologies and the develop-ment of bioinformatic tools, there has been a dramaticincrease in the number of studies surveying the miR-NAomes of plant species, which has led to an explosionin the number of described miRNAs. Unfortunately, verymany of these new discoveries have been incompletelyannotated and thus fail to discriminate genuine miRNAsfrom small interfering RNAs (siRNAs), fragments oflonger RNAs, and random sequence. We review thepublished repertoire of plant miRNAs, discriminatingthose that have been correctly annotated. We use thesedata to explore prevailing hypotheses on the tempo andmode of miRNA evolution within the plant kingdom.
Plant miRNAs are small genes with big potential butequally big problemsmiRNAs are short (�21 nt) noncoding RNAs involved inpost-transcriptional gene regulation through both degra-dative and nondegradative mechanisms [1,2]. In plants,miRNAs have been demonstrated to have an influentialrole in development [3] as well as tolerance and response toextrinsic stresses [4], including drought [5], temperature[6,7], salinity [8], oxidative stress [9], and exposure to UVradiation [10]. Hence, miRNAs are obvious targets forbioengineering to improve crop yield and food security,mitigating the impact of global climate change [11–13].However, although there has been considerable researchsurveying the miRNA repertoire of individual plant spe-cies, and crop species in particular, a synthetic under-standing of their systematic distribution has beencompromised by specious annotation of putative miRNAloci resulting in the effective corruption of databases suchas miRBase [14,15] (http://mirbase.org).
Our aim in this review is to evaluate publishedplant miRNAs to determine whether they meet thecriteria required of annotation. We propose a phylogenetic
framework for organising these data which predicts themiRNA repertoire of evolutionary lineages, including thosethat have yet to be studied. In this regard, we evaluateconventional perceptions of the tempo and mode of plantmiRNA evolution and their role within plant organismalevolution, before considering this within the perspective ofmiRNA evolution more generally.
How to identify a plant miRNAThe criteria required to identify novel plant miRNAs areunambiguous. Initial surveys of the systematic distribu-tion and diversity of miRNAs quickly led to a consensus onthe best practice criteria required for correctly identifyingnovel miRNAs based on the distinctive nature of miRNAbiogenesis [16–18]. In this regard, the role of the DCL1enzyme, a member of the Dicer family that cleaves double-strand RNAs, is influential [19,20]. DCL1 is responsible forthe cleavage of the hairpin structures, known as miRNAprimary sequences (pri-miRNA), into short functionalmiRNA strands. For DCL1 to process a double-strandedRNA, there must be a high level of complementaritybetween the opposing arms of the hairpin [21]. DCL1-mediated cleavage is extremely precise, recognising the�21 nt miRNA sequence with high fidelity [22]. Conse-quently, the pri-miRNA strands are cleaved with greatprecision 50 of the mature miRNA sequence and 30 of themiRNA* [23]. Cleavage on the 50 and 30 arms occurs with apositional offset of two nucleotides, giving rise to a char-acteristic overhang of two nucleotides on the 50 arm [23].Thus, identification of bona fide miRNAs requires thefollowing five criteria (Figure 1). (i) The miRNA sequencemust have a high degree of complementarity to the oppos-ing arm. The necessary degree of complementarity of thefunctional part of the precursor is unclear but at a mini-mum there should be in excess of 15 nucleotides bondingwith the opposing arm. (ii) The processed miRNA strandsmust show evidence of precise 50 cleavage. The vast major-ity of the processed reads from each arm of the precursorshould have their 50 ends within a nucleotide of each otherat the miRNA site. (iii) There is little or no heterogeneity inthe sequences matching to the miRNA precursor. Evidenceof such ‘smearing’, even if there is a greater accumulationof the purported miRNA strand, strongly suggests that alocus is an siRNA and not a miRNA. (iv) There must beevidence for the expression of the miRNA*. For validationof a miRNA family, this evidence can be obtained from anyorganism, and from any paralogue within the miRNA
Review
1360-1385/$ – see front matter
� 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tplants.2013.11.008
Corresponding author: Donoghue, P.C.J. ([email protected]).Keywords: microRNA; plant; annotation; classification; evolution; miRBase.
Trends in Plant Science, March 2014, Vol. 19, No. 3 175
Author's personal copy
family. (v) There should be a two nucleotide overhang of the50 sequence from the 30 sequence. Small variations infolding, or errors in folding prediction, may lead to a minoramount of heterogeneity.
Since these criteria were established [17,18], sequen-cing technology has improved dramatically, facilitatingcheaper sequencing at orders of magnitude greater depth.Many of these original criteria were effectively idealistic
TRENDS in Plant Science
Small RNA read dat amiRNA candidate 1 = AUACACCGAGG ACUU UAUGAA AUG miRNA candidate 2 = CGACAGA AGAGAGUG AGC ACmiRNA candidate 3 = AGU UACGACAUGCUUU UAUUAGUG
miRNA candid ate 1
miRNA candid ate 2
miRNA candid ate 3
Do the reads have consistent 5′ processing and presence of a star sequence?
miRNA candid ate 2
Are the two products complementary when folded?
miRNA candidate 2
BLAST search mature product against miRBase
miRNA candid ate 1
miRNA candid ate 2
Publish
Do reads map to genome and form a hairpin?
Figure 1. miRNA identification protocol. An outline of the correct protocol for identifying and annotating miRNAs from deep sequencing libraries, to ensure accuracy and
fulfilment of the established annotation criteria [16–18]. Adapted from [41].
Review Trends in Plant Science March 2014, Vol. 19, No. 3
176
Author's personal copy
because it was not practically feasible to demonstratephenomena such as cleavage precision and expression ofthe miRNA*. As a result, this led to the spurious identi-fication of, for instance, siRNAs as miRNAs [24]. However,given the depth of read afforded by contemporary sequen-cing technology, these features can now be demonstratedtrivially, hence they have become required elements formiRNA annotation [16,17,25].
An unfortunate consequence of this increase in theburden required for miRNA annotation is that publicdatabases such as miRBase include putative miRNAs forwhich evidence is lacking for many of these annotationcriteria [14,15]. Resequencing has demonstrated thatmany previously annotated miRNAs are siRNAs or frag-ments of long noncoding RNAs including tRNAs, mRNAs,and rRNAs [26]. Despite the increased economy and scaleof deep sequencing technology, many contemporary sur-veys of plant miRNAs continue to regard many of theannotation criteria as optional, adding more incompletelyjustified and, therefore, potentially spurious miRNAs topublic databases [15].
Separating the wheat from the chaffWe set out to assess the validity of annotated plant miR-NAs. We evaluated the miRNA repertoires of all plantspecies available on miRBase v20 and, where possible,we also considered the sequencing data from the originalpublications when it had not been submitted to miRBase(see Table S4 in the supplementary material online).Additionally, we searched MirNest [27] (http://lemur.amu.edu.pl/share/php/mirnest/), an online repository for com-putationally predicted miRNAs. We considered theserecords in terms of the annotation criteria outlined above,paying particular attention to deep sequencing read data(where available) in attempting validation.
Our analysis considered putative miRNAs within theirfamilies, namely, the grouping of loci into families based onsimilarity in the sequence of the mature miRNA, based onthe assumption that paralogues within a family are des-cended through duplication from a single orthologue. Thisassumption reduced the burden of evidence required formiRNA validation because miRNA* expression need onlybe demonstrated in one paralogue of that family across allspecies.
Putative miRNAs exhibiting complementarity pairingof 15 or fewer nucleotides between the opposing armshairpin were deemed to have insufficiently robust struc-ture to be validated as a miRNA, unless there was strongevidence from deep sequencing for precise cleavage. Evi-dence of heterogeneous 50 processing, or an offset of greaterthan three nucleotides or less than one nucleotide betweenthe 50 and 30 cleaved ends of the hairpin sequence, were alsoreasons for rejection of many putative miRNAs.
Finally, we conducted a BLAST search of the validatedmiRNAs against the genome sequence data that wereavailable for each of the taxa. This allowed us to detectvalidated miRNAs in species where they had not beenreported previously, establish the systematic distributionof miRNA families among extant plant species, and, ulti-mately, infer their evolutionary history within the plantkingdom.
ResultsWe examined 6172 miRNA genes annotated in miRBasev20 and found that 1993 (32.3%) lacked sufficient evidenceto justify their annotation as genuine miRNAs (see TableS1 in the supplementary material online). Of these, themajority (61.1%) were rejected because they lacked evi-dence of miRNA* sequence expression for any paraloguewithin the respective family (112 families lacked evidenceof miRNA* expression in their miRBase entry but it wasavailable in their primary publication). A further 9.5%were rejected due to insufficient complementarity betweenopposing arms of the pri-miRNA and 29.4% of miRNA lociwere rejected because of evidence of imprecise 50 proces-sing. At the family level, we rejected 1351 of the 1802 plantmiRNA families on miRBase (Table 1; see Table S2 in thesupplementary material online). The vast majority (95.8%)of the miRNA families that we reject have been describedin just a single taxon. In some taxa, almost all such orphanmiRNA families failed to fulfil the criteria required forannotation (e.g., 96.3% of orphan miRNAs in Glycine maxand 89.2% in Medicago truncatula), invariably because ofan absence of miRNA* expression data. MirNest contained22 additional miRNAs that fulfilled the annotation cri-teria. BLAST searches of publicly available genomesequence data revealed 109 previously unidentifiedmiRNA families in 20 taxa (see Table S3 in the supple-mentary material online). We provide a library of validplant miRNAs in FASTA format at <http://palaeo.gly.bris.ac.uk/donoghue/>, which we hope will serve as a bench-mark for future surveys of plant miRNAomes.
The rejection of almost one-third of annotated plantmiRNA loci in miRBase, and three-quarters of the miRNAfamilies, confirms previous assertions [14,15] that thedatabase has been corrupted by spurious data. As pessi-mistic as this appears, our results may underestimate thenumber of spuriously identified miRNAs in miRBasebecause some entries are supported by extremely limitedevidence. However, many of the miRNAs that we reject forlack of evidence of miRNA* sequence expression may yetbe validated by resequencing. However, they cannot beconsidered valid miRNAs based on the evidence currentlyavailable.
A role for ‘Tree-Thinking’ in the study of plant miRNAsmiRNA families are defined on evolutionary principles andit has been generally accepted that there is some degree ofconservation in the evolution of plant miRNAs [28–30].Nevertheless, there have been few attempts to organise thedistribution of miRNAs among plant species within anevolutionary framework [29,31,32]. Rather, the presenceor absence of miRNA families among plant species hasbeen conveyed through increasingly unwieldy tablesthat fail to exploit the predictive power of tree-basedapproaches to organising phylogenetic data [33]. BecausemiRNAs are inherited through vertical descent, it is pos-sible to infer the miRNAome of common ancestors usingphylogenetic approaches and, therefore, predict the mini-mal miRNA repertoire of hitherto unsampled lineages(Figure 2).
Although it has been argued that few miRNAs aredeeply conserved and that the majority of miRNA families
Review Trends in Plant Science March 2014, Vol. 19, No. 3
177
Author's personal copy
have evolved recently [29,34,35], a phylogenetic perspec-tive demonstrates precisely the reverse. Many miRNAfamilies have been inherited from the ancestral embryo-phyte (Figure 2, node a), and another large suite of miRNAfamilies encountered among angiosperms evolved in theancestral spermatophyte (Figure 2, node d). Few miRNAfamilies appear to have evolved within the lineage leadingto angiosperms after it separated from gymnosperms(Figure 2, node f). Instead many new miRNA familiesevolved among the angiosperms, within componentlineages after the divergence of the extant flowering plantsfrom their last common ancestor.
Based on the current knowledge of miRNA family dis-tribution and lineage divergence dates [36], it is alsopossible to establish evolutionary rates for the innovationof miRNA families (Figure 3). The rate of accumulation offamilies is high in the embryophyte stem lineage (Figure 2,node a; ‘bryophytes’ and vascular plants), after it separatedfrom its nearest green algal relatives, and in the sperma-tophyte stem lineage (Figure 2, node d; gymnosperms plus
angiosperms) after it separated from the pteridophytelineage (ferns). Rates of miRNA family innovation appearapproximately equivalent elsewhere within plant phylo-geny with the exception of the angiosperms where there isa marked increase in the rate of innovation after thedivergence of the monocot and eudicot evolutionarylineages (Figure 2, nodes g and h, respectively). However,due to the disparity in depth of coverage between thedifferent flowering plants species, it is impossible to under-take a proper comparison of the number of families pre-sent. For example, the most highly studied genus isArabidopsis, and it is unlikely to be a coincidence thatthe highest number of miRNA families has been recordedfrom this genus [37].
Figure 2 clarifies the effect that disparities in taxonomicsampling have on our understanding of the evolutionof miRNAs. Entire divisions have either no coverage(Hepatophyta, Anthocerophyta, Psilophyta, Sphenophyta,Cycadophyta, Gingkophyta, and Gnetophyta) or only asingle taxon has been surveyed (Bryophyta, Lycophyta,
Table 1. An outline of the miRNA families listed on miRBase v20 that were rejected due to lacking sufficient evidence for confidentannotationa
Embryophyta
Embr
yoph
yta
Tracheophyta
Trac
heop
hyta
Euphyllophyta
Euph
yllo
phyt
a
Spermatophyta
Sper
mat
ophy
ta Angiospermae
Poales
Angi
ospe
rmae
156, 159, 160, 166, 167, 171, 390, 408, 477, 530, 535
396
168, 169, 172
162, 164, 393, 394, 395, 397, 398, 399, 482, 2950
827
437, 444, 528, 2275
1432, 1878, 5566
403, 828, 2111, 3627, 3630, 4376, 4414
Taxon microRNA family present
Reason for rejec�on Number of families rejected (% rejected)
No reported star
Available read data do not support mature/ star
Heterogeneous processing
Improper mature/ star offset
Improper precursor structure
Reported products do not map to genome
No experimental support for annota�on
Total invalid
Total valid
779 (43.2%)
65 (3.6%)
423 (23.5%)
60 (3.3%)
89 (4.9%)
2 (0.1%)
97(5.4%)
1351 (75.0%)
451 (25.0%)
Monocotyledones
Eudicotyledones
aIn addition, the complement of miRNA families present in each of the major plant groups is listed, as inferred from the distribution in sampled taxa.
Review Trends in Plant Science March 2014, Vol. 19, No. 3
178
Author's personal copy
Ceratopteris
651
951
0 61
661
7 61
1 71
0 93
804
6 93
169
261
793
893
2 84
461
393
4 93
9 93
035
827
477
828
1 112
0363
673 4
003 5
403 5
620 6
6016
1 116
9 19 1
3 04
7 263
414 4
3 26 3
4 26 3
5 263
8 26 3
9 263
1 363
236 3
336 3
4363
536 3
636 3
7263
0363
159 3
2593
3593
4 593
851
161
361
3 71
1 93
0 04
204
7 74
035
1 77
377
4 77
187
428
438
7 38
048
2 48
548
648
748
258
658
758
858
958
1 224
414 4
3343
534 3
634 3
9343
3443
6 443
8443
9 443
577
677
777
628
038
9 48
468
768
9392
3393
4144
9 051
7051
9555
0 151
514 4
1735
3 04
774
6 802
8802
2952
9 16 2
926 2
7 263
3 125
042 5
5555
6555
7555
8 555
065 5
1655
3 655
0 217
1 217
3 217
421 7
5217
6 217
7217
8 217
304
284
728
9051
726 3
437444
5282275
48223
4187
8 16 6
5 5
530
4655
075 5
0231
5 24 1
6 481
568 1
6681
478 1
090 2
9793
289 3
7415
251 5
751 5
261 5
665 5
261
840 5
0505
100 2
2002
3 002
6002
7002
9002
010 2
1102
2102
3102
4 102
0202
1 202
2202
320 2
4 20 2
520 2
620 2
llemiR-1llem
2-Ri
llemi
3-R
ll emi
4-R
l lemiR-5
llemiR-6
llemiR-7l lem
iR-8llem
iR-9llem
iR-10
llemiR-11
llemiR-123 1
- Ri
mell
774
335
435
535
735
835
8 98
2 09
4 09
3201
4 20 1
5201
720 1
820 1
9 201
2 301
3301
4301
5 30 1
6301
8 30 1
930 1
0401
140 1
240 1
3 401
4401
740 1
8401
9401
0 50 1
1501
250 1
3501
4 50 1
850 1
9 501
0 60 1
1601
2601
3601
4 601
5 60 1
7601
860 1
1701
270 1
3701
470 1
8701
9701
1121
212 1
4121
5 12 1
612 1
7121
9121
122 1
222 1
322 1
8 702
970 2
1802
2802
4802
609
019
9411
7511
261 1
861
271
Physcomitrella
Key:
Selaginella
Picea
Larix
Liriodendron
Chlamydomonas
Cyanophora
5 11 1 76
109
3
0 801
1 80 1
2801
3 801
480 1
5801
6801
880 1
9 801
0 901
190 1
2901
390 1
490 1
5 901
690 1
790 1
8 901
990 1
0011
1011
2 01 1
3011
401 1
5011
6 011
701 1
8 011
9011
0111
111 1
2111 77
4
Musa
Oryza
Tri�cum
Zea
Brachypodium
Hordeum
Sorghum
Pinus
Aquilegia
059 2
Cynara
6116
8116
411 6
Nico�ana
Solanum
Vi�s
6 734
Citrus
059 2
Theobroma
A. thaliana
A. lyrata
Medicago
Malus
Cucumis
Glycine
Populus
Ricinus
0592
3 462
472 5
4 443
4565
2105 3 5
8
8205
0 592
6 734
0592
5 93
0592
0592
0363
728
536
0592
0 35
EudicotyledonsM
onocotyledons
Tracheophyta
Embryophyta
Gym
nosperms
119
151 1
251 1
551 1
6 701
5 701
044
678 1
341 5
9415
5 51 5
6 515
8 515
7 085
1526
5967
3 01 6
7 206
4206
520 6
910 6
020 6
4416
5416
6 416
7 41 6
841 6
9416
3 516
4 516
551 6
9516
7 441
8 441
0541
1 246
4246
7246
0346
2 346
434 6
5 44 6
6446
9546
3 646
7646
8646
4 746
4 144
2125
6967
7 967
8967
9967
1077
635
328
749
059
159
6201
1151
3 96 3
796 3
9 07 3
5225
5 225
a
b
c
d
f
h
e
g
Angiosperms
3 22 6
361 5
2715
3 715
581 5
1 025
9 077
417 7
3277
4 277
7277
8277
1 377
237 7
4 377
837 7
2 47 7
3 577
4577
0677
367 7
2877
8 051
2253
7834
4994
7 665
7 881
3 424
9 112
7 305
9912
1325
672 5
2 646
322 4
4224
5224
6 224
0324
5 324
7 324
3324
7424
0 524
2424
1 68
868
5151
168 1
3681
578 1
3682
1 78 2
3 782
0 882
TRENDS in Plant Science
Latest �me of evolu�on of miRNA familymiRNA family lost in all descendents
Figure 2. The acquisition of validated miRNA families in taxa across 31 taxa representing a broad coverage of the plant kingdom. The observed distribution of families in
these taxa is utilised to infer which node each family is likely to have evolved at. Node labels: (a) Embryophyta, (b) Tracheophyta, (c) Euphyllophyta, (d) Spermatophyta, (e)
Gymnospermae, (f) Angiospermae, (g) Monocotyledons, and (h) Eudicotyledons. The established plant phylogeny is used (Angiosperm Phylogeny Website. Version 12,
July 2012; http://www.mobot.org/MOBOT/research/APweb/) and the majority of data are drawn from miRBase v20, with additional miRNA families that fulfil the annotation
criteria included for Musa [46], Triticum [10], Larix [47], Liriodendron [31], and Ceratopteris [31].
Review Trends in Plant Science March 2014, Vol. 19, No. 3
179
Author's personal copy
and Pteridophyta). In the latter case, this creates theillusion that few miRNA families are conserved in theselineages, whereas the reality is that these miRNAs appearspecies-specific simply because they are the only speciesstudied so far within their fundamental evolutionary line-age. Inevitably, denser taxonomic sampling will revealthat the majority of these miRNAs have evolved muchearlier than the origin of the species.
Not so young miRNAsIt has been argued that, because the majority of plantmiRNAs appear to be species specific, they must haveevolved recently [29,34,35,38,39], that most of these miR-NAs are likely to lack any function, and that they will soonbe lost through neutral selection [35]. Our analysis revealstwo fundamental errors in these arguments. First,although it may be true that the species in which a novelmiRNA family is identified may itself be young, if thatspecies is the sole representative of an otherwiseunsampled evolutionary lineage, then we have no con-straint on the antiquity of the miRNA family except thatit evolved after that species lineage separated from thenearest living species or lineage whose miRNAome hasbeen assayed.
This is both most obvious and extreme in Physcomitrellapatens, which is the only representative of the moss evolu-tionary lineage to have its miRNA repertoire studied. P.patens has been described to encode 67 miRNA familiesthat are not encountered in any other plant species studiedso far. However, it does not follow that these miRNAfamilies are exclusive to the species P. patens. Rather,they are the net sum of the miRNA families that evolvedin the lineage leading to P. patens after it separated fromthe last common ancestor that it shared with tracheo-phytes (vascular plants, including the lycophytes, pterido-phytes, gymnosperms, and angiosperms; Figure 3, node a).
Through this episode lasting half a billion years, some ofthe miRNA families will have evolved very early and somemuch more recently, interpretations that can be reconciledby assaying the miRNAome of members of the moss lineagethat are related by degree to P. patens.
Similarly, even within the comparatively intensivelysampled angiosperms, those species that have been inves-tigated for their miRNAs are all relatively distantlyrelated. They are representative of their evolutionarylineages, not merely of their species. Indeed, we only haveapproximate knowledge of species-specific miRNAs in Ara-bidopsis where both Arabidopsis thaliana and Arabidopsislyrata have been investigated [37]. These two speciesappear to have 14 and 20 new miRNAs, respectively(Figure 2). However, our survey revealed that spuriouslyidentified, incompletely annotated, and/or lowly expressednovel miRNAs are most common in individual species. It islikely that resequencing efforts will significantly diminishthe inventory of truly species-specific miRNAs.
Nothing unique about model systemsIt has often been noted that miRNA families are distrib-uted very unevenly among evolutionary lineages, withspecies of Arabidopsis, Oryza, and Glycine havingevolved a disproportionately large number of miRNAfamilies. However, given the phylogenetic perspectivethat we provide, the distribution of miRNA familiesappears to be approximately proportional to the anti-quity of the evolutionary lineages assayed. For instance,the miRNAomes of the moss P. patens and the eudicot A.thaliana have both been well characterised. In the halfbillion years of independent evolutionary history sincethey last shared a common ancestor, these distantlyrelated evolutionary lineages have evolved comparablenumbers of novel miRNA families (67 and 66, respec-tively; Figure 2).
TRENDS in Plant Science
Num
ber o
f miR
NA
fam
ilies
pre
sent
100200300400500600Time before present (Ma)
0
20
40
60
80
100
Physcomitrella
Selaginella
Arabidopsis
a bc
ffdPinaceae
TracheophytaSpermatophyta
Angiospermae
e gh
Figure 3. The rate of acquisition of miRNA families over geological time. Divergence dates are consistent with both molecular data and the fossil record [36,48]. Node
labels: (a) Embryophyta, (b) Tracheophyta, (c) Euphyllophyta, (d) Spermatophyta, (e) Gymnospermae, (f) Angiospermae, (g) Monocotyledons, and (h) Eudicotyledons. Each
line produced at the point of the monocot (brown) and eudicot (green) divergence represents an angiosperm species, and illustrates both the disparity in coverage between
the angiosperm and rest of the plant kingdom, and the differences in coverage within the angiosperms.
Review Trends in Plant Science March 2014, Vol. 19, No. 3
180
Author's personal copy
Less well-characterised taxa, such as Nicotiana andSelaginella, appear to have fewer miRNAs but this isalmost certainly an artefact of incomplete sampling. Wherecomparisons have been made within angiosperms to high-light the considerable variability of miRNA evolution atthe individual gene level, for example, between rice (Oryzasativa) and papaya (Carica papaya), it has been noted thatrice has gained 126 genes while losing only one, whereaspapaya has gained only eight genes and lost 25 [35].However, rice and papaya differ dramatically in the extentto which their miRNAs have been studied; a Web of Sciencesearch (search string Topic=([rice j papaya]) AND Topic=(microRNA OR miRNA) yields 602 versus ten publishedstudies, respectively. Bioinformatic approaches could over-come this if the level of gene sequencing was comparablefor both taxa. However, whereas rice has enjoyed effec-tively complete genome sequencing and annotation, thepapaya genome has been the subject of only very low (3�)sequence coverage [40]. Previous studies in animals haveshown that the ability to identify miRNAs bioinformati-cally is correlated to the level of genome sequencing, withlow coverage genomes missing up to 32% of the expectedmiRNA repertoire, whereas completely sequenced gen-omes were missing <3% [41]. Consequently, for compar-isons to be made between the miRNA repertoires of anytwo species, care should be taken to ensure that samplingbias does not affect interpretations of the available data.
The evolution of plant miRNAsA key feature of miRNA evolution is that once evolved,families are rarely lost and, as such, this high level ofconservation between taxa was exploited as an ancillarycriterion for miRNA annotation [17,18]. However, not allmiRNAs are equally conserved and it has been argued thatmore ancient miRNA families have a higher level of con-servation than younger families [32]. Our phylogeneticperspective on miRNA evolution reveals that of the 36observed secondary losses of miRNA families, 21 of thesewere in families that evolved prior to the origin of thespermatophytes (Figure 2), suggesting that older familiesmay not be more highly conserved.
It is clear that new miRNA families are not integratedinto gene regulatory networks at a continuous rate overplant evolutionary history but, rather, in distinct epi-sodes of miRNAome expansion [e.g., in the ancestralembryophyte (Figure 2, node a) and ancestral spermato-phyte (Figure 2, node d)], with long intervening periods ofstasis. It is tempting to suggest that these bursts ofmiRNA innovation are associated causally with majorepisodes of plant phenotypic evolution, such as in theadaptation of plants to life on land. Indeed, expansion inthe number of miRNA families within the ancestralangiosperm lineage (Figure 2, node f) has been implicatedin the origin of the flowering plants [35], although ouranalysis shows that this apparent coincidence is an arte-fact of the paucity of miRNA studies in pteridophytes andgymnosperms. Evidently, not all of the major events inplant evolution can be explained by the origin of newmiRNA families, although because miRNAs are never-theless known to play an essential role in the develop-ment of the flower, [42,43] co-option of existing miRNA
families to perform novel functions may also play a crucialrole in the evolution of novel phenotypes.
The relation between miRNA and phenotypic evolutionin plants contrasts with this same relation in animalevolutionary history where the innovation of miRNAfamilies is a constant process [44]. This has been evidencedby the observation that novel miRNA families characterisealmost every major animal clade surveyed to date [41].Animal evolution also exhibits a positive relation betweenthe sum of miRNA families and proxies for phenotypiccomplexity [45]. This relation does not appear to hold inplants, not least since, as we have shown, the moss P.patens and eudicot A. lyrata have acquired a comparablerepertoire of miRNAs.
Concluding remarks and future perspectivesOur review of the plant miRNA database has shown thatvery many fail to meet the criteria required for miRNAannotation. However, many of the miRNA families that wefailed to validate were rejected due to absence of evidence,rather than evidence that they failed to meet validationcriteria. In particular, the identification of miRNA* expres-sion was the major reason for the rejection of families.Conversely, many families were not rejected but have readcounts so low that it is difficult to assess the precision of 50
processing with any confidence. This paucity of evidence isparticularly acute in important crop species includingmaize (Zea mays) and soybean (Glycine max) (Figure 2).It is important, therefore, that the validity of these candi-date miRNAs is more fully evaluated through a systematicprogramme of miRNAome resequencing.
The phylogenetic framework within which we organisethe database of valid plant miRNAs provides not only apowerful perspective within which to trace the evolution ofmiRNA families but also to predict the miRNAome ofunstudied plant species. Although angiosperms representthe majority of plant biodiversity, to gain a more completeunderstanding of plant miRNA evolution it is essentialthat there is a rebalancing of sampling to more fullyrepresent the systematic breadth of the plant kingdom.Targeted sequencing of hitherto neglected lineages thatrepresent the majority of plant evolutionary history istherefore urgently needed. This will afford a better per-spective on the role that miRNAs have played in develop-mental evolution within the plant kingdom.
Finally, given their high levels of conservation andinterspersed locations across the genome, miRNAs mayserve as a simple proxy for tracing major events in theevolution of plant genomes. For instance, paralogy withinmiRNA families affords a ready insight into the phyloge-netic timing and sequence of whole genome duplicationevents in plant evolutionary history, as well as revealingthe roles that miRNA evolution play in responding togenome obesity following whole genome duplicationevents. This will require the establishment of a coherentsystematic classification scheme for paralogy groupswithin miRNA families, to replace current schemes thatserve more to obscure than to reveal homology.
Research into the evolution of the plant miRNAomeholds great potential for bioengineering as well as unco-vering the role of these regulatory factors in organismal
Review Trends in Plant Science March 2014, Vol. 19, No. 3
181
Author's personal copy
evolution. However, these prospects can only be achieved ifgreater effort is expended by researchers and referees inensuring that the published record is not further corruptedby the description of spurious miRNAs.
AcknowledgementsWe would like to thank Kevin Peterson (Dartmouth College) for advice,support, and encouragement throughout the course of this project. Thisresearch was funded through an Engineering and Physical SciencesResearch Council (EPSRC) doctoral training grant (DTG) through theBristol Centre for Complexity Science (R.S.T.), an Irish Research Council(IRCSET) postdoctoral fellowship (J.E.T.), a Black Swan Fund from theUniversity of Bristol, Faculty of Science (P.C.J.D., S.J.H.), a LeverhulmeTrust Research Fellowship (P.C.J.D.), the Natural EnvironmentalResearch Council (P.C.J.D. and S.J.H.), the Royal Society (P.C.J.D.),and the Wolfson Foundation (P.C.J.D.).
Appendix A. Supplementary dataSupplementary data associated with this article can befound, in the online version, at doi:10.1016/j.tplants.2013.11.008.
References1 Voinnet, O. (2009) Origin, biogenesis, and activity of plant microRNAs.
Cell 136, 669–6872 Carthew, R.W. and Sontheimer, E.J. (2009) Origins and mechanisms of
miRNAs and siRNAs. Cell 136, 642–6553 Rubio-Somoza, I. and Weigel, D. (2011) MicroRNA networks and
developmental plasticity in plants. Trends Plant Sci. 16, 258–2644 Kruszka, K. et al. (2012) Role of microRNAs and other sRNAs of plants
in their changing environments. J. Plant Physiol. 169, 1664–16725 Zeng, C.Y. et al. (2010) Conservation and divergence of microRNAs and
their functions in Euphorbiaceous plants. Nucleic Acids Res. 38, 981–9956 Barakat, A. et al. (2012) Genome wide identification of chilling
responsive microRNAs in Prunus persica. BMC Genomics 13, 4817 Chinnusamy, V. et al. (2007) Cold stress regulation of gene expression
in plants. Trends Plant Sci. 12, 444–4518 Frazier, T.P. et al. (2011) Salt and drought stresses induce the aberrant
expression of microRNA genes in tobacco. Mol. Biotechnol. 49, 159–1659 Sunkar, R. et al. (2006) Post-transcriptional induction of two Cu/Zn
superoxide dismutase genes in Arabidopsis is mediated bydownregulation of miR398 and important for oxidative stresstolerance. Plant Cell 18, 2051–2065
10 Wei, B. et al. (2009) Novel microRNAs uncovered by deep sequencing ofsmall RNA transcriptomes in bread wheat (Triticum aestivum L.) andBrachypodium distachyon (L.) Beauv. Funct. Integr. Genomics 9, 499–511
11 Ivashuta, S. et al. (2011) Regulation of gene expression in plantsthrough miRNA inactivation. PLoS ONE 6, e21330
12 Zhou, M. and Luo, H. (2013) MicroRNA-mediated gene regulation:potential applications for plant genetic engineering. Plant Mol. Biol.83, 59–75
13 Li, J.F. et al. (2013) Comprehensive protein-based artificial microRNAscreens for effective gene silencing in plants. Plant Cell 25, 1507–1522
14 Jones-Rhoades, M.W. (2012) Conservation and divergence in plantmicroRNAs. Plant Mol. Biol. 80, 3–16
15 Meng, Y. et al. (2012) Are all the miRBase-registered microRNAs true?A structure- and expression-based re-examination in plants. RNA Biol.9, 249–253
16 Kozomara, A. and Griffiths-Jones, S. (2011) miRBase: integratingmicroRNA annotation and deep-sequencing data. Nucleic Acids Res.39, D152–D157
17 Meyers, B.C. et al. (2008) Criteria for annotation of plant microRNAs.Plant Cell 20, 3186–3190
18 Ambros, V. et al. (2003) A uniform system for microRNA annotation.RNA 9, 277–279
19 Reinhart, B.J. et al. (2002) MicroRNAs in plants. Genes Dev. 16, 1616–1626
20 Park, W. et al. (2002) CARPEL FACTORY, a Dicer homolog, andHEN1, a novel protein, act in microRNA metabolism in Arabidopsisthaliana. Curr. Biol. 12, 1484–1495
21 Axtell, M.J. et al. (2011) Vive la difference: biogenesis and evolution ofmicroRNAs in plants and animals. Genome Biol. 12, 221
22 Bologna, N.G. et al. (2013) Processing of plant microRNA precursors.Brief. Funct. Genomics 12, 37–45
23 Kurihara, Y. and Watanabe, Y. (2004) Arabidopsis micro-RNAbiogenesis through Dicer-like 1 protein functions. Proc. Natl. Acad.Sci. U.S.A. 101, 12753–12758
24 Berezikov, E. (2011) Evolution of microRNA diversity and regulation inanimals. Nat. Rev. Genet. 12, 846–860
25 Tarver, J.E. et al. (2012) Do miRNAs have a deep evolutionary history?Bioessays 34, 857–866
26 Chiang, H.R. et al. (2010) Mammalian microRNAs: experimentalevaluation of novel and previously annotated genes. Genes Dev. 24,992–1009
27 Szczesniak, M.W. et al. (2012) miRNEST database: an integrativeapproach in microRNA search and annotation. Nucleic Acids Res.40, D198–D204
28 Zhang, B. et al. (2006) Conservation and divergence of plant microRNAgenes. Plant J. 46, 243–259
29 Cuperus, J.T. et al. (2011) Evolution and functional diversification ofmiRNA genes. Plant Cell 23, 431–442
30 Axtell, M.J. et al. (2007) Common functions for diverse small RNAs ofland plants. Plant Cell 19, 1750–1769
31 Axtell, M.J. and Bartel, D.P. (2005) Antiquity of microRNAs and theirtargets in land plants. Plant Cell 17, 1658–1673
32 Axtell, M.J. and Bowman, J.L. (2008) Evolution of plant microRNAsand their targets. Trends Plant Sci. 13, 343–349
33 Pagel, M. (1999) Inferring the historical patterns of biologicalevolution. Nature 401, 877–884
34 Axtell, M.J. (2013) Classification and comparison of small RNAs fromplants. Annu. Rev. Plant Biol. 64, 137–159
35 Nozawa, M. et al. (2012) Origins and evolution of microRNA genes inplant species. Genome Biol. Evol. 4, 230–239
36 Clarke, J.T. et al. (2011) Establishing a time-scale for plant evolution.New Phytol. 192, 266–301
37 Fahlgren, N. et al. (2010) MicroRNA gene evolution in Arabidopsislyrata and Arabidopsis thaliana. Plant Cell 22, 1074–1089
38 Naqvi, A.R. et al. (2012) Biogenesis, functions and fate of plantmicroRNAs. J. Cell. Physiol. 227, 3163–3168
39 Fahlgren, N. et al. (2007) High-throughput sequencing of ArabidopsismicroRNAs: evidence for frequent birth and death of miRNA genes.PLoS ONE 2, e219
40 Ming, R. et al. (2008) The draft genome of the transgenic tropical fruittree papaya (Carica papaya Linnaeus). Nature 452, 991–996
41 Tarver, J.E. et al. (2013) miRNAs: small genes with big potential inmetazoan phylogenetics. Mol. Biol. Evol. 30, 2369–2382
42 Luo, Y. et al. (2013) Evolutionary conservation of microRNA regulatoryprograms in plant flower development. Dev. Biol. 380, 133–144
43 Srikanth, A. and Schmid, M. (2011) Regulation of flowering time: allroads lead to Rome. Cell. Mol. Life Sci. 68, 2013–2037
44 Wheeler, B.M. et al. (2009) The deep evolution of metazoan microRNAs.Evol. Dev. 11, 50–68
45 Sempere, L.F. et al. (2006) The phylogenetic distribution of metazoanmicroRNAs: insights into evolutionary complexity and constraint. J.Exp. Zool. B: Mol. Dev. Evol. 306B, 575–588
46 D’Hont, A. et al. (2012) The banana (Musa acuminata) genome and theevolution of monocotyledonous plants. Nature 488, 213–217
47 Zhang, J. et al. (2012) Genome-wide identification of microRNAs inlarch and stage-specific modulation of 11 conserved microRNAs andtheir targets during somatic embryogenesis. Planta 236, 647–657
48 Smith, S.A. et al. (2010) An uncorrelated relaxed-clock analysissuggests an earlier origin for flowering plants. Proc. Natl. Acad. Sci.U.S.A. 107, 5897–5902
Review Trends in Plant Science March 2014, Vol. 19, No. 3
182