17
837 Mol. Biol. Evol. 15(7):837–853. 1998 q 1998 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038 Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel Family of Non-LTR Retrotransposons in the Yellow Fever Mosquito, Aedes aegypti Zhijian Tu, Jun Isoe, and Julia A. Guzova Department of Entomology and Center for Insect Science, University of Arizona A retrotransposon named Lian-Aa1 was discovered in an intron of an AaHR3–1 gene of the yellow fever mosquito, Aedes aegypti. This retrotransposon contained a long open reading frame with 1,219 amino acids that included endonuclease, reverse transcriptase, and RNase H domains. It was shown that in the Rock strain of Ae. aegypti, there were up to 1,380 copies of Lian elements, equivalent to 0.8% of the entire genome. Five additional copies of Lian elements were isolated, mapped by restriction digestion, and partially sequenced. The 59 and 39 ends of the Lian family were determined by comparing the terminal sequences of the six copies and were subsequently con- firmed by the identification of putative target duplications flanking Lian-Aa1 and Lian-Aa2. The Lian family is likely a novel family of non-long-terminal-repeat (non-LTR) retrotransposons that terminate in a repeat of (CTGA- TAC) 2 . On average, the six copies of Lian elements showed only 0.6% sequence divergence at the nucleotide level in both a 735-bp region at the 59 end and a 1,124-bp coding region. Genomic Southern blots also revealed a very high degree of similarity among hundreds of Lian elements, suggesting very recent activity of Lian. Furthermore, all six analyzed Lian elements were closely associated with one or more different families of repetitive elements. It is possible that these associations could reflect the complex relationship between Lian elements and the rest of the Ae. aegypti genome. Phylogenetic analyses based on the reverse transcriptase, domains of 36 non-LTR retro- transposons including Lian-Aa1 identified five major subgroups that were supported by bootstrap replications. In contrast to the majority of non-LTR retrotransposons, Lian-Aa1 has an RNase H domain that is similar to a few other non-LTR retrotransposons and some retroviruses, which is consistent with the previously proposed independent assortment of different domains during the evolution of retroelements. Introduction Transposable elements are integral components of eukaryotic genomes (Berg and Howe 1989; Sherratt 1995). Because of their ability to proliferate by repli- cative transposition, they can make up a large fraction of the genome. For example, more than 36% of the hu- man genome and more than 50% of the maize genome are comprised of transposable elements (Smit 1996; Voytas 1996). The importance of transposable elements in organismal evolution is currently debated (Doolittle and Sapienza 1980; Orgel and Crick 1980; Charlesworth and Langley 1989; McDonald 1993, 1995; Brookfield 1995; Wessler, Bureau, and White 1995; Britten 1996; Kidwell and Lisch 1997). Transposable elements may be viewed as genetic entities evolving in the ecological community of the host genome (Brookfield 1995). Their spread, in the course of evolution, depends not only on their ability to amplify, but also on complex interactions between different families of elements and between the elements and the host. The activities of transposable el- ements are often under various levels of regulation to avoid or reduce deleterious effects on the host (Hartl et al. 1997). On the other hand, some transposable ele- ments have been shown to be recruited to perform spe- cific roles in the biology of their hosts. For example, in Drosophila melanogaster, telomeres were shown to be maintained by non-long-terminal-repeat (non-LTR) ret- Abbreviations: LTR, long terminal repeat; ORF, open reading frame; SINE, short interspersed nuclear element. Key words: retrotransposon, repeats, genome, reverse transcript- ase, evolution, Aedes aegypti. Address for correspondence and reprints: Dr. Zhijian Tu, Depart- ment of Entomology, University of Arizona, Tucson, Arizona 85721. E-mail: [email protected]. rotransposons (e.g., Mason and Biessmann 1995; Pardue et al. 1996). It has been proposed that eukaryotic telom- erases could have arisen from non-LTR retrotransposons (Eickbush 1997). It has also been proposed that some transposable elements may have been involved in the evolution of gene regulatory sequences (McDonald 1993, 1995; Wessler, Bureau, and White 1995; Britten 1996; Kidwell and Lisch 1997). Transposable elements can be categorized by the mechanisms of their transposition (Finnegan 1992; Rob- ertson and Lampe 1995). Class II elements such as P, hobo, and mariner transpose directly from DNA to DNA. Class I elements transpose via an RNA interme- diate. RNA transcripts of the original elements are re- verse transcribed to cDNA molecules. The cDNAs are then inserted elsewhere in the genome. Class I elements can be further categorized into three groups, including LTR retrotransposons, non-LTR retrotransposons, and short interspersed nuclear elements (SINEs). LTR retro- transposons contain terminal repeats that are usually 200–500 bp in length. These LTRs are important in the initiation and termination of transcription (Arkhipova, Lyubomirskaya, and Ilyin 1995). Non-LTR retrotranspo- sons utilize internal promoters and poly-A signals for their transcription. Therefore, most of them have a poly- A tract at the 39 end (Hutchison et al. 1989; Eickbush 1992; Levin 1997). Both LTR and non-LTR retrotrans- posons code for reverse transcriptase which is essential for retrotransposition. In contrast to LTR-retrotranspo- sons, which employ transposition mechanisms similar to those employed by retroviruses (Arkhipova, Lyubomir- skaya, and Ilyin 1995), the mechanism of transposition of the non-LTR retrotransposons is just being under- stood. Conserved endonuclease domains have been identified in several non-LTR retrotransposons, and the Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872 by guest on 17 February 2018

Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

Embed Size (px)

Citation preview

Page 1: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

837

Mol. Biol. Evol. 15(7):837–853. 1998q 1998 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038

Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel Family ofNon-LTR Retrotransposons in the Yellow Fever Mosquito, Aedes aegypti

Zhijian Tu, Jun Isoe, and Julia A. GuzovaDepartment of Entomology and Center for Insect Science, University of Arizona

A retrotransposon named Lian-Aa1 was discovered in an intron of an AaHR3–1 gene of the yellow fever mosquito,Aedes aegypti. This retrotransposon contained a long open reading frame with 1,219 amino acids that includedendonuclease, reverse transcriptase, and RNase H domains. It was shown that in the Rock strain of Ae. aegypti,there were up to 1,380 copies of Lian elements, equivalent to 0.8% of the entire genome. Five additional copies ofLian elements were isolated, mapped by restriction digestion, and partially sequenced. The 59 and 39 ends of theLian family were determined by comparing the terminal sequences of the six copies and were subsequently con-firmed by the identification of putative target duplications flanking Lian-Aa1 and Lian-Aa2. The Lian family islikely a novel family of non-long-terminal-repeat (non-LTR) retrotransposons that terminate in a repeat of (CTGA-TAC)2. On average, the six copies of Lian elements showed only 0.6% sequence divergence at the nucleotide levelin both a 735-bp region at the 59 end and a 1,124-bp coding region. Genomic Southern blots also revealed a veryhigh degree of similarity among hundreds of Lian elements, suggesting very recent activity of Lian. Furthermore,all six analyzed Lian elements were closely associated with one or more different families of repetitive elements.It is possible that these associations could reflect the complex relationship between Lian elements and the rest ofthe Ae. aegypti genome. Phylogenetic analyses based on the reverse transcriptase, domains of 36 non-LTR retro-transposons including Lian-Aa1 identified five major subgroups that were supported by bootstrap replications. Incontrast to the majority of non-LTR retrotransposons, Lian-Aa1 has an RNase H domain that is similar to a fewother non-LTR retrotransposons and some retroviruses, which is consistent with the previously proposed independentassortment of different domains during the evolution of retroelements.

Introduction

Transposable elements are integral components ofeukaryotic genomes (Berg and Howe 1989; Sherratt1995). Because of their ability to proliferate by repli-cative transposition, they can make up a large fractionof the genome. For example, more than 36% of the hu-man genome and more than 50% of the maize genomeare comprised of transposable elements (Smit 1996;Voytas 1996). The importance of transposable elementsin organismal evolution is currently debated (Doolittleand Sapienza 1980; Orgel and Crick 1980; Charlesworthand Langley 1989; McDonald 1993, 1995; Brookfield1995; Wessler, Bureau, and White 1995; Britten 1996;Kidwell and Lisch 1997). Transposable elements maybe viewed as genetic entities evolving in the ecologicalcommunity of the host genome (Brookfield 1995). Theirspread, in the course of evolution, depends not only ontheir ability to amplify, but also on complex interactionsbetween different families of elements and between theelements and the host. The activities of transposable el-ements are often under various levels of regulation toavoid or reduce deleterious effects on the host (Hartl etal. 1997). On the other hand, some transposable ele-ments have been shown to be recruited to perform spe-cific roles in the biology of their hosts. For example, inDrosophila melanogaster, telomeres were shown to bemaintained by non-long-terminal-repeat (non-LTR) ret-

Abbreviations: LTR, long terminal repeat; ORF, open readingframe; SINE, short interspersed nuclear element.

Key words: retrotransposon, repeats, genome, reverse transcript-ase, evolution, Aedes aegypti.

Address for correspondence and reprints: Dr. Zhijian Tu, Depart-ment of Entomology, University of Arizona, Tucson, Arizona 85721.E-mail: [email protected].

rotransposons (e.g., Mason and Biessmann 1995; Pardueet al. 1996). It has been proposed that eukaryotic telom-erases could have arisen from non-LTR retrotransposons(Eickbush 1997). It has also been proposed that sometransposable elements may have been involved in theevolution of gene regulatory sequences (McDonald1993, 1995; Wessler, Bureau, and White 1995; Britten1996; Kidwell and Lisch 1997).

Transposable elements can be categorized by themechanisms of their transposition (Finnegan 1992; Rob-ertson and Lampe 1995). Class II elements such as P,hobo, and mariner transpose directly from DNA toDNA. Class I elements transpose via an RNA interme-diate. RNA transcripts of the original elements are re-verse transcribed to cDNA molecules. The cDNAs arethen inserted elsewhere in the genome. Class I elementscan be further categorized into three groups, includingLTR retrotransposons, non-LTR retrotransposons, andshort interspersed nuclear elements (SINEs). LTR retro-transposons contain terminal repeats that are usually200–500 bp in length. These LTRs are important in theinitiation and termination of transcription (Arkhipova,Lyubomirskaya, and Ilyin 1995). Non-LTR retrotranspo-sons utilize internal promoters and poly-A signals fortheir transcription. Therefore, most of them have a poly-A tract at the 39 end (Hutchison et al. 1989; Eickbush1992; Levin 1997). Both LTR and non-LTR retrotrans-posons code for reverse transcriptase which is essentialfor retrotransposition. In contrast to LTR-retrotranspo-sons, which employ transposition mechanisms similar tothose employed by retroviruses (Arkhipova, Lyubomir-skaya, and Ilyin 1995), the mechanism of transpositionof the non-LTR retrotransposons is just being under-stood. Conserved endonuclease domains have beenidentified in several non-LTR retrotransposons, and the

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 2: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

838 Tu et al.

endonuclease domain of human LINE1 has been shownto be essential in transposition, most likely in the cleav-age of target DNA (Feng et al. 1996; Levin 1997). Thissuggests that target-primed reverse transcription, whichwas first described in the R2 element of Bombyx mori(Luan et al. 1993), is likely to be common for non-LTRretrotransposons (Feng et al. 1996; Levin 1997). Thephylogenetic relationship of the retroelements includingLTR and non-LTR retrotransposons, retrons, some groupII introns, telomerases, retroviruses, and reverse tran-scriptase-containing DNA viruses has been studied us-ing the shared reverse transcriptase domains (Xiong andEickbush 1988a, 1990; McClure 1991, 1992, 1993;Eickbush 1994, 1997; Nakamura et al. 1997). In theseanalyses, non-LTR retrotransposons usually form amonophyletic branch, and the LTR-retrotransposons areclosely related to retroviruses.

In the yellow fever mosquito, Aedes aegypti, a fewclass I elements have been reported including two non-LTR retrotransposons named Juan-A (Mouches, Bensaa-di, and Salvado 1992) and JAM (Warren, Hughes, andCrampton, GenBank Z86117) and a possibly incompleteLTR-retrotransposon named Zebedee (Warren, Hughes,and Crampton 1997). Three miniature inverted-repeattransposable elements named Wukong, Wujin, and Wu-neng that are likely to be class II DNA elements, havealso been discovered (Tu 1997). Several families oftransposable elements have been identified in other spe-cies of mosquitoes including Anopheles gambiae,Anopheles albimanus, and Culex pipiens (Besansky1990; Besansky et al. 1992, 1996; Agarwal et al. 1993;Robertson 1993; Besansky, Bedell, and Mukabayire1994; Ke et al. 1996; Romans, Bhattacharyya, and Co-lavita 1998). The analysis of endogenous transposableelements in mosquitoes could have potentially importantapplications in a genetic approach to control diseasessuch as malaria, yellow fever, and dengue fever by trans-forming the mosquito vectors. For example, an activeelement can potentially be used as a transformation tool.If an element is no longer active, transposable-element-mediated polymorphisms could be used as markers forgenetic mapping and population studies (Mukabayireand Besansky 1996). Studies on endogenous mosquitotransposable elements can also enhance our understand-ing of the genetic makeup and organization of mosquitogenomes, which is critical for any sustainable geneticcontrol strategy.

We here report the discovery of a novel family ofnon-LTR retrotransposons named Lian in Ae. aegypti. Astructural, genomic and phylogenetic analysis of theLian elements is also presented.

Materials and MethodsMosquitoes

Mosquitoes used in this study were from the Rockstrain of Ae. aegypti.

Genomic Library Screening

The l Dash II genomic library prepared from theRock strain of Ae. aegypti was a kind gift of Dr. A. A.

James of the Department of Molecular Biology and Bio-chemistry of the University of California at Irvine. Thelibrary was custom made by Stratagene Cloning Sys-tems (La Jolla, Calif.). The library was screened usinga digoxigenin-labeled ssDNA probe made from anEcoRI fragment of Lian-Aa1 by asymmetric PCR whereonly one primer was used. Instead of using the regularmixture of dNTPs, 5 ml of digoxigenin-dUTP labelingmixture (1 mM dATP, 1 mM dCTP, 0.65 mM dGTP, 1mM dTTP, and 0.35 mM digoxigenin-dUTP) was usedin a 100-ml reaction. The conditions for PCR were thesame as those in Tu and Hagedorn (1997). MagnaGraphNylon membranes (Micron Separation Inc., Westbor-ough, Mass.) were used to lift the plaques. The prehy-bridization solution was 5 3 SSC with 2% non-fat milk,0.1% N-lauroylsarcosine, and 0.02% SDS. Approxi-mately 20 ng probe per ml of prehybridization solutionwas used for hybridization. Hybridization was carriedout at 558C. Prehybridization, hybridization, and wash-ings were all performed in a Gene Rollery from SavantInstruments, Inc. (Holbrook, N.Y.). The washing strin-gencies were calculated according to Meinkoth andWahl (1984), allowing less than 18% mismatches in allscreenings.

Phage DNA Purification, Subcloning, and DNASequencing

Phage DNAs were purified according to Sambrook,Fritsch, and Maniatis (1989). Positive fragments in theinserts which contain the Lian elements were identifiedby phage DNA Southern blots. Restriction digestion andDNA separation conditions were as described by Lin etal. (1993). DNA blotting was carried out using aVacuGeney XL Vacuum blotting system (PharmaciaBiotech AB, Uppsala, Sweden). The probe and hybrid-ization conditions were the same as those in the screen-ing. Positive fragments from the phage DNA Southernblots were subcloned into pBluescript sk (2) plasmidfrom Stratagene Cloning Systems (La Jolla, Calif.). Se-quencing was done by the Sequencing Facility of theUniversity of Arizona with synthetic primers, using anautomatic sequencer (Model 373) from Applied Biosys-tems Intl. (Forster City, Calif.). Sequences were deter-mined from both strands.

Genomic Southern Blot

Genomic DNA was isolated from decapitated adultAe. aegypti according to Lin et al. (1993). The amountof genomic DNA was estimated by comparison with thebands of a l HindIII marker (GIBCO BRL, Gaithers-burg, Md.). Blotting, prehybridization, and hybridizationprocedures were the same as described for the phageDNA Southern blot except that a-32P-dCTP was used asthe label. Labeling was done using a Radprime kit fromGIBCO BRL.

Estimation of Copy Numbers

Two independent methods were used to estimatethe copy numbers of Lian elements in the genome ofAe. aegypti. First, the copy number was calculated basedon the ratio of positive plaques to the total number of

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 3: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

Mosquito Lian Retrotransposon 839

plaques screened, taking into account the known size ofthe haploid genome of Ae. aegypti Rock strain (800 Mb;Rao and Rai 1987), and the 16-kb average insert size ofthe genomic library. Second, a quantitative genomicSouthern blot, shown in figure 4C, was used to estimatecopy number. A serial dilution of known amounts ofplasmid DNA containing an EcoRI fragment of theLian-Aa1 was used as the standard. A standard curvewas obtained by plotting the amounts of the EcoRI frag-ment against the respective amounts of radioactivity de-tected by a Betascope (Betascope, Waltham, Mass.). Theamount of the EcoRI fragment in a known amount ofgenomic DNA, which gave positive signals to the Lianprobe, was then estimated by comparing the amount ofradiation in the genomic DNA sample to the standardcurve. Copy number was then calculated based on theknown genome size and the proportion of the positiveEcoRI fragment in the genomic DNA.

Sequence Analysis and Phylogenetic Inference

Searches for matches of either nucleotide or aminoacid sequences in the database (nonredundant GenBank1 EMBL 1 DDBJ 1 PDB) were done using FASTAof GCG (Genetics Computer Group, Madison, Wis., ver-sion 9.0, 1996) and BLAST (Altschul et al. 1990). Pair-wise comparisons were done by Gap and Bestfit ofGCG. A Q9 analysis (Gribskov and Burgess 1986; Mc-Clure 1992; Hagedorn, Maddison, and Tu 1998) wasused to estimate the significance of the similarities of apairwise comparison. Briefly, the significance of thesimilarities of two sequences was tested by comparingthe quality of the comparison, as defined by Bestfit andGap of GCG, to the average quality of 100 randomizedcomparisons. The randomized comparisons were ob-tained by randomizing one of the two sequences in theanalysis. The adjusted quality of each comparison wascalculated using the following formula: Q9 (adjustedquality) 5 (Q [quality of the comparison] 2 A [averagequality of 100 randomized comparisons])/SD (standarddeviation of the 100 randomized comparisons). A Q9 of3.0 or higher is regarded as an indication that two se-quences may be significantly related. Multiple sequenc-es were aligned by Pileup which is a progressive, pair-wise method from GCG. Specific parameters such asgap weight and gap length weight are described in thefigure legends of each alignment. Manual adjustmentwas sometimes needed as described. Consensus of themultiple-sequence alignment was obtained using Prettyof GCG. Maximum-parsimony trees were constructedusing PAUP* 4d59 (Swofford 1997). Specific parame-ters used in the parsimony analyses are described in thefigure legends for each tree. Five hundred bootstrap re-samplings were used to assess the confidence in thegrouping (Felsenstein and Kishino 1993).

List of Sequences

Sequences used in this study included three retro-viruses, MomLV (Moloney murine leukemia virus; Shin-nick, Lerner, and Sutcliffe 1981; Swiss-Prot P03355),EIAV (Equine infectious anemia virus; Chiu et al. 1985;M16575), and HIV1 (Becerra et al. 1990; Pir S11523),

and RNase H proteins in Caenorhabditis elegans (Wil-son et al. 1994; U41994), Saccharomyces cerevisiae(Itaya et al. 1991; Swiss-Prot Q04740), Thermus aqua-ticus (Itaya and Kondo 1991; X60507), Escherichia coli(Kanaya and Crouch 1983; Swiss-Prot P00647), andMycobacterium smegmatis (Mizrahi et al. 1993;U20115). The accession numbers are from GenBank/EMBL unless otherwise noted. In addition, 38 non-LTRretrotransposon sequences were used as listed in table 1.

Results and DiscussionDiscovery of Lian-Aa1, a Novel Non-LTRRetrotransposon

The first Lian retrotransposon, Lian-Aa1, was dis-covered in a 15-kb genomic clone that contained anopen reading frame (ORF) of the Ae. aegypti AaHR3–1 gene, which is a homolog of the D. melanogaster ste-roid hormone receptor DHR3 (Koelle, Segraves, andHogness 1992). The sequence of the entire AaHR3–1genomic clone was determined and deposited inGenBank (U87543). The designation of Lian-Aa1 as anon-LTR retrotransposon was based on its sequencesimilarities with other non-LTR retrotransposons in a da-tabase search using BLAST (Alschul et al. 1990). Theentire nucleotide sequence of Lian-Aa1, which containsa 3.6-kb ORF encoding a 1,219-amino-acid protein, isshown in figure 1. Lian-Aa1 is located 39 of the exonfor the DNA-binding domain of AaHR3–1, which is83% identical at the nucleotide level (P 5 1e258 as cal-culated by BLAST) to the DNA-binding domain ofDHR3. The positions of the putative 39 and 59 intron-splicing sites flanking the exon for the DNA-bindingdomain of AaHR3–1 are the same as those of DHR3.Therefore, Lian-Aa1 is likely to be located in an intron39 to the exon for the DNA-binding domain. However,exons 39 to the DNA-binding domain were not found inthe genomic clone, indicating that they may be beyondthe 39 end of the cloned fragment.

Sequence Analysis of Lian-Aa1

As shown in figure 1, Lian-Aa1 consists of a 1,219-amino-acid-long ORF flanked by a 545-bp 59 untran-slated region and a 255-bp 39 untranslated region. Thefirst methionine is the 31st residue in the ORF. Threeputative domains have been identified in this singleORF, as described below. However, unlike most non-LTR retrotransposons, no gag domain was found inLian-Aa1. Pairwise comparisons using Gap and Bestfitof GCG showed that a region at the N-terminal of Lian-Aa1 was similar to the newly discovered endonucleasedomains of several non-LTR retrotransposons includingI-Dt and R1-Bm (Feng et al. 1996). Q9 analyses wereconducted to assess the significance of the similaritiesfound in the pairwise comparisons. For example, theputative endonuclease domain of Lian-Aa1 showed Q9values of 18.8 and 28.4 to endonuclease domains of I-Dt and R1-Bm, respectively. These values are muchgreater than 3.0, which suggests that these domains maybe significantly related. As shown in figure 2A, multiple-sequence alignment of the endonuclease domains of

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 4: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

840 Tu et al.

Table 1List of Non-LTR Retrotransposons Included in Sequence Comparisons and Phylogenetic Analyses

Transposon Organism Reference Accession Number

Amy-Bm . . . . . . . . .BS-Dm. . . . . . . . . . .CgT1 . . . . . . . . . . . .Cin4-Zm . . . . . . . . .CR1-Ps . . . . . . . . . .CR1-Gg. . . . . . . . . .Cre1 . . . . . . . . . . . .

Bombyx moriDrosophila melanogasterColletotrichum gloeosporioidesZea maysPlatemys spixiiGallus gallusCrithidia fasciculata

Foster et al. (unpublished)Udomkit et al. (1995)He et al. (1996)Schwarz-Sommer et al. (1987)Kajikawa, Ohshima, and Okada (1997)Burch, Davis, and Haas (1993)Gabriel et al. (1990)

U07847S55544a

L76169Y00086AB005891b

L22152M33009

Cre2 . . . . . . . . . . . .Doc-Dm . . . . . . . . .F-Dm. . . . . . . . . . . .Hyp1-Cte. . . . . . . . .Hyp2-Cth . . . . . . . .I-Dm . . . . . . . . . . . .I-Dt . . . . . . . . . . . . .

Cr. fasciculataD. melanogasterD. melanogasterChironomus tentansCh. thummiD. melanogasterDrosophila teissieri

Teng, Wang, and Gabriel (1995)O’Hare et al. (1991)Di Nocera and Casari (1987)Blinov et al. (1997)Blinov et al. (1993)Fawcett et al. (1986)Abad et al. (1989)

S58380a

X17551M17214L79944S31175a

M14954B36186a

Ingi-Tb . . . . . . . . . .Jockey-Dm . . . . . . .Juan-Aa. . . . . . . . . .Juan-Cp . . . . . . . . .Lian-Aa1 . . . . . . . . .LINE1-Bg . . . . . . . .LINE1-Hs . . . . . . . .

Trypanosoma bruceiD. melanogasterAedes aegyptiCulex pipensAe. aegyptiBiomphalaria glabrataHomo sapiens

Murphy et al. (1987)Priimagi, Mizrokhi, and Ilyin (1988)Mouches, Bensaadi, and Salvado (1992)Agarwal et al. (1993)This paperKnight et al. (1992)Hattori et al. (1986)

S28721a

P21328c

M95171B56679a

U87543X60372P08547c

LINE1-Mm . . . . . . .Q-Ag . . . . . . . . . . . .R1-Bm . . . . . . . . . . .R1-Dm. . . . . . . . . . .R2-Bm . . . . . . . . . . .R2-DM . . . . . . . . . .

Mus musculusAnopheles gambiaeB. moriD. melanogasterB. moriD. melanogaster

Martin (1995)Besansky, Bedell, and Mukabayire (1994)Xiong and Eickbush (1988b)Jakubczak, Xiong, and Eickbush (1990)Burke, Calalang, and Eickbush (1987)Jakubczak, Xiong, and Eickbush (1990)

U15647U03849M19755X51968M16558X51967

RT-Ce . . . . . . . . . . .RT1-Ag . . . . . . . . . .Sart1-Bm. . . . . . . . .SLACS. . . . . . . . . . .T1-Ag . . . . . . . . . . .Ta11-1-At . . . . . . . .

Caenorhabditis elegansAn. gambiaeB. moriT. bruceiAn. gambiaeArabidopsis thaliana

Wilson et al. (1994)Besansky et al. (1992)Takahashi, Okazaki, and Fujiwara (1997)Aksoy et al. (1990)Besansky (1990)Wright et al. (1996)

U46668M93690D85594X17078M93689L47193

Tad1-Nc . . . . . . . . .Tart-Dm . . . . . . . . .Tras1-Bm . . . . . . . .Trim-Dmi . . . . . . . .Tx1-Xl . . . . . . . . . . .

Neurospora crassaD. melanogasterB. moriDrosophila mirandaXenopus laevis

Cambareri, Helber, and Kinsey (1994)Levis et al. (1993)Okazaki, Ishikawa, and Fujiwara (1995)Steinemann and Steinemann (1991)Garrett, Knutzon, and Carroll (1989)

L25662U02279D38414X59239M26915

NOTE.—Thirty-eight non-LTR retrotransposon sequences are listed. The reverse transcriptase domains of I-Dt and Trim-Dmi were not included in the phylogeneticanalyses shown in Figure 6. However, the RNase H domains of I-Dt and Trim-Dmi were used in the comparison of RNase H sequences shown in table 2. Theaccession numbers are from GenBank/EMBL unless otherwise noted.

a PIR accession number.b DDBJ accession number.c Swiss-Prot accession number.

Lian-Aa1, I-Dt, and R1-Bm revealed seven blocks ofhighly similar sequences. Sequences in the first block ofthe Lian-Aa1 are before the first methionine of the ORF.It is not yet clear whether the endonuclease domain ofan active Lian element contains the first block of aminoacids.

The reverse transcriptase domain of Lian-Aa1 is lo-cated in the middle of the ORF (fig. 1). It is more similarto the reverse transcriptase domains of non-LTR retro-transposons than to other types of retroelements, as in-dicated by a BLAST search. Q9 analyses further con-firmed the relationship between the reverse transcriptasedomain of Lian-Aa1 and reverse transcriptase domainsof other non-LTR retrotransposons. For example, reversetranscriptase of Lian-Aa1 showed Q9 values of 60.0 and25.6 in comparison to Sart1-Bm and LINE1-Hs. The Q9values of the comparisons between the reverse transcrip-tase of Lian-Aa1 and the reverse transcriptase domainsof three very divergent members of non-LTR retrotrans-

posons, Cre1, Cre2, and SLACS, were 4.9, 4.6, and 7.0,respectively. All are higher than 3.0. However, the Q9values of the comparisons of the reverse transcriptasedomains between Lian-Aa1 and two LTR retrotranspo-sons, copia and gypsy of D. melanogaster, are muchlower, 21.0 and 0.1, respectively. A multiple-sequencealignment of the reverse transcriptase domains of Lian-Aa1 and 35 other non-LTR retrotransposons was con-structed using Pileup of GCG and manual adjustment.Seven conserved blocks similar to those identified byXiong and Eickbush (1990) were found, part of whichare shown in figure 2B. Based on the sequence analysesdescribed above, Lian-Aa1 is likely a novel non-LTRretrotransposon.

In addition to the endonuclease and the reversetranscriptase domains, an RNase H domain was identi-fied in the C-terminal region of Lian-Aa1 (fig. 1) bycomparison with the previously identified RNase H do-main of a non-LTR retrotransposon I-Dm (McClure

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 5: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

Mosquito Lian Retrotransposon 841

FIG. 1.—Nucleotide and deduced amino acid sequence of a novel retrotransposon, Lian-Aa1, in Aedes aegypti. The single continuous openreading frame (ORF) contains 1,219 amino acid residues. The first methionine, which is at position 31 of the ORF, is underlined. Three putativedomains for endonuclease, reverse transcriptase, and RNase H are marked. The determinations of these domains are based on comparisons withother similar sequences shown in figure 2. Note that the similarity to the endonuclease domains starts before the first methionine in Lian-Aa1,which is discussed in the Results.

1992), using Gap of GCG (gap weight 5 8, gap lengthweight 5 1). These two domains showed 43.0% overallsimilarity and this similarity was shown to be significant(P 5 8.2e26) in a BLAST search. In addition, putative

RNase H domains were identified in five other non-LTRretrotransposons, including CgT1, Ingi-Tb, I-Dt, Tras1-Bm, and Trim-Dmi, during a series of BLAST searchesin which the query sequences included representatives

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 6: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

842 Tu et al.

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 7: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

Mosquito Lian Retrotransposon 843

Table 2Q9 Values of the Pairwise Comparisons Between the Putative RNase H Domains of SevenNon-LTR Retrotransposons and Known RNase H Sequences

Lian-Aa1 I-Dm I-Dt Trim-Dmi Tras1-Bm CgT1 Ingi-Tb

Lian-Aa1 . . . . . . . . . . . . . . . . .I-Dm. . . . . . . . . . . . . . . . . . . . .I-Dt. . . . . . . . . . . . . . . . . . . . . .Trim-Dmi . . . . . . . . . . . . . . . . .Tras1-Bm . . . . . . . . . . . . . . . . .

21.216.236.122.3

59.522.822.2

23.422.1 23.9

CgT1 . . . . . . . . . . . . . . . . . . . .Ingi-Tb . . . . . . . . . . . . . . . . . . .MomLV . . . . . . . . . . . . . . . . . .EIAV. . . . . . . . . . . . . . . . . . . . .HIV1. . . . . . . . . . . . . . . . . . . . .

17.36.27.5

13.112.4

16.411.815.213.0

9.0

12.010.814.219.3

8.1

18.09.4

12.812.414.5

10.113.713.5

9.97.3

10.44.69.29.1

6.58.54.6

C. elegans RNase H . . . . . . . .S. cerevisiae RNase H . . . . . .E. coli RNase H . . . . . . . . . . .M. smegmatis RNase H . . . . .T. aquaticus RNase H . . . . . .

17.09.7

12.79.25.1

11.07.1

10.96.01.0

7.66.49.35.31.6

16.911.0

9.54.53.6

9.114.211.5

9.86.7

9.09.77.1

14.32.4

10.53.9

11.46.0

12.9

NOTE.—Q9 values are used to estimate the significance of the similarities of a pairwise comparison (Gribskov andBurgess 1986; Hagedorn, Maddison, and Tu 1998). A Q9 of 3.0 or higher is regarded as an indication that two sequencesmay be significantly related. The first seven sequences in the first column are putative RNase H domains of non-LTRretrotransposons. Sequences 8–10 are RNase H domains of retroviruses. Sequences 11–15 are RNase H proteins fromdifferent organisms. References and accession numbers of these sequences are listed either in Materials and Methods or intable 1.

FIG. 2.—Multiple-sequence alignments showing the similarities of Lian-Aa1 in Aedes aegypti (marked by *) to other related sequences.Three putative domains, endonuclease, reverse transcriptase, and RNase H, were aligned, respectively. Only blocks of highly similar residuesare shown. The Roman numerals above the alignments indicate these blocks. Abbreviations, references, and accession numbers of the sequencesused in these three analyses are listed either in Materials and Methods or in table 1. Only complete domains were used. Many elements analyzedhere do not have all three domains. A, Multiple-sequence alignment of the endonuclease domains of Lian-Aa1, I-Dt, and R1-Bm. The alignmentwas done by Pileup (gap weight 5 8, gap length weight 5 1) of GCG, which was then adjusted manually using Lineup of GCG focusing onthe seven conserved blocks identified by Feng et al. (1996). The consensus was then created by Pretty (plurality 5 2, threshold 5 1) of GCG.In cases where similar residues were aligned but none represent the majority among the three sequences, the residue of one of the sequenceswas chosen arbitrarily as the consensus. B, Multiple-sequence alignment of the reverse transcriptase domains of 36 non-LTR retrotransposons.One amino acid residue was changed in Lian-Aa1 according to the consensus sequence of six Lian elements shown in figure 4E. A two-stepalignment method was necessary to incorporate the three protozoan non-LTR retrotransposons (Cre1, Cre2, and SLACS), which have beenshown to be the most divergent among non-LTR retrotransposons (e.g., Eickbush 1992). First, the three divergent non-LTR retrotransposonsand the the other 33 non-LTR retrotransposons were aligned separately by Pileup (gap weight 5 12, gap length weight 5 1). Minor adjustmentwas necessary at the N-termini of three of the sequences in the alignment of the 33 elements. The two alignments were then put together byaligning the seven conserved blocks identified by Xiong and Eickbush (1990). Finally, the amino acid residues between the seven blocks werealigned under the guidance of Profilegap of GCG, which compares one of the three sequences with the quantitative profile of the aligned 33elements in the corresponding regions. The consensus was then created by Pretty (plurality 5 18, threshold 5 2) of GCG. The seven regionsshown here are parts of the seven conserved blocks identified by Xiong and Eickbush (1990). C, Multiple-sequence alignment of the RNase Hdomains of seven non-LTR retrotransposons and three retroviruses and RNase H proteins in five different organisms. The RNase H sequenceswere aligned using Pileup of GCG (gap weight 5 8, gap length weight 5 1). The consensus was then created by Pretty (plurality 5 8, threshold5 2) of GCG. The three amino acid residues identified in the active site of E. coli RNase H (Kanaya et al. 1990) are indicated by arrows.

of RNase H proteins and known RNase H domains innon-LTR retrotransposons, LTR retrotransposons, andretroviruses. The RNase H domains of CgT1 and Ingi-Tb have previously been described by McClure (1992)and He et al. (1996). The RNase H domains of I-Dt,Tras1-Bm, and Trim-Dmi were found to be very similarto those of Lian-Aa1 (P values range from 1.3e23 to1.2e231) and I-Dm (P values range from 4.3e27 to8.8e266). However, the above BLAST searches did notreveal RNase H domains in most non-LTR retrotranspo-sons other than the above seven elements. Moreover,these BLAST searches showed that RNase H domainsof the seven non-LTR retrotransposons were similar tothe RNase H domains of some retroviruses and the RN-ase H proteins in several different organisms. For ex-ample, the RNase H domain of I-Dm was shown to besignificantly similar to the RNase H domain of MomLV

(P 5 1.8e23) and the RNase protein in S. cerevisiae (P5 6.2e24) . The RNase H domain of Lian-Aa1 wasshown to be significantly similar to the RNase H domainof the EIAV retrovirus (P 5 1.9e23). Q9 analyses wereused to further investigate the relationship between theRNase H domains of these non-LTR retrotransposonsand other known RNase H sequences. As shown table2, pairwise comparisons between RNase H domains ofmembers of the seven non-LTR retrotransposons giveQ9 values from 6.2 to 59.5, the majority of which areabove 10. Pairwise comparisons between RNase H do-mains of the non-LTR retrotransposons and those of thethree retroviruses shown in table 2 give somewhat lowerQ9 values, ranging from 4.6 to 19.3. In all but threecases, the pairwise comparisons between RNase H do-mains of the non-LTR retrotransposons and the RNaseH proteins in five different organisms also show Q9 val-

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 8: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

844 Tu et al.

FIG. 3.—Identification of the putative (A) 59 and (B) 39 termini of the Lian elements in Aedes aegypti. Shown here are parts of the terminalsequences determined in six different copies of Lian elements. Both Lian-Aa3 and Lian-Aa4 show 59 truncations of different lengths. Thesequences of the 39 ends of Lian-Aa4 and Lian-Aa5 are not available as shown in figure 5. The two tandem horizontal arrows indicate the(CTGATAC)2 repeat. Putative 59 and 39 termini of the family of Lian elements are marked by vertical arrows. The exact 39 termini of Lian-Aa1 and Lian-Aa2 may be 2 and 1 bp away from the arrow, respectively, as discussed in Results. Putative target duplications flanking Lian-Aa2 are underlined. An asterisk indicates the fact that Lian-Aa1 is flanked by putative target duplications which are too long to be shown. Theduplications flanking both Lian-Aa1 and Lian-Aa2 are imperfect, as discussed in Results.

FIG. 4.—Analysis of six Lian elements and genomic Southern blots showing that Lian elements are highly reiterated and highly similar toeach other. A, Consensus map of six different copies of Lian elements. Enzymes used are BglII (Bg), BstxI (Bs), EcoRI (E), HindIII (H), XbaI(Xb), and XhoI (Xh). Numbers in brackets represent the frequency of occurrence in the six different Lian elements. Thick solid lines indicatethree regions that were sequenced for all six elements. The probes used in genomic Southern blots shown in B1 and B2 are also indicated. B1and B2, Genomic Southern blots probed, respectively, with probes B1 and B2 as described in A. Symbols for enzymes are as in A. C, Aquantitative genomic Southern blot showing the estimation of copy number of Lian. The blot was probed with probe B2. Lanes 1–6 arepBlueScript sk (2) plasmid DNA that contains a 920 bp EcoRI fragment of the Lian element. The EcoRI fragment covers the entire B2 probe.The inserts were released by digestion with EcoRI. Plasmid concentrations were 8,000, 1,600, 320, 64, 12.8, and 2.56 pg, respectively. Lane 7represents 216 ng of Ae. aegypti genomic DNA digested with EcoRI. D, Sequence comparison of a 735-bp region at the 59 end of the Lianelements. Shown here are only those positions that have sequence variations. Each number represents the position of the base below, usingLian-Aa1 as the reference. The consensus is based on simple majority rule. In cases where there is no majority base, a base that occurs no lessthan any other base is chosen arbitrarily. The arrow separates the bases before and after the ORF. Dots indicate sequences that are identical tothe consensus. Lowercase letters indicate sequence variation. Dashed lines indicate 59 truncations. The letter ‘‘S’’ below the sequences indicatesthat a base variation in the coding region is synonymous at the amino acid level. E, Sequence comparison of a 1,124-bp (positions 2255–3375in Lian-Aa1) coding region of the Lian elements. Shown here are only those positions that have sequence variations. Numbering and determi-nation of consensus are as described in D. The arrow separates bases that belong to the reverse transcriptase domain and bases that are betweenthe reverse transcriptase and RNase H domains. The three dashed lines indicate three consecutive gaps in Lian-Aa1, which resulted in a deletionof one amino acid without a frameshift. A plus sign represents a change to a similar amino acid residue as defined by FASTA of GCG. A slashrepresents a change to a dissimilar residue. An asterisk represents a change to a stop codon. Meanings of other symbols are the same as in D.

ues above 3.0, with the highest being 17.0. These anal-yses are consistent with the BLAST search results. Asshown in figure 2C, a multiple-sequence alignment ofthe 15 RNase H sequences included in table 2 revealthree blocks of highly similar sequences. Furthermore,as shown in figure 2C, the three amino acid residuesidentified in the active site of E. coli RNase H protein(Kanaya et al. 1990) are invariant in all 15 sequences.Therefore, the above BLAST searches, Q9 analyses, andthe multiple-sequence alignment all strongly indicatethat seven non-LTR retrotransposons, including Lian-Aa1, contain RNase H domains similar to the RNase Hproteins found in several different organisms and theRNase H sequences of some retroviruses.

The Lian Family May Have an Unusual 39 Terminus

Five additional genomic clones that contain differ-ent copies of Lian elements were isolated and partiallysequenced. As shown in figure 3A, Lian-Aa3 and Lian-Aa4 show 59 truncations of different lengths. The sim-

ilarities of Lian-Aa4 and Lian-Aa5 to the other Lian el-ements stopped approximately 530 and 660 bp respec-tively, before the 39 termini of the complete elements(figs. 3B and 5). This could be caused either by recom-bination events or by insertions of other repetitive ele-ments into the 39 regions of these two copies that shiftedthe 39 ends of the Lian elements beyond the analyzedgenomic clones. In fact, a putative miniature inverted-repeat transposable element named Wujin-like (Tu 1997)was found 39 of the incomplete Lian-Aa4, as discussedlater (fig. 5). As shown in figure 3, the putative 59 and39 termini of the complete Lian elements were deter-mined by locating the positions at which identities ofthe terminal sequences among the six Lians stopped. Atandem repeat of (CTGATAC)2 was found where se-quence identities of the four Lian elements stopped nearthe 39 end, although the tandem repeats in Lian-Aa1 andLian-Aa2 were incomplete. The ends of these tandemrepeats, complete or incomplete, are likely to be the 39termini of these Lian elements. Moreover, imperfect pu-

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 9: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

Mosquito Lian Retrotransposon 845

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 10: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

846 Tu et al.

FIG. 5.—Organization of the six characterized Lian elements in Aedes aegypti relative to other repetitive elements. The figure is not drawnto scale. The imperfect target duplications (DR) flanking Lian-Aa1 and Lian-Aa2 are indicated by thin arrows. Both black boxes and thickarrows indicate retrotransposons. Orientation of the arrow represents the orientation of the retrotransposon. A striped box indicates a miniatureinverted-repeat transposable element. A dotted box indicates a yet unclassified repetitive element. A number above the line represents theapproximate distance between a repetitive element and the Lian element. A question mark indicates that the position of a repetitive sequencerelative to the Lian element is not determined. An asterisk represents the position of the end of the cloned fragment. The two copies of Unknown-Ds are very close to Lian-Aa6, less than 14 and 70 bp, respectively. All the repetitive elements shown here surrounding the six Lian elementswere discovered based on sequence analysis, except for Wukong-Aa10, which was discovered by a dot-blot analysis (data not shown).

tative target duplications were shown to flank Lian-Aa1and Lian-Aa2, respectively. The putative target dupli-cations flanking Lian-Aa1 are 305 and 300 bp, and theyare 97.7% identical (not shown). The duplications flank-ing Lian-Aa2 are 27 and 29 bp, and they are identicalexcept that the 39 copy has an extra TA sequence 5 bpfrom the end (fig. 3). Most non-LTR retrotransposonshave target duplications of variable lengths ranging from4 to 49 bp (Eickbush 1992). The putative target dupli-cations flanking Lian-Aa1 are rather long compared tomost non-LTR retrotransposons, although there are ex-amples of long target duplications (e.g., Youngman, vanLuenen, and Plasterk 1996). Sequence comparisons ofthe six Lian elements, as well as the presence of putativetarget duplications, suggest that Lian-Aa1, Lian-Aa2,and Lian-Aa6 are likely complete copies.

Most of the known non-LTR retrotransposons inwhich transposition is not sequence-specific have eithera poly-A tract or a simple repeat of of (TAA)n or(TGAAA)n as the 39 terminal sequence (Eickbush 1992;Drew and Brindley 1997). The exceptions are SR1 ofthe human blood fluke Schistosoma mansoni and CR1of the chicken, Gallus gallus, which have imperfect re-peats of (AACCATTTG)2 and (NATTCTRT)1–4, respec-tively (Drew and Brindley 1997; Haas et al. 1997). Thefamily of Lian elements is similar to CR1 and SR1, asit is likely to terminate in a repeat of (CTGATAC)2. Ithas been suggested that the tandem repeat may be thetarget for the priming event in reverse transcription ofCR1 (Burch, Davis, and Haas 1993). It is not yet clearhow these atypical termination sequences are derived.

The Lian Family of Retrotransposons Is HighlyReiterated in the Genome

The copy number of the Lian retrotransposons in theAe. aegypti (Rock strain) genome was first estimated byscreening a genomic library using a probe in the codingregion of Lian-Aa1. There were approximately 1,380 cop-

ies of Lian elements per haploid genome, calculated basedon the known size of the haploid genome of Ae. aegyptiRock strain (800 Mb; Rao and Rai 1987) and the 16-kbaverage insert size of the genomic libraries. Because dif-ferent copies of Lian elements are highly conserved, asshown in qualitative genomic Southern blots (fig. 4B), aquantitative genomic Southern blot was also used to esti-mate the copy number of the Lian elements (fig. 4C). Theaverage number estimated by the Southern method was460 copies per haploid genome. This method gives a min-imal estimation because any copy that has a point mutationat the enzyme sites used in the Southern blot cannot becounted. Based on these estimations, there are more than460, and likely about 1,380, copies of Lian in the Ae.aegypti genome. Therefore, Lian could constitute up to0.8% of the entire genome.

Two other non-LTR retrotransposons, Juan-A(Mouches, Bensaadi, and Salvado 1992) and JAM1(Hughes, Warren, and Crampton 1996; Warren, Hughes,and Crampton 1997), are also abundant in Ae. aegypti,with .200 copies and 1,000 copies per haploid genome,respectively. In addition, seven other non-LTRretrotransposons have been found in a preliminary study,some of which are shown in figure 5. Most, if not all,of the seven retrotransposons seem to be different. Only20% of the genome of the Bangkok strain of Ae. aegypti,which is similar in size to the genome of the Rock strain,consists of moderately repetitive DNA (Warren andCrampton 1991). Therefore, it is likely that Lian andother non-LTR retrotransposons are the predominantclass of the long interspersed repetitive elements in theAe. aegypti genome. It is interesting to note that LINEelements are also the predominant long interspersed re-petitive elements in mammals (Smit 1996).

High Sequence Similarity Suggests RecentTransposition of Lian

As mentioned before, a total of six genomic clonesthat contained Lian elements were analyzed, including

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 11: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

Mosquito Lian Retrotransposon 847

the original Lian-Aa1. The consensus map of these sixcloned Lians (fig. 4A) suggests high degrees of similar-ity. Except for the differences at the 39 end of Lian-Aa4and Lian-Aa5, which were discussed earlier, all the dif-ferences in restriction sites were caused by a single pointmutation as determined by subsequent partial sequenceanalysis. A 735-bp region at the 59 end of the six Lians,which includes 545 bp before the ORF and 190 bp atthe beginning of the ORF, was sequenced. Lian-Aa3 andLian-Aa4 showed 59 truncations at 76 and 53 bp fromthe putative terminus of the complete Lian elements,respectively. Other than the two truncations, there areonly a very small number (0.6%) of base substitutions,as shown in figure 4D. Moreover, the substitutions at thebeginning of the ORF are all synonymous. As shown infigure 4E, a 1,124-bp coding region was sequenced inthe six Lian elements. Again, only 0.6% average sub-stitutions were seen at the nucleotide level. Most of thenucleotide changes are synonymous or conservative (asdefined by FASTA of GCG). A 150-bp 39 untranslatedregion showed no base changes except the lack of se-quence similarities at the termini of Lian-Aa4 and Lian-Aa5 (data not shown). Furthermore, genomic Southernblots shown in figure 4B1 and B2 clearly indicate thatthe hundreds of Lian elements are highly similar, be-cause only one major band was seen in all five of theenzyme digestions that had two cuts within the Lianelements. However, in the case of XhoI, which only pro-duces a single cut most of the time, a long continuoussmear is seen in the Southern blot, which is consistentwith the high copy number of the Lian elements in thegenome. Therefore, both the analysis of multiple copiesof the Lian elements and the genomic Southern blotssuggest that the multiple copies of Lian elements arehighly similar, indicating very recent transposition ac-tivity. In this regard, it is interesting to note that theJuan families have been shown to be recently active inboth Aedes and Culex mosquitoes (Bensaadi-Mercher-mek, Salvado, and Mouches 1994).

Association of Lian Retrotransposons with OtherRepetitive Elements

As shown in figure 5, all of the six characterizedLians are associated with one or more different repeti-tive elements. Lian-Aa1 was near a miniature inverted-repeat transposable element, Wukong-Aa1 (Tu 1997).Lian-Aa2 was associated with another Wukong element,Wukong-Aa10. Lian-Aa3 was inserted into RT-E, a pu-tative retrotransposon similar to RT1 of An. gambiae(Besansky et al. 1992), as identified during a BLASTsearch (P 5 2.1e214). Two different miniature inverted-repeat transposable elements, Wuneng-Aa1 and Wujin-like (Tu 1997), were near Lian-Aa4. A fragment highlysimilar (87% identity in 125 amino acid residues com-pared) to JAM1 of Ae. aegypti (Warren, Hughes, andCrampton, GenBank Z86117) was found in the clonecontaining Lian-Aa5, although a few frameshifts had tobe made in the translation to achieve high identity. An-other retrotransposon, named RT-A, similar to Q of An.gambiae (Besansky, Bedell, and Mukabayire 1994), wasfound in the clone containing Lian-Aa6 during a BLAST

search (P 5 9.2e231). Two unclassified repetitive ele-ments, named Unknown-D1 and Unknown-D2, werealso associated with Lian-Aa6. As shown in figure 5,Unknown-D1 is less than 14 bp from the 59 end of Lian-Aa6, and Unknown-D2 is less than 70 bp from the 39end of Lian-Aa6. However, the exact distances betweenthe Unknown-Ds and Lian-Aa6 have not been deter-mined, as the termini of the two Unknown-Ds are notclear. The designation of the Unknown-Ds as putativerepetitive elements was based on the identification of athird copy of this element in a vitellogenin gene, VgA1of Ae. aegypti (Romans et al. 1995). BLAST searchesshowed that these three copies, which are approximately330 bp long, are significantly similar, with P values allless than 4.0e222. In addition, a FASTA search identifieda 68-bp fragment in a genomic sequence (unpublisheddata) that showed 78% identity (P 5 1.7e23) to Un-known-D2, indicating an incomplete fourth copy of Un-known-D. Furthermore, preliminary analyses identifiedother sequences that have the characteristics of differenttypes of repetitive elements in regions flanking the sixcopies of Lian elements (data not shown). More impor-tantly, there could be more repetitive elements in theseclones yet to be discovered, as not all the regions flank-ing the six copies of Lian elements have been sequencedand analyzed.

Evolutionary Implications

Given the fact that only 20% of the Ae. aegyptigenome is made up of moderately repetitive sequences(Warren and Crampton 1991), the frequent associationbetween Lian and other repetitive elements describedabove suggests a nonrandom distribution of Lian ele-ments in the genome. However, more copies of random-ly chosen Lian elements need to be analyzed to furtheraddress this question. It is possible that some of theanalyzed Lian elements may be either in the hetero-chromatic regions or in the euchromatic regions havinglocal concentrations of repetitive sequences. Two typesof I elements are present in D. melanogaster: the defec-tive type, which is confined in the heterochromatin, andthe active type, which can transpose to both heterochro-matin and euchromatin. The heterochromatic I elementsare often flanked by other families of repetitive ele-ments, and they showed only 94% sequence identitywith each other and with the active elements (Vaury,Bucheton, and Pelisson 1989; Busseau et al. 1994).However, the six analyzed copies of Lian elements donot seem to be different types based on their high se-quence identity.

Some retrotransposons have been shown to be con-centrated in the centromeric and telomeric regions ofchromosomes (Pimpinelli et al. 1995; Mukabayire andBesansky 1996). Nonrandom distribution of retrotrans-posons has also been shown on chromosomal arms (Mu-kabayire and Besansky 1996). Regions near tRNA genesor preexisting LTR sequences have been shown to behot spots for Ty1 transposition in yeast (Ji et al. 1993;Voytas 1996). A recent report showed a large numberof retrotransposons inserted within one another in a 280-kb intergenic region in the maize genome (SanMiguel

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 12: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

848 Tu et al.

et al. 1996). Our preliminary analyses also showed con-centrations of a number of repetitive elements in thenoncoding regions of a few genes in Ae. aegypti (un-published data). The distribution patterns of transposa-ble elements are likely the result of complex interactionsamong different families of elements and/or between el-ements and the host genome. Several mechanisms couldaccount for the nonrandom distribution and associationof different families of repetitive elements in the ge-nome. A particular chromosomal location where multi-ple elements are concentrated could be a safe haven forthe insertion of repetitive elements. It is also possiblethat the hot spots for the insertion of repetitive elementsare simply chromosomal regions that are more accessi-ble (Craigie 1992). The association between some trans-posable elements could have also resulted from com-petition between these elements. For example, the in-sertion of a transposable element into other competingfamilies of elements is likely to inactivate these ele-ments. Further analysis of the genomic distribution ofthe highly reiterated Lian family and its association withother repetitive elements may facilitate the understand-ing of the genomic organization of Ae. aegypti as wellas the genomic ecology between transposable elementsand this mosquito host.

Phylogenetic Analysis of Reverse TranscriptaseDomains of Lian and other Non-LTR Retrotransposons

The reverse transcriptase domain is the most con-served region in the retroelements and is commonlyused to infer the phylogenetic relationships of the ret-roelements. As shown in the sequence analyses of thereverse transcriptase domains described above, the Lianfamily is clearly a member of the non-LTR retrotrans-posons. Non-LTR retrotransposons are a diverse groupof elements with a wide range of reverse transcriptasesequences. Most non-LTR retrotransposons form amonophyletic group in the retroelement superfamily(e.g., Xiong and Eickbush 1988a; 1990; Eickbush 1992,1994; McClure 1993). A few elements from protozoasuch as Cre1 and SLACS have been shown to be themost divergent members of the non-LTR group (Eick-bush 1992). They were shown to be either basal branch-es within non-LTR elements (Eickbush 1992) or a sistergroup of a clade that contains all other non-LTR retro-transposons, group II introns, and msDNA-associatedreverse transcriptase sequences (Xiong and Eickbush1990).

Here, we describe a systematic attempt to study thephylogenetic relationships between members of the non-LTR retrotransposons including Lian elements usingshared reverse transcriptase domains. To our knowledge,this is the first reported case in which bootstrap analysiswas used to assess the confidence levels of groupingsamong a large number of non-LTR retrotransposons.The phylogenetic relationship of the reverse transcrip-tase domains of 36 non-LTR retrotransposons includingLian-Aa1 was analyzed using PAUP* 4d59 (Swofford1997). The trees shown in figure 6A were constructedusing 33 non-LTR retrotransposons which do not in-clude the three protozoan elements Cre1, Cre2, and

SLACS; the trees shown in figure 6B do include thesethree elements, which allows inference of the possibleroot of these non-LTR retrotransposons. However, asshown below, these three sequences are sufficiently di-vergent that the rooting obtained by this method is notcertain. Moreover, inclusion of these divergent sequenc-es may affect inferences about the relationships amongother non-LTR retrotransposons. Therefore, the exclu-sion of the three divergent elements, as shown in figure6A, allows an independent assessment of the branchingpatterns of the rest of the non-LTR retrotransposons.

Shown in figure 6A1 is the single most-parsimo-nious tree obtained for the 33 non-LTR retrotransposons.figure 6A2 is the bootstrap consensus tree of these samesequences, obtained using the maximum-parsimonymethod. The two trees are in good agreement, althougha few of the branches are not resolved in the bootstrapconsensus. There are four major groups (I to IV) of non-LTR retrotransposons that are supported by the bootstrapanalysis. Lian-Aa1 is closely related to group II, whichincludes R1-Dm, R1-Bm, RT1-Ag, Tras1-Bm, and Sart1-Bm, as indicated by the most-parsimonious tree in figure6A1. However, this relationship was not resolved in thebootstrap analysis. Shown in figure 6B1 is the strict con-sensus of the nine most-parsimonious trees of the 36non-LTR retrotransposons. Figure 6B2 is the bootstrapconsensus tree of these same sequences, obtained usingmaximum-parsimony method. The two trees in figure6B1 and B2 are in agreement in the major groupings,although a few of the branches are not resolved in thebootstrap consensus. More importantly, the rooting bythe three divergent protozoan elements (group V) is dif-ferent in these two trees. The most-parsimonious trees(fig. 6B1) placed group V at the node connecting groupII and the rest of the non-LTR retrotransposons, whilethe bootstrap tree (fig. 6B2) placed group V at the nodeconnecting group III and the rest of the elements. Theposition of Lian-Aa1 relative to group II is also differentin these two trees. While Lian-Aa1 was included ingroup II in the bootstrap tree, this relationship was notresolved in the most-parsimonious trees. The majorgroups that are supported by bootstrap analysis in figure6A2 and B2 are similar. The internal branching patternsof groups I and IV are the same in figure 6A2 and B2,while the branching patterns of groups II and III aresomewhat different. In summary, the analyses describedabove support five relatively well defined major groupswithin the 36 analyzed non-LTR retrotransposons, al-though there are other sequences whose groupings arenot yet clear. Groups III–V are well supported by boot-strap analyses, while the support for groups I and II isrelatively weak. Lian-Aa1 is possibly among, or closelyrelated to, the group II sequences, although not all anal-yses support this relationship. Moreover, the fact that thewell-supported group III includes elements from plants,insects, and vertebrates indicates either very early ori-gins of this group of non-LTR retrotransposons or pos-sible horizontal-transfer events. Although group V islikely at the basal branches of the phylogenetic tree ofthe non-LTR retrotransposons, how it is connected tothe rest of the elements is not certain. This is a common

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 13: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

Mosquito Lian Retrotransposon 849

FIG. 6.—Phylogenetic analyses of the reverse transcriptase domains of non-LTR retrotransposons including Lian-Aa1 (marked by *), using parsi-mony methods. All analyses were conducted using PAUP* 4d59 (Swofford 1997). All trees were unrooted. The relative branch length was calculatedby PAUP* 4d59. The Roman numeral at the base of a branch node indicates a major grouping of elements. The Arabic numeral at the base of a nodeis the bootstrap value which represents the percentage of times out of 500 bootstrap resamplings that branches were grouped together at a particularnode. A, Phylogenetic trees constructed using 33 non-LTR retrotransposons. The alignment used here were obtained using Pileup of GCG (gap weight5 12, gap length weight 5 1). Minor adjustment was necessary as discussed in figure 2C. The entire alignment is deposited at the EMBL database(accession number DS32420). A1, The single most-parsimonious tree of the 33 non-LTR retrotransposons, as found during 100 heuristic searches. Eachsearch begins from a starting tree acquired with a random-addition sequence and TBR branch swapping. The most-parsimonious tree was found in 25of the 100 replicates. All characters are of equal weight and unordered. A2, A bootstrap consensus tree of the most-parsimonious trees of the 33 non-LTR retrotransposons from 500 replicates. All characters are of equal weight and unordered. For each replicate, the most-parsimonious tree was soughtwith 10 heuristic searches using random addition and TBR branch swapping. B, Phylogenetic trees constructed using 36 non-LTR retrotransposonswhich include the three most divergent elements, Cre1, Cre2, and SLACS. The sequences were aligned using a two-step method discussed in figure2C. The entire alignment is deposited in the EMBL database (accession number DS32419). The methods and indications for the symbols are the sameas in A1 and A2. An arrow indicates the possible position of the root of the tree of the non-LTR retrotransposons. B1, The strict consensus of the ninemost-parsimonious trees of the 36 non-LTR retrotransposons, as found during 100 heuristic searches. These most-parsimonious trees were found in 46of the 100 replicates. B2, A bootstrap consensus tree of the most-parsimonious trees of the 36 non-LTR retrotransposons, from 500 replicates. For eachreplicate, the most-parsimonious tree was sought with 15 heuristic searches.

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 14: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

850 Tu et al.

problem when inferring trees with divergent sequences.It is clear that the resolution of the current analysis isrelatively low, especially regarding the relationship be-tween the major groups and the root of the non-LTRretrotransposons. More sequences of different types ofnon-LTR elements are necessary to better resolve thesephylogenetic relationships.

Independent Assortment of Reverse Transcriptase andRNase H Domains

As discussed above, all non-LTR retrotransposonsanalyzed so far have very similar reverse transcriptasedomains, most of which form a monophyletic group inthe retroelement superfamily. In addition to the reversetranscriptase domains, we have shown that seven non-LTR retrotransposons including Lian-Aa1 contain RN-ase H domains similar to known RNase H sequences,while the majority do not. The identification of RNaseH domains has also been reported for two other non-LTR retrotransposons, LINE1-Hs and Cin4-Zm (Mc-Clure 1993). However, our BLAST searches and Q9analyses failed to confirm this (data not shown). Nev-ertheless, based on the fact that only a small number ofnon-LTR retrotransposons contain RNase H domainsthat are similar to the RNase H proteins in several dif-ferent organisms and the RNase H domains of someretroviruses (table 2 and fig. 2C), it is likely that inde-pendent assortment of reverse transcriptase and RNaseH domains may have occurred during the evolution ofnon-LTR retrotransposons, as suggested by McClure(1992). This independent assortment may have beenachieved in two ways, namely, recombination of do-mains between different types of retroelements or in-dependent acquisition/loss of the RNase H domains.Phylogenetic analyses based on the alignment of the 15RNase H domains shown in figure 2C provided no res-olutions that were supported by bootstrap replicationsexcept for the closely related sequences, such as thethree bacterial RNase H proteins and the RNase H do-mains of the I elements in the two Drosophila species(data not shown). Accumulation of more sequences ofdifferent domains from a wide range of retroelementsand improvement of methods for determining the rela-tionships among distantly related sequences may be nec-essary to better understand the domain evolution of ret-roelements.

Non-LTR retrotransposons are a diverse and widelydistributed group of transposable elements. They oftenmake up a significant fraction of the eukaryotic ge-nomes. Continued discovery and analysis of differentfamilies of non-LTR retrotransposons will certainly fa-cilitate the understanding of the evolution of this groupof transposable elements as well as the organization andevolution of the genome of their host.

Acknowledgments

Z.T. thanks H. H. Hagedorn for excellent advice,constant interest, and invaluable support, which madethis work possible. Z.T. also thanks M. G. Kidwell forhelpful guidance and generous encouragement. We

thank H. H. Hagedorn, M. G. Kidwell, D. R. Maddison,P. M. O’Grady, J. B. Clark, Y. Park, and D. Lisch forcritical comments on the manuscript. We are grateful toA. A. James for the gift of a genomic library of Ae.aegypti and to D. L. Swofford for providing the testversion of PAUP* 4d59. We thank P. Romans for shar-ing unpublished information. We also thank the Se-quencing Facility of the University of Arizona for theirservice. This work was supported by NIH grant HD24869 to H. H. Hagedorn and A. M. Fallon, NIH grantAI42121 to Z. T., and by a MacArthur Foundation grantto the Center for Insect Science of the University ofArizona.

LITERATURE CITED

ABAD, P., C. VAURY, A. PELISSON, M. C. CHABOISSIER, I. BUS-SEAU, and A. BUCHETON. 1989. A long interspersed repet-itive element–the I factor of Drosophila teissieri—is ableto transpose in different Drosophila species. Proc. Natl.Acad. Sci. USA 86:8887–8891.

AGARWAL, M. L., N. BENSAADI, J. C. SALVADO, K. CAMPBELL,and C. MOUCHES. 1993. Characterization and genetic or-ganization of full-length copies of a LINE retroposon familydispersed in the genome of Culex pipiens mosquitoes. InsectBiochem. Mol. Biol. 23:621–629.

AKSOY, S., S. WILLIAMS, S. CHANG, and F. F. RICHARDS. 1990.SLACS retrotransposon from Trypanosoma brucei gam-biense is similar to mammalian LINEs. Nucleic Acids Res.18:785–792.

ALTSCHUL, S. F., W. GISH, W. MILLER, E. W. MYERS, and D.J. LIPMAN. 1990. Basic local alignment search tool. J. Mol.Biol. 215:403–410.

ARKHIPOVA, I. R., N. V. LYUBOMIRSKAYA, and Y. V. ILYIN.1995. Drosophila retrotransposons. Springer-Verlag, NewYork.

BECERRA, S. P., G. M. CLORE, A. M. GRONENBORN, A. R.KARLSTROM, S. J. STAHL, S. H. WILSON, and P. T. WING-FIELD. 1990. Purification and characterization of the RNaseH domain of HIV-1 reverse transcriptase expressed in re-combinant Escherichia coli. FEBS Lett. 270:76–80.

BENSAADI-MERCHERMEK, N., J. C. SALVADO, and C. MOUCH-ES. 1994. Mosquito transposable elements. Genetica 93:139–148.

BERG, D. E., and M. M. HOWE. 1989. Mobile DNA. AmericanSociety for Microbiology, Washington, D.C.

BESANSKY, N. J. 1990. A retrotransposable element from themosquito Anopheles gambiae. Mol. Cell. Biol. 10:863–871.

BESANSKY, N. J., J. A. BEDELL, and O. MUKABAYIRE. 1994.Q: a new retrotransposon from the mosquito Anophelesgambiae. Insect Mol. Biol. 3:49–56.

BESANSKY, N. J., O. MUKABAYIRE, J. A. BEDELL, and H. LUSZ.1996. Pegasus, a small terminal inverted repeat transposa-ble element found in the white gene of Anopheles gambiae.Genetica 98:119–129.

BESANSKY, N. J., S. M. PASKEWITZ, D. M. HAMM, and F. H.COLLINS. 1992. Distinct families of site-specific retrotran-sposons occupy identical positions in the rRNA genes ofAnopheles gambiae. Mol. Cell. Biol. 12:5102–5110.

BLINOV, A. G., Y. V. SOBANOV, S. S. BOGACHEV, A. P. DON-CHENKO, and M. A. FILIPPOVA. 1993. The Chironomusthummi genome contains a non-LTR retrotransposon. Mol.Gen. Genet. 237:412–420.

BLINOV, A. G., Y. V. SOBANOV, S. V. SCHERBIK, and K. G.AIMANOVA. 1997. The Chironomus (Camptochironomus)

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 15: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

Mosquito Lian Retrotransposon 851

tentans genome contains two non-LTR retrotransposons.Genome 40:143–150.

BROOKFIELD, J. F. Y. 1995. Transposable element as selfishDNA. Pp. 130–153 in D. J. SHERRATT, ed. Mobile geneticelements. Oxford University Press, Oxford.

BRITTEN, R. J. 1996. DNA sequence insertion and evolutionaryvariation in gene regulation. Proc. Natl. Acad. Sci. USA 93:9374–9377.

BURCH, J. B. E., D. L. DAVIS, and N. B. HAAS. 1993. Chickenrepeat 1 elements contain a pol-like open reading frame andbelong to the non-long terminal repeat class of retrotran-sposons. Proc. Natl. Acad. Sci. USA 90:8199–8203.

BURKE, W. D., C. C. CALALANG, and T. H. EICKBUSH. 1987.The site-specific ribosomal insertion element type II ofBombyx mori (R2Bm) contains the coding sequence for areverse transcriptase-like enzyme. Mol. Cell. Biol. 7:2221–2230.

BUSSEAU, I., M.-C. CHABOISSIER, A. PELISSON, and A. BUCH-ETON. 1994. I factors in Drosophila melanogaster: trans-position under control. Genetica 93:101–116.

CAMBARERI, E. B., J. HELBER, and J. A. KINSEY. 1994. Tad1,an active LINE-like element of Neurospora crassa. Mol.Gen. Genet. 242:658–665.

CHARLESWORTH, B., and C. H. LANGLEY. 1989. The populationgenetics of Drosophila transposable elements. Annu. Rev.Genet. 23:251–287.

CHIU, I. M., A. YANIV, J. E. DAHLBERG, A. GAZIT, S. F.SKUNTZ, S. R. TRONICK, and S. A. AARONSON. 1985. Nu-cleotide sequence evidence for relationship of AIDS retro-virus to lentiviruses. Nature 317:366–368.

CRAIGIE, R. 1992. Hotspots and warm spots: integration spec-ificity of retroelements. Trends Genet. 8:187–190.

DI NOCERA, P. P., and G. CASARI. 1987. Related polypeptidesare encoded by Drosophila F elements, I factors, and mam-malian L1 sequences. Proc. Natl. Acad. Sci. USA 84:5843–5847.

DOOLITTLE, W. F., and C. SAPIENZA. 1980. Selfish genes, thephenotype paradigm and genome evolution. Science 284:601–603.

DREW, A. C., and P. J. BRINDLEY. 1997. A retrotransposon ofthe non-long terminal repeat class from the human bloodfluke Schistosoma mansoni. Similarities to the chicken-re-peap-1-like elements of vertebrates. Mol. Biol. Evol. 14:602–610.

EICKBUSH, T. H. 1992. Transposing without ends: the non-LTRretrotransposable elements. New Biol. 4:430–440.

. 1994. Origin and evolutionary relationships of retro-elements. Pp. 121–157 in S. S. MORSE, ed. The evolution-ary biology of viruses. Raven Press, New York.

. 1997. Telomerase and retrotransposons: which camefirst? Science 277:911–912.

FAWCETT, D. H., C. K. LISTER, E. KELLETT, and D. J. FINNE-GAN. 1986. Transposable elements controlling I-R hybriddysgenesis in D. melanogaster are similar to mammalianLINEs. Cell 47:1007–1015.

FELSENSTEIN, J., and H. KISHINO. 1993. Is there somethingwrong with the bootstrap on phylogenies? A reply to Hillisand Bull. Syst. Biol. 42:193–200.

FENG, Q., J. V. MORAN, H. H. KAZAZIAN JR., and J. D. BOEKE.1996. Human L1 retrotransposon encodes a conserved en-donuclease required for retrotransposition. Cell 87:905–916.

FINNEGAN, D. J. 1992. Transposable elements. Curr. Opin.Genet. Dev. 2:861–867.

GABRIEL, A., T. J. YEN, D. C. SCHWARTZ, C. L. SMITH, J. D.BOEKE, B. SOLLNER-WEBB, and D. W. CLEVELAND. 1990.A rapidly rearranging retrotransposon within the miniexon

gene locus of Crithidia fasciculata. Mol. Cell. Biol. 10:615–624.

GARRETT, J. E., D. S. KNUTZON, and D. CARROLL. 1989. Com-posite transposable elements in the Xenopus laevis genome.Mol. Cell. Biol. 9:3018–3027.

GRIBSKOV, M., and R. BURGESS. 1986. Sigma factors from E.coli, B. subtilis, phage SP01, and phage T4 are homologousproteins. Nucleic Acids Res. 14:6745–6763.

HAAS, N. B., J. M. GRABOWSKI, A. B. SIVITZ, and J. B. E.BURCH. 1997. Chicken repeat 1 (CR1) elements, which de-fine an ancient family of vertebrate non-LTR retrotranspo-sons, contain two closely spaced open reading frames. Gene197:305–309.

HAGEDORN, H. H., D. R. MADDISON, and Z. TU. 1998. Evo-lution of vitellogenins, cyclorrhaphan yolk proteins, andother related molecules. Adv. Insect Physiol. 27:235–284.

HARTL, D. L., E. R. LOZOVSKAYA, D. I. NURMINSKY, and A.R. LOHE. 1997. What restricts the activity of mariner-liketransposable elements? Trends Genet. 13:197–201.

HATTORI, M., S. KUHARA, O. TAKENAKA, and Y. SAKAKI.1986. L1 family of repetitive DNA sequences in primatesmay be derived from a sequence encoding a reverse tran-scriptase-related protein. Nature 321:625–628.

HE, C., J. P. NOURSE, S. KELEMU, J. A. G. IRWIN, and J. M.MANNERS. 1996. CgT1: a non-LTR retrotransposon with re-stricted distribution in the fungal phytopathogen Colletotri-chum gloeosporioides. Mol. Gen. Genet. 252:320–331.

HUGHES, M. A., A. M. WARREN, and J. M. CRAMPTON. 1996.JAM1: a novel LINE transposable element in the genomeof the medically important mosquito, Aedes aegypti. P. 276in Proceedings of the XXth International Congress of En-tomology, Florence, Italy.

HUTCHISON, C. A., S. C. HARIES, D. D. LOEB, W. R. SHEHEE,and M. H. EDGELL. 1989. LINEs and related retroposons:long interspersed repeated sequences in the eucaryotic ge-nome. Pp. 593–617 in D. E. BERG and M. M. HOME, eds.Mobile DNA. American Society for Microbiology, Wash-ington, D.C.

ITAYA, M., and K. KONDO. 1991. Molecular cloning of a ri-bonuclease H (RNase HI) gene from an extreme thermo-phile Thermus thermophilus HB8: a thermostable RNase Hcan functionally replace the Escherichia coli enzyme invivo. Nucleic Acids Res. 19:4443–4449.

ITAYA, M., D. MCKELVIN, S. K. CHATTERJIE, and R. J.CROUCH. 1991. Selective cloning of genes encoding RNaseH from Salmonella typhimurium, Saccharomyces cerevisiaeand Escherichia coli rnh mutant. Mol. Gen. Genet. 227:438–445.

JAKUBCZAK, J. L., Y. XIONG, and T. H. EICKBUSH. 1990. TypeI (R1) and type II (R2) ribosomal DNA insertions of Dro-sophila melanogaster are retrotransposable elements closelyrelated to those of Bombyx mori. J. Mol. Biol. 212:37–52.

JI, H., D. P. MOORE, M. A. BLOMBERG, L.T. BRAITERMAN, D.F. VOYTAS, G. NATSOULIS, and J. D. BOEKE. 1993. Hotspotsfor unselected Ty1 transposition events on yeast chromo-some III are near tRNA genes and LTR sequences. Cell 73:1007–1018.

KAJIKAWA, M., K. OHSHIMA, and N. OKADA. 1997. Determi-nation of the entire sequence of turtle CR1: the first openreading frame of the turtle CR1 element encodes a proteinwith a novel zinc finger motif. Mol. Biol. Evol. 14:1206–1217.

KANAYA, S., and R. J. CROUCH. 1983. DNA sequence of thegene coding for Escherichia coli ribonuclease H. J. Biol.Chem. 258:1276–1281.

KANAYA, S., A. KOHARA, Y. MIURA, A. SEKIGUCHI, S. IWAI,H. INOUE, E. OHTSUKA, and M. IKEHARA. 1990. Identifi-

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 16: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

852 Tu et al.

cation of the amino acid residues involved in an active siteof Escherichia coli ribonuclease H by site-directed muta-genesis. J. Biol. Chem. 265:4615–4621.

KE, Z., G. L. GROSSMAN, A. J. CORNEL, and F. H. COLLINS.1996. Quetzal: a transposon of the Tc1 family in the mos-quito Anopheles albimanus. Genetica 98:141–147.

KIDWELL, M. G., and D. LISCH. 1997. Transposable elementsas sources of variation in animals and plants. Proc. Natl.Acad. Sci. USA 94:7704–7711.

KNIGHT, M., A. MILLER, N. RAGHAVAN, C. RICHARDS, and F.LEWIS. 1992. Identification of a repetitive element in thesnail Biomphalaria glabrata: relationship to the reversetranscriptase-encoding sequence in LINE-1 transposons.Gene 118:181–187.

KOELLE, M. R., W. A. SEGRAVES, and D. S. HOGNESS. 1992.DHR3: a Drosophila steroid receptor homolog. Proc. Natl.Acad. Sci. USA 89:6167–6171.

LEVIN, H. L. 1997. It’s prime time for reverse transcriptase.Cell 88:5–8.

LEVIS, R. W., R. GANESAN, K. HOUTCHENS, L. A. TOLAR, andF. M. SHEEN. 1993. Transposons in place of telomeric re-peats at a Drosophila telomere. Cell 75:1083–1093.

LIN, Y., M. T. HAMBLIN, M. J. EDWARDS, C. BARILLAS-MURY,M. R. KANOST, D. G. KNIPPLE, M. F. WOLFNER, and H. H.HAGEDORN. 1993. Structure, expression, and hormonal con-trol of genes from the mosquito, Aedes aegypti, which en-code proteins similar to the vitelline membrane proteins ofDrosophila melanogaster. Dev. Biol. 155:558–568.

LUAN, D. D., M. H. KORMAN, J. L. JAKUBCZAK, and T. H.EICKBUSH. 1993. Reverse transcription of R2Bm RNA isprimed by a nick at the chromosomal target site: a mecha-nism for non-LTR retrotransposition. Cell 72:595–605.

MCCLURE, M. A. 1991. Evolution of retroposons by acquisi-tion or deletion of retrovirus-like genes. Mol. Biol. Evol. 8:835–856.

. 1992. Sequence analysis of eukaryotic retroid pro-teins. Math. Comput. Modelling 16:121–136.

. 1993. Evolutionary history of reverse transcriptase. Pp425–444 in A. M. SKALKA and S. GOFF, eds. Reverse tran-scriptase. Cold Spring Harbor Laboratory Press, Plainview,N.Y.

MCDONALD, J. F. 1993. Evolution and consequences of trans-posable elements. Curr. Opin. Genet. Dev. 3:855–864.

. 1995. Transposable elements: possible catalysts of or-ganismic evolution. Trends Ecol. Evol. 10:123–126.

MARTIN, S. L. 1995. Characterization of a LINE-1 cDNA thatoriginated from RNA present in ribonucleoprotein particles:implications for the structure of an active mouse LINE-1.Gene 153:261–266.

MASON, J. M., and H. BIESSMANN. 1995. The unusual telo-meres of Drosophila. Trends Genet. 11:58–62.

MEINKOTH, J., and G. WAHL. 1984. Hybridization of nucleicacids immobilized on solid supports. Anal. Biochem. 138:267–284.

MIZRAHI, V., P. HUBERTS, S. S. DAWES, and L. R. DUDDING.1993. A PCR method for the sequence analysis of the gyrA,polA and rnhA gene segments from mycobacteria. Gene136:287–290.

MOUCHES, C., N. BENSAADI, and J. C. SALVADO. 1992. Char-acterization of a LINE retroposon dispersed in the genomeof three non sibling Aedes mosquito species. Gene 120:183–190.

MUKABAYIRE, O., and N. J. BESANSKY. 1996. Distribution ofT1, Q, Pegasus and mariner transposable elements on thepolytene chromosomes of PEST, a standard strain of Anoph-eles gambiae. Chromosoma 104:585–595.

MURPHY, N. B., A. PAYS, P. TEBABI, H. COQUELET, M. GUY-AUX, M. STEINERT, and E. PAYS. 1987. Trypanosoma bruceirepeated element with unusual structural and transcriptionalproperties. J. Mol. Biol. 195:855–871.

NAKAMURA, T. M., G. B. MORIN, K. B. CHAPMAN, S. L. WEIN-RICH, W. H. ANDREWS, J. LINGNER, C. B. HARLEY, and T.R. CECH. 1997. Telomerase catalytic subunit homologs fromfission yeast and human. Science 277:955–959.

O’HARE, K., M. R. ALLEY, T. E. CULLINGFORD, A. DRIVER,and M. J. SANDERSON. 1991. DNA sequence of the Docretroposon in the white-one mutant of Drosophila melano-gaster and of secondary insertions in the phenotypicallyaltered derivatives white-honey and white-eosin. Mol. Gen.Genet. 225:17–24.

OKAZAKI, S., H. ISHIKAWA, and H. FUJIWARA. 1995. Structuralanalysis of Tras1, a novel family of telomeric repeat-asso-ciated retrotransposons in the silkworm, Bombyx mori. Mol.Cell. Biol. 15:4545–4552.

ORGEL, L. E., and F. H. CRICK. 1980. Selfish DNA: the ultimateparasite. Nature 284:604–607.

PARDUE, M. L., O. N. DANILEVSKAYA, K. LOWENHAUPT, F.SLOT, and K. L. TRAVERSE. 1996. Drosophila telomeres:new views on chromosome evolution. Trends Genet. 12:48–52.

PIMPINELLI, S., M. BERLOCO, L. FANTI, P. DIMITRI, S. BON-ACCORSI, E. MARCHETTI, and R. CAIZZI. 1995. Transposableelements are stable structural components of Drosophilamelanogaster heterochromatin. Proc. Natl. Acad. Sci. USA92:3804–3808.

PRIIMAGI, A. F., L. J. MIZROKHI, and Y. V. ILYIN. 1988. TheDrosophila mobile element jockey belongs to LINEs andcontains coding sequences homologous to some retroviralproteins. Gene 70:253–262.

RAO, P. S., and K. S. RAI. 1987. Inter and intraspecific variationin nuclear DNA content in Aedes mosquitoes. Heredity 59:253–258.

ROBERTSON, H. M. 1993. The mariner transposable element iswidespread in insects. Nature 362:241–245.

ROBERTSON, H. M., and D. J. LAMPE. 1995. Distribution oftransposable elements in arthropods. Annu. Rev. Entomol.40:333–357.

ROMANS, P., R. K. BHATTACHARYYA, and A. C. COLAVITA.1998. Ikirara, a novel transposon family from the malariavector mosquito Anopheles gambiae. Insect Mol. Biol. 7:1–10.

ROMANS, P., Z. TU, Z. KE, and H. H. HAGEDORN. 1995. Anal-ysis of a vitellogenin gene of the mosquito, Aedes aegyptiand comparisons to vitellogenins from other organisms. In-sect Biochem. Mol. Biol. 25:939–958.

SAMBROOK, J., E. F. FRITSCH, and T. MANIATIS. 1989. Molec-ular cloning: A laboratory manual, 2nd edition. Cold SpringHarbor Press, Cold Spring Harbor, N.Y.

SANMIGUEL, P., A. TIKHONOV, Y.-K. JIN et al. (11 co-authors).1996. Nested retrotransposons in the intergenic regions ofthe maize genome. Science 274:765–768.

SCHWARZ-SOMMER, Z., L. LECLERCQ, X. GOEBEL, and H. SAE-DLER. 1987. Cin4, an insert altering the structure of the A1gene of Zea mays, exhibits properties of nonviral retrotran-sposons. EMBO J. 6:3873–3880.

SHERRATT, D. J. 1995. Mobile genetic elements. Oxford Uni-versity Press, Oxford.

SHINNICK, T. M., R. A. LERNER, and J. G. SUTCLIFFE. 1981.Nucleotide sequence of moloney murine leukaemia virus.Nature 293:543–548.

SMIT, A. F. A. 1996. The origin of interspersed repeats in thehuman genome. Curr. Opin. Genet. Dev. 6:743–748.

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018

Page 17: Structural, Genomic, and Phylogenetic Analysis of Lian, a Novel

Mosquito Lian Retrotransposon 853

STEINEMANN, M., and S. STEINEMANN. 1991. Preferential Ychromosomal location of TRIM, a novel transposable ele-ment of Drosophila miranda, obscura group. Chromosoma101:169–179.

SWOFFORD, D. L. 1997. PAUP* version 4.0d59. (A test ver-sion, used with permission of D. L. Swofford; completedversion 4.0 to be distributed by Sinauer, Sunderland, Mass.

TAKAHASHI, H., S. OKAZAKI, and H. FUJIWARA. 1997. A newfamily of site-specific retrotransposons SART1, is insertedinto telomeric repeats of the silkworm, Bombyx mori. Nu-cleic Acids Res. 25:1578–1584.

TENG, S. C., S. X. WANG, and A. GABRIEL. 1995. A new non-LTR retrotransposon provides evidence for multiple distinctsite-specific elements in Crithidia fasciculata miniexon ar-rays. Nucleic Acids Res. 23:2929–2936.

TU, Z. 1997. Three novel families of miniature inverted-repeattransposable elements are associated with genes of the yel-low fever mosquito, Aedes aegypti. Proc. Natl. Acad. Sci.USA 94:7475–7480.

TU, Z., and H. H. HAGEDORN. 1997. Biochemical, molecular,and phylogenetic analysis of pyruvate carboxylase in theyellow fever mosquito, Aedes aegypti. Insect Biochem Mol.Biol. 27:133–147.

UDOMKIT, A., S. FORBES, G. DALGLEISH, and D. J. FINNEGAN.1995. BS, a novel LINE-like element in Drosophila mela-nogaster. Nucleic Acids Res. 23:1354–1358.

VAURY, C., A. BUCHETON, and A. PELISSON. 1989. The b het-erochromatic sequences flanking the I elements are them-selves defective transposable elements. Chromosoma 98:215–224.

VOYTAS, D. F. 1996. Retroelements in genome organization.Science 274:737–738.

WARREN, A. M., and J. M. CRAMPTON. 1991. The Aedes ae-gypti genome: complexity and organization. Genet. Res. 58:225–232.

WARREN, A. M., M. A. HUGHES, and J. M. CRAMPTON. 1997.Zebedee: a novel copia-Ty1 transposable element in the ge-nome of the medically important mosquito, Aedes aegypti.Mol. Gen. Genet. 254:505–513.

WESSLER, S. R., T. E. BUREAU, and S. E. WHITE. 1995. LTR-retrotransposons and MITEs: important players in the evo-lution of plant genomes. Curr. Opin. Genet. Dev. 5:814–821.

WILSON, R., R. AINSCOUGH, K. ANDERSON et al. (53 co-au-thors). 1994. 2.2 Mb of contiguous nucleotide sequencefrom chromosome III of C. elegans. Nature 368: 32–38.

WRIGHT, D. A., N. KE, J. SMALLE, B. HAUGE, H. M. GOOD-MAN, and D. F. VOYTAS. 1996. Multiple non-LTR retrotran-sposons in the genome of Arabidopsis thaliana. Genetics142:569–578.

XIONG, Y., and T. H. EICKBUSH. 1988a. Similarities of reversetranscriptase-like sequences of viruses, transposable ele-ments, and mitochondrial introns. Mol. Biol. Evol. 5:675–690.

. 1988b. The site-specific ribosomal DNA insertion el-ement R1Bm belongs to a class of non-long-terminal-repeatretrotransposons. Mol. Cell. Biol. 8:114–123.

. 1990. Origin and evolution of retroelements basedupon their reverse transcriptase sequences. EMBO J. 9:3353–3362.

YOUNGMAN, S., H. G. VAN LUENEN, and R. H. PLASTERK.1996. Rte-1, a retrotransposon-like element in Caenorhab-ditis elegans. FEBS Lett. 380:1–7.

PIERRE CAPY, reviewing editor

Accepted March 13, 1998

Downloaded from https://academic.oup.com/mbe/article-abstract/15/7/837/1074872by gueston 17 February 2018