11
Copyright 0 1991 by the Genetics Society of America Putative Origin and Function of the Intergenic Region Between COI and COII of Apis melliferu L. Mitochondrial DNA Jean-Marie Cornuet,* Lionel Garnery* and Michel Solipact *Laboratoire de Neurobiologie Comparie des Invertibris INRA-CNRS (URA 1190), 91440 Bures-sur-Yvette France, and TLaboratoire de Biologie et Ginitique Evolutives, CNRS, 91 190 Gij%ur-Yvette, France Manuscript received October 10, 1990 Accepted for publication January 24, 1991 ABSTRACT The mitochondrial genome of honeybees is characterized by the presence of a long intergenic sequence located between the COI and COII genes. In addition, the length of this sequence varies between and within subspecies. Four length categories (200, 250, 450 and 650 bp) have been found in 63 sampled colonies. Analysis of the sequence of the largest type reveals the existence of two units: P (54 bp, 100% A + T) and Q (196 bp, 93.4% A + T). The lengths encounteredin the sample are explained by the followingcombinations: Q, PQ, PQQandPQQQ.Accordingtosimilarities in primary and secondary structures, the sequence Q has been divided into three parts : Q1 (similar to the 3' end of the COI gene),Q2 (similar to the neighboring tRNA"" gene) andQs (highly similar to the P sequence). These relationships led us to hypothesize that these sequences, which do not have any counterpart in Drosophila yakuba mitochondrial DNA (mtDNA), arose by tandem duplication. The usual location of lengthvariation in mtDNA control regionsprompted us to examinethe hypothesis that this COI-COII intergenic region might contain an origin of replication. High A + T content, stability profile, hairpin and cloverleaf putative secondary structures are all in favor of this hypothesis. I N most animal species studied so far, the size of mitochondrial DNA (mtDNA) is remarkably ho- mogeneous and is generally between 16 and 20 kb. Obviously, this is due first to its high conservation in gene content, but also to the absence or shortness of intergenic sequences and to the lack of introns (AT- TARDI 1985; BROWN 1985). Most of the large scale size variation lies in a single region of the molecule, the control region, which contains most of the regu- latory sequences. As a rule, this variation is related to the existence of repeated sequences, the different number of units of repetition being responsible for the length variability observed within several species and between related species [Cnemidophorus sp., DENS- MORE, WRIGHT and BROWN(1985); Drosophila sp., SOLIGNAC, MONNEROT and MOUNOLOU (1986); Tritu- rus cristatus, WALLIS 1987; Gryllus sp., RAND and HARRISON (1989); Acipenser transmontanus, BUROKER et al. (1990); Oryctolaguscuniculus, MICNOTTE et al. (1990)]. The occurrence of repeated sequences also accounts for theunusual length of the mitochondrial genome in some animal species: 26 kb for the nema- tode Romanomermis culicivorax (POWERS, PLATZER and HYMAN 1986), 42 kb for the scallop Placopecten ma- gellanicus (SNYDER et al. 1987; LA ROCHE et ul, 1990), and 36 kb for several species of the bark weevil Pissodes (BOYCE, ZWICK and AQUADRO 1989). Other parts of the genome (i.e., intergenic sequences) gen- erally have less length variation. Genetics 1128: 393-403 (June, 1991) The length of the mitochondrialgenome of the honeybees (Apis mellijera) is between 16.5 and 17 kb (SMITH and BROWN 1988). This rangeis explained by length variability in at least two regions (SMITH and BROWN 1990):one of them, as usual, is the control region and another one, where size differences are larger (ca. 450 bp between the longest and shortest types), is the COI-COII junction. This region has been recently sequenced by CROZIER, CROZIER and MACK- INLAY (1989). They have found, between the tRNA"" and the COII genes, an unassigned sequence of 194 bp where only five nucleotides are present in the Drosophila yakuba mtDNA (CLARY and WOLSTEN- HOLME 1985). The sequence obtained by CROZIER, CROZIER and MACKINLAY (1 989) belongs to the shortest type found in Apis mellijera. We have sequenced a domain encom- passing this unassigned sequence in a genome of the largest size class in A. mellijera. In addition, the cor- responding domain has been sequenced in three other Apis species, and in two related species: Bombus luco- rum (Apidae) and Xylocopa violacea (Anthophoridae). Analyses of base composition, stability profile and secondary structure lead us to propose an explanation forthelength variation and hypotheses aboutthe evolutionary origin andthefunction of this unas- signed sequence. MATERIALS AND METHODS mtDNA preparation: Mitochondria were extracted from honeybee workers. Young adults were ground in 15% su-

Putative Origin and Function of the Intergenic Region

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Copyright 0 1991 by the Genetics Society of America

Putative Origin and Function of the Intergenic Region Between COI and COII of Apis melliferu L. Mitochondrial DNA

Jean-Marie Cornuet,* Lionel Garnery* and Michel Solipact *Laboratoire de Neurobiologie Comparie des Invertibris INRA-CNRS (URA 1190), 91440 Bures-sur-Yvette France, and

TLaboratoire de Biologie et Ginitique Evolutives, CNRS, 91 190 Gij%ur-Yvette, France Manuscript received October 10, 1990

Accepted for publication January 24, 1991

ABSTRACT The mitochondrial genome of honeybees is characterized by the presence of a long intergenic

sequence located between the COI and COII genes. In addition, the length of this sequence varies between and within subspecies. Four length categories (200, 250, 450 and 650 bp) have been found in 63 sampled colonies. Analysis of the sequence of the largest type reveals the existence of two units: P (54 bp, 100% A + T) and Q (196 bp, 93.4% A + T). The lengths encountered in the sample are explained by the following combinations: Q, PQ, PQQ and PQQQ. According to similarities in primary and secondary structures, the sequence Q has been divided into three parts : Q1 (similar to the 3' end of the COI gene), Q2 (similar to the neighboring tRNA"" gene) and Qs (highly similar to the P sequence). These relationships led us to hypothesize that these sequences, which do not have any counterpart in Drosophila yakuba mitochondrial DNA (mtDNA), arose by tandem duplication. The usual location of length variation in mtDNA control regions prompted us to examine the hypothesis that this COI-COII intergenic region might contain an origin of replication. High A + T content, stability profile, hairpin and cloverleaf putative secondary structures are all in favor of this hypothesis.

I N most animal species studied so far, the size of mitochondrial DNA (mtDNA) is remarkably ho-

mogeneous and is generally between 16 and 20 kb. Obviously, this is due first to its high conservation in gene content, but also to the absence or shortness of intergenic sequences and to the lack of introns (AT- TARDI 1985; BROWN 1985). Most of the large scale size variation lies in a single region of the molecule, the control region, which contains most of the regu- latory sequences. As a rule, this variation is related to the existence of repeated sequences, the different number of units of repetition being responsible for the length variability observed within several species and between related species [Cnemidophorus sp., DENS- MORE, WRIGHT and BROWN (1985); Drosophila sp., SOLIGNAC, MONNEROT and MOUNOLOU (1 986); Tritu- rus cristatus, WALLIS 1987; Gryllus sp., RAND and HARRISON (1 989); Acipenser transmontanus, BUROKER et al. (1990); Oryctolagus cuniculus, MICNOTTE et al. (1 990)]. The occurrence of repeated sequences also accounts for the unusual length of the mitochondrial genome in some animal species: 26 kb for the nema- tode Romanomermis culicivorax (POWERS, PLATZER and HYMAN 1986), 42 kb for the scallop Placopecten ma- gellanicus (SNYDER et al. 1987; LA ROCHE et ul, 1990), and 36 kb for several species of the bark weevil Pissodes (BOYCE, ZWICK and AQUADRO 1989). Other parts of the genome ( i . e . , intergenic sequences) gen- erally have less length variation.

Genetics 1128: 393-403 (June, 1991)

The length of the mitochondrial genome of the honeybees (Apis mellijera) is between 16.5 and 17 kb (SMITH and BROWN 1988). This range is explained by length variability in at least two regions (SMITH and BROWN 1990): one of them, as usual, is the control region and another one, where size differences are larger (ca. 450 bp between the longest and shortest types), is the COI-COII junction. This region has been recently sequenced by CROZIER, CROZIER and MACK- INLAY (1 989). They have found, between the tRNA"" and the COII genes, an unassigned sequence of 194 bp where only five nucleotides are present in the Drosophila yakuba mtDNA (CLARY and WOLSTEN- HOLME 1985).

The sequence obtained by CROZIER, CROZIER and MACKINLAY (1 989) belongs to the shortest type found in Apis mellijera. We have sequenced a domain encom- passing this unassigned sequence in a genome of the largest size class in A . mellijera. In addition, the cor- responding domain has been sequenced in three other Apis species, and in two related species: Bombus luco- rum (Apidae) and Xylocopa violacea (Anthophoridae). Analyses of base composition, stability profile and secondary structure lead us to propose an explanation for the length variation and hypotheses about the evolutionary origin and the function of this unas- signed sequence.

MATERIALS AND METHODS mtDNA preparation: Mitochondria were extracted from

honeybee workers. Young adults were ground in 15% su-

394 J.-M. Cornuet, L. Garnery and M. Solignac

crose and a crude pellet of mitochondria was obtained by differential centrifugation. Mitochondria were then purified on a 42.5/30/20% sucrose gradient (SOLIGNAC and MON- NEROT 1986). Proteins were extracted with one phenol and one phenol/chloroform treatment. Nucleic acids were ethanol precipitated; the pellet was rinsed with 70% ethanol and dissolved in sterile water.

Determination of size categories: The size variation in the COI-COII region of honeybee mtDNA is most apparent on BclI restriction profiles (SMITH and BROWN 1990). mtDNAs from 63 colonies from 8 different subspecies were digested with BclI according to manufacturer’s recommen- dations. The fragments were end-labeled with [ ( U - ~ ~ P I ~ N T P , separated on 0.8% agarose gel and their position revealed by autoradiography of the gel.

Cloning and sequencing: Four length categories are vis- ible for a particular BclI fragment. A colony of A. melliferu from Colmar (France) belonging to the subspecies mellqera was choosen because its corresponding BclI fragment was of the largest type (cu. 1600 bp).

A mix of BclI fragments was cloned in the compatible BamHI site of a pUC8 plasmid using Escherichia coli strain JM 101. The clones containing the right fragment were identified through sizing and restriction mapping. After amplification, the cloned DNA was digested with EcoRI and Hind111 and subcloned in M13mp18 and M13mp19. Se- quencing was done by the dideoxy chain-termination method of SANGER, NICKLEN and COULSON (1977), using the M13 sequencing primer and three internal oligonucle- otides primers. The sequences of the primers are the follow- ing:

El :5’-GGAGTAAATCTAACTTTC-3’

E2 :5’-GGCAGAATAAGTGCATTG-3’

H S :5’-ATATGAATCATGTGG-3’.

Polymerase chain reaction: In addition to the sequence obtained as described above, the aforementioned primers allowed us to obtain partial mtDNA sequences from colonies of various subspecies of A. mellifera and of three other species of the genus Atis: A . cerunu, A . dorsata and A. jlorea (one individual per species). Amplification was done accord- ing to KOCHER et ul. (1989) with slight modifications of thermic cycles (45 s at 92”, 45 s at 41 O , 2 min 30 s at 62”). Single stranded DNA was obtained either by asymmetric priming polymerase chain reaction (PCR) (GYLLENSTEN and ERLICH 1988) or by strand-specific digestion with X exonu- clease (HIGUSHI and OCHMAN 1989).

Computer analyses: Stability profiles of duplex DNA were obtained through the B.I.S.A.N.C.E. DNA analysis package (DESSEN et ul. 1990). The program follows the method published by GABARRO-ARPA and MICHEL (1982). Unwinding of a given DNA domain is obtained when the sum of the free energies of the base pairs that it contains, matches a certain value ( W ) which depends on environmen- tal conditions, such as temperature and salinity (MICHEL, GABARRO-ARPA and DUJON 1982). In the computations, the environmental parameter has been set to W = 3 (MIGNOTTE et al. 1987).

RESULTS

Variable number of tandem repeats in the COI- COII intergenic region: Four length categories (Fig- ure 1) were found for the variable size BclI fragment in the 63 assayed honeybee colonies. We classified

FIGURE 1 .-Restriction patterns of mtDNA from seven A. melli- fPru colonies digested with BclI. Letters L, M, S, VS correspond to the various possible lengths of a specific restriction fragment (see text). Right lane: Hindlll/lambda phage DNA digest used as mo- lecular weight standard.

them as very short (“VS”: cu. 1 150 bp), short (“S”: cu. 1200 bp), medium (“M”: cu. 1400 bp) and large (“L”: cu. 1600 bp).

Figure 2 gives the sequence of the cloned fragment (L type). Its actual size is 1582 bp (instead of the 1600 bp estimated from BclI restriction pattern). It shows that between the tRNA”” and the COI gene, a 54-bp sequence (P) is followed by three exactly and tandemly repeated sequences (Q) of 196 nucleotides. The se- quence published by CROZIER, CROZIER and MACK- INLAY (1 989) contains a corresponding BclI fragment of 1 134 bp with an intergenic region of 194 bp which is highly similar to sequence Q (two substitutions, one addition and three deletions) but devoid of sequence P. The interpretation of the length polymorphism in this region is then straightforward: the VS BclI frag- ment corresponds to an intergenic sequence formed of one sequence Q, and the S, M and L fragments contain one sequence P followed by one, two or three sequence(s) Q, respectively. The genomes without sequence P belong to the P- type and the other ones to the P+ type. Among the 63 colonies, we found 19 P- ( i e . , combination Q) and 44 P+ mtDNA types (eight PQ, 25 PQQ and 11 PQQQ combinations).

The distribution of these categories in A. mellqeru is in agreement with the mtDNA types defined by their restriction map: two related groups of subspecies (west Mediterranean races and African races) possess

Honeybee mtDNA 395

. 100 U S L G F I M L F T I G G L T G I M L S N S S I O I I L H ~ ~ ~ ~

T ~ ~ C A C T A G G T T T ~ A T T A T A C T A T T ~ A C T A T T O O T ~ G A T I ~ ~ ~ ~ ~ ~ ~ . 200

V V G H F H Y V L S M G A V F A I I S S F l H U Y P L l T G L L L N T ~ G T T ~ ~ ~ ~ ~ ~ C A T T A T G T T C ~ ~ ~ ~ T A G ~ G ~ G ~ A ~ T T G U A T T A ~ ~ ~ ~ G A T T T A T T C A T T G A T A T C C A T T M T T A C T G G A T T A T T A T T M A

T A T T M A ~ G A ~ T M M A T T ~ T T T A T T A ~ M ~ A ~ ~ ~ A ~ ~ G ~ G ~ ~ ~ C ~ M C ~ ~ ~ C T T T C C T U A C A T T T C T T A G ~ C T M T A T C T A T A C C A C G A C G T I K W L K I P F I ~ ~ F I C V N L ~ F F P ~ H F L G L M S ~ P R R

. 300

. 400 Y S D Y P D S Y Y C U N S I S S M G S M I S L N S M I F L I ~ I I

TATTCACACTATCCAGATTCTTATTACTGTT~TTUATTTCATCCATAGGAT~TMTTTCATTMATAGMTMTTTTTTTMTTTTTATTATTT . 500 L E S L I S K R M L L F K F N P S S L E U L N F L P P L D H S H L E T A W M ~ T T M T T T C T M A C G M T A ~ ~ A ~ ~ A ~ ~ ~ ~ ~ ~ ~ C U A T C A ~ ~ C ~ ~ ~ T C A T T M A T T T C T T A C C A C C T T T A G A T C A T T C A C A T T T A G A . 600

I P L L l K N L Y L K S I L I K € ______""""""""""""" tRNA Leu (UUR)---

M T T C U T T A T T M T T M M A T T T ~ l l T M M T U A T T T T M T T M A T T T T M T A T G G C A G M T M G T G ~ T T ~ C T T M G A T T ~ T A T M A G T A

. t P P P P P I t P1 9 a, 700 """""".""" tmmmmmmmmmmmmmmmmmmvmmmmmm

T T T T T M A C T T T T A T T M M T T M T ~ T T M T A T M M T A T M A T T A T A T T T A T T ~ T T T M T T T A T T ~ TTTC~CACTTMTTCATTTTMTTT .......... I I I I I I I I I I : t - ~

9 a1 1 t 92 p2 p2 p2 O2 p2 O2 Q2 800

vvmmmm-m~mmmm~mm~aaaaaaaaaa~~aaaaaaaauaaaaaaaaaauua~aauaauuaaauuuuauuaaaaaaaauaauaaaaaaaaau M M A T M A T T M A T M C M T T T T T M T M M T M A T M T T M T T T T A T T T T T A T A T T ~ T T T T M A T T U A T C T T M A ~ T T T M T C T T ~ ~ ~ ~ ~ ~ ~

I C ?5 3 3 3 a3 3 a t O1 9 Ql 900 ~vmmmmvvmmmmmmmmmmmmmmmmmmmmvmmmmmmmmmmvmmmmmmmmmmmmvmmmmm~mmmmmmmmm3 cmmmmmmmmmmmmmmmmvmmmmmmmmmmmm ~ ~ ~ M T M A T T M T A T M M A A T M M W M A T A T M W M A T A T A T T T A T T M M T T T M T T T A T T M M T T T C C C A C T T M T T ~ T T T T M T T T M M

9 IO2 O2 O2 p2 O2 p2 p2 P2 I to3 1000

A T M A T T M A T M U A T T T T T M T M M T M A T M T T M T T T T A T T T T T A T A T T ~ T T T T M A T T U A T C T T M A G A T T T M T C T T T T T A T T ~ T T A mvmmmmmmvmmvvmmmm~aaaauaauuuauaauuuuuauaaauaaaaaaauauaaau~aauaaauauaaaaaaaauaaaaaauuauauuaauaa~mm

3 3 3 PS 3 a3 I t 9 9 9 PI 1100 vvmvvmmmmvmmmDmDDmDmmmmmmmmmmvmmmmmmvmmmmmmvmvmmmmmmmmmmmmVmmmvv][vmmmmvmmWmmmmmmmmmmmmvmmmvvmmmmm~m A T M A T T M T A T M M A T M M W M A T A T M W M A T A T A T T T A T T M M T T T M T T T A T T M M T T T C C C A C T T M T T ~ T T T T M T T T ~ T M A

a1 1 t a2 O2 O2 Q2 Q2 % p2 P2 I c a3 1200 vvmmmmmmvmv~~aaaa~aauuaaauuuuauuuaaaaaauaaaaauuaauuuu~~~uaaaauuauaaaaaaaaaaau~uuu~auuauaau~mmmmmmm T T M A T M U A T T T T T M T M M T M A T M T T M T T T T A T T T T T A T A T T ~ T T T T ~ T T U A T C T T M A G A T T T M T C T T T T T A T T ~ T T M T M A

9 3 3 3 93 3 1 . 1300 mvvmvmmmmmmmmmmmmmvmmmmmmmmmmwmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm] I S T U F M F M F 0 E S N T T M T A T M M A A T M M U M A T A T M W M A T A T A T T T A T T M M T T T M T T T A T T M M T T T C ~ C A T G A T T C A T A T T T A T A T T T C M G M T ~ T ............... ""_ . 1400 S Y Y A D N L I S F H N M V M M I I I M l S T L T V Y l l L D L F

TCATATTATCCTGATMTTTMTTT~TTTCATMTATAGTTATMTMTCATTATTATMTTTUACATTMCTGTATATATTATTTTAGATTTATTTA

. 1500 M N K F S N L F L L K N H N I E I I U T V l P l l l L L l l C F P S T M A W M T T T T U M T T T A T T T T T A T T ~ T C A T M T A T T ~ T T A T T T ~ C A G T T A T T C ~ T T A T T A T T C T A T T M T T A T T T G T T T T C C A T C

L K l L Y L l D E I V N P F F S l K S I G H P Y r V S A T T M M A T T T T A T A T T T M T T G T ~ T T G T M A T C C T T T T T T T T C M T T M A T ~ T T G G T U T U A T G A T A T T G A T C A

FIGURE 2.-Nucleotide sequence of the cloned fragment of honeybee mtDNA: 1-555: 3' end of the COI gene and its translation; 551- 620: tRNA'"' (UUR) gene; 621-674: intergenic sequence: motif P; 675-1261, intergenic sequence: 3 direct repeats Q (QI+Q2+Qs); 1262- 1582: 5' end of the COII gene and its translation. The underlined sequences are homologous motifs that are suggested to have facilitated tandem duplications or deletions in the past evolution of this sequence (---: scenario 1, step 2; 0 0 0 : scenario 2, step 2; 1: 1: scenario 2, step 3) .

the P sequence, whereas it is absent from a third group (ligustica, carnica and caucasica).

In one colony classified in category M (combination PQQ) through BclI restriction pattern, the presence of an additional fragment, whose size corresponds to the combination PQ, has been detected after amplifi- cation by PCR. This heteroplasmy, revealed by PCR, has been confirmed by reexamination of the original restriction profiles. Because less than 12 out of the 63

colonies have been subjected to PCR, other such occurrences of heteroplasmy may have remained un- detected.

Similarity of base content and stability profile with the control region of Drosophila: The A + T content of this honeybee intergenic region is very high. Sequence P is 100% A + T and sequence Q is 93.4% A + T . In the four combinations (Q, PQ, PQQ and PQQQ), this percentage varies between 93.4%

396 J.". Cornuet, L. Garnery and M. Solignac

d , c , b , a , a I

I n I

co1 leu P a 0 a c02

d , c , b (I t

Drosophila yakuba

ND2 mef,gln, ,i IF A*T-rich region small rRNA t

FIGURE S."Stability profiles of the cloned fragment of honeybee mtDNA and the A + T- rich regions of D. yakuba and D. virilis. The base lines under each profile indicate the nu- cleotide sequence with the functional units (tRNAs are indicated by the usual three letters code) and the length units (1 tick every 100 bp). Above the profile are the presumed ex- tension of the four domains a, b, c and d (see text). The ordinate is the value of stability (parameter p ) . The environmental parameter ( W ) has been fixed at 3 for the three species.

3

r Drosophila virilis

A-

ND2 mfgln, , i t$ A + T - r i c h region small rRNA +

and 94.8%. In comparison, the A + T-rich region of Drosophila has an A + T content of 92.8% (CLARY and WOLENSTHOLME 1985).

Figure 3 shows stability profiles deduced from the sequence of the A. mellifera BclI mtDNA fragment, the A + T-rich region of two distantly related Dro- sophila species (D. yakuba and D. virilis; CLARY and WOLSTENHOLME 1985, 1987). Four domains have been recognized by MICNOTTE et al. (1987) in five vertebrate species (rat, mouse, human, ox and xeno- pus). Figure 3 shows that both Drosophila species exhibit a similar stability profile in their A + T-rich region. Most characteristic is the peak of stability at the limit between domains a and b. In vertebrates, a

similar peak exists: it corresponds to conserved se- quence block-2 (CSB-2), whereas, in Drosophila, it is produced by a GGGG motif in both species. In A. mellifera, a similar pattern shows up : the motif CCCAC (or CCCCAC: see nucleotides 5-10 of se- quence Ql in Figure 6) produces a peak protruding from a low stability area. Two other domains, c (mod- erately stable) and d (unstable), similar to those of MIGNOTTE et aL's study, can be also identified in Drosophilas' A + T-rich region. In A. mellifera, cor- responding domains would encompass the tRNA"" gene and 200 bp of the 3' end of the COI gene.

Relationship between the intergenic sequence and adjacent sequences: Southern transfers of restricted

Honeybee mtDNA 397

COI (3' end)

91

tRNA Leu

Q2

P

g3

M T T C C A T T A T T M T T M M A T T T M A T T T ~ T C M T T T T M T T M A T - ttt. * ****** . ***** ** ***** .*** * ***

A T T T C C C A C - T T M T T C A - - T T T T M T T T " A T M A T T M A T M C M T T

T T M T M A T T M T A T A A M T A T M A T - - - - - - - - - - - - - - - TATATTTATTAAMTTTMTTTATTMA ******************* ***** ............................ T T M T M A T T M T A T M A M A T M M C A A M T A T M C A A A A T A T A T T T A T T A A A A T T T M T T T A T T M A

nuclear and mitochondrial DNA have been incubated at low stringency in presence of the labeled BclI frag- ment DNA used as a probe. The probe hybridized only to the mtDNA fragments encompassing the COI- COII region, thus indicating the absence of similarity with other mitochondrial or nuclear sequences. Also, computer analysis show no significant sequence simi- larity between the honeybee intergenic sequence and the total mtDNA sequence of D . yahba.

Looking for sequence similarities within the region, we first found 100% identity between the 29 nucleo- tides of the 3' ends of sequence P and Q. Actually, the 3' most third of sequence Q is nearly identical to P (see Figure 4). Further upstream of these two se- quences, we still find a high similarity between the central part of sequence Q and the tRNA"" gene and between the 5' end of sequence Q and the 3' end of the COI gene. According to this, sequence Q has been divided into three parts : Q1, Q2 and Q3 (similar to the 3' end of COI, tRNA''" gene and sequence P, respectively).

This similarity is further revealed by the existence of very similar structures between Q 2 and the tRNA"" gene and between P and Q 3 (Figure 5). Q2 and the tRNA"" gene have the same nucleotide sequence for the aminoacyl arm, the anticodon stem and the TgCG stem (16 nucleotides pairs out of 17 are identical). However, the DHU arm and the three loops have different sequences. The DHU loop of Q2 contains more nucleotides than does an usual tRNA gene.

Sequences P and Q s can be folded into hairpins with one internal loop where the major differences between both sequences lie (on the 5' side).A part of the Q 1 domain can be folded into a hairpin with a 9- bp stem. Although with a different sequence, a hairpin can be found in the 3' end of the COI gene (see Figure 9).

Nucleotide variation in sequence Q: The compar- ison between our nucleotide sequence and the one obtained by CROZIER, CROZIER and MACKINLAY (1 989) shows that the level of nucleotide substitution is about the same for genic regions (10 substitutions

out of 867 bp) as for the Q sequence (2 substitutions out of 193 bp). More substitution sites have been located by determining the sequence of Q for a few subspecies of Apis mellij-era (Figure 6). But the number of nucleotide differences between any pair of subspe- cies never exceeds four. By comparison to the nucleo- tide divergence of total mtDNA estimated through restriction mapping (around 2% between races) (SMITH and BROWN 1988), it cannot be concluded that the nucleotide sequence of Q has evolved at a rate different from total mtDNA.

The substitutions and the insertions/deletions are all located in unpaired sites of the putative secondary structure. There is one exception: a substitution in a helix has been "compensated" by a deletion (both noted by asterisks in Figure 6). Actually, the structure is slightly changed since the stem of the helix contains one more nucleotide pair. Such observations strongly support the putative secondary structure of sequence

There is only one substitution found in the sequence Q 2 located at the site corresponding to the first posi- tion of what would correspond to the anticodon. In addition to the abnormal DHU arm which probably prevents the transcript of Q 2 from being a functional tRNA, this substitution is another indication of its nonfunctionality. Also the various insertions/deletions do not allow the sequence Q to code for a polypeptide.

Intergenic region in related species: In the other three analyzed Apis species, an intergenic region exists but with a comparatively reduced length: 24 bp in A. dorsata, 32 bp in A. florea and 89 bp in A. cerana (the closest relative to A. mellijera). These regions are also A + T rich: a single C is present in A.florea and in A. cerana. Intergenic sequences of A. dorsata and A. florea can be folded in hairpins with stems strictly homolo- gous to the basic part of the stem of sequences P/Q3 (Figure 7). The intergenic sequence of A. cerana can be folded in a tRNA-like structure with a large DHU loop (17 nucleotides) and a long TgCG stem. Eleven and eight nucleotides, respectively at the 5' and 3' ends, are identical with sequences P/Qs. Dropping

Q.

398

A A T A T

A T T A

T A A T

T A i i A T T

A A A

T : ; T A A

T A T A

A T A T

A T T A A T A T T A

P

A A A A

A T T A T A T A A T A T T A

J.-M. Cornuet, L. Garnery and M. Solignac

A A C A A A

A T

A T T A

T A

T A A T

A T A T

A T T A

A T T A

Q3

FIGURE 5.-Putative secondary structure in- ferred from sequences P and Q. The secondary structure of the tRNA"" gene is given for com- parison. I

A T T A A A T

T A T A T A A T A T

A T A A T T C T A AT A A A A T A A A A G A T T T T A T

T T A T A

T T A T

T A c c A T

G T T T C A T A A G A C G A A A C T T

A ~ G A ' A A T T

A A T

T G C

T A A T A

c c A T A T

C A tRNAleU T A A

insertions/deletions, the overall similarities between the intergenic region of A. cerana and Q2 and Q3 are 77.2 and 74.3%, respectively. In Bombus lucorum and Xylocopa violacea, there is not a single nucleotide be- tween the tRNA''" and CO-I1 genes,

DISCUSSION

In honeybee mtDNA a large, unassigned, noncod- ing region composed of an optional P sequence (54 bp long) and a varying number (one to three) of Q sequences of ca. 195 bp lies between the tRNA"" and COII genes. In D. yakuba, the closest species relative to honeybees for which the nucleotide sequence is available, the order of genes is identical in this region, ie., COI, tRNA"", COII. But, between the tRNA'"" and COII genes, there are only five nucleotides. What might be the potential origin and function of this intergenic sequence?

Local origin of the sequences P and Q: The se- quence similarities described above (Figure 4) are in favor of the hypothesis that these sequences occurred by tandem duplication. Different scenarios are possi- ble whether one considers that the first event led to a P- or P+ type. We will propose an example for each case, the first one leading first to a P+ type (combina-

tion PQ) and then to a P- type (combination Q), and the second one leading first to a P- type (combination Q) and then to a P+ type (combination PQ). In both cases, we assume that in the ancestor sequence, be- tween the tRNA"" gene and the COII gene, there was a short sequence similar to P or Q 3 and which will be referred here as an ancestral intergenic sequence (AIS).

Scenario 1 (Figure 8a): In a first step, combination PQ appeared by tandem duplication of a segment encompassing the 3' end of COI, the tRNA"" gene and the AIS. In a second step, sequence P was deleted.

Scenario 2 (Figure 8b): Three steps are necessary. In step 1, only the <3' end of COI-tRNA''" gene > segment was tandemly duplicated, leading to combi- nation Q; in step 2, sequence Q was tandemly dupli- cated and in step 3, the <Q1 + Q 2 > section of the 5' repeat of Q was deleted.

Two different mechanisms can be invoked for the origin of these mutational events. The first one is an erroneous jump of the replication enzymatic complex promoted by favorable spatial structures. Figure 9 illustrates this possible phenomenon which could ex- plain the first step of both scenarios. All other events can be explained by a second mechanism, slipped-

Honeybee mtDNA 399

Sequence a1

mellifera 1 carnica 1 carnica 2 caucasica 1 caucasica 2 capensis 1 ligustica 1

Sequence P 2

mellifera 1 carnica 1 carnica 2 caucasica 1 caucasica 2 capensis 1 ligustica 1

Sequence P3

mellifera 1 carnica 1 carnica 2 caucasica 1 caucasica 2 capensis 1 ligustica 1

T T T A T

T A [ A

A. dorsata

ATTTCCC-AC TTMTTCATT

.T. . . . . - . . ......... A

.T.....C.. ......... A

.T . . . . . - . . ......... A

.T ................. A

.A.....C.. ......... A

.T ..... C.. ......... A

T T M T T T A M M T A M T T M A T M C M T T

............................

............................

............................

............................

............................

............................

FIGURE 6.-Nucleotide sequence

TTTAATAMA TAMTMTTA ATTTTATTTT TATATTGMT TTTAMTTCA ATCTTAMGA TTTMTCTTT TTATTMM Of Q (Ql+Q2+Qs) in subspecies of A. mellifera. The two asterisks in- ........................................... n...... .............................

A...... A...... A . . . . . .

dicate one substitution site and one addition/deletion site with conserva- tion of the secondary structure of the

........................................... .............................

........................................... .............................

........................................... .............................

........................................... T. . . . . . ............................. putative hairpin (see Figure 4 and A...... text). ........................................... .............................

TTAATAAATT AATATAAAAA ATAAMCAAA A T A T M C A M ATATATTTAT T A A M T T T M TTTATTAAAA

................ A..- .................. G. .....................

................ A..- .................. G. .....................

................ T..- .................. A. ...........................

................ T..- .................. G. ...........................

................ A..- .................. G. .........................

................ A,..- .................. G. ..............................

A 1 A T

T A

A. cerana

C A A A

A A A T T A A T T A

I ' T A I // A Q3

A. mellifera FIGURE 7.-Putative secondary structures of intergenic sequences of A. dorsata, A. jlorea and A. cerana. Sequences P and Q a of A. mellyera

are shown for comparison. Boxes indicate identical homologous sequences

strand mispairing (SSM) (see LEVINSON and GUTMAN 1987). Actually, SSM is favored by various factors which are all present here: low stability of duplex DNA, secondary structures which can bring closer the sequences to be mispaired, perfect repeats at the boundaries of sequences to be duplicated or deleted (see Figure 2).

Since P and Q s sequences are very similar, SSM is also a good candidate for producing variability of the number of repeats of sequence Q in type P+. However,

although the above events (appearance of the first sequences P and Q) need to occur only once, the variable number of sequences Q in P+ type genomes, implies some recurrent phenomenon. This can be explained by the large direct repeat of 34 bp located at the P-Q and Q-COI boundaries (underlined se- quences in Figure 2).

To discuss the respective likelihood of both scena- rios, one can consider at least three elements: the likelihood of each step, the number of steps and the

400 J.-M. Cornuet, L. Garnery and M. Solignac

A: scenario 1

in i t ia l state - - - - - 3"COI tRNALW A IS 5"COI I

<""" DUPLICATION-- - - - - ->

step 1 3"COI tRNALeU AIS 3"COI tRNALeU AIS 5"COII

" I 1 I I I "

I P a1 a2 a3

>DELETION<

s t e p 2 3"COI tRNALeU

" I 9 a2 a3 S'-COII I I I "

I

B: scenario 2

i n i t i a l s t a t e - - ,-- - - 3"COI tRNALeU AIS 5"COII

FIGURE %-Putative origin of se- quences P and Q. AIS is the ancestor sequence for P and QJ. 3'-COI cor- responds to the 3' end of the CO1 gene and 5'-COII to the 5' end of the COII gene.

<"DUPLICATION-->

31-COI tRNALW 3I -COI tRNALeU A I S 5 " C O I I I 1 I I I s t e p 1 " "

a1 a2 a3

3"COI tRNALW step 2

a1 a2 a3 a1 a2 a3 5 l - c o 1 I " I I I I I I

I I I "

I

>-"-DELETION-----<

3"COI tRNALW s t e p 3

a3 a1 a2 a3 5 1 - c o 1 1 " I I

I I I I I P

"

information provided by the comparison of the levels of similarity between related sequences. The first two criteria give an advantage to the first scenario, since it implies the same kind of events but one step less than the second scenario. Also relevant is the level of nucleotide divergence between <3' end of COI + tRNA"" gene > and <QI + Qz> segments and be- tween P and Q 3 . The higher similarity between P and Q 3 could be taken as an argument in favor of the second scenario in which they diverge only after the second step. However, this can be alternately ex- plained by heavier constraints related to their putative function (see below).

The intergenic sequences of other Apis species pro- vide some insight about the origin of the AIS and its evolution along the various Apis lineages. The rather high similarity (primary and secondary structures) be- tween the intergenic sequence of A. cerana and Qz in A. mellijera, the latter one deriving most probably from the next tRNA"" gene, prompts us to hypothe- size that the AIS originated from a previous duplica-

tion of the same gene. In A.Jorea and A. dorsata, this duplicated tRNA would have evolved toward a sim- plified sequence retaining only its stem in form of a hairpin-like structure whereas in the cerana-mellijera branch, this sequence would have conserved and elon- gated its cloverleaf structure. In A. mellijera only, the region would have been subsequently submitted to additional duplication events.

It is generally accepted that polycistronic mtDNA transcripts are processed by RNAse P which excises the tRNAs, leaving mRNAs which will be polyadenyl- ated and translated (CLAYTON 1984; TZACOLOFF and MYERS 1986). All these events, involving tRNA-like structures, should not have interfered with the correct processing of the local mRNA (if any) encompassing this region.

An origin of replication in sequences P and Q? Animal mtDNA contain one or two regions which do not code for protein or RNA genes. In vertebrates, there are two such regions: a large one, the so called "D-loop" region, and a small one, located between

Honeybee mtDNA 40 1

1 G l A A

C A A I

tRNAleu 2 ;

A G C C 1 A

1 T A t A I

5 " G A l C A T l C A A A l l C A l - 5 '

. . C G

FIGURE 9.-Secondary structure of the putative ancestor se- quence. It encompasses the 3' end of the COI gene, the tRNA''" gene, sequence AIS and the 5' end of the COII gene. The arrows indicate the putative jumps of the polymerase inferred for the first step of scenario 1 (black arrow) or 2 (white arrow). This secondary structure is based on the current sequence of COI, tRNA"", Q 3 and COII. The choice of Q3 instead of P for the AIS is justified only by the observation that the tRNAs at the 5' end of polypeptide genes are more constrained than those located on their 3' end because of their role in the processing of mRNA (THOMAS, MAA and WILSON 1989), a role that can be logically extended to the AIS sequence.

two tRNAs, two thirds of the mtDNA molecule away. In Drosophila, there is only one called the "A + T rich region." Its length varies between 1 and 4 kb between species and to a lesser extent within species. In sea urchins, this region has a very short length [ 12 1 bp in Strongylocentrotus purpuratus, JACOBS et al. (1988) and 135 bp in Paracentrotus lividus, CANTA- TORE et al. (1988)l. In all instances, these regions have been inferred [sea urchins, Cantatore et al. (1 988)l or demonstrated (mouse; CHANG, HAUSWIRTH and CLAYTON (1985), Drosophila; GODDARD and WOL- STENHOLME 1978, 1980; CLARY and WOLSTENHOLME 1987) to contain at least one origin of replication.

Another feature of the intergenic region of honey- bees is its high level of length polymorphism and its capability to produce heteroplasmy. HARRISON (1 989) gave a list of species exhibiting mtDNA length varia- tion and/or heteroplasmy: most of them are in control regions. Additional examples have been recently de- scribed: rabbit (ENNAFAA et al. 1987) and Alosa sapi- dissima (BENTZEN, LECGETT and BROWN 1988).

It is then logical to examine the hypothesis that an origin of replication is included in the intergenic COI- COII region of honeybees.

In the vertebrate species for which the nucleotide sequence of the control region is known, no extended sequence homologies are found. Between two Dro- sophila species, D. yakuba and D. virilis, over 1 kb of the A + T-rich region, about the first half is similar enough to allow alignment (CLARY and WOLSTEN- HOLME 1987; MONNEROT, SOLICNAC and WOLSTEN-

HOLME 1990). Between two different genera of sea urchin (JACOBS et al. 1988; CANTATORE et ai. 1988), only short motifs (7-10 bp) are conserved. It is then clear that no primary structures (nucleotides se- quences) are conserved whenever unrelated groups are compared. In spite of this remark and only for the record, it should be mentioned that the motif AATTTCCCCAC, which is one of both possibilities found at the 5' limit of sequence Q (Figure 6) is found in a 47-bp repeated sequence at the 3' end of the D- loop of Xenopus (MIGNOTTE et al. 1987).

Since the primary structure is highly variable, au- thors have looked for other, more conserved, prop erties. MICNOTTE et al. (1 98'7) have shown that the D- loops of five vertebrates species are characterized by a rather conserved DNA stability profile. Interest- ingly, the A + T-rich regions of two Drosophila spe- cies (Figure 3) provide patterns which are similar from one to each other, in spite of a low sequence similarity. In the COI-COII region of honeybee, a similar pattern is visible in intergenic and upstream sequences. A priori, such participation of sequences to a control region-like stability profile cannot be excluded if there is no negative interference with the functionality of the gene product. It is worth noticing that the last 11 codons of the COI gene of Apis do not exist in Dro- sophila and that most of the last 60 codons are differ- ent in both species. This means that functional con- straints on the corresponding part of the COI subunit are not strong, allowing other kinds of constraints to operate. In our view, the narrow peak of stability due to a GC-rich motif embedded in a A + T-rich area is the most characteristic signal found in this study. MIGNOTTE et al. (1987) note that, as has been found in E . coli, abrupt changes of stability might have a function in replication, like the switchover from RNA to DNA synthesis.

Figure 3 shows another aspect that may be consid- ered as trivial: A + T-rich regions are characterized by their comparative low stability of duplex DNA. This is consistent with the necessity for DNA to un- wind for initiating replication (and/or transcription). The free energy required for this transition is conse- quently reduced (KOWALSKI, NATALE and EDDY 1988). First, the intergenic region in honeybee has an overall lower stability than the adjacent coding se- quences. Second, the elongation by tandem duplica- tion of sequence Q is a way to increase the length of the unstable DNA bubble (MICHEL 1982).

Another feature of origins of replication is the occurrence of secondary structures. SACCONE, ATTI- MONELLI and SBISA (1985) and BROWN et al. (1986) have noticed that in all known vertebrates, there was a cloverleaf structure coupled to a hairpin in the vicinity of the 5' end of the D-loop. Also, at the origin of replication of the light strand of vertebrates, single-

402 J.-M. Cornuet, L. Garnery and M. Solignac

strand DNA can be folded in a very conserved hairpin structure. On both sides of this sequence are the cloverleaf structures of two tRNAs. In the same line, the unassigned sequence in the two known sea urchins, which is located among a cluster of tRNAs, can be folded in a hairpin structure. It seems then that the combination of a hairpin and a cloverleaf structure could be a property of origins of replication in animal mtDNA. Sequence Q which contains a (smali) hairpin (in Ql), a cloverleaf ( Q 2 ) and a half hairpin/half clov- erleaf structure (Q2) possesses all the necessary struc- tures to be a good candidate for that function. If we assume that the region has a function of an origin of replication, this would be consistent with the fact that the contiguous sequence, Q2, has kept its necessary tRNA-like structure and hence that its stems and arms have undergone much less changes than its loops.

An additional argument for our hypothesis comes from the computations ofJUKE3 and BHUSHAN (1 986) who found a close linear relationship ( r = 0.99) be- tween the GC content at silent sites of genes and the GC content in control regions in mtDNA of various animals. Our data, extended with those of CROZIER, CROZIER and MACKINLAY (1989) indicate that, in ho- neybees, the GC content is 6.0% for silent sites and between 5.2% and 6.6% for the intergenic sequence (depending on its composition). These data come very close to those of D. yahuba (respectively, 5.6 and 7.2%) and then are in close agreement with the established relationship.

In conclusion, the length variation in the COI-COII region in A. mellijera has received a simple explanation and all the sequences involved have been putatively traced back to their origin with the help of other Apis species. This intergenic region is large, has a very high A + T content, a profile of stability, a secondary struc- ture and a base composition, all characteristic of con- trol regions. These numerous arguments concur to provide a high likelihood to our hypothesis that this region contains an origin of replication. The situation is different in other Apis species and does not allow to hypothesize when this putative function could have arisen.

We wish to express our thanks to NICOLE DENNEBOUY for her kind help in the cloning part of experiments. We also thank FRAN- COISE MIGNOTTE for useful discussion and MONIQUE MONNEROT, F R A N I ~ I S MICHEL andJEAN-CLAUDE MOUNOLOU for critically read- ing earlier versions of the manuscript.

LITERATURE CITED

ATTARDI, G., 1985 Animal mitochondrial DNA: an extreme ex- ample of genetic economy. Int. Rev. Cytol. 93: 93-145.

BENTZEN P., W. C. LEGGET-I and G. G. BROWN, 1988 Length and restriction site heteroplasmy in the mitochondrial DNA of American shad ( A h a sapidissima). Genetics 118: 509-5 18.

BOYCE, T . M., M. E. ZWlcKand C. AQUADRO, 1989 Mitochondrial DNA in the bark weevils: site, structure and heteroplasmy. Genetics 123: 825-836.

BROWN, W. M., 1985 The mitochondrial genome of animals, pp. 95-130 in Molecular Evolutionary Genetics, edited by R. J. MAC~NTYRE. Plenum Press, New York.

BROWN, G. G., G. GADALETA, G. PEPE, C. SACCONE and E. SBISA, 1986 Structural conservation and variation in the D-loop- containing region of vertebrate mitochondrial DNA. J. Mol. Biol. 192: 503-511.

BUROKER, N. E., J. R. BROWN, T. A. GILBERT, P. J. O’HARA, A. T. BECKENBACH, W. K. THOMAS and M. J. SMITH, 1990 Length heteroplasmy of sturgeon mitochondrial DNA: an illegitimate elongation model. Genetics 124: 157-163.

CANTATORE, P., M . ROBERTI, G . RAINALDI, C . SACCONE and M. N. GADALETA, 1988 Clustering of tRNA genes in Paracentrotus lividus mitochondrial DNA. Curr. Genet. 13: 91-96.

CHANG, D. D., W. W. HAUSWIRTH and D. A. CLAYTON, 1985 Replication priming and transcription initiate from pre- cisely the same site in mouse mitochondrial DNA. EMBO J. 4: 1559-1567.

CLARY, D. O., and D. R. WOLSTENHOLME, 1985 The mitochon- drial DNA molecule of Drosophila yakuba: nucleotide sequence, gene organisation, and genetic code. J. Mol. Evol. 22: 252- 271.

CLARY, D. O., and D. R. WOLSTENHOLME, 1987 Drosophila mito- chondrial DNA: conserved sequence in the A + T-rich region and supporting evidence for a secondary structure model of the small ribosomal RNA. J. Mol. Evol. 25: 116-125.

CLAYTON, D. A., 1984 Transcription of the mammalian mito- chondrial genome. Annu. Rev. Biochem. 53: 573-594.

CROZIER, R. H., Y. C. CROZIER and A. G . MACKINLAY, 1989 The CO-1 and CO-I1 region of honeybee mitochondrial DNA: evidence for variation in insect mitochondrial evolutionary rates. Mol. Biol. Evol. 6: 399-41 1.

DENSMORE, L. D., J. W. W R I C H T ~ ~ ~ W. M. BROWN, 1985 Length variation and heteroplasmy are frequent in mitochondrial DNA from parthenogenetic and bisexual lizards (genus Cnemidopho- rus). Genetics 1 1 0 689-707.

DESSEN, P., C. FONDRAT, C. VALENCIEN and C. MUGNIER, 1990 B.I.S.A.N.C.E.: a French service for access to biomole- cular sequence data bases. CABIOS 6: 355-356.

ENNAFAA, H., M. MONNEROT, A. EL GAAIED and J. C. MOUNOLOU, 1987 Rabbit mitochondrial DNA: preliminary comparison between some domestic and wild animals. Genet. Sel. Evol. 1 9

GABARRO-ARPA, J., and F. MICHEL, 1982 The hierarchical ap- proach to the DNA stability problem. I . Patterns in non- equilibrium denaturation and renaturation. Biochimie 64: 99- 112.

GODDARD, J. M., and D. R. WOLSTENHOLME, 1978 Origin and direction of replication in mitochondrial DNA molecules from Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 75: 3886- 3890.

GODDARD, J. M., and D. R. WOLSTENHOLME, 1980 Origin and direction of replication in mitochondrial DNA molecules from the genus Drosophila. Nucleic Acids Res. 8: 741-757.

GYLLENSTEN, U. B., and H. A. ERLICH, 1988 Generation of single- stranded DNA by the polymerase chain reaction and its appli- cation to direct sequencing of the HLA-DQA locus. Proc. Natl. Acad. Sci. USA 85: 7652-7656.

HARRISON, R. G., 1989 Animal mitochondrial DNA as a genetic marker in population and evolutionary biology. Trends Eco~.

HIGIJSHI, R. G., and H. OCHMAN, 1989 Production of single- stranded DNA templates by exonuclease digestion following the polymerase chain reaction. Nucleic Acids Res. 17: 5865.

JACOBS, H. T., D. J. ELLIOTT, V. B. MATH and A. FARQUHARSON, 1988 Nucleotide sequence and gene organisation of sea ur- chin mitochondrial DNA. J. Mol. Biol. 202: 185-21 7.

JUKES, T. H., and V. BHUSHAN, 1986 Silent nucleotides substitu-

279-288.

Evol. 4 6- 1 1.

Honeybee mtDNA 403

tions and G + C content of some mitochondrial and bacterial genes. J. Mol. Evol. 2 4 39-44.

KOCHER, T. D., W. K. THOMAS, A. MEYER, S. V. EDWARDS, S. PAABO, F. X. VILLABLANCA and A. C. WILSON, 1989 Dynamics of mitochondrial DNA evolution in animals: amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86: 6196-6200.

KOWALSKI, D., D. A. NATALE and M. J. EDDY, 1988 Stable DNA unwinding, not “breathing,” accounts for single-strand-specific nuclease hypersensitivity of specific A + T-rich sequences. Proc. Natl. Acad. Sci. USA 85: 9464-9468.

LA ROCHE, J., M. SNYDER, D. I. COOK, K. FULLER and E. ZOUROS, 1990 Molecular characterization of a repeat element causing large-scale size variation in the mitochondrial DNA of the sea scallop Placopecten magellanicus. Mol. Biol. Evol. 7: 45-64.

LEVINSON, G., and G. A. GUTMAN, 1987 Slipped-strand mispair- ing: a major mechanism for DNA sequence evolution. Mol. Biol. Evol. 4 203-221.

MICHEL, F., 1982 Etudes expi.rimentales et thioriques des tran- sitions thermiques de I’ADN et leur application i I’ktude gin& tique de I’ADN mitochondrial de levure. ThPse d’Etat, Paris

MICHEL, F., J. GABARRO-ARPA and B. DUJON, 1982 The hierar- chical approach to DNA stability problem. 11. Some applica- tions and speculations with yeast mitochondrial DNA as an example. Biochimie 64: 113-126.

MIGNOTTE, B., D. DUNON-BLUTEAU, C. REIS and J. C. MOUNOLOU, 1987 Sequence deduced physical properties in the D-loop region common to five vertebrate mitochondrial DNAs. J. Theor. Biol. 124 57-69.

MIGNOTTE, F., M. GUERIDE, A. M. CHAMPAGNE and J. C. MOUNO- LOU, 1990 Direct repeats in the non-coding region of rabbit mitochondrial DNA : involvement in the generation of intra and inter-individual heterogeneity. Eur. J. Biochem. 194: 561- 571.

MONNEROT, M., M. SOLIGNAC and D. R. WOLSTENHOLME, 1990 Discrepancy in divergence of the mitochondrial and nuclear genomes of Drosophila teissieri and Drosophila yakuba. J. Mol. Evol. 3 0 500-508.

POWERS, T. O., E. G. PLATZER and B. C. HYMAN, 1986 Large

VI, 447 pp.

mitochondrial genome and mitochondrial DNA size polymor- phism in the mosquito parasite, Romanomermis culicivorax. Curr. Genet. 11: 71-77.

RAND, D. M., and R. G. HARRISON, 1989 Molecular population genetics of mtDNA size variation in crickets. Genetics 121:

SACCONE, C., M. ATTIMONELLI and E. SBISA, 1985 Primary and higher order structural analysis of animal mitochondrial DNA, pp. 37-47 in Achievements and Perspectives of Mitochondrial Research, Vol 2: Biogenesis, edited by E. QUAGLIARIELLO et al. Elsevier, New York.

SANGER, F., S. NICKLEN and A. R. COULSON, 1977 DNA sequenc- ing with chain terminator inhibitors. Proc. Natl. Acad. Sci.

SMITH, D. R., and W. M. BROWN, 1988 Polymorphisms in mito- chondrial DNA of European and Africanized honeybees (Apis mellfera). Experientia 4 4 257-260.

SMITH, D. R., and W. M. BROWN, 1990 Restriction endonuclease cleavage site and length polymorphisms in mitochondrial DNA of Apis mellifra mellqera and A. m. carnica (Hymenoptera Apidae). Ann. Entomol. SOC. Am. 83: 81-88.

SNYDER, M., R. FRASER, J. LAROCHE, E. GARTNER-KEPKAY and E. ZOUROS, 1987 Atypical mitochondrial DNA from deep-sea scallopPlacopecten magellanicus. Proc. Natl. Acad. Sci. USA 74: 5463-5467.

SOLIGNAC, M., AND M. MONNEROT, 1986 Race formation, specia- tion, and introgression within Drosophila simulans, D. mauri- liana, and D. sechellia inferred from mitochondrial DNA analy- sis. Evolution 4 0 53 1-539.

SOLIGNAC, M., M. MONNEROT AND J. C. MOUNOLOU, 1986 Concerted evolution of sequence repeats in Drosophila mitochondrial DNA. J. Mol. Evol. 2 4 53-60.

THOMAS, W. K., J. MAA and A. C. WILSON, 1989 Shifting con- straints on tRNA genes during mitochondrial DNA evolution in animals. New Biol. 1: 93-100.

TZAGOLOFF, A., and A. M. MYERS, 1986 Genetics of mitochon- drial biogenesis. Annu. Rev. Biochem. 55: 249-285.

WALLIS, G. P., 1987 Mitochondrial DNA insertion polymorphism and germ line heteroplasmy in the Triturus cristatus complex. Heredity 58: 229-238.

551-569.

USA 74: 5463-5467.

Communicating editor: A. G. CLARK