9
Mol Gen Genet (1993) 239:49 57 © Springer-Verlag 1993 A plant mitochondrial gene encodes a protein involved in cytochrome c biogenesis Wolfgang Schuster, Bruno Combettes*, Karin Flieger**, Axel Brennicke Institut f/ir Genbiologische Forschung, Ihnestrasse 63, W-1000 Berlin 33, Germany Received: 26 October 1992 / Accepted: 17 December 1992 Abstract. Analysis of a transcribed region in the mito- chondrial genome of Oenothera revealed an open reading frame (ORF) of 577 codons (orf577) that is also conserved in carrot, here encoding a protein of 579 amino acids (orf579). RNA editing alters the mRNA sequence of orf577 in Oenothera with 46 C to U transitions, many of which improve sequence similarity with the homologous Marchantia gene orf509. The deduced polypeptides show significant similarity with the ccll-encoded protein in- volved in cytochrome c biogenesis in the photosynthetic bacterium Rhodobacter capsulatus. A highly conserved domain is also found in plastid ORFs, suggesting that these bacterial, chloroplast and mitochondrial genes en- code polypeptides with analogous functions in assembly and maturation of cytochromes c. Key words: Plant mitochondria - Cytochrome c assembly - Chloroplasts - ccll Introduction Plant mitochondrial genomes range in size from about 200 kb up to 2500 kb, suggesting that they contain a larger number of genes than are encoded in mammalian or fungal mitochondrial (mt) DNAs (Ward et al. 1981; Leaver and Gray 1982). Although much of the genome in higher plants is taken up by integrated chloroplast and nuclear sequences, there is still ample space to encode numerous functional genes (Schuster and Brennicke 1987). Communicated by R. Hagemann * Present address: IBMP, 12 rue du General Zimmer, F-67084 Strasbourg, France **Present address: Botanisches Institut, Universit/it Miinchen, Menzinger Strasse 67, W-8000 M/inchen 19, Germany Correspondence to: W. Schuster To identify new genes and functionally expressed re- gions in plant mitochondrial genomes we have inves- tigated transcribed sequences with respect to potentially relevant coding regions. In our survey of the mitochon- drial information content in the flowering plant Oenoth- era berteriana we have already identified several genes by analysing in detail transcribed regions of this organellar genome (Hiesel et al. 1987; Knoop et al. 1991 ; Wissinger et al. 1991). The presence of conserved homologues of such transcribed sequences, and open reading frames contained therein, in the mitochondrial genomes of other plants gives additional support to the identification of coding regions. As a further indicator of functional pro- tein coding genes in plant mitochondria, RNA editing, which is mostly observed in open reading frames (Wissin- ger et al. 1992) can be invoked. All these criteria are met by a new open reading frame (ORF) of 577 codons. This ORF is transcribed and frequently edited in Oenothera mitochondria and is con- served between higher (Oenothera and carrot) and lower plants (Marchantia). A domain near the carboxy- terminus of the deduced protein is also found in an open reading frame conserved in higher plant and algal chloro- plasts. The significant similarity of these ORFs with bacterial genes essential for correct assembly of cytoch- rome c suggests a conserved analogous function for these genes in the two plant organelles. Materials and methods Isolation of mitochondria. Mitochondria of O. berteriana and Daucus carota were isolated from tissue cultures by differential centrifugati0n followed by Percoll gradient centrifugation as described previously (Schuster et al. 1989). Cloning of mtDNA and eDNA. Purified mitochondria were digested with DNase I and after several washing steps, lysed in 0.5% SDS followed by CsC1/ethidium bromide gradient centrifugation. This highly purified

A plant mitochondrial gene encodes a protein involved in cytochrome c biogenesis

Embed Size (px)

Citation preview

Mol Gen Genet (1993) 239:49 57

© Springer-Verlag 1993

A plant mitochondrial gene encodes a protein involved in cytochrome c biogenesis Wolfgang Schuster, Bruno Combettes*, Karin Flieger**, Axel Brennicke

Institut f/ir Genbiologische Forschung, Ihnestrasse 63, W-1000 Berlin 33, Germany

Received: 26 October 1992 / Accepted: 17 December 1992

Abstract. Analysis of a transcribed region in the mito- chondrial genome of Oenothera revealed an open reading frame (ORF) of 577 codons (orf577) that is also conserved in carrot, here encoding a protein of 579 amino acids (orf579). RNA editing alters the mRNA sequence of orf577 in Oenothera with 46 C to U transitions, many of which improve sequence similarity with the homologous Marchantia gene orf509. The deduced polypeptides show significant similarity with the ccll-encoded protein in- volved in cytochrome c biogenesis in the photosynthetic bacterium Rhodobacter capsulatus. A highly conserved domain is also found in plastid ORFs, suggesting that these bacterial, chloroplast and mitochondrial genes en- code polypeptides with analogous functions in assembly and maturation of cytochromes c.

Key words: Plant mitochondria - Cytochrome c assembly - Chloroplasts - ccll

Introduction

Plant mitochondrial genomes range in size from about 200 kb up to 2500 kb, suggesting that they contain a larger number of genes than are encoded in mammalian or fungal mitochondrial (mt) DNAs (Ward et al. 1981; Leaver and Gray 1982). Although much of the genome in higher plants is taken up by integrated chloroplast and nuclear sequences, there is still ample space to encode numerous functional genes (Schuster and Brennicke 1987).

Communicated by R. Hagemann

* Present address: IBMP, 12 rue du General Zimmer, F-67084 Strasbourg, France **Present address: Botanisches Institut, Universit/it Miinchen, Menzinger Strasse 67, W-8000 M/inchen 19, Germany

Correspondence to: W. Schuster

To identify new genes and functionally expressed re- gions in plant mitochondrial genomes we have inves- tigated transcribed sequences with respect to potentially relevant coding regions. In our survey of the mitochon- drial information content in the flowering plant Oenoth- era berteriana we have already identified several genes by analysing in detail transcribed regions of this organellar genome (Hiesel et al. 1987; Knoop et al. 1991 ; Wissinger et al. 1991). The presence of conserved homologues of such transcribed sequences, and open reading frames contained therein, in the mitochondrial genomes of other plants gives additional support to the identification of coding regions. As a further indicator of functional pro- tein coding genes in plant mitochondria, RNA editing, which is mostly observed in open reading frames (Wissin- ger et al. 1992) can be invoked.

All these criteria are met by a new open reading frame (ORF) of 577 codons. This ORF is transcribed and frequently edited in Oenothera mitochondria and is con- served between higher (Oenothera and carrot) and lower plants (Marchantia). A domain near the carboxy- terminus of the deduced protein is also found in an open reading frame conserved in higher plant and algal chloro- plasts. The significant similarity of these ORFs with bacterial genes essential for correct assembly of cytoch- rome c suggests a conserved analogous function for these genes in the two plant organelles.

Materials and methods

Isolation of mitochondria. Mitochondria of O. berteriana and Daucus carota were isolated from tissue cultures by differential centrifugati0n followed by Percoll gradient centrifugation as described previously (Schuster et al. 1989).

Cloning of mtDNA and eDNA. Purified mitochondria were digested with DNase I and after several washing steps, lysed in 0.5% SDS followed by CsC1/ethidium bromide gradient centrifugation. This highly purified

50

Oenothera

Daucus H r i

I DcH 2/72

H2 H1 I I I

Pst 2 I I

B 2/'42 B 5/65 B 3/55 I I I I I

3 PH H H B B B H P P H B

I I l l I I I I I I I I I i \ \ \ \ \ \ \ \ \ - , \ \ \ - , \ \ \ \ \ \ \ ~ L , , - , \ \ \ \ - , \ \ \ , i

t E E S C H B S B B I I I I I I I I I I i

r

E K C B S C B B ' I ,, I I I I I I I I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

DcH 2/12

Fig. 1. The open reading frame orf577 is conserved in the mitochon- drial genomes of Oenothera and carrot. Two new transcribed re- gions (hatched bars) were identified in the mitochondrial genome of Oenothera (top part). The larger of these was also isolated from the mitochondrial genome of carrot and shows most of the restric- tion sites conserved (hatched box). A sequence rearrangement out-

S H

I

side of this region results in two copies of the conserved reading frame in the carrot mitochondrial genome (symbolized by the branch). Genomic clones for Oenothera and carrot are shown above and below the respective restriction site maps. Restriction recog- nition sites are indicated for BamHI (B), ClaI (C), EcoRI (E), HindIII (H), PstI (P) and SalI (S)

mtDNA was cloned into pBR322 and pUC vectors by standard techniques (Sambrook et al. 1989). Transcribed mitochondrial sequences were identified by hybridization with random hexamer-primed cDNA reverse-transcribed from Oenothera mitochondrial RNA. The carrot clones were identified by hybridization with the Oenothera clone Pst2 (Fig. 1). Construction of the Oenothera mitochon- drial cDNA library has been described in detail (Wissin- ger et al. 1991).

Sequence analysis. Fragments of interest were subcloned into pUC and Bluescript (Stratagene) vectors for sequencing. Sequences were determined by a combina- tion of the chain termination and chemical modification procedures (Sambrook et al. 1989). A dideoxy double- stranded sequencing procedure (Pharmacia) with T7 polymerase was used in sequence determination. Both strands of the presented genomic sequences were analyzed.

Southern hybridization. Total mtDNA was digested with restriction enzymes, fractionated by electrophoresis in agarose gels and transferred to Biodyne A membranes (Pall). DNA hybridization was carried out in standard buffers (Sambrook et al. 1989), and washes were per- formed at 60 ° C in 0.1 x SSC and 0.1% SDS. Filters were autoradiographed at - 80 ° C using X-ray film.

Northern hybridization. Total mtRNA was fractionated in 1.5% formaldehyde/agarose gels and transferred to nylon membrane using standard procedures (Sambrook et al. 1989). Hybridization and washes were carried out under standard conditions (Sambrook et al. 1989).

Polymerase chain reaction ( PCR ) amplification. Random hexamer-primed, reverse-transcribed mtRNA of Oenothera

was used as template for the PCR reactions. PCR amplifica- tions were done with the primers: A, Y-GGATCCTGCC- ACCTTTCTTGTGG-3'; B, 5 '-TCTCAGGGACATGG- TCTAATCAT-3'; C, 5'-GGTACCTTTCTTTCCAT-3'; and D, Y-GCTCTGAATAACCAGCCCAGCAA-Y. The reaction mixture contained 50 mM KC1, 1.5 mM MgC12, 10mM TRIS-HC1 pH 8.3, 500ng of each primer, 50 gmol of each dNTP, 10 ng cDNA and 5 units of Taq polymerase (Boehringer). PCR was performed on a Biomed cycler under the following conditions: cycle 1, 1 min at 94 ° C; cycle 2, 1 min at 45 ° C; and cycle 3, 3 min at 72 ° C. All cycles were repeated 30 times and an exten- sion of 10 rain at 72 ° C was added at the end. After the PCR reaction, an aliquot of the product was analyzed in an agarose gel. To generate blunt ends, 5 units T4 DNA polymerase was added for 10 rain at 25 ° C. The resulting cDNA fragments were cloned into the EcoRV site of Bluescript (Stratagene).

Results

A transcribed region of the Oenothera mitochondrial 9enome is conserved in carrot

Hybridization with directly labelled total mitochondrial RNA and with first-strand, labelled cDNA identified a transcribed region of the Oenothera mitochondrial genome that was isolated in overlapping PstI, HindIII and BarnHI clones (Fig. 1). Closer investigation of this genomic region showed two distinct transcribed loci separated by about 1.5 kb.

Hybridization of one of these Oenothera sequences to total carrot mitochondrial DNA revealed a homologous region in the latter plant, which was isolated for further

21.0 -

9 .4 -

6 .7 - 5 .1 -

4 .0 -

3 .5 -

2,3-

2,0-

1,6-

1 ,3-

l i d -.,- 5.8kb

Q -.,,- 2.6kb

51

investigation from a HindIII clone library of total carrot mitochondrial DNA. Many of the restriction sites in this region are common to the two plants (Fig. 1) indicating a high degree of conservation, a characteristic of coding regions in plant mitochondrial genomes. The carrot mitochondrial genome contains two copies of this region, resulting from a genomic recombination event just out- side the conserved region (Fig. 1). In the mitochondrial genome of Oenothera this sequence, including orf577 and flanking sequences, is present in only one genomic en- vironment as indicated by analysis of clone libraries and by Southern blotting (Fig. 2).

The region conserved between Oenothera and carrot was investigated in Northern blot experiments with Oenothera mitochondrial RNA (Fig. 3). One prominent transcript of about 2 kb is detected with the orf577 coding region, represented by HindIII clone H1 (Fig. 1). The upstream region, covered by clone H2, is transcribed into a major m R N A of about 2.6 kb and other tran- scripts. The molecular basis of the different transcript patterns is currently unclear.

Fig. 2. The mitochondrial genome of Oenothera contains orf577 in a single genomic environment. Southern hybridization with HindIII clone HI reveals only a single arrangement of this region in re- stricted total mitochondrial DNA from Oenothera. Sizes of length markers are shown on the left in kb and the estimated sizes of the hybridizing mitochondrial fragments are indicated on the right

H1 H2 9 . 5 - ........... ; ;

7 .5 -

4 . 4 -

2 .4 -

1 .4-

Fig. 3. Transcription analysis of orf577 and adjacent sequences. Transcripts of the region conserved between Oenothera and carrot were investigated by hybridization of clones H1 and H2 (Fig. 1). Both clones show different patterns of transcripts, with a single transcript species detected with the region encoding orf577. Part of the transcript pattern observed with H2 may be due to the sequence similarity with the 26S rRNA present upstream of orf577. RNA size standards are shown in kb on the left

Sequence analysis

Sequence analysis revealed four open reading frames (ORFs) larger than 45 amino acids, three smaller ones (orf66, off55 and o¢46) and one major open reading frame (orf577) in the 4.5 kb region investigated (Fig. 4). The largest uninterrupted ORF, coding for a polypeptide of 577 amino acids, is conserved in the carrot mitochon- drial genome as orf579. Two additional triplets are in- serted into the reading frame in carrot and extend the protein (Fig. 4). The three small ORFs did not show significant similarity with any other genes in the data- bases. A short region of homology to the 26S ribosomal RNA is found between orf66 and orf55 (Fig. 4). This homology contains nucleotides 427-470 of the Oenothera 26S rRNA, which are located in a non-conserved region of the molecule (Manna and Brennicke 1985).

Further investigations concentrated on orf577 (orf579) conserved between the two higher plants.

RNA editin 9

The extent of RNA editing of orf577 in Oenothera was analyzed in independent cDNA clones isolated from a cDNA library of total mitochondrial transcripts as well as in cDNA clones obtained by specific amplification of this region from mitochondrial RNA (Fig. 5). The open reading frame was investigated in at least five PCR- derived clones and in part in the eight cDNA clones isolated from the library. All clones were identical in their sequences, apart from the C to U alterations ascribed to RNA editing. None of the clones isolated shows the "complete" RNA editing pattern with all observed sites altered. Some of the editing sites are found in only single clones, for example one at position 3496 (Fig. 4), which introduces a stop codon. Such rare editing events have

52

1 T~T~T~TATT~T~TA~T~G~TT~GG~T~GcTGA~TTGGTTGG~cTATTGG~TTTGGGA~GAcTGGCTTTGG~cGAcTACTAcT~G~%/~

i01 TAGTATTTGTAGTAGAAAT~T~CAAATGCTGGCTA~TACTAG~cTCGAAA~C~TCTcTTTT~TCACTCTTT~CTAGGAT~cTAGGTTTAGTTCG 201 GCTTTC~CATGcCATTTCGCTACATGcTT~CAcATGTTcTTG~GGTCGAGATAG~GTCGAGATGCGAGTC~GCAGGAGGTTTTTGTCTTGTCCTA

301 TTTTTTTGGAC~CG~TcCTAGGCAG~TGCGTTGAAAAATTCGCcAGTGcGTTGATTGAGAC~TAAAGCAAAGTGCTTGAGCAAAGccATTTGATTGCC 401 TCCTATcTG~TTTcTTCCTTTGTTGAGcTATcTTCcccc~TT~CCGTTCTCGGATTTTTcTcTC~c~TTCcCCAcCCGGGTcT~GTTGTT~GCACA 501 AGAGTT~AGTTCAAATGTGCCGGGTC~GTGcTAGGA~GAGGTcGCcTTGGAcTATCTGGTAGTTcACTCCTCccGGA~TAAA~GACCGCACAC~T~AC 601 cATTCccT~cCAAAAAAGGTGTAAAcGAAAGcAGTGcTTTGCTCAG~C~cC~cATAGGGGCGAc~TC~TGGATTTcGCGTCTGAAAAGGACATc 701 ~GcTcTGAAAcAGC~GAcCC~GTTGCCG~GTAAACTACcTGATC~AGAGAAAAAAAAACCT~G~CGAGCc~AGTAGTTG~GGcTCGCTT~GGCGcC 801 TTCAGCC~cTCAcC~GT~TAGG~cTcTcTcATCGCAAAGCTCGTTGAGcAAAAGACTTCGG~TGG~cAAAAAAGTCCG~CG~C~GGACcTT

ORF66 ~ m s q

901 GACCCAGTACcCG~cCTGTTGACAGAGCcTGTTGATTGATAGTCCCATCAT~TGTATAGT~C~TCTAGGAGTAGTTGCCTCGGCG~TGAGcC~G

s a a s m s e n a s r r r c n n g l l p f 1 p f k f d r k f i d t d

I001 GTTCTTTACTTCcATTcGCTcTAGCTTC~TGCCTAGTTT~AAATTTG~GAT~CAGGAAAGCCTTCTCTCG~TTCG~GATGTGAcACC~T~TGA

ii d I g i f f y i r w k g g n a p s r v s r s w a s t k *

ii01 CTCTcTTcTTAGGGATGTCAGCAGGTTAGGCAGcTGGTTGTTTTTTTATATTCGGTGG~GGcGGGcGGAAAcTcTGC~C~GTGAAAcT~GGG 1201 TCTGAAAGAGTTCACGATCTGATTTATGTATGGGATGTT~cCcT~GGAGGAcGGCT~G~TG~TGTC~ATTAAAAGGAGC~C~GcCAcTcGAcCC

1301 GAGGGAGAGG~GGA/~AGcTACTTTAccGATAGGTG~GGAGCGGAcCGATGGGGTcCTTTTTC~GCAcG~TAG~cGATCTAAAAGGGGGTGTGCA

ORF55~ w y g h 1 k t m m r s v e r s v n e e r m v e

1401 AccTAGAGAGATGGCcCTATcT~TTTGATG~TGTGGTATCGGGGTTcGGTTcATTTGGAAAAGAGGTCAGTC~CGAGGAGAcCATGcG~TGGTTG~

1 t 1 1 1 1 t i i q l i y l w g k * m s r s r n re s w r n n n s

1501 TTGATGACTTC~CTTCGATCTCTTCGACTG~ccTGAGGGAGACTTCGATCATT~CT~TTTGGAGG~CTATCTAAATTGG~TTcCGGAAAGTG~

1601 TcTTCTCTTTCAGATCCTGGCcTTGGTGAGAAAAAGTCTTTGATTGGGATTTcTATTGAAAGATGTTAGGACGCTTCCTCATGTGccT~GAAAGCC~G

1701 CCCCCTCGATTCGGAAAc~GAGTAGGTTATGG~TAGTCTTTTGGTCGACTTTccCGTTCTTGAGGACTTTcTGGCCGGTAGGCT~CTAGCGGACCTG

ORF46 ~ m 1 s k 1 d

1801 CCTTGCTGACAGGGAGGAGATGATG~TCTAGTTGCGCTTGCCTT~AcTTAGAAACTcTA~G~CcccTATTAGAGT~GG~ATGCTCTCGAAACTAGATT f v l s g i t y s 1 i s l tt 1 1 g f i s v r r v sr rs v n s s v

1901 CATTcGTTCTTGTCTCCGGCCGATT~ccTATT~cAGACTGGTGATTAGCcTAT~TCGCAcTACTCG~Gc~TcGTAAACTCATTGTCGGGTTTTATCGT v e v a l l *

2001 AGTGGAGGTTG~TCTAcTGTGAGGAGATCCTTGTTccAGTGATCGGGAAAT~TAAATACAAAAAcCATTGCTTTATTTTC~CACTCCGGATAGGCAGA 2101 AGGATcTATGTCcATcCCCCTTTc~TccACATC~cAAATTGAGGCTTTcT~TAcTGGGAGTCATCAAAGGAGGAAAGGGCATACGcTGGTTCGAGTTG 2201 ~C~GGA~G~A~GCTTGGTTTTGA~A~TC~GCGA~G~C~TGT~AAA~ATcGATG~AAAGGTTAGATTT~Ac~A~T~TAT~TAC~C~CT

2301 AGTCC~TTTTTTCTAAAAAATCCACCCTGGAAACTGTATCT~CCCGGAAAACCCTCTATCTCC~GTACTAGGTTGGGC~GCTTGGATGCCCCACTC

2401 CC~C~TTGcTGcTGGTCGTGAT~GTGG~CTCTC~CTCGAG~GGTTGGTTGTTGGT~cTTGTTTG~TcAGTTG~GTTTTTTTATTATTAGGTCAG

2501 GCGTTTCGGGGGATTACCCGCGATTGGT~TGGC~CACCAGGAGAGGG~TAGGGGAGAC~GGTTACGCGCTGGGTAGCCGAGGCCATTCTTGTTTCGG

2601 GTACAGAGGCG~GCAGGGGTGCT~CCGG~GCCC~CCCAGCAGGTCAGGGGCCGGCTGG~CCTAGGTTCG~TCCTGCCACCTTTCTTGTGGATCA

ORF577 ~ M S I V E L F H Y

rcCGf 2701 TCCTGTGGTTACCGGATGATGGG~T~CAAAGCAGAAATTTTGAAATAGACACGAG ATG TCA ATA GTT G~ TTG TTT CAT TAT

1 ................................. A ........ GT .... TAG ..... A ......... TA ................ T.. . . . . . . .

L F G L F V A F T Y N K K E P P A F G A A A F

2788 TTA TTT GGT CTT TTC GTT GCA TTC ACT TAC ~C ~G AAA G~ CCA CCA GCG TTT GGT GCA GCA GCA TTT

88 ....... T .................................. C ......... T .............. T .......

2863 TGG TGC ATT CTT CTC TCT TTC CTT GGT CTT TTC CAT ATT ~T ~C ~C TCC ~T TAC ~C GTA TTA

163 ............................................... T ...... TTA ..................

T A N A P F F Y Q I S G T W S N H E G S I L L W C

2938 ACC GCT ~T GCA CCT TTC TTT TAT C~ ATC TCA GGG ACA TGG TCT ~T CAT GAG GGT AGT ATT TTA TTA TGG TGT

238 ................................................................... C .......

3013 ATC AGT TTT TAT GGA TTC CTT CTT TGT TAC CGG GGT CGA CCC C~ AGC CAT ~T TCA AAA CGA GGA

313 ....... T ..................................................... T .............

G ~ R E T F F Y ~ F V L N ~ V K N ~ I L S L ~ Y

3088 GGC AGA GAG ACT TTT TTT TAT TTT GTC TTG ~C GTG ~G ~C ATT CTA TCT CTC TAC

388 ........... A ... C .................. C ................ C ................ C ......

3163 G~ C~ AAA ACT TTG GCAIGcG t CCC CAG TTG TAC ACT CCT TTC GTT CTA CGA ACC CTT GTT GAT TCT G~ CTT IcaTi

463 .......... G. AG. ..T..A ................. C ............... T ....................

S R R N R T F D G P A L F Y A P L Y P E R K M S F

3238 TCG CGA AGG ~C CGG ACT TTT GAC GGG CCA GCC TTA TTT TAT GCG CCG CTT TAC CCT G~ AGG AAA ATG AGC TTT

538 ................................. C.T...T .................................... TAT GTT

A ITcT i ~ L G A R L P V V R G E G E R T L L L H L A R 3313 GCT CTG GGC GCT AGG CTC CCC GTG GTT CGC GGA G~ GGA AAA AGG ACT CTT TTA TTG CAT CTG GCA CGA

619 ... C ............................................. TG AG ...... G ...............

D D K E R A S S I D E Q R I D G A L G I A L F F S

3388 GAT GAT AAA GAG AGA GCT TCG TCT ATC GAT G~ CAG CGG ATT GAC GGA GCT CTT GGC ATT GCT TTG TTT TTC TCT

694 ...........................................................................

3463 TTC GCG AGT'TCc'GAT'ccT'TTT GTT'cGA'~T TTC TTC GTT ACC G~ CCG CTT GCA G~ ~T

769 ....................... C ...................................................

P V ~ Q D P I ~ A I H P P C I Y A G D V A S A M G

844 ...........................................................................

F C L C R S K M M N G I V A L H S P P M IR~W I K D A A 3613 TTT TGC TTA TGT AGA TCA AAA ATG ATG ~T GGG ATT GTG GCA CTC CAC TCG CCG CCA ATG]cGG I~G GAT GCC GCC

919 ... G .......................................................................

Fig. 4 (~r continuation and legend see page 53)

53

E K N G R

3688 GAA AAG AAT GGA AGG

994 ............. C.

3 7 6 3 oAT OTG GGC 1 0 6 9 . . . . . . C . . . . . . . .

[-"'7 L L C S A G C V G IS~FI R I T S E L F T L K

CTG CTT TGC TCT GCT GGA TGC GTC GGAITcCl CGT ATA ACA AGC GAG CTT TTT ACC CTAAAA

T.. T.. C ........................................ C ........ C ...

A K C Y P A ILl L L R S N R S P L M L L R

GCA AAA TGC TAT CCT GCTIcTA] TTG TTG CGT AGC AAT AGA AGC CCG CTC ATG CTG CTT CGG

.A ........................ A .................... T ............

[---1 R R F F A F S IS~LJ L W T G A L V D T G R E Q A K R V

3838 CGG CGC TTT TTC GCC TTC TCT ITcGI CTC TGG ACA GGA GCG CTA GTG GAC ACG GGG AGG GAG CAG GCG AAG CGT GTC

1144 ............... C ...........................................................

V R N G K K D T A T S P L S W T A G A N T V V S D

3913 GTT CGT AAT GGA AAG AAA GAT ACC GCT ACT TCG CCT CTT TCT TGG ACC GCC GGC GCG AAC ACA GTG GTC TCT GAC

1219 ........................ A ............... G ..................................

o o o . . , • v o 3988 CAG GAC CAG GAA CCA ATT CGA ATT TGG ATC TTG ACA TGT IcGGI TGC TTT TTA ACC GTA GGC ATC TTGICcAI GGA AGT

1294 ..A ............... A ......................... G T ............

4063 TGG TGG GCT IcATI CAT GAA TTA GGT IcGGI GGT GGC TGG TGG TTT[cGG l GAT CCC GTA GAA AAT GCT TCT TTT ATG CCT

1369 ........................................................................... i

4138 lcGGI GTA TTA GCC ACA GCTIcGTIATT CAT TCA GTA ATT CTA CCC CTT CTT CAT TCT TGC ACC ITcGI TTG ATT AAT ATT

1444 ........................................................ G ........ T C ........

V T [L~F] L C C V L G T F S I R S G L L A S V H S F A

4213 GTG ACTIcTT] CTA TGC TGT GTC TTA GGA ACC TTT TCA ATA CGG TCC GGA TTG CTA GCT TCC GTT CAT AGT TTT GCT

1519 .......... C ........... C ........ C ......................... C .................

T D D T R G I F L W R F F L L M T G I S M I L F S

4288 ACA GAT GAT ACA CGA GGA ATC TTT TTA TGG CGG TTC TTC CTT CTA ATG ACC GGC ATA TCT ATG ATT CTT TTC TCC 1594 ...........................................................................

Q M K Q Q A S V R R T Y Q K E M V V A R S T L V H

4363 CAG ATG AAA CAG CAG GCA TCG GTC CGT AGA ACC TAT CAA AAG GAG ATG GTT GTG GCG CGA AGT ACT CTT GTG CAC

1669 ........ G ........................... A .... A ......... A .... A ..................

L R H S A R A Q P R P Q L L W K N *

4438 CTA CGT CAC TCG GCT CGC GCG CAA CCC CGC CCC CAA CTC TTA TGG AAG AAT TGA ACAACTTGCTGGGCTGGTTATTCAGAGC

1744 ................................. GTT A.G ............... G.TTA .......................

4520 cAGcAATTGGcTGCCTTATTTCGTCcCGcAACTAGAAGAAGATcGCTTCAGGTcTTccAAGGGAAGTGTAGAGCAGTccGCcCCTACAGccTTTGTTTGA 1826 ............. T.GG ........................ C ..........................................................

Fig. 4. Nucleotide sequence of the orf577 loci in Oenothera and alterations indicated. The nucleotides edited are shown in lower case carrot. The reading ~ame of orf577 is translated above the Oenoth- letters. Three open reading ~ames (ORFs) greater than 45 amino era genomic sequence (upper nucleotide line) with upper ease letters, acids are indicated upstream of the orf577 coding region with the The nucleotide sequence of the carrot gene as obtained ~om clone deduced polypeptide sequences shown in lower case letters. The DcH 2/12 (Fig. 1) is shown underneath with only the divergent sequence data are available in the EMBL data bank under accession nucleotides given. All observed RNA editing events are shown for numbers X69554 (D. carota) and X69555 (0. berteriana) Oenothera in open brackets with the corresponding amino acid

been interpreted as RNA editing "mistakes" or as poten- tial factors involved in regulation of the translatable m RNA pool (Schuster and Brennicke 1991). Amino acid comparisons suggest two further editings at amino acids 28 and 49 (Fig. 6) that have not been observed in any of the cDNA clones and may thus be similarly rare events.

RNA editing of this ORF in carrot is expected to be similar in extent to Oenothera, as estimated from the sequence comparison of the entire genomic coding region (Fig. 4). Of the edited triplets observed in Oenothera, 47 are conserved in carrot, while 5 codons encode genomic- ally the amino acid specified by the corresponding edited codons in Oenothera. In carrot, 9 additional codons are expected to be edited, where C to U changes would create triplets encoding the same amino acid as in Oenothera.

All RNA editing events observed in Oenothera are C to U transitions, as for RNA editing in other plant mitochondrial genes. The frequency of RNA editing in this ORF, with 46 editing sites in 1731 nucleotides, is similar to that in other identified reading frames and thus

supports the deduction that orf577 encodes a function- ally important protein product.

Oenothera orf577 and Marchantia orf509 are homologous

Comparison of the Oenothera orf577 with the complete sequence of the Marchantia mitochondrial genome (Oda et al. 1992) reveals one open reading frame, orf509, with significant similarity. Overall, 63 % of all amino acids are identical between the Marchantia orf509 and the edited Oenothera orf577. Similarity including conservative ami- no acid exchanges amounts to 77% between the two deduced protein sequences.

Several domains of higher similarity between Oenoth- era orf577 and Marchantia orf509 can be defined in the amino acid alignment (Fig. 6). Spacing between three well conserved regions is expanded by additional se- quences in the higher plant protein. The carrot genomic

54

PH H H B B B H P P H B I I I

I l l I I I l l I I I

[ 0RF577 e l ;,~1~ ' : : : : : : ¸ : : : : : : : : : : : : : : ~ : ~ : : : : : : : : : : : : : : : : I : : I : : : : : : : : : : : : : : : : : : : ; : ; : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ' ' - ~ ` ~ ' .

==================================================================== ======================================================================== ::::::::::: ......

Cl 02 C3

05kb

PCR

CA C5

G6 C7 Ca

Fig. 5. Strategy used to identify RNA editing sites in orf577. RNA editing sites (vertical arrows) are distributed throughout the entire open reading frame of orf577. None of the eight independent cDNA clones nor the five cDNA sequences derived by the polymerase chain reaction (PCR) contained all of the observed editing sites. Position and extent of the independent cDNA clones (C1-C8)

cDNA clones

isolated from the random primed cDNA library are shown by the horizontal lines. The different primer pairs (horizontal arrows) used for PCR amplification were primers A/D for P1, A/C for P2, B/D for P3 and B/C for P4. Restriction recognition sites are indicated for BamHI (B), HindIII (H) and PstI (P)

Oenothera orf577 (gen.) P P P Oenothera orf577 (ed.) ~ ...... ~_~S~F~.F~KE~P~FG ............. ~1~1-~W 36 Marchantia orf509 MPNALTKTIPAVRQKNLFLLPI~NVMSP~F~V~SI~-~P~LR~VS ................ LY~F 61 Rhodobacter cell ~Z~ ...... ~A~I~-~VI~L~GAQKGWSGWMAVATP~Q~G 48

S R P R P A H S S P

TM~F~I~Y~ ~.~DI~.~'4~*JF~N~I_~I~,~.'I~I : , ~ ~ ~ L ~ - - C ~ ~ ................... 140 I/q~/~TYAFVT ~.I~LK ~I~YE~HT DK~ L~T~V~I -',~=~VL~F ~ ................. ~/~A ................... i12

R A R S H CYEQKT LAVPQLYTPFVLRT LVDSE LCSRRNRT FDGPAL FYAPLYPERK~A~L~VVRGEGKRTYLL LH LAR~DK~S~-~R~DGA~ 2 3 2

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: P S P R S P S R S

I ~ I S~KS ~ ~ M ~ :r*'~&'*:~'6,1..~=~.lI [m~e ~ ~ Q QKS~NM~F F F P F LVKP KK ....... ~R~PEMAG 267 L - - V~/~ TL'~L~L E EAPF NG - RDM~L~7.~FhI~I~F LY L~G L S~S F A ~ .......... ~]~ ........... i ......... 202

S R FT LKFKDVGAKCYPAL LLRSN~LMLLR~RF~AF[]LWTGALVDTG~~NG~KD~ATS PLS~AG~SD~D~EP Ili~]~||.,~St~m.&'~., 431

P H R R R R S L

~ I~ll~e,~'~¢~l~l':%'4: I m mel~[elelgl~jlal~J ,1 ~v4 m~ F~%~ ~ I m~kvJ w:~ W~e t ; k~va m m ~ .... ~ ~ I ~FL*4 =~.'leJ m Wa~'i*j | ~.'I W.'. U D i q ~ I u 459 ~:~ge~'1~i~F:%'gy]~g~[~F~~L~A~L L~A~VVE KREA~K~I ~A~M~FS L I ~ L ~ I SK~'J|~'~:~E~I L F I LA FF~ 323

GI~MImF ............ SQ~A~VRRTYQKEMg~RST~HLRHSARAQ~RP~KN* 577 SM~FLFF ............ F~SWTKLVGALSVPSSNQ ...... DP[]SN~VN~SRNTLI~HSYQFNRLAKLMEGTEGHDKVIVYKAS~KPK* 509 GGALT~YAARASEMQAKGLFS~E~ALVMNN~LL;~A~VF-TG~WPLIAEmF~DRKLSVC~PFFEKAFTPFMVGLALLLPLGSMMPWK~AsLG 420

Fig. 6. The Oenothera mitochondrial orf577 is homologous to the bacterial cell gene. Alignment of the translated Oenothera orf577 edited (ed.) eDNA sequence with the translated Marchantia mito- chondrial orf509 and the Rhodobaeter cell (Beckman et al. 1992) genes shows extensive conservation of amino adds throughout the entire polypeptide. The higher plant protein contains several inser- tions that are absent from the bacterial and moss sequences. The latter two have little conserved carboxy-terminal extensions, only

part of which is given for Rhodobacter. The amino acids altered by RN A editing are indicated for the genomic (gen.) derived sequence of Oenothera. Particularly striking are the many CGG-arginine to UGG-tryptophan transitions in the highly conserved region be- tween amino acids 417 and 527 that appear to be essential for a functional protein. Amino acids conserved between at least two species are highlighted

55

Oenothera mt Marchantia mt Rhodobacter Paramecium mt C~anidium cp Marchantia cp Rice cp Tobacco cp

R P H R R R R $ $ i l t $

- YKKFF LW T GG S ~ / ~ V ~ N ~ F ~ S ~ Q ~ I I~LFYFAV~LFLL~KRP

-~S LGFP L~T I~S~AV~- N~-/~SY~N~K~TWAL I T~L I FA I YLPaTRMI

- ~S LGFT~T I ~ S ~ V ~ - N~- A ~ S Y ~ I ~ I ~ T W A ~ I T~I VFAI YL~[~RTN

Fig. 7. A conserved domain in ccH, 0ff'577 and plastid ORFs. The region between amino acids 416 and 473 of the Oenothera orf577/ ccH shows significant similarity with a mitochondrial (rot) ORP from Paramecium (orf238; Pritchard et al. 1990) and plastid (cp) ORFs from higher (orf321 in rice; Hiratsuka et al. 1989, and orf313 in tobacco; Shinozaki et al. 1986) and lower plants (off321 in M~rchant&; Ohyama et al. 1986) as well as the red alga CyanMium

473 405 265 185 266 280 278 269

(orf307; Valentin et al. 1992). The genomic-encoded amino acids altered by RNA editing in Oenothera are given above the aligned amino acid sequence deduced from the cDNA sequences. Position of the last amino acid in the entire protein sequence is given on the right. Amino acids conserved between the Oenothera sequence and at least one of the other proteins aligned are highlighted

sequence also contains these inserted sequences, which may thus be a feature common to higher plants. Both carrot and Oenothera open reading frames begin at the second in - frame ATG codon of the Marchantia orf509 (Oda et al. 1992) suggesting that this initiation codon, 22 codons downstream of the first ATG, also functions as a translational start site in the moss.

Homology with a chloroplast reading frame

The domain in the carboxy-terminal region (amino acids 417-473) best conserved between higher and lower plants shows significant similarity with a plastid open reading frame in land plants (orf313 in tobacco; Shinozaki et al. 1986; and orf320 in Marchantia," Oda et al. 1992) and also in a red alga (orf321; Valentin et al. 1992; Fig. 7). The homology suggests that this domain of both mito- chondrial- and plastid-encoded proteins has a common function. The evolutionary derivation of the conserved region in both organelles can only be identified when further species as potential links are analyzed in this respect.

Homology with the ccll gene of Rhodobacter capsulatus

Homology searches in current databases, using the pro- tein sequence deduced from the edited orf577 sequences, revealed a striking similarity with the ccll gene of R. capsulatus (Beckman et al. 1992; Figs. 6 and 7). Simi- larity between the deduced protein sequences amounts to 66%, with 40% of the amino acids being identical between the bacterial and the higher plant sequences.

The domains conserved between the higher plants Oenothera and carrot and the moss Marchantia are also found in the bacterial gene. The overall alignment (Fig. 6) positions the second ATG codon in Marchantia with the start codon of the bacterial sequence, supporting the suggestion that this codon is the translational start in the moss. The additional sequences in the higher plants rela- tive to Marchantia are also absent from the ccll gene, supporting the argument that these sequences are inser- tions specific for the higher plant lineage. Both Marchan-

t& and Rhodobacter have longer carboxy-terminal exten- sions that are absent in higher plants. It is as yet unclear whether the different non-conserved domains serve sim- ilar complementing functions in the different organisms.

The region with the highest similarity between the mitochondrial ORFs and the bacterial ccll gene is the domain that is also conserved in chloroplasts. This do- main also seems to be present in the Paramecium mito- chondrial gene (orf238; Pritchard et al. 1990; Fig. 7) and provides further support for the suggestion of similar functions for the plastid and mitochondrial genes, most probably analogous to that of the bacterial cell gene in cytochrome c biogenesis (Beckman et al. 1992).

Discussion

The approach to localizing new genes in the plant mitochondrial genome used here, which involves inves- tigating transcribed regions, has resulted in the identifica- tion of a new gene in Oenothera and carrot. The high degree of conservation in the two higher plants and the moss Marchantia emphasizes the functional importance of this gene. RNA editing observed in the Oenothera orf577 transcript and deduced for the carrot mRNA further increases the similarity of the encoded polypep- tides. Furthermore, the high degree of conservation al- lows the identification of this gene.

The conserved plant mitochondrial gene described in this investigation can by identified by its virtue of similar- ity with the respective bacterial sequences as a homo- logue of the cell gene. This gene has been shown to encode an essential factor for assembly of functional cytochromes c (Beckman et al. 1992).

The c-type cytochromes function as electron transfer proteins and have a covalently attached heme group, while other cytochromes have more loosely associated hemes. The mitochondrial, membrane - bound cyto- chrome cl and the cytochrome c are located in the inter- membrane space. These proteins are encoded by nuclear genes, at least in fungi and mammalia, and cytochrome cl has been shown to be transported to its final location in a two-step reaction (Nicholson and Neupert 1989; Nicholson et al. 1989). First, apocytochrome cl is im- ported into the mitochondrial matrix through contact

56

sites and the first target sequence is removed. The protein is then translocated through the inner membrane into the intermembrane space, where the second presequence is processed. Both cytochromes require further maturation by the addition of the heme group in the intermembrane space. Several genes involved in transport and assembly of the cytochromes have already been identified in fungi (Tzagoloff and Dieckman 1990), but recent results from mutational and molecular analyses in bacteria suggest that additional factors may be involved in these reactions also in eukaryotes (Beckman et al. 1992; Ramseier et al. 1991).

Mutational analyses in the symbiotic bacterium Bradyrhizobium japonicum (Ramseier et al. 1991) and in the photosynthetic bacterium R. capsuIatus (Beckman et al. 1992) identified gene clusters essential for assembly of cytochromes c. One locus was identified in both species that encodes at least four genes involved in cytochrome c assembly, termed heIA, helB, helC and orf52 in Rhodobacter (Beckman et al. 1992) and cycV, cycW, orf263 and cycX, respectively, in Bradyrhizobium (Ram- seier et al. 1991). The second cluster has so far only been described in Rhodobacter and contains at least two genes, ccll and ccl2 (Beckman et al. 1992).

The deduced amino acid sequences for ccll and ccl2 possess typical signal sequences for translocation into the periplasm. The ccll-encoded protein was shown by gene fusion analysis to be indeed located in the periplasm. The most likely function of ccll has been deduced by taking into consideration the previously noted similarity of ccll with chloroplast ORFs and an ORF in the Par- amecium mitochondrial genome (Pritchard et al. 1990). A function as ligase responsible for ligation of heme to cytochromes c has been considered unlikely, since ccll shows no similarity with the identified cytochrome-c- heine lyase (Beckman et al. 1992). The ccll class proteins are therefore proposed to act as specific heme chaperones. They may play a role in guidance of apocy- tochromes and heme groups for the covalent linkage introduced by the cytochrome-c-heme lyase.

The high degree of conservation of these genes in a wide range of species and organelles, including bacteria, chloroplasts and mitochondria from protozoa as well as plants, suggests a specific, indispensible function that had to be maintained through evolution. Intriguingly, the ccll gene is not present in all bacteria, e.g. Escherichia coli, which does not require covalent cytochrome c-heine binding (Beckman et al. 1992). Rhodobacter and Bradyr- hizobium, where ccll and/or related genes are found, belong to the group of photosynthetic and symbiotic bacteria which are considered to be phylogenetically re- lated to the ancestral mitochondrial endosymbiont (Gray 1989).

The other genes of the bacterial hel/cyc and ccl clus- ters may likewise be conserved in their function in assem- bly of cytochromes c in organelles. Similarity analyses in the cluster of open reading frames containing the ccll gene in the Marchantia mitochondrial genome shows extensive similarities of orf277 with the heIB/cycW genes and between orf228 and the helC/orf222 genes in Rhodobacter and Bradyrhizobium, respectively (not

shown). These genes may also be conserved in the mitochondria of higher plants, although not arranged in a similar cluster, e.g. the analogue ofhelB/cycWis indeed encoded in the mitochondrial genome of Oenothera (W. Schuster, in preparation).

Acknowledgements. We thank Astrid Weber and Waltraut Jekab- sons for excellent technical assistance. This work was supported by grants from the Deutsche Forschungsgemeinschaft and the Bundes- ministerium fiir Forschung und Technologie.

References

Beckman DL, Trawick DR, Kranz RG (1992) Bacterial cyto- chromes c biogenesis. Genes Dev 6:268-283

Gray MW (1989) The evolutionary origins of organelles. Trends Genet 5: 294-299

Hiesel R, Schobel W, Schuster W, Brennicke A (1987) Cytochrome oxidase subunit I and III genes in Oenothera mitochondria are transcribed from identical promoter sequences. EMBO J 6:29-34

Hiratsuka J, Shimada H, Whittier R, Ishibashi T, Sakamoto M, Mori M, Kondo C, Honji Y, Sun C-R, Meng B-Y, Li Y-Q, Kanno A, Nishizawa Y, Hirai A, Shinizaki K, Sugiura S (1989) The complete sequence of the rice (Oryza sativa) chloroplast genome: intramolecular recombination between distinct tRNA genes accounts for a major plastid DNA inversion during the evolution of the cereals. Mol Gen Genet 217:185-194

Knoop V, Schuster W, Wissinger B, Brennicke A (1991) Trans- splicing integrates an exon of 22 nucleotides into the nad5 mRNA in higher plant mitochondria. EMBO J 10:3483-3493

Leaver CJ, Gray MW (1982) Mitochondrial genome organization and expression in higher plants. Annu Rev Plant Physiol 33 : 373-402

Manna E, Brennicke A (1985) Primary and secondary structure of 26S ribosomal RNA of Oenothera mitochondria. Curr Genet 9:505-515

Nicholson DW, Neupert W (1989) Import of cytochrome c into mitochondria: reduction of heme, mediated by NADH and flavin nucleotides, is obligatory for its covalent linkage to apocy- tochrome c. Proc Natl Acad Sci USA 86:4340-4344

Nicholson DW, Stuart RA, Neupert W (1989) Biogenesis of cyto- chrome cl. J Biol Chem 264:10156-10168

Oda K, Yamato K, Ohta E, Nakamura Y, Takemura M, Nozato N, Akashi K, Kanegae T, Ogura Y, Kohchi T, Ohyama K (1992) Gene organization deduced from the complete sequence of liverwort Marchantia polymorpha mitochondrial DNA. J Mol Biol 223 : 1-7

Ohyama K, Fukuzawa H, Kochi T, Shira H, Sano T, Umesono K, Shiki Y, Takeuchi M, Chang Z, Aota S, Inokuchi H, Ozeki H (1986) Chloroplast genome organization deduced from com- plete sequence of liverwort Marchantia polymorpha chloroplast DNA. Nature 322:572-574

Pritchard AE, Seilhamer J J, Mahalingam CL, Sable CL, Venuti SE, Cummings DJ (1990) Nucleotide sequence of the mitochondrial genome of Paramecium. Nucleic Acids Res 18:173-180

Ramseier TM, Winteler HV, Hennecke H (1991) Discovery and sequence analysis of bacterial genes involved in the biogenesis of c-type cytochromes. J Biol Chem 266:7793-7803

Sambrook J, Fritsch EF, Maniatis T (1989) Molecular cloning: A laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York

Schuster W, Brennicke A (1987) Plastid, nuclear and reverse tran- scriptase sequences in the mitochondrial genome of Oenothera: is genetic information transferred between organelles via RNA? EMBO J 6:2857-2863

Schuster W, Brennicke A (1991) RNA editing makes mistakes in plant mitochondria: editing loses sense in transcripts of a rps 19

57

pseudogene and in creating stop codons in coxI and rps3 mRNAs of Oenothera. Nucleic Acids Res 19:6923-6928

Schuster W, Hiesel R, Wissinger B, Schobel W, Brennicke A (1989) Isolation and analysis of plant mitochondria and their genomes. In: Shaw CH (ed) Plant molecular biology. IRL Press, Oxford Washington, DC, pp 79-102

Shinozaki K, Ohme M, Tanaka M, Wakasugi T, Hayashida N, Matsubayashi T, Zaita N, Chunwongse J, Obokata J, Yam- aguchi-Shonozaki K, Ohto C, Torazawa K, Meng BY, Sugita M, Deno H, Kamogashira T, Yamada K, Kusuda J, Takaiwa F, Kato A, Tohdoh N, Shimada H, Sugiura M (1986) The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J 5: 2043-2049

Tzagoloff A, Dieckman CL (1990) PET genes of Saccharomyces cerevisiae. Microbiol Rev 54:211-225

Valentin K, Maid U, Emrich A, Zetsche K (1992) Organization and expression of a phycobiliprotein gene cluster from the uni- cellular alga Cyanidium caldarium. Plant Mol Biol 20:267-276

Ward BL, Anderson RS, Bendich AJ (1981) The mitochondrial genome is large and variable in a family of plants (Cucur- bitaceae). Cell 25: 793-803

Wissinger B, Schuster W, Brennicke A (1991) Trans splicing in Oenothera mitochondria: nadl mRNAs are edited in exon and trans-splicing group II intron sequences. Cell 65:473-482

Wissinger B, Brennicke A, Schuster W (1992) Regenerating good sense: RNA editing and trans-splicing in plant mitochondria. Trends Genet 8 : 322-328