11
Mol Gen Genet (1990) 221:102-112 © Springer-Verlag 1990 Transcription of tomato ribosomal DNA and the organization of the intergenic spacer Keith L. Perry and Peter Palukaitis Department of Plant Pathology, Cornell University, Ithaca, NY 14853-5908, USA Summary. The organization of the intergenic spacer of a 9.04 kb tomato ribosomal RNA gene (rDNA) was deter- mined. The 3258 bp spacer contains two major repeat ele- ments enclosing a region which includes 351 bp of an 81.8% A- T rich sequence. A block of nine 53 bp repeats begins 388 bp downstream from the 3' end of the 25S rRNA. The A-T rich domain is followed by a block of six 141 bp repeats terminating 818 bp upstream from the 5r end of the 18S rRNA. Major pre-rRNAs of 7.6 and 6.5 kb were observed by Northern hybridization analysis. The 5' termini of these RNAs were identified through combined S1 nucle- ase and primer extension analyses. The 7.6 kb RNA is likely to be the primary transcript; its 5' terminus lies within a sequence motif, TATA(R)TA(N)GGG, conserved at the termini of transcripts mapped in three other plant species. The 6.5 kb RNA is interpreted as a 5' end processed tran- script derived from the 7.6 kb RNA. Comparative analysis of transcribed sequences revealed a 25 bp domain of the intergenic spacer which is relatively conserved among five plant species. The conservation of spacer sequences in plants is in contrast to the extensive sequence divergence of the intergenic spacer in other non-plant systems and sug- gests a conserved function directed by these sequences. Key words: rDNA - RNA polymerase I Introduction Ribosomal DNA (rDNA) shares the same basic organiza- tion in most eukaryotic systems (reviewed in Long and Dawid 1980). In ribosomal genes, the nucleotide sequences and ordered arrangement of the encoded 18S, 5.8S, and 25S rRNAs are conserved. These RNAs are all contained within a single pre-rRNA transcript which is processed to form the mature rRNAs. The cluster of these sequences is separated from the next gene by an intergenic spacer located 3r to the 25S and 5' to the 18S rRNA domains. The term "intergenic spacer", as used to described plant rRNA genes (Rogers and Bendich 1987b), includes se- quences found both upstream (5') and downstream (3') of the promoter. In contrast to the highly conserved nature of the structural RNA domains, the sequences of the inter- genic spacer can vary considerably among taxa and are Offprint requests to: P. Palukaitis species specific (DvorAk and Appels 1982; Delcasso-Tre- mousaygue et al. 1988). In spite of this variability, the func- tional roles of the intergenic spacers are conserved. The intergenic spacer is involved in a number of basic cellular functions. Sequences of the intergenic spacer con- trol the precise initiation of rDNA transcription (Sollner- Webb and Tower 1986), the termination of transcription (Grummt et al. 1985; Labhart and Reeder 1986), and the level of rDNA expression (Moss 1983; Busby and Reeder 1983; Reeder 1984). The regulation of rDNA transcription is especially critical with regard to ribosome biogenesis, which requires the coordinate action of all three major RNA polymerases. In mouse, RNA sequences encoded by the intergenic spacer are recognized by proteins involved in the processing of the primary transcript. The sequences immediately 3' of the processing site are necessary and suffi- cient for the processing reaction; sequences 5' of this site have no effect and are rapidly degraded (Craig et al. 1987). Sequences in the intergenic spacer also appear to function as chromosomal origins of DNA replication (Botchan and Dayton 1982; Van't Hofet al. 1987a, b). Ribosomal RNA copy numbers vary considerably in plants, as do the lengths of rDNA repeats within individuals (Ingle et al. 1975; Walbot and Cullis 1985; Rogers and Ben- dich 1987 a, b). These length polymorphisms arise from dif- ferences in the numbers of repetitive elements within the intergenic spacer (Appel and Dvo~fik 1982; Yakura et al. 1984; Saghai-Maroof etal. 1984; Rogers and Bendich 1987a, b; Gerstner et al. 1988; Jorgensen et al. 1987). Shifts in the populations of spacer length classes can be observed within a small number of generations (Saghai-Maroof et al. 1984; Rogers and Bendich 1987a). The entire intergenic spacer has been sequenced from maize (McMullen et al. 1986; Toloczyki and Feix 1986), rye (Appels et al. 1986), wheat (Lassner et al. 1987 ; Barker et al. 1988), radish (Del- casso-Tremousaygue et al. 1988), and a major portion of that from mung bean (Gerstner et al. 1988). In maize and radish, the 5' termini of rDNA transcripts have been pre- cisely identified (McMullen et al. 1986; Toloczyki and Feix 1986; Delcasso-Tremousaygue et al. 1988), and this has led to the recognition of a set of sequences regarded as plant RNA polymerase I transcription initiation sites (To- loczyki and Feix 1986; Gerstner et al. 1988). Our work has been aimed at characterizing the promot- er for RNA polymerase I in tomato. We have cloned a complete rRNA gene, sequenced the entire intergenic spacer, and identified the 5' termini of rDNA transcripts

Transcription of tomato ribosomal DNA and the organization of the intergenic spacer

Embed Size (px)

Citation preview

Mol Gen Genet (1990) 221:102-112

© Springer-Verlag 1990

Transcription of tomato ribosomal DNA and the organization of the intergenic spacer

Keith L. Perry and Peter Palukaitis Department of Plant Pathology, Cornell University, Ithaca, NY 14853-5908, USA

Summary. The organization of the intergenic spacer of a 9.04 kb tomato ribosomal RNA gene (rDNA) was deter- mined. The 3258 bp spacer contains two major repeat ele- ments enclosing a region which includes 351 bp of an 81.8% A - T rich sequence. A block of nine 53 bp repeats begins 388 bp downstream from the 3' end of the 25S rRNA. The A - T rich domain is followed by a block of six 141 bp repeats terminating 818 bp upstream from the 5 r end of the 18S rRNA. Major pre-rRNAs of 7.6 and 6.5 kb were observed by Northern hybridization analysis. The 5' termini of these RNAs were identified through combined S1 nucle- ase and primer extension analyses. The 7.6 kb RNA is likely to be the primary transcript; its 5' terminus lies within a sequence motif, TATA(R)TA(N)GGG, conserved at the termini of transcripts mapped in three other plant species. The 6.5 kb RNA is interpreted as a 5' end processed tran- script derived from the 7.6 kb RNA. Comparative analysis of transcribed sequences revealed a 25 bp domain of the intergenic spacer which is relatively conserved among five plant species. The conservation of spacer sequences in plants is in contrast to the extensive sequence divergence of the intergenic spacer in other non-plant systems and sug- gests a conserved function directed by these sequences.

Key words: rDNA - RNA polymerase I

Introduction

Ribosomal DNA (rDNA) shares the same basic organiza- tion in most eukaryotic systems (reviewed in Long and Dawid 1980). In ribosomal genes, the nucleotide sequences and ordered arrangement of the encoded 18S, 5.8S, and 25S rRNAs are conserved. These RNAs are all contained within a single pre-rRNA transcript which is processed to form the mature rRNAs. The cluster of these sequences is separated from the next gene by an intergenic spacer located 3 r to the 25S and 5' to the 18S rRNA domains. The term "intergenic spacer", as used to described plant rRNA genes (Rogers and Bendich 1987b), includes se- quences found both upstream (5') and downstream (3') of the promoter. In contrast to the highly conserved nature of the structural RNA domains, the sequences of the inter- genic spacer can vary considerably among taxa and are

Offprint requests to: P. Palukaitis

species specific (DvorAk and Appels 1982; Delcasso-Tre- mousaygue et al. 1988). In spite of this variability, the func- tional roles of the intergenic spacers are conserved.

The intergenic spacer is involved in a number of basic cellular functions. Sequences of the intergenic spacer con- trol the precise initiation of rDNA transcription (Sollner- Webb and Tower 1986), the termination of transcription (Grummt et al. 1985; Labhart and Reeder 1986), and the level of rDNA expression (Moss 1983; Busby and Reeder 1983; Reeder 1984). The regulation of rDNA transcription is especially critical with regard to ribosome biogenesis, which requires the coordinate action of all three major RNA polymerases. In mouse, RNA sequences encoded by the intergenic spacer are recognized by proteins involved in the processing of the primary transcript. The sequences immediately 3' of the processing site are necessary and suffi- cient for the processing reaction; sequences 5' of this site have no effect and are rapidly degraded (Craig et al. 1987). Sequences in the intergenic spacer also appear to function as chromosomal origins of DNA replication (Botchan and Dayton 1982; Van't Ho fe t al. 1987a, b).

Ribosomal RNA copy numbers vary considerably in plants, as do the lengths of rDNA repeats within individuals (Ingle et al. 1975; Walbot and Cullis 1985; Rogers and Ben- dich 1987 a, b). These length polymorphisms arise from dif- ferences in the numbers of repetitive elements within the intergenic spacer (Appel and Dvo~fik 1982; Yakura et al. 1984; Saghai-Maroof etal. 1984; Rogers and Bendich 1987a, b; Gerstner et al. 1988; Jorgensen et al. 1987). Shifts in the populations of spacer length classes can be observed within a small number of generations (Saghai-Maroof et al. 1984; Rogers and Bendich 1987a). The entire intergenic spacer has been sequenced from maize (McMullen et al. 1986; Toloczyki and Feix 1986), rye (Appels et al. 1986), wheat (Lassner et al. 1987 ; Barker et al. 1988), radish (Del- casso-Tremousaygue et al. 1988), and a major portion of that from mung bean (Gerstner et al. 1988). In maize and radish, the 5' termini of rDNA transcripts have been pre- cisely identified (McMullen et al. 1986; Toloczyki and Feix 1986; Delcasso-Tremousaygue et al. 1988), and this has led to the recognition of a set of sequences regarded as plant RNA polymerase I transcription initiation sites (To- loczyki and Feix 1986; Gerstner et al. 1988).

Our work has been aimed at characterizing the promot- er for RNA polymerase I in tomato. We have cloned a complete rRNA gene, sequenced the entire intergenic spacer, and identified the 5' termini of rDNA transcripts

from this region. This work, together with the recent work of Kiss et al. (1988, 1989a, b), provides a composite se- quence for an entire plant rRNA gene. To our knowledge, this is only the second report for a dicotyledonous plant which provides the entire sequence of the intergenic spacer combined with an analysis of transcription. This informa- tion has facilitated the recognition of sequence elements from the intergenic spacer which are conserved among di- verse species of plants.

Materials and methods

DNA isolation, cloning and sequencing. For nucleic acid ma- nipulations and cloning, procedures were as described in Maniatis et al. (1982), unless otherwise stated. Genomic DNA was isolated from Lycopersicon esculentum cv. Rutgers according to the methods of Bendich et al. (1979). DNA was digested with the restriction enzyme BglII, fol- lowed by phenol/chloroform extraction and fractionation on a 10%-40% sucrose gradient in a swinging bucket SW41 rotor at 34000 rpm for 18.5 h at 4 ° C. Agarose gel electro- phoresis of the fractionated DNA revealed a 9 kb band. This fragment was cloned into the BamHI site of pUC18 (Yanisch-Perron et al. 1985) and transformed into Escherichia coli strain HB101. Ampicillin resistant colonies were screened by colony hybridization using a random primed cDNA probe complementary to tomato 18S rRNA and prepared by the method of Taylor et al. (1976) as de- scribed by Gould and Symons (1977). One clone, pKU83, contained a full length 9.04 kb tomato rRNA gene and was used for all further work. DNAs were subcloned into pUC119 and pUCI20 (Vieira and Messing 1987) or into bacteriophage M13 rap18 and M13 mp19 (Norrander et al. 1983). Details of the subcloned constructs are described elsewhere (Perry 1989). Sequencing of both DNA strands from the intergenic spacer was performed according to the dideoxynucleotide chain termination procedure (Sanger et al. 1977, 1980) except that 7-deaza-dGTP was substituted for dGTP (Mizusawa et al. 1986). Plasmid DNA templates were prepared for sequencing as described by Mierendorf and Pfeffer (1987). To prepare the end-labeled primer for sequencing, the primer (Fig. 5, probe j) was treated with polynucleotide kinase in the presence of an equimolar amount of [y-3zP]ATP. The sequencing reactions were per- formed with 5 x 105 cpm of primer in the presence of a five fold greater concentration of dideoxynucleotides rela- tive to the standard conditions. The products were sepa- rated on a 10% polyacrylamide-8 M urea sequencing gel.

RNA isolation and Northern hybridization analysis. RNA was isolated from young, not fully expanded leaves by the guanidinium thiocyanate method of Chirgwin et al. (1979) as modified by Sures and Crippa (1984). RNAs were stored as ethanol precipitates at - 70 ° C. For Northern hybridiza- tion analysis, 5 gg of RNA was denatured with glyoxal and electrophoresed on a 1.0% agarose gel (McMaster and Carmichael 1977) until the bromphenol dye had migrated 9.5 cm. DNA and RNA ladders (Bethesda Research Labo- ratories) were used as molecular weight standards. Nucleic acids were capillary-transferred onto GeneScreen Plus ny- lon membrane (Du Pont Company, NEN products). Filters were boiled 5 rain in 20 mM TRIS-HC1, pH 8.0, to reverse the glyoxylation, and dried. Probes were radiolabeled by the 'oligolabeling' procedure of Feinberg and Vogelstein

103

(1983). Filters were prehybridized and hybridized at 60°C for 16 or 42 h each in 1% sodium dodecyl sulfate (SDS), 50 m M TRIS-HC1 pH 7.5, 5 x Denhardt's solution [0.1% (w/v) Ficoll, 0.1% (w/v)polyvinylpyrrolidone, 0.1% (w/v) bovine serum albumin], 2.5 mM EDTA, 5 x SSC ( I X = 0.15 M NaC1, 0.015 M sodium citrate), and I00 gg/ml salm- on sperm DNA. Filters were given four 15 min washes at 65 ° C in 0.2% SDS with decreasing concentrations of SSC (2x , I x , 0.Sx and 0.5 x).

S1 nuclease and primer extension analysis. The DNA probes were 5' end-labeled with polynucleotide kinase using a 0.2 × molar equivalent of [o/-3zP]ATP. Standard enzyme and physical manipulations are as described in Maniatis et al. (1982). The DNAs were either synthetic oligonucleotides (prepared by the Cornell oligonucleotide synthesis facilities) or restriction enzyme fragments purified from low melting point agarose by making the 70 ° C melted agarose 0.5 M with respect to NaC1, extracting twice with 37°C phenol and precipitating twice with ethanol. The probes dia- grammed in Fig. 5 and described in the text were subcloned DNAs digested at one restriction enzyme site in the inter- genic spacer (EcoRI, XbaI, BamHI, AvaII, SaII, and SalI for probes a - f respectively) and at a second site in the vec- tor. A 134 nucleotide BamHI/HaeIII fragment (Fig. 5, probe h) was separated on and eluted from an 8% poly- acrylamide-urea gel (Bolivar and Backman 1979). For the annealing reactions, 1-2 × 105 cpm of probe was co-precipi- tated with RNA and the nucleic acids suspended thoroughly in 16 gl of deionized formamide. The solution was brought to 400 m M NaC1, 40 m M PIPES [piperazine-l,4-bis(2-eth- anesulfonic acid)], pH 6.4, and 1 mM EDTA in a total vol- ume of 20 gl (Berk and Sharp 1977). After coveting with a layer of mineral oil, the sample was denatured at 85 ° C for 15 rain and then incubated at 60°C for 16 h. RNAs used for controls were either yeast tRNA or total RNA from the fungus Cochliolobus heterostrophus (generously provided by S. Van Wert).

For the S1 nuclease assays, hybridizations were termin- ated as described in Berk and Sharp (1977) and the nucleic acids digested at 37 ° C for 30 rain with 400 units of S1 nu- clease (Boehringer Mannheim). The phenol-extracted, etha- nol-precipitated products were suspended in 5 gl of a buffer containing formamide (Calzone et al. 1987) and 1 ~tl was loaded onto a 6% or 8% polyacrylamide sequencing gel.

For the primer extension reactions, the hybridized sam- ples were diluted ten fold with water, ethanol-precipitated, suspended in reaction buffer, and processed as described by Kingston (1987). The primers were extended with 10-30 units of AMV reverse transcriptase (Life Sciences). The re- action products were suspended in 5 gl of buffer and I gl loaded as described above. Primer extension reactions were also carried out without any prior annealing step. The RNA and primer were denatured for 5 rain at 85 ° C. The compo- nents of the reaction mix were then added just prior to the addition of the enzyme.

Computer analysis. The analysis of sequencing data was made on an IBM Personal Computer AT equipped with the Microgenic Sequence Analysis Program (Beckman). The dot matrix analysis and sequence comparisons were done with the University of Wisconsin G C G (UWGCG) package (Devereux et al. 1984) run on a VAX/VMS com- puter.

104

Results

Organization and sequence of the intergenic spacer

The organization of a tomato rRNA gene within a tandem array is illustrated in Fig. 1 A. To study the intergenic spacer from this gene, a 9.04 kb rDNA fragment was cloned following digestion of tomato DNA with the restriction enzyme BglII. This enzyme cuts at a unique site within the 25S rRNA domain (unpublished results), and thus the clone represents a full length rRNA gene. Two size classes of repetitive elements (Fig. 1 B; 53 and 141 bp repeats) were observed when the gene was cut with the restriction enzyme SphI. Subsequent restriction enzyme and DNA sequence analysis of the intergenic spacer revealed that the smaller 53 bp sequence (one of the elements is 50 bp) is repeated nine times beginning 388 nucleotides downstream from the 3' end of the 25S rRNA domain. The larger 141 bp sequence (the repeat size varies from 136-141 bp) is repeated six times beginning near the center of the 3258 bp intergenic spacer. The organization of these two sets of repeats is illustrated in Fig. 1 B, and the sequences are boxed in Fig. 2.

A dot matrix analysis of the intergenic spacer sequence (Fig. 3) was used to illustrate the repetitive sequences within the intergenic spacer, and to aid in defining the boundaries of the repeats (Fig. 2). The matrix reveals an alignment of dots which corresponds to an 8bp subrepeat (TGGCATGC) common to both the 53 bp and 141 bp re- peats (see Fig. 3 legend). Also evident is a dense cluster of apparent repeats close to the diagonal, from nucleotides 1100-1450. This distinctive feature of the intergenic spacer is a block of 351 nucleotides (also shown in Figs. 1 and 2) that is 81.8% A + T compared with 41.9% in the re- mainder of the intergenic spacer. The overall G + C content of the intergenic spacer is 53.8%. No striking similarities in sequence or secondary structure were observed between either of the repeats and the region surrounding the putative transcription initiation site.

In addition to the 3258 nucleotides of the intergenic spacer, the nucleotide sequence was determined for the ad- joining 74 nucleotides at the 3' end of the 25S and the 162 nucleotides at the 5" end of the 18S rRNA domains. These 18S and 25S rDNA sequences (data not shown) are identical to those published by Kiss et al. (1989 a, b).

Transcription of the rDNA

In order to evaluate the primary transcripts of tomato rDNA, transcription products were first visualized by northern hybridization analysis (Fig. 4). The probe, a 32p_ labeled XbaI--BamHI fragment (nucleotides 2533-2853) upstream of the 18S coding region, was considered likely to be included in a primary transcript originating in the intergenic spacer. The two largest transcripts (Fig. 4, bands a and b) were estimated to be 6.5 kb and 7.6 kb by compari- son with commercial molecular weight ladders. [Glyoxy- lated DNAs are suitable standards for the sizing of glyoxy- lated RNA (McMaster and Carmichael 1977).] These stan- dards gave a linear plot and, using the same gel as in Fig. 4A, the size of the tobacco mosaic virus genome was estimated to be 6.6 kb and the 25S tomato rRNA to .'be 3.5 kb (data not shown). These values closely approximate the values known from sequence data, 6.40 kb (Goelet et al. 1982) and 3.38 kb (Kiss et al. 1989a), respectively. Ribo- somal DNA specific probes further upstream in the inter- genic spacer, such as the SalI-SacI fragment (nucleotides 2283-2536), hybridized to the larger 7.6 kb band, but not to the smaller 6.5 kb RNA (data not shown). We interpret these bands to be a primary transcript and its 5' end-pro- cessed product, as illustrated in the lower part of Fig. 1A. Two additional RNAs (bands c and d) hybridized to the probe in Fig. 4. Based on its position, we believe band c arose from an annealing of the probe to the 25S rRNA; other probes representing regions of the intergenic spacer both upstream (5') and downstream (3') of the probe used here did not give rise to this signal. Band d may be a pro- cessing product cleaved downstream from the 3' end of the 18S rRNA and which still retained a portion of the transcribed intergenic spacer.

In order to determine the precise 5' termini of transcripts of the intergenic spacer, we employed a combination of S1 nuclease protection and primer extension analyses. Sets of probes that had identical 5' termini were chosen and the products of these assays were electrophoresed in paral- lel. Thus, any reaction product originating from a probe that had annealed to an rDNA transcript should appear in both assays and, being of identical composition, should migrate to the same position in a polyacrylamide gel. Fig-

A Intergenic spacer

- L -

[

lkbp I I I

5.8 S ]ntergenic spacer

~ Primary transcript ~- 5'end processed

transcript

B Intergenic spacer ÷

53bp repeats A-T 141 bp repeats rich 1 kbp domain

I I

Fig. 1A and B. Structure and transcription of a tomato rRNA gene. A A single rDNA unit, encoding the structural 18S, 5.8S, and 25S rRNAs, is shown embedded in a tandem array. The locations of the presumptive initiation site (large arrow) and the primary processing site (small arrow) are indicated. The relative size and position of the primary transcript and its 5" end-processed product are shown below. The precise 3' ends of these transcripts have not been determined, as indicated by the dashed arrow. B The intergenic spacer is shown with its flanking 25S and 18S rRNA domains. The boxed regions illustrate the locations and relative sizes of the 53 and 141 bp repeats. The presumptive initiation site and primary processing site are indicated as above. An A - - T rich domain is located just upstream of the presumptive initiation site

1 0 5

1 CCCCCCCACACTCCCCCTCCCCCAAAATCAAATCCAATCATTTCTAACT~TTCAAATGTG

61 AGGTTCGCGTGCTGCCTGCATCCTTCGAAGAGGAAAAAATAACTAAGTG~TGAAATATAA

121 GTTTCAAAAGTAACACGGCAAGTGAAGTTCACTAGTCTGCCGCTAAGTGTTGAGCTATGC

181 GTTCTGAGCCCCATTGCGAGTTTTTCGTGAAGTTGAGTTCA~TATCAA~CCTAATGACA

241 TGTTAAGGGACTAATGACATGTCACTGTAAGAGGTTTTCGGGATGTCGG~TGCGATTATT

301 AAAGCCAAGTTAGATGTCAAGGGGCAAATGGGTCTGCGTACGCAGCACG~CCGCGGCCAG

~ 1

361 GCGGCATCTGCAAAGGCCTGCGCCGA~GGGCGTGGACTGCAAAATACGCCTTTGGGCAG

421 CACACACGGTCGAACGACG~GGGC~TGGCATGTCATCATCGCCTTTGGGCAGCACACAC

481 GTTCGAGCGACGq~GGGCGTGGCATGCCATCATCGCCTTTGGGCAGTACAAACGGTCGAA

541 CGACGT~GGGCGTGGCATGCCACCATCGCCTTTGGGCAG~ACACACGGTCGAGCGACG~C

601

661

721

781

841

901

961

GGGCGTGGCATGCCATCATCGCCI'fTGGGCACACACGGTCGAACGGCG~TGGCGTGGCA

TGCCATCTTCGCCTTTTTGCAGCACACACGGTCG.~CGACG~GGGCGTGGCATGCCATC

TTCGCCCTTTGACAGCATAGACGGTCGGCCGTCG~CGGGCGTGGCATGCCATCTTCGCCC

TTTGACAGCATAGACGGTCGGCCGTCG~GGGCGTGGCATGCCATCATAGCCCTTGGAC~' •

GCAC,~CGGTCGGCCGTCG~GGACGTGCCTGCACAC~CGGTCGGCCGTGGCCTGCCC

GCATCGGTCGTGGCTTGCGCAACATTCATCGAGTTCCAAACAAAACATGCGGATGTTCAT

GGCGTACATAAATCAAAGGATTTTGAAACAACCTCCATGCATAACAAACATATTCATCTA

60

120

180

240

300

360

420

480

540

600

660

720

780

840

900

960

1020

2761 GTCGGGCGGCGGGGTGGATGTCGGGCGTGCATTTCCGGAGCTATTCACGTACGGCGCATG 2820

2821 AGTGGTATTGGGCATGTGTGGTTAGGTTGGATCCCTGCT~CGAGCAGCGACGTCCTAACT 2880

2881 CGCATGCCAACTCGGTGACGGATGAAGCGCAATCTAGGC~GGTCGGACGTCGGAACTTCC 2940

2941 TGTGCTGCATACCTACTGCCTAGGCATTGTGCACGTGCA~ACGGTCGCCTTTCGCCCCTC 3000

3001 GCATCCCATGCGCGGGGTGAACCCAAAAGACGCTCTCGC~TCCCACGCCTTCCCTCGCTT 3060

3061 CGTCGTGCGATGGCGTGGTCCGTGAGCGGCGCCTCGAAT~CTCGGATACGGTAGACGCAG 3120

3121 TGGGCATGGGGCCTTCACCGGCTTCTATCTGCCCAAAAC~AATGCTCCTTGCGAATGACT 3180

3181 GCCGCGCTCGCCTTGGACCCGACCGTGCCCGAAAGGGCGCGCCGGGCTCATGCGGCGCGC 3240

3241 GGCGTCGTTGAGGAATGC 3258

Fig. 2. Sequence of an intergenic spacer of a tomato rRNA gene. The sequence begins at nucleotide 1, the first residue beyond the 3' tetaninus of the 25S r R N A sequence. Nucleotide 3258 is the last residue of the intergenic spacer and immediately precedes the 5'terminus of the 18S rRNA domain• The presumptive initiation site at nucleotide 1569 and the primary processing site at nucleotide 2691 are marked with arrows. The 53 and 141 bp repeats are boxed

1021

1081

1141

1201

1261

1321

1381

1441

1501

1561

1621

1681

CTTTCCATTATCTATTCTCAAACGTTTCCGCCTAACGTGGCTCTTTCGCATCATTTTCGT

TACTTTTACGGTTCGTACGATATTGAAACATCTTTTGTTTGTGCAAATATGCATCTTATC

ATTAATTTGACATGTTGAGAAGTGTTTTCGAGCATTTCCATATTTTTCCGACTTTTAATC

ATTATTTTATAATTTATTTTTACGCTTTTTTAATTTTTACGTCTCTTTTTAAAAATTAAA •

ATTTATTAAATTTTATATTTTAAGGTTCACATATTTATTTGTGAATTTTCGGA~TTGATT

TCATATTTTTTCGATATTTTCCCTATTTTTTATTAATTTATTACTAATTTTTTGGAATTT

TTGAAAAAAATAAAAATCAAAAAAAATTGTTGAAAAATATTTTTTTATACATATTAAAGT

CAATTATGAAGGCTGATGTGTGTTTGTACCTTAGACCGCGCATATTTGGGTTGTACATTT

TCATTATGATTCTCTGGAAAATCCATGTCTACTCCTGTCACATGGGCAAAACTTTTTTAA ~.

GCATATATAAGGGGGGTAGAGGTGTTGGAGG CAGACTGA0 3CGCAGGCAGGCAGACGGCA •

TAGGCGTCCCGTGGGCTTAGCAGGCGTGCTGCGTGGGCGCTTGATGGCATGCATGGCTTG • TCCGTGCTACGCCGTTGGGCGTTTAC~CACGTCGGCGACGTCGACGGGTCGTt~

1741 GCAACGGCAGGCGGACGCCGAGGGCGTCCTGTGGGCTTAGTAGGCGTGCTGCGTGGGCGC

1801 TTGATGGCATGCATGGCTCGTCCGTGCTACGTCGTTGGGCGTCTACAAAAACATGCTAGC

1861 GACGTTTGCGGGGC~GCGAAGGCAGGCGGACGTCGAGGGCGTCCTGTGGGCTT~ •

1921 GTAGGCGTGCTGCGTGGGCGCTTGACGGCATGCATGGCTCGTCCGTGCTACGCCGTTGGG •

1981 CGTTTAC~./~CACGCCCGCGACGTCTGTAGGGCG~TTGAC43CGGTGGCAGGCGGACG~ •

2041 CATGGGCGTCCTGTGGGCTTAGTAGGTGTGCTGCGTGGGCGCTTGACGGCATGCATGGCT

2101 CGTCCGTGCTACGCCGTTGGGCGTC,AACAAAAACATGCCAGCGACGTCTGCGGGGC~C~

2161 GAGGCGAAAGCAGGCGGACGTCAAGGGCGTCCTGTGGGCTTAGTAGGCGTGCTGCGTGGG

2221 CGCTTGATGGCATGCATGGCTCGTCCGTGCTACGCCGTTGGGCGCTTGCAAAAACATGTC • .

2281 GACGACGTCTGCGGGGCG; ~CGAGGCGTTACAAGGCGGATGCCATGGGCGTCCTGTGGGC

2341 TTAGTAGGCGTGCTGCGTGGGAGCTTGATGGCATGCATGGCTCGTCCGTGCTACGCCGTT

2401 GGGCGCTTAC~a~CATGCCAGCGACGTCTGCGGGGCG~CGTGCGCCGCCGAGGG,~.C

2461 TTCTCAAGATCGGTTTTATTATTGCGTTTGGTGTGGAAACGGCAGTGCTTTCGGGCGAGT

2521 GGCGAGTTCTAGAGCTCCTGTTACGGCTAACTCTAGGCGTCGCACGCACGGGGCACGTAA

2581 GGCCATGTACGGCCAGACGCTATGATGGACCGGGCGTGGGCGGTTCCCCTGTGTGAACCT .~

2641 TGGTCTTCCTCCAACAATCTTTGCAGTGATTAAATTCTCAACTCCCTTGGGCGGCGCGC~ •

2701 ACGGCGGGTGTAGCATTGGCCTTGCAAAGAAGGCATCGGCGTCGTCGCACGACATCTAAT

1080

1140

1200

1260

1320

1380

1440

1500

1560

1620

1680

1740

1800

1860

1920

1980

2040

2100

2160

2220

2280

2340

2400

2460

2520

2580

2640

2700

2760

1000 2000 3000 t . . . . . ~ , , , I . . . . . . . , . . , . , . I , ,. , . , , , ; , , I ,

. . . . . . . . . . . . . . . . , , . • , , . . . . . ....

' i . . . . . . i .

. . . . . . . .

. . . . . . . . • .

, . . : . . . t . . . . . ~

. . . . . . . . . . ~ . i .

_ ; • : . . . . . . • . . . . . . . . . Z . . .

. ,

,. ' ;r.*::~;

/ / / / / /

,z -3000

-2000

1000

~ ,

Fig. 3. Dot matrix analysis of the D N A sequence of the intergenic spacer of a tomato rRNA gene. A self comparison of the intergenic spacer sequence is anade using the U W G C G 'compare' program set at a stringency of 11 out of 14 identical nucleotides. The output is obtained using the 'dotplot' program and printed with an Apple laserwriter. The two blocks of the 53 and 141l bp repeats are indi- cated by sets of diagonal lines between positions 390-860 and 1600-2440, respectively, on either axis. The spacing and number of dots running parallel to the diagonal are indicative of the size and number of repeats, respectively. Dots are observed at the coor- dinates where sequences of the blocks of 53 bp repeats on one axis intersect the sequences of the blocks of 141 bp repeats on the second axis. These dots are indicative of a subrepeat common to both the larger repeats. The cluster o f dots between coordinates 1100-1450 corresponds to an A - - T rich sequence

106

9.5

7.5

5.1 4 .4 4.1

3.0 2.4

A B

origin

25S-

18S-

a

~ - - b

. ~ - - - 0

~---- d

Fig. 4A and B. Northern hybridization analysis of tomato rDNA transcription. Two independent tomato RNA preparations (5 gg each) were denatured with glyoxal and electrophoresed on a 1.0% agarose gel. The ethidium-bromide stained gel is shown in A. The migration and size (kb) of molecular weight markers are dia- grammed to the left of the gel. RNAs were transferred to a 'Gene Screen Plus' membrane, and hybridized with a nick-translated probe. The 3zP-labeled probe is a 321 nucleotide DNA fragment from the intergenic spacer (nucleotides 2533-2853), located down- stream of the 141 bp repeats. An autoradiograph of the washed membrane is shown on the right. The locations of the origin and stained t8S and 25S rRNA bands are indicated. Arrows (a) and (b) indicate the positions of the largest rDNA specific bands. The nature of bands (e) and (d) is discussed in the text

Intergenic spacer

t i

i i i

i

i i

T , i

1 * i

m , I i

I , i i i

i

1.,,,,

• (a)

(b) (c) (c')

(d) (e) (f) (g)

(h) ( i ) ( j )

Fig. 5. S] nuclease protection and primer extension analysis. The intergenic spacer of the tomato rDNA is diagrammed above. The locations of the presumptive transcription initiation site (large ar- row) and the primary processing site (small arrow) are indicated directly beneath the intergenic spacer. The relative positions of the radiolabeled probes and products are shown below• For the $1 nuclease protection analysis (a-f), the protected fragments (thicker lines) and the nuclease sensitive regions (thinner lines) of the rDNA probes are indicated. The absence of a thinner line indicates that the fragment was fully protected. The annealing of the probe in (c) to the primary transcript led to a fully protected fragment. The same probe annealed to the processed transcript (c') yielded a partially protected fragment. For the primer extension analysis (g-j), the arrows indicate the lengths of the products ex- tended from their respective primers (bold blocks). The absence of an arrow indicates that no extension product was observed. All molecules are diagrammed to scale except the primers. The precise locations of the probe and product termini are as follows (see numbering in Fig. 2): a, 2854-3100; b, 2283-2532; c, 2283-2853; c', 2283-2853 (probe) and 2691-2853 (product); d, 2283-2610; e, 1729-2282; f, 507-1728 (probe) and 1569-1728 (smallest of three products); g, 2837-2853 (probe) and 2691~853 (product); h, 2720-2853 (probe) and 2691-2853 (product); i, 259/-2610; j, 1582-/601 (probe) and 1569-1601 (product)

ure 5 illustrates this strategy and summarizes the results from the combined $1 nuclease and pr imer extension exper- iments. Only two transcripts were observed and these corre- sponded to those detected in the nor thern hybridizat ion analysis. The posit ions of the 5' termini of these two tran- scripts were mapped and are depicted in Fig. 5. The da ta for the precise mapping of these termini are presented in Figs. 6 8.

The probes i l lustrated in Fig. 5 represent 2594 of 3258 nucleotides in the intergenic spacer. All the probes used in the S1 nuclease assays were designed to contain heterolo- gous D N A of vector origin at their 3' ends. This allows gel bands corresponding to r D N A sequences which are fully protected to be differentiated from the more slowly migrat- ing, intact probes that may have escaped digestion by the S1 nuclease.

The identif ication of the pr imary processing site was made using the probes i l lustrated as c, g, and h in Fig. 5. The results of the combined S1 nuclease and pr imer exten- sion experiments are presented in Fig. 6. A product com- mon to both assays was apparent (band e, lanes 3 and 4). This 163 nucleotide band could be aligned with the G resi- due (nucleotide 2691) in the D N A sequencing ladder, thus indicating that there was an r D N A transcript with a 5'

terminus corresponding to this nucleotide in the intergenic spacer. Also evident was a second produc t (band b, lane 3), a 571 nucleotide $1 nuclease produc t which corresponded to the fully protected r D N A sequences of the probe. We interpret the two different $1 nuclease products to result f rom the annealing of the probe to two different r D N A transcripts, one transcript which init iated upstream of nu- cleotide 2283 (the 5' terminus of the probe) and spanned the whole length of the probe, and a second transcript with a 5' terminus located within the sequences of the D N A probe. This lat ter R N A is referred to as the 5' end-processed transcript. Addi t ional $1 nuclease products were sometimes observed with the probe described above and with probe d in Fig. 5, but no corresponding products were seen among the primer extension reaction products. No primer exten- sion products (lane 4) could be visualized which were larger than the fully protected r D N A fragment seen in lane 3 (band b), but as described below, primers complementary to sequences further upst ream (5') could be extended. The primer extension bands c and d in Fig. 6 are art ifactual: one was a contaminant of that par t icular gel-purified primer preparat ion, and the other was not observed when a different pr imer from the same region was used (data not shown).

107

1 2 3 4 5 6 7 8

a - - - ! ~

b -- - -~

¢ -----t~

d -----~

e ----"~

Fig. 6. $1 nuclease protection and primer extension analysis at the major processing site in the intergenic spacer. $1 nuclease assays (lanes 1-3) were performed using a 690 nucleotide 5' end-labeled probe representing 576 nucleotides of rDNA and 114 nucleotides of vector DNA (Fig. 5, probe c). The probe was annealed for 16 h at 60 ° C to 10 gg of tomato RNA (lanes 1 and 3) or fungal RNA as a control (lane 2). Samples were electrophoresed on an 8% poly- acrylamide-8 M urea sequencing gel. The treatments were either no SI nuclease (lane 1) or 400 units of $1 nuclease (lanes 2 and 3). In lane 4 the products of a primer extension reaction are shown; the primer used has the same 5' terminus as the probe in the S1 nuclease reactions. This 134 nucleotide 5' end-labeled primer (Fig. 5, probe h) was annealed to t0 gg of tomato RNA and ex- tended with AMV reverse transcriptase. Lanes 5-8 are the T, G, C, and A lanes respectively of DNA sequencing reactions using a phosphorylated primer with the same 5' terminus as the probes used in lanes 1-4. The photo on the right is an enlargement of the same autoradiograph with the nucleotide sequence indicated. The asterisk marks the 5' terminal nucleotide at the processing site. Arrows denote the following bands : (a, lane 1) the 690 nucleo- tide probe (both strands of the DNA fragment are labeled and migrate as a doublet); (b, lane 3) the fully protected, 576 nucleo-

In order to identify the 5' termini of transcripts originat- ing further upstream in the intergenic spacer, two additional probes (Fig. 5, e and f) were employed in SI nuclease pro- tection assays. Probe e remained fully protected (data not shown) while the use of probe f gave rise to a cluster of products (Fig. 7, arrow, lanes 3-8). Increasing the amount of S1 nuclease from 40 units (Fig. 7, lanes 4 and 7) to 400 units (Fig. 7, lanes 3 and 6) reduced the size of the products by approximately one nucleotide, but did not lead to degra- dation of the D N A - R N A duplex. A similar pattern of mul- tiple bands was observed in the S1 nuclease analysis of the Xenopus 40S r D N A transcript (Sollner-Webb and Reeder 1979). The lower band of the cluster shown in Fig. 7 (most easily visible in lanes 6 and 8 of the right panel) could be aligned with an A residue (nucleotide 1569) in the D N A sequencing ladder, and thus corresponds to the 5' terminus of the protected fragment. The R N A whose 5' terminus was established in this $1 nuclease assay was the only other r D N A specific transcript observed in our experiments. For reasons described below, this 5' terminus is referred to as the presumptive initiation site and the transcript as the pri- mary transcript.

The annealing reactions for the treatments shown in Fig. 7, lanes 5 and 8 were modified by lowering the temper- ature of hybridization to facilitate the annealing of regions of low G + C content. The lowered temperature reproduci- bly gave rise to a number of smaller products in addition to those described above. The exact origin of these bands is not known, but they may have arisen from a probe an- nealed to the transcript of an r D N A unit which differed slightly in sequence from the cloned rDNA.

To confirm the results of the $1 nuclease assay, a primer complementary to the sequence downstream of the primary initiation site was synthesized (Fig. 5, probe j). Following annealing of this primer to tomato RNA, extension with reverse transcriptase gave rise to a 33 nucleotide product (Fig. 8, lanes 2 and 3) which co-migrated with the A residue of a D N A sequencing ladder (Fig. 8, lanes zk7). The 5' end- labeled primer used in the primer extension reaction was also used to generate the D N A sequencing ladder, thus assuring the identity of the co-migrating primer extension and D N A sequencing reaction products. The 5' terminus of the primary transcript identified by primer extension analysis, nucleotide 1569, is in agreement with the results of the $1 nuclease protection assay.

Intergenic spacer sequences conserved among plants

A comparison of the putative promoter domain of tomato with that of radish revealed the presence of conserved se- quence elements, r D N A transcriptional control elements have been defined in a number of eukaryotic systems using R N A polymerase I assays. The r D N A promoter spans a region from about - 160 to + 5 relative to the transcription initiation site at +1 (Sollner-Webb and Tower 1986). Therefore, we compared a 200 nucleotide stretch of the

tide, rDNA specific product; (c and d, lane 4) artifactual bands not consistently observed (see text); and (e, lanes 3 and 4) the 163 nucleotide primer extension product and S 1 nuclease protected fragment. The origin of the gel and the unextended 134 nucleotide primer are not shown

108

9 10 11 12

1 2 3 45 6 7 q 6 7 8 9 101112

1 2 3 4 5 6 7

Fig. 7. S 1 nuclease protection assay in the region of the presumptive transcription initiation site. The 2.65 kb 5' end-labeled probe used in all lanes represents 1.42 kb of vector DNA and 1.23 kb ofrDNA (Fig. 5 probe f). The probe was annealed to 10 gg of tomato RNA for all treatments except the tRNA control (lane 2). The treatments were: lane 1, no $1 nuclease; lanes 2, 3, 5, 6, and 8, 400 units of $1 nuclease; lanes 4 and 7, 40 units of $1 nuclease. The treat- ments in lanes 6-8 were a repetition of those in lanes 3-5 using an independent preparation of RNA. The RNAs and probe were annealed for 16 h at 60 ° C. For treatments in lanes 5 and 8, the annealing reactions were extended for an additional 3 h at 45 ° C. Lanes 9-12 are the T, G, C, and A lanes respectively of DNA sequencing reactions using a 5' end-labeled primer with the same 5' terminus as the probe used in lanes 1-8. The products were electrophoresed on an 8% polyacrylamide sequencing gel. An ar-

row indicates the position of Sl nuclease-resistant products. The photographs on the right are an enlargement of part of the autora- diograph shown on the left. In order to visualize the position of the $1 nuclease products relative to the DNA sequencing ladder, a shorter exposure of the autoradiograph of the DNA sequencing ladder was placed alongside the photograph of the $1 nuclease products (right panel). This was facilitated by register marks from the T lane in the longer exposure. The nucleotide sequence is indi- cated alongside the autoradiograph and the a s t e r i s k denotes the nucleotide which aligns with the lower band common to lanes 6 and 8

tomato and radish rDNAs from - 180 to + 20. The greatest homology lies at the putative initiation site, where 10 of 11 nucleotides are identical (Fig. 9 A (boxed), 9 B). The re- latedness of these putative promoter domains is also sug- gested by the series of 4 to 6 nucleotide identities (under- lined in Fig. 9 A) arranged in a sequential fashion with only two gaps. The same type of comparison made between to- mato and maize, a more distantly related plant, did not reveal a similar pattern of colinear identities, although, as described below, sequence motifs surrounding the putative initiation sites were observed.

The 5' terminus of the primary transcript in tomato resembles sequences mapped in maize (Toloczyki and Feix 1986; McMullen et al. 1986), radish (Delcasso-Tremou-

a

b --Ira,

T A T A T A A G G G

Fig. 8. Identification of the 5' terminal nucleotide of the primary rDNA transcript. Lanes 1-3 contain the products of primer exten- sion reactions. A 20 nucleotide, 3zp 5' end-labeled primer (Fig. 5, probe j) was combined with 10 gg of tomato RNA and the reac- tions carried out either in the presence of I0 units of AMV reverse transcriptase (lanes 2 and 3), or without the addition of enzyme (lane 1). In lane 3, the reaction contained a starting equivalent of 2 x 105 cpm of radiolabeled primer, an amount five fold greater than that used for the reactions in lanes 1 and 2. A r r o w s mark the positions of the 33 nucleotide extension product (a) and the unextended primer at the bottom of the gel (b). Lanes 4-7 are the A, C, G, and T lanes, respectively, of DNA sequencing reac- tions in which the same end-labeled primer used in the extension reactions was annealed to a cloned rDNA template. The nucleotide sequence is indicated alongside the autoradiograph and the a s t e r i s k

denotes the nucleotide which aligns with the primer extension pro- duct

saygue et al. 1988) and wheat (unpublished, cited in Lassner et al. 1987; Barker et al. 1988), and observed within the intergenic spacer of rye (Appels et al. 1986), mung bean (Gerstner et al. 1988), and cucumber (cited in Gerstner et al. 1988) (Fig. 9B). These putative R N A polymerase I tran- scription initiation sites of plants could best be described by the consensus T A T A ( G ) T A ( N ) G G G . The spacing of nucleotides is not entirely identical, with single non-con- served insertions or deletions (indicated by parentheses).

109

A - 18 O .............. AATAAAAATCAAAAAAAATTGTTGAAAAATATTTTT " 145

I IIIII I II[[ III IIIIII I [ -180 CATGGTTAGAGGCAACAG~I,.~____,,jTTATGAA~, TTTGCCAGAA~, TAGCTCT -131

-144 TTATACATATTAAAGTCAATTATGAAGGCTGATGTGTGTTTGTACCTTAG -95

I Ill I II i II I ill li i I - 130 AACCATGTATATGAAGCATGCAAAAAATCAGATTCAAATTCGAA ...... " 87

"94 ACCGCGCATATTTGGGTTGTACATTTTCATTATGATTCTCTGGAAAATCC -45

IIIII II I I I I I fill - 86 .... -... GTATTTTTTTTTTTACATCAAAAATACTCCCGGAACAG~.TTCA " 44

IIIII III II I I I IIII II I I I l [ l l l [ I l l | • -43 ATGTCGACTGGTGAAAGA.CTG~GCTTAAGTGq]TATATAGGGGG I +6

+7 GGTAGAGGTGTTGG +20

1 I I +7 TAGGCACTCTTCTG +20

B +I

tomato .... TATA-T A AGGGGG ....

radish .... TATA-T A GGGGGT ....

cucumber .... TATA-T A GGGGGG ....

mung bean .... TATA-T A TGGGGG ....

A 2691

2741

2791

2841

2891

2941

2991

3041

G CGG CGCGCAACGG CGGGTGTAG CATTGG CCTTG CAAAGAAGGCATCGGC 2740

I II I II IIEI I [tlIEtl ....... CGCTTTGACGGACTTGCCTTGGTTTCATCCGTCTTCCATCGGC

GTCGTCGCACGACATCTAATGTCGGGCGGCGGGGTGGATGTCGGGCGTGC 2790

I II [ II I lltlf [ I rl TCTTCTGATTACATCGGAAGAGTGTTTGGATAGATTGATGTGAGTGGGGC

I |

ATTTCCGGAG CTATT CbCGTACGGCGCATGAGTGGTATTG~GCATGTGTG 2840

I I I [IIII III illlllllllll I I~ I 1 a a r . . . . . . . . . . TG~fG~ 'TCGG~GCATGAGTGGT~T~ 'GqATAGC~AG

2890

I II II Jllillll__ III llIl~J I I I II 1 II TGTTTGTAGGCFCCCTGC . TCGCGCAGqCAACTACAGACCAACTAT C CTT

Ill II Ill II II Ill li I I ill CTCAGTTGGTTCACGAGC . ATATTTATGCTCGTTGACTTGAGCCGGG~_

IIII III IIllllJ III IIII I II 1 I TGTGTTGCGTACCT~CAGAAAGGAATTGCTAA."~ . . GCTTTGCTTAAAAT

TTCG CCCCTCGCATCCCATGCGCGGGGTGAACCCAAAAGACGCTCTCGCG 3040

I I 1 II il I I II Ill II il II I I ATTGTTCGCGG CTTCTCCTTCGTTGGGGAAATCGTGAACACAAAAGCCGG

TCCCACGCCTTCCCTCGCTTCGTCGTG..CGATGGCGTGGTCCGTGAGCG 3088

I I I IIII I I II II II II II CACTTGTGAT C CT CT CGTCTTTG CATGATATATGCATTGTT CGCAAAGGT

maize CAGGTATAGT A GGGGGTAGGG

wheat CGGGTATAGT A GGGAGGAGGG

3089

3139 rye CGGGTATAGT A GGGAGGAGGG

Fig. 9A and B. Conserved plant sequences at the putative transcrip- tion initiation site. A The sequence of the putative transcription initiation site in tomato (upper) is aligned with the corresponding 3182 sequence from radish (lower) (Delcasso-Tremousaygue et al. 1988). A line between the two sequences indicates conservation of nucleo- tides. Groups of four or more identical nucleotides are underlined

3232 and the homology at the putative transcription initiation site is boxed. The sequences represent nucleotides - 1 8 0 to + 20 relative to the first nucleotide of the transcript at + 1. Position - 180 in this figure corresponds to nucleotide 1389 in the intergenic spacer of tomato. B Sequences surrounding the A residue (+ 1) at the putative transcription initiation site are shown. The initiation sites B have been mapped by primer extension and/or $1 nuclease assays tomato in tomato (this work), radish (Delcasso-Tremousaygue et al. 1988), maize (Toloczyki and Feix 1986; McMullen et al. 1986), and wheat radish (unpublished, cited in Lassner et al. 1987 and Barker et al. 1988). The sequences from cucumber (cited in Gerstner et al. 1988), mung maize bean (Gerstner et al. 1988), and rye (Appels et al. 1986) are aligned by homology wheat

GCGCCTCGAATTCTCGGATACGGTAGACGCAGTGGGCATGGGGCCTTCAC 3138

I II II I J[ I i il I I III I II GAATAACGTGTTGGCTGAGATCTCGGTTGCGGAAACGTTATGGCGGTGAC

CGGCTTCTAT . . . . . . . CTGCCCAAAACGAATGCTCCTTGCGAATGACTG 3181

I Iil IIII I III II i I II II TCGAAGCAATTCTTGTCCTGCTAAGCACGTTTGTCTCCGGACAAAAGATG

CCGCGCTCGCCTTGGACCCGACCGTGCCCGAAAGGGCGCGCCGGGCTCAT 3231

II I i I I I II I i il III I ACGGTCAAGTCCACGTCTGTTCCCCCTTTCCTTGTGTTTGCGGGGATATG

GCG.GCGCGCGGCGTCGTTGAGGAATGC 3258

II i I J IJl IJ I llilll ACGTGGTCTTGCCGTGATTTATGAATGC

--ACGTACGGCGCATGAGTGGTA-TTGG

--ACGTTCGGTGCATGAGTGGTAATTGG

GGCTACGTGGCGCATGAGTTGTC-TTGG

GGCTACTAAGCGCATGAGTAGCT-TTGG

As found in all eukaryot ic R N A polymerase I p romote r s reported, the first nucleot ide of the p lan t t ranscr ipts is a pur ine (a lmost always an A) and this is preceded by a T residue (Gers tner et al. 1988; So l lner -Webb and Tower 1986). The T A T A mot i f at pos i t ion -5(-6) to -2(-3) (Fig. 9 B) was also found in the h u m a n (F inancsek et al. 1982), rat ( R o t h b l u m et al. 1982), and Acantharnoeba ( K o w n i n et al. 1985) promoters . The str ing of three or more G residues beg inn ing at pos i t ion + 2 or + 3 is c o m m o n to all the p lan t systems a nd also can be found at the Xenopus t r anscr ip t ion in i t ia t ion site (Sol lner -Webb et al. 1983).

rye GGCTACTAAGCGCATGAGCAGCT-TTGG

Fig. 10A and B. Conserved plant sequences downstream from the primary processing site. A The tomato sequence from the primary processing site at nucleotide 2691 to the 3' end of the intergenic spacer is aligned with the corresponding region from radish (lower). A line between the two sequences indicates conservation of nucleo- tides. The regions of greatest homology are boxed. B Conserved sequences from plant intergenic spacers corresponding to the first boxed domain in A. The underlined nucleotides are conserved in all the species. The sequences are from: Delcasso-Tremousaygue et al. (1988) (radish); Toloczyki and Feix (1986) and McMullen et al. (1986) (maize); Barker et al. 1988) (wheat); and Appels et al. (1986) (rye)

110

Sequences within the transcribed spacer 3' to the pro- cessing site at nucleotide 2691 are also conserved between tomato and radish (Fig. 10A, B). The most extensive ho- mology is a stretch of 23 out of 25 nucleotides with one gap (Fig. 10B). These sequences are also relatively con- served in wheat, rye and maize, with seven identical nucleo- tides embedded in a stretch of up to 27 similar nucleotides (underlined in Fig. 10B). Sequences from the intergenic spacer are rarely conserved in such widely divergent species.

Additional sequences from tomato are found in com- mon with other plant rDNAs and are located immediately 3' to the 25S and 5' to the 18S rRNA domains. A feature common to many eukaryotic rDNAs is the presence of a pyrimidine-rich stretch following the Y terminus of the 25S rRNA domain. Of 10 nucleotides adjacent to the 25S rRNA domain 9 are C residues in tomato (nucleotides 1-10), rad- ish (Delcasso-Tremousaygue et al. 1988), maize Toloczyki and Feix 1986; McMullen et al. 1986), wheat (Lassner et al. 1987; Barker etal. 1988), and rye (Appels etal. 1986). These C residues are part of a 20 nucleotide stretch of py- rimidine-rich sequence. These same species exhibit an addi- tional conservation of sequence in the intergenic spacer im- mediately upstream of the 18S rRNA domain. The trinuc- leotide TGC adjacent to the 5' end of the 18S rRNA domain is common to all five species, and the sequences are similar for up to 10 nucleotides.

Discussion

Based on the conserved patterns of organization and tran- scription of eukaryotic rDNA, we believe the 5' termini of the pre-rRNAs identified correspond to the initiation site and the processing site for transcripts originating in the intergenic spacer. RNAs with these termini are dia- grammed in the lower part of Fig. 1A below the rDNA template. Sequence data allow an accurate prediction of the sizes of these transcripts. The data in Fig. 2 show that the primary transcript and the 5' end-processed transcript have 1690 and 568 nucleotides, respectively, of transcribed spacer at their 5' ends. The entire tomato rDNA outside of the intergenic spacer has been sequenced by Kiss et al. (1988, 1989a, b) and consists of 5778 nucleotides. Tran- scripts with 3' ends near the 3' ends of the 25S rRNA would thus contain approximately 7.47 and 6.35 kb if they originated at the mapped 5' termini. These sizes are in good agreement with those estimated for the transcripts detected here by Northern hybridization analysis. As indicated in Fig. 1 A, the precise 3' ends of the transcripts have not been determined.

The identification of the mapped 5' transcript terminus as a putative rDNA transcription initiation site in tomato is based on three criteria. First and most compelling, this site and other putative transcription initiation sites in plants resemble confirmed initiation sites in animal systems (To- loczyki and Feix 1986; Delcasso-Tremousaygue et al. 1988; Gerstner et al. 1988). Second, the putative initiation site is characterized by a conserved sequence motif present in the seven plant species for which sequences are available. And lastly, in both tomato and maize, the putative tran- scription initiation site corresponds to the 5' end of the transcript which maps furthest upstream (5') relative to the 18S rRNA domain (Toloczyki and Feix 1986; McMullen et al. 1986). In the case of radish, the last argument is com- plicated by the detection of one or two transcripts originat-

ing further upstream (5') in the intergenic spacer (Delcasso- Tremousaygue et al. 1988). It was suggested that, in radish, transcription may initiate at more than one site. The results from our work on tomato support the contention that tran- scription initiates at the TATA(G)TA(N)GGG motif, but a transcription assay is required to substantiate this claim.

Our finding of two major 5' termini in transcripts of the intergenic spacer is similar to results obtained in maize (Toloczyki and Feix 1986; McMullen etal. 1986). The downstream terminus is assumed to result from a processing event, as is observed in mouse (Miller and Sollner-Webb 1981) and is suggested to occur in both maize and radish (Toloczyki and Feix 1986; McMullen et al. 1986; Delcasso- Tremousaygue et al. 1988). In mouse, the 250 nucleotides 3' to the processing site are necessary and sufficient for the specificity of the processing event (Craig et al. 1987). A comparison of the tomato sequences 3' to the processing site with the corresponding regions of maize (Toloczyki and Feix 1986; McMullen et al. 1986), and radish (Delcasso- Tremousaygue et al. 1988) reveals the presence of a con- served 25 nucleotide stretch (Fig. 10A, B). If processing of RNA from the transcribed spacer of plants occurs in a fashion analogous to that observed in mouse (Craig et al. 1987), then this conserved region may be involved in pro- cessing. These sequences appear to be under selection, which contrasts with other regions of the intergenic spacer of plants which have rapidly diverged. This conservation has been observed in all the published plant sequences ex- amined except mung bean (Gerstner et al. 1988).

Transcription of tomato rDNA appears to originate at a single site and proceed through more than half of the intergenic spacer, including 840 bp of the 141 bp repeats. It is unknown to what extent the spacer DNA 3' to the 25S domain is transcribed. Work with the rDNA of Xeno- pus (Labhart and Reeder 1986; De Winter and Moss 1986) and Drosophila (Tautz and Dover 1986) suggests that a majority of the intergenic spacer is transcribed, and in Xeno- pus, transcription terminates at a site immediately upstream of the initiation site. We did not see any evidence for tran- scription further upstream of nucleotide 1569, but our probes did not cover the region immediately 3' to the 25S rRNA domain.

Although rDNA repeats analogous to the 53 bp and 141 bp repeats of tomato are ubiquitous in plants, their function remains unknown. The most extensive of these repeats are usually located upstream (5') of the putative transcription initiation site; the large array of 141 bp re- peats downstream (3') of this site is unusual. Furthermore, the putative transcription initiation site is located 600-1000 bp further upstream (5') of the 18S rRNA domain than observed in other plants (Rogers and Bendich 1987b). By analogy to Xenopus where 60/81 bp repeats have been shown to function as enhancers (Moss 1983; Busby and Reeder 1983; Labhart and Reeder 1984), plant rDNA re- peat elements have been hypothesized to play a role in the regulation of transcription (Appels et al. 1986; Flavell et al. 1986; McMullen etal. 1986; Toloczyki and Feix 1986; Rogers and Bendich 1987b). A fundamental distinction to be made between these systems is that the Xenopus repeats contain a 42 bp imperfect copy of a part of the rDNA promoter (Reeder 1984). No such remarkable similarities have been observed between plant repeat elements and se- quences upstream (5') of the putative transcription initia- tion sites. Clearly, a greater understanding of the signifi-

111

cance of these r D N A repeats in plants will follow the devel- opment of a functional assay for R N A polymerase I me- diated gene expression.

Acknowledgements. This work was supported by a grant from the McKnight Foundation.

Note added in proof

After this paper was submitted a paper containing the "nucleotide sequence of the intergenic spacer (IGS) of the tomato ribosomal DNA" was published by Schmidt-Puchta W, Giinther I, Stinger HL in Plant Mol Biol 13:251-253 1989. These authors also ob- served 53bp and 141bp repeats, separated by an AT-rich region.

References

Appels R, Dvo~fik J (1982) The wheat ribosomal DNA spacer region: its structure and variation in populations and among species. Theor Appl Genet 63:337-348

Appels R, Moran L, Gustafson J (1986) The structure of DNA from the rye (Secale cereale) NOR RI locus and its behaviour in wheat backgrounds. Can J Genet Cytol 28 : 673-685

Barker RF, Harberd NP, Jarvis MG, Flavell RB (1988) Structure and evolution of the intergenic region in a ribosomal DNA repeat unit of wheat. J Mol Biol 201:1-17

Bendich A, Anderson S, Ward BL (1979) Plant DNA: long, pure and simple. In: Leaver CJ (ed) Genome organization and ex- pression in plants. Plenum Press, New York, pp 31 33

Berk AJ, Sharp PA (1977) Sizing and mapping of early adenovirus mRNAs by gel eleetrophoresis of $1 endonuclease-digested hy- brids. Cell 12:721-732

Bolivar F, Backman K (1979) Plasmids of Escherichia coli as clon- ing vectors. Methods Enzymol 68 : 245-267

Botchan PM, Dayton AI (1982) A specific replication origin in the chromosomal rDNA of Lytechinus variegatus. Nature 299:453-456

Busby SJ, Reeder RH (1983) Spacer sequences regulate transcrip- tion of ribosomal gene plasmids injected into Xenopus embryos. Cell 34:98%996

Calzone FJ, Britten RJ, Davidson EH (1987) Mapping of gene transcripts by nuclease protection assays and cDNA primer extension. Methods Enzymol 152:611-632

Chirgwin JM, Przybyla AE, MacDonald RJ, Rutter WJ (1979) Isolation of biologically active ribonucleic acid from sources enriched in ribonuclease. Biochemistry 18 : 5294-5299

Craig N, Kass S, Sollner-Webb B (1987) Nucleotide sequence de- termining the first cleavage site in the processing of mouse pre- cursor rRNA. Proc Natl Acad Sci USA 84 : 629-633

Delcasso-Tremousaygue D, Grellet F, Panabieres F, Ananiev ED, Delsensy M (1988) Structural and transcriptional characteriza- tion of the external spacer of ribosomal RNA nuclear gene from a higher plant. Eur J Biochem 172:767-776

Devereaux J, Haeberli P, Smithies O (1984) A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res 12:38%395

De Winter RFJ, Moss T (1986) Spacer promoters are essential for efficient enhancement of X. laevis ribosomal transcription. Cell 44:313-318

Dvo~fik J, Appels R (1982) Chromosome and nucleotide sequence differentiation in genomes of polyploid Triticum species. Theor Appl Genet 63:349-360

Feinberg AP, Vogelstein B (1983) A technique for radiolabeling DNA restriction endonuclease fragments to high specific activi- ty. Anal Biochem 132: 6-13

Financsek I, Mizumoto K, Mishima Y, Muramatsu M (1982) Hu- man ribosomal RNA gene: Nucleotide sequence of the tran- scription initiation region and comparison of three mammalian genes. Proc Natt Acad Sci USA 79 : 309~3096

Flavell RB, O'Dell M, Thompson WF, Vincentz M, Sardana R, Barker RF (1986) The differential expression of ribosomal RNA genes. Phil Trans R Soc Lond B 314:38~397

Gerstner J, Schiebel K, Waldburg G, Hemleben V (1988) Complex organization of the length heterogeneous 5' external spacer of mung bean (Vigna radiata) ribosomal DNA. Genome 30: 723-733

Goelet P, Lomonossoff GP, Butler PJG, Akam ME, Gait M J, Karn J (1982) Nucleotide sequence of tobacco mosaic virus RNA. Proc Natl Acad Sci USA 79:5818-5822

Gould AR, Symons RH (1977) Determination of the sequence homology between the four RNA species of cucumber mosaic virus by hybridization analysis with complementary DNA. Nucleic Acids Res 11 : 378%3802

Grummt I, Maier U, Ohrlein A, Hassouna N, Bachellerie J (1985) Transcription of mouse rDNA terminates downstream of the Y end of 28S RNA and involves interaction of factors with repeated sequences in the 3' spacer. Cell 43:801-810

ingle J, Timmis JN, Sinclair J (1975) The relationship between satellite DNA, ribosomal RNA gene redundancy, and genome size in plants. Plant Physiol 55:496-501

Jorgensen RA, Cuellar RE, Thompson WF, Kavanagh TA (1987) Structure and variation in ribosomal RNA genes of pea. Plant Mol Biol 8:3-12

Kingston RE (1987) Primer extension. In: Ausubel FM (ed) Cur- rent protocols in molecular biology. J Wiley & Sons, NY, pp 481-483

Kiss T, Kis M, Abel S, Solymosy F (1988) Nucleotide sequence of the 17S-25S spacer region from tomato rDNA. Nucleic Acids Res 16:7179

Kiss T, Kis M, Solymosy F (1989a) Nucleotide sequence of a 25S rRNA gene from tomato. Nucleic Acids Res 17:796

Kiss T, Szkukalek A, Solymosy F (1989b) Nucleotide sequence of a 17S (18S) rRNA gene from tomato. Nucleic Acids Res 17:2127

Kownin P, Iida CT, Brown-Shimer S, Paule MR (1985) The ribo- somal RNA promoter of Acanthamoeba castellanii determined by transcription in cell-free system. Nucleic Acids Res 13:6237-6247

Labhart P, Reeder RH (1984) Enhancer-like properties of the 60/81 bp elements in the ribosomal gene spacer of Xenopus lae- vis. Cell 37 : 285 289

Labhart P, Reeder RH (1986) Characterization of three sites of RNA 3' end formation in the Xenopus ribosomal gene spacer. Cell 45:431-443

Lassner M, Dvo~gk J (1986) Preferential homogenization between adjacent and alternate subrepeats in wheat rDNA. Nucleic Acids Res 14:5499-5512

Lassner M, Anderson O, Dvo~/tk J (1987) Hypervariation asso- ciated with a 12-nucleotide direct repeat and inferences on inter- genomic homogenization of ribosomal RNA gene spacers based on the DNA sequence of a clone from the wheat Nor-D3 locus. Genome 29:770-781

Long EO, Dawid IB (1980) Repeated genes in eukaryotes. Annu Rev Biochem 49 : 727-764

Maniatis T, Fritsch E, Sambrook J (1982) Molecular cloning. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York

McMaster GK, Carmichael GG (1977) Analysis of single- and double-stranded nucleic acids on polyacrylamide and agarose gels by using glyoxal and acridine orange. Proc Natl Acad Sci USA 74:4835-4838

McMullen MD, Hunter B, Phillips RL, Rubenstein I (1986) The structure of the maize ribosomal DNA spacer region. Nucleic Acids Res 14:4953-4968

Mierendorf RC, Pfeffer D (1987) Direct sequencing of denatured plasmid DNA. Methods Enzymol 152:556-562

Miller KG, Sollner-Webb B (1981) Transcription of mouse rRNA genes by RNA polymerase I: in vitro and in vivo initiation and processing sites. Cell 27:165-174

Mizusawa S, Nishimura S, Seela F (1986) Improvement of the dideoxy chain termination method of DNA sequencing by use of deoxy-7-deazaguanosine triphosphate in place of dGTP. Nucleic Acids Res 14:1319-1324

112

Moss T (1983) A transcriptional function for the repetitive ribo- somal spacer in Xenopus laevis. Nature 302: 223-228

Norrander J, Kempe T, Messing J (1983) Construction of improved M13 vectors using oligonucleotide-directed mutagenesis. Gene 26:101-106

Perry KL (1989) Transcription of tomato ribosomal DNA and the organization of the intergenic spacer. Ph D thesis, Cornell University, Ithaca, New York

Reeder RH (1984) Enhancers and ribosomal gene spacers. Cell 38:349-351

Rogers SO, Bendich AJ (1987a) Heritability and variability in ribo- somal RNA genes of Viciafaba. Genetics 117 : 285-295

Rogers SO, Bendich AJ (1987b) Ribosomal RNA genes in plants: variability in copy number and in the intergenic spacer. Plant Mol Biol 9 : 509-520

Rothblum L, Reddy R, Cassidy B (1982) Transcription initiation site of rat ribosomal DNA. Nucleic Acids Res 10:7345-7362

Saghai-Maroof MA, Soliman KM, Jorgensen RA, Allard RW (1984) Ribosomal DNA spacer-length polymorphisms in bar- ley: Mendelian inheritance, chromosomal location, and popula- tion dynamics. Proc Natl Acad Sci USA 81:8014-8018

Sanger F, Coulson AR, Barrell BG, Smith AJH, Roe BA (1980) Cloning in single-stranded bacteriophage as an aid to rapid DNA sequencing. J Mol Biol 143:161-178

Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain terminating inhibitors. Proc Natl Acad Sci USA 74: 5463-5476

Sollner-Webb B, Reeder RH (1979) The nucleotide sequence of the initiation and termination sites for ribosomal RNA tran- scription in X. laevis. Cell 18:485499

Sollner-Webb B, Wilkinson JA, Roan J, Reeder RH (1983) Nested control regions promote Xenopus ribosomal RNA synthesis by RNA polymerase I. Cell 35:199-206

Sollner-Webb B, Tower J (1986) Transcription of cloned eukaryotic ribosomal RNA genes. Annu Rev Biochem 55:801-830

Sures I, Crippa M (1984) Xenopsin: the neurotensin-like octapep- tide from Xenopus skin at the earboxyl terminus of its precursor. Proc Natl Acad Sci USA 81 : 380-384

Tautz D, Dover GA (1986) Transcription of the tandem array of ribosomal DNA in Drosophila melanogaster does not termi- nate at any fixed point. EMBO J 5 : 126%1273

Taylor JM, Ilmensee R, Summers J (1976) Efficient transcription of RNA into DNA by avian sarcoma virus polymerase. Bio- chim Biophys Acta 442: 324-330

Toloczyki C, Feix G (1986) Occurrence of 9 homologous repeat units in the external spacer region of a nuclear maize rRNA gene unit. Nucleic Acids Res 14:4969-4986

Van't Hof J, Hernandez P, Bjerknes CA, Lamm SS (1987a) Loca- tion of the replication origin in the 9-kb repeat size class of rDNA in pea (Pisum sativum). Plant Mol Biol 9 : 87-95

Van't Hof J, Lamm SS, Bjerknes CA (1987b) Detection of replica- tion initiation by a replicon family in DNA of synchronized pea (Pisum sativum) root cells using benzoylated naphthoylated DEAE-cellulose chromatography. Plant Mol Biol 9:77-86

Vieira J, Messing J (1987) Production of single-stranded plasmid DNA. Methods Enzymol 153 : 3-11

Walbot V, Cullis CA (1985) Rapid genome changes in higher plants. Annu Rev Plant Physiol 36 : 367-396

Yakura K, Kato A, Tanifuji S (1984) Length heterogeneity of the large spacer of Viciafaba rDNA is due to the differing number of a 325 bp repetitive elements. Mol Gen Genet 193:400-405

Yanisch-Perron C, Vieira J, Messing J (1985) Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mp18 and pUC19 vectors. Gene 33 : 103-119

Communicated by E. Meyerowitz

Received August 10, 1989