5
Proc. Nati. Acad. Sci. USA Vol. 80, pp. 5271-5275, September 1983 Biochemistry Sequence coding for the alphavirus nonstructural proteins is interrupted by an opal termination codon (Sindbis virus/nucleotide sequence/Middelburg virus) ELLEN G. STRAUSS, CHARLES M. RICE, AND JAMES H. STRAUSS Division of Biology, California Institute of Technology, Pasadena, California 91125 Communicated by Ray D. Owen, May 31, 1983 ABSTRACT We have obtained the nucleotide sequence of the genomic RNAs of two alphaviruses, Sindbis virus and Middelburg virus, over an extensive region encoding the nonstructural (repli- case) proteins. In both viruses in an equivalent position an opal (UGA) termination codon punctuates a long otherwise open read- ing frame. The nonstructural proteins are translated as polypro- tein precursors that are processed by posttranslational cleavage into four polypeptide chains; the sequence data presented here indicate that the COOH-terminal polypeptide, ns72, may be pro- duced by read-through of this opal codon. The high degree of amino acid homology between the ns72 polypeptides of the two viruses, in contrast to the lack of conserved sequence upstream from the read-through site, suggests that ns72 plays an important role in viral replication, possibly modulating the action of other replicase components. The alphavirus genus of the family Togaviridae contains about 20 members, many of which are important human or veterinary pathogens. These viruses contain an infectious -RNA of about 11,700 nucleotides, which sediments at 49 S (reviewed in ref. 1). In the virus this RNA is surrounded by a single species of basic protein, the capsid protein, to form a nucleocapsid with cubic symmetry. As the final event in maturation this capsid buds through the plasma membrane of the infected cell and acquires an envelope containing lipids of host cell origin and two virus-encoded transmembrane glycoproteins, termed E2 and El. The alphaviruses produce two messenger RNAs after infec- tion (1). The first, 49S RNA, appears to be identical to the RNA in the virion and is translated as polyproteins that are processed to form the nonstructural proteins (2) required to replicate the virus RNA and to transcribe a subgenomic 26S RNA. This 26S RNA is the major messenger found on host cell polysomes and is translated into the structural proteins found in virions. The sequences of the 26 RNAs of three alphaviruses, Semliki Forest virus (3, 4), Sindbis virus (5), and Ross River virus (6), have been determined recently and the details of the encoded proteins established. We have previously reported that the translation of the nonstructural polyproteins from alphavirus 49S RNA ter- minates with multiple in-phase stop codons positioned about two-thirds of the distance from the 5' end (7). In this com- munication we report that translation of the nonstructural pro- teins is modulated by the presence of an opal codon within the RNA. MATERIALS AND METHODS Preparation of RNA. Growth of Sindbis virus or Middel- burg virus and preparation of 49S RNA from infected cells or from purified virions have been described (8). Sequence of Sindbis RNA. Sindbis 49S RNA was transcribed into DNA by using reverse transcriptase and random priming as described (9). The single-stranded (ss) cDNA was cut with restriction enzymes Hae III or Taq I and the resulting frag- ments were subjected to sequence analysis by the chemical methods of Maxam and Gilbert (10), as described (9). All but one of the Hae III fragments longer than 30 nucleotides and about 30 Taq I fragments that contained non-26S sequence were subjected to sequence analysis. At this stage, together with the sequence of the 26S RNA (5), about 90% of the 49S sequence had been obtained. These sequences were aligned by restric- tion enzyme analysis of double-stranded (ds) cDNA made to 49S RNA by using infrequently cutting enzymes such as EcoRI (6). At this point fragments encompassing the entire 49S se- quence could be aligned but the sequence still possessed gaps and ambiguities due to compression artifacts. Using this infor- mation to devise a cloning strategy, we obtained the entire ge- nome as four nonoverlapping clones in a plasmid derivative of pBR322 (unpublished data). Cloned DNA was used to se- quence selected regions of the genome by the chemical method. A detailed description of the methods used and the entire se- quence of the Sindbis 49S RNA will be presented elsewhere. Sequence Analysis of Middelburg RNA. A large number of Hae III fragments from ss cDNA made to Middelburg 49S RNA were subjected to sequence analysis as previously described (9). These sequences were translated in all three reading frames and the deduced amino acid sequences were compared to those of Sindbis virus by using computer homology routines. In this way most of the sequences of the Middelburg fragments ob- tained could be positioned relative to the Sindbis 49S RNA se- quence and a partial restriction map was generated by com- puter. Long ds cDNA was cut with infrequently cutting enzymes and the resulting fragments were subjected to sequence anal- ysis to confirm the alignment and obtain additional sequence. Only two EcoRI sites were found to be present in Middelburg RNA, bracketing the region containing the stop codon. ds Mid- delburg cDNA was cut with EcoRI and a 2.5-kilobase fragment was cloned into the EcoRI site of a plasmid derived originally from pBR322. This cloned fragment was subjected to sequence analysis in its entirety. A sequence analysis schematic for both viruses is shown in Fig. 1. RESULTS AND DISCUSSION During sequence analysis of Sindbis 49S RNA we discovered the presence of an opal stop codon within the region encoding the nonstructural proteins. A simple map of the Sindbis ge- nome (Fig. 2) illustrates the location of this opal codon in re- lation to the regions encoding the nonstructural proteins as well as the structural polypeptides of the virion. To demonstrate the generality of an open reading frame for the nonstructural pro- Abbreviations: ss, single-stranded; ds, double-stranded. 5271 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertise- ment" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on November 29, 2020

Sequence coding forthe is interrupted - PNAS · in the sequence (boxed in Fig. 3). For Sindbis virus this stop codon interrupts an otherwise open reading frame for 2,513 aminoacids

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sequence coding forthe is interrupted - PNAS · in the sequence (boxed in Fig. 3). For Sindbis virus this stop codon interrupts an otherwise open reading frame for 2,513 aminoacids

Proc. Nati. Acad. Sci. USAVol. 80, pp. 5271-5275, September 1983Biochemistry

Sequence coding for the alphavirus nonstructural proteins isinterrupted by an opal termination codon

(Sindbis virus/nucleotide sequence/Middelburg virus)

ELLEN G. STRAUSS, CHARLES M. RICE, AND JAMES H. STRAUSSDivision of Biology, California Institute of Technology, Pasadena, California 91125

Communicated by Ray D. Owen, May 31, 1983

ABSTRACT We have obtained the nucleotide sequence of thegenomic RNAs of two alphaviruses, Sindbis virus and Middelburgvirus, over an extensive region encoding the nonstructural (repli-case) proteins. In both viruses in an equivalent position an opal(UGA) termination codon punctuates a long otherwise open read-ing frame. The nonstructural proteins are translated as polypro-tein precursors that are processed by posttranslational cleavageinto four polypeptide chains; the sequence data presented hereindicate that the COOH-terminal polypeptide, ns72, may be pro-duced by read-through of this opal codon. The high degree of aminoacid homology between the ns72 polypeptides of the two viruses,in contrast to the lack of conserved sequence upstream from theread-through site, suggests that ns72 plays an important role inviral replication, possibly modulating the action of other replicasecomponents.

The alphavirus genus of the family Togaviridae contains about20 members, many of which are important human or veterinarypathogens. These viruses contain an infectious -RNA of about11,700 nucleotides, which sediments at 49 S (reviewed in ref.1). In the virus this RNA is surrounded by a single species ofbasic protein, the capsid protein, to form a nucleocapsid withcubic symmetry. As the final event in maturation this capsidbuds through the plasma membrane of the infected cell andacquires an envelope containing lipids of host cell origin andtwo virus-encoded transmembrane glycoproteins, termed E2and El.The alphaviruses produce two messenger RNAs after infec-

tion (1). The first, 49S RNA, appears to be identical to the RNAin the virion and is translated as polyproteins that are processedto form the nonstructural proteins (2) required to replicate thevirus RNA and to transcribe a subgenomic 26S RNA. This 26SRNA is the major messenger found on host cell polysomes andis translated into the structural proteins found in virions. Thesequences of the 26 RNAs of three alphaviruses, Semliki Forestvirus (3, 4), Sindbis virus (5), and Ross River virus (6), have beendetermined recently and the details of the encoded proteinsestablished. We have previously reported that the translationof the nonstructural polyproteins from alphavirus 49S RNA ter-minates with multiple in-phase stop codons positioned abouttwo-thirds of the distance from the 5' end (7). In this com-munication we report that translation of the nonstructural pro-teins is modulated by the presence of an opal codon within theRNA.

MATERIALS AND METHODSPreparation of RNA. Growth of Sindbis virus or Middel-

burg virus and preparation of 49S RNA from infected cells orfrom purified virions have been described (8).

Sequence of Sindbis RNA. Sindbis 49S RNA was transcribedinto DNA by using reverse transcriptase and random primingas described (9). The single-stranded (ss) cDNA was cut withrestriction enzymes Hae III or Taq I and the resulting frag-ments were subjected to sequence analysis by the chemicalmethods of Maxam and Gilbert (10), as described (9). All butone of the Hae III fragments longer than 30 nucleotides andabout 30 Taq I fragments that contained non-26S sequence weresubjected to sequence analysis. At this stage, together with thesequence of the 26S RNA (5), about 90% of the 49S sequencehad been obtained. These sequences were aligned by restric-tion enzyme analysis of double-stranded (ds) cDNA made to49S RNA by using infrequently cutting enzymes such as EcoRI(6). At this point fragments encompassing the entire 49S se-quence could be aligned but the sequence still possessed gapsand ambiguities due to compression artifacts. Using this infor-mation to devise a cloning strategy, we obtained the entire ge-nome as four nonoverlapping clones in a plasmid derivative ofpBR322 (unpublished data). Cloned DNA was used to se-quence selected regions of the genome by the chemical method.A detailed description of the methods used and the entire se-quence of the Sindbis 49S RNA will be presented elsewhere.

Sequence Analysis of Middelburg RNA. A large number ofHae III fragments from ss cDNA made to Middelburg 49S RNAwere subjected to sequence analysis as previously described(9). These sequences were translated in all three reading framesand the deduced amino acid sequences were compared to thoseof Sindbis virus by using computer homology routines. In thisway most of the sequences of the Middelburg fragments ob-tained could be positioned relative to the Sindbis 49S RNA se-quence and a partial restriction map was generated by com-puter. Long ds cDNA was cut with infrequently cutting enzymesand the resulting fragments were subjected to sequence anal-ysis to confirm the alignment and obtain additional sequence.Only two EcoRI sites were found to be present in MiddelburgRNA, bracketing the region containing the stop codon. ds Mid-delburg cDNA was cut with EcoRI and a 2.5-kilobase fragmentwas cloned into the EcoRI site of a plasmid derived originallyfrom pBR322. This cloned fragment was subjected to sequenceanalysis in its entirety. A sequence analysis schematic for bothviruses is shown in Fig. 1.

RESULTS AND DISCUSSIONDuring sequence analysis of Sindbis 49S RNA we discoveredthe presence of an opal stop codon within the region encodingthe nonstructural proteins. A simple map of the Sindbis ge-nome (Fig. 2) illustrates the location of this opal codon in re-lation to the regions encoding the nonstructural proteins as wellas the structural polypeptides of the virion. To demonstrate thegenerality of an open reading frame for the nonstructural pro-

Abbreviations: ss, single-stranded; ds, double-stranded.

5271

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertise-ment" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

Nov

embe

r 29

, 202

0

Page 2: Sequence coding forthe is interrupted - PNAS · in the sequence (boxed in Fig. 3). For Sindbis virus this stop codon interrupts an otherwise open reading frame for 2,513 aminoacids

5272 Biochemistry: Strauss et al.

SINnt4344-4-

*

MIDEcoRl

c-

nt7601J-- RNA

SS cDNA

= CLONED DNA

ZI- RNA

= SS cDNA

DS cDNA

CLONED DNA

EcoRi

*

2000 3000 4000

FIG. 1. Sequence analysis schematic for Sindbis virus (SIN) RNA and Middelburg virus (MID) RNA. The Sindbis sequence is numbered fromthe 5' terminus of 49S RNA. The two EcoRI sites define the cloned segment of the Middelburg genome. The regions sequenced as ss cDNA, as dscDNA, and as cloned DNA, respectively, are noted for each virus. The open boxes are the regions of sequence presented in Fig. 3; the asterisk marksthe opal codon. nt, Nucleotide.

teins punctuated with a termination codon as an integral mech-anism of alphavirus replication, we obtained the nucleotide se-quence of a second alphavirus, Middelburg virus, in the regionshown by a solid box in Fig. 2. The translated sequences forboth Sindbis and Middelburg viruses in this region are shownin Fig. 3. In the case of Sindbis virus, nucleotides are num-bered from the 5' terminus of the RNA and amino acids fromthe NH2 terminus of the nonstructural polyprotein precursor.The sequence of Middelburg is not complete and its sequenceis aligned by homology with the corresponding Sindbis se-quence. Both viruses have an opal codon at the same positionin the sequence (boxed in Fig. 3). For Sindbis virus this stopcodon interrupts an otherwise open reading frame for 2,513amino acids and is found at position 1,897 in this sequence (un-published data). For both viruses the sequence through the stopcodon was obtained with both ss cDNA as well as cloned DNA(Fig. 1). As noted earlier, the sequence of ss cDNA gives themajority nucleotide at any position for the population of reversetranscriptase copies produced from the population of RNA mol-ecules and thus cloning artifacts, reverse transcriptase errors,or minor components in the RNA population do not influencethe results. Furthermore, in the case of Sindbis virus both strandsof the cloned cDNA were subjected to sequence analysis in bothdirections (i.e., both 5' to 3' and 3' to 5') to rule out sequenceanalysis artifacts. The existence of the identical stop codon inthe same position in two different alphaviruses makes it clearthat this opal codon is a fundamental landmark in the alphavirusgenome.

Downstream of the terminator, there is a long open readingframe encoding highly conserved proteins in the two viruses(Fig. 3 and also see below), indicating that this sequence mustbe translated in vivo to produce a protein important for virusreplication. Direct evidence for the existence of such a poly-peptide, called ns72, comes from experiments with an antibody

produced in rabbits to the synthetic polypeptide Arg-Gly-Glu-Ile-Lys-His-Leu-Tyr-Gly-Gly-Pro-Lys, which is the deducedamino acid sequence at the COOH terminus of this polypeptidefor Sindbis virus (overlined in Fig. 3). This antipeptide anti-body precipitates ns72 as well as several larger polypeptides,presumably precursors, from Sindbis-infected cells (S. Lopezand J. Bell, personal communication). Polypeptide ns72 couldbe produced by several mechanisms, including translation froma spliced message, internal initiation, or read-through synthe-sis. The presence of a minor spliced message seems unlikely inview of the fact that spliced mRNAs have not been detected foranimal RNA viruses whose messages are produced solely in thecytoplasm and no evidence for such a RNA has been found foralphaviruses (12). Similarly, no indication of internal initiationhas been seen for alphaviruses either in vivo (2) or in vitro (13).Moreover, the first methionine residue downstream of the opalcodon is located at residue 1,957 and would result in a poly-peptide only 556 amino acids long. Because the open readingframes upstream and downstream of the termination signal arein phase, the most likely mechanism for producing ns72 that fitsall of the existing data is read-through synthesis to produce afull-length polyprotein precursor of 2,513 amino acids, fol-lowed by posttranslational proteolytic cleavage. Indeed, minoramounts of such a precursor have been detected during in vitrotranslation of 49S RNA by Collins et aL (13), who suggested onthe basis of the synthesis kinetics that this polypeptide mightbe produced by read-through of a stop codon. Moreover, thelargest polypeptide precipitated from infected cells by the an-tibody to the COOH-terminal peptide is sufficiently large to bethe 2,513 amino acid protein encoded by the entire nonstruc-tural region (S. Lopez and J. Bell, personal communication).

Thus, we believe that translation of Sindbis virus 49S RNAproduces two polyproteins, which contain the nonstructuralproteins, both of which are initiated at the same location near

NONSTRUCTURAL PROTEINS*

a ns60l nsi9 ns76

VIRION PROTEINSE3 F K 3'

CAP POlY (A)

FIG. 2. Map of the Sindbis virus genome. Boxes indicate translated regions. The solid box corresponds to the open box in Fig. 1 and to the se-quences shown in Fig. 3; the asterisk marks the opal codon. The crosshatched box is the region translated from the 26S subgenomic RNA into thestructural proteins.

0 1000

5'

Proc. Natl. Acad. Sci. USA 80 (1983)

Dow

nloa

ded

by g

uest

on

Nov

embe

r 29

, 202

0

Page 3: Sequence coding forthe is interrupted - PNAS · in the sequence (boxed in Fig. 3). For Sindbis virus this stop codon interrupts an otherwise open reading frame for 2,513 aminoacids

Biochemistry: Strauss et al. Proc. Natl. Acad. Sci. USA 80 (1983) 5273

1429 L K L L O N A Y H A V A D L V N E H N I K S V A I P L L S T G I Y A A G K D R L E V S L N C L T T ASINN 4344 UUGAAAUUGCUACAAAACGCCUACCAUGCAGUGGCAGACUUAGUAAAUGAACAUAACAUCAAGUCUGUCGCCAUUCCACUGCUAUCUACAGGCAUUUACGCAGCCGGAAAAGACCGCCUUGAAGUAUCACUUAACUGCUUGACAACCGCG0 A 0 L A A V Y R A V A S L A 0 E T V R T M A I P L L S T G T F A G G K 0 R V L O S L N H L F T AM I D GAUGCCGACCUAGCAGCUGUGUAUAGAGCCGUGGCGUCCUUGGCUGACGAG ACAGUCCGCACAAUGGCCAUACCACUCCUGUCAACGGGGACGUUCGCUGSGGGAAAGGACCGCGUGUUGCAGUCGUUGAACCACCUAUUUACGGCC

S1479) L D R T O A D V T I Y C L O K K W K E R I D A A L O L K E S V T E L K I E D M E I D D E L V W I H PSIN 4494CUAGACAGAACUGACGCGGACGUAACCAUCUAUUGCCUGGAUAAGAAGUGGAAGGAAAGAAUCGACGCGGCACUCCAACUUAAGGAGUCUGUAACAGAGCUGAAGGAUGAAGAUAUGGAGAUCGACGAUGAGUUAGU4UGGAUCCAUCCAL 0 T T 0 V 0 V T I Y C R 0 K S W E K K I D E A I 0 M R T A T E L L D 0 0 T T V M K E L T R V H PCUGGACACCACGGACGUCGAUGUGACGAUAUACUGCCGGGAUAASUC uGSSAA AAGAAAAUCCAAGAGGCCAUUGAUAUGAGGACGGCAACCGAACUGCUAGAUGACSACACAACGGUUAUG AA AGACUAACCAGGGUGCAUCCU

S 1529) D SC L K G A K G F S T T K G K L Y S Y F E G T K F H D A A K D M A E I K V L F P N OD E S N ED

LS IN W4441 GACASGUUCGCUUGAAGGGAAGAAAGGGAUUCAGUACUAC AAAAGGAAAAUUGUAUUCGUACUUCGAAGGCACCA AAUUCCAUC AAGCAGCAAAAGAC AUGGCGGAGAUAAAGGUCCUGUUCCCUAAUGACCAGGAAAGUAAUGAACAACUGMID D S C L V G R S G F S T V D G R L H S Y L E G T R F H D T A V D V A E R P T L W P R A E E A N E D IMID GAUAGCUGCCUAGUGGGGCGCAGUGGAUUCAGCACGGUGGACGGACGGCUGCAUUCAUACCUUGA AGGAACUAGGUUCCACCAGACUGCUGUCGACGUGGC AGAAAGGCCAACUUUGUGGCC AAGGAGAGAGGAAGCGAACGAGCAGAUA

S 1579 C A IL G E T M E A I R E K C P VS H N P S S S P P K T L P C L C M Y A M T P E R V H R L R S N NSIN 14794) UGUGCCUACAUAUUGGGUGAGACCAUGGAAGCAAUCCGCGAAAAGUGCCCGGUCGACCAUAACCCGUCGUCUAGCCCGCCCAAAACGUUGCCGUGCCUUUGCAUGUAUGCCAUGACGCCAGAAAGGGUCCACAGACUUAGAAGCAAUAACMID T H Y V L G E S M E A I R T K C P V 0 0 T 0 S S A P P C T V P C L C R Y A M T P E R V H R L R A A

DMID ACACACUACGUCCUUGGCGAAUCCAUGGAGGCCAUAAGAACCAAAUGCCCGGUGGAUGAUACCGACUCGUC AGC ACC ACC AUGC ACCGUCCCGUGCCUUUGCCGCUACGCC AUGACCCCCGAGCGCGUACAUAGAUUGCGCGCCGCGCAG

N1629) V K E V T V C S S T P L P K H K I K N V D K V I C T K V V L F N P H T P A F V P A R K Y I E V P E DS IN 14944) GUCAAAGAAGUUACAGUAUGCUCCUCCACCCCCCUUCCUAAGCACAAAAUUAAGAAUGUUCAGA AGGUUCAGUGC ACGAAAGUAGUCCUGUUUAAUCCGC AC ACUCCCGC AUUCGUUCCCGCCCGUAAGUACAUAGAAGUGCCAGAACAGMID V K D F T V C S S F P L P K Y K I P G V D R V A C S A V M L F N H 0 V P A L V S P A K Y R E P S I SMID GUGAAGCAGUUCACAGUCUGCUCCUCGUUCCCGCUGCCAAAGUACAAGAUACCAGGCGUGC AGAGAGUGGCGUGUUCGGCUGUAAUGUUGUUUAAUC ACGACGUUCC AGCGCUGGUA AGCCCUCGCAAGUAC AGGGAACCGAGCAUUAGC

IN(1679)P T A P P A D A E E A P E V V A T P S P S T A O N T S L O V T D I S L O M O O S S E G S L F S S F SIN5094) CCUACCGCUCCUCCUGCAC AGGCCGAGGAGGCCCCCGAAGUUGUAGCGACACCGUCACC AUCUACAGCUGAUAAC ACCUCGCUUGAUGUC ACAGACAUCUCACUGGAUAUGGAUGAC AGUAGCGAAGGCUC ACUUUUUUCGAGCUUUAGCMID S E S S S S G L S V F 0 L 0 I G S 0 S E Y E P M E P V D P E P L I 0 L A V V E E T A P V R L E R V AMu AGCGAGUCGUCAUCAUCUGGACUGUCUGUGUUCGACCUGGACAUAGGCUCUGAUUCAGAGUACGAACCA AUGGAACCCGUGC AGCCCGAACCGCUGAUUGACUUGGCAGUCGUAGAGGAGACGGCCCCCGUCAGACUGGAACGAGUGGCC

SIN 1729) G S 0 N S I T S M D S W S S G P S S L E I V D R R D V V V A D V H A V D E P A P I P P P R L K K M Ak244) GGAUCGGACAACUCUAUUACUAGUAUGGACAGUUGGUCGUCAGGACCUAGUUCACUAGAGAUAGUAGACCGAAGGC AGGUGGUGGUGGCUGACGUUCAUGCCGUCC AAGAGCCUGCCCCUAUUCCACCGCCAAGGCUAAAGAAGAUGGCCMID P V A A P R R A R A T P F T L E D 9 V V A P V P A P R T M P V R P P R R K K A A T

CCUGUGGCUGCACCUCGCAGAGCCCGCGCGACACCCUUUACUUUGGAG CAGCGGGUUGUAGCACCAGUUCCUGCGCCGCGUACGAUGCCAGUCAGACCCCCUCGCCGGAAGAAAGCGGCGACC

SIN (1779) R L A A A R K E P T P P A S N S S E S L H L S F G G V S M S L G S I FS

G E T A RD

A A VD

P L A T5394 CGCUGGCAGCGGCAAGAAAAGAGCCCACUCCACCGGCAAGCAAUAGCUCUGAGUCCCUCCACCUCUCUUUGGUGGGGUAUCCAUGUCCCUCGGAUC AUUUUCGACGGAGAGACGGCCCGCCAGGC AGCGGUACAACCCCUGGCAACA

MID R T P E R I S F G O L O A E C M A I I N O O L T F G O F G A G E Frlsv AGAACACCUG^AAGGAUCUCGUUCGGCGAUUUAGAUGCCGAGUGCAUGGCCAUC AUAAAUGACGACCUGACUUUCGGGGACUUCGGCGCGGGCGAGUUC

IISIN (1829) G P T 0 V P M S F S F S G E I E L S A R V T E S E P V L F

(5544) GGCCCCACGGAUGUGCCUAUGUCUUUCGGAUCGUUUUCCGACGGAGAGAUUGAUGAGCUGAGCCGCAGAGUAACUGAGUCCGAACCCGUCCUGUUMID

SIN (1879) S F P L R K 0 R R R R S R R T E T G V G G Y I F S T 0 T(5694) UCUUUUCCACUACGCAAGCAGAGACGUAGACGCAGGAGCAGGAGGACUGAAUA UGA UAACCGGGGUAGGUGGGUACAUAUUUUCGACGGACAC

MID E R L T S A R A G A Y I F S S O TGAACGUUUAACGUCAGC A UAGACCGGGCGGGGGCCUACAUAUUCUCAUCGGAUAC

SIN (1929) T L E R N V L E R I H A P V L 0 T S K E E 0 L K L Y 0 M P T(5644) ACCUUGGAGCGCAAUGUCCUGGAAAGAAUUCAUGCCCCGGUGCUCGACACGUCGAAAGAGGAACAACUCAAACUCAGGUACCAGAUGAUGCCCAC(

MID v A E V H E E R V F A P K C 0 K E K E L L L L 0 M 0 M A P T(1979)11 E 9 L L SS.L.P. LI. N S A 1 0 0 PE.C IKIAAIAAIAAPAUUKUUUUUA.AAAU

P.AAUUUUALLLAAAAULLAALAAAULULUASLAAULLAAAAUGGAGAACSAUGAAGCAGAG

SIN ~1979) T T E R L L S G L R L r N S A T O O P E C Y K I T Y P K P L Y S5994) ACCACUGAGCGACUACUGUCAGGACUACGACUGUAUAACUCUGCCACAGAUCAGCCAGAAUGCUAUAAGAUCACCUAUCCGAAACCAUUGUACUC

MID v I 0 L L G G A K L F V T P T T 0 C R Y V T H K H P K P M Y SMID..l....rA

G S F E P G E V N S I I S S R S A V

G P G H L O K K S V L N L T E PSGGCCCUGGGCACUUGCAAAAGAAGUCCGUUCUGCAGAACCAGCUUACAGAACCGG P G H L O OM S v R O T R L A D C

E A N K S A Y UIDR K V E N K A ICGAAGCCAACAAAAGUAGGUACCAGUCUCGUAAAGUAGAAAAUCAGAAAGCCAUE A N K S R Y S R K V E N M K A E

S S V P A N YIS P O F A V A V C NCAGUAGCGUACCGGCGAACUACUCCGAUCCACAGUUCGCUGUAGCUGUCUGUAAT S V A F Y L S S A K T A V A A C N

1C

SIN (2029) N Y L H E N Y(6144) AACUAUCUGCAUGAGAACUA

MMn E F _L S N Y

A 1Y 0 T0 Tfl A lY L SM V DIUCCGACASUASCAUCUUAUCAGiUUiCUGACGAGUACGAUGCUUACUUGGAUAUGGUAGACGGD T v T c v1 n I T n1 C I . v .. -a a, -

T V A C L O T A T F CUP A K L R S Y P K K HIGACAGUCGCCUGCCUGGAUACUGCAACCUUCUGCCCCGCUAAGCUUAGAAGUUACCCGAAAAAACAS E S C L A A F C P S K L R S F P K K H

YGAGUAUAGAGCCCCGAAUAU

YAGCUACCAUCGAGCCGAAAU

R S V S A M N I L N V L I A A T K R N C N V T M R E L PWT L O S A T F N V EUCCGCAGUGCGGUUCCAUCAGCGAUGCAGAACACGCUAC AAAAUGUGCUCAUUGCCGCA ACUAAAGAUUGCAACGUCACGCAGAUGCGUGAACUGCCAACACUGGACUC AGCGACAUUCAUGUCGAAR S A V P S P F O N T L O N V L A A A T K R N C N V T O M R E L P T L D S A V F N V E

UAAGAAGUGCCGUUCCGUCUCCUUUCC AkAAUACACUGCAGAACGUGUUGGCCGCUGCCACGAAACGCAACUGCAACGUGACGCAGAUGCGGGAGCUGC CGA^CCUUGGAUUCAGCGGUGUUUAAUGUCGAG

R C D E Y W E E F A R K P I R I r r E F v T A Y v A R L K G P K A A A L F A K T Y N L VUGCUUUCGA^AAUAUGCAUGUAAUGACGAGUAUUGGGAGGAGUUCGCUCGGAAGCCAAUUAGGAUUACCACUGAGUUUGUC ACCGCAUAUGUAGCUAGACUGAAAGGCCCUAAGGCCGCCGC ACUAUUUGCAA^AGACGUAUAAUUUGGUCY C N Y W D E F A o K P I R L T T E N I T S Y V T R L K G P K A A A L F A K T Y L KUGCUUUAAGAAGUACGCAUGCAAUAACGACUACUGGGACGAAUUUGCUCA^AAGCCCAUCAGGU)UGACGACGGAAAACAUCACAUCCUACGUGACUAGACUGAAGGGCCCGAAGGCAGCCGCUCUAUUCGC A^AAAACCUACGACWUGAAG

P L O E v D R F V M O M K R v K v T P G T K H T E E R P K v O v I A A E P L A T A Y L C G ICCAUUGCAAGAAGUGCCUAUGGAUAGAUUCGUCAUGGACAUGA^AAGAGACGUGAAAGUUACACCAGGCACGAAACAC ACAGAAG^AAGACCG^AAGUACAAGUGAUACAA^GCCGCAGAACCCCUGGCGACUGCUUACWUAUGCGGGAUUP L O E v M O R F V V D M K DVK T P G T K H T E E R P K V O V I A A E P L A T A Y L C G ICCGUUGCAAGAGGUACCGAUGGACCGCUUCGUCGUGGAUAUGAAGAGAGACGUGAAGGUGACACCCGGC ACCAAGCAuACAGAAGAAAGGCC^AAGGUGCAA^GUCAUACAGGCAGCCGAACCCCUAGCCACCGCCUACCUGUGCGGUAUA

R E L v R R L T A V L L P N I H T L F D M S A E D F D A I I A E H F K 0 G O P v L E T D I A S F 0CACCGGGAAUUAGUGCGUAGGCUUACGGCCGUCUUGCUUCCAAACAUUCAC ACGCUUUUUGACAUGUCGGCGGAGGAUuuGAUGCAA^UCAUAGCAGAACACUUCA^AGCAA^GGCGACCCGGUACUGGAGACGGAUAUCGCAUCAUUCGACR E L v R R L N A v L L P N V H T L F o M S A E D F D A I I S E H F R P G IA v L E T D I A S FCAUAGAGAACUAGUUAGAAGGUUAAAUGCAGUUCUGCUGCCGAACGUCCACACCCUGUUUGACAUGUCGGCGGAAGACUUUGAUGCCAUAAUACCGAGCACUUCCGCCCUGGGGACGCUGUACUAGAAA CAGACAUCGCAUCGUUCGAC

ODD A M A L T G L M I L E O L G v D O P L L D L I E C A F G E I S S T H L P T G T R F K F G AA^AAGCCAAGACGACGCUAUGGCGUUAACCGGUCUGAUGAUCUUGGAGGACCUGGGUGUGGAUCAACCACUACUCGACUUGAUCGAGUGCGCCUUUGGAGAAAUAUCAvUCCACCCAUCUACCUACGGGUACUCGUUUUAAAUUCGGGGCGO D D S L A Y T G L M L L E D L G v D O P L L E L I E A S F G E I T S T H L P T G T R F K F G AAAGAGCCAAGACGACUCGCUGGCGUACACGGGGCUGAUGUUGCUAGAGGACCUCGGCGUCGACCAGCCUUUGCUCGAGUUGAUCGAAGCGUCGUUCGGGGAA^AUAACAAGCACGCAUUUACCAACGGGAACUAGGUUCAAGUUCGGGGCC

M M K G M F L L F N L N I A S L E E L K S C A F I G N I I H GAUGAUGKALUCCGGAAUGUUCCUCACCUUUUUGUCAACLCLGUUUSGAAUGUCGUUAUCGCCAGC AGLGUACuEGEIGSGCGGCUUACACGUCCAGFUGUGCAGCGUUC2UUGGCG0CG VAKAUCAUGG5GUAGUAUCUGACM M K S G M F L T L F v N r M L N M T I A S A v L E E R L T N S K C A A F I G D D N I V H G v K S DAUGAUGAAAUCUGGCAUGUUCCUGACGCUCUUCGUGAACACGAUGCUCAACAUGACUAUAGCUAGCAGAGUGUUAGAAGAACGGCUGACCAA^UUCU^AAUGUGCCGCCUUUAUCGGCGAUGAUAACAUUGUGC AUGGAGUGAA AUCUGAU

SIN (2379 K E M A E R C A T W L N M E v K I I A V I G E R P P F C G G F I L O O S v T S T A C V A P L(71t94) AAAG^AAUGGCUGAGAGGUGCGCCACCUGGCUCAAC AUGGAGGUUAAGAUCAUCGACGCAGUC AUCGGUGAGAGACCACCUUACUUCUGCGGCGGAUUUAUCUUGCAA^GAUUCGGUUACUUCCACAGCGUGCCGCGUGGCGGAUCCCCUGMID R E v v R Y G G V D 0 v T G T R v A LAAACUGCUGGCUGAGAGAUGCGCCGCAUGGAUGAACAUGGAAGUGAAGAUCAUCGAUGCAGUCAUGUGCGAGCGCCCCCCUUACUUCUGCGGAGGGUUAUCGUGUUUGACCAAGUUACAGGCACCUGUUGC AGAGUGGCAGACCCGCUG

SIN 24291 K R L F K L G K P L P A E 0 E RA L L E K A W F R

V I G L A R E V

(7344 AAAAGGCUGUUUUAGUUGGGUAAACCGCUCCCAGCCGACGACGAGCAAGACGAAGACAGAAGACGCGCUCUGCUAGAUGAAACAAGGCGUGGUUUGAGUAGGUAUAACAGGCACUUUAGCAGUGGCCGUGACGACCCGGUAUGAGGUAR G A

O0 R R R A 0 R R v G 0 A M S E vAAGAGACUCUUUAAGCUCGGAAAACCGCUGCCUGCUGAAGACAAAC AGGACGAGGACCGCAGAAGGGCGUUGGCCGACGAGGC ACA^ACGGUGGAACCGCGUAGGUAUCCAAGCAGAUUUGGAGGCCGCAAUGAAC AGCCGUUACGAGGUC

SIN(2479) 0 N IUUT P v L L A L R T F A O S K R A FD A I R G E I K H L G G P K7494 GACAAAUA ACACCUGUCCUACUGGCAUUGAGAACUUUUGCCCAGAGCAAAAGAGCAUUCCAAGCCAUCAGAGGGGAAAUAAAGCAUCUCUACGGUGGUCCUAAA UAGMID E G I N V I T A L T T L S R N Y H N F R H L R G P v I 0 L G G P KM1DGAGGGGAUCCGAAACGUCAUCACGGCGUUAACCACGCUGUC ACGGAAUUAUCACAAUUUCCGAC AUCUAAGAGGACCCGUUUUuGACCUCUACGGCGGUCCUAA UGA

FIG. 3. Translated nucleotide sequence of Sindbis virus (SIN) and Middelburg virus (MID) RNAs in the region containing the opal stop codon.For Sindbis virus nucleotides are numbered from the 5' terminus of 49S RNA and amino acids are numbered from the NH2 terminus (11) of thenonstructural polyprotein precursor. The Middelburg sequence is aligned by homology with the Sindbis sequence and the two underlined Hae IIIsites (G-G-C-C) were not directly sequenced; the positioning of the Middelburg sequence between nucleotides 5,394 and 5,729 of Sindbis is arbitrarybecause of the lack of homology and the large deletion in the Middelburg sequence relative to the Sindbis sequence. In other regions gaps have beenintroduced as necessary to maintain the homologous alignment. The opal stop codons are boxed; the wavy overline indicates the peptide synthesizedto produce antibodies. The single-letter amino acid code is used in the translation of the RNA and the amino acid is positioned above the first nu-cleotide of the corresponding codon. A = Ala, C = Cys, D = Asp, E = Glu, F = Phe, G = Gly, H = His, I = Ile, K = Lys, L = Leu, M = Met, N= Asn, P = Pro, Q = Gln, R = Arg, S = Ser, T = Thr, V = Val, W = Trp, and Y = Tyr.

SIN 209)MID

SIN 29

MID

SIN )

MID

SIN 129JMID

SIN 12179)MID

SIN 2229MID

ku

3-AS

'Ar.AArr AAr AAArarrtrIA F.rrAA r r.AA Ar.1M. Ar. A Ar A .o;.AArnr.r-A

ccSGUUAUCGACAGACUGUUGGGCGGAGCGAAACUGUUCGUGACACCCACAACCGACUGCCGAUACGUGACACACAAGCACCCAAAGCCGAUGUACUCGACUAGUGUAGCAUUCUACCUAAGUUCGGCUAAGACUGCAGUGGCGGCAUGCAAU

GAAUUCUUAAGUAGGAACUACCCAACUGUGACAUCCUAUCAGAUCACGGAUGAGUACGACGCCUACtUGbACAUGGUG6ACdGUUCC.AGAGCUGUEUA.ACAGA'GCA6CC'UUU6GU'CCGOCCiAA6UA6GCiGC'UUU.CGiAG.AG.ACD-AALI 11 aIr IvA~lA.Ar-r-AAI. lAIII A.l I- AIrvv ....1aF1 ., ~~.11 -o,-.o V, TP T V

rain T S Y n I T D E Y D A v L D M v D G

Dow

nloa

ded

by g

uest

on

Nov

embe

r 29

, 202

0

Page 4: Sequence coding forthe is interrupted - PNAS · in the sequence (boxed in Fig. 3). For Sindbis virus this stop codon interrupts an otherwise open reading frame for 2,513 aminoacids

5274 Biochemistry: Strauss et al.

1429 SINDBIS VIRUS. 25131.

.; ...........................................................................................................\ ..

A.

A\

enD

cc

>~-j

LLCDCD0

N

FIG. 4. Dot matrix comparing the proteins encoded by Sindbis RNAand Middelburg RNA around the opal termination codon. The proteinsequences are compared nine amino acids at a time and a dot is placedin the matrix whenever five match. The position of the opal termina-tion codon is indicated by the arrow.

the 5' end. One precursor is 1,896 amino acids long, terminatesat the opal codon, and is cleaved to produce ns6O, ns89, andns76. The second precursor is produced in relatively smalleramounts by read-through of the opal codon, is 2,513 amino acidslong, terminates at the multiple in-phase termination codons inthe junction region, and is cleaved to produce a fourth protein,ns72, presumably in addition to the three products mentionedabove (Fig. 2). ns6O, ns89, and ns76 have been observed in bothlysates of infected cells (reviewed in ref. 2) and as the productsof in vitro translation of 49S RNA (13) but the small amountsof ns72 produced have only been visualized by immunopre'cipitation.

The protein-encoding regions on either side of the opal ter-mination codon have several interesting features, which are il-lustrated in Fig. 4. Fig. 4 shows a dot matrix in which the aminoacid sequences of Sindbis virus throughout the region con-

tained'in Fig. 3 are compared with those of Middelburg virusnine residues at a time. A dot is placed in the matrix wheneverfive residues in the string of nine match. The diagonal linesshow regions of protein homology between the two viruses; off-diagonal marks are due to random matches., It is obvious thatthe two viruses are fairly closely related and Fig; 4--also illus-trates the principle used for alignment of Middelburg sequencedata with that of Sindbis virus. Downstream from the opal ter-mination codon (arrow in Fig. 4) the two sequences are highlyhomologous and upstream from this stop codon, between aminoacids 1,429 and 1,676 in the Sindbis sequence, the two se-quences also demonstrate considerable homology. However,between amino acids 1,676 and 1,896 in the Sindbis sequencethere is virtually no detectable homology. Furthermore, as theshift in the diagonal makes clear, there is a discontinuity in thealignment and the Sindbis sequence is longer in this region by88 amino acids. Thus, the COOH-terminal sequence of thepolyprotein preceding the opal termination codon is not ho-mologous in these two alphaviruses and differs in length so thatthe third nonstructural protein of Sindbis virus, ns76 (Figs. 2and 3), is larger than the corresponding Middelburg protein.The function of this region of protein is unknown but the vari-ation in size and sequence might imply that it is not vital for thevirus replication cycle.

In contrast, the extreme COOH-terminal nonstructural pro-tein, ns72, is highly conserved between Middelburg and Sind-bis (Figs. 4 and 5). Beginning just before the opal codon andcontinuing to the series of in-phase stop codons in the junctionregion (7), which begin with an amber codon corresponding tonucleotides 2-4 of 26S RNA, the two sequences are identicalin length, exhibit perfect alignment with no gaps or insertions,and show an overall homology of 73%. The start of ns72 is un-clear but estimates of its molecular weight based on migrationin acrylamide gels.and the fact that there is no conservation inlength or sequence upstream from the opal codon makes it likelythat ns72 begins very near this stop codon, a likely start beingthe glycine six residues downstream (unpublished observa-tions). If we assume they begin at the stop codon, then the twons72s have the same amino acid at 448 out of 616 positions. Inaddition, 33 out of the 168. substitutions that occur are con-servative. There is one stretch of 98 amino acids (amino acids2,161-2,259 in the Sindbis sequence) in which there are onlyfive differences (Fig. 5). The COOH-terrminal regions show less

SIN (1429)LKLLONAYHAVAOLVNEHNIKSVAIPLLSTGIYAAGKORLEVSLNCLTTALDRTOAOVTIYCLDKKWKERIDAALOLKESVTELKDEDMEI DELVWIHPMID DAD.AAV.R ... .S.AD. TVRTM. TF.G... ...... .... ..T. V . R. .S.E KKIOEAIDMRTA ... LL.D.TTVMK. ......

SIN (1529) DSCLKGRKGFSTTKGKLYSYFEGTKFHOAAKDNAE IKVLFPNOOESNEOLCAYILGETMEAIREKCPVDHNPSSSPPKTLPCLCMYAMTPERVHRLRSNNMID ... V. ..... VO.R.H-. ... .... LT.V.V ..RPT..RRE..... V.........ITH.V:. ...T.DTD. .A. ........R AAO

IISIN (1629) VKEVTVCSSTPLPKHK IKNVOKVOCTKVVLFNPHTPAFVPARKYIEVPEOPTAPPAOAEEAPE VVATPSPSTAONTSLDVTOISLDMDDSSEGSLFSSFSMID ..OFf. Y PG. .R.......... HDV. ..SP... RR.PSISSESSSSGLSVFDLDIGSD.EYEPMEPVOPEPL.O.AVVEETAPVRLERVA

IISIN (1729) GSDNSI TSMDSWSSGPSSLEIVDRROVVVADVHAVOEPAPIPPPRLKKMARLAAARKEPTPPASNSSESLHLSFGGVSMSLGSIFOGETAROAAVOPLATMID PVAAPRRARATPFTLE .R ... PP.P.PRTMPVR. .R.K.AAT.TPERISFGDLD.ECMAIINDDLTF.DFGAGEF

SIN (1629) GPTDVPMSFGSFSDGEIDELSRRVTESEPVLFGSFEPGEVNSI ISSRSAVSF'PLRKORRRRRSRRTE- YXLTGVGGYIFSTDTGPGHLOKKSVLONOLTEPMID E.L.SAU.ORA.A. S... OR. .R.TR.AOC

SIN (1929) TLERNVLERIHAPVLDTSKEEOLKL-RYOMMPTEANKSRYOSRKVENOKAI TTERLLSGLRLYNSATDGPECYKITYPKPLYSSSVPANYSDPOFAVAVCNMID VA,.DVHE..VF..KC. KE..RL.L.QM..A.......M..EVID...G.A.K.FVTP.TDCRYVTHKH...M..T..AFYL.SAKT... A..

IIISIN (2029) NYLHENYPTVASYOI TDEYDAYLDMVDG1VACLO'ATFCPAKLRSYPKKHEYRAPNIRSAVPSAMONTLONVLIAATKRNCNVTOMRELPTLDSATFNVEMID EF.SR...S.T.... ..... .R.A.. S... F... S.HRAE. PF A.. ...

I ~ .SIN (2129) CFRKYACNOEYWEEFARKPIRI TTEFVT-AYVARLKGPKAAALFAKTYNLVPLOEVPMORFVMOMKROVKVTPGTKHTEERPKVOVIOAAEPLATAYLCGIMID .K.. ND. .OD . S... .L.. NI.S. . . . K. V

I.IISIN (2229) HRELVRRLTAVLLPNIHTLfDSAEDFDOAI IAEHFKOGDPVLETOIASFOKSOODAMALTGLMILEDLGVDOPLLDLIECAFGEISSTHLPTGTRFKFGAMID N. V.S... RP. .A SL.Y .... .L .E... AS ...E....S.T.II I~ II

SIN (2329)0MKSGMFLTLfVNTVLNVVIASRVLEERLKTSRCAAFIGOONIIHGVVSDKEMAERCATWLNME VKII AVIGERPPYFCGGF ILODSVTSTACRVADPLMIDN.... M T........ TN.K.......... ...... LL .....A.M.MC.VF.O. G.C.

SIN (2429) KRLFKLGKPLPADDEODEORRRALLDETKAWFRVGIT.GTLAVAVTTRYEVONITPVLLALRTFAOSKRAFOAIRGEIKHLYGGPKMID .... ....A.K........AAR.N .... .OAD.EA.MNS .. EG.RN.IT ..T.LSFNYHN.RHL. .PVID.

FIG. 5. Direct comparison of the amino acid sequences encoded by Sindbis RNA and Middelburg RNA around the opal termination codon. Thesequence of Sindbis is given in the first line. Amino acids are numbered starting with the NH2 terminus ofthe polyprotein precursor. The Middelburgsequence is shown below; a dot means the amino acid is the same as in the Sindbis sequence. Gaps have been introduced in the Middelburg sequencefor the purpose of alignment. The single-letter code is used as in Fig. 3; the boxed X is the position of the opal codon.

Proc. Nad Acad. Sci. USA 80 (1983)

Dow

nloa

ded

by g

uest

on

Nov

embe

r 29

, 202

0

Page 5: Sequence coding forthe is interrupted - PNAS · in the sequence (boxed in Fig. 3). For Sindbis virus this stop codon interrupts an otherwise open reading frame for 2,513 aminoacids

Proc. Natl. Acad. Sci. USA 80 (1983) 5275

homology (Fig. 5). We have postulated that components of thealphavirus replicase recognize specific nucleotide sequences' ofabout 20 nucleotides in length in the RNA as promoter sites (7,11, 14) and it is tempting to speculate that the highly conservedcore of ns72 has such a function. In any event, the high degreeof homology between these two viruses indicates an importantfunction for ns72 in the virus replication cycle. As has been pre-

viously noted (7), the alphaviruses have diverged sufficientlysuch that the codon used for any particular amino acid has beenrandomized, even in highly homologous segments. In ns72 dif-ferent codons are used between Sindbis and Middelburg for thesame amino acid in 57% of the cases in which conservation oc-

curs (Fig. 3). The slight difference from complete randomiza-tion (66% difference) probably arises because some codons are

infrequently used by alphaviruses (5, 6). Thus, conservation ofamino acid sequence is highly significant and the translationframe can be deduced by comparison of conserved amino acidsequences; because of the divergence-in codon usage the high-est degree of protein sequence homology is observed only whenthe proper translation frames are being compared.

Although termination codons have been located within thetranslated regions of a number of viruses, at the current timethe alphaviruses appear unique among animal viruses in usingread-through to modulate the synthesis of their proteins. Anamber codon punctuates the genome of the murine retrovi-ruses between the gag and pol genes (15) and it was originallybelieved that read-through was the mechanism for producingrelatively minor amounts of the polymerase products (16, 17).However, recent results with an avian retrovirus, Rous sarcoma

virus, have revealed that in this case the downstream, open

reading frame for pol is out of phase with the gag gene (18) andthus the polymerase moieties may be produced from a minorspecies of messenger produced by splicing in which the ambercodon could have been deleted. Amber codons are also presentin the genomes of a variety of- RNA plant viruses [includingtobamoviruses (19) and tymoviruses (20)] and t.read-throughproducts have been obtained by translation of these and otherplant virus RNAs in in vitro protein synthesis systems (20-24).Whether these experiments accurately reflect the conditions ofreplication during normal infection is unclear at this time.The mechanism by which read-through of an opal codon might

occur during alphavirus infection is unknown. Specific sup-

pression of the opal codon is unlikely because in the case ofSindbis virus the translation of the structural proteins is ter-minated with an opal codon (5). The-simplest mechanism pre-

dicts that read-through is a probablistic event of relatively lowfrequency, perhaps due to wobble in the first anticodon posi-tion to insert a tryptophan. If so, accumulation of ns72 wouldbe relatively slow in the infected cell and functionally ns72 wouldappear as. a late protein. An attractive hypothesis for the func--tion of such a polypeptide is as a controlling or modifying factorfor the polymerase, perhaps to effect the shutoff of minus-strandtemplate synthesis that occurs at 3-4 hr postinfection (25, 26).

Thus, alphaviruses, with a small and relatively simple ge-nome, are capable of regulating the synthesis of their virus-en-coded products at three different levels by two different mech-anisms. Production of a subgenomic (26S) RNA as the predom-

inant message in infected cells insures that most of the virus-specific protein synthesis is devoted to production of the struc-tural proteins. Translation of the genomic RNA up to the opalcodon produces three nonstructural polypeptides in lesseramounts and read-through may produce a fourth polymerasecomponent in very small quantities for specific regulation ofreplication.

The computer programs used in this work, including production offigures, were written by Tim Hunkapiller and we are grateful to himfor access to these programs and instructions in their use. The com-puter work was performed on the computer facility of Lee Hood andwe are grateful for the time on these facilities. Edith Lenches gave ex--pert technical assistance in the maintenance and preparation of virusstocks and cell lines. This work was supported by Grants AI 10793 andGM 06965 from the National Institutes of Health and by Grant PCM*8022830 from the National Science Foundation.

.1. Strauss, J. H. & Strauss, E. G. (1977) in The Molecular Biologyof Animal Viruses, ed. Nayak, D. P. (Dekker, New York), pp. 111-166.

2. Schlesinger, M. J. & Kaarsinen, L. (1980) in The Togaviruses, ed.Schlesinger, R. W. (Academic, New York), pp. 371-392.

3. Garoff, H., Frischauf, A.-M., Simons, K., Lehrach, H. & Delius,H. (1980) Proc. Natl Acad. Sci. USA 77, 6376-6380.

4. Garoff, H., Frischauf, A.-M., Simons, K., Lehrach, H. & Delius,H. (1980) Nature (London) 288, 236-241.

5. Rice, C. M. & Strauss, J. H. (1981) Proc.Natl Acad. Sci. USA 78,.2062-2066.

6. Dalgarno, L., Rice, C. M. & Strauss, J. H. (1983) Virology, in press.7. Ou, J.-H., Rice, C. M., Dalgarno, L., Strauss, E. G. & Strauss,

J. H. (1982) Proc. Natl Acad. Sci. USA 79, 5235-5239.8. Ou; J.-H., Strauss, E. G. & Strauss, J. H. (1981) Virology 109,

281-289.9. Rice,.C. M. & Strauss, J. H. (1981)J. Mal. Biol. 150, 315-340.

10. Maxam, A. M. & Gilbert, W. (1980) Methods Enzymol. 65, 499-560.

11. Ou, J.-H., Strauss, E. G. & Strauss, J. H. (1983)J. Mol. Biol., inpress.

12. Strauss, E. .G. & Strauss, J. H. (1983) Curr. Top. Microbiol. Im-munol 105, 1-97.

13. Collins, P. L., Fuller, F. J., Marcus, P. I., Hightower, L. E. &Ball, L. A. (1982) Virology 118, 363-379.

14. Ou, J.-H., Trent, D. G. & Strauss, J. H. (1982)J. Mol. Biol. 156,719-730.

15. Shinnick, T. M., Lerner, R. A. & Sutcliffe, J. G. (1981) Nature(London) 293, 543-548.

16. Philipson, L., Anderson, P., Olshevsky, U., Weinberg, R., Bal-timore, D. & Gesteland, D. (1978) Cell 13, 189-199.

17. Oppermann, H., Bishop, J. M., Varmus, H. E. & Levintow, L.(1977) Cell 12, 993-1005.

18. Schwartz, D., Tizard, R. & Gilbert, W. (1983) Cell 32, 853-869.19. Goelet, P., Lomonossoff, G. P., Butler, P. J. G., Akam, M. E.,

Gait, M. J. & Karn, J. (1982) Proc. Natl. Acad. Sci. USA 79, 5818-5822.

20. Morch, M.-D., Drugeon, G. & Benicourt, C. (1982) Virology 119,193-198.

21. Pelham, H. R. B. (1978) Nature (London) 272, 469-471.22. Mang, K.-G., Ghosh, A. & Kaesberg, P. (1982) Virology 116, 264-

274.23. Dougherty, W. G. & Kaesberg, P. (1981) Virology 115, 45-56.24. Bisaro, D. M. & Siegel, A. (1982) Virology 118, 411-418.25. Sawicki, D. L. & Sawicki, S. G. (1980)J. Virol. 34, 108-118.26. Sawicki, D. L., Sawicki, S. G., Keranen, S. & Kaariinen, L. (1981)

J. Virol. 39, 348-358.

Biochemistry: -Strauss et al.

Dow

nloa

ded

by g

uest

on

Nov

embe

r 29

, 202

0