8
Messenger RNA in Eukaryotes Jan-Peter Kreivi, University of Uppsala, Uppsala, Sweden Svend Petersen-Mahrt, University of Uppsala, Uppsala, Sweden Go ¨ran Akusja ¨rvi, University of Uppsala, Uppsala, Sweden Posttranscriptional regulation of gene expression represents an important level at which eukaryotes can expand the coding capacity of their genomes. The concept that one gene makes one protein does not apply to higher eukaryotes. Thus, a eukaryotic cell can use alternative RNA splicing and alternative polyadenylation to produce hundreds or even several thousands of protein isoforms from a single gene. Introduction Expression of the genetic information has been summar- ized in the central dogma, which postulates that the flow of information in a cell is transmitted from the deoxyribonu- cleic acid (DNA) to a ribonucleic acid (RNA) intermediate to protein. Eukaryotic cells contain three RNA poly- merases, designated RNA polymerase I, II and III, which are responsible for synthesis of specific classes of RNA molecules within the cell. RNA polymerase I is responsible for synthesis of ribosomal RNA (rRNA), RNA polymer- ase II is responsible for synthesis of protein-coding messenger RNA (mRNA), and RNA polymerase III takes care of transfer RNA (tRNA) and 5S rRNA synthesis. Here we will primarily discuss the structure, organization and maturation of the RNA polymerase II-generated transcripts, the mRNAs encoding for proteins. mRNAs in higher eukaryotes mature after extensive processing of the primary transcript of the gene. These posttranscriptional processes are important regulatory pathways used by the cell to modulate gene expression in response to different stimuli. General Sequence Features of a Eukaryotic mRNA With few exceptions, eukaryotic mRNAs are composed of three common elements: a capped 5end, a single protein- encoding reading frame, followed by a poly(A) tail at the 3end. In the vast majority of cases translation starts with an AUG codon and the mRNA encodes for a single protein species, although examples of alternative start codons and polycistronic mRNAs have been described. In addition a number of mRNAs contain regulatory sequences in their 5and 3untranslated regions (UTR), that control RNA stability, translation and polyadenylation efficiency. As will be discussed in more detail below, it is now clear that regulation of gene expression at the level of posttranscriptional processing is very common in higher eukaryotes. Almost every aspect of mRNA processing and stability have been shown to be a target for regulation. Here we will discuss some of the common features of eukaryotic mRNAs and some basics about the mechan- isms that convert a pre-mRNA to an mRNA. Maturation of Eukaryotic mRNA With the exception of mitochondria and chloroplasts, the genetic information in eukaryotic cells is stored as stable DNA molecules in the cell nucleus. The transmission of genetic information from DNA to RNA is therefore almost exclusively a nuclear process. mRNA capping All RNA polymerase II transcripts are capped at their 5end. Capping of the pre-mRNA occurs co-transcription- ally, almost immediately after that RNA synthesis is initiated and even the shortest RNA chains are capped (Figure 1). The cap consists of an inverted methylated guanosine residue that becomes linked to the first nucleotide in the transcript via a 5–5triphosphate bond. The cap serves multiple functions in RNA metabolism. For example, it is required for efficient splicing of the pre- mRNA, nuclear to cytoplasmic export of the mRNA, efficient translation initiation on the mRNA and resistance against 5–3exonuclease degradation. The cap structure is recognized by at least two different cap-binding complexes (CBC): a nuclear and a cytoplasmic CBC. The nuclear CBC is important for efficient splicing of the cap proximal intron and it has been found to be associated with mRNA, accompanying the transcript into Article Contents Introductory article . Introduction . General Sequence Features of a Eukaryotic mRNA . Maturation of Eukaryotic mRNA . mRNA as a Ribonucleoprotein Particle . Intronless and Nonpolyadenylated mRNA Transcripts . Synthesis and Processing of mRNA in Viral Systems . Trans-splicing . Non-nuclear pre-mRNA Processing in Eukaryotes . Conclusions 1 ENCYCLOPEDIA OF LIFE SCIENCES © 2001, John Wiley & Sons, Ltd. www.els.net

Encyclopedia of Life Sciences || Messenger RNA in Eukaryotes

Embed Size (px)

Citation preview

Messenger RNA inEukaryotesJan-Peter Kreivi, University of Uppsala, Uppsala, Sweden

Svend Petersen-Mahrt, University of Uppsala, Uppsala, Sweden

Goran Akusjarvi, University of Uppsala, Uppsala, Sweden

Posttranscriptional regulation of gene expression represents an important level at which

eukaryotes can expand the coding capacity of their genomes. The concept that one gene

makes one protein does not apply to higher eukaryotes. Thus, a eukaryotic cell can use

alternative RNA splicing and alternative polyadenylation to produce hundreds or even

several thousands of protein isoforms from a single gene.

Introduction

Expression of the genetic information has been summar-ized in the central dogma, which postulates that the flow ofinformation in a cell is transmitted from the deoxyribonu-cleic acid (DNA) to a ribonucleic acid (RNA) intermediateto protein. Eukaryotic cells contain three RNA poly-merases, designated RNA polymerase I, II and III, whichare responsible for synthesis of specific classes of RNAmolecules within the cell. RNApolymerase I is responsiblefor synthesis of ribosomal RNA (rRNA), RNA polymer-ase II is responsible for synthesis of protein-codingmessenger RNA (mRNA), and RNA polymerase III takescare of transfer RNA (tRNA) and 5S rRNA synthesis.Here we will primarily discuss the structure, organizationand maturation of the RNA polymerase II-generatedtranscripts, the mRNAs encoding for proteins. mRNAs inhigher eukaryotes mature after extensive processing of theprimary transcript of the gene. These posttranscriptionalprocesses are important regulatory pathways used by thecell to modulate gene expression in response to differentstimuli.

General Sequence Features of aEukaryotic mRNA

With few exceptions, eukaryotic mRNAs are composed ofthree common elements: a capped 5’ end, a single protein-encoding reading frame, followed by a poly(A) tail at the 3’end. In the vast majority of cases translation starts with anAUG codon and the mRNA encodes for a single proteinspecies, although examples of alternative start codons andpolycistronic mRNAs have been described. In addition anumber ofmRNAs contain regulatory sequences in their 5’and 3’ untranslated regions (UTR), that control RNAstability, translation and polyadenylation efficiency.

As will be discussed in more detail below, it is now clearthat regulation of gene expression at the level ofposttranscriptional processing is very common in highereukaryotes. Almost every aspect of mRNAprocessing andstability have been shown to be a target for regulation.Here we will discuss some of the common features ofeukaryotic mRNAs and some basics about the mechan-isms that convert a pre-mRNA to an mRNA.

Maturation of Eukaryotic mRNA

With the exception of mitochondria and chloroplasts, thegenetic information in eukaryotic cells is stored as stableDNA molecules in the cell nucleus. The transmission ofgenetic information from DNA to RNA is thereforealmost exclusively a nuclear process.

mRNA capping

All RNA polymerase II transcripts are capped at their 5’end. Capping of the pre-mRNA occurs co-transcription-ally, almost immediately after that RNA synthesis isinitiated and even the shortest RNA chains are capped(Figure 1). The cap consists of an inverted methylatedguanosine residue that becomes linked to the firstnucleotide in the transcript via a 5’–5’ triphosphate bond.The cap servesmultiple functions inRNAmetabolism.Forexample, it is required for efficient splicing of the pre-mRNA, nuclear to cytoplasmic export of the mRNA,efficient translation initiation on themRNAand resistanceagainst 5’–3’ exonuclease degradation.The cap structure is recognized by at least two different

cap-binding complexes (CBC): a nuclear and a cytoplasmicCBC.The nuclearCBC is important for efficient splicing ofthe cap proximal intron and it has been found to beassociated with mRNA, accompanying the transcript into

Article Contents

Introductory article

. Introduction

. General Sequence Features of a Eukaryotic mRNA

. Maturation of Eukaryotic mRNA

. mRNA as a Ribonucleoprotein Particle

. Intronless and Nonpolyadenylated mRNA Transcripts

. Synthesis and Processing of mRNA in Viral Systems

. Trans-splicing

. Non-nuclear pre-mRNA Processing in Eukaryotes

. Conclusions

1ENCYCLOPEDIA OF LIFE SCIENCES © 2001, John Wiley & Sons, Ltd. www.els.net

the cytoplasm, where it is dissociated. The cytoplasmicCBC is the translation initiation factor eIF-4E that helps torecruit the 40S ribosomal subunit to themRNA, and hencestimulate initiation of translation.

Nuclear pre-mRNA splicing

Almost all protein-coding genes in higher eukaryotes arediscontinuous, with the coding sequences (exons) inter-rupted by stretches of noncoding sequences (introns). Theintrons are present at theDNAlevel, andare removed fromthe pre-mRNA by a process called pre-mRNA splicing.The number of introns in a eukaryotic pre-mRNA variesconsiderably. For example, the interferon a, the histoneand c-jun genes have no introns, whereas the gene fordystrophin has as many as 70 introns. Also the size ofintrons can vary from less than 100 nucleotides to severalmillions of nucleotides. In contrast, internal exons in a pre-mRNA are of a relatively short length, and rarely exceed350 nucleotides. In fact, the exons in a pre-mRNAare usedas the unit for splice site recognition: the so-called exondefinitionmodel (Figure2). Thus, splicing factors binding tothe 3’ splice site upstreamof an exon, and factors binding tothe 5’ splice site downstreamof the exon interact and definethe exonic structure of the pre-mRNA during spliceosomeassembly. Subsequently, splice sites are paired across theintron (Figure 2). In both recognition steps the conservedSR family of splicing factors function as bridging factors.How are the exon–intron borders defined and recog-

nized by the splicing machinery? The ends of the intronsare, in part, defined by RNA–RNA base pairing between

the pre-mRNA and a number of small abundant nuclearribonucleoprotein particles, the so-calledU snRNPs (smallnuclear ribonucleoprotein particles). The U snRNPs arecomposed of both small RNAmolecules and proteins andthey are named based on the specific RNA they carry. U1,U2, U5 and U4/U6 snRNPs are essential for splicing ofmost pre-mRNAs in higher eukaryotes. The conservedsequences at the 5’ and 3’ ends of the intron are surprisinglyshort considering the precision by which very large intronsare excised during splicing. In most cases the intron startswith a GT dinucleotide and ends with an AG dinucleotide.The 5’ splice site is recognized by U1 snRNP through ashort RNA–RNA base pairing between U1 snRNA andthe pre-mRNA. The 3’ splice site is similarly defined bybase pairing between U2 snRNA and the so-called branchsite, which is located approximately 30 nucleotides up-stream of the 3’ splice site. U2 snRNP and U1 snRNPbinding to the 3’ splice site and 5’ splice site across an exondefines the exonic structure in the pre-mRNA (Figure 2).Subsequently, the U4/U6–U5 triple snRNP, together withthe other essential splicing factors, assembles the pre-mRNA into a large ribonucleoprotein particle, thespliceosome, which catalyses the cleavage and ligationreactions necessary to mature the final mRNA (Figure 3).Recently it was discovered that a small fraction of

nuclear pre-mRNA introns are not spliced by the conven-tional spliceosome. These introns do not contain the usual

AAAAAA

Polyadenylation+

Splicing

Transcription+

Capping

Figure 1 Transcription and processing of a simplified typical protein-coding gene in higher eukaryotes. The RNA polymerase II-transcribed pre-mRNA is capped at the 5’ end (circle). The intron sequences (wavy lines)between the protein-coding exon sequences (boxes) are spliced out andthe 3’ end becomes polyadenylated. Note that exon sequences are notnecessarily entirely protein coding.

SR

U2 U1

SR SR

Exon definition

spliced mRNA

Figure 2 The exon definition model. Exons in a pre-mRNA are recognizedas units by U2 snRNP (U2) binding to the 3’ splice site and U1 snRNP (U1)binding to the downstream 5’ splice site. Subsequently adjacent exons aredefined across the intron. In both recognition steps the conserved SR familyof splicing factors function as bridging proteins.

Messenger RNA in Eukaryotes

2

GT–AG dinucleotide pairs at the intron boundaries. Theywere originally called ATAC introns, based on thepresence of the conserved dinucleotides AT and AC foundat the 5’ and 3’ ends of these introns. A minor class ofspliceosomes splices this novel class of introns. Here U11,U12 and U4atac/U6atac snRNPs replace U1, U2 and U4/U6 snRNPs, respectively. The two types of spliceosomes,U2-type andU12-type, have been found to be able to spliceboth types of introns, i.e. someGT–AG introns are splicedby U12-type spliceosome and some AT–AC introns arespliced by theU2-type.Althoughdirect evidence is lacking,data suggest that AT–AC introns are older than the muchmore abundant GT–AG introns.

Alternative splicing as a mechanism togenerate protein diversity

By combining different 5’ and 3’ splice sites in a pre-mRNAseveral alternatively spliced cytoplasmic mRNAs can begenerated from a single nuclear gene (Figure 4). This, ofcourse,means thatmultiple proteinswith different primary

amino acid sequences and biological activities can beproduced by alternative RNA splicing. The impact ofalternative splicing on the coding capacity of eukaryoticgenes is impressive. For example, theDrosophila DSCAMgene, which encodes for an axon guidance receptor,produces more than 38 000 DSCAM protein isoforms byalternative splicing of the DSCAM pre-mRNA. Intronssubjected to alternative RNA splicing typically containsuboptimal splicing signals, and hence require splicingenhancer elements for activity. Splicing of such introns isoften subjected to a temporal, developmental or tissue-specific regulation. SR proteins are a family of essentialsplicing factors that also have an important role inregulating alternative RNA splicing, by binding to exonicsplicing enhancer elements. SR proteins are highlyphosphorylated and their activity as splicing enhancerproteins is regulated through reversible phosphorylation.Today it is accepted that alternative RNA splicing is an

important mechanism in the regulation of gene expression

pre-mRNA

U1SR

U1

U2

U2

SR

Spliceosomeformation

Splicing

pre-spliceosome

mRNA

SRSR

U4/U6U5

U4/U6U5

Figure 3 Pre-mRNAs splicing is catalysed in large ribonucleoproteincomplexes termed spliceosomes. The figure illustrates the sequentialaddition of U snRNPs into the spliceosome. In a first step U1 and U2 snRNPs,together with other non-snRNP splicing factors, recognizes the exon–intron borders in the pre-mRNA, thereby forming a pre-spliceosomecomplex. The mature spliceosome is formed after addition of the triple U4/U6/U5 snRNP. Boxes represent exons and the intron is shown as a thin line.

(a)

(b)

(c)

(d)

Figure 4 Different alternative splicing pathways. Illustrated are thedifferent ways pre-mRNAs can be alternatively spliced: (a) by combiningalternative 5’ splice sites with a common 3’ splice site, (b) by combiningalternative 3’ splice sites with a common 5’ splice site, (c) exon inclusion orskipping, (d) usage of mutually exclusive splice sites. Exon sequences areshown as large boxes and intron sequences as wavy lines.

Messenger RNA in Eukaryotes

3

during growth anddevelopment in eukaryotic cells.A largenumber of eukaryotic genes have been shown to maturealternatively spliced mRNAs; examples include growthfactors, growth factor receptors, intracellular messengers,transcription factors, oncogenes and muscle proteins. Thenumber is increasing everymonthand it has been estimatedthat as many as 35% of human pre-mRNAs arealternatively spliced.

Polyadenylation of mRNA

Formation of the mRNA 3’ end starts after the RNApolymerase transcribes past the site that specifies the 3’ endof the mature mRNA. Sequences in the pre-mRNA arethen recognized as targets for an endonucleolytic cleavage,followed by a nontemplated addition of 100–200 Aresidues to the 3’ end; thus forming the poly(A) tail. Twosequence elements, a highly conserved hexanucleotidesequence (AAUAAA) sequence, 10–30 nucleotides up-stream of the cleavage site and a second GU-rich sequencemotif 20–40 bases downstreamof the cleavage site, serve asrecognition signals for recruitment of the cleavage/poly-adenylation factors.Many eukaryotic genes contain multiple potential

poly(A) sites. Therefore, the precise usage of one oranother poly(A) site can be used to regulate geneexpression; so-called alternative poly(A) site usage. Forexample, if usage of the first poly(A) signal in a pre-mRNAis suppressed, a second poly(A) signal located furtherdownstream in the transcriptionunit canbe used. Thismaylead to inclusion of a novel exon in the mRNA. This is thecase, for example, in the production of secreted ormembrane-bound forms of immunoglobulin M. Alterna-tively, a new translational reading frame, encoding anotherprotein, may be spliced from such a pre-mRNA. It iscurrently not clear whether RNA splicing dictates alter-native poly(A) site usage or if selection of a poly(A) siteresults in alternative splicing of the pre-mRNA.The poly(A) tail has been suggested to influence virtually

every aspect of mRNA processing and metabolism. Forinstance it has been proposed to confer mRNA stability,increase translation efficiency and to be important, if notvital, for nucleocytoplasmic transport of the mRNA.

mRNA as a Ribonucleoprotein Particle

In higher eukaryotes mRNA never occurs as naked RNAin the cell. Instead pre-mRNAs associate with proteinsconcomitantly with transcription, to form large RNA–protein complexes (Figure 5). These particles are usuallyreferred to as heterogeneous nuclear ribonucleoproteinparticles (hnRNPs) or pre-mRNPs.The major protein components of hnRNP particles are

the hnRNP proteins, which is the collective term for

proteins that bind to pre-mRNA but are not stablecomponents of other classes of RNP, such as snRNP. Inhumans there are about 20major hnRNP proteins, termedA1-U. Other proteins that bind to pre-mRNA includesnRNPs, the nuclear cap binding complex, splicing factorsand poly(A) binding proteins. The proteins in hnRNPcondense the RNA in a manner resembling the DNA–protein complexes of nucleosomes. Some proteins in thehnRNP particles facilitate and regulate processing of thepre-mRNA and enable the mRNA to be exported fromnucleus to cytoplasm. Other functions of the hnRNPproteins may be to protect the RNA against nucleasedegradation.Once the mRNA is completely processed it is destined

for nucleocytoplasmic export. The mRNP particle istransported to the nuclear pores, which are the channelsthrough which all macromolecular traffic between thenucleus and cytoplasm takes place. The mRNP docks atthe pores and becomes at least partially unfolded (Figure 5).Some hnRNP proteins and splicing factors dissociate fromthe mRNP at this stage, whereas others remain associatedwith the transcript through its transport to cytoplasm.mRNP transport through the pore has been shown to beinitiated with the 5’ end of the RNA. At the cytoplasmicside the mRNA associates with an at least partly differentset of proteins, including the ribosome.

Intronless and NonpolyadenylatedmRNA Transcripts

It is believed that most mRNAs need to mature via pre-mRNA splicing and polyadenylation in order to beefficiently exported to the cytoplasm. In agreement withthis hypothesis, removal of all introns from a normallyintron-containing gene results in decreased polyadenyla-tion efficiency andpoor nucleocytoplasmic transport of theintronless transcript. However, few protein-coding genes

Nuclear pore

RibosomemRNPRNP

Transcription Processing Transport Translation

Figure 5 mRNP assembly and nucleocytoplasmic transport. The figureillustrates how pre-mRNAs and mRNAs in eukaryotes immediately becomeassociated with proteins upon transcription, forming mRNP, which is thenpartly dissociated upon transport through the nuclear pore into thecytoplasm.

Messenger RNA in Eukaryotes

4

in higher eukaryotes are intronless and even fewer do notcontain polyadenylation signals. Intronless genes includeheat shock genes, interferon a, c-jun, histones and anumber of viral genes. Studies of both cellular and viralgenes have shown that mRNAs encoded by intronlessgenes are exported via a pathway separate from that usedby mRNAs that mature via splicing. For example, theintronless Herpes simplex virus (HSV) thymidine kinase(TK) mRNA has been found to contain RNA elementsthat promote efficient polyadenylation and nucleocyto-plasmic transport in the absence of introns. Removal ofthese RNA elements from the TK pre-mRNA results inpoor polyadenylation and severely reduced nucleocyto-plasmic export of the mRNA. It has been shown thatspecific cellular hnRNP proteins bind to the TK mRNAand these may therefore promote polyadenylation andexport of this mRNA. It is thus possible that intronlessHSVmRNAs have highjacked a cellular mRNA transportpathway normally used by intronless cellular genes.Intronless genes are rare, but even less frequent are

nonpolyadenylated cellular mRNAs. The classical exam-ples are the histone mRNAs that are neither spliced norpolyadenylated. Histone genes are transcribed fromcontinuous gene segments and aU7 snRNP complex trimsthe 3’ ends of the pre-mRNAs. Histone genes are mainlytranscribed during the S phase, when expression of mostcellular genes is suppressed. As will be discussed in moredetail below, some cytoplasmic replicating viruses expressmRNAs that are nonpolyadenylated.

Synthesis and Processing of mRNA inViral Systems

Herewewill describe a few examples of how animal virusesuse the host cell biosynthetic machinery to enhance virusmultiplication. A typical virus encodes for a few regulatoryproteins that modify the host cell biosynthetic machineryto achieve a selective production of viral gene products.Thus, studies of viral regulation of gene expression havetaught usmuch about the general mechanisms that controlgene expression in the normal cell. The lessons learnt fromthese studies can be applied to the control of geneexpression observed in a normal cell.

Virus regulation of pre-mRNA splicing

All DNA viruses, except members of the poxvirus family,replicate in the host cell nucleus. Viruses that replicate inthe nucleus are dependent on the host cell mRNAprocessing and export machinery in order to achieve acomplete gene expression. Some of these viruses manip-ulate the nuclear pre-mRNA processing machinery inorder to ensure that alternative splicing and polyadenyla-tion of their pre-mRNAs takes place. The perhaps best-

studied case is adenovirus, which is a nuclear replicatingvirus. Its lytic life cycle is, by convention, divided into anearly and a late phase, which are separated by the onset ofviral DNA replication. During the lytic life cycle adeno-virus uses a series of posttranscriptional regulatory events,such as alternative pre-mRNA splicing, alternative poly-adenylation and selective nucleocytoplasmic mRNAexport to enhance viral gene expression. The alternativepre-mRNA processing events (polyadenylation and spli-cing) are necessary in order for the virus to express the fullrange of viral genes. This is particularly obvious for thepre-mRNA transcribed from the major late transcriptionunit (MLTU).This pre-mRNAencompasses � 80%of theviral genome and it is alternatively spliced and polyadeny-lated to generate around 20mRNAspecies. ThesemRNAsare grouped into five families depending on whichalternative polyadenylation site is used to generate themRNA (Figure 6). Within each mRNA family multiplealternatively spliced mRNAs are produced. The produc-tion of this complex set ofMLTUmRNAs is subjected to atight temporal regulation at the level of alternative RNAsplicing. Several virus-encoded proteins that redirect thehost cell processing machinery to ensure efficient virusmRNA production have been identified. One of them, theviral E4-ORF4 protein, has been shown to enhance virus-specific pre-mRNA splicing by functionally disabling thecellular SR protein family of splicing factors.

Virus-induced inhibition of RNA splicing

Herpes simplex virus (HSV) has a relatively large double-stranded DNA genome and replicates and assembles newvirus particles in the host cell nucleus.HSVgenes expressedduring a lytic infection are by convention categorized asimmediate early, early and late (a, b and g), depending ontiming of their expression. The perhaps most remarkablefeature of the HSV genome is that despite the fact that the

TranscriptionAlternative polyadenylation

AAAAAA

AAAAAA

L5L4L3L2L1 AAA

AAA early +late

AAA

AAA late

Alternativesplicing

Figure 6 The major late transcription unit (MLTU) from adenovirus. Thefigure shows how the large pre-mRNA from MLTU is alternativelypolyadenylated to generate five different mRNA families, L1–L5. Withineach of these mRNA families alternative 3’ splice sites are used to generatemultiple mRNA species. In the inset are the two major mRNAs from the L1region. A proximal 3’ splice site in L1 is used at all stages after infection,whereas a distal 3’ splice site is only activated at late infection phase.

Messenger RNA in Eukaryotes

5

virus mRNAs are synthesized in the host cell nucleus, onlyfour out of approximately 80 HSV genes contain introns.Also,with one exception, the intronless genes are expressedlate after infection. Since host cell genes contain introns,HSV enhances viral gene expression by inactivating RNAsplicing. For this purpose HSV encodes for an a genetermed ICP27, which shuts down cellular gene expressionby inhibiting RNA splicing. Other nuclear replicatingviruses, such as adenovirus and influenzavirus, have alsobeen shown to shut-off host cell gene expression byinhibiting RNA splicing.

Virus-regulated mRNA export

Many nuclear replicating viruses target the nucleocyto-plasmic mRNA export pathway to achieve a selectiveaccumulation of viral mRNAs in the cytoplasm. The best-studied virus is Human immunodeficiency virus (HIV),which is a nuclear replicating retrovirus with a single-stranded RNA molecule as genome. During its lytic lifecycle HIV needs to produce and export a complex set ofspliced mRNA, partially unspliced pre-mRNA andunprocessed genome RNA. The export of unprocessedRNA is essential since this RNA represents the HIVgenome that is incorporated into new virus particles in thecytoplasm. Normally pre-mRNAs are retained in thenucleus until completely spliced, but HIV overcomes thisexport restriction by a combination of inherently weaksplicing signals and by expressing the Rev protein. Rev isan RNA-binding protein that binds to a sequence element,termed Rev responsive element (RRE), which is located inthe intron of the HIV pre-mRNA. HIV RNAs bound toRev protein are released from the normal processingpathway and exported in its unspliced form to thecytoplasm.Other viruses like adenovirus have also been shown to

manipulate the nucleocytoplasmic transport machinery.At late stages of lytic adenovirus infection two early viralproteins, E1B 55 kDa and E4 orf6, mediate a selectiveexport of viral mRNAs. These two proteins form acomplex that functions as a gatekeeper allowing exportof viral mRNAs whereas transport of cellular transcriptsare blocked, thus leaving the translation machineryexclusively to the virus.

Cytoplasmic replicating viruses regulatetranslation

Gene expression of viruses with an exclusive cytoplasmiclife cycle differs fundamentally from their host in manyaspects. For instance, they carry their own RNA poly-merase, since they replicate in the cytoplasm they do notrequire nucleocytoplasmic transport of their mRNAs andtheir pre-mRNAs do not contain introns. Some cytoplas-mic replicating viruses have additional unique features.

For example, reovirus encodes for nonpolyadenylatedmRNAs. Picornavirus mRNA is unique in that thegenomic RNA, which functions as the mRNA, does notcontain a cap. Instead, picornavirus mRNA contains aviral protein covalently bound to the 5’ end. In picorna-virus-infected cells a viral protease degrades the cytoplas-mic CBC, which results in total shutdown of cap-dependent protein synthesis, and as a result, inhibition ofhost cell gene expression. To recruit ribosomes to thepicornavirusmRNAtheRNAcontains a highly structuredsequence element, called the internal ribosomal entry site(IRES). Translation of IRES-containing mRNAs resem-bles mechanistically translation initiation in bacteria inthat the IRESs function as landing sites for ribosomeswhich allows for cap-independent internal initiation oftranslation. A variety of viruses use this mechanism toinitiate translation, but very few cellular mRNAs containsuch IRES. In eukaryotes, ribosomes typically bind to thecap structure and recognize the cap-proximal AUGthrough a scanning mechanism.

Trans-splicing

In lower eukaryotes, like trypanosomes, all mRNAsmature via a trans-splicing mechanism. A short spliced-leader RNA (SL RNA), transcribed from a separate gene,is added to each mRNA as a 5’ noncoding leader sequence(Figure 7a). The SL RNA trans-splicing reaction proceedsvia a two-step mechanism resembling that of cis-splicing.Indeed, the U2, U4 and U6 snRNP particles have beenshown to participate in the trans-splicing reaction. It isthought that the SL RNA functionally substitutes the U1snRNP in the trans-splicing reaction. U5 snRNP isprobably not present, but RNA–protein complexes withsimilar functions have been identified. The SLRNA,whichis related to U1 snRNA both in size and structure,facilitates translation of the mRNA and provides themRNA with a tri-methylated cap nucleotide at the 5’ end.Trans-splicing has also been identified in higher eukar-

yotes, such as Caenorhabditis elegans. Here some poly-cistronic pre-mRNAs are trans-spliced using alternativepolyadenylation sites, and cis- and trans-splicing. The SL1RNA is trans-spliced only to the proximal exon, whereasanother spliced leader RNA, SL2, is used to splice to thedistal exons (Figure 7b). It has been estimated that as manyas 25% of all genes in C. elegans exist in such operon-likestructures and in some instances genes that are expressedfrom the same polycistronic pre-mRNA do performrelated functions. No natural examples of trans-splicingin mammalian genes have, so far, been reported. However,in experimental model systems trans-splicing works also inhuman cells. Thus, there are no reasons to presume thatfuture studies will not reveal mammalian genes that aresubjected to trans-splicing.

Messenger RNA in Eukaryotes

6

Non-nuclear pre-mRNA Processing inEukaryotes

Mitochondria and chloroplasts contain their own com-plete genetic systems. Although nuclear genes encode themajority of organelle proteins, some proteins are encodedby organelle DNA, transcribed by organelle RNA poly-merase and translated into proteins by organelle ribo-somes. In mitochondria a large number of differentmechanisms, some that are not even found in the nucleargenes, are used to ensure proper gene expression.Mitochondrial-encoded mRNAs are uncapped but

contain a polyadenylated 3’ end. The poly(A) tail is addedposttranscriptionally by a mitochondrial poly(A) poly-merase. Mitochondrial protein-coding genes in lowereukaryotes contain intron sequences that often belong toeither one of two self-splicing intron families; the splicingcatalytic activities are confined within the RNAmolecules.These introns are highly structured stretches of RNA thatfold into a three-dimensional shape that facilitates the

splicing reactions. Some of these pre-mRNA transcriptsare polycistronic, but unlike the nematode nuclear trans-splicing reaction, these transcripts are cleaved by anendonuclease, which generates multiple monocistronicmessages.Posttranscriptional regulation in mitochondria of high-

er plants is even more complex. Here some of the mostfundamental ideas of genetics are being challenged.Through a series of different mechanisms, the final mRNAof a plant mitochondrial protein may have very littleresemblance to the genomic sequence. Aside from theabove-mentioned mitochondrial mRNA processing,plants possess the capacity to edit individual RNA basesafter transcription, as well as trans-splice different exonsfrom different transcripts to each other. The RNA-editingmachinery can remove, add or change nucleotides withinthemRNA,whichwill code for different amino acids in thefinal product. RNA editing is not unique to higher plants,but they use itmost prominently.Different organisms haveslight variation in their mechanisms for RNA editing. Insome protozoa posttranscriptional insertion of uridinenucleotides into the pre-mRNA is known to occur, as wellas deletion of other RNA sequences and/or nucleotides.Insertions of cytidine nucleotides into pre-mRNA arefound in the Physariummitochondria. Cytidine to uridinesubstitution has also been found in higher plants and inmammalian nuclear pre-mRNA.

Conclusions

In eukaryotes mRNAs mature after extensive processingof a primary so-called pre-mRNA. In the vast majority ofcases, these processes result in capped 5’ ends andpolyadenylated 3’ ends. In addition to this, most pre-mRNAs contain intervening sequences (introns) in thebody of the polynucleotides, which are removed via asplicing mechanism. All three processes take place in thecell nucleus, co-transcriptionally, and prior to nucleocy-toplasmic transport of the mRNA. Recent research hasestablished that pre-mRNA processing, transport, stabi-lity and translatability of mRNAs are very important andcommon mechanisms in higher eukaryotes in the regula-tion of gene expression. Also, viruses that infect metazoancells often target mRNA-processing events in order tomaximize production of viral proteins and, in some cases,to shut down host cell gene expression. Viruses have forthis reason turned out to be valuable tools in pre-mRNAprocessing and transport studies. By dissecting howvirusesmanipulate a particular mRNA-processing event (e.g.splicing), a better understanding of the process itself can beobtained.

A

genomic DNA(a)

SL-1 SL-1

A

genomic DNA

(b)A

AA

SL-1SL-2 SL-2

Figure 7 cis- and trans-splicing of mono and polycistronic pre-mRNAs. (a)In trypanosomes a common RNA species, the SL-1 RNA (green circle withline) is trans-spliced to all pre-mRNAs. The figure illustrates how SL-1 RNA istranscribed from a separate gene, trans-spliced to two different pre-mRNAs(red and green boxes with introns as brown lines). (b) In Caenorhabditiselegans a combination of trans- and cis-splicing is used. A singlepolycistronic pre-mRNA is trans-spliced to two different RNA species, SL-1(green circle with line) and SL-2 (blue circle with line).

Messenger RNA in Eukaryotes

7

Further Reading

Adams MD, Rudner DZ and Rio DC (1996) Biochemistry and

regulation of pre-mRNA splicing. Current Opinion in Cell Biology 3:

331–339.

Berget SM (1995) Exon recognition in vertebrate splicing. Journal of

Biological Chemistry 270: 2411–2414.

ColganDF andManley JL (1997)Mechanism and regulation of mRNA

polyadenylation. Genes and Development 11: 2755–2766.

Cullen BR (1998) Retroviruses as model systems for the study of nuclear

RNA export pathways. Virology 249(2): 203–210.

Daneholt B (1997) A look at messenger RNP moving through the

nuclear pore. Cell 88: 585–588.

Dreyfuss G, Matunis MJ, Pinol-Roma S and Burd CG (1993) hnRNP

proteins and the biogenesis of mRNA.Annual Review of Biochemistry

62: 289–321.

ImperialeMJ,AkusjarviGandLeppardKN(1995) Post-transcriptional

control of adenovirus gene expression. In: Doerfler W and Bohm P

(eds)Current Topics inMicrobiology and Immunology, vol. 199/II:The

Molecular Repertoire of Adenoviruses, pp. 139–171. Berlin, Heidel-

berg: Springer-Verlag.

Lewis JD and Izaurralde E (1997) The role of the cap structure in RNA

processing and nuclear export. European Journal of Biochemistry 247:

461–469.

Nakielny S andDreyfusG (1997)Nuclear export of proteins andRNAs.

Current Opinion in Cell Biology 9: 420–429.

Messenger RNA in Eukaryotes

8