Genomic structure and expression of the human fau gene: Encoding the ribosomal protein S30 fused to a ubiquitin-like protein

Vol. 187, No. 2, 1992

September 16, 1992

BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

Pages 927-933

GENOMIC STRUCTURE AND EXPRESSION OF THE HUMAN FAU GENE:

ENCODING THE RIBOSOMAL PROTEIN S30 FUSED TO A

UBIQUITIN-LIKE PROTEINt

Koen Kas, Luc Michiels, and Jozef Merregaert’

University of Antwerp, Dept. Biochemistry, Lab. of Molecular Biotechnology,

Universiteitsplein 1, B-261 0 Wilrijk, Belgium

Received July 30, 1992

SUMMARY: The fau gene is the cellular homolog of the fox sequence in the Finkel- Biskis-Reilly Murine Sarcoma Virus (FBR-MuSV). This virus acquired the fau sequence in its reversed transcriptional orientation. Human and mouse fau cDNA’s were identified and both encode a new protein of 133 AA. We show that fau (for EBR-MuSV associated gbiquitiously expressed gene) becomes expressed in all different tissues tested as a 600 bp messenger and we report the genomic structure of the human fau gene. The gene consists of five exons and four introns and the 5’ untranslated region displays characteristic features for a housekeeping gene. Fau encodes the ribosomal protein S30 fused to a Ubiquitin-like protein. c 199: Acadtmlc Press, 1°C.

One of the main topics in molecular oncology is the search for the genomic origin and

function of sequences acquired by transforming retroviruses (viral oncogenes). One

of these viruses is the Finkel-Biskis-Reilly Murine Sarcoma Virus (FBR-MuSV) which

induces bone tumours in susceptible mice (1). This retrovirus has partially acquired

the mouse c-fos gene and an unrelated sequence called fox (2.3). The genomic origin

of this gene is fau (for EBR-MuSV associated gbiquitiously expressed gene). FBR-

MuSV acquired this fau sequence in its reversed transcriptional orientation (4). This

antisense fau sequence, which probably interferes with the normal constitutive

expression of fau, increases the transforming capacity of FBR-MuSV “in vitro” by

twofold (4). Human and mouse fau cDNA’s were characterized and encode a new

protein of 133 AA. This Fau protein shows strong similarity to the fusion protein

Ubiquitin-CEP. The first 74 AA of Fau are for 56.8% homologous to Ubiquitin, while

tsequence data from this article have been deposited with the EMBUGenBank Data Libraries under Accession No. X65921.

‘To whom correspondence should be addressed.

0006-291X/92 $4.00

927 Copyight 0 1992 by Academic Press, Inc.

Aft rights nfreproducrion in any.form reserved

Vol. 187, No. 2, 1992 BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

the remaining part shares some physical resemblance to the Carboxyl Extension

Proteins (CEP’s). Ubiquitin-CEP is processed in the cytoplasm upon which the

Carboxyl Extension Protein migrates to the nucleus where it combines with ribosomal

RNA to form ribosomal subunits. One of the two known CEP’s was identified as

ribosomal protein S27a (5). The fate of the multifunctional protein Ubiquitin in the

nucleus is still unknown. Two possibilities are the binding on histone H2A or H2B while

playing a role in transcription regulation (6) or the degradation of nuclear oncoproteins

(7).

In this study we present the genomic structure and expression of the human fau gene.

The gene consists of 5 exons and 4 introns and encodes the ribosomal protein S30

fused to a Ubiquitin-like protein. The structure of the 5’ non-coding region suggests

that fau is a housekeeping gene, which is supported by its ubiquitious expression.

MATERIALS AND METHODS

Cosmid library and isolation of fau containing clones

The cosmid library (CML-DNA partially digested with Sau 3A and ligated in the Barn HI site of pJB8) is described in (8). Radioactive probes were prepared by 32P-labeling of the human fau cDNA fragment using nick translation, yielding probes with a specific activity of lOa cpmlpg of DNA. Filters containing recombinant clones were screened according to the procedure of Grunstein and Wallis (9) prehybridized at 65°C in 5xSSPE (20xSSPE = 3.6M NaCl, 0.2M NaH,PO,, 20 mM NqEDTA, adjusted to pH 7.7) 5xDenhardt’s solution (1 OOxDenhardt’s = 2% ficoll, 2% polyvinylpyrrolidone, 2% BSA), 0.2% SDS, lOOug/ml competitor DNA (Herring Sperm DNA, sonicated and denatured in advance) for 3 h, and hybridized with 1 O8 cpm/pg 32P-labeled cDNA insert at 65°C for 20 h. Labeling was performed using a Nick Translation kit (Boehringer Mannheim) according to the method of Rigby et al.(lO). The filters were exposed to Fuji X-ray film with an intensifying screen at -70°C during 1 day. Plasmid DNA preparations of the positive clones were carried out according to Birnboim and Doly (1 I>-

Restriction Enzyme analysis, Southern Blotting and Northern Blotting

DNA digested with restriction endonucleases was subjected to electrophoresis on, ethidium bromide containing, 1% agarose gels in TAE buffer (40 mM Tris, 5 mM NaAc, 2 mM EDTA, adjusted to pH 7.6). After electrophoresis, gels were alkali-denatured and neutralized. For Southern Blotting (12), nylon membranes (Hybond N+, Amersham) were used. Hybridization with 32P-labeled DNA-probes was carried out under the same conditions as the cosmid library screening. The Northern Blotting hybridization was carried out at 42’C in a hybridization mixture containing 50% formamide.

DNA Sequencing

A DNA fragment originating from the genomic DNA insert of the cosmid clone (15.1) was subcloned in the pGEMC2 vector (Promega). DNA fragments from this construct were subsequently cloned in pGEM4-Z. The vector was linearized, and ligation was

928


carried out for 2 h at 15°C (sticky ends) or 20 h at room temperature (for blunt ends). The ligated DNA was used to transform Ecoli Sure ceils (Stratagene) by electroporation.Sequence was obtained using dideoxy chain termination method (13) with Sequenase II (USB) as described by the manufacturer.

RESULTS AND DISCUSSION

Molecular cloning of a human fau gene

To obtain human fau genomic clones, 32P-labeled human fau cDNA was used to

screen 3 haploid genome equivalents of a human CML genomic cosmid library. After

two rounds of screening 11 positive clones remained. Restriction enzyme analysis and

successive hybridization with the human fau cDNA probe showed that all these clones

contained the same fau “locus”. We choose a fau specific 6.3 kbp Hind Ill fragment

(out of cosmidclone 15.1) for subcloning in pGEM4-Z. Digestion of total human

genomic DNA with Hind III gives rise to 4 fau specific fragments, one of these also

being 6.3 kbp in length. For sequencing purposes, a series of overlapping subclones

were generated, all cloned in pGEMCZ.

Sequence analysis of the human fau gene

The characterized fau gene consists of 5 exons separated by 4 introns (Fig.1). There

is 1 third basepair difference with the (placental) cDNA structure (at position 1157:

GTC instead of Gil), not affecting the protein sequence. All of the intron donor and

acceptor sites conform to the consensus rules. The exon sizes are respectively 97,83,

145, 56 and 177 bp; the introns sizes are respectively 269, 94, 461 and 174 bp. The

first exon (starting at the putative transcription initiation site) completely consists of an

untranslated leader sequence. A single ATG initiation codon is present in exon 2. The

human fau cDNA begins at position +49 with respect to the transcription initiation site.

Using the promoter analysis software of PC Gene (Intelligenetics, USA), the putative

CAP-site was located and the C residue herein was defined position +l. A nucleotide

sequence of approximately 0.4 kbp of the 5’ flanking region of the fau gene was

determined. This putative promoter region reveals a degenerate TATA-box

(TACAAAT)(-32 to -26) a GC-box (SP-1 binding site) (-60 to -54) and a CAP-site. Two

potential regulatory sequences were identified: the binding sites (C/A)GGAA for E74

at -179 to -175 and at -326 to -322. E74 is a nuclear factor which belongs to the ets

gene family (14). Together with the ubiquitious expression of fau in different tissues

this promoter structure is a strong argument for fau being a housekeeping gene (15).

929


-200 CTCCATCCCCGCAGCGTAGCCCGGAACATGGTAGCl'GCCA~ACCTGCTACGCCAGCC~TGl'C.C~GCM~ CCGCCCCGTCCTGCG

-100 C~~CCAGGCAGG~~C~~~GTAM rJTXGGCCCTACAAATAGCAGGCAACCGCGC~GCTCAG

EXOIIl 1 100 CAGTGACGTGACACGCAGCCCAC GGTCTGTACTGACGCGCCCGClT’C!PICCTC~ GAcrccATcTrcGCGGT- CCGCCGTKAGCXA

200 AGAATGGGGCCC~~A~~~~~~~~CC~C~~~CG~C~

300 GCTCCAGTCCCTATCCGAACTC cTK;GGAGGcAcn;GccprCCGCACCPCAGCCGCCGCGAC cc-CAC

400 TCCCAMTCTCC~ATCCCAGAGCA VACAAGCCG- ACTCAGTCGCCA CGCGCCCAGGA

H Q L P V R A Q E -hmnZ 500

~A~~~CAGDn;ACC~CA~C-~CCA~~~M~~~C~CA~A~ LETPEVTGQETVAQIK 25

600 CGCGGCCCGAGGGAACGCTl'ACGAGCC~A ~C~A~A~TA~C~C~~A~CCC~~~~G~~C~

bE"ASLEG I AP E DQVVLLA tXOn3 700

GGcGcGcCCcTGGAGGATGAGGccA-CAGTGC ~~C-CPGACCC~~A~A~C~~~G~ EAPLEDEATLGQCGVEALTTLE~AGR~LGG l 74

800 GGAAVGTACCGGTAAGCGTCTAGTGAG TDPGCGCPGCATAGTCCAGCTGAmACACCTAmAATAGAGTA-A-

900 CTPCAGTPCACAGTGATPCT~~ACATGTPGAAC

1000 GAGATCCTAGTCTGGTPATCAGC~CACACTMAM~AGGTCA~CCAGGCCCCAAAG~TATAAA~AGAA~ 3IccTGAAA-c

1100 TrAAGA mCAACGTCAAATATCTGCA~~A~ACCTATP TrGAGAAAAGAGGGGI

Pxon4 1200

GTGGCAC~~ACCCCTGCrCCPGCCTPCCmCCACTACA~TAMGTCCATGG~CCTGGCCC-~GGTC~ KVHGSLARACKV RGQTP 92

1300 CTMGGTGA-~GTA~AG~~A~~A~~ CTITCACAGCTAAACCMGTCC-A=- CTpcTrCcT T

1400 CC~~~~Cn~~GACA~CCA-C~~MC~C~CA~C~CA~~~G

V AXOIl 1500

MGAAGAAGACAGGTCGGGePAAGCGGCGGATGCAGTACMCC~ GCPPTOPCAACGTn;TGCCCAC cTITGGcAAcAAGMGGGccccAATGccMcT RR RTG RA KRRHQ Y N RRP VN VVP TPGK KXGP NAN S 133

1600 ClTAAGTCmn;TAA- TlTcTcTAAT AAAAAAGCCAClTAGTTCAGTCATCGCA -ATcccpACTl!OUA GGCCTCAGGGAGAGG'ET

Figure 1. Genomic structure of the human fau gene. The exon sequences in the gene are underlined and the amino acid sequence of the human Fau protein is given in the single letter code below the codons. These amino acid residues are numbered in the right margin. Startcodon and stopcodon are marked in bold. The polyadenylation signal (AATAAA) is double underlined. Positive numbering starts at the putative transcription initiation site; negative numbers designate the 5’ flanking promotor region. The asterisk marks the stopcodon,which by alternative splicing, gives rise to a 75AA Ubiquitin-like protein.

The name fau EBR-MuSV associated ubiquitiously expressed gene) was partially

attributed to this observation.

Expression of the fau gene

The expression of the fau gene in eight different human tissues (pancreas, kidney,

skeletal muscle, liver, lung, placenta, brain, heart) was analyzed by Northern Blot

930


1 2 3 4 5 6 I 8

Figure 2. Expression pattern of fau in different human tissues. A Northern blot (Clontech) containing 2pg of Poly A+ RNA per lane from respectively (1 to 8) pancreas, kidney, skeletal muscle, liver, lung, placenta, brain and heart was hybridized with 100 ng of 32P-labeled cDNA insert. One distinct band of about 600 nucleotides is detected in all these tissues.

hybridization using the 32P-labeled human fau cDNA probe. The transcript appears as

a band of approximatively 600 nucleotides in all these tissues (Fig.2).

The structure of the protein encoded by the fau gene

The protein encoded by the fau gene shows strong similarity to the Ubiquitin-CEP

fusion proteins. These proteins are processed in the cytoplasm upon which the

Carboxyl Extension Protein (52 AA or 76-80 AA) migrates to the nucleus where it

combines with ribosomal RNA to form ribosomal subunits in the nucleolus. Noteworthy

is that the cleavage site in this fusion proteins, namely after the Gly-doublet (position

76) seems potentially to be conserved in the Fau protein (position 74, also a Gly

doublet). If, by means of alternative splicing, the donor at position 689 is neglected,

a stop codon 3 bp downstream position 73-74 gives rise to a 75 AA protein, ending

in Gly-Gly-Glu (Fig.1). However, the existence of a messenger encoding such a

protein is not observed so far. Remarkably, most poly-Ubiquitin precursor proteins in

higher eukaryotic species also contain an extra amino acid residue at the end of their

last Ubiquitin repeat (Gly-Gly-X), the extra residue being different in poly-Ubiquitins

from different species (16). More striking is the structural analogy of the fau gene to

the UbiquitinCEP52 gene (17). Ubiquitin-CEP52 also consists of 5 exons separated

by 4 introns while exon 2 also contains 8 bp of the 5’ untranslated region. In addition

a consensus TATA-box in the promoter region is lacking. The CEP part shows no

sequence homology to the respective Fau part (while CEP76-80 is ribosomal protein

S27a (5) it is not yet known what is the exact nature of CEP52). Recently, the cDNA

encoding the rat S3U ribosomal protein was sequenced (18). S30, part of the small

ribosomal subunit, is fused to a Ubiquitin-like molecule and is in this configuration . . .

identical to Fau (Fig.3). Thus, the fau gene encodes a Ubiquitin-like-S30 fusion protein.

S30 shows no homology to any other known ribosomal protein.

Most mammalian ribosomal genes are present in multiple copies but just one of them

seems to be a functional gene, containing introns. The other copies are

931


H-FAU - MQLFVRAQELHT FEVTGQETVAQIKAHVASLEGIAPEDQVVLLAGAPLED -50 IIIIIIIIIIII IIlIIIIIIIIIIIIIIIIIIIIIIIIIIIII IIII

RRFUSP - MQLFVRAQELHT LEVTGQETVAQIKAHVASLEGIAPEDQVVLLAGSPLED -50 11-II.. I III .I* **II * IlII.I*.II III

H-UBI - MQIFVKTLTGXTITLEVEPSDTIENVKAKIQDXEGIPPDQQRLIFAGICQLED -52

H-FAD - EATLGQCGVRALTTLEVAGRMLGGKVHGSLARAGKVRGQTlXVAKQEKKK -100 IIIIIIIIIIIIIIIIIlIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII

RRFIJSP - EATLGQCGVEALTTLEVAGRMLGGKVHGSLARAGWGQTPQEKKK -100 II - -II- I.11

H-UBI - GRTLSDYNIQKESTLHLVT,RLRGG - 76

H-FAU - KKTGRAKRFtMQYNRRFVNWPTFGXXXGPNANS -133 IIIlIIIIIlIIIIIIlIIIIIlIIlIIIlIII

RRFUSP - XXTGRAKRRMQYNRR FVNVVPTFGKXXGPNANS -133

Figure 3. Alignment of Fau, S30 fusion protein and Ubiquitin. The amino acid sequence of the human Fau protein (H-FAU) is compared to the rat S30 fusion protein (RRFUSP) (18) and to human Ubiquitin (H-UBI)(17). identical amino acids are presented by 1 whereas . means similar residues (A,S,T; D,E; N,Q; R,K; I,L,M,V; F,Y.W).

retropseudogenes without introns. The same is true for Ubiquitin-CEP52. There does

not seem to be a need for more than one functional ribosomal protein gene in order

to satisfy the cell’s requirements for ribosomal protein mRNA synthesis (19). In

analogy with the multiplicity of ribosomal protein genes, the existence of 3 more

human fau loci was identified. But instead of being retropseudogenes these additional

loci are intron-containing genes (L. Michiels, personal communication). Up to now,

however, one fau specific transcript of approximately 600b has been identified in all

tissues tested.

ACKNOWLEDGMENTS

We thank Fons van Hasselt for excellent technical assistance, Dr. Gunter Weber for

help with the Northern Blot hybridization and Eitan Friedman for helpful suggestions

on the manuscript. This work was financially supported by the Belgian “Nationaal

Fonds voor Geneeskundig Wetenschappelijk Onderzoek” and by the “Kankerfonds”

from the “Algemene Spaar en Lijfrente Kas”.

K.Kas is an Aspirant of the NFWO.

REFERENCES

1. Finkel, M.P., Reilly, C.A. and Biskis, B.O. (1975) Front. Radiat. Ther. Oncol., 10, 28-39

2. Van Beveren, C., Enami, S., Curran, T. and Verma, I.M. (1984) Virology, 135, 229-24

3. Michiels, L., Maisin, JR, Pedersen, F.S. and Merregaert, J. (1984) Int. J. Cancer, 33, 51 l-517

932


4. Michiels, L., Van der Rauwelaert, E., Van Hasselt, F. and Merregaert, J. (submitted)

5. Redman, K.L. and Rechsteiner, M. (1989) Nature, 338, 438-440 6. Davie, JR. and Murphy, L.C. (1990) Biochemistry, 29, 4752-4757 7. Ciechanover, A., DiGiuseppe, J.A., Bercovich, B., Orian, A., Richter, J.D.,

Schwartz, A.L. and Brodeur, G.M. (1991) Proc. Natl. Acad. Sci. U.S.A., 88,139-l 43 8. Verbeek, J.S., Roebroek, A.J.M., van den Ouweland, A.M.W., Bloemers, H.P.J.

and Van De Ven, W.J.M. (1985) Mol. Cell. Biol., 5, 422-426 9. Grunstein, M. and Wallis, J. (1979) Methods in Enzymology, 68, 379-389 10. Rigby, P.W., Dieckmann, M., Rhodes, C. and Berg, P. (1977) J. Mol. Biol., 113,

237-251 11. Birnboim, H.C. and Daly, J. (1979) Nucleic Acids Res., 7, 1513-l 523 12. Southern, E. (1975) J. Mol. Biol., 98, 503-517 13. Sanger, F., Nicklen, S. and Coulson, A.R. (1977) Proc. Natl. Acad. Sci. U.S.A., 74,

5463-5467 14. Gutman, A..and Wasylyk, B. (1991) Trends in Gen., 7, 49-54 15. Dynan, W.S. (1986) Trends in Gen., 2, 196-197 16. Finley, D. and Varshavsky, A. (1985) Trends in Biochem. Sci., 10, 343-347 17. Baker, R.T. and Board, P.G. (1991) Nucleic Acid Res., 19, 1035-l 040 18. Olvera, J. (unpublished). EMBL accession number X62671 19. Monk, R.J., Meyuhas, 0. and Perry, R.P. (1981) Cell, 24, 4301-4306

933

Documents

Genomic structure and expression of the human fau gene: Encoding the ribosomal protein S30 fused to a ubiquitin-like protein