10
JOURNAL OF VIROLOGY, Apr. 1992, p. 2170-2179 Vol. 66, No. 4 0022-538X/92/042170-10$02.00/0 Copyright © 1992, American Society for MicrobiologY Novel Human Endogenous Sequences Related to Human Immunodeficiency Virus Type 1 M. S. HORWITZ,12 M. T. BOYCE-JACINO,l AND A. J. FARAS1,2* Department of Microbiology and Institute of Human Genetics, University of Minnesota, Minneapolis, Minnesota 55455 Received 27 January 1991/Accepted 18 December 1991 Endogenous retrovirus-related sequences exist within the normal genomic DNA of all eukaryotes, and these endogenous sequences have been shown to be important to the nature and biology of related exogenous retroviruses and may also play a role in cellular functions. To date, no endogenous sequences related to human immunodeficiency virus type 1 (HIV-1) have been reported. Herein we describe the first report of the presence of nucleotide sequences related to HIV-1 in human, chimpanzee, and rhesus monkey DNAs from normal uninfected individuals. We also present the isolation and characterization of two of these endogenous HIV-1-related sequences, EHS-1 and EHS-2. With use of low-stringency Southern blot hybridization, complex banding patterns were detected in human DNA with 5' and 3' HIV-1-derived probes. When an HIV-1 env region probe was used, we detected a less complex, conserved banding pattern in human DNA as well as a related but distinct banding pattern in chimpanzee and rhesus monkey DNAs. EHS-1 and -2 were cloned from normal human genomic DNA libraries by using the env region probe. Clone EHS-1 shows sequence similarity with the domain of the envelope cellular protease cleavage site of HIV-1, while EHS-2 has sequence similarity to the overlapping reading frame for Rev and gp4i. Stringent hybridization of EHS-1 back to primate genomic DNA indicates two distinct EHS-1 loci in normal human DNA, an identical band pattern in chimpanzee DNA, and a single locus in rhesus monkey DNA. Likewise, EHS-2 is present as a single highly conserved locus in all three species. An oligonucleotide derived from EHS-2 across a region of near identity to HIV-1 detects a complex banding pattern in all primates tested similar to that seen with the 3' HIV-1 probe. These data suggest that most of the HIV-1-related sequences identified in primate DNA share a common core of nucleic acid sequence found in both EHS-2 and rev and that some of these H1V-1-related sequences have additional larger regions of sequence similarity to HIV-1. The genomes of eukaryotic organisms contain a wide variety of endogenous retroviruses and retroviruslike ele- ments which are transmitted as heritable Mendelian ele- ments yet exhibit structural and sequence similarities to infectious exogenous retroviruses (48). Endogenous retrovi- rus-related sequences can encode gene products that com- pete in trans for retrovirus function, can offer sites for recombination, and can embody a pool of related cellular genetic material from which the exogenous virus can evolve. Generally, endogenous retroviruses detected within a given species are most closely related to exogenous retroviruses which infect that host. Human endogenous retrovirus se- quences have been identified by several approaches, includ- ing low-stringency hybridization to nonhuman exogenous and endogenous retroviruses (2, 4, 30, 34, 36), hybridization to the 3' terminus of tRNAs (16, 18), analysis of flanking regions of cellular genes (23, 24), and polymerase chain reaction (PCR) amplification of conserved retrovirus do- mains with use of mixed oligonucleotide primers (1, 40). Although the function of these endogenous retroviruslike sequences is unknown, their potential contributions to ex- ogenous viral pathogenesis is suggested by several previ- ously described endogenous-exogenous recombinations which produce viruses with new tropisms and pathology. For example, Harvey and Kirsten murine leukemia viruses resulted from recombination of exogenous murine leukemia virus with the endogenous VL30 elements (5, 10). Feline leukemia viruses also acquire altered tropism and pathogen- * Corresponding author. esis from recombination with their endogenous counterparts (35). The acquisition of cellular oncogenes by retroviruses demonstrates that a cellular sequence merely needs to pro- vide a desirable characteristic to bring about a selectable recombination event. Because of the recent identification of several classes of human endogenous retroviruses and our interest in obtaining a better understanding of the evolution of human immuno- deficiency virus (HIV), experiments were performed to detect the presence of HIV-1-related sequences in normal human DNA. To date, no endogenous sequences related to HIV-1 have been described. Such species, if detected, could represent either a related endogenous retrovirus or a cellular genetic sequence with a functional domain similar to one or more of those found in HIV-1. We have previously used reduced-stringency hybridization conditions to allow us to detect novel endogenous retroviruses exhibiting as little as 65% sequence similarity to the retroviral probes employed (8, 9). We report herein the initial detection and characteri- zation of endogenous HIV-1-related sequences (EHS) in normal human DNA by using both low-stringency Southern blot hybridization and low-stringency hybridization screen- ing of human genomic DNA lambda libraries. MATERIALS AND METHODS Cloned viral DNAs. Plasmid pN1GD4 is a proviral clone of HIV-1 containing a minor deletion within gag and was provided by D. Volsky (47). Subgenomic probes of HIV-1 were generated by deleting the EcoRI fragments of pNlGD4 to generate the 5' probe and alternatively the SalI fragment 2170

Novel human endogenous sequences related to human

  • Upload
    lethuan

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Novel human endogenous sequences related to human

JOURNAL OF VIROLOGY, Apr. 1992, p. 2170-2179 Vol. 66, No. 40022-538X/92/042170-10$02.00/0Copyright © 1992, American Society for MicrobiologY

Novel Human Endogenous Sequences Related to HumanImmunodeficiency Virus Type 1

M. S. HORWITZ,12 M. T. BOYCE-JACINO,l AND A. J. FARAS1,2*Department of Microbiology and Institute ofHuman Genetics,

University ofMinnesota, Minneapolis, Minnesota 55455

Received 27 January 1991/Accepted 18 December 1991

Endogenous retrovirus-related sequences exist within the normal genomic DNA of all eukaryotes, and theseendogenous sequences have been shown to be important to the nature and biology of related exogenousretroviruses and may also play a role in cellular functions. To date, no endogenous sequences related to humanimmunodeficiency virus type 1 (HIV-1) have been reported. Herein we describe the first report of the presenceof nucleotide sequences related to HIV-1 in human, chimpanzee, and rhesus monkey DNAs from normaluninfected individuals. We also present the isolation and characterization of two of these endogenousHIV-1-related sequences, EHS-1 and EHS-2. With use of low-stringency Southern blot hybridization, complexbanding patterns were detected in human DNA with 5' and 3' HIV-1-derived probes. When an HIV-1 envregion probe was used, we detected a less complex, conserved banding pattern in human DNA as well as arelated but distinct banding pattern in chimpanzee and rhesus monkey DNAs. EHS-1 and -2 were cloned fromnormal human genomic DNA libraries by using the env region probe. Clone EHS-1 shows sequence similaritywith the domain of the envelope cellular protease cleavage site of HIV-1, while EHS-2 has sequence similarityto the overlapping reading frame for Rev and gp4i. Stringent hybridization of EHS-1 back to primate genomicDNA indicates two distinct EHS-1 loci in normal human DNA, an identical band pattern in chimpanzee DNA,and a single locus in rhesus monkey DNA. Likewise, EHS-2 is present as a single highly conserved locus in allthree species. An oligonucleotide derived from EHS-2 across a region of near identity to HIV-1 detects acomplex banding pattern in all primates tested similar to that seen with the 3' HIV-1 probe. These data suggestthat most of the HIV-1-related sequences identified in primate DNA share a common core of nucleic acidsequence found in both EHS-2 and rev and that some of these H1V-1-related sequences have additional largerregions of sequence similarity to HIV-1.

The genomes of eukaryotic organisms contain a widevariety of endogenous retroviruses and retroviruslike ele-ments which are transmitted as heritable Mendelian ele-ments yet exhibit structural and sequence similarities toinfectious exogenous retroviruses (48). Endogenous retrovi-rus-related sequences can encode gene products that com-pete in trans for retrovirus function, can offer sites forrecombination, and can embody a pool of related cellulargenetic material from which the exogenous virus can evolve.Generally, endogenous retroviruses detected within a givenspecies are most closely related to exogenous retroviruseswhich infect that host. Human endogenous retrovirus se-quences have been identified by several approaches, includ-ing low-stringency hybridization to nonhuman exogenousand endogenous retroviruses (2, 4, 30, 34, 36), hybridizationto the 3' terminus of tRNAs (16, 18), analysis of flankingregions of cellular genes (23, 24), and polymerase chainreaction (PCR) amplification of conserved retrovirus do-mains with use of mixed oligonucleotide primers (1, 40).Although the function of these endogenous retroviruslikesequences is unknown, their potential contributions to ex-ogenous viral pathogenesis is suggested by several previ-ously described endogenous-exogenous recombinationswhich produce viruses with new tropisms and pathology.For example, Harvey and Kirsten murine leukemia virusesresulted from recombination of exogenous murine leukemiavirus with the endogenous VL30 elements (5, 10). Felineleukemia viruses also acquire altered tropism and pathogen-

* Corresponding author.

esis from recombination with their endogenous counterparts(35). The acquisition of cellular oncogenes by retrovirusesdemonstrates that a cellular sequence merely needs to pro-vide a desirable characteristic to bring about a selectablerecombination event.Because of the recent identification of several classes of

human endogenous retroviruses and our interest in obtaininga better understanding of the evolution of human immuno-deficiency virus (HIV), experiments were performed todetect the presence of HIV-1-related sequences in normalhuman DNA. To date, no endogenous sequences related toHIV-1 have been described. Such species, if detected, couldrepresent either a related endogenous retrovirus or a cellulargenetic sequence with a functional domain similar to one ormore of those found in HIV-1. We have previously usedreduced-stringency hybridization conditions to allow us todetect novel endogenous retroviruses exhibiting as little as65% sequence similarity to the retroviral probes employed(8, 9). We report herein the initial detection and characteri-zation of endogenous HIV-1-related sequences (EHS) innormal human DNA by using both low-stringency Southernblot hybridization and low-stringency hybridization screen-ing of human genomic DNA lambda libraries.

MATERIALS AND METHODS

Cloned viral DNAs. Plasmid pN1GD4 is a proviral clone ofHIV-1 containing a minor deletion within gag and wasprovided by D. Volsky (47). Subgenomic probes of HIV-1were generated by deleting the EcoRI fragments of pNlGD4to generate the 5' probe and alternatively the SalI fragment

2170

Page 2: Novel human endogenous sequences related to human

NOVEL HUMAN ENDOGENOUS SEQUENCES RELATED TO HIV-1

to generate the 3' probe. The envelope containing BglIIprobe was generated by digesting the target DNA with theappropriate enzyme and cloning into pUC13.

Preparation of DNA probes. Plasmid DNA was digestedwith an appropriate restriction enzyme and separated by gelelectrophoresis. Gel-isolated DNA restriction fragmentswere radioactively labeled by incorporation of [32P]dCTP bythe random-priming method (11) to a specific activity of atleast 5 x 10 cpm/,ug.A synthetic oligonucleotide primer was generated to

EHS-2 by the University of Minnesota Microchemical Fa-cility. The 31-mer (AGAAATGGGTGGAGAGAGAGACAGAGACAGA) was end labeled (29) with [32P]dATP and T4polynucleotide kinase to a specific activity of at least 5 x 108cpm/,g.

Preparation of eukaryotic genomic DNA. High-molecular-weight eukaryotic genomic DNA was extracted from humanplacental tissue, human peripheral blood lymphocytes(PBLs) extracted from blood samples, chimpanzee PBLsextracted from EDTA-treated blood samples provided byWilliam Hobson at the Primate Research Institute at Hollo-man Air Force Base in New Mexico, and rhesus monkeyspleen tissue provided by Dan Houser at the WisconsinRegional Primate Research Center in Madison, Wis. Homog-enized tissue or PBLs were resuspended in 2 volumes ofbuffer (0.2 M Tris-HCl [pH 8.2], 0.1 M EDTA, 0.5% sodiumdodecyl sulfate [SDS], 500 ,ug of pronase per ml) andincubated at 37°C for 24 h. DNA was extracted several timeswith an equal volume of buffer-saturated phenol-chloroform.Sodium acetate to a final concentration of 0.2 M and 2volumes of ethanol were added, and the genomic DNA wasspooled out of solution. DNAs were RNase treated andsubjected to another round of extraction and precipitation.

Nucleic acid hybridization. DNAs were digested with anappropriate restriction enzyme, fractionated by agarose gelelectrophoresis, and transferred to Bio-Rad Zetaprobe nylonmembrane in 20x SSC (3 M NaCl, 0.3 M sodium citrate) bya modification of the Southern procedure (43). Membraneswith bound DNA were washed in 0.1 x SSC-0.5% SDS priorto hybridization. The following standard prehybridizationsolution was used for all Southern blot hybridizations usinga random-primed probe: 30 or 50% formamide (30% form-amide for low stringency, 50% for high stringency), 1 MNaCl, 20 mM Tris-HCl (pH 7.4), 0.1% SDS, and 150 U ofheparin (500 ,ug/ml) per ml (42). For hybridization, a freshsolution of prehybridization solution was prepared with theaddition of 8% dextran sulfate and 5 x 106 cpm of 32P-labeledprobe per ml and incubated for 12 to 15 h at 42°C. Blots werewashed in a solution of 2x SSC, 0.1% SDS, and 0.1%sodium pyrophosphate for three times of 15 min each atroom temperature. High-stringency blots were washed anadditional two times for 1 h at 50°C in 0.5x SSC-0.1%SDS-0.1% sodium pyrophosphate. Low-stringency blotswere washed an additional two times for 1 h each at 50°C in3x SSC-0.1% SDS-0.1% sodium pyrophosphate. Blotsprobed with an oligonucleotide were prehybridized in asolution containing 1 M NaCl, 50 mM Tris-HCl (pH 7.5),10% dextran sulfate, 1% SDS, and 100 ,ug of denaturedsheared salmon sperm DNA per ml for 1 h at 55°C. Afterincubation, 2 x 106 cpm of 32P-end-labeled oligomer per mlwas added, and hybridization was allowed to continue for 12h at 55°C. Oligonucleotide-hybridized blots were washedinitially like the DNA-probed blots and additionally for 1 h at420C in 3x SSC-0.1% SDS-0.1% sodium pyrophosphate.After washing, all blots were sealed in bags to prevent dryingand exposed to X-ray film. Prior to reprobing, blots were

stripped of hybridizing probe by incubation in 0.4 N NaOHfor 30 min at 42°C and then in 0.1x SSC-0.1% SDS-0.2 MTris-HCI (pH 7.4) for 30 min at 42°C. Blots were exposedovernight to X-ray film to show that probe was completelyremoved.

Library screening and cloning of genomic DNA. Two hu-man genomic libraries were screened for clones by using alow-stringency modification of the Benton-Davis plaque lifthybridization procedure (29). Lifts were hybridized like themembranes described above. A Charon 4 recombinant hu-man genomic fetal liver DNA library was provided by T.Maniatis (20). Additionally, human placental DNA wascompletely digested with BamHI and fractionated on a saltgradient. Fractions were tested for hybridization, and anappropriate 15-kb fraction was half site filled in with dATP,dGTP, and Klenow fragment. This target DNA was ligatedto lambda Gem 11 arms digested with XhoI and half site filledin with dCTP and dTlP (Promega Biotec Inc.). The ligationmixture was packaged into in vitro packaging extracts (29).Plate lysates were used to prepare lambda DNA for analysisand subcloning (29).

Subclones of recombinant lambda clones were con-structed in pUC13, using appropriate restriction enzymesand standard ligation conditions. Subclones for sequencingwere generated by using a double-stranded nested deletionkit purchased from Pharmacia.DNA sequencing. Cesium chloride gradient-purified plas-

mid DNAs were sequences by the dideoxy method of Sangeret al. (38) adapted for double-stranded templates and T7DNA polymerase (Sequenase; United States BiochemicalCorp.). Sequence data were analyzed by using a SUNcomputer equipped with Intelligenetics software. Programsused included FastDB, IFIND, GenAlign, ALIGN, GEL,SEQ, and PEP.

Nucleotide sequence accession numbers. The sequence datafor EHS-1 and EHS-2 have been entered in the GenBankdata base under accession numbers M85292 and M86246,respectively.

RESULTS

HIV-1-related sequence in primate DNA. Three probes (5',3', and BglII) representing subgenomic regions of HIV-1were subcloned from a cloned HIV-1 isolate, N1GD4, foruse in hybridization analysis (Fig. 1). The 4.1-kb 5' probeextends from the first Sacl site to the second EcoRI site andincludesgag,pol, and vip. The 3.8-kb 3' probe extends fromthe Sall site to the SacI site in the long terminal repeat. Thisprobe includes all of env as well as tat, rev, and nef. The1.3-kb BglII probe encompasses the domains encoding all ofgp4l, the 3' end of gpl20, and most of Nef. These probeswere labeled with 32P and hybridized to a Southern blot ofhuman genomic placental DNA under high- and low-strin-gency conditions as described in Materials and Methods. Asexpected, no hybridization was observed under conditionsof high-stringency hybridization (blot not shown); however,several bands that hybridized to either the 5' or 3' HIV probeunder low-stringency hybridization conditions were detected(Fig. 1A). The 3' HIV probe was capable of detectingstronger, more distinct bands than was the 5' HIV probe.Four major bands (9.5, 6.0, 4.4, and 3.2 kb) were detected inPstI-digested genomic DNA when probed with the 3' probe.To better define the portion of HIV-1 responsible for themajor hybridization seen with the 3' probe, the BglII probewas hybridized under high- and low-stringency conditions tohuman, chimpanzee, and rhesus monkey genomic DNAs

VOL. 66, 1992 2171

Page 3: Novel human endogenous sequences related to human

2172 HORWITZ ET AL.

.-- _ __ I

4 5 6 8 9

3 Probe

Bam HI Hind m Pst I

kb

- 231

- 94

- 67

4.3 -

- 2.3 -

- 2.0 -

Human Chimp Rhesus

kb EcoRI BamHI PstI EcoRI BmHI PstI EcoRI BamHI Pstl23 - _ ;;;

9e3-_'

6 7-

4.3 -

__.

2.0-

0.5-

Law Stringency

Low Stringency Low Stringency

FIG. 1. Southern blot hybridization of HIV-1 probes to normal human genomic DNA. Probes used in this analysis are diagrammed at thetop. The HIV-1 clone, N1GD4 (47), which contains a deletion in gag, was subcloned into three probes, 5', 3', and BglII. The 5' probe is a

4.1-kb SacI-EcoRI fragment, the 3' probe is a 3.8-kb Sall-Sacl fragment, and the BglII probe is a 1.3-kb fragment. (A) Southern blothybridization of the pN1GD4 5' and 3' probes to normal human DNA with various restriction enzymes under low-stringency conditions.Genomic DNA was extracted from normal human placental tissue. Twenty micrograms was digested with the indicated restrictionendonuclease, electrophoresed, and Southern blot transferred to a Zetaprobe membrane (Bio-Rad). The blot was hybridized underlow-stringency conditions with a 32P-labeled 5' or 3' probe. Blots were washed and exposed to Kodak XAR film for 3 days. Radioactivelylabeled molecular weight markers were run alongside the genomic DNA, and their sizes are indicated. (B) Southern blot hybridization of thepN1GD4 BglII probe to normal human, chimpanzee, and rhesus monkey DNAs under low-stringency conditions. Genomic DNA was

extracted from normal human placenta, chimpanzee lymphocytes, and rhesus spleen tissue. The blot was prepared as for panel A except itwas hybridized under low-stringency conditions with a 3 P-labeled BglII env probe.

(Fig. 1B). No bands were detected under high-stringencyconditions in either human, chimpanzee, or rhesus monkeyDNA with the BgiII probe (blot not shown). The sameintense four bands seen in the PstI-digested human DNAwith the 3' HIV probe were detected by the BgiII probe. Inaddition, bands were clearly detectable in both chimpanzeeand rhesus monkey DNAs under low-stringency hybridiza-tion with the BglII probe, and these bands are distinct fromthose detected in human DNA as well as from each other.The diversity of bands detected in all three primates clearlyindicates the presence of a complex family of HIV-1-relatedsequences.

Molecular cloning and mapping of HIV-1-related sequences:EHS-1 and EHS-2. To more fully characterize these HIV-1-hybridizing bands, recombinant lambda clones were isolatedby screening two human genomic DNA libraries under

low-stringency hybridization to the HIV-1 BgiII probe. Twopositively hybridizing clones, designated XEHS-1 (from a

human fetal liver DNA library) and XEHS-2 (from a humanplacental DNA library), were digested with various restric-tion enzymes and analyzed by Southern blot hybridization atlow stringency to the HIV BglII probe (data not shown). A5.0-kb EcoRI fragment of XEHS-1 and a 3.7-kb SacI frag-ment of XEHS-2 were found to hybridize to the 1.3-kb HIV-1BglII probe. No additional hybridization was observed toXEHS-1 or -2 when analyzed with probes containing HIV-1gag, pol, and long terminal repeat nucleotide sequencesunder conditions of low stringency (data not shown). The5-kb EcoRI fragment of XEHS-1 which hybridized to theBglII probe was subcloned and mapped (Fig. 2A). Hybrid-ization to the BglII probe was mapped to a 250-bp BamHI-PstI fragment within the 5-kb clone. Similarly, the 3.7-kb

A B

Probes5,

3,

<~~~~~~~~- b. i - _

[ ___ < 'i '; | !~~;.L,

5' Probe

Barn HI Hindlif Pst I

P,obes:

Bg l!

IIR -1 I ';

rm--

J. VIROL.

i

Page 4: Novel human endogenous sequences related to human

NOVEL HUMAN ENDOGENOUS SEQUENCES RELATED TO HIV-1

AX EHS-1: (Charon 4 vector)

Left

pEHS-1: (pUC13)

E B H PBg B

3*4,ll- 5'ORF:

A i 64 a

Right

16.2 kb I-Ir2 kb

g 5.0 kb I I(E)

BXEHS-2: ( Xgem 11 vector)

Left arm Rig

pEHS-2: (pUC 13'

I II I

SHP Bg Hc Bg

OFFs:Rev-Ike (lllaa)

gp4l-like (32aa)

5' --* 3'

l lNd Xb

fragment of XEHS-2 was subcloned and mappeAll of the hybridization to the HIV-1 probe was500-bp SacI-BglII fragment within the 3.7-kb cl

Nucleotide sequence analysis of EHS-1 and Eeffort to determine the precise nature of the Hsequences in the 5.0-kb EcoRI fragment of EH3.7-kb Sacl fragment of EHS-2, these subclonflanking nucleotide regions were sequenced.EHS-1 revealed a region of about 100 bp witisimilarity to HIV-1 env within the hybridizing.fragment. A portion of this alignment is shownThis homology overlaps a specific functionalHIV-1 env, the cellular protease cleavage site.EHS-1 exhibiting hybridization and sequence sirtains an open reading frame (ORF) of 192 bp (64

cleavage site

FIG. 2. Restriction endonuclease maps of the HIV-relatedEHS-1 and -2 clones. DNA sequence alignments of EHS-1 (A) andEHS-2 (B) to HIV-1 are shown. DNA alignments were performed byusing Intelligenetics programs and GenBank sequences. The se-quence of HIV-1 used was from BH10 and was obtained fromGenBank (accession number M15654), and the numbers used aredirectly from that entry. The HIV-1 BH10 sequence is most closelyrelated to that of the N1GD4 clone used as a probe in the Southernanalysis (47). EHS-1 and -2 nucleotide numbers are presented as

13.7 kb they will be entered in GenBank as well. Similar bases are alignedwith dashes between the two lines, dissimilar bases have no dashes,

1 2 kbl and gaps are presented by spaces. (A) Restriction endonucleasemaps of XEHS-1 and pEHS-1. XEHS-1 (Charon 4) has a 16.2-kbinsert; pEHS-1 (pUC13) has a 5.0-kb insert. The B/P fragmentindicated by the hatched box is 250 bp and represents the regionhybridizing to the BglII probe. An ORF encoding a 64-aa peptide(open box) is present within the B/P fragment. Its orientation (5' to3', from right to left) is indicated with a horizontal arrow. A verticalarrow designates the placement of the putative cellular trypticcleavage site within this ORF, which is at the 3' end of this ORFprior to termination. Restriction enzymes used: B, BamHI; Bg,

3.7 kb BglII; E, EcoRI; H, HindIll; K, KpnI; P, PstI; S, SallI. (E), anHc S EcoRI site present in the lambda clone, but it is an artifact of the

cloning strategy of the human genomic DNA into Charon 4 to createthe library and it is not present at this position within the human

I 1kb genome. (B) Restriction endonuclease maps of XEHS-2 andpEHS-2. XEHS-2 (lambda Gem 11) has a 15-kb insert; pEHS-2(pUC13) has a 3.7-kb insert. The area of hybridization is representedby a 500-bp BglII-SacI fragment and designated by a hatched box.Three overlapping ORFs are present within this region (open

d (Fig. 2B). boxes), and the amino acid length is indicated. A horizontal arrowmapped to a indicates the orientation of the ORFs (5' to 3', from left to right).[one. Restriction enzymes used: B, BamHI; Bg, BglII; E, EcoRI; H,[HS-2. In an HindIll; Hc, HincII; K, KjpnI; Nd, NdeI; P, PstI; S, SacI; SI, Sall;

TV-i-rlated Xb, XbaI.IS-1 and theLes and theirAnalysis of [aa]), and much of the cellular protease cleavage site ish 70 to 80% intact. The orientation of the ORF based upon the nucleotideBamHI-PstI sequence and in relation to the restriction map is shown within Fig. 3A. an arrow in the restriction map in Fig. 2A. Additionally, adomain of vertical arrow indicates the placement of the putative prote-The area of ase cleavage site relative to the ORF. The cleavage site ismilarity con- present at the 3' end of the ORF, and a potential splice siteamino acids exists just prior to termination of the putative exon and 3' of

4 L"-.XXXXXn x

I I I I IH B B B S HBB

VOL. 66, 1992 2173

7

Page 5: Novel human endogenous sequences related to human

2174 HORWITZ ET AL.

A)360 380 400 420

EHS-1 TACAAGGC--AGAGAAGAGTGGT--AGCTGAGACTAAAAAG-GCAGT-----CAGGAGCTAAAT--CTTGG111111 IIlIIIll I11 liii 1111 HlllIll 1111111 11111

HIV-1 ACCAAGGCAAAGAGAAGAGTGGTGCAG-AGAGA--AAAAAGAGCAGTGGGAATAGGAGCTTTGTTCCTTGG(BH10) 7080 7100 7120 7140

B)1330

EHS-2 CAT-I I

HIV-1 AATI(BH10)

1350 1370-CCAGAAATGGGTGGAGAGAGAGACAGAGACAGA-------GA---GGGAGAGAGATCC--AGCAAGA

liii 111111111111111 III liii 11 111111 I II

1390

AGAAGAAGAAGGTGGAGAGAGAGACAGAGACAGATCCATTCGATTAGTGA-ACGGATCCTTAGC-ACT7780 7800 7820 7840

FIG. 3. Nucleotide sequence alignment of EHS-1 and -2 to HIV-1.

the cleavage site. Although the stretch of nucleic acidsequences with similarity to HIV-1 is short, GenBank database (release 62) searches using Intelligenetics softwarerevealed no sequence more closely related to this region ofEHS-1 than HIV-1. Overall, greater than 2.3 kb of DNAfrom EHS-1 has been sequenced across this region and hasrevealed no further obvious retrovirus structures. Addition-ally, upstream of the ORF previously described are struc-tures typically found in many introns of active genes (i.e.,Alu and Li elements). Amino acid comparison of the puta-tive EHS-1 gene product with HIV-1 at the cleavage sitereveals 50% identity across the domain shown in Fig. 4A.Retrovirus envelope tryptic cleavage sites are generallyrecognized to have a strong core of basic amino acid residuesprior to the site of cleavage followed by a short stretch ofhydrophobic residues. This structure is retained quite well inEHS-1 and will be discussed in more detail below.Sequence analysis of EHS-2 revealed a stretch of 28 of 31

bp of nucleic acid identity to HIV-1 across the overlappingreading frames of Rev and gp4l (Fig. 3B). This region ofhomology spans the first arginine-rich functional domain ofthe second exon of rev, a domain which has been shown todirect nucleolar localization of the Rev protein as well as berequired for the binding of Rev to a stretch of HIV-1-encoded RNA termed the Rev-responsive element (RRE)(25, 46). To date, no known functional domain has beenassigned to this overlapping region of gp4l, despite thehighly conserved nature of this region of envelope amongisolates of HIV-1 (33, 47). This conservation may have moreto do with maintaining the overlapping reading frame thanwith the function of gp4l. Additionally, a downstream regionof 50 bp of EHS-2 has about 60% nucleic acid similarity to adownstream region of rev and env (gp41). EHS-2 is open inall three frames across the stretch of 31 bp (Fig. 2B). Theorientation of the three frames is from left to right in Fig. 2B.The Rev-related ORF is the largest of the three and couldencode a gene product of 111 aa, while the gp41-related ORFcould encode a gene product of 33 aa. Amino acid compar-isons (Fig. 4) of the putative Rev-related gene product andRev of HIV-1 reveal 10 of 15 aa (67%) across the firstarginine-rich functional domain and 10 of 23 aa (45%) acrossthe second leucine-rich functional domain. The putativegp4l-related product has 11 of 20 aa (55%) compared withHIV-1 gp4l. Structural analysis of the putative Rev-relatedgene product shows a similar length of the Rev-relatedpeptide to HIV-1 Rev (111 to 116 aa; Fig. 4B). The EHS-2Rev-related gene product has no initiation methionine; how-ever, one may be spliced onto this product, as is the casewith HIV-1 Rev. Additionally, the distance between the twoputative functional domains is different between Rev andEHS-2, with EHS-2 Rev having only about 5 aa separatingthe two putative domains whereas HIV-1 Rev has about 15aa separating the arginine-rich domain from the leucine-rich

domain. Hydropathicity plots constructed by the method ofKyte and Doolittle for the putative gene products and thoseof HIV-1 show that the related products may have similarstructures (data not shown). Similar to EHS-1, GenBankdata base searches using Intelligenetics computer programsrevealed no sequences with more relatedness to EHS-2 thanHIV-1. As was the case with EHS-1, greater than 2.2 kb ofDNA has been sequenced across this region and has re-vealed no further obvious retrovirus-related structures orsequences. Downstream of the three ORFs is an Li elementtypical of others found within the introns of active genes.Genomic structure and conservation of EHS-1 and EHS-2.

The 250-bp fragment of EHS-1, designated B/P, and the500-bp fragment of EHS-2, designated B/S, which containthe areas of strongest sequence similarity to HIV-1 env forboth clones, were used as probes to map EHS-1 and -2-related sequences in genomic DNA (Fig. 5 and 6). Southernblot analyses of human, chimpanzee, and rhesus monkeygenomic DNAs hybridized under high stringency to bothprobes independently are shown in Fig. 5 and 6. Withinhuman DNA, two EHS-1 loci with distinct restriction mapswere observed, only one of which corresponds to the EHS-1lambda clone map. Flanking probe analysis has allowed theconstruction of a putative map for the second loci and showsthat the two loci share only two of five restriction enzymesover a 5-kb span (data not shown). Interestingly, the restric-tion fragments observed in human DNA with the B/P probeare identical in size to those detected in chimpanzee DNA.Chimpanzee DNA has both loci and the same restrictionfragment length polymorphism pattern for the three enzymesshown. Rhesus monkey DNA has only a single visible locusdifferent in molecular weight from those of the other twoprimate DNAs. This result implies that both EHS-1 loci arewell conserved at the level of restriction fragment lengthpolymorphism analysis from chimpanzees to humans. It alsosuggests that a gene duplication event may have taken placeafter speciation of apes from monkeys and that the two lociin chimpanzees and humans have been conserved for over5.5 million years (41). Alternatively, EHS-2 is present as asingle locus in all three primate species under high-strin-gency Southern blot analysis. The restriction map in humanDNA is the same as that predicted by the recombinantlambda clone but is not conserved across the differentspecies as is EHS-1. The high degree of conservation of bothEHS-1 loci from chimpanzees to humans and the conserva-tion of both EHS-1 and -2 from rhesus macaques to humansmay imply some function for these sequences. No additionalbands are detectable under low-stringency hybridizationwith both probes. Comparison of the bands detected by bothprobes at high stringency (Fig. 5 and 6) with those detectedby the BglII HIV-1 probe under low stringency (Fig. 1B)indicates that both EHS-1 loci (6.7 and 5.9 kb, PstI) andEHS-2 (6.0 kb, PstI) are present within the pattern observed

J. VIROL.

Page 6: Novel human endogenous sequences related to human

NOVEL HUMAN ENDOGENOUS SEQUENCES RELATED TO HIV-1 2175

AEHS-1: gpl2O/gp4l junction

49 62EHS-1 TRQRRVVAETKKAV

1111 11HIV-1 KAKRRVVQREKRAV

499 513

EHS-2: Rev domain A

EHS-2

HIV-1

50

38

65SRNG WRERQRQRGREI

11 III IIIRRNRRRRWRERQRQIHS I

56

Rev domain B

B H P B_

(E.5.0 kb 1k

B.P paoDe

Human Chimp Rhesus

EcoRI BamHI PetI ECoRI BomHI PstI EcoRI BomHI PatIkb23.1-

9.3-

6.7- _ _

89LMQ SL RL ISSL CNCELC

11 11L QLPPLERLT LDCN EDC

89

26HPEM GGERDRDREGER

11 11111111RPEGTEEEGGERDRDRSI R

748

I89 1 14

REV

EHS-2REV-like

1 3S 5 7 1

50% 45%

I~~ IS// _~I I48 5 70

FIG. 4. Amino acid alignments of putative EHS-to HIV-1. (A) Amino acid alignments of EHS-1 andThe peptide sequences of the ORFs were predictewith the HIV-1 sequence by using Intelligenetics prPEP. Amino acids are presented in single-letter c

sequence is from the BH1O isolate (GenBank a(

M15654) (33), and the numbers represent the residRev or gp160 protein. Similar residues are aligibetween the two lines, dissimilar bases have no daspresented by spaces. (B) Structural comparisonEHS-2 Rev-like product with HIV-1 Rev. The di;represent either HIV-1 Rev or EHS-2 Rev-like rpreviously described functional domains (27, 45, 4with either a hatched box (arginine rich) or filled bThe amino acid residue numbers correspond to the a

A. The percentage of similar residues is presented aboxes and is calculated from the alignments acri

relative to the two peptides as presented in panel A

with the HIV-1-probed blot. However, bothare only minor bands in the highly comjHIV-1-related sequences.The high degree of sequence identity over

EHS-2 and HIV-1 prompted an investigationof this sequence. A 31-bp oligonucleotidEHS-2 was synthesized and hybridized asSouthern blot of human, chimpanzee, and r

genomic DNAs (Fig. 7). A complex bandpresent within the three primate speciespattern seen in Fig. 1 when the 3' HIV-1 problow-stringency hybridization to the same blois similar in the number of different bands antive molecular sizes (9.5, 6.0, 4.4, and 3.2different in the intensity of the bands. Theoligonucleotide hybridization does not allow

FIG. 5. Southern blot hybridization of the B/P (EHS-1) probe tonormal human, chimpanzee, and rhesus monkey DNAs underhigh-stringency conditions. Genomic DNA was extracted, digested,electrophoresed (10 ,ug per lane), and transferred to a Zetaprobe

-lZZI membrane (Bio-Rad). The blot was hybridized under high-strin-3 9 i1 gency conditions with the 32P-labeled B/P probe. Blots were washed

-1 and -2 peptides and exposed for 3 days. Radioactively labeled molecular weightEHS-2 to HIV-1. markers were run alongside the genomic DNA, and their sizes are-d and compared indicated.

rograms SEQ and-ode. The HIV-1ccession number pair mismatch, so copy number will drive the intensity oflue for either the hybridization more than will the extent of similarity betweenned with dashes the probe and the target. Under low-stringency hybridiza-hes, and gaps are tion, both copy number and sequence similarity dictate theof the putative extent of hybridization signal. Comparisons of identical)rotein. The two Southern blots probed with either the 3' HIV-1 probe underp6) are presented low stringency (Fig. 1) or the oligonucleotide (Fig. 7) indi-ox (leucine rich). cate that most if not all of the HIV-1-related sequences.lignment in panel contain this oligonucleotide sequence. It is evident that someibove the domain of these hybridizing sequences have an increased intensityoss the domains due to more than copy number when probed with HIV-1;

this finding suggests that additional flanking sequences are

similar to HIV-1 and are responsible for this increasedintensity.

EHS-1 and -2plex family of DISCUSSION

31 bp between Both isolated endogenous HIV-1-related sequences,into the nature EHS-1 and EHS-2, represent sequences with homology toe representing specific functional domains of HIV-1. Although they both do, a probe to a not appear as obvious endogenous retroviruses themselves,rhesus monkey they are members of a family of highly conserved sequencesling pattern is related to HIV-1 present within primate genomes, some ofsimilar to the which may encode endogenous retroviruses. EHS-1 and -2e is used under most probably represent exons of some uncharacterizedts. The pattern genes, and their relationship to HIV-1 may be better placedid their respec- within the study of the evolution of functional domains. Inkb, PstI) yet previous studies, PCR analysis using primers flanking anature of the conserved domain of reverse transcriptase has amplified afor much base number of short sequences (<120 bp) in human DNA

EHS-2

HIV-1

70

71

gp41

EHS-2

HIV-1

11

729

4.3-

2.3-2.0-:

0.5-

VOL. 66, 1992

-1E

6

I

Page 7: Novel human endogenous sequences related to human

2176 HORWITZ ET AL.

III

Human Chimp Rhesus

EcORI BomrHI PstI EcoRI BomHl Pst! EcoRI Bom HI PstIkb

23.1- -

-7 3.7 kb

'c S kbH p Og No XD

/ AAA-A 6\.

2i -2 L-A,-AAPV.: f: :':?.. ;.i .......................'. ::.RVA r .'.A-AA.:AA.:' :;- :.....'.'. . .... -.A. .': :A!.......... i-.:..

kb

23.1-

Human Chimp Rhesus

EcoRI BamrHi PstI EcoRI BamHI PstI EcoRI BomHI P8tI

9.3-

r, 7-6.7-) .at

4.3-1

2.3-b2.0-p

4

e4

t). ,- 't.- -7 .;- -.-1

43- *

2.3-. -

4

2.0-

e

0.5-

FIG. 6. Southern blot hybridization of B/S (EHS-2) probe tonormal human, chimpanzee, and rhesus monkey DNAs underhigh-stringency conditions. Genomic DNA was extracted, digested,electrophoresed (10 p.g per lane), and transferred to a Zetaprobemembrane (Bio-Rad). The blot was hybridized under high-strin-gency conditions with the 32P-labeled B/P probe. Blots were washedand exposed for 3 days. Radioactively labeled molecular weightmarkers were run alongside the genomic DNA, and their sizes areindicated.

representing known and unknown endogenous retrovirusand nonretrovirus elements that may encode reverse tran-scriptase (40). However, in those experiments many of thePCR-generated reverse transcriptase-related domains wereunable to hybridize to each other, while both EHS-1 and -2can be detected by HIV-1. In a similar manner, PCR analysisof the cellular oncogene ras has identified a family of relatedsequences (7). Analysis of four of these PCR-generatedras-related sequences revealed at least two families of geneswith less than 50% nucleic acid identity to each other. Theidentifications of these genes with similar functional domainsyet limited sequence identity are excellent examples of thecomplex evolution of functional domains. In a recent report,Dorit et al. (6) attempted to determine the number ofdifferent exons required to generate the current level ofprotein diversity by identifying statistically significant se-quence similarities between different exons. Exons wereconsidered homologous if their similarities ranged from 46%for exons 20 to 29 aa long to as little as 20% for exons of 100residues. Particular functional domains like DNA-bindingand metal-binding motifs were found to reappear in differentcontexts on different exon combinations. EHS-1 and -2clearly have a higher level of homology with portions ofHIV-1 than do any of the combinations described by Dorit etal. and, most importantly, their homology is distributed in anonrandom fashion across specific important functional mo-tifs of the virus. Although it is unclear whether divergent orconvergent evolution of these and other domains has oc-curred, it is interesting that similar nucleic acid sequences

FIG. 7. Southern blot hybridization of the EHS-2 oligonucleo-tide probe to normal human, chimpanzee, and rhesus monkeyDNAs under stringent oligonucleotide conditions. Genomic DNAwas extracted, digested, electrophoresed (10 ,ug per lane), andtransferred to a Zetaprobe membrane (Bio-Rad). The blot washybridized under stringent oligonucleotide conditions with the 32P-end-labeled 31-mer. Blots were washed and exposed for 3 days.Radioactively labeled molecular weight markers were run alongsidethe genomic DNA, and their sizes are indicated.

can exist between different genes, and it is this similarity thathas helped to identify and define their relationships to eachother.The similarity between EHS-1 and HIV-1 is across the

proteolytic processing domain of the envelope gene product.The putative gene product of EHS-1 is similar to that ofHIV-1 in the structure, size, spacing, and basic residuecontent across this region. EHS-1 varies most obviously atthe carboxy-terminal end of the putative cleavage site,where it encodes a lysine residue rather than the argininefound in a number of retroviruses, including HIV-1 (Fig. 8,site 2). Recent mutational analysis with HIV-1 has demon-strated that introducing a lysine in place of the terminalarginine does not block processing of gpl60; however,replacing the arginine with a threonine does block cleavage(13). Closer analysis of this region of gpl60 has revealed thepresence of more than one tryptic cleavage site (3, 31); justthree residues upstream of site 2 is a set of basic residuesthat fit the consensus for this site (Fig. 8, site 1). HIV-1 is notalone in this structure; EHS-1 as well as a number ofmammalian retroviruses have what appear to be two peaksof basic residues at this envelope junction (Fig. 8). Bothcores of basic residues hold fairly well to the generallydescribed R-X-K-R consensus and vary in their spacing fromtwo to seven residues. Mutational analysis of HIV-1 in thisregion has shown that nonconservative alterations withineither basic peak are sufficient to block cleavage; however,single conservative substitutions may not be sufficient (3, 13,31). The presence of both peaks appears to be important for

J. VIROL.

Page 8: Novel human endogenous sequences related to human

NOVEL HUMAN ENDOGENOUS SEQUENCES RELATED TO HIV-1 2177

+ + + ++FELV yhqpeyvytgEAL avRFRL G-p

+ +4++ ++ENS-i giciagkntRORR vvatPXXlav

XIV-1 eplgvaptlAlRB vvqUA& Iav

+ ++++++VISNA ymeergenRRSRR nlq&E, Igi

+ + ++++ +++CAEV ntaktriinElEk elsDLBXERlgv

EIV-2 itpigfaptEllR yssahgaRHLTR gv

+ + ++++ + +EU SPUMA rehytscnnBEEB svdnny&ELK I am

SIVAGN itpiglaptDVKRyttggtSRNXRlgv2-7 aa

consensus RXXR RXXRlhydrophobicdomain

FIG. 8. Comparison of the envelope tryptic cleavage sites of anumber of retroviruses and those of EHS-1. Amino acids arepresented in single-letter code. Amino acid sequences were deducedfrom the nucleic acid sequences of these retroviruses which wereretrieved from GenBank and subjected to analysis with the SEQ andPEP programs (Intelligenetics). The following virus sequences andGenBank accession numbers were used: HIV-1, M15654; felineleukemia virus (FELV), M18248; visna virus, M10608; caprinearthritis encephalitis virus (CAEV), M33677; HIV-2, M15390; hu-man spumavirus (HU SPUMA), X05591; and simian immunodefi-ciency virus SIVAGM, M19499. Amino acids are aligned relative toa putative site of cleavage, and a slash represents that cleavage site.Underlined and capitalized residues form highly basic regions whichfit to the generalized consensus (13, 31) sequence for retroviraltryptic cleavage sites, which is presented at the bottom. Positivesigns above the residues correspond to basic amino acids. Spacesbetween the two cleavage sites are present to show the difference inspacing between these two basic regions among the eight sequences,which is from 2 to 7 aa.

this subset of retroviruses. Many enveloped viruses use acellular tryptic protease for processing of their glycoproteins(e.g., paramyxoviruses and orthomyxoviruses), but theseappear to have but one tryptic domain similar to a number ofretroviruses (i.e., human T-cell leukemia virus type 1[HTLV-1], bovine leukemia virus, mouse mammary tumorvirus, and equine infectious anemia virus). A requirementfor different cellular proteases may be responsible for thenumber of domains present, or merely overall protein struc-ture may require the presence of more basic residues. Thegeneralized structure of this region consists of a cleavage site(basic residues) followed by a varying stretch of hydropho-bic residues which provide the amino terminus of the freshlycleaved product followed by a fusogenic domain which isthought to be now activated by its close proximity to thenewly created amino terminus. This method of proteolyticprocessing not only allows the cell or virus to produce threegene products from one coding region but allows delayeddelivery of a possibly toxic functional motif. In retroviruses,proteolytic processing of envelope is rarely 100% efficient,and with HIV-1 is typically 10 to 20% efficient (3, 31).Despite this low efficiency, cleavage is required for the

production of infectious virions and syncytium formation inHIV-1 (3, 31). Interestingly, EHS-1 has the identical pair ofhydrophobic residues following the putative cleavage site asdoes HIV-1 (Fig. 8, site 2). Mutational analysis of either ofthese residues in HIV-1 has been shown to result in signifi-cantly reduced syncytium formation (14). The structure,content, and perhaps spacing of this region may well deter-mine the specific protease used and the efficiency of thiscleavage, thus determining the eventual infectivity of thevirus and inherently regulating the virus life cycle. EHS-1appears to encode a domain similar to the proteolytic proc-essing domain of a number of retrovirus envelope proteinsand likely a few host proteins as well. Little is knownregarding this protease in terms of its cellular substrate;however, it must act on cellular gene products, and theputative EHS-1 gene product may provide a suitable sub-strate. It is quite conceivable that a cellular exon couldencode the junction domain of a putatively processed pro-tein, and this exon could well be linked to an exon containinga fusogenic domain or another domain that would need to behidden until exportation for the safety of the cell or theintegrity of the gene product. EHS-1 could be such an exon,as its 64-aa ORF has appropriate splice sites within its frame.By learning more about the cellular protease and its interac-tion with both cellular proteins and HIV-1, it is possible thata target for therapy or prevention of HIV-1 infection may befound.EHS-2 may encode a gene product similar in function to

HIV-1 Rev. The HIV-1 Rev has been shown to be involvedin the regulation of spliced and unspliced mRNA speciesduring the HIV replication cycle (12, 25, 26). Rev's actionappears to be specific for unspliced HIV-1 transcripts andhas been shown to bind a 234-nucleotide RNA target se-quence (RRE) located within envelope (27, 28). Of the twoapparent functional domains on the 116-aa Rev gene prod-uct, the first is rich in arginine residues and appears to beresponsible for both nucleolar localization and RNA binding(27, 46). This sequence shows identity to an RNA recogni-tion motif recently proposed for a number of arginine-richRNA-binding domains of both eukaryotic and prokaryoticproteins by Lazinski et al. (21). The second leucine-richdomain displays a dominant negative phenotype when mu-tated and is hypothesized to be required for binding of acellular factor (27, 45). Rev appears to contain two discretebinding domains, which is quite analogous to the structuresproposed for many sequence specific DNA-binding tran-scription factors (32). EHS-2's putative gene product alsocontains two discrete domains with similarity to a Rev. Itsdifferences from Rev may be due to its ability to bind adifferent specific sequence or different cellular factor and/ornot be located in nucleoli. Various lentivirus Rev- andRex-like products contain putative RNA-binding motifs, andthese are presented for comparison to EHS-2 in Fig. 9.EHS-2's putative RNA-binding motif is similar in size (13aa), arginine content, and alignment to the consensus pro-posed by Lazinski et al. Interestingly, despite the obvioussequence and structural differences between HIV-1 Rev andHTLV-1 Rex, they appear to recognize each other's RNAtarget site (19, 22), and Rex can functionally complement theHIV-1 RRE in the absence of Rev (22). Sequence identity isnot an absolute requirement for recognition of specific targetsequences in this system, and EHS-2's gene product notonly may act in a similar fashion but may act on the sametarget as does Rev itself. Additionally, the EHS-2-derivedoligonucleotide appears to contain a core element of nucleicacid sequence which detects a large family of genes with a

VOL. 66, 1992

Page 9: Novel human endogenous sequences related to human

2178 HORWITZ ET AL.

Best aligmentto consensu

7/11

6/11

6/11

7/11

8/11

7/11

7/11

FIG. 9. Amino acid comparison of the arginine-rich domains ofvarious retrovirus Rev and Rex-like products and the EHS-2 Rev-like product. Amino acids are presented in single-letter code. Aminoacid sequences were deduced from the nucleic acid sequences ofthese retroviruses, which were retrieved from GenBank and sub-jected to analysis using the SEQ and PEP programs (Intelligenetics).The following virus sequences, GenBank accession numbers, andpublications were used to determine the sequences of the 13-aaarginine-rich regions: HIV-1, M15654 (27, 46); equine infectiousanemia virus (EIAV), M18386 (44); visna virus, M10608 (39); HIV-2,M15390 (15, 37); HTLV-1, D00294 (22); and simian immunodefi-ciency virus SIVAGM, M19499 (17, 37). Indicated on the right arethe number of arginine residues (which are capitalized in the 13-aastretch on the left) present within the 13-aa stretch. The bestalignment to the consensus is based on a consensus 11-aa sequencegenerated by Lazinski et al. (21) for RNA-binding motifs. Theconsensus is BOBRBJRRZZB, where B is a basic residue, 0 is anonbasic polar residue, Z is a charged residue, and J is an acidicresidue.

similar functional motif that appears to encode either an

RNA-binding domain or nucleolar localization signal. In-cluded in this gene family could be both lentivirus-relatedendogenous retroviruses which may contain a Rev-like geneand genes encoding cellular sequence-specific RNA-bindingfactors. Further study of this family may reveal a complexlevel of cellular regulation and trafficking of spliced andunspliced mRNAs.

Besides structural evidence, transcriptional data regardingEHS-1 and -2 are required to support the hypothesis thatthese sequences are exons. Transcriptional data would alsobetter define the nature, structure, and function of thesesequences. At present, we are attempting to show that eitherone or both of these sequences are transcribed by screeningvarious tissue and cell line RNAs by Northern (RNA) blotanalysis. However, it is quite possible that they could bothbe transcribed in only specific tissues or at specific times,making this analysis quite difficult.

In conclusion, we have demonstrated the presence of a

complex family of HIV-1-related sequences in three speciesof primates, using low-stringency Southern blot hybridiza-tion and plaque screening. One member of this family,EHS-1, comprises two distinct loci in humans and chimpan-zees and one related locus in rhesus monkeys. Sequenceanalysis of EHS-1 identified a region within an ORF similarto the proteolytic processing site within the envelope glyco-protein of HIV-1. Another member, EHS-2, is a single locuspresent from rhesus monkeys to humans with sequencesimilarity to the overlapping reading frame of Rev and gp4l.The Rev-related putative gene product is similar in size,structure, and amino acid composition across both func-tional domains of HIV-1 Rev. Additionally, an oligonucleo-tide probe from the Rev-related sequence identifies almostall of the HIV-1-related sequences observed under lowstringency and suggests that some members of this family

contain not only the oligonucleotide sequence but alsoadditional adjacent sequence related to HIV-1. Furtheranalysis of members of this family will help determinewhether such endogenous sequences contributed to theevolution of HIV-1 via recombination events or whetherthese elements, either directly or through protein products,influence HIV pathogenesis.

REFERENCES1. Bangham, C., S. Daenke, R. Phillips, J. Cruickshank, and J.

Bell. 1988. Enzymatic amplification of exogenous and endoge-nous retroviral sequences from DNA of patients with tropicalspastic paraparesis. EMBO J. 7:4179-4184.

2. Bonner, T. I., C. O'Connell, and M. Cohen. 1982. Clonedendogenous retroviral sequences from human DNA. Proc. Natl.Acad. Sci. USA 79:4709-4713.

3. Bosch, V., and M. Pawlita. 1990. Mutational analysis of thehuman immunodeficiency virus type 1 env gene product proteo-lytic cleavage site. J. Virol. 64:2337-2344.

4. Callahan, R., W. Drohan, S. Tronick, and J. Schlom. 1982.Detection and cloning of human DNA sequence related to themouse mammary tumor virus genome. Proc. Natl. Acad. Sci.USA 79:5503-5507.

5. Chien, Y., M. Lai, T. Y. Shih, I. M. Verma, E. M. Scolnick, P.Roy-Burman, and N. Davidson. 1979. Heteroduplex analysis ofthe sequence relationships between the genomes of Kirsten andHarvey sarcoma viruses, their respective parental murine leu-kemia viruses and the rat endogenous 30s RNA. J. Virol.31:752-760.

6. Dorit, R. L., L. Schoenbach, and W. Gilbert. 1990. How big isthe universe of exons? Science 250:1377-1381.

7. Drivas, G. T., A. Shih, E. Coutavas, M. G. Rush, and P.D'Eustachio. 1990. Characterization of four novel ras-like genesexpressed in a human teratocarcinoma cell line. Mol. Cell. Biol.10:1793-1798.

8. Dunwiddie, C., and A. J. Faras. 1985. Presence of retrovirusreverse transcriptase-related gene sequences in avian cells lack-ing endogenous avian leukosis viruses. Proc. Natl. Acad. Sci.USA 82:5097-5101.

9. Dunwiddie, C., R. Resnick, M. Boyce-Jacino, J. Alegre, and A.Faras. 1986. Molecular cloning and characterization of gag-,pol-, and env-related gene sequences in the ev- chicken. J.Virol. 59:669-675.

10. Ellis, R. W., D. DeFeo, J. M. Maryak, H. A. Young, T. Y. Shih,E. H. Chang, D. R. Lowy, and E. M. Scolnick. 1980. Dualevolutionary origin for the rat genetic sequences of Harveymurine sarcoma virus. J. Virol. 36:408-420.

11. Feinberg, A., and B. Vogelstein. 1984. A technique for radiola-beling DNA restriction endonuclease fragments to high specificactivity. Addendum. Anal. Biochem. 137:266-267.

12. Feinberg, M. B., R. F. Jarrett, A. Aldovini, R. C. Gallo, and F.Wong-Staal. 1986. HTLV-III expression and production involvecomplex regulation at the levels of splicing and translation ofviral RNA. Cell 46:807-817.

13. Freed, E., D. Myers, and R. Risser. 1989. Mutational analysis ofthe cleavage sequence of the human immunodeficiency virustype 1 envelope glycoprotein precursor gpl60. J. Virol. 63:4670-4675.

14. Freed, E., D. Myers, and R. Risser. 1990. Characterization of thefusion domain of the human immunodeficiency virus type 1envelope glycoprotein gp4l. Proc. Natl. Acad. Sci. USA 87:4650-4654.

15. Guyader, M., M. Emerman, P. Sonigo, F. Clavel, L. Montagnier,and M. Alizon. 1987. Genome organization and transactivationof the human immunodeficiency virus type 2. Nature (London)22:662-669.

16. Harada, F., N. Tsukada, and N. Kato. 1987. Isolation of threekinds of human endogenous retrovirus-like sequences usingtRNA(Pro) as a probe. Nucleic Acids Res 15:9153-9162.

17. Hirsch, V., N. Riedel, and J. I. Mullins. 1987. The genomeorganization of STLV-3 is similar to that of the AIDS virusexcept for a truncated transmembrane protein. Cell 49:307-319.

I- 13 aa -I

ZHS-2 RngwReRqRqRgRVisna RgwykwlRnlRaREIAV RRdRwiRgqilqa

HIv-1 RRnRRRRWReRqR

aTLV-1 ktRRRpRRsqRkR

HIV-2 RRnRRRRwkqRwR

SIVAGH RRqRRRRwRVRVq

6

4

4

9

7

8

9

J. VIROL.

Page 10: Novel human endogenous sequences related to human

NOVEL HUMAN ENDOGENOUS SEQUENCES RELATED TO HIV-1 2179

18. Kroger, B., and I. Horak. 1987. Isolation of novel humanretrovirus-related sequences by hybridization to synthetic oli-gonucleotides complementary to the tRNA(Pro) primer-bindingsite. J. Virol. 61:2071-2175.

19. Kubota, S., H. Siomi, T. Satoh, S. Endo, M. Maki, and M.Hatanaka. 1989. Functional similarity of HIV-1 rev andHTLV-1 rex proteins: identification of a new nucleolar-targetingsignal in rev protein. Biochem. Biophys. Res. Commun. 162:963-970.

20. Lawn, R. M., E. F. Fritsch, R. C. Parker, G. Blake, and T.Maniatis. 1978. The isolation and characterization of linked 8and I globin genes from a cloned library of human DNA. Cell15:1157-1574.

21. Lazinski, D., E. Grzadzielska, and A. Das. 1989. Sequence-specific recognition of RNA hairpins by bacteriophage antiter-minators requires a conserved arginine-rich motif. Cell 59:207-218.

22. Lewis, N., J. Williams, D. Rekosh, and M. Hammaskjold. 1990.Identification of a cis-acting element in human immunodefi-ciency virus type 2 (HIV-2) that is responsive to the HIV-1 revand human T-cell leukemia virus types I and II rex proteins. J.Virol. 64:1690-1697.

23. Maeda, N. 1985. Nucleotide sequence of the haptoglobin andhaptoglobin-related gene pair. The haptoglobin-related genecontains a retrovirus-like element. J. Biol. Chem. 260:6698-6709.

24. Mager, D., and P. Henthorn. 1984. Identification of a retrovirus-like repetitive element in human DNA. Proc. Natl. Acad. Sci.USA 81:7510-7514.

25. Malim, M. H., S. Bohnlein, J. Hauber, and B. R. Cullen. 1989.Functional dissection of the HIV-1 Rev trans-activator-deriva-tion of a trans-dominant repressor of Rev function. Cell 58:205-214.

26. Malim, M. H., J. Hauber, R. Fenrick, and B. R. Cullen. 1988.Immunodeficiency virus rev trans-activator modulates theexpression of the viral regulatory genes. Nature (London)335:181-183.

27. Malim, M. H., J. Hauber, S.-Y. Le, J. V. Maizel, and B. R.Cullen. 1989. The HIV-1 rev trans-activator acts through astructured target sequence to activate nuclear export of un-spliced viral mRNA. Nature (London) 338:254-257.

28. Malim, M. H., L. S. Tiley, D. F. McCarn, J. R. Rusche, J.Hauber, and B. R. Cullen. 1990. HIV-1 structural gene expres-sion requires binding of the Rev trans-activator to its RNAtarget sequence. Cell 60:675-683.

29. Maniatis, T., E. F. Fritsch, and J. Sambrook. 1982. Molecularcloning: a laboratory manual. Cold Spring Harbor Laboratory,Cold Spring Harbor, N.Y.

30. Martin, M. A., T. Bryan, S. Rasheed, and A. S. Khan. 1981.Identification and cloning of endogenous retroviral sequencesfrom human DNA. Proc. Natl. Acad. Sci. USA 78:4892-4896.

31. McCune, J. M., L. B. Rabin, M. B. Feinberg, M. Lieberman,J. C. Kosek, G. R. Reyes, and I. L. Weissman. 1988. Endopro-teolytic cleavage of gpl60 is required for the activation ofhuman immunodeficiency virus. Cell 53:55-67.

32. Mitchell, P. J., and R. Tjian. 1989. Transcriptional regulation in

mammalian cells by sequence-specific DNA binding proteins.Science 245:371-378.

33. Modrow, S., B. Hahn, G. Shaw, R. Gallo, F. Wong-Staal, and H.Wolf. 1987. Computer-assisted analysis of envelope proteinsequences of seven human immunodeficiency virus isolates:prediction of antigenic epitopes in conserved and variableregions. J. Virol. 61:570-578.

34. O'Connell, C., S. O'Brien, W. Nash, and M. Cohen. 1984.ERV3, a full-length human endogenous provirus: chromosomallocalization and evolutionary relationships. Virology 138:225-235.

35. Overbaugh, J., N. Riedel, E. Hoover, and J. Mullins. 1988.Transduction of endogenous envelope genes by feline leukaemiavirus in vitro. Nature (London) 332:731-734.

36. Perl, A., J. Rosenblatt, I. Chen, J. DiVincenzo, R. Bever, B.Poiesz, and G. Abraham. 1989. Detection and cloning of newHTLV-related endogenous sequences in man. Nucleic AcidsRes 17:6841-6854.

37. Sakai, H., R. Shibata, T. Miura, M. Hayami, K. Ogawa, T.Kiyomasu, A. Ishimoto, and A. Adachi. 1990. Complementationof the rev gene mutation among human and simian lentiviruses.J. Virol. 64:2202-2207.

38. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequenc-ing with chain-terminating inhibitors. Proc. Natl. Acad. Sci.USA 74:5463-5467.

39. Sargan, D. R., and I. D. Bennet. 1989. A transcriptional map ofvisna virus: definition of the second intron structure suggests arev-like gene product. J. Gen. Virol. 70:1995-2006.

40. Shih, A., R. Misra, and M. Rush. 1989. Detection of multiple,novel reverse transcriptase coding sequences in human nucleicacids: relation to primate retroviruses. J. Virol. 63:64-75.

41. Sibley, C., and J. Ahlquist. 1987. DNA hybridization evidenceof hominoid phylogeny: results from an expanded data set. J.Mol. Evol. 26:99-121.

42. Singh, L., and K. Jones. 1984. The use of heparin as a simplecost-effective means of controlling background in nucleic acidhybridization procedures. Nucleic Acids Res 12:5627-5638.

43. Southern, E. M. 1975. Detection of specific sequences amongDNA fragments separated by gel electrophoresis. J. Mol. Biol.98:503-517.

44. Stephens, R. M., D. Derse, and N. Rice. 1990. Cloning andcharacterization of cDNAs encoding equine infectious anemiavirus Tat and putative Rev proteins. J. Virol. 64:3716-3725.

45. Venkatesh, L. K., and G. Chinnadurai. 1990. Mutants in aconserved region near the carboxy-terminus of HIV-1 Revidentify functionally important residues and exhibit a dominantnegative phenotype. Virology 178:327-330.

46. Venkatesh, L. K., S. Mohammed, and G. Chinnadurai. 1990.Functional domains of the HIV-1 rev gene required for trans-regulation and subcellular localization. Virology 176:39-47.

47. Volsky, D., K. Sakai, M. Stevenson, and S. Dewhurst. 1986.Retroviral etiology of the acquired immune deficiency syndrome(AIDS). AIDS Res 2:S35-S48.

48. Weiss, R., N. Teich, H. Varmus, and J. Coffin. 1985. RNA tumorviruses. Cold Spring Harbor Laboratory, Cold Spring Harbor,N.Y.

VOL. 66, 1992