5
Proc. Nati. Acad. Sci. USA Vol. 80, pp. 7293-7297, December 1983 Genetics Sequences essential for transposition at the termni of IS50 (transposon Tn5/in vitro mutagenesis/protein-DNA interaction/phage-A/homologous recombination) CHIHIRO SASAKAWA*t, GEORGES F. CARLE*, AND DoUGLAs E. BERG*t Departments of *Microbiology and Immunology and of *Genetics, Washington University School of Medicine, St. Louis, MO 63110 Communicated by M. M. Green, August 24, 1983 ABSTRACT The DNA sequences found repeated in opposite orientation at the ends of insertion (IS) elements are thought to contain sites at which transposase proteins act during transposi- tion. Many elements have repeats of at least 15 base pairs (bp). Those of IS50 are quite short, however: just 8 of the first 9 bp. Functional tests had indicated that one end of IS50 is more ef- fective in transposition than the other end and suggested that at least one of the recognition sites of IS50 extends beyond the com- mon 8/9 bp. To determine the lengths of recognition sites of IS50 we mutagenized IS50 in vitro and tested the transposition profi- ciency of the resulting mutants. Our results show that the rec- ognition sites at each end of IS50 are about 19 bp long. These find- ings suggest models for the evolution of IS elements from simpler immobile gene complexes. Transposition of the simple IS (insertion) elements and other mobile DNA segments of bacteria is mediated by element-spe- cific proteins termed transposases that are believed to act by binding to sequences at both ends of their cognate elements (for reviews, see refs. 1-4). Although most IS elements contain their own transposase genes, certain naturally occurring IS ele- ments use transposases encoded elsewhere in the genome. A pair of recognition sites probably constitutes the only irreplace- able component of a functional IS element. Typically, the base sequence at one end of an IS element is an imperfect inverted repeat of the sequence at the other end, for example 15/16 base pairs (bp) (IS5), 18/23 bp (ISI), and 36/ 37 bp (yB). Specific sequences of such lengths are unlikely to occur by chance alone in the bacterial genome and hence are generally equated with transposase recognition sites. Imperfect matches suggest that transposition does not involve direct base pairing between IS element ends; partial symmetry could make both ends appear equivalent to the transposase protein. The short terminal repeats within IS elements should not be con- fused with the reverse duplications of entire IS elements found in many transposons-e. g., the reverse duplications of IS50 in the kanamycin-resistance transposon Tn5 (1). The short in- verted repeats in IS elements are also distinct from the direct duplications of target sequences at insertion sites, which are formed anew each time an element transposes and which, once formed, play no role in subsequent transposition. The ends of IS50, named 0 (outside) and I (inside) corre- sponding to their positions in Tn5, seem unusual when com- pared with those of other elements; they match closely at only eight of nine. bp (Fig. 1). Functional tests have shown that one 0 plus one I end or a pair of 0 ends mediate transposition at similar frequencies, whereas pairs of I ends are less active by factors of 1/1,000 to 1/100, Thus, it seemed likely that trans- posase recognition sites of IS50 extend beyond the common 8/ HgoI BstNI HpoI BstNI 15 50 188 408 .iI r I I. I C TGACTC T TATACACAAGTA.'.. GACTGAGAATATGTGT T CAT.... O End HgaI Pvu It BgIII BcII 1027 1424 1516 1521 -1 1, - BCII I ....AAGATCTGATCAAGAGACAG .... TTCTAGACTAGTTCTCTGTC tronsposase gene UGd I End stop FIG. 1. 1S50. The 1,534-bp-long insertion sequence 1S50 is present as a reverse duplication in the kanamycin resistance transposon Tn5 (5) and is unrelated to any of the IS elements indigenous to the genome ofE. coli K-12 (6). It encodes transposase and also an inhibitor of trans- position, both from a promoter "50 bp from its 0 end. Transposase and inhibitor, 461 and 421 amino acids long, respectively, are made from two different in-phase translation initiation sites. Both proteins are terminated by the UGA codon 12-14 bp from the I end of IS50 (7-10). At the ends of IS50 is a short hyphenated inverted repeat of 8 of the first 9 bp (5), emphasize here by the horizontal lines. In addition to these 8 bp, there are four other positions in the first 19 bp at which both ends are matched in sequence. 9 bp and that the two ends of IS50 are recognized differently by its transposase (5, 11). The current studies were undertaken to determine the lengths of the recognition sites of IS50. Our analysis of IS50 derivatives generated by in vitro mutagenesis indicates that the critical rec- ognition sites at each end are about 19 bp long. MATERIALS AND METHODS Standard molecular, microbial, and genetic techniques (12) were used. The sources of reagents were as follows: BamHI (C-G-G- A-T-C-C-G and C-G-G-G-A-T-C-C-C-G) and Sal I (G-G-T-C- G-A-C-C) linkers, restriction endonucleases, T4 DNA ligase, and DNA polymerase I large fragment (Klenow), New England BioLabs; exonuclease III, nuclease S1, and T4 DNA_ polymer- ase, Bethesda Research Laboratories; mung bean nuclease, P- L Biochemicals. The restriction endonucleases and ligase were used according to the suppliers' instructions.- Other enzymes were used as follows: (i) exonuclease III, 5 Ag of DNA, 50 units of enzyme in 25 1ul of 60 mM NaCl/10 mM Tris'HCl, pH 7.4/ 10 mM MgCl2/10 mM 2-mercaptoethanol containing bovine serum albumin at 100 ,ug/ml, at 22TC; (ii) Si and mung bean nucleases, 100 and 1 unit(s), respectively, per tug of DNA in 150 mM NaCl/50 mM NaOAc, pH 4.6/6 mM ZnSO4 for 15 min at 37rC; (iii) exonuclease (3'-+5') of T4 DNA polymerase, 5 ,ug of DNA, 5 units of enzyme in 25 Al of 67 mM Tris-HCl, pH 8.8/ 6.7 mM MgCl2/6.7 mM EDTA/16.6 mM (NH4)2SO4/100 .M deoxyribonucleoside triphosphates for 2 min at 37C; (iv) DNA polymerase I (Klenow fragment), 5 Ag of linear DNA, 5 units Abbreviations: IS, insertion; bp, base pair(s); Ampr, ampicillin resistant; Tetr, tetracycline resistant. t Present address: Institute of Medical Science, University of Tokyo, Tokyo, 108 Japan. 7293 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertise- ment" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on June 24, 2020

Sequences essential for transposition at · ofenzymein25Azl of50mMTris-HCI, pH7.5/10mMMgSO4/ 100 mM NaCl/5 mM dithiothreitol/100 uM deoxyribonucleo- tide triphosphates for 1 hr at

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sequences essential for transposition at · ofenzymein25Azl of50mMTris-HCI, pH7.5/10mMMgSO4/ 100 mM NaCl/5 mM dithiothreitol/100 uM deoxyribonucleo- tide triphosphates for 1 hr at

Proc. Nati. Acad. Sci. USAVol. 80, pp. 7293-7297, December 1983Genetics

Sequences essential for transposition at the termni of IS50(transposon Tn5/in vitro mutagenesis/protein-DNA interaction/phage-A/homologous recombination)

CHIHIRO SASAKAWA*t, GEORGES F. CARLE*, AND DoUGLAs E. BERG*tDepartments of *Microbiology and Immunology and of *Genetics, Washington University School of Medicine, St. Louis, MO 63110

Communicated by M. M. Green, August 24, 1983

ABSTRACT The DNA sequences found repeated in oppositeorientation at the ends of insertion (IS) elements are thought tocontain sites at which transposase proteins act during transposi-tion. Many elements have repeats of at least 15 base pairs (bp).Those of IS50 are quite short, however: just 8 of the first 9 bp.Functional tests had indicated that one end of IS50 is more ef-fective in transposition than the other end and suggested that atleast one of the recognition sites of IS50 extends beyond the com-mon 8/9 bp. To determine the lengths of recognition sites of IS50we mutagenized IS50 in vitro and tested the transposition profi-ciency of the resulting mutants. Our results show that the rec-ognition sites at each end of IS50 are about 19 bp long. These find-ings suggest models for the evolution of IS elements from simplerimmobile gene complexes.

Transposition of the simple IS (insertion) elements and othermobile DNA segments of bacteria is mediated by element-spe-cific proteins termed transposases that are believed to act bybinding to sequences at both ends of their cognate elements(for reviews, see refs. 1-4). Although most IS elements containtheir own transposase genes, certain naturally occurring IS ele-ments use transposases encoded elsewhere in the genome. Apair of recognition sites probably constitutes the only irreplace-able component of a functional IS element.

Typically, the base sequence at one end of an IS element isan imperfect inverted repeat of the sequence at the other end,for example 15/16 base pairs (bp) (IS5), 18/23 bp (ISI), and 36/37 bp (yB). Specific sequences of such lengths are unlikely tooccur by chance alone in the bacterial genome and hence aregenerally equated with transposase recognition sites. Imperfectmatches suggest that transposition does not involve direct basepairing between IS element ends; partial symmetry could makeboth ends appear equivalent to the transposase protein. Theshort terminal repeats within IS elements should not be con-fused with the reverse duplications of entire IS elements foundin many transposons-e. g., the reverse duplications of IS50 inthe kanamycin-resistance transposon Tn5 (1). The short in-verted repeats in IS elements are also distinct from the directduplications of target sequences at insertion sites, which areformed anew each time an element transposes and which, onceformed, play no role in subsequent transposition.The ends of IS50, named 0 (outside) and I (inside) corre-

sponding to their positions in Tn5, seem unusual when com-pared with those of other elements; they match closely at onlyeight of nine. bp (Fig. 1). Functional tests have shown that one0 plus one I end or a pair of 0 ends mediate transposition atsimilar frequencies, whereas pairs of I ends are less active byfactors of 1/1,000 to 1/100, Thus, it seemed likely that trans-posase recognition sites of IS50 extend beyond the common 8/

HgoI BstNI HpoI BstNI15 50 188 408.iI r

I I.IC TGACTC T TATACACAAGTA.'..GACTGAGAATATGTGT T CAT....

O End

HgaI Pvu It BgIII BcII1027 1424 1516 1521

-11, - BCII I

....AAGATCTGATCAAGAGACAG

.... TTCTAGACTAGTTCTCTGTCtronsposase gene

UGd I Endstop

FIG. 1. 1S50. The 1,534-bp-long insertion sequence 1S50 is presentas a reverse duplication in the kanamycin resistance transposon Tn5(5) and is unrelated to any ofthe IS elements indigenous to the genomeofE. coli K-12 (6). It encodes transposase and also an inhibitor oftrans-position, both from a promoter "50 bp from its0 end. Transposase andinhibitor, 461 and 421 amino acids long, respectively, are made fromtwo different in-phase translation initiation sites. Both proteins areterminated by the UGA codon 12-14 bp from the I end of IS50 (7-10).At the ends of IS50 is a short hyphenated inverted repeat of 8 of thefirst 9 bp (5), emphasize here by the horizontal lines. In addition tothese 8 bp, there are four other positions in the first 19 bp at which bothends are matched in sequence.

9 bp and that the two ends of IS50 are recognized differentlyby its transposase (5, 11).

The current studies were undertaken to determine the lengthsof the recognition sites of IS50. Our analysis of IS50 derivativesgenerated by in vitro mutagenesis indicates that the critical rec-ognition sites at each end are about 19 bp long.

MATERIALS AND METHODSStandard molecular, microbial, and genetic techniques (12) wereused. The sources of reagents were as follows: BamHI (C-G-G-A-T-C-C-G and C-G-G-G-A-T-C-C-C-G) and Sal I (G-G-T-C-G-A-C-C) linkers, restriction endonucleases, T4 DNA ligase,and DNA polymerase I large fragment (Klenow), New EnglandBioLabs; exonuclease III, nuclease S1, and T4 DNA_ polymer-ase, Bethesda Research Laboratories; mung bean nuclease, P-L Biochemicals. The restriction endonucleases and ligase wereused according to the suppliers' instructions.- Other enzymeswere used as follows: (i) exonuclease III, 5 Ag of DNA, 50 unitsof enzyme in 25 1ul of 60 mM NaCl/10 mM Tris'HCl, pH 7.4/10 mM MgCl2/10 mM 2-mercaptoethanol containing bovineserum albumin at 100 ,ug/ml, at 22TC; (ii) Si and mung beannucleases, 100 and 1 unit(s), respectively, per tug of DNA in 150mM NaCl/50 mM NaOAc, pH 4.6/6 mM ZnSO4 for 15 min at37rC; (iii) exonuclease (3'-+5') of T4 DNA polymerase, 5 ,ug ofDNA, 5 units of enzyme in 25 Al of 67 mM Tris-HCl, pH 8.8/6.7 mM MgCl2/6.7 mM EDTA/16.6 mM (NH4)2SO4/100 .Mdeoxyribonucleoside triphosphates for 2 min at 37C; (iv) DNApolymerase I (Klenow fragment), 5 Ag of linear DNA, 5 units

Abbreviations: IS, insertion; bp, base pair(s); Ampr, ampicillin resistant;Tetr, tetracycline resistant.t Present address: Institute of Medical Science, University of Tokyo,Tokyo, 108 Japan.

7293

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertise-ment" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

June

24,

202

0

Page 2: Sequences essential for transposition at · ofenzymein25Azl of50mMTris-HCI, pH7.5/10mMMgSO4/ 100 mM NaCl/5 mM dithiothreitol/100 uM deoxyribonucleo- tide triphosphates for 1 hr at

Proc. Natl. Acad. Sci. USA 80 (1983)

of enzyme in 25 Azl of 50 mM Tris-HCI, pH 7.5/10 mM MgSO4/100 mM NaCl/5 mM dithiothreitol/100 uM deoxyribonucleo-tide triphosphates for 1 hr at 16'C. Where the repair reactionwas purposely incomplete (e.g., mutants 15 and 16 in Table 2,below), protruding single-stranded DNA was removed with S1nuclease. Phosphorylated BamHI linkers were ligated to linear-ized DNA fragments containing flush ends that had been de-phosphorylated with calf intestine alkaline phosphatase (Boeh-ringer) [5 1Lg of DNA, 3 units of enzyme in 25 Al of 10 mMTris HCl (pH 8)/1 mM EDTA for 45 min at 60'C]. DNA se-quences were determined using the chemical method of Maxamand Gilbert (13) on DNA fragments labeled at their 3' ends withDNA polymerase I (Klenow) and [a-32P]dNTPs (Amersham).

All bacterial strains are derivatives of Escherichia coli K-12.MC1061 (F- A ara-leu hsr) from M. Casadaban via H. Huangwas used as the routine host for plasmid DNAs and for the gen-eration of recombinants between plasmids. GM119, which isdcm- dam- (defective in DNA methylation), was used in thepreparation of plasmids for cleavage with Bcl I, an enzyme thatcleaves methylated DNAs poorly. DB1873 (F- recAl, lyso-genic for A redAi5 DamiS b515 b519 intam29 imm2i cI& Sam7,a phage whose deletions permit it to accommodate insertionsof up to 13 kb; 14, 15) was transformed with plasmids carryingIS50 elements to be tested for the ability to transpose. DB1891(15) was used to select A ampicillin-resistant (Ampr) and A tet-racycline-resistant (Tetr) transducing phage generated by trans-position. It is recA- and contains a hfl-mutation to enhance therecovery of transductants.The Ampr plasmid in Fig. 2, pBRG700, was generated by the

transposition of IS50R from a A: :Tn5 phage (as in ref. 5) to plas-mid pBR333, a derivative of pBR322 generated by K. Bachmanthat we find is missing nucleotides 71 and 76 through 2352 onthe map (16) of pBR322. IS50R is inserted in plasmid pBRG700at position 39 (a major site of Tn5 insertion; ref. 17) and ori-ented such that the I end of IS50 is closest to the Cla I site(position 23) in pBR333. pBRG701 was derived from a

pBR322: :Tn5 plasmid (15) after cleavage with BamHI (in thecentral region of Tn5) and EcoRI (in pBR322). It contains tet,rep, and IS50R but has lost kan, IS50L and amp.

RESULTS

To determine the lengths of the recognition sites of IS50, IS50DNA was mutagenized in vitro and changes in transposabilitywere monitored. Many of our mutations also altered the trans-posase gene or its promoter. Since the IS50-encoded transpos-ase operates efficiently only in cis (18, 19), transposition was

assayed using donor DNA molecules that carried both wild-typeIS50 and the mutated element.Our approach is outlined in Fig. 2. The IS50 element in the

Ampr plasmid pBRG700 was mutated, and the altered plasmidwas transformed into recA+ E. coli harboring an IS50-contain-ing Tet plasmid (pBRG701). Homologous recombination be-tween the two IS50 elements generated heterodimer plasmidswhich were selected by the co-transformation of their Ampr andTete traits, characterized by restriction endonuclease (Hpa I)digestion, and transformed into the recA- A red- lysogen DB1873for use in transposition assays. The heterodimers contain twocomplementary composite transposons, each with direct re-

peats of IS50 elements. As shown in Fig. 2, the 0 and I endsof the mutated IS50 element (marked by a filled box and an X,respectively) become the ends of the composite Tet transposonwhereas the ends of the wild-type IS50 element becomes theends of the Ampr transposon. The effect of alterations in a rec-ognition sequence made in pBRG700 are manifested in the ra-tio of A Tetr to A Amp' transducing phage.

TranspositionTo Phage A

0 I w rep O I

and

0 jI amp rep O I

FIG. 2. Assays of mutated IS50 elements. Mutations (., if at the 0end; x, if at the I end) were introduced into the IS50 element of the Amprplasmid pBRG700. These pBRG700 derivatives were fused with the Tetrplasmid pBRG701, which carries a wild-type IS50 element, using ho-mologous recombination in recA+ E. coli. The heterodimer (Ampr Tet)plasmids were selected by transformation of recA E. coli DB1891 andcharacterized by usingHpa I which only cleaves these plasmids in theirIS50 components. The heterodimers were transformed into the recA-Ared- lysogen DB1873,phage developmentwas inducedfrom subclonesof the transformants, and the ratios of A Tetr phage (formed using themutated 0 or I ends) to AAmpr phage were determined by transductionof recA strain DB1891. Because IS50 does not generate cointegrateswhen it transposes (20) essentially all A Tetr phage were Amp' and allA Ampr phage were Tet8. When these A Tetr and A Ampr phage lyso-genize E. coli, they replicate as multicopy plasmids driven by the pBR322replication system. Restriction endonuclease analyses of hybrid phageequivalent to those diagrammed here have been presented (ref. 5, 11,14, 15). Decreases in the A Tetr/A Ampr ratio imply that the mutationin the 0 or the I end of the Ampr plasmid affects sequences necessaryfor transposition.

As shown in Table 1, if neither IS50 element was mutant,transposition of the Ampr and Tetr segments occurred at aboutthe same frequency; if the transposase promoter near the 0end of IS50 was mutant, the Tetr/Ampr ratio decreased slightly;if sequences encoding the carboxyl terminus of transposase near

Table 1. Incomplete cis-complementation oftransposase deficiencies

Frequency (x 107)* A Tetr/A AmprMutation Location A Tetr Anpr ratioWild type None 7.1 7.5 0.981 51 bp 5.1 9.0 0.72

(O end)12 111 bp 2.6 1.9 1.63

(I end)

Mutations were generated in the 1850 component of pBRG700 andassayed for transposition as described in Fig. 2. Mutation 1 is in thetransposae promoter and was generated by partial digestion ofpBRG700with BstN1 (C-C l A-G-G), copying the unpaired 5' adenosine withDNApolymerase and ligation in the presence ofdecanucleotideBamHIlinkers (C-G-G-G-A-T-C-C-C-G). Mutation 12 is within the transposasestructural gene, was generatedbyPvu II cleavage 110 bpfrom the I end(C-A-G * C-T-G) and ligation in the presence of the BamHI linkers.* Each estimate is based on the transducing titer in four independentlysates.

7294 Genetics: Sasakawa et al.

Dow

nloa

ded

by g

uest

on

June

24,

202

0

Page 3: Sequences essential for transposition at · ofenzymein25Azl of50mMTris-HCI, pH7.5/10mMMgSO4/ 100 mM NaCl/5 mM dithiothreitol/100 uM deoxyribonucleo- tide triphosphates for 1 hr at

Proc. Natl. Acad. Sci. USA 80 (1983) 7295

Table 2. Transposition activity of mutated IS50 elementsTransposition

Mutant DNA sequence WT bp, no. activityOutside end

1 C-T-G-A-C-T-C-T-T-A-T-A-C-A-C-A-A-G-T-A-G-C-G-T-C-C-T-G... . 50 1.02 C-T-G-A-C-T-C-T-T-A-T-A-C-A-C-A-A-G-T-A-G-C-G-c-g-g-g-a-t-c-c-c-g 23 1.03 C-T-G-A-C-T-C-T-T-A-T-A-C-A-C-A-A-G-T-A-G-C-G-g-g-a-t-c-c-c-g 23 1.14 C-T-G-A-C-T-C-T-T-A-T-A-C-A-C-A-A-G-T-c-g-g-g-a-t-c-c-c-g 19 0.865 C-T-GA-C-T-C-T-T-A-T-A-C-A-C-A-A-'-GT-c-g-c-c-g 19* 0.896 C-T-G-A-C-T-C-T-T-A-T-A-C-A-C-A-A-c-g-g-a-t-c-c-g 17 0.427 C-T-G-A-C-T-C-T-T-A-T-A-C-A-C-A-c-g-g-g-a-t-c-c-c-g 16 0.0138 C-T-G-A-C-T-C-T-T-A-T-A-CA-C-c-g-g-g-a-t-c-c-c-g 15 0.0069 C-T-G-A-C-T-C-T-T-A-T-A-C-A-C-g-g-a-t-c-c-g 15 0.00610 C-T-G-A-C-T-C-T-T-A-T-A-C-g-g-a-t-c-c-g 13 0.00611 C-T-G-c-g-g-g-a-t-c-c-c-g 3 <0.001

Inside end

12 C-T-G-T-C-T-C-T-T-G-A-T-C-A-G-A-T-C-T-T-G-A-T-C-C-C-C-T-G-C.... 111 1.013 C-T-G-T-C-T-C-T-T-G-A-T-C-A-G-A-T-C-T-T-G-A-T-C-C-g-g-g-a-t-c-c-c-g 25 1.214 C-T-G-T-C-T-C-T-T-G-A-T-C-A-G-A-T-C-T-T-c-g-g-g-a-t-c-c-c-g 20 1.015 C-T-G-T-C-T-C-T-T-G-A-T-C-A-G-A-T-C-T-c-g-g-g-a-t-c-c-c-g 19 1.116 C-T-G-T-C-T-C-T-T-G-A-T-C-A-G-A-T-C-c-g-g-g-a-t-c-c-c-g 18 0.1717 C-T-G-T-C-T-C-T-T-G-A-T-C-A-G-A-T-C-g-a-t-c-t-t-g-a-t-c 18 0.2018 C-T-G-T-C-T-C-T-T-G-A-T-C-A-G-A-T-g-g-t-c-g-a-c-c 17 0.04919 C-T-G-T-C-T-C-T-T-G-A-T-C-A-G-A-c-g-g-g-a-t-c-c-c-g 16 0.06120 C-T-G-T-C-T-C-T-T-G-A-T-C-A ....T-T-G-A-T-C-C-C-T.... 14 0.03121 C-T-GT-C-T-C-T-T-GA-T-C-c---a-t-c- a-A-GA-T-C.... 13 <0.00322 C-T-G-T-C-T-C-T-T-G-A-T-C .. A. .T-T-G-A-T-C-C-C-C-T.... 13 <0.00323 C-T-G-T-C-T-C-T-T-c-g-g-a-t-c-c-g 9 <0.00324 C-T-G-T-C-T-C-c-g-g-a-t-c-c-g 7 <0.003

The sequences extending from left to right are those of the 5'-*3' strand starting at the end of the element and proceedinginward. Upper case letters indicate the wild-type sequence, lower case letters indicate the new (mutant) DNA sequence, andthe symbol A represents a deletion. A series ofdots indicates the continuation ofIS50 wild-type sequences. Mutant 5 is markedwith an * to call attention to the C-G insertion (superscripted at position 18). The linker insertion and4bp duplication in mutant21 is also superscripted. Transposition activity is defined as the ratio of A Tete/A Ampr phages obtained in the tests outlinedin Fig. 2 and normalized to mutant 1 (for O-end mutants 1-11) and to mutant 12 (for I-end mutants 12-24) (Table 1). Sequencesinternal to the site ofthe mutation are not shown because they are not necessary for transposition when transposase is suppliedby another element. This conclusion is based on (i) the complemented transposition of a Tn5 element in which sequences in-terior to the Hpa I sites (position 188 from the 0 ends) were replaced by trp operon DNA (unpublished data); (ii) mutant 5,which is deleted for sequences between 19 and 191; and (iii) the mobility of mini-IS50, which retains just 19 bp of each endof IS50. Mutations were generated in the pBR333: :IS50 (Ampr) plasmid pBRG700 as follows: mutant 1, see Table 1; mutants2 and 3, cleavage of mutant 1 with Sma I, exonuclease m nuclease digestion, S1 nuclease digestion, ligation with a BamHIlinker; mutant 4, BamHI cleavage of mutant 3, limited digestion using the 3'-*5' exonuclease activity ofT4 DNA polymerasein the presence of dATP, Si nuclease digestion, ligation with a BamHI linker; mutant 5, partial cleavage by Hga I (limitedto one cut by the presence of ethidium bromide), cleavage with Hpa I, repair synthesis usingDNA polymerase I, ligation; mu-tant 6, BamHI cleavage of mutant 4, limited digestion and ligation as with mutant 4; mutant 7, as for mutants 2 and 3 above,but starting with a BamHI site about 30 bp from the 0 end; mutant 8, partial cleavage by Hga I and repair synthesis usingDNA polymerase I, ligation with aBamHI linker; mutants 9 and 10, cleavage ofmutant 4 withBamHl, limited digestion usingthe 3'-*5' exonuclease ofT4 DNA polymerase in the presence of dATP, S1 nuclease digestion, ligation with aBamHI linker;mutant 11, Hga I/Hpa I cleavage as for mutant 5 above, S1 nuclease digestion, ligation with aBamHI linker; mutant 12, seeTable 1; mutants 13-15, Pvu II cleavage, exonuclease HI digestion, mung bean nuclease digestion, ligation; mutant 16, BglII cleavage, repair synthesis using the Klenow fragment of DNA polymerase I, ligation with BamHI linkers; mutant 17, BglII cleavage, repair with DNA polymerase I, ligation; mutants 18 and 19, Bgl II cleavage, repair synthesis, but with omissionof dCTP (for mutant 18) or of dTTP and dCTP (for mutant 19), S1 nuclease digestion, ligation with Sal I linkers (for mutant18) or BamHI linkers (for mutant 16); mutant 20, Bgl II cleavage, S1 nuclease digestion, ligation; mutant 21, Bcl I cleavage,repair synthesis using DNA polymerase I, ligation with a BamHI linker; mutant 22, Bgl H/Bcl I cleavage, ligation; mutant23, Bcl I cleavage, S1 nuclease digestion, ligation; mutant 24, Bcl I cleavage, S1 nuclease digestion, ligation. The sequencesof the mutations in IS50 elements were determined as follows: The I end of IS50 in pBRG700 is at pBR333 position 39, closeto the Cla I site (position 24). To analyze mutations at the I end, the pBRG700 derivatives were cleaved with Cla I, labeledwith DNA polymerase and [a-32P]dGTP, and cleaved again with EcoRI, and the large labeled fragment was isolated and ana-lyzed. There are no restriction sites in pBR333 that are as convenient for end labeling near the 0 end of IS50. Hence, mutationsmade at the 0 endbyBamHI linker insertion were brought close to the I end and thereby to the convenient Cla I site by deletionof the central part of IS50 using Bgl II and BamHI digestion and ligation of the complementary 5' G-A-T-C extensions. Thesequences of mutations at the0 end made without aBamHI linker were determined after labeling termini formed by digestionwith Sau3A (position 311 in IS50, as well as many other sites in pBRG700), cleavage of the labeled DNA with Hpa II, andisolation of the labeled "510-bp fragment that contains the junction between the 0 end of IS50 and the pBR333 sequences.

the I end were mutant, the Tetr/Ampr ratio increased. Theseeffects are included as corrections in the calculations of trans-position activity in Table 2.

To test whether the recognition sites extend past the com-mon 8/9 bp, we mutagenesized pBRG700 in vitro using con-venient restriction endonuclease cleavage sites. For example,

Genetics: Sasakawa et al.

Dow

nloa

ded

by g

uest

on

June

24,

202

0

Page 4: Sequences essential for transposition at · ofenzymein25Azl of50mMTris-HCI, pH7.5/10mMMgSO4/ 100 mM NaCl/5 mM dithiothreitol/100 uM deoxyribonucleo- tide triphosphates for 1 hr at

Proc. Natl. Acad. Sci. USA 80 (1983)

Hga I recognizes the asymmetric sequence 5' G-C-G-T-C 3',which is present at positions 21-25 at the 0 end of IS50 (andalso at four other sites in pBRG700). It makes staggered cuts inthe complementary DNA strands 5 and 10 base pairs to the leftof its recognition site and leaves a 5' extension that is convertedto a flush end when copied by DNA polymerase I. An IS50 mu-tant with 15 bp of the O-end sequence intact was made bydigestion of pBRG700 DNA with Hga I using conditions thatmaximize the yield of singly cleaved (full-length) linear DNA,treatment with DNA polymerase I and DNA ligase, and trans-formation into competent cells. A plasmid carrying the desiredmutation was identified by restriction endonuclease digestionand DNA sequence analysis. The mutation leaving just 15 bpof O-end sequence intact was found to severely impair trans-position (Table 2, mutant 8), and thus the O-end recognitionsite is more than 15 bp long.A mutation that contains 19 bp of O-end sequence inter-

rupted by a C-G insertion (mutant 5) was generated by Hpa Idigestion of the preparation previously treated with Hga I andDNA polymerase I. Hpa I cuts pBRG700 only within IS50, makesflush ends, and at this site leaves the sequence 5' A-A-C-G-T

. (7). The mutation generated by intramolecular ligation fol-lowing Hpa I cleavage exhibited normal transposition.To test whether I-end recognition also requires sequences

beyond the common 8/9 bp, we cleaved pBRG700 with Bcl I,removed the 5' G-A-T-C extension with nuclease S1, and li-gated the linear DNA in the presence of octanucleotide BamHIlinkers. The resultant mutated I end retained just 9 bp of nor-mal sequence and did not function in transposition (mutant 23).To identify the recognition sites of IS50 more clearly, we

used additional mutagenesis strategies summarized in the leg-end to Table 2. Our results establish that normal transpositionactivity requires 18 or 19 of the O-end and 19 bp of the I-endsequence. An independent study (21), which appeared aftersubmission of this report, concluded that between 19 and 23 bpof wild-type sequence are needed at the 0 end for normaltransposition. The recognition sites of IS50 were further de-fined by recombining mutants 4 and 15 (Table 2) at their BamHIlinkers, thereby generating a mini-IS50 element retaining only19 bp of the O-end and 19 bp of the I-end sequence. A DNAsegment encoding Tetr was then inserted into the single BamHIsite. This mini-IS50-Tet transposed to phage A at frequenciesof 1-10 x 10-7 when complemented in cis by a transposase geneplaced elsewhere in the plasmid (unpublished data). Becausemini-IS50 contains only the 19 bp from the two ends of the ele-ment, no additional internal sequences are required for specificrecognition.

Because only 12 of the 19 bp at the 0 and I ends of IS50 arematched in sequence, it is evident that transposase possessesconsiderable flexibility in sequence recognition. This conclu-sion is underscored by the partial transposition activity of mu-tations leaving 16 or 17 bp of O-end or 14-18 bp of I end se-quences intact (mutants 6, 7, and 16-20) and by the apparentability to ignore the C-G insertion at position 18 in mutant 5.

DISCUSSIONProbably the only sequences in IS elements directly requiredfor transposition are those that serve as transposase recognitionsites. The characteristics of the interactions between transpos-ase proteins and their recognition sites are likely to determinethe overall frequency and specificity of transposition, as wellas the ability of two related and linked IS elements to move inunison as a transposon. According to models in which trans-position occurs by a rolling circle replicative mechanism (22,23), transposase-target site interactions govern whether donorand target DNAs remain linked as cointegrates or re-form as

separate DNA molecules. According to our alternative models,these interactions govern whether an element moves by a con-servative (nonreplicative) process or is copied during trans-position (1, 20, 24).

Theoretical considerations and functional tests had indicatedthat the recognition sites of IS50 probably extend past the com-mon 8/9 bp into the region of dissimilar DNA sequence andthat they were thus of unusual interest (5, 11). To determinethe actual length of these recognition sites, we generated mu-tations near each end of IS50 (O and I) and examined their ef-fects on transposition. Our estimate that the O-end recognitionsite is 18 or 19 bp long is based, most critically, on the findingsthat mutants 6 and 7, which retain only 17 and 16 bp of O-endsequence, are partially defective in transposition whereas mu-tant 4, which retains 19 bp of the O-end sequence, is trans-position proficient. Similarly, the estimate that the I site is 19bp long depends on the partial transposition deficiency of mu-tants 16 and 17, which retain 18 bp of the I-end sequence, incontrast to the transposition proficiency of mutant 15, whichretains 19 bp. The mobility of mini-IS50-Tet when comple-mented for its defect in transposase synthesis eliminates com-plicated models in which transposition is dependent on one ormore additional sites buried within the element.The poor match of the sequences at the two ends of IS50

(only 12 of 19, in contrast to the match of at least 15 bp in mostother elements) suggests that its transposase is intriguingly flex-ible in DNA sequence recognition. Independent evidence forflexibility comes from the leakiness of the transposition defi-ciencies caused by mutations leaving 16 or 17 bp of the O-endsequence or 14-18 bp of the I-end sequence intact and also bythe wild-type phenotype of mutant 5, which has a C-G inser-tion at position 18 (Table 2).From the studies that showed that two IS50 0 ends function

100- to 1,000-fold more effectively than two I ends but that an0- plus an I-end function with an efficiency similar to that oftwo 0 ends, we have proposed that transposase acts in two steps:(i) Initial binding, efficient only at the O-end sequence, gen-erates an activated transposase-O-end complex. (ii) The acti-vated transposase migrates along the DNA molecule, bindinglater and with less discrimination to either an 0- or an I-endsequence (5, 11). The changes in the A Tetr/A Ampr ratio seenin transposition from heterodimers with one IS50 componentdefective in transposase synthesis (Table 1) suggest that the firststep in transposase binding is most efficient at a recognition siteclosely linked to its site of synthesis. This localization could beachieved by the formation of active DNA binding domains inthe transposase protein prior to completion of its coupled tran-scription and translation (18).Why are the recognition sites of IS50 so dissimilar? For an

answer we must consider the possible evolutionary origins ofIS elements. We propose that many pathways of IS elementformation began with chance fusions between genes that en-coded proteins much like present day repressors, helicases, andtopoisomerases, which bind, migrate on, and nick and resealDNA molecules, and that some transposase recognition sitesmay have evolved from operators to which ancestral repressorproteins had bound. Neither these genes nor the primordialrecognition site need have been part of a preexisting mobileelement. The recognition sites at both ends of an IS elementmight be related by a reverse duplication of the first bindingsite (see ref. 25). However, our data also evoke models such asthat in Fig. 3, in which the 0 and I ends do not share a commonancestry. An incipient IS element would be created by the link-ing of a transposase (tnp) gene and a single recognition site "O"as indicated above. The primitive transposase activated by bind-ing this "O" site would migrate on the DNA molecule, occa-

7296 Genetics: Sasakawa et al.

Dow

nloa

ded

by g

uest

on

June

24,

202

0

Page 5: Sequences essential for transposition at · ofenzymein25Azl of50mMTris-HCI, pH7.5/10mMMgSO4/ 100 mM NaCl/5 mM dithiothreitol/100 uM deoxyribonucleo- tide triphosphates for 1 hr at

Proc. Natl. Acad. Sci. USA 80 (1983) 7297

"O"

-'tnp' -,;

I'or

0 I 'tnp'

-__ tnp t--

O--I

incipient

primitive

evolved

FIG. 3. Model to explain the evolution ofan IS element from a sim-ple immobile gene complex. The boxes represent actual or potentialtransposase binding sites that differ in DNA sequence. "0," '0,' and0 represent successive stages in the evolution ofthe0 end binding site,and "tnp, " 'tnp, 'and tnp represent successive forms ofthe transposasegene, each encoding a protein of somewhat different specificity (co-adapted to its recognition sites). The right fork represents a sponta-neous deletion within the IS element, and the left fork represents ashortening of the IS element by transposition using an alternative for-tuitous recognition site near to or overlapping the end of the tnp gene.

sionally encounter other sequences for which it possessed for-tuitous affinity, and cause transposition.

Natural selection would favor mutant elements that trans-posed more efficiently. Mutations would accumulate in thechance binding sites, in the tnp gene, or, following changes intransposase specificity, in the "O" sequence itself. Becausechance binding sites would be rare, the earliest IS elementsmay have been quite long. Shorter derivatives would then havearisen by spontaneous internal deletions drawing the distantend closer to the tnp gene.

Truncated IS elements would also form due to the recog-nition by the transposase of other sequences, often potentiatedby changes in the specificity of the protein. Once a shortenedIS element had moved from its earlier location, the recaptureof its previous recognition site would not occur. Rather, theselection for more efficient transposition would speed the evo-lution of the new recognition site, of the tnp gene, and prob-ably of the 0-recognition site as well. The current form of IS50,the dissimilarity between its two ends and the overlap betweenessential coding and recognition sequences, may thus representthe endpoint of transpositional shortenings, reflecting a con-

tinuous pressure for compactness inherent in the nature of thetransposition process itself.

We are grateful to Dr. H. Huang for stimulating discussions; to S.Uknes and M. Schmandt for superb technical assistance; and to Drs.C. M. Berg, D. Hard, M. Howe, D. Schlessinger, and F. Stahl for crit-ical readings of the manuscript. This work was supported by Public HealthService International Research Fellowship 1 F05 TW02949 (to C.S.),a scholarship from Rotary International (to G.F.C.) and U.S. PublicHealth Service Research Grants AI 14267 and AI 18980 (to D.B.).1. Berg, D. E. & Berg, C. M. (1983) Biotechnology , 427-435.2. Iida, S., Meyer, J. & Arber, W (1983) in Mobile Genetic Ele-

ments, ed. Shapiro, J. A. (Academic, New York), pp. 159-221.3. Kleckner, N. (1981) Annu. Rev. Genetics 15, 341-404.4. Campbell, A., Berg, D. E., Botstein, D., Lederberg, E. M.,

Novick, R. P., Starlinger, P. & Szybalski, W (1979) Gene 5, 197-206.

5. Berg, D. E., Johnsrud, L., McDivitt, L., Ramabhadran, R. &Hirschel, B. J. (1982) Proc. Nati. Acad. Sci. USA 79, 2632-2635.

6. Berg, D. E. & Drummond, M. H. (1978)J. Bacteriol. 136, 419-422.

7. Auerswald, E., Ludwig, G. & Schaller, H. (1980) Cold SpringHarbor Symp. Quant. Biol. 45, 107-113.

8. Rothstein, S. J. & Reznikoff, W S. (1981) Cell 23, 191-199.9. Johnson, R. C., Yin, J. C. P. & Reznikoff, W S. (1982) Cell 30,

873-882.10. Isberg, R. R., Lazaar, A. L. & Syvanen, M. M. (1982) Cell 30, 883-

892.11. Sasakawa, C. & Berg, D. E. (1982)J. Mol. Biol. 159, 257-271.12. Maniatis, T., Fritsch, E. F. & Sambrook, J. (1982) Molecular

Cloning: A Laboratory Manual (Cold Spring Harbor Laboratory,Cold Spring Harbor, NY).

13. Maxam, A. & Gilbert, W (1980) Methods Enzymol. 65, 499-560.14. Sasakawa, C., Lowe, J. B., McDivitt, L. & Berg, D. E. (1982) Proc.

Natl. Acad. Sci. USA 79, 7450-7454.15. Hirschel, B. J. & Berg, D. E. (1982)J. Mol. Biol. 155, 105-120.16. Sutcliffe, G. (1978) Cold Spring Harbor Symp. Quant. Biol. 43,

77-97.17. Berg, D. E., Schmandt, M. A. & Lowe, J. B. (1983) Genetics, in

press.18. Berg, D. E., Lowe, J. B., Sasakawa, C. & McDivitt, L. (1982) in

Fourteenth Stadler Genetics Symposium, ed. Redei, G. (Univ.Missouri Press, Columbia, MO), pp. 5-29.

19. Isberg, R. R. & Syvanen, M. (1981)J. Mol. Biol. 150, 15-32.20. Berg, D. E. (1983) Proc. Natl. Acad. Sci. USA 80, 792-796.21. Johnson, R. C. & Reznikoff, W S. (1983) Nature (London) 304,

280-282.22. Harshey, R. & Bukhari, A. I. (1981) Proc. Natl. Acad. Sci. USA 78,

1090-1094.23. Galas, D. J. & Chandler, M. (1981) Proc. Natl. Acad. Sci. USA 78,

4858-4862.24. Berg, D. E. (1977) in DNA Insertion Elements, Plasmids and Epi-

somes, eds. Bukhari, A. I., Shapiro, J. A. & Adhya, S. L. (ColdSpring Harbor Laboratory, Cold Spring Harbor, NY), pp. 205-212.

25. Chow, L. T., Davidson, N. & Berg, D. E. (1974)J. Mol. Biol. 86,69-89.

Genetics: Sasakawa et al.

Dow

nloa

ded

by g

uest

on

June

24,

202

0