5
Proc. Natl. Acad. Sci. USA Vol. 88, pp. 3111-3115, April 1991 Microbiology Site-specific integration of mycobacteriophage L5: Integration-proficient vectors for Mycobacterium smegmatis, Mycobacterium tuberculosis, and bacille Calmette-Guerin (pathogenesis/Mycobacterium leprae/site-specific recombination/vaccines) MONG HONG LEE*, LISA PASCOPELLAt, WILLIAM R. JACOBS, JR.t, AND GRAHAM F. HATFULL* *Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260; and tHoward Hughes Medical Institute, Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, NY 10461 Communicated by Allan M. Campbell, January 14, 1991 ABSTRACT Mycobacteriophage L5, a temperate phage of mycobacteria, integrates site-specifically into the Mycobacte- rium smegmatis chromosome. We have identified the int gene and attP site of L5, characterized the chromosomal attachment site (aMtB), and constructed plasmid vectors that efficiently transform M. smegmatis through stable site-specific integration of the plasmid into the bacterial genome. These integration- proficient plasmids also efficiently transform slow-growing mycobacteria such as the pathogen Mycobacterium tuberculosis and the vaccine strain bacifle Calmette-Gukrin (BUG). The ability to easily generate stable recombinants in these slow- growing mycobacteria without the requirement for continual selection is of particular importance for the construction of recombinant BCG vaccines and for the isolation and charac- terization of mycobacterial pathogenic determinants in animal model systems. Integration vectors of this type should be of general use in a number of additional bacterial systems where temperate phages have been identified. Mycobacteria cause several important diseases of humans, including tuberculosis, leprosy, and the opportunistic infec- tions of AIDS patients. At the same time, bacille Calmette- Guerin (BCG), an avirulent derivative of the Mycobacterium tuberculosis complex, is one of the most widely used human vaccines and has been proposed as a host for the construction of recombinant live vaccines expressing novel protective epitopes (1). In spite of the role of mycobacteria as important human pathogens, the difficulties associated with growing the pathogenic species have discouraged detailed genetic analy- sis of these organisms (2). For example, Mycobacterium leprae (the etiologic agent of leprosy) has yet to be success- fully cultivated in the laboratory and must be grown in armadillos or the footpads of mice. M. tuberculosis can be grown in defined medium but grows extremely slowly, a characteristic of many of the pathogenic mycobacteria. As a consequence, the genetic basis of mycobacterial pathogene- sis is poorly understood. Recently, efficient transformation methods and "phasmid" and plasmid cloning vehicles have been developed with which these questions can be addressed (1-5). However, the study of pathogenic determinants in animal systems using recombinant mycobacteria requires that the introduced DNA molecules be stably maintained in the mycobacteria during propagation in animals. The requirement for genetically stable mycobacteria re- combinants is particularly important for the construction of recombinant BCG vaccines, since BCG is administered as live bacteria, and it is important that the novel protective epitopes are not lost subsequent to vaccination. For the construction of live recombinant Salmonella vaccines, this problem has been addressed by genetic selection of the plasmid- such that plasmid maintenance is essential for cell viability (6, 7); alternatively the antigen genes can be inte- grated into the Salmonella genome by homologous recom- bination (8, 9) or by random integration of transposons (10). Although systems for homologous recombination and transpositional recombination have been described for the fast-growing Mycobacterium smegmatis (11, 12), no success- ful application has yet been demonstrated for the slow- growing mycobacteria such as BCG or M. tuberculosis. In this paper, we describe the site-specific integration system of temperate mycobacteriophage L5 and the construction of integration-proficient vectors for integration of plasmid se- quences into mycobacterial genomes. We demonstrate that integration is efficient not only in M. smegmatis but also in M. tuberculosis and BCG. Furthermore, the recombinants produced are genetically stable. MATERIALS AND METHODS Plasmids, Bacteriophages, and Bacterial Strains. Mycobac- teriophage L5 (3) is a close relative of phage Li (3, 13), and both yield identical patterns with >20 different restriction enzymes. Li differs from L5 in that it does not form plaques at 420C (W.R.J. and L.P., unpublished data). The kanamycin- resistance gene from Tn9O3 was kindly provided by K. Derbyshire and N. Grindley (Yale University) as a 1-kilobase (kb) HindIII cassette. Plasmid pUC119 has been described (14). M. smegmatis mc26 and the high-efficiency variant mc2155 are described elsewhere (1, 3), and mc26(L5) and mc2155(L5) are lysogens of L5 in mc26 and mc2155, respec- tively. BCG Pasteur was from the Staten Seruminstitute (Copenhagen) and M. tuberculosis H37Ra was kindly pro- vided by W. Jones (Centers for Disease Control, Atlanta). Isolation of attW, attL, and attR DNA. A A phage recombi- nant containing attR DNA was isolated from a library of mc26(L5) DNA in A EMBL3 (14) by screening with radiola- beled attP DNA, and a 1.1-kb Sal I attR fragment was subcloned into pUC119. Radiolabeled attR DNA (the 1.1-kb Sal I fragment) was used to isolate a cosmid containing attB DNA from a cosmid library of mc26 (a nonlysogen), and a 1.7-kb Sal I attB fragment was subcloned into pUC119. A A recombinant clone containing attL DNA was isolated from a mc26(L5)/EMBL3 library by hybridization with radiolabeled attB DNA, and a 3.2-kb BamHI attL fragment was subcloned into pUC119. We were initially unable to isolate attL from the mc26(L5) library by using radiolabeled attP DNA, probably because of the relatively small size (0.2 kb) of the L5 segment in the 3.2-kb BamHI attL fragment (see Fig. 3A). This fragment is significantly larger than that described in ref. 3 (1.7 kb) for the following reason. The 1.7-kb BamHI fragment Abbreviation: BCG, bacille Calmette-Gudrin. The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. 3111

Site-specific integration of mycobacteriophage L5: Integration

  • Upload
    lytu

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Site-specific integration of mycobacteriophage L5: Integration

Proc. Natl. Acad. Sci. USAVol. 88, pp. 3111-3115, April 1991Microbiology

Site-specific integration of mycobacteriophage L5:Integration-proficient vectors for Mycobacterium smegmatis,Mycobacterium tuberculosis, and bacille Calmette-Guerin

(pathogenesis/Mycobacterium leprae/site-specific recombination/vaccines)

MONG HONG LEE*, LISA PASCOPELLAt, WILLIAM R. JACOBS, JR.t, AND GRAHAM F. HATFULL**Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260; and tHoward Hughes Medical Institute, Department of Microbiology andImmunology, Albert Einstein College of Medicine, Bronx, NY 10461

Communicated by Allan M. Campbell, January 14, 1991

ABSTRACT Mycobacteriophage L5, a temperate phage ofmycobacteria, integrates site-specifically into the Mycobacte-rium smegmatis chromosome. We have identified the int geneand attP site of L5, characterized the chromosomal attachmentsite (aMtB), and constructed plasmid vectors that efficientlytransformM. smegmatis through stable site-specific integrationof the plasmid into the bacterial genome. These integration-proficient plasmids also efficiently transform slow-growingmycobacteria such as the pathogen Mycobacterium tuberculosisand the vaccine strain bacifle Calmette-Gukrin (BUG). Theability to easily generate stable recombinants in these slow-growing mycobacteria without the requirement for continualselection is of particular importance for the construction ofrecombinant BCG vaccines and for the isolation and charac-terization of mycobacterial pathogenic determinants in animalmodel systems. Integration vectors of this type should be ofgeneral use in a number of additional bacterial systems wheretemperate phages have been identified.

Mycobacteria cause several important diseases of humans,including tuberculosis, leprosy, and the opportunistic infec-tions of AIDS patients. At the same time, bacille Calmette-Guerin (BCG), an avirulent derivative of the Mycobacteriumtuberculosis complex, is one of the most widely used humanvaccines and has been proposed as a host for the constructionof recombinant live vaccines expressing novel protectiveepitopes (1). In spite of the role ofmycobacteria as importanthuman pathogens, the difficulties associated with growing thepathogenic species have discouraged detailed genetic analy-sis of these organisms (2). For example, Mycobacteriumleprae (the etiologic agent of leprosy) has yet to be success-fully cultivated in the laboratory and must be grown inarmadillos or the footpads of mice. M. tuberculosis can begrown in defined medium but grows extremely slowly, acharacteristic of many of the pathogenic mycobacteria. As aconsequence, the genetic basis of mycobacterial pathogene-sis is poorly understood. Recently, efficient transformationmethods and "phasmid" and plasmid cloning vehicles havebeen developed with which these questions can be addressed(1-5). However, the study of pathogenic determinants inanimal systems using recombinant mycobacteria requiresthat the introduced DNA molecules be stably maintained inthe mycobacteria during propagation in animals.The requirement for genetically stable mycobacteria re-

combinants is particularly important for the construction ofrecombinant BCG vaccines, since BCG is administered aslive bacteria, and it is important that the novel protectiveepitopes are not lost subsequent to vaccination. For theconstruction of live recombinant Salmonella vaccines, this

problem has been addressed by genetic selection of theplasmid- such that plasmid maintenance is essential for cellviability (6, 7); alternatively the antigen genes can be inte-grated into the Salmonella genome by homologous recom-bination (8, 9) or by random integration of transposons (10).Although systems for homologous recombination and

transpositional recombination have been described for thefast-growing Mycobacterium smegmatis (11, 12), no success-ful application has yet been demonstrated for the slow-growing mycobacteria such as BCG or M. tuberculosis. Inthis paper, we describe the site-specific integration system oftemperate mycobacteriophage L5 and the construction ofintegration-proficient vectors for integration of plasmid se-quences into mycobacterial genomes. We demonstrate thatintegration is efficient not only in M. smegmatis but also inM. tuberculosis and BCG. Furthermore, the recombinantsproduced are genetically stable.

MATERIALS AND METHODSPlasmids, Bacteriophages, and Bacterial Strains. Mycobac-

teriophage L5 (3) is a close relative of phage Li (3, 13), andboth yield identical patterns with >20 different restrictionenzymes. Li differs from L5 in that it does not form plaquesat 420C (W.R.J. and L.P., unpublished data). The kanamycin-resistance gene from Tn9O3 was kindly provided by K.Derbyshire and N. Grindley (Yale University) as a 1-kilobase(kb) HindIII cassette. Plasmid pUC119 has been described(14). M. smegmatis mc26 and the high-efficiency variantmc2155 are described elsewhere (1, 3), and mc26(L5) andmc2155(L5) are lysogens of L5 in mc26 and mc2155, respec-tively. BCG Pasteur was from the Staten Seruminstitute(Copenhagen) and M. tuberculosis H37Ra was kindly pro-vided by W. Jones (Centers for Disease Control, Atlanta).

Isolation of attW, attL, and attR DNA. A A phage recombi-nant containing attR DNA was isolated from a library ofmc26(L5) DNA in A EMBL3 (14) by screening with radiola-beled attP DNA, and a 1.1-kb Sal I attR fragment wassubcloned into pUC119. Radiolabeled attR DNA (the 1.1-kbSal I fragment) was used to isolate a cosmid containing attBDNA from a cosmid library of mc26 (a nonlysogen), and a1.7-kb Sal I attB fragment was subcloned into pUC119. A Arecombinant clone containing attL DNA was isolated from amc26(L5)/EMBL3 library by hybridization with radiolabeledattB DNA, and a 3.2-kb BamHI attL fragment was subclonedinto pUC119. We were initially unable to isolate attL from themc26(L5) library by using radiolabeled attP DNA, probablybecause of the relatively small size (0.2 kb) of the L5 segmentin the 3.2-kb BamHI attL fragment (see Fig. 3A). Thisfragment is significantly larger than that described in ref. 3(1.7 kb) for the following reason. The 1.7-kbBamHI fragment

Abbreviation: BCG, bacille Calmette-Gudrin.

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

3111

Page 2: Site-specific integration of mycobacteriophage L5: Integration

3112 Microbiology: Lee et al.

was identified as an attachment junction since it was present(as seen by hybridization with phage DNA) in a lysogen butnot in phage particles (3). However, restriction mapping andDNA sequence information indicated that this 1.7-kb BamHIfragment originated from ligated cos termini of the phagegenome (G.F.H., unpublished data). The 3.2-kb BamHIfragment containing attL was not observed by hybridizationofDNA from a lysogen, presumably because of the small sizeof the phage segment and because of comigration of restric-tion fragments of a similar size.DNA Sequences. The DNA sequence of a 2-kb Sal I

segment of the L5 genome was extracted from a data base ofthe entire L5 DNA sequence (G.F.H., unpublished data).This data base was compiled from assembly of DNA se-quences of randomly selected subclones derived from soni-cation of L5 DNA and sequenced by dideoxy methods (15).The DNA sequence of this 2-kb segment was determined onboth strands. The DNA sequences of attL, attR, and attBsites were determined by dideoxy sequencing of subclonesderived either by sonication or by restriction enzyme cleav-

A

Proc. Natl. Acad. Sci. USA 88 (1991)

age of the appropriate recombinant plasmids. Compilationand analysis of sequences were as described (16).

Construction of Plasmids pMH5, pMH92, and pMH94.pMH5 is a pUC119 derivative containing a 4.9-kb insert ofthephage L5 genome and was constructed in two stages. In thefirst step, a 6.7-kb BamHI fragment derived from L5 DNAwas inserted into the BamHI site of pUC119, and the 3.7-kbsegment of L5 DNA between the SnaBI site and the vectorEcoRI site was deleted, thus leaving a unique BamHI site. Inthe second step, a 1.9-kb BamHI fragment isolated from L5was inserted into this unique BamHI site in the appropriateorientation so that the resulting plasmid contained a contig-uous 4.9-kb segment of L5 DNA. A 1-kb HindIll fragmentcontaining the kanamycin-resistance gene from Tn9O3 wasthen inserted into the unique HindIII site of this plasmid toproduce pMH5 (Fig. 1A).pMH92 and pMH94 were constructed by inserting a 2.0-kb

Sal I fragment (which includes attP and int; see Fig. 1) frompMH5 into a pUC119 derivative carrying the same kanamy-cin-resistance cassette as in pMH5. The L5 DNA in pMH94

BGTCGACCACCAAGGGCACCATCTCTGCTTGGGCCACCCCGTTGGCCGCAGCCAGCTCGCT

1l II I I)) 67kh I I

Nlie

cijiB:N }I;

4-~t_ . f pNIH4

-\,%11FT2. pNIHS-

Domain 1

INT IX) RLAMELAVVTGQRVGDLCEMKWSDIVDGINT (HK22) RLAMDLAVVTGQRVGDLCRMKWSDINDNINT (080) VFLVKFIMLTGCRTAEIRLSERSWFRLDINT (P2) KKIAILCLSTGARWGEAARLKAENIIHNINT (P4) MIAVKLSLLTFVRSSELRFARWDEFDFDINT (P22) KSWEFALSTGLRRSNIINLEWQQIDMQINT (186) ETVVRICLATGARWSEAESLRKSQLAKYINT (HP1) GLIVRICLATGARWSEAETLTQSQVMPYINT (L54a) AGAVEVQALTGMRIGELLALQVKDVDLKINT (011) RQLTRLLFYSGLRIGEALALQWKDYDKICre (P1) TAGVEKALSLGVTKLVERWISVSGVADDD Prot. (F) KMLLATLWNTGARINEALALTRGDFSLAFim B YCLTLLCFIHGFRASEICRLRISDIDLKFim E YCLILLAYRHGMRISELLDLHYQDLDLNTn2603 ORF3 RLFAQLLYGTGMRISEGLQLRVKDLDFDTn554 TnpA KLILMLMYEGGLRIGEVLSLRLEDIVTWTn554 TnpB ATMTMIVQECGMRISELCTLKKGCLLEDTn 4430 Tnpl YAIATLLAYTGVRISEALSIKMNDFNLQRci YVIFHLALETAMRQQEILALRWEHIDLRTn1545 ORF2 YDEILILLKTGLRISEFGGLTLPDLDFEFlp ----------------------------

CONSENSUS --lv-L-l-TGmR-SEl--Lr--di---

L5 RIAAYILAWTSLRFGELIELRRKDIVDD

Domain 2

HELRSLSA-RLYEKQ-ISDKFAQHLLGHKS-DTMASQYR-HELRSLSA-RLYRNQ-IGDKFAQRLLGHKS-DSMAARYRDHDMRRTIATNLSELG-CPPHVIEKLLGHQM-VGVMAHYN-HALRHSFATHFMING-GSI ITLQRI LGHTR-IEQTMVYAHHGFRTMARGALGESGLWSDDAIERQSLHSERNNVRAAYIHHDLRHTWASWLVQAG-VP ISVLQEMGGWES-IEMVRRYAHHVLRHTFASHFMMNG-GNILVLQRVLGHTD-IKMTMRYAHHVLRHTFASHFMMNG-GNILVLKEI LGHST-IEMTMRYAHHTLRHTHISLLAEMN-ISLKAIMKRVGHRDEKTTIKVYTHHHLRHSYASYLINNG-VDMYLLMELMRHSNITETIQTYSHHSARVGAARDMARAG-VSIPEIMQAGGWTN-VNIVMNYIRHTFRHSYAMHMLYAG-IPLKVLQSLMGHKS-ISSTEVYTKHMLRHSCGFALANMG-IDTRLIQDYLGHRN-IRHTVRYTAHMLRHACGYELAERG-ADTRLIQDYLGHRN-IRHTVRYTAHTLRHSFATALLRSG-YDIRTVQDLLGHSD-VSTTMIYTHHMLRHTHATQLIREG-WDVAFVQKRLGHAHVQTTLNTYVHHAFRHTVGTRMINNG-MPQHIVQKFLGHES-PEMTSRYAHHQLRHFFCTNAIEKG-FSIHEVANQAGHSN-IHTTLLYT-HDLRHEAISRFFELGSLNVMEIAAISGHRS-MNMLKRYTHHSLRHTFCTNYANAG-MNPKALQYIMGHAN-IAMTLNYYAHIGRHLMTSFLSMKGLTELTNVVGNWSDKRASAVATTYTH

H-LRHt -At-L---G---i--iQ-lLgh---i--T--Y-H

HDLRAVGATFAAQAG-ATTKELMARLGHTT-PRMAMKYQM

61121181241301361421481541

601

661

721

781

841

901

961

1021

1081

1141

1201

1261

1321

1381

1441

1501

1561

16211681174118011861192119812041

TAGCCAGATCAGGGATGCGTTGCAACCGCGTATGCCCAGGTCAGAAGAGTCGCACAAGAGTTGCAGACCCCT __T A CT _ GA G CGGGCGACGGGAATCGAACCCGCGTAGCTAGTTTGGAAG^kTGGGTGTCTGCCGACCACATATGGGCCGGTCAAGATAGGTTTTTACCCCCTCTCGGCTGCATCCTCTAAGTGGAAAGAAATTGCAGGTCGTAGAAGCGCGTTGAAGCCTGAGAGTSGCACAGGAGTTGCAACCCGGTAGCCTTGTTCACGACGAGAGGAGACCTAGTTGGCACGTCGCGGATGGGGATCGCTGAAGACTC

BnH I V R Y Y AAGCGCAGCGGGAGGATCCAAGCCTCATACGTCAACCCGCAGGACGGTGTGAGGTACTACG

L Q T Y D N K M D A E A W L A G E K R LCGCTGCAGACCTACGACAACAAG&A~GACGCCGAAGCCTGGCTCGCGGGCGAGAAGCGGC

I E M E T W T P P Q D R A K K A A A S ATCATCGAGATGGAGACCTGGACCCCTCCACAGGACCGGGCGAAGAAGGCAGCCGCCAGCG

I T L E E Y T R K W L V E R D L A D G TCCATCACGCTGGAGGAGTACACCCGGAAGTGGCTCGTGGAGCGCGACCTCGCAGACGGCA

R D L Y S G H A E R R I Y P V L G E V ACCAGGGATCTGTACAGCGGGCACGCGGAGCGCCGCATCTACCCGGTGCTAGGTGAAGTGG

V T E M T P A L V R A W W A G M G R K HCGGTCACAGAGATGACGCCAGCTCTGGTGCGTGCGTGGTGGGCCGGGATGGGTAGGAAGC

P T A R R H A Y N V L R A V M N T A V EACCCGACTGCCCGCCGGCATGCCTACAACGTCCTCCGGGCGGTGATGAACACAGCGGTCG

D K L I A E N P C R I E Q K A A D E R DAGGACAAGCTGATCGCAGAGAACCCGTGCCGGATCGAGCAGAAGGCAGCCGATGAGCGCG

V E A L T P E E L D I V A A E I F E H YACGTAGAGGCGCTGACGCCTGAGGAGCTGGACATCGTCGCCGCTGAGATCTTCGAGCACT

R I A A Y I L A W T S L R F G E L I E LACCGGATCGCGGCATACATCCTGGCGTGGACGAGCCTCCGGTTCGGAGAGCTGATCGAGC

R R K D I V D D G M T M K L R V R R G ATTCGCCGCAAGGACATCGTGGACGACGGCATGACGATGAAGCTCCGGGTGCGCCGTGGCG

S R V G N K I V V G N A K T V R S K R PCTTCCCGCGTGGGGAACAAGATCGTCGTTGGCAACGCCAAGACCGTCCGGTCGAAGCGTC

V T V P P H V A E M I R A H M K D R T KCTGTGACGGTTCCGCCTCACGTCGCGGAGATGATCCGAGCGCACATGAAGGACCGTACGA

M N K G P E A F L V T T T Q G N R L S KAGATGAACAAGGGCCCCGAGGCATTCCTGGTGACCACGACGCAGGGCAACCGGCTGTCGA

S A F T K S L K R G Y A K I G R P E L RAGTCCGCGTTCACCAAGTCGCTGAAGCGTGGCTACGCCAAGATCGGTCGGCCGGAACTCC

I H D L R A V G A T F A A Q A G A T T KGCATCCACGACCTCCGCGCTGTCGGCGCTACGTTCGCCGCTCAGGCAGGTGCGACGACCA

E L M A R L G H T T P R M A M K Y Q M AAGGAGCTGATGGCCCGTCTCGGTCACACGACTCCTAGGATGGCGATGAAGTACCAGATGG

S E A R D E A I A E A M S K L A K T SCGTCTGAGGCCCGCGACGAGGCTATCGCTGAGGCGATGTCCAAGCTGGCCAAGACCTCCTGAAACGCAAAAAGCCCCCCTCC AG~CCGGCTAGGGGGGGTTTCTTGTCAGTACGCGAAGAACCACGCCTGGCCGCGAGCGCCAGCACCGCCGCTCTGTGCGGAGACCTGGGCACCAGCCCCGCCGCCGCCAGGAGCATTGCCGTTCCCGCCAGCTGAGTTCTGTTGTGCGCCGCCTATGTAGAGCTGGTCGTTGTAGGTCCGATCTCCAGGCGACTTTCCGGCGACGCT

FIG. 1. Organization of the attP-int region of mycobacteriophage L5. (A) The top line shows the BamHI restriction map of L5 as deducedfrom the DNA sequence of the L5 genome. The location of the 6.7-kb fragment containing the attP site (3) is shown, with a more detailedrestriction map below. The position and orientation of the int gene are represented as the black arrow to the right of attP. The black bar throughattP represents the 43-base-pair (bp) core sequence within attP. At the bottom portion are shown the segments of the L5 genome present inplasmids pMH5 (4.9 kb) and pMH92 and pMH94 (which contain the same 2-kb insert but in opposite orientations relative to the vector). Thebackbone of these vectors is pUC119 with a HindIII kanamycin-resistance cassette from Tn9O3 in the HindIII site. (B) DNA sequence of the2-kb Sal I fragment of L5 present in pMH92 and pMH94 that contains all of the functions necessary for efficient transformation of mycobacteria.The 43-bp core sequence present in both attP and attB is boxed. The predicted amino acid sequence of Int is shown above the nucleotidesequence, with translation initiation occurring either at the underlined ATG codon (to produce a 332-amino acid protein) or at the upstream GTGcodon (344-amino acid protein). The sequence underlined to the left of the core may function as a transcriptional terminator to stop transcriptionfrom bacterial sequences into the prophage. The underlined region immediately downstream of int can fold into structures resembling a

transcriptional terminator or a putative RNase III site and thus resembles the sib site of phage A (17). (C) Identification of the L5 Int functionfrom amino acid similarity to the family of A Int-related proteins (18-22). Two of the more highly conserved regions (domains 1 and 2; ref. 18)of the proteins are shown, the second of which includes the histidine, arginine, and tyrosine residues (with stars) that are absolutely conservedamong all members of the family. L5 Int is clearly a member of this family.

C

attPcore

INT

INT

INT

INT

INT

INT

INT

INT

INT

INT

INT

INT

INT

INT

INT

INT

INT

INT

I I I L

Page 3: Site-specific integration of mycobacteriophage L5: Integration

Proc. Natl. Acad. Sci. USA 88 (1991) 3113

is oriented the same as in pMH5 relative to the vectorsequences, and in the opposite orientation in pMH92 (seeFig. lA).

Isolation of Chromosomal DNA. M. smegmatis cells wereharvested from 10-ml cultures by centrifugation, suspendedin 0.5 ml of 50 mM Tris, pH 8.0/50 mM EDTA/50 mM NaCI,and mixed with glass beads (0.2 mm; Sigma) in a vortexmixer. Phenol and chloroform (50 gl ofeach) were added andthe sample was mixed again with a vortex mixer. Theaqueous phase was recovered following brief centrifugationand the DNA was precipitated with 0.6 volume of2-propanol,collected by centrifugation, washed, and resuspended in 30ILIof 10 mM Tris, pH 8.0/1 mM EDTA containing RNase A (10,ug/ml). Chromosomal DNA was isolated from BCG and M.tuberculosis H37Ra as described (23).

Tests for Plasmid Stability. Two independent transformantsderived from each plasmid were inoculated into tryptic soybroth (3) containing antibiotics and grown to saturation.Cultures were diluted 1:10,000 into antibiotic-free mediumand allowed to grow back to saturation (-13 generations).Dilution and growth of this culture were repeated (for a totalof about 26 generations) and then the cells were plated forsingle colonies on solid medium. The proportion of antibiotic-resistant colonies was determined by replica plating ontoantibiotic-containing and antibiotic-free media. The propor-tion of antibiotic-resistant colonies in the culture prior to thefirst subculture was also determined and found to be >98%for all experiments.

A AAt-Aw^-ISAAC ACGS;.G-A-- - - '-' A - , - ,..

---

B~~~~CCCC~~~~A~~Ccf~~~~CCOCCCAOOZCCCCACCCCA-A7-----,-7'-l,-'~---LA 4- m m ---~~~~----------

GGTCCCTACC GAGCGGGCGA CGGGAATCGA ACCCGCGTAG CTAGTTTGGA AGACTAGGGC GlCCAGGGATGG c~GCGTGCTAC GGGCT ACACTTTACC tRNAGl

10 60

TCTACCATTG AGCTACGCCC GCATGTCAGC ACCAGCACTG TACCGGGGAGAGATGGTAAC TCGATGCGGG C*ACAGTCG TGGTCGTGAC ATGGCCCCTC

70 4. 90 100 110

CTGAAACCAAGACTTTGGTT

120

Other Methods. DNA manipulations, random-primerDNAlabeling, agarose gel electrophoresis, transfer of DNA andlibraries to nitrocellulose, and hybridization were as de-scribed (14). Electroporation ofM. smegmatis, BCG, and M.tuberculosis was as described (3).

RESULTS

Organization of the attP-int Region of L5. Mycobacterio-phage L5 is a temperate phage that infects M. smegmatis andsome strains of BCG (L.P., S. Snapper, and W.R.J., unpub-lished data). It contains a linear double-stranded DNAgenomeof -52 kb with cohesive ends (M. Oyaski and G.F.H.,unpublished data) and forms stable lysogens through site-specific integration of the phage genome into the M. smeg-matis chromosome (3). Recently, the DNA sequence of theentire L5 genome has been determined (G.F.H., unpublisheddata) and a putative integrase gene of L5 has been identifiedthrough similarity of the predicted amino acid sequence withknown integrase sequences (Fig. 1 B and C). A summary oftheattP-int organization in L5 is shown in Fig. lA.The L5 attachment site (attP) lies immediately to the 5'

side of the int gene (Figs. 1A and 2A). This was determinedby cloning of the attachment junctions (attL and attR) froman L5 lysogen [mc26(L5)] and isolation of the M. smegmatisattachment site (attB) by using attR as a probe. Alignment ofthe attB and phage sequences revealed a 43-bp core regionthat is identical in the two sequences (Fig. 2A). The se-quences of attL and attR show that site-specific recombina-

attP

attB

attR

attL

D -

CG attB 8 c Z X

w~~~~~~~~~~~~~~~~r

Ims-

ATCCAATTAA GGGTCTCGTG TCGTGAGCAC GTAGTATCTC GCGTGGCCCT TTCGCGGTTTTAGGTTAATT CCCAGAGCAC AGCACTCGTG CATCATAGAG CGCACCGGGA AAGCGCCAAA

130 140 150 160 170 180

CTCGGCCTGC GGATGGACCA CC GGGGTG TAGCGCAGCT TGGTAGCGCA TCCGCTTTGG tRNA Pro

GAGCCGGACG CCTACCTGGT GGTGCCCCAC ATCGCGTCGA ACCATCGCGT AGGCGAAACC190 200 210 220 230 240

GAGCGGAAGG CCGCAGGTTC AAATCCTGTCCTCGCCTTCC GGCGTCCAAG TTTAGGACAG

250 260 270

ACCCCGAtTC GCCCGCCCGT ACGACCCAGATGGGGCTGAG CGGGCGGGCA TGCTGGGTCT

280 290 300

FIG. 2. The bacterial attachment site attB. (A) Alignment ofDNA sequences of attP, attB, attR, and attL, showing the 43-bp core sequence(within shaded box) common to all four sites. (B) DNA sequence of the attB site of BCG, which contains a region similar to the 43-bp core regionof M. smegmatis (shaded) but with a single base difference (underlined). (C) Organization of the M. smegmatis genome near the attB site.Analysis of the DNA sequence around the attB core region revealed two tRNA genes transcribed in opposite orientations. The 43-bp coresequence overlaps the 3' end of a tRNAGIY gene so that the gene is not disrupted when L5 integrates. The intergenic region between the 5' endsof the tRNAPrO gene and the tRNAGly gene presumably contains the transcriptional promoters for these genes. Note that transcription from thetRNAGIY gene will encounter the putative transcriptional terminator to the left of the core shown in Fig. 1B. (D) Conservation of L5 attB regionin mycobacteria. Products from BamHI digests of chromosomal DNAs (as indicated) were separated by electrophoresis and analyzed by filterhybridization using a radiolabeled 1.7-kb Sal I attB fragment from M. smegmatis. Note that all of the species contain a strongly hybridizinganalogue of this part of the genome, and several species have additional fragments that hybridize less strongly. These additional sites mayrepresent closely related tRNA genes located elsewhere in the genome; variation in tRNA organization among mycobacteria has been reported(24). Also note that the patterns observed in BCG, M. bovis, and M. tuberculosis are very similar.

ES

-

Microbiology: Lee et al.

~~am

Page 4: Site-specific integration of mycobacteriophage L5: Integration

Proc. Natl. Acad. Sci. USA 88 (1991)

tion occurs within this core sequence (Fig. 2A). This 43-bpcore sequence defines the location of attP, and the functionalattP site (which is probably larger than the core as in otherphage integration systems; ref. 17) lies within the 600-bpBamHI-Sal I fragment (see Fig. 1; data not shown). Prelim-inary data indicate that the site for translation initiation of intlies to the right ofthe BamHI site (Fig. 1B) but that sequencesupstream of this BamHI site are necessary for int expression(G.F.H., unpublished data).Two other notable features of the attP-int region of the L5

genome are regions of potential RNA secondary structuredownstream of int and upstream of the core (Fig. 1B). Thesequence at the 3' end of the int gene could potentially foldinto two alternative RNA structures, one resembling a tran-scriptional terminator, the other a possible RNase III site; thesimilarity ofthese structures to the sib site ofphage A (25) andits position relative to the int gene suggest that this sequencemay play a role in the regulation of int expression. Thesequence upstream of the core has the appearance of atranscription terminator (a 13-bp stem-loop followed byUUUUUCUUU); it is situated such that it would terminatetranscription entering the prophage genome from bacterialsequences when integrated as a lysogen.

Bacterial Attachment Sites ofM. smegmatis and BCG. PhageL5 integrates into a well-conserved part of the mycobacterialgenome, since an M. smegmatis restriction fragment con-taining attB hybridizes strongly to DNA from several otherspecies of mycobacteria, including the leprosy bacillus andBCG (Fig. 2D). Some species, such as Mycobacterium bovisand BCG, contain additional fragments that hybridize lessstrongly. DNA sequence analysis ofthe region containing theM. smegmatis attB site shows the presence of two putativetRNA genes, one coding for tRNAGIY and the other fortRNAI10. The 43-bp attB core overlaps the tRNAGIY gene(Fig. 2C), although integration within attB would not alter thetRNA sequence. While it is common to find phage attach-ment sites associated with tRNA genes, the significance ofthis association is not clear (26, 27).

Preliminary analysis of the homologous segment of theBCG genome indicates that both tRNA genes are presentwithin a similar organization. The 43-bp core is presentalthough there is a single base difference between the BCGsequence and the M. smegmatis and L5 sequences (Fig. 2B).This single difference, which lies within the variable loop ofthe tRNAG0y, does not appear to have a significant affect onthe efficiency of integration as determined by transformationfrequencies (see below).

Integration-Proficient Vectors forM. smegmatis, BCG, andM. tuberculosis. Integration-proficient plasmids were con-structed by cloning segments of the attP-int region intopUC119 vectors carrying the kanamycin-resistance genefrom Tn9O3 (Fig. 1). The pUC119 vectors themselves are notcompetent to transform M. smegmatis (3). In contrast, plas-

Table 1. Transformation with integration-proficient vectors

Transformants, no./hg of DNA

Strain pMH5 pMH94 pYUB12

M. smegmatis mc2155 >lo, >1i5 >lo,M. smegmatis mc26 >104 >104 1-10M. smegmatis mc2155(L5) >l03 >102 >105BCG Pasteur 1-10 5 x 103 >105M. tuberculosis H37Ra ND 5 x 102 5 x 104*

Mycobacterial strains were transformed with pMH5 and pMH94DNAs (0.05-1 tg) as described (3) and kanamycin-resistant trans-formants were selected. Note that mc26 does not support transfor-mation of episomal plasmids such as pYUB12, whereas the high-efficiency transformation strain mc2155 does (5). mc2155(L5) is an L5lysogen of mc2155. ND, not done.*pYUB18 (a derivative of pYUB12) was used.

mids such as pMH5 and pMH94 (carrying the L5 attP-intregion) transform M. smegmatis mc2155 efficiently, yielding::z105 kanamycin-resistant transformants per ,g of DNA, anumber similar to that obtained with plasmids such aspYUB12 (or any derivatives of plasmid pAL5000; ref. 28),which replicate extrachromosomally in mycobacteria (Table1). Inactivation of the int gene by truncation of the 3' end ofthe gene (without interruption of attP function) destroys theability to transform mycobacteria. Plasmids carrying the 2-kbSal I fragment (such as pMH94, Fig. 1) integrate as efficientlyas those with a larger segment (pMH5), and the frequency isnot influenced by the orientation of the insert (comparepMH92 with pMH94). These plasmids also efficiently trans-form M. smegmatis mc26, a strain that does not supportreplication of extrachromosomal plasmids such as pYUB12(5). Hybridization studies confirm that the plasmid sequencesare integrated site-specifically at attB (Fig. 3A), and this hasbeen observed for every integrant analyzed (at least 20).pMH94 also efficiently transforms both BCG and M.

tuberculosis H37Ra (Table 1), and hybridization studies showthat integration is site-specific (Fig. 3B, lanes 5-8; data notshown). As expected, pMH94 has integrated into the regionof the BCG chromosome that strongly hybridizes to the M.smegmatis attB region (see Fig. 2D). Thus the transformationand integration of pMH94 in these slow-growing mycobac-teria are similar to those processes in the fast-growing M.

.A.4 7

i11B *--'f

_ ..

(JltI__

utt/B--- .*

FIG. 3. Site-specific integration of plasmids into the attB site ofM. smegmatis and BCG. (A) Chromosomal DNAs isolated from M.smegmatis mc26(L5) (an L5 lysogen, lane 1), mc2155 (lane 2), andfour independent transformants of pMH92 in mc2155 were digestedwith BamHI and analyzed by filter hybridization using the 1.7-kb SalI attB fragment from M. smegmatis. pMH92 has integrated into attBin each of the four transformants. A similar pattern of integration isseen with plasmids pMH5 and pMH94. The sizes of the BamHIfragments are 6.0 kb (attB), 9.3 kb (L5 attR), 3.4 kb (L5 attL), 7.3 kb(pMH92 attR), and 3.4 kb (pMH92 attL). (B) Chromosomal DNAsfrom BCG Pasteur (lane 1), three independent BCG transformants ofpMH5 (lanes 2-4), and four independent pMH94 transformants(lanes 5-8) were digested with BamHI and analyzed by filter hy-bridization using a 1.9-kb BamHI attB fragment from BCG. Integra-tion has occurred within the BCG attB site in each of the transfor-mants shown. The larger junction fragments in the pMH5 transfor-mants are different in size in each of the three transformants (2.5-3.5kb), considerably smaller than the predicted size of 7.4 kb, indicatingthat they have each suffered large deletions within the plasmids. Thesizes of the other fragments are 1.9 kb (BCG attB), 1.5 kb (pMH5 andpMH94 attR), and 1.0 kb (pMH94 attL).

3114 Microbiology: Lee et A

Page 5: Site-specific integration of mycobacteriophage L5: Integration

Proc. Natl. Acad. Sci. USA 88 (1991) 3115

smegmatis. pMH5, however, does not efficiently transformBCG, and analysis of the transformants obtained suggeststhat the plasmid has integrated site-specifically but has alsosustained large deletions (Fig. 3B, lanes 2-4). The basis forthis difference in pMH5 transformation between M. smeg-matis and BCG is not clear.pMH94 is stably maintained when integrated into the M.

smegmatis chromosome. This was demonstrated by growthof the transformants in nonselective medium followed bydetermination of the proportion of descendants that had lostthe plasmid-encoded kanamycin-resistance gene. After -30generations of nonselective growth, no loss of pMH94 wasobserved, whereas -35% of descendants had lost an extra-chromosomal plasmid (pYUB12). pMH5 was found to be lessstable than pMH94 (with about 20% loss), suggesting thatpMH5 (although not pMH94) may contain an xis gene (pre-liminary data are consistent with an xis function locatedbetween the Nhe I site and attP; see Fig. 1), which could alsoplay a role in the low efficiency of pMH5 transformation ofBCG. Preliminary evidence suggests, as expected, thatpMH94 and its derivatives are stable in both BCG and M.tuberculosis (L.P. and W.R.J., unpublished data).

DISCUSSIONSite-specific integration of mycobacteriophage L5 sharesmany features with the site-specific integration systems ofother phages. The core region ofattP is somewhat larger thanthat ofA (43 bp compared to 15 bp), although neither the coreregion nor the surrounding area is significantly richer in ATbase pairs than the rest of the phage genome (-63% G'C).Since the core region overlaps a tRNAGly gene, much of theidentity between attP and attB may be relevant not to therecombination event per se, but for maintaining the integrityof the tRNA gene once L5 has integrated. The presence ofstrongly hybridizing segments in other mycobacterial speciesand the strong conservation ofthe tRNA genes between BCGand M. smegmatis encourage us to think that functional attBsites are widespread among mycobacteria.The site-specific integration vectors described here pos-

sess several features of value for genetic manipulation ofmycobacteria. They transform mycobacteria with high effi-ciency to produce stable recombinants that contain a well-defined single-copy insert of the plasmid. That this integra-tion event is not restricted to M. smegmatis but also functionsin BCG and M. tuberculosis makes it ideal for the construc-tion of recombinant BCG vaccines and offers a significantadvantage over other methods such as homologous recom-bination that are likely to be considerably less efficient.Site-specific integration thus provides a simple and efficientmechanism for avoiding the loss of foreign genes frommycobacteria in situations where it is not possible to directlyselect for maintenance of these genes. This provides analternative approach to the methods used for construction ofrecombinant Salmonella vaccines (6-10). The use of a phagesite-specific recombination system for the generation ofgenetically stable recombinants should be applicable to avariety of bacterial systems where temperate phages havebeen identified. Phage integration systems have notableadvantages over other integration methods (such as by trans-position or general recombination) in that phage integrationnot only is very efficient but requires factors for excision thatare not required for integration. Similar approaches havebeen described for Streptomyces (29).

Since the integration vectors can be used to efficientlyinsert large segments of DNA (L5 itself is 52 kb) intomycobacteria (i.e., by constructing cosmid libraries in inte-gration-proficient vectors), they are well suited for the iso-lation of mycobacterial virulence determinants through the

construction and integration of genomic libraries of virulentmycobacteria into avirulent strains. Since pMH94 also trans-forms M. tuberculosis H37Ra, it should be possible toconstruct stable integrated genomic libraries of M. tubercu-losis H37Rv (a virulent strain) DNA in M. tuberculosisH37Ra (an avirulent strain) and hence to isolate pathogenicrecombinants in animal model systems. The high efficiency oftransformation with integration-proficient vectors is crucialfor the construction of such libraries in order to obtain arepresentative number of recombinants, which may not bepossible with general or transpositional recombination.

We thank Lisa Petersen and Rupa Doshi for excellent technicalassistance, Jose Ravano for contributing to the isolation of the attRsite, and Barry Bloom and Nigel Grindley for their enthusiasticsupport for this work. We thank Ken Stover for many usefuldiscussions and Nigel Grindley and Craig Peebles for critical readingof the manuscript. This research was supported by the Chemother-apy of Leprosy (THELEP) and Immunology of Leprosy (IMMLEP)components of the United Nations Development Program/WorldBank/World Health Organization Special Programme for Researchand Training in Tropical Diseases, by National Institutes of HealthGrant A126170 to W.R.J., and by grants from the WHO/UNDPProgramme for Vaccine Development and National Institutes ofHealth Grant A128927 to G.F.H.

1. Jacobs, W. R., Jr., Tuckman, M. & Bloom, B. R. (1987) Nature (Lon-don) 327, 532-535.

2. Jacobs, W. R., Jr., Snapper, S. B. & Bloom, B. R. (1988) in MolecularBiology ofInfectious Diseases, ed. Schwarz, M. (Elsevier, New York),pp. 207-212.

3. Snapper, S. B., Logosi, L., Jekkel, A., Melton, R. E., Kieser, T.,Bloom, B. R. & Jacobs, W. R., Jr. (1988) Proc. Natl. Acad. Sci. USA 85,6987-6991.

4. Ranes, M. G., Rauzier, J., LaGranderie, M., Gheorghiu, M. & Gicquel,B. (1990) J. Bacteriol. 172, 2793-2797.

5. Snapper, S., Melton, R., Keiser, T. & Jacobs, W. R., Jr. (1990) Mol.Microbiol. 4, 1911-1990.

6. Curtiss, R., Goldschmidt, R. M., Fletchall, N. B. & Kelly, S. M. (1988)Vaccine 6, 155-160.

7. Curtiss, R., Kelly, S. M., Gulig, P. A. & Nakayama, K. (1989)Adv. Exp.Med. Biol. 251, 33-47.

8. Hone, D., Attridge, S., Van den Bosch, L. & Hackett, J. (1988) Microb.Pathog. 5, 407-418.

9. Strugnell, R. A., Maskell, D., Fairweather, N., Pickard, D., Cockayne,A., Penn, C. & Dougan, G. (1990) Gene 88, 57-63.

10. Flynn, J. L., Weiss, W. R., Norris, K. A., Seifert, H. S., Kumar, S. &So, M. (1990) Mol. Microbiol. 4, 2111-2118.

11. Husson, R. A., James, B. E. & Young, R. A. (1990) J. Bacteriol. 172,519-524.

12. Martin, C., Timm, J., Rauzier, J., Gomez-Lus, R., Davies, J. & Giquel,B. (1990) Nature (London) 345, 739-742.

13. Doke, S. (1960) J. Kumamoto Med. Soc. 34, 1360-1373.14. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular Cloning:A

Laboratory Manual (Cold Spring Harbor Lab., Cold Spring Harbor, NY).15. Biggin, M., Gibson, T. J. & Hong, G. F. (1983) Proc. Natl. Acad. Sci.

USA 80, 3%3-3965.16. Staden, R. (1982) Nucleic Acids Res. 10, 4731-4751.17. Weisberg, R. A. & Landy, A. (1983) in Lambda II, eds. Hendrix, R. W.,

Roberts, J. W., Stahl, F. W. & Weisberg, R. A. (Cold Spring HarborLab., Cold Spring Harbor, NY), pp. 211-250.

18. Poyart-Salmeron, C., Trieu-Cuot, P., Carlier, C. & Courvalin, P. (1989)EMBOJ. 8, 2425-2433.

19. Yagil, E., Dolev, S., Oberto, J., Kislev, N., Ramaiah, N. & Weisberg,R. (1989) J. Mol. Biol. 207, 695-717.

20. Ye, Z.-H. & Lee, C. Y. (1989) J. Bacteriol. 171, 4146-4153.21. Goodman, S. D. & Scocca, J. J. (1989) J. Bacteriol. 171, 4232-4240.22. Ye, Z.-H., Buranen, S. L. & Lee, C. Y. (1990) J. Bacteriol. 172,

2568-2575.23. Grossinsky, C. M., Jacobs, W. R., Jr., Clark-Curtiss, J. E. & Bloom,

B. R. (1989) Infect. Immun. 57, 1535-1541.24. Bhargava, S., Tyagi, A. K. & Tyagi, J. S. (1990) J. Bacteriol. 172,

2930-2934.25. Echols, H. & Guarneros, G. (1983) in Lambda II, eds. Hendrix, R. W.,

Roberts, J. W., Stahl, F. W. & Weisberg, R. A. (Cold Spring HarborLab., Cold Spring Harbor, NY), pp. 75-92.

26. Landy, A. (1989) Annu. Rev. Biochem. 58, 913-949.27. Reiter, W.-D., Palm, P. & Yeats, S. (1989) Nucleic Acids Res. 17,

1907-1914.28. Rauzier, J., Moniz-Pereira, J. & Gicquel, B. (1988) Gene 71, 315-321.29. Omer, C. A., Stein, D. & Cohen, S. N. (1988) J. Bacteriol. 170, 2174-

2184.

Microbiology: Lee et al.