13
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, Nov. 2006, p. 7098–7110 Vol. 72, No. 11 0099-2240/06/$08.000 doi:10.1128/AEM.00731-06 Copyright © 2006, American Society for Microbiology. All Rights Reserved. Multilocus Sequence Typing System for the Endosymbiont Wolbachia pipientis Laura Baldo, 1 * Julie C. Dunning Hotopp, 2 Keith A. Jolley, 3 Seth R. Bordenstein, 4 Sarah A. Biber, 4 Rhitoban Ray Choudhury, 5 Cheryl Hayashi, 1 Martin C. J. Maiden, 3 Herve ` Tettelin, 2 and John H. Werren 5 Department of Biology, University of California, Riverside, California 1 ; The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, Maryland 2 ; The Peter Medawar Building for Pathogen Research, University of Oxford, Oxford, United Kingdom 3 ; Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, The Marine Biological Laboratory, Woods Hole, Massachusetts 4 ; and Department of Biology, University of Rochester, Rochester, New York 5 Received 29 March 2006/Accepted 13 August 2006 The eubacterial genus Wolbachia comprises one of the most abundant groups of obligate intracellular bacteria, and it has a host range that spans the phyla Arthropoda and Nematoda. Here we developed a multilocus sequence typing (MLST) scheme as a universal genotyping tool for Wolbachia. Internal fragments of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major Wolbachia supergroups found in arthropods, as well as other divergent lineages, were designed. A supplemental typing system using the hypervariable regions of the Wolbachia surface protein (WSP) was also developed. Thirty-seven strains belonging to supergroups A, B, D, and F obtained from singly infected hosts were characterized by using MLST and WSP. The number of alleles per MLST locus ranged from 25 to 31, and the average levels of genetic diversity among alleles were 6.5% to 9.2%. A total of 35 unique allelic profiles were found. The results confirmed that there is a high level of recombination in chromosomal genes. MLST was shown to be effective for detecting diversity among strains within a single host species, as well as for identifying closely related strains found in different arthropod hosts. Identical or similar allelic profiles were obtained for strains harbored by different insect species and causing distinct reproductive phenotypes. Strains with similar WSP sequences can have very different MLST allelic profiles and vice versa, indicating the importance of the MLST approach for strain identification. The MLST system provides a universal and unambiguous tool for strain typing, population genetics, and molecular evolutionary studies. The central database for storing and organizing Wolbachia bacterial and host information can be accessed at http://pubmlst.org/wolbachia/. In the past 10 years there has been increasing interest in the maternally inherited Wolbachia endosymbionts because of their remarkably widespread distribution and significant im- pact on the ecology, evolution, and reproductive biology of their host species (46, 54). Approximately 20 to 75% of all insect species harbor Wolbachia (20, 55), as do many arachnids and terrestrial crustaceans (7, 11, 12, 40). Individual insects can be infected with multiple Wolbachia strains (24, 55, 58), and geographically distinct populations of the same species can harbor different strains (29, 39). Outside the phylum Ar- thropoda, high infection levels have also been detected in the vast majority of pathogenic filarial nematodes (4). Overall, the extraordinary infection frequency among insects alone places members of the genus Wolbachia among the most widespread intracellular bacteria described thus far (55, 56). Wolbachia strains are typically vertically transmitted within a species through the cytoplasm of eggs (maternal inheritance). How- ever, the wide range of infected hosts cannot be explained by vertical transmission alone, and the ability of these bacteria to spread to new host species via horizontal transfer accounts for the pandemic distribution, although the mechanisms and pat- terns of interspecies transfer are not well understood (59). The importance of Wolbachia strains ranges from their ef- fects on the reproductive biology, ecology, and evolution of their hosts to their potential use in biological control of pest insects and biomedical applications (35, 54, 61). Wolbachia strains can extensively manipulate host reproduction by means such as inducing parthenogenesis, feminizing genetic males, killing male embryos, or causing cytoplasmic incompatibility between gametes (16, 45, 46). In filarial nematodes, the host dependence on Wolbachia for propagation has generated great interest in the medical field for targeting these bacteria in the treatment of filariasis (10, 38). Wolbachia strains were first found in the mosquito species Culex pipiens and designated Wolbachia pipientis (15). These bacteria were subsequently found to be widely distributed in arthropods. However, additional species have not been named, pending improved understanding of the genetic variation of this genus (6, 26). At present, supergroup designations are routinely used to describe the major phylogenetic subdivi- sions of this bacterial group. So far, eight supergroups have been designated (supergroups A to H) primarily on the basis of data from the 16S rRNA, ftsZ, and wsp (Wolbachia sur- face protein [WSP]) genes (31, 59, 62). Most of the super- groups are found in arthropods (supergroups A, B, E, F, G, and H), and the majority of insect Wolbachia strains belong to supergroups A and B. Fully annotated genomes of the following two Wolbachia strains are available: wMel, a su- pergroup A strain from the arthropod host Drosophila mela- * Corresponding author. Mailing address: Department of Biology, University of California, 900 University Avenue, Riverside, CA 92521. Phone: (951) 827-3841. Fax: (951) 827-4286. E-mail: [email protected]. Published ahead of print on 25 August 2006. 7098 on April 7, 2020 by guest http://aem.asm.org/ Downloaded from

Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

APPLIED AND ENVIRONMENTAL MICROBIOLOGY, Nov. 2006, p. 7098–7110 Vol. 72, No. 110099-2240/06/$08.00�0 doi:10.1128/AEM.00731-06Copyright © 2006, American Society for Microbiology. All Rights Reserved.

Multilocus Sequence Typing System for the EndosymbiontWolbachia pipientis�

Laura Baldo,1* Julie C. Dunning Hotopp,2 Keith A. Jolley,3 Seth R. Bordenstein,4 Sarah A. Biber,4Rhitoban Ray Choudhury,5 Cheryl Hayashi,1 Martin C. J. Maiden,3

Herve Tettelin,2 and John H. Werren5

Department of Biology, University of California, Riverside, California1; The Institute for Genomic Research, 9712 Medical Center Drive,Rockville, Maryland2; The Peter Medawar Building for Pathogen Research, University of Oxford, Oxford, United Kingdom3;

Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, The Marine Biological Laboratory,Woods Hole, Massachusetts4; and Department of Biology, University of Rochester, Rochester, New York5

Received 29 March 2006/Accepted 13 August 2006

The eubacterial genus Wolbachia comprises one of the most abundant groups of obligate intracellularbacteria, and it has a host range that spans the phyla Arthropoda and Nematoda. Here we developed amultilocus sequence typing (MLST) scheme as a universal genotyping tool for Wolbachia. Internal fragmentsof five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across themajor Wolbachia supergroups found in arthropods, as well as other divergent lineages, were designed. Asupplemental typing system using the hypervariable regions of the Wolbachia surface protein (WSP) was alsodeveloped. Thirty-seven strains belonging to supergroups A, B, D, and F obtained from singly infected hostswere characterized by using MLST and WSP. The number of alleles per MLST locus ranged from 25 to 31, andthe average levels of genetic diversity among alleles were 6.5% to 9.2%. A total of 35 unique allelic profiles werefound. The results confirmed that there is a high level of recombination in chromosomal genes. MLST wasshown to be effective for detecting diversity among strains within a single host species, as well as for identifyingclosely related strains found in different arthropod hosts. Identical or similar allelic profiles were obtained forstrains harbored by different insect species and causing distinct reproductive phenotypes. Strains with similarWSP sequences can have very different MLST allelic profiles and vice versa, indicating the importance of theMLST approach for strain identification. The MLST system provides a universal and unambiguous tool forstrain typing, population genetics, and molecular evolutionary studies. The central database for storing andorganizing Wolbachia bacterial and host information can be accessed at http://pubmlst.org/wolbachia/.

In the past 10 years there has been increasing interest in thematernally inherited Wolbachia endosymbionts because oftheir remarkably widespread distribution and significant im-pact on the ecology, evolution, and reproductive biology oftheir host species (46, 54). Approximately 20 to 75% of allinsect species harbor Wolbachia (20, 55), as do many arachnidsand terrestrial crustaceans (7, 11, 12, 40). Individual insects canbe infected with multiple Wolbachia strains (24, 55, 58), andgeographically distinct populations of the same species canharbor different strains (29, 39). Outside the phylum Ar-thropoda, high infection levels have also been detected in thevast majority of pathogenic filarial nematodes (4). Overall, theextraordinary infection frequency among insects alone placesmembers of the genus Wolbachia among the most widespreadintracellular bacteria described thus far (55, 56). Wolbachiastrains are typically vertically transmitted within a speciesthrough the cytoplasm of eggs (maternal inheritance). How-ever, the wide range of infected hosts cannot be explained byvertical transmission alone, and the ability of these bacteria tospread to new host species via horizontal transfer accounts forthe pandemic distribution, although the mechanisms and pat-terns of interspecies transfer are not well understood (59).

The importance of Wolbachia strains ranges from their ef-fects on the reproductive biology, ecology, and evolution oftheir hosts to their potential use in biological control of pestinsects and biomedical applications (35, 54, 61). Wolbachiastrains can extensively manipulate host reproduction by meanssuch as inducing parthenogenesis, feminizing genetic males,killing male embryos, or causing cytoplasmic incompatibilitybetween gametes (16, 45, 46). In filarial nematodes, the hostdependence on Wolbachia for propagation has generated greatinterest in the medical field for targeting these bacteria in thetreatment of filariasis (10, 38).

Wolbachia strains were first found in the mosquito speciesCulex pipiens and designated Wolbachia pipientis (15). Thesebacteria were subsequently found to be widely distributed inarthropods. However, additional species have not been named,pending improved understanding of the genetic variation ofthis genus (6, 26). At present, supergroup designations areroutinely used to describe the major phylogenetic subdivi-sions of this bacterial group. So far, eight supergroups havebeen designated (supergroups A to H) primarily on the basisof data from the 16S rRNA, ftsZ, and wsp (Wolbachia sur-face protein [WSP]) genes (31, 59, 62). Most of the super-groups are found in arthropods (supergroups A, B, E, F, G,and H), and the majority of insect Wolbachia strains belongto supergroups A and B. Fully annotated genomes of thefollowing two Wolbachia strains are available: wMel, a su-pergroup A strain from the arthropod host Drosophila mela-

* Corresponding author. Mailing address: Department of Biology,University of California, 900 University Avenue, Riverside, CA 92521.Phone: (951) 827-3841. Fax: (951) 827-4286. E-mail: [email protected].

� Published ahead of print on 25 August 2006.

7098

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 2: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

nogaster (60), and wBm, a supergroup D strain from thenematode host Brugia malayi (9).

Despite increasing studies of Wolbachia and a growing list ofinfected arthropod species, researchers still lack a standardand reliable system for typing and quantifying strain diversity.A strain typing system was previously developed using theWolbachia surface protein gene, wsp (62). However, evidenceof extensive recombination (2, 57) and strong diversifying se-lection (1, 21) at this gene make it unsuitable for reliable straincharacterization if it is used alone. Furthermore, recent detec-tion of recombination both within and between Wolbachiagenes (3) suggests that a single-locus approach to strain char-acterization may be misleading. A multigene typing approachhas recently been used to type Wolbachia strains (32). Thisstudy focused on strains found in Drosophila hosts and dem-onstrated the utility of multigene typing for Wolbachia strains.

Here we developed a standard multilocus sequence typing(MLST) system for Wolbachia, which works broadly with Wol-bachia strains found in diverse singly infected host species.This system includes a web-accessible central database andtoolkit for storage, organization, and analysis of data. The dataare fully portable between laboratories and allow extensivecomparative analyses (22). This accurate strain typing systemshould provide the genetic framework for tracing the move-ment of Wolbachia globally and within insect communities andfor associating Wolbachia strains with geographic regions, hostfeatures (e.g., ecology and phylogeny), and phenotypic effectson hosts.

Originally used for global epidemiology and surveillance ofrecombinant pathogenic bacteria (27), the MLST approachdefines a strain as a sequence type (ST) (equivalent to a hap-lotype) on the basis of a combination of alleles (i.e., allelicprofile) at a sample of housekeeping genes. Similarity relation-ships among strains are described using the nucleotide se-quence identity of alleles at each locus. The use of combina-tions of alleles as molecular markers to genotype strains meansthat recombination is not an impediment to strain character-ization, and allelic information can be used to readily detectrecombination between genes and rates of recombinationwithin a population (17). Phylogenetic and other clusteringalgorithms can be applied to individual or concatenated genedata sets for further analyses of strain relationships, althoughcaution in interpretation of phylogenetic relationships is nec-essary due to recombination issues.

The MLST scheme developed for Wolbachia indexes varia-tion at five conserved genes (ftsZ, gatB, coxA, hcpA, and fbpA).The PCR primers used robustly amplify loci of strains belong-ing to supergroups A and B and potentially amplify loci ofstrains belonging to other supergroups as well. Specific primersfor amplification of supergroup A and B strains coinfecting thesame host individual are also described here. A profile data-base and an isolate database were generated to store anddistribute the sequences and the strain and host information,respectively (22; http://pubmlst.org/wolbachia/). New strainsand profiles can be submitted to the curator for inclusion in thedatabases, which thus represent a focal point for future storageand management of all Wolbachia strain data.

Given the current extensive use of wsp sequences for char-acterizing Wolbachia strains (25, 30, 36, 43, 53) and the sus-pected role of the Wolbachia surface protein in the host-sym-

biont interactions (1, 62), use of this gene as an additionaloptional strain marker is proposed. A WSP typing system wasdeveloped based on the amino acid sequences encoded by thefour hypervariable regions (HVRs) of this gene (2), and aseparate database for storing both nucleotide sequences of thewsp gene and amino acid sequences of single HVRs was gen-erated. WSP typing is analogous to antigen protein typing usedfor pathogenic bacteria (33) and can complement the MLSTinformation.

Thirty-seven strains representing supergroups A, B, D, andF from diverse singly infected host species were typed by usingthe MLST and WSP systems. MLST unambiguously identified,differentiated, and grouped genetically similar strains throughcomparisons of their allelic profiles. The MLST data revealedconsiderable diversity within the genus Wolbachia and alsoidentified horizontal movement of strains between hosts andshifts in Wolbachia phenotypic effects on hosts (e.g., cytoplas-mic and parthenogenesis induction).

MATERIALS AND METHODS

Choice of loci. The choice of the genes used for the MLST system wasfacilitated by an ongoing project to characterize genetic diversity in Wolbachia. Atotal of 46 unique loci were amplified from a subset of more than 50 Wolbachiastrains belonging to supergroups A, B, C, D, E, F, and H and sequenced. DNAwas extracted from either a whole individual, the abdomen, or the gonads,depending on the size of the specimen, using a DNAeasy tissue kit (QIAGEN,Germantown, MD).

Amplicons were generated using standard PCR conditions with HotStarTaq(QIAGEN) according to the manufacturer’s recommendations and with each�64-fold degenerate primer at a concentration of 0.5 �M. The forward andreverse primers had 5� tags of M13 forward and reverse primer sequences,respectively. These tags served as anchors for the degenerate primers duringamplification and were subsequently used for sequencing. Amplification reac-tions were initiated with 15 min of incubation at 95°C, followed by 50 cycles of95°C for 15 s, 55°C for 30 s, and 72°C for 1 min. The amplification reactionmixtures (8 �l) were treated with 0.5 U shrimp alkaline phosphatase and 1.0 Uexonuclease I (Amersham) in the buffer supplied. Sequencing reactions wereperformed at the J. Craig Venter Joint Technology Center, Rockville, MD (anaffiliate of The Institute for Genomic Research [TIGR], Rockville, MD) with theM13 forward and reverse primers. Assembly was done with the TIGR assembler(47), and sequences were manually curated with the multiple-sequence editorCloe (a multiple-sequence editor heavily integrated into the TIGR database).Alignments were initially generated using CLUSTALX (51) and then manuallyedited in MacClade 4.08 (http://macclade.org) and Bioedit v. 7.0.4.1 (14).

From the initial collection of 46 loci, five conserved genes were chosen forMLST on the basis of the following features: (i) presence throughout the se-quenced Rickettsiales order; (ii) a single copy in the wMel genome (60); (iii) widedistribution in the wMel genome (Fig. 1); and (iv) evidence of strong stabilizingselection within the genus Wolbachia (average ratio of the number of nonsyn-onymous substitutions per nonsynonymous site to the number of synonymoussubstitutions per synonymous site [Ka/Ks], ��1). These criteria conform to thestandard locus requirements for an MLST system (27, 52). The five genes chosenfor the MLST scheme are gatB, coxA, hcpA, ftsZ, and fbpA (Table 1). Only oneof these genes, ftsZ, was used in previous studies (59, 32). Briefly, the gatB geneencodes aspartyl/glutamyl-tRNA amidotransferase subunit B that synthesizescharged tRNAs in organisms lacking certain tRNA synthetases; coxA encodes acatalytic subunit of cytochrome oxidase of the respiratory chain; hcpA encodes afunctionally uncharacterized protein whose length and amino acid sequence arehighly conserved in many bacteria, which is designated NCBI COG0217 (49, 50);ftsZ encodes a protein involved in bacterial cell division; and fbpA encodesfructose-bisphosphate aldolase that in Wolbachia is likely to be involved ingluconeogenesis (8, 9). The surface protein WSP was included in this study as anadditional marker for strain typing.

MLST and wsp primer design and amplification. For each gene, primers weredesigned manually using sequence alignments that were generated from theinitial sample of more than 50 strains and, when possible, included outgroupsequences of Ehrlichia ruminantium Welgevonden, Ehrlichia chaffeensis Arkan-sas, Anaplasma marginale St. Maries, Anaplasma phagocytophilum HZ, and

VOL. 72, 2006 MULTILOCUS SEQUENCE TYPING FOR WOLBACHIA 7099

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 3: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

Neorickettsia sennetsu Miyayma. Primers (Table 1) were designed to be generalacross all arthropod Wolbachia supergroups examined but not to amplify any ofthe outgroups mentioned above. For wsp, primers were designed to amplify mostof the gene (546 bp) (Table 1), including the four HVRs.

PCR amplification was performed by using 20-�l mixtures containing eachdeoxynucleoside triphosphate at a concentration of 0.2 mM, 1.5 mM MgCl2, eachprimer at a concentration of 1 �M, 0.5 U Taq DNA polymerase, 1� PCR buffer,and 1 �l DNA. Reactions were initiated by incubation at 94°C for 2 min, whichwas followed by 36 cycles of 94°C for 30 s, the optimal annealing temperature(see below) for 45 s, and 72°C for 1.5 min, a final elongation step at 70°C for 10min, and a final hold at 4°C. The optimal PCR annealing temperature was 53°Cfor hcpA, 54°C for both gatB and ftsZ, 55°C for coxA, and 59°C for fbpA and wsp.PCR products were purified using Montage PCR centrifugal filter devices (Mil-lipore) and were bidirectionally sequenced. An internal fragment of each PCRproduct was specifically selected for MLST (Table 1). Ambiguous bases andsingle-point polymorphisms were assessed by performing a second amplificationand resequencing. All MLST and wsp sequences are presented in the codingreading frame.

Given the great genetic diversity of Wolbachia strains, alternative primeramplification strategies may be required if the standard primers (Table 1) do notwork. For instance, use of tags for the MLST primers can improve their perfor-mance. The 64-fold degenerate primers used initially as part of a larger study arealso provided for four of the five genes at the MLST website. For PCR ampli-

fication these primers can be used singly or in combination with the MLSTprimers using a nested strategy. Alternative ftsZ primers, ftsZuniF and ftsZuniR,which were designed previously (26), can also be used. Additional primers tohelp with amplification of more divergent strains and supergroup A and B strainsoccurring in doubly infected individuals have been also designed. All the primersdescribed above and the PCR protocols are posted at the MLST website (http://pubmlst.org/wolbachia/). As experience with MLST sequencing of divergentstrains accumulates, the information will be provided at the website.

Wolbachia strains. MLST and wsp primers were tested with 42 host speciesharboring single infections with strains belonging either to supergroup A, B, D,E, or F or to an unassigned supergroup (Table 2). Single-infection status wasconfirmed during sequencing. Wolbachia infection of the termite species Incisi-termes snyderi was not reported previously, and the single-infection status of thisrelationship was assessed using supergroup A- and B-specific wsp primers (61).In addition, primer performance was evaluated for new species with previouslyunreported infections, including several species of the spider genus Agelenopsis,the hymenopteran Neivamyrmex opacithorax, the embiopteran Haploembiasolieri, and unidentified species of the bristletail family Meinertellidae (Archaeog-natha). In the latter case, Wolbachia 16S rRNA gene general primers were usedas a control (56). In order to confirm the robustness and consistency of PCRresults obtained in different laboratories, all primers were independently testedwith a variable sample of strains by multiple laboratories. The supergroup A- andB-specific primers were tested with insects known to harbor double infectionswith supergroup A and B strains, including Nasonia longicornis (IV7), Nasoniagiraulti (RV2), Nasonia vitripennis (R511), and Aedes albopictus; with insectssingly infected with supergroup B strains, including N. vitripennis (4.9) andProtocalliphora sialia (NY); and with insects singly infected with supergroup Astrains, including N. giraulti (16.2), N. longicornis (2.1), N. vitripennis (12.1),Drosophila ananassae (Hawaii), Drosophila simulans (wRi), D. simulans (wAu),and Drosophila melanogaster (wMel). Specific amplification of the group-specificsequences in doubly infected individuals (Nasonia species) was assessed by com-parison with sequences obtained from experimentally separated cultures of thetwo strains. The primers were also tested with the following three N. longicornisstrains carrying multiple supergroup A (A1A2) and/or B (B1B2) strains: N.longicornis strain PE 18bx5 (infection status, A1A2B1B2), strain PE 18bx4(A1A2B1), and strain BTPE 18bx5 (A1A2B2). Electropherograms were in-spected for the occurrence of double peaks to determine whether the use of theprimers could reveal multiple infections by members of a single supergroup.

MLST. Following MLST convention (27, 52), identical nucleotide sequencesat a given locus for different strains were assigned the same arbitrary allelenumber (i.e., each allele has a unique identifier). Each strain was then charac-terized by the combination of the five MLST allelic numbers, representing itsallelic profile. Each unique allelic profile was assigned an ST. Ultimately, an STcharacterizes a strain. We defined an ST complex as a group of STs sharing aminimum of three alleles with a central ST. The complex is designated by the

FIG. 1. Map of the wMel chromosome, showing the locations ofthe five MLST loci and wsp (inner circle). The outer and middle circlesshow the open reading frames on the plus and minus chromosomalstrands, respectively.

TABLE 1. MLST and wsp loci and primer features

Clustercategorya

Locuscode

(wMel)Gene Product

Primer Genelength(bp)b

Amplifiednucleotide

range(bp)b

MLSTfragmentsize (bp)Designation Sequence (5�–3�)

�-Proteobacteria WD_0146 gatB Glutamyl-tRNA(Gln)amidotransferase, subunit B

gatB_F1 GAKTTAAAYCGYGCAGGBGTT 1,425 421–891 369

gatB_R1 TGGYAAYTCRGGYAAAGATGA

Escherichia coli WD_0301 coxA Cytochrome c oxidase, subunit I coxA_F1 TTGGRGCRATYAACTTTATAG 1,551 491–977 402coxA_R1 CTAAAGACTTTKACRCCAGT

Escherichia coli WD_0484 hcpA Conserved hypothetical protein hcpA_F1 GAAATARCAGTTGCTGCAAA 741 91–605 444hcpA_R1 GAAAGTYRAGCAAGYTCTG

Escherichia coli WD_0723 ftsZ Cell division protein ftsZ_F1 ATYATGGARCATATAAARGATAG 1,197 274–798 435ftsZ_R1 TCRAGYAATGGATTRGATAT

Rickettsiales WD_1238 fbpA Fructose-bisphosphate aldolase fbpA_F1 GCTGCTCCRCTTGGYWTGAT 900 241–749 429fbpA_R1 CCRCCAGARAAAAYYACTATTC

Wolbachia WD_1063 wsp Outer surface protein wsp_F1 GTCCAATARSTGATGARGAAAC 714 85–688 546wsp_R1 CYGCACCAAYAGYRCTRTAAA

a Clustering categories indicating the level of protein conservation.b With respect to the wMel genome.

7100 BALDO ET AL. APPL. ENVIRON. MICROBIOL.

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 4: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

central ST, which is identified as the group founder based on BURST analysesimplemented in START2 (23).

When the complete set of the five MLST gene sequences cannot be obtainedfor a strain, single gene alleles and partial MLST allelic profiles can be submittedto the database. Partial data provide useful allele diversity information, allowingthe profile database to grow. A strain cannot be assigned to an ST until fullMLST data are provided, but it can receive an identification number and allspecific strain and host information can be submitted to the isolate database.

WSP typing. In addition to the ST, each strain was characterized based on theamino acid motifs of the four HVRs of its WSP sequence. A relatively conservedset of amino acid motifs are present in each of the four HVRs, and there isshuffling of HVR motifs among strains (2). The use of WSP is analogous to theuse of antigens for serotyping pathogenic bacteria (33) and is meant to be acomplement to MLST, not a substitute for MLST. Specifically, each WSP aminoacid sequence (amino acids 52 to 222 with respect to the wMel sequences) waspartitioned into four consecutive sections whose breakpoints fell within con-served regions between the hypervariable regions (2), as follows: HVR1 (aminoacids 52 to 84), HVR2 (amino acids 85 to 134), HVR3 (amino acids 135 to 185),and HVR4 (amino acids 186 to 222). Each section encompassed one of the fourHVR motifs and a portion of the two conserved flanking regions. For simplicityhere we refer to each section as an HVR. Each unique HVR haplotype was given

a number. Each WSP sequence was thus defined as a combination of fournumbers, its WSP profile, corresponding to the four HVR haplotypes.

Features of the Wolbachia databases. The following three separate databaseshave been generated for storing and managing all Wolbachia data: the profile,isolate, and WSP databases. The profile database contains the MLST allelesequences, allelic profiles, and STs. The isolate database stores informationabout the strains and host strain biological information (e.g., taxonomy, naturalhistory, collection date, and locality). Each strain is designated by an identifica-tion number and a code consisting of the first letter of the host genus name,followed by the first three letters of the host species name, underscore, a letterindicating the supergroup to which it belongs, underscore, and any furtherspecific information about the strain needed for discrimination. For example,Dsim_A_wRi indicates the Wolbachia strain from D. simulans belonging tosupergroup A found in Riverside County (CA) currently referred to as wRi. Theisolate database allows storage of accession numbers for sequences other thanthe sequences for the MLST genes that are available for a specific strain. TheWSP database stores both nucleotide and amino acid sequences for wsp allelesand HVRs. All of the databases are reciprocally linked and can be accessed atthe Wolbachia MLST website (http://pubmlst.org/wolbachia/).

Analyses of genetic variation and recombination. For each locus, the numberof variable sites (VI), G�C content, level of nucleotide diversity per site (Pi), and

TABLE 2. Wolbachia strains analyzed

Host species orsubspecies (strain�s) Order Generic name Supergroupa Host phenotypeb

Acromis sparsa Coleoptera Beetle AAedes albopictus Diptera Mosquito A CICamponotus pennsylvanicus Hymenoptera Ant ADrosophila bifasciata Diptera Fruit fly A MKDrosophila innubila Diptera Fruit fly A MKDrosophila melanogaster Diptera Fruit fly A CIDrosophila neotestacea Diptera Fruit fly ADrosophila orientacea Diptera Fruit fly A CIDrosophila recens Diptera Fruit fly A CIDrosophila simulans (wRi) Diptera Fruit fly A CIDrosophila simulans (wAu) Diptera Fruit fly A CIEphestia kuehniella Lepidoptera Flour moth AIncisitermes snyderi Isoptera Termite AMuscidifurax uniraptor Hymenoptera Wasp A PINasonia giraulti (16.2) Hymenoptera Wasp A CINasonia longicornis (2.1) Hymenoptera Wasp A CINasonia vitripennis (12.1) Hymenoptera Wasp A CISolenopsis invicta Hymenoptera Ant AAcraea encedon Lepidoptera Butterfly B MKAcraea eponina Lepidoptera Butterfly BArmadillidium vulgare Isopoda Isopod B FIChelymorpha alternans Coleoptera Beetle B CICulex pipiens pipiens Diptera Mosquito B CICulex pipiens quinquefasciatus Diptera Mosquito B CIDrosophila simulans (wMa) Diptera Fruit fly B CIDrosophila simulans (wNo) Diptera Fruit fly B CIEncarsia formosa Lepidoptera Wasp B PIEphestia kuehniella Lepidoptera Flour moth BGryllus firmus Orthoptera Cricket B CINasonia vitripennis (4.9) Lepidoptera Wasp B CIOstrinia scapulalis Lepidoptera Moth B MKProtocalliphora sialia (B1, 00-189) Diptera Blow fly BTeleogryllus taiwanemma Orthoptera Cricket B CITribolium confusum Coleoptera Beetle B CITrichogramma deion (TX) Lepidoptera Wasp B PIBrugia malayi Spirurida Nematode D MFolsomia candida Collembola Springtail ECoptotermes acinaciformis Isoptera Termite FCimex lectularius Heteroptera Bed bug FZootermopsis angusticollis Isoptera Termite FCordylochernes scorpioides Pseudoscorpiones Pseudoscorpion MKCtenocephalides felis Siphonaptera Dog flea

a All hosts were infected with a single strain.b MK, male killing; PI, parthenogenesis induction; CI, cytoplasmic incompatibility; FI, feminization induction; M, mutualism. The host phenotypes are based on

previously published information.

VOL. 72, 2006 MULTILOCUS SEQUENCE TYPING FOR WOLBACHIA 7101

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 5: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

Ka/Ks were calculated using DNAsp v. 4.0 (41). Significant differences in geneticdivergence for loci for groups of strains having one identical allele were deter-mined using 2 analysis.

Recombination analyses were performed using MaxChi (28) with single andcombined data sets from each gene alignment. The parameters were set asfollows: sequences were considered linear, the highest acceptable P value cutoffwas 0.01, a Bonferroni correction was applied, consensus daughter sequenceswere found, gaps were included, different window sizes of variable sites weretested (10, 20, and 30 VI), and 1,000 permutations were performed. All recom-bination events detected by the program were visually inspected for the align-ment, and only clearly recombinant strains were listed.

Tree reconstruction. Strict maximum likelihood (ML) inference was used toreconstruct phylogenies based on single and concatenated MLST genes (PAUPv. 4.01). The appropriate model of evolution was estimated with Modeltest v.3.06, and the best likelihood score was evaluated with the Akaike informationcriterion (37). The models selected were GTR�I�G for gatB, wsp, and theconcatenated MLST gene data set and GTR�G for coxA, hcpA, ftsZ, and fbpA.Bayesian maximum likelihood inference and maximum parsimony were used toreconstruct phylogenies based on the concatenated MLST data set and wsp data.Bayesian analyses were performed using MrBayes, v. 3.1.1 (18), with 106 gener-ations and a sample frequency of 100. The first 5,000 trees were discarded asburn-in, and 50% majority rule was applied to the remaining set of trees. Max-imum parsimony bootstrap analyses were performed with PAUP v. 4.01 (48)using 1,000 replicates and 100 random-addition replicates per bootstrap repli-cate, and a 50% majority rule bootstrap was applied.

A neighbor-joining tree was reconstructed from the 37 MLST allelic profilesusing the START2 program (23). An unweighted pair group method with arith-metic mean dendrogram was reconstructed from the matrix of pairwise similar-ities for the 36 WSP profiles (START2 program). START2 and a suite ofpublicly available bioinformatics software for MLST data analysis can be foundat the MLST home page (http://pubmlst.org/).

Analysis of congruence between trees. We used the Shimodaira-Hasegawa test(42) to evaluate the significance of topological incongruence among the ML treesbased on the five MLST genes and the concatenated data set. The �lnL scoresand significance values for nucleotide substitution were optimized separately forall topologies considered. Differences in �lnL scores between the best MLtopology and an alternative topology were calculated for each data set. Thesignificance of these differences was measured using a bootstrap approach withRELL and full optimization for 1,000 replicates (PAUP v. 4.01).

Nucleotide sequence accession numbers. All MLST and wsp sequences gen-erated in this study have been deposited in the GenBank database under acces-sion numbers DQ842268 to DQ842486.

RESULTS

Primer performance. The MLST primers worked for 34 of35 supergroup A and B strains tested (Table 2, first 35 strains).In only one case, Trichogramma deion (TX), did the ftsZ prim-ers not amplify. This strain sequence was generated using theMLST primers with tags (http://pubmlst.org/wolbachia/). Thewsp primers amplified the gene from all the supergroup A andB strains tested except the strain in T. deion. All sequencesgenerated using MLST primers perfectly matched the corre-sponding sequences independently generated using the initial64-fold degenerate primers. MLST and wsp primers consis-tently generated amplification products in all the laboratorieswhere they were tested, and bidirectional sequencing of PCRproducts gave unambiguous results.

The primers were sufficiently degenerate to also amplifydivergent strains not belonging to supergroups A and B, suchas Wolbachia strains from the nematode B. malayi (supergroupD) and the bed bug Cimex lectularius (supergroup F). Forother divergent strains the primers worked well for only someof the loci. All loci except ftsZ were amplified from Wolbachiastrains in the termite Zootermopsis angusticollis (supergroupH) and the flea Ctenocephalides felis. All loci except coxA wereamplified from Wolbachia strains in the pseudoscorpion Cor-

dylochernes scorpioides and the collembolan Folsomia candida(supergroup E). Finally, hcpA, fbpA, and ftsZ were amplifiedfrom Wolbachia strains in the termite Coptotermes acinacifor-mis (supergroup F). The wsp primers worked with all thestrains mentioned above except the strain in F. candida. Allspecies whose infection status was previously unknown gaveresults consistent with 16S rRNA gene results. Only for theembiopteran H. solieri were the results discordant; this insecttested positive for infection with a Wolbachia-specific 16SrRNA gene, but there was no amplification with MLST andwsp primers.

When distinct supergroup A and B strains coinfected thesame individuals, the MLST supergroup A- and B-specificprimers successfully amplified the target supergroup A and Btype sequences from all the doubly and singly infected speciesexamined. Group-specific MLST primers also amplified bothsupergroup A and B strain products from the three N. longi-cornis strains carrying multiple supergroup A and B strains.Analysis of the electropherograms showed unambiguous dou-ble peaks for the coxA, gatB, and wsp genes, revealing themultiple infections for each group. Based on the electrophero-gram pattern of polymorphic sites, specific restriction enzymeswere used to digest the PCR products, and the results con-firmed the presence of multiple sequence products (data notshown). This illustrates how the MLST gene set can be used todetect multiple infections in individual hosts.

Allelic variation at MLST loci. A total of 37 strains werecharacterized by MLST, including 35 strains belonging to su-pergroup A or B plus strains from B. malayi (supergroup D)and C. lectularius (supergroup F) (Table 2). In the 37 strainsanalyzed, there were 25 ftsZ alleles, 26 gatB alleles, 27 coxAalleles, 28 fbpA alleles, and 31 hcpA alleles (Table 3). Com-parative analysis of the genetic divergence across loci indicatedthat ftsZ was the least polymorphic locus, with only 24.8% ofthe sites showing variation (VI), the lowest number of alleles(25), and the lowest level of nucleotide diversity per site (Pi,0.065). The most polymorphic locus was hcpA, with 36.5% VIand the highest number of alleles (31). The greatest nucleotidediversity per site was found in fbpA (Pi, 0.092). All genesshowed an average Ka/Ks of ��1, indicating that the genes weresubject to stabilizing selection, conforming to the general re-quirements for MLST loci. Comparison of the genetic poly-morphism between supergroups A and B at the five loci (Table3) led to three main observations: (i) the percentage of VI forall five loci was always higher in supergroup B than in super-group A; (ii) the number of alleles was generally higher forsupergroup B (except for gatB); and (iii) the Ka/Ks was consis-tently ��1 for both supergroups.

Recombination at MLST loci. Four of the five genes chosenfor the MLST scheme (gatB, coxA, hcpA, and fbpA) exhibitedevidence of intragenic recombination (P � 0.001, as deter-mined by MaxChi) (Table 3). The lack of evidence for recom-bination in ftsZ was consistent with previous findings (3). ThehcpA, coxA, and fbpA alleles had recombinants within super-group A or B. The gatB and fbpA alleles had recombinantsbetween supergroups A and B. These results indicated that thetwo supergroups exchanged DNA frequently enough to bedetected with this small data set, although most genes showedconsistent association with one of the two supergroups (seebelow).

7102 BALDO ET AL. APPL. ENVIRON. MICROBIOL.

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 6: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

As a likely result of recombination both within and outsidethe gene boundaries (intra- and intergenic recombination, re-spectively), some strains were extremely similar at one locusand highly divergent at another. This disparity in divergencewas visualized by plotting the genetic distances for loci for pairsof strains that were identical at one or two loci (Fig. 2). Thelevels of genetic divergence at nonidentical loci estimated forpairs of strains that exhibited identity at gatB were remarkablyhigh in some cases (Fig. 2). For instance, for Ostrinia scapulalisand N. vitripennis (B) there was about 11% divergence at fbpA.Similar values were found for O. scapulalis and Acraea ence-don, for Telogryllus taiwanemma and N. vitripennis (B), for T.taiwanemma and A. encedon, for N. vitripennis (B) and Chely-morpha alternans, and for N. vitripennis (B) and A. encedon.These values are significantly greater than the average level ofdivergence for this locus (2.7%) based on the pairs of strainswith identity at another locus (P � 0.05). Conflicting relation-ships among strains could also be visualized (Fig. 2); N. vitrip-ennis B and C. alternans were very similar based on their coxAalleles (0.2% divergence) but were very divergent based ontheir fbpA alleles (11.4%). Conversely, N. vitripennis (B) and A.encedon were very similar based on their fbpA alleles (0.2%)but were very divergent based on their coxA alleles (8.2%).These inversions of relationships indicate that there was re-combination among divergent strains.

Phylogenetic inference of relationships. As expected for adata set affected by recombination, phylogenetic analyses forsingle and concatenated MLST loci revealed significant incon-gruence among loci (Table 4). Conflicts in tree topologies wereinferred for all pairwise comparisons of the five genes. Indeed,there appeared to be no single phylogeny for a strain butmultiple phylogenies for different fragments of the genome. Asa result, the phylogenetic tree based on the concatenated se-quences (Fig. 3) revealed strains that are closely related but

FIG. 2. Pairwise divergence at MLST loci. Thirty-three strain pairsidentical at one or two loci were used to show the variation in similarityacross loci. The most striking pattern was found for strains identical atgatB (vertical bar); some of the strains showed remarkable divergencein the fbpA and coxA alleles, likely associated with recombinationevents. The arrows indicate examples of contradictory relationshipsinferred from the results for the five genes involving N. vitripenni (B),C. alternans, and A. encedon strains (values are circled; see text forexplanation).

TABLE 3. Genetic features and occurrence of recombination at the five MLST loci

Locus Data seta No. ofalleles % of VIb Nucleotide diversity

per siteG�C content

(%) Ka/KsRecombination

(MaxChi, P � 0.01)c

gatB ABDF 26 28.7 0.078 37.2 0.093 Yes (G. firmus)d

A 13 9.2 0.027 37.8 0.151 NoB 11 11.1 0.022 36.6 0.135 No

coxA ABDF 27 28.4 0.083 39.3 0.078 NoA 11 4.5 0.015 39.9 0.099 NoB 14 17.4 0.053 38.7 0.072 Yes (P. sialia [B1], T.

confusum, E. formosa)

hcpA ABDF 31 36.5 0.081 37.0 0.140 NoA 13 12.6 0.029 37.7 0.090 Yes (A. albopictus, A. sparsa)B 16 15.6 0.028 36.3 0.346 No

fbpA ABDF 28 30.5 0.092 38.6 0.084 Yes (A. vulgare)A 12 9.0 0.034 38.7 0.085 NoB 15 20.1 0.063 38.7 0.089 Yes (C. alternans)

ftsZ ABDF 25 24.8 0.065 41.5 0.047 NoA 10 6.4 0.012 42.6 0.028 NoB 14 10.3 0.020 40.4 0.065 No

a Data sets A and B included 18 supergroup A strains and 17 supergroup B strains, respectively (Table 2).b The percentage of variable sites was estimated by considering indels in each data set variable sites.c Recombination inferences for data set ABDF refer to events that occurred between supergroups.d The organisms in parentheses are hosts of putative recombinant strains.

VOL. 72, 2006 MULTILOCUS SEQUENCE TYPING FOR WOLBACHIA 7103

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 7: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

also created phylogenetic artifacts that resulted from recom-bination between more divergent strains. For instance, mem-bers of the five clusters highlighted in the tree (Fig. 3) likelydescended from a recent common ancestor, as reflected bystrong grouping in the concatenated tree (both posterior prob-ability and bootstrap values were �95%) and consistent phy-logenetic clustering based on single-gene phylogenies (data notshown). However, the concatenated tree also revealed severalclusters of strains whose relationships differ in single-locusphylogenies. For example, based on the concatenated data set,

the Encarsia formosa and P. sialia (B1) strains formed a highlysupported cluster, with both bootstrap and posterior probabil-ity values of �100%. However, based on coxA phylogeny, theE. formosa strain clustered with the Acraea eponina strain (datanot shown) and was separated from the P. sialia (B1) strain bya remarkable genetic distance (Fig. 2). Consistent with theeffects of recombination, these two strains also did not clusterbased on the wsp phylogeny (Fig. 4). Hence, the concatenatedgene approach alone can be useful for identifying very closelyrelated strains and to resolve major supergroups. However, it

FIG. 3. Bayesian likelihood inference phylogeny based on the concatenated data set for the five MLST loci (37 strains, 2,079 bp). Groups ofstrains sharing at least three MLST alleles are highlighted. Arrows and asterisks indicate two examples of strain pairs whose predicted relationshipsare highly discordant with the wsp phylogeny (Fig. 4). Posterior probability (left) and parsimony bootstrap (right) values are indicated at majornodes if they were supported by both clustering algorithms.

TABLE 4. Results of a Shimodaira-Hasegawa test of alternative tree topologies for the Wolbachia MLST genes

TopologyLikelihood scores for the following data setsa:

gatB coxA hcpA fbpA ftsZ Concatenated

gatB 1,171.76 1,529.14*** 1,723.57*** 1,994.10*** 1,257.23*** 8,266.81***coxA 1,260.60*** 1,328.74 1,751.42*** 1,990.72*** 1,310.56*** 8,161.79***hcpA 1,273.48*** 1,532.01*** 1,584.14 1,869.74*** 1,280.41*** 8,135.25***fbpA 1,331.80*** 1,552.98*** 1,733.80*** 1,595.66 1,318.52*** 8,090.05***ftsZ 1,300.73*** 1,562.61*** 1,845.66*** 2,069.53*** 1,208.86 8,561.70***Concatenated 1,222.25** 1,445.83*** 1,649.30* 1,653.84 1,238.10 7,767.07

a �, P � 0.05; ��, P � 0.01; ���, P � 0.001. The lowest (best) likelihood scores are indicated by boldface type.

7104 BALDO ET AL. APPL. ENVIRON. MICROBIOL.

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 8: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

cannot be used to interpret more distant phylogenetic relation-ships within a supergroup, even when clades are highly sup-ported, because of artifacts resulting from recombinationamong genes. Combined analyses using allelic profiles, single-gene phylogenies, and concatenated gene phylogenies are ad-visable for any inference of strain relationships.

MLST inference. Based on the neighbor-joining tree con-structed from the STs (Fig. 5), the majority (35/37) of the STswere unique to a strain, as were most of the alleles at a singlelocus. Nine of the 35 STs (Fig. 4) did not share any alleles withany other ST. This allelic diversity was probably due in part tothe great genetic diversity of Wolbachia and in part to thesample of strains selected, which spanned a wide range of hostspecies. No alleles were shared by STs of supergroup A and Bstrains. The following two pairs of strains among the pairsanalyzed exhibited identity at all loci: the pair consisting of theCulex pipiens pipiens and C. pipiens quinquefasciatus strains(ST-9) and the pair consisting of the N. vitripennis (A) andMuscidifurax uniraptor strains (ST-23). Culex strains are iden-tical at other chromosomal genes (13, 44), while the identity

for the N. vitripennis (A) and M. uniraptor wasp strains was notreported previously.

By convention, strains that share at least three MLST alleleswith a central ST are grouped in a complex (ST complex).Additional sequencing of 28 genes showed that members of anST complex based on our five MLST genes were also consis-tently closely related based on the larger gene set, indicatingthe reliability of the MLST system for identifying closely re-lated strains.

A total of three ST complexes were identified based on thecriteria used (three shared alleles with a central ST) (Fig. 5).The ST-13 complex included supergroup A strains present inDrosophila species and the parasitoid wasp N. longicornis (A)(three shared alleles). The five strains differed by only fivenucleotide polymorphisms based exclusively on the concate-nated alignment of the five MLST genes (2,079 bases). Thecentral ST (ST-13) was identified as the Drosophila recens STbased on BURST analysis. The other two complexes includedonly two strains each and were temporarily designated ST-aand ST-b, until additional STs that allow identification of the

FIG. 4. Bayesian likelihood inference phylogeny based on wsp sequences of the 36 strains analyzed by MLST (without the T. deion strain). STcomplexes and strains with identical STs identified by MLST are highlighted. Arrows and asterisks indicate two examples of strain pairs whosepredicted relationships are highly discordant with the MLST phylogeny (Fig. 4). The supergroup placement of the A. sparsa and S. invicta strainsbased on MLST phylogeny (in parentheses) (Fig. 3) is not supported by the wsp-based phylogeny. Posterior probability (left) and parsimonybootstrap (right) values are shown at major nodes if they were supported by both clustering algorithms.

VOL. 72, 2006 MULTILOCUS SEQUENCE TYPING FOR WOLBACHIA 7105

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 9: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

central ST become available. ST-a included two supergroup Bstrains, the D. simulans wNo strain and the D. simulans wMastrain (three shared alleles, three polymorphic sites), and ST-bincluded two supergroup A strains, the Drosophila orientaceaand Drosophila neotestacea strains (four shared alleles, onepolymorphic site).

WSP characterization. The HVRs of the WSP protein wereemployed as an additional, optional marker to assess straindiversity. A new method for representing WSP sequence rela-tionships is proposed here, which can be used in combinationwith standard distance-based methods. This method takes intoaccount the extensive intragenic shuffling of the four HVRs (2)and can be used to readily detect allele shuffling based onunique identifiers for each HVR. Since the majority of aminoacid changes among WSP sequences occur in the four HVRs,these motifs can be used as signatures for discrimination of thedifferent WSP protein types (see Materials and Methods).

The sample of strains characterized by MLST was also typedby WSP, with the exception of the T. deion strain, for whichamplification of wsp failed. Figure 6 shows the WSP profile, acombination of the four HVR amino acid haplotypes. For 36strains, 33 unique WSP profiles were identified. Five groups ofstrains shared at least three HVR haplotypes (Fig. 6), andthree of them had identical WSP profiles. Nine of the 33 WSPprofiles (Fig. 6) did not share any haplotype with any otherWSP profile. The overall numbers of unique haplotypes perHVR were similar, ranging from 24 to 27. However, the num-ber of nonunique haplotypes within an HVR (i.e., haplotypes

that occurred in multiple WSP profiles) varied for the fourHVRs; there were only three haplotypes for HVR1 (haplo-types 1, 10, and 17), compared to seven haplotypes for HVR2,six haplotypes for HVR3, and nine haplotypes for HVR4.Motif shuffling was apparent in the data set. Indeed, severalstrains exhibited identity at individual HVRs alternatively withdifferent sets of strains. For example, N. giraulti (A) sharedhaplotype 1 in HVR1 with wMel, N. longicornis (A), D. simu-lans wAu, D. recens, Drosophila innubila (A), D. orientacea, D.neotestacea, and A. albopictus. However, in HVR2 it had aunique haplotype sequence (haplotype 23), while in HVR3 ithad the same haplotype sequence (haplotype 15) as D. simu-lans wRi; finally, in HVR4 it had the same haplotype sequence(haplotype 25) as Camponotus pennsylvanicus.

WSP inference and comparison to MLST. Analogous to theST-based neighbor-joining tree (Fig. 5), the relationshipsamong WSP profiles could be schematically represented in atree (Fig. 6). We noted that relationships inferred by MLSTand WSP profiles were not necessarily concordant but werecomplementary. In some cases strains that were closely relatedbased on their MLST profiles also had similar WSP profiles.Indeed, the two C. pipiens strains had identical ST (ST-9) (Fig.5) and WSP (Fig. 6) profiles. The M. uniraptor and N. vitri-pennis (A) strains, which had the same ST (ST-23), sharedthree of four HVR haplotypes and differed by only two aminoacids in HVR3. Two of the three ST complexes (Fig. 5) werealso very closely related based on their WSP profiles (Fig. 6).Specifically, D. simulans wMa and D. simulans wNo (ST-a

FIG. 5. Neighbor-joining tree based on the MLST allelic profiles (right columns) of 37 Wolbachia strains. ST complexes and strains withidentical STs are highlighted (STs shared by strains are indicated by arrowheads). Only closely related strains predicted by MLST were alsopredicted by the concatenated MLST phylogeny (Fig. 3), WSP profiles (Fig. 6), and wsp phylogeny (Fig. 4).

7106 BALDO ET AL. APPL. ENVIRON. MICROBIOL.

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 10: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

complex) and the D. neotestacea and D. orientacea strains(ST-b complex) had identical WSP profiles. Strains belongingto the ST-13 complex exhibited identity in HVR1 and HVR2,and wMel and the N. longicornis (A) strain had three identicalHVR haplotypes. However, there was a high level of nucleo-tide divergence in HVR4 among the five strains (11 of 37amino acids were polymorphic), suggesting the occurrence ofrecombination events in this region. Analogously, the oppositepattern was observed: strains with similar WSP sequencescould be divergent at MLST genes. For example, no MLSTalleles were shared by either the D. simulans wMa or D. simu-lans wNo strain (ST-15 and ST-16) and the Culex sp. strains(ST-9) (Fig. 5). The mean levels of amino acid divergencebetween alleles of the two groups were 2% for gatB, 4% forhcpA, 5% for fbpA, and 6% for coxA. There were no differ-ences between ftsZ alleles. Based on their WSP profiles, how-ever, the two groups exhibited high levels of similarity in theirWSP profiles, with two of the four HVRs identical (Fig. 6).Divergence between the two groups in the two nonidenticalHVRs accounted for only one amino acid change each. Such alevel of high similarity at a quickly evolving gene is in contrastto the higher levels of divergence in some housekeeping genes;this evidence indicates that horizontal transfer and acquisitionof wsp sequences occur through recombination.

As found with the MLST profiles, no identical HVRs wereobserved for the supergroup A and B strains, although someHVRs were closely related. Nine of 16 supergroup A strainshad the same haplotype sequence in HVR1 (haplotype 1). Twostrains, from the tortoise beetle Acromis sparsa and the antSolenopsis invicta, had four unique HVR haplotype sequences.Both strains belonged to supergroup A, as supported by phy-logenetic analyses based on MLST data (both the posteriorprobability and bootstrap values were 100) (Fig. 3) and MLSTprofiles (both strains exhibited identity at least at one allelewith another supergroup A strain) (Fig. 5). However, based onthe wsp gene phylogeny (Fig. 4), the strain from A. sparsa wasexcluded from both supergroup A and supergroup B andformed a divergent lineage, while the strain from S. invictaclustered significantly in supergroup B (both the posteriorprobability and bootstrap value were �90). Based on MaxChianalyses, the A. sparsa wsp sequence was the product of re-combination between the supergroup A wsp-type sequence ofEphestia kuehniella (A) and the supergroup B-type sequence ofE. formosa (P � 0.001). In the case of the Wolbachia strainfrom S. invicta, lateral transfer of the whole wsp gene from asupergroup B-type donor strain and recombination may ex-plain the contradictory clustering inferred from the MLST andWSP data sets (Fig. 3 and 4).

Overall, there was strong incongruence between MLST- andWSP-based relationships for both subgroup and supergroupstrain assignment, indicating the need for a multilocus ap-proach.

DISCUSSION

Wolbachia strains are among the most common obligateintracellular bacteria (56), and by virtue of the diverse effectsthat they have on their hosts (46, 54) and their potential healthand agricultural implications (35, 61), they represent an ex-tremely important group. The MLST system described hereallows universal characterization of Wolbachia strains that in-duce a variety of phenotypes in a diverse range of hosts. Itprovides a standard method for reliable strain typing, a unifiednomenclature for Wolbachia, and a centralized and curateddatabase for data storage and management. Accurate straintyping is an extremely powerful tool for investigating the biol-ogy and evolution of Wolbachia. Because of the indexing of astandard set of genes for typing Wolbachia strains and thecommon field host information in the isolate database, it ispossible to relate any new submitted strain to an expandingdata set. With the isolate database, host and strain informationare partitioned into multiple, searchable fields. Thus, each fieldcan be used to sort the data, and the frequency of a single fieldor a combination of fields can be calculated and graphicallydisplayed (for instance, host geographical location versus spe-cific allele frequencies). As the database grows, informativepatterns can emerge and key questions can be addressed, suchas (i) what is the global pattern of diversity of Wolbachiastrains; (ii) are there particular taxa that have an unusuallyhigh number of infected species; (iii) how do Wolbachia strainsmove between host taxa, and what genetic changes occur whenthey shift hosts; (iv) how do the strains move locally and glo-bally within arthropod communities (e.g., are there strains oralleles that are found in particular geographical regions, while

FIG. 6. Unweighted pair group method with arithmetic mean den-drogram based on the WSP profiles (right columns) of 36 Wolbachiastrains, corresponding to the same sample typed by MLST (without theT. deion strain). Groups of strains sharing at least three HVR haplo-types are highlighted. In some cases groups of strains identified byMLST (Fig. 5, highlighted strains) were also closely related based ontheir WSP profiles. In strains belonging to the ST-13 complex therewas a clear recombination signature in HVR4, which resulted in re-markable genetic divergence in this region among the five strains.

VOL. 72, 2006 MULTILOCUS SEQUENCE TYPING FOR WOLBACHIA 7107

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 11: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

other strains or alleles have a global distribution); and (v) doparticular closely related strains show shifts of phenotypes.

MLST is also an important source of sequence data forextensive comparative genetics, providing a platform for inves-tigating molecular evolutionary processes in these intracellularbacteria.

Results reported in this study reveal that (i) the MLSTprimers work reliably for the two major supergroups of Wol-bachia found in arthropods (supergroups A and B), as well aswith other groups (however, additional flanking primers areprovided at the MLST website); (ii) there is considerable al-lelic diversity in genes and among Wolbachia strains, with re-combination occurring both within and between the MLSTgenes; (iii) MLST can identify and discriminate closely relatedstrains using chromosomal genes; (iv) closely related strainsdetected by MLST (strains that share a minimum of threealleles in their MLST profiles) can show remarkable diver-gence at WSP as a result of recombination at the wsp gene; (v)strains with very similar WSP can have highly divergent MLSTprofiles; and (vi) strains closely related based on MLST canproduce very different host effects.

Discrimination and grouping of closely related strains. Oneof the challenges in the study of Wolbachia is the difficultyencountered when workers attempt to infer similarities amongstrains. Because of horizontal genetic exchange, Wolbachiagenes do not evolve in a bifurcating, tree-like fashion. Thus,groups of truly derived strains cannot be recovered with tradi-tional cladistic methods. Previous studies have inferred phylo-genetic relatedness among strains based on similarity or iden-tity at one or two genes, for instance, ftsZ (59) or wsp (43, 62).The data presented here demonstrate that because of recom-bination, identity at single genes is not sufficient for reliablestrain typing and that the high level of support for a clade in aconcatenated phylogeny could be biased by the unequal diver-gence of loci, masking single-locus conflicts. Because MLSTdoes not describe strain relationships in terms of the level ofhomology among sequences (i.e., phylogenetically) but de-scribes them in terms of identity versus nonidentity amongalleles (i.e., similarity), it avoids false inferences due to recom-bination issues. The ST tree in Fig. 2 provides a schematicrepresentation of relationships among allelic profiles and canbe combined with phylogenetic and other clustering algorithmanalyses based on the sequences in individual or concatenatedMLST data sets.

Based on the data set analyzed, only STs sharing three of fiveidentical MLST loci (designated an ST complex) show consis-tent similarity at all the remaining loci. Strains identical atthree alleles are likely to have a recent common ancestor andthus represent a phylogenetic cluster of recently divergedstrains. Preliminary analyses of an additional 28 chromosomalgenes for the members of three complexes (unpublished data)confirmed their consistent relatedness, indicating that MLSTprofiles can be used to predict similarity at other loci. Further-more, no recombination was found between members of thesame ST complex at MLST genes. In contrast, recombinationat MLST genes was detected between strains identical at oneor two MLST alleles. This indicates that grouping strains basedon at least three shared alleles at MLST loci can form the basisof a reliable system for assessing actual similarity at chromo-somal genes among strains. However, as more allelic profiles

become available, it should be possible to test this observationmore extensively.

Wolbachia researchers have used the gene wsp extensively todetermine relationships among strains. However, using ourdata set, we found several instances in which strains closelyrelated based on MLST data are very divergent based on WSPdata and strains that have similar WSP differ substantiallybased on MLST data. For instance, members of the ST-13complex showed extensive divergence in HVR4 of WSP, as aresult of recombination. However, when phylogenetic methodswere applied to this recombinant data set, a recombinationsignature in HVR4 was masked by a high level of similarity inthe first three HVRs and the five strains apparently formed astrong cluster (posterior probability, 100%; bootstrap support,100%) (Fig. 4). As an example of the opposite pattern, highlysimilar WSP sequences were shared by strains found in N.vitripennis (B) and E. kuehniella (B) (1% amino acid diver-gence), whose FbpA and HcpA protein sequences differ sub-stantially (8 and 5% divergence, respectively). Such findingsfurther support the conclusion that wsp alone is an unreliablepredictor of either strain similarity or divergence.

Overall, the results show that the MLST system describedhere provides an excellent method for typing Wolbachia strainsfrom diverse hosts and also for discriminating among strains inthe same host species (e.g., D. simulans strains). It should benoted that identity in MLST genes does not mean that twostrains are identical throughout their genomes. Even within ahost species, where an ancestral strain can be derived from arelatively recent ancestor, variation within the genome canaccumulate, particularly in quickly evolving or highly recombi-nant genes, such as mobile elements and/or ankyrin repeatdomains. This has been shown for strains found in C. pipiensand Drosophila species (19, 44). Discrimination of substrainswithin the same ST requires the use of additional specificpolymorphic markers (39, 44). The MLST database can ac-commodate this additional information through links that canbe accessed through stored accession numbers.

Detection and typing of multiple coinfecting strains. DoubleWolbachia infections are common in insects (24, 55), and onlyin some circumstances can they be separated experimentally(34). MLST primers for typing supergroup A and B Wolbachiastrains from individuals that are doubly infected have beendesigned to address this issue. However, PCR-based systemslike MLST and WSP typing are constrained for examination ofdouble infections as the different alleles cannot always be re-liably assigned to specific strains coinfecting the same individ-ual. A preliminary analysis using a small sample of multiplyand singly infected species gave positive results. Primer per-formance, however, should be verified on a larger scale. Use ofthe group-specific primers for MLST requires workers to as-sume that there have not been reciprocal gene exchanges be-tween the strains. Therefore, MLST for typing hosts that aredoubly infected with supergroup A and B Wolbachia strainsshould not be used unless reliable supergroup A- and B-spe-cific alleles can be identified for each of the MLST genes.Allelic profiles for doubly infected individuals are given thestatus “candidate” STs in the MLST database until the strainscan be experimentally separated (e.g., by partial antibiotic cur-ing or segregation by other means).

In some cases host species are coinfected with very closely

7108 BALDO ET AL. APPL. ENVIRON. MICROBIOL.

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 12: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

related strains that belong to the same supergroup. Thesestrains can occur in double or multiple infections and arecoded A1A2, B1B2, etc. Because of the high levels of geneticsimilarity among the strains, these multiple infections are nor-mally hard to detect, and when they are detected, the specificamplification and correct assignment of sequences to one ofthe strains are often difficult. Such multiple infections may bemore common than has been assumed, and providing an esti-mate of their global frequency in host species represents animportant first step. Based on a preliminary screening of asmall sample of N. longicornis strains doubly infected withsupergroup A and/or B strains using MLST group-specificprimers, the multiple peaks in the electropherograms of someof the MLST gene products can be read as a signature of thepotential presence of multiple strain variants. However, addi-tional analyses are necessary to verify the actual existence ofmultiple strains.

Relating MLST to Wolbachia biology. Even with the limitednumber of strains analyzed here, several patterns regarding thetransmission of Wolbachia are apparent. Based on both MLSTand WSP data, different host organisms were found to harborvery similar or identical Wolbachia strains; these organismsinclude the wasp species N. vitripennis (A) and M. uniraptor;the mosquito subspecies C. pipiens pipiens and C. pipiens quin-quefasciatus; the fly species D. melanogaster, D. simulans wAu,D. recens, and D. innubila, as well as the wasp species N.longicornis; and the fly species D. neotestacea and D. orientacea.The strain similarities may have been due to recent horizontaltransmission of Wolbachia across host species or, in the case ofthe closely related species D. neotestacea and D. orientacea, tocodivergence and coevolution of Wolbachia strains and theirhosts. Because genetically similar strains are often found insimilar insect hosts, ecological interactions among hosts mayhave mediated such horizontal transfers. In particular, it ap-pears that horizontal transmission of Wolbachia has occurredquite recently and frequently within the genus Drosophila.

Many of the reproductive phenotypes of the strains analyzedin this study remain unknown, and the possibility that there isa correlation between STs and/or WSP profiles and the strainphenotype cannot be eliminated yet. Based on the data setanalyzed, genetically similar strains found in different hostspecies were often associated with distinct host reproductivemanipulations (Table 2). For example, strains found in N.vitripennis (A) and M. uniraptor, which have identical ST andvery similar WSP profiles, are responsible for inducing cyto-plasmic incompatibility (5) and parthenogenesis (45), respec-tively, in their hosts. Similarly, based on both MLST and WSPdata the D. simulans wAu and wasp N. longicornis (A) strainsare very closely related but are responsible for different pat-terns of cytoplasmic incompatibility (5, 29, 58). Based on theseobservations, it is now relevant to investigate the geneticchanges in these closely related strains that may be involved inthe phenotypic shift. In addition, as future studies clarify thespecific role of wsp in host-parasite interactions, informationon amino acid motifs in HVRs may prove to be useful. Asmore allelic profiles become available for strains that are foundin hosts belonging to different taxa, occur in different geo-graphic regions, and are subjected to different phenotypes, thepotential of the MLST database for revealing ecological, evo-lutionary, and functional patterns should increase.

ACKNOWLEDGMENTS

We thank the following individuals for providing arthropods for thisstudy: D. Bouchon, K. Christianson, K. Dyer, M. Hoy, S. Dobson, R.Harrison, G. Hurst, J. Jaenike, F. Jiggins, N. Lo, J. Marshall, J. Rasgon, T.Sasaki, D. Shoemaker, R. Stouthamer, M. Wade, D. Windsor, D. Zeh,J. Zeh, N. Ayoub, and M. Collin. We thank R. Nene, C. Westbrook, S.Sinkins, and R. Edwards for assistance with this project.

This work was supported by funds from U.S. National Science Foun-dation grant EF-0328363 to J.H.W. Development of the databasesoftware was funded by the Wellcome Trust and the European Union.

REFERENCES

1. Baldo, L., J. D. Bartos, J. H. Werren, C. Bazzocchi, M. Casiraghi, and S.Panelli. 2002. Different rates of nucleotide substitutions in Wolbachia endo-symbionts of arthropods and nematodes: arms race or host shifts? Parasito-logia 44:179–187.

2. Baldo, L., N. Lo, and J. H. Werren. 2005. Mosaic nature of wsp (Wolbachiasurface protein). J. Bacteriol. 187:5406–5418.

3. Baldo, L., S. Bordenstein, J. J. Wernegreen, and J. H. Werren. 2006. Wide-spread recombination throughout Wolbachia genomes. Mol. Biol. Evol. 23:437–449.

4. Bandi, C., C. G. Anderson, C. Genchi, and M. L. Blaxter. 1998. Phylogeny ofWolbachia in filarial nematodes. Proc. R. Soc. Lond. B 265:2407–2413.

5. Bordenstein, S. R., and J. H. Werren. 1998. Effects of A and B Wolbachiaand host genotype on interspecies cytoplasmic incompatibility in Nasonia.Genetics 148:1833–1844.

6. Casiraghi, M., S. R. Bordenstein, L. Baldo, N. Lo, T. Beninati, J. J.Wernegreen, J. H. Werren, and C. Bandi. 2005. Phylogeny of Wolbachiapipientis based on gltA, groEL and ftsZ gene sequences: clustering of arthro-pod and nematode symbionts in the F supergroup, and evidence for furtherdiversity in the Wolbachia tree. Microbiology 151:4015–4022.

7. Cordaux, R., A. Michel-Salzat, and D. Bouchon. 2001. Wolbachia infection incrustaceans: novel hosts and potential routes for horizontal transmission. J.Evol. Biol. 14:237–243.

8. Dunning Hotopp, J. C., M. Lin, R. Madupu, J. Crabtree, S. V. Angiuoli, J.Eisen, R. Seshadri, Q. Ren, M. Wu, T. R. Utterback, S. Smith, M. Lewis, H.Khouri, C. Zhang, H. Niu, Q. Lin, N. Ohashi, N. Zhi, W. Nelson, L. M.Brinkac, R. J. Dodson, M. J. Rosovitz, J. Sundaram, S. C. Daugherty, T.Davidsen, A. S. Durkin, M. Gwinn, D. H. Haft, J. D. Selengut, S. A. Sullivan,N. Zafar, L. Zhou, F. Benahmed, H. Forberger, R. Halpin, S. Mulligan, J.Robinson, O. White, Y. Rikihisa, and H. Tettelin. 2006. Comparative genom-ics of emerging human ehrlichiosis agents. PLoS Genet. 2:e21.

9. Foster, J., M. Ganatra, I. Kamal, J. Ware, K. Makarova, N. Ivanova, A.Bhattacharyya, V. Kapatral, S. Kumar, J. Posfai, T. Vincze, J. Ingram, L.Moran, A. Lapidus, M. Omelchenko, N. Kyrpides, E. Ghedin, S. Wang, E.Goltsman, V. Joukov, O. Ostrovskaya, K. Tsukerman, M. Mazur, D. Comb,E. Koonin, and B. Slatko. 2005. The Wolbachia genome of Brugia malayi:endosymbiont evolution within a human pathogenic nematode. PLoS Biol.3:e121.

10. Gilbert, J., C. K. Nfon, B. L. Makepeace, L. M. Njongmeta, I. M. Hastings,K. M. Pfarr, A. Renz, V. N. Tanya, and A. J. Trees. 2005. Antibiotic chemo-therapy of onchocerciasis: in a bovine model, killing of adult parasites re-quires a sustained depletion of endosymbiotic bacteria (Wolbachia species).J. Infect. Dis. 192:1483–1493.

11. Goodacre, S. L., O. Y. Martin, C. F. Thomas, and G. M. Hewitt. 2006.Wolbachia and other endosymbiont infections in spiders. Mol. Ecol. 15:517–527.

12. Gotoh, T., H. Noda, and X. Y. Hong. 2003. Wolbachia distribution andcytoplasmic incompatibility based on a survey of 42 spider mite species(Acari: Tetranychidae) in Japan. Heredity 91:208–216.

13. Guillemaud, T., N. Pasteur, and F. Rousset. 1997. Contrasting levels ofvariability between cytoplasmic genomes and incompatibility types in themosquito Culex pipiens. Proc. R. Soc. Lond. B 264:245–251.

14. Hall, T. A. 1999. BioEdit: a user-friendly biological sequence alignmenteditor and analysis program for Windows 95/98/NT. Nucleic Acids Symp.Ser. 41:95–98.

15. Hertig, M., and S. B. Wolbach. 1924. Studies on rickettsia-like microorgan-isms in insects. J. Med. Res. 44:329–374.

16. Hoffmann, A. A., and M. Turelli. 1997. Cytoplasmic incompatibility in in-sects, pp. 42–80. In S. L. O’Neill, J. H. Werren, and A. A Hoffmann (ed.),Influential passengers. Oxford University Press, New York, N.Y.

17. Holmes, E. C., R. Urwin, and M. C. Maiden. 1999. The influence of recom-bination on the population structure and evolution of the human pathogenNeisseria meningitidis. Mol. Biol. Evol. 16:741–749.

18. Huelsenbeck, J. P., and F. Ronquist. 2001. MRBAYES: Bayesian inferenceof phylogenetic trees. Bioinformatics 17:754–755.

19. Iturbe-Ormaetxe, I., G. R. Burke, M. Riegler, and S. L. O’Neill. 2005.Distribution, expression, and motif variability of ankyrin domain genes inWolbachia pipientis. J. Bacteriol. 187:5136–5145.

20. Jeyaprakash, A., and M. A. Hoy. 2000. Long PCR improves Wolbachia DNA

VOL. 72, 2006 MULTILOCUS SEQUENCE TYPING FOR WOLBACHIA 7109

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 13: Multilocus Sequence Typing System for the Endosymbiont … · of five ubiquitous genes (gatB, coxA, hcpA, fbpA, and ftsZ) were chosen, and primers that amplified across the major

amplification: wsp sequences found in 76% of sixty-three arthropod species.Insect Mol. Biol. 9:393–405.

21. Jiggins, F. M., G. D. Hurst, and Z. Yang. 2002. Host-symbiont conflicts:positive selection on an outer membrane protein of parasitic but not mutu-alistic Rickettsiaceae. Mol. Biol. Evol. 19:1341–1349.

22. Jolley, K. A., M. S. Chan, and M. C. Maiden. 2004. mlstdbNet—distributedmulti-locus sequence typing (MLST) databases. BMC Bioinformatics 1:5–86.

23. Jolley, K. A., E. J. Feil, M. S. Chan, and M. C. Maiden. 2001. Sequence typeanalysis and recombinational tests (START). Bioinformatics 17:1230–1231.

24. Kikuchi, Y., and T. Fukatsu. 2003. Diversity of Wolbachia endosymbionts inheteropteran bugs. Appl. Environ. Microbiol. 69:6082–6090.

25. Kyei-Poku, G. K., D. D. Colwell, P. Coghlin, B. Benkel, and K. D. Floate.2005. On the ubiquity and phylogeny of Wolbachia in lice. Mol. Ecol. 14:285–294.

26. Lo, N., M. Casiraghi, E. Salati, C. Bazzocchi, and C. Bandi. 2002. How manyWolbachia supergroups exist? Mol. Biol. Evol. 19:341–346.

27. Maiden, M. C., J. A. Bygraves, E. Feil, G. Morelli, J. E. Russell, R. Urwin,Q. Zhang, J. Zhou, K. Zurth, D. A. Caugant, I. M. Feavers, M. Achtman, andB. G. Spratt. 1998. Multilocus sequence typing: a portable approach to theidentification of clones within populations of pathogenic microorganisms.Proc. Natl. Acad. Sci. USA 95:3140–3145.

28. Martin, D., C. Williamson, and D. Posada. 2005. RDP2: recombinationdetection and analysis from sequence alignments. Bioinformatics 21:260–262.

29. Mercot, H., and S. Charlat. 2004. Wolbachia infections in Drosophila mela-nogaster and D. simulans: polymorphism and levels of cytoplasmic incompat-ibility. Genetica 120:51–59.

30. Nirgianaki, A., G. K. Banks, D. R. Frohlich, Z. Veneti, H. R. Braig, T. A.Miller, I. D. Bedford, P. G. Markham, C. Savakis, and K. Bourtzis. 2003.Wolbachia infections of the whitefly Bemisia tabaci. Curr. Microbiol. 47:93–101.

31. O’Neill, S. L., R. Giordano, A. M. Colbert, T. L. Karr, and H. M. Robertson.1992. 16S rRNA phylogenetic analysis of the bacterial endosymbionts asso-ciated with cytoplasmic incompatibility in insects. Proc. Natl. Acad. Sci. USA89:2699–2702.

32. Paraskevopoulos, C., S. R. Bordenstein, J. J. Wernegreen, J. H. Werren, andK. Bourtzis. Towards a Wolbachia multilocus sequence typing system: dis-crimination of Wolbachia strains present in Drosophila species. Curr. Micro-biol., in press.

33. Perez-Losada, M., R. P. Viscidi, J. C. Demma, J. Zenilman, and K. A.Crandall. 2005. Population genetics of Neisseria gonorrhoeae in a high-prev-alence community using a hypervariable outer membrane porB and 13 slowlyevolving housekeeping genes. Mol. Biol. Evol. 22:1887–1902.

34. Perrot-Minnot, M. J., L. R. Guo, and J. H. Werren. 1996. Single and doubleinfections with Wolbachia in the parasitic wasp Nasonia vitripennis: effects oncompatibility. Genetics 143:961–972.

35. Pfarr, K. M., and A. M. Hoerauf. 2005. The annotated genome of Wolbachiafrom the filarial nematode Brugia malayi: what it means for progress inantifilarial medicine. PLoS Med. 2:e110.

36. Pintureau, B., S. Chaudier, F. Lassabliere, H. Charles, and S. Grenier. 2000.Addition of wsp sequences to the Wolbachia phylogenetic tree and stabilityof the classification. J. Mol. Evol. 51:374–377.

37. Posada, D., and T. R. Buckley. 2004. Model selection and model averagingin phylogenetics: advantages of akaike information criterion and bayesianapproaches over likelihood ratio tests. Syst. Biol. 53:793–808.

38. Rao, R. 2005. Endosymbiotic Wolbachia of parasitic filarial nematodes asdrug targets. Indian J. Med. Res. 122:199–204.

39. Riegler, M., M. Sidhu, W. J. Miller, and S. L. O’Neill. 2005. Evidence for aglobal Wolbachia replacement in Drosophila melanogaster. Curr. Biol. 15:1428–1433.

40. Rowley, S. M., R. J. Raven, and E. A. McGraw. 2004. Wolbachia pipientis inAustralian spiders. Curr. Microbiol. 49:208–214.

41. Rozas, J., J. C. Sanchez-DelBarrio, X. Messeguer, and R. Rozas. 2003.DnaSP, DNA polymorphism analyses by the coalescent and other methods.Bioinformatics 19:2496–2497.

42. Shimodaira, H., and M. Hasegawa. 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol.16:1114–1116.

43. Shoemaker, D. D., C. A. Machado, D. Molbo, J. H. Werren, D. M. Windsor,and E. A. Herre. 2002. The distribution of Wolbachia in fig wasps: correla-tions with host phylogeny, ecology and population structure. Proc. R. Soc.Lond. B 269:2257–2267.

44. Sinkins, S. P., T. Walker, A. R. Lynd, A. R. Steven, B. L. Makepeace, H. C.Godfray, and J. Parkhill. 2005. Wolbachia variability and host effects oncrossing type in Culex mosquitoes. Nature 436:257–260.

45. Stouthamer, R., J. A. J. Breeuwer, R. F. Luck, and J. H. Werren. 1993.Molecular identification of microorganisms associated with parthenogenesis.Nature 61:66–68.

46. Stouthamer, R., J. A. J. Breeuwer, and G. D. Hurst. 1999. Wolbachia pipientis:microbial manipulator of arthropod reproduction. Annu. Rev. Microbiol.53:71–102.

47. Sutton, G., O. White, M. Adams, and A. Kerlavage. 1995. TIGR Assembler:a new tool for assembling large shotgun sequencing projects. Genome Sci.Technol. 1:9–19.

48. Swofford, D. L. 2000. PAUP*: phylogenetic analysis using parsimony (*andother methods). Sinauer Associates, Sunderland, Mass.

49. Tatusov, R. L., E. V. Koonin, and D. J. Lipman. 1997. A genomic perspectiveon protein families. Science 278:631–637.

50. Tatusov, R. L., N. D. Fedorova, J. D. Jackson, A. R. Jacobs, B. Kiryutin, E. V.Koonin, D. M. Krylov, R. Mazumder, S. L. Mekhedov, A. N. Nikolskaya,B. S. Rao, S. Smirnov, A. V. Sverdlov, S. Vasudevan, Y. I. Wolf, J. J. Yin,and D. A. Natale. 2003. The COG database: an updated version includeseukaryotes. BMC Bioinformatics 4:41.

51. Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G.Higgins. 1997. The ClustalX Windows interface: flexible strategies for mul-tiple sequence alignment aided by quality analysis tools. Nucleic Acids Res.24:4876–4882.

52. Urwin, R., and M. C. Maiden. 2003. Multi-locus sequence typing: a tool forglobal epidemiology. Trends Microbiol. 11:479–487.

53. van Meer, M. M. M., J. Witteveldt, and R. Stouthamer. 1999. Phylogeny ofthe arthropod endosymbiont Wolbachia based on wsp gene. Insect Mol. Biol.8:399–408.

54. Werren, J. H. 1997. Biology of Wolbachia. Annu. Rev. Entomol. 42:587–609.55. Werren, J. H., and D. M. Windsor. 2000. Wolbachia infection frequencies in

insects: evidence of a global equilibrium? Proc. R. Soc. Lond. B 267:1277–1285.

56. Werren, J. H., D. M. Windsor, and L. R. Guo. 1995. Distribution of Wolba-chia among neotropical arthropods. Proc. R. Soc. Lond. B 262:197–204.

57. Werren, J. H., and J. D. Bartos. 2001. Recombination in Wolbachia. Curr.Biol. 11:431–435.

58. Werren, J. H., and J. Jaenike. 1995. Wolbachia and cytoplasmic incompati-bility in mycophagous Drosophila and their relatives. Heredity 75:320–326.

59. Werren, J. H., W. Zhang, and L. R. Guo. 1995. Evolution and phylogeny ofWolbachia: reproductive parasites of arthropods. Proc. R. Soc. Lond. B261:55–63.

60. Wu, M., L. V. Sun, J. Vamathevan, M. Riegler, R. Deboy, J. C. Brownlie,E. A. McGraw, W. Martin, C. Esser, N. Ahmadinejad, C. Wiegand, R.Madupu, M. J. Beanan, L. M. Brinkac, S. C. Daugherty, A. S. Durkin, J. F.Kolonay, W. C. Nelson, Y. Mohamoud, P. Lee, K. Berry, M. B. Young, T.Utterback, J. Weidman, W. C. Nierman, I. T. Paulsen, K. E. Nelson, H.Tettelin, S. L. O’Neill, and J. A. Eisen. 2004. Phylogenomics of the repro-ductive parasite Wolbachia pipientis wMel: a streamlined genome overrun bymobile genetic elements. PLoS Biol. 2:E69.

61. Zabalou, S., M. Riegler, M. Theodorakopoulou, C. Stauffer, C. Savakis, andK. Bourtzis. 2004. Wolbachia-induced cytoplasmic incompatibility as ameans for insect pest population control. Proc. Natl. Acad. Sci. USA 101:15042–15045.

62. Zhou, W., F. Rousset, and S. L. O’Neill. 1998. Phylogeny and PCR-basedclassification of Wolbachia strains using wsp gene sequences. Proc. R. Soc.Lond. B 265:509–515.

7110 BALDO ET AL. APPL. ENVIRON. MICROBIOL.

on April 7, 2020 by guest

http://aem.asm

.org/D

ownloaded from