15
The draft genome sequence of Arsenophonus nasoniae, son-killer bacterium of Nasonia vitripennis, reveals genes associated with virulence and symbiosis T. E. Wilkes*, A. C. Darby*, J.-H. Choi†, J. K. Colbourne†, J. H. Werren‡ and G. D. D. Hurst* *School of Biological Sciences, University of Liverpool, Liverpool, UK; The Centre for Genomics and Bioinformatics, Indiana University, Bloomington, IN, USA; and Department of Biology, University of Rochester, Rochester, NY, USA AbstractFour percent of female Nasonia vitripennis carry the son-killer bacterium Arsenophonus nasoniae,a microbe with notably different biology from other inherited parasites and symbionts. In this paper, we examine a draft genome sequence of the bacterium for open reading frames (ORFs), structures and path- ways involved in interactions with its insect host. The genome data suggest that A. nasoniae carries mul- tiple type III secretion systems, and an array of toxin and virulence genes found in Photorhabdus, Yersinia and other gammaproteobacteria. Of particular note are ORFs similar to those known to affect host innate immune functioning in other bacteria, and four ORFs related to pro-apoptotic exotoxins. The genome sequences for both A. nasoniae and its Nasonia host are useful tools for examining functional genomic interactions of microbial survival in hostile immune environments, and mechanisms of passage through gut epithelia, in a whole organism context. Keywords: symbiont, Wolbachia, male-killing, Photo- rhabdus, Arsenophonus, genome, toxins. Introduction Insects engage in many different interactions with enteric bacteria (bacteria from the gamma division of proteobacteria). The spectrum varies from strongly para- sitic (causing death of the host) through more commensal interactions, to ones where the presence of the bacterium is beneficial or even essential (Dale & Moran, 2006). Pathogenic interactions range from passive to extremely aggressive infections. For example, Erwinia carotovara and Pseudomonas entomophila are gut-invasive patho- gens, causing a loss of gut epithelial integrity that allows their dissemination throughout the insect (Vallet-Gely et al., 2008). Photorhabdus luminescens represents a very aggressive infection. A ‘partner’ of nematodes, it switches to a pathogenic lifestyle on regurgitation inside a lepidopteran haemocoel (ffrench-Constant et al., 2003). This bacterium carries a diverse arsenal used to subju- gate and kill its secondary host, comprising both toxins and systems for disabling host immune responses (Waterfield et al., 2004). In contrast to these pathogens, other members of the gammaproteobacteria are symbionts of insects. These are persistent infections, and are commonly divided into two categories. Primary symbionts (such as Buchnera, Wigglesworthia, Blochmannia) required for host function by virtue of various anabolic roles, are typically integrated into both host anatomy (through a bacteriome) and physi- ology and commonly have a long evolutionary history with their host. Secondary symbionts, such as Hamiltonella and Sodalis, are dispensable, generally less integrated into anatomy, and sometimes exist within the host haemolymph (Moran et al., 2005). Here, they must either avoid eliciting or be refractory to any clearing immune response of the host. Secondary symbionts often provide an ecologically-contingent benefit, such as natural enemy resistance (Haine, 2008). The gammaproteobacterium Arsenophonus nasoniae infects the wasp Nasonia vitripennis. First described phe- notypically from its ‘son-killer’ behaviour, it represented a maternally inherited trait present in around 4% of female N. vitripennis wasps that were typified by the production of female biased secondary sex ratios (associated with the death of 80% of sons) (Skinner, 1985). Whilst male-killing Correspondence: G. Hurst, School of Biological Sciences, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK. Tel.: +44 15 1795 4520; e-mail: [email protected] Insect Molecular Biology Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73 doi: 10.1111/j.1365-2583.2009.00963.x © 2010 The Authors Journal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73 59

The draft genome sequence of Arsenophonus nasoniae , son-killer bacterium of Nasonia vitripennis , reveals genes associated with virulence and symbiosis

Embed Size (px)

Citation preview

The draft genome sequence of Arsenophonusnasoniae, son-killer bacterium of Nasonia vitripennis,reveals genes associated with virulence and symbiosis

T. E. Wilkes*, A. C. Darby*, J.-H. Choi†,J. K. Colbourne†, J. H. Werren‡ and G. D. D. Hurst*

*School of Biological Sciences, University of Liverpool,Liverpool, UK; †The Centre for Genomics andBioinformatics, Indiana University, Bloomington, IN,USA; and ‡Department of Biology, University ofRochester, Rochester, NY, USA

Abstractimb_963 59..74

Four percent of female Nasonia vitripennis carrythe son-killer bacterium Arsenophonus nasoniae, amicrobe with notably different biology from otherinherited parasites and symbionts. In this paper, weexamine a draft genome sequence of the bacteriumfor open reading frames (ORFs), structures and path-ways involved in interactions with its insect host. Thegenome data suggest that A. nasoniae carries mul-tiple type III secretion systems, and an array of toxinand virulence genes found in Photorhabdus, Yersiniaand other gammaproteobacteria. Of particular noteare ORFs similar to those known to affect host innateimmune functioning in other bacteria, and four ORFsrelated to pro-apoptotic exotoxins. The genomesequences for both A. nasoniae and its Nasonia hostare useful tools for examining functional genomicinteractions of microbial survival in hostile immuneenvironments, and mechanisms of passage throughgut epithelia, in a whole organism context.

Keywords: symbiont, Wolbachia, male-killing, Photo-rhabdus, Arsenophonus, genome, toxins.

Introduction

Insects engage in many different interactions withenteric bacteria (bacteria from the gamma division of

proteobacteria). The spectrum varies from strongly para-sitic (causing death of the host) through more commensalinteractions, to ones where the presence of the bacteriumis beneficial or even essential (Dale & Moran, 2006).Pathogenic interactions range from passive to extremelyaggressive infections. For example, Erwinia carotovaraand Pseudomonas entomophila are gut-invasive patho-gens, causing a loss of gut epithelial integrity that allowstheir dissemination throughout the insect (Vallet-Gelyet al., 2008). Photorhabdus luminescens represents avery aggressive infection. A ‘partner’ of nematodes, itswitches to a pathogenic lifestyle on regurgitation inside alepidopteran haemocoel (ffrench-Constant et al., 2003).This bacterium carries a diverse arsenal used to subju-gate and kill its secondary host, comprising both toxinsand systems for disabling host immune responses(Waterfield et al., 2004).

In contrast to these pathogens, other members of thegammaproteobacteria are symbionts of insects. Theseare persistent infections, and are commonly divided intotwo categories. Primary symbionts (such as Buchnera,Wigglesworthia, Blochmannia) required for host functionby virtue of various anabolic roles, are typically integratedinto both host anatomy (through a bacteriome) and physi-ology and commonly have a long evolutionary history withtheir host. Secondary symbionts, such as Hamiltonellaand Sodalis, are dispensable, generally less integratedinto anatomy, and sometimes exist within the hosthaemolymph (Moran et al., 2005). Here, they must eitheravoid eliciting or be refractory to any clearing immuneresponse of the host. Secondary symbionts often providean ecologically-contingent benefit, such as natural enemyresistance (Haine, 2008).

The gammaproteobacterium Arsenophonus nasoniaeinfects the wasp Nasonia vitripennis. First described phe-notypically from its ‘son-killer’ behaviour, it represented amaternally inherited trait present in around 4% of femaleN. vitripennis wasps that were typified by the production offemale biased secondary sex ratios (associated with thedeath of 80% of sons) (Skinner, 1985). Whilst male-killing

Correspondence: G. Hurst, School of Biological Sciences, University ofLiverpool, Crown Street, Liverpool L69 7ZB, UK. Tel.: +44 15 1795 4520;e-mail: [email protected]

InsectMolecular

Biology

Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73 doi: 10.1111/j.1365-2583.2009.00963.x

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73 59

bacteria have been found to be relatively common ininsects (Hurst et al., 2003), the N. vitripennis son-killer hasmany unusual features. First, it can be infectiously trans-mitted by sharing a pupal host. Second, it was isolatedinto cell-free culture (Werren et al., 1986), and the causalagent therefore officially named and characterized(Gherna et al., 1991). Third, the bacterium has an unusualrelationship with its host, in that it combines aspects ofpathogenic and symbiotic lifestyles. It invades through thegut wall and establishes a persistent, ubiquitous infection(Huger et al., 1985) where it survives in a hostile immuneenvironment and is maternally inherited. However, ratherthan passing through the cytoplasm of the egg as othermale-killers, A. nasoniae is injected into the fly pupal hostat oviposition, and is then ingested by early instar wasplarvae where it reinvades through the wasp larval gut(Huger et al., 1985; Werren et al., 1986). Therefore, it alsoroutinely and alternately infects two kinds of hosts: para-sitic wasps and fly pupae. Typically, the fly puparium hasbeen injected with wasp venoms that alter its physiology(Rivers & Denlinger, 1994), but the host remains alive forseveral days, a time frame when the bacterium appears toreplicate within the fly and is transmitted to feeding wasplarvae. In many ways, the only aspect of its symbiosis trulyshared with other ‘male-killers’ is the phenotype: it killsmale hosts. Male killing is achieved by blocking the for-mation of maternally-derived centrosomes in the unfertil-ized haploid (male) embryos of this haplodiploid insect(Ferree et al., 2008)

The combination of pathogenic and symbiotic featuresmakes A. nasoniae an interesting organism. It can serveboth as an insect model for gut invasion and as a modelfor the biology of persistent infection. In this paper, wereport results from our analysis of a draft genomesequence of this organism (Darby et al., 2010), withspecial reference to candidate loci and genetic compo-nents of the genome that may be important for themicrobe’s interaction with the insect host, particularly interms of how it invades, survives in a hostile immuneenvironment and how it kills males. We first recapitulatethe basic properties of the genome as described in Darbyet al. (2010) and then examine what the genome can tellus about the ‘machinery’ used in interaction with the host,in terms of secretion systems. We then detail openreading frames (ORFs) that have protein sequence simi-larity to genes coding for effector molecules, as identifiedin other systems.

Results

The A. nasoniae draft genome analysed here wasobtained through pyrosequencing of a standard fragmentand paired-end single-stranded template DNA libraryusing the GS DNA Library Preparation Kits (Roche

Applied Sciences) that were then amplified by emPCRand sequenced on a GS-FLX (454 Life Sciences). The454 reads were assembled with Newbler (v1.1.03.24)using default assembly parameters. The draft A. nasoniaegenome sequence thus assembled comprised 143sequence scaffolds (median size 7 Kbp, max 212 Kbp), in665 contigs (median size 4 Kbp, maximum 43 Kbp), with261 sequencing gaps in the scaffold, and resulted in atotal draft genome assembly size of 3,567,128 bp, with aGC content of 37.7%. These scaffolds are a mixture ofbacterial chromosomes (c. 3.2 Mbp) and extrachromo-somal DNA (~100 Kbp from plasmids and ~200 Kbp fromphage). The draft genome assembly has been depositedin EMBL (accession numbers FN545141-FN545284). Fulldetails of the sequence and its properties can be found inDarby et al. (2009).

From the 3332 predicted ORFs, we defined those thatare most likely to be associated with interaction with theinsect host. Most of our report is based on inferring theidentification and putative function of annotated ORFswith significant sequence similarities (e < 1 ¥ 10-10) togenes of functionally characterized microbial genomesusing BLASTp search algorithms (Altschul et al., 1990),and through the detection of conserved domains as iden-tified from Interpro search. Homology is, where stated,inferred either from gene synteny, or from phylogeneticreconstruction (trees are not shown). The ORF acces-sions are either noted on figures or tables, or can be foundin supplementary material Table S1.

Microbial systems of interaction with the host can besubdivided into the apparatus associated with delivery ofproteins and small molecules (secretion systems), theacquisition of molecules from the host environment (trans-porter systems), and the secreted bioactive proteins andmolecules themselves. Below, we identify ORFs within theA. nasoniae genome likely to function in these three roles.

Secretion machinery

There are two types of bacterial machinery for secretion:ones that possess a needle and secrete into eukaryoticcells (e.g. Type III secretion systems), and others thattranslocate proteins, ions and small molecules acrossthe bacterial membranes into the environment (ABCtransporters, Sec-dependent translocation, and Sec-independent translocation).

Type Three Secretion systems

Arsenophonus nasoniae has two complete Type III secre-tion systems (TTSSs). The first is most closely akin to theTTSS of Yersinia sp. and the second to the Inv/Spa-likeapparatus of Salmonella. The genome also containsTTSS fragment regions, numbered 1–3, with sequence

60 T. E. Wilkes et al.

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

similarity to Shewanella, Yersinia and Salmonella, respec-tively (Figs 1–3).

All three virulent Yersinia species contain a TTSSencoded on their large virulence plasmid, with a core,highly conserved block of about 20 Kbp. The conservedregion contains 31 genes in 8 transcriptional units(lcrGVH-yopBD | yopN | lcrDR | yscN-U | virG | virF |yscA-L | lcrQ) (Hueck, 1998). Organisationally, this regionof the A. nasoniae TTSS operon is very similar to that ofYersinia in terms of both gene content and order (Fig. 1).Of all the genes previously shown to be essential for thefunctioning of the TTSS, only YscE has no discernablehomologue in the A. nasoniae operon. The address inthe genome where YscE would be expected has 29%amino acid identity to the type III export protein PscE

of Pseudomonas aeruginosa, suggesting a functionalequivalent. Interpro search indicates almost all the genesof the Yersinia-like TTSS of A. nasoniae contain completemotifs relevant to their function. There are four exceptions,detailed in supplementary material Table S2. The finalambiguity in terms of function is the presence of a VirFORF. BLASTp similarity to this ORF is spread acrosscontiguous A. nasoniae ORFs, with the second ORF con-taining an intact AraC domain. Whether this frameshiftis actual or a homopolymer sequence artefact awaitsresolution. Despite these queries, we conclude that A.nasoniae is highly likely to carry a functional TTSS akinto that found in Yersinia or Pseudomonas.

In addition to the structural elements of the YersiniaTTSS, A. nasoniae has a cluster of regulatory genes at the

Figure 1. The Yersinia-like type III secretion system (TTSS) operon of Arsenophonus nasoniae. Regulation of the operon is most likely exerted by asystem akin to that of Pseudomonas via the ExsD,VirF/ExsA,ExsC and ExsE homologues. LcrD, R and V are homologues of the Yersinia low calciumresponse genes, which activate secretion in the presence of a host cell. YopB and YopD, along with their chaperone SycD, are type III secretedpore-forming proteins in Yersinia. ‘Sct’ is a unified TTSS gene nomenclature (Hueck, 1998). Gene order and function in this operon is compared withits closest homologue: the TTSS of Yersinia entercolitica. The light orange bars between loci indicate areas of sequence similarity and gene orderconservation. Also shown is the A. nasoniae TTSS fragment region #2. Red stars indicate open reading frames pseudogenized by frame shift.Y. entercolitica gene order taken from Toh et al. (2006).

Figure 2. The Salmonella Inv/Spa-like type III secretion system (TTSS) of Arsenophonus nasoniae. The Salmonella Inv, Spa and Prg genes, whichencode the secretion machinery, all have identifiable A. nasoniae counterparts (with the exception of InvH), as does the regulator HilA. OrgA(Salmonella) and HrpE (Burkholderia) are oxygen-regulated TTSS components. A. nasoniae open reading frames similar to the effector SipC and itschaperone SipA have been identified based on synteny and size similarities. ACP is an acyl carrier protein. Gene order and function in this operon iscompared with its closest homologues: the SPI-1 TTSS of Salmonella enterica serovar Typhi strain CT18 and the SSR-2 TTSS of Sodalis glossinidiusstrain ‘morsitans’. The light orange bars between loci indicate areas of sequence similarity and gene order conservation. Also shown is the A. nasoniaeTTSS fragment region #3, which is lacking several genes of the needle complex. S.enterica gene order taken from Toh et al. (2006).

Virulence and symbiosis in Arsenophonus nasoniae 61

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

end of the operon and on the opposite strand. TheseORFs have sequence similarity to components of the lowCalcium response (Lcr) genes of Yersinia that promoteeffector secretion under low Calcium conditions (such asthose found in the presence of a host cell). Also present isa PopB-like (Pseudomonas)/YopB-like (Yersinia) ORF, apore-forming translocation protein that functions in regu-lation of secretion but has also been shown to act directlyagainst host cells. Downstream of this region are twoORFs with similarity to ExsC and ExsE of Pseudomonas.Together with the ExsD and ExsA/VirF-like ORFs, we finda complete control mechanism for TTSS gene expressionin A. nasoniae of the kind described for Ps. aeruginosa(Rietsch et al., 2005).

The second TTSS operon of A. nasoniae is homologousto the Inv/Spa apparatus of Salmonella sp. pathogenicityisland 1 (SPI-1), responsible for the invasion of intestinalepithelia (Hueck, 1998). Homology is inferred both fromconservation of gene content and synteny to the TTSSof both Salmonella sp. and Shigella sp. (Galan, 1996)(Fig. 2). The core packet of genes forming the Inv/Spasystem is present in A. nasoniae, with the exception ofan InvH homologue, a part of an outer membrane trans-location complex (Crago and Koronakis, 1998), which isapparently absent. Effectors in this operon include ORFswith sequence similarity to the proapoptotic molecule SipBand its chaperone SipD. Whilst BLASTp indicates theeffector SipA and its cognate chaperone SipC are absent,there are two predicted ORFs either side of the SipD-likegene that we believe are likely to be an effector/chaperonepair based on their size and synteny.

Other secretion systems

Arsenophonus nasoniae, like other gammaproteobacte-ria, possesses both Sec and TAT (twin arginine translo-case) systems for translocating proteins carrying thecognate signal sequence through the inner membrane.Proteins secreted via this system may either remain in theperiplasmic space, autotransport through the outer mem-brane, or move through the outermembrane via a type IV

pilus (Natale et al., 2008), which the genome sequencesuggests is also present. In addition to these systems, A.nasoniae possesses ORFs revealed by BLASTp to berelated to a wide variety of ABC transporters that translo-cate small molecules and larger peptides/proteins acrossboth cell membranes. Those likely to be important in viru-lence are described below (details of others in Darbyet al., 2009).

Microbes that live inside live hosts are commonly limitedby iron availability, and iron acquisition systems arethus regarded as ‘virulence determinants’ (Payne &Finkelstein, 1978). Arsenophonus nasoniae possessesORFs with reciprocal BLAST match, confirmed by phylog-eny (see supplementary material Fig. S3), to two systemsfor the translocation of chelated iron, one based on oneFe3+ ABC transporter system and the other on the TonB-dependent FepABCDG translocation system of Escheri-chia coli, with which it is syntenous. It is unclear whetherA. nasoniae itself manufacturers siderophores, althoughthere is a fragment of an ORF with polyketide synthasedomains discussed below that may be involved with sid-erophore production. Arsenophonus nasoniae also pos-sesses an operon showing both sequence similarity andsynteny to the E. coli Ferrous-iron transport systemFeoABC, which transports free (unchelated) ferrous iron.

Effector molecules and pathogenicity islands

The genome of A. nasoniae carries ORFs with sequencesimilarity to a variety of effector molecules. In addition, itcontains three islands of effector molecules apparentlyundergoing pseudogenization. We first review the varietyof ORFs akin to type III secreted effectors, describe apotentially novel symbiosis/pathogenicity island, andthen outline ORFs encoding either potential effectors, orthe synthesis of small molecule effectors. We finallydescribe three toxin islands that are apparently undergo-ing pseudogenization.

Type III effector-like ORFs. Twelve A. nasoniae ORFs withBLASTp sequence similarity to known TTSS effectors were

Figure 3. Gene order and function of the Arsenophonus nasoniae type III secretion system (TTSS) fragment #1 compared with its closest homologue:the TTSS of Shewanella baltica OS155. The light orange bars between loci indicate areas of sequence similarity and gene order conservation. Notable isthe loss of the AraC-like transcription regulator. In A. nasoniae this region is highly pseudogenized, containing many stop codons. S.baltica gene ordertaken from Toh et al. (2006).

62 T. E. Wilkes et al.

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

observed, corresponding to ten different effector molecules(Table 1). As is typical of TTSS effectors, only a minority(three) of these are found within the TTSS operon itself (seeabove), with the others being dispersed throughout thegenome. The ORFs to which these show sequence simi-larity variously alter host cell signalling pathways, in par-ticular, altering host innate immune responses.

In addition to those ORFs outlined in Table 1, we iden-tified candidate type III effectors within the TTSS definedby synteny rather than sequence similarity (detailedabove). We would also note two other ORFs that deserve

investigation as candidate TTSS effectors by virtue ofbeing both proximal to one of the likely effector ORFs inTable 1, and having pentapeptide repeat motifs (supple-mentary material Fig. S1). Pentapeptide repeat motifs area soft indicator of involvement in virulence, being foundin TTSS effectors such as PipB2 (Knodler & Steele-Mortimer, 2005), but also in protein with no virulence asso-ciation. In the case of YopH – pentapeptide repeat ORFpair, a third ORF is found with N terminal similarity to theeffector SopA. This ORF is clearly not a SopA homologue,but nevertheless warrants investigation.

Table 1. Arsenophonus nasoniae open reading frames with sequence similarity to known type III effectors

A.nas ORF/s Similar ORF (bit score, p) Function of similar ORFNotes on Arsenophonusnasoniae ORF References

36660 YopJ Yersinia Alters ubiquitination status, disruptingMAP Kinase and NF-kB signalling,and therefore affecting innate immunesignalling and cytokine production.Antiapoptotic.

Of similar length and 62% sequencesimilarity. Cysteine protease catalyticcore and catalytic triad identified inYopJ is intact.

Orth et al. (2000)

l = 297 S = 211, e-53

23010 YopH Yersinia/ Tyrosine phosphatase activityprovides resistance to phagocytosis.

Longer than YopH and carries twotyrosine kinase elements, with precisematch at known active site.

Black & Bliska (1997)Galán (2001)SptP Salmonella

l = 726 S = 132, e-29

10280 Effector in Shewanella Not known. Effector hypothesis israised by position in TTSS andchaperone binding site.

Within an incomplete TTSS.Contiguous to ORF with similarity tocognate chaperone.

Black & Bliska (1997)Galán (2001)l = 1221 S = 441, e-121

35130, SopA Salmonella HECT-3 like ubiqutin ligase enzyme,alters ubiquitination status of proteins.

Similarity over C terminal of protein(N terminal often different in membersof this family). Intact HECT-3 domainand necessary cysteine residue forcatalytic activity at site 753.

Zhang et al. (2006)

02810, S = 240, e-61

26090 l = 553

35620 SopB Salmonella/ Inositol phosphate phosphataseenzyme.

66% sequence similarity over entireORF. Active domain recognised.Resides next to cognate chaperone,SigE/IpgD

Norris et al. (1998)

l = 556 IpgB Escherichia coli

S = 504, e-142

23150 PipA Salmonella Type III secreted protein of unknownfunction, save important in virulence

Sequence similarity over entire ORF. Tenor et al. (2004)

l = 224 S = 158, e-37

23940 OspG Shigella Alters phosphorylation of ikB,interfering with activation of NF-kBpathway and innate immunesignalling

Protein kinase domain intact,including lysine residue required forkinase activity.

Kim et al. (2005)

l = 197 S = 98, e-19

24950 ExoY Pseudomonas Adenylate cyclase activity, creates100 fold increase in intracellularcAMP and thus interferes withsignalling pathways.

Sequence similarity at N terminal,including anthrax toxin domain. Cterminal incomplete by virtue ofsequencing gap means full functionalassessment not possible

Yahr et al. (1998)

l = 266 S = 154, e-36

14250l = 376 IpaD Shigella

S = 100, e-19

Surface antigen, required for bacterialentry into epithelial cells andintroduction of late effectors. Activitydepends on C terminal of protein.

Within TTSS. Found proximal tocognate chaperone. 55% Sequencesimilarity at C terminal.

Picking et al. (2005)

14270 SipB Salmonella Important for cell entry andtranslocation of late effectors.Pro-apoptotic through binding tocaspase-1.

Within TTSS. 49% sequencesimilarity at C terminal. Partial invasindomain found.

Hersh et al. (1999)

l = 666 S = 123, e-26

Underneath the A. nasoniae open reading frame (ORF) accession is the length in amino acids (l =), underneath the ‘Similar ORF’ information is the bitscore (S =) and e-value. Where more than one A. nasoniae ORF exists, length and BLASTp data are given for most similar ORF only. TTSS, type IIIsecretion system.

Virulence and symbiosis in Arsenophonus nasoniae 63

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

Leucine Rich Repeat/mcf Symbiosis/Pathogenicity island.Arsenophonus nasoniae possesses a likely pathogenicity/symbiosis island/s containing ORFs with multiple LeucineRich repeats alongside two ORFs, the first with sequencesimilarity to an Aeromonas cell wall enterotoxin encodedby the gene ast (Table 2) and the second a cell wallhydrolase (Fig. 4). All proteins containing LRR domainsbind ligands (that may themselves be proteins, but can beRNA or polysaccharide), and many have been observedto be important in host/bacteria interactions, both in termsof bacterial pathogenicity and host immunity (Kobe &Kajava, 2001). One such bacterial gene is the poorlyunderstood Yersinia effector YopM, a cytotoxin essentialfor Yersinia pathogenicity (Hines et al., 2001). Other LRRdomains function in bacterial internalization in the pres-ence of eukaryotic cells.

Within this region, three ORFs are particularly notablefor sharing a common, but previously unrecognized, chi-meric structure. They combine N terminal LRR elementscoupled to a C terminal with similarity to part of the Pho-torhabdus gene mcf (makes caterpillars floppy) (Dabornet al., 2002) (Fig. 5). The mcf-like part of these ORFscorresponds to a section of the Photorhabdus gene withsequence similarity to the toxin B and RTX domains foundin the C terminal region of mcf. The pro-apoptotic BH3domain found in the N terminal portion of Photorhabdusmcf was not detected.

Other ORFs with potential for toxin function/production.Table 2 additionally provides details of other ORFs whereBLASTp search finds significant similarity to toxin genes.Most notable are four ORFs, dispersed throughout thegenome, with sequence similarities to the gene Aip56(apoptosis inducing protein 56) carried on a plasmidof Photobacterium damselae ssp. piscicida. Functionalstudies demonstrated the protein product of Aip56 is apro-apoptotic exotoxin, the protein being sufficient to killneutrophils, and being necessary for virulence of thebacterium (immunization of fish against Aip56 resulted infailure of infection to kill) (Do Vale et al., 2007). BLASTpindicates these ORFs are members of a small familypresent in pathogenic gammaproteobacteria. They aresimilar to type III effector C protein in three species, and toan uncharacterized secreted protein in a pathogenic E.coli. A final BLAST return recovered similarity of the Cterminal of the A. nasoniae ORF to an element in APSE-2,a phage present in the aphid secondary symbiont Hamil-tonella defensa. A. nasoniae is the only genome to date topossess more than one copy of this family. However, theexpansion of the family is not recent, and amino acididentity between members within A. nasoniae is no morethan 75% in any case. Phylogenetic analysis cannotconfidently resolve whether the A. nasoniae copies are Ta

ble

2.In

tact

open

read

ing

fram

esw

ithse

quen

cesi

mila

rity

toto

xin

elem

ents

A.n

asO

RF

/sS

imila

rO

RF

(Bit

scor

e,P

-val

ue)

Fun

ctio

nN

otes

onA

rsen

opho

nus

naso

niae

OR

FR

efer

ence

s

1022

023

450,

0772

0,33

080

l=48

6A

ip56

,P

hoto

bact

eriu

mda

mse

lae

ssp.

pisc

icid

aP

ro-a

popt

otic

exot

oxin

Of

sim

ilar

leng

than

d45

–54%

sequ

ence

sim

ilarit

yac

ross

Aip

56.

3O

RF

sar

ein

phag

ere

gion

s,on

eis

prox

imal

toa

TT

SS

.D

oV

ale

etal

.(2

007)

S=

325,

e-87

3685

0l=

1051

cnf1

:C

ytot

oxic

necr

otiz

ing

fact

or1,

Esc

heric

hia

coli

Cte

rmin

usis

tran

sloc

ated

into

cells

,an

dca

uses

illeg

itim

ate

activ

atio

nof

Rho

GT

Pas

eac

tivity

,al

terin

gsi

gnal

ling.

Seq

uenc

esi

mila

rity

atN

term

inus

,w

ithin

tact

cell

rece

ptor

dom

ain,

and

pair

ofm

embr

ane

span

ning

helic

es.

Cte

rmin

usdo

esno

tpo

sses

scn

f1ca

taly

ticdo

mai

nth

atca

uses

toxi

city

,an

dha

sno

mat

ches

inN

CB

Inr

.

Boq

uet

(200

1)

S=

107,

e-21

3548

0l=

151

Col

icin

V,P

hoto

rhab

dus

Bac

terio

cida

lIn

tact

colic

inV

prod

uctio

nm

otif

Cas

cale

set

al.

(200

7)

S=

256,

e-67

2229

0l=

431

Col

icin

1bB

acte

rioci

dal

Por

efo

rmin

gdo

mai

nco

mpl

ete.

Lies

prox

imal

toco

licin

1bim

mun

ity–f

acto

rlik

eO

RF

fact

or.

Cas

cale

set

al.

(200

7)

S=

225,

e-57

3195

0l=

533

Ser

raly

sin

Inse

ctic

idal

Com

plet

ese

rral

ysin

dom

ain

ofin

sect

icid

alhe

mol

ysin

ofS

erra

tia.

Tao

etal

.(2

007)

S=

328,

e-88

2840

0l=

651

ast,

Aer

omon

ashy

drop

hila

Ent

erot

oxin

Car

ries

sign

alpe

ptid

e.72

%aa

iden

tity

over

entir

eA

erom

onas

OR

F.P

art

ofLR

R/m

cfis

land

.S

haet

al.

(200

2

S=

761

e=

0

Und

erne

ath

the

A.

naso

niae

open

read

ing

fram

e(O

RF

)nu

mbe

ris

the

leng

thin

amin

oac

ids

(l=),

unde

rnea

thth

em

atch

edO

RF

isth

ebi

tsc

ore

(S=)

and

e-va

lue.

Whe

rem

ore

than

one

A.

naso

niae

OR

Fis

liste

d,le

ngth

and

BLA

ST

pda

taar

egi

ven

for

the

mos

tsi

mila

rO

RF

only

.T

TS

S,

type

IIIse

cret

ion

syst

em.

64 T. E. Wilkes et al.

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

monophyletic within A. nasoniae. Therefore, it is unclear ifthis finding represents a single origin or a serial transfer ofthe element.

The other notable putative ‘toxin’ found by sequencesimilarity searches (e < 1 ¥ 10-21) has sequence similarityto the cnf1 gene (cytotoxic necrotizing factor 1). The cnf1locus possesses three functional domains. The N terminalcarries a cell receptor binding domain, important in adhe-sion to eukaryotic cells, and a pair of hydrophobic mem-brane spanning helices. These are conjectured to beimportant in initiating transfer of the bioactive C terminaldomain into the cell, which then illegitimately activateshost Rho GTPases (Boquet, 2001). The cnf1-like ORF ofA. nasoniae carries the domains for both adhesion andtransfer of the C terminal, but the C terminal is distinct (noreturns on BLASTp search). This ORF deserves investi-

gation, as it has the machinery for translocation of the Cterminal into eukaryotic cells, making the C terminalpotentially bioactive, but in unknown ways.

Additionally, the genome encodes 38 putative haemol-ysins and alkaline metalloproteases that are potentiallytransported by two ABC family transporters with weaksequence similarity to RTX/hemolysin transport proteins.One of the haemolysin genes has sequence similarityto serralysin, an insecticidal toxin of Serratia sp. (seeTable 2).

Small molecules synthesis and secretion. Bacteria canalso secrete an array of small bioactive organic mol-ecules. These molecules may aid survival in a hostile hostenvironment, or have direct activity against the host.Within the first category are molecules such as sidero-

Figure 4. The organization of the two LRR-MCF regions of Arsenophonus nasoniae. The first, larger region contains two LRR-MCF open reading frames(ORFs) and three YopM-like genes containing 12, 11 and 30 leucine rich repeat regions, respectively. The star in LRR-MCF ORF 1 indicates the locationof a stop codon that may be an artefact of 454 sequencing. Upstream is a cell wall hydrolase-like gene (best BLASTp matches to genes found in variousSalmonella isolates) and an ORF with similarity to the ast enterotoxin of Aeromonas (detailed in Table 2). The second, smaller region contains a singleLRR-MCF ORF followed by a gene with weak homology to MCF (e < 1 ¥ 10-4). Between this and the transposase are, from left to right, a tagatosebisphosphate aldolase and two phosphotransferase system (PTS) ORFs. Immediately upstream is a scaffold start.

Figure 5. A comparison of the three Leucine Rich Repeat (LRR) – Makes Caterpillars Floppy (MCF) open reading frames (ORFs) of Arsenophonusnasoniae. Green arrows indicate areas containing LRR domains, the red arrows indicate areas of homology to the Repeats in Toxin (RTX) toxin gene ofVibrio sp. and the orange bars show regions of homology to the MCF gene of Photorhabdus luminescens. The black region in LRR-MCF ORF 2indicates a sequencing gap of known size (75 amino acids).

Virulence and symbiosis in Arsenophonus nasoniae 65

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

phores, which function in iron scavenging, an importantdeterminant of fitness in iron-poor settings that typify thehost environment. In the latter category are a range ofsecreted polyketides such as the antibiotic tetracycline,and molecules that show toxicity to eukaryotes. Forinstance, the Pseudomonas symbiont of Pederea beetlesis known to synthesize and secrete the polyketide pederin,which produces repulsion of the host to predators (Kellner,2002).

Siderophores and secreted toxin molecules often sharebiosynthetic pathways and can be synthesized in conden-sation reactions similar to those of the fatty acid synthesispathways. There is evidence within the genome of A.nasoniae for genes that both form and export polyketidemolecules. Three ORFs are found within the genome thatmay function in polyketide synthesis (Supplementarymaterial Table S3): a putative polyketide cyclase, an ORFincomplete in the current assembly that nevertheless pos-sesses a variety of domains commonly found in polyketidesynthase (PKS) enzymes, and a final ORF (proximal to thePKS-like ORF) with homology to 4′-phosphopantetheinyltransferase, that is likely to activate ketosynthase (KS)enzymes, via post-translational modification of the KS acylcarrier protein domain (Lambalot et al., 1996).

Polyketide efflux is suggested by the presence of ORFswith reciprocal BLAST match (confirmed by phylogeny,see supplementary material Fig. S3) to macrolide/antimicrobial peptide transporter systems. Macrolidetransport is associated with an ABC system comprisingtwo proteins, the ABC transporter protein itself, and MacA,a periplasmic membrane fusion protein that connectsinner and outer membrane components of the transporter(Tikhonova et al., 2007). Arsenophonus nasoniae hasthese components coupled within the genome. Bothappear intact with the expected functional motifs andshare similar length and significant sequence similarity tofunctionally characterized genes in Yersinia (e < 1 ¥10-108).

Islands undergoing pseudogenization

The elements of a genome undergoing pseudogenizationcan provide insight into the historical biology of a bacterialspecies. Whilst homopolymer artefacts associated withsequencing make individual frame shift errors difficult tointerpret, the following islands presented multiple frame-shift mutations and stop codons and can thus be ascribedpseudogene status with more certainty.

RTX Island. The RTX (repeats in toxin) are a family ofexotoxin proteins known from studies of mammalianpathogens to possess haemolytic, leucotoxic andleucocyte-stimulating activities (Lally et al., 1999). The A.nasoniae genome encodes ORFs with similarity to an RTX

ABC transporter and has a 26 Kb region with sequencesimilarity to RTX proteins (ARN_13180 to ARN_13310).The transporter structure is complete and shows bothstrong sequence similarity and synteny with RTX trans-porters found in Photorhabdus sp. (confirmed by phylog-eny, see supplementary material Fig. S3). The region withsimilarity to genes encoding RTX proteins itself is probablypseudogenized, with the equivalent of three ‘functionalORFs’ disrupted by two stop codons and five frameshiftsto make 10 partial ORFs. Comparable genes in Yersiniasp. show similar fragmentation and may have a commonancestry from larger complete RTX proteins seen inProvidencia sp.

Clostridium/Ricketsiella/Xenorhabdus Island. Arsenopho-nus nasoniae possesses a 10 Kbp island of ORFs withsequence similarity to toxins found in Clostridium orXenorhabdus (ARN_08100 to ARN_08150). Whilst noneof the ORFs have outstanding similarity to any of the toxinelements (e > 1 ¥ 10-25), this series of ORFs does suggesta potential pathogenicity/symbiosis island that is undergo-ing pseudogenization. The island is predicted to containsix ORFs, but BLASTP sequence identity searches indi-cate that ancestrally it may have contained fewer. The firstthree ORFs show two alignments to Clostridium ORFX2-like genes from Erwinia (e < 1 ¥ 10-46) separated by astop codon and a frameshift. Clostridium ORF X2 has arole in botulinum toxin production (Dineen et al., 2004).The final two ORFs are similar to nematicidal proteinPlu2242 of Xenorhabdus/Photorhabdus (e < 1 ¥ 10-90), butthe single Xenorhabdus/Photorhabdus ORF is disruptedby at least one stop codon in A. nasoniae.

Toxin complex. Insecticidal toxin complex (Tc) proteinswere first isolated in P. luminescens (Bowen & Ensign,1998) and have since been described in a variety ofentomopathogens (Serratia entomophila, Xenorhabdusnematophilus, Pseudomonas entomophila) and severalYersinia sp. (Fuchs et al. 2008 and refs. therein). Threecomponents are needed for full toxicity, a TcdA-like (TcaA,TcaB or TcdA), a TcdB-like (TcaC or TcdB) and a TccC-likecomponent. Experiments expressing Tc genes in E. colishowed that TcdA1 on its own exhibits low levels of toxicitytowards insect gut epithelium, with full toxicity achievedonly with the addition of both a TcdB and a TccC-like geneproduct (Waterfield et al. 2005; Munch et al., 2008).

Arsenophonus nasoniae has a number of ORFs withBLASTp sequence similarity to Tc genes in P. luminescens(e < 1 ¥ 10-58). A. nasoniae has ORFs with proteinsequence alignments to TcdA4, TcdB1 (on one island)and TccC (elsewhere in the genome). The TcdA4 BLASTreturns are spread over three separate predicted ORFs,with one frameshift event in the middle of the ‘gene’ andmultiple stop codons in the end section. Likewise, a frame-

66 T. E. Wilkes et al.

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

shift divides the region with sequence alignment to TccB1over two ORFs, but does include the SpvB domain seenin TcdB1 of P. luminescens (Waterfield et al., 2001). TheA. nasoniae TccC-like region is interrupted by an IS100transposase and multiple stop codons. Although furtheranalysis of these regions is required, from the availablesequence data it appears that A. nasoniae contains thepseudogenized remnants of an ancestral Tc insecticidaltoxin locus.

Secreted proteins defined by signal sequence presence

A search for signal peptide motifs reveals 310 intact ORFspredicted to carry a signal sequence for sec-dependentsecretion under either NN or HMM algorithms (supple-mentary material Table S4). As expected, many of theseORFs have best BLASTp matches to membrane boundparts of transport (type III secretion, ABC, type IV pilus) ormotile (flagellar) machinery. Other ORFs have bestBLASTp matches to extracellular solute binding proteins,to lipoproteins, to proteins expected to be involved inpeptidoglycan formation and breakdown, and to outer-membrane proteins of undefined function, commonlydescribed as ‘antigens’.

Of biological interest are a variety of ORFs with bestBLASTp returns to genes that function in adhesion toeukaryotic cells (described in a later section), a groupof five ORFs with chitin-binding domains, a superoxidedismutase, eight ORFs with peptidase/carboxypeptidasedomains, and a homologue of ecotin, a serine proteaseinhibitor. There were also ORFs with Sel1 and ankyrindomains, both of which are involved in protein–proteininteractions, potentially with eukaryotic cells. The ankyrinrepeat containing protein carried four repeat units, and bestBLASTp hits were to ORFs in Providencia and Proteus, theclosest relatives of Arsenophonus. The Sel1-containingORFs carried four and five repeat units. Sel1 repeats belongto the tetratricopeptide repeat family, and many prokaryoticgenes involving Sel1 repeats mediate interaction witheukaryotic cells (Mittl & Schneider-Brachert, 2007). BothORFs have best BLASTp hits to ORFs in Providencia.

We investigated in greater detail the likelihood that A.nasoniae possesses a functional homologue of ecotin.ORF ARN_08870 encodes a protein of similar length tothat found in E. coli, and the four domains known to bepresent in ecotin (primary substrate binding, secondarysubstrate binding, inhibition and dimerization) are allpresent. All have perfect matches at the required aminoacid motifs, with the exception of the dimerization motif(which matches at 10 of 11 amino acid residues). It isthus likely that this secreted protein possesses SERPINproperties.

Outer membrane proteins determine many aspects ofinteraction with the host. OmpA has recently been dem-

onstrated to be a modulator of interaction with the hostimmune system (Weiss et al., 2008) as well as functioningas an adhesin and invasin, a modulator of biofilm forma-tion and a bacteriophage receptor (Smith et al., 2007).Weiss et al. (2008) demonstrated that the sequence of thefour external loops of this protein (especially the first, L1)dictated whether the bacterium induced an immuneresponse and persisted in the insect haemocoel, withinsect endosymbionts having a distinct L1 from vertebratepathogens. ORF ARN_16100 is a homologue of OmpA(as determined by phylogenetic analysis) and carries asignal peptide. The A. nasoniae OmpA-like ORF containsan insertion in the L1 region not seen in pathogenic E. coli,Salmonella typhimurium or Shigella flexneri, but charac-teristic of the symbiont set. The A. nasoniae insertion ishowever, different to those seen in Sodalis glossinidius, P.luminescens, SOPE and a secondary endosymbiont ofCraterina melbae, and more similar to that of H. defensa(Supplementary material Fig. S2). The fact that A. naso-niae contains a novel motif in a functionally importantregion of a gene shown to be so crucial in the switch tosymbiosis makes it an important candidate for futureinvestigation.

Adhesion to eukaryotic cells and invasin

Bacterial IgG repeats are commonly important in celladhesion, and ORFs with these repeats are found fre-quently in the A. nasoniae genome. One such ORF hasbest sequence match to the invasin locus of Yersinia, andcontains at least 12 IgG repeat domains. Two other ORFareas contain multiple IgG repeats with best sequencematches to Yersinia and, in Yersinia pestis, to invasin. Thefirst of these matches is to a region containing three ORFswith multiple IgG repeats (4, 4 and 1, respectively). Thefinal ORF with similarity to invasin is ARN_09430. Two ofthese five ORFs are identified as carrying signal peptides(ARN_19810 and ARN_09430). It should be noted thatgenes in this class are also secreted through the type IIIsecretion system or phage.

The A. nasoniae genome carries two ORFs with strongsequence similarity to the attachment invasion locus (ail)protein of Yersinia. Yersinia ail encodes a 179 amino acid17 kDa outer membrane protein that is important in celladhesion and entry. One of the two A. nasoniae ORFsshows similarity to ail along its length, with Yersinia ail asthe best BLASTp match (e < 1 ¥ 10-36); the second hasPhotorhabdus and Yersinia ail best matches (e < 1 ¥10-44). It is very likely that these ORFs both representsecreted outer membrane proteins in Arsenophonus, asboth carry signal sequences.

Arsenophonus nasoniae also carries six ORFs contain-ing at least partial haemagglutination activity domains(HADs), of which four are predicted to contain signal

Virulence and symbiosis in Arsenophonus nasoniae 67

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

peptides (supplementary material Table S5). One ofthese, ARN_32080, contains a complete HAD, is pre-dicted to carry a signal peptide and is only one codonshorter than its Proteus equivalent. Whilst A. nasoniaehas several HAD-containing ORFs, it clearly has nohomologs to the filamentous haemagglutinins of, e.g. Bor-detella sp., these being much larger ORFs (Domenighiniet al., 1990).

A. nasoniae also contains ORFs with sequence similar-ity to an agglutinin, a tia-like ORF (e < 1 ¥ 10-42), which inenterotoxigenic E. coli allows their adherence to andinvasion of human colon- and ileum-derived epithelialcells (Mammarappallil & Elsinghorst, 2000). The A. naso-niae ORF is fragmented by contig gaps and contains astop codon between the region containing the signalsequence and the rest of the ORF. Further work is neededto determine its functionality.

Finally, four fimbrial adhesion operons are detected,which are most likely assembled via the chaperone-usher pathway (Soto & Hultgren, 1999). Each operoncontains at least one fimbrial subunit protein, chaperone,outer membrane usher protein and adhesin. All fouroperons contain a fimbrial protein with a complete FimAdomain, a chaperone gene with either a FimC domainor both the N- and C-terminus chaperone domains andan outer membrane usher gene with complete FimD orusher superfamily domain. As expected, the adhesingenes show weak sequence preservation. Fimbrialadhesins are comprised of an N-terminus receptor-binding domain fused to a C-terminus pilin domain; anyhomology between A. nasoniae adhesins and those ofother bacteria is confined to this C-terminus region. Thelack of sequence conservation at the N-terminal regionmay be indicative of host specialization.

Discussion

Galan & Bliska (1996) described the relationship betweenbacteria and host as a complex cross talk. Examining thegene content of A. nasoniae provides detailed informationon one side of this conversation and allows insight intothe potential interactions occurring between bacteria andinsect. Furthermore it allows us to marry gene content tolifestyle, to speculate which interactions are occurringduring which stages of the symbiosis. In this respect, it isimportant to recognize that there are three main stages inthe life cycle of A. nasoniae; bacterial invasion of the wasplarva following per oral exposure, spread within the wasphost including survival in the face of the host immuneresponse, and finally injection into the fly pupa on ovipo-sition, replication in the fly host, and re-infection of thefeeding wasp larvae.

During the invasion phase, genetic mechanisms arerequired for A. nasoniae to survive digestive enzymes,

adhere to the Nasonia gut epithelia, and then passthrough it into the host haemocoel. Membrane-boundsecreted peptidases and the SERPIN ecotin are candi-dates that may improve survival in the gut environment,removing or inhibiting gut enzymes that would otherwisedigest the bacterium. As far as adhesion is concerned,fimbrial-like, intimin/invasin-like, and haemagluttinin-likeORFs may be of importance in the initial binding to gutepithelia prior to internalization, as in the mammaliansystem. It is likely that some of the effectors secretedby the type III secretion systems are involved in entryand passage into and through the gut epithelia; theseare known to be important in gut epithelia transit forSalmonella and Yersinia in mammal hosts (Hueck,1998). The Tc complex, important in gut entry in otherbacteria, appears to be undergoing pseudogenization inA. nasoniae, and is thus a less likely candidate in thissystem.

Arsenophonus nasoniae then establishes intercellularinfections in all major tissues of the wasp host (Hugeret al., 1985). Their spread could involve the use of adhes-ins (both cell-surface and pilin-mediated), flagella motility(as is seen in the movement of the related bacterium,Riesia (Perotti et al., 2007)) and chitin-binding apparatus.As the bacteria exist extracellularly in the host, they arelikely to be exposed to the full force of the immune system.A variety of ORFs are akin to TTSS effectors known tointerfere with cell signalling and innate immunity, includingYopJ-, OspG- and SopA-like ORFs. One can hypothesizethat the ORFs related to apoptosis-inducing protein andCNF1 genes, as well as TTSS secreted effectors (YopH/SptP-like), may be important in preventing death followingphagocytosis. The ecotin-like ORF may also be importantin this context, ecotin being a SERPIN known to affectdigestion by neutrophils.

Arsenophonus nasoniae first attracted attentionbecause of its son-killer phenotype, and this paper is thefirst to describe potential virulence components of a male-killing bacterium. However, the biochemical mechanismby which A. nasoniae induces its killing of male offspringremains elusive. Ferree et al. (2008) have shown thatdeath of male eggs results from a lack of maternal cen-trosome production (male offspring ensue from unfertil-ized eggs and receive their centrosomes maternally,rather than from the sperm). It is also known that the factormoves across eukaryotic cell membranes – implying it iseither a peptide or small molecule, or that it is injectedthrough the type III system. If the polyketide synthasesystem is functional, A. nasoniae may be able to synthe-size small molecules, and this would present a temptingcase study. The alternative hypothesis is that the agent isan effector placed into the eggs through the secretionsystem, or a secreted molecule that possesses the abilityto translocate across eukaryotic membranes.

68 T. E. Wilkes et al.

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

Some of the likely effector molecules are also ratherelusive in terms of their function. Particularly enigmatic isthe island of ORFs of the LRR/mcf family. These proteinscarry a pattern recognition motif fused to one or two toxinelements (RTX/toxin B). Another interesting locus is theORF with sequence similarity to cnf1. This gene pos-sesses an N terminal that appears functional with respectto delivery of the C terminal into host cells. However, theC terminal is without matches in BLASTp databasesearch, and thus the function of the protein remainsunclear.

Our investigation of the virulence components of the A.nasoniae genome has led to one broad conclusion: that A.nasoniae possesses ORFs related to virulence genesfrom a wide variety of gammaproteobacteria. Althoughthe A. nasoniae core genome shows significantly closestsimilarity to Photorhabdus and Proteus (Darby et al.,2009), its virulence genome shows much more equalrepresentation from a range of different gammaproteo-bacteria. It is common to find effector molecules withgreatest similarity to those in Yersinia, Salmonella, Vibrio,Pseudomonas or Serratia. Predictably, many of the anti-microbial genes in Photorhabdus are absent (A. nasoniaedoes not have to defend an insect corpse from invasion byother bacteria, as it maintains a live host).

The genome can be compared with other ‘inheritedparasites’. Unlike Wolbachia, there is a limited array ofORFs with ankyrin or TPR domains that apparently func-tion in interaction with eukaryotic proteins. Rather, thereare a range of ORFs associated with interactions witheukaryotic cell surfaces and epithelia (invasin, intimin,adhesin homologues). There are also ORFs likely to beinvolved in interactions with the host antimicrobialimmune system, which are not present in other reproduc-tive parasites. These differences reflect the distinct life-style of intra and inter-cellular symbionts, and betweentransovarially transmitted symbionts, and transovumtransmitted ones such as A. nasoniae that must cross thegut epithelium.

The genome also provides hints as to the history of A.nasoniae. Pseudogenized areas with homology to insec-ticidal regions from a wide range of different bacteria areobserved, such as the two relic pathogenicity islandsand Tc region. This pattern of pseudogenization suggestsrecent descent from a bacterium with more general insec-ticidal properties, possibly in the form of a microbe thatdemonstrated straightforward pathogenesis – as in Pho-torhabdus. Alternatively, A. nasoniae may historically havebeen a secondary symbiont, using the products from cur-rently pseudogenized regions in interactions with naturalenemies of the host, as speculated for the RTX andleukotoxin-like genes in the beneficial secondary symbiontHamiltonella (Moran et al., 2005). Overall A. nasoniae’svirulence factor complement is similar to Hamiltonella’s.

Both carry two TTSSs and a diverse assortment of effec-tors, multiple, possibly pseudogenized, RTX toxins and atleast one CNF1-like gene (Degnan et al., 2009). Withrespect to this latter hypothesis, we would note that otherArsenophonus strains are likely to be secondary sym-bionts of insects, being vertically transmitted and showingno obvious pathology or reproductive manipulation phe-notypes. However, the widespread distribution of thisgenus (Hypsa & Dale, 1997; Dale et al., 2004; Thao &Baumann, 2004; Werren, 2004; Duron et al., 2008) indi-cates significant levels of horizontal transmission andperhaps the existence of pathogenic Arsenophonus typesin nature. Little is known about the phenotypes of Arse-nophonus species found in other hosts, and comparativegenomics is likely to be informative.

Beside the direct functional investigation of the candi-date genes outlined above, comparative genomics mayprovide a useful tool in determining which aspects of A.nasoniae biology are likely to be involved in travel throughgut epithelia and male-killing. In particular, sequencing ofArsenophonus strains that are apparently purely verticallytransmitted, and act as secondary symbiont strains, mayserve to develop hypotheses as to which elements of thegenome are involved with invasion and male-killing. It mayalso serve to illuminate the elements of microbial genomeevolution that occur directly following a shift in lifestyletowards obligate symbiosis.

The availability of both the host genome (Werren et al.,2010) and that of A. nasoniae provides new avenues forfunctional investigation of both host invasion and mecha-nisms of male-killing. For example, expression studies ofboth the host and bacterium could indicate candidatemechanisms for inhibition of maternal centrosomes. Pro-teomic studies of host and bacteria during oogenesis canalso provide information on candidate molecules, and theavailability of host and bacterial sequences greatly assistsin proteomic approaches.

Experimental procedures

The A. nasoniae genome was sequenced, assembled andputative ORFs detailed as described by Darby et al. (2010). Allpredicted ORFs were subjected to BLASTp (Altschul et al., 1990)search against the NCBI nr database and Interpro domainsearches on the standalone Interproscan tool (data release 19,29th January 2009) running the blastprodom, coils, gene3d,hmmpanther, hmmpir, hmmpfam, hmmsmart, hmmtigr, fprint-scan, patternscan, profilescan, superfamily, seg, signalp, andtmhmm programs (Quevillon et al., 2005). A virulence factor data-base (Chen et al., 2005) overlay was also prepared using localBLAST. Based on BLASTp, domain and virulence associationinformation, a ‘virulence set’ of ORFs with potential roles in theinteraction between bacterium and insect were selected at sig-nificance levels set below 1 ¥ 10-10. To further assess functional-ity, ORFs of this subset were compared with their most similar

Virulence and symbiosis in Arsenophonus nasoniae 69

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

matches in terms of size, presence of active domains or motifs,and location or synteny. ABC transport systems were analysed byreciprocal BLAST and phylogenetic analysis. Initially this wasundertaken en-masse, comparing all twenty-one ORFs annotatedas ABC-like with their BLAST hits from OrthoMCL (http://www.orthomcl.org/) with e-value < e-100, aligned using MUSCLE(Edgar, 2004) with well aligned blocks extracted using Gblocks(Talavera & Castresana, 2007), and assembled using PhyML(Guindon & Gascuel, 2003) (see supplementary materialFig. S3). All ABC ORFs referred to in the text were verified bycreating phylogenies using the best BLASTp hit to the first eightspecies in the NCBI nr database, alignment and neighbour-joining tree assembled using clustalX v2.0.11 (http://www.clustal.org/).

We also estimated the component of the genome that issecreted into periplasmic space through the Sec/Tat systems,using the programme SignalP (Bendtsen et al., 2004). Presenceof a signal peptide was estimated using neural networks (NN) andhidden Markov models (HMM) trained on Gram-negative bacte-ria. A signal peptide was considered present under the NNmodels if all criteria were fulfilled for presence of a signal peptide.A signal peptide was considered present under HMM output if theprotein was identified as ‘secreted’ (S).

Acknowledgements

Sequencing support for this project is provided by TheCentre for Genomics and Bioinformatics (CGB) at IndianaUniversity, which is supported in part by the METACytInitiative of Indiana University, funded in part through amajor grant from the Lilly Endowment, Inc. Computersupport was provided by the University Information Tech-nology Services (UITS) and by the CGB computing group.We thank the group leaders Phillip Steinbachs (CGB) andCraig Stewart and Richard Repasky (UITS). Additionalsupport is provided by the Indiana Center for InsectGenomics project funded through the Indiana 21stCentury Research and Technology Fund. We thank theCGB genome sequencing team, including JadeBuchanan-Carter and Zachary Smith. TW was in partfunded by the MRC; GH by a grant from the NERC.Amanda Avery and Jorge Azpurua are thanked forisolation and culturing of bacterial strains. Work by JWwas supported by the US National Science Foundation(EF-0328363).

References

Altschul, S., Gish, W., Miller, W., Myers, E. and Lipman, D. (1990)Basic local alignment search tool. J Mol Biol 215: 403–410.

Bendtsen, J., Nielsen, H., von Heijne, G. and Brunak, S. (2004)Improved prediction of signal peptides: SignalP 3.0. J Mol Biol340: 783–795.

Black, D. and Bliska, J. (1997) Identification of p130Cas as asubstrate of Yersinia YopH (Yop51), a bacterial proteintyrosine phosphatase that translocates into mammalian cellsand targets focal adhesions. EMBO J 16: 2730–2744.

Boquet, P. (2001) The cytotoxic necrotizing factor 1 (CNF1) fromEscherichia coli. Toxicon 39: 1673–1680.

Bowen, D. and Ensign, J. (1998) Purification and characterizationof a high-molecular-weight insecticidal protein complex pro-duced by the entomopathogenic bacterium Photorhabdusluminescens. Appl Environ Microbiol 64: 3029–3035.

Cascales, E., Buchanan, S., Duche, D., Kleanthous, C., Lloubes,R. and Postle, K. (2007) Colicin Biology. Microbiol Mol BiolRev 71: 158–229.

Chen, L., Yang, J., Yu, J., Yao, Z., Sun, L. and Shen, Y. (2005)VFDB: a reference database for bacterial virulence factors.Nucleic Acids Res 33: D325–D328.

Crago, A. and Koronakis, V. (1998) Salmonella InvG forms aring-like multimer that requires the InvH lipoprotein for outermembrane localization. Mol Microbiol 30: 47–56.

Daborn, P., Waterfield, N., Silva, C., Au, C., Sharma, S. andFfrench-Constant, R. (2002) A single Photorhabdus gene,makes caterpillars floppy (mcf), allows Escherichia coli topersist within and kill insects. Proc Natl Acad Sci USA 99:10742–10747.

Dale, C. and Moran, N. (2006) Molecular interactions betweenbacterial symbionts and their hosts. Cell 126: 453–465.

Dale, C., Beeton, M., Harbison, C., Jones, T. and Pontes, M. (2004)Isolation, pure culture, and characterization of ‘CandidatusArsenophonus arthropodicus,’ an intracellular secondaryendosymbiont from the hippoboscid louse fly Pseudolynchiacanariensis. Appl Environ Microbiol 72: 2997–3004.

Darby, A.C., Choi, J-H., Wilkes, T.E., Hughes, M.A., Werren, J.H.,Hurst, G.D.D. et al. (2010) Characteristics of the Arsenopho-nus nasoniae genome: son-killer bacterium of the waspNasonia. Insect Mol Biol 19 (Suppl. 1): 75–89.

Degnan, P., Yu, Y., Sisneros, N., Wing, R. and Moran, N. (2009)Hamiltonella defensa, genome evolution of protective bacte-rial endosymbiont from pathogenic ancestors. Proc Natl AcadSci USA 106: 9063–9068.

Dineen, S., Bradshaw, M., Karasek, C. and Johnson, E. (2004)Nucleotide sequence and transcriptional analysis of the typeA2 neurotoxin gene cluster in Clostridium botulinum. FEMSMicrobiol Lett 235: 9–16.

Domenighini, M., Relman, D., Capiau, C., Falkow, S., Prugnola,A. and Scarlato, V. (1990) Genetic characterization of Borde-tella pertussis filamentous haemagglutinin: a protein pro-cessed from an unusually large precursor. Mol Microbiol 4:787–800.

Do Vale, A., Costa-Ramos, C., Silva, A., Silva, D., Gärtner, F. andSantos, N. (2007) Systemic macrophage and neutrophildestruction by secondary necrosis induced by a bacterialexotoxin in a Gram-negative septicaemia. Cell Microbiol 9:988–1003.

Duron, O., Bouchon, D., Boutin, S., Bellamy, L., Zhou, L., Engel-städter, J. et al. (2008) The diversity of reproductive parasitesamong arthropods: Wolbachia do not walk alone. BMC Biol 6:27–39.

Edgar, R.C. (2004) MUSCLE: multiple sequence alignment withhigh accuracy and high throughput. Nucleic Acids Res 32:1792–1797.

Ferree, P., Avery, A., Azpurua, J., Wilkes, T. and Werren, J.(2008) A bacterium targets maternally inherited centrosomesto kill males in Nasonia. Curr Biol 18: 1409–1414.

ffrench-Constant, R., Waterfield, N., Daborn, P., Joyce, S.,Bennett, H. and Au, C. (2003) Photorhabdus: towards a func-

70 T. E. Wilkes et al.

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

tional genomic analysis of a symbiont and pathogen. FEMSMicrobiol Lett 26: 433–456.

Fuchs, T., Bresolin, G., Marcinowski, L., Schachtner, J. andScherer, S. (2008) Insecticidal genes of Yersinia spp.: taxo-nomical distribution, contribution to toxicity towards Manducasexta and Galleria mellonella, and evolution. BMC Microbiol 8:214.

Galan, J. (1996) Molecular genetic bases of Salmonella entry intohost cells. Mol Microbiol 20: 263–271.

Galán, J. (2001) Salmonella interactions with host cells: type IIIsecretion at work. Annu Rev Cell Dev Biol 17: 53–86.

Galan, J. and Bliska, J. (1996) Cross-talk between bacterialpathogens and their host cells. Annu Rev Cell Dev Biol 12:221–255.

Gherna, R.L., Werren, J.H., Wisburg, W., Cote, R., Woese, C.R.,Mandelco, L. et al. (1991) Arsenophonus nasoniae gen. nov.,sp. nov., the causative agent of the son-killer trait in theparasitic wasp Nasonia vitripennis. Int J Syst Bacteriol 41:563–565.

Guindon, S. and Gascuel, O. (2003) A simple, fast, and accuratealgorithm to estimate large phylogenies by maximum likeli-hood? Syst Biol 52: 696–704.

Haine, E. (2008) Symbiont-mediated protection. Proc Biol Sci 275(1633): 353–361.

Hersh, D., Monack, D., Smith, M., Ghori, N., Falkow, S. andZychlinsky, A. (1999) The Salmonella invasin SipB inducesmacrophage apoptosis by binding to caspase-1. Proc NatlAcad Sci USA 96: 2396–2401.

Hines, J., Skrzypek, E., Kajava, A. and Straley, S. (2001)Structure-function analysis of Yersinia pestis YopM’s interac-tion with alpha-thrombin to rule on its significance in systemicplague and to model YopM’s mechanism of binding hostproteins. Microb Pathog 30: 193–209.

Hueck, C. (1998) Type III protein secretion systems in bacterialpathogens of animals and plants. Microbiol Mol Biol Rev 62:379–433.

Huger, A., Skinner, S. and Werren, J. (1985) Bacterial infectionsassociated with the son-killer trait in the parasitoid waspNasonia (= Mormoniella) vitripennis (Hymenoptera: Pteroma-lidae). J Invertebr Pathol 46: 272–280.

Hurst, G.D.D., Jiggins, F.M. and Majerus, M.E.N. (2003) Inheritedmicroorganisms that selectively kill male hosts: the hiddenplayers of insect evolution? In Insect Symbiosis (Bourtzis, K.and Miller, T.A., eds), pp. 177–198. CRC Press, Boca Raton,FL.

Hypsa, V. and Dale, C. (1997) In vitro culture and phylogeneticanalysis of ‘Candidatus Arsenophonus triatominarum,’ anintracellular bacterium from the triatomine bug, Triatomainfestans. Int J Syst Bacteriol 47: 1140–1144.

Kellner, R. (2002) Molecular identification of an endosymbioticbacterium associated with pederin biosynthesis in Paederussabaeus (Coleoptera: Staphylinidae). Insect Biochem Molecu-lar Biology 32: 389–395.

Kim, D., Lenzen, G., Page, A.L., Legrain, P., Sansonetti, P. andParsot, C. (2005) The Shigella flexneri effector OspGinterferes with innate immune responses by targetingubiquitin-conjugating enzymes. Proc Natl Acad Sci USA 102:14046–14051.

Knodler, L. and Steele-Mortimer, O. (2005) The Salmonellaeffector PipB2 affects late endosome/lysosome distribution tomediate Sif extension. Mol Biol Cell 16: 4108–4123.

Kobe, B. and Kajava, A. (2001) The leucine-rich repeat as aprotein recognition motif. Curr Opin Struct Biol 11: 725–732.

Lally, E., Hill, R., Kieba, L. and Korostoff, J. (1999) The interactionbetween RTX toxins and target cells. Trends Microbiol 7:356–361.

Lambalot, R., Gehring, A., Flugel, R., Zuber, P., LaCelle, M.and Marahiel, M. (1996) A new enzyme superfamily –the phosphopantetheinyl transferases. Chem Biol 3: 923–936.

Mammarappallil, J. and Elsinghorst, E. (2000) Epithelial celladherence mediated by the enterotoxigenic Escherichia colitia protein. Infect Immun 68: 6595–6601.

Marchler-Bauer, A., Anderson, J.B., Derbyshire, M.K.,DeWeese-Scott, C., Gonzales, N.R., Gwadz, M. et al.(2007) CDD: a conserved domain database for interactivedomain family analysis. Nucleic Acids Res 35: D237–D240.

Mittl, P. and Schneider-Brachert, W. (2007) Sel1-like repeatproteins in signal transduction. Cell Signal 19: 20–31.

Moran, N., Degnan, P., Santos, S., Dunbar, H. and Ochman, H.(2005) The players in a mutualistic symbiosis: insects, bacte-ria, viruses, and virulence genes. Proc Natl Acad Sci USA102: 16919–16926.

Munch, A., Stingl, L., Jung, K. and Heermann, R. (2008) Photo-rhabdus luminescens genes induced upon insect infection.BMC Genomics 9: 229.

Natale, P., Brüser, T. and Driessen, A. (2008) Sec- and Tat-mediated protein secretion across the bacterial cytoplasmicmembrane – distinct translocases and mechanisms. BiochimBiophys Acta 1778: 1735–1756.

Norris, F., Wilson, M., Wallis, T., Galyov, E. and Majerus, P. (1998)SopB, a protein required for virulence of Salmonella dublin, isan inositol phosphate phosphatase. Proc Natl Acad Sci USA95: 14057–14059.

Orth, K., Xu, Z., Mudgett, M., Bao, Z., Palmer, L. andBliska, J. (2000) Disruption of signaling by Yersinia effectorYopJ, a ubiquitin-like protein protease. Science 290: 1594–1597.

Payne, S. and Finkelstein, R. (1978) The critical role ofiron in host-bacterial interactions. J Clin Invest 61: 1428–1440.

Perotti, M., Allen, J., Reed, D. and Braig, H. (2007) Host-symbiontinteractions of the primary endosymbiont of human head andbody lice. FASEB J 21: 1058–1066.

Picking, W., Nishioka, H., Hearn, P., Baxter, M., Harrington, A.and Blocker, A. (2005) IpaD of Shigella flexneri is indepen-dently required for regulation of Ipa protein secretion andefficient insertion of IpaB and IpaC into host membranes.Infect Immun 73: 1432–1440.

Quevillon, E., Silventoinen, V., Pillai, S., Harte, N., Mulder, N.,Apweiler, R. et al. (2005) InterProScan: protein domainsidentifier. Nucleic Acids Res 33: W116–W120.

Rietsch, A., Vallet-Gely, I., Dove, S. and Mekalanos, J. (2005)ExsE, a secreted regulator of type III secretion genes inPseudomonas aeruginosa. Proc Natl Acad Sci USA 102:8006–8011.

Rivers, D.B. and Denlinger, D.L. (1994) Developmental fate ofthe flesh fly, Sarcophaga bullata, envenomated by the pupalectoparasitoid, Nasonia vitripennis. J Insect Physiol 40: 121–127.

Virulence and symbiosis in Arsenophonus nasoniae 71

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

Sha, J., Kozlova, E.V. and Chopra, A.K. (2002) Role of variousenterotoxins in Aeromonas hydrophila-induced gastroenteri-tis: generation of enterotoxin gene-deficient mutants andevaluation of their enterotoxic activity. Infect Immunol 70:1924–1935.

Skinner, S. (1985) Son-killer: a third extrachromosomal factoraffecting the sex ratio in the parasitoid wasp, Nasonia(= Mormoniella) vitripennis. Genetics 109: 745–759.

Smith, S.G.J., Mahon, V., Lambert, M.A. and Fagan, R.P. (2007)A molecular Swiss army knife: OmpA structure, function andexpression. FEMS Microbiol Lett 273: 1–11.

Soto, G. and Hultgren, S. (1999) Bacterial adhesins: commonthemes and variations in architecture and assembly. J Bacte-riol 181: 1059–1071.

Talavera, G. and Castresana, J. (2007) Improvement of phylog-enies after removing divergent and ambiguously alignedblocks from protein sequence alignments. Syst Biol 56: 564–577.

Tao, K., Yu, X., Liu, Y., Shi, G., Liu, S. and Hou, T. (2007) Cloning,expression, and purification of insecticidal protein Pr596 fromlocust pathogen Serratia marcescens HR-3. Curr Microbiol55: 228–233.

Tenor, J., McCormick, B., Ausubel, F. and Aballay, A. (2004)Caenorhabditis elegans-based screen identifies Salmonellavirulence factors required for conserved host-pathogen inter-actions. Curr Biol 14: 1018–1024.

Thao, M. and Baumann, P. (2004) Evidence for multiple acquisi-tion of Arsenophonus by whitefly species (Sternorrhyncha:Aleyrodidae). Curr Microbiol 48: 140–144.

Tikhonova, E., Devroy, V., Lau, S. and Zgurskaya, H. (2007)Reconstitution of the Escherichia coli macrolide transporter:the periplasmic membrane fusion protein MacA stimulates theATPase activity of MacB. Mol Microbiol 63: 895–910.

Toh, H., Weiss, B.L., Perkin, S.A.H. et al. (2006) Massive genomeerosion and functional adaptations provide insights into thesymbiotic lifestyle of Sodalis glossinidius in the tsetse host.Genome Res 16 (2): 149–156.

Vallet-Gely, I., Lemaitre, B. and Boccard, F. (2008) Bacterialstrategies to overcome insect defences. Nat Rev Microbiol 6:302–313.

Waterfield, N., Bowen, D., Fetherston, J., Perry, R. and ffrench-Constant, R. (2001) The tc genes of Photorhabdus: a growingfamily. Trends Microbiol 9: 185–191.

Waterfield, N., Daborn, P. and Ffrench-Constant, R. (2004) Insectpathogenicity islands in the insect pathogenic bacteriumPhotorhabdus. Physiol Entomol 29: 240–250.

Waterfield, N., Hares, M., Yang, G., Dowling, A. and Ffrench-Constant, R. (2005) Potentiation and cellular phenotypes ofthe insecticidal Toxin complexes of Photorhabdus bacteria.Cell Microbiol 7: 373–382.

Weiss, B.L., Wu, Y., Schwank, J.J., Tolwinski, N.S. and Aksoy, S.(2008) An insect symbiosis is influenced by bacterium-specificpolymorphisms in outer-membrane protein A. Proc Natl AcadSci USA 105: 15088–15093.

Werren, J.H. (2004) Arsenophonus. Bergey’s Manual of System-atic Bacteriology, Vol. 2 (Garrity, G.M., ed.), Springer-Verlag,New York.

Werren, J.H., Skinner, S. and Huger, A. (1986) Male-killingbacteria in a parasitic wasp. Science 231: 990–992.

Werren, J.H., Richards, S., Desjardins, C.A., Niehuis, O., Gadau,J., Colbourne, J.K. et al. (2010) Functional and evolutionary

insights from the genomes of three parasitoid Nasoniaspecies. Science, doi:10.1126/science.1178028.

Yahr, T., Vallis, A., Hancock, M., Barbieri, J. and Frank, D. (1998)ExoY, an adenylate cyclase secreted by the Pseudomonasaeruginosa type III system. Proc Natl Acad Sci USA 95:13899–13904.

Zhang, Y., Higashide, W., McCormick, B., Chen, J. and Zhou, D.(2006) The inflammation-associated Salmonella SopA is aHECT-like E3 ubiquitin ligase. Mol Microbiol 62: 786–793.

Supporting Information

Additional Supporting Information may be found in theonline version of this article under the DOI reference: DOI10.1111/j.1365-2583.2009.00963.x

Figure S1. Two Arsenophonus nasoniae pentapeptide repeat containingopen reading frames and their virulence effector-like neighbours. Greenarrows depict pentapeptide repeat regions, containing 8 copies each thatcan be approximately described as A(D/N)LXX, where X can be any aminoacid (Marchler-Bauer et al., 2007).

Figure S2. Multiple sequence alignment by ClustalX v2.0.11 of the Nterminus of the Arsenophonus nasoniae OmpA-like open reading frame(ARN_16100) with OmpA from SOPE (Sitophilus oryzae principal endo-symbiont, GenBank accession no. EU426969), Sodalis (BAE74305), CMS(C. melbae symbiont, EU684475), Hamiltonella defensa (EU682308), Pho-torhabdus luminescens (NP929054), Yersinia pestis (NP670036), Escheri-chia coli 536 (UPEC, CP000247), Salmonella typhimurium (X02006) andShigella flexneri (AF234271). External loop structures are underlined andlabelled L1-4. Based on Fig. 3 of Weiss et al. (2008).

Figure S3. Phylogenetic analysis of Arsenophonus nasoniae ABC-likeORFs. Phylogeny A shows all twenty-one open reading frames (ORFs)annotated as ABC-like, with BLAST hits from OrthoMCL (http://www.orthomcl.org/) of e-value < e-100, aligned using MUSCLE (Edgar, 2004) withwell aligned blocks extracted using Gblocks (Talavera & Castresana,2007), and assembled using PhyML (Guindon & Gascuel, 2003). Phylog-enies B to L show individual analyses of ABC-like ORFs mentioned in thetext. Phylogenies B to E show A. nasoniae ORFs ARN_01800-01830, ironABC transporter-like ORFs. Phylogenies F to I show A. nasoniae ORFsARN_00840-00860 and ARN_19400, RTX ABC transporter-like ORFs.Phylogenies J to L show A. nasoniae ORFs ARN_29850-29870, feoABC-like ORFs. Phylogenies were created using the best BLASTp hit to the firsteight species in the NCBI nr database, alignment and neighbour-joiningtrees were assembled using clustalX v2.0.11 (http://www.clustal.org/).

Table S1. Function of open reading frame (ORF) mentioned in text andcorresponding ORF reference numbers within the Arsenophonus nasoniaedraft genome

Table S2. Four open reading frames in the Yersinia-like Type III secretionsystem of Arsenophonus nasoniae without complete motifs relevant totheir function. Lengths are given in amino acids (AA)

Table S3. Open reading frames in the Arsenophonus nasoniae genomewith potential contribution to Polyketide synthesis

Table S4. A list of Arsenophonus nasoniae open reading frames pre-dicted to contain a Signal Peptide, their location within the A. nasoniaedraft genome and putative annotation, as ascertained from reciprocalBLAST

Table S5. Haemagglutination activity domain (HAD) containing openreading frames (ORFs) in the Arsenophonus nasoniae genome. ORFlengths are given in amino acids (aa). Domains were identified using a

72 T. E. Wilkes et al.

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73

Conserved Domain Database (CDD) search (Marchler-Bauer et al.,2007). Presence of a signal peptide was estimated using neural networks(NN) and hidden Markov models (HMM) trained on Gram negativebacteria

Please note: Wiley-Blackwell are not responsible for thecontent or functionality of any supporting materials

supplied by the authors. Any queries (other than missingmaterial) should be directed to the corresponding authorfor the article.

Virulence and symbiosis in Arsenophonus nasoniae 73

© 2010 The AuthorsJournal compilation © 2010 The Royal Entomological Society, Insect Molecular Biology (2010), 19 (Suppl. 1), 59–73