13
Freshwater bacterial lifestyles inferred from comparative genomics Joshua A. Livermore, 1 Scott J. Emrich, 1,2 John Tan 3,4 and Stuart E. Jones 1,4 * 1 Notre Dame Environmental Change Initiative, University of Notre Dame, 100 Galvin Life Sciences, Notre Dame, IN 46556, USA. 2 Department of Computer Science and Engineering, University of Notre Dame, 384 Fitzpatrick, Hall, Notre Dame, IN 46556, USA. 3 Eck Institute for Global Health, University of Notre Dame, 107 Galvin Life Sciences, Notre Dame, IN 46556, USA. 4 Department of Biological Sciences, University of Notre Dame, 264 Galvin Life Sciences, Notre Dame, IN 46556, USA. Summary While micro-organisms actively mediate and partici- pate in freshwater ecosystem services, we know little about freshwater microbial genetic diversity. Genome sequences are available for many bacteria from the human microbiome and the ocean (over 800 and 200, respectively), but only two freshwater genomes are currently available: the streamlined genomes of Polynucleobacter necessarius ssp. asymbioticus and the Actinobacterium AcI-B1. Here, we sequenced and analysed draft genomes of eight phylogentically diverse freshwater bacteria exhibiting a range of life- style characteristics. Comparative genomics of these bacteria reveals putative freshwater bacterial life- styles based on differences in predicted growth rate, capability to respond to environmental stimuli and diversity of useable carbon substrates. Our concep- tual model based on these genomic characteristics provides a foundation on which further ecophy- siological and genomic studies can be built. In addi- tion, these genomes greatly expand the diversity of existing genomic context for future studies on the ecology and genetics of freshwater bacteria. Introduction Human survival and prosperity require freshwater for basic functions including drinking, agriculture, recreation and sanitation, but freshwater availability is threatened by overuse and pollution. While micro-organisms underpin the majority of these ecosystem services, we know little about freshwater microbial genetic diversity (Reid et al., 2005). Genome sequences of over 200 marine bacteria (Yooseph et al., 2010) have enabled great advances in our understanding of the functional capacity and ecology of marine bacteria, including the roles they play in nutrient recycling and carbon fixation among others (DeLong and Karl, 2005). Additionally, the human microbiome project has revealed much about genetic diversity of human symbionts generating greater than 800 reference bacterial genomes with a goal of 3000 (Methé et al., 2012). In stark contrast, the only lake bacteria with published genome sequences are Polynucleobacter necessarius ssp. asymbioticus (Pnec; Newton et al., 2011; Hahn et al., 2012) and the Actinobacterium AcI-B1 (Garcia et al., 2012). Polynucleobacter has been cultured from virtually every type of freshwater habitat (Hahn, 2003) with abundance ranging from the detection limit to 67% of bacterioplankton cells, and they, on average, comprise approximately 20% of all cells in any given freshwater body (Jezberová et al., 2010). Recent work indicates that one strain of this broader taxonomic group is indeed passive (no known genomic mechanisms for response to the environment) and highly substrate specialized with a small streamlined genome (2.1 Mb) and no obvious genetic markers for motility or cell-cell signalling (Hahn et al., 2012). Similar to Pnec, the Actinobacteria tribe AcI-B1 is also highly abundant in lakes (up to 70% of all cells; Warnecke et al., 2005) with evidence of ubiquity by diversification through niche partitioning (Newton et al., 2007).Additionally, recent genome analysis of a single amplified genome from tribe AcI-B1 suggests a passive specialized lifestyle with a streamlined genome (Garcia et al., 2012). Based on these two genome sequences, one might assume all lake bacteria possess streamlined genomes and tend to have a passive and planktonic lifestyle. However, spatial and temporal surveys (Newton et al., 2006; Eiler and Bertilsson, 2007) as well as ecological theory would suggest that a single lifestyle strategy for all Received 18 January, 2013; revised 28 April, 2013; accepted 19 June, 2013. *For correspondence. E-mail [email protected]; Tel. (574) 631 5703; Fax (574) 631 7413. Environmental Microbiology (2014) 16(3), 746–758 doi:10.1111/1462-2920.12199 © 2013 Society for Applied Microbiology and John Wiley & Sons Ltd

Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

Freshwater bacterial lifestyles inferred fromcomparative genomics

Joshua A. Livermore,1 Scott J. Emrich,1,2 John Tan3,4

and Stuart E. Jones1,4*1Notre Dame Environmental Change Initiative,University of Notre Dame, 100 Galvin Life Sciences,Notre Dame, IN 46556, USA.2Department of Computer Science and Engineering,University of Notre Dame, 384 Fitzpatrick, Hall, NotreDame, IN 46556, USA.3Eck Institute for Global Health, University of NotreDame, 107 Galvin Life Sciences, Notre Dame, IN46556, USA.4Department of Biological Sciences, University of NotreDame, 264 Galvin Life Sciences, Notre Dame, IN46556, USA.

Summary

While micro-organisms actively mediate and partici-pate in freshwater ecosystem services, we know littleabout freshwater microbial genetic diversity. Genomesequences are available for many bacteria from thehuman microbiome and the ocean (over 800 and 200,respectively), but only two freshwater genomesare currently available: the streamlined genomes ofPolynucleobacter necessarius ssp. asymbioticus andthe Actinobacterium AcI-B1. Here, we sequencedand analysed draft genomes of eight phylogenticallydiverse freshwater bacteria exhibiting a range of life-style characteristics. Comparative genomics of thesebacteria reveals putative freshwater bacterial life-styles based on differences in predicted growth rate,capability to respond to environmental stimuli anddiversity of useable carbon substrates. Our concep-tual model based on these genomic characteristicsprovides a foundation on which further ecophy-siological and genomic studies can be built. In addi-tion, these genomes greatly expand the diversity ofexisting genomic context for future studies on theecology and genetics of freshwater bacteria.

Introduction

Human survival and prosperity require freshwater for basicfunctions including drinking, agriculture, recreation andsanitation, but freshwater availability is threatened byoveruse and pollution. While micro-organisms underpinthe majority of these ecosystem services, we know littleabout freshwater microbial genetic diversity (Reid et al.,2005). Genome sequences of over 200 marine bacteria(Yooseph et al., 2010) have enabled great advances in ourunderstanding of the functional capacity and ecology ofmarine bacteria, including the roles they play in nutrientrecycling and carbon fixation among others (DeLong andKarl, 2005). Additionally, the human microbiome projecthas revealed much about genetic diversity of humansymbionts generating greater than 800 reference bacterialgenomes with a goal of 3000 (Methé et al., 2012). In starkcontrast, the only lake bacteria with published genomesequences are Polynucleobacter necessarius ssp.asymbioticus (Pnec; Newton et al., 2011; Hahn et al.,2012) and the Actinobacterium AcI-B1 (Garcia et al.,2012).

Polynucleobacter has been cultured from virtually everytype of freshwater habitat (Hahn, 2003) with abundanceranging from the detection limit to 67% of bacterioplanktoncells, and they, on average, comprise approximately 20%of all cells in any given freshwater body (Jezberová et al.,2010). Recent work indicates that one strain of this broadertaxonomic group is indeed passive (no known genomicmechanisms for response to the environment) and highlysubstrate specialized with a small streamlined genome(2.1 Mb) and no obvious genetic markers for motility orcell-cell signalling (Hahn et al., 2012). Similar to Pnec, theActinobacteria tribe AcI-B1 is also highly abundant in lakes(up to 70% of all cells; Warnecke et al., 2005) with evidenceof ubiquity by diversification through niche partitioning(Newton et al., 2007).Additionally, recent genome analysisof a single amplified genome from tribe AcI-B1 suggests apassive specialized lifestyle with a streamlined genome(Garcia et al., 2012).

Based on these two genome sequences, one mightassume all lake bacteria possess streamlined genomesand tend to have a passive and planktonic lifestyle.However, spatial and temporal surveys (Newton et al.,2006; Eiler and Bertilsson, 2007) as well as ecologicaltheory would suggest that a single lifestyle strategy for all

Received 18 January, 2013; revised 28 April, 2013; accepted 19June, 2013. *For correspondence. E-mail [email protected]; Tel.(574) 631 5703; Fax (574) 631 7413.

bs_bs_banner

Environmental Microbiology (2014) 16(3), 746–758 doi:10.1111/1462-2920.12199

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd

Page 2: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

freshwater bacteria is highly unlikely. For example,Bacteroidetes taxa can numerically dominate lakes follow-ing an influx of dissolved organic carbon, manyGammaproteobacteria use freshwater bodies as a trans-mission vector and freshwater Verrucomicrobia are glob-ally pervasive, although scarce in any given water body.We also know that much of the bacterial diversity in lakeslies dormant, roused by an influx of resources or shifts inother environmental features (Jones and Lennon, 2010).

To expand our understanding of freshwater bacteriallifestyles and their genetic underpinnings, we sequencedthe genomes of eight lake bacterial isolates. To maximizecapture of novel freshwater genetic diversity, we selecteda phylogenetically diverse set of previously isolatedstrains. We contrasted our eight draft genome sequenceswith the Pnec genome and the single-cell amplifiedgenome of AcI-B1. In addition, we compare these 10freshwater bacterial genomes with phylogeneticallysimilar bacterial genome sequences from soil and marineenvironments. We conclude that distribution of genomiccharacteristics among these freshwater isolates suggestsa trade-off between carbon substrate diversity and growthrate as isolates with augmented substrate capabilitiesexhibited slow-predicted growth rate, whereas fast-growing isolates had reduced substrate utilization capac-ity. Both of these strategies contrasted with the passive,slow and specialist habits of the previously sequencedPnec and AcI-B1 genomes.

Results

Taxonomic identities of freshwater bacterial genomes

We selected for sequencing a phylogenetically diverse setof eight freshwater bacteria taxonomically identified using

16S rRNA gene maximum likelihood phylogenies with afreshwater-specific reference alignment. Two genomesrepresented each of the Alphaproteobacteria, Beta-proteobacteria and Gammaproteobacteria, while theremaining two genomes were from the Actinobacteria andthe Flavobacteria class of Bacteroidetes. A summary ofinferred taxonomy and a 16S rRNA gene tree outline thephylogenetic context for each bacterium (Table 1 andFig. S1). Four of the eight genomes are members of pre-viously defined freshwater tribes, while the Flavobacteriumand a Gammaproteobacterium were classifiable only to theclade level. Of the two Betaproteobacteria, one was clas-sifiable to lineage and the other to class (i.e. no moreinformation than ‘Betaproteobacteria’).

Cross-ecosystem comparison of genome content

We compared subsystem category distributions [assi-gned by Rapid Annotation using Subsystem Technology(RAST)] of freshwater bacteria genomes with closephylogenetic relatives from marine and soil environments.Principal components analysis of the full set of 30 genomessubsystem categories indicated separation of soil andmarine environments (freshwater overlapping) in the firstdimension (28% of variance) and taxonomy in the second(14% of variance) illustrated as plots of 95% confidenceinterval ellipses based on environment (Fig. 1A) and class(Fig. 1B). The RAST subsystem that contributed most toPC 1 was category Protein metabolism (Subsystem andSubsystem category names are noted with italics) withmarine and freshwater bacteria having a higher rank (lowernumbers indicate higher rank) in this category than the soilbacteria (average ranks 4.3 ± 2.0 4.4 ± 2.2, and 5.3 ± 1.1,respectively; Fig. 1C). DNA metabolism, Cell divisionand cell cycle and Nucleoside and nucleotides were

Table 1. Phylogenetic classifications of newly sequenced freshwater bacterial genomes.

Freshwaterisolate Class

Most specificclassification level Classification Predicted characteristics

LLX17 Actinobacteria Tribe acTH2 Myco Closely affiliated phylogenetically with Mycobacterium.Known clones are from Lake Taihu in China.

L41A Alphaproteobacteria Tribe alfII Brev Associated with promotion or inhibition of CyanobacteriaLLX12A Alphaproteobacteria Tribe alfIV alfIV-B M-L-85 Predicted capacity to utilize humic substance or other

difficult compounds based on isolation from humic lakesusing phenol enrichment.

WG36 Gammaprotobacteria Clade gamII gamII-A Potentially abundant in lakes, but also associated withterrestrial animals in databases

L18 Gammaproteobacteria Tribe gamIV gamIV-APseudo A1

Potentially abundant in lakes, but also associated withterrestrial animals in databases

WG21 Flavobacteria Clade bacII bacII-A Flavobacteria like lineage enriched during periods ofhigh-dissolved organic carbon

FWI2 Betaproteobacteria Lineage betII (Pnec) Cosmopolitan freshwater groupL13 Betaproteobacteria Subphylum Betaproteobacteria Best BLAST hits are to organisms in genus Vogesella,

which are commonly isolated from freshwater bodies.

Classifications determined by position of 16S rRNA gene sequences in a RAxML maximum likelihood phylogeny with query sequences aligned toa hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental aligner (Sina). Where available, predicted character-istics are from summaries in Newton et al. 2011.

Diverse freshwater bacteria genomes 747

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 3: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

also strongly correlated to the first principal compo-nent. However, there were no compositional differencesbetween the three environments for subsystems in thesecategories, indicating that within our genome set, marinebacteria allocate a greater overall proportion of genomicspace to Protein metabolism, DNA metabolism, Cell divi-sion and cell cycle and Nucleosides and nucleotide, but notnecessarily to any specific process or processes therein.Correlated to PC 1 in the opposite direction were Sulfurmetabolism, Regulation and cell signalling and Membrane

transport, indicating greater importance of these catego-ries in soil bacteria as a whole and also some freshwaterbacteria. The greatest subsystem category contributorto the second principal component is Cell wall andcapsule with Actinobacteria and Alphaproteobacteria allo-cating proportionally less than other classes (averageranks Actinobacteria-12.1 ± 2.4, Alphaproteobacteria-11.3 ± 2.1, Betaproteobacteria-8.9 ± 3.3, Flavobacteria-7.0 ± 0.8, Gammaproteobacteria-7.3 ± 1.3) but againwithout evident compositional differences within the

PNEC

FWI2

Ralst

onia e

utropha

H16

Ralst

onia s

olanac

earu

m G

MI1

000

Limnobac

ter s

p. MED10

5

Oxalo

bacte

reac

eae

bacte

rium

IMCC94

80L13

Chrom

obacte

rium

vio

lace

um A

TCC 1

2472

Bet

a Pro

teobac

terium

KB13

WG36

Serra

tia p

rote

amac

ulans 5

68

Vibrio

fisc

heri M

J11

L18

Pseudom

onas p

utida

KT24

40

Gam

map

rote

obacte

rium

HTC

C22

07

LLX17

Nocard

ioid

es s

p. JS61

4

Mar

ine

actin

obacte

rium

phS

C20

C1

L41A

Brevu

ndimonas

sp. B

AL3

Bruce

lla m

icro

ti CCM

491

5

LLX12

A

Sphingom

onas s

p. SKA58

Novosp

hingobiu

m a

rom

atic

ivora

ns DSM

124

44

WG21

Flavo

bacte

rium

johnso

niae

UW

101

Flavo

bacte

ria b

acte

rium

BAL38

0.30.4 0.2

PCA 1

PC

A 2

PCA 1

PC

A 2

Gammaproteobacteria

Betaproteobacteria

Flavobacteria

Alphaproteobacteria

Actinobacteria

Freshwater

Soil

Marine

A B

Betaproteobacteria

C

21 24 24 24 25 22 25 25 24 25 24 25 25 23 23 25 24 24 24 24 25 23 23 23 23 23 23 19 23 2424 25 21 25 24 24 23 24 25 24 25 22 24 24 24 23 21 22 23 24 24 24 24 24 25 25 24 23 24 1721 22 24 21 21 23 18 17 23 19 19 20 16 17 25 23 25 23 25 23 22 19 21 22 19 23 21 22 22 2320 21 23 23 23 21 21 23 22 22 23 21 23 21 21 22 23 21 21 22 23 19 22 20 18 22 20 20 20 2117 19 19 18 19 18 17 19 20 23 21 23 20 19 19 20 21 20 22 18 19 18 17 18 21 21 19 24 20 2016 20 22 22 22 16 21 21 20 21 22 19 22 18 19 20 20 18 18 21 21 14 19 16 16 19 18 16 17 1815 17 20 20 20 15 19 18 18 20 20 18 20 20 17 17 18 19 17 17 18 11 20 15 15 16 16 17 18 1919 16 18 17 18 17 19 20 19 18 17 16 19 22 18 19 17 17 18 19 20 22 13 19 17 18 17 18 16 1521 22 15 9 13 19 24 21 16 11 10 17 17 25 22 18 15 14 18 20 17 19 18 20 23 15 22 15 19 2224 11 14 19 15 24 9 7 8 17 18 24 18 12 13 16 18 25 15 16 14 24 25 24 22 12 24 24 25 2518 18 17 16 16 20 16 16 17 16 16 12 10 11 16 15 16 16 16 15 16 17 15 17 20 20 15 20 15 1613 14 12 13 5 12 15 14 14 13 14 15 15 16 15 11 9 13 11 14 15 13 16 14 14 14 14 12 11 99 15 16 15 17 13 14 15 15 15 15 9 11 14 14 14 14 15 14 13 12 6 10 9 10 9 10 9 12 1212 10 9 11 11 10 10 9 7 8 9 8 9 10 11 10 11 12 12 10 9 15 7 11 12 12 11 10 14 137 9 11 7 9 8 11 12 12 12 13 13 12 13 10 8 10 9 9 9 8 7 8 8 11 7 8 11 13 1414 7 6 6 6 13 12 9 4 7 6 14 8 7 11 12 5 7 13 12 7 11 14 12 8 10 13 12 8 106 8 13 14 12 9 8 11 11 14 12 10 13 9 9 9 12 10 7 7 11 9 9 6 6 8 9 7 9 75 5 10 12 14 7 6 8 13 9 7 5 7 8 8 13 13 11 7 11 13 10 11 10 13 17 12 6 7 810 6 8 4 7 11 13 13 5 10 11 11 14 14 7 5 6 5 9 8 9 15 5 5 7 4 5 12 9 114 13 7 10 10 6 7 6 6 6 8 7 3 3 5 6 8 8 6 6 6 5 12 13 9 11 7 4 4 68 12 4 5 4 4 5 5 9 3 4 4 5 6 6 7 4 4 5 5 4 7 6 7 5 6 6 8 6 43 4 5 8 8 1 4 4 10 5 5 6 6 5 3 4 7 6 2 4 5 2 4 4 3 5 4 3 5 52 3 3 3 3 1 2 3 3 4 3 3 4 4 4 3 3 3 3 3 3 4 3 3 4 3 2 5 3 311 2 2 2 2 5 3 2 2 2 2 1 1 1 2 2 1 1 3 2 2 3 1 2 1 1 3 2 2 11 1 1 1 1 3 1 1 1 1 1 2 2 2 1 1 2 2 1 1 1 1 2 1 2 2 1 1 1 2

AcI

-B1

Strepto

myc

es a

verm

itilis

MA 4

680

Janib

acte

r sp. H

TCC26

49

0.46 -0.02

-0.31 0.22

-0.44 0.36

-0.17 0.25

-0.56 -0.11

0.83 -0.01

0.57 -0.15

-0.79 -0.21

-0.7 -0.25

-0.49 0.33

-0.73 0.41

-0.32 0.21

0.85 -0.02

-0.53 0.35

0.26 -0.35

-0.7 0.14

0.82 0.23

0.32 0.84

-0.28 -0.73

0.46 0.75

-0.33 0.23

0.91 0.14

0.22 -0.27

-0.19 -0.59

0.4 -0.54

Eigenvalues

PCA 1 PCA2

Flavobacteria

Actinobacteria

Alphaproteobacteria

Gammaproteobacteria

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

-0.5 0.0 0.5 1.0 -0.5 0.0 0.5 1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

Fig. 1. Summary of protein subsystem categories from RAST annotations of genomes from 10 freshwater bacterial genomes withphylogenetic pairings of both soil and marine bacteria.A. The first two axes of a principal component analysis (PCA) on subsystem category proportions with ellipses on 95% confidence intervals ofenvironment classifications (proportion of variance: PCA 1- 0.28, PCA 2- 0.14).B. The same PCA with ellipses on class taxonomic level.C. Rank of subsystem categories each genome arranged by a 16S rRNA gene PhyML phylogeny (GTR model). Solid dots on a dendrogrambranch indicate greater than 90% approximate likelihood ratio test support, open dots indicate greater than 70% support, And remainingvalues are shown. The most abundant categories were assigned value ‘1’ to least abundant assigned ‘25’. Eigenvalues of first two principalcomponent axes are also shown.

748 J. A. Livermore, S. J. Emrich, J. Tan and S. E. Jones

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 4: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

category. RNA metabolism was also correlated with PC 2and Fatty acids, lipids and isoprenoids were negativelycorrelated.

Subsystem assignments of freshwater bacterial openreading frames

Most freshwater genomes indicate heavy investment insubsystem category Amino acids and derivatives, fol-lowed by Carbohydrates; Cofactors, vitamins, prostheticgroups and pigments; and Protein metabolism (Fig. 1C).All of the freshwater bacteria (and 29 of 30 total taxaexamined), except Pnec, have Amino acids and deriva-tives or Carbohydrates as the highest-ranked category,and five of the freshwater genomes have both asthe top 2 (24 of 30 total taxa examined, Fig. 1C).Polynucleobacter necessarius ssp. asymbioticus haslower investment in Amino acids and derivatives (ranked3rd) and Carbohydrates (ranked 5th) with proportionallyhigher investment in Protein metabolism and Cofactors,vitamins, prosthetic groups and pigments (tied for 1st).

Closely examining AcI-B1 and Pnec proportional sub-system category investment, it is clear that both investoutside the range exhibited by the eight new genomes(Fig. 2). While the freshwater isolates we sequenced uni-formly allocate about 14 ± 1% of genome space to cat-egory Amino acids and derivatives, Pnec and AcI-B1allocate 11% and 20%, respectively. In constrast, Pnecinvests relatively heavily in Cofactors, vitamins, prosthetic

groups and pigments (11%), while AcI-B1 has low outlyinginvestment in this category (6%; average for all freshwatergenomes is 9 ± 1.7%). Both AcI-B1 and Pnec have dispro-portionately high allocation in Protein metabolism (12%and 11%, respectively, versus 7% ± 2.4% average) and inRespiration (5.4% each versus 4 ± 0.8% average in theother eight genomes). Polynucleobacter necessarius ssp.asymbioticus and AcI-B1 tended to have disproportion-ately high or low allocation to the majority of subsystems.

Regarding carbohydrate pathways, genome contentfor carbohydrate usage was highly variable amongthe genomes, although several metabolic systems areconserved throughout [e.g. Methylglyoxal metabolism,Pyruvate metabolism, Glycolysis and gluconeogenesisand TCA cycle (Tricarboxylic acid cycle)]. Also interestingare the substrate markers present in all but one or twogenomes. FIGfams (the finest unit in the RAST hierarchyspecifying individual homologous proteins) associatedwith the Serine-glyoxylate cycle are present in all but twoof the genomes (24 or more FIGfams) the exceptionsbeing FWI2 and L41A. Another example is Glyceratemetabolism absent only in WG21. The isolates fell intothree levels of potential for metabolism of aromatic com-pounds with FWI2 and L18 having large number ofFIGfams (145 and 172, respectively), WG36 and LLX12Ahaving an intermediate number (46 and 64), and theremainder having less than 20 FIGfams in the category.

Carbohydrate and Metabolism of aromatic compoundsubsystem categories likely serve as important indicators

0.00 0.05 0.10 0.15

Fraction of classified ORFs

in subsystem

Pnec

AcI-B1

Fig. 2. Estimates of 95% confidence intervals of subsystem category proportion of eight freshwater isolate genomes (t distribution, d.f. = 7)with Pnec proportions represented by a triangle, and AcI-B1 proportions indicated with a square.

Diverse freshwater bacteria genomes 749

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 5: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

of lifestyle for heterotrophic bacteria and an orderedpresence-absence plot of those subsystems indicates twodifferent strategies (Fig. 3).All the bacteria share a core setof substrates and processes for which each is present in atleast 7 of the 10 genomes. Bacteria with small genomes(< 4 Mb; L13, LLX17, L41A and Pnec) have very fewadditional substrates and processes supplementing their

core (5 ± 1 additional subsystems). In contrast, the largegenomes (> 5 Mb; FWI2, WG21, LLX12A, L18 and WG36)each carry a large and unique substrate complement(24 ± 7 additional subsystems).

Examination of subsystem categories for use of otherelements (S, N, P and Fe) reveals further variation inlifestyle characteristics (Table S2). In the Sulfur metabo-

AcI

-B1

LLX17 L13

Pnec

L41A

FWI2

WG21

LLX12

AL18

WG36

L-rhamnose utilizationD-allose utilizationMannitol utilizationAlpha-acetolactate operonCitABBeta-glucoside metabolismDihydroxyacetone kinasesGalactosamine utilizationMeta-cleavage aromatic degradation4-Hydroxyphenylacetic acid catabolic pathwayn Phenylalkanoic acid degradationPeripheral glucose catabolism pathways2-Ketogluconate UtilizationL-Arabinose utilizationXylose utilizationInositol catabolismCarbon storage regulatorBenzoate transport and degradation clusterToluene degradationTricarballylate utilizationMalonate decarboxylaseCarboxylic acid catabolismBiphenyl degradationMethylcitrate cyclePropionate-CoA to succinate moduleBeta ketoadipate pathway (protocatechuate)Ethylmalonyl-CoA pathway of C2 assimilationTrehalose uptake and utilizationSucrose utilizationMixed acid fermentationCellulosomeLactose and galactose uptake and utilizationD-Galacturonate and D-glucuronate utilizationAcetoin butanediol metabolismGlycerol fermenation to 1-3 propanediolDeoxyribose and deoxynucleoside catabolismAlpha-Amylase locus in StreptocococcusFructose utilizationChitin and N-acetylglucosamine utilizationLactate fermentationPropionyl-CoA to succinyl-CoA moduleAromatic amin catabolismPhotorespiration (oxidative C2 cycle) VC0266Homogentisate aromatic degradationD-galactonate catabolismLactose utilizationGlycogen metabolismMaltose and maltodextrin utilizationD-gluconate and ketogluconates metabolismBenzoate catabolismBeta-ketoadipate pathway (catechol)N-heterocyclic aromatic compound degradationChloroaromatic degradation pathwaySalicylate ester degradationp-Hydroxybenzoate degradationGentisare degradationSalicylate and gentisate catabolismQuinate degradationOne carbon metabolism by tetrahydropterinesPyruvate metabolism I Pentose-phosphate pathwayTCA CycleGlycolysis and gluconeogenesisMethylglyoxal metabolismPyruvate metabolism II D-ribose utilizationPyruvate alanine serine interconversionsDehydrogenase complexesButanol biosynthesisAcetyl-CoA fermentation to ButyrateSerine-glyoxylate cycleGlyoxylate bypassGlycolate glyoxylate interconversionsLactate utilizationBenzoate degradationGlycerate metabolismTrehalose biosynthesisGlycerol uptake and utilizationEntner-Doudoroff PathwayMannose metabolism

< 4 Mb> 5 Mb

Core

Substrates

and

Processes

Accessory

Substrates

and

Processes

Genome sizeFig. 3. Ordered presence-absence map ofcarbohydrate and aromatics metabolismsubsystems of 10 freshwater bacteria. A blackbox indicates the subsystem is present, andwhite box indicates absence. Genomes andsubsystems were ordered using Hclust in R(ward clustering, dendograms shown)Subsystems form two distinct clusters withthose in greater than 7 of 10 genomeslabelled ‘core’ and those in less than 4 of 10genomes labelled ‘accessory’.

750 J. A. Livermore, S. J. Emrich, J. Tan and S. E. Jones

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 6: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

lism category, 7 of 10 genomes have open reading frames(ORFs) classified as inorganic sulfur-assimilation genes(excluding Pnec,AcI-B1 and LLX12A). In contrast, the onlytwo genomes with classified ORFs for Sulfur oxidationwere Pnec (18 FIGfams) and FWI2 (9 FIGfams). In theNitrogen metabolism category, all but two genomes hadORFs for both Nitrate/nitrite utilization and Ammoniaassimilation the exceptions being L41A and AcI-B1, whichlack FIGfams for use of nitrate/nitrite. Additionally,genomes for FWI2 and L13 had apparent ORFs fordenitrification (13 and 8 FIGfams, respectively). For Phos-phate metabolism, genome WG21 lacked High affinityphosphate transporter FIGfams, which were found in thenine other genomes. FWI2, L18 and WG36 had greaterthan nine FIGfams for Alkylphosphonate utilization orPhosphonate metabolism, while L41A, LLX17 and Pneceach had none and the remaining three genomes had fouror less FIGfams. Iron acquisition and metabolism FIGfamswere concentrated most heavily in FWI2, L18, WG36 andL13 (38, 70, 36 and 27 FIGfams, respectively) with remain-ing genomes harbouring eight or less. WG36 had FIGfamsfor Aerobactin and Enterobactin siderophores, while L18had Enterobactin only. It is important to note that whileassertions of presence and absence are tenuous with draftgenomes, a survey of 31 marker genes (Wu and Scott,2012) yields full recovery in six of the eight genomes andgreater than 94% in the other two (30 in WG21 and 29 inL13). Therefore, it is reasonable to make inferences basedon gene presence and absence in these genomes.

While the distributions of subsystem categories amongthe isolates indicate a range of aquatic lifestyles, we canalso examine specific investment in well-established life-style markers (Fig. 4). Polynucleobacter necessarius ssp.asymbioticus and AcI-B1 are outliers with relatively lowinvestment in all of signal transduction, extracytoplasmicfactors (ECFs) and motility. Additionally, of all 10 fresh-water genomes, five genomes have greater than 31

ORFs associated with motility (LLX12A, L41A, L18,L13 and FWI2), while the remainder have less than 17(two in AcI-B1 and one in Pnec). Polynucleobacternecessarius ssp. asymbioticus also possessed thesecond longest predicted minimum generation time of allthe genomes, exceeded only by FWI2 (which clusterswith the Pnec phylogenetically). The predicted fastestgrowing bacteria were L18 and WG36, which are bothGammaproteobacteria.

Discussion

We sequenced eight phylogenetically diverse isolates fromfreshwater lakes to contribute context and examine bacte-ria spanning a wide spectrum of ecophysiological life-styles. Polynucleobacter necessarius ssp. asymbioticusand AcI-B1 epitomize the passive, oligotrophic specialistlifestyle (slow growth, small genome, non-motile and low-signal transduction (Garcia et al., 2012; Hahn et al., 2012),while our newly sequenced genomes represent a spec-trum of predicted lifestyle characteristics.

Comparison of freshwater genomes with those from soiland marine environments

Comparison of subsystem category distributions suggeststhat the average lifestyle of a freshwater bacterium issomewhere between a soil and marine bacterium. Specifi-cally, freshwater genomes significantly overlap with therelatively distinct soil and marine genomes in the first twoaxes of a principal component analysis based on subsys-tem category distributions. In some sense, this observationis intuitive, given that soil and marine environments repre-sent two extremes of a hydrological continuum, whilefreshwater lakes are one intermediate step between themas run-off from the land accumulates in lakes before drain-ing to oceans. Taxonomic discrimination at the class level

Fig. 4. Genome characteristics of 10 freshwater bacterial genomes including number of ORFs, GC (proportion of cytosine and guanine)content, percent of ORFs associated with membrane transport, number of ORFs identified as ECFs, percent of ORFs associated with signaltransduction, number of ORFs associated with motility and predicted minimum generation time. The two previously published cosmopolitanand genomically streamlined taxa (Pnec and AcI-B1) are indicated with open bars.

Diverse freshwater bacteria genomes 751

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 7: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

in the second principal component indicates that phylog-eny is somewhat predictive of microbial lifestyle, with prop-erties conserved across the three environments. However,recall that to minimize the effect of phylogenetic variation,we deliberately chose closely related genomes for cross-ecosystem comparison. Therefore, this analysis does notnecessarily imply that environmental variation plays agreater role in determining gene content than phylogeny.Indeed, expected variation in a randomly selected set ofgenomes would likely be dominated by phylogenetic differ-ences irrespective of environmental origin.

Correspondence between previously predictedcharacteristics and genome content

Having determined the position of each of these eightgenomes within a freshwater taxonomy (Newton et al.,2011), we can compare previously hypothesized charac-teristics of the group, based on occurrence patterns intemporal and spatial surveys, with the genomic charac-ters observed here.

LLX17 is an Actinobacteria belonging to lineageacTH2 tribe Myco, representatives of which have beenfound in Lake Taihu, a subtropical lake in China (Wuet al., 2007). Although LLX17 has a small genome, it hasa relatively large proportion of signal transduction-associated genes and ECFs in agreement with observa-tions of biofilm formation in freshwater bodies (Rickardet al., 2004).

Phylogenetically, LLX12A groups with tribe AlfIV-BM-L-85 of which members have been isolated underphenol enrichment suggesting a lifestyle based on degrad-ing humic substances or other organic compounds(Hutalle-Schmelzer et al., 2010). This is supported by sub-system assignments of ORFs in LLX12A indicating manyFIGfams unique to LLX12A are categorized as involvedwith aromatic compound degradation (Table S1). Theother Alphaproteobacteria, L41A, is in clade alfII Brev(Brevundimonas in Linnaean taxonomy) and are associ-ated with cyanobacterial blooms (Berg et al., 2008). Thisagrees with the observation that most of the carbohydrateutilization subsystems in L41A involve small carbon com-pounds that might be associated with algae.

Representing the Flavobacteria, isolate WG21 is inlineage BacII clade A, which are suspected as organicscavengers and degraders of polymers like cellulose andchitin (Kirchman, 2002) and sometimes dominate lakesfollowing a period of high productivity (Newton et al.,2011). Concordant with predictions based on phylogeny,substrate utilization characteristics of WG21 indicateChitin and N-acetylglucosamine utilization subsystems aswell as Cellulosomes and a large range of other carbohy-drates and aromatic compound utilization subsystems(Fig. 3 and Table S1).

The other genome with predicted Chitin and N-acetylglucosamine function is L13, which is a Beta-proteobacterium. Our phylogeny was unable to resolveL13 any better than ‘Betaproteobacteria’, but a Basic LocalAlignment Search Tool (BLAST) search of the L13 16SrRNA gene against the National Center for BiotechnologyInformation (NCBI) 16S database indicates that it is closelyrelated to species in the freshwater Vogesella genus,which are known to utilize chitin (Jørgensen et al., 2010).The other Betaproteobacterium is FWI2 clustering withBetII, which is the cosmopolitan Pnec clade (named afterPolynucleobacter). In agreement with this relationship,FWI2 shares several metabolic capabilities with Pnec notfound in any of the other genomes examined, includingFIGfams for Sulfur oxidation, Cyanate hydrolysis and aSiderophore assembly kit.

Of the two Gammaproteobacteria, WG36 clusters inGamII-A(unclear ifA1 orA2), while L18 is in lineage GamIVclade A tribe Pseudo A1 (Pseudomonas). While found inlakes, both of these organisms are also at times associatedwith a terrestrial environment (Newton et al., 2011). Both ofthese genomes (and none of the others) have Carbonstorage regulator FIGfams that are associated with infec-tion and suggests a host component in their life cycle. (Szeand Li, 2011). However, they also share the unique func-tionality (among the genomes analysed) to utilize the plant-associated compounds xylose and arabinose, which alsoindicates capacity to metabolize carbon compounds fromleaf litter that may accumulate in lakes.

Genome characteristic distributions

Carbon metabolism. Based on substrate utilizationpathway distributions (Fig. 3 and Table S1), there appearto be three general categories of organisms in this study.Both AcI-B1 and Pnec are highly reduced and haveextremely limited substrate range, while the next set(L41A, LLX17 and L13) are also reduced but have ancillarymetabolic capabilities. All five of these organisms havesmall genomes (< 4 Mb). Members of the final group(WG21, WG36, LLX12A, FWI2 and L18) have largergenomes (> 5 Mb) with a core carbon utilization set aug-mented by a unique set of substrate utilization FIGfams,often including larger carbon compounds like polymersand aromatics. Predictably, a number of pathways arehighly conserved among the 10 freshwater genomes, suchas the TCA cycle, pyruvate metabolism and subsystemsfor utilization of small simple compounds such as lactateand glycerol (near the top of Fig. 3). However, all 10 of thegenomes we consider also have genes for utilization ofsalicylate, quinate and gentisare, which are compoundsderived from terrestrial plant matter. Conserved terrestrialcarbon pathways suggest the importance of carbon avail-ability in freshwater systems where leaf litter and other

752 J. A. Livermore, S. J. Emrich, J. Tan and S. E. Jones

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 8: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

decaying terrestrial plant matter is a consistently availablecarbon source.

Environmental response. In general, the smallergenomes tend to be associated with lower proportion ofORFs associated with signal transduction (R2 = 0.5,P < 0.03) and three of the five appear to be largely non-motile (Fig. 4). In contrast, the augmented organismshave genomes greater than 5 Mb, and three of fiveappear to be motile. Number of membrane-transportgenes detected correlates with genome size (R2 = 0.62,P < 0.007), but this may simply be a genome size effect asit is not related to proportion of genomic-membrane trans-port investment (R2 = 0.0081, P > 0.8). This may indicatethat reduction in number of membrane transporters isproportional to total reduction in genome size. In otherwords, membrane transport proteins do not appear to beselectively retained or lost during genome streamlining. Incontrast, signal transduction proteins disappear at a dis-proportionately high rate compared with the genome as awhole (i.e. larger genomes have greater proportion ofORFs in signal transduction as described earlier).

While we do not fully understand the process ofgenome ‘streamlining’, observations of distinct subsystemcategory distributions in AcI-B1 and Pnec indicate that theprocess may not be uniform. For example, both AcI-B1and Pnec have disproportionately high investment in sub-system category Protein metabolism suggesting thatreduction in this category may always pose a penalty tofitness. In contrast, in category Amino acids and deriva-tives, Pnec has heavily reduced investment and AcI-B1has disproportionately high investment. Additionally, incategory Carbohydrates, Pnec has heavily reducedinvestment, while AcI-B1 maintains investment compara-ble with other taxa. Such discrepancies in how these twogenomes appear to have streamlined may indicate fun-damental underlying differences in their lifestyles orperhaps simply different random paths taken in discardingsuperfluous genes. It may also be that Pnec simply hasnot fully streamlined to the extent of organisms like AcI-B1or the marine Pelagibacter ubique (Pnec genome size2 Mb compared with 1.2 Mb in AcI-B1 and 1.3 Mb inPelagibacter ubique), which have the smallest genomesof any known free-living bacteria.

Genome-informed lifestyles of freshwater bacteria

An advantage of analysing the genomes of a diverse setof freshwater bacteria is the opportunity to generatehypotheses on the spectrum of lifestyles suggested bytheir genome composition. Based on the genomes pre-sented here, we conceptualize freshwater organisms asutilizing a number of discrete lifestyle strategies based ona three-axis continuum with the first axis defined by pre-

dicted growth rate (estimated from codon usage bias, seeMethods), the second axis by the number of identifiedcarbon substrate utilization genes, which is correlatedwith utilization of larger carbon compounds/aromatics andgenome size, and the third axis defined by the proportionof a genome coding for motility and signal transduction, assurrogate measures for environmental responsiveness(Fig. 5). Based on these semimetric axes, there appear tobe four general lifestyle strategies that we call ‘passiveand streamlined’, ‘slow, augmented responders’, ‘fast,reduced responders’ and ‘vagabonds’.

The streamlined passive group includes the cosmo-politan Pnec and AcI-B1. These might also be called‘oligotrophs’ and appear near the origin of the graph withslow predicted growth rate (validated in Pnec culture;Hahn, 2003), very small genomes (2.0 and 1.2 Mb), fewgenes predicted for utilization of carbon substrates, littleto no genomic evidence of motility and few genes withpredicted signal transduction function. We also knowthat Pnec and AcI-B1 have very small cell volumes(Hahn, 2003; Hahn et al., 2003). Organisms conformingto such a lifestyle strategy might be thought of as sur-vivalists persisting by streamlining their genomes(Lynch, 2006) and eliminating extraneous cell surfacereceptors to minimize phage-attack concomitant withlikely loss of transporter function or signal transductioncapacity (Bohannan and Lenski, 2000). Reduction of cellsize and overall activity is also known to reduce preda-tion risk from ciliate and flagellate grazers (Cole, 1999;Pernthaler et al., 2001; Pernthaler, 2005). Summarily,organisms in this category would passively persist in thewater column growing slowly by gleaning a single or fewcarbon substrates while maintaining characteristics tominimize predation.

The remaining groups of organisms might be character-ized as copiotrophic. These organisms generally use agreater variety of substrates and respond to environmentalchanges. The first of these groups we call the slow, aug-mented responders. Members of this group include FWI2,LLX12A and WG21, which all have larger genomes (7.2,6.0 and 5.2 Mb, respectively) and higher diversity ofcarbon substrate utilization pathways but low growth rates.Additionally, all three have characteristics indicating activeresponse to their environment including genes for motilityand signal transduction. The second group, we call the fast,reduced responders that includes the organisms L13,L41A and LLX17. These organisms have intermediategenome sizes (∼3.8, 3.1 and 3.0 Mb, respectively) andsubstantially fewer genes predicted for carbon substrateutilization but higher predicted growth rates. They alsoappear to sense and respond to their environment evi-denced by motility and signal transduction markers. Finally,each of these three (L13, L41A and LLX17) has belowaverage investment in the Membrane transport subsystem

Diverse freshwater bacteria genomes 753

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 9: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

category. The distinction between the slower growing aug-mented group and the faster growing reduced group sug-gests a trade-off between substrate flexibility andmaximum growth rate among these more environmentallyresponsive organisms. It appears the fast, reducedresponders prefer a small set of substrates and eitherdelay growth until their preferred substrate is available(Jones and Lennon, 2010) or they actively seek specificresources chemotactically. In exchange for this reduction,genomic evidence suggests that they grow faster. In con-trast, the slow, augmented responders have the benefit ofbeing able to use a large and diverse array of substratesbut suffer lower growth rate either as a consequence oftheir larger genomes or some other burden related to anexcess of metabolic capacity. While genome size andpredicted growth rate are not correlated at the course scale(Mira et al., 2001), such a relationship may hold whenexamining only free-living copiotrophs as observed here(R = 0.87, P = 0.02 for these six genomes). It is also impor-tant to recognize the distinction between genomically pre-dicted growth rates, which are a reflection of evolutionaryhistory of growth habit, and measured growth rates, whichdescribe propensity to grow in a defined liquid medium.These two measures are not always perfectly correlated

(Fig. S2), and one or the other may prove a more reliableindicator of lifestyle.

Our distinction between responders and passivestreamlined organisms can serve as an explanation fordiffering adaptations to temporal variation in resourceavailability but is easily extended to spatial variation.Recent reviews on bacterial adaptation to spatial nutrientheterogeneity in aquatic environments (Stocker, 2012;Grossart, 2010) distinguish between organisms thatchemotactically respond to the nutrient gradients and hotspots intrinsic to all aquatic ecosystems and those thatpersist passively on the constant background concentra-tions of resources. These two lifestyles correspond directlyto what we referred to as responders and passive stream-lined, respectively.

The final group observed among available freshwatergenomes we call ‘vagabonds’ represented by WG36 andL18 (both Gammaproteobacteria). A review on freshwaterbacterial taxonomy (Newton et al., 2011) uses the termvagabond for those bacteria that appear in lakes but alsomay have a wider distribution in the biosphere. In otherwords, while these organisms are probably active fresh-water inhabitants (Nercessian et al., 2005), a free-livinglifestyle in freshwater may be only a portion of their life

Fig. 5. Conceptual representation of freshwater bacteria-lifestyle strategies based upon genome characteristics of 10 freshwaterrepresentatives. The horizontal axis is based on predicted growth rate [based on codon usage bias (Vieira-Silva and Rocha, 2010)], while thevertical axis is based on the number of carbon compound-utilization genes in each genome. The vertical axis also generally corresponds toincrease in the genes coding for proteins involved in the assimilation of large carbon substrates (polymers, aromatics, etc). Dot size (thirdaxis) corresponds to proportion of an organism’s genome coding for motility or signal transduction with larger dots indicating a greaterproportion thereof.

754 J. A. Livermore, S. J. Emrich, J. Tan and S. E. Jones

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 10: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

cycle. The two representatives we identified here bothpossess extremely high-predicted growth rates relative tothe other genomes, show indications of motility andappear to use a large number of complex carbon sub-strates paradoxical to the paradigm described earlier(Fig. 5). This would make sense if these types of organ-isms use freshwater bodies primarily as a maintenanceand transmission vector but propagate primarily in a hostor different freshwater-associated environment.

In examining the variation in lifestyle characteristics, it isworth noting the influence of isolation method used toobtain a genome from an organism of a given lifestyletype. The AcI-B1 genome was obtained from the amplifiedgenome of a single cell (Garcia et al., 2012), while Pnecwas isolated using a filtration-acclimatization method(Hahn et al., 2003). In contrast, the organisms presentedhere, with the exception of FWI2, were isolated andselected based on their propensity to grow on cultureplates with relatively rich medium [Wright’s Chu #10(WC) medium amended with thiamine, biotin and inor-ganic phosphate.]. We preferentially selected for morecopiotrophic organisms by using traditional culturingtechniques as opposed to the more laborious filtration-acclimatization procedures or culture-independentapproaches, such as single-cell genome amplification.Passive streamlined specialists like AcI-B1 and Pnec wereselected as freshwater models because they are cosmo-politan and frequently have high cell abundances.However, our genomic comparisons suggest this abun-dance may be a product of diminished growth and activity(non-motile, long-predicted minimum generation time, fewsubstrate propensities). In contrast, the population dynam-ics of rarer members of the freshwater community,selected for by traditional culturing techniques (Shadeet al., 2012), are likely characterized by punctuatedperiods of high growth and activity. These distinct lifestylestrategies (consistent, slow growth versus short-livedperiods of rapid growth) may result in equivalent contribu-tions to important ecosystem processes, such as nutrientrecycling and carbon processing.

In summary, by examining a diverse group freshwatergenomes, we are beginning to identify contrastinggenomic characteristics that are indicative of ecologicallyrelevant lifestyle strategies. This framework will be furtherstrengthened by field surveys and experimentation vali-dating the lifestyle inferences gained from genome char-acteristics. As evidence mounts, such a paradigm aspresented here can guide analysis and classification ofother organisms based on their genome sequences andalso informs interpretation of environmental shotgunmetagenomic surveys. Finally, the availability of genomesfrom a variety of native lake organisms will be beneficialfor contextual analysis of new and novel genes and genepathways identified in freshwater environments.

Methods

Eight freshwater bacteria previously isolated from LittleLong Lake, Wintergreen Lake (MI, USA), or PleasantLake (IN, USA) were selected for genome sequencingbased on taxonomic classification inferred from partial16S rRNA gene sequences. Seven of the eight wereisolated from Little Long and Wintergreen (five and twoisolates, respectively) using a modified WC minimalmedium (Stemberger, 1981 with 4.2 mM of glycine,5.3 mM of acetate, 5.2 mM of dextrose and 3.5 mM ofsuccinate). Additional isolation information for theseseven is available in (Bird, 2012). The eighth isolate origi-nates from Pleasant Lake and was isolated using thefiltration-acclimatization technique previously used duringthe isolation of Pnec except that humic acids were addedto the inorganic basal medium instead of nutrient broth,soytone and yeast extract (Hahn, 2003). Bacteria forsequencing were grown in R2A medium in 200 ml glassserum bottles on a shaker table. Once cultures reachedearly stationary phase, we reduced cell density to anOD600 (optical density measured at 600 nm) of 1.2 andisolated DNA with a cetyltrimethyl ammonium bromideand phenol chloroform total nucleic acid extraction fol-lowed by an RNAse treatment. DNA was quantified usinga Qubit fluorometer (Invitrogen, Life Technologies, GrandIsland, New York, USA).

DNA samples were prepared for sequencing adhering tothe Roche 454 titanium chemistry protocol for rapid librarypreparation. Approximately 500 ng of genomic DNA wasnebulized, end-repaired and ligated to 454 adaptors (con-taining barcode sequences for multiplexing) to generatethe sequencing library. Sequencing libraries were attachedto DNAcapture beads, titrated and amplified through emul-sion polymerase chain reaction (emPCR). The beads fromthe emPCR reaction were enriched, loaded onto a 454sequencing plate and sequenced on a 454 GS-FLXsequencer (454 Life Sciences, Branford, CT, USA).

We also sequenced 3 kb paired-end libraries for twoisolate genomes (L18, LLX17). Approximately 5 μg ofgenomic DNA were fragmented on a Covaris S220 high-intensity acoustic transducer (Covaris, Inc., Woburn, MA,USA), end-polished, ligated to a circularization adaptor,size-selected for 3 kb fragments, then circularized. Aftercircularization, the protocol follows the general librarypreparation and sequencing steps as discussed earlier.

Demultiplexing of FLX titanium sequence data for eachgenome was performed using standard 454 software[standard flowgram format (SFF) file] and barcodes. Next,sequence data present in the resulting individual flowgram(SFF) files were assembled using 454 Newbler version2.5.3 (454 Life Sciences) using multiple processors anddefault parameters. Newbler was used to estimate theempirical size of any mate-pair libraries present in SFF

Diverse freshwater bacteria genomes 755

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 11: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

data during the assembly process and subsequent scaf-folding. All analysis was performed on contigs or scaffoldswhere available. Assembly statistics are in Table S1 anddraft genome assemblies can be accessed fromthe NCBI Bioproject repository (Accs. PRJNA174161,PRJNA174163, PRJNA174212 and PRJNA174214-PRJNA174218)

The complete nucleotide genome sequence ofPolynucleobacter necessarius ssp. asymbioticus QLW-P1DMWA-1 (Meincke et al., 2012), hereafter referred to as‘Pnec’, was obtained from GenBank (Acc. NC_009379.1).The nucleotide contig sequences of the single-amplifiedgenome Actinobacteria AcI-B1 SCGC AAA027-L06 (here-after, AcI-B1) (Garcia et al., 2012) were kindly provided byKatherine McMahon of the University of Wisconsin-Madison. ORF finding and translation for Pnec, AcI-B1 andall of the sequences collected in this study were conductedusing Glimmer3 (Delcher et al., 2007) and a customPython script.

To compare genomes of freshwater isolates with isolatesfrom other environments while controlling for phylogeneticdifferences, we collected the phylogenetically closestavailable genome from both soil and marine environments.Phylogenetic pairings were identified with a RAxML(Stamatakis et al., 2008) maximum likelihood phylogeny offull-length 16S rRNA gene alignment from publically avail-able genomes. We selected the closest soil and marinegenome to each freshwater isolate.

To classify the freshwater genomes in an existing fresh-water bacteria phylogenetic classification system (Newtonet al., 2011), we identified 16S rRNA genes from thegenome using RNAMMER (Lagesen et al., 2007). Result-ant sequences (a single gene was identified for eachgenome) were aligned to a full-length 16S rRNA genefreshwater bacteria reference alignment (Newton et al.,2011) using the Silva incremental aligner (Sina) (Pruesseet al., 2012). Phylogenetic classifications were defined bypositions of isolates in phylum-specific RAxML maximumlikelihood phylogenies.

To holistically compare freshwater genomes with eachother and soil/marine genomes we used the RAST pipe-line [settings: ORFs were identified by Glimmer3, FIGfamrelease 45, all optional steps disabled (Aziz et al., 2008)]in which genes are assigned to orthologous groups called‘FIGfams’ which are assigned functional categories called‘subsystems’. We computed principal components on pro-portional subsystem category distributions using Vegan(Oksanen et al., 2007) in R (R Development Core Team,2010) to identify clustering relationships betweengenomes and reveal the categories with greatest contri-bution to inter-genomic variation. We further examinedCarbohydrate metabolism and Metabolism of aromaticcompounds subsystem categories using Vegan in R tovisualize a non-metric dimensionally scaled ordination

with Raup–Crick dissimilarity, which is a presence-absence-based measure (Raup and Crick, 1979).

In addition to annotation in RAST, we identifiedknown protein domains from all translated ORFs of thefreshwater isolates using the Pfam-A database (Batemanet al., 2004). Annotation was implemented with the‘pfam_scan.pl’ perl script from the Pfam website (http://pfam.sanger.ac.uk).Annotations permitted identification ofmarker domains for signal transduction (Ulrich and Zhulin,2010) and motility (Lingner et al., 2010). Supplementary tosignal transduction domain identification, a previously pub-lished hidden Markov model (Staron et al., 2009) identifiedlikely ECFs, which are additional markers of signaltransduction. Probable membrane transporters were iden-tified from amino acid sequences using TransportTP (Liet al., 2009) with Escherichia coli serotype O157:H7 strainEDL933 as the reference organism and an E-value thresh-old of 0.1. Finally, minimum generation time of each isolatewas estimated based on a previously developed modelutilizing codon usage bias as a predictor (Vieira-Silva andRocha, 2010).

References

Aziz, R., Bartels, D., Best, A., DeJongh, M., Disz, T.,Edwards, R., et al. (2008) The RAST server: rapid annota-tions using subsystems technology. BMC Genomics 9: 75.

Bateman, A., Coin, L., Durbin, R., Finn, R., Hollich, V.,Griffiths-Jones, S., et al. (2004) The pfam protein familiesdatabase. Nucleic Acids Res 32: D138–D141.

Berg, K., Lyra, C., Sivonen, K., Paulin, L., Suomalainen, S.,Tuomi, P., and Rapala, J. (2008) High diversity of cultivableheterotrophic bacteria in association with cyanobacterialwater blooms. ISME J 3: 314–325.

Bird, K. (2012) Generalist and Specialist Strategies of Phos-phorus Acquisition by Aquatic Bacteria. Ann Arbor, MI,USA: UMI Dissertations Publishing.

Bohannan, B., and Lenski, R. (2000) Linking genetic changeto community evolution: insights from studies of bacteriaand bacteriophage. Ecol Lett 3: 362–377.

Cole, J. (1999) Aquatic microbiology for ecosystem scien-tists: new and recycled paradigms in ecological microbiol-ogy. Ecosystems 2: 215–225.

Delcher, A., Bratke, K., Powers, E., and Salzberg, S. (2007)Identifying bacterial genes and endosymbiont DNA withglimmer. Bioinformatics 23: 673–679.

DeLong, E., and Karl, D. (2005) Genomic perspectives inmicrobial oceanography. Nature 437: 336–342.

Eiler, A., and Bertilsson, S. (2007) Flavobacteria blooms infour eutrophic lakes: linking population dynamics of fresh-water bacterioplankton to resource availability. ApplEnviron Microbiol 73: 3511–3518.

Garcia, S., McMahon, K., Martinez-Garcia, M., Srivastava,A., Sczyrba, A., Stepanauskas, R., et al. (2012) Metabolicpotential of a single cell belonging to one of the mostabundant lineages in freshwater bacterioplankton. ISME J7: 137–147. e-pub ahead of print 19 July 2012;doi:10.1038/ismej.2012.86

756 J. A. Livermore, S. J. Emrich, J. Tan and S. E. Jones

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 12: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

Grossart, H.P. (2010) Ecological consequences ofbacterioplankton lifestyles: changes in concepts areneeded. Environ Microbiol Rep 2: 706–714.

Hahn, M. (2003) Isolation of strains belonging to the cosmo-politan Polynucleobacter necessarius cluster from freshwater habitats located in three climatic zones. Appl EnvironMicrobiol 69: 5248–5254.

Hahn, M., Lünsdorf, H., Wu, Q., Schauer, M., Höfle, M.,Boenigk, J., and Stadler, P. (2003) Isolation of novelultramicrobacteria classified as Actinobacteria from fivefreshwater habitats in Europe and Asia. Appl EnvironMicrobiol 69: 1442–1451.

Hahn, M., Scheuerl, T., Jezberová, J., Koll, U., Jezbera, J.,Šimek, K., et al. (2012) The passive yet successful way ofplanktonic life: genomic and experimental analysis of theecology of a free-living Polynucleobacter population. PLoSONE 7: e32772.

Hutalle-Schmelzer, K., Zwirnmann, E., Krüger, A., andGrossart, H. (2010) Enrichment and cultivation of pelagicbacteria from a humic lake using phenol and humic matteradditions. FEMS Microbiol Ecol 72: 58–73.

Jezberová, J., Jezbera, J., Brandt, U., Lindström, E.,Langenheder, S., and Hahn, M. (2010) Ubiquity ofPolynucleobacter necessarius ssp. asymbioticus in lenticfreshwater habitats of a heterogenous 2000 km2 area.Environ Microbiol 12: 658–669.

Jones, S., and Lennon, J. (2010) Dormancy contributes to themaintenance of microbial diversity. Proc Natl Acad Sci U SA 107: 5881–5886.

Jørgensen, N., Brandt, K., Nybroe, O., and Hansen, M.(2010) Vogesella mureinivorans sp. nov., a peptidoglycan-degrading bacterium from lake water. Int J Syst EvolMicrobiol 60: 2467–2472.

Kirchman, D. (2002) The ecology of cytophaga–flavobacteriain aquatic environments. FEMS Microbiol Ecol 39: 91–100.

Lagesen, K., Hallin, P., Rødland, E., Stærfeldt, H., Rognes,T., and Ussery, D. (2007) Rnammer: consistent and rapidannotation of ribosomal RNA genes. Nucleic Acids Res 35:3100–3108.

Li, H., Benedito, V., Udvardi, M., and Zhao, P. (2009)TransportTP: a two-phase classification approach formembrane transporter prediction and characterization.BMC bioinformatics 10: 418.

Lingner, T., Mühlhausen, S., Gabaldón, T., Notredame, C.,and Meinicke, P. (2010) Predicting phenotypic traits ofprokaryotes from protein domain frequencies. BMCbioinformatics 11: 481.

Lynch, M. (2006) Streamlining and simplification of microbialgenome architecture. Annu Rev Microbiol 60: 327–349.

Meincke, L., Copeland, A., Lapidus, A., Lucas, S., Berry, K.,Glavina Del Rio, T., et al. (2012) Complete genomesequence of polynucleobacter necessarius subsp.asymbioticus type strain (qlw-p1dmwa-1 t). Stand GenomicSci 6: 74–83.

Methé, B., Nelson, K., Mihai, P., Creasy, H., Giglio, M.,Huttenhower, C., et al. (2012) A framework for humanmicrobiome research. Nature 486: 215–221.

Mira, A., Ochman, H., and Moran, N. (2001) Deletional biasand the evolution of bacterial genomes. Trends Genet 17:589–596.

Nercessian, O., Noyes, E., Kalyuzhnaya, M., Lidstrom, M.,and Chistoserdova, L. (2005) Bacterial populations activein metabolism of C1 compounds in the sediment of LakeWashington, a freshwater lake. Appl Environ Microbiol 71:6885–6899.

Newton, R., Kent, A., Triplett, E., and McMahon, K. (2006)Microbial community dynamics in a humic lake: differentialpersistence of common freshwater phylotypes. EnvironMicrobiol 8: 956–970.

Newton, R., Jones, S., Helmus, M., and McMahon, K. (2007)Phylogenetic ecology of the freshwater Actinobacteria acIlineage. Appl Environ Microbiol 73: 7169–7176.

Newton, R., Jones, S., Eiler, A., McMahon, K., andBertilsson, S. (2011) A guide to the natural history of fresh-water lake bacteria. Microbiol Mol Biol Rev 75: 14–49.

Oksanen, J., Kindt, R., Legendre, P., O’Hara, B., Stevens, M.,Oksanen, M., and Suggests, M. (2007) The veganpackage. Community ecology package [WWW document].URL http://www.R-project.org.

Pernthaler, J. (2005) Predation on prokaryotes in the watercolumn and its ecological implications. Nat Rev Microbiol3: 537–546.

Pernthaler, J., Posch, T., Simek, K., Vrba, J., Pernthaler, A.,Glöckner, F., et al. (2001) Predator-specific enrichment ofActinobacteria from a cosmopolitan freshwater clade inmixed continuous culture. Appl Environ Microbiol 67:2145–2155.

Pruesse, E., Peplies, J., and Glöckner, F. (2012) Sina: accu-rate high throughput multiple sequence alignment of ribo-somal RNA genes. Bioinformatics 28: 1823–1829.

R development core team (2010) R: A Language and Envi-ronment for Statistical Computing. Vienna, Austria: R Foun-dation for Statistical Computing. (01/19).

Raup, D., and Crick, R. (1979) Measurement of faunal simi-larity in paleontology. J Paleontol 53: 1213–1227.

Reid, W., Mooney, H., Cropper, A., Capistrano, D.,Carpenter, S., Chopra, K., et al. (2005) Ecosystems andHuman Well-being: Biodiversity Synthesis. Washington,DC, USA: Island Press.

Rickard, A., McBain, A., Stead, A., and Gilbert, P. (2004)Shear rate moderates community diversity in freshwaterbiofilms. Appl Environ Microbiol 70: 7426–7435.

Shade, A., Hogan, C., Klimowicz, A., Linske, M., McManus,P., and Handelsman, J. (2012) Culturing capturesmembers of the soil rare biosphere. Environ Microbiol 14:2247–2252.

Stamatakis, A., Hoover, P., and Rougemont, J. (2008) A rapidbootstrap algorithm for the RAxML web servers. Syst Biol57: 758–771.

Staron, A., Sofia, H., Dietrich, S., Ulrich, L., Liesegang, H.,and Mascher, T. (2009) The third pillar of bacterial signaltransduction: classification of the extracytoplasmic func-tion (ecf) σ factor protein family. Mol Microbiol 74: 557–581.

Stemberger, R. (1981) A general approach to the culture ofplanktonic rotifers. Can J Fish Aquat Sci 38: 721–724.

Stocker, R. (2012) Marine microbes see a sea of gradients.Science 338: 628–633.

Sze, C., and Li, C. (2011) Inactivation of bb0184, whichencodes carbon storage regulator a, represses the infec-tivity of borrelia burgdorferi. Infect Immun 79: 1270–1279.

Diverse freshwater bacteria genomes 757

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758

Page 13: Freshwater bacterial lifestyles inferred from comparative ...sjones20/ewExternalFiles/... · a hand-curated reference freshwater 16S rRNA gene alignment using the Silva incremental

Ulrich, L., and Zhulin, I. (2010) The MiST2 database: a com-prehensive genomics resource on microbial signaltransduction. Nucleic Acids Res 38 (Suppl. 1): D401–D407.

Vieira-Silva, S., and Rocha, E. (2010) The systemic imprint ofgrowth and its uses in ecological (meta) genomics. PLoSGenet 6: e1000808.

Warnecke, F., Sommaruga, R., Sekar, R., Hofer, J., andPernthaler, J. (2005) Abundances, identity, and growthstate of Actinobacteria in mountain lakes of different UVtransparency. Appl Environ Microbiol 71: 5551–5559.

Wu, M., and Scott, A. (2012) Phylogenomic analysis of bac-terial and archaeal sequences with AMPHORA2.Bioinformatics 29: 3781–3792.

Wu, Q., Zwart, G., Wu, J., Kamst-van Agterveld, M., Liu, S.,and Hahn, M. (2007) Submersed macrophytes play a keyrole in structuring bacterioplankton community compositionin the large, shallow, subtropical Taihu Lake, China.Environ Microbiol 9: 2765–2774.

Yooseph, S., Nealson, K., Rusch, D., McCrow, J., Dupont, C.,Kim, M., et al. (2010) Genomic and functional adaptation insurface ocean planktonic prokaryotes. Nature 468: 60–66.

Supporting information

Additional Supporting Information may be found in the onlineversion of this article at the publisher’s web-site:

Fig. S1. Maximum likelihood phylogeny (PhyML, GTRmodel, SPR moves) of 16S rRNA genes from eight newlysequenced freshwater bacterial genomes (closed circles),two previously sequenced freshwater bacterial genomes(open circles), and closely related sequences for context. Doton branches indicate greater than 80% support (aLRT). Key:blue = marine, red = host-associated, green = freshwater,brown = soil.Fig. S2. Correspondence between genome predictedgrowth rate and observed growth rate is noisy especially forGammaproteobacteria (L18 and WG36). The line marks oneto one correspondence.Table S1. Genome assembly statistics and isolate origins foreight freshwater bacterial genomes.Table S2. Nutrient and carbohydrate FIGfams identified inthe genomes of eight freshwater bacteria isolates and twopreviously sequenced genomes (Pnec and AcI-B1).

758 J. A. Livermore, S. J. Emrich, J. Tan and S. E. Jones

© 2013 Society for Applied Microbiology and John Wiley & Sons Ltd, Environmental Microbiology, 16, 746–758