Upload
jeff-bowman
View
68
Download
0
Tags:
Embed Size (px)
Citation preview
Describing polar marine microbial communities by their metabolic structureCan we bridge the gap between community structure and ecosystem function?
Je� S. Bowman* and Hugh W. DucklowLamont-Doherty Earth Observatory, Columbia University
Introduction and Motivation
Ecologists typically describe microbial communities by the diversity of a taxonomic marker gene, such as the 16S rRNA gene. Although this data is well suited to evaluating di�erences between communities, and to cor-relate community structure with other environmental parameters (e.g. chlorophyll concentration, tempera-ture, salinity), it is less well suited to describing the metabolic capabilities (i.e. ecosystem function) of the community. Although metagenomics and other techniques can bridge the gap between microbial commu-nity structure and ecosystem function these techniques are costly, data intensive, and low throughput.
Our goal was to develop a high-throughput method for inferring community metabolism from community taxonomy. By evaluating metabolic structure in place of community structure we capture key in-ter-sample relationships and their impact on microbial ecosystem function. Our method produces pathway genome databases (PGDBs) that describe the metabolic pathways likely to be present in the sample. These PGDBs are amenable to �ux-based metabolic modeling. Future work will focus on predict-ing the �ow of elements and energy through these pathways, providing a way to model the impact of changing community structure on biogeochemical cycles.
Here we apply our method to a seasonally variable, depth strati�ed microbial community from the West Ant-arctic Peninsula, a region undergoing unprecedented environmental change.
Key Points
• Microbial communities can be described by their metabolic structure.• Metabolic structure provides information on potential microbial ecosystem functions.• Representing a microbial community by metabolic structure may provide a way to model the �ow of elements and energy through the community.
●
●
●
●
Longitude
Latit
ude
NW
NE
SW
SE
WAP
srr3
6.N
W.s
urfa
cesr
r40.
NW
.sur
face
srr2
7.S
W.s
urfa
ce.A
srr3
8.S
W.s
urfa
ce.A
srr3
9.S
E.s
urfa
ce.H
srr4
1.S
E.s
urfa
ce.H
srr4
3.N
E.s
urfa
ce.I
srr4
4.N
E.s
urfa
ce.I
srr3
0.S
W.d
eep.
Dsr
r34.
SW
.dee
p.D
srr3
3.S
E.d
eep.
Esr
r29.
NW
.dee
p.C
srr4
2.N
W.d
eep.
Csr
r31.
SE
.dee
p.E
srr3
2.N
E.s
urfa
ce.B
srr2
8.N
E.s
urfa
ce.B
srr3
5.N
E.d
eep.
Fsr
r37.
NE
.dee
p.F
0.05
0.15
0.25
Hei
ght
srr3
0.S
W.d
eep.
Dsr
r34.
SW
.dee
p.D
srr3
5.N
E.d
eep.
Fsr
r31.
SE
.dee
p.E
srr3
2.N
E.s
urfa
ce.B
srr3
3.S
E.d
eep.
Esr
r28.
NE
.sur
face
.Bsr
r29.
NW
.dee
p.C
srr4
2.N
W.d
eep.
Csr
r38.
SW
.sur
face
.Asr
r27.
SW
.sur
face
.Asr
r40.
NW
.sur
face
srr4
3.N
E.s
urfa
ce.I
srr4
4.N
E.s
urfa
ce.I
srr3
7.N
E.d
eep.
Fsr
r36.
NW
.sur
face
srr3
9.S
E.s
urfa
ce.H
srr4
1.S
E.s
urfa
ce.H0.
40.
60.
81.
0
Hei
ght
A
B
Deep
Winter surface
Surface
0.06 0.08 0.10 0.12 0.14 0.16 0.18
0.45
0.55
0.65
0.75
Distance by pathway abundance
Dis
tanc
e by
edg
e ab
unda
nce
C
srr2
7.S
W.s
urfa
ce.A
srr3
6.N
W.s
urfa
cesr
r38.
SW
.sur
face
.Asr
r39.
SE
.sur
face
.Hsr
r40.
NW
.sur
face
srr4
1.S
E.s
urfa
ce.H
srr4
3.N
E.s
urfa
ce.I
srr4
4.N
E.s
urfa
ce.I
srr2
9.N
W.d
eep.
Csr
r30.
SW
.dee
p.D
srr3
1.S
E.d
eep.
Esr
r33.
SE
.dee
p.E
srr3
4.S
W.d
eep.
Dsr
r35.
NE
.dee
p.F
srr3
7.N
E.d
eep.
Fsr
r42.
NW
.dee
p.C
srr2
8.N
E.s
urfa
ce.B
srr3
2.N
E.s
urfa
ce.B
phenylalanine degradation II (anaerobic)
phenylacetate degradation II (anaerobic)
alginate degradation
maltose degradation
spheroidene and spheroidenone biosynthesis
thiamin salvage III
formate oxidation to CO2
salicylate degradation I
chlorosalicylate degradation
methylsalicylate degradation
guanylyl molybdenum cofactor biosynthesis
proline degradation
phenylacetate degradation I (aerobic)
lysine biosynthesis I
triclosan resistance
srr2
7.S
W.s
urfa
ce.A
srr3
6.N
W.s
urfa
cesr
r38.
SW
.sur
face
.Asr
r39.
SE
.sur
face
.Hsr
r40.
NW
.sur
face
srr4
1.S
E.s
urfa
ce.H
srr4
3.N
E.s
urfa
ce.I
srr4
4.N
E.s
urfa
ce.I
srr2
9.N
W.d
eep.
Csr
r30.
SW
.dee
p.D
srr3
1.S
E.d
eep.
Esr
r33.
SE
.dee
p.E
srr3
4.S
W.d
eep.
Dsr
r35.
NE
.dee
p.F
srr3
7.N
E.d
eep.
Fsr
r42.
NW
.dee
p.C
srr2
8.N
E.s
urfa
ce.B
srr3
2.N
E.s
urfa
ce.B
Robiginitalea biformata HTCC2501
Lactobacillus sanfranciscensis TMW 1 1304
Actinosynnema mirum DSM 43827
Alteromonodales spp.
Arthrobacter aurescens TC1
Thermodesulfovibrio yellowstonii DSM 11347
Bartonella bacilliformis KC583
Colwellia psychrerythraea 34H
Nitrosopumilus maritimus SCM1
Bartonella quintana Toulouse
Thalassobaculum spp.
Ruegeria pomeroyi DSS 3
Saccharophagus degradans 2 40
Halothiobacillus spp
Parvibaculum_lavamentivorans DS 1
Capnocytophaga/Cellulophaga spp.
Hippea maritima DSM 10411
Tetragenococcus halophilus
Melissococcus plutonius spp.
0 10 20 300
510
1520
PC1
PC
2
●●
●
●
●
●
●
●
●
●
●
DeepSurfaceWinter surface
0 5 10 15
05
PC1
PC
2
●
●●
●
●
●
●
●
●
●
●
DeepSurfaceWinter surface
A B
C D
741
0
41
0
16S sequence library, the bigger
the better!
Obtain all completed genomes
Build 16S rRNA reference tree
Find consensus genome for
each tree node
Place reads on reference tree
Extract pathways for each placement
Generate confidence score
for sample
Predict metabolic pathways
Calculate confidence for
each node
Evaluate genomic
plasticity for terminal nodes
Evaluate relative core genome size
Sample Analysis
Database Construction
Con�dence Score
Fig. 1. Methods. Our metabolic inference pipe-line uses a phylogenetic placement program (p-placer) [1] to place query reads on a reference tree of 16S rRNA genes from all completed genomes. We determine a consensus genome for each point of placement on the tree, and determine the met-abolic pathways represented in these genomes. Separately we determine a con�dence score for each point of placement on the reference tree from a novel indicator of genomic stability.
Fig. 4. Sample locations within the Palmer LTER o� the WAP (left) and inter-sample similarity (right). The location of Palmer Sta-tion is given by the star. Summer surface and deep samples along with winter surface samples were analyzed [2]. A) Hierarchical cluster-ing of samples by metabolic structure. B) Hierarchical clustering of samples by taxonomic structure. Note duplicate samples in both A and B. C) Distances between samples are in good agreement between the two methods (R2 = 0.65).
Fig. 5. What taxa and metabolic pathways account for the most variance? Having determined that the relationship between sam-ples can be accurately represented by metabolic structure we can begin to ask ecologically relevant questions. A frequent question posed to community structure data is what taxa account for most variability? We can ask the same question of metabolic structure; what metabolism account for the most variability? A) PCA of taxonomic structure. B) PCA of metabolic structure. C) Heatmap of high vari-ance taxa. D) Heatmap of high variance metabolisms. These metabolisms represent ecosystem functions that may be di�erentially provided by the microbial communities.
1. Matsen, F, R Kodner, E Armbrust. 2010. pplacer: Linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a �xed reference tree. BMC Bioinformatics, 11:538.2. Luria, C, H Ducklow, L Amaral-Zettler. 2014. Marine bacterial, archaeal and eukaryotic diversity and community structure on the conti-nental shelf of the western Antarctic Peninsula. Aquatic Microbial Ecology, 73:2 107-121.
www.polarmicrobes.org
Link toPoster
EmailPresenter
0 500 1000 1500 2000 2500
0.0
0.2
0.4
0.6
0.8
1.0
Terminal node
Rel
ativ
e pl
astic
ity
I
IIIII
IV
V VIVII
VIII
IX
Terminal Node
Terminal Node
Internal Node
Core genome
Accessory Genome
Fig. 2. Con�dence score. Placements can be made to terminal and internal nodes. To determine the con�dence (c) of a metabolic inference for a given placement we con-sider the core genome size (Score), the mean genome size of the clade (Sclade), and the mean index of plasticity for the clade (rd; Fig. 4).
2015 GRS and GRC, Polar Marine Science
Fig. 3-. Genomic plasticity of genomes in our database. A major impediment to accurate metabolic inference is the genetic diversity that can exist within even a narrow taxonomic clade. We developed a con�dence metric for our inferred metab-olisms that is based on the degree of genomic plasticity present inherent to each genome. X-axis gives the position of each genome on our reference tree, Y-axis gives the degree of plasticity. Unusually plastic genomes are indicated by Roman numerals. I) Nanoarcheum equitans II) the Mycobacteria III) a butyrate producing bacterium within the Clostridium IV) Candidatus Hodgkinia circadicola V) the Myco-plasma VI) Sulcia muelleri VII) Portiera aleyrodidanum VIII) Buchnera aphidicola IX) the Oxalobacteraceae.