36
MICROBIOMICS Current and future tools of the trade Ingeborg Klymiuk Core Facility Molecular Biology ZMF - CENTER FOR MEDICAL RESEARCH Medical University Graz

OMICS - innere-med-1 · MICROBIOMICS – Current and future tools of the trade. Ingeborg Klymiuk. Core Facility Molecular Biology. ZMF - CENTER FOR MEDICAL RESEARCH. Medical University

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

MICROBIOMICS –Current and future tools of the trade

Ingeborg KlymiukCore Facility Molecular BiologyZMF - CENTER FOR MEDICAL RESEARCHMedical University Graz

MICROBIOMICS OMIC technologies make up a holistic view of the molecules that make up a cell, tissue, sample or

organism OMICS: universal detection of genes (genomics), mRNA (transcriptomics), proteins (proteomics),

lipids (lipidomics) and metabolites (metabolomics) in a specific biological sample Non-target, non-biased manner (bias from study design to analysis that can impact results) approaches for biomarker discovery

D E F I N I T I ON O F O M I C T E C H N O L O G I E S

ProteOMICS

LipidOMICS

MetabolOMICS

TranscriptOMICS

MetagenOMICS

culturOMICsselective growthantibiotic assayenzymatic assay

X

AYTargeted amplicon

16s amplicon NGSLEA-Seq

PhyloChips

Conduction a microbiome study

The approaches to study the human-associated microbial

communities are increasing

DNA or RNA based analysis

community surveys: (descriptive) indentification of microorganism

(OTUs - operational taxonomical units); differences in OTU

composition between sample groups

indentification of genes: functional identification of genetic

potential, gene richness

detection of rare OTUs, minor species: depth requirements

transcriptional activity and functionality

various constituents of a microbial community, such as

eukaryotes, viruses and various groups of bacteria

live dead discrimination: propidium monoazide (PMA)

duration, costs and sample volume of analysis

Goodrich et al. Cell 158, July 17, 2014

biological question

Study subjects and controls

SamplingSample storage

NS extraction (DNA, RNA)PCR, libraray preparation

sequencing

pipeline specific analysisdiversity analysis

classification, clustering, modeling

OMICS data analysis

Deposit data, share…

Conduction a microbiome study

16s amplicon basedapproach

Who is there?

descriptive view ofmicrobiome diversity

assess the generalcomposition of the

microbiota

economical andtherefore scale to large

projects

bacterial, archaeal, fungal diversity

complex bacterialcommunities

Metagenomics

What can they do?

portrays functionalpotential of microbiome

gene content

bacteria, archaea, fungi, viruses

human/host background

Metatranscriptomics

What are they doing?

describes active geneexpression

elucidates the activemembers

bacteria, archaea, fungi, viruses

human/host background

TranscriptOMICS

MetagenOMICS

X

AYTargeted amplicon

16s amplicon NGSLEA-Seq

PhyloChips

(16s) targeted amplicon

one or a few marker genes and use these markers to reveal the composition and diversity of themicrobiota

16s rRNA gene highly conserved between different species of Bacteria and Archaea the internal transcribed spacer (ITS) region of the rRNA – fungi (Bellemain et al. 10, Bokulich et al. 13) beside conserved primer binding regions hypervariable regions provide species-specific signature

sequences primer choise, variable region, effect of experimental setup: PCR amplification, -cycle number, -

condition, depth of analysis, platform used fro sequencing …..effects on results large databases of reference sequences and taxonomies (such as greengenes - DeSantis et al.06,

SILVA - Quast et al. 13 and the Ribosomal Database Project - Cole et al. 09) risk of misclassification

(16s) targeted amplicon workflow

DNA Isolation

amplicon preparation

indexing, purificationand pooling

sequencing

1 100 200 300

(16s) targeted amplicon workflow DATA ANALYSIS

Galaxy: mothur, qiime

(optional) combine two sets of reads

Quality filtering and trimming

Pick OTUs with uclust (similarity 0.97)

Taxonomy assignment

Representative sequence alignment

PCR bias, Chimera removal (chimera.uchime)

Phylogenetic classification

Alpha diversity (Chao, shannon, eveness, richness)

Beta diversity (UniFrac, Bray-Curtis, Euclidian, Pearson)

Diversity analysis, sample and group comparison of

microbial communities (multi-variant data

analysis)Statistics and visualization (multi variant data

analysis)

Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. (2013): Development of a dual-index sequencing strategy and curationpipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Applied and Environmental Microbiology.79(17):5112-20.

Goodrich et al. Cell 158, July 17, 2014

innovations for high-throughput amplicon sequencing PCR and sequencing introduce sequence errors and sampling bias poor estimates of microbial

diversity the amplification of non-target DNA may results in inefficiencies to represent the microbial pattern low diversity samples are problematic on the Illumina system

1) increase amplicon diverisity by spike sequencing runs with shared genomic DNA (PhiX174)2) increase diversity by heterogenity spacer, frameshift nucleotides3) low-error amplicon sequencing: LEA-Seq…..

Lundberg et al nature methods 2013 Faith et al Science 2013 Fadrosh et al Microbiome, 2014

PhyloChip G3

PhyloChip G3 is a microbial community assessment tool that can simultaneously track high-abundance and low-abundance bacterial and archaeal taxa; currently available through the SecondGenome Inc. (San Francisco, CA)

Microarrays based technology with a high chip-to-chip reproducibility

16s full length ampliconBacteria: 27F 5‘-agagtttgatcctggctcag-3‘, 1492R 5’-ggttaccttgttacgactt-3’Archaea: 4Fa 5‘-tccggttgatcctgcccg-3‘, 1492R 5‘-ggttaccttgttacgactt-3’

PhyloChip G3

PhyloChip G3 25-mer oligos, 1,100,000 probes and analysis for 59,959 OTUs overcome the depth bias - no saturation effect detection of rare OTUs; down to subspecies level resolution Mainly used for environmental samples (space crafts, clean rooms, hot pring systems,…) Human high diversity samples, gut samples, high resolution and sensitivity desired ‚tackling the minority‘ Used in varius publications characterizing thediseased human intestinal tract, gastric samples,…

Moissl-Eichinger et al 2014

MetagenOMICS

the ‘unbiased’ direct sequencing of the microbiomes, genomes of all microorgansims in a sample

1014 microorganisms inhabiting the human gut

ensemble of the genomes of human-associated microorganisms

providing much richer data on the functional potential present in microbial community genomes

sacrifice resolution

Extend the analysis to other organisms than bacteria

MetaHIT: defined a gene catalogue of 3.3Mio non redundant gut microbial genes by metagenomics

defining 100 times more genes than encoded by the human host genome

DNA Isolation

libraray preparation

indexing, purificationand pooling

sequencing

MetagenOMICS - workflow

Reference based: MIRA, Newbler

De novo: SOAP, Velvet

MG-RAST, a metagenome annotation system

MetagenOMICS - workflow

sequencing

assembly

Annotation, gene prediction

Binning

Statistical analysis

Data storage

Data sharing

Metadata

16s phylogeny

QCQC

draft genome

MetagenOMICS

integrating microbial membership with biomolecular potential and activity in the human intestine eliminates danger of PCR/Primer bias (missed phyla) Genomes form the same species (by 16s) can have large genomic differences outside the 16s

region, can have different sets of gene clusters includes fungal (<1% stool) and viral (1*10-5% – 2% stool) sequences; immense biological

function (reviewed in Morgan et al Gastroenterology 2014)

Lepage et al Gut 2013 Dutilh et al Nature 2014

MetatranscriptOMICS

only a subset of the present genes are expressed

rare species might be highly transcriptional active

characterize the complete collection of transcribed sequences in

a microbial community

How communities respond to changes in their environment

Analysis of the active fraction of the community

Is a gene active?

Is a gene higherexpressed than another

gene in the same sample/treatment

Is a gene differentiallyexpressed in response toexperimental conditions?

Does gene expressionchange over time?

MetatranscriptOMICS-workflow

RNA is less stablerRNA, tRNA deplition (stool: 98%) (MicrobExpress kit; LifeTech)mammalian (host cell) RNA selectively removed; high relevance for e.g.

biopsy samplesHigher amount of starting material is necessary or amplification step

mod. from Warnecke et al 2009

Total RNA extraction

Microbial Community

mRNA enrichment

cDNA synthesis

optional: amplification

Libraray preparationfor high throughput

sequencing

Metatranscriptome

Franzosa et al 2014 PNAS

de-novo Sequencing

Amplicon Sequencing

Transcriptomics (e.g. Tag sequencing, RNA Seq)

ChIP-Sequencing

meDIP-Sequencing

Haplotyping

Microbiome Studies/Metagenomics

Transcriptome-wide full-length cDNA Sequencing

Whole Genome Re-Sequencing

Exome Sequencing

read length

read

num

bers

ncRNA transcriptomics

Roche 454GS FLX

1.3Mio reads á 700b

Illumina HiSeq2000/2500

~2bil. reads á 2x125b

Pac. Biosciences

50k reads 5/8.5kb-20kb!

Illumina MiSeq

~25Mio. reads á 2x300bp

Illumina NextSeq500

~400Mio. reads á 2x1500bp

Which Technology/System to choose?

Illumina: sequencing by synthesis

Output scalable: MiSeq, HiSeq, NextSeq High multiplexing capacity Read length increasing (300bp MiSeq,150 HiSeq) Low error rates Cost efficient system (about 1400€ for a MiSeq run-384

samples)

Pacific Bioscience: SMRT Cell

www.pacificbiosciences.com

small genome, bacteria, archaea metagenomics no multiplexing capacity One SMRT cell 200-400MB Output

Nanopores – The future in DNA sequencing

Single moelcule sequencing Direct RNA molecule sequencing High error rate (13%-15%)

ProteOMICS

LipidOMICS

MetabolOMICS

X

AY

MALDI-TOF MS

ProteOMICS

mass spectrometriy techniques offer methods to directly analyze small molecules determine lipids, metabolites and proteins MALDI-TOF: routine application for bacterial classification (DSMZ) recording of mass spectra of large biomolecules (mainly ribosomal proteins) mass spectrometric fingerprints: identification of bacteria, yeasts and fungi by

comparison with reference databases

http://www.mayomedicallaboratories.com/articles/communique/2013/01-maldi-tof-mass-spectrometry/index.html

LipidOMICS / MetabolOMICS

Metabolomics and metabolome profiling for disease biomarkers

Understand rare taxa and taxa with genomic variations

Rare taxa can have important metabolic activities

Metabolomics provide o picture of metabolism rather than the potential of metabolism

short-chain fatty acids (SCFAs) derived from microbial metabolism in the gut play a central role in host homeostasis

Ursell et al. Gastroenterology 2014

Mass spectrometry

RT: 2,68 - 18,86

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18Time (min)

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

Rel

ativ

e A

bund

ance

3,68

5,87

4,93

5,25

9,52

3,2117,31

3,94

9,014,25 6,73 7,36 8,32 17,479,73 11,12 14,16 16,57 17,9014,8311,66 12,15 14,08 16,31

NL:2,28E7TIC F: + c Full ms [80,00-400,00] MS L12

Semi - Targeted Methods16:0

18:0

18:1n9

18:1n11

20:4

15:0IntStd

22:6

18:2

MetabolOMICS

culturOMICsselective growthantibiotic assayenzymatic assay

X

AY

CulturOMICS

it is commonly accepted that c. 80% of the bacterial

species found by molecular tools e.g. in the human gut

are uncultured or even unculturable (Turnbaugh et al

2007)

the German national academy Leopoldina in Berlin has

recently recommended increasing the effort in

taxonomic research

taxonomy is important for medicine, food technology

and agriculture, for an optimal understanding and

application of microorganisms

pure cultures are mandatory for taxonomic assessment

CulturOMICS is an approach allowing extensive

assessment of the microbial composition by high-

throughput culture

complements the metagenomic analysis and

overcomes the ‘depth bias’ and DNA isolation/PCR bias

CulturOMICS

define high throughput culture conditions (212 different )

high throughput automized colony screening

indentification by MALDI-TOF

compariosn of culturomics taxa with those found by 16s amplicon sequencing

31 new bacterial species and genera (http://www.ebi.ac.uk/embl/Submission/index.html)

culturomics approach yielded 340 bacterial species, seven phyla, 117 genera

pyrosequencing identified 282 species, six phyla, 91 genera

51 phylotypes overlap between the methods

definition of most efficient culture conditions

characteriation by functionality and enzyme activity possible

minority population can have a substantial effect on the ecology of the gut

microbiota and on human health

later use of indetified sepcies as probiotica

Challenge & impact of OMICs

Owyang & Wu 2014

Proteomics Genomics

Hyb

Amplicon

Metagenome

Transcriptome

LipidomicsMetabolomics

Bioinformatics

candidate list

Omics

• Functional analysis • Candidate prioritization• Biomarker identification• Drug target discovery

Integrative Bioinformatics Analysis & Methods

Reiss et al. 2011, Host Cell & Microbe

Challenge & impact of OMICs

MICROBIOMICS

THANK YOU!

Ingeborg KlymiukCore Facility Molecular BiologyZMF - CENTER FOR MEDICAL RESEARCHMedical University Graz

Illumina Sequencing Systems

http://systems.illumina.com/systems/sequencing.html

Advantage Disadvantage

Present in all bacteria and archaea (91).

Present in multiple copy numbers throughmost organisms (91), which may lead tooverestimation of the abundance of someorganisms.

Contains highly conserved regions suitablefor universal primer design (37).

Small number of organisms do not display asmuch conservation through these regionsleading to primer bias (37).

Contains regions of high variability suitableas unique identifiers (42).

Regions of variability are occasionallyinsufficient to provide species-levelresolution, and may be biased toward certainspecies (39,42).

Numerous well-curated databases allowingsequence comparison and taxonomicassignment of organisms (45).

Many databases contain sequences witherrors (45).

Well-studied primer pairs available, whichare capable of amplifying most organismswith high specificity for bacteria (38).

May lack specificity for certain bacterialgroups and result in inaccurate estimations ofcommunity composition (38).

16s pros and cons