49

A decade into Next Generation Sequencing on marine non-model organisms: current state and developments

Embed Size (px)

Citation preview

A decade into Next Generation Sequencing on marine

non-model organisms: Current state and developments

Alexander Jueterbock

2017-03-01

@AJueterbock Next Generation Sequencing 2017-03-01 1 / 43

NGS history Next generation sequencing

History of sequencing

1953 Watson and Crick: Double helix structure

1977 First generation Sanger sequencing

1983 Mullis:PCR

1997 Next Generation Sequencing

2003 Human genome - after 1 decade454 Pyrosequencing2006

Illumina NGS

Numerous NGS technologies

@AJueterbock Next Generation Sequencing 2017-03-01 2 / 43

NGS history Next generation sequencing

Next generation sequencing - low costs and high throughput

from www.sciencenews.org

@AJueterbock Next Generation Sequencing 2017-03-01 3 / 43

NGS history Next generation sequencing

NGS platforms

Next Generation

First Generation

@AJueterbock Next Generation Sequencing 2017-03-01 4 / 43

NGS history Next generation sequencing

Typical NGS library preparation work�ow

Van Dijk et al., 2014, Trends in Genetics

@AJueterbock Next Generation Sequencing 2017-03-01 5 / 43

NGS history Next generation sequencing

NGS platforms

SOLiD

454 Roche PyrosequencingIon Torrent

Illumina

market leader

@AJueterbock Next Generation Sequencing 2017-03-01 6 / 43

Genomics

Omics overview

GenomicsDNA

EpigenomicsMethylation, Histone modi�cation, Non-coding RNA

Transcriptomics

mRNA

MetagenomicsGenomes of microorganisms

Stability,heritability

e�ecton

phenotype

Com

plexity

and�exibility

response

toenvironm

ent

adjusted from Kellermayer, 2010, American Journal of Medical Genetics, Part A

@AJueterbock Next Generation Sequencing 2017-03-01 7 / 43

Genomics

NGS omics applications

Seq-Methods Applications

GenomicsWGS

Reduced representation

gDNA targeted DNA

cpDNA mtDNA

Assembly

Markers Variations

EpigenomicsBis-seq MethylRAD

MeDIP ChIP-seq

gDNA

ncRNA Histone modi�cation

Methylation Expression

TranscriptomicsRNA-seq mRNA

smallRNA

Assembly Expression

Variations Characterization

MetagenomicsWGS

Amplicon

gDNA

16S rRNA

Function Variation

Phylogenetics

Stability,heritability

Com

plexity

and�exibility

adjusted from Kellermayer, 2010, American Journal of Medical Genetics, Part A

@AJueterbock Next Generation Sequencing 2017-03-01 8 / 43

Genomics Reference genome

Genome sizes

Phaeodactylum tricornutum (Diatom)Ectocarpus siliculosus (Brown alga)

Eurytemora a�nis (Copepod)

Patiria miniata (Bat star)

Crassostrea gigas (Paci�c oyster)

Salmo salar (Atlantic salmon)

Orcinus orca

Mb Gb

wikipedia

@AJueterbock Next Generation Sequencing 2017-03-01 9 / 43

Genomics Reference genome

Genome sizes

Phaeodactylum tricornutum (Diatom)Ectocarpus siliculosus (Brown alga)

Eurytemora a�nis (Copepod)

Patiria miniata (Bat star)

Crassostrea gigas (Paci�c oyster)

Salmo salar (Atlantic salmon)

Orcinus orca

Zostera marina (Seagrass)

Olsen et al., 2016, NatureMb Gb

wikipedia

@AJueterbock Next Generation Sequencing 2017-03-01 9 / 43

Genomics Reference genome

Assembly based on Illumina fragment and mate pair libraries

Genome (203 Mb, N50: 79,958)

Reads

Contigs (12,588)

Mate-pair

Sca�old (2,200)

@AJueterbock Next Generation Sequencing 2017-03-01 10 / 43

Genomics Reference genome

Zostera - adaptation to the marine evironment

Abundance of genes and Transposable El-ements, TEs (63%) in 10 largest sca�olds

TEs associated with gained genes

Olsen et al., 2016, Nature

@AJueterbock Next Generation Sequencing 2017-03-01 11 / 43

Genomics Reference genome

Zostera - adaptation to the marine evironment

Abundance of genes and Transposable El-ements, TEs (63%) in 10 largest sca�olds

TEs associated with gained genes

Olsen et al., 2016, Nature

@AJueterbock Next Generation Sequencing 2017-03-01 11 / 43

Genomics RADseq

NGS omics applications

Seq-Methods Applications

GenomicsWGS

Reduced representation

gDNA targeted DNA

cpDNA mtDNA

Assembly

Markers Variations

EpigenomicsBis-seq MethylRAD

MeDIP ChIP-seq

gDNA

ncRNA Histone modi�cation

Methylation Expression

TranscriptomicsRNA-seq mRNA

smallRNA

Assembly Expression

Variations Characterization

MetagenomicsWGS

Amplicon

gDNA

16S rRNA

Function Variation

Phylogenetics

Stability,heritability

Com

plexity

and�exibility

adjusted from Kellermayer, 2010, American Journal of Medical Genetics, Part A

@AJueterbock Next Generation Sequencing 2017-03-01 12 / 43

Genomics RADseq

RAD sequencing

RADmutated cut site

cut site

PCR duplicate

Restriction/Shearing

PCR

Sequencing

Analysis In�ated homozygosity

Baird et al., 2008, PLoS ONE; Schweyen et al., 2014, Biological Bulletin

@AJueterbock Next Generation Sequencing 2017-03-01 13 / 43

Genomics RADseq for population genomics

ddRAD case studies

Population genomicsNatural samples from North-Atlantic

Fisheries-induced selection3-year size selection on Guppy

ddRAD pool-sequencing(16 ind.)

ddRAD pool-sequencing(20 ind.)

Alignment to genomeVariant calling

(total 1,800 SNPs, 340 common)

Alignment to genomeVariant calling

(total 51,338 SNPs)

Estimate genetic di�erentiationTalk of Marvin Choquet at 12:15

Test for putatively adaptive SNPsTalk of Irina Smolina at 12:00

@AJueterbock Next Generation Sequencing 2017-03-01 14 / 43

Epigenomics

NGS omics applications

Seq-Methods Applications

GenomicsWGS

Reduced representation

gDNA targeted DNA

cpDNA mtDNA

Assembly

Markers Variations

EpigenomicsBis-seq MethylRAD

MeDIP ChIP-seq

gDNA

ncRNA Histone modi�cation

Methylation Expression

TranscriptomicsRNA-seq mRNA

smallRNA

Assembly Expression

Variations Characterization

MetagenomicsWGS

Amplicon

gDNA

16S rRNA

Function Variation

Phylogenetics

Stability,heritability

Com

plexity

and�exibility

adjusted from Kellermayer, 2010, American Journal of Medical Genetics, Part A

@AJueterbock Next Generation Sequencing 2017-03-01 15 / 43

Epigenomics

Epigenetic variation adds a level of variation to the genome

Allis et al., 2015

@AJueterbock Next Generation Sequencing 2017-03-01 16 / 43

Epigenomics

NGS omics applications

Seq-Methods Applications

GenomicsWGS

Reduced representation

gDNA targeted DNA

cpDNA mtDNA

Assembly

Markers Variations

EpigenomicsBis-seq MethylRAD

MeDIP ChIP-seq

gDNA

ncRNA Histone modi�cation

Methylation Expression

TranscriptomicsRNA-seq mRNA

smallRNA

Assembly Expression

Variations Characterization

MetagenomicsWGS

Amplicon

gDNA

16S rRNA

Function Variation

Phylogenetics

Stability,heritability

Com

plexity

and�exibility

adjusted from Kellermayer, 2010, American Journal of Medical Genetics, Part A

@AJueterbock Next Generation Sequencing 2017-03-01 17 / 43

Epigenomics DNA methylation

What is the methylome?

The set of DNA methylation modi�cations in an organism's genome

Zakhari, 2013, Alcohol research : current reviews

@AJueterbock Next Generation Sequencing 2017-03-01 18 / 43

Epigenomics DNA methylation

Eco-evolutionary importance of DNA-methylation

SpeciationHeritable phenotypic variationAdaptation independant of genotype

@AJueterbock Next Generation Sequencing 2017-03-01 19 / 43

Epigenomics DNA methylation

Bisul�te conversion allows to detect methylcytosine

www.atdbio.com

@AJueterbock Next Generation Sequencing 2017-03-01 20 / 43

Epigenomics DNA methylation

MethylRAD

Wang et al., 2015, Open Biology

@AJueterbock Next Generation Sequencing 2017-03-01 21 / 43

Epigenomics The methylome of seagrass

Epigenetic variation in seagrass clones

Epigenetic variation in a clonal meadow on the Åland Islands?

@AJueterbock Next Generation Sequencing 2017-03-01 22 / 43

Epigenomics The methylome of seagrass

Methylation response to heat stress in seagrass

New grown shoots

@AJueterbock Next Generation Sequencing 2017-03-01 23 / 43

Transcriptomics

NGS omics applications

Seq-Methods Applications

GenomicsWGS

Reduced representation

gDNA targeted DNA

cpDNA mtDNA

Assembly

Markers Variations

EpigenomicsBis-seq MethylRAD

MeDIP ChIP-seq

gDNA

ncRNA Histone modi�cation

Methylation Expression

TranscriptomicsRNA-seq mRNA

smallRNA

Assembly Expression

Variations Characterization

MetagenomicsWGS

Amplicon

gDNA

16S rRNA

Function Variation

Phylogenetics

Stability,heritability

Com

plexity

and�exibility

adjusted from Kellermayer, 2010, American Journal of Medical Genetics, Part A

@AJueterbock Next Generation Sequencing 2017-03-01 24 / 43

Transcriptomics

NGS omics applications

Seq-Methods Applications

GenomicsWGS

Reduced representation

gDNA targeted DNA

cpDNA mtDNA

Assembly

Markers Variations

EpigenomicsBis-seq MethylRAD

MeDIP ChIP-seq

gDNA

ncRNA Histone modi�cation

Methylation Expression

TranscriptomicsRNA-seq mRNA

smallRNA

Assembly Expression

Variations Characterization

MetagenomicsWGS

Amplicon

gDNA

16S rRNA

Function Variation

Phylogenetics

Stability,heritability

Com

plexity

and�exibility

adjusted from Kellermayer, 2010, American Journal of Medical Genetics, Part A

@AJueterbock Next Generation Sequencing 2017-03-01 25 / 43

Transcriptomics RNAseq and temperature adaptation

RNAseq to identify transcriptomic adaptation in seagrass

Sampling sites

Summer sea surface temperatures

Jueterbock et al., 2016, Molecular Ecology

@AJueterbock Next Generation Sequencing 2017-03-01 26 / 43

Transcriptomics RNAseq and temperature adaptation

RNAseq libraries of heatstressed samples

Jueterbock et al., 2016, Molecular Ecology

@AJueterbock Next Generation Sequencing 2017-03-01 27 / 43

Transcriptomics RNAseq and temperature adaptation

Di�erential expression analysis

Gene 1 Gene 2

Population 1 or control

Population 2 or stress

5,000 out of 13,000 uniquely mapped genes were heat-responsive

@AJueterbock Next Generation Sequencing 2017-03-01 28 / 43

Transcriptomics RNAseq and temperature adaptation

Separating neutral from adaptive di�erentiation

Gene 1 Gene 2

Population 1

Population 2

SNP140,000 SNPs in total

Neutral di�erentiation

Mediterranean

@AJueterbock Next Generation Sequencing 2017-03-01 29 / 43

Transcriptomics RNAseq and temperature adaptation

Putatively adaptive di�erentiation

21 genes were likely involved in parallel adaptation to warm temperatures

21 genesadaptivelydi�erentiated

Jueterbock et al., 2016, Molecular Ecology

@AJueterbock Next Generation Sequencing 2017-03-01 30 / 43

Metagenomics

NGS omics applications

Seq-Methods Applications

GenomicsWGS

Reduced representation

gDNA targeted DNA

cpDNA mtDNA

Assembly

Markers Variations

EpigenomicsBis-seq MethylRAD

MeDIP ChIP-seq

gDNA

ncRNA Histone modi�cation

Methylation Expression

TranscriptomicsRNA-seq mRNA

smallRNA

Assembly Expression

Variations Characterization

MetagenomicsWGS

Amplicon

gDNA

16S rRNA

Function Variation

Phylogenetics

Stability,heritability

Com

plexity

and�exibility

adjusted from Kellermayer, 2010, American Journal of Medical Genetics, Part A

@AJueterbock Next Generation Sequencing 2017-03-01 31 / 43

Metagenomics Metagenomics history

Metagenomics timeline

Escobar-Zepeda et al., 2015, Frontiers in Genetics

@AJueterbock Next Generation Sequencing 2017-03-01 32 / 43

Metagenomics Metagenomics history

16S rRNA metapro�ling vs WGS metagenomics

www.gatc-biotech.com

@AJueterbock Next Generation Sequencing 2017-03-01 33 / 43

Metagenomics Metagenomics history

NGS Omics applications

Seq-Methods Applications

GenomicsWGS

Reduced representation

gDNA targeted DNA

cpDNA mtDNA

Assembly

Markers Variations

EpigenomicsBis-seq MethylRAD

MeDIP ChIP-seq

gDNA

ncRNA Histone modi�cation

Methylation Expression

TranscriptomicsRNA-seq mRNA

smallRNA

Assembly Expression

Variations Characterization

MetagenomicsWGS

Amplicon

gDNA

16S rRNA

Function Variation

Phylogenetics

Stability,heritability

Com

plexity

and�exibility

adjusted from Kellermayer, 2010, American Journal of Medical Genetics, Part A

@AJueterbock Next Generation Sequencing 2017-03-01 34 / 43

Metagenomics The seagrass microbiome

Local variation in seagrass microbiome

Chloe Marechal, poster 192, Session 016, 03/03/2017, 11:00 - 12:00

@AJueterbock Next Generation Sequencing 2017-03-01 35 / 43

Bottlenecks and perspectives Huge data

Big data

genalice.com

@AJueterbock Next Generation Sequencing 2017-03-01 36 / 43

Bottlenecks and perspectives Huge data

Bioinformatics data

File size: several Gb

Number of lines: >1,000,000

@AJueterbock Next Generation Sequencing 2017-03-01 37 / 43

Bottlenecks and perspectives Huge data

Bioinformatics data analysis

Computational infrastructure needed

Pipelines often have to be re-established for each non-model species

Open source software increases reproducibility as compared withcommercial software

Knowledge of both biology and informatics

One month analysis contracts are often too short

Lacking standards for metagenomics data analysis

@AJueterbock Next Generation Sequencing 2017-03-01 38 / 43

Bottlenecks and perspectives Perspectives

Perspectives

Microsatellites largely replaced by SNPs for population genetics

Third Generation Sequencers open up opportunities for genomics,epigenomics, and metagenomics

CRISPR genome editing, a breakthrough also for non-modelorganisms?

@AJueterbock Next Generation Sequencing 2017-03-01 39 / 43

Bottlenecks and perspectives Perspectives

Perspectives

Microsatellites largely replaced by SNPs for population genetics

Third Generation Sequencers open up opportunities forgenomics, epigenomics, and metagenomics

CRISPR genome editing, a breakthrough also for non-modelorganisms?

@AJueterbock Next Generation Sequencing 2017-03-01 39 / 43

Bottlenecks and perspectives Third generation sequencing

Third generation sequencing platforms

NGS 3rd Generation

First Generation

single molecule sequencing

no PCR bias

characterization of DNA modi�cations

@AJueterbock Next Generation Sequencing 2017-03-01 40 / 43

Bottlenecks and perspectives Third generation sequencing

Third generation sequencing platforms

NGS 3rd Generation

First Generation

Paci�c Biocienceshigh cost

input DNA > 10µg

raw error 10-15%circular consensus sequence

error correction to 0.01%

@AJueterbock Next Generation Sequencing 2017-03-01 40 / 43

Bottlenecks and perspectives Third generation sequencing

Third generation sequencing platforms

NGS 3rd Generation

First Generation

Paci�c Biociences

Oxford Nanoporeup to 200kbp read length

5-30% raw error

size of USB stick

@AJueterbock Next Generation Sequencing 2017-03-01 40 / 43

Bottlenecks and perspectives Third generation sequencing

Perspectives

Microsatellites largely replaced by SNPs for population genetics

Third Generation Sequencers open up opportunities for genomics,epigenomics, and metagenomics

CRISPR genome editing, a breakthrough also for non-modelorganisms?

@AJueterbock Next Generation Sequencing 2017-03-01 41 / 43

References

References I

Allis, CD, ML Caparros, T Jenuwein, and D Reinberg (2015). Epigenetics.P. 984.

Baird, NA, PD Etter, TS Atwood, MC Currey, AL Shiver, ZA Lewis, et al.(2008). �Rapid SNP discovery and genetic mapping using sequencedRAD markers�. In: PLoS ONE 3.10.

Escobar-Zepeda, A, AVP De Le??n, and A Sanchez-Flores (2015). �Theroad to metagenomics: From microbiology to DNA sequencingtechnologies and bioinformatics�. In: Frontiers in Genetics 6.DEC,pp. 1�15.

Haas, BJ and MC Zody (2010). �Advancing RNA-Seq analysis�. In: NatureBiotechnology 28.5, pp. 421�423.

Jueterbock, A, SU Franssen, N Bergmann, J Gu, JA Coyer, TBH Reusch,et al. (2016). �Phylogeographic di�erentiation versus transcriptomicadaptation to warm temperatures in Zostera marina, a globallyimportant seagrass�. In: Molecular Ecology 25.21, pp. 5396�5411.

@AJueterbock Next Generation Sequencing 2017-03-01 41 / 43

References

References II

Kellermayer, R (2010). �"Omics" as the �ltering gateway betweenenvironment and phenotype: The in�ammatory bowel diseases example�.In: American Journal of Medical Genetics, Part A 152 A.12,pp. 3022�3025.

Olsen, JL, P Rouzé, B Verhelst, Yc Lin, T Bayer, J Collen, et al. (2016).�The genome of the seagrass Zostera marina reveals angiospermadaptation to the sea�. In: Nature 530.7590, pp. 331�335.

Schweyen, H, A Rozenberg, and F Leese (2014). �Detection and removal ofPCR duplicates in population genomic ddRAD studies by addition of adegenerate base region (DBR) in sequencing adapters�. In: BiologicalBulletin 227.2, pp. 146�160.

Van Dijk, EL, H Auger, Y Jaszczyszyn, and C Thermes (2014). �Ten yearsof next-generation sequencing technology�. In: Trends in Genetics 30.9,pp. 418�26.

@AJueterbock Next Generation Sequencing 2017-03-01 42 / 43

References

References III

Wang, S, J Lv, L Zhang, J Dou, Y Sun, X Li, et al. (2015). �MethylRAD: asimple and scalable method for genome-wide DNA methylation pro�lingusing methylation-dependent restriction enzymes�. In: Open Biology

5.11, p. 150130.Zakhari, S (2013). �Alcohol metabolism and epigenetics changes.� In:Alcohol research : current reviews 35.1, pp. 6�16.

@AJueterbock Next Generation Sequencing 2017-03-01 43 / 43