20
Vall d’Hebron Institut de Recerca (VHIR) Rosa Prieto Head of the High Tech Unit [email protected] 15/05/2014 Institut d’Investigació Sanitària acreditat per l’Instituto de Salud Carlos III (ISCIII) NEXT GENERATION SEQUENCING TECHNOLOGIES AND APPLICATIONS CURS OF BIOINFORMATICS FOR BIOMEDICAL RESEARCH

NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

  • Upload
    ueb

  • View
    401

  • Download
    4

Embed Size (px)

DESCRIPTION

Course: Bioinformatics for Biomedical Research (2014). Session: 2.1.3- Next Generation Sequencing. Technologies and Applications. Part III: NGS Applications II. Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.

Citation preview

Page 1: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

1

Vall d’Hebron Institut de Recerca (VHIR)

Rosa PrietoHead of the High Tech Unit

[email protected]

15/05/2014

Institut d’Investigació Sanitària acreditat per l’Instituto de Salud Carlos III (ISCIII)

NEXT GENERATION SEQUENCING TECHNOLOGIES AND APPLICATIONS

CURS OF BIOINFORMATICS FOR BIOMEDICAL RESEARCH

Page 2: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

2

INTRODUCTION TO NGS1

2

3

4

Index

NGS TECHNOLOGY OVERVIEW

NGS APPLICATIONS OVERVIEW

CURS OF BIOINFORMATICS FOR BIOMEDICAL RESEARCH

WHAT IS NEXT IN SEQUENCING TECHNOLOGIES?

Page 3: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

NGS applications

-Amplicon sequencing-Targeted DNA resequencing-Exome sequencing-Whole genome sequencing

-Metagenomics

-RNA sequencing-Targeted RNA resequencing

-Epigenomics-Sequencing of free DNA-RNA (plasma/serum)

Page 4: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

4

Metagenomics is the study of a collection of genetic material (genomes) from a mixed community of organisms.Metagenomics usually refers to the study of microbial communities.

2

What can we study?

•The biosphere contains between 1030 and 1031 microbial genomes, at least 2–3 orders of magnitude morethan the number of plant and animal cells combined.•Microbes associated with the human body outnumber human cells by at least a factor of ten.•The vast majority cannot be cultured.

Metagenomics

Page 5: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

5

2

(16S rRNA)

The 16S rRNA gene is comprised of highly conserved regions interspersed with more variable regions, allowing PCRprimers to be designed that are complementary to universally conserved regions flanking variable regions.Wu et al. BMC Microbiol. 2010; 10: 206.

Unidirectional sequencing

Types of metagenomics studies using NGS

-Population screening and diversity-Genome assembly-Gene prediction and annotation-Functional genomics-Ecology

-Taxonomy

Page 6: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

7

2

Sampling and pyrosequencing methods for characterizing bacterial communities in the humangut using 16S sequence tags.

Wu et al. BMC Microbiol. 2010; 10: 206.

This is a study of methods for surveying bacterial communities in human feces using 454/Roche pyrosequencing of 16SrRNA gene tags.

Comparison of different methods of sample storage (no effect), DNA extraction and purification (great effect), set ofprimers for amplification of several variable regions (effect) and GS FLX vs. GS FLX Titanium sequencing (no effect).

Composition of the gut microbiome in the ten subjects studied.

We did find that the choice of 16S rRNA gene regionused for analysis had a noticeable effect, with the V6-V9region representing an outlier.The V6-V9 primers consistently showed the lowestpercentage of taxonomic assignments at the genuslevel.We note that our choice of V6-V9 primer andsequencing direction did not cover the V6 regionsefficiently.

Types of metagenomics studies using NGS

Page 7: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

8

2 NIH Human Microbiome Project

“our other genome”

Page 8: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

9

•To establish associationsbetween the genes of the humanintestinal microbiota and ourhealth and disease.• Focused on two disorders ofincreasing importance in Europe,Inflammatory Bowel Disease(IBD) and obesity.

2 MetaHit Project

Intestinal microbiota deep-sequencing for patient stratification:•rich microbiota •poor microbiota (obesity, metabolic disturbance, weight increase)

The obese individuals among the lower bacterial richness group also gain more weight over time. Only a few bacterialspecies are sufficient to distinguish between individuals with high and low bacterial richness, and even between lean andobese participants. Our classifications based on variation in the gut microbiome identify subsets of individuals in thegeneral white adult population who may be at increased risk of progressing to adiposity-associated co-morbidities.

Page 9: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

10

The first Genomics technique: microarrays

One gene at a time

Many genes at the same time

PRE-GENOMICS ERA

GENOMICS ERA

Description of two-colour arrays

Page 10: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

11

What is a microarray?

SOLID SURFACE

PROBES

SAMPLE(TARGET)

Fluorescence scanning

Image analysis

Raw data

Page 11: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

14

Wang et al., Nat. Rev. Genetics 10 (2009)

4

500 pg RNAt 100 pg RNAt (Illumina), 10 pg (ultralow Illumina), 500 pg (Roche)

RNAseq vs microarrays for transcriptome analysis

•Much more sensitive than microarrays•Higher dynamic range•Real count of sequences vs. Fluorescence intensities•All RNA species can be sequenced (microarrays probes more focused on coding genes)•Available for all kinds of organisms•Protocols optimized for very low input •Cost is getting rapidly reduced

Page 12: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

15

RNAseq library construction

Very high dynamic range (105 to 107)

Page 13: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

16

Total RNAseq

Nat. Rev. Genetics 2009

more than 95% of the transcripts willbe ribosomal

Page 14: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

17

•Poly A+ selection for mRNAseq: 1st strand synthesis done on oligodTattached to magnetic beads.

PROs: very effective at removing ribosomal species.Less sequencing required for the same coverage compared to tRNA.

CONs: RNA quality is an issue (degraded RNA makes it difficult to sequence 5’)Many RNA species get lost (non coding, miRNA…)

•Standard library construction does not preserve directionality (butprotocols are available to generate libraries that do preserve strandness). This may be particularlyuseful for finding unannotated genes and ncRNAs and for de-novo sequencing.

•Small RNAseq requires specific isolation and RNA library construction protocols.

•FFPE or very poor quality samples also can be sequenced using specific kits and protocols thatnot rely on polyA tails

•Illumina and Ion Torrent sell specific kits for all these kinds of RNA libraries.•Targeted RNA custom panels also exists.

Other kinds of RNA libraries

Page 15: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

Third generation sequencing: PacBio RSII

•AMPLIFICATION OF SAMPLE IS NOT REQUIRED (LOW INPUT, AVOID BIAS, MORE UNIFORM COVERAGE, ANALYSIS OF HETEROGENEUS SAMPLES)

•SMRT Technology (Single Molecule Real Time): highly processive DNApol+ labeled phospholinked fluorescent nucleotides recorded in real time → direct observation of nucleotide incorporation

•Long reads (6-10 kb), a small number of reads up to 18 kb

•Single reads show very high error rate (15% compared to 0,1-1% of other platforms), but stochastic, improved by circular consensus sequencing (consensus sequence of high quality)

•Amplification not required (avoids bias, more uniform coverage)

•Quick delivery of results (runs last from 30 min to 3 hr)

•No problem for GC rich regions. Modification status of the template nucleotides (5-mC, 5-hmC) seen

http://smrt.med.cornell.edu/Strategies.html

2016: end of 454 commercialization and support by Roche

https://ncifrederick.cancer.gov/atp/cms/wp-content/uploads/2011/10/pacbio_technology_backgrounder.pdf

Page 16: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

Oxford Nanopore Technologies

https://www.nanoporetech.com/technology/the-minion-device-a-miniaturised-sensing-system/the-minion-device-a-miniaturised-sensing-system

Third generation sequencing: nanopore technology

https://www.nanoporetech.com/technology/introduction-to-nanopore-sensing/introduction-to-nanopore-sensing

GridION

Expected to be released in late Nov.2014

Page 17: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

1000$ genome for everybody

??

•18 Tb/run, 2x150 bp length•Human sequencing only•Bioinformatics/interpretation not included

In:-Macrogen (Seoul)-Broad Institute in Cambridge (Massachusetts)-Garvan Institute (Sydney)

Human genomes at 30x coverage

2012

2014

Page 18: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

1000$ genome for everybody

Page 19: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

And now….. what?

-Sequencing capabilities have been dramatically increased, so obtaining Tb of sequences is no longer an issue.

-Issues to deal with:

Data managing

Clinical information

Page 20: NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

VHIR’s HIGH TECHNOLOGY UNIT (UAT)

•Genomics•Metabolomics•Cytomics•Microscopy

•Statistic and Bioinformatics Unit

Unitat d’Alta Tecnologia (UAT)VHIR-Mediterrània Building-Ground floor

[email protected]

We offer a set of high-tech services that support teaching activities and research activities in the biomedical field: