64
Next-generation sequencing technologies

Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Next-generation sequencing technologies

Page 2: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

NGS applications

Page 3: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Illumina sequencing workflow

Page 4: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Overview

NGS

Short-read NGS

Long-read NGS

Sequencing by ligation

Sequencing by synthesis

Single-molecule approach

Synthetic approach

Illumina

Page 5: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

General principles of short-read NGS

Construct a library of fragments

Generate clonal template populations

Massively parallel DNA sequencing reactions

Analyze data

Page 6: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

General principles of short-read NGS

Construct a library of fragments

Generate clonal template populations

Massively parallel DNA sequencing reactions

Analyze data

Page 7: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation

• Prepares sample nucleic acids for sequencing Fragmentation

Generates double-stranded DNA flanked by Illumina adapters

Generates the same general template structure, but variables include Insert size Adapter type Index for multiplexing

Page 8: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: OverviewPurified genomic DNA

Fragment DNA

Repair Ends

Fragments < 800bp

Add an “A” to the 3’ Ends

Blunt end fragments with 5’ phosphorylated ends

Ligate Paired-end adapters

Size-select on Gel

PCR

QC Library

300-600bp fragments

Amplified DNA with adapters

Genomic DNA Library

Page 9: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: Fragmentation

Page 10: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: Fragmentation

The size of the target DNA fragments in the final library is a key parameter for NGS library construction.

Optimal library size is impacted by1. the process of cluster generation: Short products amplify more

efficiently than longer products. Longer library inserts generate larger, more diffuse clusters than short inserts.

2. the sequencing application: For example, 2×100 PE for exome sequencing since more than 80% of human exomes are under 200bp.

Page 11: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: Fragmentation

Three approaches are available to fragment nucleic acids:

1. Physical: Acoustic shearing and sonication, main method for genomic DNA

2. Enzymatic: Non-specific endonucleases cocktails or Transposase tagmentation, a greater number of artifactual indels compared with the physical method, reduced sampling handling and preparation time

3. Chemical: Heat and divalent metal cation, reserved for mRNA

Page 12: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: Repair Ends

Page 13: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: A-tailing

Page 14: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: A-tailing

PA

TP

To facilitate ligation to sequencing adapter To prevent self-ligation between blunt ended template molecules (concatermers), or between adapters (adapter dimers)

PA P

A PA P

A PA

×

TP T

P

×

Page 15: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: Adapter ligation

Page 16: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: Y-shaped adaptors

Page 17: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: Y-shaped adapters

Y-shaped adapters Non Y-shaped adapters

Page 18: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: Size-select on Gel

300bp area excised

600bp area excised

Page 19: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: PCR

• Selectively enrich DNA fragments with adapters on both ends

• Amplify the amount of DNA in the library

Page 20: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: PCR

Page 21: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Library preparation: QC LibraryQC by Agilent Bioanalyzer: gives size confirmation and visualizes unwanted products

Lower marker15bp

Upper marker1500bp

Page 22: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

General principles of short-read NGS

Construct a library of fragments

Generate clonal template populations

Massively parallel DNA sequencing reactions

Analyze data

Page 23: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Flow cells

Page 24: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Flow cells

• Adapter-ligated library elements hybridize to complementary oligonucleotides on the surface of a flow cell. Each attached library fragment acted as a seed and is amplified to generate a clonal cluster containing thousands of identical fragments.

• Ideally, clusters are of similar size and spaced well apart from each other to achieve accurate resolution during imaging. In reality, DNA clusters are randomly distributed across the flow cell with many clusters in close proximity to neighboring clusters, if the sample is overloaded, making it difficult to discern individual clusters from each others and reducing the amount of information generated during the run.

Page 25: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Patterned flow cells

Page 26: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Patterned flow cells• Patterned flow cell technology provides even cluster spacing and uniform feature size to deliver extremely high cluster densities.

• Clusters can only form in the nanowells, allowing accurate resolution of clusters during imaging.

Page 27: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification

Page 28: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification

Page 29: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Hybridization and extension

Page 30: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Denaturation

Page 31: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Anchor the template to the surface

Page 32: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Bridge amplification

Page 33: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Bridge amplification

Page 34: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Denaturation

Page 35: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Bridge amplification

Page 36: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Bridge amplification

Page 37: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: P5 Linearization

P7P5

Page 38: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: P5 Linearization

Page 39: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Blocking

Page 40: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Cluster amplification: Read1 sequencing

Page 41: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

General principles of short-read NGS

Construct a library of fragments

Generate clonal template populations

Massively parallel DNA sequencing reactions

Analyze data

Page 42: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Sequencing by synthesis

Page 43: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Sequencing by synthesis

Page 44: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Single read, paired-end and read lengths

• Program the system to sequence a specific number of bases (1-600 bases)

• Sequence the strands from both directions to achieve a total of e.g. 600 bases (2×300 bases)

Page 45: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Paired-end sequencing

Longer read lengths improve 1) the overall length of contiguous sequence that can be assembled, and 2) the certainty of short read alignments.

Several next-generation sequencers have offered increases in read length over time. Another improvement has resulted from paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one of two mechanisms: 1) paired ends or 2) mate pairs.

Page 46: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Paired-end sequencing

Page 47: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Paired-end sequencing

Page 48: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Paired-end sequencing

Page 49: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Paired-end sequencing: P7 linearization

Page 50: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Paired-end sequencing

Page 51: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Paired-end sequencing(a) paired-end (b) mate-pair

Fragment length

< 1000 bp > 1000 bp

Advantage Higher accuracy of alignments than a single-end read of the same length

Providing a scaffold for de novo

sequencing by long-range order and orientation

Page 52: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Illumina: Summary

https://www.youtube.com/watch?v=fCd6B5HRaZ8

Page 53: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Illumina platforms: Benchtop sequencers

https://www.illumina.com/systems/sequencing-platforms.html

Page 54: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Illumina platforms: Production-scale sequencers

https://www.illumina.com/systems/sequencing-platforms.html

Page 55: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Choosing a library type

• Single read library• Unidirectional sequencing• Compatible with only single-read flow cells• Applications: ChIP-seq, mRNA-seq for quantification, low-coverage

resequencing

Page 56: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Choosing a library type

• Paired end library• Uni or Bidirectional sequencing• Compatible with both single-read and paired-end flow cells• Applications: the most common library type, de novo assembly,

structural variants detection, high-coverage resequencing

Page 57: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Choosing a library type

• Indexed libraries• Uni or bidirectional sequencing• Allows multiple libraries per lane• Single-indexed libraries: adds up to 48 unique 6-base index 1 (i7) se

quences to generate up to 48 uniquely tagged libraries.• Dual-indexed libraries: adds up to 24 unique 8-base index 1 (i7)

sequences and up to 16 unique 8-base index 2 (i5) sequences to generate up to 384 uniquely tagged libraries.

Page 58: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Single-indexed sequencing

The single-indexed sequencing workflow applies to all Illumina sequencing platforms.

Page 59: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Dual-indexed sequencing on a paired-end flow cellDual-indexed sequencing includes 2 index reads.

Page 60: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Dual-indexed adapters

Page 61: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Dual-indexed sequencing: Workflow A

7 dark-cycles

Page 62: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Dual-indexed sequencing: Workflow A

Page 63: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Dual-indexed sequencing: Workflow B

Page 64: Next-generation sequencing technologies · 2017. 7. 14. · paired-end sequencing, producing sequence data from both ends of each library fragment. Read pairs can be obtained by one

Reads and coverage

• The number of reads for a specific region is denoted “depth” or “coverage”