45
CAG EMEAI | Agilent Restricted | Page 1 Life Sciences & Diagnostics Group | Agilent Technologies | Page 1 S1 Back to the Basics: Methyl-Seq 101 Presented By: Alex Siebold, Ph.D. October 9, 2013 Field Applications Scientist Agilent Technologies Life Sciences & Diagnostics Group

Back to the Basics: Next-Generation Sequencing 101 · • Analysis Workflows, File Formats, and Data Filtering • DNA-Seq vs. RNA-Seq Considerations • Integrating Disparate Data

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

CAG EMEAI | Agilent Restricted | Page 1 Life Sciences & Diagnostics Group | Agilent Technologies | Page 1 S1

Back to the Basics:

Methyl-Seq 101

Presented By:

Alex Siebold, Ph.D.

October 9, 2013 Field Applications Scientist

Agilent Technologies

Life Sciences & Diagnostics Group

CAG EMEAI | Agilent Restricted | Page 2 Life Sciences & Diagnostics Group | Agilent Technologies | Page 2 S1

Event Date & Time Speaker Topics

NGS Data Analysis 101 Thu, Oct 10

1 pm ET

Jean Jasinski, PhD

Field Application

Scientist

• Analysis Workflows, File Formats,

and Data Filtering

• DNA-Seq vs. RNA-Seq

Considerations

• Integrating Disparate Data Sets to

Create a More Complete Story

NGS Panels 101 Fri, Oct 11

1 pm ET

Adam Hauge,

University of

Minnesota

• Panel Design Process

• Quality at the Bench: Tips, Tricks,

and Lessons Learned

• Considerations for Future Panels

Back to the Basics: Agilent’s Five Part 101

eSeminar Series

CAG EMEAI | Agilent Restricted | Page 3 Life Sciences & Diagnostics Group | Agilent Technologies | Page 3 S1

Topics for Today’s Presentation

Technology Behind SureSelectXT 2

1

3 SureSelectXT Human Methyl-seq

4

Epigenetics & DNA Methylation

Comparing DNA Methylation Methods

5

Not approved for use in diagnostic

procedures

GeneSpring & Additional TE Solutions

6 Summary & Upcoming 101 eSeminars

CAG EMEAI | Agilent Restricted | Page 4 Life Sciences & Diagnostics Group | Agilent Technologies | Page 4 S1

Defining Epigenetics & Epigenetic Protein

Functions • Epigenetics: Studies changes in gene expression caused by mechanisms

that do not effect the underlying DNA sequence

- ex: DNA Methylation and Covalent Modification of Histone Tails

DNMT1 (DNA Methyltransferse)

MLL (Histone Methyltransferse)

Methyl CpG Binding Proteins (MBD’s)

CBX7 (Chromodomain)

BRD4 (Bromodomain)

DNA Demethylation

HDAC’s (Histone deacetylases)

UTX (Histone demethylase)

CAG EMEAI | Agilent Restricted | Page 5 Life Sciences & Diagnostics Group | Agilent Technologies | Page 5 S1

DNA Methylation: An Epigenetic Modification

• Found in animals, plants, bacteria, fungi, etc…

• Initially thought to be a static modification, but is dynamic

• Important in Development and Cellular Differentiation

• Promotes Gene Silencing

• Aberrant DNA methylation contributes to a host of diseases,

including cancer

CAG EMEAI | Agilent Restricted | Page 6 Life Sciences & Diagnostics Group | Agilent Technologies | Page 6 S1

CpG Dinucleotides & Their Genomic

Locations

CpG islands

• High frequency of CpG dinucleotides

– > 500bp & GC content >55% & observed/expected CpG ratio > 0.65

• In or near about 40% of promoters of mammalian genes

Promoters

• 75% of transcriptional start sites have CpG-rich regions

• 88% of active promoters are associated with CpG-rich sequences

CAG EMEAI | Agilent Restricted | Page 7 Life Sciences & Diagnostics Group | Agilent Technologies | Page 7 S1

Differentially Methylated Regions (DMRs)

• CpG islands

– 4~8 % tissue-specific differentially methylated regions or T-DMR

• CpG island shores

– ~2kb away from islands, 76% of T-DMRs in shores

• CpG island shelves

– ~4 kb away from islands

Irizarry RA et al. Nature Genetics 2009

HS3ST4 :

heparan sulfate D-

glucosaminyl 3-O-

sulfotransferase 4

CAG EMEAI | Agilent Restricted | Page 8 Life Sciences & Diagnostics Group | Agilent Technologies | Page 8 S1

The Study of Epigenetics & How it Relates to

the Clinic

•DNA Methylation has a rapidly growing

role in cancers and other diseases

•Pharma & Biotech are actively seeking

new small molecules that inhibit certain

classes of epigenetic enzymes

FDA Approved Epigenetic Therapies

Vidaza & Dacogen: Inhibit DNA Methyltransferases

Vorinostat & Romidepsin: Inhibit Histone Deacetylases

CAG EMEAI | Agilent Restricted | Page 9 Life Sciences & Diagnostics Group | Agilent Technologies | Page 9 S1

Impact of Adding Methylation Assays to Your

Current Research

• Better understand the link between environment and

genetics

• Obtain the complete picture of a gene and its impact on the

pathogenesis of cancer or other disease state

9 Not approved for

use in diagnostic

procedures

CAG EMEAI | Agilent Restricted | Page 10 Life Sciences & Diagnostics Group | Agilent Technologies | Page 10 S1

Topics for Today’s Presentation

Technology Behind SureSelectXT 2

1

3 SureSelectXT Human Methyl-seq

4

Epigenetics & DNA Methylation

Comparing DNA Methylation Methods

5

Not approved for use in diagnostic

procedures

GeneSpring & Additional TE Solutions

6 Summary & Upcoming 101 eSeminars

CAG EMEAI | Agilent Restricted | Page 11 Life Sciences & Diagnostics Group | Agilent Technologies | Page 11 S1

The Power Behind the Performance Agilent’s Core Competency: Oligo Library Synthesis (OLS)

HP Inkjet Printer Technology + Proprietary Chemistry =

Long High-Quality Oligos (SurePrint Technology)

Agilent’s SureSelect Platform Utilizes 120bp Oligos

CAG EMEAI | Agilent Restricted | Page 12 Life Sciences & Diagnostics Group | Agilent Technologies | Page 12 S1

Target Enrichment: It’s just like fishing…

Why perform target enrichment?

1. Sequence only your desired regions of

interest (Exons, gene panels, intergenic

regions etc...)!

2. Sequence more samples per lane/run

(i.e. Multiplex)

3. Save time and money

4. Faster time to results = Smaller datasets

5. Identify variants in samples with increased

reliability and accuracy:

More Reads in regions of interest =

Higher Depth of Coverage

CAG EMEAI | Agilent Restricted | Page 13 Life Sciences & Diagnostics Group | Agilent Technologies | Page 13 S1

SureSelect: In-Solution Target Enrichment S

ure

Sele

ct D

NA

Nat Biotechnol. 2009 Feb;27(2):182-9.

CAG EMEAI | Agilent Restricted | Page 14 Life Sciences & Diagnostics Group | Agilent Technologies | Page 14 S1

SureSelect: In-Solution Target Enrichment S

ure

Sele

ct D

NA

Nat Biotechnol. 2009 Feb;27(2):182-9.

CAG EMEAI | Agilent Restricted | Page 15 Life Sciences & Diagnostics Group | Agilent Technologies | Page 15 S1

Target Enrichment Maximizes Your

Sequencing Efficiency x Desired Depth of Coverage = Required Seq Depth/Sample

Human Genome

3Gb x 30 = 90Gb

Illumina HiSeq 2000 37Gb/lane

~ 3 lanes per sample!

$$$$$

Illumina HiSeq FlowCell

Genome Size

CAG EMEAI | Agilent Restricted | Page 16 Life Sciences & Diagnostics Group | Agilent Technologies | Page 16 S1

Target Enrichment Maximizes Your

Sequencing Efficiency x Desired Depth of Coverage = Required Seq Depth/Sample

Human Genome

3Gb x 30 = 90Gb

Illumina HiSeq 2000 37Gb/lane

Illumina HiSeq FlowCell

Target Size

v

Target = 50Mb x 100 = 5Gb

Target = 5Mb x 100 = 500Mb

Target = 500Kb x 100 = 50Mb

Target = 50Kb x 100 = 5Mb

Develop designs/panels for

any sequencing capacity:

- High Throughput or Desktop

CAG EMEAI | Agilent Restricted | Page 17 Life Sciences & Diagnostics Group | Agilent Technologies | Page 17 S1

Topics for Today’s Presentation

Technology Behind SureSelectXT 2

1

3 SureSelectXT Human Methyl-seq

4

Epigenetics & DNA Methylation

Comparing DNA Methylation Methods

5

Not approved for use in diagnostic

procedures

GeneSpring & Additional TE Solutions

6 Summary & Upcoming 101 eSeminars

CAG EMEAI | Agilent Restricted | Page 18 Life Sciences & Diagnostics Group | Agilent Technologies | Page 18 S1

SureSelectXT Human Methyl-Seq

Discovery Tool

• Not methylation-state dependent

• No prior knowledge needed

Comprehensive design

• Not limited to CpG Islands

• Comprehensive Content

• CpG Islands, Promoters and

DMRs

DESIGN CONTENT - 84 Mb Design,

3.7M CpGs

CpG islands

Cancer, Tissue-specific DMRs

GENCODe promoters

DMRs or regulatory features in:

CpG Islands, shores and shelves ±4kb

DNAseI hypersensitive sites

Refseq Genes

Ensembl Regulatory Features

CAG EMEAI | Agilent Restricted | Page 19 Life Sciences & Diagnostics Group | Agilent Technologies | Page 19 S1

SureSelectXT Human Methyl-Seq Bait design

CAG EMEAI | Agilent Restricted | Page 20 Life Sciences & Diagnostics Group | Agilent Technologies | Page 20 S1

SureSelectXT Methyl-Seq Workflow

DNA

Shearing, End

Repair & ‘A’

addition

SureSelect

Hyb (24hr)

mAdapter

ligation

Bisulfite

treatment

me me me me

me me me me

A A

Library

Quant and

PCR

Sequence

Bisulfite treatment is

performed after

hybridization to

maximize sample

complexity

No PCR before

Bisulfite treatment

to preserve the

Methylation state

CAG EMEAI | Agilent Restricted | Page 21 Life Sciences & Diagnostics Group | Agilent Technologies | Page 21 S1

Highly sensitive and accurate methylation detection after SureSelect target enrichment

demonstrated DNA methylation differences between HCT116 human colon cancer cells

and its methyltransferase double-knockout (DNMT1-/- and DNMT3b-/-).

Proof of Concept: HCT116 vs Methyltransferase DKO

CAG EMEAI | Agilent Restricted | Page 22 Life Sciences & Diagnostics Group | Agilent Technologies | Page 22 S1

Proof of Concept: Methylation of Metalloproteinase inhibitor 3

CAG EMEAI | Agilent Restricted | Page 23 Life Sciences & Diagnostics Group | Agilent Technologies | Page 23 S1

Proof of Concept:

Identifying Tissue Specific DMRs

CAG EMEAI | Agilent Restricted | Page 24 Life Sciences & Diagnostics Group | Agilent Technologies | Page 24 S1

SureSelectXT Methyl-Seq Capture

Performance

Percentage reads in targeted regions: 82.0%

Percentage reads in regions +/- 100bp: 93.6%

Percent of genome targeted: 2.7%

Enrichment in targeted regions: 30.07

Uniformity (3/4 mean with upper tail): 91.4%

Number of bases in targeted regions: 84,367,621

Percentage of targeted bases covered by...

...at least 1 read: 98.7%

...at least 10 reads: 91.4%

...at least 20 reads: 78.9%

10Gb Sequencing per Sample

CAG EMEAI | Agilent Restricted | Page 25 Life Sciences & Diagnostics Group | Agilent Technologies | Page 25 S1

Topics for Today’s Presentation

Technology Behind SureSelectXT 2

1

3 SureSelectXT Human Methyl-seq

4

Epigenetics & DNA Methylation

Comparing DNA Methylation Methods

5

Not approved for use in diagnostic

procedures

GeneSpring & Additional TE Solutions

6 Summary & Upcoming 101 eSeminars

CAG EMEAI | Agilent Restricted | Page 26 Life Sciences & Diagnostics Group | Agilent Technologies | Page 26 S1

Comparative Publication Of Methyl Assays:

Nature Biotech. (28), pp: 1026–1028: (2010)

CAG EMEAI | Agilent Restricted | Page 27 Life Sciences & Diagnostics Group | Agilent Technologies | Page 27 S1

Whole Genome Bisulfite Sequencing (WGBS)

• Whole Genome Coverage

• But…costly and time consuming

• Requires extensive bioinformatics

• Limited scalability per run

CAG EMEAI | Agilent Restricted | Page 28 Life Sciences & Diagnostics Group | Agilent Technologies | Page 28 S1

MeDip-Seq & Reduced Representation

BiSulfite Sequencing (RRBS)

• Difficult to target specific regions (i.e. DMRs in Shelf and Shore regions)

• Biased towards methylated regions, Repeat sequences & CpG-rich

sequences

• Can miss under-methylated regions

• Difficult to design since knowledge of methylation state for the target

region is needed

Limitations

CAG EMEAI | Agilent Restricted | Page 29 Life Sciences & Diagnostics Group | Agilent Technologies | Page 29 S1

Microarray-Based Method: Infinium 450K

Array

• Improved cost and throughput vs. whole-genome bisulfite sequencing

• Not whole-genome (~480K Features)

• Does not cover shores and shelves known to house important DMRs

• CpG Rich Regions are hard to resolve at single base resolution

• What are you missing?

CAG EMEAI | Agilent Restricted | Page 30 Life Sciences & Diagnostics Group | Agilent Technologies | Page 30 S1

SureSelectXT Methyl-Seq

• What if you could: • reduce cost/sample

• Interrogate regions more

efficiently than WGBS

• Confidently identify both

Hyper & Hypo methylated

regions in a single experiment

• Have increased scalability

• Achieve high depth of coverage

to confidently make methyl calls

CAG EMEAI | Agilent Restricted | Page 31 Life Sciences & Diagnostics Group | Agilent Technologies | Page 31 S1

Methyl-Seq Comparison with published

WGBS data

Cell line: IMR90 (female lung fibroblast)

CAG EMEAI | Agilent Restricted | Page 32 Life Sciences & Diagnostics Group | Agilent Technologies | Page 32 S1

SureSelect Methyl-seq vs. Whole-Genome

Bisulfite Sequencing

CAG EMEAI | Agilent Restricted | Page 33 Life Sciences & Diagnostics Group | Agilent Technologies | Page 33 S1

Concordance between WGBS data &

SureSelect Methyl-Seq

R = 0.927

Excellent concordance with whole genome bisulfite sequencing

data.

CAG EMEAI | Agilent Restricted | Page 34 Life Sciences & Diagnostics Group | Agilent Technologies | Page 34 S1

Excellent Reproducibility

Excellent reproducibility for IMR90 Replicate 1 vs. 2

CAG EMEAI | Agilent Restricted | Page 35 Life Sciences & Diagnostics Group | Agilent Technologies | Page 35 S1

SureSelectXT – Complete Workflow Solution

SureSelectXT Human Methyl-Seq

Target Enrich. Library

Prep

Indexes

35

CAG EMEAI | Agilent Restricted | Page 36 Life Sciences & Diagnostics Group | Agilent Technologies | Page 36 S1

Topics for Today’s Presentation

Technology Behind SureSelectXT 2

1

3 SureSelectXT Human Methyl-seq

4

Epigenetics & DNA Methylation

Comparing DNA Methylation Methods

5

Not approved for use in diagnostic

procedures

GeneSpring & Additional TE Solutions

6 Summary & Upcoming 101 eSeminars

CAG EMEAI | Agilent Restricted | Page 37 Life Sciences & Diagnostics Group | Agilent Technologies | Page 37 S1

Integrated Biology Applications in GeneSpring

Including Methyl-Seq Analysis!

CAG EMEAI | Agilent Restricted | Page 38 Life Sciences & Diagnostics Group | Agilent Technologies | Page 38 S1

Integrated Biology Applications in GeneSpring

Including Methyl-Seq Analysis!

CAG EMEAI | Agilent Restricted | Page 39 Life Sciences & Diagnostics Group | Agilent Technologies | Page 39 S1

• Each Kit has a unique ELID identifying number

• All kit design files can be easily retrieved from Agilent SureDesign

• Access SureDesign to create your own custom captures for FREE!

Sure

Sele

ct D

NA

& R

NA

Additional Target Enrichment Solutions:

SureSelectXT

Targets Kits Species Capture

Exomes 50Mb (V3) V4, V4+UTR

V5, V5+UTR (NEW!)

Human, Mouse,

Zebrafish, bovine,

dog

DNA-Seq

DNA

Methylation

Human Methyl-Seq (84Mb)

Mouse Methyl-Seq

(100Mb)

Human

Mouse (EA)

DNA-Seq

Kinome DNA or RNA ~3 Mb Human DNA or RNA-Seq

Custom

Regions

>0.2 Mb to 34 Mb

Any DNA or RNA-Seq

Focused

Panels

>0.2 Mb to 34 Mb

Any DNA or RNA-Seq

CAG EMEAI | Agilent Restricted | Page 40 Life Sciences & Diagnostics Group | Agilent Technologies | Page 40 S1

Topics for Today’s Presentation

Technology Behind SureSelectXT 2

1

3 SureSelectXT Human Methyl-seq

4

Epigenetics & DNA Methylation

Comparing DNA Methylation Methods

5

Not approved for use in diagnostic

procedures

GeneSpring & Additional TE Solutions

6 Summary & Upcoming 101 eSeminars

CAG EMEAI | Agilent Restricted | Page 41 Life Sciences & Diagnostics Group | Agilent Technologies | Page 41 S1

SureSelectXT Human Methyl-Seq Summary

• Covers more individual relevant CpGs compared to

Methylation microarrays

• Reveals methylated regions undetected by RRBS and

Me-Dip

• Increases throughput, reduces costs compared to

WGBS and allows better usage of sequence capacity

• Produces higher quality data than WGBS, deeper reads,

faster analysis

• Identification of DMR in relevant genes can lead to

discovery of useful biomarkers

CAG EMEAI | Agilent Restricted | Page 42 Life Sciences & Diagnostics Group | Agilent Technologies | Page 42 S1

A Beginner’s Glossary for Methyl-seq:

Walk the walk, talk the talk

1. Epigenetics – The study of changes in gene expression that are caused by mechanisms that do not effect the underlying

DNA sequence. Examples include covalent modification to histones tails and the methylation of DNA.

2. Epigenetics Writers – Individual enzymes or protein complexes that facilitate the establishment of covalent modifications to

DNA or histones. Examples include DNA methyltransferase and histone methyltransferase.

3. Epigenetic Readers - Proteins that identify specific epigenetic marks and either directly bind to or recruit proteins to bind to

them in order to modulate gene expression. Examples include methyl CpG binding proteins or members of the Polycomb and

Trithorax group proteins.

4. Epigenetic Erasers – Proteins that can remove covalent modifications to DNA and histones.

5. CpGs – Regions of the genome where cytosines precede guanines along the linear DNA sequence. The “p” in the CpG

annotation stands for phosphate which means the cytosine nucleotide occurs 5’ of the guanine nucleotide. This nomenclature

is used to prevent confusion since cytosines form Watson-Crick base pairing with guanines, which are not sites for DNA

methylation.

6. CpG Islands – Regions of the genome, typically >500bp, that contain a high density of CpG dinucleotide sequences.

7. CpG Island Shores – Term that describing the regions of differentially methylated CpG dinucleotides which occur

approximately 2 kb away from annotated CpG islands .

8. CpG Island Shelves – Similar to CpG shores, however these regions are found even further from annotated CpG islands in

the genome, approximately 4 kb away from annotated CpG islands.

9. DMRs – Referring to Differentially Methylated Regions of the genome.

Not approved for use in diagnostic

procedures

CAG EMEAI | Agilent Restricted | Page 43 Life Sciences & Diagnostics Group | Agilent Technologies | Page 43 S1

Agilent Technologies Knows NGS 101 & More…

Offering Complete Solutions for NGS Workflows

The Gold Standard for Sample QC

2100 Bioanalyzer Instrument & Kits

2200 TapeStation Instrument & Kits

NGS Analysis Software

GeneSpring NGS

SureCall

Validation Technologies

qPCR- Mx system & Brilliant reagents

Microarrays- CGH, CGH+SNP,

Gene Expression & miRNA

The Leader in NGS Target Enrichment

SureDesign

SureSelect

HaloPlex

Bravo Automation

CAG EMEAI | Agilent Restricted | Page 44 Life Sciences & Diagnostics Group | Agilent Technologies | Page 44 S1

Event Date & Time Speaker Topics

NGS Data Analysis 101 Thu, Oct 10

1 pm ET

Jean Jasinski, PhD

Field Application

Scientist

• Analysis Workflows, File Formats,

and Data Filtering

• DNA-Seq vs. RNA-Seq

Considerations

• Integrating Disparate Data Sets to

Create a More Complete Story

NGS Panels 101 Fri, Oct 11

1 pm ET

Adam Hauge,

University of

Minnesota

• Panel Design Process

• Quality at the Bench: Tips, Tricks,

and Lessons Learned

• Considerations for Future Panels

Back to the Basics: Agilent’s Five Part 101

eSeminar Series Continues…

CAG EMEAI | Agilent Restricted | Page 45 Life Sciences & Diagnostics Group | Agilent Technologies | Page 45 S1

Contact Us

800.227.9770

[email protected]

www.agilent.com/genomics