18
Detection and inference of differentially methylated regions from bisulfite sequencing Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017

Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Detection and inference of differentially methylated regions from bisulfite

sequencing

Keegan Korthauer

@keegankorthauer

Bioconductor Meeting 2017

Page 2: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Differential methylation

Differential methylation commonly studied in –  Cancer –  Developmental stages / Aging –  Tissue types

Control

Disease

Unmethylated CpG

Methylated CpG

* * * *

Differentially Methylated Region (DMR)

Page 3: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Whole Genome Bisulfite Sequencing

Treat DNA with bisulfite before sequencing –  Methylated C -> unaffected –  Unmethylated C -> appear as T

GAGCGATGGATAGCG CGAGTGATGGATAGC GTTACGAGCGACGG TGGTTACGAGCGATG

...TCTCGGTTACGAGCGACGGATAGCG...

Bisulfite sequencing reads Reference genome

Page 4: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Whole Genome Bisulfite Sequencing

Treat DNA with bisulfite before sequencing –  Methylated C -> unaffected –  Unmethylated C -> appear as T

GAGCGATGGATAGCG CGAGTGATGGATAGC GTTACGAGCGACGG TGGTTACGAGCGATG

...TCTCGGTTACGAGCGACGGATAGCG...

Bisulfite sequencing reads Reference genome

3/3 1

3/4 0.75

1/4 0.25

0/1 0

Methylation Counts and Proportions

Page 5: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Methods for DMR detection

Challenges –  Have to handle small sample sizes ! –  Accommodate known sources of variability –  Detect region boundaries

Two main strategies –  Detect significant CpGs, then group together neighbors to

form regions (DSS, BSmooth) –  Targeted regions / sliding windows specified in advance

(BiSeq, MOABS)

Page 6: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Genomicloca+on

Methyla+o

nDiffe

rence

0

Danger of grouping significant CpGs

Page 7: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Genomicloca+on

Methyla+o

nDiffe

rence

0

TruePosi+veFalsePosi+veTrueNega+ve

Danger of grouping significant CpGs False Discovery Rate = # False Discoveries (FDR) Total # Discoveries = 2/8 = 0.25

Page 8: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Genomicloca+on

Methyla+o

nDiffe

rence

0

Danger of grouping significant CpGs

TruePosi+veFalsePosi+veTrueNega+ve

False Discovery Rate = # False Discoveries (FDR) Total # Discoveries = 1/2 = 0.5

Page 9: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Met

hyla

tion

0.2

0.5

0.8

Left VentricleSmall Intestine

chr5: 75,654,594 − 75,655,232 (width = 639), FDR: 0.0054, Stat: 23.4

Island Shore Shelf Open SeaCpGs

Exons

DMRs come in all shapes and sizes M

ethy

latio

n

0.2

0.5

0.8

Left VentricleSmall Intestine

chr16: 54,288,213 − 54,292,817 (width = 4,605), FDR: 0.0054, Stat: −23.678

Island Shore Shelf Open SeaCpGs

IRX3Exons

-> Detect region boundaries from the data

Page 10: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Our approach: dmrseq 1.  Detect regions by scanning the genome for candidate

regions

2.  Region-level summary scores compared to a null to evaluate significance

Diffe

rence

0Diffe

rence

0

? ?

Page 11: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Our approach: dmrseq 1.  Detect regions by scanning the genome for candidate

regions

2.  Region-level summary scores compared to a null to evaluate significance

Diffe

rence

0Diffe

rence

0

? ?

Page 12: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Our approach: dmrseq 1.  Detect regions by scanning the genome for candidate

regions

2.  Region-level summary scores compared to a null to evaluate significance

Diffe

rence

0Diffe

rence

0

? ?

Page 13: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Region level Summary Scores

–  Biological variability

among samples

–  Correlation among nearby CpGs

–  Higher variability in CpGs with lower coverage

Generalized Least Squares (GLS) regression

–  Nested autoregressive correlated error structure

–  Coverage-weighted variance

arcsin(2π ijr −1) = αlr1[i=1]l∑ +βrX j +εijr

Var(εijr ) =

Summary statistic that is approximately exchangeable across the genome so we can

generate a pooled null

Page 14: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

dmrseq accurately controls FDR

0.0

0.1

0.2

0.3

0.4

0.0 0.1 0.2 0.3 0.4Specified FDR level

Obse

rved

FDR

leve

l

(A) dmrseq

0.0

0.1

0.2

0.3

0.4

0.0 0.1 0.2 0.3 0.4Specified FDR level

(B) metilene

SimulationD2

D3

Page 15: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

dmrseq has high sensitivity

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00FDR

Powe

r

(A) Simulation D2

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00FDR

(B) Simulation D3

MethodBSmooth

dmrseq

DSS

metilene

Page 16: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

DMRs enriched for associations with differential expression

(A) Roadmap Tissue Comparisons

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

log2

odd

s

Right Ventricle vs Small Intestine

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

Right Ventricle vs Sigmoid Colon

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

Left Ventricle vs Small Intestine

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

log2

odd

s

Left Ventricle vs Sigmoid Colon

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

Sigmoid Colon vs Small Intestine

(B) Murine Leukemia Models

0

1

2

3

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

log2

odd

s

AML vs Control

0

1

2

3

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

ALL vs Control

0

1

2

3

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

AML vs ALL

Method BSmooth dmrseq DSS metilene

(A) Roadmap Tissue Comparisons

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

log2

odd

s

Right Ventricle vs Small Intestine

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

Right Ventricle vs Sigmoid Colon

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

Left Ventricle vs Small Intestine

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

log2

odd

s

Left Ventricle vs Sigmoid Colon

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

Sigmoid Colon vs Small Intestine

(B) Murine Leukemia Models

0

1

2

3

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

log2

odd

s

AML vs Control

0

1

2

3

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

ALL vs Control

0

1

2

3

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

AML vs ALL

Method BSmooth dmrseq DSS metilene

(A) Roadmap Tissue Comparisons

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

log2

odd

s

Right Ventricle vs Small Intestine

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

Right Ventricle vs Sigmoid Colon

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

Left Ventricle vs Small Intestine

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

log2

odd

s

Left Ventricle vs Sigmoid Colon

1

2

3

4

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

Sigmoid Colon vs Small Intestine

(B) Murine Leukemia Models

0

1

2

3

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

log2

odd

s

AML vs Control

0

1

2

3

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

ALL vs Control

0

1

2

3

0.01 0.10 0.25 0.50 0.75 1.00FDR threshold (square root scaled)

AML vs ALL

Method BSmooth dmrseq DSS metilene

Page 17: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

dmrseq Summary

-  dmrseq identifies and prioritizes DMRs from bisulfite sequencing experiments

-  Computes region summary statistics that account for known sources of variability across the genome

-  Achieves accurate False Discovery Rate control by generating a null distribution that pools information across the genome

Page 18: Detection and inference of differentially methylated regions from … · 2017-08-08 · Keegan Korthauer @keegankorthauer Bioconductor Meeting 2017 . Differential methylation Differential

Rafa Lab Rafael Irizarry Chinmay Shukla

Collaborators Sutirtha Chakraborty Yuval Benjamini

Learn More Slides goo.gl/MwQz5f dmrseq available on GitHub https://github.com/kdkorthauer/dmrseq

Acknowledgements

Contact [email protected] @keegsdur