61
Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August 30th 2013

Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Embed Size (px)

Citation preview

Page 1: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Epigenomics: A Practical Guide

Benjamin Rodriguez, PhDWei Li Lab, Baylor College of Medicine

Molecular Biology Refresher Course with BioinformaticsAugust 30th 2013

Page 2: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Course Materials:http://dldcc-web.brc.bcm.edu/lilab/benji/MBRB_2013/index.htmlMost up to date slidesI will upload for all three of my lectures

Browsers:http://genome.ucsc.edu/http://epigenomegateway.wustl.edu/

Web-based analysis:http://bejerano.stanford.edu/great/public/html/http://david.abcc.ncifcrf.gov

Software, Sites, Materials

Page 3: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Outline

• DNA methylation• Histone

modifications• DNase

hypersensitivity• Aberrant methylation

in cancer• Epigenetic

inheritance in development and disease

Page 4: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

DNA is Packaged in Chromatin

nucleosomehistone

DNA

chromatin

Chromatin consists of nucleosomes, DNA wrapped around histone proteins

• Chromatin organizes genes to be accessible for transcription, replication, and repair

Page 5: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

CpG Islands and Promoters

• Although C & G constitute 42% of the human genome, less than 1% of pairs are CpG– Less than ¼ of expected frequency of 0.04

• A ‘CpG island’ is a run of “CpG-rich” sequence – min 200 bp in length– GC content > 50%– Observed : Expected ratio > 0.6 – This definition is not precise

• Many CpG islands occur within promoters

Page 6: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

DNA Methylation

• Methylation at CpG islands often repress nearby gene expression

• Many highly expressed genes have CpG methylation on their exons

• Some genes could be imprinted, so maternal and paternal copies have different DNA methylation

• In embryonic stem cells, there are also CHG methylation

• Recently, another type of DNA methylation called hydroxyl methylation (hmC) is found

Page 7: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Epigenetic Mechanisms: DNA Methylation

1 32 4

CGCG CG CG CG MCG

MCG

Normal

C: cytosinemC: methylcytosine

CpG island

Page 8: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

DNA Methylation and Gene Silencing

1 32 4

1 2 3 4

X

CGCG CG CG CG MCG

MCG

Normal

Cancer

CG CG CGMCG

MCG

MCG

MCG

C: cytosinemC: methylcytosine

CpG island

Page 9: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

DNA Methylation and Regulation

• Cytosine methylation blocks DNA-binding proteins’ access to regulatory sites and creates binding sites for repressive proteins

• Methylation often follows decrease in site use

From Thurman et al Nature 2012

Page 10: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Methylation and Expression

R2 = 0.7817 P < 0.0005

Some genes (e.g. HOXB13 in breast cancer) show strong correlation of promoter methylation with expression

From Rodriguez et al Carcinogenesis 2008

Page 11: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Methylation, Retroviruses and Repeats

• Bacteria use DNA methylation to limit invasive DNA from viruses

• A large fraction of the human genome consists of carcasses of retro-viruses and transposons

• Almost all DNA repeats are heavily methylated• If they lose methylation they are more likely to

be expressed

Page 12: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

DNA Methylation and Development

• Almost all DNA de-methylated in embryo• Increasing methylation at various times during

fetal development restrict functionality – This is why cloning is difficult

• Wave of methylation in adolescence• Gradual de-methylation in old age

Page 13: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

DNA Methylation and Inheritance• Most DNA is de-methylated during

gametogenesis and embryogenesis• Methylation persists in some DNA regions• Humans and mice show epigenetic inheritance

apparently mediated by DNA methylation

Page 14: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Agouti Mice and DNA Methylation

Page 15: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Paternal allele Maternal allele

Epigenetic mechanism of transcriptional regulation

Maternal allele Paternal allele

Genomic Imprinting

Expression of a subset of mammalian genes is restricted to one parental allele

Page 16: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Paternal allele Maternal alleleMaternal allele Paternal allele

Genomic Imprinting

Parental chromosomes are differentially marked by DNA methylation

Page 17: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Imprinting regulated by cis-acting elements (Imprinting Control Regions) and non-coding RNAs

Paternal allele Maternal alleleMaternal allele Paternal allele

Imprinting Control Regions act over long distances and control the imprinting of multiple genes

We will examine a recent study of the IGF2 DMR in individuals exposed to famine in utero

Genomic Imprinting

Page 18: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Epigenetic Mechanisms: Post-Translational Modification to Histones

HistoneAcetylation

HistoneMethylation

AcMe

• Epigenetic modifications of Histones include Histone Acetylation and Methylation

Page 19: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Histone Modifications

• Different modifications at different locations by different enzymes

• Potential temporal and spatial specificity

Page 20: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Histone Modifications

• Gene body mark: H3K36me3, H3K79me3• Active promoter (TSS) mark: H3K4me3• Active enhancer (TF binding) mark: H3K4me1,

H3K27ac• Both enhancers and promoters: H3K4me2,

H3/H4ac, H2AZ• Repressive promoter mark: H3K27me3• Repressive mark for DNA methylation:

H3K9me3

Page 21: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Genes, regulatory DNA, and epigenetic features

Graphic from NIH RoadMap Epigenomics Site

Page 22: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

- promoters- enhancers- silencers- insulators- etc.

DNaseI

Genes, regulatory DNA, and epigenetic features

Page 23: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

DNase Hypersensitive (HS) Mapping

• DNase randomly cuts genome (more often in open chromatin region)

• Select short fragments (two nearby cuts) to sequence

• Map to activepromoters andenhancers

Page 24: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

~100,000 – 250,000 DHSs per cell type (0.5-1.5% of genome)

genome.ucsc.edu www.epigenomebrowser.org

DNaseI Hypersensitive site (DHS)

Promoters

Enhancers

DNaseI hypersensitive sites mark regulatory DNA

Page 25: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

HMT

HMT

• Coordinated activities of chromatin modifying enzymes lead to condensation of chromatin and inhibition of gene expression

HDAC

HDAC

Ac

Ac

Ac

MeMeMe

Me

Me

Me

Me

Me

Me

Me

Geneexpression

Geneexpression

DNMT

Epigenetic Modifications to Histones and DNA Can Cooperate to Silence Gene Expression

Page 26: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

• Regulation of genes involved in differentiation, cell cycle, and cell survival

EPIGENETICS

Normal epigenetic mechanisms

Roles in Normal Development and Cancer

Differentiated cells

Progenitor cell

Page 27: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

• Regulation of genes involved in differentiation, cell cycle, and cell survival

• Through epigenetic silencing of certain genes, affected cells may acquire new phenotypes which promote tumorigenesis

EPIGENETICS

Malignant progenitor cell Tumor

Normal epigenetic mechanisms

Deregulated epigenetic mechanisms

Roles in Normal Development and Cancer

Differentiated cells

Progenitor cell

Page 28: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

HOXB13 hypermethylation in breast cancer cells

From Rodriguez et al Carcinogenesis 2008

Strong inverse assocation between promoter CpG island hypermethylation and HOXB13 gene expression

R2 = 0.7817 P < 0.0005

Page 29: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

HOXB13 hypermethylation in breast cancer cells

From Rodriguez et al Carcinogenesis 2008

Bisulfite sequencing

(Sanger, clone-based, very laborious)

Page 30: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Inhibition of DNA methyltransferase activity restores expression of HOXB13

From Rodriguez et al Carcinogenesis 2008

Page 31: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Paired Tumor and Adjacent Normal TissuesPatient ER Status

HOXB13 hypermethylation strongly associates with patient ER status

(OR=3.75, 95% CI 1.41-9.96; P = 0.008)

From Rodriguez et al Carcinogenesis 2008

Page 32: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

HOXB13 hypermethylation associates with poor disease free survival in ERα-positive patients

From Rodriguez et al Carcinogenesis 2008

Page 33: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

• Epidemiologic studies suggest adult disease risk is associated with adverse environmental conditions early in development

• Involvement of epigenetic dysregulation has been hypothesized

• Do early-life environmental conditions can cause epigenetic changes in humans that persist throughout life? Is there are role for clinical intervention?

1. Periconceptual exposure to famine2. Offspring born before vs. after maternal gastrointestinal

bypass surgery

Epigenetic inheritance and human development

Page 34: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Persistent epigenetic differences associated with prenatal exposure to famine in humans

Individuals who were prenatally exposed to famine during the Dutch Hunger Winter in 1944–45 had, 6 decades later, less DNA methylation of the imprinted IGF2 gene compared with their unexposed, same-sex siblingsAssociation was specific for periconceptional exposure, reinforcing that very early mammalian development is a crucial period for establishing and maintaining epigenetic marks

Heijmans et al PNAS 2008

Page 35: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Insulin-like growth factor II (IGF2)

• One of the best-characterized epigenetically regulated loci• Key factor in human growth and development• Maternally imprinted• Imprinting is maintained through the IGF2 differentially

methylated region (DMR)• Hypomethylation of DMR leads to bi-allelic expression of IGF2• IGF2 DMR methylation is a normally distributed quantitative trait

largely determined by genetic factors• methylation mark is stable up to middle age

• If affected by environmental conditions early in human development, altered methylation may be detected many years later

Page 36: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Difference in IGF2 DMR methylation between individuals prenatally exposed to famine and their same-sex sibling

Fig A displays the difference in IGF2 DMR methylation within sibships according to the estimated conception date of the famine-exposed individualIGF2 DMR methylation was lowest in the famine-exposed individual among 72% (43/60) of sibships; this lower methylation was observed in conceptions across the famine period.

Page 37: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

IGF2 DMR methylation among individuals periconceptionally exposed to famine and their

unexposed, same-sex siblings

Page 38: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

• 62 individuals exposed to famine late in gestation for at least 10 weeks, they were born in or shortly after the famine

• No difference in IGF2 DMR methylation between the exposed individuals and their unexposed siblings

IGF2 DMR methylation among individuals exposed to famine late in gestation and their unexposed, same-sex siblings

Page 39: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Timing of famine exposure during gestation and IGF2 DMR methylation

• Periconceptional, late exposure groups and 122 controls• Periconceptional exposure associated with lower methylation• Statistically significant association between timing and exposure

Page 40: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Differential methylation in offspring born before versus after maternal gastrointestinal bypass surgery

• Obesity during pregnancy affect fetal programming of adult disease

• Children born after surgery (AMS) are less obese and exhibit improved cardiometabolic risk profiles carried into adulthood

• Analyze the impact of maternal weight loss surgery on methylation levels in BMS and AMS offspring.

• Statistically significant correlations between gene methylation levels and gene expression and plasma markers of insulin resistance

• Effective treatment of a maternal phenotype is durably detectable in the methylome and transcriptome of subsequent offspring

Guenard et al. PNAS 2013

Page 41: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Offspring born before vs. after maternal gastrointestinal bypass surgery

BMS offspringhigher weight, height, and waist and hip girth (P < 0.05)

AMS offspringLower body fat % (P = 0.07)Improved Fasting insulin levels (P = 0.03) Homeostatic model of insulin resistance (HOMA-IR) index (P = 0.03)Lower blood pressure (P < 0.05).

Page 42: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

• 14,466 CpG sites (2.9% of sites analyzed) exhibited significant differences• corresponded to 5,698 unique genes

• significant biological functions related to autoimmune disease, pancreas disorders, diabetes mellitus, and disorders of glucose metabolism

Differential methylation analysis of offspring born before vs. after maternal gastrointestinal bypass surgery

Page 43: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Any questions?

On to the Laboratory!

Page 44: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Laboratory Excercises

• We will work with the significant differentially methylated CpG sites published as supporting data from the Guenard study

• The MGBS_study.xlsx and AMS.probes.bed files are available from the class web site

• We will perform our own mapping and significance testing of the CpG sites (in relation to genes) using GREAT

• We will analyze the published gene list and our custom gene list in DAVID

• Finally, we will analyze last week’s gene list (from the MLL-AF9 fusion protein study) in DAVID

Page 45: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Do hyper- and hypo-methylated sites in AMS offspring have different distributions?

Open MGBS_study.xlsx and examine the “DMC list” worksheet

The study used a poorly described algorithm, DiffScore, to assess statistical differences and to rank CpG sitesAlso implemented a loose threshold for change cutoff

In excel, we can easily compute summary stats

The average and standard deviation of the Delta beta values are quite similar

Page 46: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

• Choose human GRCh37 on species assembly• Test regions upload bed file AMS.probes.bed

• Set Background regions to whole genome

• Choose submit

Mapping significant CpG sites to genes with GREAT

Page 47: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

http://jura.wi.mit.edu/cgi-bin/bioc/tools/compare.cgiChoose compare 2 lists, Paste lists of genes, press submit

Compare gene lists from publication to those obtained by GREAT

• GREAT recovers 170 of 198 genes from the publication (AMS and GREAT)• GREAT identifies 170 additional genes (because by default it searches a wider

space of genomic distances)• The missing 28 genes may result from gene name synonyms

Page 48: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Mapping significant CpG sites to genes with GREAT

On “Region-Gene Association Graphs”We see 3 / 4 of the CpG sites are assigned two genesOrientation and distance to TSS show upstream pretty flat, but a spike in predictions when the distance is > 5 kb from TSSCould that be the reason we don’t see any significance test results?Let us find out

Page 49: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Open the “Association rule settings” dialog boxChange downstream to 5kb and distal to 5kbResubmit the job

Modifying the genomic region search range in GREAT

Page 50: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

What happened with a smaller genomic search interval?

We only returned 64 genes! Crap.

But we did finally return a single significant test resultInterPro(protein sequence analysis and classification)

Page 51: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Functional enrichment analyses with DAVID

• With GREAT, we were able to identify the majority of genes published in the original study

• We do not have sufficient information to repeat the study’s original analyses

• We can use DAVID to analyze the study gene list and our gene list from GREAT

Open http://david.abcc.ncifcrf.gov and choose “Start Analysis”

Page 52: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Open http://david.abcc.ncifcrf.gov and choose “Start Analysis”

• “Upload Gene List” Dialog box• Copy and Paste the list from MGBS_study.xlsx

worksheet “Study Genes” • On “Select Identifier”, choose “Official Gene

Symbol” and choose “Gene List” on “List Type”• Then Submit List

Functional enrichment analyses with DAVID

Page 53: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

For species, highlight Homo sapiens and click “Select Species”

Rename the list

Functional enrichment analyses with DAVID

Choose “Functional Annotation Tool”

Page 54: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Functional enrichment analyses with DAVID

Each Annotation Category on the left can be expanded to reveal a number of optional databases to queryThis allows for powerful customizationFor this exercise, we will accept the default options

Choose “Functional Annotation Chart”

Page 55: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Functional enrichment analyses with DAVID

Shown above are the first three results, the only ones to pass multiple-testing correctionThey reference the same group of genes

Functional Annotation Chart fields are: category, term, related term (RT), genes, count, percentage, p-value (univariate modified Fisher’s), and Benjamini p-value (correction for multiple testing)

Terms with arrows can be sorted

Page 56: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Functional enrichment analyses with DAVIDClicking on the link for term “Pleckstrin homology” opens the corresponding entry at Interpro

Proteins containing this domain can bind to and interact with membrane bound proteins, potentially mediating various signal transduction pathways in the cell

Page 57: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Let’s now perform the analysis with our list of differentially methylated genes obtained via GREAT• “Upload Gene List” Dialog box• Copy and Paste the list from MGBS_study.xlsx

worksheet “Great Analysis” • On “Select Identifier”, choose “Official Gene

Symbol” and choose “Gene List” on “List Type”• Submit List, choose “Homo sapiens”• Select “Functional Annotation Chart”

Functional enrichment analyses with DAVID

Note: Entrez Gene ID’s are a preferred way to search for gene functionsThey can account for the fact that a gene may go by several different names

Page 58: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Functional enrichment analyses with DAVID

We see the same first three results as before, but now they do not pass multiple-testing correctionWhy? One explanation, we introduced “noisy” genes with GREAT

Why did we not see any significant biological functions related to autoimmune disease, pancreas disorders, diabetes mellitus, or disorders of glucose metabolism?1. Study authors gave us a small piece of the data they likely used2. Methodological issues3. Commercial IPA is very different from publicly curated databases and

search tools

Page 59: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Functional enrichment analyses with DAVID

Finally, lets analyze the list of genes from last week’s MLL-AF9 fusion gene study.The file “MLL-AF9_promoters.bed” is available from the course website• “Upload Gene List” Dialog box• Open the bed file in excel, copy and paste the

fifth column into DAVID• On “Select Identifier”, choose “Entrez Gene ID”

and choose “Gene List” on “List Type”• Submit List, choose “Mus musculus”• Select “Functional Annotation Chart”

Note: Entrez Gene ID’s are a preferred way to search for gene functionsThey can account for the fact that a gene may go by several different names

Page 60: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Functional enrichment analyses with DAVIDJackpot! We have dozens of highly enriched terms for the genes bound by oncogenic MLL-AF9 in mouse leukemia stem cells

Enriched functions include transcription regulation and cell cycleMore than 40% of targets are phosphoproteins

Page 61: Epigenomics: A Practical Guide Benjamin Rodriguez, PhD Wei Li Lab, Baylor College of Medicine Molecular Biology Refresher Course with Bioinformatics August

Laboratory Summary

• The Guenard study was not very fruitful, so to speak• I have some issues with their methodology• Limited data (published) sharing is poor practice• DNA methylation data is difficult to interpret

• GREAT and DAVID are powerful tools for functional enrichment analyses of genome-wide studies

• With the right tools and a little patience, you can make novel discoveries and draw meaningful biological interpretation from genomics datasets