Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Detecting genome-wide directional effects of transcription factor
binding on polygenic disease risk
Yakir ReshefHarvard/MIT MD/PhD Program
Harvard University Computer ScienceOctober 18, 2017
GWAS + genomics biology
[Liu et al 2015 Nat Genet]
“inflammation”
Crohn’s GWAS
genes expressed in immune cells
[Pasaniuc & Price 2016 Nat Rev Genet; Maurano et al. 2012 Science;Pickrell 2014 AJHG;Gusev et al. 2014 AJHG; Farh et al. 2015 Nature; Finucane et al. 2015 Nat Genet; …]
.
.
.
Signed annotations: stronger inference
“inflammation”
Crohn’s GWAS
genes expressed in immune cells
.
.
.
[Degner et al. 2012 Nature; Lee et al. 2015 Nat Genet; Zhou et al. 2015 Nat Meth;Tehranchi et al. 2016 Cell; Tewhey et al. 2016 Cell; Kelley et al. 2016 Genome Res]
Signed annotations for transcription factors
“binding of IRF1”
Crohn’s GWAS
IRF1 Crohn’scausality
IRF1 “Inflammation” Crohn’smechanism
“Genome-wide, alleles increasing IRF1 binding tend to increase Crohn’s risk”
Outline
• Description of method
• Validation in simulations
• Proof of concept: analysis of molecular traits
• Analysis of 46 diseases and complex traits
Outline
• Description of method
• Validation in simulations
• Proof of concept: analysis of molecular traits
• Analysis of 46 diseases and complex traits
A thought experiment
What if oracle gave us
true causal effect of SNPs on disease
true causal effect of SNPs on TF binding
A thought experiment
effects on disease effects on TF binding
What do we have in practice?(Signed) marginal GWAS summary statistics
(Signed) binding predictions from DNA sequence
LD matrix from reference panel
Method: signed LD profile regression(Signed) marginal GWAS summary statistics
(Signed) binding predictions from DNA sequence
LD matrix from reference panel
[Details: p-values, generalized least-squares, minor allele effects]
Under model:
Outline
• Description of method
• Validation in simulations
• Proof of concept: analysis of molecular traits
• Analysis of 46 diseases and complex traits
SLDP is well-calibrated
[Reshef et al. 2017 BioRxiv]
no enrichment
SLDP is robust to unsigned enrichment
[Reshef et al. 2017 BioRxiv]
confounding byunsigned enrichment
Outline
• Description of method
• Validation in simulations
• Proof of concept: analysis of molecular traits
• Analysis of 46 diseases and complex traits
382 TF binding annotations analyzed
ENCODE ChIP-seq + Basset CNN model
Transcription factor Cell line
CTCF A459...
.
.
.
IRF1 GM12878
[ENCODE Project; Kelly et al. 2016 Genome Res]
Trait: gene expression, across genes
Seeking:
TFs that affect expression inconsistent direction across genes
Strategy:
Meta-analysis across genes
GENE1
GENE2
GENE3
GENE4
SLDP reproducibly identifies activating TFs
[Reshef et al. 2017 BioRxiv; Hansen et al. 1994 Mol Chem Bio; Kimura et al. 1994 Science]
Known activator (UniProt)
Other
SLDP links TFs to epigenetic marks
[Reshef et al. 2017 BioRxiv; Ogryzko et al. 1996 Cell; Laiosa et al. 2006 Ann Rev Immun]
Known activator (UniProt)
Other
Outline
• Description of method
• Validation in simulations
• Proof of concept: analysis of molecular traits
• Analysis of 46 diseases and complex traits
46 diseases and complex traitsUKB + public (sumstats) avg N=289k, ~1M SNPs
Phenotype Sample size
Height N≈450k
Rheu. arthritis N≈36k...
.
.
.
Lupus N≈14k
Schizophrenia N≈70k
[Loh et al. BioRxiv; BOLT-LMM UK Biobank summary statistics are publicly available]
SLDP identifies 77 TF-trait annotations
[Reshef et al. 2017 BioRxiv]
SLDP identifies 77 TF-trait annotations
…that form 12 independent signalsTotal results: 77Indep. signals: 12
Significant results at per-trait FDR < 5%, grouped into approx. independent signals.
[Reshef et al. 2017 BioRxiv]
0
20
BCL11A
[DDD 2015 Nature; Okbay et al. 2016 Nature; Bazak et al. 2016 JCI; Dias et al. 2016 AJHG]
Rare LOFsin BCL11A
intellectual disability
BCL11A EDU+
Genome-wide GWAS signalvs signed LD profile EDU Manhattan plot
0
0
[DDD 2015 Nature; Okbay et al. 2016 Nature; Bazak et al. 2016 JCI; Dias et al. 2016 AJHG]
CTCF Lupus-
Genome-wide GWAS signalvs signed LD profile Lupus Manhattan plot
0
0
0
20
CTCF
CTCF slows myeloiddifferentiation
Fine-mapped SLE SNPsmodify CTCF binding
ExAC:pLI(CTCF) = 1.00(> 99.9% of genes)
IRF1 Crohn’s+
Genome-wide GWAS signalvs signed LD profile Crohn’s Manhattan plot
0
0
-20
20
IRF1
[Jostins et al. 2012 Nature; Wright et al. 2014 Nat Genet]
IRF1 Crohn’s+
Genome-wide GWAS signalvs signed LD profile
0
0
[Jostins et al. 2012 Nature; Wright et al. 2014 Nat Genet]
-20
20
IRF1
IRF1
eQTL
z-s
core
Crohn’s Manhattan plot
Conclusions
• Signed annotations enable strong inference about disease mechanism
• Signed LD profile regression links signed annotations to GWAS
• Evidence for genome-wide directional effects of TFs on molecular and complex traits
AcknowledgementsHilary Finucane
David KelleyAlexander Gusev
Dylan KotliarJacob Ulirsch
Farhad Hormozdiari
Pier-Francesco Palamara
Luca Pinello
Nick Patterson
Ryan Adams
Alkes Price
CGTA, HMS research computing, C de Boer, L Dicker, J Engreitz, N Friedman, X Liu, M Mitzenmacher, J Perry, D Reshef, S Reilly, S Raychaudhuri, A Schoech, P Sabeti, R Tewhey, P Turley
Luke O’ConnorBryce van de Geijn
Po-Ru LohShari GrossmanGaurav BhatiaSteven Gazal
Detecting genome-wide directional effects of transcription factor binding on polygenic disease risk Reshef et al. 2017 bioRxiv:204685
Thank you