46
@monarchinit @ontowonka “Not everyone can become a great artist, but a great artist can come from anywhere” Anton Ego, Ratatouille, 2007, Dixsney/Pixar Envisioning a world where everyone helps solve disease Melissa Haendel SWAT4LS 2015 Cambridge, England

Envisioning a world where everyone helps solve disease

Embed Size (px)

Citation preview

Page 1: Envisioning a world where everyone helps solve disease

@monarchinit@ontowonka

“Not everyone can become a great artist, but a great artist

can come from anywhere” Anton Ego, Ratatouille, 2007, Dixsney/Pixar

Envisioning a world where everyone helps solve disease

Melissa HaendelSWAT4LS 2015

Cambridge, England

Page 2: Envisioning a world where everyone helps solve disease

Faith-based research

“I believe that my work on some obscure cell type in some obscure organism will matter to mankind one day”

Well, it can, and it does.

Page 3: Envisioning a world where everyone helps solve disease
Page 4: Envisioning a world where everyone helps solve disease

Four things it takes to solve an undiagnosed disease

1. Deep phenotyping the human organism

2. Crossing the language barrier

3. A lot of data from a lot of places

4. Very many people (who have faith)

Page 5: Envisioning a world where everyone helps solve disease

1. DEEP PHENOTYPING THE HUMAN ORGANISM

Page 6: Envisioning a world where everyone helps solve disease

PatientGenom

e/Exome

Filter

****

** ***** ****

Genomic data

Diagnosis,treatment

ATCTTAGCACGTTAC

ATCTTAGCACGTGACATCTTATCACGTTACATCTTAGCACGTTAC

Page 7: Envisioning a world where everyone helps solve disease

What do all those variations do?

We only know the phenotypic consequences of mutation of <20% of the human coding genome

Page 8: Envisioning a world where everyone helps solve disease

Patient

Genome

/Exome

Diagnosis,treatment

Filter

****

** ***** ****

Genomic data

Phenotype

Gene-Phenotype

Data

Environment

Page 9: Envisioning a world where everyone helps solve disease

We have a common language for sequence data…. ATCTTAGCACGTTAC… ….not so much for phenotypes

Page 10: Envisioning a world where everyone helps solve disease

CC2.0 European Southern Observatory https://www.flickr.com/photos/esoastronomy/6923443595

Page 11: Envisioning a world where everyone helps solve disease

Can we help machines understand phenotypes?

“Palmoplantar

hyperkeratosis”

Human phenotype I have absolutely no

idea what that means

???

Image credits:

"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG

Marcin Wichary [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

Page 12: Envisioning a world where everyone helps solve disease

A disease is a collection of phenotypes

Patient

Disease XDifferential diagnosis with similar but non-matching phenotypes is difficult

Flat back of head Hypotonia

Abnormal skull morphology Decreased muscle mass

Page 13: Envisioning a world where everyone helps solve disease

Do we *really* need yet another clinical vocabulary?

Winnenburg and Bodenreider, ISMB PhenoDay, 2014

UMLSSNOMED CT

CHVMedDRA

MeSHNCIT

ICD10-CICD9-CM

ICD-10OMIM

MedlinePlus

Existing clinical vocabularies don’t adequately cover phenotype descriptions

Page 14: Envisioning a world where everyone helps solve disease

Disease-phenotype associations using an ontology

Page 15: Envisioning a world where everyone helps solve disease

Once OMIM is rendered computable, are we done yet?

Free text -> HPO enables phenotype semantic similarity matching

Page 16: Envisioning a world where everyone helps solve disease

Mendelian disease integrationMerges sources together using: equivalence and subclass axioms derived from xrefs string matching manual efforts to fill gaps based on phenotypes and

anatomical axioms

Parkinson’s disease subtypes

Different colors = different disease sources

https://github.com/monarch-initiative/monarch-disease-ontology

Page 17: Envisioning a world where everyone helps solve disease

Why we need all the organisms

Model data can provide up to 80% phenotypic coverage of the human coding genome

Page 18: Envisioning a world where everyone helps solve disease

We learn different things from different organisms

Page 19: Envisioning a world where everyone helps solve disease

2. CROSSING THE LANGUAGE BARRIER

Page 20: Envisioning a world where everyone helps solve disease

Ulcerated paws

Palmoplantar hyperkeratos

is

Thick hand skin

Image credits:

"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG

http://www.guinealynx.info/pododermatitis.html

Page 21: Envisioning a world where everyone helps solve disease

Challenge: Each database uses their own vocabulary/ontology

MPHP

MGIHPOA

Image credits:

"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG

http://www.guinealynx.info/pododermatitis.html

Page 22: Envisioning a world where everyone helps solve disease

Challenge: Each database uses their own vocabulary/ontology

ZFA

MPDPO

WPO

HP

OMIA

VT

FYPO APOSNOMED

………

WB

PB

FB

OMIA

MGI

RGD

ZFIN

SGD

HPOAIMPC

OMIM

ICDQTLd

b

EHR

Image credits:

"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG

http://www.guinealynx.info/pododermatitis.html

Page 23: Envisioning a world where everyone helps solve disease

Decomposition of complex concepts allows interoperability

Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology, 11(1), R2. doi:10.1186/gb-2010-11-1-r2

“Palmoplantar

hyperkeratosis”

increased

Stratum corneum

layer of skin

=Human phenotype

PATO

Uberon

Species neutral ontologies, homologous concepts

Autopod

keratinization

GO

Page 24: Envisioning a world where everyone helps solve disease

Cross-species ontology integration

Page 25: Envisioning a world where everyone helps solve disease

3. A LOT OF DATA FROM A LOT OF PLACES

Page 26: Envisioning a world where everyone helps solve disease

Graph Views

DiverseG2P/D

source data

Source Ontologies Owl Loader

Graph Views

Monarch App

FacetedBrowsing

Phenotype

Matching

.ttl

.ttl

Input OutputPipeline

Putting it Together: Data + Ontologies

https://github.com/SciGraph/SciGraph

Page 27: Envisioning a world where everyone helps solve disease

Data Integrated in SciGraph>25 sources>100 species

51M triples4M curated

associations2.2M G-P / G-D

associations

Page 28: Envisioning a world where everyone helps solve disease

Genotype-phenotype integration

One sourceTwo sources3 or more

9%

91% of our 2.2 Million G2P associations required integrating 2 or more data sources (this number does not even include orthology (Panther))

91%

Page 29: Envisioning a world where everyone helps solve disease

Ontology-based phenotype matching

www.owlsim.org

Page 30: Envisioning a world where everyone helps solve disease

Combining genotype and phenotype data for variant prioritization

Whole exome

Remove off-target and common variants

Variant score from allele freq and pathogenicity

Phenotype score from phenotypic similarity

PHIVE score to give final candidates

Mendelian filters

https://www.sanger.ac.uk/resources/software/exomiser/

Page 31: Envisioning a world where everyone helps solve disease

York platelet syndrome and STIM1

Markello T et al. Molecular Genetics and Metabolism 2015, 114: 474

Grosse J, J Clin Invest 2007 117: 3540-50

Impaired platelet aggregation(HP:0003540) Thromocytopenia (HP:0001873)

Abnormal platelet activation(MP:0006298) Thrombocytopenia (MP:0003179)

UDP_2542 Stim1Sax/Sax

http://www.nature.com/gim/journal/vaop/ncurrent/full/gim2015137a.html

Page 32: Envisioning a world where everyone helps solve disease

4. VERY MANY PEOPLE (WHO HAVE FAITH)

Page 33: Envisioning a world where everyone helps solve disease

Who helped solve the STIM1 UDP_2542 case?

Page 34: Envisioning a world where everyone helps solve disease

Credit extends beyond the publication

Johannes creates stim1 mouse

Melissa annotates patient UDP_2542 with HPO

Will performs analysis of UDP_2542 that includes stim1 mouse to generate a dataset of prioritized variants

Tom writes publication pmid:25577287 about the STIM1 diagnosis

Tom explicitly credits Will as an author but not Melissa.

Page 35: Envisioning a world where everyone helps solve disease

Credit is connected

Credit to Will is asserted, but credit to Melissa can be inferred

Page 36: Envisioning a world where everyone helps solve disease
Page 37: Envisioning a world where everyone helps solve disease
Page 38: Envisioning a world where everyone helps solve disease
Page 39: Envisioning a world where everyone helps solve disease
Page 40: Envisioning a world where everyone helps solve disease

Who is in the graph?

Melissa HaendelPeter RobinsonChris MungallSebastian KohlerCindy SmithNicole VasilevskySandra Dolken

Johannes GrosseAttila BraunDavid Varga-SzaboNiklas BeyersdorfBoris SchneiderLutz ZeitlmannPetra HankePatricia SchroppSilke MühlstedtCarolin ZornMichael HuberCarolin SchmittwolfWolfgang JaglaPhilipp YuThomas KerkauHarald SchulzeMichael NehlsBernhard Nieswandt

Thomas MarkelloDong ChenJustin Y. Kwan Iren Horkayne-Szakaly Alan Morrison Olga Simakova Irina Maric Jay Lozier Andrew R. Cullinane Tatjana Kilo Lynn Meister Kourosh PakzadSanjay Chainani Roxanne Fischer Camilo Toro James G. White David AdamsCornelius BoerkoelWilliam A. Gahl Cynthia J. Tifft Meral Gunay-Aygun

Melissa HaendelDavid AdamsDavid DraperBailey GallingerJoie DavisNicole Vasilevsky Heather TrangRena GodfreyGretchen GolasCatherine GrodenMichele NehrebeckyAriane SoldatosElise Valkanas,Colleen WahlLynne Wolfe

Elizabeth Lee Amanda LinksWill Bone Murat SincanDamian SmedleyJules JacobsonNicole WashingtonElise FlynnSebastian KohlerOrion BuskeMarta GirdeaMichael Brudno Jeremy Band

Hans GoebleKaren BalbachNadine PfeiferSandra WernerChristian Linden

Clinical/care Pathology Ontologist CS/informatics Curator Basic research

Page 41: Envisioning a world where everyone helps solve disease

Tracking Evidence and Provenance of G2P Associations

Evidence is a collection of information that is used to support a scientific claim or association

Provenance is a history of what  processes led to the claim being made, what entities participated in these processes

Value of Evidence and Provenance Metadata context to evaluate credibility/confidence support filtering and analysis of data detailed history for attribution

Page 42: Envisioning a world where everyone helps solve disease

Evidence and Provenance for a Variant-Phenotype Association

Page 43: Envisioning a world where everyone helps solve disease

Who is missing?

http://haluzz.deviantart.com/art/Waldo-at-the-hipster-party-273602450

Page 44: Envisioning a world where everyone helps solve disease

What about patients? Can they help too?

HP:0000252Pref Label: MicrocephalySynonyms: Decreased Head Circumference; Reduced Head Circumference; Small head circumferenceSuggested Synonyms : Small Head; Little Head; Small Skull; Little Skull; Small Cranium…

Small headMicrocephaly

https://commons.wikimedia.org/wiki/File:Microcephaly.png#/media/File:Microcephaly.png

Page 45: Envisioning a world where everyone helps solve disease

Job openinghttps://goo.gl/MlcnR5

Focusing on building ontologies and semantic web technologies to represent research, attribution, provenance, and scholarly communication

@ontowonka [email protected]

Page 46: Envisioning a world where everyone helps solve disease

Funding: NIH Office of Director: 1R24OD011883; NIH-UDP: HHSN268201300036C, HHSN268201400093P; NCINCI/Leidos #15X143,

BD2K U54HG007990-S2 (Haussler) & BD2K PA-15-144-U01 (Kesselman)

PIs: Chris Mungall, Peter Robinson, Damian Smedley, Tudor Groza, Harry Hochheiserwww.monarchinitiative.org/page/team