Wimmics seminar--drug interaction knowledge base, micropublication, open annotation --2014-10-17

Using the Micropublications ontology and the Open Annotation Data Model to represent evidence within

a drug-drug interaction knowledge base

Jodi Schneider, Paolo Ciccarese, Tim Clark and Richard D. Boyce

WIMMICS Seminar TalkInria, Sophia Antipolis, France17 October 2014

Goal of this project

Construct & maintain a knowledge base linking to evidence

i.e. data, methods, materials

where:• Each ASSERTION in the knowledge basehas a SUPPORT GRAPH of claims and evidence • Each SUPPORT GRAPH element (claims, data, methods, materials)

is dynamically linked to specific QUOTED ELEMENTS in source documents on the Web

Why? It's time-consuming to find the state of the art in a field!

• What do we know about field F? assertion X?• What evidence supports assertion X?• What assumptions are used in research

supporting assertion X?

Application domain: medication safety

• Potential drug-drug interactions– 2+ drugs, where interaction is known to be possible

• Adverse drug event– Harm caused by medication– Huge public health issue in the U.S. each year,

> 1.5 million preventable adverse drug events• Post-market safety issues

Drug information sources

• Evidence is selected & assessed by editorial boards– MICROMEDEX, FirstDataBank, Q-DIPS

• E.g. MICROMEDEX: – "In-house team of 90+ clinically-trained editorial staff"

(physicians, clinical pharmacists, nurses, medical librarians)– "Content is reviewed for clinical accuracy and relevance."– "Critical content areas may undergo an additional review by

members of our Editorial Board."• Potential problems

– a time-consuming (i.e. expensive), collaborative, process– maintaining internal and external inconsistency is non-trivial

Build on 3 things

• Drug Interaction Knowledge Base [Boyce2007, Boyce2009]

• Open Annotation Data Model [W3C2013]• Micropublications Ontology [Clark2014]

Drug Interaction Knowledge Base (DIKB)

– Hand-constructed knowledge base– Safety issues when 2 drugs are taken together– Focus is on EVIDENCE

[Boyce2007, Boyce2009]

Drug Interaction Knowledge Base (DIKB) - Boyce 2007-2009

– Hand-constructed knowledge base– Safety issues when 2 drugs are taken together– Focus is on EVIDENCE

DIKB supports queries about assertions & evidence:

• Get all assertions that are supported by a U.S. FDA regulatory guidance statement

• Are the evidence use assumptions are concordant, unique, and non-ambiguous?

• Which assertions are supported/refuted by just one type of evidence?

Evidence Entry Interface (2008)

Limitations of DIKB v1.2

• Minimal argumentation model– swanco:citesAsSupportingEvidence– swanco:citesAsRefutingEvidence

• Cannot recover the source text– Document-level citation– Quote & section citation preferrable

• Level of detail– Want more detail on data, methods, materials

Open Annotation Data Model

http://www.openannotation.org/spec/core/

Micropublications Ontology (MP)

Clark, Ciccarese, Goble (2014) Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications

http://purl.org/mp

Quotes integrated (MP using OA)

http://purl.org/mp

Clark, Ciccarese, Goble (2014) Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications

Enhancing the DIKB with MP and OA

• Represent the overall argument of the paper– Support & challenge relationships– Data, methods, materials

• Semantic tagging, so drugs & proteins can be queried using knowledge from other sources

• Make quotes actionable (show in source context)• Handle new competency questions of 4 types:

1. Finding assertions and evidence2. Enabling updates3. Assessing the evidence4. Statistics for analytics/KB maintainance

Model overall argument of the paper

Statements

Methods

Semantic tagging using DRON, Protein Ontology, DIKB

Quote stored in OA, with link to source

New competency questions to answer

1. Finding assertions and evidence• List all assertions that are not supported by evidence

– By data, by methods, by materials

• What is the in vitro evidence for assertion X? the in vivo evidence? – With provenance: Give me back the original data tables

2. Enabling updates• List all evidence that has been flagged as rejected from

entry into the knowledge base– By data, by methods, by materials

New competency questions to answer

3. Assessing the evidence• Which single evidence items act as support or rebuttal for

multiple assertions of type X?• Which research group conducted the study used for evidence

item X?• What are the assumptions required for use of this evidence

item to support/refute assertion X?

4. Statistics for analytics/KB maintenance• Number of evidence items for and against each assertion type • Show the distribution of the levels of evidence for various

assertion types

Modeling challenges

• To date, MP has not been used to represent both unstructured text claims ("escitalopram does not inhibit CYP2D6") and the logical sentences written in some formalism

• Efficient querying will be needed, even when the evidence base scales. We are using an iterative design-and-test approach.

To support distributed community annotation/curation

• Enabling social environment – Lots of human time spent curating knowledge

bases, e.g. editorial boards review evidence• Enabling technical environment– MP and OA take account of provenance– OA is being increasingly adopted (several

annotation systems and W3C standards track)

Future work: Support distributed community annotation/curation

• Create a pipeline for extracting potential drug-drug interaction (PDDI) mentions from scientific & clinical literature

• Ensure distributed curation is usable for domain experts– For adding annotations: Existing MP plugin for Domeo– For viewing annotations: Want them highlighted in a

web-based interface BUT Resolving annotations requires a method for pointing to paywalled/subscription PDF & HTML

Context

– Part of “Addressing gaps in clinically useful evidence on drug-drug interactions”, 4-year U.S. National Library of Medicine project

– Focusing on PDDI assertions for a number of commonly prescribed drugs (anticoagulants, statins, psychotropics)

Acknowledgements

• Funding– ERCIM Alain Bensoussan fellowship Program

under FP7/2007-2013, grant agreement 246016– National Library of Medicine (1R01LM011838-01)

• Thanks to the Evidence Panel of Addressing PDDI Evidence Gaps: Carol Collins, Lisa Hines, and John R Horn, Phil Empey

• Thanks to programmer Yifan Ning

Web interface for data entry (1)

Open Annotation Data Model

Wimmics seminar--drug interaction knowledge base, micropublication, open annotation --2014-10-17

Technology

Annotation as Algebra: a formal framework for linguistic annotation

Block Annotation: Better Image Annotation With …...Block Annotation: Better Image Annotation with Sub-Image Decomposition Hubert Lin Cornell University Paul Upchurch Cornell University

Annotation consistency using annotation intersections

Discourse annotation

Review Free EGASP: the human ENCODE Genome Annotation ...repository.cshl.edu/25307/1/...Genome-Annotation...Automatic genome annotation methods To date, accurate automatic annotation

Java annotation

AATOS – a Conﬁgurable Tool for Automatic Annotation · annotation tools where the annotation process can be performed as swiftly as possible. The entrance barrier to annotation

Ensembl annotation

Genome Annotation

wimmics and DBpedia FR

Annotation seminar

Annotation Scale

Annotation and Evaluation - GATE · University of Sheffield, NLP Topics covered • Defining annotation guidelines • Manual annotation using the GATE GUI • Annotation schemas

Protein Functional Annotation · Annotation! Michelle Giglio! Protein Functional Annotation! •!Experimental knowledge of function! •!Literature curation! •!Perform experiment!

Beyond the paper CV and developing a scientific profile through social media, Altmetrics and Micropublication

Micropublication WormBase Workshop International Worm Meeting 2015

Annotation Update

English PropBank Annotation Guidelinesverbs.colorado.edu/propbank/EPB-Annotation-Guidelines.pdfChapter 1 Verb Annotation Instructions 1.1 PropBank Annotation Goals PropBank is a corpus

Spring annotation

Genome Annotation - University of California, Santa Cruz · Genome Annotation Repeat Annotation For Gene Annotation 1 RepeatMasker -pa xx -gccalc -nolow -species aves genome.fasta