Upload
alan-yagoda
View
2.381
Download
0
Embed Size (px)
DESCRIPTION
A session delivered at SemTech NYC in Oct 2012. It contains an update to a session earlier in the year at SemTech SF in Jun 2012. Abstract: As access to a richer set of knowledge and research continues to be critical to the healthcare community, the users of healthcare and life science solutions are demanding the same level of discoverability, integration, and innovation from their professional tools that they enjoy in their personal applications. Through the Smart Content initiative Elsevier seeks to semantically enrich its diverse offerings of health sciences content to both improve the performance of existing online resources as well as to enable the creation of the next generation of digital products. In this session, Alan Yagoda will discuss Elsevier’s efforts in developing Smart Content capabilities to power a new portfolio of strategic product offerings. The journey into smarter search and discovery resulted in a new infrastructure with a rich set of semantic capabilities include the development of a standardized medical taxonomy called EMMeT (Elsevier’s Merged Medical Taxonomy), indexing and content enrichment, and linked data services.
Citation preview
Elsevier Health Sciences
SemTechBiz 2012 Conference October 17, 2012
Alan Yagoda VP, Business Technology [email protected] @alanyagoda
Smart Content Drives Smart Applications The Future Of Using Knowledge In Healthcare
Elsevier is the largest Science, Technical and Medical Publisher in the world. In the area of Health Sciences, Elsevier publishes leading brands including The Lancet, Braunwald’s Heart Disease, Gray’s Anatomy, and the Netter Atlases among others. In addition, Elsevier produces leading online clinical support tools and products including:
• Clinical Key • MD Consult • Procedures Consult • Mosby’s Nursing Consult • CPMRC Nursing Care Plans • Gold Standard Drug Database • MEDai Analytics for Managed Care Plans
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
About Elsevier
The Challenge
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
The Challenge: Getting doctors the right information to make the best decisions and provide the best clinical care
Trusted: Authoritative medical and surgical content from Elsevier. Comprehensive: Integrated Medline and 3rd party content. Speed To Answer: Fast discoverability of the most relevant answers and more intuitive searching.
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Introducing Smart Content
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Copyright 2011 Outsell Gilbane Services, Inc.
http://www.outsellinc.com http://gilbane.com/xml/2009/11/what-is-smart-content.html#ixzz0hnuRhaBc
Taxonomy-Powered Content = Smart Content
Content today with structured XML
Content with applied taxonomy
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Smart Content At Elsevier
Entities, concepts and relationships
Smart Content Applications
Better understanding through analysis and visualization • Question & Answer"• Actionable Content & Alerts"• Tag clouds • Heatmaps"• Animations"
Better discovery through semantic search & navigation • Faceted search & browse • Ontology-driven navigation • Task-specific results • Personalized/localized results • Link to evidenced-based content"
New knowledge through aggregation and synthesis • Topic pages • Social network maps • Geolocation maps • Data integration and mashups"• Text mining "• Inference and Reasoning
Images
Text
Tables
Elsevier Content
Elsevier knowledge organization systems
Linked data from partners and the Web
Partner Content
7 Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Co
ncep
t Mapping
Making Smart Content Work in the Clinical Setting
250K+ Core Clinical Concepts
1M+ Hierarchical RelaIonships
1M+ Ontological RelaIonships
1M+ Synonyms
• Vast amounts of content made easily discoverable • Specialty-‐specific naviga9on
• Dynamic clinical summary crea9on • Meaningful related content recommenda9ons
PaIent Ed Drug Info Procedural Videos
Clinical Summaries
EMMeT
Elsevier Custom
UMLS
Books
Journals Guidelines Clinical Trials
Elsevier Merged Medical Taxonomy (EMMeT)
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Introducing EMMeT (Elsevier Merged Medical Taxonomy)
Medical Name Malignant Neoplasm of the Breast
Consumer Friendly Name Breast Cancer
Synonyms Malignant Tumor of Breast Malignant Breast Neoplasm Breast Ca
Codes ICD9 – 174.9 MeSH – D001943 SNOMED-CT – 190121004
Semantic Type/Group Neoplastic Process/Disease
• Breast Disorders • Cancer of the Thorax • Mammary Neoplasms • More….
• Breast Sarcoma • Familial Breast Cancer • Malignant lymphoma of the Breast • Malignant Neoplasm of the breast outer
quadrant • More…
Symptoms
Diagnostic Procedures
Treatment Procedures
Medications
Risk Factors
Prevention
Complications
Breast Lump, Nipple Retraction, …..
Mammography, Breast Biopsy, …..
Chemotherapy, Mastectomy, ….
Tamoxifen, Doxorubicin, …..
Family History, Genetics, Predisposition, ….
Screening, Preemptive Mastectomy, ….
Metastatic Cancer, ….
Parent Terms
Sem
antic
Rel
atio
nshi
ps
Children Terms
4
2
3
1
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Automated Indexing: Weighted Tags for Better Search
Paragraph-level SMART Content tags uncover highly-relevant information not necessarily evident from the title or abstract alone.
Article-level SMART Content tags help confirm relevance and provide a topical overview about a piece of content.
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
EMMeT Powered Auto-Suggest
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Linked Data Repository
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Rivastigmine, a cholinesterase inhibitor, has been used to treat delirium in elderly patients with stroke. 1 A biologically plausible premise—that impaired cholinergic transmission might either cause or worsen delirium—led to a randomised, placebo-controlled, double-blind trial by Maarten van Eijk and colleagues 2 in The Lancet in which they added rivastigmine or placebo to usual treatment of patients in intensive care. The trial was halted at 104 patients by the drug safety and monitoring board (DSMB) because of increased mortality (12/54 in the rivastigmine group, 4/50 in the placebo group; p=0·07) and a worse outcome. The rivastigmine group …
Linked Data Repository (LDR): Warehouse for Smart Content Enhancements Delirium treatment: An unmet challenge Title
Drug Clinical finding
• Service platform that provides a rich semantic layer that enables search and discovery of metadata.
• Transforms content into knowledge data to allow exploration of extracted knowledge, content analysis, and visualization.
• Enhances extracted knowledge of Elsevier assets by interlinking data with related sources of medical and scientific content and data.
• Optimized for high-volume read-write for use by end-user products.
• Provide service layer APIs for ease of integration.
Disease
13 Copyright © 2012 Elsevier, Inc. | All Rights Reserved
ATC: N06DA03 Drug: RivasIgmine
med:diseases Delirium med:drugs RivasIgmine
Elsevier
Trial: NCT00623103 IntervenIon: RivasIgmine CondiIon: Delirium
LinkedCT Trial: NCT00623103 Serious Adverse events: Atrial fibrillaIon
owl:same as
owl: same as
foaf:page
Represent Enhancements and Vocabularies In RDF Satellites
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
LDR
Example RDF Statements Tags from a taxonomy for a given document Document sections relevant to a given concept Document sections providing answers to a given question Genes mentioned in a given document Documents supporting or disputing conclusions of a given document Concepts in the areas of expertise for a given author
Creation of Satellite Standards • Linked data compliant RDF representing metadata objects • Leverage common namespaces from dct, pav, rdf, skos • Taxonomies in SKOS to enhance portability in the linked data world • Subject tagging against a vocabulary representing extracted
knowledge • Concept URIs that can be equated to URIs in linked data • Support RDF/XML and Turtle
Discovery Services (Semantic Knowledgebase)
Data Space Services
LDR Semantic Infrastructure
15
Linked Data Pipeline Services (Hadoop)
JSON Transform
N-Quads Extract
Reasoning
Interlinking
RDF Loader
Ontology Svcs
Analytics
…
AnnotaIon Satellites
Linked Data Loader
MongoDB NoSQL
Access & Entitlements
Asset Satellites
Vocab Satellites
3rd Party Data
SOLR/SIREn
Admin & Monitoring Analytics Atom Feed
Discovery Svc API (REST)
Ontology Service SPARQL Alerts
Virtuoso Triplestore
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
AWS Cloud Management
Tagging and Indexing Services (Concepts, Chapters, ArIcles, Guidelines,etc) RDF GeneraIon
EMMeT Semantic Network
Vocabulary SKOS GeneraIon
Elsevier
Conten
t
Product-specific Smart Content Search Index
3rd P
arty
Conten
t
InsIt.
Conten
t
Smart Content Indexing Pipeline
Linked Data
Amazon S3
Vocab & Annotation RDF Satellites
Linked Data & 3rd Party Data
Smart Content In Action
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Speed to Answer: Most relevant preview
Trend Analysis Of Special Health Topics (Mashups)
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Improved Editorial Workflow: Smart Collection Tool
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Comprehensive Drug Reference
• Moving world-class content online to Point of Care. • Extracted knowledge is linked for further enrichment. • Information is condensed, immediate and actionable.
Integrated Drug Discovery Research
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
• Recommended research relevant to a paIent profile • InfoBu^on integraIon • Alerts on FDA Announcements.
Linking Patient Data To Evidence-Based Research
Key Learnings
Copyright © 2012 Elsevier, Inc. | All Rights Reserved
What We Learned On Our Journey To Adoption
24 Copyright © 2012 Elsevier, Inc. | All Rights Reserved
• Focus on business value -‐ POCS are invaluable. • Requires different skillsets. • Break from tradiIonal IT and follow consumer Internet businesses.
Organiza2onal Readiness
• Cloud to combine big data and semanIcs at scale. • Product and data integraIon just got easier. • Need access points besides SPARQL. • Won’t get it right the first Ime -‐ Fail fast and cheap.
Technical
• Pick standards that are pracIcal and ease adopIon. • Be paranoid about freshness and quality of linked data. • OpImize for performance – more triples isn’t always be^er.
Data
Thank you. Alan Yagoda [email protected] @alanyagoda
Copyright © 2012 Elsevier, Inc. | All Rights Reserved