28
“Pushing Back” Standards and Standard Organizations in a Semantic Web Enabled World Kerstin Forsberg Informatics Scientist AstraZeneca Mölndal, Sweden Image: Flickr bitpuddle (Twitter @ eric_d_hancock )

Pushing back, standards and standard organizations in a Semantic Web enabled world

Embed Size (px)

DESCRIPTION

Keynote at SWAT4LS (Semantic Web Applications and Tools for Life Science) 2013

Citation preview

Page 1: Pushing back, standards and standard organizations in a Semantic Web enabled world

“Pushing Back”Standards and Standard Organizations in a Semantic Web Enabled World

Kerstin ForsbergInformatics ScientistAstraZenecaMölndal, Sweden

Image: Flickr bitpuddle (Twitter @eric_d_hancock)

Page 2: Pushing back, standards and standard organizations in a Semantic Web enabled world

AZIT | R&D Information

Purpose

Encourage standard organisations to“Use Standards for Standards”

Agenda

• Standards for Data and Semantics

• Exemplas of Standard Organizationsnow looking into using Semantic Web

• Provenance/Justification for Mappings

2 Kerstin Forsberg | SWAT4LS, Dec 10th 2013

Page 3: Pushing back, standards and standard organizations in a Semantic Web enabled world

AZIT | R&D Information

Kerstin Forsberg (@kerfors)

• “Volvo Web Wave Project” 1995-1997W3C conferences 1996 & 1999, Dublin Core, RDF

• “Extensible use of RDF in a business context”paper presented at the W3C WWW9 conference, 2000, Amsterdam

• “Advancing translational research with the Semantic Web” joint W3C HCLS paper in BMC Bioinformatics, 2007

• “Linked data, an opportunity to mitigate complexity in pharmaceutical research and development”Summary of experiences from LarKC and W3C HCLS2011 together with my colleague Bosse Andersson

“Information architect, semantic web and linked data enthusiast caring about clinical trial data.”

3 Kerstin Forsberg | SWAT4LS, Dec 10th 2013

Page 4: Pushing back, standards and standard organizations in a Semantic Web enabled world

AZIT | R&D Information

About AstraZeneca

• Alongside our own R&D, we partner with others, combining skills and resources to broaden the potential for successful innovation.

• We believe that only by working together with others who have a part of play in improving healthcare can real progress be made.

• We work closely with others in the healthcare community, including physicians and those who pay for healthcare, to understand their challenges and how we can combine skills and resources to achieve a common goal: improved health.

4 Kerstin Forsberg | SWAT4LS, Dec 10th 2013

Page 5: Pushing back, standards and standard organizations in a Semantic Web enabled world

AstraZeneca’s view on “Semantics”

Enabling the hyperconnected enterprise

5 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

“We need to build a linked data architecture enabling us to ask questions and solve business problems across a heterogeneous information landscape extending beyond the traditional boundaries of the enterprise.”

semanticsconnectsusall

Page 6: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data and Semantics

Different types of standards

6 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

• Entity-based Ontologies

• Concept-based Terminologies/Code systems

• Code lists/Value sets/Term sets

• Data exchange (Tabulated data)

• Information Models

Page 7: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data and Semantics

Examples

7 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

• Entity-based Ontologies

• Concept-based Terminologies/Code systems

• Code lists/Value sets/Term sets

• Data exchange (Tabulated data)

• Information Models

Page 8: Pushing back, standards and standard organizations in a Semantic Web enabled world

“Pushing back” – Use standards for standards

1. NCI (National Cancer Institute)Thesaurus

8 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

• Entity-based Ontologies

• Concept-based Terminologies/Code systems

• Code lists/Value sets/Term sets

• Data exchange (Tabulated data)

• Information Models

Page 9: Pushing back, standards and standard organizations in a Semantic Web enabled world

“Pushing back” – Use standards for standardsAZ Vocabulary Management team shared this with NCI EVS

9 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

• The NCI Thesaurus is an extensive medical vocabulary published by the US National Institutes of Health: http://ncit.nci.nih.gov/

• It is made available in several downloadable formats: http://evs.nci.nih.gov/ftp1/NCI_Thesaurus

• In order for use to use the thesaurus in our system, we need to convert it to RDF, following the SKOS standard: http://www.w3.org/2004/02/skos/

Jim Morris, Informatics ScientistAstraZeneca R&D Wilmington, USA

Page 10: Pushing back, standards and standard organizations in a Semantic Web enabled world

“Pushing back” – Use standards for standards2. MedDRA (Medical Dictionary for Regulatory Activities)

10 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

• Entity-based Ontologies

• Concept-based Terminologies/Code systems

• Code lists/Value sets/Term sets

• Data exchange (Tabulated data)

• Information Models

Page 11: Pushing back, standards and standard organizations in a Semantic Web enabled world

“Pushing back” – Use standards for standardsAZ Vocabulary Management team shared this with MedDRA MSSO

11 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D InformationCourtland Yockey, Informatics ScientistAstraZeneca R&D Wilmington, USA

A very simple SKOS-rendering of MedDRA• term skos:Concept• hierarchy level

skos:ConceptScheme• SMQ skos:Collection

Approach should be augmented with VoID representation of MedDRA versions and term properties distinguishing active from inactive terms.

Skos:Collection is likely not sufficient to support SMQ versioning nor context of terms in an SMQ (e.g. weight)

Page 12: Pushing back, standards and standard organizations in a Semantic Web enabled world

“Pushing back” – Use standards for standards

3. CDISC (Clinical Data Interchange Consortium)

12 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

• Entity-based Ontologies

• Concept-based Terminologies / Code systems

• Code lists/Value sets/Term sets

• Data exchange (Tabulated data)

• Information Models

Page 13: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data Exchange

Clinical Trial Data standardized “containers”

13 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

Trial Summary level

Patient level

Submission standards SDTM “designed so [FDA] reviewers with no tools other than perhaps the SAS Viewer would be able to open a dataset and browse it easily”.

Page 14: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data Exchange

Documentation of standardized “containers”

14 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

Human readable documentation in 200+ pages PDF:s, Excel:s (and some in XML).

Page 15: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data Exchange

Data in standardized “containers”

15 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

CDISC SDTMImplementationGuideline (IG)

Humans can connect data to data standards.

Page 16: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data Exchange

Documentation of Standard fragments

16 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

CDISC SDTM Model 1

CDISC SDTMImplementationGuideline (IG)

2

CDISC SDTMControlled Terminiolgy

3Humans can connect data to data standards and connect the different standard fragments to each other.

Page 17: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data Exchange

Linked Clinical Data Standards

17 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

• CDISC2RDF started as a cross-pharma pre-competitive project with AstraZeneca, Roche, W3C et al. to show case Semantic Web standards and Linked Data principles.

• Become part of the Semantic Technology project, a FDA/PhUSE working group for Emerging Technologies, with 30+ repr. from FDA, CDISC, Pharma:s, CRO:s and software vendors.

• First phase: Representing existing“container” standards (SDTM, CDASH,SEND, ADaM) in RDF.

Page 18: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data Exchange

Linked Clinical Data Standards

18 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

Human readable documentation in PDF:s, Excel:s (and some in XML)

Machine processable linked data structured as RDF triples(160.000+ )

Serializations of RDF triplesin Turtle and XML …

https://github.com/phuse-org/rdf.cdisc.org

Page 19: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data Exchange

Linked Clinical Data Standards

19 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

Human readable documentation in PDF:s, Excel:s (and some in XML)

Import filesAnnotated Excel files from CDISC with

classes and properties from the Schemas ready to transform to RDF triples

using a off-the-shelf tool (TopQuadrant Composer)

Meta Model Schema (mms)Based on the core ISO11179 model

(metadata for data elements and a few CDISC specific classes and properties)

Machine processable linked data structured as RDF triples(160.000+ )

https://github.com/phuse-org/rdf.cdisc.org

Serializations of RDF triplesin Turtle and XML …

Page 20: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data Exchange

Annotating existing standards

20 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

Import filesAnnotated Excel files from CDISC with

classes and properties from the Schemas ready to transform to RDF triples

using a off-the-shelf tool (TopQuadrant Composer)

Meta Model Schema (mms)Based on the core ISO11179 model

(metadata for data elements and a few CDISC specific classes and properties)

This turned out to be a good way to communicate with people knowledgeable in CDISC but new to RDF schemas to understand the process of “triplification”.

Page 21: Pushing back, standards and standard organizations in a Semantic Web enabled world

CDISC and NCIT

Value sets is an issue

21 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

• Concept-based Terminologies / Code systems

• Code lists/Value sets/Term sets

• Data exchange (Tabulated data)

mms:PermissibleValue

mms:ValueDomain

mms:Datasetmms:Data Element

mms:DataCollectionForm

Page 22: Pushing back, standards and standard organizations in a Semantic Web enabled world

Standards for Data Exchange

Cross standard review and mappings

22 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

Data Elements [SDTM, ADaM, CDASH] ”haveSame” Value Domain (CT)

Page 23: Pushing back, standards and standard organizations in a Semantic Web enabled world

Provenance/Justification for MappingsExample from EU project SALUS for Post Market Safety Studies

23 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

The example show the hierarchy of cardiac disorders in both the MedDRA andSNOMED-CT concept schemes, expressed using the skos:broader property. Mappings between

similar concepts in both concept schemes are stated using the skos:exactMatch property.From: SALUS Harmonized Ontology for Post Market Safety Studies

Page 24: Pushing back, standards and standard organizations in a Semantic Web enabled world

Provenance/Justification for MappingsExample from EU project SALUS for Post Market Safety Studies

24 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

The example show the hierarchy of cardiac disorders in both the MedDRA andSNOMED-CT concept schemes, expressed using the skos:broader property. Mappings between

similar concepts in both concept schemes are stated using the skos:exactMatch property.From: SALUS Harmonized Ontology for Post Market Safety Studies

MedDRA:10028596 skos:exactMatch SNOMEDCT:22298006

Page 25: Pushing back, standards and standard organizations in a Semantic Web enabled world

Provenance/Justification for Mappings

Alternative: Mappings as LinkSets

25 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

The Dataset Descriptions for the Open Pharmacological Space is a specification for the metadata to described datasets, and the LinkSets that relate them.

Page 26: Pushing back, standards and standard organizations in a Semantic Web enabled world

Provenance/Justification for Mappings

Alternative: Mappings as Nanopublications

26 Kerstin Forsberg | SWAT4LS, Dec 10th 2013 AZIT | R&D Information

MedDRA:10028596 skos:exactMatch SNOMEDCT:22298006

Page 27: Pushing back, standards and standard organizations in a Semantic Web enabled world

AZIT | R&D Information

Summary

Encourage standard organisations to“Use Standards for Standards”

for sustainability and trustability.

Think if …

27 Kerstin Forsberg | SWAT4LS, Dec 10th 2013

semanticsconnectsusall

Page 28: Pushing back, standards and standard organizations in a Semantic Web enabled world

AZIT | R&D Information

Acknowledgements

AZ’s Semantic Web Community of Practice members:Tom Plasterer (lead), Jim Morris, Courtland Yockey, Sorana Popa, Rob Hernandez, Mike Westaway, Rajan Desai, Simon Rakov, Dana Crowley, Ian Dix, Johan Törnqvist

Collaborators and Advisors:• Charlie Mead – IO Informatics• Dean Allemang – Working Ontologist• Frederik Malfait – IMOS consulting / Roche• Phil Ashworth – TopQuadrant

28 Kerstin Forsberg | SWAT4LS, Dec 10th 2013

Thank you! [email protected]