Upload
michel-dumontier
View
641
Download
6
Embed Size (px)
DESCRIPTION
With the growth of the Semantic Web as a medium for creating, consuming, mashing up and republishing data, our ability to trace any statement(s) back to their origin is becoming ever more important. Several approaches have now been proposed to associate statements with provenance, with multiple applications in data publication, attribution and argumentation. Here, we describe the ovopub, a modular model for data publication that enables encapsulation, aggregation, integrity checking, and selective-source query answering. We describe the ovopub RDF specification, key design patterns and their application in the publication and referral to data in the life sciences. paper: http://arxiv.org/abs/1305.6800 presented at bio-ontologies 2013: https://sites.google.com/site/bioontologies/home
Citation preview
Dumontier::Bio-ontologies 2013:Ovopubs 1
Alison Callahan and Michel DumontierCarleton University
Ovopubs: Modular data publication with minimal provenance
Dumontier::Bio-ontologies 2013:Ovopubs 2
Data publication
• Emerging interest in publishing data on the web• microdata formats (rdfa, schema.org) and formal
knowledge representation languages (RDF/OWL) • Efforts to capturing credit/provenance of assertions– PROV-O, OAG– nanopublications (data/statements - Groth, Kuth)– microattributions (gene variation - Patrinos et al)– micropublications (discourse - Clark et al)
Dumontier::Bio-ontologies 2013:Ovopubs 3
assertions
Nanopublication• A nanopublication claims to be the “smallest,
unambiguous unit of thought”. • A nanopublication is an RDF graph that links to
two/three graphs:– A graph containing one or more assertions– A graph containing the provenance for the assertion(s)– A graph providing information about the nanopublication
assertion provenance publication
Problems : indirection between assertion and its provenance; what if no provenance is provided? nanopub graph cannot fully contain other graphs; reasoning and easy of queries across nested graphs.
Dumontier::Bio-ontologies 2013:Ovopubs 4
an Ovopub is an object that contains and links to data and the ovopub’s provenance
data
provenance
Dumontier::Bio-ontologies 2013:Ovopubs 5
an assertion ovopub contains one or more connected statements
This ovopub is good for capturing knowledge in the form of statements
Dumontier::Bio-ontologies 2013:Ovopubs 6
An ovopub also links itself to its content
rdfs:member <uri>
This explicit reification enables transitive closures over graph structures
Dumontier::Bio-ontologies 2013:Ovopubs 7
An ovopub contains and links to its own provenance
• dc:creator <uri>• dc:created xsd:datetime• dc:license <uri>• rdf:type sio:assertion-ovopub sio:collection-ovopub
creator
timestamp
license
ovopub type
Dumontier::Bio-ontologies 2013:Ovopubs 8
a collection ovopub contains one or more unconnected items
Item types: - object - assertion ovopub - collection ovopub
This ovopub is good for - encapsulation and
redistribution of selected content
- restriction of query execution / results
Dumontier::Bio-ontologies 2013:Ovopubs 9
iRefIndex: Ovopub Case Study for Datasets, Records, Assertions
Dumontier::Bio-ontologies 2013:Ovopubs 10
Future work• Actively develop the nanopublication as a community
standard for provenance-based data publication– Assess the value of directly linking assertion & provenance graphs– Generate (revised) nanopublications in Bio2RDF
• Promote nanopublication-based design patterns for:– direct/indirect data/discourse assertions– Aggregation semantics
• Use of nanopublications for scientific research– Evidence gathering (HyQue)
11
Michel [email protected]
Publications: http://dumontierlab.com Presentations: http://slideshare.com/micheldumontier