Dumontier::Bio-ontologies 2013:Ovopubs 1
Alison Callahan and Michel DumontierCarleton University
Ovopubs: Modular data publication with minimal provenance
Dumontier::Bio-ontologies 2013:Ovopubs 2
Data publication
• Emerging interest in publishing data on the web• microdata formats (rdfa, schema.org) and formal
knowledge representation languages (RDF/OWL) • Efforts to capturing credit/provenance of assertions– PROV-O, OAG– nanopublications (data/statements - Groth, Kuth)– microattributions (gene variation - Patrinos et al)– micropublications (discourse - Clark et al)
Dumontier::Bio-ontologies 2013:Ovopubs 3
assertions
Nanopublication• A nanopublication claims to be the “smallest,
unambiguous unit of thought”. • A nanopublication is an RDF graph that links to
two/three graphs:– A graph containing one or more assertions– A graph containing the provenance for the assertion(s)– A graph providing information about the nanopublication
assertion provenance publication
Problems : indirection between assertion and its provenance; what if no provenance is provided? nanopub graph cannot fully contain other graphs; reasoning and easy of queries across nested graphs.
Dumontier::Bio-ontologies 2013:Ovopubs 4
an Ovopub is an object that contains and links to data and the ovopub’s provenance
data
provenance
Dumontier::Bio-ontologies 2013:Ovopubs 5
an assertion ovopub contains one or more connected statements
This ovopub is good for capturing knowledge in the form of statements
Dumontier::Bio-ontologies 2013:Ovopubs 6
An ovopub also links itself to its content
rdfs:member <uri>
This explicit reification enables transitive closures over graph structures
Dumontier::Bio-ontologies 2013:Ovopubs 7
An ovopub contains and links to its own provenance
• dc:creator <uri>• dc:created xsd:datetime• dc:license <uri>• rdf:type sio:assertion-ovopub sio:collection-ovopub
creator
timestamp
license
ovopub type
Dumontier::Bio-ontologies 2013:Ovopubs 8
a collection ovopub contains one or more unconnected items
Item types: - object - assertion ovopub - collection ovopub
This ovopub is good for - encapsulation and
redistribution of selected content
- restriction of query execution / results
Dumontier::Bio-ontologies 2013:Ovopubs 9
iRefIndex: Ovopub Case Study for Datasets, Records, Assertions
Dumontier::Bio-ontologies 2013:Ovopubs 10
Future work• Actively develop the nanopublication as a community
standard for provenance-based data publication– Assess the value of directly linking assertion & provenance graphs– Generate (revised) nanopublications in Bio2RDF
• Promote nanopublication-based design patterns for:– direct/indirect data/discourse assertions– Aggregation semantics
• Use of nanopublications for scientific research– Evidence gathering (HyQue)
11
Michel [email protected]
Publications: http://dumontierlab.com Presentations: http://slideshare.com/micheldumontier