Transcript
Page 1: Ovopub: Modular data publication with minimal provenance

Dumontier::Bio-ontologies 2013:Ovopubs 1

Alison Callahan and Michel DumontierCarleton University

Ovopubs: Modular data publication with minimal provenance

Page 2: Ovopub: Modular data publication with minimal provenance

Dumontier::Bio-ontologies 2013:Ovopubs 2

Data publication

• Emerging interest in publishing data on the web• microdata formats (rdfa, schema.org) and formal

knowledge representation languages (RDF/OWL) • Efforts to capturing credit/provenance of assertions– PROV-O, OAG– nanopublications (data/statements - Groth, Kuth)– microattributions (gene variation - Patrinos et al)– micropublications (discourse - Clark et al)

Page 3: Ovopub: Modular data publication with minimal provenance

Dumontier::Bio-ontologies 2013:Ovopubs 3

assertions

Nanopublication• A nanopublication claims to be the “smallest,

unambiguous unit of thought”. • A nanopublication is an RDF graph that links to

two/three graphs:– A graph containing one or more assertions– A graph containing the provenance for the assertion(s)– A graph providing information about the nanopublication

assertion provenance publication

Problems : indirection between assertion and its provenance; what if no provenance is provided? nanopub graph cannot fully contain other graphs; reasoning and easy of queries across nested graphs.

Page 4: Ovopub: Modular data publication with minimal provenance

Dumontier::Bio-ontologies 2013:Ovopubs 4

an Ovopub is an object that contains and links to data and the ovopub’s provenance

data

provenance

Page 5: Ovopub: Modular data publication with minimal provenance

Dumontier::Bio-ontologies 2013:Ovopubs 5

an assertion ovopub contains one or more connected statements

This ovopub is good for capturing knowledge in the form of statements

Page 6: Ovopub: Modular data publication with minimal provenance

Dumontier::Bio-ontologies 2013:Ovopubs 6

An ovopub also links itself to its content

rdfs:member <uri>

This explicit reification enables transitive closures over graph structures

Page 7: Ovopub: Modular data publication with minimal provenance

Dumontier::Bio-ontologies 2013:Ovopubs 7

An ovopub contains and links to its own provenance

• dc:creator <uri>• dc:created xsd:datetime• dc:license <uri>• rdf:type sio:assertion-ovopub sio:collection-ovopub

creator

timestamp

license

ovopub type

Page 8: Ovopub: Modular data publication with minimal provenance

Dumontier::Bio-ontologies 2013:Ovopubs 8

a collection ovopub contains one or more unconnected items

Item types: - object - assertion ovopub - collection ovopub

This ovopub is good for - encapsulation and

redistribution of selected content

- restriction of query execution / results

Page 9: Ovopub: Modular data publication with minimal provenance

Dumontier::Bio-ontologies 2013:Ovopubs 9

iRefIndex: Ovopub Case Study for Datasets, Records, Assertions

Page 10: Ovopub: Modular data publication with minimal provenance

Dumontier::Bio-ontologies 2013:Ovopubs 10

Future work• Actively develop the nanopublication as a community

standard for provenance-based data publication– Assess the value of directly linking assertion & provenance graphs– Generate (revised) nanopublications in Bio2RDF

• Promote nanopublication-based design patterns for:– direct/indirect data/discourse assertions– Aggregation semantics

• Use of nanopublications for scientific research– Evidence gathering (HyQue)

Page 11: Ovopub: Modular data publication with minimal provenance

11

Michel [email protected]

Publications: http://dumontierlab.com Presentations: http://slideshare.com/micheldumontier


Recommended