13
Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

Embed Size (px)

Citation preview

Page 1: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

Page 2: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

Outline

• Background about data.gov.uk• The use cases– XML serialization– Data transformation on the fly– Complex and nested processes

Page 3: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

data.gov.uk

• Linking UK government data• Aims:– Provide a set of best practices for government

agencies– Provide the minimum set of tooling and

specification to facilitate the publication of data– Encourage “responsible” data publishing

Page 4: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

XML -> RDF

XSLT ProcessorXSLT Processor

XSLT ParameterBinding

XSLT ParameterBinding

XSLT StylesheetXSLT Stylesheet

XSLT TemplateXSLT Template

input outputRDF FileRDF File

Who, when, which version,

how

Who, when, which version,

how

Page 5: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

XSLT ProcessorXSLT Processorinput output

RDF FileRDF FileXSLT ParameterBinding

XSLT ParameterBinding

XSLT StylesheetXSLT Stylesheet

XSLT TemplateXSLT Template

Downloaded from;Unzipped from, etc Made accessible

Who, when, which version,

how

Who, when, which version,

how

Page 6: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

On-the-fly Transformation

Data transformation

wrapper

Data transformation

wrapper

http://mytransportatio.db/j10

Who, when, which

version, how

Who, when, which

version, how

Page 7: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

Complex Data Creation Pipeline

GATE PipelineGATE Pipeline

GateXMLRegressionTransformationGateXMLRegressionTransformation

GateXMLRdfaTransformationGateXMLRdfaTransformation

RdfaRdfXmlTransformationRdfaRdfXmlTransformation

Courtesy of Paul Appleby from TSO (Data Enrichment Service)

Page 8: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

Complex Data Creation Pipeline

GATE PipelineGATE Pipeline

GateXMLRegressionTransformationGateXMLRegressionTransformation

GateXMLRdfaTransformationGateXMLRdfaTransformation

RdfaRdfXmlTransformationRdfaRdfXmlTransformation

Document Reset PRDocument Reset PR

ANNIE English Tokeniser

ANNIE English Tokeniser

ANNIE English SplitterANNIE English Splitter

ANNIE POS TaggerANNIE POS Tagger

Data.gov.uk Morphological Analyzer

Data.gov.uk Morphological Analyzer

Data.gov.uk Flexible Roof Gazetteer

Data.gov.uk Flexible Roof Gazetteer

Data.gov.uk Generic Gazeteer

Data.gov.uk Generic Gazeteer

GATE Noun Phrase Chunker

GATE Noun Phrase Chunker

Data.gov.uk Generic Transducer

Data.gov.uk Generic Transducer

TSO CoreferenceTSO CoreferenceCourtesy of Paul Appleby from TSO (Data Enrichment Service)

Page 9: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

wasGeneratedBy wasGeneratedBy wasGeneratedBy

hasParentProcess iterationOfProcess

Level 1: Provenance of execution at higher level

Level 0: Provenance of execution at detailed level

Services used by executions

Artifacts

followed

wasDerivedFrom A data collection

wasTriggeredBy wasTriggeredByaccessedService

Page 10: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

Non-digital Data Objects

• Organizations– Organizational structure changes over time– Origin organization, resulting Organization

• Boundary• Legislation

An organization ontology: http://www.epimorphics.com/public/vocabulary/org.html

Page 11: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

The Challenges

• Data of different representations, of physical forms, of granularity

• Not tooling support• Provenance across different types of systems– Identification– Different terminologies

Page 12: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

The Gaps

• A vocabulary being able to describe provenance of all types of data, from different systems

• A vocabulary still providing enough terms to describe provenance accurately

Page 13: Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License

(http://creativecommons.org/licenses/by-sa/3.0/)