21
A Semantic Web Approach for the Third Provenance Challenge Tetherless World Constellation @ Rensselaer Polytechnic Institute James Michaelis, Li Ding, Rui Huang, Zhenning Shangguan, Deborah L. McGuinness 1 07/04/22

A Semantic Web Approach for the Third Provenance Challenge

  • Upload
    abeni

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

A Semantic Web Approach for the Third Provenance Challenge. Tetherless World Constellation @ Rensselaer Polytechnic Institute James Michaelis, Li Ding, Rui Huang, Zhenning Shangguan, Deborah L. McGuinness. Introduction. - PowerPoint PPT Presentation

Citation preview

Page 1: A Semantic Web Approach for the Third Provenance Challenge

A Semantic Web Approach for the Third Provenance Challenge

Tetherless World Constellation@

Rensselaer Polytechnic Institute

James Michaelis, Li Ding, Rui Huang, Zhenning Shangguan, Deborah L. McGuinness

104/21/23

Page 2: A Semantic Web Approach for the Third Provenance Challenge

Introduction

• Our approach the Third Provenance Challenge (called TetherlessPC3) is designed to leverage Semantic Web technologies

• Support for two things useful for answering the provided queries:

• Declarative inference – SPARQL + OWL Syntax• Augmenting provenance data derived from the

workflow execution with supplementary information – SPARQL

204/21/23

Page 3: A Semantic Web Approach for the Third Provenance Challenge

TetherlessPC3 Approach

04/21/23 3

Provenance Generator Query Front-End

Import/Export Component

1 2

3

Page 4: A Semantic Web Approach for the Third Provenance Challenge

Provenance Generator Query Front-End

Import/Export Component

TetherlessPC3 Approach

Trace(OPM)

Run TW’sWorkflow code

Run other team’sWorkflow code

Trace(OPM’)

Trace(OWL)

PC3OPM(OWL)

Trace(PML)

Run Query (Pellet/Jena)

Query (SPARQL)

Results (Text)

Normalization(OPM’-OPM)

Query (English)

1 2

3 Translation(OPM-PC3OPM)

Translation(PC3OPM-PML)

Translation(English-Sparql)

Page 5: A Semantic Web Approach for the Third Provenance Challenge

Provenance Generator Query Front-End

Import/Export Component

TetherlessPC3 Approach

Trace(OPM)

Run TW’sWorkflow code

Run other team’sWorkflow code

Trace(OPM’)

Trace(OWL)

PC3OPM(OWL)

Trace(PML)

Run Query (Pellet/Jena)

Query (SPARQL)

Results (Text)

Normalization(OPM’-OPM)

Query (English)

1 2

3 Translation(OPM-PC3OPM)

Translation(PC3OPM-PML)

Translation(English-Sparql)

Produces provenance traces in Web Ontology Language (OWL) format, using Jena – a Java-based Semantic Web framework

These are structured based on the PC3OPM Ontology athttp://www.cs.rpi.edu/~michaj6/Provenance/PC3OPM.owl

PC3OPM is designed to be compatible with the OPM Specification v1.01

Page 6: A Semantic Web Approach for the Third Provenance Challenge

Provenance Generator Query Front-End

Import/Export Component

TetherlessPC3 Approach

Trace(OPM)

Run TW’sWorkflow code

Run other team’sWorkflow code

Trace(OPM’)

Trace(OWL)

PC3OPM(OWL)

Trace(PML)

Run Query (Pellet/Jena)

Query (SPARQL)

Results (Text)

Normalization(OPM’-OPM)

Query (English)

1 2

3 Translation(OPM-PC3OPM)

Translation(PC3OPM-PML)

Translation(English-Sparql)

To get the provenance workflow execution service used

This is designed to run a modified version of the workflow emulation code provided by Yogesh Simmhan (Microsoft Research)

This modified version contains injected code (in section for executing high level workflow) to recording provenance information based on PC3OPM

Page 7: A Semantic Web Approach for the Third Provenance Challenge

Three properties of PC3OPM• Provide direct mappings to OPM concepts

• Example: PC3OPM:Artifact to the OPM concept “Artifact”• Reification of OPM relations

• Example: For the relationship (Process1, WasTriggeredBy, Process2)

• Declare an instance of the class PC3OPM:WasTriggeredBy.• Extend the definitions in OPM through new concepts

• Domain dependent: some terminology specific to Third Provenance Challenge workflow

• Example: CSVFileEntry (subclass of OPM Artifact)• Domain independent: Terminology from the Proof Markup

Language (PML)• We added a new concept “Function” based on (pmlp:inferenceRule),

where an OPM process is an execution of a “Function”

04/21/23 7

Page 8: A Semantic Web Approach for the Third Provenance Challenge

WHAT IS IT?•A Provenance interlingua designed for representing and sharing explanations generated by various intelligent systems.•Originally designed to explain activity of theorem proof generators•Part of the Inference Web framework (which includes tools for browsing, validating PML)

THREE PARTS•Justification: Provides structure for describing how a conclusion was derived•Provenance: Metadata on information referenced in Justification•Trust: Metadata on trust for information referenced in Justification

04/21/23 8

Proof Markup Language (PML)

Page 9: A Semantic Web Approach for the Third Provenance Challenge

Provenance Generator Query Front-End

Import/Export Component

TetherlessPC3 Approach

Trace(OPM)

Run TW’sWorkflow code

Run other team’sWorkflow code

Trace(OPM’)

Trace(OWL)

PC3OPM(OWL)

Trace(PML)

Run Query (Pellet/Jena)

Query (SPARQL)

Results (Text)

Normalization(OPM’-OPM)

Query (English)

1 2

3 Translation(OPM-PC3OPM)

Translation(PC3OPM-PML)

Translation(English-Sparql)

What we have done1.Review given English-based queries and form corresponding SPARQL Queries

2.Update PC3OPM ontology to assist with (1) and re-generate the Provenance trace

3.Run queries, and get back results

Page 10: A Semantic Web Approach for the Third Provenance Challenge

Provenance Generator Query Front-End

Import/Export Component

TetherlessPC3 Approach

Trace(OPM)

Run TW’sWorkflow code

Run other team’sWorkflow code

Trace(OPM’)

Trace(OWL)

PC3OPM(OWL)

Trace(PML)

Run Query (Pellet/Jena)

Query (SPARQL)

Results (Text)

Normalization(OPM’-OPM)

Query (English)

1 2

3 Translation(OPM-PC3OPM)

Translation(PC3OPM-PML)

Translation(English-Sparql)

Technologies used•SPARQL - RDF Query Language

•Pellet – an Open Source OWL Reasoner

Page 11: A Semantic Web Approach for the Third Provenance Challenge

Query Answering Example

• Provenance Challenge core question 3:• “Which operation executions were strictly necessary for the

Image table to contain a particular (non-computed) value?”

• Our interpretation:• Find the Process X which generated the Image table• Look for the processes XT (directly or indirectly) triggered X• Return X and as XT as query results

• Handling this query:• Rather than write a recursive program, we use OWL-based

transitive properties in the answer

1104/21/23

Page 12: A Semantic Web Approach for the Third Provenance Challenge

Enhancing Provenance Trace

• To keep our provenance trace simple and concise, we don’t put in transitive properties – since most of the queries don’t need them

• To insert them when necessary, we create additional RDF data through a SPARQL CONSTRUCT query

04/21/23 12

Page 13: A Semantic Web Approach for the Third Provenance Challenge

SPARQL SELECT QueryPREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX PC3: <http://www.cs.rpi.edu/~michaj6/provenance/OurTrace.owl#>PREFIX PC3OPM: <http://www.cs.rpi.edu/~michaj6/provenance/PC3OPM.owl#>

SELECT ?fxn1 ?fxn2FROM <http://www.cs.rpi.edu/~michaj6/provenance/PC3OPM.owl#>FROM http://www.cs.rpi.edu/~michaj6/provenance/OurTrace.owl#FROM <http://onto.rpi.edu/sw4j/sparql?queryURL=http://tw.rpi.edu/proj/portal.wiki/images/3/36/MakeMoreTriples.sparql>

WHERE { ?wgb PC3OPM:wgbSource PC3:provVarDbEntryP2ImageMeta_0 .?wgb PC3OPM:wgbTarget ?fxn1 .OPTIONAL { ?fxn1 PC3OPM:opWasTriggeredBy ?fxn2 . }

}

1304/21/23

Page 14: A Semantic Web Approach for the Third Provenance Challenge

SPARQL CONSTRUCT QueryPREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX PC3: <http://www.cs.rpi.edu/~michaj6/provenance/PC3.owl#>PREFIX PC3OPM: <http://www.cs.rpi.edu/~michaj6/provenance/PC3OPM.owl#>

CONSTRUCT { ?FXN PC3OPM:opWasTriggeredBy ?FXN2 }

FROM <http://www.cs.rpi.edu/~michaj6/provenance/PC3.owl>FROM <http://www.cs.rpi.edu/~michaj6/provenance/PC3OPM.owl>

WHERE {?USD PC3OPM:usdSource ?FXN . ?USD PC3OPM:usdTarget ?VAR .?WGB PC3OPM:wgbSource ?VAR . ?WGB PC3OPM:wgbTarget ?FXN2

}

1404/21/23

Page 15: A Semantic Web Approach for the Third Provenance Challenge

Provenance Generator Query Front-End

Import/Export Component

TetherlessPC3 Approach

Trace(OPM)

Run TW’sWorkflow code

Run other team’sWorkflow code

Trace(OPM’)

Trace(OWL)

PC3OPM(OWL)

Trace(PML)

Run Query (Pellet/Jena)

Query (SPARQL)

Results (Text)

Normalization(OPM’-OPM)

Query (English)

1 2

3 Translation(OPM-PC3OPM)

Translation(PC3OPM-PML)

Translation(English-Sparql)

Can Import: OPM GraphsCan Export: OPM Graphs PML Proofs

The Import/Export protocols for OPM are handled through the OPM API

Likewise, the import/export Protocols for PML are handledThrough a PML API developedby our lab.

Page 16: A Semantic Web Approach for the Third Provenance Challenge

Discussion: Importing From Other Teams

• Some OPM graphs generated by other teams were not parsable by OPM API, so normalization was needed

• Our SPAQRL queries (used on our provenance trace) only needed slight modification to handle imported provenance (change URIs of artifacts)

• Some information loss was observed with many teams dumping provenance traces to OPM

• Control flow traces were not captured by some teams

04/21/23 16

Page 17: A Semantic Web Approach for the Third Provenance Challenge

Comparing with other Teams:Answering Core Query 3

Blue Team

Our Team Green Team

Page 18: A Semantic Web Approach for the Third Provenance Challenge

Conclusions• Semantic Web technologies useful for handling

provenance data• Provenance generation – RDF/OWL helps clarify the

domain specific concepts/entities in provenance metadata

• Provenance Query – supported by SPARQL + OWL inference

• We can capture control flow and data flow• Using transitive inference rules, we don’t need to write a program

to implement a recursive query• Provenance integration – RDF/OWL syntax of OPM (with

references to domain terminology) will help avoid information loss issues when exporting OPM data

04/21/23 18

Page 19: A Semantic Web Approach for the Third Provenance Challenge

References• OWL

http://www.w3.org/TR/owl-features/• SPARQL

http://www.w3.org/TR/rdf-sparql-query/• Pellet

http://clarkparsia.com/pellet/• Jena

http://jena.sourceforge.net/• PML API

http://inference-web.org/wiki/Tools_%26_Demos• OPM API http://openprovenance.org/java/maven-snapshots/org/openprovenance/

04/21/23 19

Page 20: A Semantic Web Approach for the Third Provenance Challenge

BACK

04/21/23 20

Page 21: A Semantic Web Approach for the Third Provenance Challenge

PC3 OPM Ontology

04/21/23 21