View
221
Download
0
Category
Tags:
Preview:
Citation preview
Formalising a protocol for recording provenance in Grids
Paul Groth – pg03r@ecs.soton.ac.uk
University of Southampton
All Hands Meeting 2004
Or…How to show your work.
In a Grid
Contents
1. What is Provenance and why you should care.
2. The Grid and Provenance3. An Architectural Vision 4. PReP5. Let’s get formal (yawn….)6. What’s next.7. Conclusion
A Definition
Main Entry: prov·e·nance Pronunciation: 'präv-n&n(t)s, 'prä-v&-"nän(t)sFunction: nounEtymology: French, from provenir to come forth, originate, from Latin provenire, from pro- forth + venire to come Date: 17851 : ORIGIN, SOURCE2 : the history of ownership of a valued object or work of art or literature
Documentation of Process i.e. showing your work
The importance of provenance
Process is IMPORTANT
Art Wine Drug Discovery Financial Auditing Aerospace …
The Grid
The Grid problem is defined as coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organisations [FKT01].
Effort is required to allow users to place their trust in the data produced by such virtual organisations
… and the Provenance Problem
Given a set of services in an open grid environment that decide to form a virtual organisation with the aim to produce a given result;
How can we determine the process that generated the result, especially after the virtual organisation has been disbanded?
Provenance Problem cont.
Provenance recording should be part of the infrastructure, so that users can elect to enable it when they execute their complex tasks over the Grid or in Web Services environments.
Currently, the Web Services protocol stack and the Open Grid Services Architecture do not provide any support for recording provenance.
Methods are generally adhoc and do not interoperate.
Execution Provenance
2 TypesProvenance about an interactionProvenance about an actor
Provenance is not a one way street No standard way to record execution
provenance.
An Architecture
An Architecture with Provenance Support
PReP- Provenance Recording Protocol
client serviceinvocation
result
ProvenanceService
recordinvocationand result
recordinvocationand result
negotiate
Why record 2 views?
Provenance Service3
client serviceinvocationresult
ProvenanceService
invocationand result record
invocationand result record
client serviceinvocationresult
ProvenanceServiceinvocation
and result recordinvocation
and result record
client serviceinvocationresult
ProvenanceService
invocationand result record
invocationand result record
Provenance services may be shared or different
Linking Records
client serviceinvocationresult
ProvenanceService
invocationand result record
invocationand result record
client serviceinvocationresult
Provenance Serviceinvocation
and result recordinvocation
and result record
client serviceinvocationresult
ProvenanceService
invocationand result record
invocationand result record
Provenance Record
Record Link
PReP in detail
Model PReP using asynchronous message passing. Maps well to any implementation Helpful for scalability
Four Phase Protocol Negotiation Invocation Provenance Recording Termination
PReP’s messages
ProposeReplyInvoke ResultRecord NegotiationRecord InvocationRecord ResultSubmission FinishedAdditional Provenance
Record Negotiation AckRecord Invocation AckRecord Result AckSubmission Finished AckAdditional Provenance Ack
PReP’s messages
ProposeReplyInvoke ResultRecord NegotiationRecord InvocationRecord ResultSubmission FinishedAdditional Provenance
Record Negotiation AckRecord Invocation AckRecord Result AckSubmission Finished AckAdditional Provenance Ack
Used for connecting provenance records and for recording provenance about actors.
Provenance Service – An abstract state machine
Formalise the protocol by formalising the individual entities in the protocol
Know exactly how the Provenance Service responds to receipt of messages
Use to show a liveness property Something good will eventually
happen
Rules of the ProvenanceService’s ASM
Client and Service
State transition diagram Cannot formalise internals, only the
response to PReP Show Termination Property
VRML Demo
Sketch showing Liveness
Goal Submission Finished Ack Sent Assume
• Client & Service are live• Communication channels work
• Personally, do not like this assumption• Finite number of additional prov msgs
Show termination of Client & Service using graph
ASM rules guarantee that the provenance service fills up. Notify rule fires. Ack Sent
Q.E.D.
What’s next? Security
Support some “classical” properties of distributed algorithms. • Using mutual authentication, an invoked service can ensure that it submits data to a specific provenance server, and vice-versa, a provenance server can ensure that it receives data from a given service. • With non-repudiation, we can retain evidence of the fact that a service has committed to executing a particular invocation and has produced a given result. • We anticipate that cryptographic techniques will be useful to ensure such properties
But wait there’s more…
Err…What if you have a lot of data?Look at scalability
A real (prototype) provenance serviceWe have one in the labNow let other people use it
And along comes trust
Conclusion
Provenance is important Execution provenance is the first layer Provenance recording must be part of
the infrastructure. Standards. Start from specification not
implementation. PReP is a first start. …and it’s cool.
Acknowledgements
Luc Moreau Michael Luck Victor Tan Simon Miles
Visit http://www.pasoa.orgE-mail me: pg03r@ecs.soton.ac.uk
The End
Recommended