Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
2013 © Copyright lies with the respective authors and their institutions.
Wf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science
STREP FP7-ICT-2007-6 270192
Objective ICT-2009.4.1 b) – “Advanced preservation scenarios”
D1.4v2 Reference Wf4Ever Implementation – Phase II
Deliverable Co-ordinator: Raul Palma
Deliverable Co-ordinating Institution: Poznan Supercomputing and Networking
Center (PSNC)
This document presents the Wf4Ever Toolkit, the reference implementation of Wf4Ever architecture, which
comprises a set of services providing functionalities exposed via RESTful APIs, alongside client applications
providing access to these functionalities.
Document Identifier: Wf4ever/2011/D1.4/v2 Date due: 31/07/2013
Class Deliverable: Wf4ever 270192 Submission date: 31/07/2013
Project start date: December 1, 2010 Version: V1.0
Project duration: 3 years State: Final
Distribution: Public
Page 2 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
Wf4Ever Consortium
This document is a part of the Wf4Ever research project funded by the IST Programme of the Commission of
the European Communities by the grant number FP7-ICT-2007-6 270192. The following partners are
involved in the project:
Intelligent Software Components S.A.
Edificio Testa
Avda. del Partenón 16-18, 1º, 7ª
Campo de las Naciones, 28042 Madrid
Spain
Contact person: Dr. Jose Manuel Gómez-Pérez
E-mail address: [email protected]
University of Manchester
Department of Computer Science,
University of Manchester, Oxford Road
Manchester, M13 9PL
United Kingdom
Contact person: Professor Carole Goble
E-mail address: [email protected]
Universidad Politécnica de Madrid
Departamento de Inteligencia Artificial
Facultad de Informática, UPM
28660 Boadilla del Monte, Madrid
Spain
Contact person: Dr. Oscar Corcho
E-mail address: [email protected]
University of Oxford
Department of Zoology
University of Oxford
South Parks Road, Oxford OX1 3PS
United Kingdom
Contact person: Dr. Jun Zhao / Professor David De
Roure
E-mail address: {[email protected],
Poznań Supercomputing and Networking Center
Network Services Department
Poznań Supercomputing and Networking Center
Z. Noskowskiego 12/14, 61-704 Poznan
Poland
Contact person: Dr. Raúl Palma de León
E-mail address: [email protected]
Instituto de Astrófísica de Andalucía
Dpto. Astronomía Extragaláctica
Instituto Astrofísica Andalucía
Glorieta de la Astronomía s/n 18008 Granada,
Spain
Contact person: Dr. Lourdes Verdes-Montenegro
E-mail address: [email protected]
Leiden University Medical Centre
Department of Human Genetics
Leiden University Medical Centre
Albinusdreef 2, 2333 ZA Leiden
The Netherlands
Contact person: Dr. Marco Roos
E-mail address: [email protected]
D1.3v1: Wf4Ever Architecture – Phase I Page 3 of 34
2013 © Copyright lies with the respective authors and their institutions.
Work Package Participants
The following partners have taken an active part in the work leading to the elaboration of this document,
even if they might not have directly contributed to the writing of this document or its parts:
• PSNC
• OXF
• UNIMAN
• UPM
• ISOCO
Change Log
Version Date Amended by Changes
0.1 21-06-2013 Raul Palma Document Outline
0.2 21-06-2013 Raul Palma Introduction
0.3 24-06-2013 Raul Palma Conclusions
0.4 8-07-2013 Raul Palma Section 3
0.5 10-07-2013 Raul Palma Section 2 based on input of users
0.6 15-07-2013 Raul Palma Revision
0.7 17-07-2013 Raul Palma Send to QA
0.8 28-07-2013 Raul Palma Update based on QA
1.0 31-07-2013 Raul Palma Final Version
Page 4 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
Executive Summary
This document describes the Wf4Ever toolkit, the reference implementation of Wf4Ever architecture, which
consists of a set of services providing functionalities for the realization of a scientific workflow preservation
infrastructure, alongside client applications providing access to these functionalities. We present these
components in the context of the functional categorization introduced by the Wf4Ever architecture (see
D1.3v2), followed by a description of short use case scenarios, which motivates the developed components
and illustrate how they can interact with each other to support users in the management and preservation of
scientific workflows. Moreover, for each of these components this document provides an information card
including: a brief description highlighting the key functionalities provided, the APIs implemented, the services
used, the source code repository location, the deployment location of a live instance, and a reference to the
corresponding M32 technical WP deliverable where the component implementation and usage is described
in more detail. Hence, this document serves as an index for other deliverables. A complete documentation of
the API implemented by the services is provided in the Architecture deliverable (D1.3v2), and the user
guides for client applications are provided in the Sandbox deliverable (D1.2v3).
Note that this is a self-contained document that supersedes D1.4v1. Therefore, some of the content from
D1.4v1 has been reused and updated when necessary, in particular for Section 1, Section 3, and Section 5.
The main novel contributions of this document consist of the use case scenarios (Section), which have been
aligned with the user deliverables with the input from our users, and the updated information cards (Section).
D1.3v1: Wf4Ever Architecture – Phase I Page 5 of 34
2013 © Copyright lies with the respective authors and their institutions.
Table of contents
Wf4Ever Consortium ....................................................................................................................................... 2
Executive Summary ........................................................................................................................................ 4
Table of contents ............................................................................................................................................. 5
List of Figures .................................................................................................................................................. 7
1 Introduction ................................................................................................................................................ 8
1.1 Relations with other deliverables .......................................................................................................... 9
2 Use case scenarios .................................................................................................................................. 10
2.1 Scenario: Discover Research Objects ................................................................................................ 10 2.2 Scenario: Inspect Research Objects .................................................................................................. 11 2.3 Scenario: Research Object creation and management ...................................................................... 11 2.4 Scenario: Annotations ........................................................................................................................ 13 2.5 Scenario: Quality ................................................................................................................................ 14 2.6 Scenario: Workflow run evidence ....................................................................................................... 16 2.7 Scenario: Recommendations ............................................................................................................. 16 2.8 Scenario: Evolution & Preservation .................................................................................................... 17 2.10 Scenario: Manage Research Object using different tools .................................................................. 19
3 Toolkit components ................................................................................................................................. 20
3.1 Storage and Lifecycle services ........................................................................................................... 22 3.1.1 Research Object Digital Library – RODL ............................................................................... 22 3.1.2 RO-enabled myExperiment (repository) ................................................................................. 23 3.1.3 Workflow Runner .................................................................................................................... 23
3.2 Data Management & Analysis Services ............................................................................................. 24 3.2.1 Checklist Evaluation Service .................................................................................................. 24 3.2.2 Quality Evaluation Service ..................................................................................................... 25 3.2.3 Recommender Service ........................................................................................................... 26 3.2.4 Workflow Abstraction Service ................................................................................................ 27 3.2.5 Workflow-Research Object Transformation (WF-RO) Service ............................................... 28
3.3 Access & Usage Clients ..................................................................................................................... 29 3.3.1 RO-enabled myExperiment (web interface) ........................................................................... 29 3.3.2 RO Portal ............................................................................................................................... 30 3.3.3 RO Manager Tool ................................................................................................................... 30 3.3.4 Collaboration Spheres ............................................................................................................ 31
4 Conclusions .............................................................................................................................................. 33
Page 6 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
References ..................................................................................................................................................... 34
D1.3v1: Wf4Ever Architecture – Phase I Page 7 of 34
2013 © Copyright lies with the respective authors and their institutions.
List of Figures
Figure 1. Wf4Ever Toolkit .................................................................................................................................. 8
Figure 2 Faceted search in RO Portal ............................................................................................................. 10
Figure 3 RO inspection in RO-enabled myExperiment ................................................................................... 11
Figure 4 RO inspection in RO Portal ............................................................................................................... 12
Figure 5 Importing data from myExperiment ................................................................................................... 13
Figure 6 Managing Annotations in RO Portal .................................................................................................. 14
Figure 7 Checklist result page ......................................................................................................................... 15
Figure 8 Research object quality visualization page ....................................................................................... 16
Figure 9 Recommendations in RO-enabled myExperiment ............................................................................ 17
Figure 10 Research Object evolution visualization in RO Portal ..................................................................... 18
Figure 11 Display annotations of a local resource with RO Manager ............................................................. 18
Figure 12 Deployment diagram with partial view of interactions between Wf4Ever components, described in
more detail in the subsection associated to arrow colour ................................................................................ 21
Page 8 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
1 Introduction
One of the main tangible outcomes of Wf4Ever project is a technological infrastructure for the preservation
and efficient retrieval and reuse of scientific workflows in a range of disciplines. In order to produce this
outcome, we defined a software architecture for a scientific workflow preservation infrastructure, building
upon the Research Object model developed in the project [1], and realized this architecture in a reference
implementation called the Wf4Ever Toolkit. The architecture describes interfaces to functionalities, which
have been grouped into categories representing areas addressed by a preservation infrastructure (see
deliverable 1.3v2 [2]), and the toolkit implements these functionalities through a set of services and clients.
For this process, we adopted an agile development approach, with early prototyping, where the toolkit co-
evolved alongside the architecture definition.
The Wf4Ever Toolkit comprises a set of services implementing storage, lifecycle, data management and
analysis functionalities, where these functionalities are exposed via RESTful APIs, alongside client
applications providing access and usage functionalities. Figure 1 depicts the toolkit components in the
context of the functional categorization prescribed by Wf4Ever architecture. Note that although the set of
components in the toolkit is final, the components are still evolving, improving or complementing their
implementation, and will continue to evolve during the whole lifetime of the project.
Figure 1. Wf4Ever Toolkit
The toolkit components are deployed in the Wf4Ever Sandbox, which can be used through its live instance
(http://sandbox.wf4ever-project.org/) or it can be downloaded and deployed locally along the toolkit
components (http://sandbox.wf4ever-project.org/images.html). The only exception is RO-enabled
myExperiment that uses its own Sandbox (http://alpha.myexperiment.org/). For more information about
Wf4Ever Sandbox, including user guides of the client applications, please refer to D1.2v2 [3] and D1.2v3 [4].
D1.3v1: Wf4Ever Architecture – Phase I Page 9 of 34
2013 © Copyright lies with the respective authors and their institutions.
1.1 Relations with other deliverables
This document is the final reference implementation document, and supersedes the previous version
D1.4v1. This document includes a technical overview of the toolkit components, and serves as an index to
other M32 deliverables from technical work packages (D2.2v2 [6], D3.2v2 [7] and D4.2v2 [8]), which provide
detailed description of each of these components, including implementation aspects and interaction
diagrams. A complete documentation of the implemented APIs is provided in the architecture deliverable
(D1.3v2 [2]) and user guides for the client applications is provided in the Sandbox deliverables (D1.2v2 [3]
and D1.2v3 [4]).
The remainder of this document is organized as follows: In section 2, we describe a set of short use case
scenarios, aligned with the user documents, which motivate and illustrate the toolkit component and how
users can interact with them. Next in Section 3, we introduce the toolkit components, highlighting the
changes (since D1.4v1), and then for each of them, this section presents an information card including: a
brief description, the API implemented (for services), the interrelations with other components, the location of
its source code and live instance, and a link to the corresponding M32 deliverable where this component is
described in more detail, as discussed above. Finally, we conclude in Section 4.
Page 10 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
2 Use case scenarios
To illustrate the usage of the Wf4Ever toolkit, and the interactions between their components, we describe
the ten most representative short use case scenarios selected in collaboration with our users. Although some
of these scenarios may depend on each other, in general they can be performed in any order. Note that user
guides for the client applications are presented in D1.2v2 [3] and D1.2v3 [4].
2.1 Scenario: Discover Research Objects
Users start searching for research objects stored in RO Digital Library from RO portal or RO-enabled
myExperiment using some keyword(s). Results are displayed with useful information about each research
object. For instance, such information may include fields providing a glimpse of what the research object is
about (e.g., title, author and tags), its complexity (e.g., number of resources or annotations) and its
usefulness or quality (e.g., status, date and number of comments). Results can then be restricted or filtered
through a faceted search interface. For instance, in RO portal users can restrict the results based on the their
author, status, creation date, number of annotations, number of resources or any combination of these (see
Figure 2). Moreover, results can be sorted by any of these properties via a dropdown list (shown at the top of
the results list in the figure) with the fields and their order (ascending/descending).
Figure 2 Faceted search in RO Portal
D1.3v1: Wf4Ever Architecture – Phase I Page 11 of 34
2013 © Copyright lies with the respective authors and their institutions.
2.2 Scenario: Inspect Research Objects
Once a research object is selected in RO-enabled myExperiment or RO Portal, users can inspect its content
(see Figure 3 and Figure 4). The information displayed includes an overview of the research object with its
description, status, complexity, quality and a graphical representation (if available) that may help in
understanding the research object. Additionally, users are able to navigate the structure of the research
objects, inspect its aggregated resources including their annotations, and view the relations among them. In
RO portal, users are also able to visualize the evolution of the research object, including events affecting the
research object, such as changes in its content and in its quality. Similarly, users are able to download the
whole research object or any of its aggregated resources, for instance, for reusing it or running it locally (e.g.,
a workflow).
Figure 3 RO inspection in RO-enabled myExperiment
2.3 Scenario: Research Object creation and management
Users are able to create a new research object in different ways. For instance, both in RO-enabled
myExperiment and RO Portal, they can start by creating an empty research object, optionally with a
predefined folder structure, and adding individual resources one by one (e.g., workflows, scripts, example
input data and others). Alternatively, in RO Portal users are able to create a research object from a ZIP file
with a set of folders and files, or by importing data from myExperiment. For this task, users select content
from myExperiment such as workflows and packs (see Figure 5), which are transformed into research
objects. The content itself is not modified and the available metadata, such as title and description, are
copied as semantic annotations. For workflows, the Workflow Transformation Service is used, which
Page 12 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
generates additional annotations related to the workflow structure, sub-resources, and history. In any case,
while working with the research object users can establish the relationships between the aggregated
resources (e.g., the example input data to the workflow).
Figure 4 RO inspection in RO Portal
Similarly, at any time users can modify the structure of a research object as well as its content. For instance,
they can edit the folder structure of the research object adding folders and files, moving the files among
folders, or deleting resources. They can also modify the research object properties (e.g., title, description) or
its aggregated resources, e.g., uploading a new version or changing their description.
D1.3v1: Wf4Ever Architecture – Phase I Page 13 of 34
2013 © Copyright lies with the respective authors and their institutions.
Figure 5 Importing data from myExperiment
2.4 Scenario: Annotations
The user can create, edit and remove annotations on the research object and its aggregated resources from
RO Portal or RO-enabled myExperiment. Many of these annotations serve to associate components with
resource type definition (hypothesis, workflow, script, input, result, document, conclusion, etc.), and also to
describe relationships between them (such as a resource being an input for a workflow), based on
predefined vocabularies and models compliant with Wf4Ever RO model. Other annotations are related to
authoring and crediting, including keeping the authors and contributors, as well as relating a research object
or its resources to other existing resources that they reused or derive from, and when possible automatically
crediting the authors of these resources.
It is also important to provide the possibility to include free-text descriptions for describing, for instance, the
configuration settings needed for the execution environment at the resource level or the order of every step
performed in the procedural protocol of the whole execution of the research object. Very often, users
undertake these steps taking into account personal decisions that should be registered. Others may be
related with scientific information very specific to the experiment. Scientific considerations on the provenance
of the initial input data for the RO (e.g., how did I get or prepare them?), mathematical formulae or equations
used in the model/process/script to execute, etc. These annotations may also be complemented with
bibliography used or produced by the experiment. RO Portal provides a basic view of the resource
annotations for predefined annotations and an advanced view for managing any annotation (see Figure 6).
Page 14 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
Figure 6 Managing Annotations in RO Portal
2.5 Scenario: Quality
Users can check the quality of a research object by running the Checklist Evaluation Service, providing as
input a research object stored in RO Digital Library and a minim model [5] that formally describes the
requirements to be satisfied by the research object to be suitable for some purpose (e.g., completeness).
This service can be run from RO-enabled myExperiment (Figure 7 shows the result page returned after
selecting the Checklist link on the right panel in Figure 3) or RO Portal, which automatically displays a short
summary of the results in the research object Overview page (see Figure 4) and more detailed information in
the research object Quality page. The Checklist Evaluation Service has been developed using some of the
same code base as RO Manager, and the same evaluation can be performed from the command line using
the RO Manager (see Sec 2.9 Scenario: Manage local Research Object).
D1.3v1: Wf4Ever Architecture – Phase I Page 15 of 34
2013 © Copyright lies with the respective authors and their institutions.
Figure 7 Checklist result page
Additionally, users can check the normalized completeness, stability and reliability of the research object
using the Quality Evaluation Service. This service executes a series of checklist evaluations (over time) on
research objects stored in RO Digital Library and calculates these quality dimensions values, i.e., the extent
to which the research object satisfies the checklist requirements, its ability to preserve its properties through
time and its ability to converge towards a scenario free of decay (complete and stable through time) - see
D4.2v2 [8]). The service can return a web page where the user can explore how the research object quality
has evolved over time in terms of these quality scores. This page can be opened from RO-enabled
myExperiment (Analytics and Quality link) or RO Portal (Quality page) – see Figure 8.
Page 16 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
Figure 8 Research object quality visualization page
2.6 Scenario: Workflow run evidence
When a research object aggregates a workflow that has not provided evidence that can be run, the research
object quality checklist that can be shown from RO Portal or RO-enabled myExperiment warns the user, for
instance, with a cross indicator on the run evidence requirement. The ongoing work includes enabling users
to select this indicator from the portal, and give them the possibility to either upload run data or to run the
workflow. In the first case, the user would provide a workflow run including, for instance, real (example) input
data, real (example) output data, and information about the workflow engine used (e.g.,Taverna version) and
how long it took to run. In the second case, the user would execute the workflow from the portal using the
Workflow Runner Service. The run provenance would be subsequently exposed by the service and then
aggregated by the research object. As a result, the research object quality checklist would improve.
2.7 Scenario: Recommendations
Users can get recommendations of other users, research objects and individual resources (e.g., workflows,
datasets) using the Recommender Service, based on their profile (e.g., research objects they have created,
D1.3v1: Wf4Ever Architecture – Phase I Page 17 of 34
2013 © Copyright lies with the respective authors and their institutions.
keywords previously proposed), or on a research object description (e.g., title, description, tags). This service
can be executed from RO-enabled myExperiment and RO Portal (Figure 9 shows an exemplary result page
of recommendations in RO-enabled myExperiment). The service is also used by the Collaboration Spheres
client application, available through RO-enabled myExperiment, which provides an alternative view for
browsing recommendations.
Figure 9 Recommendations in RO-enabled myExperiment
2.8 Scenario: Evolution & Preservation
Users may at any moment create a snapshot of a research object, capturing the state of the research object
at the given point in time, and create a textual annotation describing why they decided to do it and what
represent the state of the research object. Such snapshots may be useful to release the current version of
the research outcome of an experiment, submit it to be peer reviewed or to be published, share it with
supervisors or collaborators, or for acknowledgement and citation purposes.
Page 18 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
Users can create snapshots from RO Portal, where they also can consult the evolution history of the
research object as depicted in Figure 10. The user may decide to recover one snapshot of the research
object, with all its contents and annotations as they were at that stage.
Figure 10 Research Object evolution visualization in RO Portal
2.9 Scenario: Manage local Research Object
Using the RO Manager command line tool a user is able to create and manage a research object on a local
drive, aggregate his files as resources and make annotations. Figure 11 shows how to display the
annotations of a resource with RO Manager. The research object may also be synchronized and pushed to
the RO Digital Library for sharing and publication.
Figure 11 Display annotations of a local resource with RO Manager
D1.3v1: Wf4Ever Architecture – Phase I Page 19 of 34
2013 © Copyright lies with the respective authors and their institutions.
2.10 Scenario: Manage Research Object using different tools
Users are able to use different tools for managing a research object. For instance, the user may create a
research object from RO-enabled myExperiment, including aggregating resources and annotations, and then
use RO Portal for managing the evolution of the research object, e.g., create snapshots and visualize the
history. These applications use RO Digital Library for storing the research objects. Moreover, the user may
download the research object, use RO Manager for making additional annotations or aggregating additional
resources, and then push it back to RO Digital Library.
Page 20 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
3 Toolkit components
The toolkit consists of 12 components: 3 storage and lifecycle services (Research Object Digital Library -
RODL, myExperiment and Workflow Runner), 5 data management and analysis services (Checklist
Evaluation, Quality Evaluation, Recommender, Workflow Abstraction and Workflow Transformation) and 4
access and usage clients (RO-enabled myExperiment, RO Portal, RO Manager Tool and Collaboration
Spheres). Most of these components were presented in D1.4v1. However, as noted in the introduction, even
though the set of toolkit components is final, they are still evolving, improving or complementing their
implementation. Hence, in this section we summarize their current state and introduce the Workflow
Abstraction service and the Quality Evaluation service. The latter subsumes (and replaces) the former
Stability Evaluation service. Additionally, since RO-enabled myExperiment has been recently extended to
implement the Research Object – RO API1 (partially), it has been included as one of the storage services
(replacing current myExperiment). We have also removed in this document the ROBox client application as it
was deprecated and is no longer maintained. Diagram in Figure 12 depicts the toolkit components along with
a partial view of their interactions, described in more detail in the subsection associated to the arrow colour.
The toolkit components have been deployed in Wf4Ever Sandbox, available online through its live instance
(http://sandbox.wf4ever-project.org/) and for download (http://sandbox.wf4ever-project.org/images.html) –
see D1.3v2 [3] and D1.3v3 [4]. The only exception is RO-enabled myExperiment that uses its own Sandbox
(http://alpha.myexperiment.org/ and http://alpha2.myexperiment.org/). Individual deployment instructions
may be also available at the source code repository of each component (included in their information card).
In the following, we provide an information card for each component including: a brief description, the API
implemented (for services), the interactions with other components, the location of source code and live
instance, and a link to the corresponding M32 technical WP deliverable where the component
implementation and usage is described in more detail. To illustrate the interactions with other components,
we include small diagrams showing the related components with arrows depicting the flow of data, in the
context of the functional categorization prescribed by Wf4Ever Architecture, providing partial views of Figure
1.
A detailed documentation of the APIs is available in the final architecture deliverable D1.3v2 [2], and also in
the following links:
• Research Object – RO API: http://www.wf4ever-project.org/wiki/display/docs/RO+API+6
• Research Object Evolution – RO EVO API: http://www.wf4ever-
project.org/wiki/display/docs/RO+evolution+API
• Checklist Evaluation API: http://wf4ever-project.org/wiki/display/docs/RO+checklist+evaluation+API
• Reliability Evaluation API2: http://www.wf4ever-
project.org/wiki/display/docs/Reliability+Evaluation+API
1 Research Object – RO API was formerly known as Research Object Storage & Retrieval – ROSR API 2 The Reliability Evaluation API subsumes (and replaces) the former Stability Evaluation API
D1.3v1: Wf4Ever Architecture – Phase I Page 21 of 34
2013 © Copyright lies with the respective authors and their institutions.
Figure 12 Deployment diagram with partial view of interactions between Wf4Ever components, described in more detail in the subsection associated to arrow colour
• Recommendation API: http://wf4ever-
project.org/wiki/display/docs/Recommender+Service#RecommenderService-Interface
• Workflow Abstraction API: http://wf4ever-project.org/wiki/display/docs/Workflow+Indexing+API
• Workflow Runner API: http://wf4ever-project.org/wiki/display/docs/Workflow+Runner+API
• Workflow-RO Transformation API: http://wf4ever-project.org/wiki/display/docs/Wf-
RO+transformation+service+API
• User Management API: http://wf4ever-project.org/wiki/display/docs/User+Management+2
Page 22 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
3.1 Storage and Lifecycle services
3.1.1 Research Object Digital Library – RODL
Service Name RO Digital Library - RODL
Description Software system which collects, manages and preserves aggregations of scientific
methods (e.g., workflows) and related artifacts along with their annotations,
organized as research objects. The key features are:
• Create, Edit & Delete ROs
• Retrieve ROs in different formats
• Creation of different RO types (Live, Snapshot, Archived)
• Retrieve RO evolution provenance
• Add & Remove aggregated resources and annotations
• Create & Delete users
• Search & Index ROs
• Query metadata
• Retrieve metadata in different formats (e.g., RDF/XML, TTL)
RODL has a modular structure that comprises access components (built on top of dLibra
digital library services - http://dlibra.psnc.pl/ and Jena TDB store -
http://jena.apache.org/), long-term preservation components (built on top of dArceo -
http://dlab.psnc.pl/darceo/) and a controller that manages the flow of data.
URI http://sandbox.wf4ever-project.org/rodl/
Source Code
Repository
https://github.com/wf4ever/rodl
APIs
Implemented
• Research Object - RO
• Research Object Evolution – RO EVO
• Notification
• User Management
• SPARQL Endpoint3
Additionally, RODL exposes Solr schema REST API4
Services Used
RODL is built on top of dLibra services, Jena TDB store and dArceo system. It is a
core service of Wf4Ever and therefore it does not uses or rely on other Wf4Ever
services.
Additional
Information
D2.2v2 [6]
3 http://www.w3.org/TR/sparql11-service-description/ 4 http://wiki.apache.org/solr/SchemaRESTAPI
D1.3v1: Wf4Ever Architecture – Phase I Page 23 of 34
2013 © Copyright lies with the respective authors and their institutions.
3.1.2 RO-enabled myExperiment (repository)
Service Name RO-enabled myExperiment (repository)
Description Repository of scientific workflows, other digital objects and research objects
(called packs). The key features in the context of Wf4Ever include:
• Create, Edit & Delete packs/ROs
• Add & Remove Workflows and other resources
• Search & Index Workflows, packs and other resources
• Versioning of resources, such as workflows
• Sorting and pagination
• Permissions management
URI http://alpha.myexperiment.org/ and http://alpha2.myexperiment.org/
Source Code
Repository
http://rubyforge.org/projects/myexperiment/
APIs
Implemented
• myExperiment5
• Research Object – RO (partially)
• SPARQL Endpoint
Services Used
RO-enabled myExperiment sends a copy of the research objects (packs) to RO Digital
Library
Additional
Information
D2.2v2 [6]
3.1.3 Workflow Runner
Service Name Workflow Runner
Description Workflow execution service, key for workflow decay analysis. The key features are:
• Remote execution of workflows
• Expose workflow runs as ROs following a subset of RO API, aggregating inputs,
outputs, console logs, provenance and annotations including mappings to RO
vocabularies for workflow description (wfdesc) and execution provenance
(wfprov) [1].
Workflow Runner is built on top of the Taverna Server -
http://www.taverna.org.uk/documentation/taverna-2-x/server/
5 http://wiki.myexperiment.org/index.php/Developer:API
Page 24 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
URI http://sandbox.wf4ever-project.org/runner/
Source Code
Repository
https://github.com/wf4ever/workflow-runner
APIs
Implemented
• Workflow Runner
Services
Used
Workflow runner does not use or rely on other Wf4Ever service; however, it can execute
workflows stored in any remote repository, including RO-enabled myExperiment
repository and RO Digital Library.
Additional
Information
http://wf4ever-project.org/wiki/display/docs/Workflow+Runner+API
3.2 Data Management & Analysis Services
3.2.1 Checklist Evaluation Service
Service Name Checklist Evaluation Service
Description Service for testing completeness, executability, repeatability and other desired
features of a research object. In particular, it evaluates a research object against a
minimum information model and for a particular purpose (e.g., completeness).
URI http://sandbox.wf4ever-project.org/roevaluate/
Source Code
Repository
https://github.com/wf4ever/ro-manager
APIs
Implemented
• Checklist
Services
Used
The checklist service does not use or rely on other Wf4Ever service; however it can
D1.3v1: Wf4Ever Architecture – Phase I Page 25 of 34
2013 © Copyright lies with the respective authors and their institutions.
evaluate research objects available on the local file system or in any repository
implementing RO API, such as RO Digital Library and RO-enabled myExperiment.
Additional
Information
D4.2v2 [8]
Notes The checklist service also comprises a traffic-light service component, which returns
HTML or JSON data for facilitating the generation of a simple display of the checklist
results. This service implements the traffic-light API6, which is very similar in
structure to the checklist evaluation API. Since the service provides HTML/JSON
interfaces, a user guide is provided in D1.2v3 [4].
3.2.2 Quality Evaluation Service
Service
Name
Quality Evaluation Service
Description This service test the quality of a RO, a measure of how healthy it is based on the
ability of the research object to achieve its original purpose after being subject of
changes on its resources. In particular, the service monitors the RO over time,
capturing concrete values provided by the completeness, stability, and the reliability
scores for the RO in different moments of its evolution. These score represent the
extent to which the RO satisfies the checklist requirements, its ability to preserve
its properties through time and its ability to converge towards a scenario free of
decay (complete and stable through time).
URI http://sandbox.wf4ever-project.org/decayMonitoring/rest/getReliability
http://sandbox.wf4ever-project.org/decayMonitoring/rest/notifications
Source Code
Repository
https://github.com/wf4ever/quality-service
APIs
Implemented
• Reliability
• Notification
6 http://www.wf4ever-project.org/wiki/display/docs/Checklist+traffic+light+API
Page 26 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
Services
Used
The quality service uses the checklist service to check the status of the research
object at different moments in time, and consequently it can analyze research objects
available on the local file system or in any repository implementing RO API, such as RO
Digital Library and RO-enabled myExperiment. Additionally, this service generates
notifications about changes in the research object quality that are collected by RO
Digital Library and merged with other notifications about the research object.
Additional
Information
D4.2v2 [8]
Notes The quality service includes a decay-monitoring component, a web application where the
user can explore the evolution in terms of quality scores of his Research Object over
time. The decay-monitoring is available at http://sandbox.wf4ever-
project.org/decayMonitoring/visual.html and it has a parameter “id” which receives the
RO_URI of the research object to be analyzed, e.g., http://sandbox.wf4ever-
project.org/decayMonitoring/visual.html?id=http://sandbox.wf4ever-
project.org/rodl/ROs/myExpRO_1167/.
3.2.3 Recommender Service
Service Name Recommender Service
Description Service that provides recommendations of users, Research Objects and their aggregated
resources (e.g., scientific workflows and datasets), using three different algorithms:
keyword Content-based, collaborative filtering and social networks. The recommendations
are calculated for a particular user, sorted by the strength of the recommendation, and
optionally filtered to include only particular type of resources, e.g., users,
workflows, files and packs.
URI http://sandbox.wf4ever-project.org/epnoiServer/rest/recommender/
Source Code
Repository
https://github.com/wf4ever/epnoi
APIs
Implemented
• Recommendation
D1.3v1: Wf4Ever Architecture – Phase I Page 27 of 34
2013 © Copyright lies with the respective authors and their institutions.
Services
Used
The recommender service uses former myExperiment (http://www.myexperiment.org/) as the
source of information as it requires considerable amount of resources available to
create recommendations, but it can also use any repository implementing RO API as
source, such as RO Digital Library and RO-enabled myExperiment
Additional
Information
D3.2v2 [7]
3.2.4 Workflow Abstraction Service
Service Name Workflow Abstraction Service
Description This service enables scientist to search workflows by their functionality,
properties, or other conceptualization allowing their easy accessibility. For this
task, it provides functionalities for indexing, searching and recommending workflow
processes. In particular, the two main operations are:
• Search: returns URIs of workflows that contain the sequence of executed
processes that has been introduced as input parameter.
• Recommend: returns the most frequently used processes after the sequence of
executed processes that has been introduced as input parameter. The output
returns the id of the process, its probability of usage and its frequency.
URI Search: http://sandbox.wf4ever-project.org/wfabstraction/rest/search
Recommend: http://sandbox.wf4ever-project.org/wfabstraction/rest/recommend
Source Code
Repository
https://github.com/wf4ever/WfAbsServices
APIs
Implemented
• Workflow Abstraction
Page 28 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
Services
Used
The workflow abstraction service uses a Provenance Corpus to create the indexation.
The corpus is accessed through a SPARQL endpoint that exposes provenance using
wfprov vocabulary [1]. At the moment of writing, the service is in the progress of
connecting to SPARQL endpoints of Wf4Ever services exposing provenance information,
e.g., RO Digital Library and RO-enabled myExperiment.
Additional
Information
D2.2v2 [6]
3.2.5 Workflow-Research Object Transformation (WF-RO) Service
Service Name Workflow-Research Object Transformation (WF-RO) Service
Description Service for converting workflows into research objects. The key features are:
• Transform workflows into Research Objects, creating new one or updating an
existing one.
• Generates workflow description using wfdesc vocabulary [1]
• Generates workflow history using roevo vocabulary [7]
URI http://sandbox.wf4ever-project.org/wf-ro/
Source Code
Repository
https://github.com/wf4ever/wf-ro
APIs
Implemented
• Workflow-RO Transformation
Services
Used
D1.3v1: Wf4Ever Architecture – Phase I Page 29 of 34
2013 © Copyright lies with the respective authors and their institutions.
The WF-RO service takes as input a workflow (e.g., in former myExperiment, RO Digital
Library or RO-enabled myExperiment) and stores the resulting research object in a
repository implementing RO API, such as RO Digital Library or RO-enabled myExperiment.
Additional
Information
http://wf4ever-project.org/wiki/display/docs/Wf-RO+transformation+service
3.3 Access & Usage Clients
3.3.1 RO-enabled myExperiment (web interface)
Client Name RO-enabled myExperiment (web interface)
Description A collaborative environment where scientists can publish their workflows and in silico
experiments (organized as research objects), share them with groups and find those of
others.
URI http://alpha.myexperiment.org/ and http://alpha2.myexperiment.org/
Source Code
Repository
http://rubyforge.org/projects/myexperiment/
Interface Web User Interface (WUI), a type of GUI.
Services
Used
RO-enabled myExperiment uses or is in the progress of using all Wf4Ever services,
except for the lifecycle services of RO Digital Library. At the moment of writing,
Workflow Abstraction and Workflow Runner are not being used yet. RO-enabled
myExperiment stores research objects internally and in RO Digital Library. It also
communicates with other clients, e.g., RO Portal and Collaboration Spheres.
Additional
Information
D2.2v2 [6]
Page 30 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
3.3.2 RO Portal
Client Name RO Portal
Description Web client application that enables access and use of research objects and aggregated
resources. The key features are:
• Explore RODL
• Manage & visualize RO structure and annotations
• Search ROs
• Query RODL
• Import & Transform resources from myExperiment
• Visualize recommendations
URI http://sandbox.wf4ever-project.org/portal/
Source Code
Repository
https://github.com/wf4ever/portal
Interface Web User Interface (WUI), a type of GUI.
Services
Used
The RO portal uses most of the Wf4Ever services, except from the workflow-specific
ones, i.e., Workflow Abstraction and Workflow Runner. The RO Portal has been developed
alongside the RO Digital Library.
Additional
Information
D2.2v2 [6] and D1.2v3 [4] (user guide)
3.3.3 RO Manager Tool
Client Name RO Manager Tool
Description A command line tool that enables access and use of research objects and aggregated
resources stored on the local file system, and the synchronization with remote
repositories, e.g., RODL. The key features are:
• Create ROs in the local file system
• Add aggregated resources references in manifest
D1.3v1: Wf4Ever Architecture – Phase I Page 31 of 34
2013 © Copyright lies with the respective authors and their institutions.
• Add annotations
• Read & Write ROs to RODL
URI
Source Code
Repository
https://github.com/wf4ever/ro-manager
Interface Command Line Interface (CLI)
Services Used
RO Manager Tool mainly uses resources on the local file system. However, it can also
communicate with a repository implementing RO API (push, pull). The checklist
evaluation service shares much its code base with RO Manager, and is packed with RO
Manager.
Additional
Information
D2.2v2 [6] and D1.2v2 [3] (user guide)
Documentation: http://wf4ever.github.com/ro-manager/doc/RO-manager.html.
3.3.4 Collaboration Spheres
Client Name Collaboration Spheres
Description A graphical user interface intended to improve the sharing and reuse of ROs based on the
exploitation of semantic descriptions, relations and similarities between ROs and users
in order to support advanced search mechanisms. The search activity has a very strong
social analysis aspect and is based on collaborative filtering and versatile keyword
content-based recommendations. It implements a visual metaphor based on spheres, which
uses concentric circles, where the similarity is represented using the distance from the
center of the sphere to the place where the object is shown. The key features are:
• Creation of a four-layer representation for social and recommendation data,
where circles and squares identify items (ROs and users respectively). The
different layers are presented by using different colors.
• Provision of information related to the selected item from the four-layer
representation (e. g. workflow representation).
Page 32 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
• Provision of a graph representation for the relations between users and ROs.
• Provision of ROs and users ordered by they importance for the current active
item.
URI http://sandbox.wf4ever-project.org/CollaborationSpheres/circles.html
Source Code
Repository
https://github.com/wf4ever/Collaboration-spheres
Interface Graphical User Interface (GUI)
Services
Used
The collaboration spheres client uses the recommender service and consequently it uses
former myExperiment as a source of information, but it can also use any repository
implementing RO API, such as RO Digital Library and RO-enabled.
Additional
Information
D3.2v2 [7] and D1.2v2 [3] (user guide)
Documentation: http://www.wf4ever-project.org/wiki/display/docs/Collaboration+Spheres
Notes It uses myExperiment user id as a parameter, e.g., http://sandbox.wf4ever-
project.org/CollaborationSpheres/circles.html?id=http://www.myexperiment.org/users/18
D1.3v1: Wf4Ever Architecture – Phase I Page 33 of 34
2013 © Copyright lies with the respective authors and their institutions.
4 Conclusions
In this document we have presented the Wf4Ever Toolkit, starting with a motivation of the developed
components. In particular, we have introduced the 12 components comprising the toolkit: 8 services and 4
client applications, including a brief description of their main features, the set of API implemented (for
services), the interactions with other components, the location of source code and live instance, and the
reference to the corresponding M32 deliverable where detailed information about their internal structure,
interaction flow is described. Note that in some cases, there is a one to one mapping between APIs and
services (e.g., checklist service); there are other cases where a service implements several APIs (e.g.,
Research Object Digital Library); and there could be cases where an API is implemented by several services
(e.g., RO API). That shows the importance of defining open APIs that can be implemented by any service
within or outside Wf4Ever.
The Wf4Ever Toolkit was developed alongside the architecture using a co-evolutionary approach, working
closely with the use case partners and drawing on the models defined by the technical Work Packages. We
decided to follow an agile development approach with early prototyping, which allowed us to test our models
and other outcomes as early as possible during the project lifetime, receive timely feedback from our users
and identify and resolve any technological issues. For instance, users were able to interact and assess early
implementations of Research Objects since our first prototype in month 6 of the project lifetime (see D1.2v1).
The development of the toolkit during the second phase was concentrated in enhancing the support for the
achievement of workflow preservation and conservation (e.g., Research Object Digital Library was extended
with a preservation component as described D2.2v2), the realization of RO-enabled myExperiment (also
described in D2.2v2), testing and improving components and providing missing functionalities (e.g.,
notifications). Note that although the set of components comprising the toolkit is final, they are still under
development and may change during the remaining lifetime of the project in order to improve their
implementation or add some additional features. For instance, the realization of RO-enabled myExperiment
is still under intensive development, the RO portal is being redesigned and the RO digital library access
control features are still under development. Hence, this document shows their state at month 32 of the
project.
This document is intended to serve as a guide of the toolkit with links to additional documentation rather than
providing an exhaustive description of each component. In particular, it serves as an index to other M32
deliverables from technical workpackages (D2.2v2, D3.2v2 and D4.2v2), where each of the toolkit is
described in detail.
Page 34 of 34 Wf4Ever STREP FP7-ICT-2007-6 270192
References
[1] Wf4Ever Research Object Model. http://wf4ever.github.io/ro/
[2] D1.3v2 Wf4Ever Architecture – Phase II. OXF. March 2013
[3] D1.2v2 Wf4Ever Sandbox – Phase II. PSNC. July 2012
[4] D1.2v3 Wf4Ever Sandbox – Phase III. PSNC. July 2013
[5] Minimum Information Model Vocabulary Specification. http://sierra-nevada.cs.man.ac.uk/mim/ns
[6] D2.2v2 Design, implementation and deployment of workflow lifecycle management components – Phase
II. UNIMAN. July 2013.
[7] D3.2v2 Design, implementation and deployment of Workflow Evolution, Sharing and Collaboration
components – Phase II. UPM. July 2013.
[8] D4.2v2 Design, implementation and deployment of Workflow Integrity and Authenticity Maintenance
components – Phase II. ISOCO. July 2013.