23
Expressing Workflows Using Grid Enabled Computer Algebra Systems Palaiseau, France, January 19-21, 2009 Alexandru CÂRSTEA Georgiana MACARIU Marc FRÎNCU

Expressing Workflows Using Grid Enabled Computer Algebra Systems Palaiseau, France, January 19-21, 2009 Alexandru CÂRSTEA Georgiana MACARIU Marc FRÎNCU

Embed Size (px)

Citation preview

Expressing Workflows Using Grid Enabled Computer Algebra Systems

Palaiseau, France, January 19-21, 2009

Alexandru CÂRSTEA Georgiana MACARIU Marc FRÎNCU

Introduction

Existing Computer Algebra Systems(CAS) and packages: General purpose Special purpose Packages

Types of possible interactions: CAS to CAS interaction CAS to Web/Grid service interaction External components to CAS interaction

The best solution to expose CAS functionality is through Web/Grid services.

Advantages of using Web services: Cross platform support Standard mechanism for advertising the interface Standard mechanism to describe data types Compatibility with firewall policies

Advantages introduced by WSRF-Grid services (additional to the ones of Web services):

Standard mechanisms to describe resources Standard technologies to access functionality Built in security features

Usage Sketch

Compound computation requiring functionality from several CASs

The client might sometime require a certain CAS to solve a certain part of the problem

Memory/Computational power details may be required

Asynchronous calls as the calls may be issued from a portable device (laptops, PDAs, etc..)

Later provenance may be required

(MuPAD,call3) (Maple,call4)

(GAP,call5)

(KANT,call2)

(GAP,call1)

(GAP,call6)

(GAP,call7)

Standard Interface - CAS Servers

The standard interface of CAS Servers offers:

an operation that received the computation call; calls are formulated as OpenMath objects

callback configurable functionality by specifying the callback address

informational services management services

Management capabilities:

Decide where to advertise the exposed functionality

Choose the functionality that must be available

Enable provenance

SERVER

CAS

CLIENT

WE

B

SE

RV

ICE CAS WRAPPER

SERVER

CAS

SCSCP

CLIENT WE

B/G

RID

S

ER

VIC

E

Models for CAS Integration

By implementing SCSCP Communication between Web service wrapper and the CAS

achieved through TCP/IP calls

Semantic meaning of formulae offered by OpenMath support trough common Open Math CD.

By plain string messages exchanged using various technologies (where available)

Communication between Web service wrapper and the CAS achieved though files, data pipes or TCP/IP

No support for semantic meaning. The messages are meaningful only in the context of the targeted CAS

Note. Both types of messages are sent to the Web service wrapper as strings encapsulated in SOAP messages.

Example of Formulated Calls

Method Call

<OMOBJ><OMATTR><OMATP><OMS cd="scscp1" name="call_ID" /><OMSTR>alexk_9055</OMSTR></OMATP><OMA><OMS cd="scscp1"

name="procedure_call"/><OMA><OMS cd="SCSCP_transient_1"

name="WS_factorial" /><OMI>3</OMI></OMA></OMA></OMATTR></OMOBJ>

SCSCP Cal

<OMOBJ><OMATTR><OMATP><OMS cd="scscp1" name="call_ID" /><OMSTR>alexk_9055</OMSTR></OMATP><OMA><OMS cd="scscp1" name="procedure_call"/><OMA><OMS cd="SCSCP_transient_1" name="WS_factorial" /><OMI>3</OMI></OMA></OMA></OMATTR></OMOBJ>

Composition Functionality (I)

Allows composing the functionality of CASs installed on different machines and of heterogeneous types

Offers support for compound computations

Collaboration: demand tools that orchestrate the steps of computation used in scientific discovery

Support for reproducibility by storing meta information about the execution of the workflow

Easy to use

Start/Stop/Resume + steering

Inspect status and values obtained on the fly

Composition Functionality (II)

Web service interaction/composition patterns:

sequence pattern

parallel split pattern

multiple instances without synchronization

conditional patterns: exclusive choice pattern multi-choice pattern deferred choice pattern

conversational patterns: request/reply pattern one way invocation

General Overview of the Architecture

CAGS :Computer Algebra to Grid Services

AGSSO :Architecture for GridSymbolic Services Orchestration

….

CAGSAGSSO Client

CAS

CAGSAGSSO Client

CAS

….

….

….

AGSSO

Client Manager

Main Registry

Process Manager

CAS Server

Local Registry

CAS

CAS Server

Local Registry

CAS

Grid Service

Web Service

….

Behind the Scene

SA Platform - manages scheduling of

the tasks that are ready to be scheduled. - notifies the Client Manager (Workflow Manager) that the call can be started.

Computational element may hide a single machine or a cluster hierarchy( SymGridPar)

Client Manager SA Platform

Main Registry

1. The task is ready to be scheduled

2. Notify SA

Platform

3. After a suitable server is found, fill in the details.

4. Notify the Client Manager that the task was assigned to a server

Workflow Description at Client Side

Subset of the BPEL language

Sequence: <sequence>…</sequence>

Parallel: <parallel>…</parallel>

Multi-choice: <multichoice>…</multichoice>

If-Else: <if>..<else>.. </if>

Foreach: <foreach>…</foreach>

While: <while>…</while>

Variable declaration: <newvariable>

Invoke: <invoke invokeID = “…”>… </invoke>

Higher level constructs implemented directly through libraries

Example - A Rhomb WorkflowBased on a GAP Package

startWorkflow(); startSequence(); startParallel(); v1:=invoke("KANT",Bernoulli(1000)); v2:=invoke("KANT",Bernoulli(2000)); endParallel(); invoke("GAP",gcd(v1,v2)); endSequence(); endWorkflow();

START

invoke("KANT",Bernoulli(1000))

invoke("KANT",Bernoulli(2000))

invoke("GAP",gcd(v1,v2))

Symbolic Computation Problem – Ring Workflow

Workflow arising from the orbit enumeration algorithm:

job server, sending procedure calls to appropriate image service.

image service for computing the image of the point (may be more than one, each sending procedure call to appropriate orbit service).

orbit service for storing the orbit (may use hash tables, may be more than one, each maintaining part of the table and sending procedure call, if necessary, to the job server ).

Example – Arbitrary Nr. CyclesBased on a GAP Package

LoadPackage("SWIP");SWIP_startWorkflow();SWIP_declareVariable(n,"0"); SWIP_startWhile("$n<10"); SWIP_startSequence(); aVar1:=SWIP_invoke("GAP", "Int($n+1)", "$n"); SWIP_startMultiChoice(); SWIP_startChoiceBranch("$n<10"); SWIP_invoke("GAP", "Int($n+1)", "$n"); SWIP_endChoiceBranch(); SWIP_endMultiChoice(); SWIP_endSequence(); SWIP_endWhile();SWIP_endWorkflow();

CHANGE VALUE

MULTIPLE CHOICE

WHILE

INVOKE EXTERNAL SERVICE

Installation Requirements and Issues

CAS Server Globus 4.2.0 PosgresSQL 8.0 + run script to create the database and to populate it deploy the .gar file to Globus start container

AGSSO Platform Active BPEL 4.1 ( workflow engine) PostgreSQL 8.0+ Tomcat run script to create the database and to populate it configure the service deploy .war archive

Client Java package installation Setting a property file

Cancel/Pause/Resume

It is supported with limitations:

The user cannot always make a successful call

Lack of support for check-pointing ; the task is simply restarted

Computation steering available partially

Better support at CAS level may/should provide:

Threaded server that is able to handle interrupts

Check-pointing and resume

A XML Description of a Sequencewith Data Dependency

<workflow xmlns="http://ieat.ro"> <sequence> <invoke invokeID="invoke_0"> <casid>GAP</casid> <call> <OMOBJ> <OMATTR> <OMATP> <OMS cd="scscp1" name="call_ID" /> <OMSTR>ieat_9055</OMSTR> </OMATP> <OMA> <OMS cd="scscp1" name="procedure_call" /> <OMA> <OMS cd="SCSCP_transient_1" name="WS_factorial" /> <OMI>3</OMI> </OMA> </OMA> </OMATTR> </OMOBJ> </call> </invoke> <invoke invokeID="invoke_1"> <casid>GAP</casid> <call> <OMOBJ> <OMATTR> <OMATP> <OMS cd="scscp1" name="call_ID" /> <OMSTR>ieat_9056</OMSTR> </OMATP> <OMA> <OMS cd="scscp1" name="procedure_call" /> <OMA> <OMS cd="SCSCP_transient_1" name="WS_factorial" /> <OMSTR>$invoke_0</OMSTR> </OMA> </OMA> </OMATTR> </OMOBJ> </call> </invoke> </sequence></workflow>

A XML Description of a Parallel Execution

<workflow xmlns="http://ieat.ro"> <parallel> <invoke invokeID="invoke_0"> <casid>GAP</casid> <call> <OMOBJ> <OMATTR> <OMATP> <OMS cd="scscp1" name="call_ID" /> <OMSTR>ieat_9055</OMSTR> </OMATP> <OMA> <OMS cd="scscp1" name="procedure_call" /> <OMA> <OMS cd="SCSCP_transient_1" name="WS_factorial" /> <OMI>3</OMI> </OMA> </OMA> </OMATTR> </OMOBJ> </call> </invoke> <invoke invokeID="invoke_1"> <casid>GAP</casid> <call> <OMOBJ> <OMATTR> <OMATP> <OMS cd="scscp1" name="call_ID" /> <OMSTR>ieat_9056</OMSTR> </OMATP> <OMA> <OMS cd="scscp1" name="procedure_call" /> <OMA> <OMS cd="SCSCP_transient_1" name="WS_factorial" /> <OMI>6</OMI> </OMA> </OMA> </OMATTR> </OMOBJ> </call> </invoke> </parallel></workflow>

CAS Server Setup

In order to expose and advertise the functionality of a CAS server, the administrator must:

Describe the computational capabilities of the computational node

Add to the Local Registry the names and details regarding any Methods/OM Symbols that must are going to be exposed

Specify which CAS(GAP, Maple, etc..) supports the functionality

Add to the Local Registry detail about the Main Registries that the current Local Registry will advertise in.

Choose which for every method/symbol that should be exposed the Main Registries to advertise in.

Adding a Computational Node

Registering a CAS to a Machine

Adding an OM Symbol

Step 1Adding the Open Math CD

Step 2Add the OM Symbol to CD

Conclusions

Composition of symbolic Grid services is close

Some features may require extra support from the CAS

A general solution is needed in order to make sure that interoperability is not just a word in the dictionary

Web/Grid Services

Open Math representation of semantic data

SCSCP representation of communication