26
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Embed Size (px)

Citation preview

Page 1: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

A Workflow Engine with Multi-Level Parallelism Supports

Qifeng Huang and Yan Huang

School of Computer ScienceCardiff University

2005.9

Page 2: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Agenda

• Background

• SWFL Workflow Architecture

• SWFL Description Language

• SWFL Workflow Engine

• Multi-level Parallelisms in SWFL

Page 3: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Background: Service and Service Composition

• Service encapsulates various resources and make them available over the network via standard interface and protocol

• Web/grid services are emerging as important paradigms for distributed computing

• Service composition/workflow: complex application can created by simple services

Page 4: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Background: GSiB

• Current efforts such as BPEL mainly focus on business process

• Increased demands for scientific workflow, as parallel computing especially grid computing applications expands

• GSiB aims to a general workflow for both business and scientific areas, especially for the latter

• The convergence trend of grid services and web services make it feasible

Page 5: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

GSiB Workflow Architecture

VSCE

Service Workflow Language

SWFL Workflow Engine

• SWFL: an XML-based, graph-oriented service workflow description language

• Engine: Distributed enactment environment with multi-level parallelism support

• VSCE: Visual Service Composition Environment

Page 6: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

SWFL: Basic Elements

Types*

FlowModel(name, isParallel, …)

Message* (name, part* …)

Variables* (name, type)

Activity* Definition of all involved activities (normal/native services, assign, if, switch, for, while, do while and catchEnd activities)

FlowModel* (name, isParallel …)

ControlLink* (Source/Port, Target/Port)

DataLink* (Source/Part, Target/Part)

Page 7: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

SWFL: Graph-Oriented

• In GSiB, a workflow application can be described either as a validated XML (SWFL) documentation or a directed graph

• A node (activity in SWFL) could be either a standard service operation, an compound structure, or an on-machine program

• An edge (data/control link in SWFL) describes the data and control dependencies among involved activities

Page 8: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

SWFL: An Example

Data Source

Activity A

IF(a/b)

Activity B

Activity C

Data Sink

a>b

……<swfl:flow name="sample" requireParallel="false"> <wsdl:input message="flowInput"/> <wsdl:output message="flowOutput"/> <swfl:activity> <swfl:if name="ifControl">…</swfl:if> </swfl:activity> <swfl:activity> <swfl:normal name="ActivityA"> <swfl:performedBy>… </swfl:performedBy> </swfl:normal> </swfl:activity>…… <swfl:controlLink> <swfl:source name="ifControl" port="IF"/> <swfl:target name="task2"/> </swfl:controlLink> …… <swfl:dataLink target="ifControl"> <swfl:source name="ActivityA"> <swfl:map>…</swfl:map> </swfl:source> </swfl:dataLink> ……</swfl:flow>……

Page 9: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

SWFL vs. BPEL

• Both can be used to build workflows which involve peer-to-peer interactions between web services

• BPEL is mainly for business processes while SWFL is mainly for scientific areas

• BPEL uses a script-oriented approach, while SWFL follows a graph-oriented approach

Page 10: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

SWFL: Why Graph-Oriented?

• Easy to use, especially using friendly VSCE: Like flow chart and UML model

• Flexible and dynamic in services schedule and execution– Completely decided by the engine– Making full use of dynamic runtime features,

different strategies can be used for a flow– Straightforward support to multi-level

parallelisms

Page 11: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

VSCE: Make Complicated Things Easy

Workflow Drawing

Pane

Page 12: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

VSCE: What is more…

• Friendly integrated visual tool for users to build, execute and control workflow– Make end users not have to know much

about workflow

• Design (draw) a flow with fun: Drag-and-drop

• Configure and initiate the execution

• Retrieve results and track runtime status

Page 13: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

A Grid Architecture Based on workflow Engines (1)

<invoke name="registerAuctionResults"partnerLink="auctionRegistrationService"portType="as:auctionRegistrationPT"operation="process"inputVariable="auctionData"><correlations><correlation set="auctionIdentification"/></correlations></invoke><receive name="receiveAuctionRegistrationInformation"partnerLink="auctionRegistrationService"portType="as:auctionRegistrationAnswerPT"operation="answer"variable="auctionAnswerData"><correlations><correlation set="auctionIdentification"/></correlations></receive>

Job ProcessorJob ProcessorJob Processor

Workflow Engine

ServiceServiceServiceServiceServiceService

Job ProcessorJob ProcessorJob Processor

Job Processor

SWFL

BPEL

Page 14: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

A Grid Architecture Based on workflow Engines (2)

ServiceServiceServiceServiceServiceService

Job ProcessorJob ProcessorJob Processor

Workflow Engine

Job ProcessorJob ProcessorJob ProcessorJob Processor

Job ProcessorJob ProcessorJob Processor

Workflow Engine Job ProcessorJob ProcessorJob ProcessorJob Processor

Job ProcessorJob ProcessorJob Processor

Workflow Engine

Job ProcessorJob ProcessorJob ProcessorJob Processor

ServiceServiceServiceServiceServiceService

<invoke name="registerAuctionResults"partnerLink="auctionRegistrationService"portType="as:auctionRegistrationPT"operation="process"inputVariable="auctionData"><correlations><correlation set="auctionIdentification"/></correlations></invoke><receive name="receiveAuctionRegistrationInformation"partnerLink="auctionRegistrationService"portType="as:auctionRegistrationAnswerPT"operation="answer"variable="auctionAnswerData"><correlations><correlation set="auctionIdentification"/></correlations></receive>

SWFL

BPEL

Page 15: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

GSiB Workflow Processing

SWFL/MPFL Document

Java Programs

XML2Graph

Graph2Java

Enactment Environment

ExecutionResult

1

2

3

Graph2XML

Page 16: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

GSiB Instance: Graph Objects

• XML2Graph and Graph2Java tools

• Graph Objects– Two kinds: data graphs and control graphs– Straightforward format for VSCE– Schedule strategy is decided during

runtime

Page 17: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Engine: Architecture

Gateway

Job Processor

Storage

Scheduler

UDDI

VSCE

Engine

Page 18: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Engine: Components

• Gateway: a web service provides entry point to submit jobs and retrieval results and runtime status: three job formats

• Job Processor: computing resources composed of a pool of worker threads

• Scheduler: provides dynamic service execution strategy during runtime

• Storage: provides space as well as API for objects, results and status information

Page 19: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Engine: Multi-Level Parallelisms

• Service-level

• Flow-Level

• Message-Passing

• Parallelism in BPEL: explicitly described in the script

Page 20: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Service-Level Parallelism

• An activity is ready when all its input data are ready and all activities it has control dependencies are complete

• May exist several ready activities at the same time; Can be executed in parallel

• Greedy algorithm: execute an activity once it is ready; may waste storage and computing resource; not always optimum

• Question: how to schedule services?

Page 21: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Flow-Level Parallelism: An Example

A

B C

D E

F

A

BC

DE

F

Partition

Process 1

Process 2

Page 22: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Flow-Level Parallelism (2)

• Decentralized orchestration of services: divide a workflow into several sub-flows, to run by several job processors in parallel

• Two kinds: independent connected graphs; partition connected graph

• Parallelism achievements: quick response; high throughput; scalability

• Additional complexities: flow partition; coordination of distributed execution

Page 23: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Message-Passing Parallelism: Background and MPFL

• Parallelism in SWFL is suitable for applications with forms of parallelism that can be displayed in a workflow graph

• Most scientific applications exhibit more sophisticated parallelism like message passing, which is a normal thing

• MPFL: extends the SWFL flow model to support applications with message-passing

Page 24: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Message-Passing Parallelism: An Example

A

D

B

C

Flow Model 1for process 0

Flow Model 2for processes with rank larger than 0

A

D

B

C

B

C

B

C

B

C

Process 0 Process 1 Process 2 Process 3 Process N-1

Workflow Applicatio

n

MPFL Model

Page 25: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

• Multi-layer heterogeneous communication domains are supported

• An instance is usually run on a cluster: parallelism just like a standard MPI program can be achieved

• Engine: accumulative extension of SWFL engine; still a work in progress

Job ProcessorJob ProcessorJob Processor

Workflow Engine

ServiceServiceServiceServiceServiceService

Job ProcessorJob ProcessorJob Processor

Cluster

MPFL

Message-Passing Parallelism

Page 26: A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University 2005.9

Conclusions

• Workflow framework in GSiB is grid-oriented, suitable for both business and scientific applications composed of web/grid services

• Graph-based SWFL provides much flexibilities for both end users and engine implementation

• VSCE provides visual tool to build and execute workflow applications

• SWFL engine provides an automatic and self-organizing enactment environment for the processing of workflow applications

• Better performance is achieved with the support of multi-level parallelism in SWFL engine