13
A Novel Approach to Workflow Management in Grid Environments Frank Berretz*, Sascha Skorupa*, Volker Sander*, Adam Belloum** 15/04/2010 * FH Aachen - University of Applied Science, GER ** University of Amsterdam, NL

A Novel Approach to Workflow Management in Grid Environments Frank Berretz*, Sascha Skorupa*, Volker Sander*, Adam Belloum** 15/04/2010 * FH Aachen - University

Embed Size (px)

Citation preview

A Novel Approach to Workflow Management in Grid EnvironmentsFrank Berretz*, Sascha Skorupa*,Volker Sander*, Adam Belloum**

15/04/2010

* FH Aachen - University of Applied Science, GER ** University of Amsterdam, NL

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 2

Taxonomy of Grid Workflow Systems

> Common Architecture

Push-based Job Distribution

> Requirements

> UNICORE Workflow System

> Consequences

Pull-based Approach

> Benefits & Challenges

> General Architecture

Prototype Implementation

> UNICORE Grid Middleware

> jBPM Workflow Engine

Outline

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 3

Taxonomy of Grid Workflow SystemsCommon Architecture

Workflow Design & Definition

Grid Users

Build Time

Run Time

Workflow Execution & Control

Interaction with Grid resources

Grid Workflow Application Modeling & Definiton Tools

Workflow Scheduling

Data Movement Fault Movement

Grid Workflow Enactment Service

Grid Middleware

Grid Resources

Grid Information Services

Resource Info Service

Application Info Service

interaction withinforamtion services

workflow change

Grid Workflow Specification

Source: Jia Yu and Rajkumar Buyya

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 4

Push-based job distribution requires

> Efficient resource discovery and selection processes

> Detailed knowledge of available resources

> Well-defined interfaces of resources

> Up-to-date and confidential information systems

> Adapt VO schedulers and local schedulers

> Proper Access Control Lists from resource providers

Push-based Job DistributionRequirements

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 5

Push-based Job DistributionUNICORE Workflow System

1. Splitting workflow into sequence of WAs and send them to a orchestrator.

2. Filter appropriate resources by requesting an information service.

3. Information Service requests all available Grid sites.

4. Evaluate requests from multiple VOs.

5. Response concrete resource endpoint.

6. Service Orch. forwards JSDL to known interface.

7. Site performs authorization by mapping CA to a local account.

8. XNJS sends job through the TSI to a physical computing resource.

9. – 11. Callback chain

Workflow Engine

Service Orchestrator

Target System Interface

UNICORE Atomic

Services

OGSA Interface

XNJS

Workflow Management System

Computing Resource

WA

JSDL

Information Service(CIS / GLUE 2.0)

XUUDB

12

3

4

5

6

7

8 9

10

11

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 6

Scalability: Information systems and schedulers may become bottlenecks with respect to the amount of…

> Users and resources

> Parallel branches inside a workflow (parameter studies)

Cross Grid Scheduling:

> Side effects caused by resources in multiple Grids

> Limitation of resource candidates to resources of a particular Grid (Open community approach?)

Heterogeneity: Grid Workflow Systems typically deal with computational resources

> Cumbersome integration of special resources like human interaction (lack of integrating emerging standards)

> Complex decision processes by humans might influence further workflow steps (e.g. qualitative assessment criteria)

Push-based Job DistributionConsequences

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 7

Alternative IdeaPull-based Job Distribution Strategy

Workflow system sends task to an intermediary repository

Resources act autonomously and adapt to the repository

Any kind of resource can actively request the repository

Resources apply for defined roles to receive tasks according to their capabilities

Resources have to authenticate against the task repository

Grid Workflow Application Modeling & Definition Tools

Grid WorkflowSpecification

Grid Middleware

Grid Resources

Grid WorkflowEnactment Service

Workflow Scheduling

DataMovement

FaultManagement

Grid WorkflowEnactment Service

DataMovement

FaultManagement

Task Repository

Community Task Management

workflowchange

workflowchange

WFMS push jobs to Grid resources Grid resources pull

Jobs from WFMS1

2

12

Task Client

GridResources

HumanResources

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 8

Architectural Concept for the Pull Model

Workflow Repository

Abstract Workflow

Abstract Workflow

Abstract Workflow

finished workflow

instantiate

Workflow Engine

Concrete Workflow

finished task

deploy task

Task Repository

Stored Tasks

Gat

eway

Rich Client(WF-Deployer)

Rich Client(WF-Executor)

Actor

Actor

Actor

Actor

deploy workflow

lookup workflows

execute workflow

notify

lookup tasks

claim tasks

execute tasks

finished tasks

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 9

Benefits

> Scheduler and brokering components are now optional

> Simplified integration of special resources like humans, telescopes or medical devices

> Reduced administrative VO management overhead at resource sites

> Support actors across organizational boundaries (community approaches)

Challenges

> Bottleneck problem should not be shifted to the task repository

> Submitted jobs run the risk of starvation (SLAs!)

> Appropriate security and provenance frameworks needed

Benefits & Challenges of Pull-based Approach

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 10

Integrate Pull Approach to an existing Grid Middleware

Extending UNICORE Grid middleware

Use existing XML Tuple Space as repository

Integrate jBPM Workflow Engine as client for the space

1.User starts workflow2.Engine writes job to space3.Resource takes job from space4.Resource executes job locally5.Resource finishes job6.Engine receives notification by the space and resumes workflow

Grid User(s)UNICORE Rich Client

jBPM Engine

Workflow Management System

WA

1

11

Resource Provider(s)Job Taker

XML Space(s)

2

3 5

6

4

UN

ICO

RE

Host

ing

Env.

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 11

Conclusions:

> Pull-based job distribution strategies are currently missing in Grid systems

> But it could be used as an alternative model for certain application scenarios

(heterogeneous resources, high-throughput computing, …)

Summary and Outlook

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES 15. April 2010 | 12

Further steps:

> XML space should be replaced by a scalable task repository

> Among the simple Job-Takers more complex client systems should be implemented to integrate special resources (humans) into Grids

> Hybrid push/pull distribution strategies as an option

> Performance and scalability analysis

This work should result in a refined architecture to address the challenges of pull-based approaches

Summary and Outlook

FH Aachen University of Applied ScienceCampus Jülich

Ginsterweg 1 52428 JülichT +49. 241. 6009 [email protected], [email protected] www.fh-aachen.de

© FH AACHEN UNIVERSITY OF APPLIED SCIENCES | FRANK BERRETZ, SASCHA SKORUPA | GINSTERWEG 1 | 52428 JÜLICH | WWW.FH-AACHEN.DE