38
Partitioning and Scheduling Workflows across Multiple Sites with Storage Constraints Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

Embed Size (px)

Citation preview

Page 1: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

1

Partitioning and Scheduling Workflows across Multiple

Sites with Storage ConstraintsAuthors: Weiwei Chen, Ewa Deelman

9th International Conference on Parallel Processing and Applied Mathmatics

Page 2: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

2

Introduction Related work System design Experiments and Evaluations Conclusions

Outline

Page 3: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

3

Introduction Related work System design Experiments and Evaluations Conclusions

Outline

Page 4: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

4

In recent years, scientific workflows have been widely applied in astronomy, seismology, genomics, etc.

This paper aims to address the problem of scheduling large workflows onto multiple execution sites with storage constraints.

We model workflows as Directed Acyclic Graphs (DAGs), where nodes represent computation and directed edges represent data flow dependencies between nodes.

Introduction

Page 5: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

5

control or data dependencies between jobs the mapping of jobs in the workflow onto

resources that are often distributed in the wide area

data-intensive workflows that require significant amount of storage◦ the entire CyberShake earthquake science

workflow has 16,000 sub-workflows and each sub-workflow has more than 24,000 individual jobs and requires 58 GB data.

Problems

Page 6: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

6

Introduction Related work System design Experiments and Evaluations Conclusions

Outline

Page 7: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

7

Heuristic scheduling◦ HEFT、Min-Min、Max-Min、MCT

These algorithms didn’t take storage constraints into consideration and they need to check every job and schedule them.

Workflow partitioning can be classified as a network cut problem where a sub-workflow is viewed as a sub-graph.

Related work

Page 8: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

8

Introduction Related work System design Experiments and Evaluations Conclusions

Outline

Page 9: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

9

The site catalog provides information about the available resources.

System design

Page 10: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

10

reduces the complexity of the workflow mapping.◦ For example, the entire CyberShake workflow has

more than 3.8108 tasks, which is a large number for workflow management tools. In contrast, each sub-workflow has 24,000 tasks, which is acceptable for workflow management tools.

Why partition?

Page 11: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

11

The major challenge in partitioning workflows is to avoid cross dependency, which is a chain of dependencies that forms a cycle in graph (in this case cycles between sub-workflows).

Cross dependency

↑deadlock loop

Page 12: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

12

Usually jobs that have parent-child relationships share a lot of data since they have data dependencies.

Three heuristics are proposed to first partition the workflow into sub-workflows.

Partitioner

Page 13: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

13

Our heuristic only checks three particular types of nodes:◦ fan-out: where the output of a job is input to

many children◦ fan-in: where the output of several jobs is

aggregated by a child◦ pipeline nodes: 1 parent, 1 child

Our algorithm reduces the time complexity of check operations by n folds, while n is the average depth of the fan-in-fan-out structure.

Partitioner

Page 14: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

14

Aggressive search◦ checks if it’s possible

to add the whole fan structure into the sub-workflow

Less-aggressive search◦ performed on its parent

jobs, which includes all of its predecessors until the search reaches a fan-out job.

Partitioner

Page 15: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

15

Conservative search◦ all of its predecessors

until the search reaches a fan-in job or a fan-out job.

Partitioner

Page 16: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

16

We assume that the size of each input file and output file is known.

Heuristic Ⅰ

Page 17: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

17

adds a job to a sub-workflow if all of its unscheduled children can be added to that sub-workflow without causing cross dependencies or exceed the storage constraint.

Heuristic Ⅱ

Page 18: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

18

1. For a job with multiple children, each child has already been scheduled

2. After adding this job to the sub-workflow, the data size doesn’t exceed the storage constraint.

Heuristic Ⅲ

Page 19: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

19

Critical Path: the longest depth of the sub-workflow weighted by the runtime of each job.

Average CPU Time is the quotient of cumulative CPU time of all jobs divided by the number of available resources.

HEFT estimator uses the calculated earliest finish time of the last sink job as makespan of sub-workflows.

Estimator

Page 20: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

20

Re-ordering: partitioning step has already guaranteed that there is a valid mapping

Scheduling algorithm: HEFT、Min-min There are two differences compared to their

original versions: ◦ First, the data transfer cost within a sub-workflow

is ignored since we use a shared file system in our experiments.

◦ Second, the data constraints must be satisfied for each sub-workflow.

Scheduler

Page 21: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

21

Introduction Related work System design Experiments and Evaluations Conclusions

Outline

Page 22: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

22

Eucalyptus[14]: an infrastructure software that provides on-demand access to Virtual Machine (VM) resources.

The submit host that performs workflow planning and which sends jobs to the execution sites is a Linux 2.6 machine equipped with 8GB RAM and an Intel 2.66GHz Quad CPUs.

14. Eucalyptus Systems. http://www.eucalyptus.com/

Experiments and Evaluations

Page 23: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

23

We use Condor [6] pools as execution sites. HTCondor is a specialized workload management

system for compute-intensive jobs. Like other full-featured batch systems, HTCondor provides a job queueing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management.

6. M. Litzkow, M. Livny, et al., Condor—A Hunter of Idle Workstations. In Proceedings of the 8th International Conference on Distributed Computing Systems, New York, June 1988.

Experiments and Evaluations

Page 24: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

24

Performance Metrics.◦ Satisfying the Storage Constraints◦ Improving the Runtime Performance

Experiments and Evaluations

Page 25: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

25

Workflows Used◦ Montage: an astronomy application, an

astronomy application that is used to construct large image mosaics of the sky.

◦ CyberShake: a seismology application, calculate Probabilistic Seismic Hazard curves for several geographic sites in the Southern California area.

◦ Epigenomics: a bioinformatics application, maps short DNA segments collected with high-throughput gene sequencing machines to a reference genome.

Experiments and Evaluations

Page 26: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

26

They were chosen because they represent a wide range of application domains and a variety of resource requirements.◦ Montage: I/O intensive◦ CyberShake: memory intensive◦ Epigenomics: CPU intensive

Experiments and Evaluations

Page 27: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

27

storage constraint: 30GB The default workflow has no storage

constraint.

Experiments and Evaluations

Page 28: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

28

Experiments and Evaluations

Page 29: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

29

Performance with Different Storage Constraints

Experiments and Evaluations

Page 30: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

30

CyberShake

Experiments and Evaluations

Page 31: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

31

Montage

Experiments and Evaluations

Page 32: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

32

Epigenomics

Experiments and Evaluations

Page 33: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

33

Experiments and Evaluations

Page 34: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

34

The performance with three workflows shows that this approach is able to satisfy the storage constraints and reduce the makespan significantly especially for Epigenomics which has fewer fan-in (synchronization) jobs.

For the workflows we used, scheduling them onto two or three execution sites is best due to a tradeoff between increased data transfer and increased parallelism.

Experiments and Evaluations

Page 35: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

35

◦ The Average CPU Time doesn’t take the dependencies into consideration.

◦ The Critical Path doesn’t consider the resource availability.

Experiments and Evaluations

Page 36: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

36

Introduction Related work System design Experiments and Evaluations Conclusions

Outline

Page 37: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

37

Three heuristics are proposed and compared to show the close relationship between cross dependency and runtime improvement.

The performance with three real-world workflows shows that this approach is able to satisfy storage constraints and improve the overall runtime by up to 48% over a default whole-workflow scheduling.

Conclusions

Page 38: Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1

38

Thanks for your listening