Task Resource Consumption Prediction for Scientific Applications and Workflows

Task Resource Consumption Prediction for Scientific Applications and Workflows

Rafael Ferreira da Silva h.p://pegasus.isi.edu

Algorithms and Scheduling Techniques to Manage Resilience

and Power ConsumpBon in Distributed Systems

July 6-‐10, 2015, Dagstuhl, Germany

University of Southern California, Information Sciences Institute, Marina Del Rey, CA, USA

2

Introduction

•  Methods assume that accurate estimates are available •  It is hard to compute accurate estimates in production systems

•  A successful application (or workflow) execution mainly depends on how tasks are planned and executed

2

Scheduling and Resource Provisioning

Algorithms

Task CharacterisBcs: RunBme Disk Space

Memory ConsumpBon

3

Overview of the Resource Provisioning Loop

3

Workload Characterization

Models

Resource Allocation

Computer, Storage, Network, etc.

Execution Monitoring Runtime, I/O,

Memory, Energy, etc.

Workload Archive

Execution Traces Distributions, Time-

series, etc.

Workload Estimation

dV/dt

Panorama

Anomaly Detection …

Scheduling Reconfiguration,

corrective actions, etc.

4

Characterization of a HTC workload: The Compact Muon Solenoid (CMS) Experiment

4

Rafael Ferreira da Silva, Mats Rynge, Gideon Juve, Igor Sfiligoi, Ewa Deelman, James Letts, Frank Wurthwein and Miron Livny, Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC, Procedia Computer Science, International Conference On Computational Science, ICCS 2015 Computational Science at the Gates of Nature, 51, pp. 39-48, 2015

5

Workload Characteristics

5

Characteristic Data

General Workload

Total number of jobs 1,435,280

Total number of users 392

Total number of execution sites 75

Total number of execution nodes 15,484

Jobs statistics

Completed jobs 792,603

Preempted jobs 257,230

Exit code (!= 0) 385,447

Average job runtime (in seconds) 9,444.6

Standard deviation of job runtime (in seconds) 14,988.8

Average disk usage (in MB) 55.3

Standard deviation of disk usage (in MB) 219.1

Average memory usage (in MB) 217.1

Standard deviation of memory usage (in MB) 659.6

Characteristics of the CMS workload for a period of a month (Aug 2014)

6

Workload Execution Profiling

•  The workload shows similar behavior to the workload analysis conducted in [Sfiligoi 2013] •  The magnitude of the job runtimes varies among users and tasks

6

Job runtimes by user sorted by per-user mean job runtime

Job runtimes by task sorted by per-task mean job runtime

7

site

host

factory

exitCode

jobStatus

queueTime

startTime

completionTime

duration

remoteWallClockTime

numJ obStarts

numRequestedCpus

remoteSysCpu

remoteUserCpu

diskUsage

diskRequested

diskProvisioned

inputsSize

memoryProvisioned

imageSize

memoryRequested

command

executableSize

blTaskID

user

site

host

factory

exitC

ode

jobS

tatus

queueTime

startTime

completionTime

duration

remoteW

allClockTime

numJobS

tarts

numRequestedCpus

remoteS

ysCpu

remoteU

serCpu

diskUsage

diskRequested

diskProvisioned

inputsSize

mem

oryProvisioned

imageS

ize

mem

oryR

equested

command

executableSize•  Correlation Statistics

•  Weak correlations suggest that none of the properties can be directly used to predict future workload behaviors

•  Two variables are correlated if the ellipse is too narrow as a line

Workload Characterization

7

Trivial correlations

8

0

1

2

1e−01 1e+01 1e+03 1e+05Job Runtime (sec)

Prob

abilit

y D

ensi

ty

0.0

0.3

0.6

0.9

1e−02 1e+00 1e+02 1e+04Disk Usage (MB)

Prob

abilit

y D

ensi

ty

0

20

40

1e−02 1e+00 1e+02 1e+04Memory Usage (MB)

Prob

abilit

y D

ensi

ty

•  Correlation measures are sensitive to the data distribution

•  Probability Density Functions •  Do not fit any of the most common

families of density families (e.g. Normal or Gamma)

•  Our approach •  Statistical recursive partitioning

method to combine properties from the workload to build Regression Trees

Workload Characterization (2)

8

9

•  The recursive algorithm looks for PDFs that fit a family of density •  In this work, we consider the Normal

and Gamma distributions

•  Measured with the K-S test

Regression Trees

9

The PDF for the tree node (in blue) fits a Gamma distribution (in grey) with the following parameters: Shape parameter = 12 Rate parameter = 5x10-4 Mean = 27414.8 p-value = 0.17

executableSizep < 0.001

1

≤ 27 > 27

executableSizep = 0.004

2

≤ 25 > 25

Node 3 (n = 522)

0

20000

40000

60000

80000Node 4 (n = 19)

●●●0

20000

40000

60000

80000

inputsSizep < 0.001

5

≤ 28 > 28

numJobStartsp = 0.02

6

≤ 0 > 0

Node 7 (n = 2161)

●●●●

●

●

●●

●

●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●

●●●

●●●●●●

●●●

●●●●●●

●●●●●●●

●

●●●●●●

●

●●

●

●

●

●●●●●●●●●●●

●

●●●●●●●●

●

●●●●●●●●●●●●●●

●●●●●●●●●●●

●

●●●

●

●●

●

●●

●●

0

20000

40000

60000

80000Node 8 (n = 152)

●●●●

●

●●

0

20000

40000

60000

80000

numJobStartsp = 0.002

9

≤ 1 > 1

Node 10 (n = 26411)

●

●●●

●●

●

●●●●●●●

●●

●●

●●

●

●

●●

●

●●●●

●

●●

●

●●

●

●

●

●●●●

●

●

●

●●

●●●●

●●●●

●

●

●

●

●

●●●

●

●●●●●

●

●●●●●●●●●●●

●

●

●

●

●

●

●●

●

●●●

●

●

●

●

●

●

●

●●

●

●●

●

●●

●●

●

●

●

●●

●

●●●

●

●

●

●

●

●

●

●

●

●●

●

●●●●

●

●●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

●●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●●

●●●

●

●

●

●

●

●●

●

●●

●

●

●

●

●●

●

●

●

●●

●●

●

●

●

●

●

●●

●

●

●

●●●

●

●●●

●

●

●●●●

●

●

●

●

●●●●

●●

●●

●

●

●●●

●●

●●

●●●●

●

●

●

●

●

●

●

●

●●

●

●

●

●●

●●●●●●

●

●

●

●●●●●

●

●

●●

●●

●

●

●

●

●

●

●●

●●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●●●●●●●

●

●●●

●●●●

●

●

●●●

●●●●

●

●●

●●●●●

●

●●

●●●●

●

●●●●

●

●

●

●

●

●●

●

●●

●

●

●●

●

●

●

●

●

●●

●

●●●

●

●

●●

●●

●●●●

●●

●●

●

●

●●

●

●

●●●

●

●

●

●

●●

●

●

●●●●

●

●

●

●●

●

●●

●●●

●●●

●●

●●●

●

●

●

●

●●●

●●●●

●

●

●●

●●●

●●●●●●●●●

●●●

●●●

●●●●●●●●●●●●●

●

●

●

●●

●

●

●

●

●●●

●●●

●

●

●

●

●

●●

●

●●●●●●●●●●

●●●

●

●●

●

●

●

●●

●

●

●●●●●●●●

●

●●

●

●●●

●●●●●

●

●●●●●●

●●

●●

●

●●

●

●●●

●

●

●

●

●●

●●

●

●

●

●●

●

●

●

●

●●

●

●●

●●●

●●

●

●

●

●●●

●

●

●●

●●

●

●

●

●

●●●

●

●●

●

●

●●

●

●

●

●●

●

●

●

●

●●

●

●●●●●●●●●

●

●

●●●●

●

●●

●●●●●●●

●

●●●●

●

●●●●●●

●

●●●

●●●

●

●●●●●●●●

●

●●●●

●

●

●

●●

●

●●

●

●

●

●

●

●●●

●

●●●

●●

●

●

●

●

●●

●

●●●

●●

●●

●

●●●

●

●

●●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●●●●●●

●

●

●

●

●

●●●

●

●

●

●●●

●●

●●●

●●

●●●●●

●

●●●

●●

●

●●●●

●●

●

●●

●

●

●

●

●

●●

●

●

●

●●

●●

●●●●

●●●●●●●

●

●

●

●●

●

●

●●

●●

●

●

●

●

●●●●

●●

●

●●●

●

●●

●

●●●●

●

●●

●

●

●●

●●

●

●

●

●●

●

●●●

●

●

●●

●

●●

●

●

●●

●

●

●

●●

●

●

●

●

●

●●

●●

●●●

●

●

●

●●

●

●

●

●●

●

●

●

●

●●●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●●●

●

●

●

●●●

●

●●

●●●

●

●

●●●●

●

●●●●●●

●●●

●●

●

●●●●

●●

●

●

●●●

●

●●

●

●

●

●●●●●●●●●●

●

●

●

●

●

●

●●

●

●●

●●●●●●●●●●●●

●●●

●

●●

●

●●●●

●●●

●

●

●

●●●●

●●●●●

●●

●●●●●●

●

●

●

●●

●●

●

●

●

●●●●●●●

●

●

●

●

●

●●

●●●

●

●

●

●

●●

●●●●

●●●●●●●●●●●

●

●●

●

●

●

●

●●●●●●●●●●●●●●

●●●●●●●●●●●●●●

●

●●●●●●

●●●●●

●

●●

●

●●●●

●●●

●

●●

●

●

●●

●

●●

●●

●

●

●

●●●

●

●●

●

●

●

●●

●●●

●

●●●

●●●

0

20000

40000

60000

80000Node 11 (n = 665)

●

●

●

●

●

●

●●

●●

●

●

●

●

●

●●●

●●●

●

●●

●

●

●

●●●●

●●

●

●●●

●

●

●●

●

●●●●

●●

●●

●

0

20000

40000

60000

80000

0e+00

2e−05

4e−05

6e−05

20000 40000 60000Runtime (sec)

Prob

abilit

y D

ensi

ty

10

Job Estimation Process

10

•  Based on the regression trees •  We built a regression tree per user •  Estimates are generated according to a distribution

(Normal, Gamma, or Uniform)

11

Experimental Results

11

Job Runtime

Disk Usage Memory Usage

Average accuracy of the workload dataset The training set is defined as a portion of the entire workload dataset

The median accuracy increases as more data is used for the

training set

12

Experimental Results (2)

12

accuracy above 60% Fits mostly Normal distributions

•  Number of Rules per Distribution •  Runtime: better fits Gamma distributions •  Disk: better fits Normal distributions •  Memory: better fits Normal distributions Specialization

13

Prediction of Future Workloads

13

•  Experiment Conditions •  Used the workload from Aug 2014 to predict job requirements for

October 2014

•  Experiment Results •  Median estimation accuracy

Runtime: 82% (50% 1st quartile, 94% 3rd quartile) Disk and Memory consumption: over 98%

Characteristic Data

General Workload

Total number of jobs 1,638,803

Total number of users 408

Jobs statistics

Completed jobs 810,567

Characteristics of the CMS workload for a period of a month (October 2014)

14

•  Contributions •  Workload characterization of over 3 million jobs •  Use of a statistical recursive partitioning algorithm and conditional

inference trees to identify patterns •  Estimation process to predict job characteristics

•  Experimental Results •  Adequate estimates can be attained for job runtime •  Nearly optimal estimates are obtained for disk and memory

consumption

•  Remarks •  Data collection process should be refined to gather finer information •  Applications should provide mechanisms to distinguish custom user

codes from the standard executable

Summary

14

15

Online Task Resource Consumption Prediction for Scientific Workflows

15

Rafael Ferreira da Silva, Gideon Juve, Ewa Deelman, Tristan Glatard, Frédéric Desprez, Douglas Thain, Benjamín Tovar and Miron Livny, Toward Fine-Grained Online Task Characteristics Estimation in Scientific Workflows, 8th Workshop On Workflows in Support of Large-Scale Science (WORKS), 2013. Rafael Ferreira da Silva, Gideon Juve, Mats Rynge, Ewa Deelman and Miron Livny, Online Task Resource Consumption Prediction for Scientific Workflows, Parallel Processing Letters, to appear, 2015.

16

Scientific Workflows

16

•  Directed Acyclic Graph (DAG) •  Nodes denote tasks •  Edges denote task dependencies

mProjectPP mDiffFit mConcatFit mBgModel mBackground

mImgtbl mAdd mShrink mJPEG

Small (20 node) Montage Workflow

fastQSplit

filterContams

sol2sanger

fastq2bfq

map

mapMerge

maqIndex

pileup

Epigenomics Workflow SoyKB Workflow

...

...

...

aligment_to_reference

sort_sam

dedup

add_replace

realing_target_creator

indel_realing

haplotype_caller

genotype_gvcfs

combine_variants

select_variants_indel

filtering_indel

select_variants_snp

filtering_snp

merge_gvcfs

17

Workflow Execution Profiling

17

•  Workflows are executed using Pegasus WMS and profiled using Kickstart profiling tool •  Monitors and records fine-grained data •  E.g., process I/O, runtime, memory usage, CPU utilization •  LD_PRELOAD, ptrace (fork/exit, system call)

•  Workflow Datasets •  Montage: 10 datasets (2, 4, and 8 degrees) •  Epigenomics: 6 datasets •  SoyKB, Periodogram, Rosetta: 3 datasets

18

Execution Profile: SoyKB Workflow

18

Task estimation could be based on mean values

Task estimation based on average may lead to significant estimation errors

19

Workflow Characterization

19

•  Characterize tasks based on their estimation capability •  Runtime, I/O write, memory peak è estimated from I/O read

•  Enforce correlation statistics to identify statistical relationships between parameters •  High correlation values yield accurate estimations

Correlated if ρ > 0.8

Nearly constant values

20

Improving Correlations

20

•  Density-based Clustering •  Identifies groups of high density areas where no correlation is found

Higher correlation

values Constant values

21

Task Estimation Process

21

•  Based on Regression Trees •  Built offline from historical

data analyses

Rules for I/O write estimation of the Periodogram workflow

22

Online Estimation Process

22

•  Based on the MAPE-K loop •  Task executions are constantly monitored •  Estimated values are updated, and a new prediction is done

Offline Estimation

Monitoring

Tasks submission

Analysis

Task completion

Correct estimation?

yes

New Estimation

no

Execution

Replanning

Online Estimation Process

23

Results: Average Estimation Errors - Montage

23

Online Process Avg. Runtime Error: 13% Avg. I/O Write Error: 8% Avg. Memory Error: 11%

Offline Process Avg. Runtime Error: 49% Avg. I/O Write Error: 55% Avg. Memory Error: 57%

Poor output data estimations leads to a chain of estimation errors in scientific workflows

24

•  Contributions •  Profiling and Characterization of 5 real workflows •  Online estimation process to predict fine-grained task needs

•  Future Work •  Analysis of the impact of re-planning a workflow when using an

online estimation strategy •  Sensitivity analysis of application parameters •  Workflow profiling from time-series measurements •  Workflow profiling based on energy consumption

Summary

24

Task Resource Consumption Prediction for Scientific Applications and Workflows

Thank you.

[email protected]

h/p://pegasus.isi.edu

Technology

Task Resource Consumption Prediction for Scientific Applications and Workflows