Upload
rafael-ferreira-da-silva
View
25
Download
1
Embed Size (px)
Citation preview
Task Resource Consumption Prediction for Scientific Applications and Workflows
Rafael Ferreira da Silva h.p://pegasus.isi.edu
Algorithms and Scheduling Techniques to Manage Resilience
and Power ConsumpBon in Distributed Systems
July 6-‐10, 2015, Dagstuhl, Germany
University of Southern California, Information Sciences Institute, Marina Del Rey, CA, USA
2
Introduction
• Methods assume that accurate estimates are available • It is hard to compute accurate estimates in production systems
• A successful application (or workflow) execution mainly depends on how tasks are planned and executed
2
Scheduling and Resource Provisioning
Algorithms
Task CharacterisBcs: RunBme Disk Space
Memory ConsumpBon
3
Overview of the Resource Provisioning Loop
3
Workload Characterization
Models
Resource Allocation
Computer, Storage, Network, etc.
Execution Monitoring Runtime, I/O,
Memory, Energy, etc.
Workload Archive
Execution Traces Distributions, Time-
series, etc.
Workload Estimation
dV/dt
Panorama
Anomaly Detection …
Scheduling Reconfiguration,
corrective actions, etc.
4
Characterization of a HTC workload: The Compact Muon Solenoid (CMS) Experiment
4
Rafael Ferreira da Silva, Mats Rynge, Gideon Juve, Igor Sfiligoi, Ewa Deelman, James Letts, Frank Wurthwein and Miron Livny, Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC, Procedia Computer Science, International Conference On Computational Science, ICCS 2015 Computational Science at the Gates of Nature, 51, pp. 39-48, 2015
5
Workload Characteristics
5
Characteristic Data
General Workload
Total number of jobs 1,435,280
Total number of users 392
Total number of execution sites 75
Total number of execution nodes 15,484
Jobs statistics
Completed jobs 792,603
Preempted jobs 257,230
Exit code (!= 0) 385,447
Average job runtime (in seconds) 9,444.6
Standard deviation of job runtime (in seconds) 14,988.8
Average disk usage (in MB) 55.3
Standard deviation of disk usage (in MB) 219.1
Average memory usage (in MB) 217.1
Standard deviation of memory usage (in MB) 659.6
Characteristics of the CMS workload for a period of a month (Aug 2014)
6
Workload Execution Profiling
• The workload shows similar behavior to the workload analysis conducted in [Sfiligoi 2013] • The magnitude of the job runtimes varies among users and tasks
6
Job runtimes by user sorted by per-user mean job runtime
Job runtimes by task sorted by per-task mean job runtime
7
site
host
factory
exitCode
jobStatus
queueTime
startTime
completionTime
duration
remoteWallClockTime
numJ obStarts
numRequestedCpus
remoteSysCpu
remoteUserCpu
diskUsage
diskRequested
diskProvisioned
inputsSize
memoryProvisioned
imageSize
memoryRequested
command
executableSize
blTaskID
user
site
host
factory
exitC
ode
jobS
tatus
queueTime
startTime
completionTime
duration
remoteW
allClockTime
numJobS
tarts
numRequestedCpus
remoteS
ysCpu
remoteU
serCpu
diskUsage
diskRequested
diskProvisioned
inputsSize
mem
oryProvisioned
imageS
ize
mem
oryR
equested
command
executableSize• Correlation Statistics
• Weak correlations suggest that none of the properties can be directly used to predict future workload behaviors
• Two variables are correlated if the ellipse is too narrow as a line
Workload Characterization
7
Trivial correlations
8
0
1
2
1e−01 1e+01 1e+03 1e+05Job Runtime (sec)
Prob
abilit
y D
ensi
ty
0.0
0.3
0.6
0.9
1e−02 1e+00 1e+02 1e+04Disk Usage (MB)
Prob
abilit
y D
ensi
ty
0
20
40
1e−02 1e+00 1e+02 1e+04Memory Usage (MB)
Prob
abilit
y D
ensi
ty
• Correlation measures are sensitive to the data distribution
• Probability Density Functions • Do not fit any of the most common
families of density families (e.g. Normal or Gamma)
• Our approach • Statistical recursive partitioning
method to combine properties from the workload to build Regression Trees
Workload Characterization (2)
8
9
• The recursive algorithm looks for PDFs that fit a family of density • In this work, we consider the Normal
and Gamma distributions
• Measured with the K-S test
Regression Trees
9
The PDF for the tree node (in blue) fits a Gamma distribution (in grey) with the following parameters: Shape parameter = 12 Rate parameter = 5x10-4 Mean = 27414.8 p-value = 0.17
executableSizep < 0.001
1
≤ 27 > 27
executableSizep = 0.004
2
≤ 25 > 25
Node 3 (n = 522)
0
20000
40000
60000
80000Node 4 (n = 19)
●●●0
20000
40000
60000
80000
inputsSizep < 0.001
5
≤ 28 > 28
numJobStartsp = 0.02
6
≤ 0 > 0
Node 7 (n = 2161)
●●●●
●
●
●●
●
●●●
●●●●●●●●●●●●●●●●●
●●●●●●●●●●
●●●
●●●●●●
●●●
●●●●●●
●●●●●●●
●
●●●●●●
●
●●
●
●
●
●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●
●●●●●●●●●●●
●
●●●
●
●●
●
●●
●●
0
20000
40000
60000
80000Node 8 (n = 152)
●●●●
●
●●
0
20000
40000
60000
80000
numJobStartsp = 0.002
9
≤ 1 > 1
Node 10 (n = 26411)
●
●●●
●●
●
●●●●●●●
●●
●●
●●
●
●
●●
●
●●●●
●
●●
●
●●
●
●
●
●●●●
●
●
●
●●
●●●●
●●●●
●
●
●
●
●
●●●
●
●●●●●
●
●●●●●●●●●●●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●●●●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●●●
●
●●●
●
●
●●●●
●
●
●
●
●●●●
●●
●●
●
●
●●●
●●
●●
●●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●●●●●
●
●
●
●●●●●
●
●
●●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●●●●●●
●
●●●
●●●●
●
●
●●●
●●●●
●
●●
●●●●●
●
●●
●●●●
●
●●●●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●
●●
●●●●
●●
●●
●
●
●●
●
●
●●●
●
●
●
●
●●
●
●
●●●●
●
●
●
●●
●
●●
●●●
●●●
●●
●●●
●
●
●
●
●●●
●●●●
●
●
●●
●●●
●●●●●●●●●
●●●
●●●
●●●●●●●●●●●●●
●
●
●
●●
●
●
●
●
●●●
●●●
●
●
●
●
●
●●
●
●●●●●●●●●●
●●●
●
●●
●
●
●
●●
●
●
●●●●●●●●
●
●●
●
●●●
●●●●●
●
●●●●●●
●●
●●
●
●●
●
●●●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●
●●
●
●●
●●●
●●
●
●
●
●●●
●
●
●●
●●
●
●
●
●
●●●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●●●●●●●●●
●
●
●●●●
●
●●
●●●●●●●
●
●●●●
●
●●●●●●
●
●●●
●●●
●
●●●●●●●●
●
●●●●
●
●
●
●●
●
●●
●
●
●
●
●
●●●
●
●●●
●●
●
●
●
●
●●
●
●●●
●●
●●
●
●●●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●●●●●
●
●
●
●
●
●●●
●
●
●
●●●
●●
●●●
●●
●●●●●
●
●●●
●●
●
●●●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●●
●●●●
●●●●●●●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●●●●
●●
●
●●●
●
●●
●
●●●●
●
●●
●
●
●●
●●
●
●
●
●●
●
●●●
●
●
●●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●
●●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●●●
●
●●
●●●
●
●
●●●●
●
●●●●●●
●●●
●●
●
●●●●
●●
●
●
●●●
●
●●
●
●
●
●●●●●●●●●●
●
●
●
●
●
●
●●
●
●●
●●●●●●●●●●●●
●●●
●
●●
●
●●●●
●●●
●
●
●
●●●●
●●●●●
●●
●●●●●●
●
●
●
●●
●●
●
●
●
●●●●●●●
●
●
●
●
●
●●
●●●
●
●
●
●
●●
●●●●
●●●●●●●●●●●
●
●●
●
●
●
●
●●●●●●●●●●●●●●
●●●●●●●●●●●●●●
●
●●●●●●
●●●●●
●
●●
●
●●●●
●●●
●
●●
●
●
●●
●
●●
●●
●
●
●
●●●
●
●●
●
●
●
●●
●●●
●
●●●
●●●
0
20000
40000
60000
80000Node 11 (n = 665)
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●●
●●●
●
●●
●
●
●
●●●●
●●
●
●●●
●
●
●●
●
●●●●
●●
●●
●
0
20000
40000
60000
80000
0e+00
2e−05
4e−05
6e−05
20000 40000 60000Runtime (sec)
Prob
abilit
y D
ensi
ty
10
Job Estimation Process
10
• Based on the regression trees • We built a regression tree per user • Estimates are generated according to a distribution
(Normal, Gamma, or Uniform)
11
Experimental Results
11
Job Runtime
Disk Usage Memory Usage
Average accuracy of the workload dataset The training set is defined as a portion of the entire workload dataset
The median accuracy increases as more data is used for the
training set
12
Experimental Results (2)
12
accuracy above 60% Fits mostly Normal distributions
• Number of Rules per Distribution • Runtime: better fits Gamma distributions • Disk: better fits Normal distributions • Memory: better fits Normal distributions Specialization
13
Prediction of Future Workloads
13
• Experiment Conditions • Used the workload from Aug 2014 to predict job requirements for
October 2014
• Experiment Results • Median estimation accuracy
Runtime: 82% (50% 1st quartile, 94% 3rd quartile) Disk and Memory consumption: over 98%
Characteristic Data
General Workload
Total number of jobs 1,638,803
Total number of users 408
Jobs statistics
Completed jobs 810,567
Characteristics of the CMS workload for a period of a month (October 2014)
14
• Contributions • Workload characterization of over 3 million jobs • Use of a statistical recursive partitioning algorithm and conditional
inference trees to identify patterns • Estimation process to predict job characteristics
• Experimental Results • Adequate estimates can be attained for job runtime • Nearly optimal estimates are obtained for disk and memory
consumption
• Remarks • Data collection process should be refined to gather finer information • Applications should provide mechanisms to distinguish custom user
codes from the standard executable
Summary
14
15
Online Task Resource Consumption Prediction for Scientific Workflows
15
Rafael Ferreira da Silva, Gideon Juve, Ewa Deelman, Tristan Glatard, Frédéric Desprez, Douglas Thain, Benjamín Tovar and Miron Livny, Toward Fine-Grained Online Task Characteristics Estimation in Scientific Workflows, 8th Workshop On Workflows in Support of Large-Scale Science (WORKS), 2013. Rafael Ferreira da Silva, Gideon Juve, Mats Rynge, Ewa Deelman and Miron Livny, Online Task Resource Consumption Prediction for Scientific Workflows, Parallel Processing Letters, to appear, 2015.
16
Scientific Workflows
16
• Directed Acyclic Graph (DAG) • Nodes denote tasks • Edges denote task dependencies
mProjectPP mDiffFit mConcatFit mBgModel mBackground
mImgtbl mAdd mShrink mJPEG
Small (20 node) Montage Workflow
fastQSplit
filterContams
sol2sanger
fastq2bfq
map
mapMerge
maqIndex
pileup
Epigenomics Workflow SoyKB Workflow
...
...
...
aligment_to_reference
sort_sam
dedup
add_replace
realing_target_creator
indel_realing
haplotype_caller
genotype_gvcfs
combine_variants
select_variants_indel
filtering_indel
select_variants_snp
filtering_snp
merge_gvcfs
17
Workflow Execution Profiling
17
• Workflows are executed using Pegasus WMS and profiled using Kickstart profiling tool • Monitors and records fine-grained data • E.g., process I/O, runtime, memory usage, CPU utilization • LD_PRELOAD, ptrace (fork/exit, system call)
• Workflow Datasets • Montage: 10 datasets (2, 4, and 8 degrees) • Epigenomics: 6 datasets • SoyKB, Periodogram, Rosetta: 3 datasets
18
Execution Profile: SoyKB Workflow
18
Task estimation could be based on mean values
Task estimation based on average may lead to significant estimation errors
19
Workflow Characterization
19
• Characterize tasks based on their estimation capability • Runtime, I/O write, memory peak è estimated from I/O read
• Enforce correlation statistics to identify statistical relationships between parameters • High correlation values yield accurate estimations
Correlated if ρ > 0.8
Nearly constant values
20
Improving Correlations
20
• Density-based Clustering • Identifies groups of high density areas where no correlation is found
Higher correlation
values Constant values
21
Task Estimation Process
21
• Based on Regression Trees • Built offline from historical
data analyses
Rules for I/O write estimation of the Periodogram workflow
22
Online Estimation Process
22
• Based on the MAPE-K loop • Task executions are constantly monitored • Estimated values are updated, and a new prediction is done
Offline Estimation
Monitoring
Tasks submission
Analysis
Task completion
Correct estimation?
yes
New Estimation
no
Execution
Replanning
Online Estimation Process
23
Results: Average Estimation Errors - Montage
23
Online Process Avg. Runtime Error: 13% Avg. I/O Write Error: 8% Avg. Memory Error: 11%
Offline Process Avg. Runtime Error: 49% Avg. I/O Write Error: 55% Avg. Memory Error: 57%
Poor output data estimations leads to a chain of estimation errors in scientific workflows
24
• Contributions • Profiling and Characterization of 5 real workflows • Online estimation process to predict fine-grained task needs
• Future Work • Analysis of the impact of re-planning a workflow when using an
online estimation strategy • Sensitivity analysis of application parameters • Workflow profiling from time-series measurements • Workflow profiling based on energy consumption
Summary
24
Task Resource Consumption Prediction for Scientific Applications and Workflows
Thank you.
h/p://pegasus.isi.edu