View
226
Download
0
Embed Size (px)
Citation preview
Realistic CPU Workloads Through Host Load Trace Playback
http://www.cs.cmu.edu/~pdinda/LoadTraces
Peter A. Dinda
David R. O’Hallaron
Carnegie Mellon University
2
Talk in a Nutshell
• Workloads to evaluate distributed systems• Prediction-based systems• Shared benchmarks
• Reconstruct time-varying CPU contention behavior from traces of Unix load average
• Real (non-parametric)• Reproducible• Comparable
• Artifacts (http://www.cs.cmu.edu/~pdinda/LoadTraces)
• Playload tool• Collection of host load traces
3
Outline
• Evaluation of distributed systems• Adaptive applications• Prediction systems
• Host load traces• Synthetic Vs trace-based workloads
• Host load trace playback
• Evaluation
• Conclusion
4
Evaluating Distributed Systems
• Workloads critical in evaluation
• Shared benchmarks desperately needed• SPEC?
• Synthetic versus trace-based workloads• Parametric versus non-parametric
• Ideally:• Real workloads• Reproducible• Comparable• Sharable
5
Evaluating Prediction Systems• Prediction systems model workload
• RPS, NWS, adaptive applications, etc.
• Synthetic workloads assume a model• Model may be wrong or incomplete• Easy to use in simulation or in a testbed
• Trace-based workloads assume no model• However, traces may not be representative• Harder to use, especially in a testbed
• How do we use traces in a testbed?• Host load traces
6
Host Load Traces
• Periodically sampled Unix load average• Exponentially averaged run queue length
• Measure of contention for CPU
• Complex statistical properties [SciProg99]• High variability, strong autocorrelation function, self-
similar, epochal behavior, …• Difficult to synthesize
7
Host Load and Running Time
1 3 5 7Measured Load
0
5
10
15
20
25
Exe
cutio
n T
Ime
(Sec
onds
)
42,000 pointsCoefficient of Correlation = 0.998
nomtt
texec
exec t
dttzt
texecnow
now
)(1
1
8
Available Host Load Traces• DEC Unix 5 second exponential average
• Full bandwidth captured (1 Hz sample rate)• Long durations• Available! http://www.cs.cmu.edu/~pdinda/LoadTraces
Machines Duration
August 1997 13 production cluster8 research cluster2 compute servers
15 desktops
~ one week(over onemillionsamples)
March 1998 13 production cluster8 research cluster2 compute servers
11 desktops
~ one week(over onemillionsamples)
9
Host Load Measurement
Sample1 h
{
Ready Queue
Sample2
unknownsample ratef=2 Hz estimated
exponentialaverage, tau=5 s
f=1 Hz
Kernel User
TraceFile
10
Host Load Trace Playback
h-1Load
GeneratorTraceFile
LoadMeasure
Sample1h
{
Sample2error
-
applied load
measuredload
target load
11
What are h and h-1?
iii xezez recordrecord )1()( //1
record
record
e
zezx iii
/
/1
1
)(
iii xezez playbackplayback )1()( //1
run queue lengthtrace filetime constant
for recorded host
applied load (recovered run queue length)
h:
h-1:
measured load
time constantfor playback host
12
Load Generator
“1.5 load for 1 second”
“1.0 load for 1 second”
...
“0.5 load for 1 second”
Master
...
Worker Processes
“0.0 load for 1 second”
13
Load Generator// Split w into n cycles
while (!done) {if (uniformrand(1.0) < p)
compute for w/n seconds;else sleep for w/n seconds;
}
“p load for w seconds”
done=“w seconds have elapsed” Time-based playback
done=“w*p CPU seconds have been used” Work-based playback
(simplified)
14
Time-based Playback
0
0.5
1
1.5
2
2.5
0 50 100 150 200 250Time (seconds)
Target Load
Measured Load
External continuous 1.0 load
External load amplitude modulates applied load
15
Work-based Playback
External load amplitude and frequency modulates applied load
0
0.5
1
1.5
2
2.5
0 50 100 150 200 250Time (seconds)
Target Load
Measured Load
External continuous 1.0 load
16
Evaluation
• Traces described earlier• Example uses one hour 1997 axp0 trace
• Characterization of signals and errors• Summary stats• Distributions• Autocorrelation
• Multiple platforms• Digital Unix• Solaris• Linux• FreeBSD
17
Evaluation Summary StatsTarget MeasuredEnvironment
Mean Std Mean StdAlpha/DUX 4.0 1.065 0.465 1.047 0.442Sparc/Solaris 2.5 1.047 0.376 1.112 0.356PII/FreeBSD 2.2 1.047 0.376 1.123 0.361PII/RH Linux 5.2 1.047 0.376 1.131 0.360
ErrorEnvironmentMean Std
Alpha/DUX 4.0 -0.018 0.127Sparc/Solaris 2.5 0.076 0.061PII/FreeBSD 2.2 0.076 0.124PII/RH Linux 5.2 0.084 0.164
18
Evaluation on Alpha/DUXTitle:manch-7.good.compare.3600.out.targettrace.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:manch-7.good.compare.3600.out.errhist.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:manch-7.good.compare.3600.out.measuredtrace.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:manch-7.good.compare.3600.out.erracf.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Target
Measured Error ACF
Error Histogram
19
Evaluation on Sparc/SolarisTarget
Measured Error ACF
Error HistogramTitle:federation.good.compare.3600.out.targettrace.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:federation.good.compare.3600.out.errhist.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:federation.good.compare.3600.out.measuredtrace.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:federation.good.compare.3600.out.erracf.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
20
Evaluation on P2/FreeBSDTarget
Measured Error ACF
Error HistogramTitle:greenfield.good.compare.3600.out.targettrace.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:greenfield.good.compare.3600.out.errhist.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:greenfield.good.compare.3600.out.measuredtrace.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:greenfield.good.compare.3600.out.erracf.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
21
Evaluation on P2/LinuxTarget
Measured Error ACF
Error HistogramTitle:infocom.good.compare.3600.out.targettrace.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:infocom.good.compare.3600.out.errhist.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:infocom.good.compare.3600.out.measuredtrace.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
Title:infocom.good.compare.3600.out.erracf.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.
22
Conclusion• CPU Workloads from traces of
Unix load average• Reproduce contention behavior (ignoring priorities)• Real, non-parametric workloads• Reproducible
• Artifacts (http://www.cs.cmu.edu/~pdinda/LoadTraces)
• Playload tool• Collection of host load traces
• Future• Benchmarks• Priorities, memory, disk, etc.• In-kernel?
23
Feedback?
Use error signal to better track the load trace
h-1Load
GeneratorLoad
Measure
error- h-1
+
z
TraceFile
x level
24
The Problem With Feedback
Feedback would try to make SUM of applied load and external load in system track the load trace
ExternalLoad
h-1Load
GeneratorLoad
Measure
error- h-1
+
z
TraceFile
x level
AppliedLoad
Effect ofCombined
Load
25
Making Feedback Work
ExternalLoad
h-1Load
GeneratorLoad
Measure
error- h-1
+
z
TraceFile
x level
SignalSeparation
AppliedLoad
Effect ofCombined
Load
EstimatedEffect of
Applied Load
EstimatedEffect of
External Load
Load SourceModels
26
Why Host Load Traces for Evaluating Distributed Systems?
• Real
• Comparable and reproducible• Analogous to a SPEC benchmark• Usable in simulation and experimentation
• Non-parametric and non-synthetic• Especially important for prediction systems