Yi Qiao Jason Skicewicz Peter A. Dinda Prescience Laboratory Department of Computer Science

1

Yi Qiao Jason Skicewicz Peter A. Dinda

Prescience Laboratory

Department of Computer Science

Northwestern University

Evanston, IL 60201

An Empirical Study of the Multiscale Predictability of

Network Traffic

2

Talk in a NutshellIn-depth trace-based study of predictability of

link bandwidth at different resolutions– Binning and wavelet approximations

• Generalizations very difficult to make• Aggregation often helps• Predictability does not monotonically

increase with decreasing resolution• Predictability largely independent of

mechanism• Simple models sufficient

3

Outline

• Motivation and Related Work – MTTA

• Traces

• Binning Approximations and Wavelet Approximations

• Results

• Conclusions

4

Background• Why study predictability of network

traffic?– Adaptive applications– Congestion Control– Admission Control– Network management

• Eventual goal– Providing application level network traffic

queries to adaptive applications• Fine-grain app, e.g., Immersive audio• Coarse-grain app. e.g., Scientific app on grids

5

Message Transfer Time Advisor(conf_lower, conf_upper, conf_expected) =

MTTA::PredictTransferTime(src_ip_address,

dest_ip_address,message_size,

transport_protocol,

conf_level);

• Our contributions here– Predicting aggregate background traffic– Dealing with a wide range of time resolutionsTarget API

MTTAApplication Query

Time for transferring a 10MB message, confidence level

=0.95 ?

Query Answer

Expected transfer time is 50 seconds, confidence interval

is [45.9 54.1] seconds

6

Our Approach

Network

Sensor

High-Resolution Bandwidth Signal

Predictor

High-Resolution Prediction

Low-Resolution Prediction

MTTA

Resolution Selection

Application Query

Query Answer

App

7

Multiresolution Views of Resource Signals

• Two Different Approaches– Binning

• Commonly used by existing network measurement tools

– Wavelets• N-level streaming wavelet transform yielding detail

signals and approximation signals• Wavelet domain enables many useful analyses

8

Questions For This Study

• What is the nature of predictability of network resource signals?

• How does predictability depend on resolution?

• What predictive models should be used?

• What are the implications for the MTTA?

9

Tools And Data

• RPS: Resource Prediction System Toolkit for Distributed Systems

• Tsunami: Wavelet Toolkit for Distributed Systems

• NLANR Trace Archive

• Internet Traffic Archive

(Publicly Available From Us)

(Publicly Accessible)

10

Relevant Previous Work• Groschwitz, et al, ARIMA models to predict

long-term NSFNET traffic growth• Basu, et al, Modeling of FDDI, Ethernet LAN,

and NSFNET entry/exit point traffic• Leland, et al, Self-similarity of Ethernet traffic• Wolski, et al, Network Weather Service• Sang and Li: Multi-step prediction of network

traffic using ARMA and MMPP– Both aggregation and smoothing increase

predictability– Our finding: predictability often does not increase

monotonically with smoothing

11

Outline


• Traces


• Results

• Conclusions

12

Trace Classification and Analysis

Y. Qiao, and P. Dinda, Network Traffic Analysis, Classification, and Prediction, Technical Report NWU-CS-02-11, Department of Computer Science, Northwestern University,

January, 2003

Time-series

Classification Scheme

Histogram PSD

ACF

Repeated the analysis for a wide-range of resolutions

Large number and high variety of traces Conclusions

13

Traces

NameNumber of

Raw Traces Classes Studied Duration ResolutionsRange of

NLANR

AUCKLAND

BC

180

34

4

12

8

N/A

39

34

4

.125,.25,…,1024s

7.8125 msto 16s

1d

1h, 1d

90s1,2,4,…,1024ms

Totals 218 N/A 77 90s to 1d

1 msto 1024 s

14

Outline


• Traces


• Results

• Conclusions

15

Binning Approximations• Methodology

– Commonly used by existing network measurement tools

– Averages over N non-overlapping, power-of-two bins

1 S 8 S 128 S 1024 S

Increasing Bin Sizes

16

Wavelet Approximations• Parameterized by a wavelet basis function

– Equivalent to binning approach when using the Haar wavelet

• Methodology– N-level streaming wavelet transform– D8-wavelet were used for our study

Level 0

Level 1

Level 2

Increasing Approximation Level

17

Binning Prediction Methodology

Binning

Component

Prediction

Component

18

Wavelet Prediction Methodology

Wavelet

Component

Prediction

Component

19

Outline


• Traces


• Results

• Conclusions

20

One-step Ahead Predictions

One-step ahead prediction

One-step ahead prediction

High Resolution

Low Resolution

now

Lower Resolution => Longer Interval Into Future

21

Predictability Ratio• Predictability ratio = Variance of error

signal over variance of resource signal = – Fraction of the “surprise” in the signal left after

prediction

• The smaller the ratio, the better predictability we have

22 / e

Resource signal =[1 4 10 9]

Prediction =[2 3 9 10]

Error signal =[1 -1 -1 1]

182 33.12 e

Predictability Ratio =1.33/18=0.07389

22

Wide Range of Prediction Models• Simple Models

– MEAN – long term mean of signal– LAST – last observed value as prediction– BM(32) – average over a history window of optimal size

• Box-Jenkins Models– AR(8), AR(32) – pure autoregressive– MA(8) – pure moving average– ARMA(4,4) – autoregressive moving average– ARIMA(4,1,4), ARIMA(4,2,4) – integrated ARMA

• Long-range dependence model– ARFIMA(4,-1,4) – “Fractionally integrated” ARMA

• Nonlinear model– MANAGED AR(32) – TAR variant

23

Binning Study on NLANR Traces

– Generally unpredictable– Predictability worse at coarser

granularities

LAST

BM(32)With AR Comp

Log Scale

24

Binning Study On BC Traces

– Weak predictability– Predictability not always

monotonically increasing with smoothing

LASTMA(8)

With AR Comp

25

Results for AUCKLAND Traces

• General predictability of traces

• How predictability changes with different resolutions

• Relative performance of different predictive models

3 different behaviors for binning study, and4 different behaviors for wavelet study

26

AUCKLAND Behavior 1 - Binning

MA(8)

LAST

BM(8)

With AR Comp

– 14 of 34 traces– Predictability converges to a

high level with increasing bin size

– Commensurate with conclusions from earlier papers

27

AUCKLAND Behavior 1 - Wavelet

– 7 of the 34 traces– Generally shows monotonic

relationship with approximation levels except outliners

– Relatively uncommon behavior

LAST

MA(8)

With AR Comp

28


MA(8)

LAST

BM(8)

With AR Comp

– 15 of 34 traces– Presence of sweet spot - optimal bin

size that maximizes predictability– Contradicts earlier work

MaxPredictability

Sweet Spot

29

AUCKLAND Behavior 2- Wavelet

– 13 of the 34 AUCKLAND traces– a sweet spot at a particular

scale– Contradicting earlier work

MA(8)

LAST

With AR Comp

Sweet Spot

MaxPredictability

30


– 11 of the 34 traces– Non-monotonic relationship between

scale and predictability– Predictability weaker than behavior 1

and 2

LAST BM(8)

MA(8)

With AR Comp

31

AUCKLAND Behavior 3 - Wavelet– Uncommon, 5 of 34 traces– Multiple peaks and valleys at

different approximations– Predictability not as strong

as the earlier two classesMA(8)

LAST MA(8)

With AR Comp

32

AUCKLAND Behavior 4 - Wavelet

– 3 of the 34 traces– Predictability ratio plateaus and

becomes more predictable at coarsest resolutions

– Behavior did not occur in binning study

LAST MA(8)

With AR Comp

33

ConclusionsIn-depth trace-based study of predictability of

link bandwidth at different resolutions– Binning and wavelet approximations

• Generalizations very difficult to make• Aggregation often helps• Predictability does not monotonically

increase with decreasing resolution• Predictability largely independent of

mechanism• Simple models sufficient

34

Implications for Message Transfer Time Advisor (MTTA)

• Online multiscale prediction system to support MTTA is feasible– Likely to be more accurate for WAN traffic

• Often a natural time scale for prediction– Adaptation likely best here

• Prediction system must itself adapt to changing network behavior

35

Current and Future Work

Wide-area TCP throughput characterization and prediction

Wide-area Parallel TCP throughput modeling and prediction

Tsunami Wavelet Toolkit

D. Lu, Y. Qiao, P. Dinda, and F. Bustamante, Modeling and Taming Parallel TCP on the Wide Area Network, Technical Report NWU-CS-04-35, May, 2004

J. Skicewicz, P. Dinda, Tsunami: A Wavelet Toolkit for Distributed Systems, Technical Report NWU-CS-03-16, Department of Computer Science, Northwestern University, November, 2003.

D. Lu, Y. Qiao, P. Dinda, and F. Bustamante, Characterizing and Predicting TCP Throughput on the Wide Area Network, Technical Report NWU-CS-04-34, Department of Computer Science, Northwestern University, April, 2004.

36

For MoreInformation

• Prescience Lab– http://plab.cs.northwestern.edu

• Tsunami and RPS Available for Download– http://rps.cs.northwestern.edu

• Contact– [email protected]

37

AUCKLAND Behavior 1-Binning

– 14 of 34 traces– Predictability converges to a high

level with increasing bin size– Commensurate with conclusions

from earlier papers

38

AUCKLAND Behavior 1-Wavelet

– 7 of the 34 traces– Generally shows monotonic relationship

with approximation levels except outliners– Relatively uncommon behavior

39

AUCKLAND Behavior 2-Binning

– 15 of 34 traces– Presence of sweet spot, an optimal

bin size that maximize predictability– Contradicts the conclusion of earlier

works

40


– 13 of the 34 AUCKLAND traces– a sweet spot at a particular

approximation scale for maximum predictability

– Contradicting earlier work

41

AUCKLAND Behavior 3-Binning– Uncommon, 5 of 34 traces– Multiple peaks and valleys at

different bin sizes – Predictability not as strong as the

earlier two classes

42


– 11 of the 34 traces– Non-monotonic relationship between the

approximation scale and the predictability– Predictability weaker then class 1

43


– 3 of the 34 traces– The predictability ratio reaches plateaus and

becomes more predictable at coarsest resolutions– A behavior not happened for binning study

Documents

Yi Qiao Jason Skicewicz Peter A. Dinda Prescience Laboratory Department of Computer Science