A Non-intrusive, Wavelet-based Approach To Detecting Network Performance Problems Polly HuangETH...

Preview:

Citation preview

A Non-intrusive, Wavelet-based Approach To Detecting Network Performance Problems

Polly Huang ETH ZurichAnja Feldmann U. SaarbrueckenWalter Willinger AT&T Labs-Research

Road Map

Motivation and rationaleMechanism detailsConclusion and outlook

Performance Problem

Web

TCP

Network

Link/Physical

Web

TCP

Network

Link/Physical

Google.com

congestionroutingserver

else

Internet

Web

TCP

NetworkLink/Physical

congestionroutingproxyelse

Current State

Active probing Ex: traceroute, ping Disturbing - injecting unnecessary traffic Biasing - distort metrics of interest

‘Heisenberg’ effects

Passive measurements Ex: Cisco NetFlow, IP Accounting, other packet-

level measurment give much information Do not infer problems inside the network

What Would Be Cool

PassiveTrigger alerts in real time For problems due to

Server load Congestion Routing error

Common Symptoms Delay and drop

TCP’s Closed-loop Control

Delays/drops reflected in RTT/RTO estimations RTT: round trip time RTO: retransmission timeout

Quality of Network Path Values of RTT/RTO estimations Amounts of RTT/RTO samples

Can be measured passively

Detailed Estimation

Methodology A hash table of all data packets

observed One RTT sample per data-ack pair One RTO sample per data-data pair

Slow ~ #packets/observation period especially with high date rate

connections (the likely trouble makers)

Objectives

Passive measurement Non-intrusive

Infer quality of network paths Detecting network performance

problemEfficiently (so can be done in real

time) Wavelet-based technique

Road Map

Motivation and rationaleMechanism detailsConclusion and outlook

Wavelet-based Technique

Theoretical ground Wavelet transform Energy plots (or scaling plots) Interpreting energy plots

WIND, the problem detection tool Features & examples Detection methodology Validation effort

Theoretical Ground

FFT Frequency decomposition fj, Fourier coefficient Amount of the signal in frequency j

WT: wavelet transform Frequency (scale) and time decomposition dj,k, wavelet coefficient Amount of the signal in frequency j, time k

Wavelet Example

0-1

1

00 00 00 00 11 11 11 11

s1

s2

s3

s4

d1

d2

d3

d4

0 0 0 0 2 2 2 2 0 0 0 0 0 0 0 0

0 0 4 4 0 0 0 0

0 8 0 0

8 8

Self-similarity

Energy function Ej = Σ(dj,k)2/Nj

Self-similar process Ej = 2j(2H-1) C <- the magic!!

log2 Ej = (2H-1) j + log2C

linear relationship between log2 Ej and j

Self-similar Traffic

Effect of Periodicity

self-similar

Internet Traffic

Adding Periodicity

packets arrive periodically, 1 pkt/23 msec

coefficients cancel out at scale 410 00 00 00 10 00 00 00

s1

s2

s3

s4

d1

d2

d3

d4

1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0

1 0 1 0 1 0 1 0

1 1 1 1

2 0

Simulation TrafficSingle RTT

Simulation TrafficCongestion

Interpreting Energy Functions

Abrupt knees at RTT time scale RTO time scale

Knee shifts RTT/RTO time changes

Low energy level (after normalization) congestion low traffic volume

WIND - The Detection Tool

Wavelet-based Inference for Network Detection

Based on libpcap and tcpdumpOn-line mode (efficient)

Per packet: compute dj,k

Per observation period: output Ej

On a subnet basisOff-line mode

Detailed RTT/RTO estimation

Real TrafficBy Subnets

Real TrafficBy Periods

Real TrafficBy Periods

Detecting Methodology

Reference function Smoothed average

Difference Area below the reference function Weighted sum by scale

Flagged interesting Top 10% deviations

Pick Out Interesting Ones26, 30, 31

Validation By

WIND off-line mode Detailed RTT/RTO estimations Volume

Similar heuristics (area difference) CCDF of RTT/RTO Ratio of RTO/RTT Volume

Validate period 26, 30, 31

CCDF of RTO: pick out period 23, 26, 31

CCDF of RTT:pick out period 29, 30, 31

80-90% are validated interesting80-90% are validated interesting

Road Map

Motivation and rationaleMechanism detailsConclusion and outlook

Summary

Detect problems using energy plots If self-similar, clean linear relationship If periodic, getting knees If problems, knee shifts or low energy level

WIND: the online/offline analysis tool

Passive Efficient

Outlook

Full-fledged diagnosing tool More sophisticated heuristics Use of traceroute data

Illustrative examples Using the tool (beta release) Using the methodology

Questions?

http://www.tik.ee.ethz.ch/~huang