Upload
jessica-webster
View
217
Download
0
Embed Size (px)
Citation preview
Congestion Responsiveness of Internet Traffic
(a fresh look at an old problem)
Ravi Prasad&
Constantine Dovrolis
Networking and Telecommunications GroupCollege of Computing,
Georgia Tech
TCP and Internet stability Stable network: the offered load stays below the
capacity (ρ<1) Otherwise, persistent packet losses Congestion collapse: fully utilized links, but almost
zero per-flow goodput Conventional wisdom #1: the Internet manages to be
stable due to TCP congestion control TCP: more than 90% of Internet traffic TCP reduces offered load (send window) upon signs
of congestion Negative-feedback loop, stabilizing queueing system
Conventional wisdom #2: stability can be maintained without admission control or resource reservations
TCP-centric congestion control If all flows use TCP, or TCP-friendly congestion control,
then the Internet will be stable TCP congestion control -> no congestion collapse “Promoting the use of end-to-end congestion control in the
Internet”, Floyd & Fall, ToN’99 “Congestion control principles”, Floyd, RFC2914, 2000
Key modeling unit: persistent flows (they last forever!) “Rate control in communication networks: shadow prices,
proportional fairness and stability”, Kelly et al., JORS’98 “Congestion control for high performance, stability, and
fairness in general networks”, Paganini et al., ToN’05 Number of active flows does not change with time Infinitely long flows can be effectively controlled
Flows are generated by users/applications, not by the transport layer!
Examples: user clicks web page, p2p movie download, machine-generated periodic FS synchronization
Session: Set of finite (i.e., non-persistent) flows, generated by single user action
Key issue: session arrival process
Does the session arrival rate reduce during congestion?
Receiver Sender
Transport
Application
ResponseRequest
Network
Two fundamental flow arrival models
Closed-loop model Fixed number of users, each
user can generate one session at a time
New session arrival: depends on completion of previous session
E.g., ingress traffic in campus network (student downloads)
Open-loop model Sessions arrive in network
independently of congestion Theoretically, infinite
population of users E.g., egress traffic at
popular Web server Very different models in
terms of congestion responsiveness & stability
1
2
3
N
Related work Open-loop traffic model
“Statistical bandwidth sharing: a study of congestion at flow level”, Fredj et al., Sigcomm’01
“Stability and performance analysis of networks supporting services”, Veciana et al., ToN’01
Closed-loop traffic model “A new method for the analysis of feedback-based
protocols with applications to engineering web traffic over the Internet”, Heyman et al., Sigmetrics’99
“Dimensioning bandwidth for elastic traffic in high-speed data networks”, Berger & Kogan, ToN’00
Main open issues:1. What do the previous two models imply for the
congestion responsiveness of aggregate Internet traffic?2. Which of the previous two models is closer to real
Internet traffic?
Our contributions Introduce two new metrics for congestion responsiveness
of aggregate Internet traffic Elasticity and instability coefficient
Examine congestion responsiveness of several traffic models, including open-loop, closed-loop, and mixed traffic Open-loop TCP traffic is less congestion responsive than
even UDP traffic! Closed-loop traffic is more congestion responsive than
persistent flows Design experimental methodology to measure Close-loop
Traffic Ratio (CTR) Measure CTR in several Internet packet traces 70-90% of Internet traffic appears to be closed-loop
Several of implications for networking research & practice
Outline Congestion responsiveness metrics
Elasticity Instability coefficient
Results for ideal Processor Sharing (PS) server Closed-loop flow arrival model Open-loop flow arrival model
Congestion responsiveness of four traffic models Persistent TCP flows UDP constant-rate streams Open-loop TCP flows Closed-loop TCP flows
Congestion responsiveness of real network traffic Methodology and measurements
Summary and implications
Elasticity metric Quantifies the extent to which a traffic aggregate
backs off upon a congestion event U and U ’ : average throughput of aggregate
traffic prior and during stimulus, respectively Defined as fractional change in throughput
Depends on congestion event cause Canonical congestion event: a persistent TCP transfer
(stimulus) that is not limited by the receiver’s window
U
UUf
'
f=1 Completely responsive
f=0 Completely unresponsive
Elasticity
Cross-traffic
Stimulus
Positive elasticity
Negative elasticity When cross traffic increases its rate upon congestion
Elasticity
Cross-traffic
Stimulus
Instability Coefficient Instability coefficient quantifies whether (and
how fast) a traffic aggregate can lead to congestion collapse upon congestion at time t
Defined as (t)=dN(t)/dt N(t) : number of active sessions at time t
≤ 0 Fixed or decreasing number of active sessions Stable network
> 0 Increasing number of active sessions Has the potential to cause congestion collapse Larger faster move towards congestion collapse
Instability Coefficient Simulation of a stable network: = 0
Open-loop model: session arrival rate 200/sec
Instability Coefficient Simulation of an unstable network > 0
Open-loop model: session arrival rate 400/sec
Outline Congestion responsiveness metrics
Elasticity Instability coefficient
Results for ideal Processor Sharing (PS) server Closed-loop flow arrival model Open-loop flow arrival model
Congestion responsiveness of four traffic models Persistent TCP flows UDP constant-rate streams Open-loop TCP flows Closed-loop TCP flows
Congestion responsiveness of real network traffic Methodology and measurements
Summary and implications
Closed-loop model – PS server
N users: cycles of transfer and idle periods S : Average session size TT : Average transfer
duration TI : Average idle time TT increases during
congestion Na: Number of active
sessions Elasticity f = 1/(Na+1) Instability coefficient
cannot be positive indefinitely ( Na<N )
ICT
NS
1,][
1,1
][
S
CTNNE
NE
Ia
a
TIoffered TT
NSR
Open-loop model – PS server Poisson session
arrivals S : Average session size : Session arrival rate Offered load = S/C Stable only if <1
Expected throughput for new transfer: C(1-) : available bw Elasticity f = 0
Instability coefficient: if >1
C
S
1),1(][ CT
SE
SRoffered
Mixed traffic Internet traffic: mix of open-loop and
closed-loop traffic Mixed traffic can be characterized by
Closed-loop Traffic Ratio (CTR)
fmix = CTR* fclosed
mix > 0 when open > 1 Not when open +closed >1
load trafficTotal
model loop closed from load TrafficCTR
Outline Congestion responsiveness metrics
Elasticity Instability coefficient
Results for ideal Processor Sharing (PS) server Closed-loop flow arrival model Open-loop flow arrival model
Congestion responsiveness of four traffic models Persistent TCP flows UDP constant-rate streams Open-loop TCP flows Closed-loop TCP flows
Congestion responsiveness of real network traffic Methodology and measurements
Summary and implications
Persistent TCP transfers N homogenous transfers Stimulus increases RTT and loss
rate from (T,p) to (T’,p’) UMass model to estimate TCP
average throughput
Number of transfers remains constant, i.e., = 0
1
1
23
'23
'1
N
bpTNM
bpTNM
f
Constant-rate UDP transfers Fixed number of constant-rate flows
UDP flows do not react to congestion, and they do not retransmit lost packets
Throughput after stimulus: U’= (1-p)U Elasticity f = p >0 Truly congestion responsive traffic should have
larger elasticity than loss rate Instability coefficient is zero
Number of flows does not change during congestion
Cannot cause congestion collapse
Open-loop TCP transfers Poisson stream of TCP flows
Size uniformly distributed between 16-20pkts
Arrival rate chosen to vary offered load
Ideally, f=0 when <1 But, negative elasticity is
possible with TCP redundant retransmissions Increased offered load
after stimulus is positive when >1
Possible congestion collapse Open-loop traffic is net’s
worse enemy
Closed-loop TCP transfers When loss rate ~ 0 (i.e.,
small number of sessions) Stimulus increases RTT
from T to T’ Transfer latency
increases from kT to kT’
With small number of active sessions: Elasticity: about constant
With large number of active sessions: Elasticity > 1/(Na+1) Closed-loop TCP traffic:
more elastic than persistent flows
ITkT
TTkf
'
)'(
Summary
Traffic class Elasticity Stability
Persistent TCP elastic f=1/(N+1)N homogenous flows
stable
UDP const-rate inelastic f=pp: loss rate
stable
Open-loop TCP inelastic f≤0 unstable if > 1
Closed-loop TCP elastic f>1/(Na+1) stable
Outline Congestion responsiveness metrics
Elasticity Instability coefficient
Results for ideal Processor Sharing (PS) server Closed-loop flow arrival model Open-loop flow arrival model
Congestion responsiveness of four traffic models Persistent TCP flows UDP constant-rate streams Open-loop TCP flows Closed-loop TCP flows
Congestion responsiveness of real network traffic Methodology and measurements
Summary and implications
What to measure? Direct elasticity measurements require
packet traces at bottleneck during stimulus We have access to only a couple of such links
Direct measurements of instability coefficient require packet traces during congestion events We have access to only a couple of congested links
Alternative: Measure CTR (closed-loop traffic ratio) Indirect metric for congestion responsiveness High CTR (close to one): mostly closed-loop traffic Low CTR (close to zero): mostly open-loop traffic
CTR estimation (overview) Start with packet trace from Internet link
Per-packet: arrival time, src/dst address & ports, size Focus only on TCP traffic: HTTP and well-known ports
Identify users: Downloads: user is associated with unique DST address Uploads: user is associated with unique SRC address Multi-user hosts and NATs is a problem (see paper for
details) For each user, identify sessions:
Session: one or more connections (“jobs”) associated with same user action E.g., Web page download: multiple HTTP
connections Classify sessions as open-loop or closed-loop:
Successive sessions from same user: closed-loop Session from a new user, or session arriving from
known user after a long idle period: open-loop
From Connections to Jobs to Sessions An HTTP 1.1 connection
can stay alive across multiple sessions
Job : Segment of TCP connection that belongs to a single session
Intra-job packet interarrivals: TCP and network-dependent (short)
Inter-job packet interarrivals: caused by user actions (long)
Classify interarrivals based on Silence Threshold (STH)
1105126179.423931 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.478309 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.478438 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.478554 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.488433 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.488666 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.488918 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.539748 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.539870 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.539993 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.549085 163.157.239.61 127.207.1.255 80 2290 154 T 114
1105126179.549399 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.611572 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.611702 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.612235 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.612507 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.612752 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.613121 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.672432 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
Inter job gap Intra job gap
Silence Threshold (STH) estimation
Inter job gap Intra job gap
Group jobs from same user in sessions Intuition: jobs from
same session will have short interarrivals (machine-generated)
Minimum Session Interarrival (MSI) threshold
MSI aims to distinguish machine-generated from user-initiated events MSI = 1-5 seconds
1105126179.423931 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.478309 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.478438 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.478554 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.488433 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.488666 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.488918 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.539748 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.539870 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.539993 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.549085 163.157.239.61 127.207.1.255 80 2290 154 T 114
1105126179.549399 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.611572 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.611702 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.612235 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.612507 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.612752 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.613121 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.672432 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
Inter job gap Intra job gap
<MSI >MSI
session 1 session 2 session 3
Classify sessions as open/closed-loop First session from a user is
always open-loop Session from a returning
user is also open-loop, if it starts more than MTT seconds since completion of last session
MTT: Maximum Think Time Typically, MTT would be
several minutes
1105126179.423931 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.478309 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.478438 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.478554 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.488433 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.488666 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.488918 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.539748 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.539870 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.539993 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.549085 163.157.239.61 127.207.1.255 80 2290 154 T 114
1105126179.549399 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.611572 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.611702 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.612235 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.612507 163.157.239.61 127.207.1.255 80 2289 1420 T 1380
1105126179.612752 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.613121 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
1105126179.672432 163.157.239.61 127.207.1.255 80 2290 1420 T 1380
Inter job gap Intra job gap
<MSI >MSI
session 1Open
session 2Open
session 3Close
> MTT < MTT
Robustness to MSI & MTT thresholds Examined CTR variation
in the following ranges: MSI: 0.1sec-2sec MTT : 10min-25min
CTR variation < 0.05 Linear regression:
CTR/MSI = -0.0044/sec CTR/MTT = 0.0037/min
We use: MSI=1 Sec. MTT=15 Min.
Sample CTR measurementsLink location
Year Direction Duration
TCP HTTP Download Well-known ports
GB(%) Bytes(%) CTR Bytes(%) CTR
Georgia Tech.
05 In 2Hr. 129(97) 44.7 0.90 18.8 0.60
Out 2Hr. 208(99) 37.3 0.63 10.6 0.70
Los Nettos
04 Core 1Hr. 59(95) 36.2 0.93 29.3 0.83
UNC, Chapel Hill
03 In 1Hr. 41(87) 22.9 0.95 3.6 0.69
Out 1Hr. 153(97) 19.0 0.76 16.8 0.91
Abilene, Indianapolis
02 Core 1Hr. 172(96) 8.0 0.78 33.9 0.91
Core 1Hr. 178(85) 11.5 0.82 35.8 0.89
Univ. of Auckland, NZ
01 In 6Hr. 0.6(95) 42.4 0.92 30.6 0.24
Out 6Hr. 1.4(98) 70.4 0.79 7.6 0.72
Outline Congestion responsiveness metrics
Elasticity Instability coefficient
Results for ideal Processor Sharing (PS) server Closed-loop flow arrival model Open-loop flow arrival model
Congestion responsiveness of four traffic models Persistent TCP flows UDP constant-rate streams Open-loop TCP flows Closed-loop TCP flows
Congestion responsiveness of real network traffic Methodology and measurements
Summary and implications
Summary Persistent transfers have very different
congestion responsiveness than finite-size transfers Focus on open-loop and closed-loop flow arrivals
TCP or TCP-like protocols are not sufficient to avoid congestion collapse
Negative feedback at session/application layer holds key for network stability
Measurements show high CTR values for most Internet links we examined Possibly why Internet is mostly stable
Is AQM an effective controller?
Active Queue Management (AQM) Most AQM models assume persistent TCP flows
Provides congestion signal to flows Stabilizes buffer occupancy Controls link utilization
However, AQM is ineffective controller in presence of open-loop TCP traffic
Flow arrival process does not react to AQM drops
Congestion collapse still possible with AQM
Is admission control necessary? Admission control is an effective way to
control the offered load with open-loop traffic Avoids flow aborts and reattempts See proposals by J. Roberts and others
However, admission control is not required with closed-loop traffic Closed-loop traffic is self-regulating As long as the maximum possible number of
active sessions does not exceed a certain threshold
What about TCP-friendliness?
“TCP friendliness” has been proposed for all non-TCP traffic as a way to avoid congestion collapse However, like TCP, open-loop TCP
friendly sessions can still cause congestion collapse
TCP friendliness is more important for fairness reasons (share bw almost equally with TCP)
Traffic models for simulations-analysis
Time to drop the persistent flows assumption! It is not realistic It has very different congestion responsiveness
than real Internet traffic More realistic aggregate traffic models:
Mix of both open-loop and closed-loop finite-size sessions
We need more CTR measurements to characterize the mix
We need mathematical models for closed-loop traffic behavior, considering user behavior under congestion
Session/application congestion control Several existing applications generate
sessions independent of network congestion (bad!) Example-1: NNTP servers transfer news periodically Example-2: CDN servers exchange content as
needed or periodically Client-side control mechanism:
Do not start new session before current session completes
Server-side control mechanism: Use admission control when number of active
sessions exceeds threshold