Congestion models for bursty TCP traffic

Preview:

DESCRIPTION

Congestion models for bursty TCP traffic. Damon Wischik + Mark Handley University College London. DARPA grant W911NF05-1-0254. History of TCP Transmission Control Protocol. 1974: First draft of TCP/IP [ “A protocol for packet network interconnection” , Vint Cerf and Robert Kahn ] - PowerPoint PPT Presentation

Citation preview

Congestion models forbursty TCP traffic

Damon Wischik + Mark Handley

University College London

THE ROYALSOCIETY

DARPA grant W911NF05-1-0254

History of TCP Transmission Control Protocol

• 1974: First draft of TCP/IP[“A protocol for packet network interconnection”, Vint Cerf and Robert Kahn]

• 1983: ARPANET switches on TCP/IP

• 1986: Congestion collapse• 1988: Congestion control for TCP

[“Congestion avoidance and control”, Van Jacobson]

“A Brief History of the Internet”, the Internet Society

TCP algorithmif (seqno > _last_acked) {

if (!_in_fast_recovery) {

_last_acked = seqno;

_dupacks = 0;

inflate_window();

send_packets(now);

_last_sent_time = now;

return;

}

if (seqno < _recover) {

uint32_t new_data = seqno - _last_acked;

_last_acked = seqno;

if (new_data < _cwnd) _cwnd -= new_data; else _cwnd=0;

_cwnd += _mss;

retransmit_packet(now);

send_packets(now);

return;

}

uint32_t flightsize = _highest_sent - seqno;

_cwnd = min(_ssthresh, flightsize + _mss);

_last_acked = seqno;

_dupacks = 0;

_in_fast_recovery = false;

send_packets(now);

return;

}

if (_in_fast_recovery) {

_cwnd += _mss;

send_packets(now);

return;

}

_dupacks++;

if (_dupacks!=3) {

send_packets(now);

return;

}

_ssthresh = max(_cwnd/2, (uint32_t)(2 * _mss));

retransmit_packet(now);

_cwnd = _ssthresh + 3 * _mss;

_in_fast_recovery = true;

_recover = _highest_sent;

}

time [0-8 sec]

traf

fic r

ate

[0-1

00 k

B/s

ec]

Motivation

• We want higher throughput for TCP flows

• This requires faster routers and lower packet drop probabilities– high-throughput TCP flows with large round trip time are especially

sensitive to drop, since it takes them a long time to recover

xx x

packet drops

throughput

100 Mb/s

0 time60 seconds

Motivation

• We want higher throughput for TCP flows

• This requires faster routers and lower packet drop probabilities– high-throughput TCP flows with large round trip time are especially

sensitive to drop, since it takes them a long time to recover

• Such a network is hard to build– Buffering becomes an ever-harder challenge as router speeds increase

DRAM access speeds double every 10 years; this cannot keep up with ever-faster linecards

– Larger buffers aren’t even very good at reducing drop probability!

Motivation

• We want higher throughput for TCP flows

• This requires faster routers and lower packet drop probabilities– high-throughput TCP flows with large round trip time are especially

sensitive to drop, since it takes them a long time to recover

• Such a network is hard to build– Buffering becomes an ever-harder challenge as router speeds increase

Larger buffers aren’t even very good at reducing drop probability!

• Objectives– Understand better the nature of congestion at core routers

– Redesign TCP based on this understanding

– Rethink the buffer size for core routers

Three Modes of Congestion

• Theory predicts three qualitatively different modes of congestion, depending on buffer size.– TCP uses feedback control to adjust its rate

– The feedback loop is mediated by queues at routers

– By changing buffer size, we change the nature of the traffic and the mode of congestion

• A major difference between the modes is synchronization

++=

++=

aggregatetraffic rate

individualflow rates

all flows get drops at the same time:synchronization

drops are evenly spread:desynchronization

Mode I: small buffers

• e.g. a buffer of 25 packets

• System is stable

• TCP flows are desynchronized

• Queue size oscillations are very rapid

• Steady losses

• Primal fluid model

dropprob

queuesize

util25%

0%

50%

75%

100%

25%

50%

75%

100%

10%

20%

30%

15

5

25

queuesize

[0—5sec]

Mode II: intermediate buffers

• e.g. the McKeown √N rule

• System is unstable

• TCP flows are synchronized

• Queue size flips suddenly from empty to full, or full to empty

• Queue-based AQM cannot work

• Buffer is small enough that RTT is approximately constant

• Primal fluid model

dropprob

queuesize

util

queuesize

[0—5sec]

610

600

619

25%

0%

50%

75%

100%

25%

50%

75%

100%

10%

20%

30%

Mode III: large buffers

• e.g. the bandwidth-delay-product rule of thumb

• System is unstable(although it can be stabilized by e.g. RED)

• TCP flows are synchronized

• Queue size varies fluidly

• RTT varies

• Primal-dual fluid model

dropprob

queuesize

util

queuesize

43740

43730

43750

25%

0%

50%

75%

100%

25%

50%

75%

100%

10%

20%

30%

[0—5sec]

Conclusion

• We therefore proposed– A buffer of only 30 packets is sufficient for a core

router, regardless of the line rate. – Random queue size fluctuations are only ever of the

order of 30 packets; larger buffers just lead to persistent queues and synchronization

– A buffer of 30 packets gives >95% utilization, and keeps the system stable

• Other researchers ran simulations with buffers this small—and found very poor performance

Problem: TCP burstiness

• Slow access links serve to pace out TCP packets• Fast access links allow a TCP flow to send its entire window back-to-back

• We had only simulated slow access links, and our theory only covered paced TCP traffic. Other researchers simulated faster access links.

no. ofpackets

sent[0—25]

time [0—5s]

slow access links fast access links

TCP burstiness

• Slow access links serve to pace out TCP packets

• Fast access links allow a TCP flow to send its entire window back-to-back

4142 4344 41 4243 44 41 4243 44 41 42 4344

41 4243 44 4142 4344 4142 4344 41 4243 44slow access links fast access links

dropprob

queuesize

util

TCP burstiness

• Slow access links serve to pace out TCP packets

• Fast access links allow a TCP flow to send its entire window back-to-back

• Queueing theory suggests that queueing behaviour is governed by the buffer size B when TCP traffic is paced, but that it is governed by B/W for very bursty TCP traffic, where W is the mean window size

• For bursty traffic, the buffer should be up to 15 times bigger than we proposed

4142 4344 41 4243 44 41 4243 44 41 42 4344

41 4243 44 4142 4344 4142 4344 41 4243 44slow access links fast access links

dropprob

queuesize

util

B=300pkt is intermediate B=300pkt is small

TCP burstiness

• Slow access links serve to pace out TCP packets

• Fast access links allow a TCP flow to send its entire window back-to-back

• The aggregate of paced TCP traffic looks Poisson over short timescales.This drives our original model of the three modes of congestion, and is supported by theory and measurements [Bell Labs 2002, CAIDA 2004]

• We predict that the aggregate of very bursty TCP traffic should look like a batch Poisson process.

4142 4344 41 4243 44 41 4243 44 41 42 4344

41 4243 44 4142 4344 4142 4344 41 4243 44slow access links fast access links

dropprob

queuesize

util

Limitations/concerns

• Surely bottlenecks are at the access network, not the core network?– Unwise to rely on this!– The small-buffer theory seems to work for as few as 200 flows

• We need more measurement of short-timescale Internet traffic statistics

• Limited validation of predictions about buffer size[McKeown et al. at Stanford, Level3, Internet2]

• Proper validation needs– goodly amount of traffic– full measurement kit– ability to control buffer size

Conclusion

• There are three qualitatively different modes of congestion– Buffer size determines which mode is in operation, for paced traffic

– Buffer size divided by mean window size determines the mode, for very bursty traffic

– We have a collection of rules of thumb for quantifying whether a buffer is small, intermediate or big, and for quantifying how burstiness depends on access speeds

• These modes of congestion have several consequences– UTILIZATION

very small buffers can cut maximum utilization by 20%; synchronization in intermediate buffers can cut it by 4%

– SYNCHRONIZED LOSSESintermediate and large buffers lead to synchronized losses, which are detrimental to real-time traffic

– QUEUEING DELAYlarge buffers lead to queueing delay; and while end systems can recover from loss, they can never recover lost time

Recommended