66
Advanced Computer Advanced Computer Networking Networking Internet Congestion Control 1

Advanced Computer Networking Internet Congestion Control 1

Embed Size (px)

Citation preview

Page 1: Advanced Computer Networking Internet Congestion Control 1

Advanced Computer Advanced Computer NetworkingNetworking

Internet Congestion Control

11

Page 2: Advanced Computer Networking Internet Congestion Control 1

22

Principles of Congestion ControlPrinciples of Congestion Control

Congestion:• informally: “too many sources sending too

much data too fast for network to handle”• manifestations:

– lost packets (buffer overflow at routers)– long delays (queuing in router buffers)

• a highly important problem!

H1

H2

R1 H3

A1(t)

10Mb/sD(t)

1.5Mb/sA2(t)

100Mb/s behnam shafagatybehnam shafagaty

Page 3: Advanced Computer Networking Internet Congestion Control 1

33

Causes/costs of congestion: scenario Causes/costs of congestion: scenario

11

• two senders, two receivers• one router, • infinite buffers • no retransmission

behnam shafagatybehnam shafagaty

Page 4: Advanced Computer Networking Internet Congestion Control 1

44

Causes/costs of congestion: scenario Causes/costs of congestion: scenario

11

• Throughput increases with load• Maximum total load C (Each session C/2)• Large delays when congested

– The load is stochastic

behnam shafagatybehnam shafagaty

Page 5: Advanced Computer Networking Internet Congestion Control 1

55

Causes/costs of congestion: scenario Causes/costs of congestion: scenario

22 one router, finite buffers sender retransmission of lost packet

behnam shafagatybehnam shafagaty

Page 6: Advanced Computer Networking Internet Congestion Control 1

66

Causes/costs of congestion: scenario Causes/costs of congestion: scenario

22 • always: (goodput)

– Like to maximize goodput!

• “perfect” retransmission:

– retransmit only when loss:

• Actual retransmission of delayed (not lost) packet

makes larger (than perfect case) for same .

in

out

=

out

in

out

>

in

behnam shafagatybehnam shafagaty

Page 7: Advanced Computer Networking Internet Congestion Control 1

77

Causes/costs of congestion: scenario Causes/costs of congestion: scenario

22

“costs” of congestion: more work (retrans) for given “goodput” unneeded retransmissions: link carries (and

delivers) multiple copies of pkt

inin '

out

out

’in

out

’in

behnam shafagatybehnam shafagaty

Page 8: Advanced Computer Networking Internet Congestion Control 1

Packet delay and throughput as functions of load

88 behnam shafagatybehnam shafagaty

Page 9: Advanced Computer Networking Internet Congestion Control 1

Congestion ControlCongestion Control

• Congestion control involves two tasks:-Detect congestion-Limit sending rate

99 behnam shafagatybehnam shafagaty

Page 10: Advanced Computer Networking Internet Congestion Control 1

TCP & AQMTCP & AQM

xi(t)

pl(t)

TCP: Reno Vegas

AQM: DropTail RED REM,PI,AVQ

Example congestion measure pl(t)

– Loss (Reno)– Queuing delay (Vegas)

Example congestion measure pl(t)

– Loss (Reno)– Queuing delay (Vegas)

1010 behnam shafagatybehnam shafagaty

Page 11: Advanced Computer Networking Internet Congestion Control 1

TCP Congestion ControlTCP Congestion Control

•End-End control (no network assistance)

•Assumes long delays (packet loss) is due to congestion

1111 behnam shafagatybehnam shafagaty

Page 12: Advanced Computer Networking Internet Congestion Control 1

Congestion Control IICongestion Control II

• TCP uses slow start and Additive Increase/multiplicative decrease (AIMD) to deal with congestion

• Van Jacobson 1988 outlined these ideas

• slow-start roughly: whenever starting traffic or recovering from congestion, start cwnd at the size of a single segment and increase it (up to a point) as ACKs show up

1212 behnam shafagatybehnam shafagaty

Page 13: Advanced Computer Networking Internet Congestion Control 1

1313

AIMDAIMD(Additive Increase / Multiplicative (Additive Increase / Multiplicative

Decrease)Decrease)• CongestionWindow (cwnd) is a variable held

by the TCP source for each connection.

• cwnd is set based on the perceived level of congestion. The Host receives implicit (packet drop) or explicit (packet mark) indications of internal congestion.

MaxWindow :: min (CongestionWindow, AdvertisedWindow)

EffectiveWindow = MaxWindow – (LastByteSent -LastByteAcked)

behnam shafagatybehnam shafagaty

Page 14: Advanced Computer Networking Internet Congestion Control 1

1414

Additive IncreaseAdditive Increase

• Additive Increase is a reaction to perceived available capacity.

• Linear Increase basic idea:: For each “cwnd’s worth” of packets sent, increase cwnd by 1 packet.

• In practice, cwnd is incremented fractionally for each arriving ACK.

increment = (MSS /cwnd)

cwnd = cwnd + increment

behnam shafagatybehnam shafagaty

Page 15: Advanced Computer Networking Internet Congestion Control 1

1515

Additive IncreaseAdditive Increase

Source Destination

Add one packet

each RTT

behnam shafagatybehnam shafagaty

Page 16: Advanced Computer Networking Internet Congestion Control 1

1616

Multiplicative DecreaseMultiplicative Decrease

The key assumption is that a dropped packet and the resultant timeout are due to congestion at a router or a switch.

Multiplicate Decrease:: TCP reacts to a timeout by halving cwnd.

cwnd is not allowed below the size of a single packet.

behnam shafagatybehnam shafagaty

Page 17: Advanced Computer Networking Internet Congestion Control 1

1717

AIMDAIMD: Some Notes: Some Notes

• It has been shown that AIMD is a necessary condition for TCP congestion control to be stable.

• Because the simple CC mechanism involves timeouts that cause retransmissions, it is important that hosts have an accurate timeout mechanism.

• Timeouts set as a function of average RTT and standard deviation of RTT.

behnam shafagatybehnam shafagaty

Page 18: Advanced Computer Networking Internet Congestion Control 1

1818

Typical TCP Congestion window Evolution Typical TCP Congestion window Evolution

behnam shafagatybehnam shafagaty

Page 19: Advanced Computer Networking Internet Congestion Control 1

1919

AIMD: AIMD: Two users, One linkTwo users, One link

BW limit

Fairness

Rate of User 1

Rat

e of

Use

r 2

behnam shafagatybehnam shafagaty

Page 20: Advanced Computer Networking Internet Congestion Control 1

2020

Slow StartSlow Start

Linear additive increase takes too long to ramp up a new TCP connection from cold start.

Beginning with TCP Tahoe, the slow start mechanism was added to provide an initial exponential increase in the size of cwnd.

behnam shafagatybehnam shafagaty

Page 21: Advanced Computer Networking Internet Congestion Control 1

2121

SloSloww Start Start1- The source starts with cwnd = 1.2- Every time an ACK arrives, cwnd is

incremented.cwnd is effectively doubled per RTT “epoch”.Two slow start situations:

At the very beginning of a connection {cold start}.

When the connection goes dead waiting for a timeout to occur (i.e, the advertized window goes to zero!)

behnam shafagatybehnam shafagaty

Page 22: Advanced Computer Networking Internet Congestion Control 1

2222

Slow StartSlow Start

Source Destination

Slow StartAdd one packet

per ACK

behnam shafagatybehnam shafagaty

Page 23: Advanced Computer Networking Internet Congestion Control 1

2323

Fast RetransmitFast Retransmit

Basic Idea:: use duplicate ACKs to signal lost packet.

Fast RetransmitUpon receipt of three duplicate ACKs, the TCP Sender

retransmits the lost packet.

behnam shafagatybehnam shafagaty

Page 24: Advanced Computer Networking Internet Congestion Control 1

2424

Fast RetransmitFast Retransmit

• Generally, fast retransmit eliminates about half timeouts.

• This yields roughly a 20% improvement in throughput.

• Note – fast retransmit does not eliminate all the timeouts due to small window sizes at the source.

behnam shafagatybehnam shafagaty

Page 25: Advanced Computer Networking Internet Congestion Control 1

2525

Fast RetransmitFast Retransmit

Packet 1

Packet 2

Packet 3

Packet 4

Packet 5

Packet 6

Retransmitpacket 3

ACK 1

ACK 2

ACK 2

ACK 2

ACK 6

ACK 2

Sender Receiver

Fast Retransmit

Based on three

duplicate ACKs

behnam shafagatybehnam shafagaty

Page 26: Advanced Computer Networking Internet Congestion Control 1

2626

TCP Congestion Window TCP Congestion Window TraceTrace

0

10

20

30

40

50

60

70

0 10 20 30 40 50 60

Time

Co

ng

esti

on

Win

do

w

threshold

congestionwindowtimeouts

slow start period

additive increase

fast retransmission

behnam shafagatybehnam shafagaty

Page 27: Advanced Computer Networking Internet Congestion Control 1

2727

Fast RecoveryFast Recovery• Fast recovery was added with TCP Reno.

Fast Recovery•In congestion avoidance mode, if duplicate acks are received, reduce cwnd to half.•If n successive duplicate acks are received, we know that receiver got n segments after lost segment: Advance cwnd by that number.

behnam shafagatybehnam shafagaty

Page 28: Advanced Computer Networking Internet Congestion Control 1

2828

Adaptive RetransmissionsAdaptive Retransmissions

RTT:: Round Trip Time between a pair of hosts on the Internet.

• How to set the TimeOut value?– The timeout value is set as a function

of the expected RTT.– Consequences of a bad choice?

behnam shafagatybehnam shafagaty

Page 29: Advanced Computer Networking Internet Congestion Control 1

2929

Original AlgorithmOriginal Algorithm

• Keep a running average of RTT and compute TimeOut as a function of this RTT.– Send packet and keep timestamp ts .

– When ACK arrives, record timestamp ta .

SampleRTT = ta - ts

behnam shafagatybehnam shafagaty

Page 30: Advanced Computer Networking Internet Congestion Control 1

3030

Original AlgorithmOriginal Algorithm

Compute a weighted average:

EstimatedRTT = EstimatedRTT = αα x x EstimatedRTT + EstimatedRTT + ( (1- 1- αα) x SampleRTT) x SampleRTT

Original TCP spec: αα in range (0.8,0.9) in range (0.8,0.9)

TimeOut = 2 x TimeOut = 2 x EstimatedRTTEstimatedRTT

behnam shafagatybehnam shafagaty

Page 31: Advanced Computer Networking Internet Congestion Control 1

3131

Karn/Partidge AlgorithmKarn/Partidge Algorithm

An obvious flaw in the original algorithm:

Whenever there is a retransmission it is impossible to know whether to associate the ACK with the original packet or the retransmitted packet.

behnam shafagatybehnam shafagaty

Page 32: Advanced Computer Networking Internet Congestion Control 1

3232

Associating the ACK?Associating the ACK?

Sender Receiver

Original transmission

ACK

Retransmission

Sender Receiver

Original transmission

ACK

Retransmission

(a) (b)

behnam shafagatybehnam shafagaty

Page 33: Advanced Computer Networking Internet Congestion Control 1

3333

Karn/Partidge AlgorithmKarn/Partidge Algorithm

1. Do not measure SampleRTTSampleRTT when sending packet more than once.

2. For each retransmission, set TimeOutTimeOut to double the last TimeOutTimeOut.{ Note – this is a form of exponential backoff based on the believe that the lost packet is due to congestion.}

behnam shafagatybehnam shafagaty

Page 34: Advanced Computer Networking Internet Congestion Control 1

3434

Jaconson/Karels AlgorithmJaconson/Karels AlgorithmThe problem with the original algorithm is that it did

not take into account the variance of SampleRTT.

Difference = SampleRTT – EstimatedRTTDifference = SampleRTT – EstimatedRTTEstimatedRTT = EstimatedRTT +EstimatedRTT = EstimatedRTT +

((δδ x Difference)x Difference)Deviation =Deviation = δδ (|Difference| - Deviation)(|Difference| - Deviation)

where δδ is a fraction between 0 and 1.

behnam shafagatybehnam shafagaty

Page 35: Advanced Computer Networking Internet Congestion Control 1

3535

Jaconson/Karels AlgorithmJaconson/Karels Algorithm

TCP computes timeout using both the mean and variance of RTT

TimeOut =TimeOut = µµ x EstimatedRTT x EstimatedRTT ++ ΦΦ x Deviationx Deviation

where based on experience µ = 1µ = 1 and ΦΦ = 4 = 4.

behnam shafagatybehnam shafagaty

Page 36: Advanced Computer Networking Internet Congestion Control 1

AlgorithmsAlgorithms

3636 behnam shafagatybehnam shafagaty

Page 37: Advanced Computer Networking Internet Congestion Control 1

Early TCP Early TCP

• Pre-1988• Go-back-N ARQ

– Detects loss from timeout– Retransmits from lost packet onward

• Receiver window flow control– Prevent overflows at receive buffer

• Flow control: self-clocking

3737 behnam shafagatybehnam shafagaty

Page 38: Advanced Computer Networking Internet Congestion Control 1

Why Flow Control?Why Flow Control?

• October 1986, Internet had its first congestion collapse

• Link LBL to UC Berkeley – 400 yards, 3 hops, 32 Kbps– throughput dropped to 40 bps– factor of ~1000 drop!

• 1988, Van Jacobson proposed TCP flow control

3838 behnam shafagatybehnam shafagaty

Page 39: Advanced Computer Networking Internet Congestion Control 1

Effect of CongestionEffect of Congestion• Packet loss• Retransmission• Reduced throughput• Congestion collapse due to

– Unnecessarily retransmitted packets– Undelivered or unusable packets

• Congestion may continue after the overload!throughput

load3939 behnam shafagatybehnam shafagaty

Page 40: Advanced Computer Networking Internet Congestion Control 1

Window Flow ControlWindow Flow Control

• ~ W packets per RTT• Lost packet detected by missing ACK

RTT

time

time

Source

Destination

1 2 W

1 2 W

1 2 W

data ACKs

1 2 W

4040 behnam shafagatybehnam shafagaty

Page 41: Advanced Computer Networking Internet Congestion Control 1

Window flow controlWindow flow control

• Limit the number of packets in the network to window W

• Source rate = bps

• If W too small then rate « capacityIf W too big then rate > capacity

=> congestion

• Adapt W to network (and conditions)W = BW x RTT

RTT

MSSW

4141 behnam shafagatybehnam shafagaty

Page 42: Advanced Computer Networking Internet Congestion Control 1

TCP Window Flow ControlsTCP Window Flow Controls

• Receiver flow control– Avoid overloading receiver– Set by receiver– awnd: receiver (advertised) window

• Network flow control– Avoid overloading network– Set by sender– Infer available network capacity– cwnd: congestion window

• Set W = min (cwnd, awnd)4343 behnam shafagatybehnam shafagaty

Page 43: Advanced Computer Networking Internet Congestion Control 1

Receiver Flow ControlReceiver Flow Control

• Receiver advertises awnd with each ACK

• Window awnd– closed when data is received and ack’d– opened when data is read

• Size of awnd can be the performance limit (e.g. on a LAN)– sensible default ~16kB

4444 behnam shafagatybehnam shafagaty

Page 44: Advanced Computer Networking Internet Congestion Control 1

Network Flow ControlNetwork Flow Control

• Source calculates cwnd from indication of network congestion

• Congestion indications– Losses – Delay– Marks

• Algorithms to calculate cwnd– Tahoe, Reno, Vegas, RED, REM …

4545 behnam shafagatybehnam shafagaty

Page 45: Advanced Computer Networking Internet Congestion Control 1

TCP Congestion ControlsTCP Congestion Controls• Tahoe (Jacobson 1988)

– Slow Start– Congestion Avoidance– Fast Retransmit

• Reno (Jacobson 1990)– Fast Recovery

• Vegas (Brakmo & Peterson 1994)– New Congestion Avoidance

• RED (Floyd & Jacobson 1993)– Probabilistic marking

• REM (Athuraliya & Low 2000)– Clear buffer, match rate4646 behnam shafagatybehnam shafagaty

Page 46: Advanced Computer Networking Internet Congestion Control 1

VariantsVariants

• Tahoe & Reno– NewReno– SACK– Rate-halving– Mod.s for high performance

• AQM– RED, ARED, FRED, SRED– BLUE, SFB– REM, PI, AVQ4747 behnam shafagatybehnam shafagaty

Page 47: Advanced Computer Networking Internet Congestion Control 1

TCP Tahoe TCP Tahoe (Jacobson 1988)(Jacobson 1988)

SStime

window

CA

SS: Slow Start

CA: Congestion Avoidance

4848 behnam shafagatybehnam shafagaty

Page 48: Advanced Computer Networking Internet Congestion Control 1

Slow StartSlow Start

data packet

ACK

receiversender

1 RTT

cwnd1

2

34

5678

cwnd cwnd + 1 (for each ACK) 5050 behnam shafagatybehnam shafagaty

Page 49: Advanced Computer Networking Internet Congestion Control 1

Congestion AvoidanceCongestion Avoidance

cwnd1

2

3

1 RTT

4

data packet

ACK

cwnd cwnd + 1 (for each cwnd ACKS)

receiversender

5252 behnam shafagatybehnam shafagaty

Page 50: Advanced Computer Networking Internet Congestion Control 1

Packet LossPacket Loss

• Assumption: loss indicates congestion

• Packet loss detected by– Retransmission TimeOuts (RTO timer)– Duplicate ACKs (at least 3)

1 2 3 4 5 6

1 2 3

Packets

Acknowledgements

3 3

7

35353 behnam shafagatybehnam shafagaty

Page 51: Advanced Computer Networking Internet Congestion Control 1

Fast RetransmitFast Retransmit

• Wait for a timeout is quite long• Immediately retransmits after 3

dupACKs without waiting for timeout• Adjusts ssthresh

flightsize = min(awnd, cwnd)ssthresh max(flightsize/2, 2)

• Enter Slow Start (cwnd = 1)

5454 behnam shafagatybehnam shafagaty

Page 52: Advanced Computer Networking Internet Congestion Control 1

Summary: TahoeSummary: Tahoe• Basic ideas

– Gently probe network for spare capacity– Drastically reduce rate on congestion– Windowing: self-clocking– Other functions: round trip time estimation,

error recoveryfor every ACK { if (W < ssthresh) then W++ (SS) else W += 1/W (CA)}for every loss {

ssthresh = W/2 W = 1 }

5656 behnam shafagatybehnam shafagaty

Page 53: Advanced Computer Networking Internet Congestion Control 1

Fast recoveryFast recovery• Motivation: prevent `pipe’ from emptying after fast

retransmit• Idea: each dupACK represents a packet having left

the pipe (successfully received)• Enter FR/FR after 3 dupACKs

– Set ssthresh max(flightsize/2, 2)– Retransmit lost packet– Set cwnd ssthresh + ndup (window inflation)– Wait till W=min(awnd, cwnd) is large enough; transmit

new packet(s)– On non-dup ACK (1 RTT later), set cwnd ssthresh

(window deflation)

• Enter CA

5959 behnam shafagatybehnam shafagaty

Page 54: Advanced Computer Networking Internet Congestion Control 1

9

94

0 0

Example: FR/FRExample: FR/FR

• Fast retransmit– Retransmit on 3 dupACKs

• Fast recovery– Inflate window while repairing loss to fill pipe

timeS

timeR

1 2 3 4 5 6 87

8

cwnd 8ssthresh

1

74

0 0 0

Exit FR/FR

44

411

00

10 11

6060 behnam shafagatybehnam shafagaty

Page 55: Advanced Computer Networking Internet Congestion Control 1

Summary: RenoSummary: Reno

• Basic ideas– Fast recovery avoids slow start– dupACKs: fast retransmit + fast recovery– Timeout: fast retransmit + slow start

slow start retransmit

congestion avoidance FR/FR

dupACKs

timeout

6161 behnam shafagatybehnam shafagaty

Page 56: Advanced Computer Networking Internet Congestion Control 1

NewReno: MotivationNewReno: Motivation

• On 3 dupACKs, receiver has packets 2, 4, 6, 8, cwnd=8, retransmits pkt 1, enter FR/FR

• Next dupACK increment cwnd to 9• After a RTT, ACK arrives for pkts 1 & 2, exit FR/FR,

cwnd=5, 8 unack’ed pkts• No more ACK, sender must wait for timeout

1 2time

S

timeD

3 4 5 6 87 1

8

FR/FR

09

0 0 0 0 0

9

8 unack’d pkts

2

5

3

timeout

6262 behnam shafagatybehnam shafagaty

Page 57: Advanced Computer Networking Internet Congestion Control 1

NewRenoNewReno Fall & Floyd ‘96, (RFC 2583)Fall & Floyd ‘96, (RFC 2583)

• Motivation: multiple losses within a window– Partial ACK acknowledges some but not all

packets outstanding at start of FR– Partial ACK takes Reno out of FR, deflates window– Sender may have to wait for timeout before

proceeding

• Idea: partial ACK indicates lost packets– Stays in FR/FR and retransmits immediately– Retransmits 1 lost packet per RTT until all lost

packets from that window are retransmitted– Eliminates timeout

6363 behnam shafagatybehnam shafagaty

Page 58: Advanced Computer Networking Internet Congestion Control 1

SACK SACK Mathis, Mahdavi, Floyd, Romanow ’96 (RFC 2018, RFC Mathis, Mahdavi, Floyd, Romanow ’96 (RFC 2018, RFC

2883)2883)

• Motivation: Reno & NewReno retransmit at most 1 lost packet per RTT– Pipe can be emptied during FR/FR with multiple

losses

• Idea: SACK provides better estimate of packets in pipe– SACK TCP option describes received packets– On 3 dupACKs: retransmits, halves window, enters

FR– Updates pipe = packets in pipe

• Increment when lost or new packets sent• Decrement when dupACK received

– Transmits a (lost or new) packet when pipe < cwnd– Exit FR when all packets outstanding when FR was

entered are acknowledged6464 behnam shafagatybehnam shafagaty

Page 59: Advanced Computer Networking Internet Congestion Control 1

TCP Vegas TCP Vegas (Brakmo & Peterson 1994)(Brakmo & Peterson 1994)

• Reno with a new congestion avoidance algorithm• Converges (provided buffer is large) !

SStime

window

CA

6565 behnam shafagatybehnam shafagaty

Page 60: Advanced Computer Networking Internet Congestion Control 1

for every RTT

{

if W/RTTmin – W/RTT < then W ++

if W/RTTmin – W/RTT > then W --

}

for every loss

W := W/2

Congestion avoidanceCongestion avoidance

• Each source estimates number of its own packets in pipe from RTT

• Adjusts window to maintain estimate between d and d

6666 behnam shafagatybehnam shafagaty

Page 61: Advanced Computer Networking Internet Congestion Control 1

ImplicationsImplications• Congestion measure = end-to-end

queueing delay• At equilibrium

– Zero loss– Stable window at full utilization– Approximately weighted proportional

fairness– Nonzero queue, larger for more sources

• Convergence to equilibrium– Converges if sufficient network buffer– Oscillates like Reno otherwise6767 behnam shafagatybehnam shafagaty

Page 62: Advanced Computer Networking Internet Congestion Control 1

Wireless TCPWireless TCP

• Reno uses loss as congestion measure

• In wireless, significant losses due to– Fading– Interference– Handover– Not buffer overflow (congestion)

• Halving window too drastic– Small throughput, low utilization

6868 behnam shafagatybehnam shafagaty

Page 63: Advanced Computer Networking Internet Congestion Control 1

Proposed solutionsProposed solutions

• Ideas– Hide from source noncongestion losses– Inform source of noncongestion losses

• Approaches– Link layer error control– Split TCP– Snoop agent– SACK+ELN (Explicit Loss Notification)

6969 behnam shafagatybehnam shafagaty

Page 64: Advanced Computer Networking Internet Congestion Control 1

Third approachThird approach

• Problem– Reno uses loss as congestion measure– Two types of losses

• Congestion loss: retransmit + reduce window• Noncongestion loss: retransmit

– Previous approaches• Hide noncongestion losses• Indicate noncongestion losses

– Our approach• Eliminates congestion losses (buffer overflows)

7070 behnam shafagatybehnam shafagaty

Page 65: Advanced Computer Networking Internet Congestion Control 1

Third approachThird approach

• Idea– REM clears buffer– Only noncongestion losses – Retransmits lost packets without reducing window

RouterREM capable

HostDo not use loss as congestion measure

Vegas

REM

7171 behnam shafagatybehnam shafagaty

Page 66: Advanced Computer Networking Internet Congestion Control 1

PerformancePerformance• Goodput

7272 behnam shafagatybehnam shafagaty