92
Univ. of Tehran Computer Network 1 Computer Networks Computer Networks (Graduate level) University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani Lecture 8: Congestion Control

Computer Networks (Graduate level)

  • Upload
    gilead

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Computer Networks (Graduate level). Lecture 8: Congestion Control. University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani. Congestion Control. Congestion control basics TCP congestion control Assigned reading [JK88] Congestion Avoidance and Control - PowerPoint PPT Presentation

Citation preview

Page 1: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 1

Computer Computer NetworksNetworks

(Graduate level)

University of TehranDept. of EE and Computer Engineering

By:Dr. Nasser Yazdani

Lecture 8: Congestion Control

Page 2: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 2

Congestion Control Congestion control basics TCP congestion control Assigned reading

[JK88] Congestion Avoidance and Control

[CJ89] Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks

Page 3: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 3

Overview Congestion sources and collapse

Congestion control basics

TCP congestion control

TCP interactions

Page 4: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 4

Why End-to-End Protocols?

Underlying best-effort network Drop/ reorder messages delivers duplicate copies of a given message limits messages to some finite size delivers messages after an arbitrarily long delay multiple application processes on each host Different speed of sender and receiver (Flow control) Congestion in the network (Congestion controls)

Initially, there was no end to end protocol. Now: UDP: A simple end to end protocol TCP: Reliable Transport protocol

Page 5: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 5

Reliable Transport (TCP) Communication abstraction:

Connection oriented, Point to point Reliable

Error Detection and correction Ordered Byte-stream

Application writes bytes TCP sends segments Application reads bytes

Full duplex, two way connection Flow and congestion controlled

Protocol implemented entirely at the ends Fate sharing

Page 6: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 6

Difference From Link Layers

Logical link vs. physical link Must establish connection Variable RTT

May vary within a connection Reordering packets

How long can packets live max segment lifetime Can’t expect endpoints to exactly match link

Buffer space availability Packets in transmission, delay X bandwidth

Transmission rate Don’t directly know media/network transmission rate

(Congestion)

Try to adapt to the situation.

Page 7: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 7

Congestion

Different sources compete for resources inside network where thery are unaware of current state of resource and each other

In general it is resource allocation problem. manifestations:

lost packets (buffer overflow at routers) long delays (queuing in router buffers)

10 Mbps

100 Mbps

1.5 Mbps

Page 8: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 8

Causes/costs of congestion: scenario 1

two senders, two receivers

one router, infinite buffers

no retransmission large delays

when congested maximum

achievable throughput

Page 9: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 9

Causes/costs of congestion: scenario 2

one router, finite buffers sender retransmission of lost

packet

Page 10: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 10

Causes/costs of congestion: scenario 2

always: (goodput)

“perfect” retransmission only when loss:

retransmission of delayed (not lost) packet makes

larger (than perfect case) for same

in

out

=

in

out

>

in

“costs” of congestion: more work (retrans) for given “goodput” unneeded retransmissions: link carries multiple

copies of pkt

Page 11: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 11

Causes/costs of congestion: scenario 3

four senders multihop paths timeout/retransmit

inQ: what happens as

and increase ?

in

Page 12: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 12

Causes/costs of congestion: scenario 3

Another “cost” of congestion: when packet dropped, any “upstream

transmission capacity used for that packet was wasted!

Page 13: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 13

Approaches towards congestion control

End-end congestion control:

no explicit feedback from network

congestion inferred from end-system observed loss, delay

approach taken by TCP

Network-assisted congestion control:

routers provide feedback to end systems single bit indicating

congestion (SNA, DECbit, TCP/IP ECN, ATM)

explicit rate sender should send at

Two broad approaches towards congestion control:

Page 14: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 14

Congestion Collapse Increase in network load results in decrease

of useful work done and crash the network ability to deliver data

Possible causes Spurious retransmissions of packets still in flight

Classical congestion collapse How can this happen with packet conservation? Solution: better timers and TCP congestion control

Undelivered packets Packets consume resources and are dropped elsewhere

in network Solution: congestion control for ALL traffic

Page 15: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 15

Other Congestion Collapse Causes

Fragments Mismatch of transmission and retransmission units Solutions

Make network drop all fragments of a packet (early packet discard in ATM)

Do path MTU discovery Control traffic

Large percentage of traffic is for control Headers, routing messages, DNS, etc.

Stale or unwanted packets Packets that are delayed on long queues “Push” data that is never used

Page 16: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 16

Where to Prevent Collapse?

Can end hosts prevent problem? Yes, but must trust end hosts to do right thing E.g., sending host must adjust amount of data

it puts in the network based on detected congestion

Can routers prevent collapse? No, not all forms of collapse Doesn’t mean they can’t help Sending accurate congestion signals Isolating well-behaved from ill-behaved

sources

Page 17: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 17

Congestion Control and Avoidance

A mechanism which: Uses network resources efficiently Preserves fair network resource allocation Prevents or avoids collapse

Congestion collapse is not just a theory Has been frequently observed in many

networks It is a top 10 problem.

Page 18: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 18

Congestion Collapse and Efficiency

knee – point after which throughput increases slowly delay increases quickly

cliff – point after which throughput decreases

quickly to zero (congestion collapse)

delay goes to infinity Congestion avoidance

stay at knee Congestion control

stay left of (but usually close to) cliff

Note (in an M/M/1 queue) delay = 1/(1 – utilization)

Load

Load

Th

rou

ghp

ut

De

lay

knee cliff

over utilization

under utilization

saturation

congestion collapse

Page 19: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 19

Goals Operate near the knee point Remain in equilibrium How to maintain equilibrium?

Don’t put a packet into network until another packet leaves. How do you do it?

Use ACK: send a new packet only after you receive and ACK. Why?

Maintain number of packets in network “constant”

Page 20: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 20

How Do You Do It? Detect when network

approaches/reaches knee point Stay there

Questions How do you get there? What if you overshoot (i.e., go over knee

point) ? Possible solution:

Increase window size until you notice congestion

Decrease window size if network congested

Page 21: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 21

Overview Congestion sources and collapse

Congestion control basics

TCP congestion control

TCP interactions

Page 22: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 22

Control System Model [CJ89]

Simple, yet powerful model Explicit binary signal of congestion

Why explicit (TCP uses implicit)? Implicit allocation of bandwidth

User 1

User 2

User n

x1

x2

xn

xi>Xgoal

y

Page 23: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 23

Objectives Simple router behavior Distributedness Efficiency: Xknee = xi(t) Fairness: (xi)2/n(xi

2) Power: (throughput/delay) Convergence: control system must

be stable, responsiveness.

Page 24: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 24

Power Power (ratio of throughput to

delay)

Optimalload Load

Th

rou

ghp

ut/d

elay

Page 25: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 25

Fair Allocation Maxmin fairness

Flows which share the same bottleneck get the same amount of bandwidth

Assumes no knowledge of priorities

Fairness = 1 - distance from fairness line

User 1: x1U

ser

2: x

2

2 user example

2 gettingtoo much

1 getting too much

fairnessline

2

2

i

i

xn

xxF

Page 26: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 26

Basic Control Model Let’s assume window-based control Reduce window when congestion is

perceived How is congestion signaled?

Either mark or drop packets When is a router congested?

Drop tail queues – when queue is full Average queue length – at some threshold

Increase window otherwise Probe for available bandwidth – how?

Page 27: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 27

Linear Control Many different possibilities for reaction

to congestion and probing Examine simple linear controls Window(t + 1) = a + b Window(t) Different ai/bi for increase and ad/bd for

decrease Supports various reaction to signals

Increase/decrease additively Increased/decrease multiplicatively Which of the four combinations is optimal?

Page 28: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 28

Possible Choices

Multiplicative increase, additive decrease aI=0, bI>1, aD<0, bD=1

Additive increase, additive decrease aI>0, bI=1, aD<0, bD=1

Multiplicative increase, multiplicative decrease aI=0, bI>1, aD=0, 0<bD<1

Additive increase, multiplicative decrease aI>0, bI=1, aD=0, 0<bD<1

Which one?

decreasetxba

increasetxbatx

iDD

iIIi )(

)()1(

Page 29: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 29

Phase plots What are desirable properties? What if flows are not equal?

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s

Allocation x2

Optimal point

Overload

Underutilization

Page 30: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 30

Phase plots Simple way to visualize behavior of

competing connections over time

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s

Allocation x2

Page 31: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 31

Additive Increase/Decrease

T0

T1

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s

Allocation x2

Both X1 and X2 increase/decrease by the same amount over time Additive increase improves efficiency and

additive decrease reduces efficiency

Page 32: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 32

Muliplicative Increase/Decrease

Both X1 and X2 increase by the same factor over time Extension from origin – constant

fairness

T0

T1

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s

Allocation x2

Page 33: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 33

Convergence to Efficiency

xH

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s

Allocation x2

Page 34: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 34

Convergence to Fairness

xH

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s Allocation

x2

xH’

Page 35: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 35

Convergence to Efficiency & Fairness

xH

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s Allocation

x2

xH’

Page 36: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 36

Increase

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s Allocation

x2

xL

Page 37: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 37

Constraints Distributed efficiency

I.e., Window(t+1) > Window(t) during increase

ai > 0 & bi > 1 Similarly, ad < 0 & bd < 1

Must never decrease fairness a & b’s must be > 0 ai/bi > 0 and ad/bd 0

Full constraints ad = 0, 0 bd < 1, ai > 0 and bi = 1

Page 38: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 38

What is the Right Choice? Constraints limit us to AIMD

Can have multiplicative term in increase

AIMD moves towards optimal point

x0

x1

x2

Efficiency Line

Fairness Line

User 1’s Allocation x1

User 2’s Allocation

x2

Page 39: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 39

Overview Congestion sources and collapse

Congestion control basics

TCP congestion control

TCP interactions

Page 40: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 40

TCP Congestion Control Motivated by ARPANET congestion collapse Underlying design principle: packet

conservation At equilibrium, inject packet into network only

when one is removed Basis for stability of physical systems

Why was this not working? Connection doesn’t reach equilibrium Spurious retransmissions Resource limitations prevent equilibrium

Page 41: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 41

TCP Congestion Control - Solutions

Reaching equilibrium Slow start

Eliminates spurious retransmissions Accurate RTO estimation Fast retransmit

Adapting to resource availability Congestion avoidance

Page 42: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 42

TCP Congestion Control Basics Keep a congestion window, cwnd

Denotes how much network is able to absorb

Sender’s maximum window: Min (advertised window, cwnd)

Sender’s actual window: Max window - unacknowledged segments

If we have large actual window, should we send data in one shot? No, use acks to clock sending new data

Page 43: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 43

Self-clocking

PrPb

Ar

Ab

ReceiverSender

As

Page 44: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 44

Slow Start How do we get this clocking behavior

to start? Initialize cwnd = 1 Upon receipt of every ack, cwnd = cwnd

+ 1 Implications

Window actually increases to W in RTT * log2(W)

Can overshoot window and cause packet loss

Page 45: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 45

Slow Start Example

1

One RTT

One pkt time

0R

2

1R

3

4

2R

567

83R

91011

1213

1415

1

2 3

4 5 6 7

Page 46: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 46

Slow Start Sequence Plot

Time

Sequence No

.

.

.

Page 47: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 47

Congestion Avoidance Loss implies congestion – why?

Not necessarily true on all link types If loss occurs when cwnd = W

Network can handle 0.5W ~ W segments Set cwnd to 0.5W (multiplicative

decrease) Upon receiving ACK

Increase cwnd by 1/cwnd Results in additive increase

Page 48: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 48

Return to Slow Start If packet is lost we lose our self

clocking as well Need to implement slow-start and

congestion avoidance together When timeout occurs set ssthresh to

0.5w If cwnd < ssthresh, use slow start Else use congestion avoidance

Page 49: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 49

Overall TCP Behavior

Time

Window

Page 50: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 50

Congestion Window

Time

CongestionWindow

Slow start with each time out

Time out Time out

Slow Start

•Time out is just wasting resource and time•How to prevent it: Do not wait! Fast retransmission

Page 51: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 51

Fast Retransmit Resend a segment

after 3 duplicate ACKs A duplicate ACK means

that an out-of sequence segment was received

Notes: duplicate ACKs due to

packet reordering why reordering?

window may be too small to get duplicate ACKs

Then what? Slow start

ACK 2

segment 1cwnd = 1

cwnd = 2 segment 2segment 3

ACK 4cwnd = 4 segment 4

segment 5segment 6segment 7

ACK 3

3 duplicateACKs

ACK 4

ACK 4

ACK 4

Page 52: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 52

Fast Recovery A duplicate ack notifies sender that a packet

has departed network When < cwnd packets are outstanding

Allow new packets out with each new duplicate acknowledgement

Behavior Sender is idle for some time – waiting for ½ cwnd

worth of dupacks Transmits at original rate after wait

Ack clocking rate is same as before loss At the end: No Slow start: W=W/2 and got to

AIMD

Page 53: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 53

Fast Retransmit

Time

Sequence NoDuplicate Acks

RetransmissionX

Page 54: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 54

Fast Recovery

Time

Sequence NoSent for each dupack after

W/2 dupacks arrive

Page 55: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 55

Multiple Losses

Time

Sequence NoDuplicate Acks

RetransmissionX

X

XX

Now what?

Page 56: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 56

Time

Sequence NoX

X

XX

Tahoe

Slow start again

Page 57: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 57

TCP Reno (1990) All mechanisms in Tahoe Addition of fast-recovery

Opening up congestion window after fast retransmit Delayed acks Header prediction

Implementation designed to improve performance Has common case code inlined

With multiple losses, Reno typically timeouts because it does not see duplicate acknowlegements

Page 58: Computer Networks (Graduate level)

58

TCP Reno

Fast retransmit: retransmit a segment after 3 DUP Acks

Fast recovery: reduce cwnd to half instead of to one

Time

cwnd

Slow Start

CongestionAvoidance

Timeout

Fast Recovery

Fast recovery

Page 59: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 59

Reno

Time

Sequence NoX

X

XX

Now what? - timeout

Page 60: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 60

NewReno The ack that arrives after

retransmission (partial ack) should indicate that a second loss occurred

When does NewReno timeout? When there are fewer than three

dupacks for first loss When partial ack is lost

How fast does it recover losses? One per RTT

Page 61: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 61

NewReno

Time

Sequence NoX

X

XX

Now what? – partial ackrecovery

Page 62: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 62

SACK Basic problem is that cumulative

acks only provide little information Ack for just the packet received

What if acks are lost? carry cumulative also

Not used Bitmask of packets received

Selective acknowledgement (SACK)

How to deal with reordering

Page 63: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 63

SACK

Time

Sequence NoX

X

XX

Now what? – sendretransmissions as soonas detected

Page 64: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 64

Performance Issues Timeout >> fast rexmit

Need 3 dupacks/sacks Not great for small transfers

Don’t have 3 packets outstanding What are real loss patterns like?

Right edge recovery Allow packets to be sent on arrival of

first and second duplicate ack Helps recovery for small windows

How to deal with reordering?

Page 65: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 65

NewReno Changes

Send a new packet out for each pair of dupacks Adapt more gradually to new window

Will not halve congestion window again until recovery is completed Identifies congestion events vs. congestion

signals Initial estimation for ssthresh

Page 66: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 66

Rate Halving Recovery

Time

Sequence NoSent after every

other dupack

Page 67: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 67

Delayed Ack Impact TCP congestion control triggered

by acks If receive half as many acks window

grows half as fast Slow start with window = 1

Will trigger delayed ack timer First exchange will take at least

200ms Start with > 1 initial window

Bug in BSD, now a “feature”/standard

Page 68: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 68

TCP Congestion Control end-end control (no network assistance) transmission rate limited by congestion window size, Congwin,

over segments:

w segments, each with MSS bytes sent in one RTT:

throughput = w * MSS

RTT Bytes/sec

Congwin

Page 69: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 69

Fast Retransmit and Recovery in Reno

Upon reception of 3 dupes, thresh= thresh/2. Set the congwin = thresh + 3

This accounts for the 3 packets that have left the network. Increment congwin for each dupe subsequently

received. Transmit a new segment if we are allowed When a new ack* finally arrives, set

congwin=thresh and we are in congestion avoidance again with a “deflated window”.

*”new ack” means an ack for any data not yet acked, but inside congwin.

Page 70: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 70

Fast Retransmit and Recovery in NewReno

Upon reception of 3 dupes, thresh= thresh/2. Set the congwin = thresh + 3

This accounts for the 3 packets that have left the network. Increment congwin for each dup subsequently

received. Transmit a new segment if we are allowed When a new ack* finally arrives, set congwin=thresh

and we are in congestion avoidance again.* But only do this if the ack received is for the highest

seq# sent, avoiding a stall while recovering.

Page 71: Computer Networks (Graduate level)

71

TCP & Routers

How Routers can help Congestion control Indeed, Congestion control and queue

management are the same problem “Resource Allocation”.

RED XCP Read

Chapter 6 of the book, also look at [FJ93] Random Early Detection Gateways for

Congestion Avoidance

Page 72: Computer Networks (Graduate level)

72

Queuing Disciplines Each router must implement some

queuing discipline Queuing allocates both bandwidth

and buffer space: Bandwidth: which packet to serve

(transmit) next Buffer space: which packet to drop next

(when required) Queuing also affects latency

Page 73: Computer Networks (Graduate level)

73

Packet Drop Dimensions

AggregationPer-connection state Single class

Drop positionHead Tail

Random location

Class-based queuing

Early drop Overflow drop

Page 74: Computer Networks (Graduate level)

74

Typical Internet Queuing FIFO + drop-tail

Simplest choice Used widely in the Internet

FIFO (first-in-first-out) Implies single class of traffic

Drop-tail Arriving packets get dropped when queue is

full regardless of flow or importance Important distinction:

FIFO: scheduling discipline Drop-tail: drop policy

Page 75: Computer Networks (Graduate level)

75

Active Queue Management

Design active router queue management to aid congestion control

Why? Routers can distinguish between

propagation and persistent queuing delays

Routers can decide on transient congestion, based on workload

Page 76: Computer Networks (Graduate level)

76

Active Queue Designs Modify both router and hosts

DECbit: congestion bit in packet header Modify router, hosts use TCP

Fair queuing Per-connection buffer allocation

RED (Random Early Detection) Drop packet or set bit in packet header as

soon as congestion is starting

Page 77: Computer Networks (Graduate level)

77

Random Early Detection (RED)

Detect incipient congestion, allow bursts Keep power (throughput/delay) high

Keep average queue size low Assume hosts respond to lost packets

Avoid window synchronization Randomly mark packets

Avoid bias against bursty traffic Some protection against ill-behaved

users

Page 78: Computer Networks (Graduate level)

78

RED Algorithm Maintain running average of queue

length If avgq < minth do nothing

Low queuing, send packets through If avgq > maxth, drop packet

Protection from misbehaving sources Else mark packet in a manner

proportional to queue length Notify sources of incipient congestion

Page 79: Computer Networks (Graduate level)

79

RED OperationMin threshMax thresh

Average Queue Length

minth maxth

maxP

1.0

Avg queue length

P(drop)

Page 80: Computer Networks (Graduate level)

80

RED Algorithm Maintain running average of queue

length Byte mode vs. packet mode – why?

For each packet arrival Calculate average queue size (avg) If minth ≤ avgq < maxth

Calculate probability Pa

With probability Pa

Mark the arriving packet

Else if maxth ≤ avg Mark the arriving packet

Page 81: Computer Networks (Graduate level)

81

Queue Estimation Standard EWMA: avgq = (1-wq) avgq + wqqlen

Special fix for idle periods – why? Upper bound on wq depends on minth

Want to ignore transient congestion Can calculate the queue average if a burst arrives

Set wq such that certain burst size does not exceed minth

Lower bound on wq to detect congestion relatively quickly

Typical wq = 0.002

Page 82: Computer Networks (Graduate level)

82

Thresholds minth determined by the utilization

requirement Tradeoff between queuing delay and utilization

Relationship between maxth and minth

Want to ensure that feedback has enough time to make difference in load

Depends on average queue increase in one RTT

Paper suggest ratio of two Current rule of thumb is factor of three

Page 83: Computer Networks (Graduate level)

84

Packet Marking maxp is reflective of typical loss rates Paper uses 0.02

0.1 is more realistic value If network needs marking of 20-30%

then need to buy a better link! Gentle variant of RED (recommended)

Vary drop rate from maxp to 1 as the avgq varies from maxth to 2* maxth

More robust to setting of maxth and maxp

Page 84: Computer Networks (Graduate level)

85

Extending RED for Flow Isolation

Problem: what to do with non-cooperative flows?

Fair queuing achieves isolation using per-flow state – expensive at backbone routers How can we isolate unresponsive flows

without per-flow state? RED penalty box

Monitor history for packet drops, identify flows that use disproportionate bandwidth

Isolate and punish those flows

Page 85: Computer Networks (Graduate level)

86

FRED Fair Random Early Drop (Sigcomm, 1997) Maintain per flow state only for active

flows (ones having packets in the buffer) minq and maxq min and max number of

buffers a flow is allowed occupy avgcq = average buffers per flow Strike count of number of times flow has

exceeded maxq

Page 86: Computer Networks (Graduate level)

87

Feedback

Round Trip Time

Congestion Window

Congestion Header

Feedback

Round Trip Time

Congestion Window

How does XCP Work?

Feedback = + 0.1 packet

Page 87: Computer Networks (Graduate level)

88

Feedback = + 0.1 packet

Round Trip Time

Congestion Window

Feedback = - 0.3 packet

How does XCP Work?

Page 88: Computer Networks (Graduate level)

89

Congestion Window = Congestion Window + Feedback

Routers compute feedback without any per-flow state Routers compute feedback without any per-flow state

How does XCP Work?

XCP extends ECN and CSFQ

Page 89: Computer Networks (Graduate level)

90

How Does an XCP Router Compute the Feedback?

Congestion Controller

Fairness ControllerGoal: Divides between

flows to converge to fairness

Looks at a flow’s state in Congestion Header

Algorithm:If > 0 Divide equally

between flowsIf < 0 Divide between flows proportionally to their

current rates

MIMD AIMD

Goal: Matches input traffic to link capacity & drains the

queueLooks at aggregate traffic

& queue

Algorithm:Aggregate traffic changes by

~ Spare Bandwidth

~ - Queue SizeSo, = davg Spare -

Queue

Congestion Controller

Fairness Controller

Page 90: Computer Networks (Graduate level)

91

= davg Spare - Queue

224

0 2 and

Theorem: System converges to optimal

utilization (i.e., stable) for any link bandwidth, delay,

number of sources if:

(Proof based on Nyquist Criterion)

Getting the devil out of the details …

Congestion Controller Fairness Controller

No Parameter Tuning

No Parameter Tuning

Algorithm:If > 0 Divide equally between

flowsIf < 0 Divide between flows

proportionally to their current rates

Need to estimate number of flows N

Tinpkts pktpkt RTTCwndTN

)/(1

RTTpkt : Round Trip Time in header

Cwndpkt : Congestion Window in header

T: Counting Interval

No Per-Flow StateNo Per-Flow State

Page 91: Computer Networks (Graduate level)

92

Lessons TCP alternatives

TCP being used in new/unexpected ways Key changes needed

Routers FIFO, drop-tail interacts poorly with TCP Various schemes to desynchronize flows and

control loss rate Fair-queuing

Clean resource allocation to flows Complex packet classification and scheduling

Core-stateless FQ & XCP Coarse-grain fairness Carrying packet state can reduce complexity

Page 92: Computer Networks (Graduate level)

Univ. of Tehran Computer Network 93

Next Lecture: TCP behavior and New Versions

High speed TCPs Assigned reading

[BP95] TCP Vegas: End to End Congestion Avoidance on a Global Internet

[FHPW00] Equation-Based Congestion Control for Unicast Applications