Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
OperatingSystemsandNetworks
NetworkLecture10:CongestionControl
AdrianPerrigNetworkSecurityGroupETHZürich
2
WhereweareintheCourse• MorefunintheTransportLayer!– Themysteryofcongestioncontrol– DependsontheNetworklayertoo
PhysicalLink
Application
NetworkTransport
3
Topic• Understandingcongestion,a“trafficjam”inthenetwork– Laterwewilllearnhowtocontrolit
What’stheholdup?
Network
NatureofCongestion• Routers/switcheshaveinternalbufferingforcontention
4
...
...
... ...
InputBuffer OutputBufferFabric
Input Output
NatureofCongestion(2)• Simplifiedviewofperportoutputqueues– TypicallyFIFO(FirstInFirstOut),discardwhenfull
5
Router
=
(FIFO)QueueQueuedPackets
Router
6
NatureofCongestion(3)• Queueshelpbyabsorbingburstswheninput>outputrate
• Butifinput>outputratepersistently,queuewilloverflow– Thisiscongestion
• Congestionisafunctionofthetrafficpatterns– canoccurevenifeverylinkhassamecapacity
EffectsofCongestion• Whathappenstoperformanceasweincreasetheload?
7
8
EffectsofCongestion(3)• Asofferedloadrises,congestionoccursasqueuesbegintofill:– Delayandlossrisesharplywithmoreload– Throughputfallsbelowload(duetoloss)– Goodput mayfallbelowthroughput(duetospuriousretransmissions)
• Noneoftheaboveisgood!– Wanttooperatenetworkjustbeforetheonsetofcongestion
9
BandwidthAllocation• Importanttaskfornetworkistoallocateitscapacitytosenders– Goodallocationisefficientandfair
• Efficient meansmostcapacityisusedbutthereisnocongestion
• Fair meanseverysendergetsareasonablesharethenetwork
10
BandwidthAllocation(2)• Keyobservation:– Inaneffectivesolution,TransportandNetworklayersmustworktogether
• Networklayerwitnessescongestion– Onlyitcanprovidedirectfeedback
• Transportlayercausescongestion– Onlyitcanreduceofferedload
11
BandwidthAllocation(3)• Whyisithard?(Justsplitequally!)– Numberofsendersandtheirofferedloadisconstantlychanging
– Sendersmaylackcapacityindifferentpartsofthenetwork– Networkisdistributed;nosinglepartyhasanoverallpictureofitsstate
12
BandwidthAllocation(4)• Solutioncontext:– Sendersadaptconcurrentlybasedontheirownviewofthenetwork
– Designthisadaptionsothenetworkusageasawholeisefficientandfair
– Adaptationiscontinuoussinceofferedloadscontinuetochangeovertime
13
Topics• Natureofcongestion• Fairallocations• AIMDcontrollaw• TCPCongestionControlhistory• ACKclocking• TCPSlow-start• TCPFastRetransmit/Recovery• CongestionAvoidance(ECN)
14
FairnessofBandwidthAllocation(§6.3.1)
• What’sa“fair”bandwidthallocation?– Themax-minfairallocation
15
Recall• Wewantagoodbandwidthallocationtobefairandefficient– Nowwelearnwhatfairmeans
• Caveat:inpractice,efficiencyismoreimportantthanfairness
16
Efficiencyvs.Fairness• Cannotalwayshaveboth!– ExamplenetworkwithtrafficAàB,BàCandAàC– Howmuchtrafficcanwecarry?
A B C1 1
17
Efficiencyvs.Fairness(2)• Ifwecareaboutfairness:– Giveequalbandwidthtoeachflow– AàB:½unit,BàC:½,andAàC,½– Totaltrafficcarriedis1½units
A B C1 1
18
Efficiencyvs.Fairness(3)• Ifwecareaboutefficiency:– Maximizetotaltrafficinnetwork– AàB:1unit,BàC:1,andAàC,0– Totaltrafficrisesto2units!
A B C1 1
19
TheSlipperyNotionofFairness• Whyis“equalperflow”fairanyway?– AàCusesmorenetworkresources(twolinks)thanAàBorBàC
– HostAsendstwoflows,Bsendsone
• Notproductivetoseekexactfairness– Moreimportanttoavoidstarvation– “Equalperflow”isgoodenough
20
Generalizing“EqualperFlow”• Bottleneck foraflowoftrafficisthelinkthatlimitsitsbandwidth– Wherecongestionoccursfortheflow– ForAàC,linkA–Bisthebottleneck
A B C1 10
Bottleneck
21
Generalizing“EqualperFlow”(2)• Flowsmayhavedifferentbottlenecks– ForAàC,linkA–Bisthebottleneck– ForBàC,linkB–Cisthebottleneck– Cannolongerdividelinksequally…
A B C1 10
22
Max-MinFairness• Intuitively,flowsbottleneckedonalinkgetanequalshareofthatlink
• Max-minfairallocation isonethat:– Increasingtherateofoneflowwilldecreasetherateofasmallerflow
– This“maximizestheminimum”flow
23
Max-MinFairness(2)• Tofinditgivenanetwork,imagine“pouringwaterintothenetwork”1. Startwithallflowsatrate02. Increasetheflowsuntilthereisanewbottleneckinthe
network3. Holdfixedtherateoftheflowsthatarebottlenecked4. Gotostep2foranyremainingflows
Max-MinExample• Example:networkwith4flows,linksequalbandwidth– Whatisthemax-minfairallocation?
24
Max-MinExample(2)• Whenrate=1/3,flowsB,C,andDbottleneckR4—R5– FixB,C,andD,continuetoincreaseA
25
Bottleneck
Max-MinExample(3)• Whenrate=2/3,flowAbottlenecksR2—R3.Done.
26
Bottleneck
Bottleneck
Max-MinExample(4)• EndwithA=2/3,B,C,D=1/3,andR2—R3,R4—R5full– Otherlinkshaveextracapacitythatcan’tbeused
• ,linksxample:networkwith4flows,linksequalbandwidth– Whatisthemax-minfairallocation?
27
AdaptingoverTime• Allocationchangesasflowsstartandstop
28
Time
AdaptingoverTime(2)
29
Flow1slowswhenFlow2starts
Flow1speedsupwhenFlow2stops
Time
Flow3limitiselsewhere
30
Recall• Wanttoallocatecapacitytosenders– Networklayerprovidesfeedback– Transportlayeradjustsofferedload– Agoodallocationisefficientandfair
• Howshouldweperformtheallocation?– Severaldifferentpossibilities…
31
BandwidthAllocationModels• Openloopversusclosedloop– Open:reservebandwidthbeforeuse– Closed:usefeedbacktoadjustrates
• HostversusNetworksupport– Whosets/enforcesallocations?
• WindowversusRatebased– Howisallocationexpressed?
TCPisaclosedloop,host-driven,andwindow-based
32
BandwidthAllocationModels(2)• We’lllookatclosed-loop,host-driven,andwindow-based
• Networklayerreturnsfeedbackoncurrentallocationtosenders– Atleasttellsifthereiscongestion
• Transportlayeradjustssender’sbehaviorviawindowinresponse– Howsendersadaptisacontrollaw
33
AdditiveIncreaseMultiplicativeDecrease(AIMD)(§6.3.2)
• Bandwidthallocationmodels– AdditiveIncreaseMultiplicativeDecrease(AIMD)controllaw
AIMD!
Sawtooth
34
AdditiveIncreaseMultiplicativeDecrease• AIMDisacontrollawhostscanusetoreachagoodallocation– Hostsadditivelyincreaseratewhilenetworkisnotcongested
– Hostsmultiplicativelydecreaseratewhencongestionoccurs
– UsedbyTCPJ
• Let’sexploretheAIMDgame…
35
AIMDGame• Hosts1and2shareabottleneck– Butdonottalktoeachotherdirectly
• Routerprovidesbinaryfeedback– Tellshostsifnetworkiscongested
RestofNetwork
Bottleneck
Router
Host1
Host2
1
11
36
AIMDGame(2)• Eachpointisapossibleallocation
Host1
Host20 1
1
Fair
Efficient
OptimalAllocation
Congested
37
AIMDGame(3)• AIandMDmovetheallocation
Host1
Host20 1
1
Fair,y=x
Efficient,x+y=1
OptimalAllocation
Congested
MultiplicativeDecrease
AdditiveIncrease
38
AIMDGame(4)• Playthegame!
Host1
Host20 1
1
Fair
Efficient
Congested
Astartingpoint
39
AIMDGame(5)• Alwaysconvergetogoodallocation!
Host1
Host20 1
1
Fair
Efficient
Congested
Astartingpoint
40
AIMDSawtooth• Producesa“sawtooth”patternovertimeforrateofeachhost– ThisistheTCPsawtooth (later)
MultiplicativeDecrease
AdditiveIncrease
Time
Host1or2’sRate
41
AIMDProperties• Convergestoanallocationthatisefficientandfairwhenhostsrunit– Holdsformoregeneraltopologies
• Otherincrease/decreasecontrollawsdonot!(TryMIAD,MIMD,AIAD)
• Requiresonlybinaryfeedbackfromthenetwork
FeedbackSignals• Severalpossiblesignals,withdifferentpros/cons– We’lllookatclassicTCPthatusespacketlossasasignal
42
Signal ExampleProtocol Pros/ConsPacket loss TCPNewReno
Cubic TCP(Linux)+Hard togetwrong
-HearaboutcongestionlatePacket delay Compound TCP
(Windows)+Hear aboutcongestionearly-Needtoinfercongestion
Routerindication
TCPswithExplicitCongestionNotification
+Hearaboutcongestionearly-Require routersupport
43
HistoryofTCPCongestionControl (§6.5.10)
• ThestoryofTCPcongestioncontrol– Collapse,control,anddiversification
What’sup?
Internet
44
CongestionCollapseinthe1980s• EarlyTCPusedafixedsizeslidingwindow(e.g.,8packets)– Initiallyfineforreliability
• Butsomethingstrangehappened astheARPANETgrew– Linksstayedbusybuttransferrates fellbyordersofmagnitude!
45
CongestionCollapse(2)• Queuesbecamefull,retransmissionscloggedthenetwork,and
goodput fell
Congestioncollapse
46
VanJacobson(1950—)• WidelycreditedwithsavingtheInternetfromcongestioncollapseinthelate80s– Introducedcongestioncontrolprinciples– Practicalsolutions(TCPTahoe/Reno)
• Muchotherpioneeringwork:– Toolsliketraceroute,tcpdump,pathchar– IPheadercompression,multicasttools
Source:Wikipedia(publicdomain)
47
TCPTahoe/Reno• Avoidcongestioncollapsewithoutchangingrouters(orevenreceivers)
• Ideaistofixtimeoutsandintroduceacongestionwindow (cwnd)overtheslidingwindowtolimitqueues/loss
• TCPTahoe/RenoimplementsAIMDbyadaptingcwndusingpacketlossasthenetworkfeedbacksignal
48
TCPTahoe/Reno(2)• TCPbehaviorswewillstudy:– ACK clocking– Adaptivetimeout(meanandvariance)– Slow-start– FastRetransmission– FastRecovery
• Together,theyimplementAIMD
TCPTimeline
49
1988
19901970 19801975 1985
Originsof“TCP”(Cerf&Kahn,’74)
3-wayhandshake(Tomlinson,‘75)
TCPReno(Jacobson,‘90)
CongestioncollapseObserved,‘86
TCP/IP“flagday”(BSDUnix4.2,‘83)
TCPTahoe(Jacobson,’88)
Pre-history Congestioncontrol...
TCPandIP(RFC791/793,‘81)
TCPTimeline(2)
50
201020001995 2005
ECN(Floyd,‘94)
TCPReno(Jacobson,‘90) TCPNewReno
(Hoe,‘95) TCPBIC(Linux,‘04
TCPwithSACK(Floyd,‘96)
DiversificationClassiccongestioncontrol...
1990
TCPLEDBAT(IETF’08)
TCPVegas(Brakmo,‘93)
TCPCUBIC(Linux,’06)
...
BackgroundRoutersupportDelaybased
FASTTCP(Lowetal.,’04)
CompoundTCP(Windows,’07)
51
TCPAck Clocking(§6.5.10)• Theself-clockingbehaviorofslidingwindows,andhowitisusedbyTCP– The“ACK clock”
TickTock!
52
SlidingWindowACKClock• Eachin-orderACK advancestheslidingwindowandletsanewsegmententerthenetwork– ACKs “clock”datasegments
Ack 12345678910
20191817161514131211Data
BenefitofACKClocking• Considerwhathappenswhensenderinjectsaburstofsegmentsintothenetwork
53
Fastlink FastlinkSlow(bottleneck)link
Queue
BenefitofACKClocking(2)• Segmentsarebufferedandspreadoutonslowlink
54
Fastlink FastlinkSlow(bottleneck)link
Segments“spreadout”
BenefitofACKClocking(3)• ACKs maintainthespreadbacktotheoriginalsender
55
SlowlinkAcks maintainspread
BenefitofACKClocking(4)• Senderclocksnewsegmentswiththespread– Nowsendingatthebottlenecklinkwithoutqueuing!
56
Slowlink
Segmentsspread Queuenolongerbuilds
57
BenefitofACKClocking(4)• Helpsthenetworkrunwithlowlevelsoflossanddelay!
• Thenetworkhassmoothedouttheburstofdatasegments
• ACK clocktransfersthissmoothtimingbacktothesender
• Subsequentdatasegmentsarenotsentinburstssotheydonot queueupinthenetwork
58
TCPUsesACKClocking• TCPusesaslidingwindowbecauseofthevalueofACK
clocking
• Slidingwindowcontrolshowmanysegmentsareinsidethenetwork– Calledthecongestionwindow,orcwnd– Rateisroughlycwnd/RTT
• TCPonlysendssmallburstsofsegmentstoletthenetworkkeepthetrafficsmooth
59
TCPSlowStart(§6.5.10)• HowTCPimplementsAIMD,part1– “Slowstart”isacomponentoftheAIportionofAIMD
Slow-start
60
Considerations• WewantTCPtofollowanAIMDcontrollawforagood
allocation
• Senderusesacongestionwindow orcwnd tosetitsrate(≈cwnd/RTT)
• Senderusespacketlossasthenetworkcongestionsignal
• NeedTCPtoworkacrossaverylargerangeofratesandRTTs
61
TCPStartupProblem• Wewanttoquicklyneartherightrate,cwndIDEAL,butitvariesgreatly– Fixedslidingwindowdoesn’tadaptandisroughonthenetwork(loss!)
– AIwithsmallburstsadaptscwnd gentlytothenetwork,butmighttakealongtimetobecomeefficient
62
Slow-StartSolution• Startbydoublingcwnd everyRTT– Exponentialgrowth(1,2,4,8,16,…)– Startslow,quicklyreachlargevalues
AI
Fixed
TimeWindo
w(cwnd
)
Slow-start
63
Slow-StartSolution(2)• Eventuallypacketlosswilloccurwhenthenetworkiscongested– Losstimeouttellsuscwnd istoolarge– Nexttime,switchtoAIbeforehand– Slowlyadaptcwnd nearrightvalue
• Intermsofcwnd:– ExpectlossforcwndC ≈2BD+queue– Usessthresh =cwndC/2toswitchtoAIafterobservingloss
64
Slow-StartSolution(3)• Combinedbehavior,afterfirsttime– Mosttimespendnearrightvalue
AI
Fixed
Time
Window
ssthresh
cwndC
cwndIDEALAIphase
Slow-start
Slow-Start(Doubling)Timeline
65
Incrementcwndby1segmentsizeforeachACK
AdditiveIncreaseTimeline
66
Incrementcwnd by1segmentsizeeverycwnd ACKs(or1RTT)
67
TCPTahoe(Implementation)• Initialslow-start(doubling)phase
– Startwithcwnd =1(orsmallvalue)– cwnd +=1segmentsizeperACK
• LaterAdditiveIncreasephase– cwnd +=1/cwndsegmentsperACK– Roughlyadds1segmentsizeperRTT
• Switchingthreshold(initiallyinfinity)– SwitchtoAIwhencwnd >ssthresh– Setssthresh =cwnd/2afterloss– Beginwithslow-startaftertimeout
68
TimeoutMisfortunes• Whydoaslow-startaftertimeout?– InsteadofMDcwnd (forAIMD)
• TimeoutsaresufficientlylongthattheACK clockwillhaverundown– Slow-startrampsuptheACK clock
• WeneedtodetectlossbeforeatimeouttogettofullAIMD– DoneinTCPReno
69
TCPFastRetransmit/FastRecovery(§6.5.10)
• HowTCPimplementsAIMD,part2– “Fastretransmit”and“fastrecovery”aretheMDportionofAIMD
AIMDsawtooth
70
Recall• WewantTCPtofollowanAIMDcontrollawforagood
allocation
• Senderusesacongestionwindow orcwnd tosetitsrate(≈cwnd/RTT)
• Senderusesslow-starttorampuptheACK clock,followedbyAdditiveIncrease
• Butafteratimeout,senderslow-startsagainwithcwnd=1(asithasnoACK clock)
71
InferringLossfromACKs• TCPusesacumulativeACK– Carrieshighestin-orderseq.number– Normallyasteadyadvance
• DuplicateACKsgiveushintsaboutwhatdatahasn’tarrived– Tellussomenewdatadidarrive,butitwasnotnextsegment– Thusthenextsegmentmaybelost
72
FastRetransmit• TreatthreeduplicateACKsasaloss– Retransmitnextexpectedsegment– Somerepetitionallowsforreordering,butstilldetectslossquickly
Ack 12345555 55
FastRetransmit(2)
73
Ack 10Ack 11Ack 12Ack 13
...
Ack 13
Ack 13Ack 13
Data14...Ack 13
Ack 20......
Data20ThirdduplicateACK,sosend14 Retransmissionfills
intheholeat14ACKjumpsafterlossisrepaired
......
Data14waslostearlier,butgot15to20
74
FastRetransmit(3)• Itcanrepairsinglesegmentlossquickly,typicallybeforeatimeout
• However,wehavequiettimeatthesender/receiverwhilewaitingfortheACKtojump
• AndwestillneedtoMDcwnd …
75
InferringNon-LossfromACKs• DuplicateACKsalsogiveushintsaboutwhatdatahasarrived– EachnewduplicateACKmeansthatsomenewsegmenthasarrived
– Itwillbethesegmentsaftertheloss– Thusadvancingtheslidingwindowwillnotincreasethenumberofsegmentsstoredinthenetwork
76
FastRecovery• Firstfastretransmit,andMDcwnd• ThenpretendfurtherduplicateACKsaretheexpectedACKs– LetsnewsegmentsbesentforACKs– ReconcileviewswhentheACKjumps
Ack 1234555555
FastRecovery(2)
77
Ack 12Ack 13Ack 13
Ack 13Ack 13
Data14Ack 13
Ack 20......
Data20ThirdduplicateACK,sosend14
Data14waslostearlier,butgot15to20
Retransmissionfillsintheholeat14
Setssthresh,cwnd =cwnd/2
Data21Data22
MoreACKsadvancewindow;maysend
segmentsbeforejump
Ack 13
ExitFastRecovery
78
FastRecovery(3)• Withfastretransmit,itrepairsasinglesegmentlossquicklyandkeepstheACK clockrunning
• ThisallowsustorealizeAIMD– Notimeoutsorslow-startafterloss,justcontinuewithasmallercwnd
• TCPRenocombinesslow-start,fastretransmitandfastrecovery– MultiplicativeDecreaseis½
TCPReno
79
MDof½,noslow-start
ACKclockrunning
TCPsawtooth
80
TCPReno,NewReno,andSACK• RenocanrepaironelossperRTT– Multiplelossescauseatimeout
• NewReno furtherrefinesACKheuristics– Repairsmultiplelosseswithouttimeout
• SACKisabetteridea– ReceiversendsACKrangessosendercanretransmitwithoutguesswork
81
ExplicitCongestionNotification(§5.3.4,§6.5.10)
• Howrouterscanhelphoststoavoidcongestion– ExplicitCongestionNotification
!!
82
CongestionAvoidancevs.Control• ClassicTCPdrivesthenetworkintocongestionandthenrecovers– Needstoseelosstoslowdown
• Wouldbebettertousethenetworkbutavoidcongestionaltogether!– Reduceslossanddelay
• Buthowcanwedothis?
FeedbackSignals• Delayandroutersignalscanletusavoidcongestion
83
Signal ExampleProtocol Pros/Cons
Packet loss Classic TCPCubic TCP(Linux)
Hard togetwrongHearaboutcongestionlate
Packet delay Compound TCP(Windows)
Hear aboutcongestionearlyNeedtoinfercongestion
Routerindication
TCPswithExplicitCongestionNotification
HearaboutcongestionearlyRequire routersupport
ECN(ExplicitCongestionNotification)• Routerdetectstheonsetofcongestionviaitsqueue– Whencongested,itmarks affectedpackets(IPheader)
84
ECN(2)• Markedpacketsarriveatreceiver;treatedasloss– TCPreceiverreliablyinformsTCPsenderofthecongestion
85
86
ECN(3)• Advantages– Routersdeliverclearsignaltohosts– Congestionisdetectedearly,noloss– Noextrapacketsneedtobesent
• Disadvantage– Routersandbothsenderandreceivermustbeupgraded
87
Example1AssumeaTCPsenderwithoutfastretransmit,butwithslowstartandadditiveincrease.Alsoassume:• Segmentsn,n+1,n+2,…,n+10transmittedattimes0,1,2,…,10ms• Transmissiontime/segment=1ms• RTT(2xpropagation+transmision +ack processing+ack transmission)=10
ms• Segmentnislost(only)• InordersegmentsandACKs• Retransmissiontimerforsegmentnis60ms,startingattheendof
transmission• cwnd =ssthresh =64attime0• offeredWindow =70
88
Example2:InferEventsthatOccurred