Upload
tyler-alexander
View
225
Download
2
Tags:
Embed Size (px)
Citation preview
FAST TCP
Steven Low
CS/EEnetlab.CALTECH.edu
Oct 2003
FAST Protocols for Ultrascale Networks
netlab.caltech.edu/FAST
Internet: distributed feedback control system TCP: adapts sending rate to congestion AQM: feeds back congestion information
Rf (s)
Rb’(s)
x
))((1
lll
l ctyc
p
)()(1)( tan)(
)()(1-2
tqtttT
wx iid
tqtxi
ii ii
ii
y
pq
TCP AQM
Theory
Calren2/Abilene
Chicago
Amsterdam
CERN
Geneva
SURFNet
StarLight
WAN in LabCaltech
research & production networks
Multi-Gbps50-200ms delay
Experiment
Students Choe (Postech/CIT) Hu (Williams) J. Wang (CDS) Z.Wang (UCLA) Wei (CS)
Industry Doraiswami (Cisco) Yip (Cisco)
Faculty Doyle (CDS,EE,BE) Low (CS,EE) Newman (Physics) Paganini (UCLA)
Staff/Postdoc Bunn (CACR) Jin (CS) Ravot (Physics) Singh (CACR)
Partners CERN, Internet2, CENIC, StarLight/UI, SLAC, AMPATH, Cisco
People
155Mb/s
slowstart
equilibrium
FASTrecovery
FASTretransmit
timeout
10Gb/s
Implementation
netlab.caltech.edu
Outline
Motivation Network model FAST TCP
Equilibrium Stability Experiments
TCP/IP
Applications
TCP/AQM
IP
Transmission
WWW, Email, Napster, FTP, …
Ethernet, ATM, POS, WDM, …
netlab.caltech.edu
High Energy Physics Large global collaborations
2000 physicists from 150 institutions in >30 countries 300-400 physicists in US from >30 universities & labs
SLAC has 500TB data by 4/2002, world’s largest database Typical file transfer ~1 TB
At 622Mbps: ~ 4 hrs At 2.5Gbps: ~ 1 hr At 10Gbps: ~15min Gigantic elephants!
LHC (Large Hadron Collider) at CERN, to open 2007 Generate data at PB (1015B)/sec Filtered in realtime by a factor of 106 to 107
Data stored at CERN at 100MB/sec Many PB of data per year To rise to Exabytes (1018B) in a decade
netlab.caltech.edu
HEP high speed network
… that must change
netlab.caltech.edu
HEP Network (DataTAG)
NLNLSURFnet
GENEVA
UKUKSuperJANET4ABILEN
E
ABILENE
ESNETESNET
CALREN
CALREN
ItItGARR-B
GEANT
NewYork
FrFrRenater
STAR-TAP
STARLIGHT
Wave
Triangle
2.5 Gbps Wavelength Triangle 2002 10 Gbps Triangle in 2003
Newman (Caltech)
netlab.caltech.edu
Performance at large windowsns-2 simulation
10Gbps
capacity = 155Mbps, 622Mbps, 2.5Gbps, 5Gbps, 10Gbps; 100 ms round trip latency; 100 flowsJ. Wang (Caltech, June 02)
27%
txq=100 txq=10000
95%1G
Linux TCP Linux TCP FAST
19%
average utilization
capacity = 1Gbps; 180 ms round trip latency;1 flowC. Jin, D. Wei, S. Ravot, etc (Caltech, Nov 02)
DataTAG Network:CERN (Geneva) – StarLight (Chicago) – SLAC/Level3 (Sunnyvale)
txq=100
netlab.caltech.edu
Outline
Motivation Network model FAST TCP
Equilibrium Stability Experiments
TCP/IP
Applications
TCP/AQM
IP
Transmission
WWW, Email, Napster, FTP, …
Ethernet, ATM, POS, WDM, …
netlab.caltech.edu
Congestion Control
~ W packets per RTT Lost packet detected by missing ACK Congestion signal: delay and loss
RTT
time
time
Source
Destination
1 2 W
1 2 W
1 2 W
data ACKs
1 2 W
netlab.caltech.edu
Congestion control
xi(t)
pl(t)
Example congestion measure pl(t) Loss (Reno) Queueing delay (Vegas)
netlab.caltech.edu
TCP/AQM
Congestion control is a distributed asynchronous algorithm to share bandwidth
It has two components TCP: adapts sending rate (window) to congestion AQM: adjusts & feeds back congestion information
They form a distributed feedback control system Equilibrium & stability depends on both TCP and AQM And on delay, capacity, routing, #connections
pl(t)
xi(t)TCP: Reno Vegas
AQM: DropTail RED REM/PI AVQ
netlab.caltech.edu
Network model
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
lieR lis
lif link uses source if
lieR lislib link uses source if R
netlab.caltech.edu
for every RTT
{ if W/RTTmin – W/RTT < then W ++
if W/RTTmin – W/RTT > then W -- }
queue size
Vegas model
iiiii
i dtqtxtT
x )()( if )(
12
else 0ix
Fi:
iiiii
i dtqtxtT
x )()( if )(
12
Gl:))((1
llcl ctypl
Link queueing delay
E2E queueing delay
netlab.caltech.edu
Vegas model
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
1)(
l
ll c
tyG
ii
ii
dtqtx
i tTF
)()(
21sgn
)(
1
netlab.caltech.edu
Outline
Motivation Network model FAST TCP
Equilibrium Stability Experiments
TCP/IP
Applications
TCP/AQM
IP
Transmission
WWW, Email, Napster, FTP, …
Ethernet, ATM, POS, WDM, …
netlab.caltech.edu
Methodology
Protocol (Reno, Vegas, RED, REM/PI…)
Equilibrium Performance
Throughput, loss, delay
Fairness Utility
Dynamics Local stability Cost of stabilization
))( ),(( )1(
))( ),(( )1(
txtpGtp
txtpFtx
netlab.caltech.edu
Model
c1 c2
Network Links l of capacities cl
Sources sL(s) - links used by source sUs(xs) - utility if source rate = xs
x1
x2
x3
121 cxx 231 cxx
netlab.caltech.edu
Summary: duality model
cRx
xUs
ssxs
subject to
)( max0
Flow control problem (Kelly, Malloo, Tan 98)
TCP/AQM Maximize utility with different utility functions
Primal-dual algorithm
))( ),(( )1(
))( ),(( )1(
tRxtpGtp
txtpRFtx T
Reno,
VegasDropTail, RED, REM
Result (L 00): (x*,p*) primal-dual optimal iff 0 ifequality with ** lll pcy
netlab.caltech.edu
Example utility functions
1 log
1 )1( :General
log : Vegas
32log
1 :2-Reno
3/2tan23
:1-Reno
11
1
i
i
ii
ii
ii
i
iii
x
x
x
Tx
Tx
T
TxT
/
netlab.caltech.edu
Game interpretation
lllssss
xpRxxU
s
)( max0
Source s:
s
lslslp
cxRpl
max0
Link l:
sllsss tpRUtx )()1( 1'
slslll ctxtptp )()()1(
netlab.caltech.edu
Synchronous convergence
Theorem (L & Lapsley 99)
Provided R has full row rank & Us strictly concave:
Gradient projection algorithm of dual problem
Converges to optimal primal-dual solutions if
Limit point: unique Pareto optimal Nash equilibrium
LSl 2
netlab.caltech.edu
Asynchronous convergence
Sources and links update & compute at different times with different frequencies using delayed info
Theorem (L & Lapsley 99)
Converges in asynchronous environment with smaller
netlab.caltech.edu
Equilibrium of VegasNetwork
Link queueing delays: pl
Queue length: clpl
Sources
Throughput: xi
E2E queueing delay : qi
Packets buffered:
Utility funtion: Ui(x) = i di log x Proportional fairness
iiii dqx
netlab.caltech.edu
Validation (L. Wang, Princeton)
Source rates (pkts/ms)# src1 src2 src3 src4 src51 5.98 (6) 2 2.05 (2) 3.92 (4)3 0.96 (0.94) 1.46 (1.49) 3.54 (3.57)4 0.51 (0.50) 0.72 (0.73) 1.34 (1.35) 3.38 (3.39)5 0.29 (0.29) 0.40 (0.40) 0.68 (0.67) 1.30 (1.30) 3.28
(3.34)
# queue (pkts) baseRTT (ms)1 19.8 (20) 10.18 (10.18)2 59.0 (60) 13.36 (13.51)3 127.3 (127) 20.17 (20.28)4 237.5 (238) 31.50 (31.50)5 416.3 (416) 49.86 (49.80)
netlab.caltech.edu
Methodology
Protocol (Reno, Vegas, RED, REM/PI…)
Equilibrium Performance
Throughput, loss, delay
Fairness Utility
Dynamics Local stability Cost of stabilization
))( ),(( )1(
))( ),(( )1(
txtpGtp
txtpFtx
netlab.caltech.edu
222
2
3
33
)1(4
)1 )(
2
-(Nc
N
c
Theorem (Low et al, Infocom’02) Reno/RED is locally stable if
Stability: Reno/RED
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
TCP: Small Small c Large N
RED: Small Large delay
netlab.caltech.edu
Stability: scalable control
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
lll
l ctyc
tp )(1
)()(
)(tq
mii
iii
i
extx
Theorem (Paganini, Doyle, L, CDC’01) Provided R is full rank, feedback loop is locally stable for arbitrary delay, capacity, load and topology
netlab.caltech.edu
Stability: Stabilized Vegas
)()(1)( tan)(
1 )()(1-
2tqtt
tTx iid
tqtxi ii
ii
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
lll
l ctyc
tp )(1
)(
Theorem (Choe & L, Infocom’03) Provided R is full rank, feedback loop is locally stable if
),( max aTx ii
netlab.caltech.edu
Stability: Stabilized Vegas
ii
ii
dtqtx
i tTx
)()(
21sgn
)(
1
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
lll
l ctyc
tp )(1
)(
Theorem (Choe & L, Infocom’03) Provided R is full rank, feedback loop is locally stable if
),( max aTx ii
-1
netlab.caltech.edu
Stability: FAST
)()(1)( tan)(
1 )()(1-
2tqtt
tTx iid
tqtxi ii
ii
F1
FN
G1
GL
Rf(s)
Rb’(s)
TCP Network AQM
x y
q p
lll
l ctyc
tp )(1
)(
Application Stabilized TCP with current routers Queueing delay as congestion measure has right scaling Incremental deployment with ECN
netlab.caltech.edu
Outline
Motivation Network model FAST TCP
Equilibrium Stability Experiments
TCP/IP
Applications
TCP/AQM
IP
Transmission
WWW, Email, Napster, FTP, …
Ethernet, ATM, POS, WDM, …
netlab.caltech.edu
Window control algorithm
Theorem (Jin, Wei, L ‘03) In absence of delay Mapping from w(t) to w(t+1) is contraction Global exponential convergence Full utilization after finite time Utility function: i log xi (proportional fairness)
netlab.caltech.edu
Network
(Sylvain Ravot, caltech/CERN)
netlab.caltech.edu
FAST BMPS
Internet2Land Speed
Record
FAST
1 2
1
2
7
9
10
Gen
eva-
Sunn
yval
e
Baltim
ore-S
unnyvale
#flows
FAST Standard MTU Throughput averaged over > 1hr
netlab.caltech.edu
Aggregate throughput
1 flow 2 flows 7 flows 9 flows 10 flows
Average utilization
95%
92%
90%
90%
88%FAST Standard MTU Utilization averaged over > 1hr
1hr 1hr 6hr 1.1hr 6hr
netlab.caltech.edu
Aggregate throughput
Linux TCP Linux TCP FAST
Average utilization
19%
27%
92%FAST Standard MTU Utilization averaged over 1hr
txq=100 txq=10000
95%
16%
48%
Linux TCP Linux TCP FAST
2G
1G
SCinet Caltech-SLAC experiments
netlab.caltech.edu/FAST
SC2002 Baltimore, Nov 2002
Acknowledgments
PrototypeC. Jin, D. Wei
TheoryD. Choe (Postech/Caltech), J. Doyle, S. Low, F. Paganini (UCLA), J. Wang, Z. Wang (UCLA)
Experiment/facilities Caltech: J. Bunn, C. Chapman, C. Hu (Williams/Caltech), H. Newman, J. Pool, S.
Ravot (Caltech/CERN), S. Singh CERN: O. Martin, P. Moroni Cisco: B. Aiken, V. Doraiswami, R. Sepulveda, M. Turzanski, D. Walsten, S. Yip DataTAG: E. Martelli, J. P. Martin-Flatin Internet2: G. Almes, S. Corbato Level(3): P. Fernes, R. Struble SCinet: G. Goddard, J. Patton SLAC: G. Buhrmaster, R. Les Cottrell, C. Logg, I. Mei, W. Matthews, R. Mount, J.
Navratil, J. Williams StarLight: T. deFanti, L. Winkler
Major sponsorsARO, CACR, Cisco, DataTAG, DoE, Lee Center, NSF
netlab.caltech.edu
Dynamic sharing: 3 flowsFAST Linux
Dynamic sharing on Dummynet capacity = 800Mbps delay=120ms 3 flows iperf throughput Linux 2.4.x (HSTCP: UCL)
netlab.caltech.edu
Dynamic sharing: 3 flowsFAST Linux
HSTCP STCP
Steady throughput
netlab.caltech.edu
FAST Linux
throughput
loss
queue
STCPHSTCP
Dynamic sharing on Dummynet capacity = 800Mbps delay=120ms 14 flows iperf throughput Linux 2.4.x (HSTCP: UCL)
30min
netlab.caltech.edu
FAST Linux
throughput
loss
queue
STCPHSTCP
30min
Room for mice !
HSTCP
netlab.caltech.edu
Outline
Motivation Network model FAST TCP
Equilibrium Stability Experiments
TCP/IP
Applications
TCP/AQM
IP
Transmission
WWW, Email, Napster, FTP, …
Ethernet, ATM, POS, WDM, …
netlab.caltech.edu
Network model
F1
FN
G1
GL
R
RT
TCP Network AQM
x y
q p
))( ),(( )1(
))( ),(( )1(
tRxtpGtp
txtpRFtx T
Reno, Vegas
DT, RED, …
liRli link uses source if 1 IP routing
netlab.caltech.edu
Motivation
ll
li l
lliR
iiixp
iii
xR
cppRxxU
cRxxU
ii
max)( max min
subject to )( maxmax
00
0
:Dual
:Primal
netlab.caltech.edu
Motivation
Can TCP/IP maximize utility?
ll
li l
lliR
iiixp
iii
xR
cppRxxU
cRxxU
ii
max)( max min
subject to )( maxmax
00
0
:Dual
:Primal
Shortest path routing!
netlab.caltech.edu
TCP-AQM/IP
Theorem (Wang, et al 03)
Primal problem is NP-hard
Ai
iAi
i cc
Proof Reduce integer partition to primal problem
Given: integers {c1, …, cn}Find: set A s.t.
netlab.caltech.edu
TCP-AQM/IP
Theorem (Wang, et al 03)
Primal problem is NP-hard
Achievable utility of TCP/IP?
Stability? Duality gap?
Conclusion: Inevitable tradeoff between
achievable utility routing stability
netlab.caltech.edu
Ring networkdestination
r
Single destination Instant convergence of
TCP/IP Shortest path routing
Link cost = pl(t) + dl
price static
TCP/AQM
IPr(0)
pl(0)
r(1)
pl(1)
… r(t), r(t+1) , …
routing
netlab.caltech.edu
Ring networkdestination
r
TCP/AQM
IPr(0)
pl(0)
r(1)
pl(1)
… r(t), r(t+1) , …
Stability: r ?
Utility: V ?r* : optimal routing
V* : max utility
netlab.caltech.edu
Ring networkdestination
rTheorem (Infocom 2003)
“No” duality gap Unstable if = 0
starting from any r(0), subsequent r(t) oscillates between 0 and 1
link cost = pl(t) + dl
Stability: r ?
Utility: V ?
netlab.caltech.edu
Ring networkdestination
r
link cost = pl(t) + dl
0
0||*
*
VV
rr
Theorem (Infocom 2003)
Solve primal problem asymptoticallyas
Stability: r ?
Utility: V ?
netlab.caltech.edu
Ring networkdestination
r
link cost = pl(t) + dl
Theorem (Infocom 2003)
large: globally unstable small: globally stable medium: depends on r(0)
Stability: r ?
Utility: V ?
netlab.caltech.edu
General network
Conclusion: Inevitable tradeoff between
achievable utility routing stability
random graph20 nodes, 200 links Achievable
utility
netlab.caltech.edu
FAST TCP: motivation, architecture, algorithms, performance. submitted for publication, July 1, 2003
-release: August 2003Inquiry: [email protected]
FAST Project Review Caltech, Oct 27-28, 2003
netlab.caltech.edu/FAST