View
218
Download
1
Tags:
Embed Size (px)
Citation preview
netlab.caltech.edu
Outline
Overview of FAST TCP. Implementation Details. SC2002 Experiment Results. FAST Evaluation and WAN-in-Lab.
netlab.caltech.edu
FAST vs. Linux TCP
Distance = 10,037 km; Delay = 180 ms; MTU = 1500 B; Duration: 3600 sLinux TCP Experiments: Jan 28-29, 2003
1333173.182Linux TCPtxqlen=100
3909319.352Linux TCPtxqlen=10000
78 1851.86 1Linux TCPtxqlen=100
753
387
111
Transfer(GB)
1,79718.032FAST
19.11.2002
9259.281FAST
19.11.2002
2662.671Linux TCPtxqlen=10000
Throughput(Mbps)
BmpsPeta
Flows
netlab.caltech.edu
Aggregate Throughput
Linux TCP Linux TCP FAST
Average utilization
19%
27%
92%FAST Standard MTU Utilization averaged over 1hr
txq=100 txq=10000
95%
16%
48%
Linux TCP Linux TCP FAST
2G
1G
netlab.caltech.edu
Summary of Changes
RTT estimation: fine-grain timer. Fast convergence to equilibrium. Delay monitoring in equilibrium. Pacing: reducing burstiness.
netlab.caltech.edu
FAST TCP Flow Chart
Slow Start
Fast Convergence
Equilibrium
Loss Recovery
NormalRecovery
Time-out
netlab.caltech.edu
RTT Estimation
Measure queueing delay. Kernel timestamp with s resolution. Use SACK to increase the number of RTT
samples during recovery. Exponential averaging of RTT samples to
increase robustness.
netlab.caltech.edu
Fast Convergence
Rapidly increase or decrease cwnd toward equilibrium.
Monitor the per-ack queueing delay to avoid overshoot.
netlab.caltech.edu
Equilibrium
Vegas-like cwnd adjustment in large time-scale -- per RTT.
Small step-size to maintain stability in equilibrium.
Per-ack delay monitoring to enable timely detection of changes in equilibrium.
netlab.caltech.edu
Pacing
What do we pace? Increment to cwnd.
Time-Driven vs. event-driven. Trade-off between complexity and
performance. Timer resolution is important.
netlab.caltech.edu
Time-Based Pacing
cwnd increments are scheduled at fixed intervals.
data ack data
netlab.caltech.edu
Event-Based Pacing
Detect sufficiently large gap between consecutive bursts and delay cwnd increment until the end of each such burst.
SCinet Caltech-SLAC experiments
netlab.caltech.edu/FAST
SC2002 Baltimore, Nov 2002
Experiment
Sunnyvale Baltimore
Chicago
Geneva
3000km 1000km
70
00
km
C. Jin, D. Wei, S. LowFAST Team and Partners
Internet: distributed feedbacksystem Rf (s)
Rb’(s)
x
p
TCP AQM
Theory
FAST TCP Standard MTU Peak window = 14,255 pkts Throughput averaged over > 1hr 925 Mbps single flow/GE card
9.28 petabit-meter/sec 1.89 times LSR
8.6 Gbps with 10 flows 34.0 petabit-meter/sec 6.32 times LSR
21TB in 6 hours with 10 flows
Implementation Sender-side modification Delay based
Highlights
1 2
1
2
7
9
10G
enev
a-Sunnyv
ale
Baltim
ore-
Sunn
yval
eFA
ST
I2 L
SR
#flows
netlab.caltech.edu
FAST BMPS
Internet2Land Speed
Record
FAST
1 2
1
2
7
9
10
Geneva-S
unnyvale
Baltim
ore-S
unnyvale
#flows
FAST Standard MTU Throughput averaged over > 1hr
netlab.caltech.edu
Aggregate Throughput
1 flow 2 flows 7 flows 9 flows 10 flows
Average utilization
95%
92%
90%
90%
88%FAST Standard MTU Utilization averaged over > 1hr
1hr 1hr 6hr 1.1hr 6hr
netlab.caltech.edu
Caltech-SLAC Entry
Rapid recoveryafter possiblehardware glitch
Power glitchReboot
100-200Mbps ACK traffic
SCinet Caltech-SLAC experiments
netlab.caltech.edu/FAST
SC2002 Baltimore, Nov 2002
PrototypeC. Jin, D. Wei
TheoryD. Choe (Postech/Caltech), J. Doyle, S. Low, F. Paganini (UCLA), J. Wang, Z. Wang (UCLA)
Experiment/facilities Caltech: J. Bunn, C. Chapman, C. Hu (Williams/Caltech), H. Newman, J. Pool, S.
Ravot (Caltech/CERN), S. Singh CERN: O. Martin, P. Moroni Cisco: B. Aiken, V. Doraiswami, R. Sepulveda, M. Turzanski, D. Walsten, S. Yip DataTAG: E. Martelli, J. P. Martin-Flatin Internet2: G. Almes, S. Corbato Level(3): P. Fernes, R. Struble SCinet: G. Goddard, J. Patton SLAC: G. Buhrmaster, R. Les Cottrell, C. Logg, I. Mei, W. Matthews, R. Mount, J.
Navratil, J. Williams StarLight: T. deFanti, L. Winkler TeraGrid: L. Winkler
Major sponsorsARO, CACR, Cisco, DataTAG, DoE, Lee Center, NSF
Acknowledgments
netlab.caltech.edu
Evaluating FAST
End-to-End monitoring doesn’t tell the whole story.
Existing network emulation (dummynet) is not always enough.
Better optimization if we can look inside and understand the real network.
netlab.caltech.edu
Dummynet Issues
Not running on a real-time OS -- imprecise timing.
Lack of priority scheduling of dummynet events.
Bandwidth fluctuates significantly with workload.
Much work needed to customize dummynet for protocol testing.
netlab.caltech.edu
10 GbE Experiment
Long-distance testing of Intel 10GbE cards.
Sylvain Ravot (Caltech) achieved 2.3 Gbps using single stream with jumbo frame and stock Linux TCP.
Tested HSTCP, Scalable TCP, FAST, and stock TCP under Linux.
1500B MTU: 1.3 Gbps SNV -> CHI; 9000B MTU: 2.3 Gbps SNV -> GVA
netlab.caltech.edu
TCP Loss Mystery
Frequent packet loss with 1500-byte MTU. None with larger MTUs.
Packet loss even when cwnd is capped at 300 - 500 packets.
Routers have large queue size of 4000 packets.
Packets captured at both sender and receiver using tcpdump.
netlab.caltech.edu
How Can WAN-in-Lab Help?
We will know exactly where packets are lost.
We will also know the sequence of events (packet arrivals) that lead to loss.
We can either fix the problem in the network if any, or improve the protocol.