Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
How Hard Can It Be? ���Designing and Implementing a
Deployable Multipath TCP Cos$n Raiciu
University Politehnica of Bucharest
Joint work with: Christoph Paasch, Sebas1en Barre, Alan Ford, Fabien Duchene, Michio Honda, Olivier Bonaventure, Mark Handley
Thanks to
Networks are becoming mul$path
Mobile devices have mul1ple wireless connec1ons
Networks are becoming mul$path
Datacenters have redundant topologies
Networks are becoming mul$path
Datacenters have redundant topologies
Networks are becoming mul$path
Servers are mul1-‐homed
Client
How do we use these networks?
TCP.
Used by most applica$ons,
offers byte-‐oriented reliable delivery,
adjusts load to network condi$ons
TCP is single path
A TCP connec$on Uses a single-‐path in the network regardless of network topology
Is 1ed to the source and des1na1on addresses of the endpoints
Mismatch between network and transport
creates problems
Collisions in datacenters
[Fares et al -‐ A Scalable, Commodity Data Center Network Architecture -‐ Sigcomm 2008]
How hard can it be?
Designing and
Implementing a
Deployable Multipath TCP
Deployable Multipath TCP
How hard can it be?
Designing
Implementing
Deployable Multipath TCP
How hard can it be?
Designing
Implementing
Goal: A Deployable Multipath TCP
We want to evolve TCP to be able to use mul2ple paths in the network.
Mul$path TCP must meet the following goals:
GOAL 1: Support unmodified applica.ons
GOAL 2: Work over today’s networks
GOAL 3: Work whenever TCP would work
Our Linux kernel Mul.path TCP implementa.on supports legacy apps
and works well over:
deployed 3G and Wifi networks,
exis1ng datacenters and
the Internet at large.
Deployable Multipath TCP
How hard can it be?
Designing
Implementing
It can be preTy hard.
It can be preTy hard.
Mark Handley suggested we start working on designing MPTCP in spring 2007
It can be preTy hard.
Mark Handley suggested we start working on designing MPTCP in spring 2007
Five years later, here we are –
we finally nailed this!
It can be preTy hard.
Mark Handley suggested we start working on designing MPTCP in spring 2007
Five years later, here we are –
we finally nailed this!
Why was it this difficult?
Internet Architecture is a living thing.
Protocol Layering
IP IP IP IP
TCP TCP
HTTP HTTP
Ethernet Modem Ethernet ATM ATM Modem
Web Server
Internet Router
Internet Router
Web Client
• Link layers (eg Ethernet) are local to a particular link • Routers look at IP headers to decide how to route a packet. • TCP provides reliability via retransmission, flow control, etc. • Application using OS’s TCP API to do its job.
Fiction!
Middleboxes
Source Port Destination Port
Sequence Number
Acknowledgment Number
Receive Window Header Length Reserved Code bits
Checksum
Options
Urgent Pointer
Data
Bit 0 Bit 15 Bit 16 Bit 31
20 Bytes
0 - 40 Bytes
Destination IP
Source IP
TTL Header Checksum Protocol
Identification Fragment Offset Flags
IHL Version TOS Total Length ECN
Source Port Destination Port
Sequence Number
Acknowledgment Number
Receive Window Header Length Reserved Code bits
Checksum
Options
Urgent Pointer
Data
Bit 0 Bit 15 Bit 16 Bit 31
20 Bytes
0 - 40 Bytes
Destination IP
Source IP
TTL Header Checksum Protocol
Identification Fragment Offset Flags
IHL Version TOS Total Length ECN
Source Port Destination Port
Sequence Number
Acknowledgment Number
Receive Window Header Length Reserved Code bits
Checksum
Options
Urgent Pointer
Data
Bit 0 Bit 15 Bit 16 Bit 31
20 Bytes
0 - 40 Bytes
Destination IP
Source IP
TTL Header Checksum Protocol
Identification Fragment Offset Flags
IHL Version TOS Total Length ECN
Source Port Destination Port
Sequence Number
Acknowledgment Number
Receive Window Header Length Reserved Code bits
Checksum
Options
Urgent Pointer
Data
Bit 0 Bit 15 Bit 16 Bit 31
20 Bytes
0 - 40 Bytes
Destination IP
Source IP
TTL Header Checksum Protocol
Identification Fragment Offset Flags
IHL Version TOS Total Length ECN
Source Port Destination Port
Sequence Number
Acknowledgment Number
Receive Window Header Length Reserved Code bits
Checksum
Options
Urgent Pointer
Data
Bit 0 Bit 15 Bit 16 Bit 31
20 Bytes
0 - 40 Bytes
Destination IP
Source IP
TTL Header Checksum Protocol
Identification Fragment Offset Flags
IHL Version TOS Total Length ECN
Deployable Multipath TCP
How hard can it be?
Designing
Implementing
MPTCP Connec$on Management
MPTCP Connec$on Management
SYN
MP_CAPABLE
MPTCP Connec$on Management
SYN
MP_CAPABLE
Enable MPTCP if SYN has MP_CAPABLE
MPTCP Connec$on Management
SYN/ACK MP_CAPABLE
Enable MPTCP if SYN has MP_CAPABLE
ENABLED
MPTCP Connec$on Management
SYN/ACK MP_CAPABLE
Enable MPTCP if SYN has MP_CAPABLE
ENABLED
Enable MPTCP if SYN/ACK has MP_CAPABLE
MPTCP Connec$on Management
Enable MPTCP if SYN/ACK has MP_CAPABLE
Enable MPTCP if SYN has MP_CAPABLE
ENABLED ENABLED
ACK
MPTCP Connec$on Management
Subflow 1 Subflow 1
MPTCP Connec$on Management
SYN JOIN
Subflow 1 Subflow 1
MPTCP Connec$on Management
SYN/ACK
JOIN
Subflow 1 Subflow 1
MPTCP Connec$on Management
ACK
Subflow 1 Subflow 1
MPTCP Connec$on Management
Subflow 1 Subflow 1
Subflow 2 Subflow 2
That was easy!
Almost too easy…
MPTCP Connec$on Management
SYN/ACK MP_CAPABLE Y
Enable MPTCP if SYN/ACK has MP_CAPABLE
Enable MPTCP if SYN has MP_CAPABLE
ENABLED
MPTCP Connec$on Management
Enable MPTCP if SYN/ACK has MP_CAPABLE
Enable MPTCP if SYN has MP_CAPABLE
SYN/ACK MP_CAPABLE
ENABLED
6% of access networks remove unknown op$ons (14% on port 80)
[Honda et al. – Is It S$ll Possible to Extend TCP? – IMC 2011 ]
MPTCP Connec$on Management
Enable MPTCP if SYN/ACK has MP_CAPABLE
Enable MPTCP if SYN has MP_CAPABLE
SYN/ACK
ENABLED DISABLED
MPTCP Connec$on Management
Enable MPTCP if SYN/ACK has MP_CAPABLE
Enable MPTCP if SYN has MP_CAPABLE
and ACK has DATA_ACK
SYN/ACK
DISABLED
MPTCP Connec$on Management
Enable MPTCP if SYN/ACK has MP_CAPABLE
ACK
DISABLED
Enable MPTCP if SYN has MP_CAPABLE
and ACK has DATA_ACK
MPTCP Connec$on Management
Enable MPTCP if SYN/ACK has MP_CAPABLE
ACK
DISABLED DISABLED
Enable MPTCP if SYN has MP_CAPABLE
and ACK has DATA_ACK
To achieve GOAL 3: When MPTCP opera$on is not possible, fallback to TCP.
Nego$a$on used to be between two endpoints
In today’s Internet, nego$a$on is: between two endpoints and an unknown number of intermediaries
New protocol nego2a2on has to take this into account or it will fail
Lesson
Sending Data
TCP Opera$on
1
TCP Opera$on
1 2
TCP Opera$on
1 2 3
TCP Opera$on
2 3 4 1
TCP Opera$on
3 4 2
ACK 1
TCP Opera$on
4 3
ACK 2 ACK 1
TCP Opera$on
4
ACK 3 ACK 2 ACK 1
TCP Opera$on
ACK 4 ACK 3 ACK 2
ACK 1
Strawman Design
1
Strawman Design
1
2
Strawman Design
3
2
1
Strawman Design
2
3
4
1
Strawman Design
2 4
3
ACK 1
Strawman Design
4
3
ACK 1
ACK 2
Strawman Design
4
ACK 3
ACK 2
ACK 1
A third of access networks will “correct” or drop ACKs of unseen data
Strawman Design
ACK 3
ACK 4
ACK 1
Strawman Design
Ok, so what does work?
• We need a sequence space for each subflow – This will drive loss detec$on and retransmissions
• We need a data sequence number – This will put segments in order at the receiver
• We need a data ACK for flow control – Receive window is rela$ve to Data ACK
MPTCP Data Transmission
SUBFLOW: 100 DATA:1
MPTCP Data Transmission
SUBFLOW: 100 DATA:1
SUBFLOW: 200 DATA:2
MPTCP Data Transmission
SUBFLOW: 101 DATA:3
SUBFLOW: 200 DATA:2
SUBFLOW: 100 DATA:1
MPTCP Data Transmission
SUBFLOW: 101 DATA:3
SUBFLOW: 200 DATA:2
SUBFLOW: 100 DATA:1
MPTCP Data Transmission
SUBFLOW: 101 DATA:3
SUBFLOW: 200 DATA:2
SUBFLOW: 100 DATA:1
SUBFLOW: 102 DATA:2
TCP Packet Header
Source Port Destination Port
Sequence Number
Acknowledgment Number
Receive Window Header Length Reserved Code bits
Checksum
Options
Urgent Pointer
Data
Bit 0 Bit 15 Bit 16 Bit 31
20 Bytes
0 - 40 Bytes
MPTCP Packet Header
Source Port Destination Port
Sequence Number
Acknowledgment Number
Receive Window Header Length Reserved Code bits
Checksum
Options
Urgent Pointer
Data
Bit 0 Bit 15 Bit 16 Bit 31
20 Bytes
0 - 40 Bytes
Subflow
Subflow
Subflow Subflow
MPTCP Packet Header
Source Port Destination Port
Sequence Number
Acknowledgment Number
Receive Window Header Length Reserved Code bits
Checksum
Options
Urgent Pointer
Data
Bit 0 Bit 15 Bit 16 Bit 31
20 Bytes
0 - 40 Bytes
Subflow
Subflow
Subflow Subflow
Connection relative to Data ACK
MPTCP Packet Header
Source Port Destination Port
Sequence Number
Acknowledgment Number
Receive Window Header Length Reserved Code bits
Checksum
Options
Urgent Pointer
Data
Bit 0 Bit 15 Bit 16 Bit 31
20 Bytes
0 - 40 Bytes
Subflow
Subflow
Subflow Subflow
Data sequence number ? Data ACK ?
Data sequence number ? Data ACK ?
Connection relative to
Sending Data ACKs in the payload sucks
Sending Data ACKs in the payload sucks leads to deadlocks
Client Server
Read Request 2, Sends Response 2
Read Request 1, Sends Response 1
Read Request 2, Sends Response 2
Read Request 1, Starts sending response 1 (won’t read request 2 $ll finished)
Client Server
Client blocked by server receive window
Server blocked on client receive window (receive window will only open when Data Ack received)
Client wants to send Data Ack, but blocked by receive window
Deadlock
Data wai$ng in receive buffer
Even though the client app has read the data, Data Ack s$ll cannot be sent
Design space for feasible solu1ons is quite narrow
There are not too many things that could have
been done differently
Read paper for: • Flow control • Dealing with content-‐ changing middleboxes
• Dealing with TSO/LRO • Connec$on teardown
• Fast receive code • Middlebox tests • Evalua$on
Deployable Multipath TCP
How hard can it be?
Designing
Implementing
MPTCP over WiFi/3G
8Mbps, 20ms
2Mbps, 150ms
TCP over WiFi/3G
MPTCP over WiFi/3G
MPTCP over WiFi/3G
Mul1path TCP increases throughput
MPTCP over WiFi/3G
What happened here?
MPTCP over WiFi/3G
1 Receive Window
MPTCP over WiFi/3G
1 Receive Window
4 3 2
MPTCP over WiFi/3G
1
3 2 4
Receive Window
MPTCP over WiFi/3G
1
ACK
3 2 4
Receive Window
MPTCP over WiFi/3G
1
Wifi path blocked by the Receive Window
3 2 4
Receive Window
MPTCP over WiFi/3G
1
1
3 2 4
Receive Window
REINJECT SEGMENT ON WIFI
MPTCP over WiFi/3G
1 REINJECT SEGMENT ON WIFI
HALVE CONGESTION WINDOW
OF 3G SUBFLOW 3 2 4
Receive Window 1
MPTCP over WiFi/3G
1 3 2 4
Receive Window 1
MPTCP over WiFi/3G
Receive Window 1
MPTCP over WiFi/3G
Demo
Conclusions
• Designing a Mul$path TCP isn’t difficult. • Designing a deployable Mul$path TCP is much harder. – Need to understand the evolving and undocumented Internet architecture.
– Need defensive mechanisms to fall back to TCP behaviour when all else fails.
• Most extensions to TCP now face the same hurdles.
Conclusions (2)
• Designing a performant MPTCP needs care. – Especially need careful management of buffering to avoid unwanted interac$ons between subflows.
MPTCP allows standard applica2ons to reap the benefits of mul2path networks
• It is deployable today • Try out the code – hDp://mptcp.info.ucl.ac.be/