Upload
olivier-bonaventure
View
554
Download
0
Embed Size (px)
DESCRIPTION
More details on the TCP protocol including some security issues with TCP and introduction of congestion control
Citation preview
Week 7UDP and TCP
SCTP and Internet Congestion control
Agenda
• TCP
• Connection establishment
• Reliable data transfer
• Connection release
• SCTP
• Congestion control
TCP segment
Source port Destination port
Payload
32 bits
Checksum Urgent pointer
THL Reserved Flags
20 bytes
Sequence number
Optional header extension
Window
Flags :used to indicate the function of a segmentSYN : used during establishmentFIN : used during connection releaseRST : used in case of problemsACK : if true, means that the Acknowledgementnumber inside the segment is valid
Computed over the entire segment and part of the IP
header
Acknowledgement number
Segment header length
Three-way handshake
ACK(seq=x+1, ack=y+1)
CONNECT.req
CONNECT.ind
SYN+ACK(ack=x+1,seq=y)
CONNECT.resp
CONNECT.conf
Initial sequence number (x)
Initial sequence number (y)
SYN(seq=x)
Connection established
Connection established
The sequence numbers of all segments A->B will start at x+1
The sequence numbers of allsegments B->A will start at y+1
TCP FSM
Init
SYN RCVD SYN Sent
Established
?SYN / !SYN+ACK !SYN
?SYN+ACK / !ACK
?SYN / !SYN+ACK
?ACK
!SYN
?ACK
Simultaneous open
CONNECT.conf
SYN(seq=y)CONNECT.req
CONNECT.req
SYN(seq=x)
Connection establishedConnection established
CONNECT.conf
SYN+ACK(seq=y, ack=x+1)
SYN+ACK(seq=x, ack=y+1)
Negotiating options
ACK(seq=x+1, ack=y+1)
CONNECT.req
CONNECT.ind
SYN+ACK(ack=x+1,seq=y) Option
CONNECT.resp
CONNECT.conf
Initial sequence number (x) Option proposed
Initial sequence number (y)Option accepted
SYN(seq=x),Option
Connection establishedOption accepted
Connection established
The sequence numbers of all segments A->B will start at x+1
The sequence numbers of allsegments B->A will start at y+1
TCP options
• MSS
• Selective acknowledgements
• Timestamps
• Window Scale
• Multipath TCP
• ...
Agenda
• TCP
• Connection establishment
• Reliable data transfer
• Connection release
• SCTP
• Congestion control
Reliable data transfer
(seq=127,"ef")
(seq=123,"abcd")
(seq=123,"abcd")
(seq=127,"ef")
(ack=123)
Retransmission timer
(ack=129)
(ack=129)unnecessary
retransmission
"abcdef"
Retransmission of all unacked segments
“ef” placed in buffer
Retransmission timer
• How to compute it ?
• round-trip-time may change frequently
during the lifetime of a TCP connection
Retransmission timer
• Algorithm
• timer = mean(rtt) + 4*std_dev(rtt)
• est_mean(rtt) = (1- )*est_mean(rtt)
+ *rtt_measured
• est_std_dev=(1-)*est_std_dev+
*|rtt_measured - est_mean(rtt)|
RTT measurements
• Solution (Karn/Partridge)
• Do not measure rtt of retransmitted segments
(seq=123,"abcd")
(seq=120,"xyz")
(ack=123)
(ack=128)
measured rtt
Timerwhich is the good one ?
(seq=123,"abcd")
With Timestamp option
(seq=123,TS=3, TS echo=12, "abcd")
(seq=120,TS=1, TS echo=7, "xyz")
(ack=123, TS=12, TS echo=1)
(ack=127, TS=17, TS echo=3)
measured rtt
timer
measured rtt
(seq=123,TS=5, TS echo=12, "abcd")
Fast retransmit
(seq=123,"abcd")
(ack=123)
(ack=123)
(ack=123)
(ack=123)
(ack=133)
(seq=123,"abcd")
"abcdefghij"
(seq=127,"ef")
Out of sequence, in buffer
(seq=129,"gh")
Out of sequence, in buffer(seq=131,"ij")
Out of sequence, in buffer
Selective Acks
(seq=123,"abcd")
(seq=127,"ef")
(ack=123)
(seq=129,"gh")
(seq=131,"ij")
(ack=123,sack:127-128)
(ack=123, sack:127-130)
(ack=123, sack:127-132)
Lost
(seq=123,"abcd")(ack=133)
"abcdefghij"
only 123-126 must beretransmitted
• Receiver reports SACK blocks
• Negotiated during establishment
Delayed acks• Sending an ack per segment is costly
• Tradeoff
• In sequence data segment
• no ack waiting, delay by up to 50msec
• one ack waiting, send immediately
• Out-of-sequence data segment
• send ack immediately
When to send data ?
• When should a segment be sent ?
• After each write system call
• When there is a full segment of data
Nagle algorithm
• A new data segment can be sent if
• This is a full segment (MSS bytes)
• There are no unacknowledged bytes
Observed IP packets
http://www.caida.org/research/traffic-analysis/pkt_size_distribution/graphs.xml
Flow control
(seq=122,"abcd")
(ack=126,rwin=0)
Last_ack=122, swin=100, rwin=4To transmit : abcdefghijklm
Last_ack=122, swin=96, rwin=0
Last_ack=126, swin=100, rwin=0(ack=126,rwin=2)
(seq=126,"ef")
(ack=128,rwin=20)
Last_ack=126, swin=100, rwin=2Last_ack=126, swin=98, rwin=0
Last_ack=128, swin=100, rwin=20
Last_ack=128, swin=93, rwin=13(seq=128,"ghijklm")
(ack=135,rwin=20)Last_ack=135, swin=100, rwin=20
TCP flow control
• Performance function of window size
• Throughput ~= window/rtt
• TCP window : 16 bits field
• RFC1323 Window scale extension
rtt 1 msec 10 msec 100 msecWindow8 Kbytes 65.6 Mbps 6.5 Mbps 0.66 Mbps64 Kbytes 524.3 Mbps 52.4 Mbps 5.2 Mbps
Agenda
• TCP
• Connection establishment
• Reliable data transfer
• Connection release
• SCTP
• Congestion control
Connection release
FIN(seq=x)
DISCONNECT.req (A-B)
DISCONNECT.ind(A-B)
ACK(ack=x+1)DISCONNECT.conf(A-B)
ACK(ack=y+1)DISCONNECT.conf(A-B)
DISCONNECT.req(B-A)
DISCONNECT.ind(B-A)
FIN(seq=y)
Time WAITMaintain state for this connection during twice MSLto be able to retransmit ACK if a segment is received from the other entity
outgoing connection closed
incoming connection closed
incoming connection closed
outgoing connection closed
State can be removed
Last sent data : x-1
Last sent data : y-1
Abrupt release
RST(seq=x)
DISCONNECT.req (abrupt)
DISCONNECT.ind(abrupt)
Connection closed
Connection closed
State can be removed
State can be removed
Last sent data : x
• Data segments can be lost during such an abrupt release• No entity needs to wait in TIME_WAIT state after such a release
• anyway, any segment received when there is no state causes the transmission of a RST segment
TCP connection release
FIN Wait1
SYN RCVD
CLOSE Wait
Established
FIN Wait2
LAST-ACK
TIME Wait
Closing
Closed
?FIN/!ACK
!FIN
?ACK
Timeout[2MSL]
?FIN/!ACK
?ACK
!FIN
?ACK
?FIN/!ACK
!FIN
Agenda
• TCP
• Connection establishment
• Reliable data transfer
• Connection release
• SCTP
• Congestion control
TCP limitations
• Service
• Only supports bytestream service
• Extensibility
• Limited space for options
• Security
• Various issues like Denial of Service
attacks
TCP establishment
SYN(Src=C,seq=x)
CONNECT.ind
SYN+ACK(Dest=C,ack=x+1,seq=y)
ACK(Src=A,seq=x)
CONNECT.req
DoS attack
SYN(Src=A,seq=x)
CONNECT.ind
CONNECT.ind
SYN+ACK(Dest=A,ack=x+1,seq=y)
SYN+ACK(Dest=B,ack=x+1,seq=z)
SYN(Src=B,seq=x)
• Attacker sends 1000s of SYNs
TCP Security
• 20th century security
• Server trusts Alice but not Bob
• Server accepts all TCP connections
from Alice's IP address without
asking a password
• Server always asks a password
from Bob's IP address
TCP Security • Can Bob create a fake TCP connection
by spoofing Alice's IP when she is away
?
SYN+ACK(ack=x+1,seq=y)
SYN(seq=x)
ACK(seq=x+1, ack=y+1)
CONNECT.req
CONNECT.ind
CONNECT.resp
CONNECT.conf
TCP Security
• Bob's view of the transfer
SYN+ACK(Dst=A,ack=x+1,seq=y)
SYN(Src=A,seq=x)
ACK(seq=x+1, ack=y+1)
Data(Src=A,seq=x+1)
SYN Cookies
SYN+ACK(ack=x+1,seq=y)
SYN(seq=x)
ACK(seq=x+1, ack=y+1)
CONNECT.req
CONNECT.ind
CONNECT.conf
No state createdy=Hash(IPClient,PortClient,Secret)
Verify thatack=1+Hash(IPClient,PortClient,Secret)
State is created
• Stateless passive opener
SCTP
• Segment format
SCTP connection
establishment
Agenda
• TCP
• Connection establishment
• Reliable data transfer
• Connection release
• SCTP
• Congestion control
TCP Congestion
Control• Congestion detection
• Packet loss
• Explicit Congestion Notification
• Congestion control
• Additive Increase Multiplicative
Decrease
Additive Increase• No congestion ?
• All acks move window
• Additive increase
• Increment cwnd by on MSS every rttCwnd
Time
Faster increase• How to speed up the growth of the
congestion window at connection
startup ?
• Slow-start
• Double cwnd every rttCwnd
Slow-startexponential increase of cwnd
Time
Max window
Multiplicative
decrease• How to detect congestion ?
• Three duplicate acks
• mild congestion for TCP
• cwnd/2 and restart additive increase
• Expiration of retransmission timer
• severe congestion
• Reset cwnd at 1 MSS
• Perform slow-start until half previous cwnd
and then continue with congestion
avoidance
CwndFast retransmit
Threshold
Threshold
Slow-startexponential increase of cwnd
Congestion avoidance linear increase of cwnd
Fast retransmit
Mild congestion
Severe congestion
Cwnd
Time
Timer expiration
Threshold
Timer expiration
Threshold
Slow-startexponential increase of cwnd
Congestion avoidance linear increase of cwnd