30
06/14/22 1 TCP enhancements TCP enhancements Hojun Lee [email protected]

TCP enhancements

  • Upload
    felton

  • View
    41

  • Download
    1

Embed Size (px)

DESCRIPTION

TCP enhancements. Hojun Lee [email protected]. Many variants of TCP. - PowerPoint PPT Presentation

Citation preview

Page 1: TCP enhancements

04/22/23 1

TCP enhancementsTCP enhancements

Hojun [email protected]

Page 2: TCP enhancements

04/22/23 2

Many variants of TCPMany variants of TCP Tahoe TCP: Follows a basic go-back-n model using slow start, congestion

avoidance and Fast Retransmit algorithm. With Fast Retransmit, after receiving small number of acks for the same segment, the sender infers that the packet has been lost and retransmits the packet without waiting for the retransmission timer to expire

Reno TCP– Modification to the Tahoe TCP Fast Retransmit algorithm to include Fast Recovery;

this prevents the pipe from going empty after Fast retransmit, thereby avoiding the need to slow start after a single packet loss

– Recover 1 lost segment every 3 RTTs New Reno:

– Uses partial acknowledgement to improve loss recovery– Recovers 1 lost segment every RTT

SACK TCP– Uses SACK option bit field to improve loss recovery– Recovers up to 3 segments per RTT

Other schemes exist (e.g., Vegas)

Page 3: TCP enhancements

04/22/23 3

OutlineOutlineLFN

– Needs some TCP options such as window scale, timestamp and PAWS

Methods to recover from multiple packet losses in a window– SACK– TCP with “partial acknowledgements”

Effects of increasing window size

Page 4: TCP enhancements

04/22/23 4

Long fat pipes problemsLong fat pipes problems The TCP window size is a 16-bit field in the TCP header,

limiting the window to 65,535 bytes– Can be solved with “window scale option”

Packet loss in an LFN can reduce throughput drastically (Possible solutions for multiple packet loss within a window?)– SACK (Selective acknowledgements) – New Reno (use partial acknowledgements)

Better RTT measurements are required for operating on an LFN– Timestamp option

If the network is so fast that sequence number wrap occurs in less than MSL– PAWS algorithm (Protection Against Wrapped Sequence Number)

Page 5: TCP enhancements

04/22/23 5

Window scale optionWindow scale option Format

Increase the definition of TCP window from 16 to 32 bits

The 1-bytes shift count is between 0 and 14. The maximum value of 14 is a window of 1,073,725,440 bytes (65535*214)

Appears in a SYN segment

kind = 3 len = 4 Shift count 1 byte 1 byte 1 byte

Page 6: TCP enhancements

04/22/23 6

Timestamp optionTimestamp option Format

Let the sender place a timestamp value in every segment Receiver echoes this value in the ACK by allowing the

sender to calculate an RTT for each received ACK Uses in SYN segment Larger window sizes require better RTT calculation Does not require any form of clock synchronization

between the two hosts

timestamp echo reply timestamp valuelen=10Kind=81 byte 1 byte 4 bytes 4 bytes

Page 7: TCP enhancements

04/22/23 7

PAWSPAWS Largest receiver window = 230 = 1 GB “Lost” segment may reappear before MSL, and

the sequence numbers may have wrapped around.

The receiver considers the timestamp as an extension of the sequence number discard out-of- sequence segment based on both seq # and timestamp.

Page 8: TCP enhancements

04/22/23 8

Useful termsUseful terms LW (Loss Window): size of the congestion window

after a TCP sender detects loss using its retransmission timer

RW (Restart Window): size of the congestion window after a TCP restarts transmission after an idle period

Flight Size: The amount of data that has been sent but not yet acknowledged

Page 9: TCP enhancements

04/22/23 9

Two methods to detect segment Two methods to detect segment losslossTO (Timeout )TD (Triple Duplicate ACK)

Page 10: TCP enhancements

04/22/23 10

Detect a loss by TODetect a loss by TO Set cwnd = 1 Set ssthresh = max(Flight size/2, 2MSS)

Page 11: TCP enhancements

04/22/23 11

Detect a loss by TD (Fast Detect a loss by TD (Fast Retransmit and Fast Recovery Retransmit and Fast Recovery procedure)procedure) After receiving 3

duplicate ACKS, TCP performs a retransmission of what appears to be the missing segment, without waiting for the retransmission timer to expire

Good for a single loss within a window but not good for multiple losses

X

ack 10

1110

12

13

ack 10

ack 10

ack 10

10

ack 14

TD

Page 12: TCP enhancements

04/22/23 12

Fast retransmit and fast recovery Fast retransmit and fast recovery algorithmsalgorithms1. When the third duplicate ACK is received, then set ssthresh = max(Flight size/2, 2MSS)2. Retransmit the lost segment and then set cwnd = ssthresh + 3*MSS (Inflating the window) The reason for this is that since the three duplicate ACKS are

received, it assumes that three segments got through because according to the TCP rule, if the receiver receives a new packet, it must generate an ACK.

3. For each additional duplicate ACK received, increase cwnd by 1.4. Transmit a segment, if allowed by min(cwnd, receiver’s AW) 5. When the next ACK arrives that acknowledges new data, set cwnd = ssthresh (the value set in step 1) (deflating the window)

Page 13: TCP enhancements

04/22/23 13

Problem occurs if multiple packets Problem occurs if multiple packets loss happens within a windowloss happens within a window Two possible answers

– TCP with SACK– TCP with partial

acknowledgement (New Reno TCP)

partial ack 12

TD

16

172nd TD

ack 12ack 12

X

ack 10

1110

12

13

ack 10

ack 10

ack 10

10

X14

15

18ack 12

Page 14: TCP enhancements

04/22/23 14

TCP with SACK (Selective TCP with SACK (Selective Acknowledgement)Acknowledgement) Based on RFC 2018 (TCP Selective

Acknowledgement Options) –standards track Good for when multiple packets are lost from one

window of data Gives sender view of which segments queued at

receiver and which in flight Uses two TCP options

– “SACK permitted” (may be sent in a SYN segment)– SACK option itself (may be sent over an established

connection)

Page 15: TCP enhancements

04/22/23 15

Format of two SACK optionsFormat of two SACK options Sack-permitted option

– Two bytes option

Sack option itself (Kind = 5, Length=variable)Kind = 2 Length = 4

LengthKind = 5

Left edge of 1st block

Right edge of 1st block

Right edge of nth block

Left edge of nth block

Page 16: TCP enhancements

04/22/23 16

SACK option examplesSACK option examples~ Assume the left edge is 5000 and transmitter sends a

burst of 8 segments, each containing 500 data bytes

Example (1) The first four segments are received but the last 4 are dropped- The data receiver will return a normal TCP ACK segment acknowledging sequence number 7000, with no SACK option

12345

76

8xxxx

ACK 7000

Transmitter Receiver

Page 17: TCP enhancements

04/22/23 17

SACK option examples con’t [1]SACK option examples con’t [1]Example (2) The first

segment is lost but the remaining 7 segments are received.

- Receiver will return a TCP ACK segment that acknowledges sequence number 5000 and contains SACK option specifying one block of queued data

- LE = Left Edge- RE = Right Edge

12345

76

8

x

ACK 5000; LE 5500; RE: 6000

Transmitter Receiver

ACK 5000; LE 5500; RE: 6500

ACK 5000; LE 5500; RE: 9000ACK 5000; LE 5500; RE: 8500ACK 5000; LE 5500; RE: 8000ACK 5000; LE 5500; RE: 7500ACK 5000; LE 5500; RE: 7000

Page 18: TCP enhancements

04/22/23 18

SACK option examples con’t [2]SACK option examples con’t [2]

Example (3)The 2nd, 4th, 6th and 8th (last) segments are dropped.

12345

76

8

Transmitter Receiver

(a) ACK 5500

x

xxx

(b) ACK 5500

(c) ACK 5500

(d) ACK 5500

First block Second block Third block

Left Edge

Right Edge

Left Edge

Left Edge

Right Edge

Right Edge

SACK not used

6000 65007000 7500 6000 65008000 8500 7000 7500 6000 6500

(a)

(b)(c)(d)

Page 19: TCP enhancements

04/22/23 19

SACK option examples con’t [3]SACK option examples con’t [3]

Suppose at this point (continue from pervious example), the 4th packet is received out of order.– Receiver replies with the following SACK:

Suppose that the 2nd segment is received.– Receiver replies with the following SACK:

First block Second block Third block

Leftedge

Rightedge

Leftedge

Rightedge

Leftedge

Rightedge

ACK

5500 6000 80007500 8500

7500 8000 8500

First block

Page 20: TCP enhancements

04/22/23 20

New Reno algorithmNew Reno algorithm

Based on RFC 2582 (The NewReno modification to TCP’s Fast Recovery Algorithm) - experimental

Little information available to the TCP sender in making retransmission decision during Fast recovery

Use “partial acknowledgements” (ACKs which cover new data, but not all the data outstanding when loss was detected)

Recover 1 lost segment every RTT NewReno modification to TCP’s Fast Recovery

Page 21: TCP enhancements

04/22/23 21

New Reno algorithm con’t [1]New Reno algorithm con’t [1]

Refer to slide 12 Variable “recover” – used to record the highest

sequence number transmitted Add variable “recover” in step 1. Step 5: when an ACK arrives that acknowledges new

data, this ACK could be the acknowledgement elicited by the retransmission from step 2, or elicited by a later transmission

Page 22: TCP enhancements

04/22/23 22

New Reno algorithm con’t [2]New Reno algorithm con’t [2] Two possibilities

1. If this ACK, acknowledges all of the data up to and including “recover”, then the ACK acknowledges all the intermediate segments sent between the original retransmission of the lost segment and the receipt of the third duplicate ACK

Set either cwnd = min (ssthresh, FlightSize + mss) or cwnd = ssthresh (set in step 1) (Flight size (in this case) amount of data outstanding when the Fast Recovery is exited)

2. If this ACK does not acknowledge all of the data up to and including “recover”, then this is a partial ACKSet cwnd = “deflate the previous cwnd by the amount of new data acknowledged” + mss

Page 23: TCP enhancements

04/22/23 23

New Reno algorithm con’t [3]New Reno algorithm con’t [3]

Possible variants to the simple response to partial acknowledgements

– How many packets to retransmit after each partial ACK?– When to reset the retransmit timer after a partial ACK?

• Reset the retransmit timer only after the first partial ACK ( Impatient variant of NewReno)• Reset the retransmit timer after each partial ACK ( Slow-but-steady variant of NewReno)

– How to avoid multiple Fast Retransmits caused by the retransmission of packets already received by the receiver?

Page 24: TCP enhancements

04/22/23 24

New Reno algorithm con’t [4]New Reno algorithm con’t [4]

Avoiding multiple fast retransmitReason: TCP data sender is unable to distinguish between a

duplicate ACK that results from a lost or delay data packet, and a duplicate ACK that results from the sender’s retransmission of a data packet that had already been received at the receiver

Needs a new variable called “send_high” = highest sequence number transmitted so far after each retransmit timeout

Page 25: TCP enhancements

04/22/23 25

New Reno algorithm con’t [5]New Reno algorithm con’t [5] Example (a): Assumption:

When the third duplicate ACK is received and the sender in not already in the Fast Recovery procedure, then check whether those duplicate ACKs cover more than “send_high” or not.

X

1514

Send_high = 12ack 13ack 13

ack 13

Sender is not in the fast recovery procedure at this point

• Set ssthresh = max(flight size/2, 2MSS)• Set the highest sequence number transmitted in the variable called “recover” • Go to step 2

16

ack 13

101112

ack 13

ack 13

1st Scenario 2nd ScenarioWait for RTO

13

Page 26: TCP enhancements

04/22/23 26

New Reno algorithm con’t [6]New Reno algorithm con’t [6] Example (b): when the duplicate

ACKs don’t cover “send_high”, then do nothing.

– Do not enter fast retransmit and fast recovery procedure

– Do not change the ssthresh value– Do not go to step 2 to retransmit lost

segment– Do not execute step 3 upon receiving

subsequent duplicate ACKs– After a retransmit timeout, record the

highest sequence number in “send_high” and exit the Fast Recovery procedure if applicable

131415

ack 11ack 11ack 11Send_high = 12

Page 27: TCP enhancements

04/22/23 27

Increase IW (Initial Window) sizeIncrease IW (Initial Window) size

Limited to 2 segments (RFC 2581)– Standard track RFC (not experimental)

Upper bound for IW is given more precisely:– IW = min (4*MSS, max(2*MSS,4380 bytes))

4 segments (RFC 2416) - informational– A simple experiment with only 3 buffers leading

into a 9600 baud modem at the receiver– No significant degradation of performance even

when the IW size 4

Page 28: TCP enhancements

04/22/23 28

Advantages/disadvantages of Advantages/disadvantages of larger IWlarger IW Advantages

– When an IW of at least two segments, the receiver will generate an ACK after the second data segment arrives (eliminates the wait on the timeout (~200 msec)

– For small file sizes, delay can be improved from 3RTTs down to 1 (Email, webpage transfers less than 4 Kbytes yields 1RTT)

Disadvantages– A burst of 4 segments (small burst) may not be “handable” in a

rotuer– Slightly increase packet drop rate

Page 29: TCP enhancements

04/22/23 29

Re-starting after idle connectionsRe-starting after idle connections

A known problem with TCP congestion control algorithm:– Potentially in appropriate burst of traffic to be transmitted after TCP

has been idle for a relatively long period of time(line rate burst occurs – source is idle but also due to ACK losses)

– Idle time; more than one retransmission timeout (RTO)– When TCP does not receive during the idle time, then

set RW(Restart Window) = IW

Page 30: TCP enhancements

04/22/23 30

SummarySummary

Studied TCP options for LFN,many variants of TCP such as SACK TCP and New Reno, and the effect of increasing IW

ECN (Explicit Congestion Notification) with RED (other TCP enhancement)