59
Jiachen Chen WINLAB, Dept. of Computer Science Rutgers University Transport Layer ECE544: Communication Networks-II, Spring 2018 Includes teaching materials from L. Peterson, Sumathi Gopal and Sumit Rangwala, D. Raychaudhuri, Mike Freedman, Tam Vu

Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Jiachen Chen

WINLAB, Dept. of Computer Science

Rutgers University

Transport Layer ECE544: Communication Networks-II, Spring 2018

Includes teaching materials from L. Peterson, Sumathi Gopal and Sumit Rangwala, D. Raychaudhuri, Mike Freedman, Tam Vu

Page 2: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

OSI Protocol Stack: Key Abstractions

2

Problem: Network Layer (IP) provides only best-effortcommunication services

Best-effort local packet delivery

Best-effort global packet delivery

Reliable streams

Applications

Messages

Application

Presentation

Session

Transport

Network

Data link

Physical

File share, Virtual terminal, …

HTML, XML, JSON, CSS…

RFC, SCP, NFS, …

TCP, UDP, …

IP

802.2, MPLS, …

ISDN, USB, DSL, DOCSIS, …

Link Net Trans …

Page 3: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Applications requirements vs. Network layer limitations Guarantee message delivery

Network may drop messages.

Deliver messages in the same order they are sent Messages may be reordered in networks and incurs a long delay.

Delivers at most one copy of each message Messages may duplicate in networks.

Support arbitrarily large message Network may limit message size.

Support synchronization between sender and receiver

Allows the receiver to apply flow control to the sender

Support multiple application processes on each host Network only support communication between hosts

Many more

Page 4: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

IP Protocol Stack: Key Abstractions

4

Transport layer:

Provide applications with good abstractions

Without support or feedback from the network

Is the lowest layer in the network stack that is an end-to-end protocol

Best-effort local packet delivery

Best-effort global packet delivery

Reliable streams

Applications

Messages

Link

Network

Transport

Application

Page 5: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Transport Protocols

5

Logical communication between processes Sender divides a message into segments Receiver reassembles segments into message

Transport services (De)multiplexing packets Detecting corrupted data Optionally: reliable delivery, flow control, …

Page 6: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Two Basic Transport Features Demultiplexing: port numbers

Error detection: checksums

Web server

(port 80)

Client host

Server host 128.2.194.242

Echo server

(port 7)

Service request for

128.2.194.242:80

(i.e., the Web server)OSClient

IP payload

detect corruption6

Page 7: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Most Popular Transport Protocols User Datagram Protocol (UDP)

Support multiple applications processes on each host Option to check messages for correctness with CRC check

Transmission Control Protocol (TCP) Ensures reliable delivery of packets between source and destination processes

Ensures in-order delivery of packets to destination process

Other services

Datagram Congestion Control Protocol (DCCP) Message-oriented protocol Reliable connection setup, tear down, ECN, congestion control, feature negotiation

Page 8: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

User Datagram Protocol (UDP) Service: Support for multiple processes on each host to communicate

Issue: IP only provides communication between hosts (IP addresses)

Solution

Add port number and associate a process with a port number

4-Tuple Unique Connection Identifier: [SrcPort, SrcIPAddr, DestPort, DestIPAddr ]

Lightweight communication between processes

Send and receive messages

Avoid overhead of ordered, reliable delivery

No connection setup delay, in-kernel connection state

Used by popular apps

Query/response for DNS

Real-time data in VoIP

SrcPort DesPort

Length Checksum

Payload

0 16 31

Page 9: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

User Datagram Protocol (UDP): Error Detection

Service: Ensure message correctness Issue: Packet corruption in transit

Solution Use Checksum.

Includes UDP header, payload, pseudo header

Pseudo header Protocol number, source IP address, destination IP address, and UDP length

SrcPort DesPort

Length Checksum

Payload

0 16 31

Page 10: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Advantages of UDP

10

Fine-grain control

UDP sends as soon as the application writes

No connection set-up delay

UDP sends without establishing a connection

No connection state

No buffers, parameters, sequence #s, etc.

Small header overhead

UDP header is only eight-bytes long

Page 11: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Transmitting a stream of bytes ?

Stream-of-bytes service

Sends and receives a stream of bytes

Reliable, in-order delivery

Corruption: checksums

Detect loss/reordering: sequence numbers

Reliable delivery: acknowledgments and retransmissions

Connection oriented

Explicit set-up and tear-down of TCP connection

Flow control

Prevent overflow of the receiver’s buffer space

Congestion control

Adapt to network congestion for the greater good11

Page 12: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Transmission Control Protocol (TCP) First proposed by Vinton Cerf and Robert Kahn, 1974

TCP/IP enabled computers of all sizes, from different vendors, different OSs, to communicate with each other.

Used by 80% of all traffic on the Internet

Reliable, in-order delivery, connection-oriented, bye-stream service

Page 13: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Starting and Ending a Connection:

TCP Handshakes

Page 14: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Establishing a TCP Connection

Three-way handshake to establish connection

Host A sends a SYN (open) to the host B

Host B returns a SYN acknowledgment (SYN ACK)

Host A sends an ACK to acknowledge the SYN ACK

14

A B

Each host tells its Initial Sequence Number (ISN) to the other host.

Page 15: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP Header

15

Source port Destination port

Sequence number

Acknowledgment

Advertised windowHdrLe

nFlags0

Checksum Urgent pointer

Options (variable)

Data

Flags:SYN

FIN

RST

PSH

URG

ACK

Page 16: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Step 1: A’s Initial SYN Packet

16

A’s port B’s port

A’s Initial Sequence Number

Acknowledgment

Advertised window20 Flags0

Checksum Urgent pointer

Options (variable)

Flags:SYN

FIN

RST

PSH

URG

ACK

A tells B it wants to open a connection…

Page 17: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Step 2: B’s SYN-ACK Packet

B’s port A’s port

B’s Initial Sequence Number

A’s ISN plus 1

Advertised window20 Flags0

Checksum Urgent pointer

Options (variable)

Flags:SYN

FIN

RST

PSH

URG

ACK

B tells A it accepts, and is ready to hear the next byte…

… upon receiving this packet, A can start sending data

17

Page 18: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Step 3: A’s ACK of the SYN-ACK

A’s port B’s port

B’s ISN plus 1

Advertised window20 Flags0

Checksum Urgent pointer

Options (variable)

Flags:SYN

FIN

RST

PSH

URG

ACK

A tells B it is okay to start sending

Sequence number

… upon receiving this packet, B can start sending data

18

Page 19: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

SYN Loss and Web Downloads Upon sending SYN, sender sets a timer

If SYN lost, timer expires before SYN-ACK received

Sender retransmits SYN

How should the TCP sender set the timer?

No idea how far away the receiver is

Some TCPs use default of 3 or 6 seconds

Implications for web download

User gets impatient and hits reload

… Users aborts connection, initiates new socket

Essentially, forces a fast send of a new SYN!

19

Page 20: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Tearing Down the Connection

Closing (each end of) the connection

Finish (FIN) to close and receive remaining bytes

And other host sends a FIN ACK to acknowledge

Reset (RST) to close and not receive remaining bytes

timeA

B

20

Page 21: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Sending/Receiving the FIN Packet

Sending a FIN: close()

Process is done sending data via socket

Process invokes “close()”

Once TCP has sent all the outstanding bytes…

… then TCP sends a FIN

Receiving a FIN: EOF

Process is reading data from socket

Eventually, read call returns an EOF

21

Page 22: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Data transmission

Page 23: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP: Byte-stream

Service: Byte-stream Application reads or writes a stream of bytes to the transport

Issue: IP is packet-oriented

Solution: TCP maintains a local buffer Chop the stream into packets and transmit (sender)

Coalesce data from packets to form a stream (receiver)

Page 24: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP “Stream of Bytes” Service

Host A

Host B

24

Page 25: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

…Emulated Using TCP “Segments”

Host A

Host B

TCP Data

TCP Data

Segment sent when:1. Segment full (Max Segment Size),2. Not full, but times out, or3. “Pushed” by application

25

Page 26: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP Segment

IP packet

No bigger than Maximum Transmission Unit (MTU)

E.g., up to 1500 bytes on an Ethernet link

TCP packet

IP packet with a TCP header and data inside

TCP header is typically 20 bytes long

TCP segment

No more than Maximum Segment Size (MSS) bytes

E.g., up to 1460 consecutive bytes from the stream

IP HdrIP Data

TCP HdrTCP Data (segment)

26

Page 27: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Sequence NumberHost A

Host B

TCP Data

TCP Data

ISN (initial sequence number)

Sequence number = 1st byte

27

Page 28: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Initial Sequence Number (ISN)

Sequence number for the very first byte

E.g., Why not a de facto ISN of 0?

Practical issue: reuse of port numbers

Port numbers must (eventually) get used again

… and an old packet may still be in flight

… and associated with the new connection

So, TCP must change the ISN over time

Set from a 32-bit clock that ticks every 4 microsec

… which wraps around once every 4.55 hours!

28

Page 29: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Reliable Delivery on a LossyChannel With Bit Errors

Page 30: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Challenges of Reliable Data Transfer

Over a perfectly reliable channel: Done

Over a channel with bit errors

Receiver detects errors and requests retransmission

Over a lossy channel with bit errors

Some data missing, others corrupted

Receiver cannot easily detect loss

Over a channel that may reorder packets

Receiver cannot easily distinguish loss vs. out-of-order

30

Page 31: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

An Analogy Alice and Bob are talking

What if Alice couldn’t understand Bob?

Bob asks Alice to repeat what she said

What if Bob hasn’t heard Alice for a while?

Is Alice just being quiet? Has she lost reception?

How long should Bob just keep on talking?

Maybe Alice should periodically say “uh huh”

… or Bob should ask “Can you hear me now?”

31

Page 32: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Take-Aways from the Example

Acknowledgments from receiver

Positive: “okay” or “uh huh” or “ACK”

Negative: “please repeat that” or “NACK”

Retransmission by the sender

After not receiving an “ACK”

After receiving a “NACK”

Timeout by the sender (“stop and wait”)

Don’t wait forever without some acknowledgment

32

Page 33: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP Support for Reliable Delivery Detect bit errors: checksum

Used to detect corrupted data at the receiver

…leading the receiver to drop the packet

Detect missing data: sequence number

Used to detect a gap in the stream of bytes

... and for putting the data back in order

Recover from lost data: retransmission

Sender retransmits lost or corrupted data

Two main ways to detect lost packets

33

Page 34: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP AcknowledgmentsHost A

Host B

TCP Data

TCP Data

ISN (initial sequence number)

Sequence number = 1st byte

ACK sequence number = next expected byte

34

Page 35: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Automatic Repeat reQuest (ARQ)

ACK and timeouts Receiver sends ACK when it receives packet

Sender waits for ACK and times out

Simplest ARQ protocol Stop and wait

Send a packet, stop and wait until ACK arrives

35

Time

Tim

eo

u

t

Sender Receiver

Page 36: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Quick TCP Math• Initial Seq No = 501. Sender sends 4500 bytes

successfully acknowledged. Next sequence number to send is:

(A) 4501 (B) 5000 (C) 5001 (D) 5002

• Next 1000 byte TCP segment received. Receiver acknowledges with ACK number:

(A) 5001 (B) 6000 (C) 6001

36

Page 37: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Flow Control:TCP Sliding Window

Page 38: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Sliding Window: Motivation

Stop-and-wait is inefficient

Only one TCP segment is “in flight” at a time

Consider: 1.5 Mbps link with 50 ms round-trip-time (RTT)

Assume segment size of 1 KB (8 Kbits)

8 Kbits/segment at 50 msec/segment 160 Kbps

That’s 11% of the capacity of 1.5 Mbps link

39

Page 39: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Sliding Window Allow a larger amount of data “in flight”

Allow sender to get ahead of the receiver

… though not too far ahead

Sending process Receiving process

Last byte ACKed

Last byte sent

TCP TCP

Next byte expected

Last byte written Last byte read

Last byte received40

Page 40: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Receiver Buffering Receive window size

Amount that can be sent without acknowledgment

Receiver must be able to store this amount of data

Receiver tells the sender the window Tells the sender the amount of free space left

Window Size

OutstandingUn-ack’d data

Data OK to send

Data not OK to send yet

Data ACK’d

41

Page 41: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP: Flow Control Flow Control

“Prevent sender from overrunning the capacity (buffer) of the receiver”

Solution: Use adaptive receiver window size

Goal is to keep (C) – (A) < MaxRcvBuffer

Every packet carries ACK and AdvertisedWindowSending Appl Receiving Appl

LastByteAcked (J) (K) LastByteSent

(I) LastByteWritten

(B) NextByteExpected

(C) LastByteRcvd

LastByteRead(A)TCP TCP

AdvertisedWindow = MaxRcvBuffer-((NextByteExp-1)-LastByteRead)

LastByteSent (K) – LastByteAcked (J) <= AdvertisedWindow

EffWin = AdvertisedWin -(LastByteSent-LastByteAcked)

LastByteWritten – LastByteAcked <= MaxSendBuffer

Page 42: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Optimizing Retransmissions

43

Page 43: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Reasons for Retransmission

44T

imeou

t

Tim

eou

t

Tim

eou

t

Tim

eou

t

Tim

eou

t

Tim

eou

t

ACK lost

DUPLICATE

PACKET

Packet lostEarly timeout

DUPLICATE

PACKETS

Page 44: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

How Long Should Sender Wait? Sender sets a timeout to wait for an ACK

Too short: wasted retransmissions

Too long: excessive delays when packet lost

TCP sets timeout as a function of the RTT

Expect ACK to arrive after an “round-trip time”

… plus a fudge factor to account for queuing

But, how does the sender know the RTT?

Running average of delay to receive an ACK

45

Page 45: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP Timeout

Issue: RTT in a wide area network varies substantially

Solution: Adaptive Timeout

Original Algorithm: EstimatedRTT = a x EstimatedRTT + (1-a) x SampleRTT

Timeout = β x EstimatedRTT (β = 2)

Problem Does not distinguish whether the ACK is for original transmission or retransmission

Constant β is not good. Assumes constant variance

10 a

Page 46: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP Timeout Karn/Partridge Algorithm

Whenever TCP retransmits a segment, it stops taking samples of the RTT Only measure SampleRTT for segments that have been sent only once

Each time TCP retransmits, set the next timeout to be twice the last timeout Relieves congestion

Jacobson/Karels Algorithm: Adaptive variance (uses mean variance)

Difference = SampleRTT - EstimatedRTT

EstimatedRTT = EstimatedRTT + (d x Difference) → (same as in original)

Deviation = Deviation + d(|Difference|- Deviation)

Timeout = m x EstimatedRTT + f x Deviation

(default: set m = 1 and f= 4 )

10 d

Page 47: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP Deadlock TCP Deadlock

receiver advertises a window size of 0, the sender stops sending data

the window size update from the receiver is lost

To solve it: the sender starts the persist timer when AdvertisedWindow = 0

When the persist timer expires, the sender sends a small packet

Page 48: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Triggering Transmission When to transmit a segment:

small segments subject to large overhead

Reach max segment size (MSS): the size of the largest segment TCP can send without causing the local IP to fragment MSS = local MTU – IP & TCP header

The sending process explicitly ask the TCP to transmit, “push”

Page 49: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Congestion

When the network cannot support the sender’s rate Queues at the network elements overflow

Source1

Source2

Source3

Dest2

Dest1

Even with flow control packets might not reach the

destination

Page 50: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Congestion Control vs. Flow Control Congestion Control

Mechanism to prevent sender from overrunning the capacity of the network When network is the bottleneck

Flow Control

Mechanism to prevent sender from overrunning the capacity of the receiver When receiver is the bottleneck

Page 51: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Congestion Control: Design Approach

Maintain another window at the sender called CongestionWindow (cwnd) CongestionWindow is the max number of packets allowed in the network

Number of unACKed packets at the sender.

Key: How to calculate congestion window (cwnd) Various approaches possible

TCP estimates it based on observed packet losses

Assumes packet loss as indication of congestion

Since we don’t know whether the network or the receiver is the bottleneck MaxWindow = MIN(CongestionWindow, AdvertisedWindow) EffectiveWin = MaxWindow – (LastByteSent –LastByteAcked)

Page 52: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP Congestion Control

TCP sends packets into network without reservation Try to use network resource (bandwidth, buffer) as much as it can

As congestion occurs, scales back

Strategy: Conservatively increases packet sending rate (cwnd) if no congestion

Quickly reduce sending rate(cwnd) as congestion detected (packet loss)

Page 53: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Congestion Avoidance: (AIMD)

If no congestion in the network (increase conservatively) Increase the congestion window additively every RTT

If congestion in the network (decrease aggressively) Decrease the congestion window multiplicatively, immediately

How is congestion detected? Estimated (more later)

Every RTTw = w + 1

w = cwnd in segments

Every ACK receptionw = w + 1/w

w = cwnd in segments

Every ACK receptioncwnd = cwnd + MSS*(MSS/cwnd)

cwnd in bytes

cwnd = cwnd/2cwnd in bytes

Page 54: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Congestion Avoidance: (AIMD)

TCP’s saw tooth pattern

Issues with additive increase takes too long to ramp up a connection from the beginning

The entire advertised window may be reopened when a lost packet retransmitted and a single cumulative ACK is received by the sender

Time

CongestionWindow Size

Startup time

Page 55: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP “Slow Start”: To start quickly!

Maintain another variable slow start threshold (ssthresh) Last known stable rate If (cwnd > ssthresh)

State = congestion avoidance

Else State = slow start

In Slow start Increase the congestion window exponentially every RTT

Key: How is ssthresh calculated?

Every ACK reception

w = w + 1w = cwnd in segments

Every ACK reception

cwnd = cwnd + MSScwnd in bytes

Page 56: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

TCP: Congestion Detection and Retransmit

Loss of packet indicates congestion

Timer Timeouts (No ACK) Set according to Jacobson/Karels algorithm

On timer timeout ssthresh = max(2*MSS, effwin/2); cwnd = MSS

Notice this will cause TCP to go into slow start

Issue: takes a long time to detect a packet loss

Affects throughput

Any other quicker way of detecting a packet loss?

Page 57: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Fast Retransmit

Observation: A series of duplicate ACKs might mean a packet loss

Solution Every time receiver receives a packet (out-of-order), sends a duplicate ACK

Sender retransmit the missing packet after it receives some number of duplicate ACKs (e.g. 3 duplicate ACKs)

Fast Retransmit does not replace timeouts

Issue: Reduces latency (early retransmit) but still incurs loss in throughput (slow start after packet loss )

ACK 1

ACK 2

ACK 2

ACK 2

ACK 2

ACK 6

PKT 1

PKT 2

PKT 4

PKT 5

PKT 6

PKT 3

Retran

PKT 3

Page 58: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Fast Recovery

Transmit a packet for every ACK received till the retransmitted packet is ACK’d ssthresh= (2*MSS, cwdn/2); cwnd = sshthred + 3

On every ACK will the ACK of retransmitted packet cwnd = cwnd + 1

On reception of ACK of retransmitted packet Start congestion avoidance instead of slow start cwnd = ssthresh

Page 59: Transport Layer - WINLAB · 2018. 3. 30. · OSI Protocol Stack: Key Abstractions 2 Problem: Network Layer (IP) provides only best-effort communication services Best-effort local

Homework 5.13 (3rd ed and 4th ed)

5.16

5.28

5.34

5.39

Due 4/5