Upload
iram
View
29
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Transport Layer. Dr. Nawaporn Wisitpongphan. Credit: Prof. Nick McKeown http://www.stanford.edu/~nickm. Outline. The Transport Layer The UDP Protocol The TCP Protocol TCP Characteristics TCP Connection setup TCP Segments TCP Sequence Numbers TCP Sliding Window - PowerPoint PPT Presentation
Citation preview
TRANSPORT LAYERDr. Nawaporn Wisitpongphan
Credit: Prof. Nick McKeownhttp://www.stanford.edu/~nickm
OUTLINE
The Transport Layer The UDP Protocol The TCP Protocol
TCP Characteristics TCP Connection setup TCP Segments TCP Sequence Numbers TCP Sliding Window Timeouts and Retransmission Congestion Control and Avoidance
REVIEW OF THE TRANSPORT LAYER
Nick Dave
Leland.Stanford.edu Athena.MIT.edu
Network Layer
Link Layer
Application Layer
Transport Layer
O.S. O.S.HeaderData HeaderData
HD
HD
HD
HD HD
HD
LAYERING: THE OSI MODEL
Session
Network
Link
PhysicalPhysicalPhysical
Application
Presentation
Transport
Network
Link Link
Network
Transport
Session
Presentation
Application
Network
Link
Physical
Peer-layer communication
layer-to-layer communication
Router Router
1
2
3
4
5
6
7
1
2
3
4
5
6
7
USER DATAGRAM PROTOCOL (UDP) CHARACTERISTICS
UDP is a connectionless datagram service. There is no connection establishment: packets may show up
at any time. UDP is unreliable:
No acknowledgements to indicate delivery of data. Checksums cover the header, and only optionally cover the
data. Contains no mechanism to detect missing or mis-sequenced
packets. No mechanism for automatic retransmission. No mechanism for flow control, and so can over-run the
receiver.
USER-DATAGRAM PROTOCOL (UDP)
App
App
A1 A2
App
App
B1
B2
UDP
OS
IP
UDP uses port number to demultiplex packets
Port Description
123 Network Time Protocol (NTP)
67,68 Dynamic Host Configuration Protocol (DHCP)
500 Internet Security Association Key Management Protocol (ISAKMP)
520 Routing Information Protocol
SRC port DST port
checksum
length
DATA
USER-DATAGRAM PROTOCOL (UDP)PACKET FORMAT
Why do we have UDP? It is used by applications that don’t need reliable delivery, or Applications that have their own special needs, such as streaming of real-time audio/video.
By default, only covers the
header.
TCP CHARACTERISTICS TCP is connection-oriented.
3-way handshake used for connection setup. TCP provides a stream-of-bytes service. TCP is reliable:
Acknowledgements indicate delivery of data. Checksums are used to detect corrupted data. Sequence numbers detect missing, or mis-sequenced data. Corrupted data is retransmitted after a timeout. Mis-sequenced data is re-sequenced. (Window-based) Flow control prevents over-run of receiver.
TCP uses congestion control to share network capacity among users.
HTTP AND TCP
Port Description
80 HTTP
23 Telnet
20/21
FTP(data/control)
25 Simple Mail Transfer Protocol (SMTP)
TCP IS CONNECTION-ORIENTED
Connection Setup3-way handshake
(Active)Client
(Passive)Server
Syn
Syn + Ack
Ack
Connection Close/Teardown2 x 2-way handshake
(Active)Client
(Passive)Server
Fin
(Data +) Ack
Fin
Ack
THE TCP DIAGRAM
Which path does the Active Client or Passive Server
follow?
(Active)Client
(Passive)Server
Syn
Syn + Ack
Ack
TCP CLIENT
TCP SERVER
TCP SUPPORTS A “STREAM OF BYTES” SERVICE
Byte
0B
yte
1B
yte
2B
yte
3
Byte
0B
yte
1B
yte
2B
yte
3
Host A
Host B
Byte
80
Byte
80
TCP accepts data as a constant stream from the applicationsThere are no record markers automatically inserted by TCP. Example:
If the application on one end writes 10 bytes, followed by a write of 20 bytes, followed by a write of 50 bytes, the application at the other end of the connection cannot tell what size the individual writes were. The other end may read the 80 bytes in four reads of 20 bytes at a time.
One end puts a stream of bytes into TCP and the same, identical stream of bytes appears at the other end
…WHICH IS EMULATED USING TCP “SEGMENTS”
Byte
0B
yte
1B
yte
2B
yte
3
Byte
0B
yte
1B
yte
2B
yte
3
Host A
Host B
Byte
80
TCP Data
TCP Data
Byte
80
Segment sent when:1. Segment full (MSS
bytes),2. Not full, but times out, or3. “Pushed” by application.
THE TCP SEGMENT FORMAT
IP HdrIP Data
TCP HdrTCP Data
Src port Dst port
Sequence #
Ack Sequence #
HLEN4
RSVD6
UR
GA
CK
PS
HR
ST
SYN
FIN
FlagsWindow Size
Checksum Urgent Pointer
(TCP Options)
0 15 31
TCP Data
TCP Header and Data + IP
Addresses
Src/dst port numbersand IP addresses uniquely identify
socket
SEQUENCE NUMBERSHost A
Host B
TCP Data
TCP Data
TCP HDR
TCP HDR
ISN (initial sequence number)
Sequence number = 1st
byte Ack sequence number =
next expected byte
How does ISN get chosen?
INITIAL SEQUENCE NUMBERS
Connection Setup3-way handshake
(Active)Client
(Passive)Server
Syn +ISNA
Syn + Ack +ISNB
Ack
Sequence number = 32 bitsWhat if a message has more than 232
bytes?
Sequence Number wrap-around
Solution : Timestamp Option: Sender places timestamp in every segment: Receiver copies timestamp in the ACK it sends for a segment
TCP SLIDING WINDOW
How much data can a TCP sender have outstanding in the network?
How much data should TCP retransmit when an error occurs? Just selectively repeat the missing data?
How does the TCP sender avoid over-running the receiver’s buffers?
TCP SLIDING WINDOW
Window Size
OutstandingUn-ack’d data
Data OK to send
Data not OK to send yet
Data ACK’d
Window is meaningful to the sender. Current window size is “advertised” by receiver (usually 4k – 8k Bytes when connection set-up).
TCP SLIDING WINDOW
Host A
Host BACK
Window Size
Round-trip time
(1) RTT > Window size
ACK
Window Size
Round-trip time
(2) RTT = Window size
ACK
Window Size???
TCP: RETRANSMISSION AND TIMEOUTS
Host A
Host B
ACK
Round-trip time (RTT)
ACK
Retransmission TimeOut (RTO)
Estimated RTT
Data1 Data2
Guard
Band
TCP uses an adaptive retransmission timeout value:
CongestionChanges in Routing
RTT changes frequently
TCP: RETRANSMISSION AND TIMEOUTS Picking the RTO is important:
Pick a values that’s too big and it will wait too long to retransmit a packet,
Pick a value too small, and it will unnecessarily retransmit packets.
The original algorithm for picking RTO:1. EstimatedRTTk= EstimatedRTTk-1 + (1 - ) SampleRTT2. RTO = 2 * EstimatedRTT
Characteristics of the original algorithm: Variance is assumed to be fixed. But in practice, variance increases as congestion
increases.
Determined empirically
TCP: RETRANSMISSION AND TIMEOUTS There will be some (unknown) distribution of RTTs. We are trying to estimate an RTO to minimize the probability of a false timeout.
RTT
Pro
babili
ty
mean
variance
Load(Amount of trafficarriving to router)
Avera
ge Q
ueuein
g D
ela
y
Variance grows
rapidly with load
Router queues grow when there is more traffic, until they become unstable. As load grows, variance of delay grows rapidly.
TCP: RETRANSMISSION AND TIMEOUTS
Newer Algorithm includes estimate of variance in RTT:
Difference = SampleRTT - EstimatedRTT EstimatedRTTk = EstimatedRTTk-1 + (*Difference) Deviation = Deviation + *( |Difference| - Deviation )
RTO = * EstimatedRTT + * Deviation 1 4
Same as before
TCP: RETRANSMISSION AND TIMEOUTSKARN’S ALGORITHM
Retransmission
Wrong RTT Sample
Host A Host B
Retransmission
Wrong RTT Sample
Host A Host B
Problem: How can we estimate RTT when packets are retransmitted?Solution: On retransmission, don’t update estimated RTT (and double RTO).
CONGESTION CONTROL: MAIN POINTS
Congestion is inevitable Congestion happens at different scales – from
two individual packets colliding to too many users
TCP Senders can detect congestion and reduce their sending rate by reducing the window size
TCP modifies the rate according to “Additive Increase, Multiplicative Decrease (AIMD)”.
To probe and find the initial rate, TCP uses a restart mechanism called “slow start”.
Routers slow down TCP senders by buffering packets and thus increasing delay
CONGESTIONH1
H2
R1 H3
A1(t)10Mb/s
D(t)1.5Mb/s
A2(t)100Mb/s
A1(t)
A2(t)X(t)
D(t)
A1(t)
A2(t)
D(t)
X(t)
Cumulativebytes
t
TIME SCALES OF CONGESTION
Too many users using a link during a peak hour
TCP flows filling up allavailable bandwidth
Two packets collidingat a router
7:00 8:00 9:00
1s 2s 3s
100µs 200µs 300µs
DEALING WITH CONGESTIONEXAMPLE: TWO FLOWS ARRIVING AT A ROUTER
StrategyDrop one of the flows
Buffer one flow until the other has departed, then send it
Re-Schedule one of the two flows for a later time
Ask both flows to reduce their rates
R1
?A1(t)
A2(t)
CONGESTION IS UNAVOIDABLEARGUABLY IT’S GOOD!
We use packet switching because it makes efficient use of the links. Therefore, buffers in the routers are frequently occupied.
If buffers are always empty, delay is low, but our usage of the network is low.
If buffers are always occupied, delay is high, but we are using the network more efficiently.
So how much congestion is too much?
LOAD, DELAY AND POWER
AveragePacket delay
Load
Typical behavior of queueing systems with random arrivals:
Power
Load
A simple metric of how well the network is performing:
LoadPower
Delay
“optimalload”
Burstiness tends to moveasymptote to the left
OPTIONS FOR CONGESTION CONTROL
1. Implemented by host versus network2. Reservation-based, versus feedback-based3. Window-based versus rate-based.
TCP CONGESTION CONTROL
TCP implements host-based, feedback-based, window-based congestion control.
TCP sources attempts to determine how much capacity is available
TCP sends packets, then reacts to observable events (loss).
TCP CONGESTION CONTROL TCP sources change the sending rate by modifying
the window size:Window = min{Advertized window, Congestion Window}
In other words, send at the rate of the slowest component: network or receiver.
“cwnd” follows additive increase/multiplicative decrease On receipt of Ack: cwnd += 1 On packet loss (timeout): cwnd *= 0.5
Receiver Transmitter (“cwnd”)
ADDITIVE INCREASE/ MULTIPLICATIVE DECREASE
D A D D A A D D A AD A
Src
Dest
Additive Increase: Every time the source successfully sends a cwnd’s worth of packets (each pkt sent out during the last RTT has been ACKed) add the equivalent of 1 pkt to the cwnd
Increment = MSS×(MSS/CWND) ; CWND≥MSSCWND +=Increment
LEADS TO THE TCP “SAWTOOTH”
t
Window
halved
Timeouts
Could take a long time to get started!
Multiplicative Decrease: For each timeout, the source set CWND to half of its previous value.
CWND is largeall the packets dropped will be retransmitted congestion gets worseNeed to get out of this state quickly
“SLOW START” Designed to find the fair-share rate quickly at startup. How Does it work?
1. Increase cwnd exponentially for each ACK received, until it reaches SSthreshold.
2. If cwnd < SSthreshold {Do Slow Start}, else {Do Congestion Avoidance}
3. Initial SSThreshold = large value. After the pkt lost, SSThreshold = cwnd/24. Congestion Avoidance Increase cwnd linearly
D A D D A A D D
A A
D
A
Src
Dest
D
A
1 2 4 8
SLOW START
Why is it called slow-start? Because TCP originally had no congestion control mechanism. The source would just start by sending a whole advertised window’s worth of data.
FAST RETRANSMIT AND FAST RECOVERY?
Homework!!
TCP SENDING RATE
What is the sending rate of TCP? Acknowledgement for sent packet is received
after one RTT Amount of data sent until ACK is received is the
current window size W Therefore sending rate is R = W/RTT
Is the TCP sending rate saw tooth shaped as well?
TCP AND BUFFERS
TCP AND BUFFERS For TCP with a single flow over a network link with
enough buffers, RTT and W are proportional to each other Therefore the sending rate R = W/RTT is constant (and not
a sawtooth) But experiments and theory suggest that with many
flows:
Where: p is the drop probability.
TCP rate can be controlled in two ways:
1. Buffering packets and increasing the RTT
2. Dropping packets to decrease TCP’s window size
pRTTR
1
CONGESTION CONTROL IN THE INTERNET Maximum window sizes of most TCP
implementations by default are very small Windows XP: 12 packets Linux/Mac: 40 packets
Often the buffer of a link is larger than the maximum window size of TCP A typical DSL line has 200 packets worth of buffer For a TCP session, the maximum number of packets
outstanding is 40 The buffer can never fill up The router will never drop a packet
CONGESTION AVOIDANCE
TCP reacts to congestion after it takes place. The data rate changes rapidly and the system is barely stable (or is even unstable).
Can we predict when congestion is about to happen and avoid it? E.g. by detecting the knee of the curve.
AveragePacket delay
Load
CONGESTION AVOIDANCE SCHEMES
Router-based Congestion Avoidance: DECbit:
Routers explicitly notify sources about congestion. Random Early Detection (RED):
Routers implicitly notify sources by dropping packets. RED drops packets at random, and as a function of the
level of congestion.
Host-based Congestion Avoidance Source monitors changes in RTT to detect onset of
congestion.
DECBIT Each packet has a “Congestion Notification” bit called
the DECbit in its header. If any router on the path is congested, it sets the DECbit.
Set if average queue length >= 1 packet, averaged since the start of the previous busy cycle.
To notify the source, the destination copies DECbit into ACK packets.
Source adjusts rate to avoid congestion. Counts fraction of DECbits set in each window. If <50% set, increase rate additively. If >=50% set, decrease rate multiplicatively.
Time
QueueLength
at router
Averaging period
RANDOM EARLY DETECTION (RED) RED is based on DECbit, and was designed to work well with
TCP. RED implicitly notifies sender by dropping packets. Drop probability is increased as the average queue length
increases. (Geometric) moving average of the queue length is used so as
to detect long term congestion, yet allow short term bursts to arrive.
1
11
(1 )
( )(1 )i.e.
n n n
nn i
n ii
AvgLen AvgLen Length
AvgLen Length
RED DROP PROBABILITIES
A(t)D(t)
maxP
1
minTh maxThAvgLen
:
ˆ
ˆPr( )
ˆ1
I f
Drop Packet
AvgLen
AvgLen
AvgLen
minTh AvgLen maxTh
AvgLen minThp maxP
maxTh minTh
p
count p
counts how long we've been in
since we last dropped a packet. i.e. drops are spaced out in
time, reducing likelihood of re-entering slow-start.
count minTh AvgLen maxTh
PROPERTIES OF RED
Drops packets before queue is full, in the hope of reducing the rates of some flows.
Drops packet for each flow roughly in proportion to its rate.
Drops are spaced out in time. Because it uses average queue length, RED is
tolerant of bursts. Random drops hopefully desynchronize TCP
sources.
SYNCHRONIZATION OF SOURCES
Source A
A
B
C
D
RTT
N RTT
SYNCHRONIZATION OF SOURCES
Aggregate Flow f(RTT)
A
B
C
D
RTT
Avg
DESYNCHRONIZED SOURCES
Source A
A
B
C
D
RTT
N RTT
DESYNCHRONIZED SOURCES
Aggregate Flow
A
B
C
D
RTT
Avg
N RTT