Upload
regina-best
View
26
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Transport Layer ECE544: Communication Networks-II, Spring 2013. Tam Vu WINLAB, Dept. of Computer Science Rutgers University. Includes teaching materials from, L. Peterson, Sumathi Gopal and Sumit Rangwala, D. Raychaudhuri, Mike Freedman. IP Protocol Stack: Key Abstractions. Application. - PowerPoint PPT Presentation
Citation preview
Tam VuWINLAB, Dept. of Computer
ScienceRutgers University
Transport Layer ECE544: Communication Networks-II, Spring
2013
Includes teaching materials from, L. Peterson, Sumathi Gopal and Sumit Rangwala, D. Raychaudhuri, Mike Freedman
IP Protocol Stack: Key Abstractions
2
Problem: Network Layer (IP) provides only best-effort communication services
Best-effort local packet delivery
Best-effort global packet delivery
Reliable streams
Applications
Messages
Link
Network
Transport
Application
Applications requirements vs. IP layer limitationsGuarantee message delivery
Network may drop messages.Deliver messages in the same order they are sent
Messages may be reordered in networks and incurs a long delay.
Delivers at most one copy of each messageMessages may duplicate in networks.
Support arbitrarily large messageNetwork may limit message size.
Support synchronization between sender and receiver
Allows the receiver to apply flow control to the sender
Support multiple application processes on each hostNetwork only support communication between hosts
Many more
IP Protocol Stack: Key Abstractions
4
Transport layer:Provide applications with good abstractionsWithout support or feedback from the network Is the lowest layer in the network stack that is an end-to-
end protocol
Best-effort local packet delivery
Best-effort global packet delivery
Reliable streams
Applications
Messages
Link
Network
Transport
Application
Transport Protocols
5
Logical communication between processesSender divides a message into segmentsReceiver reassembles segments into message
Transport services(De)multiplexing packetsDetecting corrupted dataOptionally: reliable delivery, flow control, …
Two Basic Transport FeaturesDemultiplexing: port numbers
Error detection: checksums
Web server(port 80)
Client host
Server host 128.2.194.242
Echo server(port 7)
Service request for128.2.194.242:80(i.e., the Web server)
OSClient
IP payload
detect corruption6
Most Popular Transport ProtocolsUser Datagram Protocol (UDP)
Support multiple applications processes on each host
Option to check messages for correctness with CRC check
Transmission Control Protocol (TCP) Ensures reliable delivery of packets between
source and destination processes Ensures in-order delivery of packets to
destination process Other services
Real Time Protocol (RTP) Serves real-time multimedia applicationsMoves decision making to the applications Runs over UDP
User Datagram Protocol (UDP)
Service: Support for multiple processes on each host to communicate Issue: IP only provides communication between hosts (IP addresses)
SolutionAdd port number and associate a process with a port number4-Tuple Unique Connection Identifier: [SrcPort, SrcIPAddr, DestPort,
DestIPAddr ]
Lightweight communication between processesSend and receive messagesAvoid overhead of ordered, reliable delivery
No connection setup delay, in-kernel connection state
Used by popular appsQuery/response for DNSReal-time data in VoIP
SrcPort DesPort
Length Checksum
Payload
0 16 31
User Datagram Protocol (UDP): Error Detection
Service: Ensure message correctnessIssue: Packet corruption in transit
SolutionUse Checksum. Includes UDP header, payload, pseudo headerPseudo header
Protocol number, source IP address, destination IP address, and UDP length
SrcPort DesPort
Length Checksum
Payload
0 16 31
Transmitting a stream of bytes ?
Stream-of-bytes serviceSends and receives a
stream of bytes
Reliable, in-order deliveryCorruption: checksumsDetect loss/reordering:
sequence numbersReliable delivery:
acknowledgments and retransmissions
Connection orientedExplicit set-up and
tear-down of TCP connection
Flow controlPrevent overflow of
the receiver’s buffer space
Congestion controlAdapt to network
congestion for the greater good
11
Transmission Control Protocol (TCP) First proposed by Vinton Cerf and Robert
Kahn, 1974TCP/IP enabled computers of all sizes, from
different vendors, different OSs, to communicate with each other.
Used by 80% of all traffic on the InternetReliable, in-order delivery, connection-
oriented, bye-stream service
Starting and Ending a Connection:
TCP Handshakes
Establishing a TCP Connection
Three-way handshake to establish connectionHost A sends a SYN (open) to the host BHost B returns a SYN acknowledgment (SYN ACK)Host A sends an ACK to acknowledge the SYN
ACK14
SYN
SYN
ACKACK
Data
A B
Data
Each host tells its Initial Sequence Number (ISN) to the other host.
Each host tells its Initial Sequence Number (ISN) to the other host.
TCP Header
15
Source port Destination port
Sequence number
Acknowledgment
Advertised windowHdrLen
Flags0
Checksum Urgent pointer
Options (variable)
Data
Flags:SYNFINRSTPSHURGACK
Step 1: A’s Initial SYN Packet
16
A’s port B’s port
A’s Initial Sequence Number
Acknowledgment
Advertised window20 Flags0
Checksum Urgent pointer
Options (variable)
Flags:SYNFINRSTPSHURGACK
A tells B it wants to open a connection…
Step 2: B’s SYN-ACK Packet
B’s port A’s port
B’s Initial Sequence Number
A’s ISN plus 1
Advertised window20 Flags0
Checksum Urgent pointer
Options (variable)
Flags:SYNFINRSTPSHURGACK
B tells A it accepts, and is ready to hear the next byte…
… upon receiving this packet, A can start sending data
17
Step 3: A’s ACK of the SYN-ACK
A’s port B’s port
B’s ISN plus 1
Advertised window20 Flags0
Checksum Urgent pointer
Options (variable)
Flags:SYNFINRSTPSHURGACK
A tells B it is okay to start sending
Sequence number
… upon receiving this packet, B can start sending data
18
SYN Loss and Web Downloads
Upon sending SYN, sender sets a timerIf SYN lost, timer expires before SYN-ACK receivedSender retransmits SYN
How should the TCP sender set the timer?No idea how far away the receiver isSome TCPs use default of 3 or 6 seconds
Implications for web downloadUser gets impatient and hits reload … Users aborts connection, initiates new socketEssentially, forces a fast send of a new SYN!
19
Tearing Down the Connection
Closing (each end of) the connectionFinish (FIN) to close and receive remaining bytesAnd other host sends a FIN ACK to acknowledgeReset (RST) to close and not receive remaining
bytes
SYN
SYN
AC
K
AC
KD
ata
FIN
AC
K
AC
K
timeA
BFIN
AC
K
20
Sending/Receiving the FIN Packet
Sending a FIN: close()Process is done sending
data via socketProcess invokes
“close()”Once TCP has sent all
the outstanding bytes…… then TCP sends a FIN
Receiving a FIN: EOFProcess is reading
data from socketEventually, read
call returns an EOF
21
Data transmission
TCP: Byte-streamService: Byte-stream
Application reads or writes a stream of bytes to the transport
Issue: IP is packet-orientedSolution: TCP maintains a local buffer
Chop the stream into packets and transmit (sender)Coalesce data from packets to form a stream (receiver)
TCP “Stream of Bytes” Service
By te 0
By te 1
By te 2
By te 3
By te 0
By te 1
By te 2
By te 3
Host A
Host B
By te 8 0
By te 8 0
24
…Emulated Using TCP “Segments”
By te 0
By te 1
By te 2
By te 3
By te 0
By te 1
By te 2
By te 3
Host A
Host B
By te 8 0
TCP Data
TCP Data
By te 8 0
Segment sent when:1. Segment full (Max Segment Size),2. Not full, but times out, or3. “Pushed” by application
25
TCP SegmentIP packet
No bigger than Maximum Transmission Unit (MTU)
E.g., up to 1500 bytes on an Ethernet link
TCP packetIP packet with a TCP header and data insideTCP header is typically 20 bytes long
TCP segmentNo more than Maximum Segment Size (MSS)
bytesE.g., up to 1460 consecutive bytes from the
stream
IP HdrIP Data
TCP HdrTCP Data (segment)
26
Sequence NumberHost A
Host B
TCP Data
TCP Data
ISN (initial sequence number)
Sequence number = 1st byte
By te 8 1
27
Reliable Delivery on a Lossy Channel With Bit Errors
Challenges of Reliable Data Transfer
Over a perfectly reliable channel: Done
Over a channel with bit errorsReceiver detects errors and requests
retransmission
Over a lossy channel with bit errorsSome data missing, others corruptedReceiver cannot easily detect loss
Over a channel that may reorder packetsReceiver cannot easily distinguish loss vs. out-of-
order
30
An AnalogyAlice and Bob are talking
What if Alice couldn’t understand Bob?Bob asks Alice to repeat what she said
What if Bob hasn’t heard Alice for a while?Is Alice just being quiet? Has she lost
reception?How long should Bob just keep on talking?Maybe Alice should periodically say “uh huh”… or Bob should ask “Can you hear me
now?”
31
Take-Aways from the ExampleAcknowledgments from receiver
Positive: “okay” or “uh huh” or “ACK”Negative: “please repeat that” or “NACK”
Retransmission by the senderAfter not receiving an “ACK”After receiving a “NACK”
Timeout by the sender (“stop and wait”)Don’t wait forever without some
acknowledgment
32
TCP Support for Reliable DeliveryDetect bit errors: checksum
Used to detect corrupted data at the receiver…leading the receiver to drop the packet
Detect missing data: sequence numberUsed to detect a gap in the stream of bytes... and for putting the data back in order
Recover from lost data: retransmissionSender retransmits lost or corrupted dataTwo main ways to detect lost packets
33
TCP AcknowledgmentsHost A
Host B
TCP Data
TCP Data
ISN (initial sequence number)
Sequence number = 1st byte
ACK sequence number = next expected byte
34
Automatic Repeat reQuest (ARQ)
ACK and timeoutsReceiver sends ACK
when it receives packet
Sender waits for ACK and times out
Simplest ARQ protocolStop and waitSend a packet, stop and
wait until ACK arrives
35
Time
Packet
ACKTim
eou
t
Sender Receiver
Quick TCP Math• Initial Seq No = 501. Sender sends 4500 bytes
successfully acknowledged. Next sequence number to send is:
(A) 4501 (B) 5000 (C) 5001 (D) 5002
• Next 1000 byte TCP segment received. Receiver acknowledges with ACK number:
(A) 5001 (B) 6000 (C) 6001
36
Flow Control:TCP Sliding Window
Sliding Window: MotivationStop-and-wait is inefficient
Only one TCP segment is “in flight” at a time
Consider: 1.5 Mbps link with 50 ms round-trip-time (RTT)Assume segment size of 1 KB (8 Kbits)8 Kbits/segment at 50 msec/segment 160 KbpsThat’s 11% of the capacity of 1.5 Mbps link
39
Sliding WindowAllow a larger amount of data “in flight”
Allow sender to get ahead of the receiver… though not too far ahead
Sending process Receiving process
Last byte ACKedLast byte sent
TCP TCP
Next byte expected
Last byte written Last byte read
Last byte received40
Receiver BufferingReceive window size
Amount that can be sent without acknowledgmentReceiver must be able to store this amount of
data
Receiver tells the sender the windowTells the sender the amount of free space left
Window Size
OutstandingUn-ack’d data
Data OK to send
Data not OK to send yet
Data ACK’d
41
TCP: Flow Control Flow Control
“Prevent sender from overrunning the capacity (buffer) of the receiver” Solution: Use adaptive receiver window size
Goal is to keep (C) – (A) < MaxRcvBuffer Every packet carries ACK and AdvertisedWindow
Sending Appl Receiving Appl
LastByteAcked (J) (K) LastByteSent
(I) LastByteWritten
(B) NextByteExpected
(C) LastByteRcvd
LastByteRead(A)TCP TCP
AdvertisedWindow = MaxRcvBuffer- ((NextByteExp-1)-LastByteRead)
LastByteSent (K) – LastByteAcked (J) <= AdvertisedWindowEffWin = AdvertisedWin - (LastByteSent-LastByteAcked)LastByteWritten – LastByteAcked <= MaxSendBuffer
Optimizing Retransmissions
43
Reasons for Retransmission
44
Packet
ACK
Tim
eou
t Packet
ACK
Tim
eou
t
Packet
Tim
eou
t
Packet
ACK
Tim
eou
t
Packet
ACK
Tim
eou
tPacket
ACK
Tim
eou
t
ACK lostDUPLICATE PACKET
Packet lost Early timeoutDUPLICATEPACKETS
How Long Should Sender Wait?Sender sets a timeout to wait for an ACK
Too short: wasted retransmissionsToo long: excessive delays when packet lost
TCP sets timeout as a function of the RTTExpect ACK to arrive after an “round-trip
time”… plus a fudge factor to account for queuing
But, how does the sender know the RTT?Running average of delay to receive an ACK
45
TCP TimeoutIssue: RTT in a wide area network varies
substantiallySolution: Adaptive Timeout
Original Algorithm: EstimatedRTT = x EstimatedRTT + (1-) x
SampleRTT
Timeout = β x EstimatedRTT (β = 2)
Problem Does not distinguish whether the ACK is for original
transmission or retransmissionConstant β is not good.
Assumes constant variance
10
TCP TimeoutKarn/Partridge Algorithm
Whenever TCP retransmits a segment, it stops taking samples of the RTTOnly measure SampleRTT for segments that have been
sent only onceEach time TCP retransmits, set the next timeout to be
twice the last timeoutRelieves congestion
Jacobson/Karels Algorithm: Adaptive variance (uses mean variance)
Difference = SampleRTT - EstimatedRTTEstimatedRTT = EstimatedRTT + ( x Difference) → (same as
in original)Deviation = Deviation + (|Difference|- Deviation)Timeout = x EstimatedRTT + x Deviation(default: set = 1 and = 4 )
10
TCP Deadlock TCP Deadlock
receiver advertises a window size of 0, the sender stops sending data
the window size update from the receiver is lost
To solve it:the sender starts the persist timer when
AdvertisedWindow = 0When the persist timer expires, the sender
sends a small packet
Triggering TransmissionWhen to transmit a segment:
small segments subject to large overheadReach max segment size (MSS): the size
of the largest segment TCP can send without causing the local IP to fragmentMSS = local MTU – IP & TCP header
The sending process explicitly ask the TCP to transmit, “push”
Congestion
When the network cannot support the sender’s rateQueues at the network elements overflow
Source1
Source2
Source3
Dest2
Dest1
Even with flow control packets might not reach the
destination
Congestion Control vs. Flow ControlCongestion Control
Mechanism to prevent sender from overrunning the capacity of the network When network is the bottleneck
Flow Control Mechanism to prevent sender from
overrunning the capacity of the receiverWhen receiver is the bottleneck
Congestion Control: Design ApproachMaintain another window at the sender called
CongestionWindow (cwnd)CongestionWindow is the max number of packets allowed
in the networkNumber of unACKed packets at the sender.
Key: How to calculate congestion window (cwnd)Various approaches possibleTCP estimates it based on observed packet losses
Assumes packet loss as indication of congestion
Since we don’t know whether the network or the receiver is the bottleneck MaxWindow = MIN(CongestionWindow,
AdvertisedWindow)EffectiveWin = MaxWindow – (LastByteSent –
LastByteAcked)
Congestion Avoidance:
(AIMD) If no congestion in the network (increase
conservatively)Increase the congestion window additively every RTT
If congestion in the network (decrease aggressively)Decrease the congestion window multiplicatively,
immediately
How is congestion detected? Estimated (more later)
Every RTT w = w + 1w = cwnd in segments
Every ACK reception w = w + 1/ww = cwnd in segments
Every ACK reception cwnd = cwnd +
MSS*(MSS/cwnd)cwnd in bytes
cwnd = cwnd/2 cwnd in bytes
Congestion Avoidance: (AIMD)
TCP’s saw tooth patternIssues with additive increase
takes too long to ramp up a connection from the beginning
The entire advertised window may be reopened when a lost packet retransmitted and a single cumulative ACK is received by the sender
TimeCongest
ionW
indow
Siz
e
Startup time
TCP “Slow Start”: To start quickly!
Maintain another variable slow start threshold (ssthresh) Last known stable rate If (cwnd > ssthresh)
State = congestion avoidanceElse
State = slow start In Slow start
Increase the congestion window exponentially every RTT
Key: How is ssthresh calculated?
Every ACK reception w = w + 1w = cwnd in segments
Every ACK reception cwnd = cwnd + MSScwnd in bytes
TCP: Congestion Detection and Retransmit
Loss of packet indicates congestionTimer Timeouts (No ACK)
Set according to Jacobson/Karels algorithmOn timer timeout
ssthresh = max(2*MSS, effwin/2); cwnd = MSSNotice this will cause TCP to go into slow start
Issue: takes a long time to detect a packet lossAffects throughput
Any other quicker way of detecting a packet loss?
Fast Retransmit
Observation: A series of duplicate ACKs might mean a packet loss
SolutionEvery time receiver
receives a packet (out-of-order), sends a duplicate ACK
Sender retransmit the missing packet after it receives some number of duplicate ACKs (e.g. 3 duplicate ACKs)
Fast Retransmit does not replace timeouts
Issue: Reduces latency (early retransmit) but still incurs loss in throughput (slow start after packet loss )
ACK 1
ACK 2
ACK 2
ACK 2
ACK 2
ACK 6
PKT 1PKT 2
PKT 4
PKT 5PKT 6
PKT 3Retran
PKT 3
Fast Recovery
Transmit a packet for every ACK received till the retransmitted packet is ACK’dssthresh= (2*MSS,
cwdn/2); cwnd = sshthred + 3
On every ACK will the ACK of retransmitted packet cwnd = cwnd + 1
On reception of ACK of retransmitted packet Start congestion
avoidance instead of slow startcwnd = ssthresh
Homework5.13 (3rd ed and 4th ed)5.165.285.345.39
Due 4/5