Performance Interactions Between P-HTTP and TCP Implementation
John HeidemannUSC/Information Sciences Institute
May 19, 1997
PresentationBaekcheol Jang
Nov 27, 2001
Contents Introduction The Performance Problem
Experimental Framework and Initial Performance
The Short-Initial-Segment Problem The Odd/Short-Final-Segment Problem The Slow-Start Re-Start Problem Other Problems
Conclusions Critics
Introduction (1/2) Early experiments suggest
P-HTTP performance was ten times slower than the corresponding HTTP transactions in a simple page-retrieval benchmark.
This resulting is surprising since P-HTTP is intended to improve performance by amortizing costs of connection creation across multiple requests.
We found Several interaction between P-HTTP and TCP which explain the
exceedingly poor P_HTTP performance. They result from interactions between application-level P-HTTP performance and existing TCP algorithm.
Introduction (2/2) Resolved these interaction
Through application-level implementation changes, providing an HTTP implementation where P-HTTP is 40% faster than simple HTTP over an Ethernet.
Three Problems Tie P_HTTP performance to TCP delayed
acknowledgments. The Short-Initial-Segment Problem The Odd/Short-Final-Segment Problem Add up to 200ms to each P-HTTP transaction.
Result in multiple slow-starts per TCP connection.(CC) The Slow-Start Re-Start Problem Can substantially reduce the performance benefits of P-
HTTP.
The Performance Problems
Experimental Framework and Initial Performance
The Short-Initial-Segment Problem The Odd/Short-Segment Problem The Slow-Start Re-Start Problem Other Problems Current Status
Experimental Framework and Initial Performance
Two host directly connected by a 10Mb/s Ethernet. Two Sun SPARC 20/71 computers running SunOS 4.1.3
with TCP modifications(multicast support, a 16KB default TCP window size, and slow-start enable for directly connected networks)
HTTP server: Apache 1.1b4 Client made HTTP/1.0 request. Run a workload consisting of 100 web-page transactions. Each retrieval consists of retrieval for three documents of
6651, 3883 and 1866 B over a single P-HTTP connection.
Experimental Framework and Initial Performance
• P-HTTP performance is about 14 times worse than simple HTTP performance over Ethernet• Pipelining requests across a P-HTTP connection is necessary to maximize performance, but our simple client does not pipeline request.
The Short-Initial-Segment Problem
Problem Interaction between
Apache sending MIME headers as a separate segment.(300B) SunOS`s implementation of TCP’s slow-start and delayed-acknowl
edgement algorithms. Apache supports keep-alive connections, and it sends its head
ers as separate segment. TCP’s delayed-acknowledgment algorithm
ACK should be delayed in hopes of piggybacking the ACK on return traffic.
Host requirement RFC adds At least every other full segment must be acknowledged.
The client delay ACKing the data until the delayed ACK timer expires. (200ms on BSD-derived TCPs or 500ms)
The Short-Initial-Segment Problem
The Short-Initial-Segment Problem
SolutionInsure that the HTTP server does not send the HTTP headers in a partial segment. ( remove application level flush: work around a bug in a popular browser.)
The Odd/Short-Final-Segment Problem
Problem Odd numbers of segments interacting with the silly-
window-syndrome avoidance algorithm. When Nagle algorithm is enabled. And a response requires an odd number of full
segments followed by a short final segment. Odd number of segments arise when Apache sends
data over a TCP connection with a large MSS. Apache writes data at the application-layer in 4KB
chunks. 1460B(MTU), 1460B, 1175
Host requirements RFC, every two full segments must be acknowledged.
The client will delay acknowledgment of the third segment according to the TCP delayed acknowledgment algorithm.
The Odd/Short-Final-Segment Problem
Problem (Cont..) Assume that the server has only a small amount of
data to send to complete the current response. Apache write this data, but TCP will refuse to send it
because of sender-side SWS avoidance. (BSD TCP algorithm) A full-size segment can be sent. We can send half of the client’s advertised window. We can send everything we have and either are not
expecting an ACK or the Nagle algorithm is disabled. The server therefore waits for the client to ACK this
segment before responding. (200ms) This problem occurs because Nagle’s algorithm is
intended for small-packet.
The Odd/Short-Final-Segment Problem
The Odd/Short-Final-Segment Problem
• Solution• Disabling Nagle’s algorithms for P-HTTP connections, thus disabling the aspect of SWS avoidance which interferes with performance.
The Slow-Start Re-Start Problem
The assumption of BSD TCP If at any time all data sent has been acknowledged and
nothing has been sent for one retransmission time-out period, then it reinitializes the congestion window to 1 segment, forcing a slow start.
The motivation : some application such as SMTP and NNTP typically have a negotiation phase followed by a data transfer phase.
Result of reinitializing P-HTTP connections will frequently slow-start “mid-
stream”. In fact, since users nearly always spend more than the
retransmission time out browsing a given page, P-HTTP will nearly always slow-start when the user follow a link.
The Slow-Start Re-Start Problem
Solution Improve P-HTTP performance by avoiding
additional slow-starts, but will send a burst of up to a full window of packets.
Insure that all TCP implementations reset the congestion window after an idle period.
Not easily available to the application. Limits performance advantage of persistent
connections. Intermediate approach
Decay the congestion window over time rather than reset it to one.
Other Problems Many web server employ standard I/O packages.
Result in several extra data copies for bulk data transfer. Disk, file-system cache, input-stream stdio buffer, user buffer,
network buffers, output stdio buffer,network device Approach
Memory-mapping the input file, reducing data copies to three. With memory-mapping all data copying can happen directly in the kernel.
Socket buffers too small to support steady segment flow for wide-area connections.
TCP’ sliding window is limited by socket buffer size. TCP socket buffer(2-16KB), average 4KB
Conclusions Identify three performance problems that
occur due to interactions between specific implementations of TCP and P-HTTP
The Short-Initial-Segment Problem. The Odd/Short-Final –Segment Problem.
Validation. The Slow-Start Re-Start Problem.
Just suggest proposal.
Critics Strong Point
Solve the P-HTTP’s performance Problem Start with not suspicion but measurement Find the problem from real packet trace Implementation
Weak Point So simple (paper, approach) No back ground information Third problem and others will be further works.