Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
CIS 553: Networked Systems
HTTP Performance
April 6, 2020
Agenda
n Router-assisted congestion controln HTTP
n Protocoln Fetching web pages
n HTTP Optimizations
2University of Pennsylvania
NEXT
From HTTP to Web Pages
n Most Web pages have multiple objectsn e.g., HTML file and a bunch of embedded images
n How do you retrieve those objects (naively)?n One item at a time
n New TCP connection per (small) object!
3University of Pennsylvania
Non-persistent connections
n Default in HTTP/1.0n 2RTT for each object in the HTML file!
n One more 2RTT for the HTML file itself
n Inefficient due to repeated work
4University of Pennsylvania
Persistent connections
n Maintain TCP connection across multiple requestsn Including transfers subsequent to current pagen Client or server can tear down connection
n Advantagesn Avoid overhead of connection set-up and tear-downn Allow underlying layers (e.g., TCP) to learn about RTT and
bandwidth characteristics
n Default in HTTP/1.1
5University of Pennsylvania
Pipelined requests & responses
n Batch requests and responses to reduce the number of packets
n Multiple requests can be contained in one TCP segment
Client Server
Request 1Request 2Request 3
Transfer 1
Transfer 2
Transfer 3
6University of Pennsylvania
Concurrent requests and responses
n Use multiple connections in parallel
n Does not necessarily maintain order of responses
R1R2 R3
T1
T2 T3
Client
Server
7University of Pennsylvania
Concurrent Connections (Part 2)
n Initial congestion window & slow start limits BWn Use more connections!
n Per-connection RWNDs are sometimes set lown Use more connections!
n Congestion control limits how much we can sendn Use more connections!
n Adds complexityn Goes against original intent of TCP
8University of Pennsylvania
Scorecard: Getting n small objects
n Time dominated by latency
n One-at-a-time: ~2n RTTn m concurrent: ~2[n/m] RTTn Persistent: n Pipelined: n Pipelined/Persistent:
9University of Pennsylvania
Scorecard: Getting n small objects
n Time dominated by latency
n One-at-a-time: ~2n RTTn m concurrent: ~2[n/m] RTTn Persistent: ~ (n+1)RTTn Pipelined: ~2 RTTn Pipelined/Persistent: ~2 RTT first time, RTT later
10University of Pennsylvania
Scorecard: Getting n large objects each of size F
n Time dominated by bandwidth
n One-at-a-time: ~ nF/Bn m concurrent: ~ [n/m] F/B
n Assuming shared with large population of users and each TCP connection gets the same bandwidth
n Pipelined and/or persistent: ~ nF/Bn The only thing that helps is getting more bandwidth
11University of Pennsylvania
Agenda
n Router-assisted congestion controln HTTP
n Protocoln Fetching web pages
n HTTP Optimizations
12University of Pennsylvania
NEXT
Every Millisecond Counts
500ms delay causes 1.2% decrease in Bing revenue [Souders 2009]
400ms delay causes 0.74% decrease in Google searches [Brutlag 2009]
100ms delay causes 1% decrease in Amazon revenue [Linden 2013]
Many Web services companies spend considerable effort reducing Web response time.
13University of Pennsylvania
14University of Pennsylvania
Background - Critical Path
Critical Path: the longest chain of dependent browser tasks
Fetch Delay: Network DelayRender Delay: Computational Delay
15University of Pennsylvania
Background - Page Load Time (PLT)
Order matters!
16University of Pennsylvania
How can we reduce PLT?
17University of Pennsylvania
Agenda
n Router-assisted congestion controln HTTP
n Protocoln Fetching web pages
n HTTP Optimizationsn CDNsn HTTP/2
18University of Pennsylvania
NEXT
Proxies
n Computer that acts a broker between client and servern Speaks to server on client’s behalf
19University of Pennsylvania
HTTP Proxies
n Accept requests from multiple clients
n Takes request and reissues it to server
n Takes response and forwards to client
client
proxyserver
client
HTTP request
HTTP request
HTTP response
HTTP response
HTTP request
HTTP response
originserver
originserver
20University of Pennsylvania
Types of Proxies
n Explicitn Requires configuring browser
n Implicitn Service provider deploys an “on path” proxyn … that intercepts and handles Web requests
n Also, forward and reverse
21University of Pennsylvania
Forward Proxy
n Proxy on behalf of the client
n Usually under administrativecontrol of client-side AS
proxyserver
HTTP request
HTTP request
HTTP response
HTTP response
22University of Pennsylvania
client
client
Reverse Proxy
n Cache on behalf of the server
n Generally implicit proxyserver HTTP request
HTTP response
originserver
originserver
HTTP requestHTTP response
23University of Pennsylvania
Examples of Proxies
n Forward:n Privacy – Hide your true IPn Firewalls – limit access to the Internet, secure the proxyn ISP caching – reduce traffic exiting the AS
n Reverse:n CDNs – reduce traffic exiting the ASn Load balancers
24University of Pennsylvania
Before CDNs
n Sending content from the source to 4 users takes 4 x 3 = 12 “network hops” in the example
source
client
. . .
25University of Pennsylvania
client
After CDNs
n Sending content via replicas takes only 4 + 2 = 6 “network hops”
26University of Pennsylvania
source
client
. . .
client
replica
Popularity of Content
n Zipf’s Law: few popular items, many unpopular ones; both matter
Zipf popularity(kth item is 1/k)
Rank Source: Wikipedia
George Zipf (1902-1950)
27University of Pennsylvania
Content Distribution Networks (CDN)
n Caching and replication as a servicen Large-scale distributed storage infrastructure
(usually) administered by one entityn e.g., Akamai has servers in 20,000+ locations
n Combination of caching and replicationn Pull: Direct result of clients’requests (caching)n Push: Expectation of high access rate (replication)
n Can do some processing to handle dynamic webpage content
28University of Pennsylvania
CDNs (e.g., Akamai)
Resulting in traffic of:30 terabits / sec3+ trillion hits / day Responsible for 15-30% of all web traffic
The Akamai EdgePlatform:
240,000+Servers
2200+POPs
130Countries
1600+Networks
800+ Cities
29University of Pennsylvania
Server Selection Policies
n Live servern For availability
n Lowest loadn To balance load across the
servers
n Closestn Nearest geographically, or in
round-trip time
n Best performancen Throughput, latency, …
n Cheapest bandwidth, electricity, …
origin server in North America
CDN distribution node
CDN serverin S. America CDN server
in Europe
CDN serverin Asia
30University of Pennsylvania
Mechanism: HTTP Redirection
n Advantagesn Fine-grain controln Selection based on client
IP address
n Disadvantagesn Extra round-trips for TCP
connection to servern Overhead on the server
GET
Redirect
GET
OK
31University of Pennsylvania
Mechanism: Anycast Routing
n Advantagesn No extra round tripsn Route to nearby server
n Disadvantagesn Does not consider
network or server loadn Different packets may go
to different serversn Used only for simple
request-response apps
1.2.3.0/24
1.2.3.0/24
32University of Pennsylvania
Mechanism: DNSn Advantages
n Avoid TCP set-up delayn DNS caching reduces
overheadn Relatively fine control
n Disadvantagen Based on IP address of
local DNS servern “Hidden load” effectn DNS TTL limits adaptation
DNSquery
local DNS server
33University of Pennsylvania
1.2.3.4
1.2.3.5