Upload
tress
View
41
Download
0
Embed Size (px)
DESCRIPTION
An Integrated Approach to Improving Web Performance. Lili Qiu Cornell University. Outline. Motivation & Open Issues Solutions Study the workload of a busy Web server Optimize TCP performance for Web transfers Provision the content distribution networks Summary & Other Work. Motivation. - PowerPoint PPT Presentation
Citation preview
1
An Integrated Approach to Improving Web Performance
Lili Qiu
Cornell University
2
Outline Motivation & Open Issues Solutions
Study the workload of a busy Web server Optimize TCP performance for Web
transfers Provision the content distribution networks
Summary & Other Work
3
Motivation Web is the most dominant traffic in the
Internet today Accounts for over 70% wide-area traffic
Web performance is often unsatisfactory WWW – World Wide Wait Consequence: losing potential
customers! Network congestio
nOverloadedWeb server
4
Challenges in Providing Highly Efficient Web Services
Workload characterization The workload of busy Web
sites is not well understood
Protocol inefficiency Mismatch between Web
transfers and TCP protocol Infrastructure provisioning
Current trend: Content Distribution Networks
Problem: Where to place replicas?
WorkloadCharacterization
ProtocolInefficiency
InfrastructureProvisioning
5
Our Solutions Web Workload Characterization
Study the workload of a busy Web server Improve protocol efficiency
Optimize TCP startup performance for Web transfers
Provision Web replication infrastructure Develop placement algorithms for content
distribution networks (CDNs)
6
Part I Web Workload Characterization
The Content and Access Dynamics of a Busy Web Site: Findings and Implications. Proceedings of ACM SIGCOMM 2000, Stockholm, Sweden, August 2000. (Joint work with V. N. Padmanabhan)
7
Motivation Solid understanding of Web workload is critical
for designing robust and scalable systems Missing piece in previous work: workload of
busy Web servers
Internetreplica
proxy
replica
proxy
proxy
Clients Servers
8
Overview MSNBC server site
a large news site consistently ranked among the busiest sites in the Web server cluster with 40 nodes 25 million accesses a day (HTML content alone) Period studied: Aug. – Oct. 99 & Dec. 17, 98 flash crowd
Server logs HTTP access logs Content Replication System (CRS) logs HTML content logs
Data analysis Content dynamics Access dynamics
9
Temporal Stability of File Popularity
Methodology Consider the traces from
a pair of days Pick the top n popular
documents from each day Compute the overlap
Results One day apart:significant
overlap (80%) Two months apart:
smaller overlap (20-80%) Ten months apart: very
small overlap (mostly below 20%)
0
0.2
0.4
0.6
0.8
1
1 10 100 1000 10000 100000
# popular documents picked
Exte
nt o
f ove
rlap
17DEC98 - 18OCT99 01AUG99 - 18OCT99 17OCT99 - 18OCT99
The set of popular documents remains stable for days
10
Spatial Locality inClient Accesses
Normal Day
0
0.2
0.4
0.6
0.8
1
0 10000 20000 30000 40000 50000
Domain ID
Frac
tion
of re
ques
ts s
hare
d
Domain membership is significant except when there is a “hot” event of global interest
Dec. 17, 1998
0
0.2
0.4
0.6
0.8
1
1.2
0 5000 10000 15000 20000 25000 30000 35000
Domain IDFr
actio
n of
requ
ests
sha
red
Trace
Random
11
Spatial Distribution of Client Accesses
Cluster clients using network aware clustering [KW00]
IP addresses with the same address prefix belongs to a cluster
Top 10, 100, 1000, 3000 clusters account for about 24%, 45%, 78%, and 94% of the requests respectively
A small number of client clusters contribute most of the requests.
12
The Applicability of Zipf-law to Web requests
The Web requests follow Zipf-like distribution Request frequency 1/i, where i is a document’s ranking
The value of is much larger in MSNBC traces 1.4 – 1.8 in MSNBC traces smaller or close to 1 in the proxy traces close to 1 in the small departmental server logs [ABC+96] Highest when there is a hot event
0
0.5
1
1.5
2
MSNBC Proxies Less popular servers
13
Impact of larger Accesses in MSNBC traces
are much more concentrated90% of the accesses are accounted by
Top 2-4% files in MSNBC traces
Top 36% files in proxy traces (Microsoft proxies and the proxies studied in [BCF+99])
Top 10% files in small departmental server logs reported in [AW96]
Popular news sites like MSNBC see much more concentrated accesses Reverse caching and replication can be very
effective!
0
0.2
0.4
0.6
0.8
1
1.2
0 0.5 1 1.5
Percentage of Documents (sorted by popularity)
Pe
rce
nta
ge
of R
eq
uest
s
12/17/98 Server Traces 08/01/99 Server Traces10/06/99 Proxy Traces
14
Part II Transport Layer Optimization for the Web Speeding Up Short Data Transfers: Theory,
Architectural Support, and Simulation Results. Proceedings of NOSSDAV 2000 (Joint work with Yin Zhang and Srinivasan Keshav)
15
Motivation Characteristics of Web data transfers
Short & bursty [Mah97] Use TCP
Problem: Short data transfers interact poorly with TCP !
16
TCP/Reno Basics
Slow Start Exponential growth in
congestion window, Slow: log(n) round
trips for n segments Congestion
Avoidance Linear probing of BW
Fast Retransmission Triggered by 3
Duplicated ACK’s
17
Related Work P-HTTP [PM94]
Reuses a single TCP connection for multiple Web transfers, but still pays slow start penalty
T/TCP [Bra94] Cache connection count, RTT
TCP Control Block Interdependence [Tou97]: Cache cwnd, but large bursts cause losses
Rate Based Pacing [VH97] 4K Initial Window [AFP98] Fast Start [PK98, Pad98]
Need router support to ensure TCP friendliness
18
Our Approach Directly enter Congestion Avoidance Choose optimal initial congestion window
A Geometry Problem: Fitting a block to the service rate curve to minimize completion time
19
Optimal Initial cwnd Minimize completion time by having the
transfer end at an epoch boundary.
20
Shift Optimization Minimize initial cwnd while keeping the
same integer number of RTTs
Before optimization:cwnd = 9
After optimization:cwnd = 5
21
Effect of Shift Optimization
22
TCP/SPAND Estimate network state by sharing performance
information SPAND: Shared PAssive Network Discovery [SSK97]
Directly enter Congestion Avoidance, starting with the optimal initial cwnd
Avoid large bursts by pacing
Internet
Web Servers
Performancegateway
23
Implementation Issues Scope for sharing and aggregation
24-bit heuristic network-aware clustering [KW00]
Collecting performance information Performance reports, New TCP option, Windmill’s
approach, … Information aggregation
Sliding window average Retrieving estimation of network state
Explicit query, active push, … Pacing
Leaky-bucket based pacing
24
Opportunity for Sharing MSNBC: 90% requests arrive within 5 minutes
since the most recent request from the same client network (using 24-bit heuristic)
25
Cost for Sharing MSNBC: 15,000-25,000 different client
networks in a 5-minute interval during peak hours (using 24-bit heuristic)
26
Simulation Results Methodology
Download files in rounds Performance Metric
Average completion time TCP flavors considered
reno-ssr: Reno with slow start restart reno-nssr: Reno w/o slow start restart newreno-ssr: NewReno with slow start restart newreno-nssr: NewReno w/o slow start restart
27
Simulation Topologies
28
T1 Terrestrial WAN Link withSingle Bottleneck
29
T1 Terrestrial WAN Link withMultiple Bottlenecks
30
TCP Friendliness
31
Summary TCP/SPAND significantly reduces latency
for short data transfers 35-65% compared to reno-ssr / newreno-ssr 20-50% compared to reno-nssr / newreno-
nssr Even higher for fatter pipes
TCP/SPAND is TCP-friendly TCP/SPAND is incrementally deployable
Server-side modification only No modification at client-side
32
Part III Provision Content Distribution Networks (CDNs)
On the Placement of Web Server Replicas. To appear in INFOCOM'2001. (Joint work with V. N. Padmanabhan and G. M. Voelker)
33
Introduction to CDNs Content providers want to offer
better service to their clients at lower cost
Increasing deployment of content distribution networks (CDNs)
Akamai, Digital Island, Exodus … Idea: a network of servers Features:
Outsourcing infrastructure Improve performance by moving
content closer to end users Flash crowd protection
CDNserver
server
ClientsContent
Providers
server
server
server
34
Placement of CDN servers Goal
minimize users’ latency or bandwidth usage
Minimum K-median problem
Select K centers to minimize the sum of assignment costs
Cost can be latency or bandwidth or other metric we want to optimize
NP-hard problem
CDNserver
server
server
server
server
ClientsContent
Providers
35
Placement Algorithms Tree based algorithm [LGG+99]
Assume the underlying topologies are trees, and model it as a dynamic programming problem
O(N3M2) for choosing M replicas among N potential places
Random Pick the best among several random
assignments Hot spot
Place replicas near the clients that generate the largest load
36
Placement Algorithms (Cont.)
Greedy algorithmGreedy(N,M) { for I = 1 .. M { for each remaining replica R {
cost[R] = cost after placing an additional replica at R
} select the replica with the lowest cost }}
Super Optimal algorithm Lagrangian relaxation + subgradient method
37
Simulation Methodology Network topology
Randomly generated topologies Using GT-ITM Internet topology generator
Real Internet network topology AS level topology obtained using BGP routing data from
a set of seven geographically dispersed BGP peers Web Workload
Real server traces MSNBC, ClarkNet, NASA Kennedy Space Center
Performance Metric Relative performance: costpractical/costsuper-optimal
38
Simulation Results inRandom Graph Topologies
39
Simulation Results inReal Internet Topologies
40
Effects of Imperfect Knowledge about Input Data
Predict load using moving window average
(a) Perfect knowledge about topology
(b) Knowledge about Topology with a factor of 2
accurate
41
Summary First experimental study on placement of CDNs Knowledge about client workload and topology is
crucial for provisioning CDNs The greedy algorithm performs the best
Within a factor of 1.1 – 1.5 of super-optimal The greedy algorithm is insensitive to noise
Stay within a factor of 2 of the super-optimal when the salted error is a factor of 4
The hot spot algorithm performs nearly as well Within a factor of 1.6 – 2 of super-optimal
How to obtain inputs Moving window average for load prediction Using BGP router data to obtain topology information
42
Contributions Workload characterization
Study the workload of MSNBC web site
Protocol efficiency Optimize TCP startup
performance for Web transfers
Infrastructure provisioning
Develop placement algorithms for Content Distribution Networks
WorkloadCharacterization
Protocol Efficiency
InfrastructureProvisioning
43
Other Work Available at
http://www.cs.cornell.edu/lqiu/papers/papers.html Fast Firewall Implementations for Software and
Hardware-based Routers. Submitted to ACM SIGMETRICS’2001.
Integrating Packet FEC into Adaptive Voice Playout Buffer Algorithms on the Internet. Proceedings of IEEE INFOCOM'2000, Tel-Aviv, Israel, March 2000.
On Individual and Aggregate TCP Performance. 7th International Conference on Network Protocols (ICNP'99), Toronto, Canada, October 1999.