Upload
hugh
View
28
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Integrated Approach to Improving Web Performance. Lili Qiu Cornell University. Outline. Motivation & Open Issues Solutions Study Web workload, and properly provision the content distribution networks Optimizing TCP performance for Web transfers Fast packet classification Summary - PowerPoint PPT Presentation
Citation preview
1
Integrated Approach to Improving Web Performance
Lili Qiu
Cornell University
2
Outline Motivation & Open Issues Solutions
Study Web workload, and properly provision the content distribution networks
Optimizing TCP performance for Web transfers
Fast packet classification Summary Other Work
3
Motivation Web is the dominant traffic in the
Internet today Web performance is often
unsatisfactory WWW – World Wide Wait Consequence: losing potential
customers! Network congestio
nOverloadedWeb server
4
Why is the Web so slow? Application layer
Web servers are overloaded … Transport layer
Web transfers are short and busty, and interact poorly with TCP
Network layer Routers are not fast enough Network congestion Route flaps and routing instabilities
…Inefficiency in any layer of the
protocol stack can slow down the Web!
5
Our Solutions Application layer
Study Web Workload Properly provision content distribution
networks (CDNs) Transport layer
Optimize TCP startup performance for Web transfers
Network layer Speed up packet classification (useful for
firewall & diff-serv)
6
Part I Application Layer Approach
Study the workload of busy Web servers The Content and Access Dynamics of a Busy Web
Site: Findings and Implications. Proceedings of ACM SIGCOMM 2000, Stockholm, Sweden, August 2000. (Joint work with V. N. Padmanabhan)
Properly provision content distribution networks
On the Placement of Web Server Replicas. Submitted to INFOCOM'2001. (Joint work with V. N. Padmanabhan and G. M. Voelker)
7
Introduction Solid understanding of Web workload is critical
for designing robust and scalable systems The workload of popular Web servers is not well
understood Study the content and access dynamics of MSNBC
web site a large news server one of the busiest sites in the Web 25 million accesses a day (HTML content alone) Period studied: Aug. – Oct. 99 & Dec. 17, 98 flash crowd
Properly provision content distribution networks Where to place the edge servers in the CDNs
8
Temporal Stability of File Popularity
Methodology Consider the traces from
a pair of days Pick the top n popular
documents from each day Compute the overlap
Results One day apart:significant
overlap (80%) Two months apart:
smaller overlap (20-80%) Ten months apart: very
small overlap (mostly below 20%)
0
0.2
0.4
0.6
0.8
1
1 10 100 1000 10000 100000
# popular documents picked
Exte
nt o
f ove
rlap
17DEC98 - 18OCT99 01AUG99 - 18OCT99 17OCT99 - 18OCT99
The set of popular documents remains stable for days
9
Spatial Locality inClient Accesses
Normal Day
0
0.2
0.4
0.6
0.8
1
0 10000 20000 30000 40000 50000
Domain ID
Frac
tion
of re
ques
ts s
hare
d
Domain membership is significant except when there is a “hot” event of global interest
Dec. 17, 1998
0
0.2
0.4
0.6
0.8
1
1.2
0 5000 10000 15000 20000 25000 30000 35000
Domain IDFr
actio
n of
requ
ests
sha
red
Trace
Random
10
Spatial Distribution of Client Accesses
Cluster clients using network aware clustering [KW00]
IP addresses with the same address prefix belongs to a cluster
Top 10, 100, 1000, 3000 clusters account for about 24%, 45%, 78%, and 94% of the requests respectively
A small number of client clusters contribute most of the requests.
11
The Applicability of Zipf-law to Web requests
The Web requests follow Zipf-like distribution Request frequency 1/i, where i is a document’s ranking
The value of is much larger in MSNBC traces 1.4 – 1.8 in MSNBC traces smaller or close to 1 in the proxy traces close to 1 in the small departmental server logs [ABC+96] Highest when there is a hot event
0
0.5
1
1.5
2
MSNBC Proxies Less popular servers
12
Impact of larger Accesses in MSNBC traces
are much more concentrated90% of the accesses are accounted by
Top 2-4% files in MSNBC traces
Top 36% files in proxy traces (Microsoft proxies and the proxies studied in [BCF+99])
Top 10% files in small departmental server logs reported in [AW96]
Popular news sites like MSNBC see much more concentrated accesses Reverse caching and replication can be very
effective!
0
0.2
0.4
0.6
0.8
1
1.2
0 0.5 1 1.5
Percentage of Documents (sorted by popularity)
Pe
rce
nta
ge
of R
eq
uest
s
12/17/98 Server Traces 08/01/99 Server Traces10/06/99 Proxy Traces
13
Introduction to Content Distribution Networks (CDNs)
Content providers want to offer better service to their clients at lower cost
Increasing deployment of content distribution networks (CDNs)
Akamai, Digital Island, Exodus … Idea: a network of servers Features:
Outsourcing infrastructure Improve performance by moving
content closer to end users Flash crowd protection
CDNserver
server
ClientsContent
Providers
server
server
server
14
Placement of CDN servers Goal
minimize users’ latency or bandwidth usage
Minimum K-median problem
Select K centers to minimize the sum of assignment costs
Cost can be latency or bandwidth or other metric we want to optimize
NP-hard problem
CDNserver
server
server
server
server
ClientsContent
Providers
15
Placement Algorithms Tree based algorithm [LGG+99]
Assume the underlying topologies are trees, and model it as a dynamic programming problem
O(N3M2) for choosing M replicas among N potential places
Random Pick the best among several random
assignments Hot spot
Place replicas near the clients that generate the largest load
16
Placement Algorithms (Cont.)
Greedy algorithmGreedy(N,M) { for I = 1 .. M { for each remaining replica R {
cost[R] = cost after placing an additional replica at R
} select the replica with the lowest cost }}
Super Optimal algorithm Lagrangian relaxation + subgradient method
17
Simulation Methodology Network topology
Randomly generated topologies Using GT-ITM Internet topology generator
Real Internet network topology AS level topology obtained using BGP routing data from
a set of seven geographically dispersed BGP peers Web Workload
Real server traces MSNBC, ClarkNet, NASA Kennedy Space Center
Performance Metric Relative performance: costpractical/costsuper-optimal
18
Simulation Results inRandom Tree Topologies
19
Simulation Results inRandom Graph Topologies
20
Simulation Results inReal Internet Topologies
21
Effects of Imperfect Knowledge about Input Data
Predict load using moving window average
(a) Perfect knowledge about topology
(b) Knowledge about Topology with a factor of 2
accurate
22
Conclusion Characterize Web workload using MSNBC traces Placement of CDN servers
Knowledge about client workload and topology is crucial for provisioning CDNs
The greedy algorithm performs the best Within a factor of 1.1 – 1.5 of super-optimal
The greedy algorithm is insensitive to noise Stay within a factor of 2 of the super-optimal when the salted
error is a factor of 4 The hot spot algorithm performs nearly as well
Within a factor of 1.6 – 2 of super-optimal How to obtain inputs
Moving window average for load prediction Using BGP router data to obtain topology information
23
Part II Transport Layer Approach Speeding Up Short Data Transfers: Theory,
Architectural Support, and Simulation Results. Proceedings of NOSSDAV 2000 (Joint work with Yin Zhang and Srinivasan Keshav)
24
Motivation Characteristics of Web data transfers
Short & bursty [Mah97] Use TCP
Problem: Short data transfers interact poorly with TCP !
25
TCP/Reno Basics
Slow Start Exponential growth in
congestion window, Slow: log(n) round
trips for n segments Congestion
Avoidance Linear probing of BW
Fast Retransmission Triggered by 3
Duplicated ACK’s
26
Related Work P-HTTP [PM94]
Reuses a single TCP connection for multiple Web transfers, but still pays slow start penalty
T/TCP [Bra94] Cache connection count, RTT
TCP Control Block Interdependence [Tou97]: Cache cwnd, but large bursts cause losses
Rate Based Pacing [VH97] 4K Initial Window [AFP98] Fast Start [PK98, Pad98]
Need router support to ensure TCP friendliness
27
Our Approach Directly enter Congestion Avoidance Choose optimal initial congestion window
A Geometry Problem: Fitting a block to the service rate curve to minimize completion time
28
Optimal Initial cwnd Minimize completion time by having the
transfer end at an epoch boundary.
29
Shift Optimization Minimize initial cwnd while keeping the
same integer number of RTT’s
Before optimization:cwnd = 9
After optimization:cwnd = 5
30
Effect of Shift Optimization
31
TCP/SPAND Estimate network state by sharing performance
information SPAND: Shared PAssive Network Discovery [SSK97]
Directly enter Congestion Avoidance, starting with the optimal initial cwnd
Avoid large bursts by pacing
Internet
Web Servers
PerformanceServer
32
Implementation Issues Scope for sharing and aggregation
24-bit heuristic network-aware clustering [KW00]
Collecting performance information Performance reports, New TCP option, Windmill’s
approach, … Information aggregation
Sliding window average Retrieving estimation of network state
Explicit query, active push, … Pacing
Leaky bucket based pacing
33
Opportunity for Sharing MSNBC: 90% requests arrive within 5 minutes
since the most recent request from the same client network (using 24-bit heuristic)
34
Cost for Sharing MSNBC: 15,000-25,000 different client
networks in a 5-minute interval during peak hours (using 24-bit heuristic)
35
Simulation Results Methodology
Download files in rounds Performance Metric
Average completion time TCP flavors considered
reno-ssr: Reno with slow start restart reno-nssr: Reno w/o slow start restart newreno-ssr: NewReno with slow start restart newreno-nssr: NewReno w/o slow start restart
36
Simulation Topologies
37
T1 Terrestrial WAN Link withSingle Bottleneck
38
T1 Terrestrial WAN Link withMultiple Bottlenecks
39
T1 Terrestrial WAN Link with Multiple Bottlenecks and Heavy Congestion
40
TCP Friendliness (I)Against reno-ssr with 50-ms Timer
41
TCP Friendliness (II)Against reno-ssr with 200-ms Timer
42
Conclusions TCP/SPAND significantly reduces latency
for short data transfers 35-65% compared to reno-ssr / newreno-ssr 20-50% compared to reno-nssr / newreno-
nssr Even higher for fatter pipes
TCP/SPAND is TCP-friendly TCP/SPAND is incrementally deployable
Server-side modification only No modification at client-side
43
Part III Network Layer Approach Fast Packet Classification on Multiple
Dimensions. Cornell CS Technical Report 2000-1805, July 2000. (Joint work with G. Varghese and S. Suri, in progress)
44
Motivation Traditionally, routers forward packets based on
the destination field only Diff-serv and firewall require layer 4 switching
forward packets based on multiple fields in the packet header, e.g. source IP address, destination IP address, source port, destination port, protocol, type of service (tos) …
The general packet classification problem has poor worst-case cost:
Given N arbitrary filters with k packet fields either the worst-case search time is ((logN)k-1) or the worst-case storage is O(Nk)
45
Problem Specification Given a set of filters (or rules), where each
filter specifies a class of packet headers based on K fields an associated directive, which specifies how to
forward the packet matching this filter Goal: Find the best matching filter for each
incoming packet A packet P matches a filter F if every field of P
matches the corresponding field of F Exact match, prefix match, or range match Assume prefix matching
46
Problem Specification (Cont.) Example of Cisco Access control List
(ACL)1. access-list 100 deny udp 26.145.168.192
255.255.255.255 74.199.168.192 255.255.255.0 eq 2049
2. access-list 100 permit ip 74.199.191.192 255.255.0.0 255 74.199.168.192.255.0.0
3. access-list 100 permit tcp 250.197.149.202 255.0.0.0 74.199.20.76 255.0.0.0
Packet: tcp 250.19.34.34 74.23.5.12 matches filter 3
47
Backtracking Search A trie is a binary
branching tree, with each branch labeled 0 or 1
The prefix associated with a node is the concatenation of all the bits from the root to the node
F1 00*
F2 10*
D
E
48
Backtracking Search (Cont.)
Extend to multiple dimensions
Backtracking is a depth-first traversal of the tree which visits all the nodes satisfying the given constraints
Example: search for [00*,0*,0*]
49
Trie Compression Algorithm
If a path AB satisfies the Compressible Property: All nodes on its left point to the same place L All nodes on its right point to the same place R
then we compress the entire branches by 3 edges Center edge with value (AB) pointing to B Left edge with value < (AB) pointing to L Right edge with value > (AB) pointing to R
Advantages of compression: save time & storage
50
Trading Storage for Time Smoothly tradeoff storage for time
Selective push Push down the filters with large
backtracking time Iterate until the worst-case backtracking
time satisfies our requirement
Exponential Time
ExponentialSpace
51
Example of Selective PushGoal: worst-case memory
accesses < 12 The filter [0*, 0*,
0000*] has 12 memory accesses.
Push the filter down reduce lookup time
Now the search cost of the filter [0*,0*,001*] becomes 12 memory accesses. So we need to push it down. Done!
52
Using Available Hardware So far, we focus on software techniques for
packet classification. Further improve the performance by taking
advantage of limited hardware if it is available By moving some filters (or rules) from software to
hardware Key issue: Which filters to move from software to
hardware?Answer:
To reduce lookup time, move the filters with the largest number of memory accesses when using software approach
53
Summary
Approach Description Performance Gain
Trie compression algorithm
Effectively exploit redundancy in trie nodes
Reduce lookup time by a factor of 2 – 5, save storage by a factor of 2.8 – 8.7
Selective push
“Push down” the filters with large backtracking time
Reduce lookup time by 10 – 25% with only marginal increase in storage
Moving filters from software to hardware
Heuristics to move a small number of filters from software to hardware
Moving 10 – 20 rules to hardware cuts storage by 33% - 50%, or lookup time by 10% – 20%
54
Contributions Application layer
Study Web Workload of busy Web servers Properly provision content distribution
networks Transport layer
Optimize TCP startup performance for short Web transfers
Network layer Speed up packet classification
55
Other Work Available at
http://www.cs.cornell.edu/lqiu/papers/papers.html Integrating Packet FEC into Adaptive Voice Playout
Buffer Algorithms on the Internet. Proceedings of IEEE INFOCOM'2000, Tel-Aviv, Israel, March 2000.
On Individual and Aggregate TCP Performance. 7th International Conference on Network Protocols (ICNP'99), Toronto, Canada, October 1999.
Understanding the End-to-End Performance Impact of RED in a Heterogeneous Environment. July 2000. Submitted to INFOCOM'2001.
56
Integrating Packet FEC into Adaptive Voice Playout Buffer Algorithms
Internet telephony are subject to Variable loss rate Variable delay
Previous work has addressed the two problems separately Use FEC for loss recovery Use playout buffer adaptation for
delay jitter compensation
57
Integrating Packet FEC into Adaptive Voice Playout Buffer Algorithms (Cont.)
Our work Demonstrate the interaction between
playout algorithm and FEC Playout algorithm should depend on both FEC and
network loss conditions and network jitter Propose several playout algorithms that
provide this coupling Demonstrate the effectiveness of the
algorithms through simulations
58
On Individual and Aggregate TCP Performance Motivation
TCP behavior under many competing TCP connections has not been sufficiently explored
Our work Use extensive simulations to
investigate the individual and aggregate TCP performance for many concurrent connections
59
On Individual and Aggregate TCP Performance (Cont.) Major findings
All connections have the same rtt Wc > 3*Conn global synchronization Conn < Wc < 3*Conn local synchronization Wc < Conn shut off connections
Adding random processing time synchronization and consistent discrimination less pronounced
Derive the general characterization of overall throughput, goodput, and loss probability
Quantify the roundtrip bias for connections with different RTT
60
Understanding the End-to-End Performance Impact of RED in a Heterogeneous Environment
Motivation IETF recommends wide spread
deployment of RED in routers Most previous work studies RED in
relatively homogeneous environment Our work
Investigate the interaction of RED with five types of heterogeneity
61
Understanding the End-to-End Performance Impact of RED in a Heterogeneous Environment (Cont.) Major findings
Mix of short and long TCP connections Short TCP connections get higher goodput with RED than with
Drop Tail Mix of TCP and UDP
Bursty UDP tends to get lower loss rate with RED than with Drop Tail
Mix of ECN and non-ECN capable traffic ECN-capable TCP connections get higher goodput than non-ECN-
capable TCP connections Effect of different RTT
RED reduces the bias against long-RTT bulk transfers Effect of two-way traffic
When ACK path is congested, TCP gets higher goodput with RED than with Drop Tail
62
Effects of Imperfect Knowledge about Input Data
63
Effects of Imperfect Knowledge about Input Data (Cont.)
The effect of imperfect topology information
Randomly remove from 0 up to 50% edges in the AS topology derived from the BGP routing tables
The greedy algorithm is insensitive to edge removal
Performs within 2.6 of optimal when the edge removal is 50%