CPSC558 Project Proposal - La Sierra Universityfaculty.lasierra.edu/~dlin/classes/cpsc558/project.doc · Web viewWeb Cache Report Ping-Herng Denny Lin Fall 1998 CPSC558 Project California

Web Cache Report

Ping-Herng Denny LinFall 1998 CPSC558 Project

California State University, Fullerton

Abstract

Network congestion can be better managed when frequently accessed items are cached. Web caches are used to put web data closer to the user, thus speeding access to data and reducing network bandwidth waste. There are several approaches to caching and the report presents an overview of these techniques. This report gives an overview of the operation, use, problems, and web design issues that make web caching viable.

1. IntroductionThe Internet is increasingly associated with “the Web”, because web browsers such as Netscape and Internet Explorer have become popular all-purpose interfaces to the Internet, used for web browsing, e-mail, and file transfers. This wide array of applications has led to an unprecedented increase of network traffic. So long as “the web” continues to make “hot” items available, there will be large crowds of browsing users trying to access the latest copy of say, election results, the Starr report, etc.

When multiple Internet users request the same item, it is intuitively apparent that network traffic and congestion can be reduced if the information was stored closer to users. If a user or a group of users access an item that has already been retrieved before and if the item has not been changed, it would be far more efficient to serve the old local copy of the item, than to make many costly attempts to retrieve it from a distant server. Enter web caching, an increasingly popular way to manage network

http://www.lasierra.edu/~dlin/classes/cpsc558 1 12/9/98 Version 1.1

congestion and reduce network bandwidth waste by storing popular items close to users.

2. How caches workSince the 1960s, memory caches have been used in computers, using the principle of reference locality. Recently requested memory data have a high degree of probability of being requested in the future. Thus memory caches store data recently requested by, and is located in close proximity to, the Central Processing Unit (CPU).

When the information requested by the CPU is in the cache (a cache hit), data traffic between the CPU and main memory is eliminated. The CPU would only need to access the main memory when information requested is not found in the cache (a cache miss). Because caches are finite temporary places to store data, rules such as the Least-Recently-Used (LRU) replacement policy are be used to govern which cache item is evicted, to make room for new items to be cached.

A web cache uses the same principles found in a memory cache to reduce network traffic between a web site (server), and a web browser (client). The caching entity may store newly requested items on a local hard drive and/or a server hard drive located close to the user. It is often more efficient to have cache servers that can service requests in say the same country, than to serve the request from a server located in a distant country. Future requests for the same item are served from the caching entity rather than from the distant server.

In contrast to memory caches, not all items are cached, and there are very wide variations in the size of entries stored in the web cache. Furthermore, web caches must be designed to store information updated at uneven or unpredictable intervals.


Items that are not usually cached include passwords, CGI scripts, and dynamically changing data. However, Cao, Zhang, and Beach [10] have proposed active caching schemes cache applets that can handle dynamic data.

There are several issues a web cache designer has to consider: the cache architecture, the cache replacement policy, how cache consistency is maintained, and how client browsers use cached data. Depending on the cache architecture, cache replacement policy, and storage available on the cache, web cache hit rates ranging from 10% to 60% have been reported [1].

Cache Size (GB) vs. Object Hit Rate (%)of 7 ISPs using NetCache

0102030405060

G S Boeing F

ISP

Cache Size

Object Hit Rate

Figure 1 Object hit rates of several web caches using NetCache. Hit rates are not directly related to cache size.

The size of disk storage required to deploy a web cache depends on traffic; Danzig suggests using hard disk size capable of storing one week of WAN traffic. Daily link utilization is reported to be a quarter of peak utilization of the link; thus if the link’s peak utilization is 1 Mbit/s, the daily average link utilization is about 2 GB/day, or 14 GB/week. Depending on expected bandwidth savings (for example, 25%), one could deploy a cache with disk size of (100% - 25%) * 14 GB = 10.5 GB

In general, Danzig suggests deploying caches that use two [4 GB] disks and 50 MB of RAM for each Mbit/s of WAN traffic [17].


3. Cache architecturesThere are several ways to configure a web cache. The first web caches were designed almost as an after-thought to deal with light traffic, and have been less scaleable than recent designs. The necessity for scalability has led to different solutions

Web item caching can take place on both the client and server.

Client caches:

Popular desktop browsers such as Netscape Navigator and Internet Explorer, cache recently accessed items on the local hard drive. When a user requests an item on the web that was stored in the cache and if the item has not changed (a cache hit), the browser simply provides the item stored on the local hard drive instead of incurring in potentially costly network traffic.

Single Proxy Server caches:

Proxy servers were created to restrict Internet access from an intranet (a company’s internal network), and prevent Internet users from accessing web or file servers in the intranet. [3]

Proxy cache service user requests by first checking to see if the item the user needs is on the proxy server’s hard drive(s). If the proxy cache misses, it retrieves the information from the remote server, serves a copy to the user, and keeps (caches) a copy for itself. Future requests of the same item would be served from the proxy cache server.

Single caching servers are not scaleable, thus a mesh of cooperating servers, or a hierarchical caching approach is more desirable.


Cooperative server caching:

Internet service providers like America On Line (AOL) have been using cooperative proxy servers to store recently requested items in close proximity to their users.

Many proxy cache servers can be networked into a mesh to share cached data. Cooperative caches use the Internet Cache Protocol (ICP) [5, 14] to share information about each other’s contents, to balance loads between caches, and to provide resistance to cache failures.

When a cache misses, it sends an ICP query to its neighbor caches; the neighbors respond with an ICP reply indicating a hit or a miss.

Load balancing between caches is achieved by ignoring caches that take longer time to respond to a UDP broadcast (Round Trip Time delay); the cache that finds the item and responds with least delay to the UDP broadcast, will send its copy of the item to the user. Because client requests can either be served directly, or through other sibling caches, ICP is resilient to failures of sibling caches.

ICP can operate in either unicast or multicast modes. A unicast request establishes individual connections to poll servers; a multicast request polls the entire cooperative cache at once. [7]

Hierarchical caching:

This approach connects cache servers together into parent and child caches. Child (first-level) caches are polled first, and if the item is not found, a UDP broadcast is sent to its sibling(s); if sibling caches miss, higher-level caches are polled next; the parent (highest-level) cache is ultimately responsible for retrieving a fresh copy if its child caches miss.


Hits/Misses resolved

Direct Retrievals

Hits Resolved

The Internet

Cache Clients

Sibling Cache

Local Cache

Parent Cache

Figure 2 A two level web cache hierarchy using ICP. The local cache can retrieve hits from sibling caches, hits and misses from parent caches, and some requests directly from origin servers.

Hierarchical caches are more efficient because they scale better and make better use of available network bandwidth.

Cache designers should avoid using more than 3 levels of caches in a caching mesh. Web caches should be located close to traffic flows, and where network bottlenecks occur. First-level web caches should be located close to one’s external Internet connection, and close to users; if there are several first-level web caches, they should be placed where networks come together. Upper-level web caches should be placed where networks join, on or close to a national Internet exchange point [4].

Partitioned caching:

The web contains documents of very different sizes, and few large documents can frequently replace many smaller documents stored in a finite-sized cache. When large documents replace many smaller documents in a single cache, reference locality is lost.

Murta, Almeida, and Meira [6] have proposed creating caches partitioned to store items of different sizes separately, thus preserving the reference locality of both large and small items.


Partitioned cache performance depends on the number of cache partitions, size of each partition, subrange limits of document sizes for each partition, and the replacement policy used in each partition.

Murta et al proposed a three partition non-equal sized configuration, where 1/10 of total web cache storage was dedicated for small items (<= 2 Kb), 2/10 for medium sized items (> 2 Kb and <= 6 Kb), and 7/10 for large items (> 6 Kb). LRU and Size the only two replacement policies used in their simulations; the Size replacement policy rendered the best hit rate.

4. Cache replacement policiesThe purpose of a cache replacement policy, is to decide what items to evict when there is no space in the cache to store more items.Due to the wide variety of web item sizes, it is very difficult to find an optimal replacement algorithm. There are at least nine known replacement algorithms, and the most popular replacement policy used by web-caches is the Least-Recently-Used (LRU) policy.

Cao and Irani [12] have named some champion replacement policies, including LRU, Size, Hybrid, and Lowest-Relative-Value. In the same paper, Cao et al propose a new algorithm that combines locality, size and latency/cost concerns called Greedy Dual-Size (GD-Size).

Least-Recently-Used (LRU):

This algorithm evicts the item that was requested least recently. It can be implemented in O(1) time, and has very little overhead per cached file and O(1) time per access. In the absence of cost and size concerns, LSU is an optimal on-line (calculated on-the-fly) algorithm.

Disadvantages: Given a finite cache size, it is better to keep many small LRU items than few large more-recently-used items. It is also better to keep many expensive LRU items than few inexpensive more-recently-used items.


This prompts the development of algorithms that combines locality, size, and cost considerations.

Size:

This algorithm evicts the largest item. A priority queue based on size is used to implement this policy, and handling a hit requires O(1) time while an eviction requires O(log k) time, where k is the number of cached documents.

The disadvantage of this algorithm is that it does not take into account the cost of retrieving an item. It is less desirable to evict few expensive and large item, than to evict many inexpensive and small items.

Hybrid:

This algorithm uses a function to compute the utility of retaining an item in the cache. It evicts the item that has the smallest value returned by the function, using the following parameters: cs, the time to connect with the server s, bs the bandwidth to server s, np the number of times p has been requested since it was brought into the cache, and zp, the size in bytes of item p. Wb and Wn are constants. The function used is:

((cs + (Wb/bs)) * (np)Wn) / zp

Hybrid is implemented by using a priority queue, so it requires O(log k) time to find a replacement. Furthermore, it requires an array to track the average latency and bandwidth for every web server to estimate the downloading latency of a web item (page).

Lowest-Relative-Value (LRV):

This algorithm takes into account the cost and size of an item to calculate the desirability of keeping it in the cache. The item that has the lowest computed desirability is evicted from the cache.


The functions used are:

V(i, t, s) = { If i=1 then P1(s) (1 - D(t)) * c/s;

else Pi (1 - D(t)) * c/s},

where D(t) = .035 log(t + 1) + .45 * (1 - e(-t/2e6))

LRV requires O(1) storage per cached file; if the Cost is proportional to Size, replacements can be found in O(1) time; if Cost is arbitrary, then O(k) is needed to find a replacement. In practice, the cost of calculating D(t) was found to be very high because it uses log and exp.

Greedy Dual-Size (GD-Size):

The original Greedy Dual algorithm was proposed by Young [13], and further modified by Cao and Irani [12]. The algorithm associates a value H (the cost/size ratio) with each cached page p:

1 L (least valuable) is initialized to 0 before each item is processed.2 If p is already in cache, H(p) = L + cost(p)/size(p)3 If p is not in cache

3.1 While there is not enough room for p 3.1.1 set L = minqÎM H(q)3.1.2 evict (least valuable item) q such that H(q) = L

3.2 Bring p into cache3.3 set H(p) = L + cost(p)/size(p)

GD-Size can be implemented in a priority queue of documents based on the H value. Handling a hit and eviction both require only O(log k) time.

5. Maintaining cache consistencyThe purpose of maintaining cache consistency is to avoid serving an outdated (cached) copy of an item to users.

In an ideal cache, there are never inconsistencies between the cache copy and the server’s original copy. In the real world, items kept in a cache may


not be consistent with original copy, especially if the original copy changes faster than the refresh frequency of the cache.

There are several approaches used to maintain cache consistency: Time-To-Live (TTL), Polling-Every-Time, and Invalidation [9]. The most widely used approach is Time-To-Live and Adaptive Time-To-Live. Browsers such as Netscape allow users to check the document cached against the server copy upon every request, which is in effect a Polling-Every-Time approach.

Cao and Liu [9] have classified cache consistency maintenance models as either weak or strong. A weak cache consistency model may serve a stale copy of a requested item, while a strong cache consistency model does not return a stale copy of a modified document after completion of a write.

Time-To-Live (TTL):

Every Universal Resource Locator (URL) document has an estimate of how long the document is expected to remain unchanged. This information is used by the cache manager to decide whether its copy should be refreshed from the source. Clients can also send an “if-modified-since” request to the server. If the document has been modified since the time specified by the request, a status code of “200” and the new data is sent to the client; otherwise, the server sends a code “304” and no data is sent to the client.

The disadvantage to this approach is with the difficulty of assigning an accurate TTL to every document. The server could be burdened with too many “if-modified-since” requests if the TTL value is too small; the user could be served with a stale document if the TTL value is too large.

Adaptive Time-To-Live:

This approach has the cache manager adjusting the document’s Time-To-Live as a percentage of the document’s current “age”, which is the current time minus the last modified time of the document [9]. Adaptive TTL can


keep the probability of stale documents below 5%, and is known to be the best of “weak consistency protocols”.

Polling-Every-Time:

Every time a cache hit occurs, the cache sends an “if-modified-since” request to see if the document in the cache is up-to-date.

This approach guarantees strong cache consistency, and can be easily implemented through the existing HTTP protocol.

The disadvantage is that more network messages are generated (as high as 50% [9]), and the user has to wait for the cache to confirm every access to a document that has already been cached.

Invalidation:

The web server sends out a notification (invalidation message) to caches whenever an item has been modified; upon receipt of the invalidation message, the cache deletes the item but does not retrieve a new copy.

The advantage of invalidation is that strong cache consistency is achieved with overhead similar to Adaptive TTL, reduced network transactions, and improved cache utilization by deleting stale copies.

The disadvantage with this approach is that the existing HTTP protocol does not include invalidation messages. As of this writing, a protocol that supports invalidation called Hyper Text Caching Protocol (HTCP) [18] is being drafted and developed.

6. Web design issuesA web caching system should be totally transparent to its users. However, the possibility of serving stale documents could prompt web designers to encourage users to reload pages, which defeats the purpose of caching, and may place undue burden on the web server.


Dynamic server pages are frequently used to produce web content tailor made to suit a user’s web browser type; these pages are usually not cached. Web pages that contain frequently changing information (stock-market quotes, news sites, etc.), Cookies, Common Gateway Interface (CGI) scripts, may never be cached [15].

Web caches depend on correct time information to determine if a page has been modified since a particular date, and refreshed to maintain cache consistency [16]. If the server hosting web pages has an incorrect date or time setting, web caches may pre-expire “artificially old” documents (defeating the benefits of caching), or may never refresh an “artificially new” updated document (serving stale documents).

Due to security concerns, web caches and browsers do not cache web documents served from a secure server. Web designers should be aware that graphics contained in web documents from these sources are not cached [16].

The average size of documents on a web site should be kept as small as is feasible. It is better for caches to store many small-sized URLs than few large-sized items. Some caches may assume an average item size of 8 Kb per URL [17]; exceeding this average size wastes cache storage.

7. Example cachesThe following is a list of some popular web cache server software. Following this list of software are countries and institutions using web caches.

CERN proxy/server (1996)

This program was the first public-domain web server and was widely used for setting up web sites. However, when it received caching capabilities, many sites chose to use it as a simple local web cache.


The cache manager uses TTL and adaptive TTL to determine staleness of an item. Since the cache capability is part of the web server, a new child process is created to handle each request, resulting in considerable load on the cache machine. While cache hierarchies can be configured, it is not resilient to the failure of a parent cache. Cache filenames are derived from the URL, which result in slow reads from a hard drive busy retrieving cached items from a deep directory structure. For these reasons, CERN is not suitable for high-demand servers [1].

Netscape Proxy Server (1995)

This commercial program is a dedicated web cache and proxy server that can be arranged in a hierarchical organization. The cache is resilient to failure of other cache servers, but is not capable of discovering other cache’s contents, or perform load balancing tasks between caches.

As a proxy server, it is also capable of filtering Internet and intranet traffic on the basis of document types, browser signatures, and URL patterns [1].

Harvest (1995)

This caching software introduced the ICP protocol for sharing information, load balancing between cooperating caches, and is resilient to failures in other caches. Latest versions of Harvest runs parallel I/O threads to handle peak loads.

The software can also be used as a proxy server to block access to undesirable sites by URL, keyword, or the Webtrack rating system [1].

Squid (1996)

The source code for Squid is freely available, and the project is developed by volunteers. This project is based on early versions of Harvest and like its predecessor, Squid uses ICP to co-operate and eliminate duplication of cache contents.


Squid is frequently used in conjunction with the highly modular Apache web server.

NetCache (1997)

This proxy caching software resulted from work done on the Harvest project. NetCache uses a memory resident hash table to cache URLs, so determining a cache hit or miss does not require disk operations. Web items are stored in a Write-Anywhere-File-Layout (WAFL) file system, which is a write-optimized, log-structured, RAID-aware file system.

NetCache can run concurrently with up to 8,000 client, server, and file state machines. To share workload and interchange information about caches, it can either use ICP or NetCache Clusters. NetCache Clusters eliminate ICP network round-trip delays, by using persistent TCP connections between caches.

NetCache claims to be capable of serving 100-200 URLs/second (compared to 25-50 URLs/second on Squid) [17].

Countries and institutions using cache servers:

National caches have been set up to deal with congestion on slow international links. Norway’s Uninett and France’s CNRS use Squid caching software that achieve hit rates between 20% and 30%. United Kingdom’s HENSA cache uses Netscape Caching Proxy to achieve a hit rate between 55% and 60%. Singapore Telecom’s SingNet uses Harvest 2.0, which has achieved a hit rate of 53%. [1]

8. ConclusionIn order to keep up with exponentially increasing web traffic, scaleable web caching solutions must be used to help manage and reduce network congestion.


This report gave an overview of cache architectures, replacement policies, and consistency maintenance techniques in use.

Further research could be done in simulating or implementing a cache using the best of cache architectures, replacement policies, and consistency maintenance techniques mentioned here.

9. References1. A. Cormack.

“Web Caching”, September 1996.http://www.jisc.ac.uk/acn/caching.html

2. M. Kurcewicz, W. Sylwestrzak, A. Wierzbicki.“A Distributed WWW Cache”.http://wwwcache.ja.net/events/workshop/09/paper.html

3. A. Pullin“Internet Proxies”http://www.newi.ac.uk/pullina/proxy/

4. I. Melve.“Web caching architecture”, March 1997 http://www.uninett.no/prosjekt/desire/arneberg/altsammen.html

5. D. Wessels, K. Claffy.“Internet Cache Protocol” RFC 2186.http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2186.txt

6. C. Duarte Murta, V. A. F. Almeida, W. Meira Jr.“Analyzing Performance of Partitioned Caches for the WWW”.http://wwwcache.ja.net/events/workshop/24/

7. G. Neisser, J. Heaton.“Co-operative Caching”http://www.g-ming.net.uk/Documents/JENC8/

8. A. Chankhunthod, P. B. Danzig, C. Neerdaels, M. F. Schwartz, K. J. Worrell“A Hierarchical Internet Object Cache”, November 1995http://excalibur.usc.edu/cache-html/cache.html

9. P. Cao, C. Liu“Maintaining Strong Cache Consistency in the World-Wide Web”, April 1998http://www.cs.wisc.edu/~cao/papers/icache.ps


10.P. Cao, J. Zhang, K. Beach“Active Cache: Caching Dynamic Contents on the Web”http://www.cs.wisc.edu/~cao/papers/active-cache.html

11.L. Breslau, P. Cao, L. Fan, G. Phillips, S. Shenker“Web Caching and Zipf-like Distributions: Evidence and Implications”http://www.cs.wisc.edu/~cao/papers/zipf-like.ps.gz

12.P. Cao, S. Irani“Cost-Aware WWW Proxy Caching Algorithms”http://www.cs.wisc.edu/~cao/papers/gd-size.ps.Z

13.N. Young“The k-server dual and loose competitivenes for paging”.Algorithmica, June 1994.

14.D. Wessels, K. Claffy.“Internet Cache Protocol” RFC 2187.http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2187.txt

15.M. A. Paraz“Cache Breakers”http://home.iphil.net/~map/cache/breakers.html

16.the Cache Now! campaign - detailshttp://vancouver-webpages.com/CacheNow/detail.html#vrml

17.P. Danzig“NetCache Architecture and Deployment”http://wwwcache.ja.net/events/workshop/01/NetCache-3_2.pdf

P. Vixie, D. Wessels“Hyper Text Caching Protocol (HTCP/0.0)”http://info.internet.isi.edu/in-drafts/files/draft-vixie-htcp-proto-03.txt