32
Hot Systems, 18.12.2000 Volkmar Uhlig [email protected]

Hot Systems, 18.12.2000 Volkmar Uhlig [email protected]

Embed Size (px)

Citation preview

Page 1: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Hot Systems, 18.12.2000

Volkmar [email protected]

Page 2: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

On the scale and performance of cooperative Web proxy caching

Alec Wolman, Geoffrey M. Voelker, Nitin Sharma, Neal Cardwell,

Anna Karlin, and Henry M. LevyUniversity of Washington

(SOSP ‘99, Kiawah Island SC)

Page 3: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Outline Concepts of cooperative web

caches Cache simulation Request analysis UW + Microsoft Conclusion

Page 4: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Web Proxy Caches

Internet

Internet

http://l4ka.org/

Miss

http://l4ka.org/

Hit

Page 5: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Reasoning for Caches Reduce download time Improve responsiveness Reduce internet bandwidth usage

Save money

Page 6: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Idea:Cooperative Caches

Overall Hit

Rate?

Page 7: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Hierarchical Caching

Page 8: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Neighborhood Caches

Page 9: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Hash based Caching

Page 10: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Related Work – Proxies V. Almeida, A. Bestavros, M. Crovella, and A. de-Oliveira. Characterizing reference locality in the WWW. Technical

Report 96-011, Boston University, June 1996. L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and

implications. In Proc. of IEEE INFOCOM ’99, pages 126–134, March 1999. R. Caceres, F. Douglis, A. Feldmann, G. Glass, and M. Rabinovich. Web proxy caching: The devil is in the details. In

Workshop on Internet Server Performance, pages 111–118, June 1998. P. Cao. Characterization of Web proxy traffic and Wisconsin proxy benchmark 2.0. http://www.cs.wisc.edu/~cao/w3c-

webchar-position, Nov. 1998. M. E. Crovella and A. Bestavros. Self-similarity in World Wide Web traffic: Evidence and possible causes. In Proc. of

the ACM SIGMETRICS ’96 Conf., pages 160–169, May 1996. F. Douglis, A. Feldmann, B. Krishnamurthy, and J. Mogul. Rate of change and other metrics: a live study of the

World Wide Web. In Proc. of the 1st USENIX Symp. on Internet Technologies and Systems, pages 147–158, Dec. 1997.

B. Duska, D. Marwood, and M. J. Feeley. The measured access characteristics of World Wide Web client proxy caches. In Proc. of the 1st USENIX Symp. on Internet Technologies and Systems, pages 23–36, Dec. 1997.

A. Feldmann, R. Caceres, F. Douglis, G. Glass, and M. Rabinovich. Performance of web proxy caching in heterogeneous bandwidth environments. In Proc. of IEEE INFOCOM ’99, March 1999.

S. D. Gribble and E. A. Brewer. System design issues for Internet middleware services: Deductions from a large client trace. In Proc. of the 1st USENIX Symp.on Internet Technologies and Systems, pages 207–218, Dec. 1997.

T. M. Kroeger, D. D. E. Long, and J. C. Mogul. Exploring the bounds of Web latency reduction from caching and prefetching. In Proc. of the 1st USENIX Symp. on Internet Technologies and Systems, pages 13–22, Dec.1997.

M. Rabinovich, J. Chase, and S. Gadde. Not all hits are created equal: Cooperative proxy caching over a wide area network. In Proc. of the 3rd Int. WWW Caching Workshop, June 1998.

Page 11: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Related Work – Locality V. Almeida, A. Bestavros, M. Crovella, and A. de-Oliveira. Characterizing reference

locality in the WWW. Technical Report 96-011, Boston University, June 1996. L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like

distributions: Evidence and implications. In Proc. of IEEE INFOCOM ’99, pages 126–134, March 1999.

P. Cao and S. Irani. Cost-aware WWW proxy caching algorithms. In Proc. of the 1st USENIX Symp. on Internet Technologies and Systems, pages 193–206, Dec. 1997.

C. R. Cunha, A. Bestavros, and M. E. Crovella. Characteristics of WWW client-based traces. Technical Report BU-CS-95-010, Boston University, July 1995.

S. Glassman. A caching relay for the World Wide Web. In Proc. First Int. World Wide Web Conf., pages 60–76, May 1994.

T. M. Kroeger, J. C. Mogul, and C. Maltzahn. Digital’s Web proxy traces. ftp://ftp.digital.com/pub/DEC/traces/proxy/webtraces.html, August 1996.

Page 12: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Scope of the paper What is the best performance one could

achieve with “perfect” caching? For what range of client populations can

cooperative caching work effectively? Does the way in which clients are

assigned to caches matter? What cache hit rates are necessary to

achieve worthwhile decreases in document access latency?

Page 13: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Cache Simulations – How? Collect traces (i.e. packet sniffer) Model cache behavior Play traces against cache model Analyze

Page 14: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Cache Traces977131631.070 11 1.2.3.52 TCP_MISS/200 1465 GET http://i30www.ira.uka.de/ -

DIRECT/i30www.ira.uka.de text/html977131631.369 13 1.2.3.52 TCP_MISS/200 3488 GET http://i30www.ira.uka.de/header.shtml -

DIRECT/i30www.ira.uka.de text/html977131631.379 30 1.2.3.52 TCP_MISS/200 11585 GET http://i30www.ira.uka.de/main.html -

DIRECT/i30www.ira.uka.de text/html977131631.663 67 1.2.3.52 TCP_REFRESH_HIT/200 1898 GET

http://i30www.ira.uka.de/sysarch_header.css - DIRECT/i30www.ira.uka.de text/css977131631.665 10 1.2.3.52 TCP_REFRESH_HIT/200 2119 GET http://i30www.ira.uka.de/sysarch3.css -

DIRECT/i30www.ira.uka.de text/css977131631.862 64 1.2.3.52 TCP_REFRESH_HIT/200 3215 GET

http://i30www.ira.uka.de/images/bg_lgrey.jpg - DIRECT/i30www.ira.uka.de image/jpeg977131631.867 31 1.2.3.52 TCP_REFRESH_HIT/200 11755 GET

http://i30www.ira.uka.de/images/infblg.jpg - DIRECT/i30www.ira.uka.de image/jpeg977131632.257 19 1.2.3.52 TCP_REFRESH_HIT/200 2569 GET http://i30www.ira.uka.de/images/sag.gif

- DIRECT/i30www.ira.uka.de image/gif977131632.393 45 1.2.3.52 TCP_REFRESH_HIT/200 3016 GET

http://i30www.ira.uka.de/images/bg_white.jpg - DIRECT/i30www.ira.uka.de image/jpeg977131637.860 542 1.2.3.52 TCP_CLIENT_REFRESH_MISS/200 445 GET

http://www.aftenposten.no/grafikk/pixel-blank.gif - DIRECT/www.aftenposten.no image/gif977131637.980 693 1.2.3.52 TCP_CLIENT_REFRESH_MISS/200 4271 GET

http://www.aftenposten.no/grafikk/finn_samtlige.gif - DIRECT/www.aftenposten.no image/gif977131638.146 309 1.2.3.52 TCP_CLIENT_REFRESH_MISS/200 2295 GET

http://aftenposten.no/grafikk/aftenpostenhode1.gif - DIRECT/aftenposten.no image/gif977133332.271 13 1.2.3.52 TCP_MEM_HIT/200 446 GET

http://ad.no.doubleclick.net/ad/www.aftenposten.no/Innenriks;sz=468x60;ord= - NONE/- image/gif

977131631.07011 sec1.2.3.52TCP_MISS1465 GET <URL>DIRECT/i30www.ira.uka.detext/html

Page 15: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Simulation Methodology Infinite sized caches No expiration for objects No compulsory misses (cold start) Ideal vs. Practical Cache

(cacheability)

Page 16: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Simulation ofCooperative Caching Optimistic simulation model:

Working set of all combined caches No inter-proxy communication latency

One HUGE cache server

Page 17: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Collect Traces

MicrosoftUniversity of Washington

Traces of same period of time

Page 18: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

University of Washington 82.8 million HTTP requests 18.4 million HTTP objects 677 GB total requested bytes 137 requests/second 22,984 clients 244,211 servers 7 days

Page 19: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Microsoft Cooperation 107.7 million HTTP requests 15.3 million HTTP objects total requested bytes not available 199 requests/second 60,233 clients 306,586 servers 6 days 6 hours

Page 20: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Experiment Analysis Hit rate (object, byte) Request latency Bandwidth Locality

Page 21: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Request Hit-Rate / # Clients

Caches with more than 2500 clients do not increase hit

rates significantly!

Page 22: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Byte Hit-Rate / # Clients (UW)

Page 23: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Object Request Latency

More clients do not reduce object

latency significantly.

Page 24: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Bandwidth / # Clients

There is no relation between number of clients

and bandwidth utilization!

Page 25: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Locality:Proxies and Organizations University of Washington

Museum of Art and Natural History Music Department Schools of Nursing and Dentistry Scandinavian Languages Computer Science

comparable to cooperating businesses

Page 26: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Local and Global Proxy Hit rates

Page 27: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Randomly populated vs. UW organizations

Locality is minimal(about 4%)

Page 28: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Impact of larger populations

Page 29: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Large-scale Experiment

MicrosoftUniversity of Washington

23K Clients 60K Clients

Page 30: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Cooperative CachingMicrosoft + UW

Page 31: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Further Aspects Analytic model of Web accesses

Popularity Expiration of documents Rate of change

Page 32: Hot Systems, 18.12.2000 Volkmar Uhlig volkmar@ira.uka.de

Summary and Conclusions Cooperative caching with small

population is effective (< 2500) Can be handled by single server Locality not significant Limitations due to cacheability

Further research should focus on

improving cacheability!