Upload
zurina
View
58
Download
0
Tags:
Embed Size (px)
DESCRIPTION
A Survey of Web Caching Schemes for the Internet. Jia Wang. Agenda. The World Wide Web Problem and solution (caching) Proxy servers Advantages of web caching Disadvantages of web caching Elements of A WWW caching system Desirable properties of WWW caching system - PowerPoint PPT Presentation
Citation preview
Web Caching Schemes 1
A Survey of Web Caching Schemes for the Internet
Jia Wang
Web Caching Schemes 2
AgendaThe World Wide WebProblem and solution (caching)Proxy serversAdvantages of web cachingDisadvantages of web cachingElements of A WWW caching systemDesirable properties of WWW caching systemProblems in designing caching systems for the WWWCaching architecture
Web Caching Schemes 3
The World Wide WebThe WWW can be considered as a large distributed information system.Exponential growth in size.On may 1999 included 600 millions of static web pages.Increases 15% per month.Very popular.
Web Caching Schemes 4
0
100
200
300
400
500
600
Jun-97 Nov-97 Mar-98 May-99
SIZE OF DISTINCT STATIC WEB PAGES
Web Caching Schemes 5
The World Wide WebUsage is relatively inexpensiveAccessing information is very fastDocuments appeal to a wide range of interests
But…..
Web Caching Schemes 6
The World Wide WebNetwork congestion
Server overloading
Web Caching Schemes 7
ProblemInternet backbone capacity increases 60% per year.Bandwidth is not growing fast enough.
Without solution WWW will become too congested and its entire appeal would be lost.
Web Caching Schemes 8
SolutionCaching:
Placing popular objects at locations close to the clients.
Web Caching Schemes 9
proxy serversHTTP servers handled by companies for security reasons.
The bottleneck of the connection between the client and the internet.
Shared by all clients inside the firewall.
Web Caching Schemes 10
Clients within the firewall
Proxy on firewall machine
HTTP server
HTTP server
HTTP server
Web Caching Schemes 11
proxy servers
Belonging to same organization, clients
share common interests.
They probably access the same set of
documents.
Web Caching Schemes 12
thus
On the proxy server, a previously
requested and cached documents
would likely result in future hits.
Web Caching Schemes 13
proxy seversCaching most popular web pages on the proxy server can:
Save network bandwidth Lower access latency for the client
Web Caching Schemes 14
Advantages of web caching Reduces bandwidth consumption
Decreases network traffic
Lessens network congestion
Access latency:frequently used docs are cached nearbyless traffic shorter delay for docs not cached
Web Caching Schemes 15
Advantages of web caching (cont.)
Reduces workload of remote serverData can be accessed when remote server is down (enhanced robustness).Allows analysis of organization usage patterns
cooperation between caches increases efficiency.
Web Caching Schemes 16
Disadvantages of web caching
Data not updated automaticallyCache miss can cause increase in latency (extra proxy processing).Bottleneck effect – limit # of clients per proxy.A single proxy is a single point of failureInformation providers can not monitor # of visits per site.
Web Caching Schemes 17
Elements of A WWW caching system
Documents can be cached at the clients, the proxies and the servers.
Web server Web server
Proxy server cooperation
clients clients
clients
Web Caching Schemes 18
Elements of a WWW caching system
request
client
Does proxy have requested page yes
no
Does cooperative proxies have web page
yes no
Find web page on server
Web Caching Schemes 19
Desirable properties of WWW caching system
fast access
robustness
transparency
scalability
efficiency
adaptivity
stability
load balance
ability to deal with heterogeneity
simplicity
Web Caching Schemes 20
Fast accessReduce web access latency to a minimum.
Especially comparing to other servers not using caching techniques.
Web Caching Schemes 21
RobustnessRobustness = Availability to usereliminate single point failurein case of failure – fall down gracefullyeasy to recover from failure
Web Caching Schemes 22
TransparencyTransparent to the user
The user should only notice:Faster response Higher availability
Web Caching Schemes 23
ScalabilityScale well along the increasing size and density of the network.All protocols should be as lightweight as possible.
Web Caching Schemes 24
Efficiencyimpose minimal additional burden on the network (in control & data packets)do not adopt any scheme which leads to under-utilization of the network
Web Caching Schemes 25
Adaptivityadapt to dynamic changing in the user demand and network environment
achieve optimal performance
Web Caching Schemes 26
StabilityDo not introduce
instabilities into
the network
Web Caching Schemes 27
Load balancingdistribute load evenly through the entire network
no bottlenecks / hot-spots
Web Caching Schemes 28
Ability to deal with heterogeneity
Adapt to a range
of network
architecture
(hardware &
software)
Web Caching Schemes 29
SimplicityMechanism simple to deploysimpler schemes are easier to implement and likely to be accepted as international standards
Web Caching Schemes 30
What Problems do we face in designing caching systems for the WWW ???
Web Caching Schemes 31
Problems in designing caching systems for the
WWWCaching system architecturehow cache proxies
are organized – hierarchically, distributed or hybrid.
Web Caching Schemes 32
Problems in designing caching systems for the
WWWProxy placement were to place a
cache proxy in order to optimize performance
Web Caching Schemes 33
Problems in designing caching systems for the
WWWCaching contentsWhat can be
cached in the caching system
Web Caching Schemes 34
Problems in designing caching systems for the
WWWProxy cooperationHow do proxies
cooperate with each other
Web Caching Schemes 35
Problems in designing caching systems for the
WWWData sharingwhat kind of
data/information can be shared among among cooperative proxies
Web Caching Schemes 36
Problems in designing caching systems for the
WWWCache resolution/routinghow does a proxy
decide where to fetch a page requested by a client.
Web Caching Schemes 37
Problems in designing caching systems for the
WWWPrefetchingHow does a proxy
decide what and when to prefetch from webservers or other proxies to reduce access latency.
Web Caching Schemes 38
Problems in designing caching systems for the
WWWCache placement/ replacementhow the proxy decides
which page to be stored in its cache and which page to be removed from it.
Web Caching Schemes 39
Problems in designing caching systems for the
WWWCache coherencyhow does a proxy
maintain data consistency
Web Caching Schemes 40
Problems in designing caching systems for the
WWWControl information distributionhow is the control
information (e.g URL) distributed among proxies.
Web Caching Schemes 41
Problems in designing caching systems for the
WWWDynamic data cachinghow to deal with
data that is not cachable
Web Caching Schemes 42
Caching architectureHierarchicalCaches are placed at multiple levels of the
network.
national
regionalinstitutional
bottom
Web Caching Schemes 43
Hierarchicalarchitecture
Bottom – clients/browsers caches.
national
regionalinstitutional
bottom web page not found
web page not found
web page not found
Web Caching Schemes 44
Hierarchicalarchitecture
after web page is found
national
regionalinstitutional
bottom
forward page, leave copy
forward page, leave copy
forward page, leave copy
Web Caching Schemes 45
Hierarchical architecture
Advantages:Bandwidth efficient – especially when
cache servers are slow.Allows to efficiently diffuse popular
web pages towards the demand.
Web Caching Schemes 46
Hierarchical architectureDisadvantagesCache server needs to be placed at key
access points of the network requires coordination among caches.
Each level adds a delay.High levels are bottlenecks.multiple copies at different cache levels.
Web Caching Schemes 47
Distributed architectureCaches at the bottom level only.No other intermediate caching levels.Each cache server contains meta-data on the data stored on other servers.Hierarchy used only for distributing information about location of the copy.No copying of actual documents.
Web Caching Schemes 48
Advantages:Traffic flows through low network
levels which are less congested.No additional disk space required for
intermediate network levels.Better load sharing.More fault tolerant.
Distributed architecture
Web Caching Schemes 49
Disadvantages:High connection timesHigher bandwidth usageAdministrative issues.
Distributed architecture
Web Caching Schemes 50
ExamplesICP – Internet Cache Protocol (Harvest group)
Retrieve data from neighboring caches + parent caches
CARP – Cache Array Routing ProtocolURL space divided to an array of caches.Each cache stores only documents whose
URL are hashed to it.
Distributed architecture
Web Caching Schemes 51
Hybrid architectureCaches may cooperate with other caches at the same level or at a higher level using distributed caching.
ICP is an example: the document is fetched from a parent/neighbor
cache that has the lowest RTT.
Web Caching Schemes 52
Performance of architecturesHierarchical caching has shorter connection times than distributed caching.
Additional copies at intermediate level reduces retrieval latency for small documents.
Distributed caching has shorter transmission times & higher bandwidth usage.
“Well configured” hybrid scheme can reduce both connection time and transmission time.