Branch and Bound: A Paradigm of Elastic Network Cachingpages.cs.wisc.edu/~leonn/misc/branch-n-bound.pdf · Branch and Bound: A Paradigm of Elastic Network Caching Feng Niu University

Branch and Bound: A Paradigm of Elastic Network Caching

Feng NiuUniversity of Wisconsin-Madison

[email protected]

Neel Kamal MadabhushiUniversity of Wisconsin-Madison

[email protected]

May 19, 2010

Abstract

Content delivery is an increasingly dominating func-tionality of the Internet, and continuously poses sig-nificant pressures on the networking infrastructure aswell as content providers. As a general mechanism toimprove efficiency, caching has been used in severalapproaches; e.g., CDNs, caching proxies, and Redun-dancy Elimination (RE). In this paper, we proposea new caching paradigm called Branch and BoundCaching (BBC) that addresses the disadvantages ofexisting approaches.

Similar to RE, BBC caches co-locate with routersand are an integral part of the networks. However,while RE caches anonymous packets and use themto compress the traffic, BBC caches named objectsand use them to satisfy requests. In BBC, the func-tionalities of forwarding and caching are loosely cou-pled using our “branch and bound” mechanism sothat the cached content is utilized elastically withoutincurring additional overhead for cache misses. Wereport preliminary simulation results to demonstratethe feasibility of our proposal.

1 Introduction

Recent studies [4, 5] show that the increasingly dom-inant portion of Internet traffic belongs to the cat-egory of content delivery. To accommodate suchrapidly growing demand, both the ISPs and the con-tent providers have to continuously upgrade their in-frastructure. As such, improving the efficiency of con-tent delivery bears tremendous economic promises.Naturally enough, caching as a general mechanismhas been used in most approaches. For example,some web proxies store HTTP responses and usethem to satisfy repeated requests; content distribu-tion networks (CDNs) replicate expensive-to-delivercontents and push them towards the edge of content

consumers. More recently, Anand et al. [2] exploresthe idea of caching content chunks on routers and us-ing them to dictionary-encoding network traffic; wecall this approach redundancy elimination (RE).

In this paper, we examine the pros and cons ofeach of the three caching paradigms. Based on ouranalysis, we propose a novel caching paradigm calledBranch and Bound Caching (BBC) that addressesthe problems with the existing approaches. In BBC,we follow the lead of RE [2] by co-locating caches withrouters and making caching an integral functionalityof the networks. However, the caches in BBC areused in a different way than those in RE. In RE, arouter caches all packets that pass by regardless oftheir semantics. Given an incoming packet, an up-stream router would substitute chunks in the packetthat also appear in the cache with much shorter refer-ences (called “shims” in [2]), thereby realizing trafficcompression. Assuming that caches are synchronized,the downstream router would dereference the shimsby replacing them with the corresponding chunks inthe cache. By contrast, BBC caches are used in thesame way as with web proxies: named objects arestored and retrieved to satisfy matching requests. Aswe will see in Section 2, not using content names isthe fundamental reason of several crucial drawbacksof RE.

A critical challenge of co-locating cache withrouters is capacity mismatch: on the one hand, highcache effectiveness typically requires large-volumestorage; on the other hand, current large-volume stor-age systems (usually consisting of DRAM and harddrives) can rarely handle the throughput of a high-end router (e.g., 10Gb/s). We address this prob-lem by introducing the branch-and-bound coupling(also abbreviated BBC) mechanism. Specifically, thecaching component is neither an internal part of therouter (as with RE), nor an external entity to therouter (as with proxies). Instead, the router and

1

the cache are loosely coupled: the router monitorsthe state of the caching component and determineswhich tasks to assign to the cache; the cache fulfillsassigned tasks and continuously reports its state tothe router. For example, the router lets the cachehandle a request only if 1) the cache contains the re-quested object; and 2) the cache is not overloaded.

The remainder of this paper is organized as fol-lows. Section 2 lays out the basic settings of con-tent delivery. Section 3 examines and compares ex-isting caching mechanisms, including CDNs, proxies,and RE. Section 4 presents the details of Branch andBound Caching. Section 5 reports some preliminarysimulation results. Section 6 summarizes our pro-posal and discusses future directions.

2 Content Delivery

We use the term “content” to refer to digital ob-jects such as video, audio, images, and web pages.We assume that such objects are named, and thatthe names are unique; i.e., at any moment one namemaps to at most one object. Hence content hashingschemes cannot be used since there might be con-flicts. A vanilla version of such a naming system isthe URLs, where uniqueness is guaranteed by parti-tioning the name space – part of the content name isthe content provider name.

Backbone

server

request

response

clients

Figure 1: Server/Client Model of Content Delivery

When the context is clear, content providers are

also called servers, and content consumers clients. Weconsider the simplistic server/client delivery model(Figure 1). A session of content delivery consists oftwo phases: first, the client sends a request of a cer-tain object to the server; then, the server responds bytransferring the object to the client. Both requestsand responses contain the name of the object in ques-tion. Clearly HTTP qualifies for this scenario. 1

The bare server/client model as shown in Figure 1is very inefficient because there is typically significantredundancy in the traffic of content delivery sessions.For example, some popular videos on YouTube couldbe requested millions of times within days and eachsingle session is realized with a full round-trip be-tween the clients and the server; if the videos werecached at network nodes close to the clients, the totaltraffic could have been reduced dramatically. More-over, it is likely that such redundancy is rapidly de-teriorating, given that the Internet is becoming moreand more social (e.g., the Slashdot effect).

There are several general mechanisms to addressthe issue of redundancy; e.g., caching and multicas-ting. Here we consider only caching. Regarding thequestion of what to cache, there are two types ofcache: object-level cache and chunk-level cache. Inobject-level cache, the content is stored at the granu-larity of objects and indexed by object names; there-fore we may also call it named cache. In chunk-levelcache, the content is stored at the granularity of bytestrings and the indexing methods do not rely on ob-ject names; therefore we may also call it anonymouscache. Named cache is typically used by matchingincoming requests against the cache index; since thismatching process can be done very quickly, the usageof named cache is computationally light. By contrast,both manipulation and retrieval of anonymous cacheusually involve hashing on the data itself; the usageof anonymous cache is computationally heavy.

The object associated with a given name mayevolve over time. Thus, object-level caching must beaware of cache staleness. For that purpose, serversusually indicate in their responses whether and forhow long an object can be cached. Similarly, a clientmay indicate in its request whether a cached copy ofthe object is allowable and, if yes, how fresh the cacheshould be. These semantics are supported in HTTP.

1Although there exist more sophisticated deliveryparadigms (e.g., P2P), we don’t consider them here asthe main topic of this paper is caching paradigms. We do notethat our solution (namely BBC) can be easily (well, more orless) adapted to other delivery paradigms.

2

3 A Taxonomy of Caching

In this section, we describe and compare several ex-isting caching paradigms, including CDNs, Proxies,and RE. The purpose is to look into various aspectsof caching and motivate our proposal, Branch andBound Caching (BBC), which will be explained indetail in Section 4.

To put the subject into perspective, Figure 2 pro-vides a taxonomy of the four caching paradigms. Thetop header indicates whether the approach focuses onthe redundancy among contents or among requests;the left header indicates whether the cache is a rel-atively independent node in the network, or an in-tegral part of the networking infrastructure. To in-terpret, CDNs and RE are designed with a mindsetof push-based communication, whereas Proxies andBBC work with pull-based communication, which isof course compatible with the server/client model. Inthe other dimension, CDNs and Proxies are relativelyindependent entities in the network, whereas in REand BBC, the cache is an integral part of the network– in fact, cache is co-located with routers in both REand BBC.

content-centric request-centricstand-alone CDNs Proxiesin-network RE BBC

Figure 2: A Taxonomy of Caching Paradigms. Thetop header indicates whether the approach focuses onthe redundancy among contents or among requests;the left header indicates whether the cache is a rela-tively independent node in the network, or an integralpart of the networking infrastructure.

To compare the efficiency of different cachingparadigms, we need to consider various performanceaspects, including (among others) bandwidth cost,server load, and session latency. We now briefly de-scribe the three caching paradigms and discuss theirpros and cons.

3.1 Content Distribution Networks

As shown in Figure 3, the work flow of a basic CDNis as follows:

1. Numerous cache servers are deployed in a geo-graphically diverse manner.

2. Content is replicated onto the cache servers.

Backbone

CDN server

Figure 3: An Illustration of CDNs

3. The cache servers are used to serve requests fromnearby clients.

The benefits of CDNs are obvious: by redirect-ing clients’ requests to closer servers, we can reduceboth the session latency and the load of the contentproviders’ servers. In addition, the shorter round tripalso saves bandwidth usage for both the ISPs and thecontent providers. However, all those benefits comewith a price: for one, the deployment of large-scaleCDNs can be quite expensive; for another, the con-tent providers usually have to pay to use CDN ser-vices. Moreover, the content providers have to takethe pain to configure CDN caches – it is in this sensethat CDNs are push-based and content-centric. An-other issue is that as CDN caches are typically static,efforts have to be taken towards optimal content dis-tribution or dynamic distribution mechanisms; e.g.,some commercial CDN providers have explored usingP2P to achieve adaptive cache [1].

3.2 Redundancy Elimination

Redundancy Elimination (RE) [2] is the idea of us-ing synchronized on-router caches to compress net-work traffic with dictionary coding. The cache isanonymous and is used to match against the pay-load of a target packet. Fingerprint hashing is usedto speed up this matching process. The paper alsoproposes using traffic engineering to deploy and ad-

3

just the caches, but we do not discuss it here since itis orthogonal to the basic caching paradigm. Figure 4illustrates how RE works. The proposal of eliminat-ing traffic redundancies in a universal manner is anovel and bold one, and there are many interestingideas in [2] that can be further explored. Neverthe-less, we do observe several disadvantages of RE as acaching paradigm in the context of content delivery.

on-site cache

packet size

Figure 4: An Illustration of RE

The most prominent benefit of RE is that is itsaves bandwidth for ISPs since the traffic is com-pressed. As such, one can expect session latenciesto be reduced, since the queues in the routers arepresumably shorter. However, every request is stillserved by the content provider, which means thatRE does not contribute to reducing server load atall. In addition, as RE need to scan the data it-self, its computationally heavy nature may actuallyintroduce additional latency or even lead to bottle-necks. A third issue is that, despite the fact that REcaches are semantics independent and can be used toidentify finer grained matches (byte strings instead ofobjects), its effectiveness of identifying redundancy isseverely limited its cache coverage – a FIFO cache ofsize 100GB (DRAM, that is) on a 10Gbit/s routercan only capture packets from the last 80 seconds;as such, only duplicates that occur within this shortinterval will be detected by RE.

Besides effectiveness, the feasibility of RE is also abig issue. For one, given the temporal/spatial hetero-geneity of the routers and the uncertainty of the traf-

fic, FIFO alone is unlikely to robustly synchronize on-router caches and guarantee correctness. For another,RE requires a shared naming scheme (e.g., unique andglobal packet IDs) for the distributed caches; the “Pk-tID” variable as described in [2] does not solve thisproblem.

3.3 Proxies

Backbone

caching proxy

Figure 5: An Illustration of web proxies

While CDNs are deployed by content providers,web proxies are usually initiated by content con-sumers or ISPs. As shown in Figure 5, web proxiesare middle-boxes deployed on the clients’ side. Theyregulate all traffic that passes trough them, and of-ten function as filters and/or caches. The cachingfunctionality works as follows: upon receiving a re-quest, the proxy checks if the requested object is inits cache, and decides whether to intercept or pass onthe request accordingly; upon receiving a response,the proxy checks if the response is cacheable and up-date its cache accordingly. Since the cache is dy-namic, caching proxies use cache replacement algo-rithms such as Least Recently Used (LRU) to achievehigher cache hit ratio.

There is a special type of proxy called interceptingproxy or transparent proxy that is commonly usedby some ISPs. An intercepting proxy consists of arouter connected with a proxy server. Network flowsgo through the router, which decides whether to di-vert them to the proxy or let them through. The

4

decision is usually based on TCP headers. For ex-ample, the router may divert all HTTP traffic to theproxy server. As such, if we consider only the divertedtraffic, the illustration in Figure 5 is still valid. (Wedo note that intercepting proxy was an inspiration toour BBC paradigm.)

Like CDNs, caching proxies also reduce bandwidthusage, session latency, and server load. Unlike CDNs,the benefits apply to all content providers, not onlyto those who pay. Nevertheless, there are severalimportant drawbacks that proxies face. First, be-ing middle-boxes make proxies potential bottlenecksin the network; the throughput of the link where aproxy reside is bound by the processing capacity ofthe proxy. Second, cache misses experience the pureoverhead of an additional hop; when the network con-dition is in good shape, the clients may be better offwithout the proxy.

4 Branch and Bound Caching

After examining the pros and cons of various cachingparadigms, we envision a solution that inherits thebenefits of object-level/request-centric approachesbut is also elastic, in the sense that it neither intro-duces network bottlenecks nor incurs extra overheadfor cache misses. It is illustrated in Figure 6, and wename it Branch and Bound Caching (BBC).

on-site cache

request

response

Figure 6: An Illustration of BBC

4.1 Branch and Bound

BBC can be seen as a natural extension of intercept-ing proxies. The novel thing is our Branch and BoundCoupling (also BBC) mechanism that is applied tothe relationship between the router and the cache.

Client ServerRouter Caching proxy

1 234

1 256

34

Client ServerRouter

Cache

1

23

41

4

23

cache hit

cache miss

Proxy:

BBC:

Figure 7: Branch and Bound removes the bottleneckas well as undue overhead for cache misses

As shown in Figure 7, in BBC, the router monitorsthe state of the cache (including object names andcache load) and makes routing decisions based on thepacket as well as the cache state.

The “Branch” mechanism is as follows:

• Upon receiving a request packet, if the requestedobject is in cache, then forward it to the cache;otherwise, pass it through.

• Upon receiving a response packet, if the objectis cacheable, multicast the packet to both its in-dicated destination and the cache.

The “Bound” mechanism is as follows:

• Disable the cache when it’s overloaded.

4.2 Packet Format

For completeness, we present a baseline packet formatthat can be used with BBC. We could have focusedon HTTP only, but to demonstrate the generality ofobject-level caching, we opt to conceptual expositionas shown in Figure 8. We assume that the requestand response packets are on top of TCP. Roughlyspeaking, the fields are a subset of those in HTTPheaders; their meanings are as follows:

name The name of the object.

5

Figure 8: Packet format

range We support retrieval of part of an object. Forrequests, “range” is the portion of object beingrequested. For responses, “range” is the byterange of the object carried by this packet.

total-size The total size of the object. This fieldcan be used by the cache server or the client toappropriately allocate buffer space.

cache-control For requests, this can be as simpleas a bit, indicating whether cached content isacceptable; or it could be a timestamp indicat-ing how stale the cache could be. For responses,this can be as simple as a bit, indicating whetherthe content can be cached; or it could be a times-tamp, indicating for how long the object can bein the cache without refreshing.

4.3 Architecture

In this section, we describe the architecture of BBC(Figure 9). A BBC node consists of a router anda cache, which presumably reside at the same loca-tion. The two subsystems are loosely coupled viathe cache’s three interfaces: GET, PUT, and Con-trol. Although it’s possible for one cache to connectto multiple routers and vice versa, for simplicity weonly consider the one-on-one configuration.

On top of the existing routing functionality, therouter also maintains two data structures:

name table A hash table in DRAM that storesnames of the objects in the cache. False negativeis allowed; that is, if due to space constraint theDRAM cannot hold all the names, it’s OK tostore only a subset of them. This table is to beupdated by the cache via the Control interface.

cache state A set of volatile variables in SRAM toindicate the working load of the cache. We leaveopen the exact specifications, but here are somesuggestions: 1) “keep-alive” to indicate the avail-ability of the cache; 2) the amount of outstanding

GET Interface

PUT Interface

CacheHigh-capacity, high-throughput storage system

Retrieval Daemon

Request Queue

ResponseQueue

Control Interface

Content Directory

Cache Manager

control messages

Cache

name table

Cache state

Request Handler

Cache Regulator

Router

Response Handler

Router

Figure 9: BBC Architecture

requests to be processed by the cache, in termsof total size or estimated latency; 3) free spacein the input buffer and/or output buffer of thecache. Those variables are to be actively up-dated by the cache via the Control interface.

The Branch and Bound mechanism is mainly car-ried out by the router. Upon receiving a requestpacket, the router (its Request Handler, to be precise)forwards it to the cache via the GET interface onlyif: 1) the request accepts cache; and 2) the requestprocessing component of the cache (i.e., the RetrievalDaemon in Figure 9) is not overloaded. The secondcondition is evaluated by looking into the cache state,and the exact interpretation of “overloaded” may besomething like “the estimated latency is above 5ms.”By the same token, upon receiving a response packet,the router first checks the packet header to see if it’scacheable. If the answer is yes, the router furtherchecks the cache state to see if the cache manageris overloaded (e.g., by comparing currently availablebuffer size in the cache against the object size). Ifthe answer is yes again, the router will multicast thepacket to both the designated destination and thecache.

The cache contains a (hopefully but not necessar-

6

ily) high-capacity, high-throughput storage systemand interacts with the router via three interfaces:GET, PUT, and Control. The GET interface ac-cepts requests from the router and sends out pack-ets of corresponding objects; i.e., the cache behaveslike a content server for this part. The PUT inter-face accepts response packets and reassembles theminto objects. But note that it never sends out ac-knowledgments since the incoming packets are copiesof packets directed to the clients. The Cache Man-ager is responsible for maintaining the cache content;the bare minimum is to execute cache replacementpolicies such as LRU.

The driving force of the BBC mechanism is theControl interface, which continuously sends out con-trol messages to update the cache state on the router.Triggering events and corresponding messages in-clude: 1) an object is about to be removed from thecache; 2) an object was just added into the cache; 3)certain cache state variable (e.g., available buffer sizeis becomes less than 4GB) reaches a critical point orsome threshold. Upon receiving those messages, therouter will update the name table and cache stateaccordingly.

4.4 Discussion

Security We did not discuss security issues here.But there is a simple strategy to ensure security: if webind the content name to the provider’s ID, then se-cure forwarding/delivery would imply secure caching.For example, if the network architecture is AIP [3] in-stead of IP, then the security of content delivery andcaching are both guaranteed.

BBX The Branch and Bound mechanism appliedto content caching yields BBC. However, it’s also pos-sible to replace the cache with other types of subsys-tems, giving rise to Branch and Bound Proxy (BBP).

5 Simulation Results

We analyze the Internet traffic from a university’ssubnet to evaluate the redundancy in the user re-quests and the gains from deploying our cachingframework.

5.1 Experimental Setup

The traffic dump consists of outgoing HTTP requestsover a three day period: Apr 23rd, 24th, and 26th of

2010. Out of these the requests going to YouTubewere filtered and any spurious records eliminated.The resulting dataset has 144191 requests for 61446unique videos. We first examine this dataset to es-timate the redundancy and the benefits of deployingour BBC caching scheme in the network.

0%

10%

20%

30%

40%

50%

60%

70%

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

% o

f A

ll V

ide

os

# of repeated views

Figure 10: Redundancy in the YouTube trace

We found considerable redundancy: About 46% ofthe total videos were viewed more than once. Asshown in Figure 10, out of the repeated videos, 60%were repeated once. The dataset exhibited a highdegree of variation in the frequency with one videobeing viewed around 4000 times. The extent of re-dundancy in the requests warrants the deployment ofa caching scheme.

5.2 Results

0

10

20

30

40

50

60

5 10 25 50 75 100 250 500 750 1000

Pe

rce

nt

Cache Size (GB)

Hit Ratio

BW Savings

Figure 11: Cache effectiveness with various cache ca-pacities

7

Requirement of Cache Capacity We first mea-sure the impact of cache capacity over two perfor-mance metrics – cache hit ratio and the networkbandwidth savings. Using the LRU replacement algo-rithm for the cache, we calculated the hit ratio andthe bytes served from the cache. As shown in Fig-ure 11, even for a cache size of 5GB, the cache alreadyhas a 30% hit ratio and 30% reduction in bandwidthusage. A cheap commodity PC will be able to providesuch capcaities.

Figure 12: Required cache throughput over time

Requirement of Cache Throughput We alsomeasure the working load of the cache over time, as-suming that each request is served at a the speedof 2 Mbit/s – which is the video bit rate. Again weuse the LRU replacement algorithm; the cache capac-ity is set to 100GB. The results are plotted in Fig-ure 12. Again, a cheap storage system with a maxi-mum throughput of 15 MB/s would be able to handlealmost all the requests. For peak times when the idealthroughput is over 15 MB/s, the BBC router wouldtemporarily disable the cache and forward additionalrequests as usual.

6 Conclusion and Future Work

We have presented a novel caching paradigm for con-tent delivery called Branch and Bound Caching. Wedemonstrated its advantages over existing approachessuch as CDNs, web proxies, and Redundancy Elimi-nation. The most prominent feature of our solutionis its elasticity: the Branch mechanism ensures thatthe cache is put into full use; the Bound mechanismensures that the cache won’t introduce bottlenecks orextra overhead.

For future work, we plan to specifically evaluate theBranch and Bound mechanisms. It is also interestingto see how the BBC paradigm can be generalized forother applications; e.g., Branch and Bound Proxies.

7 Acknowledgment

We are grateful to Holly Esquivel, Paras Doshi,and Jeremy Tanumihardjo for sharing the YouTubedataset with us. Many thanks to Ashok Anand andAditya Akella for many fruitful discussions and forbeing tolerant with our constantly drifting projecttopics.

References

[1] Velocix http://www.velocix.com/.

[2] A. Anand, A. Gupta, A. Akella, S. Seshan, andS. Shenker. Packet caches on routers: the impli-cations of universal redundant traffic elimination.ACM SIGCOMM Computer Communication Re-view, 38(4):219–230, 2008.

[3] D. Andersen, H. Balakrishnan, N. Feamster,T. Koponen, D. Moon, and S. Shenker. Account-able internet protocol (AIP). In Proceedings of theACM SIGCOMM 2008 conference on Data com-munication, pages 339–350. ACM, 2008.

[4] J. Erman, A. Gerber, M. Hajiaghayi, D. Pei, andO. Spatscheck. Network-aware forward caching.In Proceedings of the 18th international confer-ence on World wide web, pages 291–300. ACM,2009.

[5] G. Maier, A. Feldmann, V. Paxson, and M. All-man. On dominant characteristics of residentialbroadband internet traffic. In Proceedings of the9th ACM SIGCOMM conference on Internet mea-surement conference, pages 90–102. ACM, 2009.

8

Documents

Branch and Bound: A Paradigm of Elastic Network Cachingpages.cs.wisc.edu/~leonn/misc/branch-n-bound.pdf · Branch and Bound: A Paradigm of Elastic Network Caching Feng Niu University