13
Self-Organizing Algorithms for Cache Cooperation in Content Distribution Networks Sem Borst, Varun Gupta, and Anwar Walid The delivery of video content is expected to gain huge momentum, fueled by the popularity of user-generated clips, growth of video-on-demand (VoD) libraries, and widespread deployment of Internet Protocol television (IPTV) services with features such as CatchUp/PauseLive TV and network personal video recorder (NPVR) capabilities. The “time-shifted” nature of these personalized applications defies the broadcast paradigm underlying conventional TV networks and increases the overall bandwidth demands by orders of magnitude. Caching strategies provide an effective mechanism for mitigating these massive bandwidth requirements by storing copies of the most popular content closer to the network edge, rather than keeping it in a central site. The reduction in the traffic load lessens the required transport capacity and capital expense and alleviates performance bottlenecks. In this paper, we develop lightweight cooperative cache management algorithms aimed at maximizing the traffic volume served from cache and minimizing the bandwidth cost. As a canonical scenario, we focus on a cluster of distributed caches and show that under certain symmetry assumptions, the optimal content placement has a rather simple structure. Besides being interesting in its own right, the optimal structure offers valuable guidance for the design of low-complexity cache management and replacement algorithms. We establish that the proposed algorithms are guaranteed to operate within a constant factor from the globally optimal performance, with benign worst-case ratios, even in asymmetric scenarios. Numerical experiments reveal that typical performance is far better than the worst-case conditions indicate. © 2009 Alcatel-Lucent. attracts tens or even hundreds of millions of viewers a day, generating around 2000 terabytes (TB) of traffic. While the daily number of VoD users is unlikely to be that high, the size of a high-definition movie dwarfs that of a typical video clip, and just one user requesting Introduction The delivery of digital video content is anticipated to show tremendous growth over the next several years, driven by the huge popularity of user-generated video clips and the expansion of video-on-demand (VoD) libraries. It is estimated that YouTube* alone Bell Labs Technical Journal 14(3), 113–126 (2009) © 2009 Alcatel-Lucent. Published by Wiley Periodicals, Inc. Published online in Wiley InterScience (www.interscience.wiley.com) • DOI: 10.1002/bltj.20391

Self-organizing algorithms for cache cooperation in content distribution networks

Embed Size (px)

Citation preview

Page 1: Self-organizing algorithms for cache cooperation in content distribution networks

◆ Self-Organizing Algorithms for CacheCooperation in Content Distribution NetworksSem Borst, Varun Gupta, and Anwar Walid

The delivery of video content is expected to gain huge momentum, fueled bythe popularity of user-generated clips, growth of video-on-demand (VoD)libraries, and widespread deployment of Internet Protocol television (IPTV)services with features such as CatchUp/PauseLive TV and network personalvideo recorder (NPVR) capabilities. The “time-shifted” nature of thesepersonalized applications defies the broadcast paradigm underlyingconventional TV networks and increases the overall bandwidth demands byorders of magnitude. Caching strategies provide an effective mechanism formitigating these massive bandwidth requirements by storing copies of themost popular content closer to the network edge, rather than keeping it in acentral site. The reduction in the traffic load lessens the required transportcapacity and capital expense and alleviates performance bottlenecks. In thispaper, we develop lightweight cooperative cache management algorithmsaimed at maximizing the traffic volume served from cache and minimizingthe bandwidth cost. As a canonical scenario, we focus on a cluster ofdistributed caches and show that under certain symmetry assumptions, theoptimal content placement has a rather simple structure. Besides beinginteresting in its own right, the optimal structure offers valuable guidancefor the design of low-complexity cache management and replacementalgorithms. We establish that the proposed algorithms are guaranteed tooperate within a constant factor from the globally optimal performance,with benign worst-case ratios, even in asymmetric scenarios. Numericalexperiments reveal that typical performance is far better than the worst-caseconditions indicate. © 2009 Alcatel-Lucent.

attracts tens or even hundreds of millions of viewers a

day, generating around 2000 terabytes (TB) of traffic.

While the daily number of VoD users is unlikely to be

that high, the size of a high-definition movie dwarfs

that of a typical video clip, and just one user requesting

IntroductionThe delivery of digital video content is anticipated

to show tremendous growth over the next several

years, driven by the huge popularity of user-generated

video clips and the expansion of video-on-demand

(VoD) libraries. It is estimated that YouTube* alone

Bell Labs Technical Journal 14(3), 113–126 (2009) © 2009 Alcatel-Lucent. Published by Wiley Periodicals, Inc.Published online in Wiley InterScience (www.interscience.wiley.com) • DOI: 10.1002/bltj.20391

Page 2: Self-organizing algorithms for cache cooperation in content distribution networks

114 Bell Labs Technical Journal DOI: 10.1002/bltj

one movie a month would approach the bandwidth

demands of 10 users watching 20 clips a month.

Even stronger growth is likely to be fueled by the

proliferation of Internet Protocol television (IPTV)

services with personalized features such as

CatchUp/PauseLive TV and network personal video

recorder (NPVR) capabilities. From a functional per-

spective, NPVR services are similar to the use of a digi-

tal video recorder (DVR) for recording TV programs,

the difference being that the content is physically

stored on (and played back from) a device in the ser-

vice provider’s network rather than in the user’s

home. Features such as CatchUp/PauseLive TV allow

users to watch and pause TV programs any time they

like, thus offering similar functionality to what digital

video disc (DVD) players provide.

The common characteristic of these “time-shifted”

TV services, as well as VoD libraries and sites like

YouTube, is that users can select from a huge collec-

tion of content material at any time they want. This is

a radical departure from conventional TV networks,

where users can only tune in to a limited number of

channels at any given time. As a result, there can be

as many different play-out sessions as there are active

viewers, which could be in the hundreds of thousands

in a major metropolitan area. This runs counter to

the broadcast paradigm embraced in conventional TV

networks and raises a need for unicast sessions,

increasing the overall bandwidth demands by orders

of magnitude.

Caching strategies provide an effective mecha-

nism for mitigating the massive bandwidth require-

ments associated with large-scale distribution of

personalized high-definition video content [17]. In

essence, caching strategies exploit storage capacity to

absorb traffic by storing copies of the most popular

content closer to the network edge rather than keeping

it in a central location. The reduction in the traffic

load translates into a smaller required transport capac-

ity and capital expense as well as fewer performance

bottlenecks, thus enabling better service quality at a

lower price [8], e.g., short and predictable download

times. The scope for cost savings and performance

benefits from caching increases as the cost of memory

continues to drop at a higher rate than that of trans-

mission gear.

The design of efficient caching strategies involves

a broad range of interrelated problems, such as accu-

rate prediction of demand, intelligent content place-

ment, and optimal dimensioning of caches. In the

context of content delivery networks, these problems

inherently entail strong distributed and spatial fea-

tures, since the caches are physically scattered across

the network and the user requests are generated in

geographically dispersed nodes. These facets add con-

tent look-up and request routing as major additional

problem dimensions and severely complicate the relia-

ble estimation of demand and efficient content place-

ment and eviction. In tracking popularity, for instance,

we face a fundamental trade-off between averaging

over larger user populations to reduce sampling error

and capturing possible variations across heterogeneous

user communities. In view of implementation consid-

erations, we further prefer low-complexity content

placement algorithms that operate in a mostly distrib-

uted fashion and yet implicitly coordinate their actions

so as to approach globally optimal performance.

Motivated by the issues above, we aim to devise

lightweight cooperative content placement algorithms

so as to maximize the traffic volume served from cache

and thus minimize the bandwidth cost. The bandwidth

cost need not be actual monetary expenses, but could

also represent some metric reflecting the congestion

levels on the various network links, like link weights

in Open Shortest Path First (OSPF) for example. We

assume that demand estimates are given, and that

suitable mechanisms for online content look-up and

request routing are available, e.g., the distributed

directory services in [2, 7, 13, 16].

Panel 1. Abbreviations, Acronyms, and Terms

CDN—Content distribution networksDVD—Digital video discDVR—Digital video recorderIPTV—Internet Protocol televisionLFU—Least frequently usedLRU—Least recently usedNPVR—Network personal video recorderOSPF—Open Shortest Path FirstVoD—Video-on-demand

Page 3: Self-organizing algorithms for cache cooperation in content distribution networks

DOI: 10.1002/bltj Bell Labs Technical Journal 115

As a canonical scenario, we focus on a cluster of

distributed caches, either connected directly or via a

parent node, for which we show that under certain

symmetry assumptions, the optimal content place-

ment has a rather simple structure. Besides being

interesting in its own right, knowledge of the optimal

structure offers useful insight for the design of low-

complexity cooperative content placement and evic-

tion algorithms. We propose algorithms that are

guaranteed to operate within a constant factor from

the globally optimal performance while requiring only

limited knowledge of global popularities, with benign

worst-case ratios, even in asymmetric scenarios.

Numerical experiments for typical topologies and pop-

ularity distributions demonstrate that actual per-

formance is far better than the worst-case conditions

suggest.

The benefits of cooperative caching have been

investigated previously in the setting of distributed

file systems, starting with the taxonomy in [6], as well

as large-scale information systems and Web-oriented

content distribution networks (CDNs). Prior studies

in these contexts include simulation experiments

[5, 7, 12, 14], prototypes [18], and analytical results

and algorithms [1, 3, 4, 9–11, 15]. Most of the above-

mentioned work has focused on minimizing access

latency and has disregarded the bandwidth con-

sumption involved. In fact, the main rationale for

cooperative caching in distributed file systems is that

bandwidth is assumed to be abundant and hence can

be freely leveraged to improve response time per-

formance. In Web-oriented CDNs such as Akamai*,

optimizing the response time performance also

appears to be the central objective, while limiting the

bandwidth consumption only seems to be a secondary

aim. In contrast, for high-definition video objects with

sizes of a few gigabytes and hour-long durations, mini-

mizing the bandwidth usage is a far more relevant

objective than reducing the initial play-out delay by a

few hundred milliseconds.

Although the relevant performance metrics may

differ in various application scenarios, a strong concep-

tual similarity emerges when the notion of “access cost”

is adopted. This cost could represent either the addi-

tional latency incurred when fetching content from

remote caches or main memory or the bandwidth

consumed when retrieving content from a peer node or

video headend, depending on the scenario of interest.

Indeed, our mathematical formulation of the content

placement problem shows a strong resemblance to ear-

lier ones for which a wide range of approximation algo-

rithms have been proposed; see for instance [1, 3, 9–11,

15]. In contrast to [3, 4, 15], we focus on specific topolo-

gies, motivated by real system deployments, and prove

that the optimal solution of the relaxed integer pro-

gram has a simple structure. In addition, we use the

insight into the optimal structure to develop efficient

distributed content placement algorithms with far more

favorable performance ratios than the ones in [3, 4, 15].

The remainder of the paper is organized as fol-

lows. We present a detailed model description, give a

formal problem formulation, and show that under

certain symmetry assumptions the optimal content

placement has a rather simple structure. Next, we use

the knowledge of the optimal structure to develop

distributed content placement algorithms with tight

performance guarantees. We then present numerical

experiments to evaluate the performance of the pro-

posed content placement algorithms and offer our

conclusions.

Model Description, Problem Formulation, andStructure of Optimal Content Placement

We consider a cache cluster consisting of M “leaf”

nodes indexed 1, . . . , M, which are either directly

connected or connected indirectly via node 0 as a

common parent somewhere along the path to the

root node, as represented in Figure 1. The parent

node and leaf nodes are endowed with caches of sizes

B0, B1, . . . , BM, while the root node is endowed with

a cache of sufficient capacity to store all the content

items. We assume a static collection of N content items

of sizes s1, . . . , sn. Denote by din the demand for con-

tent item n in leaf node i � 1, . . . M, n � 1, . . . , N.

It is worth emphasizing that the cache cluster

need not be a standalone network but could in fact be

part of a larger hierarchical tree topology as illustrated

in Figure 2. For ease of operation, IPTV networks

tend to have a mostly hierarchical tree structure, but

they may also have some degree of logical connectiv-

ity among nodes within the same hierarchy level,

either directly or via a “U-turn” through a common

Page 4: Self-organizing algorithms for cache cooperation in content distribution networks

116 Bell Labs Technical Journal DOI: 10.1002/bltj

parent node. Cost analysis reveals that it rarely pays

off to install caches at more than one or two levels,

even if the network consists of four or five hierarchy

levels, as is commonly the case. Also, there are imple-

mentation issues involved in integrating caches with

other lower-layer devices at certain hierarchy levels.

Hence, the two-level cache cluster constitutes a fairly

canonical scenario that covers most cases of practical

interest. However, most of the structural properties

that we will obtain in fact extend to any number of

hierarchy levels.

Denote by c0 the unit cost incurred when trans-

ferring content from the root node to the parent node,

by ci the unit cost associated with transferring con-

tent from the parent node to leaf node i, and by cij

the unit cost incurred when transferring content from

leaf node j to leaf node i, i � j � 1, . . . M. We assume

Parent node

2 M1

Leaf nodes

Root node

Figure 1.Cache cluster with leaf nodes.

VHO

IO

DSLAM

CO

STB

CO—Central officeDSLAM—Digital subscriber line access multiplexerIO—Intermediate officeSTB—Set-top boxVHO—Video hub office

Figure 2.Cache cluster within a hierarchical tree topology.

Page 5: Self-organizing algorithms for cache cooperation in content distribution networks

DOI: 10.1002/bltj Bell Labs Technical Journal 117

that cij � c0 � ci for all i � 1, . . . , M, i � j � 1, . . . , M;

i.e., it is cheaper to transfer content from a peer node

than from the root node.

From now on, we focus on the case where con-

tent can only be stored at the leaf nodes and not at the

parent node, i.e., B0 � 0, which we will refer to as

intra-level cache cooperation.

Let the 0–1 decision variable xin indicate whether

the n-th content item is stored in the cache of node i

or not, i � 1, . . . , M. The 0–1variable xijn indicates

whether requests for the n-th content item at node i

are served from the cache of node j or not, i � 1, . . .

M, j � –1, 1, . . . , M, with –1 representing the root

node. Note that the problem of minimizing the band-

width expenses is equivalent to the problem of maxi-

mizing the bandwidth savings. For convenience, we

will focus on the latter problem, which may be for-

mulated as:

with shorthand notation for .

The objective function represents the total bandwidth

savings achieved when requests for content item n at

node i are served from the cache of node j. The first

constraint states that the total size of the content items

stored in a particular node cannot exceed the cache

capacity of that node. The second constraint requires

that requests for a particular content item n can only

be served by node j if item n is stored in the cache of

node j. The third and final constraint precludes that

requests for a particular content item that is stored by

a node itself are also served by other nodes, in order

to prevent double counting of the bandwidth savings.

We now assume that the leaf nodes are symmet-

ric in terms of bandwidth cost, demand characteristics

ai�1

j�1� a

M

j�i�1aj� i

xin � aj� i

xijn � 1, i � 1, . . . , M, n � 1, . . . ,N

xijn � xjn, i � j � 1, . . . , M, n � 1, . . . ,N

s .t . aN

i�1

sn xin � Bi, i � 1, . . . , M

� aj� i

(c0 � ci � cij) xijn b(Pmax) max a

M

i�1aN

n�1

sn din a(c0 � ci)xin

and cache sizes, i.e., ci � c�, cij � c, din � dn and Bi � B

for all i � 1, . . . , M. The assumption of equal band-

width cost and cache sizes is a fairly natural one,

given the way IPTV networks tend to be configured.

The assumption of symmetric demands is more

restrictive, as the popularity of content items may

exhibit variation across different user communities.

In practice, however, the available demand estimates

may simply not have the required level of granular-

ity to reliably distinguish among all possible hetero-

geneous popularity characteristics. In the absence of

any fine-grained information, assuming homoge-

neous popularity characteristics is a reasonable

option. Furthermore, the optimal content placement

turns out to have a distinct and relatively simple

structure in case of symmetric demands. The knowl-

edge of the optimal structure offers valuable insight

for the design of low-complexity cache management

and replacement algorithms. As will be shown later,

these algorithms perform remarkably well in terms of

both worst-case guarantees and average metrics, even

when the actual demand characteristics are highly

asymmetric. This may be explained by the fact that

these distributed algorithms will cache locally popu-

lar items at the lowest levels of the network hierarchy

and thus automatically adapt to heterogeneous popu-

larity distributions. The flipside of the greedy,

myopic behavior is that it tends to reduce the scope

for gains at higher levels in case of asymmetric popu-

larities, but this only gives rise to a limited degree of

suboptimality.

Under the above symmetry assumptions, prob-

lem Pmax reduces to a so-called knapsack problem,

with a knapsack of size MB and 2N items of sizes

an � sn, aN � n � (M � 1)sn, n � 1, . . . , N. The value of

item n is c��� � M(c � c0)�(M � 1)c� while the value

of item N � n is (M � 1)c�. This may be interpreted as

follows. Note that storing item n in the first of the M

nodes yields a bandwidth savings of sndnc���, while stor-

ing it in each additional node yields a further band-

width savings of sndnc�.

The optimal solution of the knapsack problem has

a relatively simple structure, as stated in the next

lemma. For convenience, assume that the content

items are indexed in descending order of dn,

Page 6: Self-organizing algorithms for cache cooperation in content distribution networks

118 Bell Labs Technical Journal DOI: 10.1002/bltj

i.e., d1 � d2 � . . . � dn, and that the values of c���dn

and c�dn, n � 1, . . . , N, are all distinct.

Lemma 1. (Structure of Optimal Content Placement)

1. For some n*0, items 1, . . . n*

0 � 1 are fully repli-

cated at all leaf nodes, while item n*0 may be par-

tially replicated.

2. For some n*1 � n*

0, single copies of items n*0 �

1, . . . , n*1 � 1 are stored, while a partial copy of

item n*1 may be stored.

3. In fact, it cannot be the case that both item n*0 is

partially replicated and a partial copy of n*1 is

stored; i.e., there must be exactly one copy of item

n*0 or no copy of item n*

1.

The intuitive explanation for the above-described

structure follows from the interpretation as a knap-

sack problem. Specifically, in the optimal solution of a

continuous knapsack problem, one item at most may

be partially included, with all higher-value items fully

included, and all lower-value items excluded. We now

distinguish the two cases:

1. The partial item is an item n with size sn and value

c���, say item n*0; and

2. The partial item is an item n with size (M � 1)sn

and value (M � 1)c�, say n*1.

In case 1, items 1, . . . , n*0 � 1 are fully replicated,

while single copies of items n*0 � 1, . . . ,n*

1 � 1 are

stored, with n*1 � min {n : c�dn � c���dn0

.} In case 2,

items 1, . . . n*0 � 1 are fully replicated, with n*

0 � min

{n : c���dn � c�dn1.} while single copies of items n*

0, . . . ,

n*1 � 1 are stored.

Distributed Algorithms for Intra-Level CacheCooperation

In this section, we propose a simple distributed

algorithm, Local-Greedy, to achieve cooperative

caching for the case B0 � 0, i.e., no caching at the root.

For transparency, we assume unit-size content items,

i.e., sn � 1 for all n � 1, . . . , N, but the algorithms

and performance results easily extend to arbitrary

sizes. We will first consider the simple scenario of equal

cache sizes and symmetric demands and bandwidth

costs, where Local-Greedy achieves at least a fraction

3/4 of the maximum achievable bandwidth savings.

Next, we turn to the case of arbitrary cache sizes,

demand vectors, and inter-leaf bandwidth costs, where

the performance ratio of Local-Greedy only decreases

to 1/2 in the worst case. Furthermore, both these per-

formance guarantees are tight. While achieving the

globally optimal configuration would require each

node to communicate its local demand vector to every

other node, the Local-Greedy algorithm only requires

that each node have knowledge about the local and

aggregate demand for each item, as well as the total

number of nodes where each item is currently cached.

Because of space constraints, all proofs are skipped.

We first consider the symmetric case where Bi � B

and din � dn for all i � 1, . . . , M. Further, we assume

cij � c� i, j. Let d1 � d2 � . . . � dN. In the previous

section, we discussed the structure of the optimal

solution under the above assumptions. We now look

for local cache replacement algorithms that imple-

ment the globally optimal solution.

Algorithm: Local-GreedyAt the turn of node i, node i:

1. Associates utility Un � dnc��� with an item n if

item n is not cached anywhere in [M]\{i}.

2. Associates utility Vn � dnc� with an item n if

item n is cached somewhere in [M]\{i}.

3. Replaces its contents with B items of the

highest value.

It seems intuitively plausible that in the extremely

restrictive case of symmetric cache sizes, demands,

and bandwidth costs, the Local-Greedy algorithm

should be able to reach the globally optimal configu-

ration. However, this turns out not always to be the

case, and there exist demand vectors for which Local-

Greedy may not achieve more than a fraction 3/4 of

the maximum bandwidth savings.

Claim 1. The Local-Greedy algorithm does not always

reach the globally optimal configuration.

The next theorem shows that a performance ratio

of 3/4 is in fact the worst the Local-Greedy algorithm

can do.

Theorem 1. For equal cache sizes and symmetric

demands and bandwidth costs, the Local-Greedy algo-

rithm achieves at least a fraction 3/4 of the maximum

bandwidth savings.

We now turn to the case of arbitrary cache sizes,

demand vectors and bandwidth costs.

Page 7: Self-organizing algorithms for cache cooperation in content distribution networks

DOI: 10.1002/bltj Bell Labs Technical Journal 119

Theorem 2. The Local-Greedy algorithm achieves at

least a fraction 1/2 of the maximum bandwidth sav-

ings, independent of the cache sizes, demand vectors,

and bandwidth costs.

The next proposition gives a matching lower

bound for the performance ratio of Local-Greedy. The

lower bound is in fact tight even when the cache sizes

and the bandwidth costs are symmetric. Therefore,

the loss in performance is essentially due to asym-

metric demand vectors.

Numerical ExperimentsIn this section, we present the numerical experi-

ments that we have conducted to evaluate the per-

formance of the proposed algorithms. Throughout,

we assume that there are M � 10 leaf nodes, each

endowed with a cache of size B. The parent node has

a cache of size B0, which may or may not be zero,

while the root node is endowed with a cache of suf-

ficient capacity to store all the content items. The

parameters c0, c, and c� represent the unit cost incurred

when transferring content from the root node to the

parent node, from the parent node to one of the leaf

nodes, and between any pair of leaf nodes, respec-

tively. In all the experiments we assume c0 � 2, c � 1,

and c� � 1.

The system offers a collection of N � 10,000 con-

tent items. For convenience, we assume the content

items to have a common size of S � 2 GB, although

this is not particularly essential for the results. For

compactness, denote by K B/S the number of con-

tent items that can be stored in each of the leaf nodes.

Each of the leaf nodes receives an average of one

request every 160 seconds; i.e., the aggregate request

rate per leaf node is v � 0.00625 sec�1. We assume

that the popularity of the various content items is gov-

erned by a Zipf-Mandelbrot distribution with shape

parameter and shift parameter q; i.e., the relative

popularity of the n-th most popular item is:

with normalization constant

H(a, q, N) � aN

n�1

1

(q � n)a

pn � H�1(a, q, N)1

(q � n)a, n � 1, . . . , N,

The request rate for the n-th most popular item at

each of the leaf nodes is dn � pnv, with � the total

request rate per leaf node defined above.

Below we present two sets of experimental

results. In the first set of results, we assess the gains

from cooperative caching and demonstrate that judi-

cious replication can yield significant bandwidth sav-

ings, even with small cache sizes. In the second set of

experiments, we examine the performance of the

Local-Greedy algorithm and show that it operates

close to globally optimal performance.

Gains from Cooperative CachingIn order to quantify the gains from cooperative

caching, we compare the minimum bandwidth cost as

characterized by the optimal solution of Pmax with the

bandwidth cost incurred in two other scenarios: 1) full

replication, where each leaf node stores the K most

popular content items, and 2) no replication, where

only a single copy of the MK most popular content

items is stored in one of the leaf nodes. The bandwidth

costs for these three scenarios are compared as a func-

tion of the shape parameter a of the Zipf-Mandelbrot

distribution with the shift parameter fixed at q � 10.

Note that without caching the bandwidth cost would

be MvS(c � c0) � 10 � 0.00625 � 2 � 3 � 0.375 Gb/s,

which amounts to 3 Gb/s.

Figure 3 shows the results for cache sizes

B � 200 GB, B � 1 TB, and B � 2 TB, respectively.

Figure 3d shows similar results for a cache size B � 1

TB, where, in addition, the parent node has a cache of

size B0 � 2 TB. As the results demonstrate, the band-

width costs markedly decrease with increasing values

of a, reflecting the well-known fact that caching effec-

tiveness improves as the popularity distribution gets

steeper. Even when B � 200 GB, which means that

the collective cache space can hold no more than 10

percent of the total content, caching reduces the

bandwidth costs to a fraction of what they would have

been in the absence of caching, as long as the value of

a is not too low.

It is worth observing that the best choice between

either full or zero replication is often not much worse

than the optimal content placement. However, for

low values of a, full replication is substantially worse

than the optimal content placement and may incur

Page 8: Self-organizing algorithms for cache cooperation in content distribution networks

120 Bell Labs Technical Journal DOI: 10.1002/bltj

up to three times higher bandwidth costs, depending

on the cache size, B. For higher values of , on the

other hand, zero replication is significantly worse than

the optimum and may incur 10-fold the bandwidth

costs, even though these are still moderate in absolute

terms. In conclusion, neither full nor zero replication

performs well across the entire range of values, and

it is crucial to adjust the degree of replication based on

the steepness of the popularity distribution. In case

of the Zipf-Mandelbrot distribution, one could in prin-

ciple calculate the optimal degree of replication as a

function of the shape parameter a. In practice, how-

ever, the value of a may not be so easy to obtain, or

the popularity statistics may not even conform to a

Zipf-Mandelbrot distribution altogether. Hence it

makes more sense to adjust the degree of replication

based on the actual observed demands for the indi-

vidual content items, rather than fitting a particular

0 1 2

Shape parameter alpha

0

2

4B

and

wid

th c

ost

(G

bp

s)Full replicationNo replicationOptimal configuration

0

2

4

Ban

dw

idth

co

st (

Gb

ps)

Full replicationNo replicationOptimal configuration

0

2

4

Ban

dw

idth

co

st (

Gb

ps)

Full replicationNo replicationOptimal configuration

0

2

4

Ban

dw

idth

co

st (

Gb

ps)

Full replicationNo replicationOptimal configuration

(a) B � 200 GB

0 1 2

Shape parameter alpha

(b) B � 1 TB

0 1 2

Shape parameter alpha

(c) B � 2 TB

0 1 2

Shape parameter alpha

(d) B � 1 TB with parent node B0 � 2 TB

Figure 3.Results for cache sizes.

Page 9: Self-organizing algorithms for cache cooperation in content distribution networks

DOI: 10.1002/bltj Bell Labs Technical Journal 121

distribution. The algorithms described in the previous

section do just that, without requiring any knowledge

of a or even relying on the popularity statistics obey-

ing a Zipf-Mandelbrot distribution.

Performance of the Local-Greedy AlgorithmWe now proceed to examine the performance of

the Local-Greedy algorithm.

Throughout, we assume that each of the leaf

nodes has a cache of size B � 1 TB, and that the popu-

larity of the various content items is governed by a

Zipf-Mandelbrot distribution with shape parameter

a � 0.8 and shift parameter q � 10.

In order to study the dynamic evolution of the

content placement, we conduct a simulation experi-

ment where the various leaf nodes receive requests

over time, sampled from the above-described popular-

ity distribution. Whenever a particular node receives

a request for an item that it has not presently stored,

it decides whether to cache it and if so, which cur-

rently stored item to evict, as prescribed by the Local-

Greedy algorithm.

In order to monitor the performance and conver-

gence, we track the objective value over time, and

compare it with the value of the optimal content

placement. In the optimal content placement, items 1

through 165 are fully replicated, and single copies of

items 166 through 3515 are stored. For the initial con-

tent placement, we distinguish three different sce-

narios:

1. Full replication, where each leaf node initially

stores the K most popular content items,

2. No replication, where initially only a single copy

of the KM most popular content items is stored

in one of the leaf nodes, and

3. Random, where each leaf node stores K content

items selected uniformly at random, independ-

ently of all other leaf nodes.

Figure 4 shows the performance of the Local-

Greedy algorithm. Figure 4a illustrates its perfor-

mance as a ratio of the optimal content placement. Note

that the objective value achieved by the algorithm

gets progressively closer to the optimum as the system

receives more requests and replaces content items

over time. After only 3000 requests (out of a total

number 10,000 content items) the Local-Greedy algo-

rithm has come to within 1 percent of the optimal

content placement and stays there, performing

markedly better than the worst-case ratio of 3/4 might

suggest.

While the algorithm seems to “converge” for all

three initial states, the scenario with no replication

appears to be the most favorable one. This may be

explained by the fact that in the optimal content

placement only items 1 through 165 are fully repli-

cated. It takes relatively few requests to replicate these

items starting from no replication, whereas it takes

far more requests to replace all the duplicates of items

166 through 500 by single copies of items 501

through 3515 starting from full replication.

The results as shown in Figure 4a assume a static

content population and fixed popularities. In the next

experiment, for which the results are presented in

Figure 4b through Figure 4d, we explore how dynamic

content ingestion and aging affect the performance of

the Local-Greedy algorithm. These phenomena have a

strong impact on caching effectiveness and, in particu-

lar, degrade the performance of conventional algo-

rithms such as least recently used (LRU) and least

frequently used (LFU). As a point of reference, the over-

all popularity law is assumed to remain the same over

time, with the same shape parameter a and the same

shift parameter q, in the sense that the relative popu-

larity of the n-th most popular active item continues to

be pn. However, the relative popularity of a given item

decays over time, and we specifically assume that the

rank of each item is decremented by 1 after every R

requests received in the system. In other words, for

every request that the system receives, the ranks of

all the items are decremented by 1/R, which may be

interpreted as a measure for the rate of aging. In order

to keep the overall popularity law the same, the bot-

tom item with rank N is supposed to be removed while

a new item with rank 1 is ingested after every R

requests received by the system. Equivalently, each

individual item receives an average of R requests over

its active lifetime.

Figures 4b through Figure 4d show the results for

various aging factors. In order not to clutter the fig-

ures, we only display the results for a random initial

Page 10: Self-organizing algorithms for cache cooperation in content distribution networks

122 Bell Labs Technical Journal DOI: 10.1002/bltj

content placement, those for either full or no initial

replication being similar. The Local-Greedy algorithm

continues to “converge” but now only comes to

within a certain margin from the “optimal” content

placement. A comparison of Figure 4b, Figure 4c, and

Figure 4d reveals that the optimality margin increases

with the rate of aging. This may be explained from the

fact that the constant churn requires the cache content

to be continuously refreshed to maintain optimality,

whereas previously the cache content gradually settled

into a fixed state as the optimum was approached.

The Local-Greedy algorithm is constrained in terms

of the rate at which it can make the required updates

by the requests that it receives and suffers more as

the required rate of updates grows relative to the

number of requests, i.e., as the aging rate increases.

By the same token, the “optimal” content placement

is granted an unfair advantage here, as the additional

0 2500 5000

Number of requests received

0

0.5

1Pe

rfo

rman

ce r

atio Full replication

No replicationRandom

0 2500 5000

Number of requests received

0

0.5

1

Perf

orm

ance

rat

io Full replicationNo replicationRandom

0 2500 5000

Number of requests received

0

0.5

1

Perf

orm

ance

rat

io Full replicationNo replicationRandom

0 2500 5000

Number of requests received

0

0.5

1

Perf

orm

ance

rat

io Full replicationNo replicationRandom

(a) Static popularity (b) Slow aging, R � 200

(c) Moderate aging, R � 100 (d) Fast aging, R � 20

Figure 4.Performance of the Local-Greedy algorithm at optimal content placement and with examples showing the impactof dynamic content ingestion and aging.

Page 11: Self-organizing algorithms for cache cooperation in content distribution networks

DOI: 10.1002/bltj Bell Labs Technical Journal 123

bandwidth that is required to maintain optimality is

not accounted for. In other words, it only provides an

upper bound, and the Local-Greedy algorithm may

therefore in fact be closer to the true optimum than

the numbers suggest.

So far, we have assumed that the relative popu-

larities of the various content items are known exactly.

In practice, popularities are unlikely to be known

with perfect accuracy and will need to be estimated.

In order to examine the impact of estimation errors,

we conducted an experiment where the popularities

are no longer assumed to be known, but estimated

based on the actual received requests using a geo-

metrically smoothed average with some time constant

T as proxy for the length of the measurement window.

Figure 5 plots the results for various values of the time

constant. As is to be expected, the optimum is

approached more closely as the accuracy of the estimates

0 2500 5000

Number of requests received

0

1

2O

bje

ctiv

e va

lue

Optimum with exact knowledgeOptimum for given estimatesLocal greedy with estimates

0 2500 5000

Number of requests received

0

1

2

Ob

ject

ive

valu

e

Optimum with exact knowledgeOptimum for given estimatesLocal greedy with estimates

0 2500 5000

Number of requests received

0

1

2

Ob

ject

ive

valu

e

Optimum with exact knowledgeOptimum for given estimatesLocal greedy with estimates

(b) T � 1,000(a) T � 100

(c) T � 10,000 (d) T � 100,000

0 2500 5000

Number of requests received

0

1

2

Ob

ject

ive

valu

e

Optimum with exact knowledgeOptimum for given estimatesLocal greedy with estimates

Figure 5.Results for various values of time constant T.

Page 12: Self-organizing algorithms for cache cooperation in content distribution networks

124 Bell Labs Technical Journal DOI: 10.1002/bltj

improves for larger time constants, although the ini-

tial convergence is possibly somewhat slower. For

comparison purposes, the figure also shows what the

optimum objective value is for the given popularity

estimates. This provides an indication of the best that

any algorithm in general, and Local-Greedy in par-

ticular, can be expected to do, given available esti-

mates, and suggests that for larger time constants the

performance gap is mostly due to estimation errors.

The issue of popularity estimation is particularly

pertinent in the case of time-varying popularities due

to content aging and dynamic ingestion, as consid-

ered before. The results for that scenario combine the

effects observed in the previous two cases.

ConclusionWe have developed lightweight cooperative cache

management algorithms, which achieve close to glob-

ally optimal performance, with favorable worst-case

ratios, even in asymmetric scenarios. Numerical

results demonstrate that the typical performance is

far better than the worst-case guarantees suggest. The

numerical results also illustrate the complex interac-

tion between distributed cache cooperation and popu-

larity estimation in the presence of dynamic content

ingestion and aging. Finding optimal content place-

ment algorithms and performance bounds for time-

varying popularity statistics remains as a challenging

issue for further research.

AcknowledgementsThe authors gratefully acknowledge helpful dis-

cussions with Volker Hilt, Won Suck Lee, Ivica Rimac

and Iraj Saniee.

*TrademarksAkamai is a registered trademark of Akamai Tech-

nologies, Inc.YouTube is a registered trademark of Google, Inc.

References[1] B. Awerbuch, Y. Bartal, and A. Fiat, “Distributed

Paging for General Networks,” J. Algorithms,28:1 (1998), 67–104.

[2] B. Awerbuch and D. Peleg, “Online Tracking ofMobile Users,” J. ACM, 42:5 (1995), 1021–1058.

[3] I. D. Baev and R. Rajaraman, “ApproximationAlgorithms for Data Placement in ArbitraryNetworks,” Proc. 12th Annual ACM-SIAM

Symposium on Discrete Algorithms (SODA ‘01)(Washington, DC, 2001), pp. 661–670.

[4] I. Baev, R. Rajaraman, and C. Swamy,“Approximation Algorithms for Data PlacementProblems,” SIAM J. Comput., 38:4 (2008),1411–1429.

[5] M. D. Dahlin, R. Y. Wang, T. E. Anderson, andD. A. Patterson, “Cooperative Caching: UsingRemote Client Memory to Improve File SystemPerformance,” Proc. 1st USENIX Conf. onOperating Syst. Design and Implementation(OSDI ‘94) (Monterey, CA, 1994), article 19.

[6] L. W. Dowdy and D. V. Foster, “ComparativeModels of the File Assignment Problem,” ACMComput. Surveys, 14:2 (1982), 287–313.

[7] L. Fan, P. Cao, J. Almeida, and A. Z. Broder,“Summary Cache: A Scalable Wide-Area WebCache Sharing Protocol,” ACM SIGCOMMComput. Commun. Rev., 28:4 (1998), 254–265.

[8] C. Huang, J. Li, and K. W. Ross, “Can InternetVideo-on-Demand Be Profitable?,” ACMSIGCOMM Comput. Commun. Rev., 37:4(2007), 133–144.

[9] J. Kangasharju, J. Roberts, and K. W. Ross,“Object Replication Strategies in ContentDistribution Networks,” Comput. Commun.,25:4 (2002), 376–383.

[10] M. R. Korupolu, C. G. Plaxton, and R. Rajaraman, “Placement Algorithms forHierarchical Cooperative Caching,” J. Algorithms, 38:1 (2001), 260–302.

[11] A. Leff, J. L. Wolf, and P. S. Yu, “ReplicationAlgorithms in a Remote Caching Architecture,”IEEE Trans. Parallel and Distrib. Syst., 4:11(1993), 1185–1204.

[12] J. Ni and D. H. K. Tsang, “Large-ScaleCooperative Caching and Application-LevelMulticast in Multimedia Content DeliveryNetworks,” IEEE Commun. Mag., 43:5 (2005),98–105.

[13] C. G. Plaxton, R. Rajaraman, and A. W. Richa,“Accessing Nearby Copies of Replicated Objectsin a Distributed Environment,” Theory Comput.Syst., 32:3 (1999), 241–280.

[14] P. Sarkar and J. H. Hartman, “Hint-BasedCooperative Caching,” ACM Trans. Comput.Syst., 18:4 (2000), 387–419.

[15] C. Swamy, “Algorithms for the Data PlacementProblem,” 2004.

[16] M. van Steen, F. J. Hauck, and A. S.Tanenbaum, “A Model for Worldwide Trackingof Distributed Objects,” Proc. Telecommun.

Page 13: Self-organizing algorithms for cache cooperation in content distribution networks

DOI: 10.1002/bltj Bell Labs Technical Journal 125

Inform. Networking Architecture Conf. (TINA‘96) (Heidelberg, Ger., 1996), pp. 203–212.

[17] M. Verhoeyen, D. De Vleeschauwer, and D. Robinson, “Content Storage Architectures forBoosted IPTV Service,” Bell Labs Tech. J., 13:3(2008), 29–43.

[18] G. M. Voelker, E. J. Anderson, T. Kimbrel, M. J.Feeley, J. S. Chase, A. R. Karlin, and H. M.Levy, “Implementing Cooperative Prefetchingand Caching in a Globally-Managed MemorySystem,” ACM SIGMETRICS Perform.Evaluation Rev., 26:1 (1998), 33–43.

(Manuscript approved May 2009)

SEM BORST is a member of technical staff in the Mathematics of Networks and SystemsResearch Department at Alcatel-Lucent BellLabs in Murray Hill, New Jersey. He receivedan M.Sc. degree in applied mathematicsfrom the University of Twente and a Ph.D.

degree from the University of Tilburg, both located inthe Netherlands. His main research interests are in thearea of performance evaluation and resourceallocation algorithms for communication networks. Dr. Borst has authored over 100 publications injournals, conference proceedings, and book chaptersand serves or has served as a member of severaleditorial boards and program committees. He holds 12patents and has six patent applications pending in thearea of networking and wireless communications.

VARUN GUPTA is a Ph.D. candidate in the Computer Science Department at Carnegie MellonUniversity in Pittsburgh, Pennsylvania. Hereceived his B.Tech. degree in computerscience and engineering from IndianInstitute of Technology, Delhi, and was

awarded the university’s prestigious President’s GoldMedal in 2004. His research interests are in all aspectsof design and analysis of algorithms, distributedsystems, and stochastic modeling. His current focus ison performance modeling and analysis of schedulingpolicies for multi-server systems from a queuing-theoretic perspective.

ANWAR WALID is a distinguished member of technical staff in the Mathematics of Networks andSystems Research Department at Alcatel-Lucent Bell Labs in Murray Hill, New Jersey.He received a B.S. degree in electricalengineering from Polytechnic Institute of

New York, Brooklyn, and a Ph.D. degree in electricalengineering from Columbia University in New YorkCity. He developed theory and algorithms for resourcemanagement and quality of service (QoS) support forAT&T and Lucent Technologies products. He is an IEEEFellow and holds patents on congestion control,scheduling, and routing in asynchronous transfer mode(ATM) and Internet Protocol (IP)/multiprotocol labelswitching (MPLS) networks. He received a Best PaperAward from ACM Sigmetrics on multimedia trafficmodeling. He has co-authored IETF RFCs in the MPLSand Traffic Engineering Working Groups and hasserved on the executive and technical programcommittees of IEEE and MPLS conferences. He is amember of Tau Beta Pi (National Engineering HonorSociety, and the International Federation forInformation Processing (IFIP) Working Group 7.3. ◆