Upload
sem-borst
View
215
Download
0
Embed Size (px)
Citation preview
◆ Self-Organizing Algorithms for CacheCooperation in Content Distribution NetworksSem Borst, Varun Gupta, and Anwar Walid
The delivery of video content is expected to gain huge momentum, fueled bythe popularity of user-generated clips, growth of video-on-demand (VoD)libraries, and widespread deployment of Internet Protocol television (IPTV)services with features such as CatchUp/PauseLive TV and network personalvideo recorder (NPVR) capabilities. The “time-shifted” nature of thesepersonalized applications defies the broadcast paradigm underlyingconventional TV networks and increases the overall bandwidth demands byorders of magnitude. Caching strategies provide an effective mechanism formitigating these massive bandwidth requirements by storing copies of themost popular content closer to the network edge, rather than keeping it in acentral site. The reduction in the traffic load lessens the required transportcapacity and capital expense and alleviates performance bottlenecks. In thispaper, we develop lightweight cooperative cache management algorithmsaimed at maximizing the traffic volume served from cache and minimizingthe bandwidth cost. As a canonical scenario, we focus on a cluster ofdistributed caches and show that under certain symmetry assumptions, theoptimal content placement has a rather simple structure. Besides beinginteresting in its own right, the optimal structure offers valuable guidancefor the design of low-complexity cache management and replacementalgorithms. We establish that the proposed algorithms are guaranteed tooperate within a constant factor from the globally optimal performance,with benign worst-case ratios, even in asymmetric scenarios. Numericalexperiments reveal that typical performance is far better than the worst-caseconditions indicate. © 2009 Alcatel-Lucent.
attracts tens or even hundreds of millions of viewers a
day, generating around 2000 terabytes (TB) of traffic.
While the daily number of VoD users is unlikely to be
that high, the size of a high-definition movie dwarfs
that of a typical video clip, and just one user requesting
IntroductionThe delivery of digital video content is anticipated
to show tremendous growth over the next several
years, driven by the huge popularity of user-generated
video clips and the expansion of video-on-demand
(VoD) libraries. It is estimated that YouTube* alone
Bell Labs Technical Journal 14(3), 113–126 (2009) © 2009 Alcatel-Lucent. Published by Wiley Periodicals, Inc.Published online in Wiley InterScience (www.interscience.wiley.com) • DOI: 10.1002/bltj.20391
114 Bell Labs Technical Journal DOI: 10.1002/bltj
one movie a month would approach the bandwidth
demands of 10 users watching 20 clips a month.
Even stronger growth is likely to be fueled by the
proliferation of Internet Protocol television (IPTV)
services with personalized features such as
CatchUp/PauseLive TV and network personal video
recorder (NPVR) capabilities. From a functional per-
spective, NPVR services are similar to the use of a digi-
tal video recorder (DVR) for recording TV programs,
the difference being that the content is physically
stored on (and played back from) a device in the ser-
vice provider’s network rather than in the user’s
home. Features such as CatchUp/PauseLive TV allow
users to watch and pause TV programs any time they
like, thus offering similar functionality to what digital
video disc (DVD) players provide.
The common characteristic of these “time-shifted”
TV services, as well as VoD libraries and sites like
YouTube, is that users can select from a huge collec-
tion of content material at any time they want. This is
a radical departure from conventional TV networks,
where users can only tune in to a limited number of
channels at any given time. As a result, there can be
as many different play-out sessions as there are active
viewers, which could be in the hundreds of thousands
in a major metropolitan area. This runs counter to
the broadcast paradigm embraced in conventional TV
networks and raises a need for unicast sessions,
increasing the overall bandwidth demands by orders
of magnitude.
Caching strategies provide an effective mecha-
nism for mitigating the massive bandwidth require-
ments associated with large-scale distribution of
personalized high-definition video content [17]. In
essence, caching strategies exploit storage capacity to
absorb traffic by storing copies of the most popular
content closer to the network edge rather than keeping
it in a central location. The reduction in the traffic
load translates into a smaller required transport capac-
ity and capital expense as well as fewer performance
bottlenecks, thus enabling better service quality at a
lower price [8], e.g., short and predictable download
times. The scope for cost savings and performance
benefits from caching increases as the cost of memory
continues to drop at a higher rate than that of trans-
mission gear.
The design of efficient caching strategies involves
a broad range of interrelated problems, such as accu-
rate prediction of demand, intelligent content place-
ment, and optimal dimensioning of caches. In the
context of content delivery networks, these problems
inherently entail strong distributed and spatial fea-
tures, since the caches are physically scattered across
the network and the user requests are generated in
geographically dispersed nodes. These facets add con-
tent look-up and request routing as major additional
problem dimensions and severely complicate the relia-
ble estimation of demand and efficient content place-
ment and eviction. In tracking popularity, for instance,
we face a fundamental trade-off between averaging
over larger user populations to reduce sampling error
and capturing possible variations across heterogeneous
user communities. In view of implementation consid-
erations, we further prefer low-complexity content
placement algorithms that operate in a mostly distrib-
uted fashion and yet implicitly coordinate their actions
so as to approach globally optimal performance.
Motivated by the issues above, we aim to devise
lightweight cooperative content placement algorithms
so as to maximize the traffic volume served from cache
and thus minimize the bandwidth cost. The bandwidth
cost need not be actual monetary expenses, but could
also represent some metric reflecting the congestion
levels on the various network links, like link weights
in Open Shortest Path First (OSPF) for example. We
assume that demand estimates are given, and that
suitable mechanisms for online content look-up and
request routing are available, e.g., the distributed
directory services in [2, 7, 13, 16].
Panel 1. Abbreviations, Acronyms, and Terms
CDN—Content distribution networksDVD—Digital video discDVR—Digital video recorderIPTV—Internet Protocol televisionLFU—Least frequently usedLRU—Least recently usedNPVR—Network personal video recorderOSPF—Open Shortest Path FirstVoD—Video-on-demand
DOI: 10.1002/bltj Bell Labs Technical Journal 115
As a canonical scenario, we focus on a cluster of
distributed caches, either connected directly or via a
parent node, for which we show that under certain
symmetry assumptions, the optimal content place-
ment has a rather simple structure. Besides being
interesting in its own right, knowledge of the optimal
structure offers useful insight for the design of low-
complexity cooperative content placement and evic-
tion algorithms. We propose algorithms that are
guaranteed to operate within a constant factor from
the globally optimal performance while requiring only
limited knowledge of global popularities, with benign
worst-case ratios, even in asymmetric scenarios.
Numerical experiments for typical topologies and pop-
ularity distributions demonstrate that actual per-
formance is far better than the worst-case conditions
suggest.
The benefits of cooperative caching have been
investigated previously in the setting of distributed
file systems, starting with the taxonomy in [6], as well
as large-scale information systems and Web-oriented
content distribution networks (CDNs). Prior studies
in these contexts include simulation experiments
[5, 7, 12, 14], prototypes [18], and analytical results
and algorithms [1, 3, 4, 9–11, 15]. Most of the above-
mentioned work has focused on minimizing access
latency and has disregarded the bandwidth con-
sumption involved. In fact, the main rationale for
cooperative caching in distributed file systems is that
bandwidth is assumed to be abundant and hence can
be freely leveraged to improve response time per-
formance. In Web-oriented CDNs such as Akamai*,
optimizing the response time performance also
appears to be the central objective, while limiting the
bandwidth consumption only seems to be a secondary
aim. In contrast, for high-definition video objects with
sizes of a few gigabytes and hour-long durations, mini-
mizing the bandwidth usage is a far more relevant
objective than reducing the initial play-out delay by a
few hundred milliseconds.
Although the relevant performance metrics may
differ in various application scenarios, a strong concep-
tual similarity emerges when the notion of “access cost”
is adopted. This cost could represent either the addi-
tional latency incurred when fetching content from
remote caches or main memory or the bandwidth
consumed when retrieving content from a peer node or
video headend, depending on the scenario of interest.
Indeed, our mathematical formulation of the content
placement problem shows a strong resemblance to ear-
lier ones for which a wide range of approximation algo-
rithms have been proposed; see for instance [1, 3, 9–11,
15]. In contrast to [3, 4, 15], we focus on specific topolo-
gies, motivated by real system deployments, and prove
that the optimal solution of the relaxed integer pro-
gram has a simple structure. In addition, we use the
insight into the optimal structure to develop efficient
distributed content placement algorithms with far more
favorable performance ratios than the ones in [3, 4, 15].
The remainder of the paper is organized as fol-
lows. We present a detailed model description, give a
formal problem formulation, and show that under
certain symmetry assumptions the optimal content
placement has a rather simple structure. Next, we use
the knowledge of the optimal structure to develop
distributed content placement algorithms with tight
performance guarantees. We then present numerical
experiments to evaluate the performance of the pro-
posed content placement algorithms and offer our
conclusions.
Model Description, Problem Formulation, andStructure of Optimal Content Placement
We consider a cache cluster consisting of M “leaf”
nodes indexed 1, . . . , M, which are either directly
connected or connected indirectly via node 0 as a
common parent somewhere along the path to the
root node, as represented in Figure 1. The parent
node and leaf nodes are endowed with caches of sizes
B0, B1, . . . , BM, while the root node is endowed with
a cache of sufficient capacity to store all the content
items. We assume a static collection of N content items
of sizes s1, . . . , sn. Denote by din the demand for con-
tent item n in leaf node i � 1, . . . M, n � 1, . . . , N.
It is worth emphasizing that the cache cluster
need not be a standalone network but could in fact be
part of a larger hierarchical tree topology as illustrated
in Figure 2. For ease of operation, IPTV networks
tend to have a mostly hierarchical tree structure, but
they may also have some degree of logical connectiv-
ity among nodes within the same hierarchy level,
either directly or via a “U-turn” through a common
116 Bell Labs Technical Journal DOI: 10.1002/bltj
parent node. Cost analysis reveals that it rarely pays
off to install caches at more than one or two levels,
even if the network consists of four or five hierarchy
levels, as is commonly the case. Also, there are imple-
mentation issues involved in integrating caches with
other lower-layer devices at certain hierarchy levels.
Hence, the two-level cache cluster constitutes a fairly
canonical scenario that covers most cases of practical
interest. However, most of the structural properties
that we will obtain in fact extend to any number of
hierarchy levels.
Denote by c0 the unit cost incurred when trans-
ferring content from the root node to the parent node,
by ci the unit cost associated with transferring con-
tent from the parent node to leaf node i, and by cij
the unit cost incurred when transferring content from
leaf node j to leaf node i, i � j � 1, . . . M. We assume
Parent node
2 M1
Leaf nodes
Root node
Figure 1.Cache cluster with leaf nodes.
VHO
IO
DSLAM
CO
STB
CO—Central officeDSLAM—Digital subscriber line access multiplexerIO—Intermediate officeSTB—Set-top boxVHO—Video hub office
Figure 2.Cache cluster within a hierarchical tree topology.
DOI: 10.1002/bltj Bell Labs Technical Journal 117
that cij � c0 � ci for all i � 1, . . . , M, i � j � 1, . . . , M;
i.e., it is cheaper to transfer content from a peer node
than from the root node.
From now on, we focus on the case where con-
tent can only be stored at the leaf nodes and not at the
parent node, i.e., B0 � 0, which we will refer to as
intra-level cache cooperation.
Let the 0–1 decision variable xin indicate whether
the n-th content item is stored in the cache of node i
or not, i � 1, . . . , M. The 0–1variable xijn indicates
whether requests for the n-th content item at node i
are served from the cache of node j or not, i � 1, . . .
M, j � –1, 1, . . . , M, with –1 representing the root
node. Note that the problem of minimizing the band-
width expenses is equivalent to the problem of maxi-
mizing the bandwidth savings. For convenience, we
will focus on the latter problem, which may be for-
mulated as:
with shorthand notation for .
The objective function represents the total bandwidth
savings achieved when requests for content item n at
node i are served from the cache of node j. The first
constraint states that the total size of the content items
stored in a particular node cannot exceed the cache
capacity of that node. The second constraint requires
that requests for a particular content item n can only
be served by node j if item n is stored in the cache of
node j. The third and final constraint precludes that
requests for a particular content item that is stored by
a node itself are also served by other nodes, in order
to prevent double counting of the bandwidth savings.
We now assume that the leaf nodes are symmet-
ric in terms of bandwidth cost, demand characteristics
ai�1
j�1� a
M
j�i�1aj� i
xin � aj� i
xijn � 1, i � 1, . . . , M, n � 1, . . . ,N
xijn � xjn, i � j � 1, . . . , M, n � 1, . . . ,N
s .t . aN
i�1
sn xin � Bi, i � 1, . . . , M
� aj� i
(c0 � ci � cij) xijn b(Pmax) max a
M
i�1aN
n�1
sn din a(c0 � ci)xin
and cache sizes, i.e., ci � c�, cij � c, din � dn and Bi � B
for all i � 1, . . . , M. The assumption of equal band-
width cost and cache sizes is a fairly natural one,
given the way IPTV networks tend to be configured.
The assumption of symmetric demands is more
restrictive, as the popularity of content items may
exhibit variation across different user communities.
In practice, however, the available demand estimates
may simply not have the required level of granular-
ity to reliably distinguish among all possible hetero-
geneous popularity characteristics. In the absence of
any fine-grained information, assuming homoge-
neous popularity characteristics is a reasonable
option. Furthermore, the optimal content placement
turns out to have a distinct and relatively simple
structure in case of symmetric demands. The knowl-
edge of the optimal structure offers valuable insight
for the design of low-complexity cache management
and replacement algorithms. As will be shown later,
these algorithms perform remarkably well in terms of
both worst-case guarantees and average metrics, even
when the actual demand characteristics are highly
asymmetric. This may be explained by the fact that
these distributed algorithms will cache locally popu-
lar items at the lowest levels of the network hierarchy
and thus automatically adapt to heterogeneous popu-
larity distributions. The flipside of the greedy,
myopic behavior is that it tends to reduce the scope
for gains at higher levels in case of asymmetric popu-
larities, but this only gives rise to a limited degree of
suboptimality.
Under the above symmetry assumptions, prob-
lem Pmax reduces to a so-called knapsack problem,
with a knapsack of size MB and 2N items of sizes
an � sn, aN � n � (M � 1)sn, n � 1, . . . , N. The value of
item n is c��� � M(c � c0)�(M � 1)c� while the value
of item N � n is (M � 1)c�. This may be interpreted as
follows. Note that storing item n in the first of the M
nodes yields a bandwidth savings of sndnc���, while stor-
ing it in each additional node yields a further band-
width savings of sndnc�.
The optimal solution of the knapsack problem has
a relatively simple structure, as stated in the next
lemma. For convenience, assume that the content
items are indexed in descending order of dn,
118 Bell Labs Technical Journal DOI: 10.1002/bltj
i.e., d1 � d2 � . . . � dn, and that the values of c���dn
and c�dn, n � 1, . . . , N, are all distinct.
Lemma 1. (Structure of Optimal Content Placement)
1. For some n*0, items 1, . . . n*
0 � 1 are fully repli-
cated at all leaf nodes, while item n*0 may be par-
tially replicated.
2. For some n*1 � n*
0, single copies of items n*0 �
1, . . . , n*1 � 1 are stored, while a partial copy of
item n*1 may be stored.
3. In fact, it cannot be the case that both item n*0 is
partially replicated and a partial copy of n*1 is
stored; i.e., there must be exactly one copy of item
n*0 or no copy of item n*
1.
The intuitive explanation for the above-described
structure follows from the interpretation as a knap-
sack problem. Specifically, in the optimal solution of a
continuous knapsack problem, one item at most may
be partially included, with all higher-value items fully
included, and all lower-value items excluded. We now
distinguish the two cases:
1. The partial item is an item n with size sn and value
c���, say item n*0; and
2. The partial item is an item n with size (M � 1)sn
and value (M � 1)c�, say n*1.
In case 1, items 1, . . . , n*0 � 1 are fully replicated,
while single copies of items n*0 � 1, . . . ,n*
1 � 1 are
stored, with n*1 � min {n : c�dn � c���dn0
.} In case 2,
items 1, . . . n*0 � 1 are fully replicated, with n*
0 � min
{n : c���dn � c�dn1.} while single copies of items n*
0, . . . ,
n*1 � 1 are stored.
Distributed Algorithms for Intra-Level CacheCooperation
In this section, we propose a simple distributed
algorithm, Local-Greedy, to achieve cooperative
caching for the case B0 � 0, i.e., no caching at the root.
For transparency, we assume unit-size content items,
i.e., sn � 1 for all n � 1, . . . , N, but the algorithms
and performance results easily extend to arbitrary
sizes. We will first consider the simple scenario of equal
cache sizes and symmetric demands and bandwidth
costs, where Local-Greedy achieves at least a fraction
3/4 of the maximum achievable bandwidth savings.
Next, we turn to the case of arbitrary cache sizes,
demand vectors, and inter-leaf bandwidth costs, where
the performance ratio of Local-Greedy only decreases
to 1/2 in the worst case. Furthermore, both these per-
formance guarantees are tight. While achieving the
globally optimal configuration would require each
node to communicate its local demand vector to every
other node, the Local-Greedy algorithm only requires
that each node have knowledge about the local and
aggregate demand for each item, as well as the total
number of nodes where each item is currently cached.
Because of space constraints, all proofs are skipped.
We first consider the symmetric case where Bi � B
and din � dn for all i � 1, . . . , M. Further, we assume
cij � c� i, j. Let d1 � d2 � . . . � dN. In the previous
section, we discussed the structure of the optimal
solution under the above assumptions. We now look
for local cache replacement algorithms that imple-
ment the globally optimal solution.
Algorithm: Local-GreedyAt the turn of node i, node i:
1. Associates utility Un � dnc��� with an item n if
item n is not cached anywhere in [M]\{i}.
2. Associates utility Vn � dnc� with an item n if
item n is cached somewhere in [M]\{i}.
3. Replaces its contents with B items of the
highest value.
It seems intuitively plausible that in the extremely
restrictive case of symmetric cache sizes, demands,
and bandwidth costs, the Local-Greedy algorithm
should be able to reach the globally optimal configu-
ration. However, this turns out not always to be the
case, and there exist demand vectors for which Local-
Greedy may not achieve more than a fraction 3/4 of
the maximum bandwidth savings.
Claim 1. The Local-Greedy algorithm does not always
reach the globally optimal configuration.
The next theorem shows that a performance ratio
of 3/4 is in fact the worst the Local-Greedy algorithm
can do.
Theorem 1. For equal cache sizes and symmetric
demands and bandwidth costs, the Local-Greedy algo-
rithm achieves at least a fraction 3/4 of the maximum
bandwidth savings.
We now turn to the case of arbitrary cache sizes,
demand vectors and bandwidth costs.
DOI: 10.1002/bltj Bell Labs Technical Journal 119
Theorem 2. The Local-Greedy algorithm achieves at
least a fraction 1/2 of the maximum bandwidth sav-
ings, independent of the cache sizes, demand vectors,
and bandwidth costs.
The next proposition gives a matching lower
bound for the performance ratio of Local-Greedy. The
lower bound is in fact tight even when the cache sizes
and the bandwidth costs are symmetric. Therefore,
the loss in performance is essentially due to asym-
metric demand vectors.
Numerical ExperimentsIn this section, we present the numerical experi-
ments that we have conducted to evaluate the per-
formance of the proposed algorithms. Throughout,
we assume that there are M � 10 leaf nodes, each
endowed with a cache of size B. The parent node has
a cache of size B0, which may or may not be zero,
while the root node is endowed with a cache of suf-
ficient capacity to store all the content items. The
parameters c0, c, and c� represent the unit cost incurred
when transferring content from the root node to the
parent node, from the parent node to one of the leaf
nodes, and between any pair of leaf nodes, respec-
tively. In all the experiments we assume c0 � 2, c � 1,
and c� � 1.
The system offers a collection of N � 10,000 con-
tent items. For convenience, we assume the content
items to have a common size of S � 2 GB, although
this is not particularly essential for the results. For
compactness, denote by K B/S the number of con-
tent items that can be stored in each of the leaf nodes.
Each of the leaf nodes receives an average of one
request every 160 seconds; i.e., the aggregate request
rate per leaf node is v � 0.00625 sec�1. We assume
that the popularity of the various content items is gov-
erned by a Zipf-Mandelbrot distribution with shape
parameter and shift parameter q; i.e., the relative
popularity of the n-th most popular item is:
with normalization constant
H(a, q, N) � aN
n�1
1
(q � n)a
pn � H�1(a, q, N)1
(q � n)a, n � 1, . . . , N,
�
The request rate for the n-th most popular item at
each of the leaf nodes is dn � pnv, with � the total
request rate per leaf node defined above.
Below we present two sets of experimental
results. In the first set of results, we assess the gains
from cooperative caching and demonstrate that judi-
cious replication can yield significant bandwidth sav-
ings, even with small cache sizes. In the second set of
experiments, we examine the performance of the
Local-Greedy algorithm and show that it operates
close to globally optimal performance.
Gains from Cooperative CachingIn order to quantify the gains from cooperative
caching, we compare the minimum bandwidth cost as
characterized by the optimal solution of Pmax with the
bandwidth cost incurred in two other scenarios: 1) full
replication, where each leaf node stores the K most
popular content items, and 2) no replication, where
only a single copy of the MK most popular content
items is stored in one of the leaf nodes. The bandwidth
costs for these three scenarios are compared as a func-
tion of the shape parameter a of the Zipf-Mandelbrot
distribution with the shift parameter fixed at q � 10.
Note that without caching the bandwidth cost would
be MvS(c � c0) � 10 � 0.00625 � 2 � 3 � 0.375 Gb/s,
which amounts to 3 Gb/s.
Figure 3 shows the results for cache sizes
B � 200 GB, B � 1 TB, and B � 2 TB, respectively.
Figure 3d shows similar results for a cache size B � 1
TB, where, in addition, the parent node has a cache of
size B0 � 2 TB. As the results demonstrate, the band-
width costs markedly decrease with increasing values
of a, reflecting the well-known fact that caching effec-
tiveness improves as the popularity distribution gets
steeper. Even when B � 200 GB, which means that
the collective cache space can hold no more than 10
percent of the total content, caching reduces the
bandwidth costs to a fraction of what they would have
been in the absence of caching, as long as the value of
a is not too low.
It is worth observing that the best choice between
either full or zero replication is often not much worse
than the optimal content placement. However, for
low values of a, full replication is substantially worse
than the optimal content placement and may incur
120 Bell Labs Technical Journal DOI: 10.1002/bltj
up to three times higher bandwidth costs, depending
on the cache size, B. For higher values of , on the
other hand, zero replication is significantly worse than
the optimum and may incur 10-fold the bandwidth
costs, even though these are still moderate in absolute
terms. In conclusion, neither full nor zero replication
performs well across the entire range of values, and
it is crucial to adjust the degree of replication based on
the steepness of the popularity distribution. In case
of the Zipf-Mandelbrot distribution, one could in prin-
ciple calculate the optimal degree of replication as a
function of the shape parameter a. In practice, how-
ever, the value of a may not be so easy to obtain, or
the popularity statistics may not even conform to a
Zipf-Mandelbrot distribution altogether. Hence it
makes more sense to adjust the degree of replication
based on the actual observed demands for the indi-
vidual content items, rather than fitting a particular
0 1 2
Shape parameter alpha
0
2
4B
and
wid
th c
ost
(G
bp
s)Full replicationNo replicationOptimal configuration
0
2
4
Ban
dw
idth
co
st (
Gb
ps)
Full replicationNo replicationOptimal configuration
0
2
4
Ban
dw
idth
co
st (
Gb
ps)
Full replicationNo replicationOptimal configuration
0
2
4
Ban
dw
idth
co
st (
Gb
ps)
Full replicationNo replicationOptimal configuration
(a) B � 200 GB
0 1 2
Shape parameter alpha
(b) B � 1 TB
0 1 2
Shape parameter alpha
(c) B � 2 TB
0 1 2
Shape parameter alpha
(d) B � 1 TB with parent node B0 � 2 TB
Figure 3.Results for cache sizes.
DOI: 10.1002/bltj Bell Labs Technical Journal 121
distribution. The algorithms described in the previous
section do just that, without requiring any knowledge
of a or even relying on the popularity statistics obey-
ing a Zipf-Mandelbrot distribution.
Performance of the Local-Greedy AlgorithmWe now proceed to examine the performance of
the Local-Greedy algorithm.
Throughout, we assume that each of the leaf
nodes has a cache of size B � 1 TB, and that the popu-
larity of the various content items is governed by a
Zipf-Mandelbrot distribution with shape parameter
a � 0.8 and shift parameter q � 10.
In order to study the dynamic evolution of the
content placement, we conduct a simulation experi-
ment where the various leaf nodes receive requests
over time, sampled from the above-described popular-
ity distribution. Whenever a particular node receives
a request for an item that it has not presently stored,
it decides whether to cache it and if so, which cur-
rently stored item to evict, as prescribed by the Local-
Greedy algorithm.
In order to monitor the performance and conver-
gence, we track the objective value over time, and
compare it with the value of the optimal content
placement. In the optimal content placement, items 1
through 165 are fully replicated, and single copies of
items 166 through 3515 are stored. For the initial con-
tent placement, we distinguish three different sce-
narios:
1. Full replication, where each leaf node initially
stores the K most popular content items,
2. No replication, where initially only a single copy
of the KM most popular content items is stored
in one of the leaf nodes, and
3. Random, where each leaf node stores K content
items selected uniformly at random, independ-
ently of all other leaf nodes.
Figure 4 shows the performance of the Local-
Greedy algorithm. Figure 4a illustrates its perfor-
mance as a ratio of the optimal content placement. Note
that the objective value achieved by the algorithm
gets progressively closer to the optimum as the system
receives more requests and replaces content items
over time. After only 3000 requests (out of a total
number 10,000 content items) the Local-Greedy algo-
rithm has come to within 1 percent of the optimal
content placement and stays there, performing
markedly better than the worst-case ratio of 3/4 might
suggest.
While the algorithm seems to “converge” for all
three initial states, the scenario with no replication
appears to be the most favorable one. This may be
explained by the fact that in the optimal content
placement only items 1 through 165 are fully repli-
cated. It takes relatively few requests to replicate these
items starting from no replication, whereas it takes
far more requests to replace all the duplicates of items
166 through 500 by single copies of items 501
through 3515 starting from full replication.
The results as shown in Figure 4a assume a static
content population and fixed popularities. In the next
experiment, for which the results are presented in
Figure 4b through Figure 4d, we explore how dynamic
content ingestion and aging affect the performance of
the Local-Greedy algorithm. These phenomena have a
strong impact on caching effectiveness and, in particu-
lar, degrade the performance of conventional algo-
rithms such as least recently used (LRU) and least
frequently used (LFU). As a point of reference, the over-
all popularity law is assumed to remain the same over
time, with the same shape parameter a and the same
shift parameter q, in the sense that the relative popu-
larity of the n-th most popular active item continues to
be pn. However, the relative popularity of a given item
decays over time, and we specifically assume that the
rank of each item is decremented by 1 after every R
requests received in the system. In other words, for
every request that the system receives, the ranks of
all the items are decremented by 1/R, which may be
interpreted as a measure for the rate of aging. In order
to keep the overall popularity law the same, the bot-
tom item with rank N is supposed to be removed while
a new item with rank 1 is ingested after every R
requests received by the system. Equivalently, each
individual item receives an average of R requests over
its active lifetime.
Figures 4b through Figure 4d show the results for
various aging factors. In order not to clutter the fig-
ures, we only display the results for a random initial
122 Bell Labs Technical Journal DOI: 10.1002/bltj
content placement, those for either full or no initial
replication being similar. The Local-Greedy algorithm
continues to “converge” but now only comes to
within a certain margin from the “optimal” content
placement. A comparison of Figure 4b, Figure 4c, and
Figure 4d reveals that the optimality margin increases
with the rate of aging. This may be explained from the
fact that the constant churn requires the cache content
to be continuously refreshed to maintain optimality,
whereas previously the cache content gradually settled
into a fixed state as the optimum was approached.
The Local-Greedy algorithm is constrained in terms
of the rate at which it can make the required updates
by the requests that it receives and suffers more as
the required rate of updates grows relative to the
number of requests, i.e., as the aging rate increases.
By the same token, the “optimal” content placement
is granted an unfair advantage here, as the additional
0 2500 5000
Number of requests received
0
0.5
1Pe
rfo
rman
ce r
atio Full replication
No replicationRandom
0 2500 5000
Number of requests received
0
0.5
1
Perf
orm
ance
rat
io Full replicationNo replicationRandom
0 2500 5000
Number of requests received
0
0.5
1
Perf
orm
ance
rat
io Full replicationNo replicationRandom
0 2500 5000
Number of requests received
0
0.5
1
Perf
orm
ance
rat
io Full replicationNo replicationRandom
(a) Static popularity (b) Slow aging, R � 200
(c) Moderate aging, R � 100 (d) Fast aging, R � 20
Figure 4.Performance of the Local-Greedy algorithm at optimal content placement and with examples showing the impactof dynamic content ingestion and aging.
DOI: 10.1002/bltj Bell Labs Technical Journal 123
bandwidth that is required to maintain optimality is
not accounted for. In other words, it only provides an
upper bound, and the Local-Greedy algorithm may
therefore in fact be closer to the true optimum than
the numbers suggest.
So far, we have assumed that the relative popu-
larities of the various content items are known exactly.
In practice, popularities are unlikely to be known
with perfect accuracy and will need to be estimated.
In order to examine the impact of estimation errors,
we conducted an experiment where the popularities
are no longer assumed to be known, but estimated
based on the actual received requests using a geo-
metrically smoothed average with some time constant
T as proxy for the length of the measurement window.
Figure 5 plots the results for various values of the time
constant. As is to be expected, the optimum is
approached more closely as the accuracy of the estimates
0 2500 5000
Number of requests received
0
1
2O
bje
ctiv
e va
lue
Optimum with exact knowledgeOptimum for given estimatesLocal greedy with estimates
0 2500 5000
Number of requests received
0
1
2
Ob
ject
ive
valu
e
Optimum with exact knowledgeOptimum for given estimatesLocal greedy with estimates
0 2500 5000
Number of requests received
0
1
2
Ob
ject
ive
valu
e
Optimum with exact knowledgeOptimum for given estimatesLocal greedy with estimates
(b) T � 1,000(a) T � 100
(c) T � 10,000 (d) T � 100,000
0 2500 5000
Number of requests received
0
1
2
Ob
ject
ive
valu
e
Optimum with exact knowledgeOptimum for given estimatesLocal greedy with estimates
Figure 5.Results for various values of time constant T.
124 Bell Labs Technical Journal DOI: 10.1002/bltj
improves for larger time constants, although the ini-
tial convergence is possibly somewhat slower. For
comparison purposes, the figure also shows what the
optimum objective value is for the given popularity
estimates. This provides an indication of the best that
any algorithm in general, and Local-Greedy in par-
ticular, can be expected to do, given available esti-
mates, and suggests that for larger time constants the
performance gap is mostly due to estimation errors.
The issue of popularity estimation is particularly
pertinent in the case of time-varying popularities due
to content aging and dynamic ingestion, as consid-
ered before. The results for that scenario combine the
effects observed in the previous two cases.
ConclusionWe have developed lightweight cooperative cache
management algorithms, which achieve close to glob-
ally optimal performance, with favorable worst-case
ratios, even in asymmetric scenarios. Numerical
results demonstrate that the typical performance is
far better than the worst-case guarantees suggest. The
numerical results also illustrate the complex interac-
tion between distributed cache cooperation and popu-
larity estimation in the presence of dynamic content
ingestion and aging. Finding optimal content place-
ment algorithms and performance bounds for time-
varying popularity statistics remains as a challenging
issue for further research.
AcknowledgementsThe authors gratefully acknowledge helpful dis-
cussions with Volker Hilt, Won Suck Lee, Ivica Rimac
and Iraj Saniee.
*TrademarksAkamai is a registered trademark of Akamai Tech-
nologies, Inc.YouTube is a registered trademark of Google, Inc.
References[1] B. Awerbuch, Y. Bartal, and A. Fiat, “Distributed
Paging for General Networks,” J. Algorithms,28:1 (1998), 67–104.
[2] B. Awerbuch and D. Peleg, “Online Tracking ofMobile Users,” J. ACM, 42:5 (1995), 1021–1058.
[3] I. D. Baev and R. Rajaraman, “ApproximationAlgorithms for Data Placement in ArbitraryNetworks,” Proc. 12th Annual ACM-SIAM
Symposium on Discrete Algorithms (SODA ‘01)(Washington, DC, 2001), pp. 661–670.
[4] I. Baev, R. Rajaraman, and C. Swamy,“Approximation Algorithms for Data PlacementProblems,” SIAM J. Comput., 38:4 (2008),1411–1429.
[5] M. D. Dahlin, R. Y. Wang, T. E. Anderson, andD. A. Patterson, “Cooperative Caching: UsingRemote Client Memory to Improve File SystemPerformance,” Proc. 1st USENIX Conf. onOperating Syst. Design and Implementation(OSDI ‘94) (Monterey, CA, 1994), article 19.
[6] L. W. Dowdy and D. V. Foster, “ComparativeModels of the File Assignment Problem,” ACMComput. Surveys, 14:2 (1982), 287–313.
[7] L. Fan, P. Cao, J. Almeida, and A. Z. Broder,“Summary Cache: A Scalable Wide-Area WebCache Sharing Protocol,” ACM SIGCOMMComput. Commun. Rev., 28:4 (1998), 254–265.
[8] C. Huang, J. Li, and K. W. Ross, “Can InternetVideo-on-Demand Be Profitable?,” ACMSIGCOMM Comput. Commun. Rev., 37:4(2007), 133–144.
[9] J. Kangasharju, J. Roberts, and K. W. Ross,“Object Replication Strategies in ContentDistribution Networks,” Comput. Commun.,25:4 (2002), 376–383.
[10] M. R. Korupolu, C. G. Plaxton, and R. Rajaraman, “Placement Algorithms forHierarchical Cooperative Caching,” J. Algorithms, 38:1 (2001), 260–302.
[11] A. Leff, J. L. Wolf, and P. S. Yu, “ReplicationAlgorithms in a Remote Caching Architecture,”IEEE Trans. Parallel and Distrib. Syst., 4:11(1993), 1185–1204.
[12] J. Ni and D. H. K. Tsang, “Large-ScaleCooperative Caching and Application-LevelMulticast in Multimedia Content DeliveryNetworks,” IEEE Commun. Mag., 43:5 (2005),98–105.
[13] C. G. Plaxton, R. Rajaraman, and A. W. Richa,“Accessing Nearby Copies of Replicated Objectsin a Distributed Environment,” Theory Comput.Syst., 32:3 (1999), 241–280.
[14] P. Sarkar and J. H. Hartman, “Hint-BasedCooperative Caching,” ACM Trans. Comput.Syst., 18:4 (2000), 387–419.
[15] C. Swamy, “Algorithms for the Data PlacementProblem,” 2004.
[16] M. van Steen, F. J. Hauck, and A. S.Tanenbaum, “A Model for Worldwide Trackingof Distributed Objects,” Proc. Telecommun.
DOI: 10.1002/bltj Bell Labs Technical Journal 125
Inform. Networking Architecture Conf. (TINA‘96) (Heidelberg, Ger., 1996), pp. 203–212.
[17] M. Verhoeyen, D. De Vleeschauwer, and D. Robinson, “Content Storage Architectures forBoosted IPTV Service,” Bell Labs Tech. J., 13:3(2008), 29–43.
[18] G. M. Voelker, E. J. Anderson, T. Kimbrel, M. J.Feeley, J. S. Chase, A. R. Karlin, and H. M.Levy, “Implementing Cooperative Prefetchingand Caching in a Globally-Managed MemorySystem,” ACM SIGMETRICS Perform.Evaluation Rev., 26:1 (1998), 33–43.
(Manuscript approved May 2009)
SEM BORST is a member of technical staff in the Mathematics of Networks and SystemsResearch Department at Alcatel-Lucent BellLabs in Murray Hill, New Jersey. He receivedan M.Sc. degree in applied mathematicsfrom the University of Twente and a Ph.D.
degree from the University of Tilburg, both located inthe Netherlands. His main research interests are in thearea of performance evaluation and resourceallocation algorithms for communication networks. Dr. Borst has authored over 100 publications injournals, conference proceedings, and book chaptersand serves or has served as a member of severaleditorial boards and program committees. He holds 12patents and has six patent applications pending in thearea of networking and wireless communications.
VARUN GUPTA is a Ph.D. candidate in the Computer Science Department at Carnegie MellonUniversity in Pittsburgh, Pennsylvania. Hereceived his B.Tech. degree in computerscience and engineering from IndianInstitute of Technology, Delhi, and was
awarded the university’s prestigious President’s GoldMedal in 2004. His research interests are in all aspectsof design and analysis of algorithms, distributedsystems, and stochastic modeling. His current focus ison performance modeling and analysis of schedulingpolicies for multi-server systems from a queuing-theoretic perspective.
ANWAR WALID is a distinguished member of technical staff in the Mathematics of Networks andSystems Research Department at Alcatel-Lucent Bell Labs in Murray Hill, New Jersey.He received a B.S. degree in electricalengineering from Polytechnic Institute of
New York, Brooklyn, and a Ph.D. degree in electricalengineering from Columbia University in New YorkCity. He developed theory and algorithms for resourcemanagement and quality of service (QoS) support forAT&T and Lucent Technologies products. He is an IEEEFellow and holds patents on congestion control,scheduling, and routing in asynchronous transfer mode(ATM) and Internet Protocol (IP)/multiprotocol labelswitching (MPLS) networks. He received a Best PaperAward from ACM Sigmetrics on multimedia trafficmodeling. He has co-authored IETF RFCs in the MPLSand Traffic Engineering Working Groups and hasserved on the executive and technical programcommittees of IEEE and MPLS conferences. He is amember of Tau Beta Pi (National Engineering HonorSociety, and the International Federation forInformation Processing (IFIP) Working Group 7.3. ◆