CR Ciullo TPDS - EURECOM · 2020. 11. 14. · Traditional (client-server) Content Delivery Networks (CDN) ... After characterizing Sd(t,k ), the evaluation of S(t) is easy, ... to

1

Peer-assisted VoD Systems:an Efficient Modeling Framework

Delia Ciullo, Valentina Martina, Michele Garetto, Emilio Leonardi, Giovanni Luca Torrisi

Abstract—We analyze a peer-assisted Video-on-Demand system in which users contribute their upload bandwidth to the redistribution

of a video that they are downloading or that they have cached locally. Our target is to characterize the additional bandwidth that servers

must supply to immediately satisfy all requests to watch a given video. We develop an approximate fluid model to compute the required

server bandwidth in the sequential delivery case, as well as in controlled non sequential swarms. Our approach is able to capture

several stochastic effects related to peer churn, upload bandwidth heterogeneity, non-stationary traffic conditions, which have not been

documented or analyzed before. At last, we provide important hints for the design of efficient peer-assisted VoD systems under server

capacity constraints.

Index Terms—Video-on-demand, peer-to-peer, performance evaluation.

✦

1 INTRODUCTION

The efficient distribution of video contents will be one of the

main challenges of the future Internet. According to Cisco

forecasts [2], the combination of all forms of video (live

streaming, video-on-demand and P2P file sharing) will be in

the range of 80 to 90 percent of global consumer Internet

traffic by 2017, posing a tremendous challenge to both content

providers and network operators. In particular, Video-On-

Demand traffic has been increasing tremendously during the

last few years. For example, it has been reported that, on a

normal weeknight, Netflix alone accounts for almost one third

of all Internet traffic entering North American homes.

Traditional (client-server) Content Delivery Networks

(CDN) help alleviate the traffic on the transport infrastructure

by “moving” contents close to the users, however they do not

solve the scalability problem of data centers and server farms,

whose resources (bandwidth/storage/processing) increase lin-

early with the user demand and the data volume.

Peer-assisted video distribution architectures, in which users

contribute their upload bandwidth to the system while viewing

the requested video, have been advocated as a viable alterna-

tive to traditional CDNs to reduce the server workload and

guarantee the scalability to large populations of users [3], [4].

Several peer-assisted systems have already been deployed,

such as PPLive, GridCast, PPStream, TVU, SopCast, Xunlei

Kankan [5].

Despite the wide popularity gained by existing applications,

several fundamental questions remain unanswered about the

design of video streaming systems and the potential benefits

of the peer-assisted approach. Indeed, the unpredictable nature

• D. Ciullo is with EURECOM, Sophia Antipolis, France. E-mail:[email protected]

• V. Martina and E. Leonardi are with Dipartimento di Elettron-ica, Politecnico di Torino, Torino, Italy. E-mail: {valentina.martina,emilio.leonardi}@polito.it

• M. Garetto is with Dipartimento di Informatica, Università di Torino,Torino, Italy. E-mail: [email protected]

• G. L. Torrisi is with Istituto per le Applicazioni del Calcolo, CNR, Roma,Italy. E-mail: [email protected]

A preliminary version of this paper appeared at IEEE INFOCOM Mini-Conference 2012 [1].

of users cooperation, that cannot always guarantee the strict

quality-of-service requirements of online video; the added

complexity on the control plane due to signalling and chunk

scheduling; and the need to provide incentive mechanisms to

the users, tend to discourage the content providers to adopt

peer-assisted solutions.

In our work, we focus on peer-assisted Video-on-Demand

(VoD) systems, for the on-line distribution of movies to a large

audience of users, such as Netflix. Users can browse a catalog

of available movies, and asynchronously issue requests to

watch a given content, which are ideally immediately satisfied

by the system, with the optional support for limited VCR

actions such as pause and jump backward/restart. Notice that

Video-on-Demand (VoD) systems are quite different from

live video streaming applications, in which users join the

distribution of a given TV channel at random points in time,

but peers connected to the same channel watch the content

almost synchronously.

In peer-assisted VoD systems, users interested to a specific

video can retrieve it from servers (CDN modality), from peers

downloading/watching it, and from users storing a copy of it

in their computer/Internet TV memory or in dedicated set-top-

boxes remotely controllable by the network operator [6]. The

content is typically divided into small chunks (typically one

or few video frames) which are retrieved from other peers (or

from servers) in a fully distributed (swarming) fashion based

on the exchange of chunk bitmaps. However, differently for

traditional file-sharing, chunk and peer selection strategies for

peer-assisted video distribution must account for the fact that

users watch while downloading. In particular, to avoid service

interruptions/degradations: i) a minimum average download

rate equal to the video playback rate must be guaranteed; ii)

an “almost in order” delivery of chunks is required.

Our main contribution is a stochastic fluid framework that

allows to approximately estimate the additional bandwidth that

servers must provide to satisfy all requests to watch a given

video. Our methodology can account for several stochastic

effects related to peer churn, upload bandwidth heterogeneity,

non-stationary traffic conditions, which have not been analyzed

before, providing a useful tool for the analysis and design of

2

VoD systems.

We consider both the simple basic sequential delivery

scheme, in which users download the video chunks (almost)

sequentially, as well as, schemes that tolerating higher play-

out delay permit to exploit non-sequential delivery to improve

the overall system performance.

The analytical approach described in this paper comple-

ments the analysis presented in [7], in which we obtain

rigorous bounds for the sequential delivery scheme (under

stationary traffic conditions) and asymptotic results as the

number of users increases. With respect to [7], we extend the

analysis to non-sequential delivery schemes and non-stationary

traffic conditions, with a different goal in mind, i.e., to provide

a performance evaluation tool that can be readily used for

system design and optimization.

We emphasize that in our work we do not consider issues

related to optimal replication strategies of heterogeneous con-

tents (in size and popularity) or optimal peer resource alloca-

tion (in terms of storage and upload bandwidth) in the presence

of multiple videos (e.g., universal streaming architectures).

This because we focus on the bandwidth requested from the

servers to distribute a given video, assuming that the peer

resources allocated to it (i.e., number of copies available in

the system and the amount of upload bandwidth devoted to

the considered video) are given. Our analysis can be combined

with optimal resource and replication strategies for universal

streaming architectures.

For an overview of related work, see Section 10 in [8].

2 MODEL

2.1 System assumptions

We model a fairly general peer-assisted VoD system. Users1

run applications that allow them to browse an online catalog

of videos. When a user selects a video, we assume that the

request is immediately satisfied and the selected video can be

watched uninterruptedly till the end (i.e., a continuity index

equal to 1 is guaranteed). This is possible only if the system

is able to steadily provide to each user a data flow greater than

or equal to the video playback rate.

Users contribute their upload bandwidth to the video dis-

tribution, thus they can retrieve part of the video (or even

the entire video) from other peers, saving servers resources.

The main goal of our analysis is to characterize the additional

bandwidth that servers must supply (in addition We provide a

fundamental tool to properly dimension the CDN infrastruc-

ture supporting the VoD system.

We focus on a given video of duration Tv seconds and sizeL bytes, which is played back by the users’ applications at ratedv = L/Tv bytes/s. Clearly, to guarantee continuous playbackeach user must at least receive video chunks sequentially at

rate dv . As a widely adopted strategy to mitigate bandwidthfluctuations, applications pre-fetch and buffer video chunks

before playback (notice that we consider VoD systems, hence

we assume, in contrast to live streaming systems, that all

chunks are immediately available, at least at the servers). In

1. In this paper we use the terms peer and user interchangeably.

our model, we assume that the system provides to each user

a fixed target download rate d ≥ dv (we assume that thedownload bandwidth on the access links of the users is large

enough that it does not constitute a bottleneck).

In general, the target download rate of a peer could be

adapted to the portion of video being downloaded, or even

depend on some peer’s characteristics (such as its upload

bandwidth). By imposing a constant target download rate

d ≥ dv at each user we simplify the analysis, while obtaininga conservative prediction with respect to the case in which

the target download rate is adapted over time and to the

peer characteristics. Notice that the target download rate dcan be chosen by the system. We will show that in some

cases, unexpectedly, the optimal value of d (i.e., the one thatminimizes the average bandwidth requested from the servers)

is actually larger than dv .The amount of upload bandwidth with which peers con-

tribute to the redistribution of the video that they are down-

loading may or may not be under the control of the system. In

our analysis, we assume that the upload bandwidth available

at a peer is a random variable with given distribution. This

way, we encompass both the realistic case of users with

heterogeneous Internet connections (i.e., ADSL, fiber, LAN)

and cross-traffic fluctuations, and the case in which the peer

upload bandwidth allocated to the given video is tuned by

the system (such as in universal streaming architectures).

More specifically, the amount of upload bandwidth with which

users contribute at a given time to the redistribution of the

considered video is modeled by a random variable U withcumulative distribution function FU (w), mean U and varianceσ2U . The random variables denoting the instantaneous uploadbandwidths of the users are assumed to be i.i.d. (identically

and independently distributed). See Section 11 in [8] for a

critical discussion on our system assumptions.

2.2 Peers dynamics

We need to incorporate in our analysis a model describing

how peers join the distribution of the considered video, and

when and how they leave the system, stopping to contribute

their upload bandwidth. To this aim, we adopt a very flexible

model that allows to consider a non-stationary video request

process, and general peer churn, which is an another crucial

feature of any realistic P2P system.

In particular, we assume that the arrival process of requests

for the considered video follows a time-varying Poisson pro-

cess of intensity λ(t). By so doing, we are able to captureeffects related to content popularity variation, while maintain-

ing analytical tractability [9], [10].Indeed, assuming that at a

given time the arrival process is Poisson is reasonable, since

users behave independently of each other, and their requests

are immediately satisfied. On the other hand, a video (e.g., a

typical movie) can be long enough that the rate at which it

is requested can vary significantly during the playing time,

due to daily traffic fluctuations, or rapidly-changing video

popularity. We account for this fact assuming that the video

request process is described by an non-homogeneous Poisson

process with time-varying intensity λ(t), which can possibly

3

be equal to zero before a given time t0, to model a newlyintroduced video inserted into the catalog at time t0.

The dynamics of peer participation in the distribution of

a given video must account for the fact that activity periods

of the users are highly heterogeneous, as observed in several

measurement studies [4]: some users stop watching the video

after a very short time since the beginning, because they realize

they are no longer interested in it; most users who decide to

watch the video shut down the computer/Internet-TV towards

the end of it; some of them keep the application running for

prolonged time after the end of the video; those running set-

top-boxes can be considered to be always active and serving

other peers (until they stop contributing to the distribution of

the considered video). We account for general user behavior

assuming that the activity period of a user (i.e., the interval

during which a user contributes its upload bandwidth, starting

from the instant at which the video has been requested) is

described by an arbitrary random variable T with finite meanT and complementary cumulative distribution function GT (x).The activity periods of the users are assumed to be i.i.d.

It follows from our assumptions that the number of active

users N(t) at time t is distributed as the number of customersin an M/G/∞ queue with time-varying arrival rate, hence itfollows a Poisson distribution with time-varying mean N(t)given by

N(t) =

∫ ∞

0

λ(t − x)GT (x) dx (1)

In our analysis we need to distinguish two classes of active

users: those who are still downloading the video, and those

who have completed the download (referred to as seeds in the

following). Let τd = L/d be the time needed to downloadthe whole video, and T d =

∫ τd0

GT (x)dx the average timespent by peers downloading the video. The number of down-

loading peers at time t, denoted by Nd(t), follows a Poissondistribution of mean Nd(t) given by

Nd(t) =

∫ τd

0

λ(t − x)GT (x) dx (2)

Then standard properties of Poisson processes allow to say that

the number of seeds at time t, denoted by Nseed(t), follows aPoisson distribution of mean N seed(t) = N(t) − Nd(t).

We define as instantaneous system load γ(t) the quantity

γ(t) =d · Nd(t)

U · N(t)(3)

which is the ratio between the average data rate requested

at time t by downloading peers and the average uploadrate provided by all active users at time t. Borrowing theterminology adopted in previous work [3], [11] we say that

at time t the system operates in deficit mode if γ(t) > 1, inbalanced mode if γ(t) = 1, and in surplus mode if γ(t) < 1.

We also introduce the per-user system load γp =d·T dU ·T

,

which is the ratio between the average amount of data that are

downloaded by a peer, and the average amount of data that a

peer is able to offer to other peers. Note that by construction

γp is equal to the (constant) instantaneous system load in thecase of a stationary user arrival process. In ergodic systems,

TABLE 1: Notation

Symbol Definition

L video size (bytes)dv video playback rate (bytes/s)Tv video playback duration (s), Tv = L/dvd target download rate (bytes/s)U average user upload bandwidth (bytes/s)T average user activity period (s)T d average time spent downloading the video (s)λ(t) arrival rate of requests for the video at time tN(t) average number of active users at time tNd(t) average number of downloading users at time tN seed(t) average number of seeds at time tSd(t) average bandwidth requested by downloaders at time tSseed(t) average bandwidth offered by seeds at time tS(t) average bandwidth requested from the servers at time t

γp can be regarded as the time average of γ(t).

2.3 Performance metrics

The main goal of our paper is to characterize the additional

bandwidth required from the servers (i.e., in addition to the

aggregate bandwidth provided by the peers) to guarantee an

optimal quality of service (i.e., continuity index equal to 1) to

the users.

Let S(t) be the random variable denoting the additionalbandwidth that the servers must supply at time t to satisfy allactive downloads of the considered video. We denote by S(t)and σ2S(t) the mean and variance of S(t), respectively.

Since in practice there are multiple videos to be served

concurrently by the system, statistical multiplexing arguments

suggest that a good design goal is to minimize the mean

value S(t) of the server bandwidth required by a single video.Therefore, this will be the main metric that we will look at in

our performance analysis. Table 1 summarizes the notation of

our model.

3 ANALYSIS

3.1 Sequential delivery

We start considering the simple case in which users download

the video chunks sequentially. This scheme is simple to

implement, as it does not require complex chunk/peer selection

mechanisms such as those needed in BitTorrent-like chunk

swarming schemes. More importantly, the sequential delivery

scheme is analytically tractable and provides an upper bound

to the server bandwidth requested by non-sequential schemes.

Let Sd(t) be the aggregate bandwidth requested by the

downloading users at time t, and Sseed(t) =∑Nseed(t)

i=1 Ui bethe aggregate upload rate offered by the seeds at time t. Thenthe bandwidth requested from the servers at time t is given by

S(t) = max{0, Sd(t) − Sseed(t)}. (4)

Focusing on Sd(t), we first condition this quantity on thenumber of downloading users k, defining

Sd(t, k) , (Sd(t) | Nd(t) = k)

4

After characterizing Sd(t, k), the evaluation of S(t) is easy,since the distribution of Nd(t) is known (a Poisson distributionof mean N(t)), while Sseed(t) is a compound Poisson randomvariable which does not depend on k [12].

To evaluate Sd(t, k) under sequential download, we startobserving that, if all peers download the video sequentially at

common rate d, a peer can only redistribute video pieces topeers arrived later on in time.

Proposition 1: Quantity Sd(t, k) satisfies the following re-cursive equation:

Sd(t, k) =

{

d k = 1d + max{0, Sd(t, k − 1) − Uk} k > 1

(5)

The proof of Proposition 1 is reported in Appendix A of [8].

Expression (5) provides the key to the analytical approxima-

tion developed in the next section.

Alternate formulations of quantity Sd(t, k) exist (see [3],[11], [7]). Here, we just mention that in [7] we find a

connection between the stochastic process described by (5)

and a random walk with increments d − U , which allows toobtain analytical upper bounds to the server bandwidth and

to characterize its asymptotic behavior for large number of

users.

3.2 Gaussian approximation

In the sequential delivery case, we can characterize the

distribution of the server bandwidth using a second-order

approximation [12], [13].

The idea is to approximate the distribution of quantity

Sd(t, k − 1) − Uk in (5) (for each k ≥ 2) by a normaldistribution matching the first two moments of this quantity.

We can then apply standard formulas of the truncated normal

distribution to derive the first two moments of Sd(t, k) as afunction of the first two moments of Sd(t, k−1). This providesa recursive technique to compute the first two moments of

Sd(t, k) for any k, starting from the exact values knownfor k = 1. Our gaussian approximation is motivated by thefact that significant excursions of Sd(t) (i.e., away from thelower limit d) result from the accumulation of several randomcontributions d − U , thus the central limit theorem can beinvoked to justify the convergence to a normal distribution.

A similar approximation is subsequently applied to take into

account the effect of the seeds, whose aggregate contribution

Sseed(t) can be well described by a gaussian distributionfor sufficiently large number of seeds. Notice that an exact

evaluation of the distributions (or just the first few moments)

of S(t) and Sd(t, k) is difficult, due to the presence of barriersat zero and d, respectively.

More in detail, let us start introducing some notation and

standard results related to the normal distribution. Let N (w)be the probability density function of the standard normal

distribution (having mean 0 and variance 1), and Q(w) itscomplementary cumulative distribution function.

Let y be a random variable distributed according to a normaldistribution N(µ, σ) of mean µ and standard deviation σ. Thenit can be proved that the first moment of the random variable

y′ = max{0, y} has the following expression:

E[y′] = σN(

−µ

σ

)

+ µQ(

−µ

σ

)

(6)

while the second moment is given by

E[y′2] = σµN(

−µ

σ

)

+ (σ2 + µ2)Q(

−µ

σ

)

. (7)

Let Sd(t, k) and σ2Sd

(t, k) be the mean and variance ofSd(t, k). Our recursive procedure to approximately computeSd(t, k) and σ

2Sd

(t, k) for all k starts from the initial known

values Sd(t, 1) = d and σ2Sd

(t, 1) = 0 (see (5)). For agiven k ≥ 2 we approximate Sd(t, k − 1) − Uk by a nor-mal random variable y of mean µ = Sd(t, k − 1) − U andvariance σ2 = σ2Sd(t, k − 1) + σ

2U . Defining the random vari-

able y′ , max{0, y} ≃ max{0, Sd(t, k − 1) − U}, from (5)we obtain:

Sd(t, k) ≃ d + E[y′] (8)

σ2Sd(t, k) ≃ E[y′2] − E[y′]2. (9)

Applying (6) and (7) we can compute the first and second

moment of variable y′ in (8,9). This provides the recursion tocompute Sd(t, k) and σ

2Sd

(t, k) for all k.To account for the effect of the seeds (if any), we apply

once more the normal approximation, as follows. Let S(t, k) =max(0, Sd(t, k)−Nseed(t)) be the server bandwidth necessaryat time t, assuming that there are k downloading users andNseed(t) seeds. Moreover, let S(t, k) and σ

2S(t, k) be the mean

and variance of S(t, k).We observe that Sseed(t) is a compound Poisson random

variable, whose moments can be computed exactly in close-

form. In particular, the mean of Sseed(t) is equal to N seed(t)U ,

whereas its variance is equal to N seed(t)(σ2U + U

2). We

approximate Sd(t, k) − Sseed(t) by a normal distribution y ofmean µ = S(t, k) − N seed(t)U and variance σ

2 = σ2S(t, k) +

N seed(t)(σ2U + U

2), and apply again (6) and (7) to compute

the first and second moment of y′ = max{0, y} ≃ S(t, k).Finally, the mean server bandwidth S(t) (and similarly its

variance) can be obtained deconditioning with respect to k:

S(t) =∑

k≥1

S(t, k)P(Nd(t) = k) (10)

We observe that each step of the iterative procedure requires

a constant number of operations. Hence the computational

complexity of the model solution is linear in the number

of steps (k), which equals the number of downloading usersin the systems. This number is theoretically unbounded (it

has a Poisson distribution), however we can reasonably limit

the iteration to a maximum number of steps kmax such thatP(Nd(t) > kmax) is negligible (i.e., smaller than a given smallconstant ǫ; in our results we set ǫ = 10−6). The computationalcomplexity is then Θ(kmax).

3.3 Extension to non-sequential delivery

So far we have restricted the analysis to the case in which users

receive the video chunks sequentially. Although conceptually

simple, this scheme is clearly sub-optimal when users down-

load data at a rate larger than the playback rate: recall that in

5

this case users can download in advance video chunks needed

in the future, and this prefetching does not necessarily have

to be done sequentially. Actually, by allowing out-of-sequence

delivery the system can better exploit the upload bandwidth

of the peers. Chunk-based, swarming approaches like those

commonly used in P2P bulk transfers (e.g., BitTorrent) can

be applied to VoD systems with the additional constraint

that individual chunks must be downloaded before specific

deadlines to avoid interrupting the video playback.

A common approach to combine the efficiency of P2P

swarming with the strict delay constraints of VoD is to allow

users to receive also out-of-sequence chunks of the video

within a limited “sliding window” of data starting from the

point currently played [5].

For simplicity, instead of considering an actual sliding

window, we divide the video into a fixed number W ofnon-overlapping segments of size LW , L/W . We denoteby TW , LW /d = τd/W the time needed to downloada segment. Users who are concurrently downloading the

same segment, besides helping users downloading previous

segments can help each other in a swarming fashion, i.e., we

assume that, within a segment, we can exploit also chunk-

based, out-of-sequence distribution. This model is able to

capture the behavior of realistic sliding window applications,

while keeping the analysis simple.

Below we show how the analysis developed in Section 3.1

can be adapted to study this scheme as well, permitting us

to assess the performance gain achievable by non-sequential

schemes.

Indeed, by aggregating all users belonging to the same

segment into a sort of ‘meta-peer’ we can essentially apply

the same analysis as before to a chain of W meta-peersdownloading the video segments sequentially. Let Nv(t),1 ≤ v ≤ W , be the random variables representing the numberof peers concurrently downloading segment v at time t. Noticethat this number is Poisson-distributed [12] with mean

Nv(t) =

∫ TW

0

λ(t − (v − 1)TW − x)GT (x + (v − 1)TW ) dx

Let Sd(t, v) be the bandwidth that the servers (or the seeds)must supply at time t to all users downloading segments ofindex smaller than or equal to v. Using the same reasoningas in the proof of Proposition 1, quantity Sd(t, v) can becomputed through the following recursive equation,

Sd(t, v) = max{

0, Sd(t, v − 1) −

Nv(t)∑

i=1

Ui +

Nv(t)−1∑

i=1

d}

+ d · INv(t)>0 1 ≤ v ≤ W (11)

with Sd(t, 0) = 0 and the convention that summations areequal to zero if Nv(t) = 0. Notice that (11) is analogous to(5), if we consider all peers within a segment v (if any) as asingle meta-peer having virtual upload bandwidth Ũ(t, v) =

d +∑Nv(t)

i=1 (Ui − d).

The first two moments of Sd(t) = Sd(t,W ) can be com-puted using a second-order approximation similar to the one

adopted for the case of sequential delivery in Section 3.2, i.e.,

by assuming that quantity Sd(t, v − 1) − Ũ (whose momentscan be computed exactly) has a normal distribution, and then

using (6) and (7) to compute the first and second moments of

the positive part of it [12], [13].The analysis is made slightly

more complicated by the fact that we need to consider also

the case in which there are no users downloading a segment

(Nv(t) = 0), which requires some care. Details are reportedin Appendix B of [8].

At last, the impact of seeds is taken into account in a way

analogous to the sequential case. Note that the computational

procedure for the non-sequential case is again linear in the

number of steps, hence it is Θ(W ).The extreme case W = 1 of just one segment (i.e., the

entire video) corresponds to a scheme in which chunks can

be downloaded in any order, and the system can exploit the

upload bandwidth of any peer irrespective of its arrival time.

In this case we have,

Sd(t, 1) = max{

d,

Nd(t)∑

i=1

(d − Ui)}

(12)

which plugged into (4) (in the place of Sd(t)) provides alower bound to the server bandwidth required by any chunk

distribution scheme. In the following, we will refer to the

server bandwidth obtained in this way as the completely non-

sequential case, or simply the lower bound.

In Sect 7 of [8], it is shown how our model can be extended

to analyze systems providing limited VCR functionalities as

well as, systems in which peers have limited storage capabil-

ities.

4 PERFORMANCE UNDER STATIONARY CONDI-TIONS

In this section we report a selection of the most interesting

results that we have obtained by our analysis under stationary

user arrival process. Since in this case all averages do not

depend on t, we will omit for simplicity the indication oftime. We normalize to 1 the video playback rate dv , whichthus serves as unit for all other bandwidth figures. We assume

that users stay in the system for a time at least equal to the

watching time, hence T ≥ Tv . Unless otherwise specified,we assume that users’ upload bandwidth U is exponentiallydistributed.

The results obtained by our analytical approximation in

Section 3.2 are compared against those obtained by an event-

driven simulator derived from P2PTVsim2. In particular, we

adapted P2PTVsim, originally developed for live-streaming,

to represent VoD systems. Our simulator permits considering

a general mesh-based pull system: the content is segmented

into small fixed-size chunks, corresponding to 100 ms of

video; peers independently retrieve individual chunks from

other peers and/or the servers adopting a simple trading mech-

anism that involves a periodic exchange of signaling messages

containing buffer maps between pairs of peers. To guarantee an

almost in-order delivery of chunks (as in the case of VoD) we

2. P2PTVsim is available at http://www.napa-wine.eu/cgi-bin/twiki/view/Public/P2PTVSim

6

implemented a sliding window mechanism: each peer playing

chunk c is allowed to retrieve only the chunks [c, c+H] whereH is the window parameter. A sequential delivery scheme isthen approximately achieved by choosing a small value of

H (compared to the entire video). In our simulations (if nototherwise specified) we have set H = 40, such that a windowcorresponds to only 4 seconds of video. Chunks are, in the

first instance, retrieved from other peers; a peer is allowed to

retrieve directly from servers the immediately following chunk

to play (i.e., chunk c+1 can be retrieved from the servers onlywhile playing chunk c). Servers are assumed to have unlimitedbandwidth.

We recognize that our simulator may appear rather simpli-

fied with respect to the behavior of real systems, as it shares

most of the system assumptions of the model (see Section

11 in [8]).We use it primarily to validate the accuracy of

the Gaussian approximation introduced in Section 3.2, that

enables us to assess the system performance at very low

computational cost (i.e., with negligible effort as compared

to detailed simulations or measurement campaigns).

We emphasize that the ultimate goal of our model and

analysis is not to produce accurate quantitative results about

the performance of a realistic system (that necessarily depends

on the specific system implementation), but to allow for an effi-

cient qualitative prediction of the impact of various parameters

and design choices on the resulting system performance.

0.1

1

10

100

1 10 100 1000

Av

erag

e se

rver

ban

dw

idth

Average number of users

approx - U = 0.9sim - U = 0.9

sim lower bound - U = 0.9approx - U = 1.0

sim - U = 1.0sim lower bound - U = 1.0

approx - U = 1.2sim - U = 1.2

sim lower bound - U = 1.2

Fig. 1: Comparison of average server bandwidth in the case

d = dv , as function of the average number of users N , fordifferent values of U , in the absence of seeds.

0.1

1

10

100

1 10 100 1000

Av

erag

e se

rver

ban

dw

idth

Average number of downloading users

approx - U = 0.9sim - U = 0.9

sim lower bound - U = 0.9approx - U = 1.0

sim - U = 1.0sim lower bound - U = 1.0

approx - U = 1.2sim - U = 1.2

sim lower bound - U = 1.2

Fig. 2: Comparison of average server bandwidth in the case

d = dv , as function of the number of downloading users Nd,for different values of U , in the presence of N seed = 0.1 ·Ndseeds.

4.1 Impact of the number of watching users and

seeds

Figure 1 reports, on a log-log scale, the average server

bandwidth S as function of the average number of users N , inthe case d = dv , T = Tv . We consider three different valuesof average upload bandwidth U = 0.9, 1.0, 1.2, correspondingto systems operating in deficit, balanced, and surplus mode,

respectively (here γ = 1/U ).Besides noticing the accuracy of the approximate analysis, it

is interesting to see that the average server bandwidth saturates

for U = 1.2 (surplus mode) to a value about 3.5 timeslarger than the corresponding lower bound, which tends to

dv = 1. As expected, the average server bandwidth divergesunder the deficit and balanced modes. Moreover, in the deficit

mode, the sequential system requires asymptotically the same

bandwidth as the completely non-sequential system (i.e., the

lower bound).

In Figure 2 we compare the results obtained in the same

system considered above, but assuming that users remain

active after the end of the watching time for an exponen-

tially distributed amount of time of mean equal to 10% of

the watching time, generating an average number of seeds

N seed = 0.1 · Nd. Now the systems with U = 1.0 andU = 1.2 operate in surplus mode, whereas the system withU = 0.9 operates very close to the balanced mode (hereγ = 1/(1.1 · U)). We observe that, in the presence of seeds,the average server bandwidth requested by systems operating

in surplus mode reaches a maximum, after which it goes

to zero as the number of users increases. Results such as

those reported in Figures 1 and 2 can be useful in system

dimensioning, as they allow to estimate, in the surplus mode,

the worst-case server bandwidth which is needed when the

number of downloading users Nd is not known.

4.2 Impact of the target download rate

Even when users tend to leave the system at the end of the

watching time, it is still possible to benefit from the positive

effect created by the seeds, who absorb part of the fluctuations

in the bandwidth requested by downloading peers, shielding

the servers. The trick to ‘artificially’ create some seeds is to

make the users download the video at rate d > dv , so that theybecome seeds for other peers before the end of the watching

time. Intuitively, however, d should not be set too large tooffset the gain achievable by the seeds. Figure 3 illustrates the

performance of this strategy in the case of N = 100 users,T = Tv , showing the average server bandwidth as functionof d. We observe that for all the considered values of U , theaverage server bandwidth achieves a minimum for a value of

d slightly larger than dv . The impact is particularly striking inthe surplus mode (U > 1), in which setting d > dv brings theserver bandwidth close to zero.

4.3 Impact of non-sequential delivery

To understand the performance gains achievable by adding

(in a limited way) BitTorrent-like chunk delivery schemes to

the video distribution, we consider two scenarios operating in

7

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

1 1.2 1.4 1.6 1.8 2 2.2

download rate, d

simapprox

sim lower bound 0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

1 1.2 1.4 1.6 1.8 2 2.2

download rate, d

simapprox

sim lower bound 0

0.25

0.5

0.75

1

1.25

1.5

1.75

2

2.25

2.5

2.75

3

3.25

3.5

1 1.2 1.4 1.6 1.8 2 2.2

download rate, d

simapprox

sim lower bound

0

0.25

0.5

0.75

1

1.25

1.5

1.75

2

2.25

2.5

2.75

3

3.25

3.5

1 1.2 1.4 1.6 1.8 2 2.2

download rate, d

simapprox

sim lower bound

Aver

age

serv

erb

andw

idth

Ū = 1 Ū = 1.2 Ū = 1.4Ū = 0.9

Fig. 3: Average server bandwidth as function of the target download rate d, for different values of U , with N = 100, T = Tv .

0

1

2

3

4

5

6

0 10 20 30 40 50 60 70 80 90 100

Aver

age

serv

er b

andw

idth


sim - sequentialsim - W = 32sim - W = 16sim - W = 8sim - W = 4sim - W = 2

sim - lower bound 0

1

2

3

4

5

6

0 10 20 30 40 50 60 70 80 90 100


approx - sequentialapprox - W = 32approx - W = 16approx - W = 8approx - W = 4approx - W = 2approx - W = 1

Fig. 4: Average server bandwidth as function of the number

of users, with T = Tv , d = dv , γ = 1, for different values ofthe number of segments W . Comparison between simulation(left plot) and approximate analysis (right plot).

balanced mode (γ = 1), in the case of uniformly distributeduser upload bandwidth. In the first one (see Figure 4) we vary

the number of users, assuming T = Tv , d = dv . Recall thatthe case W = 1 in our analysis provides an approximation ofthe lower bound.

As we increase the number of segments, we obtain in-

termediate curves between the completely non-sequential

scheme and the pure sequential scheme, which corresponds to

W → ∞. Besides noticing the accuracy of the approximation,we observe that swarming schemes provide non-negligible

gains over pure sequential schemes only when the average

number of users per segment is not too small (say larger than

a few units). Recall, however, that larger segments (and thus

larger number of users in them) imply larger startup delays.

For example, the case W = 32 corresponds to a slidingwindow of about 4 minutes for a typical movie (2 hours long).

As we can see on Figure 4, the benefit of a non-sequential

scheme with W = 32 is negligible for the considered valuesof the number of users (below one hundred).

In the second scenario (see Figure 5) we fix the total number

of users N = 100, and increase the total activity time ofthe users T (the average upload bandwidth U is reducedaccordingly to keep γ = 1). We observe that as we increasethe activity time (and thus the number of seeds), the difference

in performance between swarming schemes and sequential

delivery becomes less significant.

2

2.5

3

3.5

4

4.5

5

5.5

6

1 1.1 1.2 1.3 1.4 1.5

Av

erag

e se

rver

ban

dw

idth

Average activity time / video duration

sim - sequentialsim - W = 64sim - W = 32sim - W = 16

sim - W = 8sim - W = 4sim - W = 2

sim - lower bound

2

2.5

3

3.5

4

4.5

5

5.5

6

1 1.1 1.2 1.3 1.4 1.5


approx - sequentialapprox - W = 64approx - W = 32approx - W = 16

approx - W = 8approx - W = 4approx - W = 2approx - W = 1

Fig. 5: Average server bandwidth as function of the ratio T/Tvbetween average activity time and video duration, keeping

fixed N = 100, d = dv , γ = 1, for different values of thenumber of segments W . Comparison between simulation (leftplot) and approximate analysis (right plot).

5 NON-STATIONARY SYSTEMS

In this section we show how our analytical framework can be

applied to study the performance of time-varying systems in

which the arrival rate of requests for a given content changes

significantly over time. In particular, we will see that the

behavior of a non-stationary system can dramatically differ

from the one of a stationary system in which the arrival rate of

requests is constant, due to a misalignment problem between

the temporal evolution of the number of downloaders and the

temporal evolution of the number of seeds.

5.1 The effect of daily traffic variations

We consider a reference scenario in which the arrival rate of

requests for a given video follows a daily pattern which is

modeled for simplicity by a sine function of period equal to

24 hours, between a minimum of λ = 0.1 and a maximum ofλ = 1, represented by the thick solid line in the top plot ofFig. 6.

We first analyze a software-based system in which users

contribute their upload bandwidth during the watching time

of the video, plus a random additional time in which the

application is kept running. We assume that the video duration

is Tv = 2 h, and the additional activity time after the end of thevideo is exponentially distributed with mean 1 h. We normalize

8

dv = d = 1 and assume the upload bandwidth of users to beexponentially distributed with mean U = 0.7. The per-userload is γp = 2/(0.7 · 3) ≈ 0.95. The top plot of Fig. 6 reportsthe temporal evolution of both the number of downloaders and

the number of seeds. Since peers become seeds only after the

end of the watching time, the dynamics of downloaders and

seeds are misaligned, with a temporal shift about Tv = 2 h.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

12 18 24 6 12 18 24 6 0

20

40

60

80

100

120

140

vid

eo r

equ

est

rate

, λ

(t)

nu

mb

er o

f d

ow

nlo

ader

s /

seed

sλdownloaders

seeds

0

1

2

3

4

5

6

7

8

9

10

12 18 24 6 12 18 24 6

aver

age

serv

er b

and

wid

th

time of day (hours)

non-stationarystationary (λ(t) = 1)

Fig. 6: Temporal evolution of video request rate (top plot,

left y axes), number of downloaders/seeds (top plot, right yaxes), and average server bandwidth S(t) (bottom plot), in thesoftware-based system.

The effect of this misalignment on the required average

server bandwidth is depicted on the bottom plot of Fig. 6,

by the solid line labeled ‘non-stationary’, which exhibits a

peak preceding the point at which the video request rate is

maximum. Fig. 6 reports also a curve labeled ‘stationary’,

representing the server bandwidth that would be necessary if

the content request rate were constant and equal to λ = 1 (themaximum request rate). We observe that the performance of

the non-stationary system is worse than that of the stationary

system, both in terms of peak server bandwidth and average

server load. This occurs even if the content request rate is

always larger in the stationary system.

If we increase the activity time after the end of the video,

while keeping the same per-user system load γp (either byreducing the upload bandwidth of the users, or equivalently

by increasing the download rate of the video, i.e., its resolu-

tion/quality), the negative effect of the misalignment problem

become worse. As an extreme case, we consider a P2P-VoD

system relying on set-top-boxes which are always active and

serving the last watched video. To mimic the behavior of set-

top-boxes with our model of peer dynamics, we assume that

the activity time after watching a movie is much longer than

before (in the order of a day), representing a set-top-box which

remains always on before the user downloads the next video.

In particular, we consider an additional activity time of 22 h,

which added to the watching time of a movie leads to T = 1day. To obtain the same per-user load of the software-based

system, we set U = 0.7 · 3/24.Fig. 7 reports analytical results for this scenario, analogous

to those in Figure 6. We have also reported on the bottom plot

of Fig. 7 a sample path obtained from simulation, to confirm

the analytical prediction. In this case the instantaneous system

load γ(t) is severely unbalanced across the day. During peak

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

12 18 24 6 12 18 24 6 0

200

400

600

800

1000

1200

vid

eo r

equ

est

rate

, λ

(t)

nu

mb

er o

f d

ow

nlo

ader

s /

seed

sλdownloaders

seeds

0

10

20

30

40

50

60

70

12 18 24 6 12 18 24 6

aver

age

serv

er b

andw

idth

time of day (hours)

non-stationarystationary (λ = 1)

sim trace

Fig. 7: Temporal evolution of video request rate (top plot, left

y axes) number of downloaders/seeds (top plot, right y axes),and average server bandwidth S(t) (bottom plot), in the set-top-box system.

hours, the bandwidth requested at servers grows very large,

while for the rest of the day it is negligible. The problem is

that the upload capacity of the seeds, which are very numerous

and almost stable along the day (see top plot of Figure 7) is

totally wasted for a large fraction of the day. Notice that we

are not saying that set-top-boxes are not useful: increasing the

activity time of peers (up to the point of having always-on user

devices) is very beneficial to the system performance, since

the per-user load γp is reduced. However, one must be carefulthat the instantaneous system load γ(t) can vary significantlyaround γp, and peak traffic demand cannot be absorbed wellby large populations of seeds (set-top-boxes) each devoting a

small amount of upload bandwidth to the video distribution.

Indeed, to minimize the bandwidth deficit at peak times, the

ratio U/d should not become too small.

5.2 The effect of newly introduced videos

We now consider a system in which the non-stationarity of

the video request process is due to new contents regularly

introduced in the catalogue, whose popularity changes over

time. We model such a system using a shot-noise process [14].

We assume that new videos are made available in the system at

(constant) rate β. The request rate of a given video i insertedat time ti follows an non-homogeneous Poisson process withtime-varying intensity

λi(t) = Λφ(t − ti)

where φ(t) is a shaping function, defined over t ≥ 0,modeling how the popularity of the video evolves over time.

We require φ(t) to be an integrable function, and, withoutloss of generality, we assume that

∫ ∞

0φ(t) dt = 1. By so

doing, Λ equals the average number of times that the videois requested during its lifetime. In general, both Λ and φ(t)could be random, i.e., each file, upon arrival, could be assigned

a popularity shape φ(t) extracted from a family of functionswith randomized parameters, and a value of Λ taken from agiven distribution possibly associated to the chosen shape φ(t).We denote by FΛ,φ the joint cdf of Λ and φ.

Our target is to evaluate the average overall server band-

width Z resulting from the distribution of all videos available

9

in the catalog. Despite the complexity of the system, our

approximate model can be exploited to quickly estimate Zunder several parameter settings, without running expensive

simulations. We briefly summarize the computational proce-

dure for clarity: using (1) and (2) we can analytically compute,

for any t > ti, the average number of peers downloadinga given video i, and the average number of users actingas seeds for it. Equation (10) provides the mean bandwidth

S(t, Λ, φ) requested from the servers at any time t > ti.Standard numerical integration techniques3 allow to compute

the average amount of data D(Λ, φ) =∫ ∞

tiS(t − ti,Λ, φ) dt

supplied by the servers for a file characterized by popularity

parameters {Λ, φ}, from which we can compute

Z = β

∫

Λ,φ

D(Λ, φ) dFΛ,φ (13)

Notice that the average number of users in the system is βΛT .We start considering a scenario in which both Λ and φ

are deterministic. In particular, we consider files with expo-

nentially decreasing popularity: φ(t) = µe−µt. Since Z istrivially linear in the arrival rate of new contents (13), we

arbitrarily set β = 1. Throughout our experiments we also fixthe per-user system load to γp = 0.5, as the effects that we aregoing to show occur also when users can upload a much larger

amount of data than what they request during their activity

period. We consider the pure sequential delivery scheme, and

normalize dv = d = 1. We further normalize T d = 1,and assume, whenever T > T d, that the additional activitytime after downloading the movie is exponentially distributed.

Under the above settings the average upload bandwidth of

the users, which is assumed to be exponentially distributed, is

directly related to T according to U = 2/T . At last, we defineχ = 1/(µT d), which can be thought as the average lifetimeof a video (in terms of popularity) normalized by its watching

duration. We now investigate the joint impact of the only free

parameters: Λ, χ, and T .

10

100

1000

10000

1 1.5 2 2.5 3 0.6

0.8

1

1.2

1.4

1.6

1.8

2

Av

erag

e up

load

ban

dw

idth

, U

Normalized activity period, T/Td

Λ = 1000000Λ = 100000

Λ = 10000Λ = 1000

U

1

10

100

1000

10000

1 1.5 2 2.5 3 0.6

0.8

1

1.2

1.4

1.6

1.8

2

Mea

n s

erv

er b

and

wid

th,

Z

Normalized activity period, T/Td

Λ = 10000Λ = 1000Λ = 100Λ = 10

U

Fig. 8: Average server bandwidth Z as function of normalizeduser activity period T/T d, for χ = 1 (left plot) and χ = 100(right plot), and different values of Λ. The associated averageupload bandwidth U = 2/T is reported on the right y axes.

Fig. 8 reports, for different Λ’s, the average server band-width Z as function of user activity period T , for χ = 1(left plot) and χ = 100 (right plot). We observe that theserver bandwidth Z remains low and mainly independent of

3. We adopted the simple trapezoid rule.

Λ (i.e., the system can scale up to arbitrarily large numberof users), for values of T slightly larger than 1 and smallerthan 2. Notice that when T < 2 the average user uploadbandwidth (right y axes) is larger than the video download

rate (U > 1). When this condition is not met (in general,when U ≤ dv), the system does not sustain itself, despitethe fact that users can potentially upload twice the amount

of traffic they download (γp = 0.5). This is again due to themisalignment problem between downloaders and seeds: when

T increases, more seeds becomes available, but too late withrespect to the time their upload capacity can be exploited. We

emphasize that this is in sharp contrast to what we have seen

under stationary conditions, where the effect of decreasing Ucan be completely compensated by increasing T , so that thesystem performance is not compromised (see Figure 5). We

further notice that this asymptotic behavior (for increasing Λ)occurs for any value of χ (i.e., popularity shape), and thatlarger values of χ (i.e., files whose popularity decays moreslowly) lead to larger values of Z in the regime where thesystem is able to scale.

1

10

100

1000

0.1 1 10 100 1000 10000

Mea

n s

erv

er b

and

wid

th,

Z

Normalized video lifetime, χ

T = 1T = 1.3T = 1.7

T = 2T = 3T = 6

T = 20

20

63

1.7

1.3

1

2

Fig. 9: Average server bandwidth Z as function of the nor-malized video lifetime χ, for different values of user activityperiod T , in the case of Λ = 1000.

The impact of the video popularity shape is better un-

derstood looking at Fig. 9, which reports, in the case of

fixed Λ = 1000, the server bandwidth Z as function of thenormalized video lifetime χ, for different values of T . First,note that Z is bounded above by βΛT ddv , since in the worstcase the system provides rate dv to each downloading user. Inour parameters setting, the above upper bound on Z coincidesnumerically with Λ = 1000. We observe that Z approachesthe upper bound for large values of χ (and any T ), for whichrequests for the same file are so diluted over time that it

becomes more and more unlikely to have peers concurrently

downloading the same file (and thus helping other peers).

When T increases, the upper bound is approached also forsmall values of χ, this time because downloads of the same fileare so synchronized that the upload bandwidth of users (which

decreases with T ) can be exploited only during a short intervalequal to T d after the file is inserted into the catalog, andthe resulting contribution of peers tends to become negligible.

Small values of χ can be the result of videos posted on webpages providing suggestions to the users which are frequently

updated: our results suggest that this practice can be harmful

as it can compromise an effective peer-assisted distribution

10

(when U ≤ dv). For T > 2, Z achieves a minimum for agiven value of χ. We observe that, for large values of χ (fileswith slowly decaying popularity), long user activity times are

actually beneficial to the system, although in this regime the

system is not able to scale to large number of users, as we

have seen.

30

40

50

60

70

80

90

100

120

150

200

1000 10000 100000

Mea

n s

erver

ban

dw

idth

, Z

Λ

α = 2α = 1

α = 0.5α = 0

0

5

10

15

20

25

30

1000 10000 100000

Λ

α = 2α = 1

α = 0.5α = 0

10

100

1000

1000 10000 100000

Λ

α = 2α = 1

α = 0.5α = 0

T = 1 T = 3T = 1.5

Fig. 10: Average server bandwidth Z as function of Λ, fordifferent values of the Zipf’s exponent α, and fixed χ = 10.The user activity period is either T = 1 (left plot), or T = 1.5(middle plot) or T = 3 (right plot).

At last, we consider a scenario in which Λ is a randomvariable, while maintaining the assumption that the popularity

decay exponent µ is the same for all files. More specifically,we assume that files belong to 10 different classes, whose

request rates follows a Zipf’s distribution of exponent α, i.e.,files of class i (1 ≤ i ≤ 10) are requested at rate Λi = ΛK/i

α,

where K is a normalizing constant such that∑10

i=1 Λi = Λ.Note that α = 0 corresponds to the previous scenario in whichall files have the same popularity profile. Results are shown in

Fig. 10, in which we report the mean server bandwidth Z asfunction of Λ, for different values of the Zipf’s exponent α,and fixed χ = 10. In the left plot we consider T = 1, i.e., usersabandon the system immediately after downloading the video,

i.e., no seeds are available. We notice that the server bandwidth

Z scales logarithmically with Λ, which can be explained by thelower limit d in (5). In the middle plot we consider T = 1.5,which belongs to the range in which the system performance

is mainly insensitive to Λ (see Fig. 8), and thus also to theZipf’s exponent α. In the right plot, we consider T = 3, forwhich the system is not able to scale to large number of users.

Actually, in this case Z scales linearly with Λ.In conclusion we can say that in non-stationary scenarios

the system performance critically depends on the relationship

between the average peer upload bandwidth and the download

rate: when U ≤ dv the bandwidth deficit cannot be effectivelycompensated by just increasing the seed availability (i.e, by

increasing T ). Smart prefetching policies can in principle re-duce the burden on the servers. However, prefetching policies

can not be easily implemented in non-stationary (e.g., flash-

crowd) scenarios where contents are not available in advance,

and their popularity cannot be easily predicted at the early

stages.

6 CONCLUSIONS

We have proposed a computationally-efficient methodology to

analytically estimate the server bandwidth requested in non-

stationary peer-assisted VoD systems. Our approach is highly

flexible, and can account for several important effects such as

peer upload bandwidth heterogeneity, churning, non-sequential

chunk delivery schemes.

We have shown that our analysis provides efficient, accurate

predictions of all observed phenomena, providing a useful tool

for the design of peer-assisted VoD systems.

REFERENCES[1] D. Ciullo, V. Martina, M. Garetto, E. Leonardi, and G. L. Torrisi,

“Performance Analysis of Non-stationary Peer-assisted VoD Systems,”in INFOCOM Mini-Conference, 2012.

[2] “Cisco Visual Networking Index: Forecast and Methodology, 2012–2017,” white paper published on Cisco website, 2012.

[3] C. Huang, J. Li, and K. W. Ross, “Can Internet Video-on-Demand BeProfitable?” in ACM SIGCOMM, 2007.

[4] Y. Huang, T. Z. J. Fu, D. ming Chiu, J. C. S. Lui, and C. Huang,“Challenges, Design and Analysis of a Large-scale P2P VoD System,”in ACM SIGCOMM, 2008.

[5] “PPLive, http://www.gridcast.cn/. GridCast, http://www.gridcast.cn/. PP-Stream, http://www.ppstream.com/. TVU, http://www.tvunetworks.com/.SopCast, http://www.sopcast.com/. Kankan, http://www.kankan.com/.”

[6] M. Cha, P. Rodriguez, S. Moon, and J. Crowcroft, “On next-generationtelco-managed P2P TV architectures,” in IPTPS, 2008.

[7] D. Ciullo, V. Martina, M. Garetto, E. Leonardi, and G. L. Torrisi,“Stochastic Analysis of Self-Sustainability in Peer-Assisted VoD Sys-tems,” in INFOCOM, 2012.

[8] ——, “Peer-assisted VoD Systems: an Efficient Modeling Framework,”in Supplemental material, 2013.

[9] H. Abrahamsson and M. Nordmark, “Program popularity and viewerbehaviour in a large tv-on-demand system,” in ACM IMC, 2012.

[10] M. Ahmed, S. Traverso, M. Garetto, P. Giaccone, E. Leonardi, andS. Niccolini, “Temporal locality in today’s content caching: Why itmatters and how to model it,” ACM Computer Communication Review,vol. 43, no. 5, 2013.

[11] W. Wu and J. Lui, “Exploring the Optimal Replication Strategy in P2P-VoD Systems: Characterization and Evaluation,” in INFOCOM, 2011.

[12] S. M. Ross, “Stochastic processes. 1996,” 2001.[13] W. Whitt, “Approximations for the gi/g/m queue,” Production and

Operations Management, vol. 2, no. 2, pp. 114–161, 1993.[14] D. Daley and D. Vere-Jones, An introduction to the theory of point

processes, Springer-Verlag, Ed., 1998.[15] R. Kumar, Y. Liu, and K. Ross, “Stochastic Fluid Theory for P2P

Streaming Systems,” in INFOCOM, 2007.[16] D. Wu, Y. Liu, and K. W. Ross, “Queuing Network Models for Multi-

Channel P2P Live Streaming Systems,” in INFOCOM, 2009.[17] X. Yang and G. de Veciana, “Service Capacity of Peer to Peer Net-

works,” in INFOCOM, 2004.[18] D. Qiu and R. Srikant, “Modeling and Performance Analysis of

BitTorrent-Like Peer-to-Peer Networks,” in ACM SIGCOMM, 2004.[19] N. Parvez, C. Williamson, A. Mahanti, and N. Carlsson, “Analysis of

BitTorrent-like Protocols for On-Demand Stored Media Streaming,” inACM SIGMETRICS, 2008.

[20] B. Fan, D. Andersen, M. Kaminsky, and K. Papagiannaki, “BalancingThroughput, Robustness, and In-Order Delivery in P2P VoD,” in ACMCoNEXT, 2010.

[21] B. Tan and L. Massoulie, “Optimal Content Placement for Peer-to-PeerVideo-on-Demand Systems,” in INFOCOM, 2011.

[22] W. Wu, J. Lui, and R. Ma, “On incentivizing upload capacity in P2P-VoD systems: Design, analysis and evaluation,” Computer Networks,vol. 57, no. 7, pp. 1674 – 1688, 2013.

[23] C. Zhao, J. Zhao, X. Lin, and C. Wu, “Capacity of P2P On-DemandStreaming with Simple, Robust and Decentralized Control,” in INFO-COM, 2013.

[24] Y. Zhou, T. Z. Fu, and D. M. Chiu, “Server-assisted adaptive videoreplication for P2P VoD,” Signal Processing: Image Communication,vol. 27, no. 5, pp. 484 – 495, 2012.

[25] Y. He, Z. Xiong, Y. Zhang, X. Tan, and Z. Li, “Modeling and analysisof multi-channel P2P VoD systems,” Journal of Network and ComputerApplications, vol. 35, no. 5, pp. 1568 – 1578, 2012.

[26] Z. W. Zhao, S. Samarth, and W. T. Ooi, “Modeling the effect of userinteractions on mesh-based P2P VoD streaming systems,” ACM Trans-actions on Multimedia Computing, Communications, and Applications,vol. 9, no. 2, 2013.

11

Delia Ciullo received the Master degree inTelecommunications Engineering and the Ph.D.degree in Electronics and Communications En-gineering, both from Politecnico di Torino in2007 and 2011, respectively. Between 2012 and2013 she was a post-doc fellow in the MAE-STRO team at INRIA Sophia Antipolis with anERCIM fellowship. She is currently a post-docresearcher at EURECOM Sophia Antipolis.

Valentina Martina received the Master degreein Mathematical modeling in Engineering andthe Ph.D. degree in Electronics and Communica-tion Engineering, both from Politecnico di Torinoin 2007 and 2011, respectively. In 2010, shehas been a visiting student at the TechnicolorParis Research Lab. She is currently a post-docresearcher at Politecnico di Torino.

Michele Garetto (M’04) received the Dr.Ing.degree in Telecommunication Engineering andthe Ph.D. degree in Electronic and Telecommu-nication Engineering, both from Politecnico diTorino, Italy, in 2000 and 2004, respectively. Heis currently assistant professor at the Universityof Torino, Italy.

Emilio Leonardi (M’99, SM’09) is an AssociateProfessor at the Dipartimento di Elettronica ofPolitecnico di Torino. He received a Dr.Ing de-gree in Electronics Engineering in 1991 anda Ph.D. in Telecommunications Engineering in1995 both from Politecnico di Torino. His re-search interests are in the field of performanceevaluation of wireless networks, P2P systems,packet switching.

Giovanni Luca Torrisi graduated in Mathe-matics in 1994 at the University of Rome ”LaSapienza” and obtained a Ph.D. in Mathematicsat the University of Milan. Since December 2001he has been a researcher at the CNR. Hisresearch interests are in the field of ProbabilityTheory and Applied Probability.

1

Supplemental material of paper:

Peer-assisted VoD Systems:

an Efficient Modeling Framework

Delia Ciullo, Valentina Martina, Michele Garetto, Emilio Leonardi, Giovanni Luca Torrisi

7 ANALYSIS EXTENSIONS

7.1 VCR functionalities

VoD systems can optionally provide to the users limited VCR functionalities. For example, Netflix allows watching users to

pause or restart viewing at will. In this section we show how our model can be extended to take into account the most common

VCR functionalities provided by VoD services, i.e., the possibility to place the video in pause mode, and the the possibility to

replay portions of the video which have already been downloaded. The above two operations offer to the users a significant

degree of flexibility, while being easily implementable in practice. Furthermore, they actually improve the effectiveness of

peer-assisted video distribution, as we will see.

Observe that a VoD system can react to users making pause/replay actions in different ways. We will consider the following

two extreme strategies: 1) the video download is interrupted as soon as the user enters the pause/replay state, and it is resumed

only when the user starts watching the left portion of the video; 2) the download is never interrupted, and thus proceeds up

to its natural end.

The first strategy makes sense while having in mind a short-term objective. Indeed, by interrupting the download of users in

pause/replay, the system achieves an instantaneous reduction of global user bandwidth request. The second strategy, instead,

which keeps on the video download, can be justified by having in mind a long-term objective. Indeed, users who complete

sooner the download (and are likely to watch the entire video) will act as seeds for an extended period of time.

In conclusion, the first strategy responds to the need of reducing the instantaneous load, whereas the second aims at increasing

the number of future seeds. While it is clear that pause/replay actions improve in general the system performance (since users

remain in the system for a longer time, contributing their upload bandwidth) it is not easy, however, to guess which strategy

performs better, since this depends on many system parameters, including the user behavior. An analytical model like ours can

actually be very useful to predict the system performance in different scenarios.

Both strategies described above can easily be incorporated in our model. In particular, the second strategy does not require

any substantial modification to the model which does not consider VCR functionalities. Indeed, users entering the pause/replay

state, but keeping on the download, have no impact on (1) and the approximation developed in Section 3.2. The only difference

is in the activity time T of users (and its distribution GT (t)), which will be prolonged by pause/replay events, increasing thenumber of seeds (and reducing the system load).

The iterative procedure described in Section 3.2, instead, must be slightly modified to analyse the first strategy, in which

video download is suspended while users are in pause/replay state. In this case, Proposition 1 is generalized to,

Proposition 2: Quantity Sd(t, k) satisfies the following recursive equation:

Sd(t, k) =

{

d I{1 ∈/ P} k = 1d I{k ∈/ P} + max{0, Sd(t, k − 1) − Uk} k > 1

where I{k ∈/ P} is the indicator function of the event {user k is not pausing/replaying}. Note that I{k ∈/ P} models the fact thatnodes which are currently in pause are not downloading the video (so their contribution to the increase of the bandwidth

request is null) while they are still contributing their upload bandwidth to the content distribution.

Then, denoting by pk = E[I{k ∈/ P}] the probability that the k-th downloading user is not pausing, we can easily extend theGaussian approximation procedure described in Section 3.2 by simply replacing (8) and (9) with:

Sd(t, k) ≃ pkd + E[y′]

σ2Sd(t, k) ≃ (pk)(1 − pk)d2 + E[y′2] − E[y′]2

under the initial conditions: Sd(t, 1) = dp1 and σ2Sd

(t, 1) = (p1)(1 − p1)d2.

2

7.2 Peers with limited memory capabilities

At last, we consider the situation in which the amount of memory available at a peer to store the video currently being watched

is not large enough to cache the entire video file. This lack of memory clearly penalizes the effectiveness of peer-assisted

video distribution, as users can redistribute only a limited portion of downloaded data.

To mitigate the impact of memory constraints, an effective solution that we propose and analyse here is to adopt a ‘striping’

approach: the original video stream is divided into M sub-streams, called stripes, whose size (and number) can be adaptedto the storage capacity of peers. Assuming, for simplicity, that all peers have the same memory constraint (the more general

case can be similarly handled), we will focus on the case in which all stripes have the same size, equal to the storage capacity

of peers. Peers downloading a video need to retrieve the M stripes in parallel. Then, each peer is requested to cache andredistribute a single, randomly selected stripe belonging to the requested video.

Notice that each stripe can be regarded as a content of the same temporal duration of the original video, with an associated

play-out rate equal to dv/M . Our model can be easily extended to analyze this case as well. Indeed (1) can be generalized tocompute the bandwidth requested from the servers by each individual stripe, as follows:

Proposition 3: Quantity Sd(t, k), representing now the bandwidth requested by downloaders of one particular stripe, satisfiesthe following recursive equation:

sd(t, k)=

{

dM k = 1dM +max{0, Sd(t, k − 1)−UkI{k ∈ S}} k > 1

where I{k ∈ S} is an indicating function specifying whether the k-th user is redistributing the current stripe. By constructionE[I{k ∈ S}] = 1/M .The Gaussian approximation procedure described in Section 3.2 can be easily adapted to analyze this case, using:

µ = Sd(t, k − 1) − U/M and variance σ2 = σ2Sd(t, k − 1) + (σ

2U + U

2)/M − U

2/M2. At last observe that, as consequence of

the fact that every peer redistributes only one randomly selected stripe, we need to modify the mean of Sseed(t) (the contribution

of seeds associated to the considered stripe) to U N seed(t)/M , whereas the variance of Sseed(t) becomes N seed(t)(σ2U +U

2)/M .

8 PERFORMANCE UNDER STATIONARY CONDITIONS

8.1 Impact of upload bandwidth heterogeneity

0

5

10

15

20

25

0 0.5 1 1.5 2

Aver

age

serv

er b

andw

idth

Variation coefficient

approxsim

sim lower bound

0

2

4

6

8

10

12

14

0 0.5 1 1.5 2

Aver

age

serv

er b

andw

idth

Variation coefficient

approxsim

sim lower bound

Ū = 0.9 Ū = 1.1

Fig. 11: Comparison of average server bandwidth with N = 100, d = dv , T = Tv , as function of the variation coefficient ofpeer upload bandwidth, for U = 0.9 (left plot) and for U = 1.1 (right plot).

Figure 11 compares the average server bandwidth S as function of the variation coefficient of the peer upload bandwidth(keeping fixed the mean), considering N = 100, d = dv , T = Tv . The average upload bandwidth is equal to either U = 0.9(deficit mode) or U = 1.1 (surplus mode). Simulation results are supplemented by 95%-level confidence intervals. The uploadbandwidth distribution used in the simulations depends on the variation coefficient: for values larger than one, we adopt a

second-order hyper-exponential distribution with balanced means; for values smaller than one, we employ an exponential

distribution added to a constant (this explains the small glitch at variation coefficient equal to 1).

We observe that the average server bandwidth increases significantly as the variability of upload bandwidth increases, while

our approximation tends to provide a conservative prediction.

8.2 Impact of VCR functionalities

In Figure 12 we evaluate the impact of users pausing the video playback or replaying portions of video already watched. As

explained in Sec. 7.1 these VCR functionalities can be easily modeled introducing the probability pk that the k-th downloading

3

user is watching fresh new content (i.e., the portion of video currently being donwloaded). For simplicity, we assume that pkis a given constant, equal for all k (more in general, we would need a sub-model to compute pk on the basis of additionalassumptions about the user behavior). The left plot of Figure 12 refers to the system that keeps on the download while the

user is in pause/replay state (system 1), whereas the right plot refers to the system in which the download is suspended in

pause/replay state (system 2).

Results in Figure 12 refer to the same scenario considered in Figure 5. We observe non-negligible differences between the

two systems only when users act as seeds for zero or small time after finishing watching the video4. In this case, system 1

performs better than system 2, due to the very beneficial effect induced by just a small number of seeds, which are artificially

created in system 1 by keeping on the download. We remark, however, that the gain achieved by system 1 over system 2 also

depends on the probability that users watch the entire video, actually consuming all downloaded data. Premature abandons

(not present in the scenario considered in Figure 12), if likely to occur, can make system 2 preferable to system 1.

0

1

2

3

4

5

6

0 0.1 0.2 0.3 0.4 0.5

Avera

ge s

erv

er

bandw

idth

Original seed time / video duration

approx - pk = 1.0approx - pk = 0.98approx - pk = 0.95approx - pk = 0.9

0

1

2

3

4

5

6

0 0.1 0.2 0.3 0.4 0.5

Original seed time / video duration

approx - pk = 1.0approx - pk = 0.98approx - pk = 0.95

approx - pk = 0.9

Fig. 12: Average server bandwidth as function of the (original) ratio between the seed time and the video duration, for different

values of probability pk. Comparison between system 1 (left) and system 2 (right).

8.3 Impact of limited memory at peers

At last, we investigate the impact of memory constraints at peers, as analysed in Section 7.2. We consider again the same

scenario as in Figures 5 and 12. Recall that 1/M represents the fraction of the entire video that can be cached at a peer.

2

4

6

8

10

12

14

16

18

20

1 1.1 1.2 1.3 1.4 1.5

Aver

age

serv

er b

andw

idth


sim - M = 10approx - M = 10sim - M = 5approx - M = 5sim - M = 2approx - M = 2sim - M = 1approx - M = 1

Fig. 13: Average server bandwidth as function of the ratio between average activity time and video duration, for different

values of the number of stripes M . Comparison between simulation and approximate analysis.

As expected, Figure 13 confirms that limited memory availability at peers negatively affects the system performance, in

rather strong way. The increase in the server bandwidth, however, is not linear with respect to M .

8.4 Summary of results under stationary conditions

By applying our performance evaluation methodology under various parameters setting, we have discovered several interesting

properties of P2P-VoD systems:

- in the surplus mode, the server bandwidth achieves a maximum as we increase the number of users, and then decreases to

4. The seed time is incremented in system 1 by pause/replay functionalities: the seed time reported in Figure 12 refers to the original seed time (withoutpause/replay effects), due to user behavior after the video playback is finished.

4

zero provided that T > T d (i.e., the average activity time is larger than the average download time) (Fig. 2);- under the pessimistic assumption that users leave the system after watching the video, the server bandwidth can be minimized

by a proper selection of the target download rate, under any system load (Fig. 3);

- the server bandwidth increases with the variation coefficient of the peer upload bandwidth distribution (Fig. 11);

- the gain achievable by non-sequential schemes over the simple sequential scheme depends critically on the size of the sliding

window and the number of downloading users, and vanishes as the number of seeds increases (Fig. 4 and 5);

- VCR-like functionalities allowing users to stop/rewind the video playback improve the system performance (Fig. 12);

- memory contraints can severely penalize the effectiveness of peer-assisted video distribution (Fig. 13).

9 SUMMARY OF RESULTS UNDER NON-STATIONARY CONDITIONS

We found out the following interesting properties:

- under non-stationary traffic conditions peer-assisted VoD systems are affected by a misalignment problem between

downloaders and seeds. The per-user load γp is not enough to characterize the system performance, which comes to criticallydepend on the amount of bandwidth contributed by peers while they download the video: when U > dv the bandwidthdemanded from the servers is essentially independent from the system size (number of watching users); if U < dv , instead,the bandwidth demanded from the server scales linearly with the system size, regardless of the availability of users as seeds

and the per-user system load γp (Fig. 8);- in the regime where the system does not scale with the number of users, the system performance is further negatively affected

by an increase in the content heterogeneity (in terms of popularity) (Fig. 10).

10 RELATED WORK

A stochastic fluid approach to analyze peer-assisted video distribution has been proposed in [15] in the context of live streaming,

in which (heterogeneous) peers download and playback content synchronously. They derive simple conditions for the existence

of a fluid distribution scheme that achieves universal streaming. in a system with a given available server rate, and two classes

of users (with high/low upload capacity). In [16] authors extend the analysis in [15] developing queueing network models of

multi-channel P2P live streaming systems, capturing peer churn, bandwidth heterogeneity, and Zipf-like channel popularity,

and showing the benefit of the View-Upload Decoupling strategy. Here we apply the stochastic fluid approach to VoD systems,

whose dynamics are quite different from live streaming, since users can watch the video asynchronously.

A mathematical formulation of the server bandwidth needed under sequential delivery appeared in [4], in which authors

resort to a Monte Carlo approach to get basic insights into the system behavior (like surplus and deficit modes). The sequential

delivery scheme has been considered also in [11], where authors explore by simulation the effectiveness of different replication

strategies to minimize the server load in the slightly surplus mode, as well as distributed replacement algorithms to achieve it.

Differently from [11], we provide an analytical approximation that can account for both sequential and non-sequential delivery

schemes under any system load, considering also the impact of seeds.

Stochastic fluid models for BitTorrent-like file-sharing system, accounting for the dynamics of downloaders and seeds, have

been proposed for both transient and steady-state regimes [17], [18], but they are not directly applicable to streaming systems.

In [19], authors adapt the fluid model in [18] to VoD systems, investigating the impact of different piece selection policies

(rarest-first and in-order) on download latency and startup delay, in the case of homogeneous peers. In contrast to [19], we

focus on the characterization of VoD systems with strict service guarantees and heterogeneous user upload bandwidths. In

[20], a per-chunk capacity model is developed to show the tradeoff that exists between system throughput, sequentiality of

downloaded data and robustness to heterogeneous network conditions. Optimal content placement strategies to maximize the

upload capacity of (homogeneous) set-top-boxes (and thus minimize the servers workload) in VoD systems have been recently

investigated in [21] under many-user asymptotic.

More recently, [22] characterizes the content providers’ uploading cost as a function of the peers’ contribution, and proposes

an incentive mechanism that rewards peers based on their dedicated upload bandwidth. However, they evaluate the performances

of their scheme only through simulation. Somehow complementary to our work is the study in [23], where control algorithms that

achieve the optimal streaming capacity as the number of peers increases, are proposed. [24] considers a distributed and adaptive

video replication strategy with some server feedback. Similarly to our paper, [24] deals with peer churn and non-stationary

popularity of movies, but results are validated through simulation only. [25] proposes two analytical models that capture several

aspects of peer behavior, such as participating in the system, sojourning in a channel, downloading and uploading the content,

wandering around channels and leaving the system. However the effects of upload bandwidth heterogeneity, non sequential

delivery and non-stationary traffic conditions are not taken into account. In [26] an analytical model to both qualitatively and

quantitatively study the effect on server cost of seeks and pauses on mesh-based P2P VoD streaming systems is developed.

To the best of our knowledge, we are the first to analytically investigate the performance of peer-assisted VoD systems under

non-stationary traffic conditions.

5

11 DISCUSSION ON MODEL ASSUMPTIONS

In this section we critically revisit the main assumptions of our model discussing their impact on the analysis of a VoD system.

The first strong assumption consists in assuming a fixed video playback rate. This assumption actually does not hold in

practice since most video encoding schemes produce variable bitrate streams. However, rapid bitrate fluctuations are usually

averaged out by the playout buffer of the decoder, so that assuming a fixed download rate d larger than or equal to theaverage playback rate can be an acceptable assumption, while being also a reasonable design choice to simplify the system.

Nevertheless, it would be possible to incorporate in the model a random download rate, representing fluctuations due to

variable video bit-rate and cross-traffic effects. Indeed, since different users in a VoD system are retrieving at time t differentand independent segments of the video content, we could well assume instantaneous play-back rates of users to be described

by i.i.d. random variables with known distribution. The effect of variable play-back rates could then be easily incorporated

in our modeling framework without changing its mathematical structure. Observe, indeed, that the structure of Sd(t, k) in (5)remains unchanged when d is replaced with dk, where dk is a random variable, while the gaussian approximation in 3.2 canbe easily extended to handle this case, given the first two moments of dk.

The second important assumption of our work consists in modeling the users’ upload bandwidth as i.i.d. random variables

Ui with assigned distribution. We do not consider this assumption particularly restrictive, since it permits to represent prettywell bandwidth heterogeneity of users’ access links, as well as random fluctuations in the available upload bandwidth due to

cross traffic and other forms of bandwidth restriction. Notice that such fluctuations could be also correlated over time at each

user, without affecting our analysis, which is essentially based on an instantaneous analysis of the system. The assumption that

upload bandwidths are uncorrelated among users is also reasonable, since in a VoD system users are geographically spread,

and thus they are likely to experience independent bandwidth fluctuations.

At last, in our work we have ignored implementation issues such as: i) the effects of protocol overheads and signalling

bandwidth (necessary to reconfigure the cooperation among users); ii) possible constraints on the number of peers from which

a user can simultaneously download data; iii) the effect of congestion inside the network. All these issues can potentially affect

the performance of a realistic system, but we have not incorporated them for the sake of simplicity and analytical tractability.

APPENDIX APROOF OF PROPOSITION 1

The case k = 1 is obvious. The recursive expression for k ≥ 2 can be easily explained if we look at the users in reverseorder with respect to the arrival time into the system, i.e., user k arrives before user k − 1. Suppose that we know the serverbandwidth Sd(t, k − 1) needed in the presence of k − 1 users. Then user k can reduce this rate by its upload bandwidth Uk,possibly bringing the server rate to zero. Instead, user k cannot be helped by any other peers, hence it requires fresh newcontent from the server at rate d.

APPENDIX BEXTENSION TO NON-SEQUENTIAL DELIVERY

More formally, let P0(t, v) = e−Nv(t) be the probability that there are no users downloading segment v at time t.

Let Sd(t, v) and S(2)

d (t, v) be the first and second moment of Sd(t, v), respectively. To start our iterative computation, we

set Sd(t, 0) = 0 and S(2)

d (t, 0) = 0. Now, suppose that we know Sd(t, v−1) and S(2)

d (t, v−1) for a given v ≥ 1. Conditioningon the event Nv(t) > 0, we approximate Sd(t, v − 1) − Ũ by a normal random variable y of mean

µ = Sd(t, v − 1) −Nv(t)

1 − P0(t, v)(U − d) − d

and variance

σ2 = S(2)

d (t, v − 1) − (Sd(t, v − 1))2 +

Nv(t)

1 − P0(t, v)σ2U +

[

Nv(t) + (Nv(t))2

1 − P0(t, v)−

(

Nv(t)

1 − P0(t, v)

)2]

(U − d)2

and apply (6) and (7) to compute the first moment E[y′] and second moment E[y′2] of y′ = max{0, y}. Then we obtain thefirst two moments of Sd(t, v) as:

Sd(t, v)= P0(t, v)Sd(t, v−1)+(1−P0(t, v))(d + E[y′])

S(2)

d (t, v)= P0(t, v)S(2)

d (t, v−1)+(1�

Documents

CR Ciullo TPDS - EURECOM · 2020. 11. 14. · Traditional (client-server) Content Delivery Networks (CDN) ... After characterizing Sd(t,k ), the evaluation of S(t) is easy, ... to