16
1 Peer-assisted VoD Systems: an Efficient Modeling Framework Delia Ciullo, Valentina Martina, Michele Garetto, Emilio Leonardi, Giovanni Luca Torrisi Abstract—We analyze a peer-assisted Video-on-Demand system in which users contribute their upload bandwidth to the redistribution of a video that they are downloading or that they have cached locally. Our target is to characterize the additional bandwidth that servers must supply to immediately satisfy all requests to watch a given video. We develop an approximate fluid model to compute the required server bandwidth in the sequential delivery case, as well as in controlled non sequential swarms. Our approach is able to capture several stochastic effects related to peer churn, upload bandwidth heterogeneity, non-stationary traffic conditions, which have not been documented or analyzed before. At last, we provide important hints for the design of efficient peer-assisted VoD systems under server capacity constraints. Index Terms—Video-on-demand, peer-to-peer, performance evaluation. 1 I NTRODUCTION The efficient distribution of video contents will be one of the main challenges of the future Internet. According to Cisco forecasts [2], the combination of all forms of video (live streaming, video-on-demand and P2P file sharing) will be in the range of 80 to 90 percent of global consumer Internet traffic by 2017, posing a tremendous challenge to both content providers and network operators. In particular, Video-On- Demand traffic has been increasing tremendously during the last few years. For example, it has been reported that, on a normal weeknight, Netflix alone accounts for almost one third of all Internet traffic entering North American homes. Traditional (client-server) Content Delivery Networks (CDN) help alleviate the traffic on the transport infrastructure by “moving” contents close to the users, however they do not solve the scalability problem of data centers and server farms, whose resources (bandwidth/storage/processing) increase lin- early with the user demand and the data volume. Peer-assisted video distribution architectures, in which users contribute their upload bandwidth to the system while viewing the requested video, have been advocated as a viable alterna- tive to traditional CDNs to reduce the server workload and guarantee the scalability to large populations of users [3], [4]. Several peer-assisted systems have already been deployed, such as PPLive, GridCast, PPStream, TVU, SopCast, Xunlei Kankan [5]. Despite the wide popularity gained by existing applications, several fundamental questions remain unanswered about the design of video streaming systems and the potential benefits of the peer-assisted approach. Indeed, the unpredictable nature D. Ciullo is with EURECOM, Sophia Antipolis, France. E-mail: [email protected] V. Martina and E. Leonardi are with Dipartimento di Elettron- ica, Politecnico di Torino, Torino, Italy. E-mail: {valentina.martina, emilio.leonardi}@polito.it M. Garetto is with Dipartimento di Informatica, Universit` a di Torino, Torino, Italy. E-mail: [email protected] G. L. Torrisi is with Istituto per le Applicazioni del Calcolo, CNR, Roma, Italy. E-mail: [email protected] A preliminary version of this paper appeared at IEEE INFOCOM Mini- Conference 2012 [1]. of users cooperation, that cannot always guarantee the strict quality-of-service requirements of online video; the added complexity on the control plane due to signalling and chunk scheduling; and the need to provide incentive mechanisms to the users, tend to discourage the content providers to adopt peer-assisted solutions. In our work, we focus on peer-assisted Video-on-Demand (VoD) systems, for the on-line distribution of movies to a large audience of users, such as Netflix. Users can browse a catalog of available movies, and asynchronously issue requests to watch a given content, which are ideally immediately satisfied by the system, with the optional support for limited VCR actions such as pause and jump backward/restart. Notice that Video-on-Demand (VoD) systems are quite different from live video streaming applications, in which users join the distribution of a given TV channel at random points in time, but peers connected to the same channel watch the content almost synchronously. In peer-assisted VoD systems, users interested to a specific video can retrieve it from servers (CDN modality), from peers downloading/watching it, and from users storing a copy of it in their computer/Internet TV memory or in dedicated set-top- boxes remotely controllable by the network operator [6]. The content is typically divided into small chunks (typically one or few video frames) which are retrieved from other peers (or from servers) in a fully distributed (swarming) fashion based on the exchange of chunk bitmaps. However, differently for traditional file-sharing, chunk and peer selection strategies for peer-assisted video distribution must account for the fact that users watch while downloading. In particular, to avoid service interruptions/degradations: i) a minimum average download rate equal to the video playback rate must be guaranteed; ii) an “almost in order” delivery of chunks is required. Our main contribution is a stochastic fluid framework that allows to approximately estimate the additional bandwidth that servers must provide to satisfy all requests to watch a given video. Our methodology can account for several stochastic effects related to peer churn, upload bandwidth heterogeneity, non-stationary traffic conditions, which have not been analyzed before, providing a useful tool for the analysis and design of

CR Ciullo TPDS - EURECOM · 2020. 11. 14. · Traditional (client-server) Content Delivery Networks (CDN) ... After characterizing Sd(t,k ), the evaluation of S(t) is easy, ... to

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • 1

    Peer-assisted VoD Systems:an Efficient Modeling Framework

    Delia Ciullo, Valentina Martina, Michele Garetto, Emilio Leonardi, Giovanni Luca Torrisi

    Abstract—We analyze a peer-assisted Video-on-Demand system in which users contribute their upload bandwidth to the redistribution

    of a video that they are downloading or that they have cached locally. Our target is to characterize the additional bandwidth that servers

    must supply to immediately satisfy all requests to watch a given video. We develop an approximate fluid model to compute the required

    server bandwidth in the sequential delivery case, as well as in controlled non sequential swarms. Our approach is able to capture

    several stochastic effects related to peer churn, upload bandwidth heterogeneity, non-stationary traffic conditions, which have not been

    documented or analyzed before. At last, we provide important hints for the design of efficient peer-assisted VoD systems under server

    capacity constraints.

    Index Terms—Video-on-demand, peer-to-peer, performance evaluation.

    1 INTRODUCTION

    The efficient distribution of video contents will be one of the

    main challenges of the future Internet. According to Cisco

    forecasts [2], the combination of all forms of video (live

    streaming, video-on-demand and P2P file sharing) will be in

    the range of 80 to 90 percent of global consumer Internet

    traffic by 2017, posing a tremendous challenge to both content

    providers and network operators. In particular, Video-On-

    Demand traffic has been increasing tremendously during the

    last few years. For example, it has been reported that, on a

    normal weeknight, Netflix alone accounts for almost one third

    of all Internet traffic entering North American homes.

    Traditional (client-server) Content Delivery Networks

    (CDN) help alleviate the traffic on the transport infrastructure

    by “moving” contents close to the users, however they do not

    solve the scalability problem of data centers and server farms,

    whose resources (bandwidth/storage/processing) increase lin-

    early with the user demand and the data volume.

    Peer-assisted video distribution architectures, in which users

    contribute their upload bandwidth to the system while viewing

    the requested video, have been advocated as a viable alterna-

    tive to traditional CDNs to reduce the server workload and

    guarantee the scalability to large populations of users [3], [4].

    Several peer-assisted systems have already been deployed,

    such as PPLive, GridCast, PPStream, TVU, SopCast, Xunlei

    Kankan [5].

    Despite the wide popularity gained by existing applications,

    several fundamental questions remain unanswered about the

    design of video streaming systems and the potential benefits

    of the peer-assisted approach. Indeed, the unpredictable nature

    • D. Ciullo is with EURECOM, Sophia Antipolis, France. E-mail:[email protected]

    • V. Martina and E. Leonardi are with Dipartimento di Elettron-ica, Politecnico di Torino, Torino, Italy. E-mail: {valentina.martina,emilio.leonardi}@polito.it

    • M. Garetto is with Dipartimento di Informatica, Università di Torino,Torino, Italy. E-mail: [email protected]

    • G. L. Torrisi is with Istituto per le Applicazioni del Calcolo, CNR, Roma,Italy. E-mail: [email protected]

    A preliminary version of this paper appeared at IEEE INFOCOM Mini-Conference 2012 [1].

    of users cooperation, that cannot always guarantee the strict

    quality-of-service requirements of online video; the added

    complexity on the control plane due to signalling and chunk

    scheduling; and the need to provide incentive mechanisms to

    the users, tend to discourage the content providers to adopt

    peer-assisted solutions.

    In our work, we focus on peer-assisted Video-on-Demand

    (VoD) systems, for the on-line distribution of movies to a large

    audience of users, such as Netflix. Users can browse a catalog

    of available movies, and asynchronously issue requests to

    watch a given content, which are ideally immediately satisfied

    by the system, with the optional support for limited VCR

    actions such as pause and jump backward/restart. Notice that

    Video-on-Demand (VoD) systems are quite different from

    live video streaming applications, in which users join the

    distribution of a given TV channel at random points in time,

    but peers connected to the same channel watch the content

    almost synchronously.

    In peer-assisted VoD systems, users interested to a specific

    video can retrieve it from servers (CDN modality), from peers

    downloading/watching it, and from users storing a copy of it

    in their computer/Internet TV memory or in dedicated set-top-

    boxes remotely controllable by the network operator [6]. The

    content is typically divided into small chunks (typically one

    or few video frames) which are retrieved from other peers (or

    from servers) in a fully distributed (swarming) fashion based

    on the exchange of chunk bitmaps. However, differently for

    traditional file-sharing, chunk and peer selection strategies for

    peer-assisted video distribution must account for the fact that

    users watch while downloading. In particular, to avoid service

    interruptions/degradations: i) a minimum average download

    rate equal to the video playback rate must be guaranteed; ii)

    an “almost in order” delivery of chunks is required.

    Our main contribution is a stochastic fluid framework that

    allows to approximately estimate the additional bandwidth that

    servers must provide to satisfy all requests to watch a given

    video. Our methodology can account for several stochastic

    effects related to peer churn, upload bandwidth heterogeneity,

    non-stationary traffic conditions, which have not been analyzed

    before, providing a useful tool for the analysis and design of

  • 2

    VoD systems.

    We consider both the simple basic sequential delivery

    scheme, in which users download the video chunks (almost)

    sequentially, as well as, schemes that tolerating higher play-

    out delay permit to exploit non-sequential delivery to improve

    the overall system performance.

    The analytical approach described in this paper comple-

    ments the analysis presented in [7], in which we obtain

    rigorous bounds for the sequential delivery scheme (under

    stationary traffic conditions) and asymptotic results as the

    number of users increases. With respect to [7], we extend the

    analysis to non-sequential delivery schemes and non-stationary

    traffic conditions, with a different goal in mind, i.e., to provide

    a performance evaluation tool that can be readily used for

    system design and optimization.

    We emphasize that in our work we do not consider issues

    related to optimal replication strategies of heterogeneous con-

    tents (in size and popularity) or optimal peer resource alloca-

    tion (in terms of storage and upload bandwidth) in the presence

    of multiple videos (e.g., universal streaming architectures).

    This because we focus on the bandwidth requested from the

    servers to distribute a given video, assuming that the peer

    resources allocated to it (i.e., number of copies available in

    the system and the amount of upload bandwidth devoted to

    the considered video) are given. Our analysis can be combined

    with optimal resource and replication strategies for universal

    streaming architectures.

    For an overview of related work, see Section 10 in [8].

    2 MODEL

    2.1 System assumptions

    We model a fairly general peer-assisted VoD system. Users1

    run applications that allow them to browse an online catalog

    of videos. When a user selects a video, we assume that the

    request is immediately satisfied and the selected video can be

    watched uninterruptedly till the end (i.e., a continuity index

    equal to 1 is guaranteed). This is possible only if the system

    is able to steadily provide to each user a data flow greater than

    or equal to the video playback rate.

    Users contribute their upload bandwidth to the video dis-

    tribution, thus they can retrieve part of the video (or even

    the entire video) from other peers, saving servers resources.

    The main goal of our analysis is to characterize the additional

    bandwidth that servers must supply (in addition We provide a

    fundamental tool to properly dimension the CDN infrastruc-

    ture supporting the VoD system.

    We focus on a given video of duration Tv seconds and sizeL bytes, which is played back by the users’ applications at ratedv = L/Tv bytes/s. Clearly, to guarantee continuous playbackeach user must at least receive video chunks sequentially at

    rate dv . As a widely adopted strategy to mitigate bandwidthfluctuations, applications pre-fetch and buffer video chunks

    before playback (notice that we consider VoD systems, hence

    we assume, in contrast to live streaming systems, that all

    chunks are immediately available, at least at the servers). In

    1. In this paper we use the terms peer and user interchangeably.

    our model, we assume that the system provides to each user

    a fixed target download rate d ≥ dv (we assume that thedownload bandwidth on the access links of the users is large

    enough that it does not constitute a bottleneck).

    In general, the target download rate of a peer could be

    adapted to the portion of video being downloaded, or even

    depend on some peer’s characteristics (such as its upload

    bandwidth). By imposing a constant target download rate

    d ≥ dv at each user we simplify the analysis, while obtaininga conservative prediction with respect to the case in which

    the target download rate is adapted over time and to the

    peer characteristics. Notice that the target download rate dcan be chosen by the system. We will show that in some

    cases, unexpectedly, the optimal value of d (i.e., the one thatminimizes the average bandwidth requested from the servers)

    is actually larger than dv .The amount of upload bandwidth with which peers con-

    tribute to the redistribution of the video that they are down-

    loading may or may not be under the control of the system. In

    our analysis, we assume that the upload bandwidth available

    at a peer is a random variable with given distribution. This

    way, we encompass both the realistic case of users with

    heterogeneous Internet connections (i.e., ADSL, fiber, LAN)

    and cross-traffic fluctuations, and the case in which the peer

    upload bandwidth allocated to the given video is tuned by

    the system (such as in universal streaming architectures).

    More specifically, the amount of upload bandwidth with which

    users contribute at a given time to the redistribution of the

    considered video is modeled by a random variable U withcumulative distribution function FU (w), mean U and varianceσ2U . The random variables denoting the instantaneous uploadbandwidths of the users are assumed to be i.i.d. (identically

    and independently distributed). See Section 11 in [8] for a

    critical discussion on our system assumptions.

    2.2 Peers dynamics

    We need to incorporate in our analysis a model describing

    how peers join the distribution of the considered video, and

    when and how they leave the system, stopping to contribute

    their upload bandwidth. To this aim, we adopt a very flexible

    model that allows to consider a non-stationary video request

    process, and general peer churn, which is an another crucial

    feature of any realistic P2P system.

    In particular, we assume that the arrival process of requests

    for the considered video follows a time-varying Poisson pro-

    cess of intensity λ(t). By so doing, we are able to captureeffects related to content popularity variation, while maintain-

    ing analytical tractability [9], [10].Indeed, assuming that at a

    given time the arrival process is Poisson is reasonable, since

    users behave independently of each other, and their requests

    are immediately satisfied. On the other hand, a video (e.g., a

    typical movie) can be long enough that the rate at which it

    is requested can vary significantly during the playing time,

    due to daily traffic fluctuations, or rapidly-changing video

    popularity. We account for this fact assuming that the video

    request process is described by an non-homogeneous Poisson

    process with time-varying intensity λ(t), which can possibly

  • 3

    be equal to zero before a given time t0, to model a newlyintroduced video inserted into the catalog at time t0.

    The dynamics of peer participation in the distribution of

    a given video must account for the fact that activity periods

    of the users are highly heterogeneous, as observed in several

    measurement studies [4]: some users stop watching the video

    after a very short time since the beginning, because they realize

    they are no longer interested in it; most users who decide to

    watch the video shut down the computer/Internet-TV towards

    the end of it; some of them keep the application running for

    prolonged time after the end of the video; those running set-

    top-boxes can be considered to be always active and serving

    other peers (until they stop contributing to the distribution of

    the considered video). We account for general user behavior

    assuming that the activity period of a user (i.e., the interval

    during which a user contributes its upload bandwidth, starting

    from the instant at which the video has been requested) is

    described by an arbitrary random variable T with finite meanT and complementary cumulative distribution function GT (x).The activity periods of the users are assumed to be i.i.d.

    It follows from our assumptions that the number of active

    users N(t) at time t is distributed as the number of customersin an M/G/∞ queue with time-varying arrival rate, hence itfollows a Poisson distribution with time-varying mean N(t)given by

    N(t) =

    ∫ ∞

    0

    λ(t − x)GT (x) dx (1)

    In our analysis we need to distinguish two classes of active

    users: those who are still downloading the video, and those

    who have completed the download (referred to as seeds in the

    following). Let τd = L/d be the time needed to downloadthe whole video, and T d =

    ∫ τd0

    GT (x)dx the average timespent by peers downloading the video. The number of down-

    loading peers at time t, denoted by Nd(t), follows a Poissondistribution of mean Nd(t) given by

    Nd(t) =

    ∫ τd

    0

    λ(t − x)GT (x) dx (2)

    Then standard properties of Poisson processes allow to say that

    the number of seeds at time t, denoted by Nseed(t), follows aPoisson distribution of mean N seed(t) = N(t) − Nd(t).

    We define as instantaneous system load γ(t) the quantity

    γ(t) =d · Nd(t)

    U · N(t)(3)

    which is the ratio between the average data rate requested

    at time t by downloading peers and the average uploadrate provided by all active users at time t. Borrowing theterminology adopted in previous work [3], [11] we say that

    at time t the system operates in deficit mode if γ(t) > 1, inbalanced mode if γ(t) = 1, and in surplus mode if γ(t) < 1.

    We also introduce the per-user system load γp =d·T dU ·T

    ,

    which is the ratio between the average amount of data that are

    downloaded by a peer, and the average amount of data that a

    peer is able to offer to other peers. Note that by construction

    γp is equal to the (constant) instantaneous system load in thecase of a stationary user arrival process. In ergodic systems,

    TABLE 1: Notation

    Symbol Definition

    L video size (bytes)dv video playback rate (bytes/s)Tv video playback duration (s), Tv = L/dvd target download rate (bytes/s)U average user upload bandwidth (bytes/s)T average user activity period (s)T d average time spent downloading the video (s)λ(t) arrival rate of requests for the video at time tN(t) average number of active users at time tNd(t) average number of downloading users at time tN seed(t) average number of seeds at time tSd(t) average bandwidth requested by downloaders at time tSseed(t) average bandwidth offered by seeds at time tS(t) average bandwidth requested from the servers at time t

    γp can be regarded as the time average of γ(t).

    2.3 Performance metrics

    The main goal of our paper is to characterize the additional

    bandwidth required from the servers (i.e., in addition to the

    aggregate bandwidth provided by the peers) to guarantee an

    optimal quality of service (i.e., continuity index equal to 1) to

    the users.

    Let S(t) be the random variable denoting the additionalbandwidth that the servers must supply at time t to satisfy allactive downloads of the considered video. We denote by S(t)and σ2S(t) the mean and variance of S(t), respectively.

    Since in practice there are multiple videos to be served

    concurrently by the system, statistical multiplexing arguments

    suggest that a good design goal is to minimize the mean

    value S(t) of the server bandwidth required by a single video.Therefore, this will be the main metric that we will look at in

    our performance analysis. Table 1 summarizes the notation of

    our model.

    3 ANALYSIS

    3.1 Sequential delivery

    We start considering the simple case in which users download

    the video chunks sequentially. This scheme is simple to

    implement, as it does not require complex chunk/peer selection

    mechanisms such as those needed in BitTorrent-like chunk

    swarming schemes. More importantly, the sequential delivery

    scheme is analytically tractable and provides an upper bound

    to the server bandwidth requested by non-sequential schemes.

    Let Sd(t) be the aggregate bandwidth requested by the

    downloading users at time t, and Sseed(t) =∑Nseed(t)

    i=1 Ui bethe aggregate upload rate offered by the seeds at time t. Thenthe bandwidth requested from the servers at time t is given by

    S(t) = max{0, Sd(t) − Sseed(t)}. (4)

    Focusing on Sd(t), we first condition this quantity on thenumber of downloading users k, defining

    Sd(t, k) , (Sd(t) | Nd(t) = k)

  • 4

    After characterizing Sd(t, k), the evaluation of S(t) is easy,since the distribution of Nd(t) is known (a Poisson distributionof mean N(t)), while Sseed(t) is a compound Poisson randomvariable which does not depend on k [12].

    To evaluate Sd(t, k) under sequential download, we startobserving that, if all peers download the video sequentially at

    common rate d, a peer can only redistribute video pieces topeers arrived later on in time.

    Proposition 1: Quantity Sd(t, k) satisfies the following re-cursive equation:

    Sd(t, k) =

    {

    d k = 1d + max{0, Sd(t, k − 1) − Uk} k > 1

    (5)

    The proof of Proposition 1 is reported in Appendix A of [8].

    Expression (5) provides the key to the analytical approxima-

    tion developed in the next section.

    Alternate formulations of quantity Sd(t, k) exist (see [3],[11], [7]). Here, we just mention that in [7] we find a

    connection between the stochastic process described by (5)

    and a random walk with increments d − U , which allows toobtain analytical upper bounds to the server bandwidth and

    to characterize its asymptotic behavior for large number of

    users.

    3.2 Gaussian approximation

    In the sequential delivery case, we can characterize the

    distribution of the server bandwidth using a second-order

    approximation [12], [13].

    The idea is to approximate the distribution of quantity

    Sd(t, k − 1) − Uk in (5) (for each k ≥ 2) by a normaldistribution matching the first two moments of this quantity.

    We can then apply standard formulas of the truncated normal

    distribution to derive the first two moments of Sd(t, k) as afunction of the first two moments of Sd(t, k−1). This providesa recursive technique to compute the first two moments of

    Sd(t, k) for any k, starting from the exact values knownfor k = 1. Our gaussian approximation is motivated by thefact that significant excursions of Sd(t) (i.e., away from thelower limit d) result from the accumulation of several randomcontributions d − U , thus the central limit theorem can beinvoked to justify the convergence to a normal distribution.

    A similar approximation is subsequently applied to take into

    account the effect of the seeds, whose aggregate contribution

    Sseed(t) can be well described by a gaussian distributionfor sufficiently large number of seeds. Notice that an exact

    evaluation of the distributions (or just the first few moments)

    of S(t) and Sd(t, k) is difficult, due to the presence of barriersat zero and d, respectively.

    More in detail, let us start introducing some notation and

    standard results related to the normal distribution. Let N (w)be the probability density function of the standard normal

    distribution (having mean 0 and variance 1), and Q(w) itscomplementary cumulative distribution function.

    Let y be a random variable distributed according to a normaldistribution N(µ, σ) of mean µ and standard deviation σ. Thenit can be proved that the first moment of the random variable

    y′ = max{0, y} has the following expression:

    E[y′] = σN(

    −µ

    σ

    )

    + µQ(

    −µ

    σ

    )

    (6)

    while the second moment is given by

    E[y′2] = σµN(

    −µ

    σ

    )

    + (σ2 + µ2)Q(

    −µ

    σ

    )

    . (7)

    Let Sd(t, k) and σ2Sd

    (t, k) be the mean and variance ofSd(t, k). Our recursive procedure to approximately computeSd(t, k) and σ

    2Sd

    (t, k) for all k starts from the initial known

    values Sd(t, 1) = d and σ2Sd

    (t, 1) = 0 (see (5)). For agiven k ≥ 2 we approximate Sd(t, k − 1) − Uk by a nor-mal random variable y of mean µ = Sd(t, k − 1) − U andvariance σ2 = σ2Sd(t, k − 1) + σ

    2U . Defining the random vari-

    able y′ , max{0, y} ≃ max{0, Sd(t, k − 1) − U}, from (5)we obtain:

    Sd(t, k) ≃ d + E[y′] (8)

    σ2Sd(t, k) ≃ E[y′2] − E[y′]2. (9)

    Applying (6) and (7) we can compute the first and second

    moment of variable y′ in (8,9). This provides the recursion tocompute Sd(t, k) and σ

    2Sd

    (t, k) for all k.To account for the effect of the seeds (if any), we apply

    once more the normal approximation, as follows. Let S(t, k) =max(0, Sd(t, k)−Nseed(t)) be the server bandwidth necessaryat time t, assuming that there are k downloading users andNseed(t) seeds. Moreover, let S(t, k) and σ

    2S(t, k) be the mean

    and variance of S(t, k).We observe that Sseed(t) is a compound Poisson random

    variable, whose moments can be computed exactly in close-

    form. In particular, the mean of Sseed(t) is equal to N seed(t)U ,

    whereas its variance is equal to N seed(t)(σ2U + U

    2). We

    approximate Sd(t, k) − Sseed(t) by a normal distribution y ofmean µ = S(t, k) − N seed(t)U and variance σ

    2 = σ2S(t, k) +

    N seed(t)(σ2U + U

    2), and apply again (6) and (7) to compute

    the first and second moment of y′ = max{0, y} ≃ S(t, k).Finally, the mean server bandwidth S(t) (and similarly its

    variance) can be obtained deconditioning with respect to k:

    S(t) =∑

    k≥1

    S(t, k)P(Nd(t) = k) (10)

    We observe that each step of the iterative procedure requires

    a constant number of operations. Hence the computational

    complexity of the model solution is linear in the number

    of steps (k), which equals the number of downloading usersin the systems. This number is theoretically unbounded (it

    has a Poisson distribution), however we can reasonably limit

    the iteration to a maximum number of steps kmax such thatP(Nd(t) > kmax) is negligible (i.e., smaller than a given smallconstant ǫ; in our results we set ǫ = 10−6). The computationalcomplexity is then Θ(kmax).

    3.3 Extension to non-sequential delivery

    So far we have restricted the analysis to the case in which users

    receive the video chunks sequentially. Although conceptually

    simple, this scheme is clearly sub-optimal when users down-

    load data at a rate larger than the playback rate: recall that in

  • 5

    this case users can download in advance video chunks needed

    in the future, and this prefetching does not necessarily have

    to be done sequentially. Actually, by allowing out-of-sequence

    delivery the system can better exploit the upload bandwidth

    of the peers. Chunk-based, swarming approaches like those

    commonly used in P2P bulk transfers (e.g., BitTorrent) can

    be applied to VoD systems with the additional constraint

    that individual chunks must be downloaded before specific

    deadlines to avoid interrupting the video playback.

    A common approach to combine the efficiency of P2P

    swarming with the strict delay constraints of VoD is to allow

    users to receive also out-of-sequence chunks of the video

    within a limited “sliding window” of data starting from the

    point currently played [5].

    For simplicity, instead of considering an actual sliding

    window, we divide the video into a fixed number W ofnon-overlapping segments of size LW , L/W . We denoteby TW , LW /d = τd/W the time needed to downloada segment. Users who are concurrently downloading the

    same segment, besides helping users downloading previous

    segments can help each other in a swarming fashion, i.e., we

    assume that, within a segment, we can exploit also chunk-

    based, out-of-sequence distribution. This model is able to

    capture the behavior of realistic sliding window applications,

    while keeping the analysis simple.

    Below we show how the analysis developed in Section 3.1

    can be adapted to study this scheme as well, permitting us

    to assess the performance gain achievable by non-sequential

    schemes.

    Indeed, by aggregating all users belonging to the same

    segment into a sort of ‘meta-peer’ we can essentially apply

    the same analysis as before to a chain of W meta-peersdownloading the video segments sequentially. Let Nv(t),1 ≤ v ≤ W , be the random variables representing the numberof peers concurrently downloading segment v at time t. Noticethat this number is Poisson-distributed [12] with mean

    Nv(t) =

    ∫ TW

    0

    λ(t − (v − 1)TW − x)GT (x + (v − 1)TW ) dx

    Let Sd(t, v) be the bandwidth that the servers (or the seeds)must supply at time t to all users downloading segments ofindex smaller than or equal to v. Using the same reasoningas in the proof of Proposition 1, quantity Sd(t, v) can becomputed through the following recursive equation,

    Sd(t, v) = max{

    0, Sd(t, v − 1) −

    Nv(t)∑

    i=1

    Ui +

    Nv(t)−1∑

    i=1

    d}

    + d · INv(t)>0 1 ≤ v ≤ W (11)

    with Sd(t, 0) = 0 and the convention that summations areequal to zero if Nv(t) = 0. Notice that (11) is analogous to(5), if we consider all peers within a segment v (if any) as asingle meta-peer having virtual upload bandwidth Ũ(t, v) =

    d +∑Nv(t)

    i=1 (Ui − d).

    The first two moments of Sd(t) = Sd(t,W ) can be com-puted using a second-order approximation similar to the one

    adopted for the case of sequential delivery in Section 3.2, i.e.,

    by assuming that quantity Sd(t, v − 1) − Ũ (whose momentscan be computed exactly) has a normal distribution, and then

    using (6) and (7) to compute the first and second moments of

    the positive part of it [12], [13].The analysis is made slightly

    more complicated by the fact that we need to consider also

    the case in which there are no users downloading a segment

    (Nv(t) = 0), which requires some care. Details are reportedin Appendix B of [8].

    At last, the impact of seeds is taken into account in a way

    analogous to the sequential case. Note that the computational

    procedure for the non-sequential case is again linear in the

    number of steps, hence it is Θ(W ).The extreme case W = 1 of just one segment (i.e., the

    entire video) corresponds to a scheme in which chunks can

    be downloaded in any order, and the system can exploit the

    upload bandwidth of any peer irrespective of its arrival time.

    In this case we have,

    Sd(t, 1) = max{

    d,

    Nd(t)∑

    i=1

    (d − Ui)}

    (12)

    which plugged into (4) (in the place of Sd(t)) provides alower bound to the server bandwidth required by any chunk

    distribution scheme. In the following, we will refer to the

    server bandwidth obtained in this way as the completely non-

    sequential case, or simply the lower bound.

    In Sect 7 of [8], it is shown how our model can be extended

    to analyze systems providing limited VCR functionalities as

    well as, systems in which peers have limited storage capabil-

    ities.

    4 PERFORMANCE UNDER STATIONARY CONDI-TIONS

    In this section we report a selection of the most interesting

    results that we have obtained by our analysis under stationary

    user arrival process. Since in this case all averages do not

    depend on t, we will omit for simplicity the indication oftime. We normalize to 1 the video playback rate dv , whichthus serves as unit for all other bandwidth figures. We assume

    that users stay in the system for a time at least equal to the

    watching time, hence T ≥ Tv . Unless otherwise specified,we assume that users’ upload bandwidth U is exponentiallydistributed.

    The results obtained by our analytical approximation in

    Section 3.2 are compared against those obtained by an event-

    driven simulator derived from P2PTVsim2. In particular, we

    adapted P2PTVsim, originally developed for live-streaming,

    to represent VoD systems. Our simulator permits considering

    a general mesh-based pull system: the content is segmented

    into small fixed-size chunks, corresponding to 100 ms of

    video; peers independently retrieve individual chunks from

    other peers and/or the servers adopting a simple trading mech-

    anism that involves a periodic exchange of signaling messages

    containing buffer maps between pairs of peers. To guarantee an

    almost in-order delivery of chunks (as in the case of VoD) we

    2. P2PTVsim is available at http://www.napa-wine.eu/cgi-bin/twiki/view/Public/P2PTVSim

  • 6

    implemented a sliding window mechanism: each peer playing

    chunk c is allowed to retrieve only the chunks [c, c+H] whereH is the window parameter. A sequential delivery scheme isthen approximately achieved by choosing a small value of

    H (compared to the entire video). In our simulations (if nototherwise specified) we have set H = 40, such that a windowcorresponds to only 4 seconds of video. Chunks are, in the

    first instance, retrieved from other peers; a peer is allowed to

    retrieve directly from servers the immediately following chunk

    to play (i.e., chunk c+1 can be retrieved from the servers onlywhile playing chunk c). Servers are assumed to have unlimitedbandwidth.

    We recognize that our simulator may appear rather simpli-

    fied with respect to the behavior of real systems, as it shares

    most of the system assumptions of the model (see Section

    11 in [8]).We use it primarily to validate the accuracy of

    the Gaussian approximation introduced in Section 3.2, that

    enables us to assess the system performance at very low

    computational cost (i.e., with negligible effort as compared

    to detailed simulations or measurement campaigns).

    We emphasize that the ultimate goal of our model and

    analysis is not to produce accurate quantitative results about

    the performance of a realistic system (that necessarily depends

    on the specific system implementation), but to allow for an effi-

    cient qualitative prediction of the impact of various parameters

    and design choices on the resulting system performance.

    0.1

    1

    10

    100

    1 10 100 1000

    Av

    erag

    e se

    rver

    ban

    dw

    idth

    Average number of users

    approx - U = 0.9sim - U = 0.9

    sim lower bound - U = 0.9approx - U = 1.0

    sim - U = 1.0sim lower bound - U = 1.0

    approx - U = 1.2sim - U = 1.2

    sim lower bound - U = 1.2

    Fig. 1: Comparison of average server bandwidth in the case

    d = dv , as function of the average number of users N , fordifferent values of U , in the absence of seeds.

    0.1

    1

    10

    100

    1 10 100 1000

    Av

    erag

    e se

    rver

    ban

    dw

    idth

    Average number of downloading users

    approx - U = 0.9sim - U = 0.9

    sim lower bound - U = 0.9approx - U = 1.0

    sim - U = 1.0sim lower bound - U = 1.0

    approx - U = 1.2sim - U = 1.2

    sim lower bound - U = 1.2

    Fig. 2: Comparison of average server bandwidth in the case

    d = dv , as function of the number of downloading users Nd,for different values of U , in the presence of N seed = 0.1 ·Ndseeds.

    4.1 Impact of the number of watching users and

    seeds

    Figure 1 reports, on a log-log scale, the average server

    bandwidth S as function of the average number of users N , inthe case d = dv , T = Tv . We consider three different valuesof average upload bandwidth U = 0.9, 1.0, 1.2, correspondingto systems operating in deficit, balanced, and surplus mode,

    respectively (here γ = 1/U ).Besides noticing the accuracy of the approximate analysis, it

    is interesting to see that the average server bandwidth saturates

    for U = 1.2 (surplus mode) to a value about 3.5 timeslarger than the corresponding lower bound, which tends to

    dv = 1. As expected, the average server bandwidth divergesunder the deficit and balanced modes. Moreover, in the deficit

    mode, the sequential system requires asymptotically the same

    bandwidth as the completely non-sequential system (i.e., the

    lower bound).

    In Figure 2 we compare the results obtained in the same

    system considered above, but assuming that users remain

    active after the end of the watching time for an exponen-

    tially distributed amount of time of mean equal to 10% of

    the watching time, generating an average number of seeds

    N seed = 0.1 · Nd. Now the systems with U = 1.0 andU = 1.2 operate in surplus mode, whereas the system withU = 0.9 operates very close to the balanced mode (hereγ = 1/(1.1 · U)). We observe that, in the presence of seeds,the average server bandwidth requested by systems operating

    in surplus mode reaches a maximum, after which it goes

    to zero as the number of users increases. Results such as

    those reported in Figures 1 and 2 can be useful in system

    dimensioning, as they allow to estimate, in the surplus mode,

    the worst-case server bandwidth which is needed when the

    number of downloading users Nd is not known.

    4.2 Impact of the target download rate

    Even when users tend to leave the system at the end of the

    watching time, it is still possible to benefit from the positive

    effect created by the seeds, who absorb part of the fluctuations

    in the bandwidth requested by downloading peers, shielding

    the servers. The trick to ‘artificially’ create some seeds is to

    make the users download the video at rate d > dv , so that theybecome seeds for other peers before the end of the watching

    time. Intuitively, however, d should not be set too large tooffset the gain achievable by the seeds. Figure 3 illustrates the

    performance of this strategy in the case of N = 100 users,T = Tv , showing the average server bandwidth as functionof d. We observe that for all the considered values of U , theaverage server bandwidth achieves a minimum for a value of

    d slightly larger than dv . The impact is particularly striking inthe surplus mode (U > 1), in which setting d > dv brings theserver bandwidth close to zero.

    4.3 Impact of non-sequential delivery

    To understand the performance gains achievable by adding

    (in a limited way) BitTorrent-like chunk delivery schemes to

    the video distribution, we consider two scenarios operating in

  • 7

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    1 1.2 1.4 1.6 1.8 2 2.2

    download rate, d

    simapprox

    sim lower bound 0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    1 1.2 1.4 1.6 1.8 2 2.2

    download rate, d

    simapprox

    sim lower bound 0

    0.25

    0.5

    0.75

    1

    1.25

    1.5

    1.75

    2

    2.25

    2.5

    2.75

    3

    3.25

    3.5

    1 1.2 1.4 1.6 1.8 2 2.2

    download rate, d

    simapprox

    sim lower bound

    0

    0.25

    0.5

    0.75

    1

    1.25

    1.5

    1.75

    2

    2.25

    2.5

    2.75

    3

    3.25

    3.5

    1 1.2 1.4 1.6 1.8 2 2.2

    download rate, d

    simapprox

    sim lower bound

    Aver

    age

    serv

    erb

    andw

    idth

    Ū = 1 Ū = 1.2 Ū = 1.4Ū = 0.9

    Fig. 3: Average server bandwidth as function of the target download rate d, for different values of U , with N = 100, T = Tv .

    0

    1

    2

    3

    4

    5

    6

    0 10 20 30 40 50 60 70 80 90 100

    Aver

    age

    serv

    er b

    andw

    idth

    Average number of users

    sim - sequentialsim - W = 32sim - W = 16sim - W = 8sim - W = 4sim - W = 2

    sim - lower bound 0

    1

    2

    3

    4

    5

    6

    0 10 20 30 40 50 60 70 80 90 100

    Average number of users

    approx - sequentialapprox - W = 32approx - W = 16approx - W = 8approx - W = 4approx - W = 2approx - W = 1

    Fig. 4: Average server bandwidth as function of the number

    of users, with T = Tv , d = dv , γ = 1, for different values ofthe number of segments W . Comparison between simulation(left plot) and approximate analysis (right plot).

    balanced mode (γ = 1), in the case of uniformly distributeduser upload bandwidth. In the first one (see Figure 4) we vary

    the number of users, assuming T = Tv , d = dv . Recall thatthe case W = 1 in our analysis provides an approximation ofthe lower bound.

    As we increase the number of segments, we obtain in-

    termediate curves between the completely non-sequential

    scheme and the pure sequential scheme, which corresponds to

    W → ∞. Besides noticing the accuracy of the approximation,we observe that swarming schemes provide non-negligible

    gains over pure sequential schemes only when the average

    number of users per segment is not too small (say larger than

    a few units). Recall, however, that larger segments (and thus

    larger number of users in them) imply larger startup delays.

    For example, the case W = 32 corresponds to a slidingwindow of about 4 minutes for a typical movie (2 hours long).

    As we can see on Figure 4, the benefit of a non-sequential

    scheme with W = 32 is negligible for the considered valuesof the number of users (below one hundred).

    In the second scenario (see Figure 5) we fix the total number

    of users N = 100, and increase the total activity time ofthe users T (the average upload bandwidth U is reducedaccordingly to keep γ = 1). We observe that as we increasethe activity time (and thus the number of seeds), the difference

    in performance between swarming schemes and sequential

    delivery becomes less significant.

    2

    2.5

    3

    3.5

    4

    4.5

    5

    5.5

    6

    1 1.1 1.2 1.3 1.4 1.5

    Av

    erag

    e se

    rver

    ban

    dw

    idth

    Average activity time / video duration

    sim - sequentialsim - W = 64sim - W = 32sim - W = 16

    sim - W = 8sim - W = 4sim - W = 2

    sim - lower bound

    2

    2.5

    3

    3.5

    4

    4.5

    5

    5.5

    6

    1 1.1 1.2 1.3 1.4 1.5

    Average activity time / video duration

    approx - sequentialapprox - W = 64approx - W = 32approx - W = 16

    approx - W = 8approx - W = 4approx - W = 2approx - W = 1

    Fig. 5: Average server bandwidth as function of the ratio T/Tvbetween average activity time and video duration, keeping

    fixed N = 100, d = dv , γ = 1, for different values of thenumber of segments W . Comparison between simulation (leftplot) and approximate analysis (right plot).

    5 NON-STATIONARY SYSTEMS

    In this section we show how our analytical framework can be

    applied to study the performance of time-varying systems in

    which the arrival rate of requests for a given content changes

    significantly over time. In particular, we will see that the

    behavior of a non-stationary system can dramatically differ

    from the one of a stationary system in which the arrival rate of

    requests is constant, due to a misalignment problem between

    the temporal evolution of the number of downloaders and the

    temporal evolution of the number of seeds.

    5.1 The effect of daily traffic variations

    We consider a reference scenario in which the arrival rate of

    requests for a given video follows a daily pattern which is

    modeled for simplicity by a sine function of period equal to

    24 hours, between a minimum of λ = 0.1 and a maximum ofλ = 1, represented by the thick solid line in the top plot ofFig. 6.

    We first analyze a software-based system in which users

    contribute their upload bandwidth during the watching time

    of the video, plus a random additional time in which the

    application is kept running. We assume that the video duration

    is Tv = 2 h, and the additional activity time after the end of thevideo is exponentially distributed with mean 1 h. We normalize

  • 8

    dv = d = 1 and assume the upload bandwidth of users to beexponentially distributed with mean U = 0.7. The per-userload is γp = 2/(0.7 · 3) ≈ 0.95. The top plot of Fig. 6 reportsthe temporal evolution of both the number of downloaders and

    the number of seeds. Since peers become seeds only after the

    end of the watching time, the dynamics of downloaders and

    seeds are misaligned, with a temporal shift about Tv = 2 h.

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    12 18 24 6 12 18 24 6 0

    20

    40

    60

    80

    100

    120

    140

    vid

    eo r

    equ

    est

    rate

    , λ

    (t)

    nu

    mb

    er o

    f d

    ow

    nlo

    ader

    s /

    seed

    sλdownloaders

    seeds

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    12 18 24 6 12 18 24 6

    aver

    age

    serv

    er b

    and

    wid

    th

    time of day (hours)

    non-stationarystationary (λ(t) = 1)

    Fig. 6: Temporal evolution of video request rate (top plot,

    left y axes), number of downloaders/seeds (top plot, right yaxes), and average server bandwidth S(t) (bottom plot), in thesoftware-based system.

    The effect of this misalignment on the required average

    server bandwidth is depicted on the bottom plot of Fig. 6,

    by the solid line labeled ‘non-stationary’, which exhibits a

    peak preceding the point at which the video request rate is

    maximum. Fig. 6 reports also a curve labeled ‘stationary’,

    representing the server bandwidth that would be necessary if

    the content request rate were constant and equal to λ = 1 (themaximum request rate). We observe that the performance of

    the non-stationary system is worse than that of the stationary

    system, both in terms of peak server bandwidth and average

    server load. This occurs even if the content request rate is

    always larger in the stationary system.

    If we increase the activity time after the end of the video,

    while keeping the same per-user system load γp (either byreducing the upload bandwidth of the users, or equivalently

    by increasing the download rate of the video, i.e., its resolu-

    tion/quality), the negative effect of the misalignment problem

    become worse. As an extreme case, we consider a P2P-VoD

    system relying on set-top-boxes which are always active and

    serving the last watched video. To mimic the behavior of set-

    top-boxes with our model of peer dynamics, we assume that

    the activity time after watching a movie is much longer than

    before (in the order of a day), representing a set-top-box which

    remains always on before the user downloads the next video.

    In particular, we consider an additional activity time of 22 h,

    which added to the watching time of a movie leads to T = 1day. To obtain the same per-user load of the software-based

    system, we set U = 0.7 · 3/24.Fig. 7 reports analytical results for this scenario, analogous

    to those in Figure 6. We have also reported on the bottom plot

    of Fig. 7 a sample path obtained from simulation, to confirm

    the analytical prediction. In this case the instantaneous system

    load γ(t) is severely unbalanced across the day. During peak

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    12 18 24 6 12 18 24 6 0

    200

    400

    600

    800

    1000

    1200

    vid

    eo r

    equ

    est

    rate

    , λ

    (t)

    nu

    mb

    er o

    f d

    ow

    nlo

    ader

    s /

    seed

    sλdownloaders

    seeds

    0

    10

    20

    30

    40

    50

    60

    70

    12 18 24 6 12 18 24 6

    aver

    age

    serv

    er b

    andw

    idth

    time of day (hours)

    non-stationarystationary (λ = 1)

    sim trace

    Fig. 7: Temporal evolution of video request rate (top plot, left

    y axes) number of downloaders/seeds (top plot, right y axes),and average server bandwidth S(t) (bottom plot), in the set-top-box system.

    hours, the bandwidth requested at servers grows very large,

    while for the rest of the day it is negligible. The problem is

    that the upload capacity of the seeds, which are very numerous

    and almost stable along the day (see top plot of Figure 7) is

    totally wasted for a large fraction of the day. Notice that we

    are not saying that set-top-boxes are not useful: increasing the

    activity time of peers (up to the point of having always-on user

    devices) is very beneficial to the system performance, since

    the per-user load γp is reduced. However, one must be carefulthat the instantaneous system load γ(t) can vary significantlyaround γp, and peak traffic demand cannot be absorbed wellby large populations of seeds (set-top-boxes) each devoting a

    small amount of upload bandwidth to the video distribution.

    Indeed, to minimize the bandwidth deficit at peak times, the

    ratio U/d should not become too small.

    5.2 The effect of newly introduced videos

    We now consider a system in which the non-stationarity of

    the video request process is due to new contents regularly

    introduced in the catalogue, whose popularity changes over

    time. We model such a system using a shot-noise process [14].

    We assume that new videos are made available in the system at

    (constant) rate β. The request rate of a given video i insertedat time ti follows an non-homogeneous Poisson process withtime-varying intensity

    λi(t) = Λφ(t − ti)

    where φ(t) is a shaping function, defined over t ≥ 0,modeling how the popularity of the video evolves over time.

    We require φ(t) to be an integrable function, and, withoutloss of generality, we assume that

    ∫ ∞

    0φ(t) dt = 1. By so

    doing, Λ equals the average number of times that the videois requested during its lifetime. In general, both Λ and φ(t)could be random, i.e., each file, upon arrival, could be assigned

    a popularity shape φ(t) extracted from a family of functionswith randomized parameters, and a value of Λ taken from agiven distribution possibly associated to the chosen shape φ(t).We denote by FΛ,φ the joint cdf of Λ and φ.

    Our target is to evaluate the average overall server band-

    width Z resulting from the distribution of all videos available

  • 9

    in the catalog. Despite the complexity of the system, our

    approximate model can be exploited to quickly estimate Zunder several parameter settings, without running expensive

    simulations. We briefly summarize the computational proce-

    dure for clarity: using (1) and (2) we can analytically compute,

    for any t > ti, the average number of peers downloadinga given video i, and the average number of users actingas seeds for it. Equation (10) provides the mean bandwidth

    S(t, Λ, φ) requested from the servers at any time t > ti.Standard numerical integration techniques3 allow to compute

    the average amount of data D(Λ, φ) =∫ ∞

    tiS(t − ti,Λ, φ) dt

    supplied by the servers for a file characterized by popularity

    parameters {Λ, φ}, from which we can compute

    Z = β

    Λ,φ

    D(Λ, φ) dFΛ,φ (13)

    Notice that the average number of users in the system is βΛT .We start considering a scenario in which both Λ and φ

    are deterministic. In particular, we consider files with expo-

    nentially decreasing popularity: φ(t) = µe−µt. Since Z istrivially linear in the arrival rate of new contents (13), we

    arbitrarily set β = 1. Throughout our experiments we also fixthe per-user system load to γp = 0.5, as the effects that we aregoing to show occur also when users can upload a much larger

    amount of data than what they request during their activity

    period. We consider the pure sequential delivery scheme, and

    normalize dv = d = 1. We further normalize T d = 1,and assume, whenever T > T d, that the additional activitytime after downloading the movie is exponentially distributed.

    Under the above settings the average upload bandwidth of

    the users, which is assumed to be exponentially distributed, is

    directly related to T according to U = 2/T . At last, we defineχ = 1/(µT d), which can be thought as the average lifetimeof a video (in terms of popularity) normalized by its watching

    duration. We now investigate the joint impact of the only free

    parameters: Λ, χ, and T .

    10

    100

    1000

    10000

    1 1.5 2 2.5 3 0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    Av

    erag

    e up

    load

    ban

    dw

    idth

    , U

    Normalized activity period, T/Td

    Λ = 1000000Λ = 100000

    Λ = 10000Λ = 1000

    U

    1

    10

    100

    1000

    10000

    1 1.5 2 2.5 3 0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    Mea

    n s

    erv

    er b

    and

    wid

    th,

    Z

    Normalized activity period, T/Td

    Λ = 10000Λ = 1000Λ = 100Λ = 10

    U

    Fig. 8: Average server bandwidth Z as function of normalizeduser activity period T/T d, for χ = 1 (left plot) and χ = 100(right plot), and different values of Λ. The associated averageupload bandwidth U = 2/T is reported on the right y axes.

    Fig. 8 reports, for different Λ’s, the average server band-width Z as function of user activity period T , for χ = 1(left plot) and χ = 100 (right plot). We observe that theserver bandwidth Z remains low and mainly independent of

    3. We adopted the simple trapezoid rule.

    Λ (i.e., the system can scale up to arbitrarily large numberof users), for values of T slightly larger than 1 and smallerthan 2. Notice that when T < 2 the average user uploadbandwidth (right y axes) is larger than the video download

    rate (U > 1). When this condition is not met (in general,when U ≤ dv), the system does not sustain itself, despitethe fact that users can potentially upload twice the amount

    of traffic they download (γp = 0.5). This is again due to themisalignment problem between downloaders and seeds: when

    T increases, more seeds becomes available, but too late withrespect to the time their upload capacity can be exploited. We

    emphasize that this is in sharp contrast to what we have seen

    under stationary conditions, where the effect of decreasing Ucan be completely compensated by increasing T , so that thesystem performance is not compromised (see Figure 5). We

    further notice that this asymptotic behavior (for increasing Λ)occurs for any value of χ (i.e., popularity shape), and thatlarger values of χ (i.e., files whose popularity decays moreslowly) lead to larger values of Z in the regime where thesystem is able to scale.

    1

    10

    100

    1000

    0.1 1 10 100 1000 10000

    Mea

    n s

    erv

    er b

    and

    wid

    th,

    Z

    Normalized video lifetime, χ

    T = 1T = 1.3T = 1.7

    T = 2T = 3T = 6

    T = 20

    20

    63

    1.7

    1.3

    1

    2

    Fig. 9: Average server bandwidth Z as function of the nor-malized video lifetime χ, for different values of user activityperiod T , in the case of Λ = 1000.

    The impact of the video popularity shape is better un-

    derstood looking at Fig. 9, which reports, in the case of

    fixed Λ = 1000, the server bandwidth Z as function of thenormalized video lifetime χ, for different values of T . First,note that Z is bounded above by βΛT ddv , since in the worstcase the system provides rate dv to each downloading user. Inour parameters setting, the above upper bound on Z coincidesnumerically with Λ = 1000. We observe that Z approachesthe upper bound for large values of χ (and any T ), for whichrequests for the same file are so diluted over time that it

    becomes more and more unlikely to have peers concurrently

    downloading the same file (and thus helping other peers).

    When T increases, the upper bound is approached also forsmall values of χ, this time because downloads of the same fileare so synchronized that the upload bandwidth of users (which

    decreases with T ) can be exploited only during a short intervalequal to T d after the file is inserted into the catalog, andthe resulting contribution of peers tends to become negligible.

    Small values of χ can be the result of videos posted on webpages providing suggestions to the users which are frequently

    updated: our results suggest that this practice can be harmful

    as it can compromise an effective peer-assisted distribution

  • 10

    (when U ≤ dv). For T > 2, Z achieves a minimum for agiven value of χ. We observe that, for large values of χ (fileswith slowly decaying popularity), long user activity times are

    actually beneficial to the system, although in this regime the

    system is not able to scale to large number of users, as we

    have seen.

    30

    40

    50

    60

    70

    80

    90

    100

    120

    150

    200

    1000 10000 100000

    Mea

    n s

    erver

    ban

    dw

    idth

    , Z

    Λ

    α = 2α = 1

    α = 0.5α = 0

    0

    5

    10

    15

    20

    25

    30

    1000 10000 100000

    Λ

    α = 2α = 1

    α = 0.5α = 0

    10

    100

    1000

    1000 10000 100000

    Λ

    α = 2α = 1

    α = 0.5α = 0

    T = 1 T = 3T = 1.5

    Fig. 10: Average server bandwidth Z as function of Λ, fordifferent values of the Zipf’s exponent α, and fixed χ = 10.The user activity period is either T = 1 (left plot), or T = 1.5(middle plot) or T = 3 (right plot).

    At last, we consider a scenario in which Λ is a randomvariable, while maintaining the assumption that the popularity

    decay exponent µ is the same for all files. More specifically,we assume that files belong to 10 different classes, whose

    request rates follows a Zipf’s distribution of exponent α, i.e.,files of class i (1 ≤ i ≤ 10) are requested at rate Λi = ΛK/i

    α,

    where K is a normalizing constant such that∑10

    i=1 Λi = Λ.Note that α = 0 corresponds to the previous scenario in whichall files have the same popularity profile. Results are shown in

    Fig. 10, in which we report the mean server bandwidth Z asfunction of Λ, for different values of the Zipf’s exponent α,and fixed χ = 10. In the left plot we consider T = 1, i.e., usersabandon the system immediately after downloading the video,

    i.e., no seeds are available. We notice that the server bandwidth

    Z scales logarithmically with Λ, which can be explained by thelower limit d in (5). In the middle plot we consider T = 1.5,which belongs to the range in which the system performance

    is mainly insensitive to Λ (see Fig. 8), and thus also to theZipf’s exponent α. In the right plot, we consider T = 3, forwhich the system is not able to scale to large number of users.

    Actually, in this case Z scales linearly with Λ.In conclusion we can say that in non-stationary scenarios

    the system performance critically depends on the relationship

    between the average peer upload bandwidth and the download

    rate: when U ≤ dv the bandwidth deficit cannot be effectivelycompensated by just increasing the seed availability (i.e, by

    increasing T ). Smart prefetching policies can in principle re-duce the burden on the servers. However, prefetching policies

    can not be easily implemented in non-stationary (e.g., flash-

    crowd) scenarios where contents are not available in advance,

    and their popularity cannot be easily predicted at the early

    stages.

    6 CONCLUSIONS

    We have proposed a computationally-efficient methodology to

    analytically estimate the server bandwidth requested in non-

    stationary peer-assisted VoD systems. Our approach is highly

    flexible, and can account for several important effects such as

    peer upload bandwidth heterogeneity, churning, non-sequential

    chunk delivery schemes.

    We have shown that our analysis provides efficient, accurate

    predictions of all observed phenomena, providing a useful tool

    for the design of peer-assisted VoD systems.

    REFERENCES[1] D. Ciullo, V. Martina, M. Garetto, E. Leonardi, and G. L. Torrisi,

    “Performance Analysis of Non-stationary Peer-assisted VoD Systems,”in INFOCOM Mini-Conference, 2012.

    [2] “Cisco Visual Networking Index: Forecast and Methodology, 2012–2017,” white paper published on Cisco website, 2012.

    [3] C. Huang, J. Li, and K. W. Ross, “Can Internet Video-on-Demand BeProfitable?” in ACM SIGCOMM, 2007.

    [4] Y. Huang, T. Z. J. Fu, D. ming Chiu, J. C. S. Lui, and C. Huang,“Challenges, Design and Analysis of a Large-scale P2P VoD System,”in ACM SIGCOMM, 2008.

    [5] “PPLive, http://www.gridcast.cn/. GridCast, http://www.gridcast.cn/. PP-Stream, http://www.ppstream.com/. TVU, http://www.tvunetworks.com/.SopCast, http://www.sopcast.com/. Kankan, http://www.kankan.com/.”

    [6] M. Cha, P. Rodriguez, S. Moon, and J. Crowcroft, “On next-generationtelco-managed P2P TV architectures,” in IPTPS, 2008.

    [7] D. Ciullo, V. Martina, M. Garetto, E. Leonardi, and G. L. Torrisi,“Stochastic Analysis of Self-Sustainability in Peer-Assisted VoD Sys-tems,” in INFOCOM, 2012.

    [8] ——, “Peer-assisted VoD Systems: an Efficient Modeling Framework,”in Supplemental material, 2013.

    [9] H. Abrahamsson and M. Nordmark, “Program popularity and viewerbehaviour in a large tv-on-demand system,” in ACM IMC, 2012.

    [10] M. Ahmed, S. Traverso, M. Garetto, P. Giaccone, E. Leonardi, andS. Niccolini, “Temporal locality in today’s content caching: Why itmatters and how to model it,” ACM Computer Communication Review,vol. 43, no. 5, 2013.

    [11] W. Wu and J. Lui, “Exploring the Optimal Replication Strategy in P2P-VoD Systems: Characterization and Evaluation,” in INFOCOM, 2011.

    [12] S. M. Ross, “Stochastic processes. 1996,” 2001.[13] W. Whitt, “Approximations for the gi/g/m queue,” Production and

    Operations Management, vol. 2, no. 2, pp. 114–161, 1993.[14] D. Daley and D. Vere-Jones, An introduction to the theory of point

    processes, Springer-Verlag, Ed., 1998.[15] R. Kumar, Y. Liu, and K. Ross, “Stochastic Fluid Theory for P2P

    Streaming Systems,” in INFOCOM, 2007.[16] D. Wu, Y. Liu, and K. W. Ross, “Queuing Network Models for Multi-

    Channel P2P Live Streaming Systems,” in INFOCOM, 2009.[17] X. Yang and G. de Veciana, “Service Capacity of Peer to Peer Net-

    works,” in INFOCOM, 2004.[18] D. Qiu and R. Srikant, “Modeling and Performance Analysis of

    BitTorrent-Like Peer-to-Peer Networks,” in ACM SIGCOMM, 2004.[19] N. Parvez, C. Williamson, A. Mahanti, and N. Carlsson, “Analysis of

    BitTorrent-like Protocols for On-Demand Stored Media Streaming,” inACM SIGMETRICS, 2008.

    [20] B. Fan, D. Andersen, M. Kaminsky, and K. Papagiannaki, “BalancingThroughput, Robustness, and In-Order Delivery in P2P VoD,” in ACMCoNEXT, 2010.

    [21] B. Tan and L. Massoulie, “Optimal Content Placement for Peer-to-PeerVideo-on-Demand Systems,” in INFOCOM, 2011.

    [22] W. Wu, J. Lui, and R. Ma, “On incentivizing upload capacity in P2P-VoD systems: Design, analysis and evaluation,” Computer Networks,vol. 57, no. 7, pp. 1674 – 1688, 2013.

    [23] C. Zhao, J. Zhao, X. Lin, and C. Wu, “Capacity of P2P On-DemandStreaming with Simple, Robust and Decentralized Control,” in INFO-COM, 2013.

    [24] Y. Zhou, T. Z. Fu, and D. M. Chiu, “Server-assisted adaptive videoreplication for P2P VoD,” Signal Processing: Image Communication,vol. 27, no. 5, pp. 484 – 495, 2012.

    [25] Y. He, Z. Xiong, Y. Zhang, X. Tan, and Z. Li, “Modeling and analysisof multi-channel P2P VoD systems,” Journal of Network and ComputerApplications, vol. 35, no. 5, pp. 1568 – 1578, 2012.

    [26] Z. W. Zhao, S. Samarth, and W. T. Ooi, “Modeling the effect of userinteractions on mesh-based P2P VoD streaming systems,” ACM Trans-actions on Multimedia Computing, Communications, and Applications,vol. 9, no. 2, 2013.

  • 11

    Delia Ciullo received the Master degree inTelecommunications Engineering and the Ph.D.degree in Electronics and Communications En-gineering, both from Politecnico di Torino in2007 and 2011, respectively. Between 2012 and2013 she was a post-doc fellow in the MAE-STRO team at INRIA Sophia Antipolis with anERCIM fellowship. She is currently a post-docresearcher at EURECOM Sophia Antipolis.

    Valentina Martina received the Master degreein Mathematical modeling in Engineering andthe Ph.D. degree in Electronics and Communica-tion Engineering, both from Politecnico di Torinoin 2007 and 2011, respectively. In 2010, shehas been a visiting student at the TechnicolorParis Research Lab. She is currently a post-docresearcher at Politecnico di Torino.

    Michele Garetto (M’04) received the Dr.Ing.degree in Telecommunication Engineering andthe Ph.D. degree in Electronic and Telecommu-nication Engineering, both from Politecnico diTorino, Italy, in 2000 and 2004, respectively. Heis currently assistant professor at the Universityof Torino, Italy.

    Emilio Leonardi (M’99, SM’09) is an AssociateProfessor at the Dipartimento di Elettronica ofPolitecnico di Torino. He received a Dr.Ing de-gree in Electronics Engineering in 1991 anda Ph.D. in Telecommunications Engineering in1995 both from Politecnico di Torino. His re-search interests are in the field of performanceevaluation of wireless networks, P2P systems,packet switching.

    Giovanni Luca Torrisi graduated in Mathe-matics in 1994 at the University of Rome ”LaSapienza” and obtained a Ph.D. in Mathematicsat the University of Milan. Since December 2001he has been a researcher at the CNR. Hisresearch interests are in the field of ProbabilityTheory and Applied Probability.

  • 1

    Supplemental material of paper:

    Peer-assisted VoD Systems:

    an Efficient Modeling Framework

    Delia Ciullo, Valentina Martina, Michele Garetto, Emilio Leonardi, Giovanni Luca Torrisi

    7 ANALYSIS EXTENSIONS

    7.1 VCR functionalities

    VoD systems can optionally provide to the users limited VCR functionalities. For example, Netflix allows watching users to

    pause or restart viewing at will. In this section we show how our model can be extended to take into account the most common

    VCR functionalities provided by VoD services, i.e., the possibility to place the video in pause mode, and the the possibility to

    replay portions of the video which have already been downloaded. The above two operations offer to the users a significant

    degree of flexibility, while being easily implementable in practice. Furthermore, they actually improve the effectiveness of

    peer-assisted video distribution, as we will see.

    Observe that a VoD system can react to users making pause/replay actions in different ways. We will consider the following

    two extreme strategies: 1) the video download is interrupted as soon as the user enters the pause/replay state, and it is resumed

    only when the user starts watching the left portion of the video; 2) the download is never interrupted, and thus proceeds up

    to its natural end.

    The first strategy makes sense while having in mind a short-term objective. Indeed, by interrupting the download of users in

    pause/replay, the system achieves an instantaneous reduction of global user bandwidth request. The second strategy, instead,

    which keeps on the video download, can be justified by having in mind a long-term objective. Indeed, users who complete

    sooner the download (and are likely to watch the entire video) will act as seeds for an extended period of time.

    In conclusion, the first strategy responds to the need of reducing the instantaneous load, whereas the second aims at increasing

    the number of future seeds. While it is clear that pause/replay actions improve in general the system performance (since users

    remain in the system for a longer time, contributing their upload bandwidth) it is not easy, however, to guess which strategy

    performs better, since this depends on many system parameters, including the user behavior. An analytical model like ours can

    actually be very useful to predict the system performance in different scenarios.

    Both strategies described above can easily be incorporated in our model. In particular, the second strategy does not require

    any substantial modification to the model which does not consider VCR functionalities. Indeed, users entering the pause/replay

    state, but keeping on the download, have no impact on (1) and the approximation developed in Section 3.2. The only difference

    is in the activity time T of users (and its distribution GT (t)), which will be prolonged by pause/replay events, increasing thenumber of seeds (and reducing the system load).

    The iterative procedure described in Section 3.2, instead, must be slightly modified to analyse the first strategy, in which

    video download is suspended while users are in pause/replay state. In this case, Proposition 1 is generalized to,

    Proposition 2: Quantity Sd(t, k) satisfies the following recursive equation:

    Sd(t, k) =

    {

    d I{1 ∈/ P} k = 1d I{k ∈/ P} + max{0, Sd(t, k − 1) − Uk} k > 1

    where I{k ∈/ P} is the indicator function of the event {user k is not pausing/replaying}. Note that I{k ∈/ P} models the fact thatnodes which are currently in pause are not downloading the video (so their contribution to the increase of the bandwidth

    request is null) while they are still contributing their upload bandwidth to the content distribution.

    Then, denoting by pk = E[I{k ∈/ P}] the probability that the k-th downloading user is not pausing, we can easily extend theGaussian approximation procedure described in Section 3.2 by simply replacing (8) and (9) with:

    Sd(t, k) ≃ pkd + E[y′]

    σ2Sd(t, k) ≃ (pk)(1 − pk)d2 + E[y′2] − E[y′]2

    under the initial conditions: Sd(t, 1) = dp1 and σ2Sd

    (t, 1) = (p1)(1 − p1)d2.

  • 2

    7.2 Peers with limited memory capabilities

    At last, we consider the situation in which the amount of memory available at a peer to store the video currently being watched

    is not large enough to cache the entire video file. This lack of memory clearly penalizes the effectiveness of peer-assisted

    video distribution, as users can redistribute only a limited portion of downloaded data.

    To mitigate the impact of memory constraints, an effective solution that we propose and analyse here is to adopt a ‘striping’

    approach: the original video stream is divided into M sub-streams, called stripes, whose size (and number) can be adaptedto the storage capacity of peers. Assuming, for simplicity, that all peers have the same memory constraint (the more general

    case can be similarly handled), we will focus on the case in which all stripes have the same size, equal to the storage capacity

    of peers. Peers downloading a video need to retrieve the M stripes in parallel. Then, each peer is requested to cache andredistribute a single, randomly selected stripe belonging to the requested video.

    Notice that each stripe can be regarded as a content of the same temporal duration of the original video, with an associated

    play-out rate equal to dv/M . Our model can be easily extended to analyze this case as well. Indeed (1) can be generalized tocompute the bandwidth requested from the servers by each individual stripe, as follows:

    Proposition 3: Quantity Sd(t, k), representing now the bandwidth requested by downloaders of one particular stripe, satisfiesthe following recursive equation:

    sd(t, k)=

    {

    dM k = 1dM +max{0, Sd(t, k − 1)−UkI{k ∈ S}} k > 1

    where I{k ∈ S} is an indicating function specifying whether the k-th user is redistributing the current stripe. By constructionE[I{k ∈ S}] = 1/M .The Gaussian approximation procedure described in Section 3.2 can be easily adapted to analyze this case, using:

    µ = Sd(t, k − 1) − U/M and variance σ2 = σ2Sd(t, k − 1) + (σ

    2U + U

    2)/M − U

    2/M2. At last observe that, as consequence of

    the fact that every peer redistributes only one randomly selected stripe, we need to modify the mean of Sseed(t) (the contribution

    of seeds associated to the considered stripe) to U N seed(t)/M , whereas the variance of Sseed(t) becomes N seed(t)(σ2U +U

    2)/M .

    8 PERFORMANCE UNDER STATIONARY CONDITIONS

    8.1 Impact of upload bandwidth heterogeneity

    0

    5

    10

    15

    20

    25

    0 0.5 1 1.5 2

    Aver

    age

    serv

    er b

    andw

    idth

    Variation coefficient

    approxsim

    sim lower bound

    0

    2

    4

    6

    8

    10

    12

    14

    0 0.5 1 1.5 2

    Aver

    age

    serv

    er b

    andw

    idth

    Variation coefficient

    approxsim

    sim lower bound

    Ū = 0.9 Ū = 1.1

    Fig. 11: Comparison of average server bandwidth with N = 100, d = dv , T = Tv , as function of the variation coefficient ofpeer upload bandwidth, for U = 0.9 (left plot) and for U = 1.1 (right plot).

    Figure 11 compares the average server bandwidth S as function of the variation coefficient of the peer upload bandwidth(keeping fixed the mean), considering N = 100, d = dv , T = Tv . The average upload bandwidth is equal to either U = 0.9(deficit mode) or U = 1.1 (surplus mode). Simulation results are supplemented by 95%-level confidence intervals. The uploadbandwidth distribution used in the simulations depends on the variation coefficient: for values larger than one, we adopt a

    second-order hyper-exponential distribution with balanced means; for values smaller than one, we employ an exponential

    distribution added to a constant (this explains the small glitch at variation coefficient equal to 1).

    We observe that the average server bandwidth increases significantly as the variability of upload bandwidth increases, while

    our approximation tends to provide a conservative prediction.

    8.2 Impact of VCR functionalities

    In Figure 12 we evaluate the impact of users pausing the video playback or replaying portions of video already watched. As

    explained in Sec. 7.1 these VCR functionalities can be easily modeled introducing the probability pk that the k-th downloading

  • 3

    user is watching fresh new content (i.e., the portion of video currently being donwloaded). For simplicity, we assume that pkis a given constant, equal for all k (more in general, we would need a sub-model to compute pk on the basis of additionalassumptions about the user behavior). The left plot of Figure 12 refers to the system that keeps on the download while the

    user is in pause/replay state (system 1), whereas the right plot refers to the system in which the download is suspended in

    pause/replay state (system 2).

    Results in Figure 12 refer to the same scenario considered in Figure 5. We observe non-negligible differences between the

    two systems only when users act as seeds for zero or small time after finishing watching the video4. In this case, system 1

    performs better than system 2, due to the very beneficial effect induced by just a small number of seeds, which are artificially

    created in system 1 by keeping on the download. We remark, however, that the gain achieved by system 1 over system 2 also

    depends on the probability that users watch the entire video, actually consuming all downloaded data. Premature abandons

    (not present in the scenario considered in Figure 12), if likely to occur, can make system 2 preferable to system 1.

    0

    1

    2

    3

    4

    5

    6

    0 0.1 0.2 0.3 0.4 0.5

    Avera

    ge s

    erv

    er

    bandw

    idth

    Original seed time / video duration

    approx - pk = 1.0approx - pk = 0.98approx - pk = 0.95approx - pk = 0.9

    0

    1

    2

    3

    4

    5

    6

    0 0.1 0.2 0.3 0.4 0.5

    Original seed time / video duration

    approx - pk = 1.0approx - pk = 0.98approx - pk = 0.95

    approx - pk = 0.9

    Fig. 12: Average server bandwidth as function of the (original) ratio between the seed time and the video duration, for different

    values of probability pk. Comparison between system 1 (left) and system 2 (right).

    8.3 Impact of limited memory at peers

    At last, we investigate the impact of memory constraints at peers, as analysed in Section 7.2. We consider again the same

    scenario as in Figures 5 and 12. Recall that 1/M represents the fraction of the entire video that can be cached at a peer.

    2

    4

    6

    8

    10

    12

    14

    16

    18

    20

    1 1.1 1.2 1.3 1.4 1.5

    Aver

    age

    serv

    er b

    andw

    idth

    Average activity time / video duration

    sim - M = 10approx - M = 10sim - M = 5approx - M = 5sim - M = 2approx - M = 2sim - M = 1approx - M = 1

    Fig. 13: Average server bandwidth as function of the ratio between average activity time and video duration, for different

    values of the number of stripes M . Comparison between simulation and approximate analysis.

    As expected, Figure 13 confirms that limited memory availability at peers negatively affects the system performance, in

    rather strong way. The increase in the server bandwidth, however, is not linear with respect to M .

    8.4 Summary of results under stationary conditions

    By applying our performance evaluation methodology under various parameters setting, we have discovered several interesting

    properties of P2P-VoD systems:

    - in the surplus mode, the server bandwidth achieves a maximum as we increase the number of users, and then decreases to

    4. The seed time is incremented in system 1 by pause/replay functionalities: the seed time reported in Figure 12 refers to the original seed time (withoutpause/replay effects), due to user behavior after the video playback is finished.

  • 4

    zero provided that T > T d (i.e., the average activity time is larger than the average download time) (Fig. 2);- under the pessimistic assumption that users leave the system after watching the video, the server bandwidth can be minimized

    by a proper selection of the target download rate, under any system load (Fig. 3);

    - the server bandwidth increases with the variation coefficient of the peer upload bandwidth distribution (Fig. 11);

    - the gain achievable by non-sequential schemes over the simple sequential scheme depends critically on the size of the sliding

    window and the number of downloading users, and vanishes as the number of seeds increases (Fig. 4 and 5);

    - VCR-like functionalities allowing users to stop/rewind the video playback improve the system performance (Fig. 12);

    - memory contraints can severely penalize the effectiveness of peer-assisted video distribution (Fig. 13).

    9 SUMMARY OF RESULTS UNDER NON-STATIONARY CONDITIONS

    We found out the following interesting properties:

    - under non-stationary traffic conditions peer-assisted VoD systems are affected by a misalignment problem between

    downloaders and seeds. The per-user load γp is not enough to characterize the system performance, which comes to criticallydepend on the amount of bandwidth contributed by peers while they download the video: when U > dv the bandwidthdemanded from the servers is essentially independent from the system size (number of watching users); if U < dv , instead,the bandwidth demanded from the server scales linearly with the system size, regardless of the availability of users as seeds

    and the per-user system load γp (Fig. 8);- in the regime where the system does not scale with the number of users, the system performance is further negatively affected

    by an increase in the content heterogeneity (in terms of popularity) (Fig. 10).

    10 RELATED WORK

    A stochastic fluid approach to analyze peer-assisted video distribution has been proposed in [15] in the context of live streaming,

    in which (heterogeneous) peers download and playback content synchronously. They derive simple conditions for the existence

    of a fluid distribution scheme that achieves universal streaming. in a system with a given available server rate, and two classes

    of users (with high/low upload capacity). In [16] authors extend the analysis in [15] developing queueing network models of

    multi-channel P2P live streaming systems, capturing peer churn, bandwidth heterogeneity, and Zipf-like channel popularity,

    and showing the benefit of the View-Upload Decoupling strategy. Here we apply the stochastic fluid approach to VoD systems,

    whose dynamics are quite different from live streaming, since users can watch the video asynchronously.

    A mathematical formulation of the server bandwidth needed under sequential delivery appeared in [4], in which authors

    resort to a Monte Carlo approach to get basic insights into the system behavior (like surplus and deficit modes). The sequential

    delivery scheme has been considered also in [11], where authors explore by simulation the effectiveness of different replication

    strategies to minimize the server load in the slightly surplus mode, as well as distributed replacement algorithms to achieve it.

    Differently from [11], we provide an analytical approximation that can account for both sequential and non-sequential delivery

    schemes under any system load, considering also the impact of seeds.

    Stochastic fluid models for BitTorrent-like file-sharing system, accounting for the dynamics of downloaders and seeds, have

    been proposed for both transient and steady-state regimes [17], [18], but they are not directly applicable to streaming systems.

    In [19], authors adapt the fluid model in [18] to VoD systems, investigating the impact of different piece selection policies

    (rarest-first and in-order) on download latency and startup delay, in the case of homogeneous peers. In contrast to [19], we

    focus on the characterization of VoD systems with strict service guarantees and heterogeneous user upload bandwidths. In

    [20], a per-chunk capacity model is developed to show the tradeoff that exists between system throughput, sequentiality of

    downloaded data and robustness to heterogeneous network conditions. Optimal content placement strategies to maximize the

    upload capacity of (homogeneous) set-top-boxes (and thus minimize the servers workload) in VoD systems have been recently

    investigated in [21] under many-user asymptotic.

    More recently, [22] characterizes the content providers’ uploading cost as a function of the peers’ contribution, and proposes

    an incentive mechanism that rewards peers based on their dedicated upload bandwidth. However, they evaluate the performances

    of their scheme only through simulation. Somehow complementary to our work is the study in [23], where control algorithms that

    achieve the optimal streaming capacity as the number of peers increases, are proposed. [24] considers a distributed and adaptive

    video replication strategy with some server feedback. Similarly to our paper, [24] deals with peer churn and non-stationary

    popularity of movies, but results are validated through simulation only. [25] proposes two analytical models that capture several

    aspects of peer behavior, such as participating in the system, sojourning in a channel, downloading and uploading the content,

    wandering around channels and leaving the system. However the effects of upload bandwidth heterogeneity, non sequential

    delivery and non-stationary traffic conditions are not taken into account. In [26] an analytical model to both qualitatively and

    quantitatively study the effect on server cost of seeks and pauses on mesh-based P2P VoD streaming systems is developed.

    To the best of our knowledge, we are the first to analytically investigate the performance of peer-assisted VoD systems under

    non-stationary traffic conditions.

  • 5

    11 DISCUSSION ON MODEL ASSUMPTIONS

    In this section we critically revisit the main assumptions of our model discussing their impact on the analysis of a VoD system.

    The first strong assumption consists in assuming a fixed video playback rate. This assumption actually does not hold in

    practice since most video encoding schemes produce variable bitrate streams. However, rapid bitrate fluctuations are usually

    averaged out by the playout buffer of the decoder, so that assuming a fixed download rate d larger than or equal to theaverage playback rate can be an acceptable assumption, while being also a reasonable design choice to simplify the system.

    Nevertheless, it would be possible to incorporate in the model a random download rate, representing fluctuations due to

    variable video bit-rate and cross-traffic effects. Indeed, since different users in a VoD system are retrieving at time t differentand independent segments of the video content, we could well assume instantaneous play-back rates of users to be described

    by i.i.d. random variables with known distribution. The effect of variable play-back rates could then be easily incorporated

    in our modeling framework without changing its mathematical structure. Observe, indeed, that the structure of Sd(t, k) in (5)remains unchanged when d is replaced with dk, where dk is a random variable, while the gaussian approximation in 3.2 canbe easily extended to handle this case, given the first two moments of dk.

    The second important assumption of our work consists in modeling the users’ upload bandwidth as i.i.d. random variables

    Ui with assigned distribution. We do not consider this assumption particularly restrictive, since it permits to represent prettywell bandwidth heterogeneity of users’ access links, as well as random fluctuations in the available upload bandwidth due to

    cross traffic and other forms of bandwidth restriction. Notice that such fluctuations could be also correlated over time at each

    user, without affecting our analysis, which is essentially based on an instantaneous analysis of the system. The assumption that

    upload bandwidths are uncorrelated among users is also reasonable, since in a VoD system users are geographically spread,

    and thus they are likely to experience independent bandwidth fluctuations.

    At last, in our work we have ignored implementation issues such as: i) the effects of protocol overheads and signalling

    bandwidth (necessary to reconfigure the cooperation among users); ii) possible constraints on the number of peers from which

    a user can simultaneously download data; iii) the effect of congestion inside the network. All these issues can potentially affect

    the performance of a realistic system, but we have not incorporated them for the sake of simplicity and analytical tractability.

    APPENDIX APROOF OF PROPOSITION 1

    The case k = 1 is obvious. The recursive expression for k ≥ 2 can be easily explained if we look at the users in reverseorder with respect to the arrival time into the system, i.e., user k arrives before user k − 1. Suppose that we know the serverbandwidth Sd(t, k − 1) needed in the presence of k − 1 users. Then user k can reduce this rate by its upload bandwidth Uk,possibly bringing the server rate to zero. Instead, user k cannot be helped by any other peers, hence it requires fresh newcontent from the server at rate d.

    APPENDIX BEXTENSION TO NON-SEQUENTIAL DELIVERY

    More formally, let P0(t, v) = e−Nv(t) be the probability that there are no users downloading segment v at time t.

    Let Sd(t, v) and S(2)

    d (t, v) be the first and second moment of Sd(t, v), respectively. To start our iterative computation, we

    set Sd(t, 0) = 0 and S(2)

    d (t, 0) = 0. Now, suppose that we know Sd(t, v−1) and S(2)

    d (t, v−1) for a given v ≥ 1. Conditioningon the event Nv(t) > 0, we approximate Sd(t, v − 1) − Ũ by a normal random variable y of mean

    µ = Sd(t, v − 1) −Nv(t)

    1 − P0(t, v)(U − d) − d

    and variance

    σ2 = S(2)

    d (t, v − 1) − (Sd(t, v − 1))2 +

    Nv(t)

    1 − P0(t, v)σ2U +

    [

    Nv(t) + (Nv(t))2

    1 − P0(t, v)−

    (

    Nv(t)

    1 − P0(t, v)

    )2]

    (U − d)2

    and apply (6) and (7) to compute the first moment E[y′] and second moment E[y′2] of y′ = max{0, y}. Then we obtain thefirst two moments of Sd(t, v) as:

    Sd(t, v)= P0(t, v)Sd(t, v−1)+(1−P0(t, v))(d + E[y′])

    S(2)

    d (t, v)= P0(t, v)S(2)

    d (t, v−1)+(1�