16
Adaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie Department of Computer and Information Science University of Oregon {nazanin,reza}@cs.uoregon.edu ABSTRACT This paper presents design and evaluation of an adaptive streaming mechanism from multiple senders to a single receiver in Peer-to- Peer (P2P) networks, called P2P Adaptive Layered Streaming, or PALS . PALS is a receiver-driven mechanism. It enables a receiver peer to orchestrate quality adaptive streaming of a single, layer en- coded video stream from multiple congestion controlled senders, and is able to support a spectrum of non-interactive streaming ap- plications. The primary challenge in design of a multi-source stream- ing mechanism is that available bandwidth from each peer is not known a priori, and could significantly change during a session. In PALS , the receiver periodically performs quality adaptation based on aggregate bandwidth from all senders to determine (i) over- all quality (i.e. number of layers) that can be collectively deliv- ered from all senders, and more importantly (ii) specific subset of packets that should be delivered by each sender in order to grace- fully cope with any sudden change in its bandwidth. Our detailed simulation-based evaluations illustrate that PALS can effectively cope with several angles of dynamics in the system including: band- width variations, peer participation, and partially available content at different peers. We also demonstrate the importance of coordina- tion among senders and examine key design tradeoffs for the PALS mechanism. Keywords Quality adaptive streaming, Peer-to-peer networks, Congestion con- trol, Layered encoding 1. INTRODUCTION During recent years, Peer-to-Peer (P2P) overlays have become an increasingly popular approach for streaming multimedia content from a single source to many receivers without any special sup- port from the network (e.g., IP multicast or content distribution infrastructure) [1, 2]. A P2P Streaming mechanism requires two key components: (i) Overlay Construction mechanism that deter- mines how participating peers connect together to collectively form an overlay, and (ii) Content Delivery mechanism that manages how the content is being streamed to each peer through the overlay. The goal is to maximize delivered quality to individual peers while min- imizing the overall network load associated with content delivery. Previous studies have often adopted the idea of application level multicast where participating peers form a tree structure and each peer simply “pushes” all (or a specific subset) of its received con- tent (e.g., packets of a certain layer) to its child peers (e.g., [3, 4, 5]). This class of solutions have primarily focused on the design of overlay construction mechanism to maintain an optimal tree struc- ture in order to minimize network load. However, they often in- corporate simple push-based content delivery with static content- to-parent mapping. This class of solutions are unable to maximize delivered quality to individual peers since each peer only receives content from a single parent peer who may not have (or may not be willing to allocate) sufficient outgoing bandwidth to stream content with the desired quality to its child peer. This problem is further ag- gravated by the following issues: (i) heterogeneity and asymmetry of access link bandwidth among participating peers[6], (ii) dynamic of peer participations, and (iii) the competition among peers for available bandwidth from parent peers. A promising approach to maximize delivered quality to individual peer is to allow a receiver peer to receive content from multiple parent peers. If parent peers are properly selected by the overlay construction mechanism (e.g., to avoid a shared bottleneck), they can provide higher aggregate bandwidth and thus stream higher quality content to the receiver. This multi-sender approach could also lead to a better load balanc- ing among parents and across the network. Efficient streaming of content from multiple sender peers is chal- lenging. We assume that all connections between peers perform TCP-friendly congestion control [7, 8] to ensure that P2P stream- ing applications can properly co-exist with other applications. This implies that the available bandwidth from a parent is not known a priori and could significantly change during a session. Given the variable nature of available bandwidth, the commonly used static approach of receiving separate layers of a layer encoded stream from each sender [3, 5]) can easily lead to a poor performance. More specifically, this static approach results in poor quality when the available bandwidth from a sender is lower than one layer band- width, or becomes inefficient when a sender has significantly higher than one layer bandwidth but it is not used for delivery of other lay- ers. To accommodate the variations in available bandwidth, a dy- namic coordination mechanism among senders can be deployed in order to adaptively determine: (i) the maximum quality (i.e., num- ber of layers) that can be delivered by all senders, and (ii) the proper distribution (or mapping) of the target quality among senders in or- der to fully utilize their available bandwidth while ensuring in-time delivery of individual packets.

Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

  • Upload
    others

  • View
    21

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

Adaptive Receiver-Driven Streaming from Multiple Senders

Nazanin Magharei, Reza RejaieDepartment of Computer and Information Science

University of Oregon{nazanin,reza}@cs.uoregon.edu

ABSTRACTThis paper presents design and evaluation of an adaptive streamingmechanism from multiple senders to a single receiver in Peer-to-Peer (P2P) networks, called P2P Adaptive Layered Streaming, orPALS. PALSis a receiver-driven mechanism. It enables a receiverpeer to orchestrate quality adaptive streaming of a single,layer en-coded video stream from multiple congestion controlled senders,and is able to support a spectrum of non-interactive streaming ap-plications. The primary challenge in design of a multi-source stream-ing mechanism is that available bandwidth from each peer is notknown a priori, and could significantly change during a session. InPALS, the receiver periodically performs quality adaptation basedon aggregate bandwidth from all senders to determine(i) over-all quality (i.e. number of layers) that can be collectivelydeliv-ered from all senders, and more importantly(ii) specific subset ofpackets that should be delivered by each sender in order to grace-fully cope with any sudden change in its bandwidth. Our detailedsimulation-based evaluations illustrate thatPALScan effectivelycope with several angles of dynamics in the system including: band-width variations, peer participation, and partially available contentat different peers. We also demonstrate the importance of coordina-tion among senders and examine key design tradeoffs for thePALSmechanism.

KeywordsQuality adaptive streaming, Peer-to-peer networks, Congestion con-trol, Layered encoding

1. INTRODUCTIONDuring recent years, Peer-to-Peer (P2P) overlays have become anincreasingly popular approach for streaming multimedia contentfrom a single source to many receivers without any special sup-port from the network (e.g., IP multicast or content distributioninfrastructure) [1, 2]. A P2P Streaming mechanism requirestwokey components:(i) Overlay Constructionmechanism that deter-mines how participating peers connect together to collectively forman overlay, and(ii) Content Deliverymechanism that manages howthe content is being streamed to each peer through the overlay. The

goal is to maximize delivered quality to individual peers while min-imizing the overall network load associated with content delivery.

Previous studies have often adopted the idea of applicationlevelmulticast where participating peers form a tree structure and eachpeer simply “pushes” all (or a specific subset) of its received con-tent (e.g., packets of a certain layer) to its child peers (e.g., [3, 4,5]). This class of solutions have primarily focused on the design ofoverlay construction mechanism to maintain an optimal treestruc-ture in order to minimize network load. However, they often in-corporate simple push-based content delivery with static content-to-parent mapping. This class of solutions are unable to maximizedelivered quality to individual peers since each peer only receivescontent from asingleparent peer who may not have (or may not bewilling to allocate) sufficient outgoing bandwidth to stream contentwith the desired quality to its child peer. This problem is further ag-gravated by the following issues:(i) heterogeneity and asymmetryof access link bandwidth among participating peers[6],(ii) dynamicof peer participations, and(iii) the competition among peers foravailable bandwidth from parent peers. A promising approach tomaximize delivered quality to individual peer is to allow a receiverpeer to receive content from multiple parent peers. If parent peersare properly selected by the overlay construction mechanism (e.g.,to avoid a shared bottleneck), they can provide higher aggregatebandwidth and thusstreamhigher quality content to the receiver.This multi-sender approach could also lead to a better load balanc-ing among parents and across the network.

Efficient streaming of content from multiple sender peers ischal-lenging. We assume that all connections between peers performTCP-friendly congestion control [7, 8] to ensure that P2P stream-ing applications can properly co-exist with other applications. Thisimplies that the available bandwidth from a parent is not known apriori and could significantly change during a session. Given thevariable nature of available bandwidth, the commonly usedstaticapproach of receiving separate layers of a layer encoded streamfrom each sender [3, 5]) can easily lead to a poor performance.More specifically, this static approach results in poor quality whenthe available bandwidth from a sender is lower than one layerband-width, or becomes inefficient when a sender has significantlyhigherthan one layer bandwidth but it is not used for delivery of other lay-ers. To accommodate the variations in available bandwidth,a dy-namiccoordination mechanism among senders can be deployed inorder toadaptivelydetermine:(i) the maximum quality (i.e., num-ber of layers) that can be delivered by all senders, and(ii) the properdistribution (or mapping) of the target quality among senders in or-der to fully utilize their available bandwidth while ensuring in-timedelivery of individual packets.

Page 2: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

This paper presents a receiver-driven coordination mechanism forquality adaptive streaming from multiple congestion controlled senderpeers to a single receiver peer, calledP2P Adaptive Layered Stream-ing or PALS. Given a set of sender peers, the receiver passivelymonitors the available bandwidth from each sender. Then, itpe-riodically determines the target quality (i.e., the number of layers)that can be streamed from all senders, identifies required packetsthat should be delivered during the next period, and requests a sub-set of required packets from each sender. This receiver-driven ap-proach not only maximize the delivered quality but also ensuresits stability despite the variations in available bandwidth. More im-portantly,PALSachieves this goal with a relatively small amount ofreceiver buffering (e.g., tens of seconds worth of playout). There-fore,PALScan accommodate a spectrum ofnon-interactivestream-ing applications ranging from playback to live (but non-interactive)sessions in P2P networks.

We note thatPALSis a receiver-driven mechanism for streaming(i.e., content delivery) from multiple senders. Therefore, it must bedeployed in conjunction with an overlay construction mechanism(e.g., [9]) that provides information about potential parents (i.e.,senders) to each peer. Clearly, behavior of the overlay constructionmechanism affects overall performance of content deliveryacrossthe entire group. In this paper, our goal is to study the stream-ing delivery of content from multiple senders to a single receiver.Therefore, we assume that a list of potential senders are provided toa receiver peer. Further details of the overlay construction mecha-nism and global efficiency of content delivery are outside the scopeof this paper and will not be discussed. While we motivatedPALSmechanism in the context of P2P streaming, it should be viewedas a generic coordination mechanism for streaming from multiplecongestion controlled senders across the Internet.

The rest of this paper is organized as follows: We sketch an overviewof the design space for a multi-sender streaming mechanism in Sec-tion 2. Section 3 provides an overview ofPALSmechanism and de-scribes its key components. We present our simulation-based eval-uations in Section 4. In Section 5, we present the related work. Fi-nally, Section 6 concludes the paper and presents our futureplans.

2. EXPLORING THE DESIGN SPACEBefore describing thePALSmechanism, we explore the design spacefor a multi-sender streaming mechanism to clarify design issuesand justify our design choices. Note that streaming contentshouldhave a layered structure in order to accommodate bandwidth het-erogeneity among peers by enabling each peer to receive a propernumber of layers. To design a multi-sender streaming mechanism,there must be a coordination among senders to address the follow-ing two key issues:

• Aggregate Quality: How is the aggregate deliverable qualityfrom all senders determined?i.e., how many layers can bedelivered by all senders?

• Content-to-Sender Mapping: How is the aggregate deliver-able quality mapped among senders?i.e., which (or part of)layer should be delivered by each sender?

If the available bandwidth from each sender is known, the aggre-gate quality can be easily determined as the number of completelayers that can be streamed through the aggregate bandwidth. The

number of mapped layers to each sender is proportional to itscon-tribution to the aggregate bandwidth. This approach to layer-to-sender mapping raises the issue of how the residual bandwidth fromeach sender should be used. If the residual bandwidth from eachsender is not used for delivery of partial layer, the residual band-width across different senders can not be accumulated and thus theaggregate delivered quality can not be maximized. For example, iftwo senders provide 2.6 and 3.7 layer bandwidth, they can stream6 layers only when their residual bandwidth (0.6 and 0.7 layer) areutilized for delivery of partial layer. Assigning partial layer to asender requires proper division of a layer across multiple senders(i.e., which particular packets of a layer are determined by eachsender).

Aside from the above design choices for a coordination mechanism,a multi-sender streaming mechanism may perform coordination inan adaptive or static fashion, and should determine where the ma-chinery of coordination should be placed as follows:(i) Adaptive vs Static Coordination: The coordination mechanismcan be invoked just once at the startup phase to determine theaggre-gate quality and its mapping among senders. Suchstaticapproachis feasible when available bandwidth from senders are knownandstable. However, if senders are congestion controlled, availablebandwidth from each sender is not known a priori and can signifi-cantly change during a session. In such dynamic environment, thecoordination mechanism should be invoked periodically in orderto adaptively determine aggregate quality and its mapping amongsenders. In adaptive coordination approach, proper adaptation pe-riod should be selected in order to achieve responsiveness to band-width variations while ensuring instability of delivered quality.(ii) Placement of the Coordination Machinery: The final issue isto determine where the coordination machinery should be imple-mented. The coordination mechanism can be implemented either ina distributed fashion by all senders or in a central fashion by a sin-gle sender or receiver peer. In both approaches, senders should beinformed about the status of the receiver, including its buffer state,playout time and lost packets. Distributed coordination requires ac-tive participation of all senders and close interactions among them.Such approach is likely to be sensitive to pair-wise delay amongsenders and sender dynamics, and results in a higher volume of co-ordination traffic. In central approach, the receiver is in auniqueposition to implement the coordination mechanism since it is theonly permanentmember of the session, (i.e., receiver-driven co-ordination). Furthermore, the receiver has a complete knowledgeabout delivered (and thus lost) packets, available bandwidth fromeach sender, and aggregate delivered quality. While the receivercannot predict the future available throughput for each sender, itshould be able to leverage a degree of multiplexing among sendersto its advantage. Another advantage of the receiver-drivenapproachis that it does not required a significant processing overhead bythe senders which provides a better incentive for senders topar-ticipate. The coordination overhead for a receiver-drivenapproachshould not be higher than associated overhead for any conceivabledistributed coordination among senders.

In summary,PALS adopts a receiver-driven coordination mecha-nism in an adaptive fashion in order to minimize coordination over-head, maximize delivered quality, and gracefully accommodate dy-namics of bandwidth variations.

3. PALS MECHANISMPALSis a coordination mechanism for streaming a single content(e.g., video stream) from multiple senders over the Internet. An

Page 3: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

sj Senderj SRTTj EWMA RTT from sj

Li Layeri SRTTmax Max. value amongSRTTj

n No of active layers ∆ Length of sliding windowN Max. no of layers K Estimated no of incoming packets during∆Tewma EWMA aggregate BW C Per layer BWT j

ewma EWMA BW from sj Nc No of packet assignment roundsbufi Buffered data forLi tpr Receiver’s playout timebwi Allocated BW forLi Delayr Maximum delay between senders & receiverBUF [i] Target buffer forLi BUFadd Total buffered data before adding a layerPktSize Packet size τ Look ahead intervalr Degree of redundancy p period of added redundancy

Table 1: Summary of Notations

Sen

der

Senderbw

3

bw2

bw1

bw0

buf3

buf2

buf1

buf0 Deco

der

Internet

Demu

xSliding

Window

Aggr

egat

e BW

Buf. State

SendRequest

Disp

lay

Playout time

Segm

ent

Assig

nmen

t

Receiver Peer

Data

RequestControl Info.

QualityAdaptation

Sender

Figure 1: Internal Architecture of a PALS receiver

overlay construction mechanism provides information about a suf-ficient number of senders1 to each peer in a demand-driven fash-ion. The receiver contacts a proper number of senders to serveas its parents. Each selected sender establishes a separateconges-tion controlled UDP connection to the receiver (i.e., using RAP [7]or TFRC [10]). We assume that senders are heterogeneous andscattered across the Internet. This implies that senders may havedifferent available bandwidth and round-trip-time (RTT) to the re-ceiver, and any subset of them might reside behind a shared bot-tleneck. Because of the inherent dynamics of peer participation inP2P networks, a sender may leave at any point of time. To ac-commodate bandwidth heterogeneity among senders, we assumethat video streams are layer encoded. However, the framework canbe easily extended to accommodate multiple description encodingas well. For clarity of the discussion, we do not consider streambandwidth variability in this paper and assume that all layers havethe same constant-bit-rate (C). General information about the de-livered stream (e.g., maximum number of layers (N ), layer band-width) can be provided to the receiver along with the list of senderpeers during the initial setup phase.

To simulate live but non-interactive streaming sessions, we assumethat each sender can only provide a limited window of future pack-ets to a receiver. Once a receiver selects a group of sender peers, itdelays its playout time (tpr) with respect to the minimum playouttime among selected senders (tpi), (tpr = MIN(tpi) −Delayr).This ensures that all senders can provide a window ofDelayr sec-onds worth of future packets relative to the receiver playout time.Ina nutshell, all participating peers in a session are viewingthe samestream but the playout time of each peer is delayed where the delay

1Throughout this paper, we use the terms “sender”, “parent” and“sender peer” interchangeably.

is roughly proportional to its hop-count distance from the sourcethrough the overlay.Delayr is an important configuration param-eter that is controlled by the receiver. As the receiver increasesDelayr, its session becomes closer to the playback mode since ithas more room to request future packets. Table 1 summarizes thenotation that we use throughout this paper.

3.1 An OverviewThe primary design goal ofPALSis to effectively utilize availablebandwidth from each sender to maximize delivered quality whilemaintaining its stability despite independent variationsin band-width from different senders. The basic idea inPALS is simpleand intuitive. The receiver peer monitors the available bandwidthfrom its parents and periodically requests a list of packetsfromeach parent. Each parent peer delivers requested packets tothereceiver in the given order through a congestion controlledconnec-tion. In a nutshell, content of delivered packets are determined bythe receiver whereas the rate of packet delivery from each sender iscontrolled by a congestion control mechanism.

The machinery of thePALSprotocol is mostly implemented at thereceiver, as shown in Figure 1. The receiver passively monitorsthe Exponentially Weighted Moving Average (EWMA) bandwidthfrom each sender (T j

ewma) and thus can determine EWMA of ag-gregate bandwidth from all senders (Tewma). The receiver deploysa Sliding Window(SW) scheme to properly identify timestampsof required packets for each window as playout time progresses,and ensures in-time delivery of requested packets as we discuss insubsection 3.2. At a beginning of each window (∆), the receiverassumes that the current value ofTewma remains unchanged forone window and takes the following steps: First, it estimates thetotal number of incoming packets (K) during this window:K =

Page 4: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

Playout time

Time

tmin

δ

∆Tim

e

(a) Sliding Window inPALS

Playing

Window

Buffering

Window

Delayr

Timetpr

δ

Active W

indow

reft

∆ (τ)

Delivered PacketsRequested Packets

(b) Relative timing of windows &Delayr

L0

L1

L2

L3

Time

(c) Packet ordering inside a window

Figure 2: Sliding window & packet ordering in PALS

Tewma∗∆PktSize

.

Second, theQuality Adaptation(QA) mechanism is invoked to de-termine(i) the number ofactive layers that can be played duringthis window (n), and (ii) an ordered list of required packets foractive layers that should be delivered during this window. The re-quired packets are determined based on the estimated budgetof in-coming packets (K), stream bandwidth (i.e., n×C) and receiver’scurrent buffer state (buf0, buf1, .., bufn). For example, if the re-ceiver expects to receive 500 packets during a period (K = 500)where four layers are currently being played, the QA mechanismmay allocatek0 = 200, k1 = 150, k2 = 100, k3 = 50 packets tolayerL0 .. L3, respectively. By controlling the distribution of in-coming packets among layers, the QA mechanism loosely controlsthe distribution of aggregate bandwidth among active layers (i.e.,bw0, bw1, .. ,bwn) during one window, which in turn determinesevolution of receiver’s buffer state. Further details on the qualityadaptation mechanism is descried in subsection 3.3.

Third, given an ordered list of required packets for a period, thePacket Assignment(PA) mechanism divides selected packets intodisjoint subsets (possibly from different layers), and sends a sepa-rate request to each sender. Each request contains anorderedlistof assigned packets to the corresponding sender. The numberof re-quested packets from each sender is proportional to its contributionin aggregate bandwidth. Therefore, senders are likely to deliverall the requested packets during the corresponding window.Eachsender only maintains a single list of pending packets for each re-ceiver peer and simply delivers requested packets in the given orderat the rate that is determined by its congestion control mechanism.Ordering the list of requested packets from each sender allows thereceiver to prioritize requested packets based on its own prefer-ences (e.g., based on encoding-specific information). This in turnensures graceful degradation in quality when bandwidth of asendersuddenly drops since available bandwidth is utilized for delivery ofmore important packets. The receiver can send its request toeachsender through an out-of-band TCP connection or piggy-backthemwith the ACK packets to that sender. Further details of the packetassignment mechanism is presented in subsection 3.4.

3.2 Sliding WindowOnce the receiver initiates media playback,PALSdeploys a slidingwindow scheme in order to identify and properly prioritize requiredpackets in each window to accommodate “in-time” delivery ofre-quested packets, The sliding window scheme works as follows: the

receiver maintains a window of time [tref , tref + ∆] called ac-tive window. This window represents a range of timestamps forpackets (from all active layers) that must be requested during onewindow. tref and∆ denote the left edge and length of the window,respectively. As the playout time progresses, the window isslidedforward in a step-like fashion (tref ← tref + ∆) when the gapbetween the left edge of the window and the playout time reachesa minimum thresholdδ, (tref − tpr ≤ δ). Figure 2(a) shows therelative position of the window with respect to the playout timethat clearly demonstrates the step-like sliding strategy.This slid-ing strategy allows the receiver to request and receive packets for anew window while the receiver is playing delivered packets for theprevious window. Periodic sliding accommodates in-time deliveryof requested packets since their timestamps are sufficiently (at leastδ+∆) ahead of the receiver’s playout time.

Figure 2(b) depicts status of the window right after a sliding has oc-curred. The requested packets forn active layers in a window canbe divided into the following three groups based on their times-tamp:

• Packets from Previous Window(tref−∆≤ ts < tref ): theseare packets of then active layers that are missing from theprevious window, also calledplaying window, often due topacket loss in the network. Therefore, the ratio of these pack-ets are usually very small. Since there is still sufficient timefor in time delivery of these packets, the receiver can performexplicit loss recovery by re-requesting these packets.

• Packets from Active Window(tref ≤ ts < tref+∆): Allmissing packets of then active layers within this windowmust be requested at this point. Because of the receiver buffer-ing, some of these packets are often delivered ahead of time.This implies that required budget for requesting these pack-ets is often less than one window worth of packets.

• Packets from Buffering Window(tref+∆≤ ts < tref+∆+τ ):Any excess packet budget can be leveraged to request pack-ets from the buffering window.τ is the length of the buffer-ing window which is called the LookAhead interval. Thesepackets are determined and ordered by the QA mechanismin order to shape up receiver’s buffer state as we discuss insubsection 3.3. Note that the timestamp of requested futurepackets (i.e., size of the buffering window) is determined bythe availability of future data which is specified byDelayr,i.e., τ = Delayr - (2*∆+δ).

Page 5: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

At macro level, the above three groups of selected packets are or-dered based on their corresponding windows,i.e., first packets ofthe playing window, then active window, and finally buffering win-dow. Micro-level ordering of selected packets within the bufferingwindow is determined by the QA mechanism as we describe insubsection 3.3. Packets within the playing and active windows areordered based on the following strategy: Given two packetsa andb with timestampstsa andtsb from layerLa andLb (Lb > La),packetb is ordered beforea only if the following two conditionsare satisfied,n

∆>

|La−Lb||tsa−tsb|

andtsa > tsb, wheren denotes thenumber of active layers. This approach simply orders packets ina diagonal pattern with the slope ofn

∆(as shown in Figure 2(c)).

This ordering strategy ensures that packets from lower layers orwith lower timestamps are given higher priority. Therefore, lowerpriority packets that may not be delivered will be located attheright, top corner of the window and have more time for delivery inthe next window. As the above condition indicates, the slopeof theordering adaptively changes withn to strike the balance betweenimportance of lower layers and shorter time for delivery of pack-ets with lower timestamps. Clearly, other ordering patterns can beadopted if layer number or timestamp has a higher priority.

The window size (∆) is a key parameter that determines the trade-off between stability of delivered quality and responsiveness of theQA mechanism to variations of aggregate bandwidth. Decreasingwindow size improves responsiveness at the cost of lower stabilityin delivered quality and higher control overhead (i.e., request mes-sages). Note that the window size should be at least several timeslonger than average RTT between receiver and different senderssince RTT determines:(i) minimum delay of the control loop fromthe receiver to each sender, and(ii) timescale of variations in con-gestion controlled bandwidth from the sender.

Since a new request is sent during the delivery of packets in thelast request, aPALSreceiver may observe duplicate packets. Morespecifically, after the window is slided forward, there is half anRTT worth of packets in flight from each sender. These packetswill arrive during the next window and might be requested againwhich results in duplicates. We minimize the ratio of duplicatepackets inPALSas follows: requested packets that are not deliveredin one window, are re-requested from the same sender in the nextwindow. Each sender removes any packet from a new request thatwas delivered during the last RTT. As our simulation resultsshow,the ratio of duplicate packets is extremely low inPALS.

Coping with Bandwidth Variations: Periodic window sliding isnot sufficient to effectively cope with sudden changes in aggregatebandwidth during one window. More specifically, sudden increaseor decrease in available bandwidth from a sender could results inlow utilization of its bandwidth or late arrival of packets,respec-tively. In PALS, the sliding window mechanism is coupled withseveral secondary mechanisms to address this issue. The receiveremploys three mechanisms to cope with a major drop in bandwidthas follows:

• (i) Overwriting Requests: Any new request from the receiveroverwritesthe outstanding list of packets that is being deliv-ered by a sender. More specifically, when a sender receivesa new list from the receiver, it starts delivery of packets fromthe new list and abandons any pending packet from the previ-ous list. This mechanism “loosely” synchronize slow senderswith receiver’s playout time and prevents them from falling

behind,i.e., delivering late packets.

• (ii) Packet Ordering: As we mentioned earlier, requestedpackets from each sender are ordered based on their impor-tance. Therefore, the effect of any drop in bandwidth is mini-mized since available bandwidth is used for delivery of moreimportant packets.

• (iii) Event-driven Sliding: During each window, the receivermonitors the progress in evolution of buffer state once perRTT. In an extreme scenario when the aggregate availablebandwidth is significantly lower than the estimated value andthe available buffer state is not sufficient to maintain currentlayers until the end of this window, the receiver drops a layer,slides the window forward, invokes the QA mechanism andsends a new request to each peer.

If available bandwidth from a sender significantly increases dur-ing a period, the sender may deliver all the requested packets at ahigher rate and become idle. This in turn reduces bandwidth uti-lization of that sender.PALSincorporate two ideas to address thisproblem: First, the receiver requests a percentage of extrapackets(beyond estimated number of incoming packets based on EWMAbandwidth). The percentage of requested extra packets fromasender is determined based on the deviation of the sender’s per-window bandwidth from its EWMA average bandwidth. Similarto the packet in the buffering window, these extra packets are se-lected and ordered by the QA mechanism as well. Second,PALSalso incorporate the idea ofreverse flow controlto keep all sendersbusy. The receiver keeps track of delivered packets by each senderto determine whether a sender is likely to complete their assignedpackets (including the extra packets) before the end of the currentwindow. If such an event is detected, the receiver sends a small re-quest for that particular sender half an RTT before sender becomesidle to ensure utilization of the sender’s bandwidth for theremain-ing part of the window.

3.3 Quality AdaptationThe QA mechanism has two degrees of control that adapts deliv-ered quality in two different timescales:

• Coarse-grained Adaptation:Over long timescale, the QAmechanism can add or drop the top layer to adjust the num-ber of playing layers in response to long-term mis-match be-tween aggregate bandwidth and stream bandwidth.

• Fine-grained Adaptation:Over short timescale (once per win-dow), the QA mechanism controls evolution of receiver bufferstate by adjusting the allocation of aggregate bandwidth amongactive layers which is used to absorbshort-termmis-matchbetween stream bandwidth and aggregate bandwidth.

The basic adaptation mechanism works as follows: when aggre-gate bandwidth is higher than stream bandwidth (n ∗C ≤ Tewma),called filling window, the QA mechanism can utilize the excessbandwidth to request future packets and fill receiver’s buffers witha proper inter-layer distribution (buf0, buf1, .., bufn). Once re-ceiver’s buffers are filled to a certain level (BUFadd) with the re-quired inter-layer buffer distribution, the QA mechanism can in-crease stream bandwidth by adding a new layer to avoid bufferoverflow. In contrast, when aggregate bandwidth is lower thanstream bandwidth (Tewma < n ∗ C), calleddraining window, the

Page 6: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

Buffering Window

Timereft

(τ)

+∆

Vertical

Horizontal

Diagonal

A B

C

Figure 3: Horizontal, Vertical and Diagonal buffer distrib ution

QA mechanism drains receiver’s buffers to compensate the band-width deficit while maintaining proper inter-layer distribution forthe remaining buffered data. If the amount of buffered data or itsdistribution among active layers is inadequate to efficiently absorbbandwidth deficit during a draining window, the QA mechanismdrops the top layer.

BUFadd and inter-layer buffer distribution are key factors in coarse-grained and fine-grained adaptations, respectively. The larger thevalue ofBUFadd becomes, the longer it takes to add a new layerwhen excess bandwidth is available, the less likely it is to drop thenewly added layer in a near future. Therefore, increasingBUFadd

further decouples the delivered quality from the variations of ag-gregate bandwidth, and improves stability of delivered quality.

Since stream has layered structure, a key question is “how the re-ceiver buffer state should be evolved as it is filled or drained?”. Thisis determined by two factors:(i) Target buffer distribution, and(ii)Packet ordering. In essence, target buffer distribution determinesoverall distribution ofBUFadd amount of buffered data acrossall active layers whereas packet ordering controls evolution of thebuffer state as individual packets arrive. A givenBUFadd valuecan be distributed across active layers in different ways asshownin Figure 3. In general, more buffering should be allocated to lowerlayers because of their importance for decoding higher layers. Theconservative or horizontal approach of allocating buffering only tolower layers improves buffering efficiency since the buffered datais more likely to be available, and long-term stability in deliveredquality. In contrast, the aggressive or vertical approach of allocat-ing buffering only to lower timestamps of active layers achievesshort-term improvement in quality but it is less efficient2 and doesnot achieve long term stability. A skewed or diagonal bufferdis-tribution can effectively leverage the tradeoff between long-termstability in quality and buffering efficiency. By changing the slopeof such distribution one could achieve proper weight for stability orefficiency.

Packet ordering determines how the target buffer distribution isfilled and drained over short timescales. There are two criteria forordering packets, namely layer ID and timestamp. As shown inFigure 4(a), using layer ID as the primary criteria results in the hor-izontal ordering whereas ordering primarily based on packet times-tamp leads to the vertical pattern. The snake-shape (or diagonal)pattern orders packets based on both timestamp and layer ID wherepacketa has a higher priority thanb if the following condition ismet slope >

|La−Lb||tsa−tsb|

(similar to the criteria in subsection 3.2)whereslope determines the slope of diagonal pattern and thus the

2In general, the higher the amount of buffering for the droppedlayer, the lower the buffering efficiency. In other words, allocatingthat buffer to lower layers would have been more useful for QA,and thus more efficient[11].

required weight for layer ID vs timestamp. The effect of packet or-dering on the actual buffer state is primarily visible when deliveredpackets do not fill the entire target distribution. To illustrate thiseffect, Figure 4 depicts the buffer state for diagonal buffer distribu-tion with different packet ordering scheme.

Quality Adaptation in PALS: PALSincorporate the following ideasto leverage the above fundamental tradeoffs in a balanced fash-ion. The value ofBUFadd and its distribution is adaptively deter-mined as a function of Look ahead (τ ) and number of active layers(n) as shown in Figure 3. Toward this end,BUFadd is equal tohalf of the total available data in the future window orBUFadd =0.5(τ *n ∗ C), i.e., area of triangle ABC. Furthermore,BUFadd isbeing distributed in a diagonal fashion across all active layers withthe dynamic slope ofn

τ. the corresponding snake-shape ordering

for requested packets. In this approach, as the quality of stream (n)increases, thePALSbehavior becomes more conservative in adding,and more resilient in dropping a layer. Similarly, using larger valueof Delayr which results in larger LookAhead value, increases thevalue of BUFadd and results in more conservative target bufferdistribution which is desirable since more future data is availableas shown in Figure 5.

Diagonal buffer distribution inPALSimplies that packets of lowerlayers are requested well ahead of their playout times. Thisstrat-egy provides multiple opportunities for requesting packets of lowerlayers. This in turn leads to a higher degree of resiliency againstpacket loss for lower layers that are more important. Furthermore,PALSdeploys an implicit loss recovery mechanism by requestingany missing packets within the playing window for the last time aswe discussed earlier. In summary, multiple requesting opportuni-ties coupled with implicit loss recovery inPALSprovide sufficientnumber of opportunities for delivery of lost packet and / or over-written request.

PALSrequires a startup phase when the receiver buffers sufficientamount of data before it initiates stream playout. During the startupphase, only packets of the base layer (L0) are requested from senders.In PALS, the playout is initiated when two conditions are satisfied:(i) the receiver has two windows’ worth of buffered data for thebase layer, and(ii) EWMA bandwidth from all sender is more thanthe consumption rate of a single layer (C).

3.4 Packet AssignmentOnce the QA mechanism identified and ordered all the requiredpackets for a window, it passes an ordered list of selected pack-ets to the packet assignment mechanism. Then, the packet assign-ment mechanism maps the ordered list of packets among activesenders such that two conditions are met:(i) the number of allo-cated packets to each peer is proportional to its contribution in theaggregate bandwidth, and(ii) delivered packets by different senders

Time

ττ1

2

Figure 5: Effect of LookAhead (τ ) on the amount of futurebuffering

Page 7: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

Buffering Window

Timereft

(τ)

+∆

(a) Horizontal Ordering

Buffering Window

Timereft

(τ)

+∆

(b) Diagonal Ordering

Buffering Window

Timereft

(τ)

+∆(c) Diagonal Ordering

Figure 4: Effect of Packet Ordering on the evolution of buffer state

arrive in the order that is rather similar to the ordering of the aggre-gate list despite independent variations in bandwidth fromdifferentsenders. Maintaining the order of delivery for requested packetsensures that the evolution of the buffer state remains closeto thedetermined plan by the QA mechanism. To achieve these goals,PALSadopts aweighted round-robinstrategy for packet assignmentamong senders. Consider an example where three senderss0, s1

ands2 contribute 50%, 30% and 20% of the aggregate bandwidth,and all senders can provide all layers. The ordered list of all se-lected packets for a window is assigned to senders as follows: Firstthe PA mechanism divides the list ofK packets intoNc equal-sizechunks where each chunk (except possibly the last one) containskc = K

Ncpackets. Starting from the first chunk, the PA mechanism

assignskc*0.5 packets tos0, thenkc*0.3 packets tos1, and thenkc*0.2 packets tos2. Nc is a configuration parameter that deter-mines how much the order of delivered packet could diverge fromthe global ordering specified by the QA mechanism. This strategyattempts to proportionally distribute less important packets at theend of requests from different senders.

Partially Available content: In practice, a sender may not receiveor (may not cache) all the layers, or all the segments of the lay-ers it receives. The information about available packets ateachsender can be provided to the receiver as the session progresses(for live sessions) or at the beginning of playback sessions. Our ba-sic packet assignment mechanism can not accommodate partiallyavailable content among senders.

We devised the following two-pass packet assignment mechanismto accommodate this practical scenario. Given the EWMA band-width from each sender, we can determine their packet budgets as

follows: ki = T iewma

PktSize. In the first pass, the PA mechanism sequen-

tially examines each packet from the ordered list and keeps track ofthe number of assigned packets to each sender (assignedi). If thepacket is only available at one sender, it is assigned to thatsender.Otherwise, it is assigned to a sender that has the minimum ratioof assigned to total packet budget (i.e., assignedi

ki) because it can

provide an earlier delivery time and better maintain the original or-dering. For example, if both senderss1 ands2 can provide packetx, have the total packet budget of 300 and 100, and their numberofalready assigned packets is 50 and 10, packetx is assigned tos1. Atthe end of the first pass, all packets are assigned to a sender,but thenumber of assigned packets to some senders might be larger thantheir packet budget (assignedi > ki) which implies that numberof assigned packets is less than the budget for some other senders.

In the second pass, we only examine those senders whose the num-ber of assigned packets is more than their packet budget, startingfrom the sender with maximum surplus. For each sender, we ex-amine those assigned packets that are available at other peers (i.e.,

exclude packets that are only available at this sender) in the givenorder. If each packet can be provided by another sender with packetdeficit, it is assigned to the other sender. The second pass contin-ues until no sender has any packet surplus or no more improve-ment can be achieved. Note that if distribution of packet amongsenders is proportional to the distribution of bandwidth (T i

ewma)among senders, then this algorithm will identify the properpacket-to-sender mapping that fully utilizes available bandwidthfrom allsenders. We examine the performance of this mechanism in Section4.

4. PERFORMANCE EVALUATIONWe use packet level simulator, namelyns2, to extensively evalu-ate performance ofPALSmechanism under different dynamics inP2P systems and explore several key tradeoffs in design of receiver-driven coordination mechanisms. In our simulations, all senderpeers employ RAP [7] mechanism to perform TCP-friendly con-gestion control and are able to provide a limited window of futurepackets (Delay r seconds) for all layers of a requested stream tothe receiver. To properly measure congestion controlled bandwidthfrom a sender independent ofPALSbehavior, we have decoupledPALSfrom the underlying congestion control mechanism. Once asender receives the first request, it starts sending requested packetsat a rate that is determined by RAP. At each packet departure time,if there is an outstandingPALSpacket, it will be mapped to the out-going packet. Otherwise, an empty packet is sent. This decouplingallows us to assess ability ofPALSmechanism to utilize availablebandwidth from senders3. After a startup phase, the receiver em-ulates “streaming” playout of delivered packets while consideringdecoding dependency among packets. For example, a lost or latepacket for the second layer implies that corresponding packets ofhigher layers can not be decoded and are useless.

Figure 6 depicts the basic topology in our simulations with defaultparameters. Each flow goes through a shared and unshared linkwith cross traffic. By changing the volume of cross traffic on theselinks, we can control which one ultimately becomes a bottleneck ineach simulation and generate a desired scenario with sharedor un-shared bottleneck. We have also used a single TCP flow on the re-verse direction to avoid any phase effect in simulations with largernumber of flows. The amount of buffer size for each link is prop-erly provisioned. Presented results are averaged over 50 runs withdifferent random seeds. Unless otherwise stated, the following de-fault parameters are used in our simulations:Nc = 5,C = 80 KBps,N = 10,M = 3, δ = 500 msec,PktSize = 1KByte.

3It is worth noting that despite this decoupling any change inPALSparameters could result in a different pattern of requests from thereceiver which in turn affects dynamics of ACK packets and shortterm variations of congestion controlled bandwidth.

Page 8: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

S

S

R

9.6 Mbps30 msec

S

2

1

i

bw_s1

d_s1

bw_si

d_si

. . .

Single TCP Flow

5 Mpbs1 msec

20 Mpbs1 msec

Unshared Bottlenecks

Shared Bottlenecks

5 Mpbs1 msec

5 Mpbs1 msec

9 long-livedTCP Flows

bw_lsi=

Figure 6: Simulation Topology

We have extensively evaluatedPALSand examined the effect ofwide range of parameters under a variety of scenarios including:shared and unshared bottleneck among senders, and different de-gree of bandwidth heterogeneity. Due to the limited space, we onlypresent a representative subset of our results in this sections. Inparticular, we focus on senders with shared bottleneck since band-width sharing among senders introduces further dynamics toavail-able bandwidth. Furthermore, we emphasize on scenarios with amoderate degree of heterogeneity among senders for the followingreason. When the degree of heterogeneity is too high, the behaviorof one sender become dominant and the dynamics of multi-senderdelivery is not shown. In contrast, assuming senders with homoge-neous bandwidth is rather unrealistic. Further results canbe foundin the related technical report [12]. We explore the following issuesin this section:(i) the importance of inter-sender coordination,(ii)key design tradeoff in receiver-driven coordination,(iii) ability ofPALSto cope with dynamics of bandwidth variations and peer par-ticipation, (iv) sensitivity ofPALSto different pattern of partiallyavailable content among senders.

4.1 Importance of CoordinationAs we discussed earlier, to maximize delivered quality frommulti-ple congestion controlled senders despite variations in bandwidth,it is important to dynamically coordinate delivered packets by eachsender. To illustrate the importance of inter-sender coordination,we compare the performance ofPALSwith the following two mech-anisms that employstaticcontent-to-sender mapping.

• Single Layer per Sender (SLS): In this approach each sendersi delivers packets of a designated layer (Li) at the rate thatis determined by its congestion control mechanism.SLS

represents the common “one layer per sender” approach thathas been proposed in several previous studies (e.g., [13, 5]).

• Multiple Layer per Sender (MLS): This approach is moreelaborate and assigns multiple layers to each sender [14].Given an ordered list of senders based on their available band-width, MLS starts from the first sender (with maximum band-width) and sequentially assigns the maximum number of con-

secutive new layers (i.e., li =j

T iewma

C

k

) to each sender. For

example, if layer bandwidth is C=80KBps, the layer to send-ing mapping for three senders with 250, 175 and 100 KBpsaverage bandwidth would be (L0, L1, L2), (L3, L4) and(L5), respectively.

Both SLS and MLS incorporate the following miscellaneous mech-anisms. Each sender delivers packets of the assigned layersthrough

a RAP connection (i.e., RAP) based on their timestamp, and acrossdifferent layers based on their layer number,i.e., vertical ordering.Receiver reports its playout time in ACK packets to enables sendersto estimate receiver’s playout time. To ensure in-time delivery, eachsender only transmits those packets whose timestamp is at least oneRTT larger than the receiver’s playout time. This implies that asender skips a range of timestamps when its available bandwidthdrops in order to remain loosely synchronized with the receiver’splayout time. A sender can also utilize its excess bandwidthto sendavailable future packets of assigned layers (up toDelay r seconds)and increase the buffered data at the receiver. Both SLS and MLSincorporate an explicit loss recovery mechanism. Drop packets aredetected by RAP at the sender side and retransmitted based ontheirpriority within the packet ordering scheme if there is sufficient timefor their in-time delivery. In summary, SLS and MLS represent twowell designed multi-sender streaming mechanisms that leverage in-teractions between the receiver and each sender to accommodatetiming and loss recovery. However, they do not use any coordina-tion among senders and rely on static layer-to-sender mapping.

We comparePALSwith SLS and MLS in a scenario where a vari-able number of heterogeneous senders reside behind a sharedbot-tleneck. The results for homogeneous senders are similar and canbe found in the related technical report [12]. For a scenariowith i

senders, only senderss1 to si participate in delivery. We use thetopology shown in Figure 6. However, we have changed the fol-lowing parameters from their default values:d lsi values are setbased on the following equationd lsi = d ls1 + (i − 1)*0.5mswhered ls1 = 1msec to achieve heterogeneous bandwidth. Theshared bottleneck hasBW bn = 32 Mbps bandwidth and 20 long-lived cross traffic. Finally, receiver access link was reduced toBW r= 4.5 Mbps. With these parameters, as the number of senderincreases, initially the shared bottleneck is the limitinglink andthen receiver’s access link becomes the bottleneck. Note that theaggregate bandwidth to the receiver is independent of the contentdelivery mechanism, and only depends on the number of senders.Table 2 shows the available bandwidth from each sender in thesescenarios in terms of layer bandwidth (T j

ewma

C).

n T 1ewma T 2

ewma T 3ewma T 4

ewma T 5ewma T 6

ewma

2 2.5 2.43 1.9 2.1 1.94 1.7 1.6 1.5 1.45 1.4 1.3 1.27 1.17 1.146 1.2 1.2 1.1 1.02 1.01 1.01

Table 2: Available bandwidth for each sender across differentscenarios

Figure 7(a) depicts the average delivered quality by SLS, MLS andPALSfor different number of senders, ranging from 2 to 6. Wehave also shown the maximum deliverable quality (i.e., the ratioof aggregate bandwidth to layer bandwidth) in each scenarioas anupper bound for average delivered quality. This figure showsthatthe average delivered quality byPALSis higher than the other twomechanisms and is indeed very close to the maximum deliverablequality. The small gap between the delivered quality byPALSandthe maximum deliverable quality represents the residual aggregatebandwidth that is insufficient for adding another layer. Lower de-livered quality by SLS and MLS is primarily due to the inability ofthese mechanisms to utilize residual bandwidth from each sender.For example in a scenario with 4 senders, the residual bandwidth

Page 9: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

0

1

2

3

4

5

6

7

8

0 1 2 3 4 5 6 7

Deliv

ere

d Q

ualit

y (laye

rs)

Number of Senders

Avg. BW (layers)PALS

SLSMLS

(a) Avg. Quality (layer)

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 1 2 3 4 5 6 7

Fre

quency

of D

rops

Number of Senders

PALSSLSMLS

(b) Frequency of drops

20

30

40

50

60

70

80

90

100

0 1 2 3 4 5 6 7

Perc

enta

ge o

f B

W u

tiliz

atio

n

Number of Senders

PALSSLS

MLS

(c) Percentage of BW utilization

Figure 7: Effect of Inter-sender Coordination,PALS parameters: ∆ = 6sec,Delay r= 40sec

from all senders (shown in Table 2) is sufficient to deliver two morelayers. However, without any coordination among senders, theseresidual bandwidth can not be utilized.

Figure 7(b) shows the frequency of layer drops to quantify the sta-bility of delivered quality by these mechanism in the same simula-tions presented in Figure 7(a). These variations occur because oflayer drops (only inPALS) or undelivered packets. Variations ofdelivered quality by SLS and MLS is zero in those scenarios whereall senders have a plenty of residual bandwidth. In these scenarios,each sender can deliver packets of designated layers ahead of timewhich results in a plenty of receiver buffering. Table 2 shows thatall senders have a plenty of residual bandwidth in scenarioswith 2,4 and 5 senders. However, when residual bandwidth for at least onesender is low (e.g., s2 whenn = 3, ors3-s6 whenn=6), MLS andSLS are very sensitive to variations of available bandwidthfromthese sender(s) because senders with low residual bandwidth cannot accumulate sufficient buffering, and thus are forced to skip por-tions of designated packets to ensure in-time delivery.

Figure 7(c) depicts the utilization of aggregate bandwidthfrom allsenders in same simulations.PALSis the only mechanism that fullyutilizes aggregate available bandwidth from all senders. In SLS andMLS, the aggregate bandwidth is not fully utilized due to thelim-ited availability of future packets at each sender. In otherwords,each sender can only send a limited amount of future packets (de-pending on the value ofDelayr) and then becomes idle. The largerthe value of residual bandwidth, the more future packets (i.e., largerDelayr values) are required to fully utilize available bandwidth.

As expected, none of the mechanisms experienced late packets inour simulations. Because of their static mapping, SLS and MLSdo not deliver duplicate packets. However, we observed lessthan0.05% duplicate packets in allPALSsimulations.In summary, thebehavior of SLS and MLS significantly depends on the amount ofresidual bandwidth at individual senders. When residual band-width at some senders is low, they can efficiently utilize their avail-able bandwidth but they become too sensitive to bandwidth varia-tions which results in instability of delivered quality. Incontrast,when all senders have a plenty of residual bandwidth, they canbuffer future packets and provide a stable quality. However, theyexhibit poor bandwidth utilization specially when the amount offuture packets among senders are limited (i.e., Delayr is not toolarge). In a nutshell, the content delivery mechanisms thatrely onstatic layer-to-sender mapping cannot maximize delivered qualitybecause of their inability to efficiently utilize residual bandwidth

from each sender.

4.2 Effect of PALS ParametersWe turn out attention to the effect of key configuration parame-ters, including window size and LookAhead, on the performanceof PALSmechanisms. This also illustrates underlying dynamicsand key tradeoffs of the receiver-driven coordination mechanism.Window size (∆) determines the frequency of adaptation inPALSmechanism. LookAhead (τ ) controls the amount of future buffer-ing that determines the amount ofBUFadd and its inter-layer dis-tribution, and thus affects the tradeoff between responsiveness andstability of delivered quality. To explore these issues, weconsidera scenario when three heterogeneous senders are behind a sharedbottleneck using the topology shown in Figure 6 with defaultpa-rameters. Table 3 summarizes averaged bandwidth and RTT thatare obtained by each sender in this simulation.PALSuses diagonalshape for buffer distribution along with diagonal pattern for packetordering in these simulations.

sender1 sender2 sender3AvgRTT 79 msec 110 msec 144 msecAvgBW 1.59 Mbps 1.16 Mbps 0.82 MbpsDev.BW 0.47 Mbps 0.35 Mbps 0.24 Mbps

Table 3: Bandwidth and RTT for individual senders in subsec-tion 4.2

Window and LookAhead: Figure 8 depicts different angles ofPALSperformance as a function of window size for different valuesof LookAhead parameter. Figure 8(a) shows the average deliveredquality and illustrates that neither depends on∆ nor τ . In otherwords the average quality is only determined by the average band-width. This is mainly due to the event-driven sliding of the windowcoupled with reverse flow makes the behavior ofPALSthat reducesthe effect of window size on average quality. Figure 8(b) shows thedegree of stability by depicting the frequency of layer dropeventsthat are triggered by the QA mechanism. This figure reveals thatfrequency of changes is generally very low (often less than 1.5drops every 100 seconds, or 0.015) and is reduced by increasingthe LookAhead parameter. Furthermore, it does not depend onthewindow size when LookAhead is not too low. This figure illustratesthe key tradeoff between responsiveness to variations in bandwidthand stability in delivered quality. Using large LookAhead param-eter results in largeBUFadd which accommodates stability at thecost of responsiveness. In contrast, when LookAhead is small, the

Page 10: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

2

3

4

5

6

7

8

0 2 4 6 8 10 12 14

Avg

Qualit

y(la

yer)

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(a) Avg. Quality (layer)

0

0.005

0.01

0.015

0.02

0.025

0.03

0 2 4 6 8 10 12 14

Fre

quency

of T

ota

l Dro

ps

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(b) Frequency of Drops

0

10

20

30

40

50

60

70

80

90

100

0 2 4 6 8 10 12 14

Perc

enta

ge o

f ra

tio o

f se

condary

dro

ps

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(c) Percentage of ratio of secondary drops

0

0.01

0.02

0.03

0.04

0.05

0 2 4 6 8 10 12 14

Perc

enta

ge o

f U

nre

cove

red L

oss

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(d) Percentage of Unrecovered Loss

0

0.1

0.2

0.3

0.4

0.5

0 2 4 6 8 10 12 14

Perc

enta

ge o

f D

uplic

ate

Pkt

s

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(e) Percentage of Duplicate Packets

95

95.5

96

96.5

97

97.5

98

98.5

99

99.5

100

0 2 4 6 8 10 12 14Perc

enta

ge o

f B

W U

tiliz

atio

n o

f S

enders

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(f) Percentage of BW Utilization of senders

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0 2 4 6 8 10 12 14

Perc

enta

ge o

f C

ontr

ol T

raffic

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(g) Percentage of Control Traffic

2

4

6

8

10

12

14

16

18

20

22

24

12 10 8 6 4 2 1

Buffer

Am

ount (s

ec)

Window Size (Sec)

LookAhead = 6 SecLookAhead = 8 SecLookAhead = 10 SecLookAhead = 12 SecLookAhead = 18 SecLookAhead = 24 Sec

(h) Slope of buffer dist. among layers

0.5

1

1.5

2

2.5

3

3.5

24 18 12 10 8 6

Avg

Buffer

Slo

pe (

sec/

laye

r)

LookAhead (Sec)

delta = 1 Secdelta = 2 Secdelta = 4 Secdelta = 6 Secdelta = 8 Secdelta = 10 Secdelta = 12 Sec

(i) Buffer distribution among active layers

Figure 8: Effect of Window size and LookAhead on different aspect of PALS performance

Page 11: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

layer add condition can be easily satisfied which results to an im-mature adding of a new layer that can not be sustained and leadsto instability in quality. We further explore the underlying causesof the observed layer drops by dividing them into two groups:(i)Primary dropsthat occur due to insufficient aggregate buffering toabsorb a drop in bandwidth at the receiver, and(ii) Secondary dropsthat occur when total buffered data is sufficient but its useful portionwithin the active window is inadequate to absorb bandwidth deficit,i.e., buffered data is distributed across multiple window. The dis-tinction between these drop events allows us to identify theeffectof buffer distribution on variations of delivered quality.Figure 9(b)depicts the ratio of secondary drops and clearly shows that this ratioincreases with the LookAhead parameter but this increase issignif-icantly smaller as the window size grows. Since the useful portionof buffered data depends on the shape of buffer distributionandthe portion of buffered data that falls within the next window (i.e.,|τ −∆|). Therefore, increasing the window size or decreasing theLookAhead would results in decrease in the useful portion ofbufferwhich leads to smaller ratio of secondary drops.

Figure 8(d) shows the percentage of unrecovered packet losses.These are the packets that were lost (once or multiple times)anddid not get another transmission opportunity because of their rel-ative priorities. Note that losses are weighted by their impact ondelivered quality. In other words, when a packet is lost, allitsdecoding-dependent packets from higher layer are also consideredlost although they are actually delivered. This figure illustrates thatthe percentage of un-recovered losses is generally small (less than0.03%), but it increases as the window size grows or LookAheadshrinks. This behavior is related to the average number of trans-mission opportunities for each packet which is determined by theratio of LookAhead to window size.In summary, Figure 9(b), 9(e)and 8(d) show two important tradeoff for selecting window sizeand LookAhead as follows: Increasing∆ reduces the ratio of sec-ondary drops but results in higher ratio of unrecovered losses. Fur-thermore, increasingτ improves the stability of delivered quality atthe cost of higher ratio of secondary drops, and larger delaywithrespect to parent peers. In a nutshell, there is a sweet spot for win-dow size and LookAhead parameters that should be set accordingto application’s requirements.

Efficiency and Overhead:Figure 8(e), 8(f) and 8(g) represent var-ious dimensions of efficiency and overhead forPALSmechanism.Figure 8(e) indicates that the percentage of duplicate packets is ingeneral very small (less than 0.3%) and can be rapidly dropped byincreasing window size. This supports our explanation thatdupli-cate packets could occur only when window is sliding since in-creasing window size rapidly reduces the ratio of duplicates. Fig-ures 8(f) shows that the utilization of aggregate bandwidthis alwaysvery close to 100% except for very small window. We also ex-amined the average percentage of overwritten packets (not shownhere) across all senders and found out that both the average andper-sender percentage of overwritten packets remain around 25%independent of window size or LookAhead parameter. Since thedeviation of average bandwidth from each sender is around 30%in these simulation, only extra packets from each sender areover-written. While this appears to suggest thatPALSover-estimatesthe ratio of requested extra packets from each sender in thissce-nario, any reduction in the ratio of extra packet could result in alower bandwidth utilization. The overhead of generated controltraffic (i.e., request of the receiver) inPALS reduces as windowsize increases but remains below 0.85% in all scenarios as shownin Figure 8(g). In Summary, these results show thatPALScan ef-

fectively utilize available bandwidth from senders, and provide astable quality with a low ratio of control overhead and very smallratio of duplicate packets.

Buffer State: Finally, Figure 8(h) and 8(i) depict the average amountof buffered data across active layers, and the slope of distributionfor actual buffered data (usingτ as x-axis), respectively. Thesefigures collectively illustrate how closely the actual buffer state fol-lows the target buffer distribution despite the underlyingdynamicsin bandwidth variations, packet selection and packet assignment.Figure 8(h) shows the average amount of buffered data acrossalllayers linearly increases with both∆ andτ . The direct effect ofτ on the actual amount of future buffering (the gap between con-secutive lines for the same value of∆) supports our earlier expla-nation on why increasingτ reduces the frequency of layer drop(in Figure 8(b)). We note that when window size is larger thanLookAhead, the active window is less than half-full right after thewindow is slided. Therefore, the average per layer buffering at thereceiver should be around∆+0.5 ∗ τ . However, when LookAheadis larger than window size, the buffer data is gradually accumulatedover several windows at the rate proportional to excess bandwidth.Therefore, the per layer buffering would be less than∆ + 0.5 ∗ τ .Figure 8(h) clearly demonstrates these two scenarios. Figure 8(i)demonstrates that increasing LookAhead can effectively increasethe slope of actual buffer state. However, increasing∆ results in alarger total buffering and slightly reduces the effect of LooKAheadon the slope of buffer state.In Summary, our results indicate thatthe LookAhead parameter can effectively control actual distribu-tion of buffered data across layers despite various dynamics in thesystem.

Buffer Distribution & Packet Ordering: Now, we take a closerlook at the impact of packet ordering in the buffering windowonPALSperformance. Packet ordering primarily affects fine-graindy-namics of buffer evolution that determines receiver’s buffer stateand its efficiency. Figure 9 depicts both the ratio of secondarydrops and buffering efficiency inPALSfor three different orderingschemes, namely vertical, horizontal, diagonal, with the adaptivediagonal buffer distribution inPALSas described in subsection 3.3.This figure clearly illustrates an interesting tradeoff between ratioof secondary drops and buffering efficiency that can be leveragedby packet ordering. More specifically, vertical ordering schemehas the minimum ratio of secondary drops but lowest buffering ef-ficiency. In contrast, horizontal ordering scheme has the maximumratio of secondary drop but highest efficiency. The diagonalor-dering strikes the balance by achieving a very good buffering ef-ficiency and with sufficiently large window size can minimizetheratio of secondary drops. This tradeoff is a direct effect ofthe por-tion of delivered packets fall within the next window. Vertical andhorizontal orderings place the maximum and minimum portionofbuffered data in the next window, respectively.In summary, this re-sults demonstrate that our choice for diagonal packet ordering canproperly leverage this tradeoff.

4.3 Dynamics of Bandwidth VariationsAs we discussed earlier, by increasing the LookAhead parameter inPALS, the amount of receiver buffering grows and its QA mech-anism becomes more conservative in adding layers, and more re-silient in dropping a layer. Therefore, in essence,PALSbecomesless responsive as LookAhead parameter increases. To qualify thiseffect, we have examinedPALSperformance over a wide range ofdynamics in available bandwidth by using different types ofcrosstraffic including long-lived TCP, http and flash-crowd. Here, we de-

Page 12: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

0

1

2

3

4

5

6

7

0 2 4 6 8 10 12 14

Perc

enta

ge o

f ra

tio o

f se

condary

dro

ps

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(a) Ratio of secondary drops, Vertical Or-dering

0

10

20

30

40

50

60

70

80

90

100

0 2 4 6 8 10 12 14

Perc

enta

ge o

f ra

tio o

f se

condary

dro

ps

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(b) Ratio of secondary drops, Diagonal Or-dering

0

20

40

60

80

100

0 2 4 6 8 10 12 14

Perc

enta

ge o

f ra

tio o

f se

condary

dro

ps

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(c) Ratio of secondary drops-HorizontalOrdering

80

85

90

95

100

0 2 4 6 8 10 12 14

Perc

enta

ge o

f B

uffer

Effic

iency

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(d) Buffering Efficiency, Vertical Ordering

80

85

90

95

100

0 2 4 6 8 10 12 14

Perc

enta

ge o

f B

uffer

Effic

iency

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(e) Buffering Efficiency, Diagonal Order-ing

80

85

90

95

100

0 2 4 6 8 10 12 14

Perc

enta

ge o

f B

uffer

Effic

iency

Window Size(sec)

LookAhead=6 secLookAhead=8 sec

LookAhead=10 secLookAhead=12 secLookAhead=18 secLookAhead=24 sec

(f) Buffering Efficiency, Horizontal Order-ing

Figure 9: Effect of packet ordering on the ratio of secondarydrops and buffering efficiency

scribe a representative scenario where 3 senders stream content to areceiver through a shared bottleneck with flash-crowd crosstraffic.We use the topology shown in Figure 6 with the default parameterand only introduce flash crowd traffic to the shared bottleneck fromt=110 second to t=140second. Flash-crowd cross traffic is simu-lated with 300 short-lived TCP flows where each flow starts at arandom time between t=110sec and t=140sec, and remains activefor a random period between 0.8 to 1.2 seconds.

Figure 10 illustrates behavior ofPALSin this scenario as a func-tion of time for two different LookAhead parameters,τ = 10, 32seconds. Figure 10(a) and 10(d) show the aggregate bandwidth anddelivered quality forτ equal to 10 and 32 seconds, respectively. Aswe described earlier, whenτ is smaller,PALSis more responsiveto the variations of bandwidth, in particular when sudden change inthe aggregate bandwidth occurs. In contrast, using largerτ valueslead to a significantly more stable behavior such thatPALSis evenmanage to avoid any layer drop despite the major drop in availablebandwidth. The observed negative spikes in these figures aredueto unrecovered losses for the top layer. Such unrecovered lossesoccur when loss rate suddenly increases and thus bandwidth drops,since the allocated bandwidth to loss recovery is limited.

To illustrate the role of buffering onPALSbehavior in these twoscenarios, we have shown the evolution of buffer state (i.e., amongof per-layer buffering) in terms of its playout time4 for these sim-ulations in Figure 10(b) and 10(e). Comparison between these twofigures clearly shows that using larger LookAhead values resultsin more total buffering and more skewed distribution acrosslay-

4We show the buffered data in terms of its playout time which isthetime it takes to playout the buffered data. This presents thetimingaspect of buffer that is not captured by its absolute value inbyte.

ers (i.e., larger gap between lines in Figure 10(e)) which in turnincreases receiver’s ability to effectively absorb a majordrop inbandwidth by draining the buffer data. As shown in Figure 10(e)right after flash-crowd traffic starts, the buffered data forhigherlayers are drained at a faster rate to protect lower layers.

To show the effect of bandwidth variations on the dynamic of re-quested packet by the QA mechanism, Figure 10(c) and 10(f) de-pict the timestamps of requested packets in each window along withthe position of the buffering window for part of the above simula-tions. Groups of requested packets for active layers in eachwindoware shown as parallel column on top of the corresponding window.The relative gap between each packet and its corresponding win-dow is proportional to its timestamp. Even though all packets arerequested at the beginning of the window, they are separatedin thisgraph for clarity (the left most column represents requested packetsfor the base layer). Comparison of these figures shows that the rel-ative difference, and thus the total range, of requested timestampsfrom different layers increases with the LookAhead parameter. Fig-ure 10(f) clearly shows that once the available bandwidth drops, thegap between timestamp of requested packets and the window de-creases indicating that receiver’s buffers are being drained. Packetswith timestamp lower than the corresponding buffering window arelocated within the playing window, and requested by the explicitloss recovery mechanism (e.g., around t=123 and 127 sec). Fig-ure 10(f) also demonstrates that many packets have multiplenum-ber of transmission opportunities. This number can be easily de-termined by drawing a horizontal line from the first request for apacket and counting number of future windows that fall belowthisline. Clearly, there are more opportunities for packets of lower lay-ers to be requested because they are initially requested well aheadof their playout times. A packet might be requested multipletimes

Page 13: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

0

100000

200000

300000

400000

500000

600000

700000

40 60 80 100 120 140 160 180 200 220 240 260

Bandw

idth

(B

ytes/

Sec)

Time (seconds)

Delivered QualityTotal BW

(a) Delivered Quality and aggregate BW,τ= 10 sec

0

5

10

15

20

40 60 80 100 120 140 160 180 200 220 240 260

Buffer

Am

ount (S

eco

nds)

Time (seconds)

Layer 0Layer 1Layer 2Layer 3Layer 4Layer 5

(b) Buffer evolution,τ = 10 sec

110

115

120

125

130

135

140

145

150

155

110 115 120 125 130

Tim

est

am

p (

seco

nds)

Playout Time (seconds)

Window startWindow end

Layer 0 Layer 1Layer 2Layer 3Layer 4

(c) Window evolution and requested pack-ets,τ = 10 sec

0

100000

200000

300000

400000

500000

600000

700000

40 60 80 100 120 140 160 180 200 220 240 260

Bandw

idth

(B

ytes/

Sec)

Time (seconds)

Delivered QualityTotal BW

(d) Delivered Quality and aggregate BW,τ= 32 sec

0

5

10

15

20

25

30

35

40

45

40 60 80 100 120 140 160 180 200 220 240 260

Buffer

Am

ount (S

eco

nds)

Time (seconds)

Layer 0Layer 1Layer 2Layer 3Layer 4Layer 5

(e) Buffer evolution,τ = 32 sec

110

115

120

125

130

135

140

145

150

155

160

110 115 120 125 130

Tim

est

am

p (

seco

nds)

Playout Time (seconds)

Window startWindow end

Layer 0 Layer 1Layer 2Layer 3Layer 4

(f) Window evolution and requested pack-ets requested,τ = 32 sec

Figure 10: Behavior of PALS with flash crowd cross traffic,∆ = 4 seconds

due to packet loss or overwritten request.

4.4 Dynamics of Peer ParticipationDeparture or arrival of an active sender can change aggregate band-width and thus delivered quality to a receiver. In essence, this effectis very similar to the dynamics of bandwidth variations thatwe ex-amined in subsection 4.3. To examine the effect of sender dynamicson PALSbehavior, we use our default simulation topology with 4senders and the followingd s(i) values 1,13,16,19msec. All otherparameters are set to their default values. Initially, three senderss1, s2 ands3 are active and obtain average bandwidths 149 KBps,138 KBps, and 133 KBps, respectively. We stop senders2 and timet=100 second, and start senders4 at time t = 150second to simulateslow replacement of a departed sender.

Figure 11(a) and 11(b) depict aggregate and per sender bandwidthas well as delivered quality in this scenario for two different LookA-head parameters,τ = 10 and 32 seconds. Comparison betweenthese figures reveals two interesting points: First, arrival or depar-ture of one of three senders does not result in 33% change in ag-gregate bandwidth due to the shared nature of bottleneck. Morespecifically, when a sender departs (or arrives), the other activesenders obtain (or lose) portion of their bandwidths in responseto this change. Note that the variations of bandwidth for senderswith unshared bottleneck are rather independent and do not exhibitthis behavior. Second, a drop in the aggregate bandwidth duetothe departure of a sender can be effectively absorbed by increasingthe LookAhead parameter. This in turn provides more time forthereceiver to replace the departed sender. For example, Figure 11(b)shows that settingτ to 32 seconds enables the receiver to sustainall layers for 50 seconds until the departed sender is replaced. Insummary,PALS can effectively cope with the dynamics of sender

participation when the LookAhead parameter is sufficientlylarge.This implies that the receiver does not require to quickly detect de-parture of an active sender.

4.5 Partially Available ContentSo far, we have assumed that all senders can provide all requestedpackets by the receiver. However, in practice, each sender peermay not receive (or may not cache) all layers of the stream or allpackets of the received layers. In this subsection, we examine theability of the two-pass packet assignment mechanism inPALStostream partially available content from multiple senders.In partic-ular, partially available content limits the flexibility for the packetassignment mechanism to map the required content among senders.Such scenario limits the ability ofPALSmechanism to fully uti-lize available bandwidth from one (or more senders) which inturnsdegrades the quality of delivered stream. We define theaveragecontent bandwidthat a sender peer as the average bandwidth of itscontent across all timestamps. For example, if a sender has 3layersfor half of the timestamps and 2 layers for the other half, itsaveragecontent bandwidth is 2.5*C. When the average content bandwidthat a sender is lower than its available bandwidth, the senderbe-comescontent bottleneckand its available bandwidth can not befully utilized. The following minimum requirements for availablecontent among senders must be met to ensure that senders do notbecome content bottleneck:(i) at least one copy of any requiredpacket by the receiver must be available among senders, and(ii)the distribution of aggregate content among senders must bepro-portional to their contributions in aggregate bandwidth.

Once the minimum requirements are met, there are three relatedfactors that could affect the performance ofPALSmechanism asfollows: (i) Granularity of Content: each sender may cache indi-

Page 14: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

0

100000

200000

300000

400000

500000

600000

700000

40 60 80 100 120 140 160 180 200 220 240 260

Band

widt

h (B

ytes/S

ec)

Time (seconds)

Delivered QualityTotal BW

Sender 1 BWSender 2 BWSender 3 BW

(a) Delivered Quality, aggregate and per-sender BW,τ= 10 sec

0

100000

200000

300000

400000

500000

600000

700000

40 60 80 100 120 140 160 180 200 220 240 260

Band

widt

h (B

ytes/S

ec)

Time (seconds)

Delivered QualityTotal BW

Sender 1 BWSender 2 BWSender 3 BW

(b) Delivered Quality, aggregate and per-sender BW,τ= 32 sec

Figure 11: Effect of sender dynamics on PALS behavior

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

1000 100 10 1

Ave

rage Q

ualit

y (laye

r)

Segment Size(Pkt)

p = 10 Segp = 20 Segp = 50 Seg

(a) Effect of Frequency-Redundancy 2%,∆: 6, Delay r= 20

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

1000 100 10 1

Ave

rage Q

ualit

y (laye

r)

Segment Size(Pkt)

p = 5 Segp = 10 Segp = 50 Seg

(b) Effect of Frequency-Redundancy 4%,∆: 6, Delay r= 20

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

1000 100 10 1

Ave

rage Q

ualit

y (laye

r)

Segment Size(Pkt)

p = 1 Segp = 2 Segp = 5 Seg

p = 10 Seg

(c) Effect of Frequency-Redundancy 20%-∆: 6, Delay r= 20

Figure 12: Effect of period of redundancy onPALS performance

vidual segments of any received layer where a segment consistsof s consecutive packets. Therefore, the segment size determinesthe granularity (and thus the pattern) of available contentat eachsender. Using a small segment size of one packet results in cachingscattered packets of a stream. In contrast, if segment size is equal tostream length, each sender caches complete layers of each stream.

(ii) Degree of Redundancy: Availability of multiple copies of somepackets among senders provides more flexibility for the packet as-signment mechanism and improves its performance. For example,two senders might be able to deliver a particular packet. We definethe degree of redundancy (r) as the ratio of total number of extrapackets across all sender to the total number of single copies acrossall senders. For example, if 30% of packets have 2 extra copies, thedegree of redundancy is 60%.

(iii) Pattern of Redundancy: The pattern of added redundancy mayalso determine its impact on the performance of packet assignmentamong senders. We assume that the redundancy is added with thegranularity of one segment. To control the pattern of redundancy ateach peer, we introduce the notion ofperiodof added redundancy(p) which determines how often the redundant segments should beadded. For example, there are different ways to add 20% redun-dancy to a stream with 5 layers as follows: adding one extra seg-ment to each group of corresponding segments from differentlayers(i.e., p = 1 segment), or adding 2 segments to every other group ofsegments (i.e., p = 2 segment), and so on. It is worth noting that fora givenn of active layers and the degree of redundancyr, we canonly define period of redundancyp when two conditions are met:p ∗ r ∗ n is an integer value, and 1≤ p ∗ r ∗ n ≤ n ∗ (M − 1).

M denotes the number of senders. In a nutshell, we can uniquelydetermine the pattern of distribution forn layers across senderswith three parameters: segment size, degree of redundancy and fre-quency of redundancy. Toward this end, we start from a singlecopy of all layers, then once everyp consecutive segments, we addn ∗ p ∗ r randomly selected but unique copies of the correspondingsegments from different layers as shown in Figure 13.

p (segments)

n (

layer

s)

Figure 13: Effect of the period of added redundancy on its pat-tern

Finally, each group of segments from different layers are randomlymapped to senders where the number of segments per sender isproportional its contributions in aggregate bandwidth.

To study the sensitivity ofPALSperformance to different patternsof partially available content, we consider a scenario withthreesenders behind a shared bottleneck using the topology in Figure6. But we change the following parameters:bw ls(i) = bw s(i)= [10,15.3,20.6] Mbps,d s(i) = [4,25,36] msec,bw r = 4 Mbps,

Page 15: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

Available bandwidth and RTT for three senders are summarized intable 4

sender1 sender2 sender3AvgRTT 98 msec 140 msec 160 msecAvgBW 1.5 Mbps 0.92 Mbps 0.85 MbpsDevBW 0.2 Mbps 0.11 Mbps 0.1 Mbps

Table 4: Avg. BW and RTT of each sender in subsection 4.5

We have examined performance ofPALSover a wide range of pat-tern for partially available content. Due to the limited space, weonly present a subset of results that represents the main findings.Further results can be found in the related technical report. Num-ber of layers (n) is 5 in all of our simulations. To illustrate theeffect of redundancy period (p) on the performance ofPALSmech-anism, Figure 12(a) depicts average delivered quality as a functionof segment size for different period of redundancy wherer = 2%,Delayr = 20 second, and∆ = 6 seconds. Note that the x axis hasa logarithmic scale. This figure shows that the delivered quality isnot very sensitive to the value ofp. Figures 12(b) and 12(c) showthe same simulation with a higher degree of redundancy, namely r

= 4% andr = 20%. The selected values ofp across these figuresare different because of the dependency ofp to ther value. Thesefigures collectively demonstrate that the period of redundancy doesnot have a significant effect onPALS. The reason for this behavioris that the performance is sufficiently high when segment size issmall and there is not much improvement to achieve by adding re-dundancy. There is some room for performance improvement whensegment size is large. However, in these scenarios, redundant pack-ets, regardless of the degree of redundancy, are significantly apart(by s ∗ p packets) and thus can not provide flexibility to packetassignment across many windows.

Figure 14(a) presents the sensitivity ofPALSperformance to dif-ferent window sizes wherep = 10 segments,Delayr = 40sec,r =2%. This figure shows two interesting points: First, as the segmentsize increases, the performance is degraded. This effect issignifi-cantly more visible for smaller window sizes. To explain this effect,we note that as the segment size increases, the distributionof av-erage content bandwidth among senders not only becomes coarserbut also further diverges from the distribution of available band-width among senders. Furthermore, the pattern of content distribu-tion has less flexibility within each window since each sender canprovide packets of the specific same layers. These two effects col-lectively limit the ability of packet assignment mechanismto fullyutilizes the available bandwidth and result in the content bottleneckin some senders. Second, a small increase in the window size cansignificantly improve the performance, specially in scenarios withlarge segment sizes. The main reason for this improvement istheincrease in the number of packets that should be assigned in eachwindow, which in turn provides more flexibility for the packet as-signment mechanism.

To quantify the effect of redundancy on performance, we repeat thesimulations in Figure 14(a) with the exact same parameters but in-crease the degree of redundancy, from 2 to 20%, as shown in Figure14(b). Comparison between Figure 14(b) and 14(a) indicatesthatthe significant increase in redundancy only moderately improvesthe performance, and its impact is lower than window size.

Finally, Figure 14(c) shows the effect ofDelay r on PALSper-formance when∆= 1 second andr = 1%. This figure illustratesthat increasingDelayr degradesPALSperformance specially forlarge segment size. This is the direct effect ofDelayr on the dis-tribution of requested packets across different layers. More specifi-cally whenDelayr increases, the distribution of requested packetsacross active layers becomes more skewed (i.e., more packets arerequested from lower layers). Therefore, a coarser mappingof con-tent mapping among sender due to larger segment size is unable toeffectively utilize available bandwidth from senders.In summary,our results illustrate that segment size is the primary factor that lim-its the flexibility of packet assignment. However, a minor increasein window size can compensate the effect of segment size.

5. RELATED WORKThere have been a few previous studies on various aspects of multi-sender streaming to a single client. Apostolopoulos et al. [15] pro-posed a mechanism for streaming multiple description (MD) en-coded content from multiple servers. They presented a distortionmodel for MD encoding and used this model to study the effect ofserver and content placement on delivered quality of both MDandsingle description (SD) encodings. Multi-sender streaming in P2Pnetworks was casted as resource management problem by Cui etal[16]. Using global knowledge about participating peers, they for-mulated the problem as an optimization problem to maximize de-livered quality to each peer while minimizing server and networkload. Neither of these previous studies have considered congestioncontrolled connections among peers and thus they did not addressdynamics of bandwidth variations over the Internet. Nguyenet al.[17] presented a multi-sender streaming mechanism from conges-tion controlled senders. In their proposed solution, the receiver pe-riodically reports throughput and delay of all senders backto themusing control packets. Then, senders run a distributed algorithmto determine which sender is responsible for sending each segmentof the stream. They assumed that senders do not reside behindashared bottleneck, and their aggregate bandwidth is sufficient fordelivery of full quality stream. Compare to their approach,PALSisreceiver-driven, and does not make any assumption about aggregatebandwidth or location of participating peers.

There has been a wealth of research on both distributed [5, 18, 19]and central [3] approaches to construct overlay for P2P streaming.PALSmechanism is complementary to these approaches and canbe integrated with them. BothPALSand RLM [20] are receiver-driven mechanisms and perform layered quality adaptation.How-ever, there is a fundamental difference between them. In RLM,congestion control and quality adaptation mechanisms are tightlycoupled. More specifically, the receiver performs coarse-grainedcongestion control by adjusting the number of delivered layers toregulate overall bandwidth of incoming stream which implicitlyleads to adaptation of delivered quality. Given a set of sender peers,thePALSreceiver does not have any control over senders’ transmis-sion rates, but it only coordinates how the available bandwidth fromeach sender is used.To our knowledge,PALS is the first mechanismfor quality adaptive streaming from multiple, congestion controlledsenders with arbitrary distribution across the Internet.

6. CONCLUSIONS AND FUTURE WORKIn this paper, we presentedPALS, a receiver-driven coordinationmechanism for quality adaptive streaming of layered encoded videofrom multiple congestion controlled senders. We describedthemechanism, its key components and their design space, as wellas various design tradeoffs. We have conducted simulation-based

Page 16: Adaptive Receiver-Driven Streaming from Multiple Sendersmirage.cs.uoregon.edu/pub/amj05-pals.pdfAdaptive Receiver-Driven Streaming from Multiple Senders Nazanin Magharei, Reza Rejaie

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

1000 100 10 1

Ave

rage Q

ualit

y (laye

r)

Segment Size(Pkt)

Win = 1 Sec - p = 10 SegWin = 4 Sec - p = 10 SegWin = 6 Sec- p = 10 SegWin = 8 Sec - p = 10 Seg

Win = 10 Sec - p = 10 SegWin = 12 Sec - p = 10 Seg

(a) Effect of Window Size-Redundancy2%,Delay r= 40

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

1000 100 10 1

Ave

rage Q

ualit

y (laye

r)

Segment Size(Pkt)

Win = 1 Sec - p = 2 SegWin = 4 Sec - p = 2 SegWin = 6 Sec- p = 2 Seg

Win = 8 Sec - p = 2 SegWin = 10 Sec - p = 2 SegWin = 12 Sec - p = 2 Seg

(b) Effect of Window Size-Redundancy20%,Delay r= 40

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

1000 100 10 1

Ave

rage Q

ualit

y (laye

r)

Segment Size(Pkt)

Delay = 10 SecDelay = 20 SegDelay = 30 SegDelay = 40 Seg

(c) Effect ofDelayr on Avg. Quality,∆ =1second,r = 1%

Figure 14: Effect of degree of redundancy andDelayr on PALS performance

evaluation and showed thatPALScan gracefully cope with differ-ent dynamics in the system including bandwidth variations,peerparticipation, and partially available content.

We plan to expand this work in two directions. On the design side,we plan to explore several issues including performance ofPALSover smoother TCP-friendly congestion control mechanism such asTFRC, adding support for VBR layered encoding streams as wellas performing per-sender QA based on available bandwidth fromindividual senders rather than aggregate bandwidth. On theevalu-ation side, we are currently prototyping thePALSprotocol and willconduct detailed experiments over PlanetLab in a near future.

7. ACKNOWLEDGMENTSWe would like to thank Vikash Agarwal for his work on the firstversion ofPALSsimulation code.

8. REFERENCES[1] J. Jannotti, D. K. Gifford, K. L. Johnson, M. F. Kaashoek,

and J. W. O’Toole, Jr., “Overcast: Reliable multicasting withan overlay network,” inOSDI, pp. 197–212, Oct. 2000.

[2] Y. Chu, S. G. Rao, S. Seshan, and H. Zhang, “Enablingconferencing applications on the internet using an overlaymulticast architecture,” inACM SIGCOMM, ACM, 2001.

[3] V. N. Padmanabhan, H. J. Wang, P. A. Chou, andK. Sripanidkulchai, “Distributing streaming media contentusing cooperative networking,” inWorkshop on Network andOperating System Support for Digital Audio and Video,(Miami Beach, FL), 2002.

[4] D. Xu, M. Hefeeda, S. Hambrusch, and B.Bhargava, “Onpeer-to-peer media streaming,” inICDCS, July 2002.

[5] M. Castro, P. Druschel, A.-M. Kermarrec, A. R. A. Nandi,and A. Singh, “SplitStream: High-bandwidth contentdistribution in a cooperative environment,” inACM SOSP,Feb. 2003.

[6] S. Saroiu, P. K. Gummadi, and S. D. Gribble, “Measurementstudy of peer-to-peer file system sharing,” inSPIEMultimedia Computing and Networking, Jan. 2002.

[7] R. Rejaie, M. Handley, and D. Estrin, “RAP: An end-to-endrate-based congestion control mechanism for realtimestreams in the internet,” inIEEE INFOCOM, (New York,NY.), Mar. 1999.

[8] S. Floyd, M. Handley, J. Padhye, and J. Widmer,“Equation-based congestion control for unicast applications,”in Proceedings of the ACM SIGCOMM, 2000.

[9] R. Rejaie and S. Stafford, “A framework for architectingpeer-to-peer receiver-driven overlays,” inNOSSDAV, 2004.

[10] Padhye, J. Kurose, D. Towsley, and R. Koodli, “Amodel-based TCP-friendly rate control protocol,” inWorkshop on Network and Operating System Support forDigital Audio and Video, June 1999.

[11] R. Rejaie, M. Handley, and D. Estrin, “Quality adaptation forcongestion controlled playback video over the internet,” inProceedings of the ACM SIGCOMM, (Cambridge, MA.),Sept. 1999.

[12] N. Magharei and R. Rejaie, “Adaptive Receiver-DrivenStreaming from Multiple Senders,” inTechnical ReportCIS-TR-05-06, University of Oregon, August 2005.

[13] V. N. Padmanabhan, H. J. Wang, P. A. Chou, andK. Sripanidkulchai, “Distributing streaming media contentusing cooperative networking,” inProceedings of the 12thinternational workshop on Network and operating systemssupport for digital audio and video, pp. 177–186, ACMPress, 2002.

[14] Y. Cui and K. Nahrstedt, “Layered Peer-to-Peer Streaming,”in acm-mm, 2003.

[15] J. Apostolopoulos, T. Wong, W.-T. Tan, and S. Wee, “Onmultiple description streaming with content deliverynetworks,” inProceedings of the IEEE INFOCOM, 2002.

[16] Y. Cui and K. Nahrstedt, “Layered peer-to-peer streaming,”in Proceedings of the 13th international workshop onNetwork and operating systems support for digital audio andvideo, pp. 162–171, ACM Press, 2003.

[17] T. Nguyen and A. Zakhor, “Distributed video streaming overthe internet,” inSPIE Multimedia Computing andNetworking, Jan. 2002.

[18] M. Hefeeda, A. Habib, B. Botev, D. Xu, and B. Bhargava,“Promise: Peer-to-peer media streaming using collectcast,”in Proceedings of the eleventh ACM international conferenceon Multimedia, pp. 45–54, ACM Press, 2003.

[19] D. A. Tran, K. A. Hua, and T. Do, “Zigzag: An efficientpeer-to-peer scheme for media streaming,” inIEEEINFOCOM, 2003.

[20] S. McCanne, V. Jacobson, and M. Vettereli, “Receiver-drivenlayered multicast,” inACM SIGCOMM, 1996.