Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Flash Crowd Handling in P2P LiveVideo Streaming Systems
by
Anurag Dwivedi(10327132)
Department of Electrical EngineeringIndian Institute of Technology, Kanpur
May 2015
Flash Crowd Handling in P2P LiveVideo Streaming Systems
A Thesis submitted
in Partial Fulfillment of the Requirementsfor the Degree of
Bachelor-Master of Technology(Dual Degree)
by
Anurag Dwivedi(10327132)
to the
Department of Electrical EngineeringIndian Institute of Technology, Kanpur
May 2015
.
Acknowledgment
I would like to express my deepest gratitude to my Thesis Supervisor, Dr. Yatindra Nath
Singh, for his guidance and constant support throughout my thesis work. I have been
fortunate to have him as my supervisor who gave me the freedom to explore my own area
of interest and at the same time provided me with valuable inputs from time to time.
I would also like to thank my mentor Mr. Sateesh Kumar Awasthi for his constant
support and guidance. He not only guided me with his expertise in the field but also has
been patient and kind in helping me with the problems I faced. The discussion and brain
storming sessions with him has helped my thesis work to a great extent. I would also like
to express my gratitude to Mr. Ashutosh Singh for his valuable inputs throughout the
period.
I would also like to thank the faculty members of Electrical Engineering department
for imparting me with invaluable knowledge and inculcating an academic spirit in me.
I would like to thank the Computer Center of IIT Kanpur for providing me with High
Performance Computing Facilities pivotal for my work.
Lastly, but with no less gratitude, I would like to thank my family, specially my
parents, and friends for their constant support and encouragement.
Anurag Dwivedi
Abstract
Peer-to-peer systems have greatly enhanced live streaming experience by creating ef-
ficient and highly scalable streaming overlays where bandwidth capabilities of all peers
can be utilized. However, realization of such systems have been challenged by the phe-
nomenon of flash crowd — the arrival of hundreds of thousands of peers in a very short
span of time. Such situations may typically arise at the beginning of live streaming events
such as a football match or a live lecture. Experiments have shown that the system can
scale only up to a limit during flash crowd. This is limited both by the available surplus
bandwidth as well as the intense competition among the peers for scarce initial resources.
Various population control measures have been suggested for both mesh-based and
tree-based live streaming systems. The main focus area of this thesis is on tree-based
systems. Such systems have seen some centralized solutions. In this thesis, a distributed
algorithm with minimal central control is presented which organizes the newly arrived
peers into hierarchical positions to reduce competition among them. This hierarchical
rank is then used to construct different sub-stream trees. The video stream is divided
into sub-streams and each sub-stream is pushed over a separate sub-stream tree. Only
the peers at the top of the hierarchy will directly access the scares initial resources and
in turn forward the stream to those below them in the hierarchy. Thus, by utilizing the
resources provided by the newly arrived peers in handling flash crowd, better system scale
can be achieved.
Contents
1 Introduction 1
1.1 What is a P2P Network? . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Live Streaming in P2P Network . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Tree-Based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Mesh-Based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Flash Crowd in P2P Network . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Background 9
2.1 Understanding Flash Crowd . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 A Batch Join Process For Single Tree-Based Systems . . . . . . . . . . . 11
2.3 SplitStream - High-Bandwidth Multicast in Cooperative Environment . . 12
2.4 A Novel Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Multi-Tree Based Solution to Handle Flash Crowd 14
3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.1 DHT Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.2 Chord . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.1 Maintenance of Feed-Forwarder List . . . . . . . . . . . . . . . . . 15
i
CONTENTS ii
3.2.2 Node Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 NumLevel - Quantifying Flash Crowd . . . . . . . . . . . . . . . . . . . . 16
3.3.1 Calculation of NumLevel . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.2 Updating NumLevel . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.3 NumLevel Root Node . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Construction of Streaming Trees . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.1 Necessary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.2 Registration as a Feed-Forwarder . . . . . . . . . . . . . . . . . . 20
3.4.3 Connecting to Fertile Tree . . . . . . . . . . . . . . . . . . . . . . 20
3.4.4 Connecting to Sterile Trees . . . . . . . . . . . . . . . . . . . . . 24
3.5 Global Feed-Forwarder List . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Design Implementation in PeerSim Simulator 28
4.1 PeerSim Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.2 Main Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.3 PeerSim Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.4 Configuration File . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Chord Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.1 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.2 Routing Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2.3 Join Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3 Chord Implementation in PeerSim . . . . . . . . . . . . . . . . . . . . . . 37
4.3.1 Variables and Data Structures . . . . . . . . . . . . . . . . . . . . 39
4.3.2 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.3 Flash Crowd Handling in Chord — Why and How? . . . . . . . . 43
4.3.4 Flash Crowd Handling in Chord — Experiments and Results . . 45
4.4 Implementation of Proposed Method . . . . . . . . . . . . . . . . . . . . 49
4.4.1 Variables and Data Structures . . . . . . . . . . . . . . . . . . . . 49
4.4.2 Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
CONTENTS iii
5 Experiments and Results 54
5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Configuring Number of Sub-Streams . . . . . . . . . . . . . . . . . . . . 56
5.3 Effect of Node Arrival Rate . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.4 Effect of Parameters - Us and k . . . . . . . . . . . . . . . . . . . . . . . 61
5.5 Effect of Underlying Chord Overlay . . . . . . . . . . . . . . . . . . . . 62
6 Conclusion and Future Work 65
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
List of Figures
1.1 Peer organization in single tree-based live streaming system . . . . . . . . 5
1.2 Peer organization in multiple tree-based live streaming system . . . . . . 5
1.3 Peer organization in mesh-based live streaming system . . . . . . . . . . 6
3.1 Flow chart for selecting fertile tree . . . . . . . . . . . . . . . . . . . . . 21
3.2 Algorithm to select stream and level for registration in fertile tree . . . . 22
3.3 Flow chart for registering in fertile tree . . . . . . . . . . . . . . . . . . . 23
3.4 Flow chart for connection to sub-stream trees . . . . . . . . . . . . . . . 27
4.1 Example configuration file (1/2) for PeerSim simulator . . . . . . . . . . 32
4.2 Example configuration file (2/2) for PeerSim simulator . . . . . . . . . . 33
4.3 Pseudocode for lookup reproduced from original Chord paper [10] . . . . 37
4.4 Pseudocode for stabilization from original Chord paper [10] . . . . . . . . 38
4.5 Delivery rate when 10000 nodes join the system at the rate of 1000 nodes/sec.
46
4.6 Convergence time for different overlay sizes . . . . . . . . . . . . . . . . 47
4.7 Average hop count for routing messages during flash crowd . . . . . . . 48
4.8 Convergence time for different node arrival rates . . . . . . . . . . . . . 48
5.1 Time taken for 90% of the nodes to get all the sub-streams ( stabilization
time ) for different values of t . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Average and maximum value of average latency at each node for different
values of t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3 Average and maximum buffer size required at each node . . . . . . . . . 59
iv
LIST OF FIGURES v
5.4 Number of sub-streams received by nodes at different times . . . . . . . . 60
5.5 System stabilization time for different node arrival rates . . . . . . . . . 61
5.6 Stabilization time of the system for different values of Us . . . . . . . . . 62
5.7 Stabilization time of the system for different values of k . . . . . . . . . . 63
5.8 System scale for different values of wait . . . . . . . . . . . . . . . . . . . 64
List of Tables
4.1 Definition of variables for node n in Chord network . . . . . . . . . . . . 36
4.2 Variables and Data Structures used in Chord Implementation . . . . . . 39
5.1 Default values of various parameters used in simulation. . . . . . . . . . . 55
vi
Chapter 1
Introduction
Client-server architecture has been the modus operandi of the internet for a long time. In
such systems, there is a clear distinction between the role of client and server nodes. The
client node is the requester of the resource (which may be a webpage, music file, video
stream or even a service) and the server node is responsible for processing these requests
and replying with the desired resources. All the processing is done at the server. This
architecture worked well as long as the capabilities of client and server nodes were of dif-
ferent orders of magnitude. But improvement in processing capabilities of local machines
and the bandwidth capability of internet reduced this difference significantly. The client
nodes, equipped with powerful processors and faster internet, became capable of evolving
from passive recipient to an active participant in the system. Traditional client-server
architecture was unable to tap in to this potential. Moreover, as the number of devices
connected to the internet increased, the client-server architecture suffered from scalability
and single point of failure issues. Hence, peer-to-peer architecture was developed as an
alternative.
1.1 What is a P2P Network?
A peer-to-peer (P2P) network is in stark contrast to traditional client-server based net-
works where functions of client and server nodes are static and differ from each other.
In a P2P network, all the nodes (peers) are at equal level and can play the role of both
1
CHAPTER 1. INTRODUCTION 2
client as well as server. The work load of the network is distributed among the peers with
each peer making a portion of its resources, such as processing power, disk storage or
network bandwidth, directly available to other peers [1]. The peers are able to organize
and collaborate with each other without the need of a central authority.
The P2P architecture was popularized by the development of Napster in 1999. It
allowed the users to share mp3 files with other users. A central index list of the shared
files was maintained by the Napster server. A user would search for the required file in the
list and then directly download it from the providing peer. Since then, P2P systems have
evolved both in structure as well as in application. Currently, they are used for a variety
of purposes such as file sharing (BitTorrent [2], Gnutella [3]), multimedia (Skype [4],
CoolStreaming [5]), distributed computing (SETI@home [6]), storage services (FreeNet
[7]) and even digital cryptocurrency (Bitcoin [8]).
1.1.1 Advantages
P2P networks have numerous advantages over networks based on server-client model.
Some of them are:
• Efficient: P2P networks can be used to build highly efficient systems as the re-
sources of all the peers are available for use.
• Scalable: New peers who join the network brings along additional processing and
storage capabilities. Thus, the total capacity of the system grows along with the
increase in load on the system. By maintaining fine balance between the two, highly
scalable systems can be designed.
• Reliable: In the absence of a central server, P2P networks do not suffer from single
point of failure. Even in case of failure of few peers, reliability can be maintained
by replicating the data in the network and storing it on separate peers.
• Inexpensive: As P2P networks do not require deployment of additional servers,
they are very inexpensive.
CHAPTER 1. INTRODUCTION 3
1.1.2 Disadvantages
Apart from the advantages covered above, P2P networks also suffer from a few disadvan-
tages. Some of them are stated below:
• Difficult administration: In the absence of a centralized authority, the P2P
networks becomes difficult to administer. If there is lack of cooperation among the
peers, the system performance can be greatly affected.
• Insecure: P2P networks are highly susceptible to attacks by malicious users. They
may upload malicious contents, drop routing requests or collude together to disrupt
the services of the system.
• Free riding: It is one of the most important challenges to P2P network. A peer
may consume resources but may not share their own resulting in bulk of the work
being done by a small percentage of peers.
• Network Churn: In P2P networks, apart form serving its own interest, a peer
is also responsible for providing services to other peers. Ungraceful exit (without
notifying other peers) of a peer from the system can not only disrupt other peers’
service but can also cause irreversible data loss if not handled properly.
1.1.3 Types
The existing P2P architectures can be divided into 2 broad categories.
1. Unstructured: The peers are not organized in any particular structure and form
random connections with each other. Due to lack of structure, query resolution
can be done only through flooding or random walk [9]. Even then, there can be no
guarantee that the query would be successfully resolved.
2. Structured: The resources are distributed in the network according to an algo-
rithm. The algorithm guarantees that a query originating anywhere in the network
will end in a definite success or failure in a bounded number of hops. Some examples
are Chord [10], Pastry [11] and Tapestry [12].
CHAPTER 1. INTRODUCTION 4
1.2 Live Streaming in P2P Network
In tradition client-server systems, each client sets up a separate and direct connection
with the server. It results in bottleneck at the server if the number of clients are above a
limit. Content Delivery Networks (CDN) are used to alleviate the load on source server
and the task of providing streams is done by the content delivery servers. The source
node pushes the stream to these content delivery servers from where it is streamed to the
requesting clients. Such systems reduce the load on source node but still cannot support
very high number of peers.
The use of P2P network has greatly enhanced live streaming experience by creating
efficient and highly scalable streaming overlays. In such systems, the requesting clients
also act as content providers by forwarding parts of stream that they posses to other
requesting peers. As every peer contribute to the aggregate system bandwidth, P2P live
streaming systems can support a very large number of peers. Measurement studies on
existing systems have shown that the system performance can be maintained at a high
level once a sufficient system scale has been achieved [13] [14].
The existing solutions to peer-to-peer live video streaming can be divided into 2
categories — tree-based and mesh-based.
1.2.1 Tree-Based
The peers are organized into a tree-shaped overlay with static parent-child relationship
among them. In single-tree approaches as shown in Figure 1.1, a single tree rooted at the
source node is constructed and the stream is delivered through a push mechanism where
each node forwards the data to all of its child nodes. This approach is not resilient to
system churn as removal of one node from the network leads to disruption of services of
all of its descendant nodes. Moreover, the outgoing bandwidth of leaf nodes is not used
resulting in poor resource utilization. Examples include Nice [15] and Zigzag [16].
In multi-tree approaches, the stream is divided into multiple sub-streams using appro-
priate data encoding technique such as Multiple Descriptive Coding (MDC) [17] and each
sub-stream is pushed over a separate tree. Peers choose the number of trees to connect
to depending on their download bandwidth. To ensure equitable workload distribution
CHAPTER 1. INTRODUCTION 5
Figure 1.1: Peer organization in single tree-based live streaming system
and to minimize the effect of churn, a peer is placed as an internal node in only one tree
and as a leaf node in all other trees. If such a node leaves the network, at most one
sub-stream is affected. Figure 1.2 gives an example of multi-tree construction with two
sub-streams. Examples include Splitstream [18], Bullet [19] and CoopNet [20].
Figure 1.2: Peer organization in multiple tree-based live streaming system
1.2.2 Mesh-Based
Such systems form a random connected overlay of peers and utilizes swarming content
delivery to exchange packets. Every node on joining the system gets a list of random nodes
from the bootstrapping server. It then periodically reports the newly available packet
to its child peers and at the same time requests new packets from its parent peers. An
CHAPTER 1. INTRODUCTION 6
example of mesh-based system is given in Figure 1.3. The bi-directional arrows represents
that the parent-child relationship among the peers is dynamic and is determined by data
availability. Mesh-based systems are more resilient to system churn. Examples include
Coolstreaming [5] and Prime [21].
Figure 1.3: Peer organization in mesh-based live streaming system
1.3 Flash Crowd in P2P Network
The phenomenon of arrival of hundreds of thousands of peer in a very short span of time
is called flash crowd. Such situations typically arise at the beginning of live streaming
events. The newly arrived peers compete for the limited system resources and drastically
reduces the performance of the system. The problem of flash crowd is more challenging
in live streaming systems due to stringent time constraints associated with the resources.
A significant number of newly arrived peer may leave the system if they are unable to
meet these stringent time constraints. This adds to the system churn and makes flash
crowd handling more difficult.
Flash crowds have been traditionally handled by deployment of additional resources -
servers in CoolStreaming+ [22] or content delivery network in SkyNet [23]. However, this
method is not cost effective. Moreover, it has been observed that the system performance
can be maintained at a high level once a sufficient system scale has been achieved. Hence,
the additional resources are only necessary for the initial period and useless afterwards.
In recent years, a more rigorous analysis of the problem [24] has helped broaden the
understanding of flash crowd dynamics. Having upload bandwidth alone is not sufficient
to accommodate the flash crowd as the nodes take time to locate the available resources.
Moreover, due to intense competition among the nodes, this available bandwidth is also
CHAPTER 1. INTRODUCTION 7
not fully utilized. Based on these, various population control measures have been sug-
gested for both mesh-based [25] [26] and tree-based systems [27] [28]. We discuss these
approaches in chapter 2.
1.4 Thesis Overview
Flash crowd in P2P live streaming systems is a challenging phenomenon. As discussed
earlier, the problem of flash crowd has seen considerable interest in recent years and
various population control measures have been suggested. In this thesis, the main focus
is on live streaming in tree-based P2P systems. Some proposals [27] [28] have been
suggested to handle flash crowd in tree-based live streaming systems, but they adopt a
centralized algorithm. Such solutions sufer from scalability and fault tolerance issues.
As part of this thesis, a distributed algorithm that can organize the newly arrived peers
in multiple sub-stream trees with minimal central control is proposed. The proposed
methodology has two vital aspects —(1) A distributed algorithm to construct the sub-
stream trees freeing the bootstrapping server of excessive load, and (2) The design doesn’t
need any special flash crowd detection mechanism. Even in the absence of flash crowd,
it can keep on working without necessitating any change. Thus, a distributed, scalable
and simplistic solution to handle flash crowds in tree-based P2P live streaming systems
is proposed in this thesis.
PeerSim [29] [30], a Java based simulation engine, is used to run the simulation. Chord
[10] is used to provide the overlay over which the distributed tree construction algorithm
is run. The second contribution of this thesis is the implementation of Chord Protocol
on PeerSim. A general purpose implementation of Chord is provided that can be used to
test any design based on Chord. The various data structures and messages designed are
discussed and documented. Finally, the distributed tree construction algorithm is also
implemented on PeerSim and experimental results are documented.
CHAPTER 1. INTRODUCTION 8
1.5 Thesis Organization
Chapter 2 discusses the existing research work done on handling flash crowd. The various
population control measures for both mesh-based and tree-based systems are discussed.
The novelty of the this thesis work with respect to existing solutions is also discussed.
Chapter 3 covers the preliminaries and the method proposed to handle flash crowd in
tree-based P2P live streaming systems.
Chapter 4 discusses the PeerSim simulator in general and Chord implementation in
particular. The additions made to the simulator as a part of this thesis is highlighted
and presented. Moreover, the software design architecture of the method proposed in
Chapter 3 is also presented.
Chapter 5 examines the framework to evaluate the method and presents various ex-
perimental results to examine the benefits and shortcomings of the proposed method.
Finally, Chapter 6 concluded the report and also discusses the scope for future work.
Chapter 2
Background
2.1 Understanding Flash Crowd
Chen et al. [31] understands flash crowd as a sudden increase of peer arrival rate. Zhang
et al. [26] expands this understanding and defines flash crowd in terms of shock level.
The shock level of a flash crowd is defined as the ratio of the peer arrival rate during and
before the flash crowd. Similarly, the capacity of the system is defined as the shock level
of the highest flash crowd that the system can survive [26]. A different representation
of flash crowd is used in [25]. Liu et al. models flash crowd as abrupt arrival of a large
number of peers. Since, the latter representation can be consider a special case of the
former, the representation in [31] is used for this thesis.
In recent years, a more rigorous analysis [24] has helped broaden the understanding
of flash crowd dynamics. Liu et al. [24] [25] examined the fundamental characteristics of
flash crowd and proposed a time-scale relationship in mesh-based live streaming systems.
They showed that with stringent time constraints as in the case of live streaming systems,
the network can scale only up to a limit during flash crowd. This is because only that
many peers can be satisfied in the first attempt as is the available surplus bandwidth of
the system. But the key insight obtained was that having available surplus bandwidth
alone is not sufficient for the system to scale as peers take time to locate the available
resources. The terminology used in [31] is used to denote the peers who are already
connected and able to forward streams as stable peers and the new arrived peers as
9
CHAPTER 2. BACKGROUND 10
start-up peers.
In mesh-based systems that rely on gossip based protocol to exchange information,
the partial or incomplete knowledge of the system is an extra deterrent to achieving the
desired system scale. A start-up peer randomly selects k peers from among the stable
peers and asks for the resources. A stable peer can receive requests from many new
start-up peers, hence it is able to satisfy only a subset of the received requests. The
available bandwidth gets distributed among the start-up peers. As a result, only a small
fraction of peers complete their start-up process and become stable peers themselves.
Hence, collaboration instead of competition among the peers is required to achieve better
system scale.
Liu et al. also examined the effects of various critical factors such as initial system
scale, flash crowd scale, number of partners and per peer upload capacity on the system’s
capability to handle flash crowd. Based on all these, they designed a simple population
control framework.
They showed that with the increase in the rate of peer arrival, the rate of system
scale starts to increase till a maxima and then it starts to deteriorate. Hence, for every
system there exists a critical peer arrival rate at which the growth rate of the system
is maximum. The framework exploits this property by admitting all peers if the peer
arrival rate is less than the optimum rate and delays a fraction of arriving peers if the
peer arrival rate is higher than the optimal rate. Using this strategy it was possible to
trade initial peer start-up delay to achieve a better system scale.
Zhang et al. [26] used a fluid based model to examine the flash crowd dynamics and
estimated the strength of flash crowd in terms of shock level. They showed that using
proper population control measures the waiting time of peers can be made to increase
logarithmically with the shock level of the flash crowd.
CHAPTER 2. BACKGROUND 11
2.2 A Batch Join Process For Single Tree-Based Sys-
tems
Tree-based live streaming systems have a pre-constructed topology where the peers re-
trieve the video content from pre-determined parents. The challenge to system scale
caused by gossip based protocol as in the case of mesh-based system is therefore absent.
As a result, flash crowd handling in tree-based systems have adopted a slightly different
approach. Chung et al. [27] identified peer join process as a system bottleneck during
flash crowd. The small pool of stable peers becomes overloaded by surge of join request
during flash crowd. Hence, they advocated to alter the serial join process and instead
join the peers in batches. This is done by constructing a tree completely out of start-up
peers and then connecting it to the existing system. Wu et al. [28] takes this scheme
further and proposes to construct not one but many trees based on the available surplus
bandwidth of the system. The underlying principle in these approaches is to isolate the
start-up peers and arrange them in a topology before connecting them to the existing
network.
This process involves four steps —(1) The detection of flash crowd and the estimation
of number of tree(s) to be constructed based on the available surplus bandwidth of the
system. (2) The merit calculation of each peer based on its forwarding capability and
waiting time, both weighted by appropriate factors. (3) The positioning of each peer into
different layers of the tree(s) based on their merit ranks. (4) Connecting peers to a peer
in a layer above it and joining the root node of the tree(s) to the stable peer(s).
An essential role is played by the tracking server as it is responsible for performing
all of the above steps. This burdens the tracking server and makes it a potential single
point of failure.
CHAPTER 2. BACKGROUND 12
2.3 SplitStream - High-Bandwidth Multicast in Co-
operative Environment
Various systems have been designed and implemented that use multiple trees to distribute
live stream. One such system namely SplitStream [18] is discussed here. The next section
discusses the novelty of the method proposed in this thesis in the context of flash crowd
handling in multiple tree-based systems.
As seen earlier, single tree-based systems suffer from resource under-utilization and
fault tolerance issues. SplitStream tackles this problem by constructing a forest of
interior-node-disjoint multicast trees that distributes the forwarding capacity among the
participating peers [18]. Every node is an interior node in only one tree and a leaf node
in all other trees. Also, depending upon the forwarding capacity of a node, it can select
the number of sub-stream trees to join.
SplitStream relies on a structured peer-to-peer overlay to construct and maintain
trees. Pastry [11] and Scribe [32] are used to provide this structure. Pastry is a DHT
based, scalable, self organizing peer-to-peer network similar to Chord [10] and Tapestry
[12]. Scribe [32] is a application level group communication system built upon Pastry.
The stream is divided into stripes and each stripe is distributed over a separate Scribe
multicast trees. A Scribe multicast tree associated with a stripe is formed by the union
of all the Pastry routes from each group member to the stripeId’s root. The stripeIds
are choosen such that it divides the network into almost equal groups. Each node will be
an interior node for a stripe whose stripeId has the longest prefix match with the node’s
nodeId.
Each node adopts the prospective child irrespective of its outdegree limit. If the
outdegree limit is exceeded, it orphans a child which belongs to the stripe whose stripeId
do not share a common prefix with the local node’s nodeId. (SplitStream doesn’t uses
strict checks during tree construction, hence a node may be intermediate node for more
than one stripe.) If no such node is found, it orphans a child with shortest prefix match
with the local node’s nodeId. The orphaned child then tries to connect to former siblings
under the same criteria. If unsuccessful, it is pushed down the sub-stream tree.
CHAPTER 2. BACKGROUND 13
If the node is pushed down the tree without being able to locate a new parent, it
sends a anycast message to a special Scribe group called spare capacity group. This
group contains all nodes which have fewer children than their forwarding capacity. The
anycast message is send to the node that is near the orphaned node in the physical
network and a depth-first search (DFS) is started at this node to find a parent for the
orphaned node.
2.4 A Novel Scheme
Population control measures in mesh-based systems delays the entry of a fraction of peers
to achieve a better system scale. The peers remain ideal for the delayed time duration and
can be utilized instead. On the other hand, in tree-based system flash crowd is handled by
arranging the peers in streaming trees and then connecting them to the existing system.
The approaches suggested are centralized and would require a lot of work being done by
the bootstrapping server.
Moreover, the existing literature on flash crowd handling in tree-based live streaming
systems focuses on constructing single-stream tree from the start-up peers. Wu et al. [28]
uses the terminology ”multiple tree” to denote n disjoint trees that are constructed from
the start-up peers each rooted at different node of the existing network. The terminology
of ”multiple single-streamed tree” would be better suited as the advantages present in
traditional ”multiple tree” systems is not present in this case. The bandwidth utilization
of the system remains poor as only the intermediate nodes are responsible for forwarding
the stream. Moreover, departure of a node affects the availability of stream on all of its
dependent node. Multiple tree-based solutions have been provided for live streaming but
they are not designed to handle flash crowds. Hence, to the best of my knowledge, this
thesis presents a novel scheme to control flash crowd in the context of multiple tree-based
live streaming systems.
Chapter 3
Multi-Tree Based Solution to
Handle Flash Crowd
3.1 Preliminaries
3.1.1 DHT Overview
A Distributed Hash Table (DHT) based protocol is used to provide the overlay over
which the streaming trees can be constructed. In DHT based systems, each resource is
identified by a <key,value> pair. The set of all possible keys forms the keyspace. For
e.g., if each key is a m-bit long sequence, the keyspace would comprise of all possible 2m
combinations. The keyspace is partitioned among the nodes of the network. Each node
is responsible for maintaining the resources whose key lies in its region of the keyspace.
Finally, a routing algorithm describes how the queries are routed in the network to locate
the resources. Different implementations give rise to different systems. Chord [10] is the
preferred choice for this thesis but other DHT based protocols such as Tapestry [12] or
Pastry [11] can also be used.
3.1.2 Chord
Chord is a scalable peer-to-peer look up protocol designed to construct an overlay network
based on DHT. As described earlier, each node as well as resource in the system is
14
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 15
identified by a m-bit long node-id and resource-id (key) respectively, with each node
maintaining a subset of the resources. By constructing and maintaining efficient routing
structures, a resource can be located in a maximum of O(log2(N)) hops, where N is the
number of nodes in the Chord overlay. The mechanism for mapping resources to nodes
and routing queries over the network is discussed in Section 4.2. On bootstrapping,
each node contacts the bootstrapping server and receives a list of existing nodes. It then
connects itself in the chord overlay and starts to populates its routing tables by exchanging
information with other nodes. Chord supports multiple joins and can bootstrap large-
scale DHT networks [33].
3.2 System Design
The system consists of homogeneous nodes, having bandwidth B, connected in the Chord
overlay. The video stream is split into t sub-streams and each sub-stream is distributed
over a separate tree. The value of t is a configuration parameter and is fixed at the start
of the streaming process. Nodes are arranged in the sub-stream trees under the principle
that a node is an intermediate node in only one tree and leaf node in all other trees. The
terminology used in [20] is adopted to refer to the node as a fertile node in the tree in
which it is an intermediate node and can forward stream and sterile node in all other
trees. The trees are also known as fertile and sterile trees of the node.
3.2.1 Maintenance of Feed-Forwarder List
To reduce competition among the nodes, start-up nodes arrange themselves in different
levels of the tree. A separate list of feed-forwarder is maintained in the network for
each level of each sub-stream tree. Every list is identified by a unique key of the form
streamX levelY, where X and Y are numerical values and represent the sub-stream (tree)
number and level number respectively. The key is hashed to generate a m-bit long
resource id. The node who is responsible for maintaining this resource (feed-forwarder
list) can be found by routing on the Chord overlay with resource-id as the destination.
This node has dual responsibilities of registering peers in the list on receiving registration
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 16
requests and replying with the latest feed-forwarder list on receiving query requests.
It should be noted that this node n is equal to any other node in the network. It only
happens that the resource-id obtained on hashing streamX levelY , i.e., #(streamX levelY)
lies in n’s region of keyspace and Chord protocol routes all messages destined to #(streamX levelY)
to n. If a new node joins the network and is a better candidate, the feed-forwarder
list maintained at n is transferred to the new node and all subsequent requests for
#(streamX levelY) would be routed to the new node.
3.2.2 Node Rank
A start-up peer first requests the bootstrapping server for a list of existing peers and then
connects itself in the Chord overlay. The bootstrapping server also returns a numerical
value known as nodeRank. It denotes the position of the node among the other start-up
peers and increments by one after each node’s arrival. This quantity will be used to
arrange the peers in different levels of the sub-stream trees.
3.3 NumLevel - Quantifying Flash Crowd
A quantity called numLevel is used quantify the intensity of flash crowd. It denotes the
number of level that needs to be maintained to accommodate the start-up peers. The
value of numLevel will depend on the size of flash crowd M , feed-forwarding capacity
of an individual peer k, and initial surplus per sub-stream feed-forwarding capacity of
the system Us, i.e, number of peers that can be supported by the intial system in each
sub-stream tree.
3.3.1 Calculation of NumLevel
Since all sub-stream trees are similar in structure, consider one particular tree X. Let
N(i) denote the maximum number of peers that can be accommodated in level i of the
tree. Then, if the initial surplus per sub-stream feed-forwarding capacity of the system
is Us, we have N(1) = Us. Now, the numbers of peers in the second level would be
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 17
N(2) = Usk, in third level would be N(3) = Usk2 and so on. This gives
N(i) = Uski−1 (3.1)
Now, if M peers join the system, we have the following relationship:
N(1) + . . . + N(numLevel − 1) < M ≤ N(1) + N(2) + . . . + N(numLevel)
Substituting the value of N(i) and adding the terms, we get:
Us(knumLevel−1 − 1)
k − 1< M ≤ Us(k
numLevel − 1)
k − 1
Simplifying further, we have:
numLevel − 1 < logk(M(k − 1)
Us
+ 1) ≤ numLevel
Thus, the value of numLevel can be given by
numLevel =
⌈logk(
M(k − 1)
Us
+ 1)
⌉(3.2)
where d e is the greatest integer function.
3.3.2 Updating NumLevel
The value of numLevel is calculated by the bootstrapping server. k and Us are known
to the bootstrapping server at the start of the streaming process. Also, as every node
contacts the bootstrapping server before joining the system to get the list of existing peers,
M is also known. numLevel also has an expiration time associated with it which is also
propagated in the network along with its value. Whenever the value of numLevel changes
or the previous value is about to expire, the bootstrapping server schedules updates (either
new or refreshes of old value with extended expiry time) to be propagated in the system.
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 18
3.3.3 NumLevel Root Node
The value of numLevel is maintained by a node known as levelRoot. This node can be
reached by routing the query to #(number of levels). It is responsible for responding to
all queries related to the latest value of numLevel. It is also responsible for propagating
updates when scheduled by the bootstrapping server.
On receiving the request for numLevel, an intermediate node stores the identity of
the requester node and forwards the query towards the levelRoot node. The path traced
by search queries form a tree routed at the levelRoot node. Whenever a new update
is scheduled by the bootstrapping server, the updated value is then propagated from
the levelRoot node down the reverse path taken by search queries. Controlled update
propagation is used in this regard [34].
To avoid the levelRoot node from being flooded by requests, query coalescing and
caching of numLevel at local nodes is done. If an intermediate node has an active
request for numLevel, it doesn’t forward any other request unless the active request
is satisfied or is timed out. On reception of the reply it is forwarded to all the nodes
awaiting the reply. Thus, the actual number of requests reaching the levelRoot node is
greatly reduced. Moreover, every node caches the reply it receives. If an intermediate
node receives a request for numLevel, it first tries to reply from its cached value. If the
cache value is invalid or expired, only then is the request routed towards the levelRoot
node.
3.4 Construction of Streaming Trees
3.4.1 Necessary Conditions
The newly arrived nodes need to be arranged in t streaming trees in such way that they
are fertile node in only one tree and sterile nodes in all other trees. The available
information at each node is its nodeRank, number of trees t and the number of levels
numLevel. To achieve this, two conditions are necessary.
Condition 3.4.1 The feed forwarding capacity k of each node must not be less than the
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 19
number of streaming trees t.
If the stream is divided into t sub-streams and each node is a fertile node in only one
sub-stream tree, we have M/t fertile nodes per sub-stream tree. These can distribute the
feed to Mtk nodes. Hence, in order for all nodes to receive all sub-streams, M
tk ≥ M or,
k ≥ t.
As the system consists of only homogeneous nodes, this condition can be easily ful-
filled. Let, the total available uplink bandwidth with an individual node be B. Now,
if the rate of original stream is R, each sub-stream will consume a bandwidth of R/t.
A node will be able to forward the sub-stream to a maximum of k = BR/t
child nodes.
By making the stream rate R equal to the uplink bandwidth of an individual node, we
can achieve Condition 3.4.1. It should be noted that this condition is necessary but not
sufficient to guarantee the construction of streaming trees.
Condition 3.4.2 If the total number of nodes in the network is M and the number of
streaming trees is t, then (1) the number of fertile nodes in each streaming tree must not
be less than bM/tc and (2) the initial per sub-stream surplus feed-forwarding capacity of
the system Us must not be less than t.
Part (1) of this condition is a natural extension of Condition 3.4.1 which warrants that
the consumption capacity of each sub-stream must not exceed its production capacity.
As for Part (2), consider the following two cases.
CASE 1: M is a multiple of t
In this case, each tree will have exactly M/t fertile nodes. The maximum number of
nodes that can be connected in each tree would be Mtk = M as we have t = k from
Condition 3.4.1. Hence, no initial per sub-stream surplus feed-forwarding capacity is
required apart from the source node. Us = 1 would suffice in this case.
CASE 2: M is not a multiple of t
In this case, some sub-stream trees will have dM/te fertile nodes whereas others will have
bM/tc fertile nodes. The maximum number of nodes that can be connected in the former
case would be dM/tek ≥ M whereas the latter trees can only support bM/tck ≤ M
nodes. The remaining r = M − bM/tck nodes will have to be supported by the initial
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 20
per sub-stream surplus feed-forwarding capacity of the system. Since r < k, U = k would
be sufficient to cover up this deficit. This condition can be easily satisfied as the source
server often have bandwidth in orders of magnitude of the ordinary nodes.
3.4.2 Registration as a Feed-Forwarder
Every node will register itself as a feed-forwarder only in its fertile tree. But, before
that, it request the value of numLevel by routing the request message towards #(num-
ber of levels). The request may be answered by a cached value in the intermediate node
or by the levelRoot node itself. The stream and level selection of the fertile tree is then
done according to the algorithm given in Fig 3.2. nodeRank denotes the position of the
node among the start-up peers and is provided to the node by bootstrapping server. The
flowchart of the process is given in Figure 3.1.
After selecting the fertile tree s and level l, each node routes a registration request to
#(stream[s] level[l]) to register itself as a feed-forwarder. The flowchart of the process is
given in Figure 3.3.
3.4.3 Connecting to Fertile Tree
After successfully registering itself as a feed-forwarder in level l of sub-stream tree s, a
node requests the list of feed-forwarders registered at a level above it in the same tree by
routing the request to #(stream[s] level[l-1]). After receiving the feed-forwarder list, the
node (childNode) selects a node (parentNode) randomly from the list and sends connec-
tion request to it. If the connection is successful, the process is completed. Otherwise,
the childNode selects a different parentNode from the list and tries again. The flowchart
of the process is given in Figure 3.4. As seen in Figure 3.4, the node waits for timeOut
seconds before requesting the feed-forwarder list again. This gives time to new arriving
nodes to register themselves in the list before the list is requested again. Otherwise an
empty or stale feed-forwarder list would be returned.
When a childNode sends a connection request to a parentNode in its fertile tree, the
following cases may arise.
CASE 1: parentNode has less than k children
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 21
Start
Initialize state variables
Request number oflevels numLevel
Reply receivedbefore time-out?
Choose fertile tree according toalgorithm given in Figure 3.2
Reset time-out counter
Stop
yes
no
Figure 3.1: Flow chart for selecting fertile tree
In this case, the parentNode accepts childnNode as its child and the process is completed
for childNode. If the number of fertile children of parentNode becomes k, it sends a
deregistration message to #(stream[s] level[l-1]) to remove itself from the feed-forwarder
list.
CASE 2: parentNode has k children and at least one of them is a sterile
child
The node orphans the sterile child and accepts childNode as its new child. To orphan
the child, parentNode sends a disconnect message to the orphaned child along with the
childNode’s address. The orphaned child then tries to connect to childNode as its new
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 22
1 // N( i ) = number o f nodes accommodated in l e v e l i o f a s i n g l e sub−stream t r e e
2 f o r ( i = 1 to numLevel ) {3 N( i ) = U s ∗ power (k , i −1) ;4 }5
6 // C( i ) = number o f nodes accommodated upto l e v e l i in a l l t r e e s7 // C( i ) = ( N(1) + N(2) + . . . + N( i ) ) ∗ t ;8 C(0) = 0 ;9 f o r ( i = 1 to numLevel ) {
10 C( i ) = C( i −1) + N( i ) ∗ t ;11 }12
13 l e v e l = 1 ;14
15 whi le ( t rue ) {16
17 i f ( ( C( l e v e l −1) < nodeRank ) && ( nodeRank <= C( l e v e l ) ) ) {18 stream = (nodeRank − C( l e v e l −1) )%k ;19 break ;20 }21 l e v e l = l e v e l + 1 ;22
23 }
Figure 3.2: Algorithm to select stream and level for registration in fertile tree
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 23
Start
Sub-stream tree = s
Is s a fertile tree?
Choose level lChoose level l
Register as feed-forwarderon #(stream[s] level[l])
Registration success-ful before time-out?
Reset time-out counter
Wait for sterileDelay seconds
Contd
yesno
yes
no
Figure 3.3: Flow chart for registering in fertile tree
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 24
parent, effectively moving down one level in the tree. Again 3 cases may arise which is dis-
cussed in the next segment. After accepting a fertile child, if the number of fertile children
of parentNode becomes k, it sends a deregistration message to #(stream[s] level[l-1]) to
remove itself from the feed-forwarder list.
CASE 3: parentNode has k children and none of them is a sterile child
In this case, the connection is not successful and childNode picks up another node from
the feed-forwarder list.
The number of nodes registered in level i is N(i) = Uski−1 as given by Equation 3.1.
Now, nodes registered in level i can support a maximum of N(i) ∗ k = Uski children,
which is also equal to the number of nodes registered in level i + 1. Hence, if fertile
children are given priority over sterile children, each node is guaranteed to get the stream
in the tree in which it is registered (is a fertile node).
3.4.4 Connecting to Sterile Trees
As a node registers and connects to its fertile tree, it also tries to connect to the remaining
sterile trees. As seen in Figure 3.3 this process is delayed by sterileDelay seconds. This
is done to give the new arriving nodes time to register themselves in the feed-forwarder
lists of their respective fertile tree before the lists are requested. Otherwise an empty or
stale feed-forwarder list would be returned.
If the number of levels in the system is numLevel, the nodes registered in levels from
1 to numLevel − 3 would already have their feed-forwarding capacity utilized by fertile
children. Hence, the sterile nodes tries to connect to nodes registered in the lower levels.
The selection of level for the sterile tree is done according to the following rule.
If a node is registered at a level higher than numLevel − 1 in its fertile tree
o Then, it chooses l = numLevel − 2.
o Otherwise, it chooses l = numLevel − 1.
Now, a node requests the list of feed-forwarders registered at level l of sterile tree s
by routing the request to #(stream[s] level[l]). This is done for all the sterile trees of
the node. After receiving the feed-forwarder list, the node (childNode) selects a node
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 25
(parentNode) randomly from the list to connect to. If the connection is successful, the
process is completed. Otherwise, the childNode selects a different parentNode from the
list and tries again. The flowchart of the process is given in Figure 3.4.
When a childNode sends a connection request to a parentNode in its sterile tree, the
following cases may arise.
CASE 1: parentNode has less than k children
In this case, the parentNode accepts childNode as its sterile child and the process is
completed for childNode for that particular sterile tree s.
CASE 2: parentNode has k children and at least one of them is a fertile
child
In this case, the connection is denied. But parentNode also sends the addresses of its
fertile children nodes along with the connection denial message. childNode then tries to
connect to parentNodes’s children, effectively moving down one level in the tree. This
process continues and terminates either in a successful connection or childNode being
pushed down to the last level. In the latter case, childNode then requests the global
feed-forwarder list of that particular sterile tree s.
CASE 3: parentNode has k children and none of them is a fertile child
In this case, the connection is not successful and childNode picks up another node from
the feed-forwarder list.
3.5 Global Feed-Forwarder List
Apart from maintaining feed-forwarder list for all levels of all sub-stream trees, a global
feed-forwarder list is also maintained for each sub-stream tree. it contains all those nodes
who have less than k children. After registering itself in the feed-forwarder list, a node
waits for globalT imeOut seconds before joining the global feed-forwarder list. It will do
so only if all of its feed-forwarding capacity is not utilized till then. It serves two main
purposes.
Firstly, when the flash crowd intensity is very high, the Chord overlay takes some
time to stabilize before it can route messages correctly. More about this will be discussed
in the next Chapter. The registration of some nodes may be delayed in this case. Such
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 26
nodes are capable of accepting child nodes and hence notify their presence by registering
themselves in the global feed forwarder list.
Secondly, as seen in Case 2 of Section 3.4.4, while connecting to the sterile tree, a
node can be pushed down to the last level. In this case, the node queries the global
feed-forwarder list of the sterile tree in which it was pushed down and connects to the
returned nodes.
CHAPTER 3. MULTI-TREE BASED SOLUTION TO HANDLE FLASH CROWD 27
Contd
Request feed forwarder listof #(stream[s] level[l-1])
List received be-fore time-out?
Reset time-out counter
Wait fortimeOut seconds
Is list empty?
Remove a random node from thelist and send connection request
Is connectionsuccessful?
Stop
yes
noyes
no
yes
no
Figure 3.4: Flow chart for connection to sub-stream trees
Chapter 4
Design Implementation in PeerSim
Simulator
4.1 PeerSim Simulator
4.1.1 Introduction
PeerSim [29] [30] is a java based peer-to-peer simulator. Its ability to support large scale
simulations makes it a suitable candidate for capturing the behavior of P2P network under
flash crowd. In contrast to other network simulators that provide detailed modeling of
the lower layers, a phenomenon of limited interest in P2P research, PeerSim achieves
scalability by implementing a simple topology at the network layer. While Omnet++
[35] and P2PSim [36], other simulators used in research can support a maximum of 1000
and 3000 nodes respectively, PeerSim can scale over 1,000,000 nodes. PlanetSim [37] is
another simulator that can provide scalability upto 100,000 nodes.
Another feature which makes PeerSim suitable is its modular design. The simulator
is structured with various independent parts which can be plugged in and out as per
the need of the simulation. For e.g., a separate module is responsible for maintaining
raw links among the nodes, another module for exchanging messages between the nodes
and yet another module for implementing higher layer overlay architectures. These are
modeled in the form of protocols with each protocol providing a specific functionality to
the node. Each node maintains one or more such protocols. More about protocols is
28
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 29
discussed in Section 4.1.2.
The simulation is carried by the PeerSim simulation engine. It is designed to work in
two different modes: cycle-based and event-driven. The cycle-based engine is relatively
simpler to understand and use, but it has certain shortcomings. It does not model the
transfer of messages and lacks concurrency. Event-driven engine on the other hand is more
complex, less documented but provides a more realistic simulation. Hence, event-driven
simulation engine is used. More about the two is discussed in Section 4.1.3.
4.1.2 Main Interfaces
Peersim has three main interface classes that plays different roles in the simulation engine.
They are — node, protocol and control.
o Node: The node interface is the abstraction of the real world processing unit in the
network. In other words, it represents a peer in the peer-to-peer network. Nodes
themselves maintain very little state and does not perform any function. They are
simply aggregations of protocols. The messages exchanged among the nodes also
runs through the protocol interface class.
o Protocol: The protocol interface class is often the most customized part of PeerSim
as it defines the main functionality of the system. Each protocol class consists of a
set of state variables, set of messages that it supports and a set of rules to define
a node’s behaviour on reception of these messages. These messages are the main
component that drives the simulation. The protocol interface class is the main
producer of messages and the only consumer of it.
Every node contains a stack of protocols each implementing a specific functionality.
It can range from low level functions such as modeling propagation delay in link
layer, maintaining list of neighbor nodes to high level functions such as routing
messages in a DHT overlay. These protocols are built in a modular way on top of
one another and can be modified or replaced without affecting other protocols.
o Control: The control interface can be seen as an onlooker of the network. They are
either run at the start of the simulation to initialize certain aspects of the network
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 30
or at regular intervals. The control class is known as initializer class in the former
case. When run at regular intervals, they are often used to gather and print desired
statistics or to provide input to the simulation in the form of messages. They can
also be used to end the simulation midway.
4.1.3 PeerSim Engines
Cycle-Based Engine
The cycle-based simulation engine is very simplistic in nature. In each round, the control
of the simulation is sequentially cycled through all the nodes of the system. When the
control is with a specific node, its protocol interface class can perform computations,
change its state variables or execute remote procedure calls on other nodes. The message
transfer mechanism is absent in cycle-based engine and a node communicates directly
with other nodes through RPC calls.
At the start of simulation, all control objects used to initialize the system are run.
Then, the cycle-based engine calls all the other objects (protocols and controls) for every
node once in each cycle, until the given number of cycles are completed. Any component
can also decide to end the simulation at any time. Although, all components are executed
in every cycle, they can be configured to run only in certain cycles. It is also possible
to control their order within each cycle. Cycle-based model is easy to understand and
use but does not model real word accurately. Event-driven engine would be used for our
purpose.
Event-Driven Engine
As the name suggest the event-driven engine functions by executing events one at a
time. These events can be of two types — (1) control objects that are scheduled to
run at fixed intervals to gather and print statistics, or (2) messages which are exchanged
among the nodes (more correctly among the node’s protocol class) or scheduled by control
object to provide external stimulus to the system. The concept of time provides a sense of
continuity and realism in the simulation. Each event is schedule to execute at a particular
time only. In case two events are scheduled at the same time, they are executed one after
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 31
the other.
At the start of the simulation, all the control objects used to initialize the system
are run. Then the events in the event queue are executed one by one. Whenever an
event is complete, time jumps to the scheduled time of the next event and it is run. A
node’s protocol class receives the messages destined to it, perform computations, alter
state variables and may add more messages (events) in the simulator. A node’s protocol
can also send a message to itself to schedule events in the future. The simulation engine
stops when the event queue becomes empty, or the simulation time limit is exceeded or
any component ends the simulation.
4.1.4 Configuration File
The configuration of the system is given by an ASCII text file that is received by the
PeerSim simulator during run time. It contains all the options ranging from simulation
and network parameters to various protocol and control components used. Each compo-
nent has a name associated with it. In case of protocols, this name is also mapped to a
numeric index called protocol ID. The protocol ID is used to access a particular protocol
on a node. The syntax of declaring a component is as follows:
<protocol|init|control>.string_id [full_path_]classname
Here, string id is the name of the component and the full path classname is the loca-
tion of the Java class that describes the functionality of the component. Each component
can have various parameters associated with it which are also initialized in the configu-
ration file. The required syntax is:
<protocol|init|control>.string_id.parameter_name [value]
An example configuration file for event-driven engine is shown in Fig 4.1 and Fig 4.2.
It performs the task of creating a Chord overlay and routing messages over the overlay
at fixed intervals. At another fixed interval the average number of hops taken by the
messages to reach their destination is observed and printed. The unreliability and delay
in transport layer is also modeled. The various components are as follows.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 32
##### Global Simulation Parameters #####
random.seed 1234567890
simulation.experiments 1
simulation.endtime 100000
simulation.logtime 1000
network.size 1000
##### Protocols #####
protocol.urt UniformRandomTransport
protocol.urt.mindelay 45
protocol.urt.maxdelay 55
protocol.tr UnreliableTransport
protocol.tr.transport urt
protocol.tr.drop 0
protocol.ovr ChordProtocol
protocol.ovr.transport tr
Figure 4.1: Example configuration file (1/2) for PeerSim simulator
Global Simulation Parameters
They define the global parameters related to the simulation such as the random seed, net-
work size, number of experiments to run etc. The simulation is run till simulation.endtime
or unless the event scheduler becomes empty or the simulation is stopped by any compo-
nent.
Protocols
Each node has a stack of protocol that defines its behavior. At the lowest layer, it is the
transport layer protocol (urt) that delivers messages from one node to another with a
delay picked uniformly from [mindelay, maxdelay]. Packet loss probability is modeled by
another protocol (tr). The packet loss probability is set by the drop parameter associated
with the tr protocol. The modular nature of the simulator is evident from this example.
The packet loss probability and propagation delay are modeled by two separate protocols
and can be independently configured. They are linked to each other by the transport
parameter of the tr protocol.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 33
##### Initialization #####
init.nw CreateNetwork
init.nw.protocol ovr
init.nw.idLength 128
init.nw.succListSize 20
##### Control #####
control.generator TrafficGenerator
control.generator.protocol my
control.generator.step 1000
control.generator.from 1000
control.generator.until 50000
control.observer MessageHopObserver
control.observer.protocol my
control.observer.step 5000
##### Network Churn #####
control.churn DynamicNetwork
control.chrun.add 10
control.churn.add -5
control.churn.maxsize 2000
control.churn.step 1000
control.churn.init.0 ChordInitializer
control.churn.init.0.protocol my
Figure 4.2: Example configuration file (2/2) for PeerSim simulator
The overlay of the network is modeled by another protocol (ovr). It is initialized
at the start of the simulation and new nodes can be added to the network during the
simulation. The Chord protocol runs at the top of all other protocols and needs tr
protocol to exchange messages between the nodes.
Initialization
init elements are also control components. The only difference is that they are run only
at the beginning of the simulation. Here, it is used to connect the nodes in the Chord
overlay. Various configuration parameters related to Chord protocol such as idLength,
succListSize are provided.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 34
Control
They serve two purposes. Firstly, they can be used to give inputs to the system by
scheduling events. In the given example, the control component generator is run at fixed
intervals to generate routing messages between random nodes. It provides the network
with traffic as an input to observe the behavior of the system.
The second task of control events is to run at fixed intervals to collect and print
statistics. In this example, the desired statistics is the average number of hops taken
by a message to reach its destination. The fixed interval between two runs is configured
by step parameter. Also, the control objects can be scheduled to run in between only
specific time intervals by using from and until parameters.
Network Churn
The churn in the network can also be modeled by control components. The add parameter
denotes the number of nodes to be added to the system on every run of the control
component. If it is negative, the same number of randomly selected nodes are removed
from the system. If both are provided, the nodes are first removed then added.
The size of the network can be kept within bounds by using the maxsize and minsize
parameters. PeerSim simulator engine only adds the nodes in the network array. The
task of adding the nodes in the overlay and making them visible to other nodes is done
by the init parameter. Also, PeerSim has no mechanism to inform which nodes were
removed. The failure detection and fault tolerance mechanism needs to be build in the
overlay protocol itself.
4.2 Chord Introduction
Chord is a scalable protocol designed to provide time bound lookup in a dynamic peer-to-
peer network. The overlay network constructed by Chord provides variety of advantages
such as load balancing of resources among peers, decentralization, scalability, availability
of resources in the face of new node joins or node failures and flexibility in the types of
resources maintained in the system.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 35
Every node in the Chord overlay is identified by a m-bit long, unique identifier known
as node-id. It is generated by consistent hashing of any unique information related to
the node, say its MAC address. Every object maintained in the network is identified by
an unique key. This key is also hashed to provide a m-bit long resource-id associated
with that object. Hence, the node-id and resource-id space coincides with each other.
This id-space is circular and is known as identifier circle. The identifier length m must
be chosen such that the probability of two nodes or objects having same identifier is
negligible.
The keyspace or identifier cycle is partitioned among the nodes and each node
is responsible for maintaining the keys that lies in its region of keyspace. In this re-
gard, key k is assigned to the first node whose identifier is equal to or comes after k
in the identifier cycle. Such a node is known as successor of key k and is denoted
by successor(k). By constructing and maintaining efficient routing data-structures (dis-
cussed in Section 4.2.1), a resource can be located in O(log2(N)) hops, where N is the
number of nodes in the Chord overlay.
4.2.1 Data Structures
Each node needs to maintain only a single piece of information to ensure the correctness
of lookup service, namely the information about its successor, i.e., the node just after it
in the identifier circle. Queries for a given key can be forwarded in the circle through
the successor nodes till it reaches the desired destination.
Such a lookup is not scalable as it would take O(N) hops to locate the successor for
a key. Hence, each node maintains a routing table with m entries known as finger table.
The entries of the finger table are filled such that the distance between successive entries
varies exponentially. For e.g., the ith entry in the finger table of the node with id n is
the first node that succeeds n by at least 2i−1 on the identifier circle, i.e., n.finger[i] =
successor(n + 2i−1).
Maintenance of successor ensures correctness and maintenance of finger table ensures
scalable lookup. If the successor node went down, the correctness of the protocol would
be compromised. Hence to ensure fault tolerance, every node maintains a successor list
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 36
Notation Definitionfinger[k] first node on identifier circle that
succeeds (n + 2k−1) mod 2m, 1 ≤ k ≤ msuccessor next node on the identifier circle;
finger[1].nodepredecessor previous node on the identifier circlesuccessorList[i] ith successor of the node
Table 4.1: Definition of variables for node n in Chord network
of size r containing the node’s first r successors. If the successor of a node fails, the next
active node on the successor list is made the new successor. The various variables and
data structures associated with Chord protocol is reproduced from the original Chord
paper [10] and is shown in Table 4.1.
4.2.2 Routing Process
Chord provides just one operation — given a key it can map it to a node in O(log2(N))
hops. This is done by running the find successor() routine on node n with the key
as parameter. If the node is able to find the key’s successor in its successor list, the
successor is returned. Otherwise, the request is passed to the node in n’s finger table
whose nodeId just precedes key. In each step, the possible search space of the identifier
circle is cut in to half. Hence, the lookup cost is under logarithmic bounds. The lookup
protocol from the original Chord paper [10] is reproduced in Figure 4.3.
4.2.3 Join Process
Every node on bootstrapping, selects a random node and initiates the join process. The
join process is finished after the successor of the new node is found. The arrival of new
node needs to be reflected in the existing nodes’ successor list and finger table. This is
done by running the stabilize() and fix fingers() routines.
stabilize() routine is run by every node periodically to learn about newly joined nodes.
A node n asks its successor for its successor’s predecessor. If a new node has arrived in
the network, the returned value may not be n. In this case, node n updates its successor
to the new node if the returned node is nearer to it in the identifier circle than its
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 37
// ask node n to find the successor of idn.find successor(id)
if (id ∈ (n, successor])return successor ;
elsen’ = closest preceding node(id);return n’.find successor(id);
//search the local table for the highest predecessor of idn.closest preceding node(id)
for i = m downto 1if (finger[i] ∈ (n, id))
return finger[i];return n;
Figure 4.3: Pseudocode for lookup reproduced from original Chord paper [10]
original successor. It also sends a notification to its (new or original) successor informing
about its existence. The successor then makes n its predecessor if its predecessor is null
or node n is a better candidate than the previous predecessor.
Similarly, fix fingers() routine is also run periodically by every node. It updates
the finger table entries of the node, reflecting the arrival of new nodes in the network.
Finally, check predecessor() routine is run periodically to remove failed predecessors.
The psuedo code for various routines is reproduced from the original Chord paper [10] in
Figure 4.4.
4.3 Chord Implementation in PeerSim
PeerSim has been used to carry out simulations for peer-to-peer networks but there are
very few examples in which structured overlay network such as Chord is used. A rough
skeletal outline of Chord implementation with the vary basic features can be found on
the Peersim project page [38] but it is insufficient for the purpose of this thesis. The join
process is implemented in a static manner assuming global knowledge of the network’s re-
sources and direct access to any node’s routing tables. When a peer joins the network, its
successor list and finger table are populated correctly in an instant. Hence, a more real-
istic implementation of Chord was required for the purpose of simulation of the proposed
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 38
// create a new Chord ringn.create()
predecessor = nil;successor = n;
// join a Chord ring containing node n’n.join(n’ )
predecessor = nil;successor = n’.find successor(n);
//called periodically. verify n’s immediate//successor and tells the successor about nn.stabilize()
x = successor.predecessor;if (x ∈ (n, successor))
successor = x;successor.notify(n);
// n’ thinks it might be our predecessorn.notify(n’ )
if (predecessor is nil or n′ ∈ (predecessor, n))predecessor = n′
//called periodically. refreshes finger table entries//next stores the index of the next finger to fixn.fix fingers()
next = next + 1;if (next > m)
next = 1;finger[next] = find successor(n + 2next−1);
// called periodically. checks whether predecessor has failedn.check predecessor()
if (predecessor has failed)predecessor = nil;
Figure 4.4: Pseudocode for stabilization from original Chord paper [10]
method which would take into consideration the incorrectness of the overlay and partial
knowledge of the nodes in the initial stages of the network and also the delay incurred
in exchanging information between the nodes. In the process of developing an alternate
implementation, some basic structures are utilized from the original implementation and
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 39
Name Type Description
Network - global class provided by Peersim- Node data-type of a peer in the network
chordId BigInteger used to represent the id of nodesm int bit-length of chordId
succLSize int length of the successor listsuccessorList[] Node array successor list for a nodefingerTable[] Node array finger table for a nodepredecessor Node predecessor node
next int index of the finger table entry to fixpid int id of the overlay protocol i.e., chord protocoltid int id of the transport layer protocol
stabInterval long time difference between 2 stabilize() callffInterval long time difference between 2 fix fingers() callretryTime long time duration after which a node tries to join the network again
Table 4.2: Variables and Data Structures used in Chord Implementation
changes are made whenever deemed necessary.
4.3.1 Variables and Data Structures
The variables and data structures used has been listed in Table 4.2. The usage of some
of them is evident from the description whereas others are described in next sections.
4.3.2 Classes
The task of performing functions on a node and providing utility is done by various Java
Classes. These can be either control or protcol components. The relevant classes for
Chord implementation are:
CreateNetwork Class
It is a control component that is run once at the beginning of the simulation to construct
the Chord overlay from the available nodes. It takes three main parameters from the
configuration file — size of finger table m, size of successor list succLSize and chord’s
protocol id. The Class’s constructor receives the parameters from the configuration file
thorough a PeerSim Configuration class.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 40
Every time a control component is invoked, its execute() function is run. It returns
a boolean value, true to end the simulation, false otherwise. In this case, the execute()
function initialize the variables and data structures and populates the finger table and
successor list of all the nodes. It should be noted that it is reasonable to assume that
the nodes present at the start of the simulation have correct and updated routing data
structures.
ChordInitializer Class
It is run when any new node joins the network. The initialize() function is called with
the new node as a parameter. The basic task of this class is to initialize data structures
and populate finger table and successor list. As the node has limited knowledge of the
system in initial stages, these data structures will take some time to be filled correctly.
Hence, the implementation of ChordInitializer Class is changed completely to make it
more realistic. Now, every node on joining the network selects a already stable node and
sends a FindSuccessorMessage to it to initiate the join process.
ChordMessage Class
It is only an interface provided to create other message classes. All the message classes
follow the same structure. It has data members, constructor and get functions. Nodes in
the network communicate with each other through messages. In this case, messages are
send from one node’s ChordProtocol class to another node’s ChordProtocol class. How
the protocol processes these messages is defined in the ChordProtocol Java file. These
messages can be send through the underlying transport layer protocol or added directly
to the simulator engine to run at a specific time in the future. The various message
classes are as follows:
o FindSuccesorMessage: Send by a newly joined node to find its successor. It is
routed through the network and ReturnSuccessorMessage is returned on success.
Else FailMessage is returned.
o ReturnSuccesorMessage: Send to the requester node with its possible successor
and possible successor’s current predecessor node. On receiving the message, the
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 41
node updates its successor and predecessor nodes. It also schedules StabilizeMessage
and FixFingerMessage to run immediately.
o FailMessage: Returned to the requester node when a FindSuccesorMessage re-
sults in failure. The requester node retries again after retryT ime.
o StabilizeMessage: Scheduled by a node to be run on fixed intervals to know about
new nodes in the system. When received by a node, starts the stabilize process as
described in Section 4.2.3 by sending GetSPMessage to its successor. The node
also schedules the next StabilizeMessage to run after stabInterval by adding it to
the event simulator engine.
o GetSPMessage: Send by a node to its successor to inquire about the successor’s
current predecessor.
o ReturnSPMessage: Reply to the GetSPMessage. If new nodes are added in the
system, then the reply to the GetSPMessage can be different from the requester
node. In other words, predecessor of a node n’s successor may not be node n. In this
case, the requester node updates its successor and sends NotifyPredecessorMessage
to its new successor.
o NotifyPredecessorMessage: Send to the successor node to request it to make the
sender node its predecessor if it is a better candidate.
o NotifySuccessorMessage: Send to the predecessor node to request it to make the
sender node its successor if it is a better candidate.
o FixFingerMessage: Scheduled by a node to run on fixed intervals to update
its finger table entries. When received by a node, starts the process by sending
FindSuccesorFMessage. The node also schedules the next FixFingerMessage
to run after ffInterval by adding it to the event simulator engine.
o FindSuccessorFMessage: Send by a node to find suitable candidates for its finger
table. It is routed through the network and ReturnSuccessorFMessage is returned
on success.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 42
o ReturnSuccessorFMessage: Send as a reply to FindSuccessorFMessage.
All the above messages are used either to join a new node in the network or for book-
keeping purposes. These are the bare essential messages that are needed to keep the
Chord network intact. Apart from these, other messages can be defined as per the need
of the application running over Chord overlay.
ChordProtocol Class
As said earlier, a node in PeerSim is only a wrapper to contain the protocols. All the
functions of the node are performed by the protocol interface class. In this case, the
ChordProtocol class defines the complete functioning of the Chord Protocol. Most of the
variables and data structures listed in Table 4.1 are members of this class. Whenever
a ChordMessage is received by the node, the processEvent() routine of this class is
invoked. Depending upon the type of message, the state of the node is changed and reply
messages are sent. The description of various messages have already been covered in the
last segment.
Generator Class
These are control components that can be scheduled to run at fixed intervals. They are
mainly used to provide input to the simulation. For e.g., a Generator class can be used
to generate traffic in the network. On each run, every node would route a message to
any random node. The behavior of the system can then be observed in terms of number
of messages successfully delivered or average hop count of a message.
Observer Class
These are also control component that can be schedule to run at fixed intervals during
the simulation. Their main task is to observe, calculate and print various entities related
to the simulation. For e.g. in the above example, an Observer class can be used iterate
over each node and gather the statistics related to percentage of successful messages and
average hop count.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 43
The Chord Implementation follows the modular approach of PeerSim. Various classes
are used to perform different functions. There can be many Observer and Generator class
as peer the need of the application.
4.3.3 Flash Crowd Handling in Chord — Why and How?
Chord overlay can be used to support a variety of services. If the service experiences
a flash crowd, the Chord overlay should also be able to scale appropriately. For e.g.,
consider a live streaming system supported over a Chord overlay. During the beginning
of the streaming session, only a few nodes are present in the system. If a large number of
nodes arrive simultaneously, the quality of service will be degraded. Population control
mechanisms can be adopted to handle the flash crowd but it will require a overlay network
that can be used for communication between the nodes. Hence, Chord overlay network
should not become a bottleneck in this case and should be able to scale as per the needs
of the service.
Related Work
The problem of bootstrapping large scale DHT network from scratch has been studied
using different approaches. Angluin et al. [39] uses a series of distributed algorithms to
build a sorted linked list of all the identifiers. This list is then used to construct the
Chord overlay. Montresor et al. [40] starts with a handful of pointers known as view
and keeps on exchanging views with its neighbors. The overlay topology converges to the
Chord topology as each node tries to receive nodes that are better candidates for their
Chord topology.
Both these approaches do away with standard DHT algorithm in favor for a distributed
algorithm of their own. Moreover, they also assume that the nodes have some information
about the initial state of the system even before joining the DHT network. They work
their way up to a structured DHT overlay from a loosely connected unstructured network.
But, such a situation may not always be applicable.
Lee et al. [33] challenges the above assumptions and shows that the original join
procedure is able to handle large number of nodes simultaneously. The join procedure
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 44
can be used as it is for Kademlia [41] but for Chord it would require a slight tweaking.
Aggressive Join
In the original Chord specifications, the join process is finished as soon as a node finds
its successor. The task of fixing the rest of the pointers is left to the stabilize() call. The
pseudo-code for original join process can be found in Figure 4.4.
Aggressive join completes this process by fixing the other pointers also. Aggresive join
was first suggested by Baumgart et al. [42]. When a node s receives a find successor()
request from node n, it immediately sets its predecessor to n if it is a better match for
it than its original predecessor p. Then it sends p’s information in the reply message to
n so that n may add it as a new predecessor. Finally, it also sends a message to its old
predecessor p notifying it that n may be a better match for its successor.
Modified Fix Finger Routine
Another change in the original Chord specifications is necessary to achieve faster conver-
gence. The fix finger routine is modified so that the farther entries are filled first. The
pseudo-code for original process can be found in Figure 4.4.
This change is necessary because of the following reason. Consider a newly constructed
network where nodes are still in the process of updating their finger tables. If the lower
finger table entries are updated first, the node has information about its neighbors in the
near neighborhood only. If a request is received to route a message halfway across the
identifier circle, it can be done only in small incremental hops as the higher finger table
entries are not yet filled. Hence, updating the farther finger table entries first can tackle
this problem.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 45
4.3.4 Flash Crowd Handling in Chord — Experiments and Re-
sults
Simulation Setup
To test the performance of Chord under these two modifications, we run the simulation
with the same conditions as in [33]. Every join uses aggressive join to connect to the
network. If successful, it immediately runs the stabilize() and modified fix finger()
routines. After this the stabilize() and fix finger() routines are run every 2 and 10
seconds respectively. If the join process is unsuccessful, the node waits for 2 seconds
before trying again.
Each node after successfully joining the system starts to send a test message to any
random node in the network. The time interval between successive test messages is
normally distributed with a mean of 1 second and standard deviation of 0.1 second
truncated to non-negative values. The total number of successfully received messages is
calculated at a fixed interval of 5 seconds. Delivery rate is defined as the ratio of messages
received successfully to total messages sent in the last interval. Convergence time of the
system is defined as the time when a delivery rate of 95% is achived. This is the same
performance metric used in [33].
The link layer delay is modeled by a uniform random variable between 45 and 55
milliseconds. The processing delay at each node is not modeled. Though it is not zero,
it is assumed to be negligible as today’s computer have high processing capabilities. The
simulator also doesn’t model routing delays as modeling these would prevent large scale
network simulations. Each node can directly send message to any other node in the
network if its IP address is known.
Convergence Time
Fig 4.5 plots the delivery rate as a function of time. The results shows the output of 10
simulations. The nodes join the system uniformly with a rate of 1000 nodes/sec for 10
seconds. We can see that the convergence time is between 15 and 20 seconds. This results
tallies with the result obtained in [33] thus validating the implementation of Chord in
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 46
Peersim. The convergence rate is higher than 100% because due to jitter and routing
delay in the network, the messages originating in the previous interval are received in
later intervals.
Figure 4.5: Delivery rate when 10000 nodes join the system at the rate of 1000 nodes/sec.
The variation in convergence time with the number of nodes in the system is shown
in Fig 4.6. The nodes join the system uniformly with the same rate of 1000 nodes/sec
but for different time periods. The average convergence time taken over 10 simulations is
plotted for different chord overlay sizes on a log scale. We can see that the convergence
time only increases slightly on doubling the overlay size. For the case of 128000 nodes,
the system converges even before all the nodes have joined the system. These results are
consistent with those obtained in [33] and establishes the validity of the implementation.
Average Hop Count
Figure 4.7 shows the variation in average hops taken by the message before reaching its
destination. The nodes join the system uniformly with a rate of 1000 nodes/sec for 10
seconds. The results of 10 simulations are shown. The Chord overlay becomes stable
around 20 seconds as shown in Figure 4.5, but it takes 100 more seconds before the
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 47
Figure 4.6: Convergence time for different overlay sizes
lookup becomes O(log2(N)). This is obvious from the fact that the fix finger() routine
updates only one entry every 10 seconds and takes time before all entries are correct.
A takeaway from this result it than even in the initial stages, when almost all of the
finger table entries are null or invalid, the routing protocol performs way better than
O(N). This further suggests that the fix finger() routine interval of 10 seconds is
sufficient.
Effect of Node Arrival Rate
Figure 4.8 plots the convergence time of the overlay for different arrival rates. The average
convergence time taken over 10 simulations is shown. The nodes keep on arriving for 10
seconds. It shows that the chord protocol is able to scale well to the increase in arrival
rate. Even after increasing the arrival rate 8 time from 1000 nodes/s to 8000 nodes/sec
the average convergence increases by only 3 seconds and in 9 out of 10 simulations the
convergence time is 20 seconds.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 48
Figure 4.7: Average hop count for routing messages during flash crowd
Figure 4.8: Convergence time for different node arrival rates
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 49
4.4 Implementation of Proposed Method
The method proposed in Chapter 3 was implemented in PeerSim simulator from scratch
to run the simulations. The design used the Chord overlay implementation documented
in Section 4.3 to route and deliver the messages to the destination.
4.4.1 Variables and Data Structures
The important configuration and design parameters along with brief descriptions are
given below:
o numLevel: Denotes the number of level that needs to be maintained in the network.
It is calculated by the bootstrapping server. Whenever the value of numLevel
changes or the previous value is about to expire, the bootstrapping server schedules
updates to be propagated in the system.
o expirationT ime: The time after which the value of numLevel becomes stale.
o timeOut: Time-out counter associated with the various steps involved in the pro-
cess, namely getting numlevel, registering as a feed-forwarder and fetching feed-
forwarder list. If the value or a confirmation is not received within timeOut seconds,
the node starts the process again.
o sterileDelay: After a node registers itself in the feed-forwarder list of its fertile
tree, it waits for sterileDelay seconds before requesting the feed-forwarder list of
other sterile trees.
o globalT imeOut: After completing the registration process, a node tries to connect
to its sterile trees. If the node is unsuccessful to connect to sub-stream tree s
even after globalT imeOut seconds, it requests the global feed-forwarder list of the
sub-stream tree s.
o streamReq: A boolean variable that denotes the status of node’s connection re-
quests. It is true if the node has successfully connected to all sub-stream trees and
false otherwise.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 50
The important data-structured used along with brief descriptions are also given below:
o regList: It contains the list of nodes that has requested the value of numLevel from
node n. On receiving a reply or an updated value of numLevel, node n forwards
the message to all the active entries of regList.
o indexRegList: It contains the list of nodes that has requested the feed-forwarder
list from node n. On receiving the feed-forwarder list for level l of sub-stream tree
s, node n forwards the message to all the entries of indexList that have a pending
request for the given level and sub-stream tree.
o localIndexList: It is used to store a local copy of the feed-forwarder lists.
o connectionList: It contains the list of child nodes of a node. Both fertile as well as
sterile child are stored in the list.
4.4.2 Messages
The messages used in the implementation serve two purposes. Firstly, they are used for
communication between the nodes. In this case, they are send to the destination node
via the transport layer protocol. The other purpose of messages is to schedule events
in the future. In this case, a node adds a message directly to the event simulator to be
received by it in future.
The various messages along with brief descriptions are given below.
o StreamMessage(): When a node completes the Chord join process, it schedules a
StreamMessage() to be received immediately. When received, it starts the stream-
ing process by sending GetLevelMessage() to #(number of levels).
o GetLevelMessage(): It is used to request the latest value of numLevel. If an
intermediate node has unexpired entry in its cache, the query ends there and
ReturnLevelMessage(numLevel, expirationT ime) is send. Else, the message is
forwarded towards #(number of levels). In either case, the intermediate node adds
the requester node in its regList. Query coalescing is also used. If a request for
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 51
numLevel is already underway, then no new request is issued until the previous
request is either fulfilled or timed out.
o ReturnLevelMessage(numLevel, expirationT ime): It is used to return the value
of numLevel along with its expirationT ime. When received by a node, the message
is forwarded to all the entries in its regList awaiting the reply. Also, if the node’s
StreamReq flag is true, the following steps are taken:
- Level l in fertile tree s is selected according to algorithm given in Figure 3.2.
- RegisterMessage(s, l) is send to #(stream[s] level[l])
- ConnectionScheduleMessage(st) is scheduled to run after sterileDelay sec-
onds for every sterile tree st
- GlobalRequestMessage(st) is scheduled to run after globalT imeOut seconds
for every sterile tree st
o RegisterMessage(s, l): Send by a node to register itself in feed-forwarder list
of level l of sub-stream tree s. The message is routed to the node located at
#(stream[s] level[l]). RegisterAMessage(s, l) is returned on successful registra-
tion, otherwise RegisterDMessage(s, l) is returned.
o RegisterAMessage(s, l): Registration in level l of fertile tree s is completed after
receiving this message. The node now queries the feed-forwarder list of a level above
by sending a GetIndexMessage(s, l − 1) to #(stream[s] level[l-1]). The node also
schedules a GlobalRegisterMessage(s) to run after globalT imeOut seconds.
o ConnectionScheduleMessage(s): If s is a sterile tree, the node selects a level l as
described in Section 3.4.4. It then sends GetIndexMessage(s, l) to #(stream[s] level[l]).
o GetIndexMesasge(s, l): Used to request the feed-forwarder list of level l of sub-
stream tree s. The message is routed to the node at #(stream[s] level[l]). The
intermediate nodes implement query coalescing. If a request for feed-forwarder
list of the same level and tree is already underway, then no new request is issued
until the previous request is either fulfilled or timed out. The node simply add the
requester node in its indexRegList to notify it of the reply when received.
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 52
o ReturnIndexMessage(s, l): It is used to return the list of feed-forwarders registered
at level l of sub-stream tree s. When received by a node, the node stored the feed-
forwarder list in localIndexList(s) and forwards the message to all the entries in
its indexRegList who are awaiting the reply. Also, if the node’s StreamReq flag is
true, the following steps are taken:
- If the list is empty, the node schedules a ScheduleConnectionMessage(s) to
run after timeOut seconds to start the process again.
- Else, the node removes a random node from the list and sends ConnectMessage(s, l)
to it.
o ConnectMessage(s, l): Send by a node to its potential parent node. ConnectAMessage(s, l)
is send if the parent node can accommodate the requester node. Else, the parent
node can either send ConnectDMessage(s, l) to refuse the connection or TransferMessage(s, l+
1, childNode) to transfer the connection to its child node. The requester node can
then try to connect to childNode.
After accepting the connection, if number of fertile children in its connectionList be-
comes k, the parent node sends a RemoveIndexMessage(s, l) to #(stream[s] level[l]).
o ConnectDMessage(s, l): Send by a parent node if it is unable to accept the node
as it child. On receiving the message, the node removes another entry from the
localIndexList(s) and sends ConnectMessage(s, l) to it. If the localIndexList(s)
becomes empty, the node schedules ScheduleConnectionMessage(s) to run after
timeOut seconds to start the process again.
o TransferMessage(s, l, rNode): After receiving the message, the node sends a
ConnectMessage(s, l) to rNode.
o RemoveIndexMessage(s, l): Send by a node to remove itself from the feed-forwarder
list of level l of sub-stream tree s. If the requesting node is the only entry in the
list, it is not removed. Else, it is removed.
o ConnectAMessage(s, l): Send by a parent node to notify the child node that the
connection was successful. The streaming process for sub-stream s ends for the
CHAPTER 4. DESIGN IMPLEMENTATION IN PEERSIM SIMULATOR 53
child node.
o FeedMessage(s, n): It is used to simulate the feed distribution. FeedMessages
belonging to different sub-streams each having a sequence number are pushed every
second by the source node. On receiving the message, a node forwards it to all the
nodes in its connectionList.
o GlobalRegisterMessage(s): When received by a node, it registers itself in the
global feed-forwarder list by sending a RegisterMessage(s,−1) to #(stream[s] level-
1). The node will register itself only if has remaining feed-forwarding capacity when
the message was received.
o GlobalRequestMessage(s): It is received timeOut seconds after the node starts
the registration process. If the node was not able to connect to the sterile tree s at
the time of the reception of this message, it sends a GetIndexMessage(s,−1) to
#(stream[s] level-1) to request the global feed-forwarder list of the sub-stream tree
s.
o UpdateScheduleMessage(): Send by the bootstrapping server to #(number of levels)
with the updated value of numLevel and expirationT ime. The levelRoot node then
propagates the update throughout the network.
o UpdateMessage(): Used to propagate the latest value of numLevel and expirationT ime.
On receiving this message, a node updates its cache entry and forwards the message
to all the entries stored in its regList.
Chapter 5
Experiments and Results
5.1 Experimental Setup
Chord Overlay Creation Process
Every node starts by contacting the bootstrapping server to get a list of active nodes.
It then selects a random node from the list and starts the join process by sending
FindSuccessorMessage to it. After receiving ReturnSuccessorMessage, the node joins
the Chord overlay by updating its successor and predecessor pointers. The node also
schedules StabaliseMessage and FixFingerMessage immediately and afterwards they
are scheduled after stabInterval and ffInterval respectively. If unsuccessful, the node
waits for retryT ime before starting the join process again. The value of stabInterval,
ffInterval and retryT ime is taken to be 2, 10 and 5 seconds respectively for all the
simulations unless stated otherwise. This completes the Chord overlay construction and
maintenance part. The process is explained in detail in Section 4.2.3
Streaming Process
After connecting to the Chord overlay, a node queries for the value of numLevel by send-
ing GetLevelMessage to #(number of levels). After receiving the value of numLevel,
it selects a fertile tree s and level l in the fertile tree and registers itself as a feed-
forwarder by sending RegisterMessage to #(stream[s] level[l]). After successful regis-
tration, it requests the feed-forwarder list of a level above it in the fertile tree by sending
54
CHAPTER 5. EXPERIMENTS AND RESULTS 55
Notation Default Value Definition
t 8 number of sub-stream treesk 8 feed-forwarding capacity of an individual peerUs 8 initial per sub-stream surplus feed-forwarding
capacity of the systemstabInterval 2 secs time difference between 2 stabilize() callffInterval 10 secs time difference between 2 fix fingers() callretryT ime 5 secs time duration after which a node tries to join
the Chord overlay againtimeOut 3 secs time-out duration for various steps, such as
getting numlevel, registering as a feed-forwarderand fetching feed-forwarder list
sterileDelay 2 secs time duration after which a node tries to connectto a sterile tree
Table 5.1: Default values of various parameters used in simulation.
GetIndexMessage to #(stream[s] level[l-1]). After receiving the feed-forwarder list, it
tries to connect to the nodes. Simultaneously, the node also requests the feed-forwarder
list of other sterile trees and tries to connect to them. It does so after waiting for
sterileDelay seconds. The value of sterileDelay is taken to be 2 seconds for all the
simulations unless stated otherwise. The various steps involved in the process, namely
getting numlevel, registering as a feed-forwarder and fetching feed-forwarder list, have
a time-out counter associated with them. If the value or a confirmation is not received
within timeOut seconds, the node starts the process again. The value of timeOut is taken
to be 3 seconds for all the simulations unless stated otherwise. The process is explained
in detail in Section 3.4.
The simulate the feed distribution, a feed message is pushed on each sub-stream tree
every second. The messages are numbered in increasing order and messages belonging
to the same stream, pushed on different sub-stream trees, are given the same sequence
number. At another fixed interval of 1 second, the number of feed messages of different
sub-streams received at the nodes is observed. The initial per sub-stream surplus feed-
forwarding capacity of the system Us is taken to be 8 for all the simulations unless stated
otherwise. Flash crowd is modeled as high rate of arrival of nodes. It is assumed that
the nodes do not leave the system, hence, network churn is not modeled.
The various system parameters and their default values are given in Table 5.1.
CHAPTER 5. EXPERIMENTS AND RESULTS 56
Network Parameters
The link layer delay is modeled by a uniform random variable between 45 and 55 mil-
liseconds. The processing delay at each node is not modeled. Though it is not zero, it
is assumed to be negligible as today’s computer have high processing capabilities. The
simulator also doesn’t model routing delays as modeling these would prevent large scale
network simulations. Each node can directly send message to any other node in the
network if its IP address is known.
5.2 Configuring Number of Sub-Streams
The number of sub-stream trees t is a configuration parameter and needs to be set before
the simulation starts. Having more number of sub-stream trees has both advantages and
disadvantages. Higher number of sub-streams makes the system more robust as departure
of any node will affect the streaming quality by a smaller fraction. Moreover, the feed-
forwarding capacity of a node k is dependent on the number of sub-stream trees. ( They
are taken to be equal from Condition 3.4.1 ). Thus, having more sub-streams will result in
construction of shorter trees, which in turn will reduce the average latency of the overall
system.
On the other hand, higher number of sub-streams make the system complex and in-
creases message complexity as more number of feed-forwarder lists needs to be maintained
in the system. Since, the main objective is to provide stream to maximum number of
peers in the minimum possible time during flash crowd, the performance of the system is
compared for different values of t for the following 3 parameters:
System Scale
A total of 10000 nodes join the system with a rate of 1000 nodes/sec. Let, the time
taken for more than 90%, i.e., 9000, peers to receive all the sub-streams be denoted as
stabilization time. Then, the stabilization time of the system is plotted for different
number of sub-stream trees in Figure 5.1. It should be noted that 90% is taken as
a benchmark against 100% because the latter also depends upon random stray events
CHAPTER 5. EXPERIMENTS AND RESULTS 57
and would not capture the real essence of change in number of sub-stream trees. The
observation is averaged over 10 simulations.
Figure 5.1: Time taken for 90% of the nodes to get all the sub-streams ( stabilizationtime ) for different values of t
As seen from the figure, varying the number of sub-stream trees has very little effect
on the time in which a node gets all the sub-streams.
Average Latency
A total of 10000 nodes join the system with a rate of 1000 nodes/sec. The source node
starts pushing the feed messages on sub-stream trees at the rate of one message per
second. The time delay after which the message is received by the nodes is observed,
which will depend upon the position of a node in a sub-stream tree. At a node, the
delay incurred in receiving feed messages from all sub-stream trees is averaged to get the
average latency at the node. To observe the performance of the system, the maximum and
average value of average latency at each node is plotted against the number of sub-stream
trees in Figure 5.2. The observation is averaged over 10 simulations.
As evident from the figure, the average latency do decrease on increasing the number
CHAPTER 5. EXPERIMENTS AND RESULTS 58
Figure 5.2: Average and maximum value of average latency at each node for differentvalues of t
of sub-stream trees because of the decrease in the number of levels. But, with addition
of every new sub-stream tree, the difference keeps on reducing substantially.
Minimum Size of Buffer Required
Systems having multiple sub-streams suffer from the problem of differential delay. The
node is placed in different levels in different sub-stream trees and hence, the sub-stream
feeds arrive with different time delays. They have to be buffered at the node until all the
sub-stream feeds of a particular stream are received. The minimum buffer size required
should be greater than the maximum time difference between two sub-stream feeds.
A total of 10000 nodes join the system with a rate of 1000 nodes/sec. The source
node starts pushing the feed messages on sub-stream trees at the rate of one message per
second. The maximum time delay between two sub-stream feed messages is observed at
each node. This would be the minimum buffer size required at the node. The maximum
and average value of required buffer size is plotted against the number of sub-stream trees
in Figure 5.3. The observation is averaged over 10 simulations.
CHAPTER 5. EXPERIMENTS AND RESULTS 59
Figure 5.3: Average and maximum buffer size required at each node
As seen from the figure, the average value doesn’t vary much. But, the maximum
buffer size is greater for lesser number of sub-stream. This happens because with creation
of more number of levels, a node may be placed in top level in one sub-stream tree and
bottom-most level in another.
Number of Sub-Streams Received by a Node
The advantage of having more number of sub-streams can be seen from Figure 5.4. It
shows the number of sub-streams received by peers at different times. It can be seen that
it takes 40 seconds for all the peers to have all 8 sub-streams. But in about half time, i.e.,
20 seconds, most of the peers have more than 4 sub-streams. Thus, though the system
takes time to scale up to full quality, at least the nodes have an opportunity to stream
video with low quality in the meanwhile.
From the above results, the number of sub-stream trees is taken to be 8
for all the simulations unless stated otherwise.
CHAPTER 5. EXPERIMENTS AND RESULTS 60
Figure 5.4: Number of sub-streams received by nodes at different times
5.3 Effect of Node Arrival Rate
The simulation is run for different value of node arrival rate keeping all the other pa-
rameters fixed according to Table 5.1. The stabilization time of the system is plotted for
different values of node arrival rate in Figure 5.5. For each observation, the nodes arrive
with a uniform rate r for a period of 10 seconds. The observed value is averaged over 20
simulations.
The rate of arrival of nodes is plotted on a logarithmic scale. We can see from the
figure that the system stabilization time varies slowly with rate. Even when doubling the
rate from 2000 nodes/sec to 4000 nodes/sec, the stabilization time increases by only 2.5
seconds. This increase can be attributed to two factors — (1) Increase in stabilization
time of Chord overlay on increasing the rate of arrival as seen in Figure 4.8 (2) More
number of nodes leads to the creation of more number of levels. Also, the number of
nodes registered in level i is N(i) = Uski−1 as given by Equation 3.1. Hence, if more
number of levels are created, the nodes in the lowermost level would also increase. This
increases the competition among the peers belonging to these levels leading to increase
CHAPTER 5. EXPERIMENTS AND RESULTS 61
Figure 5.5: System stabilization time for different node arrival rates
in stabilization time. It can be reduced by creating sub-levels within levels.
5.4 Effect of Parameters - Us and k
Effect of Initial Per Sub-Stream Surplus Feed-Forwarding Capacity Us
The simulation is run for different values of Us keeping other values constant as given in
Table 5.1. A total of 10000 nodes join the system with a rate of 1000 nodes/sec and the
stabilization time of the system is plotted for different values of Us in Figure 5.6. The
observation is averaged over 20 simulations.
As seen from the figure, the stabilization time is not sensitive to the initial capacity
of the system. When Us is increased, it adds more number of peers in the first level. This
decreases the average latency of the systems as lesser number of levels are constructed.
But, the overall process still takes the same time.
CHAPTER 5. EXPERIMENTS AND RESULTS 62
Figure 5.6: Stabilization time of the system for different values of Us
Effect of Feed-Forwarding Capacity of an Individual Peer k
The simulation is run for different values of k keeping other values constant as given in
Table 5.1. A total of 10000 nodes join the system with a rate of 1000 nodes/sec and the
stabilization time of the system is plotted for different values of k in Figure 5.7. The
observation is averaged over 20 simulations.
As seen from the figure, increasing k beyond the minimum required value of t can
significantly improves the performance of the system.
5.5 Effect of Underlying Chord Overlay
The correctness of Chord overlay is needed to route the messages to their destinations.
When a large number of peers join the system at the same time, the Chord overlay itself
takes some time to route messages correctly as seen in Section 4.3.4. To observe the
effect of correctness of Chord overlay on performance of the proposed method, a node
waits for wait seconds to let the Chord overlay stabilize. Only after that, it starts the
CHAPTER 5. EXPERIMENTS AND RESULTS 63
Figure 5.7: Stabilization time of the system for different values of k
streaming process by sending a GetLevelMessage. Keeping all other parameters as given
in Table 5.1, the stabilization time of the system is observed for different values of wait
when a total of 10000 nodes join the system with a rate of 1000 nodes/sec. The required
observation is plotted in Figure 5.8.
As we can see from the figure, even in the case when wait = 0 the system is able to
scale as fast as the other two cases. It can be seen from the slope of the plots, which is
almost same for all the 3 cases.
CHAPTER 5. EXPERIMENTS AND RESULTS 64
Figure 5.8: System scale for different values of wait
Chapter 6
Conclusion and Future Work
6.1 Conclusion
The proposed method consists of two major steps — (1) Arrange the newly arrived peers
in different levels of different sub-stream trees. (2) Connect the peers with each other to
distribute the stream. The number of levels that are constructed depends upon the node
arrival rate. Hence, the design is able to scale to very large number of nodes. From the
results, we can see that on increasing the node arrival rate from 2000 nodes/sec to 4000
nodes/sec, the stabilization time of the system increases by only 2.5 seconds. The slight
increase in stabilization time can be attributed to two factors — (1) With higher rate,
the underlying Chord overlay used for communication takes more time to stabilize. (2)
The number of peers in a level increases exponentially. Hence, if the node arrival rate
increases, more levels would be created. As a result, number of peers in the lowermost
level would be more. This increases competition among the peers belonging to the same
level. This can be reduced by creating sub-levels within levels.
The desired result was achieved by tapping in the potential of start-up peers to help
in system scale. Existing population control frameworks for P2P live streaming systems
see start-up peers as a hindrance to system scale as they compete for the scarce available
resources. Hence, to achieve the optimum growth rate, a fraction of the start-up peers are
delayed. This thesis proposed a different methodology which reduced competition among
the start-up peers and instead asked them to collaborate with each other to achieve a
65
CHAPTER 6. CONCLUSION AND FUTURE WORK 66
better system scale. Thus, collaboration instead of competition among the start-up peers
can be a key to handle flash crowds.
6.2 Future Work
This thesis has provided an insight into a different approach to handle flash crowds in
P2P live streaming systems. Being an initial work, certain assumptions were taken about
the system such as homogeneity of nodes, absence of network churn. A future course of
action would be to remove these assumption and develop a system that would be able to
perform in any scenario.
Bibliography
[1] R. Schollmeier. “A definition of peer-to-peer networking for the classification of
peer-to-peer architectures and applications”. In: Peer-to-Peer Computing, 2001.
Proceedings. First International Conference on. 2001, pp. 101–102. doi: 10.1109/
P2P.2001.990434.
[2] BitTorrent. url: http://www.bittorrent.com/.
[3] Gnutella. url: http://en.wikipedia.org/wiki/Gnutella.
[4] Skype — Free Calls to friends and family. url: http://skype.com.
[5] Xinyan Zhang et al. “CoolStreaming/DONet: a data-driven overlay network for
peer-to-peer live media streaming”. In: INFOCOM 2005. 24th Annual Joint Con-
ference of the IEEE Computer and Communications Societies. Proceedings IEEE.
Vol. 3. IEEE. 2005, pp. 2102–2111.
[6] David P Anderson et al. “SETI@ home: an experiment in public-resource comput-
ing”. In: Communications of the ACM 45.11 (2002), pp. 56–61.
[7] Ian Clarke et al. “Freenet: A distributed anonymous information storage and re-
trieval system”. In: Designing Privacy Enhancing Technologies. Springer. 2001,
pp. 46–66.
[8] Satoshi Nakamoto. “Bitcoin: A peer-to-peer electronic cash system”. In: Consulted
1.2012 (2008), p. 28.
[9] Christos Gkantsidis, Milena Mihail, and Amin Saberi. “Random walks in peer-to-
peer networks”. In: INFOCOM 2004. Twenty-third AnnualJoint Conference of the
IEEE Computer and Communications Societies. Vol. 1. IEEE. 2004.
67
BIBLIOGRAPHY 68
[10] Ion Stoica et al. “Chord: A scalable peer-to-peer lookup service for internet ap-
plications”. In: ACM SIGCOMM Computer Communication Review 31.4 (2001),
pp. 149–160.
[11] Antony Rowstron and Peter Druschel. “Pastry: Scalable, decentralized object lo-
cation, and routing for large-scale peer-to-peer systems”. In: Middleware 2001.
Springer. 2001, pp. 329–350.
[12] Ben Y Zhao et al. “Tapestry: A resilient global-scale overlay for service deploy-
ment”. In: Selected Areas in Communications, IEEE Journal on 22.1 (2004), pp. 41–
53.
[13] Xiaojun Hei et al. “A measurement study of a large-scale P2P IPTV system”. In:
Multimedia, IEEE Transactions on 9.8 (2007), pp. 1672–1687.
[14] Bo Li et al. “Inside the new coolstreaming: Principles, measurements and per-
formance implications”. In: INFOCOM 2008. The 27th Conference on Computer
Communications. IEEE. IEEE. 2008.
[15] Suman Banerjee, Bobby Bhattacharjee, and Christopher Kommareddy. Scalable
application layer multicast. Vol. 32. 4. ACM, 2002.
[16] Duc A Tran, Kien A Hua, and Tai Do. “Zigzag: An efficient peer-to-peer scheme for
media streaming”. In: INFOCOM 2003. Twenty-Second Annual Joint Conference
of the IEEE Computer and Communications. IEEE Societies. Vol. 2. IEEE. 2003,
pp. 1283–1292.
[17] Vivek K Goyal. “Multiple description coding: Compression meets the network”. In:
Signal Processing Magazine, IEEE 18.5 (2001), pp. 74–93.
[18] Miguel Castro et al. “SplitStream: high-bandwidth multicast in cooperative envi-
ronments”. In: ACM SIGOPS Operating Systems Review. Vol. 37. 5. ACM. 2003,
pp. 298–313.
[19] Dejan Kostic et al. “Bullet: High bandwidth data dissemination using an over-
lay mesh”. In: ACM SIGOPS Operating Systems Review. Vol. 37. 5. ACM. 2003,
pp. 282–297.
BIBLIOGRAPHY 69
[20] Venkata N Padmanabhan, Helen J Wang, and Philip A Chou. “Resilient peer-to-
peer streaming”. In: Network Protocols, 2003. Proceedings. 11th IEEE International
Conference on. IEEE. 2003, pp. 16–27.
[21] Nazanin Magharei and Reza Rejaie. “Prime: Peer-to-peer receiver-driven mesh-
based streaming”. In: IEEE/ACM Transactions on Networking (TON) 17.4 (2009),
pp. 1052–1065.
[22] Bo Li et al. “An empirical study of the coolstreaming+ system”. In: Selected Areas
in Communications, IEEE Journal on 25.9 (2007), pp. 1627–1639.
[23] Hao Yin et al. “Design and deployment of a hybrid CDN-P2P system for live video
streaming: experiences with LiveSky”. In: Proceedings of the 17th ACM interna-
tional conference on Multimedia. ACM. 2009, pp. 25–34.
[24] Fangming Liu et al. “How P2P streaming systems scale over time under a flash
crowd?” In: IPTPS. Vol. 1. 2009, p. 2.
[25] Fangming Liu et al. “Flash crowd in P2P live streaming systems: Fundamental
characteristics and design implications”. In: Parallel and Distributed Systems, IEEE
Transactions on 23.7 (2012), pp. 1227–1239.
[26] Yishuai Chen, Baoxian Zhang, and Changjia Chen. “Modeling and performance
analysis of P2P live streaming systems under flash crowds”. In: Communications
(ICC), 2011 IEEE International Conference on. IEEE. 2011, pp. 1–5.
[27] Tein Yaw Chung and Odin Lin. “A batch join scheme for flash crowd reduction
in iptv systems”. In: Parallel and Distributed Systems (ICPADS), 2011 IEEE 17th
International Conference on. IEEE. 2011, pp. 823–828.
[28] Haibo Wu et al. “Multiple-tree topology construction scheme for P2P live stream-
ing systems under flash crowds”. In: Wireless Communications and Networking
Conference (WCNC), 2013 IEEE. IEEE. 2013, pp. 4576–4581.
[29] Alberto Montresor and Mark Jelasity. “PeerSim: A Scalable P2P Simulator”. In:
Proc. of the 9th Int. Conference on Peer-to-Peer (P2P’09). Seattle, WA, Sept. 2009,
pp. 99–100.
BIBLIOGRAPHY 70
[30] Alberto Montresor and Mark Jelasity. PeerSim: A Scalable P2P Simulator. 1999.
url: http://peersim.sourceforge.net.
[31] Zhijia Chen et al. “How scalable could P2P live media streaming system be with the
stringent time constraint?” In: Communications, 2009. ICC’09. IEEE International
Conference on. IEEE. 2009, pp. 1–5.
[32] Miguel Castro et al. “SCRIBE: A large-scale and decentralized application-level
multicast infrastructure”. In: Selected Areas in Communications, IEEE Journal on
20.8 (2002), pp. 1489–1499.
[33] Jae Woo Lee et al. “0 to 10k in 20 seconds: Bootstrapping Large-scale DHT net-
works”. In: Communications (ICC), 2011 IEEE International Conference on. IEEE.
2011, pp. 1–6.
[34] Mema Roussopoulos and Mary Baker. “CUP: Controlled Update Propagation in
Peer-to-Peer Networks.” In: USENIX Annual Technical Conference, General Track.
2003, pp. 167–180.
[35] Andras Varga et al. “The OMNeT++ discrete event simulation system”. In: Pro-
ceedings of the European simulation multiconference (ESM2001). Vol. 9. S 185. sn.
2001, p. 65.
[36] Thomer Gil et al. p2psim, a simulator for peer-to-peer protocols. 2003.
[37] Pedro Garcıa et al. Planetsim: A new overlay network simulation framework. Springer,
2005.
[38] Andrea Presa. An implementation of Chord Protocol for Peersim. url: http://
peersim.sourceforge.net/#extras.
[39] Dana Angluin et al. “Fast construction of overlay networks”. In: Proceedings of the
seventeenth annual ACM symposium on Parallelism in algorithms and architectures.
ACM. 2005, pp. 145–154.
[40] Alberto Montresor, Mark Jelasity, and Ozalp Babaoglu. “Chord on demand”. In:
Peer-to-Peer Computing, 2005. P2P 2005. Fifth IEEE International Conference on.
IEEE, pp. 87–94.
BIBLIOGRAPHY 71
[41] Petar Maymounkov and David Mazieres. “Kademlia: A peer-to-peer information
system based on the xor metric”. In: Peer-to-Peer Systems. Springer, 2002, pp. 53–
65.
[42] Ingmar Baumgart, Bernhard Heep, and Stephan Krause. “OverSim: A flexible over-
lay network simulation framework”. In: IEEE Global Internet Symposium, 2007.
IEEE. 2007, pp. 79–84.