Upload
journal-of-computing
View
219
Download
0
Embed Size (px)
Citation preview
7/30/2019 Hybrid Network Coding Peer-to-Peer Content Distribution
1/10
Hybrid Network Coding Peer-to-Peer ContentDistribution
Dinh Nguyen and Hidenori Nakazato
Abstract Network coding has been applied successfully in peer-to-peer systems to shorten the distribution time. Pieces of
data, i.e. blocks, are combined, i.e. encoded, by the sending peers before being forwarded to other peers. Even though
requiring all peers to encode might achieve shortest distribution time, it is not necessarily optimal in terms of computational
resource consumption. Short finish time, in many cases, can be achieved with just a subset of carefully chosen peers. Peer-to-
peer systems, in addition, tend to be heterogeneous in which some peers, such as hand-held devices, would not have the
required capacity to encode. We therefore envision a P2P system where some peers encode to improve distribution time and
other peers, due to limited computational capacity or due to some system-wide optimization, do not encode. Such a system
gives rise to a design problem which has never happened in both pure non-coding and full network coding-enabled P2P
systems. We identify the problem and propose our solutions to address it. Simulation evaluation confirms robust performance of
our proposed hybrid network coding peer-to-peer content distribution.
Index Terms
content distribution, network coding, peer-to-peer
1 INTRODUCTION
ETWORK coding [1][13], wh ich allows content to be
coded at intermediate nodes w hile being forward ed
in the network, has been shown to achieve signifi-
cantly shorter distribution time in peer-to-peer (P2P) con-
tent distribution [4][16]. It is, however, too expensive and
in many cases impossible to require encoding at every
peer . Recen t w ork h as dem onstrated th at encod ing is on ly
needed at a su bset of carefully chosen p eers, and in some
particu lar instances, on ly at th e sou rce, to achieve comp a-
rable performance to network coding [11][12][15]. Many
other studies have focused on minimizing the number of
required network coders to achieve optimal multicast
throughput [8][21][22]. P2P networks in reality usually
consist of heterogeneous peers with quite d ifferent cap a-
bilities. More pow erful peers can be read y for network
coding-enabled operations, yet such jobs are beyond the
capacity of resource-limited peers like hand-held and
mobile devices. A successful network coding solution to
optimize P2P network performance, therefore, cannot
impose encoding at every network n ode.
Interested in u sing network coding to shorten distribu-tion time in P2P network, we envision a P2P system
where encoding is applied at some peers while other
peer s, due to resource limitat ion or due to op tim iza tion
reasons, might not code. The system, which we call a hy-
brid network coding P2P system, gives rise to a design
problem which has never hap pened before. In pure Bit-
Torrent P2P system [3], the source and all peers exchange
pieces, i.e. b locks, of th e file using r arest-block select ion to
quickly disseminate the file into the system. A peer
chooses the rarest blocks in the neighborhood to dow n-
load first. In full network coding-enabled P2P, all peers
code. Before downloading from a neighbor, a peer com-
municates with the neighbor to determine if it can pro-
vide with new data. When some peers encode and others
do n ot, there are mixtures of coded and non -coded blocks
in the neighborhood for each peer to choose from. The
question is how we can design a protocol and a block-
selection algorithm to hand le such a mixture of coded and
non-coded blocks and at the sam e time preserve the effi-
ciency and simplicity of BitTorrent p rotocol.
In this paper, we design our hybrid network coding
P2P system. Our contributions are follows.
1) We devise information exchange protocols whichenable hybrid network coding systems to work
seamlessly. Our design, backward-compatible to
BitTorrent, requires only an addition of one field in
the meta-exchange messages.
2) We propose a block-selection algorithm for a partlynetwork coding-enabled system to operate efficient-
ly. Our block-selection algorithm, an extension from
BitTorrents rarest-first selection, is derived from ex-
tensive observations of the way netw ork coded d ata
benefit content d istribution.Our design and algorithm noticeably improve system
performance in term s of distr ibu tion tim e com pared with
current netw ork coding P2P systems.
In the remaining parts, we review related w ork in sec-
tion 2. The system model is given in section 3. We de-
scribe the protocol for peers to communicate in section 4.
Section 5 focuses on the block-selection problem and our
proposed algorith m. Section 6 presen ts performance
evaluation results and finally, we conclude the paper in
section 7.
2 RELATED WORKBitTorrent [3], a popular P2P file sharing with parallel
dow nloads to accelerate dow nload sp eed, divides the file
into equal-size blocks, i.e. chunks, pieces, which peers
The authors are with Waseda Univ ersity , Tokyo 169-0051, Japan. This work has been extended from a previous paper[14].
JOURNAL OF COMPUTING, VOLUME 5, ISSUE 4, APRIL 2013, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 8
7/30/2019 Hybrid Network Coding Peer-to-Peer Content Distribution
2/10
send an d receive in parallel, utilizing both available u p-
load and download bandwidth. Each newly joining peer
connects to a set of random existing peers, such that to
construct a mesh overlay network with random topolo-
gies. Furth ermore, rarest blocks are chosen first by receiv -
ing peers to quickly disseminate the whole file into the
system. To encourage peers to contribute uploading
bandwidth to th e system, a p eer uploads to, i.e . unch okes,
a certain number of neighboring peers at a time, those
provide it with best dow nloading ra tes. Rarest first block
selection and unchoking are shown to be the reasons un-
derlying BitTorrent excellent p erform ance [5].
Network cod ing [1][2][13], which allows inter med iate
nodes to encode, have been app lied to BitTorrent in or-
der to shorten distribution time [4][16]. Whenever there is
an opportunity to transmit, a peer combines all blocks it
has to make new coded blocks and sends to the request-
ing peer.
For full-scale network coding P2P where all peers en-
code, [4][6] proposes a mechanism by which beforedow nloading from a neighbor, a peer checks if the neigh-
bor can provide it with meaningfu l blocks, i.e. blocks
wh ich are linearly indepen dent from its own set of blocks.
We call that a try-and-download approach which, com-
pared to BitTorrent, requires a major update in the way
peers exchange metad ata:
1. a peer send s a requ est message to its neighbor ,
2. the neighbor replies either with a newly generated
encoding vectoror with its decoding matrix1,
3. requesting peer dow nloads a newly coded block
from the neighbor if the encoding vector or the
neighbors decoding matrix is independent fromits own d ecoding m atrix.
Try-and-download is synchronous in the sense that a
peer has to be in synch w ith its neighbors by contin uou sly
checking if they can pr ovide it with n ew d ata. Moreover,
a receiving peer cannot know in advance exactly which
and how many blocks it is going to receive from each
neighbor to make a better choice. Such knowledge will
help the receiving peers to decide which blocks are most
valuab le to it.
In full-scale network coding systems where peers are
somehow homogeneous in terms of computational re-
sources to encode, try-and-download is feasible, yet with a
protocol overhead . Within a hybrid network cod ing P2P
system, however, it is not necessarily that all peers can
code. In such a scenario, requiring a resource-limited peer
to frequently compare its own d ecoding matrix with d e-
coding matrices of its neighbors is beyond its capacity.
We need a simple, yet effective, way to do that which
every peer, encoding-enabled or n ot, can d o.
To facilitate hybrid network coding P 2P systems w h ere
encoding and non-encoding nodes mix together, we de-
part from try-and-download approach to introduce an ex-
tension to BitTorrent metadata exchange. We furthermore
propose a block selection algorith m to improve dist ribu-
tion time. Our prop osed solution is backward-comp atiblewith BitTorrent and virtually requires no more protocol
1 We explain encoding vector and decoding matr ix in se ction 3.2.
overhead than pure BitTorrent, yet the performance im-
provement is noticeable com pared with original ne tw ork
coding P2P systems.
3 SYSTEM MODEL
3.1 Hybrid Network Coding Peer-to-Peer System
We consider a P2P content distribution from a source to
many peers in w hich each peer m aintains overlay connec-
tions to some random peers, i.e. its neighbors, to ex-
change data.
A file exists at the source and is distributed to all peers
which, at the beginning, do not have any part of the file.
The file is originally divided in to Kequal blocks, the same
as in [3], which are transferred in the system in parallel.
To accelerate throughput, some peers in the system en-
code while other peers do not. Since coded data exist in
the system, all peers which have received coded data,
however, are required to decode to recover the original
data.As in BitTorrent systems [3][4], block exchange com-
plies with tw o ru les: (1) ra rest block first selection at th e
receivers side: receivers choose rarest b locks in its neigh-
borh ood to dow nload, and (2) a n incentive schem e at the
senders side: senders send blocks to their neighbors re-
ciprocally.
3.2 Random Linear Network Coding
Encoders in our system use random linear network cod-
ing (RLNC) [2][6] to create new coded blocks from the
blocks they have received . In RLNC, an encoding vectorof
K coefficients is attached to each coded block to specifyhow that coded block is generated from the K original
blocks. Su ppose we have a cod ed block C0 with encoding
vector (c01,c02,...,c0K), and K original blocks, B1, B2, ...,BK.
That means C0= c01B1+ c02B2+...+ c0KBK. The coefficients and
mu ltiplication and add ition op erations are taken place in
a Galois field, e.g. GF(28).
Now su ppose a peer A, hav ing received 2 blocks C1
and C2, wants to make a new coded block to send to a
neighboring peer. Peer A w ill pick u p tw o rand om coeffi-
cients a1 and a2 and generate a new coded block C:
C = a1C1+ a2C2
= (a1 c11+ a2 c21)B1+(a1c12+ a2c22)B2+...+(a1c1K+ a 2c2K)BK.The coded block C together w ith Kcoefficients abov e is
sent to the requesting peer.
At the receiving peer, all encoding v ectors are stored in
a decoding matrix with corresponding coded data blocks.
After a peer collects Kindependent coded blocks, i.e. the
Kassociated encoding vectors form a full-rank matrix, it
can decode to get the Koriginal blocks by solving the set
ofKlinear equations.
3.3 Network Coder Assignment
Since only some peers in the system encode, the questions
are which peers will become network coder and who is
responsible for assigning them .
In our view, peers at key locations of the network can
selectively be assigned as netw ork coders. We discuss this
problem in detail in [15] where we propose to place net-
JOURNAL OF COMPUTING, VOLUME 5, ISSUE 4, APRIL 2013, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 9
7/30/2019 Hybrid Network Coding Peer-to-Peer Content Distribution
3/10
work coders at nodes with high centrality [17][18] values.
This approach, however, requires a centralized server to
compute and assign coders. Practically, in P2P systems
such as BitTorrent [3][4][19], we can allow trackers to do
that task since the trackers know which peers currently
join the torrents.
Network cod ers can also be assigned in a dist ribu ted
manner without any centralized server by using, for ex-
amp le, degree information.2 Given a th reshold, peers with
degrees higher than the given value will become encod-
ers.
In scenarios where computational resources are lim-
ited, we can approximately predict the amount of re-
quired resou rces based on w hich a p eer can d etermine, by
itself, to become an encoder if it meets the resource re-
quirements. Such an encoder assignment does not need a
centralized server either.
4 INFORMATION EXCHANGE PROTOCOLIn pu re BitTorrent w ithout netw ork coding, there are two
pha ses to distribu te blocks. These tw o phases interlace
and take place asynchronously.
Notification phase: after downloading a block, thedownloading peer notifies its neighbors about the
block it has just d ow nloaded .
Selection phase: whenever bandwidth is availablefor downloading, a peer, based on the information
it has about which blocks are available in the
neighborhood, chooses one block to download us-
ing a block selection algorithm. The download,
then, can proceed if the downloading peer is cur-
rently unchocked by its neighbor who has the cho-
sen block and the neighbor has enough bandwidth
to sustain such download. If that fails, the peer can
repeat this process to choose another block. This
phase stop s when th e peer ru ns ou t of bandwid th
or has no more blocks to choose from.
In the following subsections, we concentrate on the
protocols used to com municate betw een peers and th e
format of the exchanged metadata. We discuss the block
selection algorithm in d etail in the n ext section. BitTorrent
unchocking algorithm is one topic in itself to handle fair-
ness and free-rider issues and is not discussed in this p a-per. W e instead assu me peer s in ou r system are alt ru ist ic
and w illing to contribute their bandwidth .
4.1 Block Format
To identify data blocks, each block is associated with one
unique block-id. However, one extension is needed to
support network coding. Unlike non-coding systems in
wh ich the assignment is d one by the sou rce where all the
blocks or iginate, in network cod ing P2P system s, th at
assignment is done w here the block is created or originat-
ed: both at the source and at all the encoders. To assist
2 Degree-based routing has been proposed in [20]. In this paper, how-ever, we do not p ropose any network coder placement but d evote our-selves to designing the protocol and data selection algorithm for thehybrid network coding system.
our block selection algorithm, the block-id is generated in
increasing order: a new block-id generated by a p articular
encoder is greater than all pr evious block-ids generated by
that encoder.With network coding, an encoding vector is attached to
each coded block as described in the previous section.
We propose an additional encoder-id field (Fig. 1) which
stores the identification of the encoder who generated the
coded block.Encoder-idwill be used in our selection algo-
rithm later on.
For each block, the metadata exchanged between
neighbors in a notification message, thus, consists of three
fields: block-id, its encoder-id, and its encoding vector
(Fig. 1(a)). The d ata block consists of block-id, encoder-id,
and the payload data (Fig. 1(b)).If the notification or d ata
block is a non -cod ed one, its encoding vectoran d encoder-idcan be omitted.
Having d efined the block formats, we next present d e-
tails of two communication protocols, either of which can
be used in th e hybrid ne tw ork coding system.
Fig. 1. Notification and data block formats with thenewly proposed encoder-idfield.
Fig. 2. Pre-code protocol peers used to communicate. There are
two asynchronous phases: notification phase and selection phase.This protocol is an extension from BitTorrent: the notification mes-sages and data blocks have an additional encoder-id. Encodingvectors are also attached to the notification messages as de-scribed in section 3.2.
JOURNAL OF COMPUTING, VOLUME 5, ISSUE 4, APRIL 2013, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 10
7/30/2019 Hybrid Network Coding Peer-to-Peer Content Distribution
4/10
Pre-code protocol: encoding vectors of coded blocksare generated in the n otification phase w hen encod -
ers notify their neighbor about n ewly coded blocks.
Post-code protocol: encoding vector for a given codedblock is generated in th e selection phase, just before
the block is down loaded.
We discuss the pros and cons of those two protocolssubsequently.
4.2 Pre-code Protocol
Without the assumption that every peer can code, we
propose a sim ple ad ap tation to BitTorrent metad ata ex-
change mechanism. To facilitate coding, in our system, if
a peer is an encoder, for each newly downloaded block,
the peer notifies each of its neighbors with metadata of
one newly encoded block. The newly encoded block is
different from one neighbor to another neighbor. We note
that to save computational resources, only the metadata,
i.e. encoder-id, block-id, and the newly generated encodingvectorof the encoded block (Fig. 1(a)), are notified to the
neighbors in a notification message. Only when a neigh-
bor decid es to chooses the notified cod ed block is the ac-
tual data of that block encoded. For an ordinary non-
encoding peer, the metadata exchange is the same as in
BitTorrent: the peer notifies its neighbors of the block it
has just received. The communication protocol is illus-
trated in Fig. 2. Since the system is a hybrid network cod-
ing, notifications (message 1) and data blocks (message 3)
transferred between p eers can be either encoded or origi-
nal ones.
One might argue to use try-and-downloadhere, but that
will make the operation more complicated because each
peer has to implem ent tw o protocols: one for encod ing-
enabled neighbors, one for ordinary neighbors. With our
app roach, all a peer h as to d o is to choose from candidate
blocks one par ticu lar block to dow nload based on the
metadata it received in notification phase, which is the
same as what h appens in a p ure BitTorrent system.
When a peer receives notification of a newly encoded
block by a n eighbor , i.e. message 1 in Fig. 2, the peer stores
that block in a candidate list if the block is independent
from all blocks it has downloaded. Otherwise, it ignores
the notification. Unlike encoding-enabled peers, non-
encoding peers do not encode but forward what theyhave received: a m ixture of coded and non-coded blocks.
As in BitTorrent, when receiving notification from a non-
encoding neighbor, a peer will update the count of that
block, i.e. at how many n eighbors the block exists.
When a p eer can d ownload, it sends a block request to
the correspond ing neighbor (message 2), and if the request
is accepted, the n eighbor will upload th e data block to the
requesting peer (message 3).
Coding generates a large number of coded blocks,
usually larger than the number of original blocks, of
which many blocks are redundant. As a peer continuous-
ly downloads new blocks, some blocks in its candidatelist might become depend ent on wh at it has dow nloaded.
Each p eer is therefore required to check and discard can-
didate blocks which are dependent on what has been
downloaded.
4.3 Post-code Protocol
As we mentioned before, notification phase and selection
phase are asynchr on ou s. That is, after peer A notifies an
encoded block in message 1, some amount of time passes
before peer B requ ests th at encod ed block in message 2.
The elapsed time can arbitrarily be long if, for example,
peer B decid es to dow nload several blocks from other
neighbors before choosing the encoded block from peerA. In the meantime, peer A might receive some new
blocks. Using pre-cod e protocol, that new information is
not includ ed in the encoded block since the way the block
is generated , i.e. its encoding vector, was fixed at the n oti-
fication t ime.
Encoders combine the blocks they currently have to
make new coded blocks. If we can delay the act of encod-
ing just before the coded blocks are dow nloaded, w e can
provid e th e receiv ing peers with the most updated infor -
mation. Based on th e above observation, we pr oposed an
Fig. 3. Usingpost-code protocol, the encoding vector is generatednot in the notification phase but just before the requested block issent to the receiving peer.
Fig. 4. Node E and F receive notification from node A, G,and H about the candidate blocks the two nodes can down-load. Of which B1 and B2 are non-coded blocks from node Gand H; A1-A4 are newly encoded blocks from encoder A.
JOURNAL OF COMPUTING, VOLUME 5, ISSUE 4, APRIL 2013, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 11
7/30/2019 Hybrid Network Coding Peer-to-Peer Content Distribution
5/10
alternative protocol, namely post-code protocol, which is
illustrated in Fig. 3.
The differences of the post-code protocol from pre-
code pr otocol are as follows.
Encoding vector is not included in the notification
message, i.e. message 1 in Fig. 3. Only encoder-idand
block-id are notified to the neighbor (peer B) each
time peer A downloads a new block. As stated be-
fore, encoder-idis the ID of peer A and block-id is an
increasing nu mber generated by peer A.
Fig. 5. With original rarest-first selection, there is aprobability 1/8 that node E and F choose the sameblock B1 or B2. The result is that node T can onlydownload one new block while its bandwidth allowstwo blocks.
Fig. 7. Encoder A, having 2 blocks B1 and B2, notifiesnode E and node F with newly encoded blocks A1-A4.
Fig. 6. If coded blocks from encoder A are preferred,node E and F can always download independentblocks. As a result, node T can utilize all its bandwidthto download 2 new independent blocks.
Fig. 10. If the newest blocks are preferred, the 4 blocksdownloaded by node E and node F are independent, whichmeans node T can download in total 4 independent blocksin 2 units of time.
Fig. 8. Encoder A, after downloading 2 new blocks B3 andB4, notifies node E and node F with blocks A5-A8 encodedfrom all 4 blocks B1-B4.
Fig. 9. With original rarest-first selection, there is aprobability 1/9 that node E chooses block A3 andnode F chooses block A4. The result is node T canonly download 2 independent blocks A1 and A2 in 2units of time. Blocks A3 and A4 in node E and node Fare not useful to node T because they are dependenton A1 and A2.
JOURNAL OF COMPUTING, VOLUME 5, ISSUE 4, APRIL 2013, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 12
7/30/2019 Hybrid Network Coding Peer-to-Peer Content Distribution
6/10
The encoder (peer A) actually generates the encodingvector and sends to the receiving peer in the selec-
tion phase (message 3) just before the actual coded
data (message 5). The receiving peer (peer B) needs
to check if that encoding vector is independent from
its own decoding matrix before requesting th e encod-
ed block (message 4).
Post-code protocol has the advantage of producing
fresher coded blocks which expectedly accelerate content
distribution. The limitation, however, is that it requires
more protocol overhead: in total 5 messages for each
downloaded block compared with 3 messages in case of
pre-cod e p rotocol.
5 BLOCK SELECTION PROBLEM
In this section, we describe in detail the block selection
problem associated with hybrid ne tw ork cod ing sy stem s
and prop ose our solution for it. The p roposed block selec-
tion, which can be u sed w ith either of the two pr otocols in
the last section, completes our proposal for an efficient,
high-performance P2P content distribution with network
coding. We begin by describing the duplication problem
in such a system using the original rarest-first block selec-
tion.
5.1 Duplication Problem with Current Rarest-firstBlock Selection
The block selection algorithm used by BitTorrent is rarest -
first by which peers choose the rarest block in the neigh-
borh ood to dow nload first . If there are several rarest
blocks, a random one is selected from th ose rarest blocks 3.Rarest-first selection is not enough because of two rea-
sons.
1) Encoders combine more information in the neigh-borh ood. Wh en th ere is lim ited available band-
width, for example when a bottleneck exists, non-
coded blocks and coded blocks cannot be given the
same attention. Coded blocks from the encoders
should be preferred because they contain, in a
sense, more information and can accelerate content
distribution through the bottleneck.
2) Coded blocks are n ot equally important. Each cod -ed block even though is always unique, i.e. rare, inthe sense that almost always no two coded blocks
are identical, the level of importance of each coded
block is d ifferent. Cod ed blocks are created progres-
sively from all the blocks an encoder h as dow nload -
ed. In the beginning, as there are only a few blocks
to encode, the coded blocks created then contain
within them only the information from that few
blocks. The more blocks an encod er has, the more
data are combined to create new coded blocks. Be-
cause of that, only at the source or wh en an encoder
has downloaded the full file, are the coded blocks
3 In the beginning of the d istribution section w hen p eers have no blocksto exchange with others, BitTorrent uses random block selection bywhich peers choose a random block in the neighborhood to download.Neverthe less, after a peer ha s acquired som e b locks , it sw itches to ra restblock se lection.
equally important. In other cases, the most recently
coded blocks likely contain m ore information.
To make it clear, we illustrate the problem in two fol-
lowing examp les.
Exampl e 1(Fig. 46) illustrates a partial overlay topol-
ogy with 6 nodes: A, G, H, E, F, T of which A is the only
encoder. Nodes A, G, H each has two blocks B1 and B2.Encoder A has notified node E, F with 4 blocks A 1-A4,
each node with two newly coded blocks. Nodes G, H
have n otified E, F with blocks B1 and B2. The count of each
block in th e ne igh borh ood is given in th e tables (Fig. 4).
Supp ose due to bottlenecks, E and F, each can only d own -
load on e new block. If E and F select blocks using original
rarest-first algorithm , there is 1/ 8 chance that both will
dow nload the same block B1 or B2 which results in node T
can only download one new block while its available
bandwid th allows tw o (Fig. 5). In Fig. 6, if E and F p refer
coded blocks from encoder A over other blocks, T can
always down load two new blocks.
Exampl e 2(Fig. 710) considers a partial overlay topol-
ogy in which an encoder A is delivering coded blocks to
non-coding nodes E, F, and T. At the beginning, A has
two blocks B1 and B2, and notifies E and F of 4 newly en-
coded blocks: A1-A4, two blocks for each node (Fig. 7).
Nod e E an d nod e F th en each can dow nload one block,
e.g. A1 and A2 du e to bandw idth limit. In the meantime, A
downloads two more blocks: B3 and B4, and sends new
notifications about blocks A 5-A8 to E and F (Fig. 8). If E
and F select blocks using rarest-first, there is 1/ 9 chan ce
that E chooses A3 and F chooses A4 wh ich resu lts in p eer
T being only able to obtain 2 independent blocks in 2
units of time (Fig. 9). In contrast, T can download 4 newblocks if E an d F prefer new encod ed blocks over old ones
(Fig. 10).
The problem th erefore is: given a mixtu re of coded and
non-coded blocks in the neighborhood, which blocks
should a peer choose to dow nload.
5.2 Proposed Block Selection Algorithm
Our prop osed algorithm is given in Algorithm 1. It works
seamlessly in all types of networks: pure non-coding, full-
scale network coding, and hybrid network coding. We
ALGORITHM 1PROPOSED BLOCK SELECTION ALGORITHM
Given a set of coded and non-coded blocks with
their correspond ing nu mber of occurrence
1. Sort blocks in descending order of their rareness(i.e. ascend ing ord er of their occurrence).
2. Choose the rarest block for download . If thereare several blocks with the same rareness choose
a block in the following ord er
a. a block encoded by one neighbor, i.e. a codedblock which has encoder-id of the neighbor; if
there are several coded blocks from a neighbor,
prefer the b lock with largest block-id(most recent
one)
b. a block at random (coded or non-coded)
JOURNAL OF COMPUTING, VOLUME 5, ISSUE 4, APRIL 2013, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 13
7/30/2019 Hybrid Network Coding Peer-to-Peer Content Distribution
7/10
extend rarest-first selection to give preference to coded
blocks from immed iate neighbors over other ones (Alg o-
rithm 1 2a). Also, from the same encoding neighbor,
newer coded blocks are preferred over older ones. In d o-
ing so, we allow valuable newly encoded blocks in the
neighborhood to be quickly disseminated while preserv-
ing the power of rarest-first in distributing new infor-
mation. Our algorithm improvement is generally signifi-
cant. Without it, newly coded blocks, virtually with m ore
information, are arbitrarily blocked in the network be-
cause neighboring peers may choose not to download
them.
6 PERFORMANCE EVALUATION
We implemented a C++ simulator of the hybrid network
coding P2P content d istribution system. We evaluate the
proposed block-selection algorith m in sect ion 5 usin g
either pre-code or post-code protocol in section 4 and
comp are the performance with a baseline network codingBitTorrent system. The baseline system uses BitTorrents
original rarest first block selection and the pre-code pro-
tocol.
A file is distributed from the source to all participating
peers, among w hich a p reset nu mber of peer s are allow ed
to encode. The file is divided into smaller fix-sized parts,
i.e. blocks. The source and all peers exchange blocks until
all peers acquire enough blocks to construct the original
file; then th e simu lation finishes.
The simulations are round-based. Each peer chooses
blocks to dow nload accor ding to its available bandwid th ,
rarest block first selection, and the incentive scheme at thebegin ning of each round. The chosen blocks are dow n-
loaded by the peer at the end of the round and then the
system moves to the next rou nd. After a p eer has collect-
ed enou gh blocks, it stops dow nloading but keeps staying
in the system to serve other peers. Each overlay link ca-
pacity is measu red by block per round, i.e . how man y
blocks can be tr ansfer red th rou gh th e link in a round. W e
disregard the overhead of sending encoding coefficients
associated with ran dom linear coding.
To captu re the essence of the system, we assume a stat-
ic scenario, i.e. there is no chan ge in the p hysical topology
and the overlay topology during a content distribution
session. The insight obtained from this static case is criti-
cally important for future work which investigates the
dynamic scenario.
We implemented mu tual exchange incentive scheme in
the simu lations: when there is contention for u ploading, a
sending peer preferably uploads to the neighbors from
wh om it is also downloading. After such peers are ex-
hausted, other neighbors are chosen for u pload. This kind
of incentive schemes ha s prev iously been used in [4].
For a given network topology we run simulations 100
times and collect the average finish time of all peers.
6.1 Clustered TopologiesWe first evaluate performance in a simple topology of
two clusters (Fig. 11). A midd le node i intercepts between
the source and the clusters to simulate a situation where
blocks are com ing progressively to nod e i. Within a clus-
ter, peers are arranged in k-regular random topologies
where kis from 3 to 6. Each cluster has 1000 nod es with 1block per round bandwidth betw een neighbors with in a
cluster. Source bandw idth to node i is 8 blocks per round
and from node i to each clust er is 4 blocks per round . The
two clusters are connected by a link with a capacity of 1
block/ round. The source delivers a 200-block file to all
peer s.
Fig. 12 compares the finish time of our system using
the proposed block selection algorithm (with either pre-
code or post-code protocol) and the finish time of the
baseline system in th ree cases: no coding, coding at n ode i,
and network coding. As we expect, no codingfinish time is
the same for both systems. Finish time improvem ent of
the proposed system becomes evident when node i codes
(around 5%) and when all nodes code (around 10%). In
this topology, the finish time is the same for both pre-
code and post-code p rotocols.
6.2 Small-world Network Topologies
P2P networks have been shown to exhibit small-world
properties [10]. We use Wa tts and Strogatz sm all-w orld
network model [7] to generate more complex topologies
with 5000 peers. By varying the small-world networks
rewiring probability p we can change the network topol-
ogy. We set capacity of all links to 1 block/ round , the
degree k=6, and change the rewiring probability p in sim-ulations.
We simu late two real-life scenarios.
1) Optimization scenario: encoders are placed at se-
Fig. 11. A 2-cluster topology with a middle node i.
Fig. 12. Finish time of the proposed system compared withbaseline system in a clustered topology.
JOURNAL OF COMPUTING, VOLUME 5, ISSUE 4, APRIL 2013, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 14
7/30/2019 Hybrid Network Coding Peer-to-Peer Content Distribution
8/10
lected p eers to minimized d istribution time. We use
two placement method s:
betweenness centrality placement [15]: nodes withhigh betw eenness centrality valu es [9] are chosen
as encoders. Betweenness centrality measures the
degree that a node lies in the shortest paths be-
tween other nodes. Coding at high betweenness
centrality peers can improve content distribution
to more downstream peers4.
degree-basedplacement: encod ers are placed atnodes w ith high degrees first.
2) Resource-constraint scenario: nodes with highercapacity can encode, nodes having limited resources
cannot. Among peers, we set some random ones
with rich resources and assign them as encod ers.
We increase the number of encoders from 0 (no coding)
to 5000 (network coding) and compare the performance of
the prop osed system w ith the baseline system in two sce-
narios above. The results are given in Fig. 13, Fig. 14, and
Fig. 15 with rewiring pr obabilityp=0.02.
When betweenness centrality is used to optimize en-
coder placement, the proposed block selection together
with pre-code protocol shortens distribution time by
about 15% compared to the baseline system with only 250
encoders (Fig. 13). With more encoders, the improv emen t
is higher and reaches more than 25% when all nodes en-codes. The finish time ofno coding, i.e. the number of en-
coders is zero, is the same regard less of which block selec-
tion algorithm and protocol are used.
With 2000 or less encoders chosen at random, i.e. a set
of random peers are allowed to encode, there is not mu ch
finish time improvement using both rarest-first and the
proposed block selection (Fig. 15). That is becau se a few
encoders at random, without a proper placement, are not
effective in improving distribution time. When a large
number of encoders, e.g. 3000 or 4000 encoders, are ran-
domly deployed, the proposed block selection with pre-
code protocol can improve distribution time by around15% comp ared to ba seline system .
The finish time using degree-based placement lies be-
4 We discuss the strategy to place encoders in another w ork [15].
tween th e other two p lacements. As before, the prop osed
system achieves noticeable finish time improvement
compared to the baseline system using rarest selection
(Fig. 14).
We next change the rewiring probability p from 0.02 to
0.05 to evaluate our system in a wide range of topologies.
Using 250 encoders among the total of 5000 peers, the
finish time improvement compared with non-coding Bit-
Torrent (with no encoder) is presented in Fig. 16, Fig. 17,
and Fig. 18.
Our p roposed system (proposed selection + pre-code and
proposed selection + post-code) always achieves improved
performance com pared with th e baseline system . The
improvement, however, is more visible in topologies with
low rewiring probabilities (p=0.02). High rewiring
probabilities generate almost random topologies in which
the effect of coding, in general, is not so noticeable.
The performance of post-code protocol (proposed selec-
tion + post-code) is better than pre-code protocol (proposed
selection + pre-code) in low-rewiring topologiesp=0.02 (Fig.
13-15). The reason is because with post-code protocol en-coders can combine more up dated inform ation to send to
the receivers as we have discussed before. In topologies
with higher rewiring p robabilities (p0.1) (Fig. 16, 17, and
Fig. 13. Finish time of the proposed system when choosingnodes with highest betweenness centrality as encoderscompared with finish time of baseline system.
Fig. 14. Finish time using the proposed system comparedwith a baseline system in case encoders are placed at high-degree peers
Fig. 15. The performance of the proposed system comparedwith baseline system in resource-constraint scenario whenonly (random) high-capacity peers are allowed to encode.
JOURNAL OF COMPUTING, VOLUME 5, ISSUE 4, APRIL 2013, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 15
7/30/2019 Hybrid Network Coding Peer-to-Peer Content Distribution
9/10
18), since new information can transfer through more re-
wiring links, post-code protocol is no longer more effec-tive than pre-code protocol. We note that in the simula-
tions, we have not taken into account the overhead of
protocols used to com municate between peers.
7 CONCLUSION
We have proposed information exchange protocols and
its associated block-selection algorithm to improve per-
formance of a hybrid network coding P2P system in
wh ich encoding-enabled and non-encoding peers coexist.
Our design is simple, backward compatible to Bit-
Torrent, yet efficient in the w ay it hand les blocks of d ata:coded and non-coded alike.
We prop osed tw o pr otocols. The first one,pre-code pro-
tocol, is an extension of BitTorrent w ith the ad dition of an
encoder-id field in the exchanged messages to identify
from whom the blocks are generated. The second one,
namely post-code, by postponing the encoding process,
can combine and deliver more up dated information to the
receivers and achieve shorter finish time. Post-code proto-
col is more effective in severely bottlenecked topologies.
The trade-off is, however, higher protocol overhead.
Our block-selection algorithm is derived from observa-
tion on the benefit of network coding in eliminating data
duplication. Using our proposed algorithm, peers can
effectively choose blocks to download which results in
considerable imp rovement in d istribution time.
We believe our proposed solution, which promotes
network coding as a method to shorten distribution time
even if encoding is not fully enabled at every p eer, will be
of great use in heterogeneous P2P systems and / or when
there is a need to m inimize resource consum ption.
For futu re work, we plan to evaluate the proposed d e-
sign and block selection algorithm in a dynamic setting.
We are also interested in imp lementing the prop osal in a
real system, especially to evaluate the actual trad e-off and
effectiveness o f the p ost-code p rotocol.In this paper, we have not addressed incentive issues:
how to motivate peers to encode, which is another inter-
esting p roblem we leave for futu re work.
ACKNOWLEDGMENT
This work was supported by JSPS KAKENHI Grant
Number (24500098).
REFERENCES
[1] R. Ahlswede, N. Cai, S. R. Li, and R. W. Yeung, Network InformationFlow, IEEE Transactions on Information Theory, July 2000.
[2] T. Ho, R. Koetter, M. Medard, D. Karger, and M. Effros, The Benefits ofCoding over Routing in a Randomized Setting, ISIT, Japan, 2003.
[3] B. Cohen, Incentives Build Robustness in BitTorrent, P2P EconomicsWorkshop, 2003.
[4] C. Gkantsidis and P. R. Rodriguez, "Network Coding for Large ScaleContent Distribution", IEEE INFOCOM, March 2005.
[5] A. Legout, G. Urvoy-Keller, and P. Michiardi, Rarest first and chokealgorithms are enough, ACM SIGCOMM IMC 2006.
[6] P. Chou, Y. Wu, and K. Jain, Practical network coding, in Proc. An-nual Allerton Conference 2003.
[7] Watts, Duncan J., Strogatz, Steven H. (June 1998) "Collective dynamicsof 'small-world ' networks", Natu re 393 (6684): 440442.
[8] M. Langberg, A. Sprintson, and J. Bruck, "The encoding complexity ofnetwork coding", IEEE Trans. on Information Theory, pp. 2386 - 2397,
June 2006.[9] L.C. Freeman, A set of measures of centrality based on betweenness,
1977, Sociometry, vol. 40, No. 1, 35-41.
[10] N Leibowitz, M. Ripeanu, A. Wierzbicki, Deconstructing the kazaanetwork, in proceedings of WIAPP 2003.
Fig. 16. Finish time improvement of the proposed systemand baseline system using 250 encoder placed at highbetweenness centrality peers compared with non-codingBitTorrent.
Fig. 17. Finish time improvement of the proposed systemand baseline system using 250 encoder placed at high-degree peers compared with non-coding BitTorrent.
Fig. 18. Finish time improvement of the proposed systemand baseline system using 250 encoder placed at randompeers compared with non-coding BitTorrent.
JOURNAL OF COMPUTING, VOLUME 5, ISSUE 4, APRIL 2013, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 16
7/30/2019 Hybrid Network Coding Peer-to-Peer Content Distribution
10/10
[11] D. Nguyen, H. Nakazato, Peer-to-Peer Content Distribution in Clus-tered Topologies with Source Coding, IEEE GLOBECOM 2011, Dec.
2011.
[12] N. Cleju, N. Thomos, P. Frossard, Network Coding Node Placementfor Delay Minimization in Streaming Overlays, IEEE ICC 2010.
[13] S. Li, R. Yeung, and N. Cai, Linear network coding, IEEE Transac-tions on Information Theory, 2003.
[14] D. Nguyen, H. Nakazato, Rarest-first and Coding Are N ot Engough,IEEE GLOBECOM 2012, Dec. 2012.
[15] D. Nguyen, H. Nakazato, Network Coder Placement for Peer-to-PeerContent Distribution, IEICE Tech. Report CS2012-74, Nov. 2012.
[16] C Gkantsidis, J Miller, P Rodriguez, "Comprehensive view of a livenetwork coding P2P system," ACM SIGCOM IMC '06.
[17] L.C. Freeman, A set of measures of centrality based on betweenness,1977, Sociometry, vol. 40, No. 1, 35-41.
[18] L.C. Freeman, S.P. Borgatti, D.R. White, "Centrality in valued graphs: ameasure of betweenness based on network flow", Social Networks 13,
141154, 1991.
[19] A Legout, G Urvoy-Keller, P Michiardi, "Understanding BitTorrent: AnExperimental Perspective," Tech. Report, INRIA-00000156, VERSION 3
- Nov. 2005.
[20] C. Yin, B. Wang, W. Wang, T. Zhou, and H. Yang, Efficient routing onscale-free networks based on local information, Physics Letters A, Vol.351, Issues 4-5, 6 March 2006, pp.220-224.
[21] M. Kim, M. Medard , V. Aggarwal, U.-M. O'Reilly, W. Kim, C. W. Ahn,and M. Effros, ``Evolutionary Approaches to Minimizing Network
Coding Resources,'' Proc. IEEE INFOCOM 2007, May 2007.
[22] K. Bhattad, N. Ratnakar, R. Koetter, and K. R. Narayanan, ``Minimalnetwork coding for multicast,'' Proc. IEEE ISIT 2005, Sep. 2005.
Dinh Nguyen received his Bachelor of Electronics and Telecomm.degree from Hanoi University of Technology, Vietnam, in 1999. From1999 he was with NetNam ISP Corporation. He received his MSc in2006 and currently is a Ph.D. candidate at Graduate School of Glob-al Information and Telecommunications Studies, Waseda University,Tokyo, Japan. His research interests include peer-to-peer systems,
network coding, and content distribution systems.
Hidenori Nakazato received his B. Engineering degree in electron-ics and telecommunications from Waseda University in 1982 and hisMS and Ph.D. degrees in computer science from University of Illinoisin 1989 and 1993, respectively. He was with Oki Electric from 1982to 2000. Since 2000, he has been a faculty member of GraduateSchool of Global Information and Telecommunications Studies,Waseda University. He served as the editor of IEICE Transactionson Communications from 1999 to 2002 and other positions in theexecutive committee of IEICE Communication Society from 1997 to2004, and from 2008 to now. He also served as an executive com-mittee member of IEEE Region 10 and is serving as a member ofseveral IEEE Member and Geographic Activity committees. Hisresearch interests include performance issues in distributed systems
and networks. He is a member of ACM, IEEE, and IPSJ.
JOURNAL OF COMPUTING, VOLUME 5, ISSUE 4, APRIL 2013, ISSN (Online) 2151-9617
https://sites.google.com/site/journalofcomputing
WWW.JOURNALOFCOMPUTING.ORG 17