14
Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks Dongyu Qiu and R. Srikant Coordinated Science Lab and Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign {dqiu,rsrikant}@uiuc.edu Abstract In this paper, we develop simple models to study the performance of BitTorrent, a second generation peer- to-peer (P2P) application. We first present a simple fluid model and study the scalability, performance and efficiency of such a file-sharing mechanism. We then consider the built-in incentive mechanism of Bit- Torrent and study its effect on network performance. We also provide numerical results based on both sim- ulations and real traces obtained from the Internet. 1 Introduction Peer-to-Peer (P2P) applications have become im- mensely popular in the Internet. Traffic measure- ments shows that P2P traffic is starting to dominate the bandwidth in certain segments of the Internet [2]. Among P2P applications, file sharing is perhaps the most popular application. Compared to traditional client/sever file sharing (such as FTP, WWW), P2P file sharing has one big advantage, namely, scalabil- ity. The performance of traditional file sharing appli- cations deteriorates rapidly as the number of clients increases, while in a well-designed P2P file sharing system, more peers generally means better perfor- mance. There are many P2P file sharing programs, such as Kazza, Gnuttella, eDonkey/overnet, BitTor- rent, to name a few. In this paper, we develop simple models to understand and study the behavior of Bit- Torrent [8] which is proving to be one of the more popular P2P applications today. For a BitTorrent network (or a general P2P file sharing network), several issues have to be addressed in order to understand the behavior of the system. Peer Evolution: In P2P file sharing, the num- ber of peers in the system is an important fac- tor in determining network performance. There- fore, it is useful to study how the number of peers evolves as a function of the request ar- rival rate, the peer departure rate, the upload- ing/downloading bandwidth of each peer, etc. Scalability: To realize the advantages of P2P file sharing, it is important for the network perfor- mance to not deteriorate, and preferably to actu- ally improve, as the size of the network increases. Network performance can be measured by the average file downloading time and the size of the network can be characterized by the number of peers, the arrival rate of peers, etc. File Sharing Efficiency: It is common for peers in a P2P network to have different up- loading/downloading bandwidths. Further, in BitTorrent-like systems, a file may be broken into smaller pieces and the pieces may be dis- tributed at random among the peers in the net- work. To efficiently download the file, it is im- portant to design the file-sharing protocol such that each peer is matched with others who have the pieces of the file that it needs and further, to ensure that the downloading bandwidth of each peer is fully utilized. Incentives to prevent free-riding: Free-riding is a major cause for concern in P2P networks. Free- riders are peers who try to download from oth- ers while not contributing to the network, i.e., by not uploading to others. Thus, most P2P networks try to build in some incentives to de- 1

Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

Modeling and Performance Analysis of BitTorrent-Like

Peer-to-Peer Networks

Dongyu Qiu and R. Srikant

Coordinated Science Lab

and

Department of Electrical and Computer Engineering

University of Illinois at Urbana-Champaign

{dqiu,rsrikant}@uiuc.edu

Abstract

In this paper, we develop simple models to study theperformance of BitTorrent, a second generation peer-to-peer (P2P) application. We first present a simplefluid model and study the scalability, performanceand efficiency of such a file-sharing mechanism. Wethen consider the built-in incentive mechanism of Bit-Torrent and study its effect on network performance.We also provide numerical results based on both sim-ulations and real traces obtained from the Internet.

1 Introduction

Peer-to-Peer (P2P) applications have become im-mensely popular in the Internet. Traffic measure-ments shows that P2P traffic is starting to dominatethe bandwidth in certain segments of the Internet [2].Among P2P applications, file sharing is perhaps themost popular application. Compared to traditionalclient/sever file sharing (such as FTP, WWW), P2Pfile sharing has one big advantage, namely, scalabil-ity. The performance of traditional file sharing appli-cations deteriorates rapidly as the number of clientsincreases, while in a well-designed P2P file sharingsystem, more peers generally means better perfor-mance. There are many P2P file sharing programs,such as Kazza, Gnuttella, eDonkey/overnet, BitTor-rent, to name a few. In this paper, we develop simplemodels to understand and study the behavior of Bit-Torrent [8] which is proving to be one of the morepopular P2P applications today.

For a BitTorrent network (or a general P2P filesharing network), several issues have to be addressedin order to understand the behavior of the system.

• Peer Evolution: In P2P file sharing, the num-ber of peers in the system is an important fac-tor in determining network performance. There-fore, it is useful to study how the number ofpeers evolves as a function of the request ar-rival rate, the peer departure rate, the upload-ing/downloading bandwidth of each peer, etc.

• Scalability: To realize the advantages of P2P filesharing, it is important for the network perfor-mance to not deteriorate, and preferably to actu-ally improve, as the size of the network increases.Network performance can be measured by theaverage file downloading time and the size of thenetwork can be characterized by the number ofpeers, the arrival rate of peers, etc.

• File Sharing Efficiency: It is common forpeers in a P2P network to have different up-loading/downloading bandwidths. Further, inBitTorrent-like systems, a file may be brokeninto smaller pieces and the pieces may be dis-tributed at random among the peers in the net-work. To efficiently download the file, it is im-portant to design the file-sharing protocol suchthat each peer is matched with others who havethe pieces of the file that it needs and further, toensure that the downloading bandwidth of eachpeer is fully utilized.

• Incentives to prevent free-riding: Free-riding is amajor cause for concern in P2P networks. Free-riders are peers who try to download from oth-ers while not contributing to the network, i.e.,by not uploading to others. Thus, most P2Pnetworks try to build in some incentives to de-

1

Page 2: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

ter peers from free-riding. Once the incentivemechanism is introduced into the network, eachpeer may try to maximize its own net benefitwithin the constraints of the incentive mecha-nism. Thus, it is important to study the effectof such behavior on the network performance.

1.1 Relationship to prior work

The basic idea of P2P network is to have peers par-ticipate in an application level overlay network andoperate as both servers and clients. Since the ser-vice burden is distributed to all participating peers,the system is expected to scale well even when thenetwork is very large. Besides file sharing, P2P over-lays have also been deployed in distributed directoryservice [18, 21], web cache [15], storage [9], and gridcomputation [1] etc.

While early work on P2P systems has mainlyfocused on system design and traffic measurement[19, 20, 17], some recent research has emphasized per-formance analysis. In [13], a closed queueing systemis used to model a general P2P file sharing systemand basic insights on the stationary performance areprovided. In [6, 7], a stochastic fluid model is usedto study the performance of P2P web cache (SQUIR-REL) and cache clusters. A part of our work is mo-tivated by the models in [11, 24], where a branch-ing process is used to study the service capacityof BitTorrent-like P2P file sharing in the transientregime and a simple Markovian model is presentedto study the steady-state properties. Our work dif-fers from [11, 24] in the following respects:

• Instead of studying the Markov chain numeri-cally, we develop a simple deterministic modelwhich allows us to obtain simple expressionsfor the average file-transfer time, thus provid-ing insight into the performance of the P2P net-work. We also incorporate realistic scenarios inour fluid model such as the abandonment of filetransfers by peers and download bandwidth con-straints.

• Then, we develop a simple stochastic fluid modelwhich characterizes the variability of the numberof peer around the equilibrium values predictedby the deterministic fluid model.

• We also develop a simple model to study the ef-ficiency of downloading from other peers and ar-gue that the file-sharing protocol in BiTorrent isvery efficient.

• Finally, we consider the mechanisms built intoBitTorrent to avoid free-riding and study the im-pact of these mechanisms on the users’ behaviorsand network performance.

2 A brief description of BitTor-

rent

BitTorrent is a P2P application whose goal is to facil-itate fast downloads of popular files. Here we providea brief description of how BitTorrent operates whena single file is downloaded by many users. Typicallythe number of simultaneous downloaders for popularfiles could be of the order of a few hundreds whilethe total number of downloaders during the lifetimeof a file could be of the order of several tens or some-times even hundreds of thousands. The basic idea inBitTorrent is to divide a single large file (typically afew 100 MBytes long) into pieces of size 256 KB each.The set of peers attempting to download the file do soby connecting to several other peers simultaneouslyand download different pieces of the file from differentpeers.

To facilitate this process, BitTorrent uses a cen-tralized software called the tracker. In a BitTor-rent network, a peer that wants to download a filefirst connects to the tracker of the file. The trackerthen returns a random list of peers that have thefile. The downloader then establishes a connection tothese other peers and finds out what pieces reside ineach of the other peers. A downloader then requestspieces which it does not have from all the peers towhich it is connected. But each peer is allowed toupload only to a fixed number (default is four) ata given time. Uploading is called unchoking in Bit-Torrent. Which peers to unchoke is determined bythe current downloading rate from these peers, i.e.,each peer uploads to the four peers that provide itwith the best downloading rate even though it mayhave received requests from more than four down-loaders. This mechanism is intended to deter free-riding. Since a peer is only uploading four other peersat any time, it is possible that a peer, say Peer A, maynot be uploading to a peer, say Peer B, which couldprovide a higher downloading rate than any of thepeers to which Peer A is currently uploading. There-fore, to allow each peer to explore the downloadingrates of other peers, BitTorrent uses a process calledoptimistic unchoking. Under optimistic unchoking,each peer randomly selects a fifth peer from which

2

Page 3: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

it has received a downloading request and uploadsto this peer. Thus, including optimist unchoking, apeer may be uploading to five other peers at any time.Optimistic unchoking is attempted once every 30 sec-onds and to allow optimistic unchoking while keep-ing the maximum number of uploads equal to five, anupload to the peer with the least downloading rate isdropped.

BitTorrent distinguishes between two types ofpeers, namely downloaders and seeds. Downloadersare peers who only have a part (or none) of the filewhile seeds are peers who have all the pieces of the filebut stay in the system to allow other peers to down-load from them. Thus, seeds only perform uploadingwhile downloaders download pieces that they do nothave and upload pieces that they have. Ideally, onewould like an incentive mechanism to encourage seedsto stay in the system. However, BitTorrent currentlydoes not have such a feature. We simply analyze theperformance of BitTorrent as is.

In practice, a BitTorrent network is a very com-plicated system. There may be hundreds of peersin the system. Each peer may have different partsof the file. Each peer may also have different up-loading/downloading bandwidth. Further, each peeronly has partial information of the whole networkand can only make decisions based on local informa-tion. In addition, BitTorrent has a protocol (calledthe rarest-first policy) to ensure a uniform distribu-tion of pieces among the peers and protocols (call theendgame mode) to prevent users who have all but afew of the pieces from waiting too long to finish theirdownload. As with any good modelling exercise, wetradeoff between the simplicity of the model and itsability to capture all facets of the protocol. Thus,we will first use a simple fluid model to study thescalability and the stability of the system. We willthen assume that each peer has the global informa-tion and study the incentive mechanism of BitTor-rent. We will finally briefly study the effect of theoptimistic unchoking on free-riding.

3 A Simple Fluid Model

Our model for file-sharing is influenced by the modelin [11]. However, while [11] only uses the model todevelop a Markov chain which is then studied nu-merically, we use the key modelling idea in [11] todevelop a simple deterministic fluid model which isamenable to analysis and provides insights into the

system performance.In our model, we use the following quantities to

capture a BitTorrent peer-to-peer network [8] thatserves a given file (without loss of generality, we as-sume that the file size is 1):

x(t) number of downloaders (also known as leechers)in the system at time t.

y(t) number of seeds in the system at time t.

λ the arrival rate of new requests.

µ the uploading bandwidth of a given peer. We as-sume that all peers have the same uploadingbandwidth.

c the downloading bandwidth of a given peer. Weassume that all peers have the same downloadingbandwidth and c ≥ µ.

θ the rate at which downloaders abort the download.

γ the rate at which seeds leave the system.

η indicates the effectiveness of the file sharing, whichwe will describe shortly. η takes values in [0, 1].

In a BitTorrent-like P2P network, a downloadercan upload data to other peers even though it mayonly have parts of a file. The parameter η is used toindicate the effectiveness of this file sharing. If thereis no constraint on downloading bandwidth, the to-tal uploading rate of the system can be expressed asµ(ηx(t) + y(t)). If η = 0, then the downloaders donot upload data to each other and only downloadfrom seeds. When the downloading bandwidth con-straint is considered, the total uploading rate will bemin{cx(t), µ(ηx(t) + y(t))}.

Next, we comment on the parameters θ and γ. Adownloader may not stay in the system till it com-pletely downloads the file. Occasionally, a down-loader may leave the network before the download-ing is complete if he/she feels that the download istaking too long. We use the parameter 1/θ to de-note the average amount of time that a downloaderwill stay in the system before deciding to abort thedownload. Equivalently, θ is the rate at which down-loaders abort their download and leave the system. Ina fluid model, the rate of departures of downloaderswill be given be

min{cx(t), µ(ηx(t) + y(t))} + θx(t).

3

Page 4: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

While the departures that occur due to the factthat the file download has been completed will be-come seeds instantaneously, the remaining download-ers will permanently leave the system. The parameterγ is the rate at which seeds depart from the network.Clearly, γ will have an effect on system performance:the lower the γ, the lower the download times sincethis means that there will more seeds in the system.This parameter γ can be influenced by providing in-centives for users to stay in the system after theyhave downloaded the file, i.e., after they have becomeseeds. However, BitTorrent currently does not havesuch incentives and therefore, we simply consider γto be a fixed constant.

Now, we are ready to describe the evolution of xand y based on the above model. A deterministicfluid model for the evolution of the number of peers(downloaders and seeds) is given by

dx

dt= λ− θx(t) − min{cx(t), µ(ηx(t) + y(t))},

dy

dt= min{cx(t), µ(ηx(t) + y(t))} − γy(t), (1)

along with the obvious constraint that x(t) and y(t)should be non-negative. A key contribution of [11]was to describe the efficiency of data transfer fromother downloaders using the parameter η. Our fluidmodel provides a simple description of the systemthat was described by a Markov chain in [11]. In ad-dition, we have incorporated other realistic scenariossuch as departures of downloaders due to impatiencewith the downloading process (described by θ) anddownloading bandwidth constraint c. In a later sub-section, we will also present a simple stochastic fluidmodel that characterizes the variability around thefluid model. We now study the steady-state perfor-mance of the P2P system using the above fluid model.

3.1 Steady-State Performance

To study the system in steady-state, we let

dx(t)

dt=

dy(t)

dt= 0

in (1) and obtain

0 = λ− θx− min{cx, µ(ηx+ y)},0 = min{cx, µ(ηx+ y)} − γy(t), (2)

where x and y are the equilibrium values of x(t) andy(t) respectively.

We first assume η > 0. Further, suppose thatthe downloading speed is the constraint, i.e., cx ≤µ(ηx+ y). Equation (2) then becomes a simple linearequation. Solving the equation, we have

x =λ

c(1 + θc)

y =λ

γ(1 + θc). (3)

Now, the assumption that cx ≤ µ(ηx+y) is equivalentto

1

c≥ 1

η(1

µ− 1

γ).

Instead, if we assume that the uploading band-width is the constraint, i.e., cx ≥ µ(ηx+ y), we get

x =λ

ν(1 + θν)

y =λ

γ(1 + θν), (4)

where 1ν

= 1η( 1

µ− 1

γ). From cx ≥ µ(ηx+ y), we have

1

c≤ 1

ν=

1

η(1

µ− 1

γ).

Define 1β

= max{ 1c, 1

η( 1

µ− 1

γ)}, then (3) and (4)

can be combined to yield

x =λ

β(1 + θβ)

y =λ

γ(1 + θβ). (5)

To calculate the average downloading time for apeer in steady state, we use Little’s law [4] as follows:

x = (λ− θx)T,

where T is the average downloading time and λ− θxis the average rate at which downloads are completed.Using (5), it is now easy to see that

T =1

β= max{1

c,1

η(1

µ− 1

γ)}. (6)

Equation (6) provides several insights into the be-havior of BitTorrent:

• The average downloading time T is not related toλ, the request arrival rate. Hence, the BitTorrentP2P system scales very well.

4

Page 5: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

• When η increases, T decreases. This is becausethe peers share the file more efficiently.

• When γ increases, T increases because a larger γmeans that there are fewer seeds in the system.

• Initially, when c increases, T decreases. How-ever, once c is large enough ( 1

c≤ 1

η( 1

µ− 1

γ)),

increasing c further will not decrease T, becausethe downloading bandwidth is no longer the bot-tleneck. A similar observation can be made re-garding the uploading bandwidth µ.

• While the average numbers of downloaders andseeds (x and y) are functions of θ, the down-loader leaving rate, the average downloadingtime T is independent of θ. The intuition behindthis is as follows: when a downloader aborts itsfile transfer, it reduces the number of download-ers competing for resources in the system. Onthe other hand, it also reduces the number ofpeers that are uploading the file in the system.These two effects precisely cancel each other outin the fluid model leaving the average downloadtime independent of θ.

• It is often true that the downloading bandwidthc of a peer would be much higher than its upload-ing bandwidth µ. Common examples of such anasymmetry are DSL and cable modem connec-tions. For performance analysis purposes, it maybe tempting to set c = ∞ as in [11, 24]. How-ever, the expression for T in (6) shows that theaverage download time is not always constrainedby the uploading bandwidth of the peers. Infact, if the seed leaving rate γ is smaller than µ,then the downloading bandwidth c determinesthe network performance even though c may bemuch larger than µ.

We briefly comment on the case η = 0, whichmeans that downloaders do not upload data to eachother and only download from seeds. If γ < µ, theprevious analysis as in the case of η > 0 still holdsand T = 1/c. On the other hand, if γ > µ, from (1),we can see

dy(t)

dt≤ (µ− γ)y(t).

This tells us that y(t) decreases at least exponentially.So if γ > µ, the number of seeds will exponentiallydecrease to zero and the system dies. Recall thatwhen η > 0, the system reaches a steady state nomatter what γ is. So, it is very important for the

downloaders to upload data to each other. Even if thefile sharing is not very efficient (a small η), it can playan important role in keeping the system alive. From(6), we also see that η is important to the networkperformance. In the next subsection, we will derivean expression for η and argue that η is very close to1 in BitTorrent.

3.2 Effectiveness of File Sharing

In this section, we present a simple model to calculatethe value of η, which indicates the effectiveness of thefile sharing. For a given downloader i, we assume thatit is connected to k = min{x−1,K} other download-ers, where x is the number of downloaders in the sys-tem andK is the maximum number of downloaders towhich a peer can connect. We also assume that eachdownloader has the information about which piecesthe connected peers have. Hence if peer i has piecesthat are of interest to at least one peer that is con-nected to it, then peer i will upload data. We thenhave

η = 1 − P

{

downloader i has no piece thatthe connected peers need

}

.

We assume that the piece distributions between dif-ferent peers are independent and identical. Then

η = 1 − P

{

downloader j needs nopiece from downloader i

}k

,

where j is a downloader connected to i.For each downloader, we assume that the number

of pieces it has is uniformly distributed in {0, · · · , N−1}, where N is the number of pieces of the servedfile. Let ni denote the random variable describingthe number of pieces at downloader i. We assumethat given ni, these pieces are chosen randomly fromthe set of all pieces of the file. This is a reasonable as-sumption because BitTorrent takes a rarest first pieceselection policy when downloading. Under these as-sumptions, we have

P

{

downloader j needs nopiece from downloader i

}

= P{j has all pieces of downloader i}

=

N−1∑

nj=1

nj∑

ni=0

1

N2P{j has all pieces of i|ni, nj}

=

N−1∑

nj=1

nj∑

ni=0

1

N2

(

N−ni

nj−ni

)

(

Nnj

)

5

Page 6: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

=

N−1∑

nj=1

1

N2

(

N+1nj

)

(

Nnj

)

=

N−1∑

nj=1

N + 1

N2(N + 1 − nj)

=N + 1

N2

N−1∑

nj=1

1

N + 1 − nj

=N + 1

N2

N∑

m=2

1

m≈ logN

N

and

η ≈ 1 −(

logN

N

)k

. (7)

Now, we will interpret the expression for realis-tic file sizes. In BitTorrent, each piece is typically256KB. Thus, for a file that is a few hundreds ofmegabytes in size, N is of the order of several hun-dreds. Hence, even if k = 1, η is very close to one. ForBitTorrent, k is actually larger, since the maximumnumber of connections K is typically 40. This tellsus that BitTorrent is very efficient in sharing files.When k increases, η also increases but very slowlyand the network performance increases slowly. Notethat, since k depends on the number of other peersin the system, it may be related to the arrival rateλ. Hence, when λ increases, the network performanceincreases but very slowly. Thus, our observation inthe previous subsection that the network performanceis essentially independent of λ still holds. This alsomatches the observations of real BitTorrent networkspresented in [11, 24]. Note that when k = 0, thedownloader is not connected to any other download-ers and hence η = 0.

3.3 Local Stability

When deriving the steady-state quantities x, y andT, we implicitly assumed that the system is stableand will reach its equilibrium. In this section, westudy the stability of the fluid model (1) around theequilibrium {x, y}.

When 1c< 1

η( 1

µ− 1

γ), the uploading bandwidth is

the constraint and around a small neighborhood of{x, y}, we have

dx(t)

dt= λ− θx(t) − µ(ηx(t) + y(t))

dy(t)

dt= µ(ηx(t) + y(t)) − γy(t).

Let

A1 =

[

−(µη + θ) −µµη −(γ − µ)

]

. (8)

Then the eigenvalues of A1 determine the stability ofthe equilibrium {x, y}. Let ψ be an eigenvalue of A1.The eigenvalues of A are the solutions of

ψ2 + (µη + θ + γ − µ)ψ + µηγ + θ(γ − µ) = 0. (9)

Since 1c< 1

η( 1

µ− 1

γ), we have γ > µ. When η > 0,

both µη + θ + γ − µ and µηγ + θ(γ − µ) are greaterthan zero. So the eigenvalues have strictly negativereal parts and the system is stable.

Similarly, when 1c> 1

η( 1

µ− 1

γ), the download-

ing bandwidth is the constraint and around a smallneighborhood of {x, y}, we have

dx(t)

dt= λ− θx(t) − cx(t)

dy(t)

dt= cx(t) − γy(t).

Let

A2 =

[

−(θ + c) 0c −γ

]

. (10)

Then the eigenvalues of A2 satisfy

ψ2 + (θ + γ + c)ψ + (θ + c)γ = 0. (11)

Again, since both θ + γ + c and (θ + c)γ are greaterthan zero, we see that the eigenvalues have strictlynegative real parts and the system is stable.

The case where 1c

= 1η( 1

µ− 1

γ) is a little more tricky

since the dynamics are determined by the matrix A1

or the matrix A2, depending upon the direction inwhich the system is perturbed. Thus, a linear anal-ysis will not suffice to even determine local stability.To avoid lengthy arguments, we do not consider thisspecial case here.

Even in the cases where 1c6= 1

η( 1

µ− 1

γ), the global

stability of the fluid model (1) may be hard to analyzebecause of the fact that the dynamics of the systemchanges depending upon whether cx > µ(ηx + y) ornot. Such systems are called switched linear systems;we refer the reader to the survey in [16] for the sta-bility issues associated with such models.

3.4 Characterizing Variability

When the request arrival rate is large (which alsomeans a large number of downloaders and seeds), thefluid model is a good approximation of the real sys-tem. However, it is important to understand how the

6

Page 7: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

number of seeds and downloaders vary around thenumbers predicted by the deterministic model. Inthis subsection, we present a simple characterizationof the variance of x and y around x and y using aGaussian approximation.

To calculate the variance around the fluid limit,we make a few assumptions regarding the peer ar-rival process, the downloader and seed departure pro-cesses. We assume that peers arrive according to aPoisson process, and we assume that each downloaderindependently aborts its download after a certainamount of time which is exponentially distributedwith mean 1/θ. Each seed stays in the system for arandom time which is exponentially distributed withmean 1/γ. Finally, to obtain a Markovian descriptionof the system, we assume that the probability thatsome downloader becomes a seed in a small intervalδ is given by min(cx, µ(ηx+ y))δ. These assumptionscan be easily relaxed to allow more general distri-butions for all the random variables involved by us-ing phase-type distributions as in [14, 22, 10]. Underthese assumptions, the number of downloaders andseeds at any time t can be described by

x(t) +√λx(t), y(t) +

√λy(t),

respectively, where X = (x, y)T is the solution to thefollowing stochastic fluid differential equation whosesolution is known as the Ornstein-Uhlenbeck process:

dX(t) = AX(t)dt+ BdW(t). (12)

In (12), the components of W are independent stan-dard Wiener processes (Brownian motions), with theentries of A and B being determined by whether thedownloading or the uploading bandwidth is the bot-tleneck. Specifically, A = A1 given by (8) if 1

c<

1η( 1

µ− 1

γ) and A = A2 given by (10) if 1

c> 1

η( 1

µ− 1

γ).

In both cases, we have

B =

[

1 −√ρ −

(1 − ρ) 0

0 0√

(1 − ρ) −√

(1 − ρ)

]

(13)

where ρ := θθ+β

. We do not consider the more com-

plicated case 1c

= 1η( 1

µ− 1

γ) which is unlikely to occur

in practice.From (12), it is easy to compute the steady-state

covariance of X, i.e, Σ = limt→∞E(X(t)XT (t)). Thisis given by the so-called Lyapunov equation [3]

AΣ + ΣAT + BBT = 0. (14)

The steady-state variance of x is then given by (1, 1)element of Σ and the steady-state variance of y is

given by the (2, 2) element of Σ. The above resultessentially states that, in steady-state, the numberof seeds and downloaders is distributed as Gaussianrandom variables whose variances are determined byΣ.

The formal proof required to establish (12) isbeyond the scope of this paper. We will simplystate here that it involves showing that the originalstochastic process converges to the deterministic andstochastic differential equation limits when the arrivalrate goes to ∞. This can be established using weak-convergence theorems such as the ones in [5, 12, 23].

4 Incentive Mechanism

In this section, we discuss the algorithm in BitTorrentwhich is intended to discourage free-riding. We firstdescribe the algorithm and then study the optimalselfish behavior of the users under this algorithm.

4.1 Peer Selection Algorithm

There is a built-in incentive mechanism in BitTorrentto encourage users to upload. The basic idea is thateach peer uploads to nu peers from which it has thehighest downloading rates (the default value of nu is4). But since a peer only has partial information ofthe whole network (i.e., it doesn’t have the uploadrate information of all peers), optimistic unchoking[8] is used to explore the network. In this section,our objective is to understand how the built-in in-centive mechanism affects the network performance.Hence, we ignore the details of optimistic unchokingand assume that each peer has the global informationof uploading rates. We also assume that there areno downloading bandwidth constraints, all peers arefully connected and have demands from each other.

Under the above assumptions, we can simplify thepeer selection algorithm of BitTorrent as follows. Wefirst sort the peers according to their uploading band-width (it could be the physical uploading bandwidthor the uploading bandwidth that has been set man-ually by the user) such that the first peer has thehighest uploading bandwidth. If two or more peershave the same uploading bandwidth, they are ran-domly ordered. The peer selection process proceedsin steps with peer i choosing peers to upload at stepi. In the real BitTorrent, the peer selection does notproceed in steps like this. However, after we describethe selection algorithm, it would be clear that the

7

Page 8: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

step-by-step selection process does not change the se-lection of the peers significantly. Let N be the totalnumber of peers and let µi be the uploading band-width of peer i. Then at step i, peer i selects peersto upload according to the following rules.

1. If peer i is selected by peer j (j < i), then iselects j. For any peer k (k ≥ i), let ni

k be thenumber of peers that have selected peer k priorto step i.

2. If nii < nu and nu − ni

i ≤ N − i, peer i selectsnu − ni

i peers from the set {k|k > i} using thefollowing set of rules to prioritize a peer, say k1,over another peer k2 :

(a) If µk1 > µk2, select k1.

(b) If µk1 = µk2 and nik1 < ni

k2, select k1.

(c) If µk1 = µk2, nik1 = ni

k2, and k1 < k2, selectk1.

3. If nii < nu and nu − ni

i > N − i, peer i selectsall peers in {k|k > i} and also randomly selects(nu − ni

i) − (N − i) peers from the peers that ihas not selected yet.

These rules are easy to understand. Rule 1 statesthat if the downloading rate from peer j to peer i isgreater than or equal to the uploading rate of i, peeri should upload to peer j to try to keep the down-loading rate high. We will show in Lemma 1 thatni

i ≤ nu. So rule 1 will not violate the requirementthat the number of uploads cannot exceed nu. Rule2(a) simply gives priority to peers with higher up-loading rates. Rule 2(b) tries to treat peers with thesame uploading rate as fairly as possible and rule 2(c)is simply a tie-break rule. Rule 3 takes care of the lastseveral peers and makes sure that all peers have nu

uploads. The following lemma is a simple propertyof the peer selection algorithm.

Lemma 1 With the peer selection algorithm, whenpeer i selects uploading peers, we have ni

i ≤ nu andfor any k2 > k1 ≥ i, ni

k2 ≤ nik1 ≤ nu.

Proof: First, when i = 1, nii = 0 ≤ nu and ni

k2 =ni

k1 = 0 ≤ nu, the lemma is true. Now, we assumethat the lemma is true for peer i and prove that it isalso true for peer i + 1 and hence by induction, it istrue for all i.

If the lemma is true for i, we will have nii+1 ≤ ni

i ≤nu and ni

k2 ≤ nik1 ≤ nu for any k2 > k1 ≥ i + 1.

Now, if nii = nu, then peer i already has nu uploads

and it will not select any peer from {k|k > i}. Hence,for any k > i, ni

k = ni+1k and the lemma is true for

i + 1. If nii < nu, then ni

i+1 < nu. So, no matterwhether peer i selects peer i + 1 or not, we alwayshave ni+1

i+1 ≤ nu. To show the second part of thelemma, if ni

k1 > nik2, after peer i makes the selection,

we always have ni+1k1 ≥ ni+1

k2 . If nik1 = ni

k2, accordingto rule 2, we also have ni+1

k1 ≥ ni+1k2 . Hence the lemma

is true for i+ 1.Now let Di be the set of peers that select peer i.

We exclude peers that randomly select i by using rule3 here for two reasons. First, each peer i has aboutequal chance to be selected and hence on average,the effect of the random selection can be equivalentlyseen as each peer getting a constant download ratedr. Secondly, if the number of peers is large, dr willbe very small and can be ignored. The aggregatedownloading rate of peer i then is

di =1

nu

k∈Di

µk.

Note that if two peers have the same uploading band-width, they may get different downloading rates.Generally, if µi = µi+1 = · · · = µj are peerswith the same uploading bandwidth, we will havedi ≥ di+1 ≥ · · · ≥ dj . So, for a given peer i, thedownloading rate not only depends on the uploadingbandwidth µi, but also depends on how the peer isordered with regards to other peers with the sameuploading bandwidth. To eliminate the ambiguity,when there are two or more peers with the same up-loading bandwidth µ, we define the downloading rateof these peers to be

d(µ) =1

j − i+ 1

j∑

k=i

dk, (15)

where i (resp. j) is the first (resp. last) peer withuploading bandwidth µ. Moreover, we have the fol-lowing lemma when nu ≥ 2.

Lemma 2 Suppose that peers i, i+ 1, · · · , j have thesame uploading bandwidth µ, where i (resp. j) is thefirst (resp. last) peer with uploading bandwidth µ. Ifj + i− 1 > nu ≥ 2, then for any k > j, we have

1. di ≥ di+1 ≥ · · · ≥ dj ≥ dk,

2. di > dk,

3. d(µ) > dk.

8

Page 9: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

Proof: First, from the peer selection rules, it iseasy to see that for any two peers k1 < k2, dk1 ≥ dk2.So condition 1 is obviously true. Now, to prove condi-tion 2, we only need to prove di > dj+1. When peer iselects peers, if ni

i = nu (i.e., peer i has already beenselected by nu peers that have uploading bandwidthgreater than µ), then di > µ. If ni

i < nu, peer i willselect nu − ni

i peers from i + 1, · · · , j. So, we alwayshave di ≥ µ.

Now, when peer m (i ≤ m ≤ j − nu) selects peers,if nm

m+1 ≥ 1, obviously we will have nm+1m+1 ≥ 1. If

nmm+1 = 0, since peers m and m + 1 have the same

uploading bandwidth, from peer selection rule 2(b),we have nm

m ≤ 1 < nu and m will select peer m + 1.Hence nm+1

m+1 = 1. In both case, we have nm+1m+1 ≥ 1.

When m + 1 selects peers, since nm+1m+1 ≥ 1, nu −

nm+1m+1 ≤ nu − 1. From m+2 to j, we have more thannu − 1 peers with uploading bandwidth µ. So, m+ 1will not select peer j+1 and peer j+1 can at most beselected by nu − 1 peers with uploading bandwidthµ. So

dj+1 ≤ 1

nu

((nu − 1)µ+ µj+2) < µ ≤ di.

Since dj+1 ≥ dk for any k ≥ j + 1, we have di > dk.From the condition 2 and the definition of d(µ)

(15), it is easy to see that d(µ) > dk and we aredone.

Now, we have defined the peer selection rules. Wewill next study how these rules affect the peer’s choiceof µi, the uploading bandwidth.

4.2 Peer Strategy

The objective of the incentive mechanism is to en-courage users to contribute. In BitTorrent, the up-loading bandwidth can be chosen by each user up to amaximum of the physical uploading bandwidth. Thepurpose of the rest of this section is to study howthe incentive mechanism will affect the peer strat-egy, i.e, how the users set their bandwidth. Let pi

be the physical uploading bandwidth of peer i andlet {µ−i} be the set of uploading bandwidth chosenby the peers except µi. Let di(µi, µ−i) be the aggre-gate downloading rate of peer i when the uploadingbandwidth of peer i is µi. When {µ−i} is given, it isobvious that di is a non-decreasing function of µi. Sowhen µi = pi, peer i gets the maximum downloadingrate. But setting µi = pi is not necessarily the beststrategy for peer i. For each peer i, di is the gain andµi is the cost. A peer wants to maximize the gain,

but at the same time, it also wants to minimize thecost. Here, we assume that maximizing the gain haspriority over minimizing the cost. Intuitively, we maywant peer i to choose µi such that

µi = min{µi|di(µi, µ−i) = di(pi, µ−i)}. (16)

But unfortunately, the minimum of the set{µi|di(µi, µ−i) = di(pi, µ−i)} may not exist (e.g., forthe set (4, 6]). If we take this into account, the beststrategy for peer i will be

µi = min {inf{µi|di(µi, µ−i) = di(pi, µ−i)} + ε, pi} ,(17)

where ε > 0 is a small number. The parameter ε canbe interpreted as the small difference between tworates that a peer can differentiate. Note that even ifthe minimum of {µi|di(µi, µ−i) = di(pi, µ−i)} exists,it is still better to add a small number ε. Because ifthe uploading bandwidth of two peers are very close,we may not be able to detect the difference betweenthem. Hence, adding a small positive number canhelp differentiate peer i from other competing peers.

Given the peer selection algorithm (game rules),we can now study the system as a non-cooperativegame. A Nash equilibrium for our problem is a set ofuploading rates {µi} such that

µi = min {inf{µi|di(µi, µ−i) = di(pi, µ−i)} + ε, pi} .

Let’s consider a small BitTorrent network with 6peers. The number of uploads nu = 4 for all peers.We will show that if the peers have different physi-cal uploading bandwidth and the minimum upload-ing bandwidth min{pi} > 2ε, there is no Nash equi-librium point for the system. In this simple exam-ple, we can see that if the uploading bandwidth µi ofpeer i is less than those of all other peers, then peeri will get zero downloading rate because the otherfive peers will upload to each other and not to peeri. On the other hand, once µi is greater than theuploading bandwidth of at least one peer, peer i willget the same downloading rate even if µi < pi. Sothe strategy for peer i (17) in this example turns outto setting µi such that it is the fifth highest upload-ing bandwidth. Now, assume that there is a Nashequilibrium point {µi} and we sort the peers by theiruploading bandwidth such that µ1 is the highest up-loading bandwidth. Then we have µ5 > µ6. Other-wise, if they are equal, since the two peers have dif-ferent physical uploading bandwidth, there is at leastone peer with µi < pi and this peer can increase its

9

Page 10: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

uploading bandwidth to increase its download rate.Now, if µ5 > µ6, we know that peer 6 gets a zerodownloading rate. Since {µi} is a Nash equilibrium,given {µ−6}, the maximum downloading rate thatpeer 6 can get is also zero. Hence, from (17), wehave µ6 = ε. Now, if µ6 = ε, from ( 17), we haveµ5 = 2ε < min{pi}. If µ5 < min{pi}, peer 6 canincrease its uploading bandwidth such that µ6 > µ5,which contradicts the fact that µ5 is the fifth high-est uploading bandwidth. Hence, there is no Nashequilibrium point for the system. While there maybe no Nash equilibrium point for a general networksetting, when the network consists of groups of peerswhere members of each group have the same upload-ing and downloading bandwidths, there does exist aNash equilibrium point as we will show in the nextsubsection.

4.3 Nash Equilibrium Point

We consider a network with a finite number of groupsof peers. In group j, all peers have the same physicaluploading bandwidth pj . Note that this is in fact agood model for the current Internet users, who haveonly a finite number of network access methods (dial-up, dsl, cable modem, etc). Let gj be the set of peersin group j and ||gj || be the number of peers in groupj. Without loss of generality, we also assume p1 >p2 > · · ·.

Proposition 1 If nu ≥ 2 and the number of peers ina group ||gj || > nu + 1 for all groups, there exists aNash equilibrium point for the system, in which µi =pj if peer i ∈ gj. Moreover, with any initial setting of{µ0

i }, the system converges to the Nash equilibriumpoint {µi}.

Proof: We first prove that {µi} is a Nash equi-librium point. To prove this, we only need to provethat for any peer i, if µi < µi, then di(µi, µ−i) <di(µi, µ−i). Without loss of generality, we assumethat i ∈ gj . Since ||gj || > nu + 1, if we set µi <µi = pj , there will be still at least nu + 1 peers withuploading bandwidth pj . From Lemma 2, it is easyto see that di(µi, µ−i) < di(µi, µ−i).

To prove convergence, we first consider the firstgroup g1. Let vm be the (nu + 1)th highest upload-ing bandwidth after m rounds of iterations. Thenv0 is the (nu + 1)th highest uploading bandwidth ofthe initial set {µ0

i }. If after m rounds, vm + ε ≤ p1,then in the m + 1 round, any peer i ∈ g1 will in-crease its uploading bandwidth to µi ≥ vm + ε to

maximize its downloading rate. Since ||g1|| > nu +1,after the m + 1 round, we will have vm+1 ≥ vm + ε.The increase in vm will continue until vm = p1 andthe peers cannot increase their uploading bandwidthanymore. In this case, any peer i ∈ g1 will have theuploading bandwidth µi = p1. Once peers in thefirst group reach their maximum limit, they will notchange their uploading bandwidth anymore. We cannow use a similar argument to prove that peers inthe second group will also reach the Nash equilib-rium point. Continuing in a similar fashion, we canestablish that the whole system converges to the Nashequilibrium point.

5 Optimistic Unchoking

In Section 4, we assume that each peer knows the up-loading bandwidths of all other peers. In reality, eachpeer only has the rate information about peers fromwhich it is downloading. Hence optimistic unchok-ing is used to explore the network and obtain infor-mation about other peers. In this section, we brieflystudy the effect of optimistic unchoking on free-riders.Specifically, while in Section 4.3, we showed that ra-tional users would set their uploading rate to be equalto the maximum possible limit, here we will showthat the maximum downloading rate that an irra-tional user who chooses to free-ride is limited to 1

nu+1of the normal downloading rate that they can get ifthey behave rationally.

5.1 Free-Riding

Free-riding means that a peer does not contributeanything to the system, while it attempts to obtainservice (or downloading) from other peers. If peershave global information, the free-riding problem canbe solved by not uploading to peers with zero up-loading bandwidth. In reality, peers use optimisticunchoking to explore the network and this gives anopportunity to free-ride. To illustrate it, let’s con-sider a simple example.

We consider a network with a group of peers (g1)that have the same uploading bandwidth µ. Thenumber of peers in the group is N . We assume thateach peer has nu uploads and one optimistic unchok-ing upload. Now, a new peer j with zero uploadingbandwidth joins the network. Each peer i ∈ g1 willrandomly choose a peer that it is not currently up-loading to as the target of its optimistic unchoking.So, for peer i, on average, 1

N−nuof the time, it will

10

Page 11: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

optimistically upload to peer j. Since there is a totalof N peers in g1, the total average downloading rateof peer j will be

N1

N − nu

µ

nu + 1≈ µ

nu + 1,

when N is large.In this example, we see that because of optimistic

unchoking, peer j contribute nothing to the system,but it still get an average downloading rate of µ

nu+1 .In current BitTorrent, nu = 4 and thus a free-ridergets 20% of the possible maximum downloading rate.It would seem that nu can be increased to reduce theamount that a free-rider can get. However, choosingnu to be large means that multiple TCP connectionshave to share the same bandwidth and thus may leadto more time-outs and result in poor performance.The choice of an optimal nu or other methods to al-leviate the free-riding problem is a subject for furtherstudy.

6 Experimental Results

We performed a series of experiments to validate thefluid model described in Section 3. In the first twoexperiments, we compare a simulated BitTorrent-likenetwork and the fluid model. In the last experiment,we actually introduced a seed into the BitTorrent net-work, studied the evolution of the seeds/downloaders,and compared it to our fluid model results. Due tocopyright reasons, we obviously could not introducea very popular file into the network. However, as wewill show in our experimental results, even for a filewhich had a total of less than 100 completed down-loads, the match between the fluid model and theobserved data is quite close.

6.1 Experiment 1

In Figs 1 and 2, we compare the simple determin-istic fluid model that we derived with the resultsfrom a discrete-event simulation of a BitTorrent-likenetwork. In the discrete-event simulation, we usethe Markov model described in Section 3.4. Wechose the following parameters for this simulation:µ = 0.00125, c = 0.002, θ = γ = 0.001. When thenumber of downloaders is 1, we set η = 0, otherwise,we set η = 1. This is in keeping with our observa-tion regarding the efficiency of the download as de-scribed in Section 3.2. Initially, there is one seed and

0 1000 2000 3000 4000 50000

100

200

300

400

500

600

700

800

time (min)

Nor

mal

ized

num

ber o

f see

ds

λ=0.04 λ=0.4 λ=4 λ=40 simple fluid model

Figure 1: Experiment 1 : The evolution of the numberof seeds as a function of time

0 1000 2000 3000 4000 50000

50

100

150

200

250

300

350

400

450

500

time (min)

Nor

mal

ized

num

ber o

f dow

nloa

ders

λ=0.04 λ=0.4 λ=4 λ=40 simple fluid model

Figure 2: Experiment 1 : The evolution of the numberof downloaders as a function of time

no downloader. We also keep the number of seedsno less than one during the entire simulation. Wechange the arrival rate λ from 0.04 to 40 and plotnumber of seeds/downloaders normalized by the ar-

rival rate, i.e., y(t)λ

and x(t)λ, from both simulations

and the fluid model. In this experiment, since γ < µ,we know that downloading bandwidth is the bottle-neck. From the figures, we see that the simple fluidmodel is a good approximation of the system when λis large, but the match is quite good even for smallλ. The figures also indicate that the number of down-loaders increases linearly with the arrival rate λ. ByLittle’s law, this implies that the average downloadtime is constant, independent of the peer arrival rate,which shows that the system scales very well. In other

11

Page 12: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

words, even very popular files can be downloaded atthe same speed as less popular files.

6.2 Experiment 2

0 1000 2000 3000 4000 50000

50

100

150

200

250

time (min)

Nor

mal

ized

num

ber o

f see

ds

λ=0.04 λ=0.4 λ=4 λ=40 simple fluid model

Figure 3: Experiment 2 : The evolution of the numberof seeds

0 1000 2000 3000 4000 50000

50

100

150

200

250

300

350

400

450

500

time (min)

Nor

mal

ized

num

ber o

f dow

nloa

ders

λ=0.04 λ=0.4 λ=4 λ=40 simple fluid model

Figure 4: Experiment 2 : The evolution of the numberof downloaders

In Figs. 3 and 4, we have the same setting as thefirst experiment, except that now we set γ = 0.005.With the change of γ, the uploading bandwidth nowbecomes the bottleneck. In this setting, we have thesimilar result as before. Again, we see that the simplefluid model is accurate when λ is large, but performswell even for smaller λ. We also plot the histogram of

−40 −20 0 20 40 600

200

400

600

800

1000

1200

1400

1600

y*

His

togr

am o

f y*

λ=0.04 λ=0.4 λ=4 λ=40

Figure 5: Experiment 2 : Histogram of the variationof the number of seeds around the fluid model

−60 −40 −20 0 20 40 600

200

400

600

800

1000

1200

x*

His

togr

am o

f x*

λ=0.04 λ=0.4 λ=4 λ=40

Figure 6: Experiment 2 : Histogram of the variationof the number of downloaders around the fluid model

x and y in Figs. 5 and 6,

x(t) =xsim(t) − x(t)√

λ

and

y(t) =ysim(t) − y(t)√

λ,

where xsim(t) and ysim(t) are the number of down-loaders and seeds respectively in the actual simula-tion and x(t) and y(t) are the number of download-ers and seeds in deterministic fluid model. From thetheory presented in Section 3.4, we expect the his-tograms to look roughly Gaussian and this fact isborne out by the figures for sufficiently large λ. We

12

Page 13: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

can see that the variance of x and y do not changemuch when λ changes from 0.04 to 40.

6.3 Experiment 3

In this experiment, we introduced a file into theBitTorrent network and collected the log files ofthe BitTorrent tracker for a time period of aroundthree days. When a peer joins/leaves the systemor completes the download, it reports the event tothe tracker. In addition, peers regularly report in-formation such as the total amount of data up-loaded/downloaded so far, the number of bytes thatstill need to be downloaded, etc. The tracker keepsall the information in the log files. Hence, we can an-alyze the tracker log files and retrieve useful informa-tion. The parameters λ, θ, and γ can be measured bycounting the peer arrival, the downloader departure,and the seed departure respectively. However, fromthe tracker log files, we cannot determine whether theuploading bandwidth or the downloading bandwidthis the bottleneck. So we assume the uploading band-width is the bottleneck and estimate µ by dividingthe measured total uploading rate by the number ofpeers (i.e., we assume that η = 1). The size of the filethat was introduced was around 530MB. The aver-age uploading bandwidth was estimated to be 90kb/s.We use 1 min as the time unit to calculate arrivalrates, departure rates, etc. The normalized upload-ing bandwidth (normalized by the file size in bytes)was estimated µ = 0.0013. The downloader leavingrate was estimated to be θ = 0.001. An interestingfeature that we observed in the real BitTorrent is thatλ and γ are in fact time-varying. We attribute this tothe fact that when a new file is introduced into thesystem, the first few seeds stay in the system longenough to ensure that there is a sufficient populationof peers to sustain the system. If the initial seedsdepart too quickly, the system will simply die, i.e.,there will be no one to download from.

From the tracker logs, we estimate that, for t ≤800min, λ = 0.06 and γ = 0.001. When t ≥1300min, λ = 0.03 and γ = 0.0044. In between,the arrival rate increases roughly linearly. In ourfluid model simulation, for time between 800min and1300min, we let λ and γ change linearly. We also setthe downloading bandwidth c = 1 for the fluid modelsimulation (note that the actual value of c will notaffect the fluid model results if it is above a certainthreshold).

The simulation results are shown in Figs 7 and 8.

0 1000 2000 3000 4000 50000

2

4

6

8

10

12

14

16

18

20

time (min)

num

ber o

f see

ds

fluid model real trace

Figure 7: Experiment 3 : Evolution of the number ofseeds

0 1000 2000 3000 4000 50000

5

10

15

20

25

time (min)

num

ber o

f dow

nloa

ders

fluid model real trace

Figure 8: Experiment 3 : Evolution of the number ofdownloaders

The real trace is measured from the tracker log fileand the fluid model is calculated by using the abovemeasured parameters. For the fluid model, we alsonumerically calculate the standard deviation from thesteady state network parameters by using (14) andplot the error bar for 95% confidence intervals. FromFig. 7, we see that the fluid model captures the evo-lution of the number of seeds well. In Fig. 8, theoscillation of the number of downloaders is more sig-nificant. This is because that the file is not very pop-ular and the arrival rate λ is small. Hence, our modelis only an approximation of the real network. But de-spite this, we can see that the oscillation is within thelevel suggested by the 95% confidence interval.

13

Page 14: Modeling and Performance Analysis of BitTorrent-Like Peer ...harchol/WORMS04/people/srikant/srikant.pdf · Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks

7 Conclusions

In this paper, we first presented a simple fluid modelfor BitTorrent-like networks and studied the steady-state network performance. Specifically, we obtainedexpressions for the average number of seeds, the av-erage number of downloaders, and the average down-loading time as functions of the peer arrival rate,downloader leaving rate, seed leaving rate, upload-ing bandwidth, etc, which explicitly give us insighton how the network performance is affected by dif-ferent parameters. We also characterized the vari-ability of the system by applying limit theorems tothe stochastic model when the arrival rate is large.We then abstracted the built-in incentive mechanismof BitTorrent and studied its effect on network per-formance. Under certain conditions, we proved thata Nash equilibrium exists, under which each peerchooses its physical uploading bandwidth to be equalto the actual uploading bandwidth. We also brieflydiscussed the effect of optimistic unchoking on free-riding. Our experimental results show that the simplefluid model can capture the behavior of the systemeven when the arrival rate is small.

References

[1] Entropia. http://www.entropia.com.

[2] Top applications (bytes) for subin-terface: SD-NAP traffic, 2002.www.caida.org/analysis/workload/byapplication/sdnap.

[3] L. Arnold. Stochastic Differential Equations: Theoryand Applications. John Wiley, New York, NY, 1974.

[4] D. Bertsekas and R. Gallager. Data Networks. Pren-tice Hall, Englewood Cliffs, NJ, 1987.

[5] P. Billingsley. Convergence of Probability Measures.Wiley, 1968.

[6] F. Clevenot and P. Nain. A Simple Fluid Modelfor the Analysis of SQUIRREL. Technical report,INRIA RR-4911, 2003.

[7] F. Clevenot, P. Nain, and K. Ross. Stochastic FluidModels for Cache Clusters. Technical report, INRIARR-4815, 2003.

[8] B. Cohen. Incentives build robustness in bittorrent,May 2003. http://bitconjurer.org/BitTorrent/bittorrentecon.pdf.

[9] F. Dabek, M. F. Kaashoek, D. Karger, R. Morris,and I. Stoica. Wide-area cooperative storage withCFS. In Proceedings of the 18th ACM Symposium onOperating Systems Principles (SOSP ’01), ChateauLake Louise, Banff, Canada, October 2001.

[10] A. Das and R. Srikant. Diffusion approximationsfor a single node accessed by congestion-controlledsources. IEEE Transactions on Automatic Control,45(10):1783–1799, October 1998.

[11] G. de Veciana and X. Yang. Fairness, incentivesand performance in peer-to-peer networks. In theForty-first Annual Allerton Conference on Commu-nication, Control and Computing, Monticello, IL,Oct. 2003.

[12] S. N. Ethier and T. G. Kurtz. Markov Processes:Characterization and Convergence. Wiley, 1994.

[13] Z. Ge, D. R. Figueiredo, S. Jaiswal, J. Kurose, andD. Towsley. Modeling peer-peer file sharing systems.In Proc. of Infocom, 2003.

[14] P. W. Glynn. On the Markov property of theGI/G/∞ Gaussian limit. Advances in Applied Prob-ability, 14:191–194, 1982.

[15] S. Iyer, A. Rowstron, and P. Druschel. Squirrel: Adecentralized peer-to-peer web cache. In Proceed-ings of ACM Symposium on Principles of DistributedComputing (PODC ’02), Monterey, California, 2002.

[16] D. Liberzon and A. Morse. Basic problems in sta-bility and design of switched systems. IEEE ControlSystems Magazine, pages 59–70, 1999.

[17] T. S. Eugene Ng, Y.-H. Chu, S. G. Rao, K. Sri-panidkulchai, and Hui Zhang. Measurement-BasedOptimization Techniques for Bandwidth-DemandingPeer-To-Peer Systems. In INFOCOM, 2003.

[18] S. Ratnasamy, P. Francis, M. Handley, R. Karp, andS. Shenker. A scalable content addressable network.In Proceedings of ACM SIGCOMM 2001, 2001.

[19] M. Ripeanu. Peer-to-peer architecture case study:Gnutella network, 2001.

[20] M. Ripeanu, I. Foster, and A. Iamnitchi. Mappingthe gnutella network: Properties of large-scale peer-to-peer systems and implications for system design.IEEE Internet Computing Journal, 6(1), 2002.

[21] I. Stoica, R. Morris, D. Karger, M. Kaashoek, andH. Balakrishman. Chord: A scalable peer-to-peerlookup protocol for internet applications. In Pro-ceedings of ACM SIGCOMM 2001, 2001.

[22] W. Whitt. On the heavy-traffic limit theorems forGI/G/∞ queues. Advances in Applied Probability,14:171–190, 1982.

[23] W. Whitt. Stochastic Process Limits. Springer, 2002.

[24] X. Yang and G. de Veciana. Service Capacity of Peerto Peer Networks. In INFOCOM, 2004.

14