9
Bandwidth allocation in peer-to-peer file sharing networks Albert Creus-Mir a , Ramon Casadesus-Masanell b , Andres Hervas-Drane c, * a Universitat Polite `cnica de Catalunya, EUETIB, Comte d’Urgell 187, Barcelona 08036, Spain b Harvard Business School, Morgan Hall 233, Soldiers Field Road, Boston, MA 02163, USA c Harvard Business School, Baker Library B20H, Soldiers Field Road, Boston, MA 02163, USA Available online 14 August 2007 Abstract We present a model of bandwidth allocation in a stylized peer-to-peer file sharing network with s peers (sharers) who share files and download from each other and f peers (freeriders) who download from sharers but do not contribute files. Assuming that upload band- width is scarcer than download bandwidth and efficient allocation, we compute the expected bandwidth obtained by each peer. We show that (i) while the exact formula is complex, s/(s + f) is a good approximation and (ii) sharers (freeriders) obtain bandwidth larger (smal- ler) than s/(s + f). The paper constitutes a first step towards a general analytical foundation for scarce resource allocation in peer-to-peer file sharing networks. Ó 2007 Elsevier B.V. All rights reserved. Keywords: Peer-to-peer; Network formation; Resource allocation; Congestion 1. Introduction Consider a peer-to-peer (p2p) file sharing network where content offered for download is all of similar value to downloaders. Peers in the p2p network may act as sharers (by contributing files for others to download) or as freerid- ers (by downloading from others but not making files avail- able for download). In a world where upload bandwidth is scarce, a natural question arises: how is total upload band- width allocated among sharers and freeriders? In this paper we present a stylized model of a p2p file sharing network to address this question. Our model of p2p assumes that: (1) Each sharer provides one unit of upload bandwidth; (2) All peers have at least one unit of download band- width capacity; (3) Every peer connects to one sharer only; (4) A sharer may not connect to herself; (5) Bandwidth obtained from a sharer is allocated equa- bly amongst all peers connected to her. We refer to a set of links connecting peers to sharers as an allocation.A stable allocation is one where no peer can be made strictly better off by connecting to a different sharer. We assume all stable allocations arise in the net- work with equal probability. For example, consider a p2p network with three peers only: two sharers and one freerider. In this case, there are two stable allocations (see Fig. 1). Notice that in this simple example, sharer 1 must download from sharer 2 (and vice versa). Given this, the freerider (peer 3 in the figure) is indifferent between downloading from sharer 1 or from sharer 2. In both cases, the freerider obtains download bandwidth equal to 1/2. Depending on the sharer to which the freerider connects to, sharers may end up with down- load bandwidth of 1 or 1/2. Under the assumption of equi- probability of stable allocations, sharers wind up with expected bandwidth equal to 3/4 while the freerider gets 1/2 only. In this paper we consider p2p networks with an arbi- trary number of peers and derive an exact formula for the expected bandwidth obtained by sharers and freeriders, 0140-3664/$ - see front matter Ó 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.comcom.2007.08.001 * Corresponding author. Tel.: +1 617 496 5725; fax: +1 617 495 0699. E-mail addresses: [email protected] (A. Creus-Mir), casadesus@ gmail.com (R. Casadesus-Masanell), [email protected] (A. Hervas- Drane). www.elsevier.com/locate/comcom Available online at www.sciencedirect.com Computer Communications 31 (2008) 257–265

Bandwidth allocation in peer-to-peer file sharing networks

Embed Size (px)

Citation preview

Available online at www.sciencedirect.com

www.elsevier.com/locate/comcom

Computer Communications 31 (2008) 257–265

Bandwidth allocation in peer-to-peer file sharing networks

Albert Creus-Mir a, Ramon Casadesus-Masanell b, Andres Hervas-Drane c,*

a Universitat Politecnica de Catalunya, EUETIB, Comte d’Urgell 187, Barcelona 08036, Spainb Harvard Business School, Morgan Hall 233, Soldiers Field Road, Boston, MA 02163, USA

c Harvard Business School, Baker Library B20H, Soldiers Field Road, Boston, MA 02163, USA

Available online 14 August 2007

Abstract

We present a model of bandwidth allocation in a stylized peer-to-peer file sharing network with s peers (sharers) who share files anddownload from each other and f peers (freeriders) who download from sharers but do not contribute files. Assuming that upload band-width is scarcer than download bandwidth and efficient allocation, we compute the expected bandwidth obtained by each peer. We showthat (i) while the exact formula is complex, s/(s + f) is a good approximation and (ii) sharers (freeriders) obtain bandwidth larger (smal-ler) than s/(s + f). The paper constitutes a first step towards a general analytical foundation for scarce resource allocation in peer-to-peerfile sharing networks.� 2007 Elsevier B.V. All rights reserved.

Keywords: Peer-to-peer; Network formation; Resource allocation; Congestion

1. Introduction

Consider a peer-to-peer (p2p) file sharing network wherecontent offered for download is all of similar value todownloaders. Peers in the p2p network may act as sharers

(by contributing files for others to download) or as freerid-

ers (by downloading from others but not making files avail-able for download). In a world where upload bandwidth isscarce, a natural question arises: how is total upload band-width allocated among sharers and freeriders?

In this paper we present a stylized model of a p2p filesharing network to address this question. Our model ofp2p assumes that:

(1) Each sharer provides one unit of upload bandwidth;(2) All peers have at least one unit of download band-

width capacity;(3) Every peer connects to one sharer only;(4) A sharer may not connect to herself;

0140-3664/$ - see front matter � 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.comcom.2007.08.001

* Corresponding author. Tel.: +1 617 496 5725; fax: +1 617 495 0699.E-mail addresses: [email protected] (A. Creus-Mir), casadesus@

gmail.com (R. Casadesus-Masanell), [email protected] (A. Hervas-Drane).

(5) Bandwidth obtained from a sharer is allocated equa-bly amongst all peers connected to her.

We refer to a set of links connecting peers to sharers asan allocation. A stable allocation is one where no peer canbe made strictly better off by connecting to a differentsharer. We assume all stable allocations arise in the net-work with equal probability.

For example, consider a p2p network with three peersonly: two sharers and one freerider. In this case, there aretwo stable allocations (see Fig. 1). Notice that in this simpleexample, sharer 1 must download from sharer 2 (and viceversa). Given this, the freerider (peer 3 in the figure) isindifferent between downloading from sharer 1 or fromsharer 2. In both cases, the freerider obtains downloadbandwidth equal to 1/2. Depending on the sharer to whichthe freerider connects to, sharers may end up with down-load bandwidth of 1 or 1/2. Under the assumption of equi-probability of stable allocations, sharers wind up withexpected bandwidth equal to 3/4 while the freerider gets1/2 only.

In this paper we consider p2p networks with an arbi-trary number of peers and derive an exact formula forthe expected bandwidth obtained by sharers and freeriders,

Fig. 1. Stable allocations with two sharers and one freerider. The arroworiginating from a given peer indicates the sharer from whom that peerobtains files.

258 A. Creus-Mir et al. / Computer Communications 31 (2008) 257–265

where expectations are taken across all stable allocations.We show that, just as in the example above, the expectedbandwidth obtained by sharers is always larger than thatavailable to freeriders. Sharers can be allocated to fewersources as they face the constraint of not connecting tothemselves.

We also show that #sharers/#peers is a good approxi-mation to the expected bandwidth obtained by both shar-ers and freeriders. Sharers (freeriders) always obtainexpected bandwidth larger than (smaller than) #sharers/#peers. And as the size of the network grows, the differencebetween expected bandwidth and #sharers/#peers quicklydecreases. Already in a network of size 10, the expectedbandwidth obtained by sharers and freeriders differs from#sharers/#peers by, at most, 10�4. And when network sizeis 100, the difference is always less than 10�6. Because theapproximation has an exceedingly simple form, it providesa foundation for applied theoretical work on p2p. Theexact formula, on the other hand, is discouragingly com-plex and it is unusable for all practical purposes.

While the present paper is devoted to the study of band-width allocation in p2p file sharing networks, we believeour approach is quite general and the results can beextended to other scarce rival resources in p2p networks.The remainder of the paper is organized as follows. Section2 relates our contribution to the literature. Section 3 statesthe problem formally. In Section 4 we introduce two reduc-tions and proceed to count all stable allocations. We pres-ent the formula in Section 5 and discuss some properties.Section 6 provides graphical representations. In the finalsection, we present intuition and discuss limitations.

2. Literature

Several empirical studies have examined the topology ofp2p file sharing networks, employing different techniquesto gather information. Network crawlers are deployed inAsvanund et al. [1], Chu, Labonte and Levine [2], Ripeanuand Foster [3], and in Saroiu, Gummadi and Gribble [4].While in Gummadi et al. [5] and Sen and Wang [6], p2ptraffic is recovered from a trace performed at the physicalnetwork’s backbone. Together, both approaches havehelped characterize the main properties of these networkswith respect to peer availability and activity patterns, datatraffic paths, and bandwidth bottlenecks.

A parallel literature has focused on constructing theoret-ical models of p2p file sharing. Contributions have emergedfrom Computer Science (Buragohain, Agrawal and Suri[7], Feldman et al. [8] and Golle, Leyton-Brown and Miro-

nov [9]) and Economics (Antoniadis, Courcoubetis andMason [10], Cunningham, Alexander and Adilov [11] andKrishnan et al. [12]). This line of work explores the incen-tives of peers to contribute resources to p2p networks. Sev-eral papers present mechanisms to induce higher resourcecontribution levels.

This theoretical literature, however, has so far failed toincorporate insights from the structure of the network.With just a few exceptions (see Jian and MacKie-Mason[13] and Feldman et al. [14], for instance), the structureof p2p networks is generally not modeled, and, as a result,the resource allocation rules considered lack a foundation.In our view, p2p research can benefit from explicitly con-sidering the properties emerging from the underlying net-work structure. One main difficulty in pursuing thisapproach, however, is the general lack of analytical tracta-bility of network games. To this end, the approximationresult derived in our analysis can be readily applied in the-oretical work.

Casadesus-Masanell and Hervas-Drane [15], for exam-ple, use the approximation to construct a model of a p2pfile sharing network with endogenous sharing that com-petes against a for-profit firm that offers content on a cli-ent–server architecture at positive prices. Building on thefoundation for bandwidth allocation presented here andincorporating the costs associated with file sharing, theauthors show that such a model can explain important styl-ized facts identified in the empirical literature.

3. Setup

Let N, S, and F, be the set of all peers, the set of sharers(peers that contribute bandwidth), and the set of freeriders(peers that do not contribute) respectively. Let n, s, and f,be the cardinalities of these sets. Every peer is either asharer or a freerider. That is, F = NnS and f = n � s. Recallthat a stable allocation is one where no peer can be madestrictly better off by reallocating her to another sharer.Obviously, given N and S there are multiple stable alloca-tions. And whenever the number of peers is not divisible bythe number of sharers, the bandwidth obtained from thenetwork will differ across peers.

There are two alternative, equivalent ways to representthe p2p networks that we study. First, we can use a net-work representation such as that of Fig. 1. Nodes are peersand arrows indicate the sharer from whom a given peerobtains files. Second, we can use urns and balls. Balls arepeers (sharers and freeriders) and there are as many urnsas sharers. Balls in an urn are all those peers downloadingfrom a given sharer. Fig. 2 presents both representationsfor a given allocation in a network with n = 12 and s = 5.In what follows, we use the ‘urns and balls’ representationas it is more directly related to the mathematical develop-ment below.

Let b := s � (n mod s), and a := n div s. In every stableallocation we must have a peers allocated to b sharersand the rest (a + 1) allocated to the remaining s � b shar-

Fig. 2. Two alternative representations of an allocation.

A. Creus-Mir et al. / Computer Communications 31 (2008) 257–265 259

ers. Therefore, n = ba + (s � b)(a + 1). The following fig-ure (Fig. 3) illustrates these zones.

To compute the expected bandwidth obtained by free-riders and sharers, we begin by computing the bandwidthobtained by each peer in every stable allocation. Let x bea peer, and let G(x) be the bandwidth obtained by the peerin any given allocation. G(x) is a random variable that cantake two values only, 1/(a + 1) and 1/a. For any given sta-ble allocation, we define the set B of ‘fortunate’ peersas B = {x|G(x) = 1/a} and the set of ‘unfortunate’ peersas S � B = {x|G(x) = 1/(a + 1)}. Notice that the specificpeers in B and S � B depend on the particular allocationunder consideration.

The expected bandwidth obtained across all peers, bothsharers and freeriders, is E := E(G(x)) = s/n. Obviously,1/(a + 1) < s/n 6 1/a. We may also consider the conditionalexpectations ES := E(G(x)|x 2 S) and EF := E(G(x)|x 2 F)(these are the expected bandwidths obtained by sharersand freeriders, respectively). If sharers were to connect tothemselves the calculation would be trivial. In this case,symmetry implies that ES = EF = E = s/n.

As sharers do not connect to themselves in peer-to-peernetworks, however, the conditional expectations ES and EF

turn out to be different. Nevertheless, the total bandwidthavailable in the network, s, must still be equal to the num-ber of sharers times ES plus the number of freeriders timesEF. That is, s/n Æ ES + (n � s)/n Æ EF = s/n. Therefore, if wecompute ES, we can immediately obtain EF.

Because every stable allocation is assumed to be equi-probable, to compute ES we need to count all of themand compute the average of G(x)x2S for each. Let H bethe total number of stable allocations. Let hi be the numberof these allocations with i sharers in B. Notice that for allthese allocations, the average of G(x)x2S is exactly thesame. Therefore, we can consider hi as the histogram of thisvalue. Then,

Fig. 3. Relevant zones.

ES ¼Xs

i¼0

1

aisþ 1

aþ 1

s� is

� �hi

H

or

ES ¼aþ

Psi¼0

is

hiH

aðaþ 1Þ :

4. Count

In this section we compute hi. That is, we orderly enu-merate and count all the stable allocations that have i shar-ers in B. Due to the magnitude of this task, we proceed byreducing and decomposing the problem.

4.1. First reduction

Given n and s, consider the set of all stable allocations.We can divide this set into subsets depending on whichsharers support B (having only a peers allocated to them).These subsets are disjoint, and they add up to the entire setof stable allocations, so they are classes. Specifically, there

aresb

� �classes.

Each of these classes has the same number of alloca-tions. To see this, notice that there are bijections betweenthe classes, obtained by changing the names of the ele-ments. Consider class r and let hr

i be the number of alloca-tions with i sharers in B for class r. Notice that for all r, thevalue of hr

i (for every i) is the same. Therefore, by studying

one single allocation we can obtain the total hi ¼sb

� �hr

i .

Thus, to simplify the analysis, we will study the class inwhich the sharers supporting B are the first ones. The nextfigure (Fig. 4) illustrates the approach.

4.2. Second reduction

Given n and s, consider the set of all stable allocationswhere B is supported by the first sharers. We now computehow many ways we can assign the n peers between the twozones B and S � B.

4.2.1. Counting reduction 2

In general, there aren

ba

� �possibilities. There are three

exceptions:

Fig. 4. The first reduction is to rename sharers to get a clearer view.

Fig. 6. Freeriders are plotted in white, easy sharers in grey andcomplicated sharers in black.

260 A. Creus-Mir et al. / Computer Communications 31 (2008) 257–265

(1) When b = s there is 1 possibility.

(2) When b = 1 there aren� 1

ba

� �possibilities.

(3) When b = s � 1 there aren� 1

ba� 1

� �possibilities.

These three cases will always be exceptions to the gen-eral formula we derive. They are illustrated in the followingfigure (Fig. 5). The last two cases are special because theyrequire that a given sharer be assigned to ‘the other’ zone.

In the general case, 1 < b < s � 1, there are

n� sba� i

� �si

� �possibilities that have exactly i sharers in

B. Using known formulae, adding up we obtain

n

ba

� �¼

Xminfs;bag

i¼maxf0;sþba�ng

n� s

ba� i

� �s

i

� �;

where the limits come from observing the problem andensuring the combinatorial numbers are well defined.

As illustrated in the figure above (Fig. 6), j sharers outof i may be ‘complicated’. A complicated sharer is onethat supports B and happens to be in B. We have totake this into account in order to not assign a compli-cated sharer to herself. An allocation that has a sharerassigned to herself is called a ‘coincidence.’ There are

n� sba� i

� �s� bi� j

� �bj

� �possibilities that contain exactly

j ‘complicated’ sharers. Again using known formulae,we add up and obtain

n

ba

� �¼

Xminfs;bag

i¼maxf0;sþba�ng

n� s

ba� i

� � Xminfb;ig

i¼maxf0;bþi�sg

s� b

i� j

� �b

j

� �:

4.2.2. Relationship with the original problem

For every possibility of the reduced problem, we nowcount how many possibilities exist in the original problem.We proceed by assigning the peers in B to the b sharers thatsupport them.

If there were no ‘complicated’ sharers, this would be asimple multinomial problem with formula ðabÞ!

ða!Þb. However,because there are j complicated sharers, we perform thesame count but need to subtract all those possibilities withcoincidences. For example, if j = 1,

Fig. 5. The three

ðabÞ!ða!Þb

� ðab� 1Þ!ða� 1Þ!ða!Þb�1

¼ ðabÞ!ða!Þb

� aabðabÞ!ða!Þb

¼ ðabÞ!ða!Þb

1� aab

� �:

If j = 2, we subtract twice the cases with one coincidence,but add once the cases with two coincidences,

ðabÞ!ða!Þb

� 2ðab� 1Þ!

ða� 1Þ1!ða!Þb�1þ 1

ðab� 2Þ!ða� 1Þ!2ða!Þb�2

¼ ðabÞ!ða!Þb

1� 2a1

ðabÞ þ 1a2

ðabÞðab� 1Þ

� �

¼ ðabÞ!ða!Þb

2

0

� �a0ðab� 0Þ!ðabÞ! � ðabÞ!

ða!Þb2

1

� �a1ðab� 1Þ!ðabÞ!

þ ðabÞ!ða!Þb

2

2

� �a2ðab� 2Þ!ðabÞ! :

In this process, we are applying the principle of inclusion-exclusion. The problem is similar to a derangement. Byanalogy, we define the Multinomial Derangement number,

MDðb; aÞ ¼ ðabÞ!ða!Þb

Xb

k¼0

b

k

� �ð�1Þkak ðab� kÞ!

ðabÞ! ;

and the Generalized Multinomial Derangement number,

GMDðb; a; jÞ ¼ ðabÞ!ða!Þb

Xj

k¼0

j

k

� �ð�1Þkak ðab� kÞ!

ðabÞ! :

Notice that MD(b,a) = GMD(b,a,b), and MD(b, 1)= Derangements(b) = !b.

As we have to take into account both zones, B andS � B, for each possibility of the reduced problem there

exceptions.

A. Creus-Mir et al. / Computer Communications 31 (2008) 257–265 261

are GMD(b,a, j)GMD(s � b,a + 1,(s � b) � (i � j)) possi-bilities of the original problem.

5. Formula

Consider the original problem described in Section 3.Let n be the number of peers, s the number of sharers,a = n div s and b = s � (n mod s).

(0) For b = s, we have that ES = 1/a = s/n. Otherwise, ifH :¼

Pihi;

ES ¼aþ

Ps

i¼0

is

hiH

aðaþ 1Þ :

(1) For b = 1, i 2 [max{0, s + ba � n}, min{s � 1,ba}].(2) For b = s � 1, i 2 [max{1, s + ba � n}, min{s,ba}].(3) Otherwise, 1 < b < s � 1, i 2 [max{0, s + ba � n},

min{s,ba}].

Where

hi ¼n� s

ba� i

� � Xminfb;ig

j¼maxf0;bþi�sg

s� b

i� j

� �b

j

� �GMDðb; a; jÞ�

GMDðs� b; aþ 1; ðs� bÞ � ði� jÞÞ;

and

GMDðb; a; jÞ ¼ ðabÞ!ða!Þb

Xj

k¼0

jkð Þð�1Þkak ðab� kÞ!ðabÞ! :

20 40 60 80 100s0.0

0.2

0.4

0.6

0.8

1.0Es

0

Plot 1. ES P s/n.

5.1. Properties

If sharers can connect to themselves we obtain ES = s/n.To see this, notice that in this case the GMD() factors dis-appear and

Xi

is

hi

Pi

is

n� s

ba� i

� �s

i

� �

n

ba

� �

¼

Pi

ðn� 1Þ � ðs� 1Þðba� 1Þ � ði� 1Þ

� � ðs� 1Þði� 1Þ

� �

n

ba

� �

¼

n� 1

ba� 1

� �

n

ba

� � ¼ abn:

Therefore, the difference is given by the GMD() factors.Also note that (ab)!/(a!)b is not relevant within GMD().

Because this expression does not depend on i or j, it alsoappears in H :¼

Pihi. Therefore it cancels out.

Two efficient ways to compute GMD() recursively are asfollows. Let

dk :¼j

k

� �ð�1Þkak ðab� kÞ!

ðabÞ!

¼ ðj� 0Þ . . . ðj� k þ 1Þ12 . . . k

� �fð�1Þ . . . ð�1Þgfaa . . . ag

� 11 . . . 1

ðba� 0Þ . . . ðba� k þ 1Þ

� �;

fk :¼ ð�1Þðj� k þ 1ÞðaÞðkÞðba� k þ 1Þ ; f0 :¼ 1;

dk :¼ fkdk�1; d0 :¼ 1;

gk :¼ fj�kð1þ gk�1Þ; g0 :¼ fj:

Then,

GMDðb; a; jÞ ¼ ðabÞ!ða!Þb

Xj

k¼0

dk ¼ðabÞ!ða!Þb

gj:

6. Plots

In this section we present a few plots to illustrate theproperties of ES, the expected bandwidth obtained by shar-ers. We initially fix the number of peers to n = 100, increasethe number of sharers from s = 2 to 100, and plot ES. Theplot reveals that ES P s/n (the diagonal line). ES is a curvewith several peaks, always above s/n. Of course, ES > s/nimplies that EF < s/n. This follows from the fact thats/n Æ ES + (n � s)/n Æ EF = s/n. The difference between ES

and s/n is small. In the following plot (Plot 1) we have aug-mented 10,000 times the difference to make it visible.

We now plot the absolute difference (Plot 2), ade =ES � s/n. We should note that, in order to generate a statis-tical plot of ade by the Montecarlo method, one needs to beespecially careful not to end up with a biased generator. Afull 3D plot is included in the Appendix A. Although a plotof the relative difference in percentage terms between ES

and s/n could be more informative, such a curve has peaksthat vary wildly in size and cannot be drawn well.

As we increase n, the differences between ES and s/nmaintains a similar pattern, only the magnitudes differ.The next plot (Plot 3) shows ade for different values of n

(n = 60, 120, and 240), normalized in order for the largestpeak to reach 1. We have also stretched the horizontal axis.

0 20 40 60 80 100 120 140 160 180 200 220 240s

0.0

0.2

0.4

0.6

0.8

1.0ade%

60×4120×2240×1

Plot 3. The ade curve has almost the same shape everywhere.

1e-08

1e-07

1e-06

1e-05

1e-04log ade

10 100 1000log s

Plot 5. The ade curve’s last peak decreases at an exponential rate of �2 asnew peers arrive.

0 10 20 30 40 50 60 70 80 90 100s

0

1e-06

2e-06

3e-06

4e-06

5e-06

6e-06

7e-06ade

Plot 2. A reference plot of ade.

262 A. Creus-Mir et al. / Computer Communications 31 (2008) 257–265

For example, the plot for n = 60 has its horizontal axisstretched 4 times. And the plot for n = 120 has its horizon-tal axis stretched twice. The similarity between the threeplots is remarkable.

As n changes, the most pronounced differences in theshape of ade occur at the values of s where ade approacheszero (n/2, n/3, n/4, . . .). If n is divisible by s, then ade isequal to zero. Otherwise, ade > 0. The following plot (Plot4) shows ade for n = 60 and n = 61. Sixty is divisible by 2,3, 4, 5, 6, 10, 12, 15, 20, and 30. At all these points ade is

0.0e+002.0e-064.0e-066.0e-068.0e-061.0e-051.2e-051.4e-051.6e-051.8e-052.0e-05

ade

0 10 20 30 40 50 60s

6061

Plot 4. The biggest differences are on the zeros.

zero. Sixty-one, however, is prime. In this case ade neverreaches zero.

The fact that ade is similar for all n allows us to tabulateade for a given n and then extrapolate its value for other n.In this way, we can obtain approximate values of ade with-out costly calculations.

It is of interest to evaluate how ade varies as we increasen; or how the bandwidth difference between sharers andfreeriders evolves as the size of the network increases. Wenext plot (Plot 5) the values of ade at the last peak as weincrease both n and s, thus maintaining the proportion ofsharers.

The plot shows that ade decreases at a quadratic rate.This suggests a quick method to approximate the valueof ade given s and n: look up s100 = s/n100 in the secondplot, and apply the formula ade = ade100(100/n)2.

It is also interesting to see how ade evolves for a given s

as n increases. In the following plot (Plot 6), we set s = 20and let n vary from 20 to 120.

The next plot (Plot 7) shows how the values at the peaksof the ade curve decrease for this case. The decrease rate isalso exponential, but even larger with a �3 exponent.

These results suggest that ade quickly converges to 0 as n

increases.

0e+00

1e-05

2e-05

3e-05

4e-05

5e-05

6e-05

7e-05ade

0 10 20 30 40 50 60 70 80 90 100 110 120n

Plot 6. The ade curve when only freeriders arrive.

1e-131e-121e-111e-101e-091e-081e-071e-061e-051e-041e-031e-021e-01

log ade

1 10 100 1000 10000log a

Plot 7. The peaks of the ade curve decrease at an exponential rate of �3 asnew freeriders arrive.

A. Creus-Mir et al. / Computer Communications 31 (2008) 257–265 263

7. Discussion

7.1. Connectivity constraints

As we have just shown, the difference in expectations ES

and EF is due to the constraint that a sharer cannot connectto herself. Furthermore, not only ES „ EF, but ES P EF.The constraint ‘helps’ sharers; they end up better off (inexpected terms). Counterintuitively, peers with moreoptions available (freeriders) are worse off than those withless options (sharers).

To further illustrate the effect consider the followingmodified model where each sharer can only be connectedto herself. Notice that the constraint is now stronger thanbefore; sharers have only one feasible link and connectionsto other sharers are no longer available. The analysis isimmediate,

ES ¼1

sb

1

aþ ðs� bÞ 1

aþ 1

� �¼ 1

ssaþ b

aðaþ 1Þ :

The following plot (Plot 8) illustrates the behavior of ade inthe modified model

As shown by the graph, ade > 0. And in fact, the differ-ence between ES and s/n is now even larger. This exampleshows that the more constrained peers are, the better off

0.00

0.01

0.020.03

0.040.050.060.07

0.08

0.090.10

ade

0 17 33 50 67 83 100s

Plot 8. The ade curve when sharers can only connect to themselves.

they end up. As a result, if peers could commit not to beallocated to a given sharer, they would do so.

7.2. Number of links

Our model assumes that peers can connect to one shareronly. While in real networks peers hold multiple links, thenumber of simultaneous connections is generally limited.In this sense, our single-link assumption is not that unreal-istic. A more general model of p2p file sharing would allowfor multiple-link formation. We have considered general-izations of the model in this direction, but have not derivedthe more general formula due to the complexity of theanalysis.

To provide some insight on the effect of multiple links weanalyze the following simple case. Consider a p2p networkwith s = n � 1 sharers (there is only one freerider) and wherepeers hold k = s � 1 simultaneous links. By construction,every sharer is connected to all other sharers in all stable allo-cations. Hence the upload bandwidth contributed by anygiven sharer is always accessed by the remaining k sharers.This implies that the freerider obtains the same bandwidthacross all her links in all stable allocations,

EF ¼ k1

k þ 1:

We can calculate the bandwidth obtained by sharers bytaking into account that s/n Æ ES + (n � s)/n Æ EF = s/n andn = s + 1:

ES ¼1þ k þ k2

ð1þ kÞ2;

which implies that in this particular example ES > EF. Thissuggests that the asymmetry identified above is likely tohold more generally to the case of multiple links.

7.3. Equiprobability

We have assumed throughout that stable allocations areequiprobable. We next motivate this assumption in thecontext of a model where peers decide with whom to con-nect to. The nature of p2p applications suggests one-sidedlink formation, where peers can decide which sharer toconnect to without the consent of the sharer.1 In this set-ting, a simultaneous one-shot game yields a set of equilibriathat coincides by definition with our set of stable alloca-tions. Clearly, this is the set of allocations that are of inter-est for the analysis.2

1 See Jackson [16] for a survey of models of networks in Economics anda discussion of different models of link formation.

2 We should note that in our model no tradeoff exists between stabilityand efficiency. An allocation is Pareto efficient if and only if the uploadbandwidth provided by all sharers is being utilized in the network. That is,if the whole bandwidth provision in the network is enjoyed by peers.Clearly, if an allocation does not satisfy this condition, a peer reassignedto a free sharer can be made better off without worsening the remainingpeers. Hence stable allocations are a subset of Pareto allocations.

Plot 9. A three dimensional plot of ade. The plane defined by the ade and s

axis corresponds to the ade curve depicted in Plot 2.

264 A. Creus-Mir et al. / Computer Communications 31 (2008) 257–265

A one-shot game, however, provides no insight on therelative probability of each outcome. To construct a prob-ability distribution over this set we need to consider asequential game, where peers decide orderly with whomto connect to. To model such a game consider a random-ized connecting order with myopic peers. That is, peerstake the current allocation as given when choosing theirconnection; no forward induction takes place (which isunfeasible given the size and complexity of p2p networks).In the model described, however, the probability distribu-tion over the set of stable allocations depends on the finedetails of the connection process.

Consider the following example. There are six peers,three sharers (1, 2, and 3) and three freeriders (4, 5, and6). The random ordering of peers is 3, 2, 4, 5, 6, and 1.In the connecting sequence 3 connects to 2; 2 connects to3; 4 connects to 1; 5 connects to 2; and 6 connects to 3.At this point, 1 can only connect to 2 or 3. But neither con-stitutes a stable allocation (Fig. 7).

It turns out that constraining sharers not to connect tothemselves while demanding equiprobability is somewhatequivalent to allocating them first, or at least, not last. Thisprovides further intuition as to why the more constrainedthey are, the better off they end up.

And as shown by the example, further assumptions arerequired concerning the mechanism by which peers mayupdate their links if unstable allocations arise. Due to thedependence of the solution on the fine details of the mod-eling choice, we have assumed equiprobability. Differentmechanisms may favor either sharers or freeriders com-pared to the equiprobability benchmark.

7.4. Applicability and limitations

We end with a few observations regarding the limita-tions of the model. An important characteristic of real net-works we have not considered in the analysis is peerheterogeneity. In real networks, upload and downloadbandwidth capacities differ between peers. Heterogeneityin upload capacities, for example, would imply that certainsharers are able to provide higher utility to the peers thatconnect to them. Our model is also static, it does not con-sider how stable allocations are reached nor the evolutionof the network over time. In real networks, peers enterand leave.

Even with these limitations, we hope to have provided abenchmark on which to construct these (and other) exten-sions. We believe the qualitative results obtained from ourmodel are robust and should hold under more general spec-

Fig. 7. Sequential connecting orderings with myopic peers may yieldunstable allocations.

ifications. A clear understanding of bandwidth allocationin p2p file sharing networks is a necessary first step towardsproviding a theory of incentives to contribute resources top2p.3

Acknowledgement

We thank three anonymous referees and the Editorfor comments that helped improve the manuscript. Weare grateful for financial support from the HBS Divisionof Research, the NET Institute, Fundacion BBVA, andthe FPU program of the Spanish Ministry of Educationand Science.

Appendix A

A three dimensional plot (Plot 9) of ade. The planedefined by the ade and s axis corresponds to the ade curvedepicted in Plot 2.

References

[1] A. Asvanund, K. Clay, R. Krishnan, M.D. Smith, An empiricalanalysis of network externalities in peer-to-peer music-sharingnetworks, Information Systems Research 15 (2004) 155–174.

[2] J. Chu, K. Labonte, B.N. Levine, Availability and locality measure-ments of peer-to-peer file systems, in: Proceedings of ITCom:Scalability and Traffic Control in IP Networks, 2002.

[3] M. Ripeanu, I. Foster, Mapping the Gnutella network: macroscopicproperties of large-scale peer-to-peer systems, in: Proceedings of the1st International Workshop on Peer-to-Peer Systems, 2002.

[4] S. Saroiu, K. Gummadi, S.D. Gribble, A measurement study of peer-to-peer file sharing systems, in: Proceedings of Multimedia Comput-ing and Networking, 2002.

[5] K.P. Gummadi, R.J. Dunn, S. Saroiu, S.D. Gribble, H.M. Levy, J.Zahorjan, Measurement, modeling, and analysis of a peer-to-peer file-

3 For a recent application of the s/n approximation see Casadesus-Masanell and Hervas-Drane [15].

A. Creus-Mir et al. / Computer Communications 31 (2008) 257–265 265

sharing workload, in: Proceedings of the 19th ACM Symposium onOperating Systems Principles, 2003.

[6] Subhabrata Sen, Jia Wang, Analyzing peer-to-peer traffic across largenetworks, ACM/IEEE Transactions on Networking 12 (2004)219–232.

[7] C. Buragohain, D. Agrawal, S. Suri, A game theoretic framework forincentives in p2p systems, in: Proceedings of the 3rd InternationalConference on Peer-to-Peer Computing, 2003.

[8] M. Feldman, C. Papadimitriou, J. Chuang, I. Stoica, Free-riding andwhitewashing in peer-to-peer systems, in: Proceedings of the ACMSIGCOMM 2004 workshop on Practice and Theory of Incentives andGame Theory in Networked Systems, 2004.

[9] P. Golle, K. Leyton-Brown, I. Mironov, Incentives for sharing inpeer-to-peer networks, Lecture Notes in Computer Science 2232(2001) 75–87.

[10] P. Antoniadis, C. Courcoubetis, R. Mason, Comparing economicincentives in peer-to-peer networks, Computer Networks 46 (2004)133–146.

[11] B.M. Cunningham, P.J. Alexander, N. Adilov, Peer-to-peer filesharing communities, Information Economics and Policy 16 (2004)197–221.

[12] R. Krishnan, M.D. Smith, Z. Tang, R. Telang, The virtual commons:understanding content provision in peer-to-peer file sharing networks,Carnegie Mellon Heinz School working paper, 2004, p. 19.

[13] L. Jian, J. MacKie-Mason, Why share in peer-to-peer networks?, in:Proceedings of the ACM EC 2006 workshop on the Economics ofNetworked Systems, 2006.

[14] M. Feldman, K. Lai, J. Chuang, I. Stoica, Quantifying disincentivesin peer-to-peer networks, in: Proceedings of the 1st Workshop onEconomics of Peer-to-Peer Systems, 2003.

[15] R. Casadesus-Masanell, A. Hervas-Drane, Peer-to-peer file sharingand the market for digital information goods, Harvard BusinessSchool, Mimeo, 2006.

[16] M.O. Jackson, A survey of models of network formation: stabilityand efficiency, California Institute of Technology, Mimeo, 2003.

Albert Creus-Mir holds a Bachelor’s degree inMathematics and a Master’s degree in Business andIndustry Applied Mathematics. Albert is currently aPh.D. Candidate in the field of Artificial Intelligenceat Universitat Politecnica de Catalunya, Barcelona.He has a long teaching experience, both as a HighSchool teacher and since 1997 as a University Lec-turer. His current research focuses on strategicinteractions in networks.

Ramon Casadesus-Masanell is an Associate Pro-

fessor at Harvard Business School, and holds aPh.D. in Managerial Economics and Strategy from the Kellogg Graduate School of Manage-ment, Northwestern University. He received hisBachelor’s degree in Economics from UniversitatAutonoma de Barcelona. Ramon specializes inmanagement strategy, managerial economics, andindustrial organization. His research studiescompetition between organizations with differentbusiness models.

Andres Hervas-Drane holds a Bachelor’s degree in

Business Administration and a Master’s degree inEconomic Analysis. A Ph.D. Candidate in Eco-nomics at Universitat Autonoma de Barcelona,Andres has performed research at London Busi-ness School as a Visiting Ph.D. student, and morerecently at Harvard Business School as a VisitingFellow at Harvard University. His work focuses ontelecommunications and the impact of technologyon business models and market structure.