An incentive mechanism for message relaying in unstructured peer-to-peer systems

Electronic Commerce Research and Applications 8 (2009) 315–326

Contents lists available at ScienceDirect

Electronic Commerce Research and Applications

journal homepage: www.elsevier .com/locate /ecra

An incentive mechanism for message relaying in unstructured peer-to-peer systems

Cuihong Li a,*, Bin Yu b, Katia Sycara c

a School of Business, University of Connecticut, Storrs, CT 06269, United Statesb Quantum Leap Innovations, 3 Innovation Way, Suite 100, Newark, DE 19711, United Statesc Robotics Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, United States

a r t i c l e i n f o a b s t r a c t

Article history:Received 14 July 2008Received in revised form 11 April 2009Accepted 13 April 2009Available online 22 April 2009

Keywords:Peer-to-peer systems

Message relayingIncentive mechanismsMulti-level marketing modelsReinforcement learning

1567-4223/$ - see front matter � 2009 Elsevier B.V. Adoi:10.1016/j.elerap.2009.04.007

* Corresponding author.E-mail addresses: [email protected] (C. Li), b

[email protected] (K. Sycara).

Distributed message relaying is an important function of a peer-to-peer system to discover service pro-viders. Existing search protocols in unstructured peer-to-peer systems create huge burden on communi-cations, cause long response time, or result in unreliable performance. Moreover, with self-interestedpeers, these systems are vulnerable to the free-riding problem. In this paper we present an incentivemechanism that not only mitigates the free-riding problem, but also achieves good system efficiencyin message relaying for peer discovery. In this mechanism promised rewards are passed along the mes-sage propagation process. A peer is rewarded if a service provider is found via a relaying path thatincludes this peer. The mechanism allows peers to rationally trade-off communication efficiency and reli-ability while maintaining information locality. We provide some analytic insights to the symmetric Nashequilibrium strategies of this game, and an approximate approach to calculate this equilibrium. Experi-ments show that this incentive mechanism brings a system utility generally higher than breadth-firstsearch and random walks, based on both the estimated utility from our approximate equilibrium andthe utility generated from learning in the incentive mechanism.

� 2009 Elsevier B.V. All rights reserved.

1. Introduction

Peer-to-peer systems have recently gained a lot of attention inthe academic and industrial communities. A peer-to-peer (P2P)system is modeled as a self-organizing distributed system, whereinformation is highly distributed and stored by individual peers.One important challenge for P2P systems is how to find peers thatprovide certain information or services in an efficient way, so thatpeers can exploit the distributed resources owned by other peers inthe system (Milojicic et al. 2002). To ensure the scalability androbustness of the system, as well as to avoid some legal issues, acentralized database of content of each peer usually does not existin a P2P system (an exception is Napster). Instead, a distributedcatalog of content is favored in which each peer only maintains alist of resources/services, and may contain information about theacquaintances and neighbors. In this distributive model peer dis-covery is realized via message relaying between peers so that amessage for resource searching is propagated in the system witha ‘‘word-of-mouth” effect.

Traditionally there are two ways to perform distributive peerdiscovery in unstructured peer-to-peer systems (Milojicic et al.2002, Yang and Garcia-Molina 2002): breadth-first search (BFS,used by Gnutella) and depth-first search (DFS, used by Freenet).

ll rights reserved.

[email protected] (B. Yu),

With BFS the messages are flooded in the system. Therefore, theconsumption of bandwidth is enormous, although results can befound very quickly. With DFS searches can be terminated once a re-sult is found, and therefore use less bandwidth. But the responsetime could be very long and is exponential in the depth limit. Re-cently random walks (RW) have been considered as a method forP2P search that significantly reduces the number of messages com-pared to BFS (Lv et al. 2002). But the performance based on randomwalks is highly variable, and greatly depends on the network topol-ogy and the number of walkers (Tsoumakos and Roussopoulos2003).

Besides the system efficiency, another problem that requiresattention in the protocol design is incentives. A P2P network is ahighly decentralized system and each peer may represent a differ-ent self-interested entity. A peer may manipulate the local infor-mation to take advantage of other peers’ resources (Ng et al.2003, Shneidman and Parkes 2003a, Dasgupta et al. 1979, Maet al. 2004, Sun and Garcia-Molina 2004). For example, a peermay simply drop a message that is sent from other peers for relay-ing, for the purpose of saving communication bandwidth and en-ergy. Therefore, a message relaying P2P system is vulnerable tothe free-riding problem, i.e., a node relies on others’ efforts to relayits own messages, but does not cost itself to relay messages forother nodes. Free-riding can cause severe degradation of the sys-tem performance and prevent requesters from finding high qualityproviders efficiently (Adar and Huberman 2000, Shneidman andParkes 2003a). It is important to design an incentive mechanism

mailto:[email protected]



http://www.sciencedirect.com/science/journal/15674223

http://www.elsevier.com/locate/ecra

316 C. Li et al. / Electronic Commerce Research and Applications 8 (2009) 315–326

that motivates each peer to behave rationally, and results in goodsystem efficiency.

In this paper, we present an incentive mechanism of messagerelaying for peer discovery that overcomes the flooding problemof BFS search while preserving the quick response property andgood reliability. Although both peer discovery and distributedrouting are related to message relaying, they are different prob-lems (Chandan and Hogendorn 2001). In distributed routing thedestination of a message is known and the routing paths are welldefined (Papadimitriou 2001). For each action (dropping or for-warding the message to a node) of a peer, the consequence is wellspecified and the payoff is clearly expected. A peer only needs todecide whether or not to take the action, depending on the incen-tive provided, in ways of a market price, reciprocal rewards, orcontract payment, etc. (see Section 2 for a review of the incentivemechanisms in P2P systems). But in peer discovery the destina-tion of the message is unknown, and hence it is not clear whothe message should be sent to, what action a peer is expectedto take, or what the consequence and payoff of each action willbe. This implies greater uncertainty and less control in messagerelaying for peer discovery than for distributed routing. Therefore,the incentive mechanisms for distributed routing, which requireprior knowledge of routing paths, cannot be applied to our peerdiscovery problem.

Most of the existing study of P2P systems is concerned withcontent sharing or information availability (e.g, Feldman et al.2003, Golle et al. 2001, Figueiredo et al. 2005, Hughes et al. 2008,Bhattacharjee et al. 2005, Arora et al. 2005). Taking a different per-spective, our work aims at improving the search capability, one ofthe major technical issues in content sharing P2P systems (Milog-icic et al. 2002). While content sharing involves the behaviors ofend users (Chen et al. 2008), distributed searching is a techniquethat once implemented in the software, remains largely transpar-ent to end users. Thus our incentive mechanism may be consideredas a protocol of P2P distributed searching based on an economicmechanism. For this reason, the transaction cost and user behav-iors are not a concern.

In a P2P system there may exist a small portion of altruisticnodes that contribute even without incentives. Due to the exis-tence of such nodes, a content sharing system may be able to sus-tain even if all other nodes are free riders (although free-ridingbehaviors lead to degradation of the system performance). Forexample, in 2005 it was found that 85% of all Gnutella users werefree riders (Hughes et al. 2005). However, we think that it is lesslikely for distributed searching to succeed if relying on a few altru-istic nodes. In order for an information requestor to find a provider,the message has to reach a provider along a path from the reques-tor. If any node on the path drops the message, the relaying on thatpath cannot succeed. Thus free-riding may have more substantialeffects on distributed searching than on content sharing.

In our mechanism, the source peer sends the query to someneighbors and promises some payment to each receiver if the re-source provider is found via a transmission route that includesthe receiver. Depending on the offer, each receiver decides thenumber of neighbors it relays the message to and also the prom-ised payment to its immediate downstream peers. Each of thenew receivers again makes similar decisions, until the maximumnumber of hops (time-to-live) is reached. One feature of this mech-anism is that it does not price the relaying activities, but insteadprices the relaying result, which influences the relaying activities.It tackles not only the incentive problem, but also communicationefficiency and reliability in a P2P system.

The rest of the paper is organized as follows. In Section 3 wemotivate and present the model of the message relaying mecha-nism in peer-to-peer systems. The equilibrium analysis andapproximation are presented in Section 4. Simulation results are

provided in Section 5. Section 7 concludes the paper with somedirections for future research.

2. Prior work

Peer-to-peer (P2P) systems have received increasing attentionfor benefits such as improving scalability, eliminating the needfor costly infrastructure, and enabling resource aggregation(Milogicic et al. 2002, Oram 2001). With all these benefits, P2P sys-tems also create challenges in discovering information efficientlyin the network. Some research work is dedicated to the design ofefficient search techniques. Gnutella is a famous protocol designedto facilitate decentralized search and discovery of information in anetwork (Gnutella 2001). This protocol permits a node to post que-ries, forward a query to other nodes, and respond to a query. In thismanner, through the shared use of resources provided by the relay-ing nodes, information can be located and shared between nodes inthe network. However, a shortcoming in Gnutella is the ‘‘unintelli-gent” relaying of queries to other nodes in the network. A nodeupon receipt of a query relays a request (within time-to-live) toall of its neighbors. This results in wastage of shared resources suchas bandwidth without necessarily generating more value to therequestor.

Some other research work for search techniques in unstruc-tured P2P systems aims at reducing the number of nodes that re-ceive and process each query with little sacrifice of the quality ofresults. The different approaches include: adaptively deepeningthe search based on the responses (Yang and Garcia-Molina2002), selectively querying neighbors based on their quality orreputation (Yang and Garcia-Molina 2002), building local indicesthat allow nodes to process query on behalf of nodes in a localrange (Yang and Garcia-Molina 2002, Adamic et al. 2001), main-taining ‘‘hints” as to the possible information location by learningfrom the history (Crespo and Garcia-Molina 2002, NeuroGridwww.neurogrid.net), and random walks (Lv et al. 2002). Althoughthese techniques improve the efficiency of searching P2P networks,they are based on the assumption that the nodes are cooperativeand can be programmed to follow these protocols.

P2P systems are often composed of nodes governed by self-interested parties, each acting to better its own outcome. The ra-tional behavior of nodes creates the free-riding situations inpeer-to-peer settings. In these situations, nodes consume resources(bandwidth, computation, energy, files, etc.) of others but do notcontribute at the same level of their consumption. Several papershave helped to advance the understanding of disincentives ofcooperation in P2P systems. Feldman et al. (2003) quantify disin-centives in file sharing P2P networks. Christin and Chuang (2005)propose a cost-based model to assess the resources that each over-lay node has to contribute for being part of the overlay, which al-lows to gauge potential disincentives for nodes to collaborate.Shneidman and Parkes (2003b) discuss the notions of rationalityand self-interest in P2P systems. In Shneidman et al. (2003) theyadvocate mechanism design for P2P systems in which peers are ex-pected to be rational and self-interested and may deviate from asuggested protocol. They also discuss some open problems inmechanism design for peer-to-peer systems.

In the following we focus on the review of the incentive mech-anisms that aim at improving the efficiency of a P2P system con-sidering the rational behavior of nodes. Generally, the incentivefor a peer to cooperate is induced by some economic mechanism.The economic mechanisms that have appeared in P2P incentive de-sign include micropayment, reciprocity, taxation and contracts. Weshare give a brief review of the research work based on each ofthese mechanisms.

In a micropayment system, a peer is rewarded with virtual cur-rency or credit for each action by the system or the peer who

C. Li et al. / Electronic Commerce Research and Applications 8 (2009) 315–326 317

benefits from the action. In a P2P system, the action can be, e.g.,forwarding a message, uploading a file, answering a query, etc.Based on game theoretic analysis Golle et al. (2001) and Figueiredoet al. (2005) evaluate the effectiveness of different micropaymentmechanisms to motivate file sharing in P2P systems. Buttyán andHubaux (2001) and Zhong et al. (2003) apply micropayment mech-anisms to stimulate packet forwarding actions in a P2P network.Rogers et al. (2005) present a distributed mechanism for self-orga-nized routing in a energy constrained sensor network. In theirmechanism, a sensor receives a payment from the server each timeit transmits data to the center, for itself or as a mediator for an-other sensor. The payment scheme is designed to motivate locallyselfish strategies that possess desirable global properties. Mostexisting micropayment schemes require a trusted centralized bro-ker (server), which is responsible to distribute and cash credits.Yang and Garcia-Molina (2003), Micali and Rivest (2002) andGlassman et al. (1995) propose several micropayment mechanismsthat reduce the load of the broker. In these above mentioned mech-anisms, the reward that a peer receives for each action is indepen-dent of the resources, the work load, or the value of the service. Inother words, the payment is based on a static and uniform price. Inour mechanism, the payment a peer promises or receives dependson the value of information and its position (hop number) in thepropagation process. Therefore, it can be regarded as a ‘‘dynamicpricing” mechanism that allows the price to change with the mi-cro-situation of the ‘‘market”.

Generally, a micropayment system requires that the action of apeer can be observed or verified by the centralized broker or otherpeers. This assumption cannot be satisfied in a P2P network wheresuch broker is expensive to implement or the action of a peer ishidden information to other peers. Considering query relaying inP2P systems, Yang et al. (2005) propose a mechanism in whichpeers buy and sell the right of respond to each query. In this mech-anism, selling the query to a peer implements an activity of relay-ing the message to that peer. It thus does not require the visibilityof a peer’s actions. However, such mechanism requires a transac-tion for each relaying activity. It also requires that the final priceof the information is given and known to all peers. In reality, thevalue of information is private to the requestor, and how muchhe is willing to pay for the content is part of the negotiation afterthe content is discovered. Compared to Yang et al. (2005), ourincentive mechanism allows the transactions only happen whenrequired information is found, which greatly reduces the commu-nication load. In addition, it does not require the price of informa-tion to be pre-determined.

Another way to motivate cooperation is based on reciprocity. Insuch mechanisms, a peer discriminates between the peers thathave behaved cooperatively and non-cooperatively in the past,and only gives the cooperative peers preferential treatment. Specif-ically, such mechanism is called direct reciprocity if each peerkeeps a private record of performance of other peers. It is calledindirect reciprocity (or reputation) if peers share their knowledgeabout other peers. Direct reciprocity does not scale to large popu-lation sizes or high turnover, because it is only useful when twoplayers have repeated interactions. Indirect reciprocity scales bet-ter but creates implementation overhead for maintaining reputa-tion information, and is subject to collusion. The ‘‘tit-for-tat”mechanism proposed by Cohen (2003) for a P2P file sharing systemcan be regarded as a direct reciprocity mechanism. In this mecha-nism, peers reciprocate uploading to peers which upload to them.Lai et al. (2003) evaluate the two mechanisms by simulation, andsuggests that the incentive techniques should adapt to the behav-ior of strangers. Further, Feldman et al. (2004) present a family ofincentive techniques based on a Reciprocative decision functionto provide different tradeoffs. Habib and Chuang (2006) proposean incentive mechanism for P2P media streaming in peers are se-

lected based on rank-order tournaments. This can also be viewedas a reciprocity mechanism in which peers are rewarded for goodperformance in the past. Generally, reciprocity mechanisms arevulnerable to the ‘‘whitewashing” (change of identity) problem.They require that the effort of a peer can be measured or observedby other peers, which may not be the case in a P2P network. Forexample, it is hard for a peer to identify whether a neighbor hasforwarded a message or how many nodes it forwards to. Our mech-anism overcomes these shortcomings by having peers decidetransmission efforts based on expected rewards but not on the pastbehavior of other peers.

In some situations the resources that peers possess in a networkare not even. Some peers may be resource-rich while others are re-source-poor. The resource-poor peers are limited in their capabilityto contribute. This makes it difficult for these peers to participatein a reciprocity-based system, in which the consumption has tobe comparable to the contribution. In order to improve the socialwelfare when the distribution of resources is uneven, taxationcan be used to motivate resource-rich peers to contribute moreto the system, and subsidize for the resource-poor peers. HuaChu et al. (2004) propose a taxation mechanism for P2P streamingbroadcast in which peers contribute and receive bandwidthaccording to a tax schedule. Taxation is appropriate when con-sumption and contribution happen at the same time (such as inthe streaming context). Otherwise, it will be difficult to enforcetaxation on peers if consumption and contribution are separated.Message relaying for information discovery is in the latter case. Apeer does not necessarily request information when it relays amessage, and vice versa.

A contract is an enforceable agreement that specifies the obliga-tions along with the rewards. A contract is different from themicropayment agreement in that the reward may not be directlyassociated with an action, and also the reward may be differentacross peers. Feldman et al. (2005) focus on stimulating the actionsof intermediate nodes to forward messages in a multi-hop net-work. The endpoint can observe the random result whether ornot the end-to-end transmission was successful, but not the ac-tions of the intermediate nodes. Using a principle-agent model,they propose a design of contracts, contingent on the result, thatovercomes the hidden-action problem. They assume only two ac-tions of a node: drop or forward the message, and the route, whichis a linear path, is given with all nodes on it known. However, inmessage relaying for information search, message relaying doesnot follow a pre-determined, linear path. In addition, the actionsof a peer are richer: a peer can relay the message to differentneighbors in different numbers. Our incentive mechanism is ableto operate in these situations. It is also a contract-based mecha-nism—a peer promises the downstream node a reward if informa-tion is found through the latter.

3. Message relaying mechanism

In this section we first present in Section 3.1 the motives of ourincentive mechanism, and then describe the mechanism in Section3.2.

3.1. Motivation

The design of our incentive mechanism is motivated by the fol-lowing three considerations:

3.1.1. Communication efficiencyA system with high communication efficiency is featured by

high marginal values of message relaying. A significant part of inef-ficiency in message propagation is caused by the overlapping orsaturation issue. By overlapping we mean that a peer may forward


the message to some peers that have received the message fromother peers, and these actions only waste the communication re-sources. Since the number of times that the message is transmittedin the system increases exponentially with respect to each peer’stransmission effort (the number of neighbors to forward the mes-sage to), the probability of overlapping will soon get close to 1 ifeach peer makes significant relaying efforts, in other words, thesystem becomes saturated very quickly. To reduce the communica-tion inefficiency caused by overlapping, a peer should explicitlyconsider the overlapping probability, and be able to adjust thetransmission effort with the progress of propagation, or the satura-tion status of the network.

3.1.2. ReliabilityCommunication efficiency and reliability are two conflicting

goals. The intensity of message relaying is positively correlatedwith the reliability of peer discovery. Therefore, an efficient mes-sage relaying scheme should tradeoff the communication effi-ciency and reliability. A peer should decide the optimal relayingeffort by considering both the cost and the expected payoff of find-ing a service provider. The expected payoff of finding a providernot only depends on the value of finding the service to the reques-tor, but also on the reliability of finding a provider, which increaseswith the coverage of the peers that are exposed to the query.

3.1.3. Information localityPricing the scarce resource and charging for the usage of the re-

source via a micropayment system is a common approach to pro-vide incentive compatibility (Golle et al. 2001, Yang and Garcia-Molina 2003).1 Such a mechanism is not applicable to a P2P discov-ery system. In message relaying, a micro-payment mechanismwould require the requestor to ‘‘buy” relaying actions of other peers.This means that the source peer can identify all the intermediatepeers and their transmission efforts, which is not feasible in a decen-tralized P2P system. On the other hand, it is not easy either for therequestor or the mechanism designer to decide the right price tocharge for each relaying action as the local environment, such asthe number of neighbors, of a peer is not known by the mechanismdesigner or by a third party. Revelation of such local information iscalled non-private value revelation in Shneidman et al. (2003). Oneway to avoid this revelation problem in mechanism design is toask a peer to price ‘‘items” provided by immediate downstream nodesbased only on its own local information. In our mechanism the imme-diate downstream nodes and their responses, and the input incen-tive are all local information of a peer. The item that is priced isthe search result.

3.2. Incentive mechanism for message relaying

A P2P search process based on our incentive mechanism can bedecomposed into two phases: the message relaying phase and therewarding phase. In the following we describe these two phasesseparately.

3.2.1. The message relaying phaseThe message relaying phase is initiated by the requestor. The

requestor sends the query message to some neighbors, along witha promised reward to each receiver. A node who receives this mes-sage relays the message further to other nodes, also with a prom-ised reward. We call a peer that has received the message a knower.Otherwise if a peer has not been exposed to the message, it iscalled an ignorant. The number of knowers increases while the

1 For example a micro-payment incentive mechanism in a file sharing system is toask a peer to pay certain price for each unit of resources it downloads from otherpeers Golle et al. (2001).

number of ignorants decreases along with the propagation. Therequestor is initially a knower.

The propagation builds a family tree between nodes that arecovered in the process. If node i receives the message from nodej, then node i is a downstream node of j, and node j is the upstreamnode of i. A node i is called a descendant of a node j if i is a down-stream node of j or j’s descendant. If a node receives the messagefrom multiple senders, it can choose to be a descendant of the sen-der with the highest reward. The hop number of a node in the fam-ily tree is defined as its distance (the number of generations) awayfrom the requestor. The requestor is the only node in hop 0. In thepropagation process, the hop number is attached to the message,and is automatically increased by one each time it is relayed.Therefore, a peer knows its hop number during the process. Themaximum hop number allowed in the process is defined bytime-to-live (TTL). TTL is commonly used in message relaying pro-tocols (Yang and Garcia-Molina 2002). The relaying process ends ifthe hop number reaches TTL.

A node will relay a message only once, although the messagemay be forwarded to multiple peers. This avoids repeated relayingand reduces repeated queries, and is used in the Gnutella Protocolv0.4. We assume that nodes in a earlier hop conduct relaying ear-lier than nodes in a later hop. Therefore, when a node receives amessage, it assumes that all nodes in earlier hops have completedtheir relaying efforts of this message.

Fig. 1 shows an example of the construction of hops in a prop-agation process with TTL = 4. Each circle in this figure represents anode, indexed by the number inside the circle. Each arrow repre-sents a transmission of the message between two nodes. The ar-rows in solid lines contribute to the identification of the hopnumber for the receivers. The arrows in dashed lines representthe situations where the message is sent to a knower, and that doesnot change the hop that the receiver belongs to. The numbers at-tached to each arrow are the promised rewards from the upstreamto downstream nodes.

3.2.2. The rewarding phaseA relaying path ends if the message reaches a provider or the

hop number reaches TTL. If a provider is reached, then a rewardingprocess in the opposite direction of the relaying process is trig-gered. The provider first responds to the sender, who returns theidentity/address of the provider to its own upstream node. Suchinformation is reported backward the relaying path all the way

Fig. 1. Illustration of hops in message relaying.


up to the requestor. Each node on the path, except the requestor,receives the promised reward from its own upstream node.

In Fig. 1, the arrows in bold compose a path that reaches theprovider, Node 9. The rewarding process traces backwards thispath. As a result, Nodes 9, 7, 4 and 1 receive rewards of 5, 7, 8and 10 from their upstream Nodes 7, 4, 1 and 0.

3.2.3. Decision trade-offIn this incentive mechanism, the incentive for a node to relay a

query message is the reward promised by its upstream node. Anode can only receive the reward if a provider is discovered on apath extended by its relaying effort. The probability that a provideris found among a node’s descendants depends not only on thenode’s own relaying effort (how many neighbors the node relaysthe message to), but also its downstream node’s relaying efforts.Therefore, after a query message is received along with a promisedreward, a node has to decide how many nodes to relay the messageto, and what reward should be provided to its downstream nodes.A greater relaying effort enlarges the number of downstreamnodes, and a larger promised reward induces a greater relaying ef-fort of the downstream node, both leading to a higher probabilityof covering a provider in the descendants and hence receivingthe reward from the upstream node. However, relaying the mes-sage to more peers incurs more costs as well due to the consump-tion of bandwidth, energy, and other resources in a relayingactivity (Rogers et al. 2005). Promising a greater reward to a down-stream node reduces the net profit that can be kept by the node it-self if a provider is found in its descendants. Therefore, a nodetrades off the cost and return in the decision of the relaying effortand output incentive.

We assume nodes are homogeneous, i.e., each node, except therequestor, has the same ex ante probability to be a provider. In thissituation, a node does not discriminate its neighbors; the successrate (the number of providers discovered) only depends on howmany neighbors a node relays the message to, but not on whothe message is sent to. This assumption allows us to focus on theperformance of the incentive mechanism without the complicationof differed local environments of each node. The assumption maybe violated in a social network in which a peer acquires informa-tion from its past experience about the expertise and functionalityof acquaintances and neighbors (Yu and Singh 2003). In that case, anode may incorporate the knowledge of other peers by relaying themessage to selected neighbors. But the assumption of homogeneityis appropriate in a simple network such as a sensor network. Lem-ma 1 shows that in a homogeneous network the expected numberof providers discovered is proportional to the number of peers cov-ered in the propagation process.

Lemma 1. In a homogeneous network of N nodes, if the total numberof providers is M, the expected number of providers covered amongrandomly selected L nodes is L �M=N.

The incentive mechanism constitutes a game in which a peer’sutility depends on the decision of itself as well as other peers’ deci-sions. In Section 4 we analyze and approximately calculate theequilibrium strategy of this game.

4. Calculating the equilibrium strategy

In this section we analyze the equilibrium strategy of peers inthe incentive mechanism for message relaying. After the notationsare introduced, in Section 4.1 we formally define the game andstrategy equilibrium, and obtain some analytic insights to a peer’sstrategy. Then in Section 4.2 we proceed to provide an algorithm tocalculate approximately the symmetric equilibrium.

Denote by N þ 1 the (estimated) number of peers in the net-work. Let the requestor be node 0, the other nodes are indexed

by 1,2, . . . ,N. Let I¼: f0;1; . . . ;Ng be the collection of all nodes. De-note by Di the degree of peer i, i.e., the number of neighbors of peeri. In the network there are M 6 N peers (providers) that can pro-vide the information or service requested by the source node(requestor). The value of finding a provider to the requestor is v0.Since the providers are substitutes, the requestor gains v0 even ifmore than one providers are discovered. Denote by vi,i ¼ 1;2; . . . ;N, the promised reward received by node i from its up-stream node. For a node the cost of relaying a message to a neigh-bor is c. The hop number of node i in the propagation process is hi.Note for the requestor, h0 ¼ 0. Given the incentive vi and hop num-ber hi, a node i needs to decide the transmission effort ki – the num-ber of peers to relay the message to, and the output incentive ui –the incentive to be provided to each downstream node.

4.1. Equilibrium analysis

In order to analyze the equilibrium, we shall define the strategyand utility of a peer, which constitute the P2P incentive messagerelaying game and a strategy equilibrium.

4.1.1. Strategy and utilityGiven the degree Di of a node i, the collection of possible trans-

mission efforts of node i is: Ki ¼ f0;1; . . . ;Dig. LetH ¼ f0;1; . . . ;Hg be the set of possible hop numbers along thepropagation, where H is the maximum hop number (TTL).

Definition 1. The (pure) strategy si of a node i, i 2 I, is a map fromthe hop number hi 2H and input incentive vi P 0 to the trans-mission effort ki 2Ki and output incentive ui P 0:ðki;uiÞ ¼ siðhi; viÞ : H� Rþ !Ki � Rþ.

Let Si ¼ fsi : H� Rþ !Ki � Rþg be the strategy space of node i,and s ¼ ðs0; s1; . . . ; sNÞ be a strategy profile of all nodes, where si 2 Si.The strategy profile of all nodes except i is denoted bys�i¼

: ðs0; . . . ; si�1; siþ1; . . . ; sNÞ.The number of providers covered by descendants of node i, de-

noted by esi, is a random variable; the distribution of esi depends onthe strategies of all nodes. Note that if there is only one provider inthe network, M ¼ 1, then esi is the probability that the provider iscovered by node i’s descendants. Given the strategy profile s, letsiðsÞ¼

:E½esijs� be the expected value of esi.

If a node i receives an input incentive vi for each provider foundamong its descendants, and the node pass on an output incentive ui

to its immediate downstream nodes, then the node receivesðvi � uiÞE½esijs� on expectation. On the other hand, if the node relaysthe message to ki nodes, the cost of relaying is c � ki. Therefore, gi-ven the strategy profile s of all nodes, the expected utility of node i,i ¼ 1;2; . . . ;N, UiðsÞ can be characterized by:

ui sð Þ ¼ ðvi � uiÞE esijs½ � � c � ki: ð1Þ

From Eq. (1) we can see that for a node who is not the requestor, theutility is a linear function of the number of providers found throughits relaying. This is because a node receives the reward from its up-stream node for each provider that is discovered. However, this is notthe case for the requestor, who obtains the same information fromall providers and hence benefits from one single provider. Therequestor obtains a value v0 for any es0 P 1, but pays u0 for each pro-vider discovered. Therefore, the expected utility of the requestor is:

u0 sð Þ ¼ v0Prðes0 P 1jsÞ � u0E es0js½ � � c � ki: ð2Þ

Note that the number of providers covered by the descendantsof node i, esi, not only depends on node i’s transmission effort ki, butalso on transmission efforts of other nodes. The dependence onother nodes’ efforts is for two reasons. First, a node competes withother nodes in the same hop in searching for providers; the greatertransmission efforts or other nodes, the less chance a node covers a


service provider among its descendants. Second, the total numberof providers reached, directly or indirectly, by a node is higher ifthe node has a larger population of descendants. But the total num-ber of descendants of a node depends on the transmission efforts ofits descendants, which again are impacted by the input incentivespassed on from their upstream nodes.

4.1.2. Game and equilibriumHaving described the strategies and utility functions of nodes,

we are now ready to define the P2P incentive message relayinggame, and the strategy equilibrium in the game.

Definition 2. The P2P incentive message relaying game is defined bythe following elements:

Players: the players of the game are the peers in the network,I ¼ f0;1;2; . . . ;NgStrategies: the strategy space of each peer i is Si, with thepure strategy of a peer i defined asðki; uiÞ ¼ siðhi; viÞ : H� Rþ !Ki � Rþ, i ¼ 1; . . . ;N, andðk0;u0Þ ¼ s0ð0; vð�ÞÞ.Payoffs: The utility function of peer i, given a strategy profile s, isuiðsÞ as defined in Eqs. (1) and (2). The objective of each node isto maximize its utility.

A pure strategy Nash equilibrium is a strategy profile such thatfor each node, given the strategies of the others, its strategy max-imizes its expected utility.

Definition 3. A pure strategy Nash equilibrium (NE) is a profile s ofall nodes’ strategies such that for all nodes i,

ui sijs�ið ÞP ui s0ijs�i� �

for all s0i 2 Si:

Proposition 4. In a pure strategy Nash equilibrium, for a node i,i ¼ 1;2; . . . ;N,

(i) When vi decreases, ki or ui decreases and it is not possible thatboth increase.

(ii) The expected number of descendants increases with vi.

Proposition 4 says that in the equilibrium strategy, a node thatreceives a lower input incentive will decrease the output incentiveor transmission effort. Note that a lower output incentive furtherreduces the output incentive or transmission effort of the down-stream nodes, causing less future transmission efforts. As a result,a node develops a smaller population of descendants when facingless input incentive. As the input incentive decays by hops along arelaying path, it means that with this incentive mechanism theflooding problem and pyramid effect are automatically avoidedwith peers’ individual self-interested actions.

4.2. Approximation of utility in a symmetric network

Calculating the equilibrium the incentive message relayinggame is difficult. This is because message relaying is a stochasticprocess and as a result the outcome of the family tree is random:It is uncertain which nodes will receive the message from a peerand among those receivers who will become downstream nodesof the sender (as a node may receive the message from severalpeers). To count all the possible outcomes of the family tree andmeasure their probabilities is not possible for an agent with limitedcomputational resource or bounded rationality (Gigerenzer andSelten 2002). In this section, we develop an algorithm based onsymmetric networks to approximately calculate the population ofdescendants of a node.

We call a P2P network symmetric if all nodes are homogeneouswith the same degree (the number of neighbors) and equal proba-bility of being a provider. While nodes in a realistic unstructurednetwork may have different degrees, a symmetric network canbe considered as an approximation using the average degree ofnodes as the actual degree of each node. It is a simplification thatoverlooks the detailed structures of the network and uses informa-tion on the aggregated (average) level, a model often used forbounded rationality (Simon 1982). In a symmetric game, since allnodes are homogeneous, we can drop the node index; two nodeshave equal expected utility if their strategies are the same. There-fore, we can restrict to symmetric strategy profiles in which allnodes have the same strategy, with the output incentive and trans-mission effort only depending on the hop number and input incen-tive, but not on the node index. Thus the simplification using asymmetric network greatly reduces the dimensions of the strategyspace. Denote a symmetric strategy by ðk;uÞ ¼ shðvÞ, where h rep-resents the hop number, v the input incentive, k the transmissioneffort, and u the output incentive.

Given a strategy profile, we approximately calculate the descen-dant population of a node following the certainty equivalence esti-mate (Bertsekas 2005). In this approximation, the expectednumber of immediate downstream nodes developed by each nodeis treated as certainty; this allows estimating the expected numberof descendants of a node without enumerating outcomes of thefamily tree. Such approximate solution follows the certainty equiv-alence theory used in decisions with bounded rationality (Handa1977, Schneeweiss 1974). In the following, we first describe howto estimate the expected number of descendants of a node, giventhe transmission efforts of each hop. We then show how to esti-mate the expected number of providers covered among thedescendants of a node, given a strategy profile of all hops. Afterthat, we present how to search for a symmetric NE.

In order to estimate the number of descendants of a node, weneed to know the expansion of the family tree by each hop. Prop-osition 5 characterizes such expected expansion by the relaying ac-tions of one hop, given the situation of the system before therelaying actions of the hop.

Proposition 5. If there are nh knowers in the system up to (including)hop h, the number of nodes in hop h is mh, and each node in hop hrelays the message to kh peers, then

(1) the expected number of ignorants �dh reached by each node inhop h, is:

�dh ¼0 if kh ¼ 0 or mh ¼ 0N�nh

mh1� 1� kh

N�1

� �mhh i

else

8<: ;

(2) the expected number of nodes in hop hþ 1 is:

�mhþ1 ¼ mh�dh ¼ ðN � nh�1 �mhÞ½1� ð1�

kh

N � 1Þmh �; ð3Þ

(3) the expected number of nodes up to hop hþ 1 is:

�nhþ1 ¼ nh þmh�dh: ð4Þ

Based on Proposition 5, given the number of knowers nh up tothe a hop h and the number of nodes mh in hop h, we can calculatethe expected number of knowers �nhþ1 after the relaying actions ofhop h, and the expected number of nodes �mhþ1 in the next hophþ 1, with the relaying effort known. With the certainty equiva-lence estimate, we take the expectations �mhþ1 and �nhþ1 as cer-tainty: mhþ1 ¼ �mhþ1, nhþ1 ¼ �nhþ1. This allows us applying Eqs. (3)and (4) recursively starting from hop 0, and obtaining an estimateof the expected numbers of nodes in all hops. Initially, the onlyknower is the requestor: n0 ¼ m0 ¼ 1.


In a symmetric system, the family tree is developed evenly,nodes in the same hop having the same number of descendantson expectation. Therefore, given the expected numbers of nodesin all hops, the expected number of descendants of a node in hoph can be estimated by:

Lh �1�mh

XH

l¼hþ1

�ml;

wherePH

l¼hþ1ml is the total expected number of descendants ofnodes in hop h.

Based on Lemma 1, given Lh, the expected number of providerscovered among the descendants of a node in hop h is then equal to:�sh ¼ Lh �M=N, where M=N is the probability of a node being aprovider.

Given �sh, the expected utility Uh of a node in hop h, h P 1, canbe approximated by:

Uh � ðvh � uhÞ�sh � c � kh: ð5Þ

For the requestor, taking �s0 as certainty, the expected utility U0

can be approximated by:

U0 � v0 min �s0;1ð Þ � u0 � �s0 � c � k0: ð6Þ

Given a symmetric strategy profile of all hops, the process ofpredicting (approximately) the expected utility of a node in eachhop, based on certainty equivalence, is summarized in Algorithm 1.

Algorithm 1. Approximate the utility given a symmetric strategyprofile

Step 0: m0 ¼ 1, n0 ¼ 1. h ¼ 0.Step 1: Calculate �mhþ1 and �nhþ1 following Eqs. (3) and (4).Step 2: Let mhþ1 ¼ �mhþ1, and nhþ1 ¼ �nhþ1.

Let h ¼ hþ 1. If h < H þ 1, go to Step 1; otherwise go toStep 3.

Step 3: Lh � 1�mh

PHl¼hþ1 �ml, �sh � Lh �M=N, for all hops h.

Step 4: Calculate Uh for all hops h following Eq. (6) (h ¼ 0) or (5)(h P 1).

5. Simulation

In this section we provide experimental results benchmarkingthe system performance of the distributed search based on ourincentive mechanism, compared to breadth-first-search (BFS) andrandom walks, two commonly used schemes for searching (Yangand Garcia-Molina 2002, Lv et al. 2002). We do not considerdepth-first-search (DFS). This is because without considering theresponse time, DFS undoubtedly outperforms the others in termsof the system utility—in DFS a message is only further relayed,sequentially between peers, when no result has been found. Butthis efficiency of DFS is in the cost of long response time. The re-sponse time of BFS, random walks and our incentive mechanismis bounded by OðTTLÞ because the relaying activities of nodes inthe same hop are conducted simultaneously. But the response timeof DFS is in the order of the total number of relaying activities,since they are conducted sequentially.

We first present in detail the search algorithms in the threeschemes (Section 5.1), then describe the setup of the experiments(Section 5.2), followed by the experimental results in the rest ofthis section.

5.1. Search algorithms in experiments

We consider three classes of search algorithms in the experi-ments: distributed search with the incentive mechanism, BFS and ran-

dom walks. There are two strategies for distributed search. One isbased on the approximate symmetric Nash equilibrium calculatedfollowing the approach in Section 4.2, noted as approximate distrib-uted search. Another is the strategy learned in the message relayingprocesses using reinforcement learning, noted as distributed search.Reinforcement learning is also a heuristic often considered in deci-sions with bounded rationality (Gigerenzer and Selten 2002).

In reinforcement learning each peer has a strategy table, forexample, H�V�K�U ¼ 4� 10� 4� 10, where the hops ofthe message H can be f0;1; . . . ;4g, the input U and output V

incentives are bounded by 10 on a discrete space f0;1; . . . ;10g,and the transmission effort K is bounded by 4. The initial incentivev0 is 10 for each query. Peers are randomly chosen as service pro-viders. A peer in the system queries other peers through messagerelaying.

Each peer uses a standard a-greedy algorithm to explore thediscretized strategy space, where a ¼ 0:1 (Sutton and Barto1998). Specifically, the estimated value of state a after t plays isdenoted by Q tðaÞ and rka is the reward from choosing price a atthe ka-th time, where state a is a combination of V, H, U, K. Apeer updates the estimated value of a state, if the state is chosen,based on the current reward and the previous estimated value. Ifa state gives a higher reward in this period, accordingly theestimated value of the price will be higher. The update of the esti-mated value is based on the following rule:

Qkaþ1ðaÞ ¼ Qka ðaÞ þ aðrkaþ1 � Q ka ðaÞÞ; ð7Þ

where step size a is a constant, 0 < a 6 1. Based on this rule the re-cent rewards are weighted more heavily than long past ones. This isnecessary in a non-stationary environment in which the mean re-ward of a state changes over time. The recent environment is moresimilar to the environment today and therefore gives more informa-tion, than the environment of long past. With a constant step size a,the estimates never completely converge but continue to vary in re-sponse to the most recently received rewards (Sutton and Barto1998). This is actually desirable in a non-stationary environmentfaced by each peer in peer-to-peer systems.

We find the approximate optimal strategies of most peers con-verge to the following

� Given the initial incentive v0 ¼ 10 and hop number h ¼ 0,the peer will ask 3 neighbors with output incentive 7.

� Given the input incentive v1 ¼ 7 and hop number h ¼ 1,the peer will ask 3 neighbors with output incentive 2.

Breadth-first search (BFS) is a simple searching strategy in peer-to-peer systems. Here we consider a variant of BFS, in which therequesting peer randomly chooses a ratio of their neighbors to for-ward the query. The ratio is also called branching factor and is sixin this paper. In random walks, the requesting peer sends out kquery messages (or walkers) to randomly chosen neighbors(k ¼ 5, 10, or 15 in this paper if not specified). Each walker followsits own path, having nodes forward it to a randomly chosen neigh-bor at each step. The random walks are restricted to the same num-ber of hops used in other mechanisms.

5.2. Experiments setup

The topology of the system is initialized as a directed randomgraph. We use a random graph with N nodes, and approximate10 out-edges per node (to its neighbors) as a starting point forthe experiment. Other parameters are defined as follows: (1) thetotal number of nodes N is in the range from 50 to 250; (2) the totalnumber of service providers in the network is M. M ¼ 1 denotesthat there is one service provider in the system; (3) the initialincentive v0 changes from 10 to 30; (4) the maximal hop number

510

15

20

25

30

35

40

45

50

55

10 15 20 25 30

cove

rage

Value of information

BFSrandom walks (5 walkers)

random walks (10 walkers)distributed search

approximate distributed search

Fig. 3. Comparing the coverage in experiments for a single provider (M ¼ 1;N ¼ 50; c ¼ 0:15).


of message is H ¼ 2; in other words, each query is propagated forthree hops; and (5) the branching factor D for BFS and distributedsearch is 6.

We are interested in (1) the total utility of the system; (2) thetotal coverage, i.e., the number of peers exposed to the message,at the end of the propagation, which measures the reliability. Weshow the average system utilities and coverage for all search algo-rithms over 10 iterations. In each iteration, each peer in the systemis allowed to query other peers once. For distributed search usingreinforcement learning, we compute its average system utilityand coverage after 100 learning iterations.

We first consider only one service provider for a given unit costc. Then we examine the results for multiple service providers andeffects of different unit costs, network sizes and networktopologies.

5.3. Single service provider

Fig. 2 shows that, for linear cost functions 0.15k, the incentivemechanism generally achieves a higher system utility than BFSand Random walks. When v0, the value of finding the service, islow, BFS walkers could bring a negative utility.

With the increase of v0, the utility advantage of the incentivemechanism over BFS decreases. The experimental results showthat distributed search is close to BFS when v0 is equal to 30. Thisis because BFS generates very high coverage, which is close to theoptimal when the query value is high. In other words, given thelow communication costs, sometimes distributed search cannotfind the service provider, while BFS can find due to its messageflooding. One hypotheses is that if we impose a different cost func-tion, e.g., 0.5k, distributed search will win. We will examine the ef-fects of cost functions on system utilities in Section 5.5.

Fig. 2 also shows the utilities for strategies of propagating mes-sages beyond 2 hops are negative when the input incentive is low.This is due to communication costs and saturation of the network.The strategies of peers show that the transmission effort using ourmodel is efficient than flooding (breadth-first search), where a peerin our incentive mechanism only queries a subset of its neighbors.The system utility is close to the predication of the system perfor-mances of relaying mechanisms for the linear cost function cases.

The total number of peers that receive the query message dur-ing the propagation process based on each mechanism, or the cov-erage, is recorded in Fig. 3.

We can find that BFS always covers more peers than othermechanisms, which implies a higher reliability (but its utility

-10

-5

0

5

10

15

20

10 15 20 25 30

Util

ity





Fig. 2. Comparing the system utility in experiments for a single provider (M ¼ 1;N ¼ 50; c ¼ 0:15).

may be lower as shown in Fig. 2). With the given policy parame-ters, the coverage generated by BFS and random walks is indepen-dent of v0, the value of finding the service. The coverage in theincentive and approximate search mechanisms increases with v0,as a peer adapts its searching strategies in the message relayingprocesses. However, in random walks, walkers may visit the samenodes multiple times, and results in lower coverage compared withother search algorithms such as BFS and distributed search.

5.4. Multiple service providers

We have assumed in the above experiment that there is onlyone service provider. Next we study the model with multiple ser-vice providers.

Fig. 4 shows the system utility when totally three service pro-viders are available in the system. A peer will get the reward ifit finds any of the service providers. We can see that the utilityof random walks is very close to distributed search when theincentive v0 is low, but distributed search becomes better whenv0 is high. This indicates that random walks can have a goodchance to cover at lease one service provider when multipleones are available in the systems. Also, when the incentive v0

is high, BFS gains more utility than distributed search. The rea-son is, BFS can guarantee to find at least one service provider;this high reliability leads to a high utility if the communicationcost is low. We will study the impact of communication costsin Section 5.5.

Fig. 5 shows the coverage of different searching mechanisms.We can find that, just like the case with a single provider, BFS al-ways covers more peers than the other mechanisms. When thereare multiple providers in the systems, the coverage in distributedsearch is close to the random walks with 10 walkers. However, ran-dom walks still get lower system utility even though they mighthave similar coverage as distributed search. This indicates that inrandom walks peers may send more messages than necessary tofind a service provider.

5.5. Effect of communication cost

From the experiments above we notice that BFS may gain moreutilities than distributed search for a communication cost c ¼ 0:15.In this section we study the impacts of the unit communicationcost on system utilities by changing c, where c ¼ 0:1;0:2;0:3;0:4;0:5. For simplicity, we only consider the case of one serviceprovider.

-5

0

5

10

15

20

25

30

10 15 20 25 30

Util

ity





Fig. 4. Comparing the system utility in experiments for three service providers(M ¼ 3;N ¼ 50; c ¼ 0:15).

0

5

10

15

20

25

30

35

40

45

50

10 15 20 25 30

cove

rage





Fig. 5. Comparing the coverage in experiments for multiple provider (M ¼ 3;N ¼ 50; c ¼ 0:15).

-40

-30

-20

-10

0

10

20

30

0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

Util

ity

Value of unit cost

BFS (single provider)distributed search (single provider)

BFS (multiple providers)distributed search (multiple providers)

Fig. 6. Comparing the system utility in experiments for different unit costs (M ¼ 1;N ¼ 50; c ¼ 0:1; . . . ;0:5; v0 ¼ 30).

12

14

16

18

20

22

24

Util

ity


random walks (10 walkers)random walks (15 walkers)

distributed search


Fig. 6 shows the system utility of BFS and distributed search forthe general linear cost function. We can find that the utility of BFSdrops rapidly when the value of b increases. Distributed search isnot that sensitive to the cost. The reason is that in distributedsearch, a peer can dynamically adjust its transmission effort to bal-ance its cost and the probability of success.

5.6. Effect of network size

We study the effects of network size on different searchschemes. We have the number of nodes in the P2P network varyingfrom 50 to 1500. We only consider a fixed value of v0, wherev0 ¼ 30. In order to make the results comparable, we assume thatevery fifty nodes have three providers and each message is allowedto be propagated for three hops for all different network sizes.

Fig. 7 shows the results for BFS, random walks (with 5, 10 and15 walkers) and distributed search. We find that distributed searchis generally better than BFS and random walks. It is also interestingto note that the performance of distributed search and randomwalks are relatively insensitive to the network size2, while BFS

2 The variation of the performance of distributed search is largely due to the factthat the learning algorithm may converge to different points.

has the utility decreasing rapidly with the network size up to a point(about 300), and then staying relatively stable for larger sizes. Thereason for these results may be explained as follows. Note that thenumber of providers are proportional to the network size and the de-gree of each node does not change with the network size. Thus indistributed search and random walks the relaying efforts and cover-age do not change significantly with the network size. In BFS, whenthe network size increases the chance of having overlapped messagerelaying becomes lower, which causes a greater amount of relayingactivities. For example, the coverage of BFS is about 80 in a networkof size 250, which is much higher than the one for a smaller networkwith 50 nodes. This explains why the utility of BFS decreases withthe network size. But such effect becomes negligible when the net-work gets large. In this case, the overlapping probability becomesso low that it is hardly affected when the network size furtherincreases.

5.7. Effect of network topologies

In the previous subsections we have shown the effectiveness ofour incentive mechanism in a random network. In this subsectionwe study our incentive mechanism in a small-world network,which presents features of social networks. Following Wattset al. (1998), we begin with a ring, but unlike them, we allow for

0 200 400 600 800 1000 1200 1400 1600Size

Fig. 7. Comparing the system utility in experiments for multiple providers (M ¼3 � N=50;N ¼ 50;100; . . . ;250; v0 ¼ 30; c ¼ 0:15).

-10

-5

0

5

10

15

20

10 15 20 25 30

Util

ity




Fig. 8. Comparing the system utility in experiments of small-world networks for asingle provider (M ¼ 1;N ¼ 50; c ¼ 0:15).


edges to be directed. We use a regular ring with 50 nodes, and 10out-edges per node (to its neighbors) as a starting point for theexperiment and then we rewire each edge at random with proba-bility p ¼ 0:2. Note that simply rewiring the network does notchange the number of edges in it.

Fig. 8 shows that, for linear cost functions 0:15k, the incentivemechanism generally achieves a higher system utility than BFSand random walks in a small-world network. We could also findthat, for a single provider, our incentive mechanism performsmuch better in a small-world network than that in a random net-work, as shown in Fig. 2. This is partially due to the property ofsmall-world networks, in which some peers have a significantlyhigher out-degree (i.e., number of neighbors) than other agents(Yu et al. 2003). In our incentive mechanism, a peer can quicklyidentify these peers with higher out-degree, which lead to shorterpaths to the service provider.

6. Discussion

Our mechanism can be implemented with virtual currency thata peer can borrow from the system (such as a virtual bank). Thus anode does not need to have a ‘‘deep” pocket to request for messagerelaying service from other peers. In addition, since a node actsoptimally to maximize its expected utility, considering the benefitand cost, in the long run a node achieves a positive balance. Thusthe system is sustainable.

In our incentive mechanism, a node receives the promised re-ward only if a provider is discovered along the path consisting ofthis node; in other words, the payment is made contingent onthe result but not on the action. Under this mechanism, no pay-ment will be made if defaults occur. This eliminates free-ridingas a cause of defaults. Defaults may occur due to random factorsthat lead to the failure of a message relaying action. In the paperwe assume message relaying is reliable; in other words, if a noderelays the message to another node, this message will always suc-cessfully reach the receiver. This assumption can be easily relaxedwith the introduction of a relaying failure probability (the proba-bility of the message not reaching the receiver) to account forthe random factors causing a relaying failure. Then the descendentcoverage and expected utility of a node should be modified accord-ingly with such a probability.

Some people may find our incentive mechanism similar tothe multi-level marketing (MLM) model (Carroll 2009, Druffwww.vandruff.com/mlm.html). In MLM a distributor is paid off

by developing her downstream distributors, and takes creditfrom all the profit made by all distributors that are brancheddirectly or indirectly from her. Although MLM inspires the mech-anism in this work, these two mechanisms are different. MLM isnotorious for causing the pyramid effect. The incentive mecha-nism proposed in this paper keeps the advantage of distributivepropagation in MLM, but avoids the pyramid effect. A peer isinformed of the stage of the propagation by the current hopnumber that is carried in the message. A peer will estimatethe current system state and foresee its behavior on the futurepropagation based on its position in the ‘‘family tree”. But thepyramid effect exists in MLM because people are unaware oftheir positions in the network and the market status, but insteadusually misled by the getting-rich stories of a few big distributorsthat exaggerate the potential opportunities of making money.

In our incentive mechanism the intermediate nodes competewith each other by choosing their prices and relaying efforts. Thecompetitions between nodes are essentially different from theclassic Bertrand and Cournot competitions (Tirole 1988). In the la-ter two games firms of a homogeneous good compete by choosingtheir prices (in a Bertrand competition) or quantities (in a Cournotcompetition). In these two games firms move simultaneously; theprofit of a firm is fully determined given the actions taken by allfirms at the same time. But in our message relaying game the inter-mediate nodes do not all act at the same time—the nodes in differ-ent hops move sequentially, although those in the same hop areconsidered to act simultaneously. Due to the existence of sequen-tial moves, the action taken by a node influences the actions of itsdescendants—if a node offers a higher price, its downstream nodewill pass on a higher price to the descendants or relay the messageto reach more descendants, such influence passing further downthe generations. Thus when deciding the price and efforts, a nodeneeds to take into consideration not only the simultaneous actionsof other nodes in the same hop, but also the subsequent actions ofthe nodes in later hops. The difference of the game structure makesthe solution of our game essentially different from the ones for Ber-trand or Cournot competitions.

Generally, repeated competition may create a chance of collu-sion, reducing the effect of competition. For example, in the Ber-trand game firms may collude by all charging higher prices,while in the Cournot game they may do so by all producing at low-er quantities (which result in a higher price). In our game, nodesmay collude by all reducing the prices they offer to the down-stream nodes. This improves the margin of a node with a lowerprice to be paid to a downstream node if message relaying is suc-cessful. However, such benefit of collusion on price is offset by itsinfluence on the later relaying process: With lower prices being of-fered, the downstream nodes will reduce their relaying efforts orpassing lower prices to their descendants, which reduces thechance of covering a provider in the descendants of a node. We alsothink it is unlikely for nodes to collude by all reducing their relay-ing efforts. Unlike in a Cournot game, where reducing the quantityimproves the price, in our message relaying game reducing the ef-forts does not influence the price—the price is given by the up-stream node before the actions of a node are taken. Thus wethink collusion may be less a concern in our message relayinggame than in Bertrand or Cournot competitions.

7. Conclusion and future work

In this paper we present an incentive mechanism for messagerelaying in peer-to-peer discovery. In this problem the common mi-cro-payment protocol based on the relaying actions is not feasiblefor an anonymous message relaying process. By pricing the searchresult but not the relaying action, our mechanism provides appro-


priate incentives for distributed message relaying that induceefficient tradeoffs between communication efficiency and reliabil-ity, while satisfying information locality.

We focus on the cost and benefit of message relaying for infor-mation discovery in a simulated testbed without considering thenetwork dynamics such as node arrival and departure. In the futurework we plan to study our incentive mechanism within peer-to-peer social networks like Myspace and examine its effectivenessin the continuous process of node arrival and departure. We be-lieve that the approximate strategy equilibrium developed in thispaper helps one to understand the influence of some importantpolicy parameters such as time-to-live and dynamics of the net-work on the system performance. The approximate equilibriumstrategy can also be used as a reference to a peer’s decision, in com-bination with other techniques such as learning or informedsearch. Joining a network may be costly for a node due to setupcosts such as fees to install hardware and software. In such a case,participation is a long-term decision that needs to take into ac-count the setup cost along with the costs and benefits after joiningthe network. The participation of nodes and the network formationwould be another interesting topic for future research.

For a node that requests certain service, the process leading tothe requestor receiving the service can be divided into two stages.In the first stage, a query is propagated in the system searching fora provider. In the second stage, after a provider is found, therequestor negotiates with the provider on the contract/price forthe service. In this paper we focus on the first stage—how to con-duct an efficient searching process in the network. The negotiationbetween the requestor and provider in the second stage may beconsidered in future work.

Acknowledgement

This research was supported by the AFOSR under Grant No.F49620-01-1-0542 and by AFRL/MNK Grant No. F08630-03-1-0005.

Appendix

Before proving Proposition 4, we need to show Lemmas 2 and 3.Let Liðki;ui;hiÞ and siðki;ui;hiÞ denote the expected number ofdescendants and the expected number of providers covered bythe descendants of node i, when its transmission effort is ki, outpututility is ui, and hop number is hi, given other peers’ strategies. De-fine DLiðki;ui;hiÞ¼

: Liðki þ 1;ui;hiÞ � Liðki;ui; hiÞ (Dsiðki;ui;hiÞ¼: siðkiþ

1;ui;hiÞ � siðki;ui;hiÞ) as the marginal increase of the expectednumber of (providers covered by) the descendants of node i withrespect to the transmission effort of i.

Lemma 2. DLiðki;ui; hiÞ > 0 decreases with hi and ki.

Proof. Define DbLiðki; ui;hi � 1Þ¼: DLiðki;ui;hi � 1jTTL ¼ H � 1Þ as themarginal increase of the expected number of node i’s descendantsexcluding those in hop H when i is in hop h� 1. DLiðki; ui;hiÞ is non-negative for any hi and ki. DLiðki;ui;hiÞ and DbLiðki;ui;hi � 1Þ involvethe same number of hops, but the hops for DbLiðki;ui;hi � 1Þ are onehop earlier than those for DLiðki;ui;hiÞ. DLiðki;ui;hiÞ 6DbLiðki;ui;hi � 1Þ because the probability of a message reaching aknower is smaller at earlier time of the propagation process (whenthe hop number is smaller). Then DLiðki;ui;hi � 1ÞPDbLiðki;ui;hi � 1ÞP DLiðki;ui;hiÞ for any hi and ki. Therefore,DLiðki;ui; hiÞ decreases with hi.

The higher ki, the greater the number of downstream nodes of i,and the lower the probability of node i to reach an ignorant not inthese downstream nodes by sending the message to one moreneighbor. Also the higher ki, the smaller the number of descen-

dants of each downstream node. Therefore, Liðki;ui;hiÞ increasesbut at a decreasing rate with ki. h

Lemma 3. Dsiðki; ui;hiÞ > 0 decreases with hi and ki.

Proof. Since the expected number of providers covered in node i’sdescendants is proportional to the number of descendants, theconclusion follows Lemma 3. h

Proof of Proposition 4. The expected utility of node i, given strat-egy ðki;uiÞ, is:

Uiðki; uiÞ ¼ ðvi � uiÞsiðki; ui;hiÞ � cki:

Denote by k�i ðvi; hiÞ and u�i ðvi;hiÞ the best response strategy.hi ¼ H � 1:

Since there is no further propagation done by node i’s down-stream node, the output incentive is zero, u�i ¼ 0. Based on Lemma3, Uiðki;0Þ is a submodular function of ki, and k�i satisfies@@ki

Uiðk�i ;0Þ ¼ 0. Therefore,

dk�idvi¼ �

@@ki

si ki;0;H � 1ð Þvi

@2

@k2isi ki; 0;H � 1ð Þ

:

Based on Lemma 3, @@ki

siðki; 0;H � 1ÞP 0 and @2

@k2isiðki;0;H � 1Þ 6 0.

Therefore, dk�idvi> 0 decreasing with vi, i.e., k�i is an increasing submod-

ular function of vi. Given that Liðki;ui;hiÞ is an increasing concavefunction of ki, Liðk�i ; 0;H � 1Þ is an increasing concave function of vi.

Now assume for hi P hþ 1, Liðk�i ;u�i ;hiÞ is an increasing concavefunction of vi. We now prove that this conclusion also holds forhi ¼ h:

Note that the number of descendants of node i is equal to thenumber of downstream nodes of i plus the sum of the number oftheir descendants, and downstream nodes’ input incentive is i’soutput incentive:

Liðki;ui; hiÞ ¼ EX

j is a downstream node of i

1þ Ljðk�j ; u�j ;hþ 1jvj ¼ u�i Þ" #

:

Given that Ljðk�j ;u�j ;hþ 1Þ is an increasing concave function of vj,Liðki;ui;hÞ and hence siðki;ui;hÞ are increasing concave function ofui. Therefore, Uiðki;ui;hÞ is a concave function of ui, and u�i satisfies:

vi � u�i� � @

@uisiðk�i ; u�i ; hÞ � siðk�i ;u�i ;hÞ ¼ 0:

Based on the Implicit Function Theorem, we then have

ddvi

siðk�i ;u�i ;hÞ ¼ �@@ui

siðk�i ;u�i ;hÞðvi � u�i Þ @

2

@u2isiðk�i ; u�i ;hÞ � 1

:

Given @@ui

siðki;u�i ;hÞ > 0 and @2

@u2isiðki;u�i ;hÞ < 0, d

dvisiðki; u�i ;hÞ > 0

decreases with vi. Since Li ¼ siN=M, Liðki;ui;hÞ is an increasing con-cave function of vi.

Given that Liðk�i ;u�i ;hÞ increases with vi, either k�i or u�i shouldincrease with vi. h

Proof of Proposition 5. Imagine the nodes in hop h forward themessage sequentially. Denote by k̂j

h the expected number of newnodes reached by the jth mover of hop h. For the jth mover inhop h, the probability of a relaying action reaching an ignorant isðN � nh �

Pi<jk̂

ihÞ=N, in which the numerator is the total number

of ignorants in the system before that node takes the move. There-fore, the expected number of ignorants reached by the node withkh relaying actions (relaying the message to kh peers), iskhðN � nh �

Pi<jk̂

ihÞ=N. This leads to the following results of k̂j

h,j ¼ 1; . . . ;mh, with lh¼

: khðN � nhÞ=N and ph¼: ðN � khÞ=N.


k̂1h ¼ kh

N � nh

N¼ lh;

k̂2h ¼ kh

N � nh � k̂1h

N¼ lh

N � kh

N¼ lhph;

k̂3h ¼ kh

N � nh � k̂1h � k̂2

h

N¼ k

N � 1N � nh � lh � lhphð Þ ¼ lhp2

h;

..

.

k̂jh ¼ lhpj�1

h :

Then

k̂h ¼1

mh

Xmh

j¼1

k̂jh ¼

lh

mh

Xmh

j¼1

pj�1h ¼ lhð1� pmh

h Þð1� phÞmh

¼ N � nh

mh1� ð1� kh

NÞmh

� �: �

References

Adamic, L., Lukose, R., Puniyani, A., and Huberman, B. Search in power-lawnetworks. Physical Review E, 64, 2001, 046135.

Adar, E., and Huberman, B. Free riding on Gnutella. First Monday, 5, 10, 2000.Arora, G., Hanneghan, M., and Merabti, M. P2p commercial digital content exchange.

Electronic Commerce Research and Applications, 4, 3, 2005, 250–263.Bertsekas, D. P. Dynamic Programming and Optimal Control. Athena Scientific, 2005.Bhattacharjee, S., Gopal, R., Lertwachara, K., and Marsden, J. R. Using p2p sharing

activity to improve business decision making: proof of concept for estimatingproduct life-cycle. Electronic Commerce Research and Applications, 4, 1, 2005, 14–20.

Buttyán, L., and Hubaux, J. -P. Stimulating cooperation in self-organizing mobile adhoc networks, Technical report no. DSC/2001/046, Swiss Federal Institute ofTechnology, 2001.

Carroll, R. T. Multi-level marketing (a.k.a. network marketing & referral marketing),February 23, 2009, <http://skepdic.com/mlm.html>.

Chandan, S., and Hogendorn, C. The bucket brigade: Pricing and networkexternalities in peer-to-peer communications networks. In Telecommuni-cations Policy Research Conference, Alexandria, VA, USA, October 27–29, 2001.

Chen, Y.-C., Shang, R.-A., and Lin, A.-K. The intention to download music files in ap2p environment: consumption value, fashion, and ethical decisionperspectives. Electronic Commerce Research and Applications, 7, 4, 2008, 411–422.

Christin, N., and Chuang, J. A cost-based analysis of overlay routing geometries. InProceedings of the IEEE Conference on Computer Communications, Miami, FL, USA,March 13–17, 2005, pp. 2566–2577.

Cohen, B. Incentives build robustness in BitTorrent. First Workshop on Economics ofPeer-to-Peer Systems, Berkeley, CA, USA, June 5–6, 2003.

Crespo, A., and Garcia-Molina, H. Routing indices for peer-to-peer systems. InProceedings of the 28th Conference on Distributed Computing Systems, Vienna,Austria, July 2–5, 2002, pp. 23–32.

Dasgupta, P., Hammond, P., and Maskin, E. The implementation of social choicerules: some general results on incentive compatibility. Review of EconomicStudies, 46, 2, 1979, 185–216.

Druff, D. V. What’s wrong with multi-level marketing? <www.vandruff.com/mlm.html>.

Feldman, M., Lai, K., Chuang, J., and Stoica, I. Quantify disincentives in peer-to-peernetworks. Workshop on Economics of Peer-to-peer Systems, Berkeley,California, June 2003.

Feldman, M., Lai, K., Stoica, I., and Chuang, J. Robust incentive techniques for peer-to-peer networks. In Proceedings of the Fivth ACM Conference on ElectronicCommerce, New York, NY, USA, May 17–20, 2004, pp. 102–111.

Feldman, M., Chuang, J., Stoica, I., and Shenker, S. Hidden-action in multi-hoprouting. In Proceedings of the Sixth ACM Conference on Electronic Commerce,Vancouver, BC, Canada, June 5–8, 2005, pp. 117–126.

Figueiredo, D., Shapiro, J., and Towsley, D. Incentives to promote availability in peer-to-peer anonymity systems. In Proceedings of the 13th IEEE InternationalConference on Network Protocols, Boston, MA, USA, November 6–9, 2005, pp.110–121.

Gigerenzer, G., and Selten, R. Bounded Rationality: The Adaptive Toolbox. The MITPress, 2002.

Glassman, S., Manasse, M., Abadi, M., Gauthier, P., and Sobalvarro, P. The Millicentprotocol for inexpensive electronic commerce. In Proceedings of the 4thInternational World Wide Web Conference, Boston, MA, USA, December 11–14,1995.

Gnutella. The Gnutella Protocol v0.4, 2001.Golle, P., Leyton-Brown, K., Mironov, I., and Lillibridge, M. Incentives for sharing in

peer-to-peer networks. In Proceedings of the Second International Workshop onElectronic Commerce, Heidelberg, Germany, November 16–17, 2001, pp. 75–87.

Habib, A., and Chuang, J. Service differentiated peer selection: an incentivemechanism for peer-to-peer media streaming. IEEE Transactions onMultimedia, 8, 3, 2006, 610–621.

Handa, J. Risk, probabilities, and a new theory of cardinal utility. Journal of PoliticalEconomy, 85, 1, 1977, 97–122.

Hua Chu, Y., Chuang, J., and Zhang, H. A case for taxation in peer-to-peer streamingbroadcast. ACM SIGCOMM Workshop on Practice and Theory of Incentives andGame Theory in Networked Systems, Portland, Oregon, USA, August 30–September 3, 2004, pp. 205–212.

Hughes, D., Coulson, G., and Walkerdine, J. Freeriding on Gnutella revisited: the belltolls? IEEE Distributed Systems Online, 6, 6, 2005, 1541–4922.

Hughes, J., Lang, K. R., and Vragov, R. An analytical framework for evaluating peer-to-peer business models. Electronic Commerce Research and Applications, 7, 1,2008, 105–118.

Milogicic, D. S., Kalogeraki, V., Lukose, R., et al. Peer-to-peer computing. Technicalreport 2002-57RI, Hewlett-Packard Company, 2002.

Lai, K., Feldman, M., Stoica, I., and Chuang, J. Incentives for cooperation in peer-to-peer networks. First Workshop on Economics of Peer-to-Peer Systems, Berkeley,CA, USA, June 5–6, 2003.

Lv, Q., Cao, P., Cohen, E., Li, K., and Shenker, S. Search and replication in unstructuredpeer-to-peer networks. In Proceedings of the 16th International Conference onSupercomputing, New York, USA, 2002.

Ma, R. T. B., Lee, S. C. M., Lui, J. C. S., and Yau, D. K. Y. An incentive mechanism for P2Pnetworks. In Proceedings of the 24th International Conference on DistributedComputing Systems, Tokyo, Japan, 2004.

Micali, S., and Rivest, R. Micropayments revisited. In Proceedings of theCryptographer’s Track at the RSA Conference on Topics in Cryptology, San Jose,CA, USA, February 18–22, 2002, pp. 149–163.

NeuroGrid, <http://www.neurogrid.net>.Ng, C., Parkes, D., and Seltzer, M. Strategy proof computing: Systems infrastructures

for self-interested parties. The First Workshop on Economics of Peer-to-PeerSystems, Berkeley, CA, USA, 2003.

Oram, A. Peer-to-Peer: Harnessing the Power of Disruptive Technologies. O’reillyPublishing, 2001.

Papadimitriou, C. H. Algorithms, games, and the internet. In Proceedings of the 33rdAnnual ACM Symposium on Theory of Computing, Hersonissos, Greece, July 6–8,2001, pp. 749–753.

Rogers, A., David, E., and Jennings, N. Self-organized routing for wirelessmicrosensor networks. IEEE Transactions on Systems, Man, and Cybernetics –Part A: Systems and Humans, 35, 3, 2005.

Schneeweiss, H.. Probability and Utility—Dual Concepts in Decision Theory.Information, Inference and Decision. D. Reidel, Dordrecht, Holland, 1974. pp.113–144.

Shneidman, J., and Parkes, D. Rationality and self-interest in peer-to-peer networks.The Second International Workshop on Peer-to-Peer Systems, Berkeley, CA,USA, 2003a.

Shneidman, J., and Parkes, D. Rationality and self-interest in peer-to-peer networks.Second International Workshop on Peer-to-Peer Systems, Berkeley, CA, USA,February 20–21, 2003b.

Shneidman, J., Parkes, D., and Seltzer, M. Overcoming rational manipulation indistributed mechanism implementations. Technical report TR-12-03, HarvardUniversity, 2003.

Simon, H. A. Models of Bounded Rationality. The MIT Press, 1982.Sun, Q., and Garcia-Molina, H. Slic: a selfish link-based incentive mechanism for

unstructured peer-to-peer networks. In Proceedings of the 24th InternationalConference on Distributed Computing Systems, Tokyo, Japan, 2004.

Sutton, R. S., and Barto, A. G. Reinforcement Learning: An Introduction. MIT Press,1998.

Tirole, J. The Theory of Industrial Organization. MIT Press, 1988.Tsoumakos, D., and Roussopoulos, N. A comparison of peer-to-peer search methods.

International Workshop on the Web and Databases, San Diego, CA, USA, 2003.Watts, D. J., and Strogatz, S. H. Collective dynamics of ‘small-world’ networks.

Nature, 393, 1998, 440–442.Yang, B., and Garcia-Molina, H. Improving search in peer-to-peer systems. In

Proceedings of the 22nd International Conference on Distributed ComputingSystems, Vienna, Austria, 2002.

Yang, B., and Garcia-Molina, H. PPay: Micropayments for peer-to-peer systems. InProceedings of the 10th ACM Conference on Computer and CommunicationsSecurity, Washington DC, USA, October 27–31, 2003, pp. 300–310 ().

Yang, B., Condie, T., Kamvar, S., and Garcia-Molina, H. Non-cooperation incompetitive P2P networks. In Proceedings of the 25th IEEE InternationalConference on Distributed Computing Systems, Columbus, Ohio, USA, June 6–10,2005, pp. 91–100.

Yu, B., and Singh, M.P. Searching social networks. In Proceedings of SecondInternational Joint Conference on Autonomous Agents and Multiagent Systems,Melbourne, Australia, July 14–18, 2003, pp. 65–72.

Yu, B., Venkatraman, M., and Singh, M. P. An adaptive social network for informationaccess: theoretical and experimental results. Applied Artificial Intelligence, 17, 1,2003, 21–38.

Zhong, S., Chen, J., and Yang, Y. Sprite: a simple, cheat-proof, credit-based system formobile ad-hoc netowrks. In Proceedings of the IEEE Conference on ComputerCommunications, San Francisco, CA, USA, March 30–April 3, 2003, pp. 1987–1997.

http://skepdic.com/mlm.html

http://www.vandruff.com/mlm.html

http://www.vandruff.com/mlm.html

http://www.neurogrid.net

Documents

An incentive mechanism for message relaying in unstructured peer-to-peer systems