A game theoretic trust model for on-line distributed evolution of cooperation inMANETs

Journal of Network and Computer Applications 34 (2011) 39–51

Contents lists available at ScienceDirect

Journal of Network and Computer Applications

1084-80

doi:10.1

� Corr

E-m

journal homepage: www.elsevier.com/locate/jnca

A game theoretic trust model for on-line distributed evolution ofcooperation in MANETs

Marcela Mejia a,�, Nestor Pena a, Jose L. Munoz b, Oscar Esparza b, Marco A. Alzate c

a Universidad de los Andes, Colombiab Universitat Polit�ecnica de Catalunya, Spainc Universidad Distrital, Colombia

a r t i c l e i n f o

Article history:

Received 11 August 2009

Received in revised form

5 September 2010

Accepted 28 September 2010

Keywords:

MANET

Trust models

Game theory

Evolutionary algorithm

45/$ - see front matter & 2010 Elsevier Ltd. A

016/j.jnca.2010.09.007

esponding author. Tel.: +57 3005645704.

ail address: [email protected] (M.

a b s t r a c t

Cooperation among nodes is fundamental for the operation of mobile ad hoc networks (MANETs).

In such networks, there could be selfish nodes that use resources from other nodes to send their packets

but that do not offer their resources to forward packets for other nodes. Thus, a cooperation

enforcement mechanism is necessary. Trust models have been proposed as mechanisms to incentive

cooperation in MANETs and some of them are based on game theory concepts. Among game theoretic

trust models, those that make nodes’ strategies evolve genetically have shown promising results for

cooperation improvement. However, current approaches propose a highly centralized genetic evolution

which render them unfeasible for practical purposes in MANETs. In this article, we propose a trust

model based on a non-cooperative game that uses a bacterial-like algorithm to let the nodes quickly

learn the appropriate cooperation behavior. Our model is completely distributed, achieves optimal

cooperation values in a small fraction of time compared with centralized algorithms, and adapts

effectively to environmental changes.

& 2010 Elsevier Ltd. All rights reserved.

1. Introduction

Mobile Ad Hoc NETworks (MANETs) are infrastructurelessnetworks formed by wireless mobile devices with limited resources.Source/destination pairs that are not within transmission range ofeach other must use intermediate nodes as relays (Perkins, 2001).The cooperation among nodes is fundamental for the operation ofMANETs, since nodes that contribute with their own limitedresources, such as battery, memory and processing capacity,should be able to use the resources contributed by other nodes.However, in this environment, there could be free-riders or selfishnodes, i.e., users that want to maximize their own welfare by usingresources from the network to send their own packets withoutforwarding packets on behalf of others (Wrona and Mahonen, 2004).Thus, it is important to encourage nodes to participate in essentialnetwork functions such as packet routing and forwarding, becausethe higher the cooperation the better the network performance. Inthis sense, several trust mod els have been proposed as mechanismsto incentive node participation within the network (Mejia et al.,2009b). A trust model is a conceptual abstraction to buildmechanisms for assigning, updating and using trust levels amongthe entities of a distributed system. In other words, the trust model

ll rights reserved.

Mejia).

allows the establishment of trust relationships among differententities with some degree of credibility, for a given action (Marti andGarcia-Molina, 2006).

Among proposals in the literature, trust models based on gametheory are interesting because they properly model the dilemmathat a node has: to cooperate and gain trust, or not to cooperateand save battery. To face this dilemma, several pure and mixedstrategies has been studied in game theory, especially differentforms of tit for tat (Osborne, 2004). However, in a MANET, gameconditions can change over time, for which we focus on trustmodels that use genetic algorithms to evolve the strategies. Thesemodels are promising in MANET because they can dynamicallyadapt to the current network conditions.

In this article, we propose a trust model based on (1) a game,(2) a trust evaluation mechanism, (3) a way of coding thecooperation strategy and (4) a genetic algorithm that allows anode to evolve its cooperation strategy. The proposed evolutionalgorithm1 introduces a low overhead because it is completelydistributed and easy to compute. Indeed, it only requires localexchanges among neighbors. Our algorithm works much likeplasmid migration in bacterial colonies (Dale and Park, 2004;Marshall and Roadknight, 2000) combined with a parallel cellular

1 A preliminary version of the algorithm was presented in a conference paper

(Mejia et al., 2009a).

www.elsevier.com/locate/jnca

dx.doi.org/10.1016/j.jnca.2010.09.007

mailto:[email protected]

dx.doi.org/10.1016/j.jnca.2010.09.007

M. Mejia et al. / Journal of Network and Computer Applications 34 (2011) 39–5140

genetic algorithm (Alba and Dorronsoro, 2008; Alba and Troya,1999; Cantupaz, 1998; Nowostawski and Poli, 1999). Through thisprocedure, individual nodes select the strategies that locallymaximizes their payoff in terms of both packet delivery andresource saving. The general idea is that, within the network,there are some nodes that are willing to cooperate if deemedworthy, called normal nodes. There are also some nodes that areexpecting to use the resources of normal nodes withoutcontributing to the network, called selfish nodes. The networkenvironment is characterized by the fraction of selfish nodeswithin the total population of nodes in the network. For a givenenvironment, we measure the cooperation as the fraction ofpackets originated by normal nodes that effectively reach theirdestinations. We measure the resource savings in terms of thedetection and isolation of selfish nodes, since serving selfishnodes constitutes a waste of resources. Although these are globalvariables, the local payoff maximization is such that the wholenetwork increases the cooperation and, consequently, thethroughput, without wasting scarce energy resources on selfishnodes. Indeed, our trust model efficiently achieves both highadaptability to environment changes and quick convergence toalmost optimal cooperation and energy saving values, as we showby simulation.

It is important to mention that we are concerned only withselfish nodes, not malicious nodes. Selfish nodes are free-ridersthat want to consume other nodes resources without supportingthe network with their own resources, but they are not interestedin undermining network security or causing any damage.

The rest of the article is organized as follows. Section 2 brieflyreviews models for cooperation and genetic algorithms to put ourwork in context. Section 3 describes our game theoretic trustmodel. Section 4 presents the simulation scenarios and theirparameters. Section 5 shows the performance evaluation for ourtrust model and compares it with the previously published resultsof a centralized evolution model. Section 6 concludes the article.

2. Background

In this section we briefly review cooperation models andgenetic algorithms, to put in context our algorithm which uses anon-cooperative game trust model and an hybrid cellular/bacterial evolution mechanism.

2.1. Models for cooperation

Models for cooperation enforcement in MANETs can bebroadly divided in two categories according to the techniquesthey use to enforce cooperation (Marias et al., 2006): credit-basedmodels and trust models. The former category is based oneconomic incentives, whereas the later is based on buildingreputation to enforce cooperation.

In credit-based models, the network tasks are treated as servicesthat can be valued and charged. These models incorporate a form ofvirtual currency to regulate the dealings among nodes. The mostwidely cited proposal of this type was introduced by Buttyan andHubaux (2000). These authors introduce a currency called nuglets.The exchange of nuglets relies on a tamper-resistant securitysubsystem being present in every node. Another credit-basedproposal called Sprite (Yale and Zhong, 2002) makes use of apublic key infrastructure to deal with the problem of selfishness.Nodes upload receipts to a Credit Clearance Service (CCS), a centralauthority which is available when the nodes are connected to theInternet. Express (Janzadeh et al., 2009) is a work based on Sprite.Express tries to minimize the cost of digital signatures by using hashchains. Express also uses an external trust entity called Reliable

Clearance Center (RCC). In general, the main drawback of credit-based models is that they require the existence of either tamper-resistant hardware or a virtual bank, heavily restricting their usabilityfor MANETs. An hybrid scheme called OCEAN (Bansal and Baker,2003) uses both reputation to detect and punish selfish behavior, anda micro-payment component to encourage cooperation. The credit isearned for each immediate neighbor and it cannot be used to sendpackets in a different route. A reputation scheme optimized for videostreaming inspired in OCEAN has been presented in Munoz et al.(2010). Another model called ad hoc-VCG (Anderegg and Eidenbenz,2003) is also a credit-based model which introduces a second-bestsealed type of auction. To this respect, a pricing question arisesconcerning the amount of the payment a node should ask to forwardpackets. Intermediate nodes declare their respective prices honestly.Honest behavior is assured by VCG mechanism since ad hoc-VCG isrobust when only one cheating node exists. However, it might fail inthe presence of collusions of nodes trying to maximize theirpayments. An additional issue is the excessive overhead because adhoc-VCG requires complete knowledge of the network topologyduring the route discovery phase.

On the other hand, models in which trust is the base forcooperation are envisioned as the most promising solutions forMANETs because these models do not have the restrictions ofcredit-based models. Trust models can frustrate the intentions ofselfish nodes by coping with observable misbehaviors. If a nodedoes not behave cooperatively, the affected nodes, reciprocally,may deny cooperation. Generally speaking, in a trust model, anentity called the Subject S commends the execution of an action a

to another entity called the Agent A, in which case we say that T{S

: A, a} is the trust level that S has on A with respect to theexecution of action a (Sun et al., 2006). This trust level varies asthe entities interact with each other; i.e., if the Agent A respondssatisfactorily to the Subject S, S can increase the trust level T{S: A

a}. On the other hand, if the subject S is disappointed by the agentA, the corresponding trust level could be decreased by someamount. In this sense, a trust model helps the subject of adistributed system to select the most reliable agent among severalagents offering a service (Marti and Garcia-Molina, 2006). Tomake this selection, the trust model should provide the mechan-isms needed for each entity to measure, assign, update and usetrust values. Several trust models have been proposed in theliterature for improving the performance of MANETs. In Mejiaet al. (2009b), there is classification of trust-based systems basedon the theoretical mechanisms used for trust scoring and trustranking. Following this classification, we can further divide thetrust-based proposals in approaches based on social networks,information theory, graph theory and game theory.

In proposals based on social networks, nodes build their viewof the trust or reputation not only taking into account their ownobservation but also considering the recommendations fromothers. One of the first examples of a trust system based onsocial networks is CONFIDANT (Buchegger and Le Boudec, 2002).CONFIDANT works as an extension of a reactive source-routingprotocol for MANET. Nodes monitor the next node on the routeby either listening to the transmission of the next node or byobserving route protocol behavior. Any misbehaving actiongenerates ALARMs. Each node maintains received ALARMs fromfriend nodes and ALARMs produced by the node itself. In theimproved CONFIDANT (Buchegger and Boudec, 2004), authorsprovided a modified Bayesian approach for reputationrepresentation, updates, and view integration. When updatingthe reputation according to recommendations, only informationthat is compatible with the current reputation rating is accepted.This approach is objective and robust, but it still leaves anopportunity for elaborate attackers to launch false accusationattacks (Li and Wu, 2010). In Michiardi and Molva (2002), authors

M. Mejia et al. / Journal of Network and Computer Applications 34 (2011) 39–51 41

propose CORE. CORE relies on observations and recommendationswhich are combined by a specialized function. The CORE schemeis immune to some attacks because no negative ratings arespread, and, thus, it is impossible for a node to maliciouslydecrease another nodes reputation. However, two or more nodesmay collude (i.e., send positive rating messages) in order toincrease their reputation. To prevent such phenomena, the COREimplicitly provides some protection, since subjective reputationhas more impact (i.e., weight) than the indirect. Finally, SORI(Bansal and Baker, 2003) is another trust model that also usesreputation spreading. In addition to the previous ones, there aresome trust models that are based on social networks and that theyalso use cluster-heads. The cluster-head is a node who is electedto play a special role regarding the management of recommenda-tions. An example can be found in Safa et al. (2010). In thisproposal, the protocol organizes the network into one-hopdisjoint clusters then elects the most qualified and trustworthynodes to play the role of cluster-heads. The proposed mechanismcontinuously ensures the trustworthiness of cluster-heads byreplacing them as soon as they become malicious. As a concludingremark for trust models based on social networks, we would liketo notice that the calculation and measurement of trust inunsupervised ad hoc networks involves a very complex aspectlike rating the honesty of recommendations provided byother nodes. Although there are efforts like Luo et al. (2009),Li and Wu (2010) and Zouridaki et al. (2009) that try to alleviatethis problem, it is still a hard problem for systems that userecommendations. Furthermore, social trust models that also useclusters add another problem, they require adealer, which mustbe involved in the working of other nodes, and this is hard toachieve in practical ad hoc networks.

Regarding the proposals based on information theory, one ofthese is Sun et al. (2006), in which the authors proposed a trustmodel to obtain a quantitative measurement of trust and itspropagation through the MANET. However, the proposal istheoretical and it does not include an implementation specifica-tion. Sherwood et al. (2006) describes a trust inference algorithmin terms of a directed and weighted Trust Graph, T, whose verticescorrespond to the users in the system and for which an edge fromvertex i to vertex j represents the trust that node i has in node j.However, covering the whole graph is still a high complexitycomputational problem.

Finally, there are several proposals that use game theory. Theseproposals can be further divided into cooperative and non-cooperative games (Osborne, 2004). In cooperative games, usersform coalitions so that a group of players can adopt a certainstrategy to obtain a higher gain than the one it may be obtainedmaking decisions individually. In cooperative games, the nodesneed to communicate with each other and discuss the strategiesbefore they play the game. Baras and Jiang (2005) and Saad et al.(2009) are representative proposals of cooperative games forMANET. Nevertheless, this type of games have the disadvantage ofgenerating network overhead due to the complex processes ofcoalition management. On the other hand, non-cooperative gamesare especially suitable in scenarios in which players might haveconflicting interests, and each of them wants to maximize herown profit taking individual decisions (Felegyhazi et al., 2006).Our proposal and many others that can be found in the literature(Seredynski and Bouvry, 2009; Seredynski et al., 2007; Milanet al., 2006; Komathy and Narayanasamy, 2008; Wang et al.,2010; Ji et al., 2010; Jaramillo and Srikant, 2010) are based on avariation of the classical non-cooperative game of the iteratedprisoner’s dilemma game, in which nodes can use differentstrategies. A further discussion of these proposals and acomparison with ours is provided in Section 5.3. It is worth tomention that, in general, all the mentioned models improve

cooperation results, but they still have some difficulties to beapplied in MANET.

In this article, we focus on trust models based on game theory,since they capture very well each node dilemma about decidingwhether to cooperate and obtain the trust of peer nodes, or not tocooperate and save scarce energy resources. In particular, wefocus on non-cooperative games since nodes rely only on privatehistories and thus, the costly coalition overhead and possibleconflicting interests can be avoided. More specifically, trustmodels that use genetic algorithms for the evolution of strategiesshow promising results in MANET because they can adaptdynamically the behavior of nodes to the current conditions ofthe network. In particular, we take as reference model thecentralized one presented in Seredynski et al. (2007). Thismodel is interesting because the evolution algorithm achievespromising results regarding cooperation and energy saving.However, we are still concerned about the highly centralizednature and the slow convergence of its evolution algorithm.Optimal strategies are obtained by using a centralized entity thatruns a conventional genetic algorithm in an off-line way after alarge number of interactions between nodes. For these reasons, inthis article we propose a non-cooperative game theoretic trustmodel in which strategies evolve on-line according to adistributed genetic algorithm without requiring too much dataexchange among nodes. Our model exploits the distributed natureof a MANET by using a local genetic information exchange. Ouralgorithm works much like plasmid migration in bacterialcolonies (Dale and Park, 2004; Marshall and Roadknight, 2000)combined with a parallel cellular genetic algorithm (Alba andDorronsoro, 2008; Alba and Troya, 1999; Cantupaz, 1998;Nowostawski and Poli, 1999). Through this procedure,individual nodes select the strategies that locally maximizestheir payoff in terms of both packet delivery and resource saving.A brief summary about Genetic algorithms is introduced in therest of this section to better understand the proposal.

2.2. Genetic algorithms

In many complex optimization problems, an exhaustive searchof the solution space is unfeasible. Genetic algorithms (GAs) areheuristic approaches that are based on the genetic hereditaryprocesses of biological organisms. A canonical GA works with apopulation of individuals, where each individual represents apossible solution to a given problem. A fitness score for eachindividual is assigned according to how it fits as a solution to theproblem. The higher the fitness an individual has, the higher theopportunity it has to be selected for reproduction. This reproduc-tion is done by crossing two individuals of the population over,generating new individuals as offspring that contain the bestcharacteristics of the previous generation. As generations comeand go, these best characteristics are spread throughout the entirepopulation. By favoring the mating between the fittest indivi-duals, it is possible to explore the most promising regions of thesolution space (Holland, 1975; Whitley, 1993).

Generally, a GA performs four steps to obtain a new popula-tion: initialization, evaluation, selection and reproduction. Theprocess is recursively repeated from the second step during agiven number of generations or until the solution converges(Holland, 1975).

I.
Initialization: In a GA, a genetic code, or chromosome,represents any solution to the problem. The first populationof chromosomes is typically randomly generated.
II.
Evaluation: The mechanism used to measure the individualfitness to solve the given problem is called fitness function.


III.
Selection: The selection algorithm picks individuals out amongthe current population to participate in the next generation.The most common approach is a roulette-wheel selection,where a selection probability is assigned to each individualchromosome, proportionally to its fitness.
IV.
Reproduction: This is the process by which two parentchromosomes are recombined, normally through the geneticoperators of crossover and mutation. Crossover is a sexualreproduction between two parents, where some portions ofthe parents chromosomes are swapped to form a child. Then,the mutation operator alters each child gene with a smallprobability in order to ensure that no point in the search spacehas a zero probability of being examined.
2.2.1. Parallel genetic algorithms

Parallel genetic algorithms (PGA) are not only an extension ofcanonic GA, but also a different efficient way to search the spaceof solutions for a given problem (Nowostawski and Poli, 1999). Ina PGA, the population is divided into subpopulations and anindependent GA is performed on each of these subpopulations.The local selection and reproduction rules allow the species toevolve locally, and diversity is enhanced by migration, i.e., bytransferring genetic information among subpopulations. Theprocesses of genetic evolution and migration are repeated untilthe solution converges.

The many parallel genetic algorithm models that can be found inthe literature can be classified in island or cellular depending on theirparallelism grade (Alba and Troya, 1999). In an island PGA (iPGA), thewhole population is divided into a few separated subpopulations, orislands, with many individuals on it, and an independent geneticalgorithm is performed in each island. After a certain number ofgenerations, there is a migration process by which the fittestindividuals migrate among islands in order to obtain geneticdiversity (Nowostawski and Poli, 1999). In a cellular PGA (cPGA),the population is divided into a large number of subpopulations witha few individuals on it, and the genetic information exchange amongsubpopulations is performed by overlapping subpopulations (Albaand Troya, 1999). The population in a cPGA has a spatial structurethat limits the interactions among individuals to just some smallneighborhoods. However, by neighborhood overlapping, optimallocal solutions can spread across the entire population (Nowostawskiand Poli, 1999).

2.2.2. Plasmid migration and bacterial genetic algorithms

There are several genetic algorithms based on observedbacterial behavior. Different authors present them as microbialgenetic algorithms (Harvey, 1996), bacterial algorithms (Cabritaet al., 2003), pseudobacterial genetic algorithms (Nawa et al.,1999), lateral gene transfer (Ochman et al., 2000) and plasmidmigration (Marshall and Roadknight, 2000) among others. Ingeneral, these algorithms avoid the sexual reproduction, so theyare not classified as parallel genetic algorithms, although they areextremely distributed. Here we describe two of these algorithms,those on which we based part of our evolution process.

Plasmid Migration (PM) is based on the behavior of somebacteria (Marshall and Roadknight, 2000). Plasmids are self-replica-ting extrachromosomal DNA molecules that are not essential for thesurvival of the bacterium but encode a wide variety of geneticstrains that permit a better survival in adverse environments.Plasmid has the ability to be transferred among bacteria within thesame generation by allowing healthy individuals to deposit plasmidsin the medium, so that less healthy individuals can take theseplasmids from the medium. This ability gives bacteria a greatadaptability to sudden environmental changes (Dale and Park,2004). This analogy can be used in genetic algorithms to spread high

quality strands from fitted individuals to the rest of the populationthrough the gene transfer operation.

On the other hand, in a bacterial algorithm, a chromosome isdivided in p parts, and each individual produces m�1 clones of itself.The randomly chosen ith part of the m�1 clones is mutated and thebest fitted part is replicated in the m individuals. After this mutation–evaluation–selection–replacement process is repeated for all thep parts, the fittest individual goes to the next population and theother m�1 individuals die (Nawa et al., 1999; Cabrita et al., 2003).

Both plasmid migration and bacterial algorithms are greedyalgorithms that, at each step, make apparent good decisions withoutregarding for future consequences and, as such, can lead only tolocally optimal solutions. In contrast, these solutions can be obtainedvery quickly, enhancing adaptability at the cost of optimality (Weiss,1998). However, in many occasions, in a well designed plasmidmigration algorithm, the mobility of the individuals allows goodplasmid to spread all over the population, so that better solutionscan be obtained through a more exhaustive search of the solutionspace. In this article, we propose an enhanced cPGA algorithm thatincludes some greedy bacterial heuristics to achieve fastconvergence, optimality and adaptability.

3. A model for on-line distributed evolution of cooperation

In this section we describe our proposal, where both the trustevaluation mechanism and the strategy evolution algorithm aredesigned to enhance the convergence speed and adaptability.Furthermore, both trust evaluation and strategy evolution arecarried out in a distributed way among the nodes of the network.In our trust model, the interactions among nodes are based on theiterated prisoner’s dilemma under the random pairing game(Ishibuchi and Namikawa, 2005). Each intermediate node utilizes astrategy that defines whether it should retransmit or discard apacket that comes from a certain source node. The strategy dependson two aspects: the past behavior of the network when theintermediate node acted as a source, and the trust level that theintermediate node has in the source node. The model is comprisedof: (a) a trust evaluation mechanism; (b) a game based networkmodel; (c) a strategy; and (d) a local genetic algorithm based onplasmid migration to evolve the strategy in a highly distributed way.

We take as reference the centralized model of Seredynski et al.(2007). Essentially, the similarities between the centralized modeland ours are in the game model and in the strategy encoding.These similarities are kept to make the performance comparisoneasier (see evaluation results in Section 5). However, both thetrust evaluation mechanism (for better adaptability) and thegenetic evolution (for easy distributed computation) are totallydifferent. On the other hand, the more close related work is theBNS proposal (Komathy and Narayanasamy, 2008), which alsouses a non-cooperative game model with a distributed evolution.In BNS, a node decides the strategy to follow by changing to otherplayer’s strategy if it seems to be doing better. Despite thedistributed nature of this evolution strategy, the payoff structurestrictly encourages cooperating strategies without taking intoaccount the energy savings of discarding strategies. A moredetailed comparison with BNS and with other proposals that alsouse non-cooperative game models is provided in Section 5.3.

3.1. Trust evaluation mechanism

Each node maintains a trust table based on the observedbehavior of its neighbors. For example, if node B is observing nodeA, which is within its transmission range, it can know the numberof packets that has been sent to A to be forwarded, n, and thenumber of packets that A has actually forwarded nA. So B can

Table 2Tables of payoff.

Source node payoffs

Transmission statusSuccessful 5

Failed 0

Intermediate node payoffs

Trust level of the source node

T¼3 T¼2 T¼1 T¼0

Cooperate 3 2 1 0.5

Discard 0.5 1 2 3

Table 3Strategy coding, example strategy 0001 0011 0101 0111.

Source trust level 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3

Transmission status � 2 F F S S F F S S F F S S F F S S

Transmission status �1 F S F S F S F S F S F S F S F S

Current decision D D D C D D C C D C D C D C C C


compute the forwarding rate of A, as shown in

frðB,A;nÞ ¼nA

nð1Þ

We can use a simple cumulative average to compute theforwarding rate. For instance, B can update the forwarding rate ofA by observing whether the nth packet has been forwarded by A

or not. This is done by applying

frðB,A;nÞ ¼1

n

Xn

i ¼ 1

di ¼ðn�1ÞfrðB,A;n�1Þþdn

nð2Þ

where diAf0,1g is the ith observed decision.Since the strategies are modified continuously by the nodes to

adapt to environmental changes, it would be unfair to have a longrecord of observed decisions if they were taken under a previousstrategy, different to the current one. Correspondingly, wedecided to take into account only the most recent m observeddecisions, so we compute the forwarding rate as the movingaverage of the decision sequence, i.e., the fraction of retransmittedpackets among the previous m packets received for forwarding.

The value of the memory depth, m, obeys a trade-off: wewould like m to be large enough to obtain a fair evaluation of theforwarding rate, but we would also like m to be small enough toensure that the forwarding rate actually corresponds to thecurrent strategies (tuning of m is discussed in Section 4).Computing the forwarding rate with a finite memory of depthm, requires both the previous forwarding rate and the last m

decisions as state variables, as shown in

f rðB,A;nÞ ¼

1

n

Xn

i ¼ 1

di ¼ðn�1Þf rðB,A;n�1Þþdn

nnrm

1

m

Xm�1

i ¼ 0

dn�i ¼mf rðB,A;n�1Þþdn�dn�m

mn4m

8>>>>><>>>>>:

ð3Þ

With the current rate f rðB,A;nÞ, B can determine the trust levelit should have in A, T{B:A;n}, as shown in Table 1. For comparisonpurposes, we keep the same ranges on the fraction of forwardedpackets and the same corresponding trust values as in thecentralized model.

Finally, it is worth to mention that using a moving average thatonly takes into account the previous m observed decisions mayseem a subtle detail, but we introduce it because it is critical for theon-line adaptability of the strategies, as we will show in Section 5.

3.2. Game-based network model

Each intermediate node that receives a packet should decidewhether to forward it or to discard it, according to its strategy.Each game starts with the transmission of a new packet from asource node and ends either when the packet is delivered to itsdestination, or when an intermediate node decides to discard thepacket. Once the game has finished, each participant receives apayoff according to the decision it took and its trust level on thesource node. Since there are two types of nodes, source nodes andintermediate nodes, two types of payoff tables are maintained, asshown in Table 2 (Seredynski et al., 2007). These payoffs have

Table 1Relation between delivery rate and trust level.

f rðB,A;nÞ T{B:A;n}

0.9–1 3

0.6–0.9 2

0.3–0.6 1

0–0.3 0

been directly taken from the centralized model for comparisonpurposes and also because they have two good properties: asuccessful transmission is the most rewarding event, and there issymmetry between the discarding and forwarding payoffs withrespect to the trust value, indicating that saving energy is asimportant for each node as obtaining the trust of its neighbors.

The forwarding or discarding decision of an intermediate nodeis observed by all nodes preceding it in the path. In addition, anode that receives a packet can consider that all its precedingnodes have cooperated.

3.3. The strategy

We use a strategy codification similar to the one used in thecentralized model. The strategy that a node follows when it is actingas intermediate node is encoded by a string of bits, in which each bitrepresents the decision of discarding (D) (bit¼0) or cooperating (C)(bit¼1). The strategy depends on the following parameters:

�
The trust level that the node has in the source node. � The transmission status of the two previous games that the node
has played as source, which could be success (S) or failure (F).

The resulting strategy has 16 bits, as shown in the example strategyof Table 3. For instance, according to the strategy of Table 3, anintermediate node will forward a packet if it has a trust level of 1 inthe packet’s source and if its two previous packet transmissions assource were successful (8th bit of the strategy, from left to right).

The strategy is evolved by means of a genetic algorithm. Apossibility is to randomly choose an initial strategy and then startevolving it. However, the convergence speed to optimal cooperationcan be improved if some bits of the initial strategy are set toparticular values. In addition, nodes cooperate if they have not playedtwo times as source yet (we discuss these issues in Section 3.4).

3.4. Distributed bacterial-like evolution algorithm

A genetic algorithm is used to maximize the fitness or meanpayoff of each node. The design of the evolution algorithm took usseveral iteration steps where we probed different design options


to balance the different issues of the learning process. As a result,next we present our evolution algorithm and discuss its design.Furthermore, in Section 5, it is shown that our proposal clearlyoutperforms the centralized algorithm in terms of cooperation,convergence speed, energy saving and adaptability.

In our distributed bacterial-like algorithm, we evolve thestrategies on-line during the life of the network. A successivesequence of games takes place during this network life. A game isa successful or failed packet transmission. In a game, a sourcenode selects the most trusted h-hop route among r possible routesand sends its packet through it. The game is successful if thetransmitted packet is received at the destination. When a nodehas played R times as the packet source, we said that it hascompleted a Plasmid Migration Period (PMP). At this moment, anevolution step must take place, i.e., the node exchanges geneticinformation with its neighbors and evolves its strategy.

We combine the plasmid migration concept with a classicalcellular genetic algorithm by allowing each node to receive thegenetic information from all its one-hop neighbors in order tostart a reproductive mechanism to construct a new strategy. Forreproduction we use the classical one-point crossover andmutation processes. The bacterial plasmid migration is introducedby allowing each node to keep its best previous strategy so that, ifduring the current plasmid migration period the new strategy didnot increase the fitness, the old strategy can be restored. Morespecifically our model works as shown in Algorithm 1.

Algorithm 1. Bacterial-like evolution pseudo-code.

/
/ Node i goes on from the 0th PMP
i
nputs: At each PMP, the set of i’s current neighbors, K o utputs: The sequence of strategies si(j) and fitnessesfi(j) at the jth PMP, for j¼0,1,2,y
1 b�
egin 2
3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

��

sið0Þ’GenerateStrategyðÞ; == Generate the

initial strategy

j’0; == Time index; expressed in PMPs

while true dofiðjÞ’GetFitnessðsiðjÞÞ; == Evaluate this

strategy ðplay some gamesÞ

if ðj40Þ && ðfiðjÞofiðj�1ÞÞ then == The previous

strategy was better

siðjÞ’siðj�1Þ; == Restore the previous strategy

fiðjÞ’fiðj�1Þ; == and the previous fitness

��end

K’GetNeighborhoodðiÞ;

p1’RouletteWheelðK [ figÞ; == Get first parent

p2’RouletteWheelðK [ figÞ; == Get second parent

while p1 ¼ ¼ p2 do

jp2’RouletteWheelðK [ figÞ; == We need two

different parents

end

if fiðjÞZMeanðfp1ðjÞ,fp2

ðjÞÞ then

jsiðjþ1Þ’siðjÞ; == Changing the strategy

is not worthy

elsesiðjþ1Þ’Crossoverðsp1ðjÞ,sp2ðjÞÞ;

siðjþ1Þ’Mutationðsiðjþ1ÞÞ;

��end

jþþ ;

��end

��
24 e nd
At time 0, node i starts with an initial random strategy, si(0),whose fitness, fi(0), is evaluated during the first PMP, i.e.,during the transmission of its own first R packets (5th line ofAlgorithm 1). Like plasmid genes, node i keeps a record of the bestproven strategy so far, i.e., if at the jth PMP the current strategysi(j) is worst than the previous one, si(j�1), node i restores itsprevious strategy and fitness (7th and 8th lines of Algorithm 1).Node i exchanges its strategy si(j) and its corresponding fitnessfi(j) with its one-hop neighbors, K, and selects a pair of potentialparents, p1 and p2, one of which can be node i itself (lines 10th to15th of Algorithm 1). This selection is performed through aroulette wheel process. In a roulette wheel, a selection probabilityis assigned to each individual strategy as pk ¼ fk=

PnAKfn,kAK,

and each parent is randomly selected according to thisdistribution. If the mean fitness of the selected potential parentsis worse than the fitness of the current strategy (16th and 17thlines of Algorithm 1), node i keeps its current strategy for the nextPMP. Otherwise, the selected parent strategies are combined toconstruct a child strategy si(j+1). In this last case, the functionperforms a one point crossover where half of the genes are takenfrom one parent and the other half are taken from the otherparent. Finally, the child strategy goes through a mutationprocess, where each bit is flipped with a given probability. Theresultant strategy is the one that will be used in the next PMP.This process is repeated during the life time of the network.

Regarding the initial random strategy, corresponding to thefunction GenerateStrategy() in the 2nd line of Algorithm 1,we make some a priori decisions about some bits in order tospeed up the convergence of the algorithm, as shown in Algorithm2. In particular, we fix six bits of that initial strategy: anintermediate node cooperates when the source node has thehighest trust value and the network has delivered at least one ofits previous two packets (i.e., according to Table 3, bits 14, 15 and16 are set to Cooperate—7th line of Algorithm 2). Similarly, anintermediate node discards a packet when the source node hasthe lowest trust value and the network has failed in delivering atleast one of its two previous packets (i.e., bits 1, 2 and 3 are set toDiscard—8th line of Algorithm 2). We have validated theseasumptions by simulation since this pattern was found in the finalstrategies after convergence under very different simulationscenarios. Finally, the ten remaining bits of the 1

2 (lines 3–6 ofAlgorithm 2).

Algorithm 2. Pseudo-code for randomly generating a newstrategy.

1 G
enerateStrategy{ 2 s ’null; // strategy to be returned
3 f
or (bit¼ 4; bito14;bitþþ) do 4
5

��
r’RandomðÞ; == 0 or 1; each with
probability 1=2

SetStrategyBitðs,bit,rÞ; == Set the

given bit of s randomly

��
6 e nd 7 s ’ORðs,0000000000000111Þ; // conditions for
initial collaboration

8 s
’ANDðs,0001111111111111Þ; // conditions for
initial discarding

9 r
eturn s ; 10 }
An important issue to address is how a node should behave atthe beginning of network operation, when it has tried to send lessthan two packets as source node, so no information is availableabout network behavior. In this case, we decided that the nodeshould try to immediately start building a good reputation among


neighbors, as it is suggested in Axelrod and Dion (1988).Therefore, when the node has transmitted less than twopackets, the decision made as intermediate node is alwayscooperate, regardless of the trust level it has on the currentsource node. The last issue we have to consider is how to managethe reputation of unknown nodes. In this case, the initialreputation level assigned is One.

The steps followed to design our algorithm are briefly summar-ized next. First, we tried a greedy plasmid migration algorithm inwhich each node exchanges its strategy with the first neighbor itfounds and, if the neighbor’s fitness is higher, replaces its ownstrategy with the acquired one. By simulation, we found that thissimple plasmid migration algorithm quickly improved cooperationbut the solution achieved was not very close to optimal cooperation.Next, we tried a chromosomal cellular algorithm, in which astandard genetic algorithm was run among neighbors. The geneticinformation was exchanged among one-hop neighbors and eachnode performed a roulette-wheel selection, a crossover and amutation process with the received strategies. This approach yieldedstrategies that reached higher cooperation ratios than greedyplasmid migration. However, there were two problems: firstly,convergence to the maximum cooperation was slow, and secondly,sometimes cooperation suddenly decayed because nodes were quitedisposed to change their strategies (i.e. they did not keep goodstrategies for too much time).

As a conclusion of the previous discussion, we devise an hybridalgorithm that mixes the previous ones. Our algorithm useschromosomal crossover and mutation, and also allows for plasmidattributes such as the restoration of previous strategies or thepossibility for a child to refuse the genetic material of its parents.This design of the evolution algorithm allowed us to find a goodtrade-off between the bias error (where the strategies are not closeenough to optimal) and the variance error (where the strategies varyeven to the point of forgetting good solutions). This process wasparticularly challenging because, in our problem, we have a doublecriterion: we need to get both a good cooperation level amongnormal nodes (to maximize the throughput) and an effectiveisolation of selfish nodes (to minimize energy consumption). Indeed,many design options led to truly optimal cooperation values,although the selfish nodes were not completely isolated; otherdesign options we tested were very good in saving energy resourcesfrom selfish nodes, but at the cost of lower cooperation amongnormal nodes. Algorithm 1 is the result of such design procedure,where we weighted those criteria with the additional requirementsof fast convergence and good adaptability. Furthermore, theextended exploration of the solution space is due to both mobilityand neighborhood overlap. This ensures the convergence to almostoptimal cooperation values.

As a result, the foremost characteristics of our evolutionalgorithm are its distributed implementation and its convergencespeed. These features make our algorithm readily implementablein a MANET since we exploit the natural structure of the networkin two ways:

�
The evolution is carried out in a distributed way through localneighborhoods. � The genetic information is quickly disseminated all over the
network after a few PMPs since these neighborhoods highlyoverlap through mobility.

4. Simulation

One of the main goals of the work presented in this article is todevise a distributed algorithm to evolve the cooperation strategyof nodes. The requirements for this algorithm are fast

convergence to maximum cooperation and high adaptation tochanging cooperation conditions. To test and develop our design,we could have used one of several powerful simulators such asNS2 (2009), OPNET (2009) or QUALNET (2009), among others. Allof them are known for having appropriate libraries for wirelessad hoc networks. However, we decided to develop a custom-madesimulator in Java with a simplified network model (aboutmobility, routing etc.) because we wanted to measure, in anisolated way, the cooperation achieved by our proposal withrespect to the maximum theoretical cooperation. Also, acontrolled design of the network allows us to observe andanalyze the effects of our design choices isolated from theinteractions of physical, multi-access, routing and transportprotocols. In particular, in our simulator, the mobility modeland routing protocol are represented by the random selection ofpaths and neighborhoods. Finally, it is worth to mention that notonly our algorithm has been implemented but also the centralizedone for comparison purposes. The simulator and its mainparameters are described next.

4.1. Simulator

In our simulation platform the network is composed of apopulation of mobile nodes divided into two groups: the normalnodes, which are willing to cooperate if deemed worthy, and theselfish nodes, which are free-riders that discard every packet intransit. The environment, which is characterized by the fraction ofselfish nodes among the whole population, can be programed tochange dynamically during the simulation by the conversion ofsome nodes from normal to selfish and from selfish to normal. Thenetwork simulation progresses through the execution of games,or packet transmissions, between randomly selected pairs ofsource/destination nodes. For each source/destination pair, sev-eral randomly selected paths are established. In a single game,once a path has been chosen, the source node transmits its packetand each intermediate node in the path decides to forward ordiscard the packet according to its own strategy. As mentioned inSection 3.2, the forwarding or discarding decision of anintermediate node is observed by all nodes in the path up to thedestination (if the packet arrived there) or to the discarding node,if the transmission failed. The packet transmission (or game)succeeds if the packet arrives to the destination, and it fails ifsome intermediate node discards the packet. After a specifiednumber of games, the genetic information exchange takes placeamong neighbor nodes to run an evolution step. This way, we canobserve the evolution of strategies according to the programmedchange of environment conditions.

The mobility is considered through two mechanisms: arandom selection of a set of neighbors for each node at eachplasmid migration period, and a random selection of the path ateach single game. In the first mechanism, at the moment of aplasmid migration period, a set of six neighbors is randomlyselected for each node, and the evolution process goes on withinthese neighborhoods. The neighborhood size is chosen in order tomodel typical cellular hexagonal geometry with nodes in thevertices and the centers of each hexagon. In the secondmechanism, when a packet is to be transmitted between a sourceand a destination, a path length is randomly chosen according to aprobability mass function Ph. Given the path length h, the numberof routes, r, is randomly chosen according to a conditionalprobability mass function Prjh. For performance evaluation andcomparison purposes, these two distributions are taken fromSeredynski et al. (2007) (see Table 4). Among the discoveredroutes, the source node chooses the most trusted path, which isthe one with the highest product of the forwarding rates of the


participating intermediate nodes, according to the observationsmade by the source node.

For the evolution algorithm, the cross point is established atthe 8th bit, which marks the difference between the strategies fortrusted and untrusted source nodes. If the crossover takes place,each bit of the children strategy mutates with probability 0.001.All the simulation results are presented as the average of 60independent replicas of the given experiment.

Finally, our algorithm requires appropriate values for thenumber of observed decisions that a node will keep in memory, m,and for the number of games that a node will play as source ineach plasmid migration period, R. In the next section, we usesome simulation experiments in order to set these parameters.

4.2. Tuning the parameters of our algorithm

Our proposal includes the capacity of on-line adaptation. Tothis effect, it is important that the trust values among nodesreflect, as much as possible, the behavior of their currentstrategies. Therefore, taking into account all the past events(having an infinite memory) is not appropriate for our algorithm,so we use a finite memory of depth m. This memory allows us tostore the previous m decisions as state variables for updating thefraction of observed forwarded packets. Regarding this memory,notice that there is a trade-off: we would like a long memory to

Table 4Probability distribution of path hops (Ph) and number of discovered paths given

the length (Prjh).

Number of hops (h)

2 3 4 5 6 7 8

Ph 0.4 0.3 0.1 0.05 0.05 0.05 0.05

P1jh 0.5 0.5 0.6 0.6 0.6 0.8 0.8

P2jh 0.3 0.3 0.25 0.25 0.25 0.15 0.15

P3jh 0.2 0.2 0.15 0.15 0.15 0.05 0.05

101 102 1030

0.2

0.4

0.6

0.8

1

0% selfish nodes

Average number of packets sent per node

Coo

pera

tion

101 102 103 1040

0.2

0.4

0.6

0.8

1

50% selfish nodes


Coo

pera

tion

Fig. 1. Evolution of cooperation for different number of selfish nodes and

better evaluate the trust placed in each node, but we would likealso a short memory in order not to judge the nodes based onoutdated behaviors. After some pilot tests, we concluded thatusing the last three observed decisions leads to optimal coopera-tion ratios. Furthermore, the value of m¼3 gives four possibleforwarding rates (0, 1

3, 23 and 1), with which we can select the four

corresponding trust values of Table 1.The second important parameter is the number of times a node

plays as source in a plasmid migration period, R. Again, a longvalue of R would allow a better evaluation of the fitness of eachstrategy, but at the cost of increased convergence time andreduced adaptability. So, for the purpose of deciding an appro-priate value of R, we computed the cooperation evolution forR¼10, 25, 100 and 300 packets, under different number of selfishnodes within the population of 100 nodes. Recall that thecooperation is measured as the fraction of packets originated innormal nodes that effectively reach their destinations.

This cooperation is shown in Fig. 1, where the cooperation amongnormal nodes is plotted against the average number of packetstransmitted per node, in a logarithmic scale. Given a transmissionrate, this average number of packets transmitted per node becomesa measure of time, so we are plotting both the convergence speedand the maximum achieved cooperation value. In all cases, thecooperation converges to the same steady values, but the speed ofconvergence varies both with R and with the fraction of selfishnodes. With no selfish nodes (Fig. 1a), a value of R as small as 10produces fast convergence times (less than 200 packets per node) tothe optimal cooperation value of 1, while this time increases to 250packets, 600 packets and 4000 packets for R¼25, 100 and 300,respectively. With 20% of selfish nodes (Fig. 1b), R¼10 seems toosmall to obtain a good evaluation of the strategies. If we comparethis case to that obtained by greater values of R, we get both a longerconvergence time and a reduced maximum achieved cooperation.In this case, a value of R of 25 packets per node seems to be thebest choice but, with greater fractions of selfish nodes, 100 packetsper node in a plasmid migration period not only show betterconvergence times but also slightly higher achieved coopera-tion values (Fig. 1c and d). In the four environment cases, the

101 102 1030

0.2

0.4

0.6

0.8

1

20% selfish nodes


Coo

pera

tion

101 102 103 1040

0.2

0.4

0.6

0.8

1

60% selfish nodes


Coo

pera

tion

10 rounds25 rounds100 rounds300 rounds

different number of packets per node in a plasmid migration period.


convergence times were unacceptable when R¼300 packets. So,according to previous analysis, and taking into account that smallvalues of R also generate greater transmission overhead due tofrequent plasmid migration processes, we decided to use a PMP ofR¼100 packets per node.

4.3. Centralized algorithm implementation

As we previously mentioned, we have also implemented thecentralized evolution algorithm presented in Seredynski et al.(2007). For comparison purposes and to better understand bothproposals, in this section we introduce a more detailed descrip-tion of this approach.

In the centralized model, nodes start with a randomly generatedstrategy. Then, series of tournaments are played to calculate thepayoff that nodes will receive for their actions. The fitness of eachplayer’s strategy is evaluated as the average payoff per event.According to this fitness, a centralized entity, which knows allcurrent strategies and fitnesses, selects 2NN strategies through aroulette wheel mechanism, where NN is the number of participatingnormal nodes, i.e., nodes that are willing to cooperate if theyconsider it advantageous. Applying a standard one point crossoverand a standard uniform bit flip mutation over these 2NN strategies,the centralized entity generates a set of NN new strategies in order todistribute them among the normal nodes, and the whole process isrepeated during a certain number of generations. The performanceresults of the centralized approach show that the strategies evolveaccording to different (but static) environments, where an environ-ment is characterized by a given fraction of selfish nodes, i.e., nodesthat always decide to discard every packet in transit. Finally, tocompare both approaches, we need to consider the same networktraffic. Unfortunately, the centralized model is presented in terms ofrounds and tournaments instead of transmitted packets so we needto calculate the average number of packets per node per generation.

More specifically, a tournament in the centralized model isplayed among 50 nodes, randomly selected from a total populationof 100 nodes. Each tournament is composed of 300 rounds. A roundis composed of 50 (successful or failed) packet transmissions orgames, one per participating node. Each of the nodes of thepopulation must participate as source in at least two tournamentsper generation. Since the probability that a given node is selected k

times in n tournaments is qðk,nÞ ¼ ðnkÞ2�n, the probability that a

given node is selected more than once in n tournaments is1�qð0,nÞ�qð1,nÞ ¼ 1�ðnþ1Þ2�n,n41. We are interested in theprobability F(n) that, by the nth tournament, the node that hasparticipated in less tournaments has already been selected morethan once. This is equal to the probability that any of the nodes ofthe population has been selected to participate in two or more of then tournaments, i.e., F(n)¼[1�(n+1)2�n]100, assuming independenceamong nodes. Correspondingly, the probability of requiring exactly n

tournaments in order for all the nodes to participate in at least twotournaments is PðnÞ ¼ FðnÞ�Fðn�1Þ,n41. According to this distri-bution, the mean number of tournaments is 11.56.2 In conclusion, inthe centralized model, there are 11:56 tournaments=generation�

300 rounds=tournament � 50 packets=round C100 nodes¼ 1734packets per generation per node, in average.

4.4. Maximum cooperation

As we previously mention, one of the main reasons that lead usto develop a custom-made simulator with a simplified networkmodel (about mobility, routing etc.) is to be able to compute the

2 This result we obtain theoretically has also been verified during the

simulations.

maximum theoretical cooperation, Cmax. Next, we develop atheoretical expression for the maximum cooperation achievableunder an optimal strategy with perfectly computed trust values.For this purpose, we assume that the selfish nodes are completelyidentified and that, with this information, the normal nodescooperate among them and discard the packets of selfish nodes.This is the ideal condition that any trust model would like toachieve, where packets will be forwarded if they find at least onepath composed exclusively of non-selfish nodes. In what follows,we will use the following notation

�
N is the total number of nodes in the network. � NN is the total number of normal nodes among the N nodes of
the population.
� Ph is the probability that a path has h hops (see Table 4). � Prjh is the probability of finding r routes given that the path
length is h hops (see Table 4).
� B(h) is the event that there are no selfish nodes among h
randomly selected nodes.
� A is the event that a packet finds at least one route made out
exclusively of normal nodes.

According to the discussion above, a packet will reach thedestination through an h-hop path if at least the first h nodes ofthe path are normal nodes (the destination can be a selfish node).Since the nodes of the path are randomly selected, the probabilityof finding a consecutive sequence of h normal nodes is given by

Pr½BðhÞ� ¼Yh�1

i ¼ 0

NN�i

N�ið4Þ

Correspondingly, the probability of having at least one selfishnode in an h-hop path is 1�Pr[Bh]. So, the probability that eachone of r routes of h hops has at least one selfish node is(1�Pr[Bh])r and, consequently, the probability that at least oneof the r routes is composed exclusively of normal nodes is 1�(1� Pr[Bh])r. Since the probability of finding r routes of h hops isPh � Prjh, the probability that a packet finds at least one pathcomposed exclusively of non-selfish nodes is given by

Cmax ¼ Pr ½A� ¼X

h

Xr

PhPrjhð1�ð1�Pr½BðhÞ�ÞrÞ ð5Þ

As we said before, this is the ideal fraction of packetsoriginated in normal nodes that reach their destination, i.e., thisis the maximum cooperation, Cmax.

5. Performance evaluation

Fig. 2 compares the maximum cooperation given in Eq. (5)with the maximum cooperation values obtained with thecentralized and distributed algorithms as a function of thefraction of selfish nodes in a population of 100 nodes. Recallthat we measure the cooperation as the fraction of packetsoriginated by normal nodes that effectively reach theirdestinations. Fig. 2 shows that not only the values achieved byour distributed bacterial-like algorithm are higher than valuesobtained by the centralized model, but also that our algorithmattains cooperation values very close to the optimal ones,especially under a low fraction of selfish nodes.

Fig. 3 shows the fraction of delivered packets for both normaland selfish nodes as a function of the plasmid migration periods.Figures are obtained under different percentages of selfish nodes,using the selected PMP length of R¼100 packets. It can beobserved that our evolved strategies not only achieve a highcooperation among normal nodes but also effectively isolate theselfish nodes by reducing their delivered packet ratio to a


negligible value. These results imply the achievement of a highthroughput, without a waste of energy in forwarding packets ofselfish nodes (i.e. nodes save battery).

5.1. Comparison centralized vs distributed

Fig. 4 shows the cooperation among normal nodes for our modeland for the centralized one as a function of the average number ofpackets transmitted per node and the percentage of selfish nodes.Again, a PMP corresponds to R¼100 packets sent per node. On theother hand, a generation in the centralized algorithm corresponds to1734 packets sent per node, as previously discussed in Section 4.3.The first observation is that our distributed algorithm convergesmuch faster than the centralized algorithm: while the formerrequires between 500 and 2000 packets (from 5 to 20 PMP),depending on the environment, the later requires hundreds ofthousands of packets to converge, corresponding to hundreds ofgenerations. This is an expected result, since the distributed natureof our algorithm allows a more frequent execution of the evolution

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Coo

pera

tion

Fraction of selfish nodes

Theoretical optimumCentralized algorithmDistributed algorithm

Fig. 2. Maximum achievable cooperation values.

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Evolution of cooperation with normal nodes

Coo

pera

tion

Plasmid migration period

0% selfish nodes20% selfish nodes50% selfish nodes60% selfish nodes

Fig. 3. Evolution of Cooperation und

process. The second observation is that the final cooperation valuesachieved by our distributed algorithm are also significantly higherthan those of the centralized one. This is not surprising if weconsider that each neighborhood can evolve to different solutions, sothat the overlapped and mobile neighborhoods allow a morecomplete exploration and exploitation of the search space.

5.2. Dynamic environmental changes

Finally, we verify whether or not our distributed algorithmadapts properly to dynamic environmental changes. To do so, wechange the number of selfish nodes on the fly during a singlesimulation by making some normal nodes to become selfish onesand vice versa.

Fig. 5 shows the fraction of normal nodes and the cooperationvalues obtained by normal and selfish nodes as a function of time(in PMPs). In the first part of the figure, each 50 PMPs a nodeswitches its behavior from selfish to normal. After 2000 PMPs,nodes start becoming selfish again with the same frequency (onenode changes its behavior each 50 PMPs). This happens until4000 PMPs. Then, sudden changes take place at random times. In

0 10 20 30 40 50 600

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Evolution of cooperation with selfish nodes

Coo

pera

tion


20% selfish nodes50% selfish nodes60% selfish nodes

er four different environments.

102 103 104 1050

0.2

0.4

0.6

0.8

1

Number of packets transmitted per node

Coo

pera

tion

amon

g no

rmal

nod

es

1734

Fig. 4. Comparison of centralized and distributed models.

0 2000 4000 6000 8000 10000 120000

0.10.20.30.40.50.60.70.80.9

1


Fraction of normal nodesCooperation with normal nodesCooperation with selfish nodes

Fig. 5. Adaptation of the cooperation to environmental changes.

0 10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

Migration Period

% normal nodesnormal coop.slefish coop.

Fig. 6. Evolution of cooperation under ‘‘step’’ environment transitions.


the figure, it can be observed that the cooperation evolved in sucha way that it adapted to each new environment in just a fewPMPs, keeping the network at optimal cooperation values most ofthe time. Notice also that nodes that became selfish were not ableto keep their reputation. This means that our distributed geneticalgorithm is very effective in detecting and isolating selfish nodesand, as a conclusion, selfish nodes do not cause a relevant waste ofenergy of cooperative nodes within the network.

In order to analyze in more detail how the adaptation takesplace, we plot in Fig. 6 a more detailed simulation, in which theenvironment changes between 60% and 100% normal nodes every20 PMPs. At the beginning, with the initial random strategies, bothnormal and selfish nodes get very low cooperation from thenetwork but, as the network runs, there is a slow adaptationtowards optimal cooperation among normal nodes and almosttotal isolation of selfish nodes (this takes around 12 PMPs). Whenthe selfish nodes switch their behavior to the normal one, nodesquickly learn the new cooperation conditions and quickly startcooperating. In this last case, convergence takes only 5 PMPs(compared with the 12 PMPs of the previous case). Finally, noticethat when a normal node becomes selfish again, the networktakes approximately 3 PMPs to detect it and isolate it.

3 Remember that (Seredynski et al., 2007) has been used in the article as

reference model for comparison purposes.

5.3. Comparison with related work

In this section, we compare our proposal with other relatedworks for cooperation enforcement in MANETS which are alsobased on non-cooperative games. To this respect, in Seredynski

and Bouvry (2009), the authors present and analyze a centralizedsystem similar to Seredynski et al. (2007).3 In this new work,authors keep most of the system structure of Seredynski et al.(2007) but they change the encoding of the strategies ofintermediate nodes. In particular, in the new proposal, thecooperation behavior during three consecutive time frames ofdifferent length are included in the codification of the strategy.It is worth to mention that this new work only studies staticscenarios (environmental changes are not considered) and thatauthors assume a high interaction between nodes, increasing thenumber of rounds with respect to their previous work. In addition,the analysis presented in Seredynski and Bouvry (2009) is notfocused in measuring the level of cooperation achieved, but it isfocused in analyzing which are the most popular strategies afterseveral generations. The main conclusion of their work is that thepopular strategies can be seen as variations of tit-for-tat but thatnone of them resulted evolutionarily stable in their system. In thissense, another variation of tit-for-tat called generous tit-for-tat(GTFT) has also been proposed to be used in MANET (Milan et al.,2006). This strategy partly alleviates the impact of environmentalchanges by assuming that the nodes may be generous to contributemore to the network than to benefit from it. However, as pointed inJi et al. (2010), it is difficult to extend GTFT to a multi-player gamescenario like end-to-end transmissions in multi-hop multi-nodeMANETs. In our proposal, the overall behavior of the network isoptimized by considering the previous forwarding actions of thesource node and the network within the strategy code. Then, thesestrategies evolve genetically to achieve an optimal behaviour.

On the other hand, a very close related work is presented inKomathy and Narayanasamy (2008). This work uses non-cooperative game theory with a distributed evolution based onBNS (Best Neighbor Strategy) (Komathy and Narayanasamy, 2007),in which a node decides the strategy to follow by changing to otherplayer’s strategy if deemed worthy. The BNS proposal distributes theevolution process among the nodes, but the payoff structure strictlyencourages cooperating strategies without taking into account theenergy savings of discarding strategies. Despite the distributednature of this evolution strategy, our proposal is different fromKomathy and Narayanasamy (2008) in three aspects. First, in ourproposal the payoffs not only reward cooperation but also rewardthe energy savings of discarding strategies. This payoff structuremodels better the natural dilemma a node faces, and also addssignificance to the emerging cooperation, because discardingstrategies can also give a good payoff. Second, in our proposaleach node’s strategy depends on the trust it has on the source of thepacket to be forwarded, which, in turn, depends on the observedbehavior history, not only on the outcomes of the random pairinggames. And third, an important strategy criterion for a node is howuseful has been the network in relying its own packets to theintended destinations, which better models the motivation forcooperation.

In Wang et al. (2010), the authors propose a non-cooperativegame based on a repeated forwarding game. They also define astrategy space with five possible strategies: defection,cooperation, tit-for-tat, anti-tit-for-tat and random. They dividethe time in series of discrete slots, and they propose a system witha single memory position in which the strategy can be changed ateach slot according to the fitness of the previous slot. They alsopropose a punishment mechanism, in which cooperating nodeschange their forwarding strategy to a more restrictive one whenthey notice that their neighbors are doing the same thing. Theyargue that this kind of punishment mechanism can make other


nodes around the selfish node to decrease their forwardingstrategies and in this way achieve a global punishment. Comparedwith our results, they only show results for a network with asingle selfish node that suddenly reduces its forwarding behaviorand they do not consider the impact of selfishness in theperformance of cooperative nodes.

In Ji et al. (2010) the authors present an interesting non-cooperative game theoretic framework. The main contribution ofthis work is that it considers noisy and imperfect observation. Totackle this problem, the authors propose to use a belief systemusing the Bayes’ rule to assign and update a belief (trust) in othernodes. In particular, this belief measure is related to theprobability that a certain node will follow a certain cooperationstrategy. The authors develop a model for the interaction of twonodes and then they try to generalize this model to a multi-nodescenario. As the authors state, a direct design of their system forthe multi-node case is difficult. For this reason, the two-playerscenario is used as baseline and a strategy called BMPF (Belief-based multi-hop packet forwarding) is used to try to improve theend-to-end performance. To properly apply the BMPF strategy,the sender needs an updated belief value for each node on thepossible route, which means that a lot of interaction betweennodes are needed to take advantage of the BMPF strategy. On theother hand, their belief value accumulates all the previous history.Despite they use a discount factor, the accumulation of a longhistory makes quick adaptability difficult. To this respect, noticethat the behavior of a device can switch to a selfish behaviorbecause, for example, the node is running out of battery. Whenthere are changes of this type, it is very important to consider theconvergence time required for adaptation to the new conditions.In the mentioned work, there are not results in this direction, buttheir results are presented in average. It is true that their modelcan partially alleviate changes of behavior by considering them akind of imperfect observation but their convergence speed isgoing to be low in comparison with ours since they keep longrecords of interaction history. In our proposal, we have explicitlyanalyzed this type of environmental changes and we have shownthat in our system the nodes quickly learn the appropriatecooperation behavior (see our results in Section 5.2).

Finally, in Jaramillo and Srikant (2010), the authors present asystem called Darwin which is also based on a non-cooperativegame. The strategy proposed in Darwin considers a retaliationsituation after a node has been falsely perceived as selfish. Thesystem assumes that nodes share their perceived droppingprobability with each other. This assumption is made in orderto facilitate the theoretical analysis by isolating a pair of nodes,but in a real implementation, a mechanism is required toguarantee that even if a node lies, the reputation scheme stillworks. Authors state that the study of this problem in the contextof wireless networks is not solved, and that they just propose asimple mechanism which relies on other cooperative nodes to tellthe actual perceived dropping probability of a node to minimizethe impact of liars. On the contrary, in our proposal, nodes do notshare trust values. Nodes just exchange their strategies and theirassociated fitness. The strategies do not contain specific data ofparticular nodes and a strategy is used several times only if itprovides a good fitness, which can be directly measured by eachnode. Regarding performance results, the authors of Darwin donot show adaptability measures, but they mention that theirsystem needs to interact for a long time in order to estimate thediscarding probability of neighbor nodes. Simulations of Darwinshow that the nodes that implement it effectively obtain morecooperation than those that not implement Darwin. Authorsdefine non-Darwin nodes as those that apply random discard withsome probability. Although their results are not directlycomparable with ours, it is interesting to notice that unlike our

system, even when the non-Darwin nodes discard withprobability one (which is the definition of selfish nodes in ourwork), they still receive a considerable cooperation from thenetwork (around 70%).

6. Conclusions

In this article we have shown that it is possible to use distributedalgorithms for the genetic evolution of strategies in a game theoretictrust model for MANETs. We have proposed a trust model in whichgenetic information is exchanged among neighbor nodes, much likeplasmid migration in bacterial colonies. This way we introduce lowcommunication overhead, while disseminate good strategiesthroughout the network by means of neighborhood overlappingand mobility. Our proposal does not need a central entity and doesnot require unrealistically large number of node interactions toevaluate the fitness. Instead of this, each node adapts its strategy tothe dynamical characteristics of the network, trying to maximize itspayoff in terms of packet delivery and resource saving. As anemerging feature of this local optimization procedure, the wholenetwork was able to maximize the cooperation and save energy byoffering the forwarding service only to those nodes that were willingto cooperate with the network, and isolating those free-rider nodesthat wanted to utilize the resources of the network withoutcooperating with it. Our distributed model not only achieved theseobjectives hundreds of times faster than the centralized model, butalso obtained better cooperation values, very close to thosepredicted by a theoretical upper bound. Thanks to these featuresof fast convergence to optimal cooperation values and selfish nodeisolation, the algorithm also demonstrated a remarkable adaptationcapacity to dynamic environmental changes.

References

Alba E, Dorronsoro B. Cellular genetic algorithms (operations research/computerscience interfaces) series. 1st ed. Springer; 2008.

Alba E, Troya JM. A survey of parallel distributed genetic algorithms. Complexity1999;4(4):31–52.

Anderegg L, Eidenbenz S. Ad hoc-vcg: a truthful and cost-efficient routing protocolfor mobile ad hoc networks with selfish agents. In: MobiCom ’03: proceedingsof the ninth annual international conference on mobile computing andnetworking. New York, NY, USA: ACM; 2003. p. 245–59.

Axelrod R, Dion D. The further evolution of cooperation. Science 1988;242(4884):1385–90.

Bansal S, Baker M. Observation-based cooperation enforcement in ad hocnetworks. Technical Report, 2003.

Baras J, Jiang T. Cooperation, trust and games in wireless networks. Springerlink;2005. p. 183–202 [chapter 4].

Buchegger S, Boudec J-YL. A robust reputation system for p2p and mobile ad-hocnetworks. In: Proceedings of the second workshop on the economics of peer-to-peer systems, 2004.

Buchegger S, Le Boudec J-Y. Performance analysis of the confidant protocol. In:MobiHoc ’02: proceedings of the third ACM international symposium onmobile ad hoc networking & computing. New York, NY, USA: ACM Press; 2002.p. 226–36.

Buttyan L, Hubaux J-P. Enforcing service availability in mobile ad-hoc wans. In:MobiHoc ’00: proceedings of the first ACM international symposium on mobilead hoc networking & computing. Piscataway, NJ, USA: IEEE Press; 2000. p.87–96.

Cabrita C, Botzheim J, Ruano A, Kczy L. Genetic programming and bacterialalgorithm for neural networks and fuzzy systems design. In: IFAC internationalconference on intelligent control systems and signal processing (ICONS), 2003.

Cantupaz E. A survey of parallel genetic algorithms. Calculateurs Paralleles,Reseaux et Systemes Repartis 1998;10(2):141–71.

Dale J, Park S. Molecular genetics of bacteria. 4th ed. Wiley; 2004.Felegyhazi M, Hubaux J-P, Buttyan L. Nash equilibria of packet forwarding

strategies in wireless ad hoc networks. IEEE Transactions on MobileComputing 2006;5(5):463–76.

Harvey I. The microbial genetic algorithm. URL: /http://www.cogs.susx.ac.uk/users/inmanh/Microbial.pdfS; 1996.

Holland JH. Adaptation in natural and artificial systems: an introductory analysiswith applications to biology control and artificial intelligence. Ann Arbor:University of Michigan Press; 1975.

http://www.cogs.susx.ac.uk/users/inmanh/Microbial.pdf

http://www.cogs.susx.ac.uk/users/inmanh/Microbial.pdf


Ishibuchi H, Namikawa N. Evolution of cooperative behavior in the iteratedprisoner’s dilemma under random pairing in game playing. IEEE Congress onEvolutionary Computation 2005;3(September):2637–44.

Janzadeh H, Fayazbakhsh K, Dehghan M, Fallah MS. A secure credit-basedcooperation stimulating mechanism for MANETs using hash chains. FutureGeneration Computer Systems 2009;25(8):926–34.

Jaramillo JJ, Srikant R. A game theory based reputation mechanism to incentivizecooperation in wireless ad hoc networks. Ad Hoc Networks 2010;8(4):416–29.

Ji Z, Yu W, Liu KR. A belief evaluation framework in autonomous MANETs undernoisy and imperfect observation: vulnerability analysis and cooperationenforcement. IEEE Transactions on Mobile Computing 2010;9:1242–54.

Komathy K, Narayanasamy P. Best neighbor strategy to enforce cooperation amongselfish nodes in wireless ad hoc network. Computer Communications2007;30(18):3721–35.

Komathy K, Narayanasamy P. Trust-based evolutionary game model assistingAODV routing against selfishness. Journal of Network and Computer Applica-tions 2008;31(4):446–71.

Li F, Wu J. Uncertainty modeling and reduction in MANETs. IEEE Transactions onMobile Computing 2010;9(7):1035–48.

Luo J, Liu X, Fan M. A trust model based on fuzzy recommendation for mobilead-hoc networks. Computer Networks 2009;53(14):2396–407.

Marias GF, Georgiadis P, Flitzanis D, Mandalas K. Cooperation enforcementschemes for MANETs: a survey: research articles. Wireless Communicationsand Mobile Computing 2006;6(3):319–32.

Marshall IW, Roadknight C. Adaptive management of an active service network. BTTechnology Journal 2000;18(4):78–84.

Marti S, Garcia-Molina H. Taxonomy of trust: categorizing p2p reputation systems.Computer Networks 2006;50(4):472–84.

Mejia M, Alzate M, Pena N, Munoz JL, Esparza O. Distributed evolution of strategiesin a game theoretic trust model for mobile ad hoc networks. In: VII Jornadas deIngenierıa Telematica—JITEL; 2009a.

Mejia M, Pena N, Munoz JL, Esparza O. A review of trust modeling in ad hocnetworks. Internet Research 2009b;19(1):88–104.

Michiardi P, Molva R. Core: a collaborative reputation mechanism to enforce nodecooperation in mobile ad hoc networks. In: Proceedings of the IFIP TC6/TC11sixth joint working conference on communications and multimedia security.B.V., Deventer, The Netherlands: Kluwer; 2002. p. 107–21.

Milan F, Jaramillo JJ, Srikant R. Achieving cooperation in multihop wirelessnetworks of selfish nodes. In: GameNets ’06: proceeding from the 2006workshop on game theory for communications and networks. New York, NY,USA: ACM; 2006.

Munoz JL, Esparza O, Aguilar M, Carrascal V, Forne J. Rdsr-v. reliable dynamicsource routing for video-streaming over mobile ad hoc networks. ComputersNetworks 2010;54(1):79–96.

Nawa NE, Furuhashi T, Hashiyama T, Uchikawa Y. A study on the discovery ofrelevant fuzzy rules using pseudo-bacterial genetic algorithm. IEEE Transac-tions on Industrial Electronics 1999;46(6):1080–9.

Nowostawski M, Poli R. Parallel genetic algorithm taxonomy. In: Third interna-tional conference on knowledge-based intelligent information engineeringsystems, 1999. p. 88–92.

NS2. Network simulator /http://www.isi.edu/nsnam/ns/S; 2009.Ochman H, Lawrence JG, Groisman EA. Lateral gene transfer and the nature of

bacterial innovation. Nature 2000;405(6784):299–304.OPNET. Opnet technologies inc. /http://www.opnet.comS; 2009.Osborne M. An introduction to game theory. Oxford University Press; 2004.Perkins CE. Ad hoc networking. Addison-Wesley Professional; 2001.QUALNET. Scalable network technologies /http://www.scalable-networks.com/

products/developer.phpS; 2009.Saad W, Han Z, Debbah M, Hjørungnes A. A distributed coalition formation

framework for fair user cooperation in wireless networks. Transactions onWireless Communications 2009;8(9):4580–93.

Safa H, Artail H, Tabet D. A cluster-based trust-aware routing protocol for mobilead hoc networks. Wireless Network 2010;16(4):969–84.

Seredynski M, Bouvry P. Evolutionary game theoretical analysis of reputation-basedpacket forwarding in civilian mobile ad hoc networks. In: IPDPS ’09: Proceedingsof the 2009 IEEE international symposium on parallel & distributed processing.Washington, DC, USA: IEEE Computer Society; 2009. p. 1–8.

Seredynski M, Bouvry P, Klopotek M. Modelling the evolution of cooperativebehavior in ad hoc networks using a game based model. IEEE symposium oncomputational intelligence and games (CIG) 2007:96–103.

Sherwood R, Lee S, Bhattacharjee B. Cooperative peer groups in nice. ComputerNetworks 2006;50(4):523–44.

Sun YL, Yu W, Han Z, Liu K. Information theoretic framework of trust modeling andevaluation for ad hoc networks. IEEE Journal on Selected Areas in Commu-nications 2006;24(2):305–17.

Wang K, Wu M, Ding C, Lu W. Game-based modeling of node cooperation in ad hocnetworks. In: 19th annual wireless and optical communications conference(WOCC), 2010.

Weiss MA. Data structures and algorithm analysis in Java. Boston, MA, USA:Addison-Wesley Longman Publishing Co., Inc; 1998.

Whitley D. A genetic algorithm tutorial. Technical Report, Colorado StateUniversity; 1993.

Wrona K, Mahonen P. Analytical model of cooperation in ad hoc networks.Telecommunication Systems 2004;27:347–69.

Yale SZ, Zhong S. Sprite: A simple, cheat-proof, credit-based system for mobile ad-hoc networks. In: Proceedings of IEEE INFOCOM, 2002. p. 1987–97.

Zouridaki C, Mark BL, Hejmo M, Thomas RK. E-hermes: a robust cooperative trustestablishment scheme for mobile ad hoc networks. Ad Hoc Networks2009;7(6):1156–68.

http://www.isi.edu/nsnam/ns/

http://www.opnet.com

http://www.scalable-networks.com/products/developer.php

http://www.scalable-networks.com/products/developer.php

Documents

A game theoretic trust model for on-line distributed evolution of cooperation inMANETs