Swarm intelligence routing approach in networked robots

Swarm intelligence routing approach in networked robots

S. Hoceini & A. Mellouk & A. Chibani & Y. Touati &B. Augustin

Received: 21 May 2011 /Accepted: 10 March 2012 /Published online: 20 June 2012# Institut Mines-Télécom and Springer-Verlag 2012

Abstract Robot swarm combined with wireless communi-cation has been a key driving force in recent few years andhas currently expanded to wireless multihop networks,which include ad hoc radio networks, sensor networks,wireless mesh networks, etc. The aim of this paper is topropose an approach which introduces a polynomial timeapproximation path navigation algorithm and constructsdynamic state-dependent navigation policies. The proposedalgorithm uses an inductive approach based on trial/errorparadigm combined with swarm adaptive approaches tooptimize simultaneously two criteria: cumulative cost pathand end-to-end delay path. The approach samples, esti-mates, and builds the model of pertinent aspects of theenvironment. It uses a model that combines both a stochasticplanned prenavigation for the exploration phase and a de-terministic approach for the backward phase. To show therobustness and performances of the proposed approach,simulation scenario is built through the specification of theinterested network topology and involved network trafficbetween robots. For this, this approach has been comparedto traditional optimal path routing policy.

Keywords Networked robots . Robot swarm . Adaptiveapproaches . Irregular traffic . Quality of service basedrouting

1 Introduction

The evolution of computing and hardware technologies haspermitted to develop new protocols and communication toolsfor systems interaction in different areas, such as automotive,space industry, medicine, etc. Actually, several initiatives inrobotics for implementing cooperation techniques and com-munication protocols have been studied (Fig. 1).

Several activities in mobile robotics have been engagedin collective and cooperative behavior in order to increasethe robotics tasks robustness and their feasibility. In thenetworked robotics area, many researchers have been un-dertaken in order to study group’s behaviors found in smallinsects or animals. The study concerns mainly the way tocontrol and coordinate a team of robots. Especially, real-time wireless communication can help dynamic resourcemanagement and self-organization for a team of cooperativerobots. The multiple robots communicate with each other,sharing the same mission, naturally through wireless com-munication. In this respect, wireless communication is anexcellent candidate for interrobot information exchange.Thus, robot swarm combined with wireless communicationhas been a key driving force in recent few years and hascurrently expanded to wireless multihop networks, whichinclude ad hoc radio networks, sensor networks, wirelessmesh networks, and mobile multihop relay systems. Therobot swarm has emerged recently as a solution for multipletasks executed in complex environments in order to reducesignificantly the costs of individual robots and accessoriessuch as wireless adaptors, global positioning system (GPS),etc. In the present work, we are interested by quality ofservices (QoS) to guarantee performances in term of band-width and delay in networked robots and under node mo-bility and communication conditions for collaborative robotsystems.

S. Hoceini (*) :A. Mellouk :A. Chibani :B. AugustinLiSSi Laboratory & Department of Networks and Telecoms,IUT C/V, University of Paris-Est (UPEC),122, rue Paul Armangot,94400 Vitry-sur-Seine, Francee-mail: [email protected]

Y. TouatiLIASD Laboratory, Paris 8 University,2, rue de la Liberté,93526 Saint-Denis Cedex, France

Ann. Telecommun. (2012) 67:377–386DOI 10.1007/s12243-012-0309-8

This paper is organized as follows: In Section 2, wediscuss related projects in robot swarm. Section 3 proposesbriefly a state-of-the-art swarm-based routing algorithmsand briefly introduces the quality of service. In Section 4,we describe our proposed method which is based on poly-nomial time approximation path navigation algorithm byconstructing dynamic state-dependent navigation policies.In Sections 5 and 6, some results and conclusions arerespectively presented.

2 Robot swarm and related works

Swarm intelligence technique is widely used in networkedrobotics tasks such as moving around obstacles, pushing ablock, etc. It can be defined as the study of how a swarm ofrelatively simple physically embodied agents can be con-structed to collectively accomplish tasks that are beyond thecapabilities of a single one [1]. An architecture example of aproposed robot swarm communication network is illustratedin Fig. 2. Here, wireless mesh routers are deployed in theenvironment, and each of them consists of several antennasleading him to operate over multiple channels to improvethe network capacity and coverage. All the robots areequipped with wireless adaptors which permit to communi-cate with mesh routers or other robots.

Different from other studies on multirobot systems,swarm robotics emphasizes self-organization and emer-gence while keeping in mind the issues of scalability androbustness. These emphases promote the use of relatively

simple robots, equipped with localized sensing abilities,scalable communication mechanisms, and the explorationof decentralized control strategies. In this case, problems aresolved by implementing behavior in the low level such asavoiding and following, as in Ants’ colony [2]. Thus, forexample, if traces of pheromones are in vicinity, then followthem, otherwise generate a random movement. Unlike toSwam intelligence technique, in the case of robots soccerplayer teams, communication operates directly betweenagents, particularly to indicate their individual location.Among research laboratories working on multirobot coop-eration problems, we find the LEGO Laboratory in Den-mark. The research area concerns techniques implementinggenetic techniques on robot soccer player, i.e., khepera. Inthis context, the concept of co-evolution was developed.The latter, unlike other learning techniques, presents anadvantage in the definition of a global performance func-tion. But a major drawback is that the research spaceincreases even if it offers a total autonomy for the teamorganization. The same works on coordination and cooper-ative robots learning also have been implemented by centerfor Robotics and Embedded Systems CRES at SouthernCalifornia University [3, 4, 5]. The Oak Ridge NationalLaboratory proposes also significant works on robotics co-operation [6, 7]. An algorithm for distributing a globalperformance for each robot is being developed by the teamand applied for collective robotic for cleaning, boxes stor-age, and crossing stick between robots. In the same direc-tion, the Mobile Robot Lab at Georgia Tech of Atlantadevelops a reactive strategies based on learning methods

Fig. 1 Cooperation betweenseveral robotics entities

378 Ann. Telecommun. (2012) 67:377–386

for multi-agents cooperation in order to help the robotssearch for trash objects, which they grasp and carry towastebaskets [8, 9, 10]. Other interesting works using theprinciples of the swarm intelligence have been investigatedon several aspects of cooperative multi-agents systems es-pecially on techniques applied for robots soccer player [11,12]. A great interest is focused in multi-agents systems thatare embedded in realistic real-world situations and have tosuccessfully deal with motion noise, sensor uncertainty, andtime constraints. A special emphasis is currently set on real-time, embedded systems such as multirobot platforms,actuators and sensor networks [13, 14], and intelligentvehicles [15]. Some others related works have been alsoelaborated in [16, 17] and concern the self-deployment ofa mobile sensors network and the coordination of robotswarm where the concept of Swarm-Bot is proposed. Thus,a group of autonomous mobile robots called S-Bots have aparticular assembling capability that enables them to con-nect physically to each other. Additionally, some worksaddress the issue of the coverage of a geographic area[18], while others focus more on the problem of the explo-ration of an unknown environment [19].

The need to incorporate wireless and possibly ad hocnetworks into the existing wire-link infrastructure rendersthe requirement for efficient network routing even moredemanding. In fact, routing algorithms in modern networksmust address numerous problems. Two of the usual perfor-mance metrics of a network are average throughput anddelay. The interaction between routing and flow control

affects how well these metrics are jointly optimized. In[20], it is noted that the balance of delay and throughput isdetermined by the flow-control scheme—good routingresults in a more favorable delay-throughput curve. QoSguarantee is another important performance measure [21,22]. Here, a user might require a guaranteed allocation ofbandwidth, a maximum delay, or a minimum hop count.Current routing algorithms are not adequate to tackle theincreasing complexity of such networks. Centralized algo-rithms have scalability problems; static algorithms havetrouble keeping up-to-date with network changes, and otherdistributed and dynamic algorithms have oscillations andstability problems. In this respect, swarm intelligence rout-ing can provide a promising alternative to these approaches.

3 Swarm intelligence routing and QoS

Swarm intelligence uses mobile software agents for networkmanagement. These agents are autonomous entities, bothproactive and reactive, and have the capability to adapt,cooperate, and move intelligently from one location to theother in the communication network [23]. In particular,sensor and actor networks rely on self-optimized nodes forexecuting specific tasks and on the routing layer for deter-mining optimal routing paths for specific tasks [24]. As it isdescribed above, QoS guarantee is another important per-formance measure in terms of delay and bandwidth. A classof routing algorithms has been developed in this respect.

Fig. 2 Robot swarmcommunication networkarchitecture

Ann. Telecommun. (2012) 67:377–386 379

They are usually message based, i.e., they find a feasiblepath satisfying the QoS constraints based on an exchange ofmessages between the nodes [12]. These algorithms havethe tendency to temporarily overuse network resources untilthey find the appropriate path [20]. Yet another form ofnetwork control, which relies heavily on routing, is that ofload balancing [25, 26].

As described in [27], a number of swarm-based routingalgorithms have been studied, for example AntNet [28]. Thealgorithm consists of an adaptive agent-based routing algo-rithm that has outperformed the best-known routing algo-rithms on several packet-switched communicationsnetworks. For telephone networks, there also exists a suc-cessful application of swarm intelligence dubbed ant-basedcontrol [26]. In [25], authors propose another interestingexample using a variation of swarm routing based on Bell-man’s principle of dynamic programming. In [21], an algo-rithm dubbed agent-based routing system whose main goalis to achieve high utilization of network resources has beenimplemented. The authors propose an extension of the Ant-Net algorithm with QoS guarantees, imposing certainrestrictions on bandwidth and hop count. Following Ant-Net’s seminal work, a number of bio-inspired swarm algo-rithms have been proposed so far [29, 30], includingapplications to vehicular ad hoc networks [31].

In wireless mesh networks (WMNs), multichannel or mul-tiradio features are enabled at mesh routers. Thus, WMNssuffer from significant channel interference and thus not ableto fully achieve its high capacity and support sufficient QoSfor multimedia applications. So, problems concerning the wayof designing an automatic network configuration protocolsuch that the IP address of a robot swarm is dynamicallychanged according to the associated mesh router should beinvestigated. The QoS experienced by wireless end users isconstrained by the number of gateway routers and availablechannels. Therefore, it is desirable to balance the load ofnetwork traffics going through gateway routers. Furthermore,as users move, it might be more appropriate to route traffics toa closer gateway node or a gateway node with higher routeavailable bandwidth.

One of the main goals of state-dependent navigation innetworked robots swarm is to find a path which satisfies thegiven constraints and to simultaneously optimize the resourceutilization. The integration of constraint parameters increasesthe complexity of current navigation algorithms. In fact, theproblem of determining a path that satisfies two or more pathconstraints (for example, delay and cost) is known to be NP-complete [32]. One major difficulty is that the time required tosolve exactly themulticonstrained optimal path problem cannotbe upper-bounded by a polynomial function. Hence, much ofthe focus over the last few years has been on the developmentof pseudopolynomial time algorithms, heuristics, and approxi-mation algorithms for multiconstrained navigation paths.

The purpose of this paper is to propose an inductiveapproach based on trial/error paradigm combined withswarm adaptive approaches to optimize simultaneouslytwo criteria: cumulative cost path and end-to-end delaypath. The approach constructs dynamic state-dependentnavigation policies. The originality of our approach isbased on the fact that our system is capable to take intoaccount the dynamics of the environment where no mod-el of the environment dynamics is assumed initially. Ourapproach samples, estimates, and builds the model ofpertinent aspects of the environment. The algorithm usesa model that combines both a stochastic planned prena-vigation for the exploration phase and a deterministicapproach for the backward phase.

4 Proposed models and algorithm formulation

Based on our earlier works tested on wired operators’network [33, 34], our objective in this paper is to adaptthe developed earlier system called K optimal Q-routingalgorithm for other applications correlated to robots en-vironment. As we optimize two kinds of QoS constraints(static (cost) and dynamic (delay)), we define a function-al model based on two main stages, as it is illustrated inFig. 3.

1. First stage (selecting candidate paths based on staticcriteria): The set of candidate paths which is a subsetof all the available paths between a source–destinationpair of considered robots is selected based on a costfunction. This latter is combined all QoS static criteria,such as link cost, hop count, error ratio, etc.

2. Second stage (load balancing traffic based on dynamiccriteria): The traffic is split among a subset of all can-didate paths issued by a first stage. The proportionalvalue of each qualified path depends on the evaluationof dynamic criteria such as the residual bandwidth, theend-to-end delay, etc. This evaluation requires the def-inition of a real-time cost function based on the dynamiccriteria considered. For this, we have used a learningapproach based on reinforcement learning methods.

4.1 Continuous optimization process

In our earlier work, we have proposed a unified formalismfor this state-dependent approach based on [35, 36]. Thissection summarized this model.

The system is based on different modules, each oneoptimizing a local cost function. To achieve this, we proposea unified formalism to see the convergence of our approach.Continuous learning in our system involves changing the

380 Ann. Telecommun. (2012) 67:377–386

parameters of the network model using an adaptive updaterule whose general formulation is:

wðnÞ ¼ w n� 1ð Þ � "ðnÞF xðnÞ; w n� 1ð Þð Þ ð1Þwhere w(n) denotes the parameters of the whole system attime n, x(n) is the current traffic, and ε is the modificationstep. F is either the gradient of a cost function or a heuristicrule. x can be defined by a probability density function p(x)and w the parameters of the learning system. The conceptbehind our system is based on the fact that each iterationuses an example drawn from the real flow instead of a finitetraining set. The average update therefore is a gradientdescent algorithm which directly optimizes the expectedrisk. For a given state of the system, we can define a localcost function J(x,w) which measures how well our systembehaves on x. The goal of learning is often the optimizationof some functional of the parameters and of the concept tolearn which we will call the global cost function. It isusually the expectation of the local cost function over thespace X of the concept to learn:

CðwÞ ¼Z

x

J x;wð ÞpðxÞdx ¼ Ex J x;wð Þf g ð2Þ

Most often we do not know the explicit form of p(x), andtherefore, C(w) is unknown. Our knowledge of the randomprocess comes from a series of observations {xi} of thevariable x. We are thus only able to measure the realizationsJ(x,w) for the observations {xi}.

A necessary condition of optimality for the parameters ofthe system is:

rCðwÞ ¼ Ex rwJ x;wð Þf g ¼ 0 ð3Þwhere ∇ is the gradient operator. Since we do not know ∇C(w) but only the realizations ∇wJ(x,w), we cannot use

classical optimization techniques to reach the optimum.One solution is to resort to adaptive algorithm, the simplestbeing the following:

wðnÞ ¼ w n� 1ð Þ � gðnÞrwJ xðnÞ; w n� 1ð Þð Þ ð4Þwhere γ is the gradient step. There is an obvious similaritybetween Eqs. (1) and (4), and when learning in the secondstage aimed at minimizing a cost function, there is a completematch. This formulation is particularly adequate for describingadaptive algorithms that simultaneously process an observa-tion and learn to perform better. Such adaptive algorithms arevery useful in tracking a phenomenon that evolves over time.

4.2 Cost function

The optimization of the multiple criteria simultaneously stillremains a challenge. In fact, the complexity becomes very high.To solve this problem, we have used a lexicographic heuristicbased on the importance of the metrics. This sequential filteringis based on selecting in a first stage a first set of paths for thestatic cost function. In the second step, we eliminate a subset ofthese paths based on a secondary metric and so on.

Our proposed system makes use of two modules that acton the two stages of our functional model:

1. The objective of the first module is to select the K bestcandidate paths in terms of cumulative link costs (con-sidered as static criterion) path from the source to thedestination node (for simplicity, we consider here alllink costs equal to 1, as in a hop count mechanism).

2. The second stage considers the dynamic criteria: It isused in order to select N best paths among the first Kbest paths (N<K) according to the estimation of real-time end-to-end delay criteria obtained by reinforce-ment learning signal.

Sub set of used paths

Candidateshortpaths

Dynamic criteria Network

parameters

Static criteria

Flow packet

First stage

Instantaneous demand

Second stage

Real-Time Traffic Information

Fig. 3 Functional modelscheme

Ann. Telecommun. (2012) 67:377–386 381

To summarize, the important points proposed in ourmodel are:

– Its modular architecture makes it adaptable to integratestatic or dynamic criteria.

– The lexicographic filtering sequence approach can re-duce its complexity.

– The cost functions are estimated by learning parametersdefining the environment and to take into account thedynamic nature of the evolution of the latter.

4.2.1 First module: constructing K optimal paths basedon static criteria

First of all, in spite of exploring the entire network environ-ment which needs large computational time and space mem-ory, our approach reduces this environment to K best no looppaths in terms of cumulative cost path. Thus, each robotmaintains a link state database of the network topology map.We used a label setting algorithm based on the optimalityprinciple and a generalization of Dijkstra’s algorithm.

To find these K best paths, a variant of the Dijkstra’salgorithm was proposed in [17, 20]. The space complexity is

O(Kmn), where K is the number of paths, m (resp. n) is thenumber of robots (resp. the number of links), and the Paretoprinciple of nondominated paths [37]; we can reduce thesearch space without compromising the solution; and thetime complexity can be kept at O(Knlog(Kn)+K2m).

When a network link changes its state (i.e., goes up ordown, or its utilization is increased or decreased), the networkis flooded with a link state advertisement message. Thismessage can be issued periodically or when the actual linkstate change exceeds a certain relative or absolute threshold.Obviously, there is tradeoff between the frequency of stateupdates (the accuracy of the link state database) and the cost ofperforming those updates. In our approach, the link stateinformation is updated when the actual link state changes.Once the link state database at each robot is updated, the robotcomputes the K optimal paths.

Let a DAG (N; A) denote a network with n robotsand m links, where N0{1.. n}, and A0{ai/i,j∈ N}. Theproblem is to find the top K paths from source s to allthe others robots. Let us define a label set X and a one-to-many projection h: N→X, meaning that each robot i∈ N corresponds to a set of labels h(i), each element ofwhich represents a path from s to i.

382 Ann. Telecommun. (2012) 67:377–386

4.2.2 Second module: optimizing the end-to-end delaywith the Q-learning algorithm

After finding K best optimal paths based on costs path, thesecond step is to choose one path on these K candidatepaths. For this, we use another criteria based on real-timeend-to-end delays. The selected reinforcement signal cor-responds to the estimated time to robot to reach itsdestination.

In this approach, each robot x maintains in a Q tablea collection of values of Q(x, y, d) for every destinationd and for every neighbor y. This value reflects a delayof robot to reach destination d via neighbor s. Then, therobot x choose one path determined from the Q table.Just after receiving this packet, the robot y provides xan estimate of its best Q value to reach the destination.This new information is then added in the Q values ofthe robot x.

The reinforcement signal T employed in the Q-learningalgorithm can be defined as the minimum of the sum ofthe estimated Q (x, y, d) sent by the robot y neighbor ofrobot x and the latency in waiting queue qx correspondingto robot x.

T ¼ miny 2 neighbor of x

qx þ Qðx; y; dÞf g ð5Þ

where Q(x, y, d) denote the estimated time by the robot xso that the packet p reaches its destination d through therobot y. The packet is sent to the robot y which determinesthe optimal path to send this packet.

Once the choice of the next robot is made, the robot yputs the packet in the waiting queue and sends back thevalue T as a reinforcement signal to the robot x. It cantherefore update its reinforcement function as:

ΔQ x; y; dð Þ ¼ η a þ T � Q x; y; dð Þð Þ ð6Þ

α and η are the packet transmission time between x and yand the learning rate, respectively.

So, the new estimation Q′(x, y, d) can be written asfollows:

Q0 x; y; dð Þ ¼ Q x; y; dð Þ 1� ηð Þ þ η T þ að Þ ð7Þ

5 Numerical results

A simulation platform for networked robots should consistof links and network traffic information. Also, a simulationscenario is built through the specification of the interestednetwork topology and involved network traffic betweenrobots. For this, this approach has been compared to tradi-tional link state optimal path routing policy (LSPR).

The topology of the networked robots is specified by acollection of robots and a set of links that bind these robotselements. The network traffic is specified in the source robotby setting several parameters like the start time, the stoptime, the statistical distribution for packet interarrival times,the statistical distribution for packet size, and the destinationnode.

To ensure a meaningful validation of our algorithm per-formance, we devised a realistic simulation environment interms of network characteristics, communications protocols,and traffic patterns. With the proposed architecture, therobot swarms help critical tasks difficult for humans ascontinuous surveillance or coverage inspection. The scenar-io used in our simulation process concerns informationcollection through effective coordination. A team of robotsequipped with GPS, video camera, and sensors can captureimage/video periodically, recognize sensitive objects, andreport to administrators instantly. The topology of the net-work employed for simulations includes 36 autonomousmobile robots (Fig. 4). Each robot is covered by a subsetA of its neighborhood in its coverage area represented in thefigure by a circle (the hop count represent the radius of thecircle).

The traffic is sent/received by four end nodes (marked inthe figure node1, node2, node3, and node4). Each robotmust relay the traffic to these four end nodes which collectthis information and transmit it to the fixed administrator’sstation linked with a wired link. We model traffic in terms ofrequests characterized by its source and destination. Whilewe concern ourselves with arrival and departure of flows,we do not model the data traffic of the flows. For simplicity,we also chose not to implement a proper management oferror, flow, and congestion control. In fact, each additionalcontrol component has a considerable impact on the net-work performance, making very difficult to evaluate and tostudy properties of each routing algorithm without taking inconsideration the complex way it interacts with all the othercontrol components. Therefore, we chose to test the behav-ior of our algorithm such that the routing component can beevaluated in isolation.

For our simulation results, we studied the performance ofthe algorithms for increasing traffic load, examining theevolution of the network status toward a saturation condi-tion, and for temporary saturation conditions. For this topol-ogy, we study the performance of our QoS routing strategiesaccording to the interarrival times statistical distribution. Forthis, we generate the traffic according to Poisson lawdistribution.

The Poisson distribution is a discrete probability distri-bution which expresses the probability of a number ofevents occurring in a fixed period of time if these eventsoccur with a known average rate and are independent of thetime since the last event. The It is represented by random

Ann. Telecommun. (2012) 67:377–386 383

variables N that count a number of discrete occurrences(called “arrivals”) that take place during a time interval ofgiven length. The probability that there are exactly k occur-rences (with k a nonnegative integer, k00, 1, 2, …) is:

P k; lð Þ ¼ e�llk

k!ð8Þ

where l is a positive real number and is the mean number ofoccurrences k. The Poisson law is then defined by its mean lparameter.

Obtained simulation results are summarized in Fig. 5. Allthe algorithms have been implemented using the OPNETsoftware, and we used the same data structure for all ofthem. OPNET is an appropriate modeling, scheduling, andsimulation tool that allow the visualization of different typesof physical network topologies. The protocol specificationlanguage is based on a formal description of a finite stateautomaton.

Results given Fig. 5 are evaluated in terms of averagepacket end-to-end delivery time on the topology given in

Fig. 4 (this value is internal to OPNET). Time simulation isrepresented on the x-axis. The objective of these experi-ments is to investigate the impact of our proposed state-dependent routing approach comparing the traditional linkstate approach.

As shown in Fig. 5, our proposed approach yields betterresults than the traditional LSPR approach. These results areobtained because of the fact that robots are able to take intoaccount not only the hop count and (or) average of deliverydelay but also the dynamic delay criteria in our new ap-proach. Thus, they are able to adapt their decisions rapidlyin response to changes in the network dynamics. Also, theexploration of potential good paths based on the two criteriayields better performance results. In this manner, our ap-proach anticipates network congestions by estimating con-tinuously the real delay on the way to the destination,including the waiting queue time. Thus, they are able toadapt their decisions very fast and in close concordance withthe network dynamics. The proposed algorithm exhibits a37 % performance improvement over the performance ofLSPR. In fact, the mobile robots environment creates a high

Fig. 4 Simulation architectureof robot swarm communicationnetwork

384 Ann. Telecommun. (2012) 67:377–386

dynamic network. State-dependent approaches are moresuitable in these conditions and are also more adaptable tofrequently changing networks.

6 Conclusion

In this paper, we have presented a routing approach forrobots swarm networks, which permits to increase theirperformances in terms of information exchange and coop-eration. The approach introduces a polynomial time approx-imation path navigation algorithm and constructs dynamicstate-dependent navigation policies. It uses an inductivemethod based on trial/error paradigm combined with swarmadaptive approaches to optimize simultaneously two crite-ria: cumulative cost path and end-to-end delay path. Itsoriginality is based on the fact that our system is capableto take into account the dynamics of the environment whereno model of the environment dynamics is assumed initially.It permits to sample, estimate, and build the model ofpertinent aspects of the environment. The approach imple-ments a model that combines both a stochastic plannedprenavigation for the exploration phase and a deterministicapproach for the backward phase. To show its robustnessand performances, simulation scenario is built through thespecification of the interested network topology and in-volved network traffic between robots. Thus, it has beencompared to classical link state optimal path routing policy.

Analysis of the obtained results shows its suitability andefficiency. These promising results lead us to increase inves-tigations in this area for other kinds of routing algorithms inorder to increase the QoS in terms of several criteria such asend-to-end delay and residual bandwidth.

References

1. Cao E, Spears WM (2005) Swarm robotics. Springer, Berlin2. Dorigo M, Blum C (2005) Ant colony optimization theory: a

survey. Theor Comput Sci 344(2–3):243–2783. Tapus A, Mataric MJ (2007) Towards active learning for socially

assistive robots. In: Poster paper in Neural Information ProcessingSystems (NIPS): workshop on robotics challenges for machinelearning, Whistler, B.C., Canada

4. Koenig N, Mataric MJ (2006) Demonstration-based behavior andtask learning. Working notes, AAAI Spring Symposium, Stanford,California

5. Mataric M (1997) Behavior base control: examples from naviga-tion, learning, and group behavior. Exp Theor Artif Intell 9:323–326

6. Parker LE, Claude Touzet C, Jung D (2000) Learning and adapta-tion in multi-robot teams. In: Proc. of 18th Symposium on EnergyEngineering Sciences, pp 177–185

7. Parker LE (1999) A case study for life-long learning and adapta-tion in cooperative robot teams. In: Proc. of SPIE Sensor Fusionand Decentralized Control in Robotic Systems II, Vol. 3839, pp92–101

8. Takamuku S, Arkin CA (2007) Multi-method learning and assim-ilation. Robot Auton Syst 55(8):618–627

Fig. 5 Obtained simulationsresults

Ann. Telecommun. (2012) 67:377–386 385

9. Arkin RC, Lee B (2003) Adaptive multi-robot behavior via learningmomentum. In: IEEE International Conference on Intelligent Robotsand Systems, IROS 2003, Las Vegas, Nevada, Vol. 2, pp 2029–2036

10. Sgorbissa A, Arkin RC (2003) Local navigation strategies for ateam of robots. Robotica 21:461–473

11. Pugh J, Martinoli A (2007) Parallel learning in heterogeneousmulti-robot swarms. IEEE Congress on Evolutionary Computa-tion, pp 3839–3846

12. Winfield AFT, Nembrini J (2006) Safety in numbers: fault toler-ance in robot swarms. Int J Model Identif Control 1(1):30–37

13. Li W, Shen W (2011) Swarm behavior control of mobile multi-robots with wireless sensor networks. J Netw Comput Appl 34(4):1398–1407

14. Kim DM, Hwang Y, Kim S, Jin G (2011) Testbed results of anopportunistic routing for multi-robot wireless networks. ComputComm 34(18):2174–2183

15. Kok JR, Vlassis N (2005) Using the max-plus algorithm for multi-agent decision making in coordination graphs. RoboCup 2005,Robot Soccer World Cup IX, Osaka, Japan

16. Mondada F, Pettinaro GC, Guignard A, Kwee I, Floreano D,Dneubourg JL, Nolfi S, Gambardella LM, Dorigo M (2004)SWARM-BOT: a new distributed robotic concept, autonomousrobots. Swarm Robot 17(2–3):193–221

17. Poduri S, Sukhatme GS (2004) Constrained coverage for mobilesensor networks. In: IEEE Int. Conf. on Robotics and Automation,New Orleans, LA, USA

18. Rutishauser S, Correll N, Martinoli A (2009) Collaborative cover-age using a swarm of networked miniature robots. Robot AutonomSyst 57(5):517–525

19. Kovacs T, Pasztor A, Istenes Z (2011) A multi-robot explorationalgorithm based on a static Bluetooth communication chain. RobotAuton Syst 59(num. 7):530–542

20. Bertsekas D, Gallager R (1992) Data networks. Prentice-Hall,Upper Saddle River

21. Oida K, Sekido M (1999) An agent-based routing system for QoSguarantees. In: Proc. IEEE International Conference on Systems,Man, and Cybernetics, Oct. 12–15, pp 833–838

22. Chakrabarti S, Mishra A (2001) QoS issues in Ad Hoc wirelessnetworks. IEEE Comm Mag 39:142–148

23. Bieszczad A, Pagurek B, White T (1998) Mobile agents for net-work management. IEEE Commun Surv, fourth quarter 1(1)

24. Dressler F, Akan O (2010) Bio-inspired networking: from theoryto practice. IEEE Comm Mag 48(11):176–183

25. Heusse M, Snyers D, Guérin S, Kuntz P (1998) Adaptive agent-driven routing and load balancing in communication network. In:Proc. ANTS’98, 1st Int. Workshop on Ant Colony Optimization,Brussels, Belgium, October 15–16

26. Choonderwoerd R, Holland OE, Bruten J, Rothkrantz L (1996)Ant-based load balancing in telecommunications networks. HPLabs Technical Report, HPL-96-76

27. Kassabalidis I, El-Sharkawi MA, Marks RJ, Arabshahi P, Gray AA(2001) Swarm intelligence for routing in communication networks.IEEE Globcom, San Antonio

28. Bonabeau E, Dorigo M, Théraulaz G (1999) Swarm intelligence:from natural to artificial systems. Oxford University Press, Oxford

29. Martins J, Correia S, Celestino J (2010) Ant-DYMO: a bio-inspired algorithm for MANETS. In: Proceedings of the 17th IEEEInternational Conference on Telecommunication (ICT), pp 748–754

30. Villalba, Canas D, Orozco A (2010) Bio-inspired routing protocolfor mobile ad hoc networks. IET Communications 4(18):2187–2195

31. Correia S, Junior J, Cherkaoui O (2011) Mobility-aware ant colonyoptimization routing for vehicular ad hoc networks. In: Proceed-ings of the IEEE Wireless Communications and Networking Con-ference (WCNC), pp 1125–1130

32. Kuipers F, Van Mieghem P (2005) Conditions that impact thecomplexity of QoS routing. IEEE/ACM Trans Network 13(4):717–730

33. Mellouk A, Hoceini S, Zeadally S (2009) Design and performanceanalysis of an inductive QoS routing algorithm. Comput Comm J32(12):1371–1376

34. Mellouk A, Hoceini S, Amirat Y (2007) Adaptive quality ofservice based routing approaches: development of a neuro-dynamic state-dependent reinforcement learning algorithm. Int JComm Syst 20(10):1113–1130

35. Vapnik VN, Bottou L (2004) “Stochastic learning”, advancedlectures on machine learning, LNAI 3176. Springer, Berlin

36. Tsypkin Y (1973) Foundations of the theory of learning systems.Academic, New York

37. Henig M (1983) Vector-valued dynamic programming. SIAM JContr Optim 3:490–499

386 Ann. Telecommun. (2012) 67:377–386

Documents

Swarm intelligence routing approach in networked robots