An improved PSO-based approach with dynamic parameter tuning for cooperative multi-robot target searching in complex unknown environments

This article was downloaded by: [Monash University Library]On: 22 September 2013, At: 03:24Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

International Journal of ControlPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/tcon20

An improved PSO-based approach with dynamicparameter tuning for cooperative multi-robot targetsearching in complex unknown environmentsYifan Caia & Simon X. Yanga

a Advanced Robotics and Intelligent Systems (ARIS) Laboratory, School of Engineering,University of Guelph, Guelph, Ontario, Canada, N1G 2W1Published online: 09 May 2013.

To cite this article: Yifan Cai & Simon X. Yang , International Journal of Control (2013): An improved PSO-based approachwith dynamic parameter tuning for cooperative multi-robot target searching in complex unknown environments, InternationalJournal of Control, DOI: 10.1080/00207179.2013.794920

To link to this article: http://dx.doi.org/10.1080/00207179.2013.794920

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/tcon20

http://dx.doi.org/10.1080/00207179.2013.794920

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

International Journal of Control, 2013http://dx.doi.org/10.1080/00207179.2013.794920

An improved PSO-based approach with dynamic parameter tuning for cooperativemulti-robot target searching in complex unknown environments

Yifan Cai and Simon X. Yang∗

Advanced Robotics and Intelligent Systems (ARIS) Laboratory, School of Engineering, University of Guelph, Guelph,Ontario, Canada, N1G 2W1

(Received 30 September 2012; final version received 16 March 2013)

Target searching in complex unknown environments is a challenging aspect of multi-robot cooperation. In this paper, animproved particle swarm optimisation (PSO) based approach is proposed for a team of mobile robots to cooperatively searchfor targets in complex unknown environments. The improved cooperation rules for a multi-robot system are applied in thepotential field function, which acts as the fitness function of the PSO. The main improvements are the district-differencedegree and dynamic parameter tuning. In the simulation studies, various complex situations are investigated and comparedto the previous research results. The results demonstrate that the proposed approach can enable the multi-robot system toaccomplish the target searching tasks in complex unknown environments.

Keywords: multi-robot cooperation; PSO; artificial potential field; target searching; complex environment

1. Introduction

Compared to single robot work, multi-robot cooperationcan significantly improve the efficiency and provide withbetter robustness and adaptability. Multi-robot cooperationfor exploration has many applications in industry (Gong,Zhang, & Qi, 2012; Karimadini & Lin, 2011), such assearching for lost targets in large unknown environments(Barton, Hoelzle, Alleyne, & Johnson, 2011), and coveringenvironments with tedious and hazardous materials (Gao &Wang, 2010). The target searching task involves the prob-lem of optimising it by achieving the minimum cost. Inmulti-robot cooperation, the optimisation goal is reflectedby the final cooperation time and trajectory length of therobots.

For the exploration tasks in unknown environments,most existing research focuses on partially unknownenvironments, i.e. the target locations are unknown but theobstacles are known to the robots (Ijaz & Manzoor, 2006;Kazerooni & Khorasani, 2009; Wang & Xie, 2012). Incompletely unknown environments, the multi-robot systemknows nothing except the total number of the targets, mak-ing the environment model difficult to build. Autonomousexploration is expected for effective map building of the de-tected areas. Another problem for some existing approachesis the lack of flexibility (Li, Yuan, & Wang, 2009; Sadowskaet al., 2011). In other words, these methods are situation-based only. In addition, most of the existing research payslittle attention to adaptive characteristics (Burgard, Moors,Stachniss, & Schneider, 2005). In real applications, some

∗Corresponding author. Email: [email protected].

uncertainties may happen, e.g. some robots unexpectedlylose power and run at a lower velocity than other partners(Cepeda-Gomez, Olgac, & Sierra, 2011; Choi, Yoo, Park,& Choi, 2010; Ghommam, Mehrjerdi, Saad, & Mnif, 2011).The change will bring more challenges to the robustness ofthe scheme (Hou & Cheah, 2012; Tian, Yashiro, & Ohnishi,2012). The proposed approach is expected to offer a strat-egy to deal with an intelligent function introducing therobots with various velocity (Sachs, Valle, & Rajko, 2004).

The particle swarm optimisation (PSO) algorithm isa typical method for optimisation problems (Tang &Eberhard, 2011; Yin, Su, & Guo, 2008). In most situa-tions, PSO can offer a faster convergence than other meth-ods, such as genetic algorithm (GA) (Eberhart & Shi, 1998;Zhu, Liu, & Yang, 2011). The existing PSO research focuseson the improvement of the parameters’ adjustment to ob-tain more reliable and accurate optimisation results (Tang,Li, Wang, Zhang, & Yin, 2010). For example, Hereford,Siebold, and Nichols (2007) distributed the PSO algorithmto a robotic swarm search in which robots determine theirpositions by triangulating from three cricket motes set upas beacons. The parameter tuning is crucial for the scheme.Pugh and Martinoli (2007) applied the PSO algorithm tomodel multi-robot search and quantified the effectivenessof each parameter. Robots are assumed to be able to iden-tify the intensity of the target signal, which is in inverseproportion to the square of the distance between robot andtarget. Doctor, Venayagamoorthy, and Gudise (2004) coor-dinated robots by an outer PSO, and used an inner PSO to

C© 2013 Taylor & Francis

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13

http://dx.doi.org/10.1080/00207179.2013.794920

2 Y. Cai and S.X. Yang

optimise the parameters of outer PSO. The fitness of eachrobot is defined as the strength of signal from the target.In path planning, however, most existing research pays at-tention just to known targets, and the fitness function isrelated only to the distance between the solution point andthe target. In the presence of unknown targets or complexsituations, such a simple evaluation is not competent tohandle the tasks (Kiranyaz, Pulkkinen, & Gabbouj, 2008).

A novel potential field-based PSO algorithm was pro-posed for multi-robot cooperative target searching in dif-ferent unknown environments (Cai & Yang, in press). Theproposed approach can handle the searching tasks in vari-ous environments with different robots and targets. But theexperiment results indicated that the scheme is incompe-tent to accomplish the task in the complex environments,e.g. some targets are surrounded by the obstacles and theentrance area is limited. The repulsive potential in Cai andYang (in press) sometimes prevents the robots from gettingclose to these targets. In this study, an improved approachis developed to resolve the problem. The flexibility andapplicability of the proposed approach are tested in thesimulation experiments.

The paper is organised as follows. The notations aredescribed in Table 1. In Section 2, the improved PSO-basedapproach is given. The simulation experiments at varioussituations are presented in Section 3, including the compar-isons with the method in Cai and Yang (in press). At last,the conclusion is given in Section 4.

2. The improved approach

In PSO algorithm, the personal best pbest (pb) representsthe position with the best fitness value for an individualparticle up to that moment in the process, while the globalbest gbest (gb) represents the position with the best particlein the entire swarm. For the classical PSO, a populationof particles are randomly created initially. The algorithmwill evaluate particle fitness and find both the personaland global best positions of the swarm (Xue, Tian, & Li,2008). The solution is optimised by iterative improvementsaccording to a given procedure (Parsopoulos & Vrahatis,2010).

2.1. The procedure of the improved algorithm

The optimisation function for the swarm movement is de-fined as

max{f (Xji)}, (1)

where Xji ∈ A is the position for particle pj of robot Ri, j = 1,···, n and i = 1, ···, m for m robots; and A is the workspace of

the environment. The objective function (also called fitnessfunction) f(x) for particle pj are defined by

f (Xji) =⎧⎨⎩

1|Xji−Xt | , if the target positions are known,

Uji, if the target positions are unknown,

(2)

where Xt is the target position; and Uji is the potential fieldvalue of the particle pj in unknown environments, whichis defined in the proposed approach. In this study, it is as-sumed that initially all the target locations are completelyunknown. The optimisation in PSO is achieved by the iter-ative updating procedure as follows:

Step 1: Randomly initialise the particles within theworkspace and no initial collision;

Step 2: Update the velocity and position by

v(k + 1) = v(k) + c1r1(pb(k) − p(k))

�t

+ c2r2(gb(k) − p(k))

�t, (3)

p(k + 1) = p(k) + v(k + 1)�t, (4)

where v(k) and p(k) denote the velocity and position of theparticle, respectively; k is the iteration counter; r1 and r2 arerandom variables uniformly distributed within [0,1], whichoffer PSO the ability of stochastic searching; c1 and c2 arethe weights, which can help to compromise the trade-offbetween exploration and exploitation; and �t is the PSOiteration time interval;

Step 3: Update pb(k) and gb(k) by

pb(k) = p(k), if pb(k) < p(k), (5)

gb(k) = g(k), if gb(k) < g(k); (6)

Step 4: Check whether the maximum iterations or min-imum improvement criterion (1% during 10 iterations) ismet; if not, repeat Steps 2–4 until any termination criterionis met. The process of particle position updating in the PSOalgorithm is shown in Figure 1.

In robotics, both exploration and exploitation are toconduct searching tasks in the environment. However,exploration is a long-term process with risky and uncertainoutcomes, while exploitation by contrast is a short-termprocess with immediate and relatively certain benefits(March, 1991) (Alstyne and Arbor, Tutorial: Explo-ration vs. exploitation, at: http://www.indigosim.com/tutorials/exploration/t0s1.htm). In other words, explo-ration focuses on the global environment map building,while exploitation concentrates the search task around apromising candidate solution in order to locate the re-sult precisely (Lamberson, Exploration and exploitation, at:http://vserver.cscs.lsa.umich.edu/pjlamber/Complexity&%

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13

http://www.indigosim.com/tutorials/exploration/t0s1.htm

http://www.indigosim.com/tutorials/exploration/t0s1.htm

International Journal of Control 3

Table 1. Notations.

Term Meaning Term Meaning

A The environment area pb The personal bestAa The explored area PSO Particle swarm optimisationc1 and c2 The weights for PSO updating equations �t The PSO iteration time intervalD The real-time average distance between robots Ua The attractive potential valued The nearest distance between the particle and

detected obstaclesθ The robot orientation

d1 The influence scope of obstacles Ud The distance potential valuedgk The distance from the particle goal point to the

initial robot positionUji The total potential value of particle pj

of robot Ri

dg(k + 1) The distance from the particle goal point to theend robot position

Ur The repellent potential value

dk(k + 1) The distance between the robot initial and endpositions

Us The smoothness potential value

GA Genetic algorithm v Velocitygb The global best vmax The maximum velocity thresholdHD The dispersion degree Xt The target positionHH The homodromous degree XTl

The position of target Tl

i The robot index Xji The position of particle pj of robot Ri

j The particle index α A positive real numberk The PSO iteration counter η A position gain coefficientk1 The counter in σ calculation δ A variable in HD functionka1 and ka2 The position gain coefficients μ A variable in HD functionKmax The total allowed number of iterations σ A variable in HD functionl The target index ω(k) The inertial weightlb The local best ωmax and ωmin The upper and lower bounds of ωm The robot number ωd and ωs The weights in Uji

N The target number ωa and ωr The weights in Ud

Na The number of detected targets ωamax and ωamin The upper and lower bounds of ωa

Nu The number of undetected targets ωrD , ωrH and ωrDDThe density weights in Ur

n The particle number �ωrD , �ωrH and �ωrDDThe increment of ωrD , ωrH and ωrDD

n1 The number of possible robot moving directions ωD1 , ωD2 and ωD3 The weights in HDD

n2 The current iteration number �ωr The increment of ωr

Figure 1. The position updating process of a particle in classicalPSO.

20Course_files/exploration_exploitation.pdf). In the PSOalgorithm, the parameters c1and c2 are the factors control-ling the particle memory influence (short-term and precise)item and swarm influence (long-term and uncertain) item,

respectively. Such an inclusion of random variables r1

and r2 provides with the stochastic searching ability. Byadjusting the values of c1 and c2, the PSO algorithmcan compromise the trade-off between exploration andexploitation, and thus improve the global result andconvergence speed.

In this study, an improved PSO-based algorithm is pro-posed for a team of mobile robots to cooperatively search fortargets in complex unknown environments. An improvedpotential field function is the fitness function of PSO, whichis used to evaluate the exploration priority of unknown area.The flowchart of the improved PSO algorithm is shown inFigure 2. There are two terminations in the proposed algo-rithm: one is maximum iterations (it is 100 in this study)while the other one is minimum improvement criterion (theaverage improvement is less than 1% during the last 10 it-erations). The complexity mainly depends on the particlenumber. The goal of the proposed approach is to find allpotential targets in complex unknown areas, especially theenvironments where the approach in Cai and Yang (in press)cannot complete the task.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13

http://vserver.cscs.lsa.umich.edu/ pjlamber/Complexity


Figure 2. The flowchart of the improved PSO-based approach.

2.2. Particle population generation

In the PSO algorithm, a particle is a candidate solution,which represents one expected next position of a robot. Ex-isting research indicates that the number of the particles ishighly related to the convergence of the solution (Parsopou-los & Vrahatis, 2010). Masehian and Sedighizadeh (2010)proposed a particle population generation method that de-pends on the sensing range of the robot, which is adoptedin this study. Such a method can help to make the numberof particles to be within 10 and 50, which can significantlycontribute to the convergence of the solution (Masehian &Sedighizadeh, 2010).

In the design, a mobile robot has limited sensors, andeach sensed direction is defined as one particle that is acandidate solution (moving direction) for the robot. At thebeginning of the task, a number of particles are generated

Table 2. Parameter value ranges for PSO and potential fieldfunction.

PSO Potential Field

c1 [1, 4] ωd and ωs [1, 2]c2 [1, 4] ωa and ωr [0.3, 1.5]ω(k) [0.4, 0.9] ka1 and ka2 [0.1, 1]vmax 50 m/s ωrD , ωrH and ωrDD

[0.1, 2]Kmax 100 η [1, 1.4]

α 2

according to the sensing directions with the maximal sens-ing distance. If there is an obstacle or an area boundarywithin the sensor detecting range, the point near the obsta-cle border or area boundary is defined as the initial loca-tion of a particle. An example of generated eight-particle

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13


Figure 3. An example of generated eight-particle population fora robot with eight sensors.

population with an eight-sensor robot is illustrated inFigure 3. Each robot is equipped with eight sensors with anangle of 45◦ between them: one in the front direction, onein the back direction and three on each side of the robot.For m robots, there are a total of 8m particles. In a realrobot, ultrasonic sensors and infrared sensors are normallyused. When lidars or stereo cameras are used for preciseand long-distance detection of obstacles, the measurementof those eight sensor locations would be used as the sensorinformation of the robot.

2.3. Maximum velocity threshold and inertialcoefficient

To avoid the the swarm explosion (Tang et al., 2010), the pa-rameter vmax is introduced (Parsopoulos & Vrahatis, 2010),defined as maximum velocity threshold. It is applied to re-strict the velocity updating process as

v(k) ≤ vmax. (7)

The velocity threshold can prevent particles from takingextremely large steps. In addition, the disability to controlvelocities can lead to oscillation around the best positionsin the last phase of the optimisation procedure. To deal withit, a new parameter ω, called inertial weight (Parsopoulos& Vrahatis, 2010), is introduced into Equations (3) and (4),resulting in a new PSO variant as

v(k + 1) = ωv(k) + c1r1(pb(k) − p(k))

�t

Table 3. The performance comparison with regular shapeobstacles.

The improved The method in CaiApproach method and Yang (in press)

Experiment times 10 10The average cooperationtime T (sec)

629 643

The standard deviation of T 6.81 12.52The average total pathlength d

347 384

The standard deviation of d 6.40 14.14

+ c2r2(gb(k) − p(k))

�t, (8)

p(k + 1) = p(k) + v(k + 1)�t. (9)

The inertial weight is supposed to be selected so that theeffect of v(k) fades during the execution of the algorithm.Thus, a decreasing value of ω with time is suggested. Acommon choice is the initialisation of ω to a value slightlygreater than 1.0 to promote exploration in early optimisa-tion stages (Parsopoulos & Vrahatis, 2010), and a lineardecrease toward zero to eliminate oscillatory behavioursin later stages. The existing research usually takes a strictpositive lower bound on ω to prevent the previous veloc-ity term from vanishing. In general, a linearly decreasingscheme for ω can be mathematically described as

ω(k) = ωmax − (ωmax − ωmin)k

Kmax, (10)

where ωmin and ωmax are the lower and upper bounds ofω, respectively; and Kmax is the total allowed number ofiterations.

2.4. The improved potential field-based fitnessfunction

Compared to the approach in Cai and Yang (in press), themain improvement in PSO fitness function is the applicationof district-difference degree HDD. It can help the robots toequally explore the unknown environments especially whenthe percentage of detected targets is small.

In the improved scheme, if there are m robots, assumethat the position of particle pj of robot Ri in the work spaceis Xji = [xji, yji], j = 1, ···, 8 and i = 1, ···, m. The number ofthe detected targets is Na while Nu is for undetected targets.The variable A denotes the area of the environment whileAa represents the already explored area. The total potentialvalue Uji of particle pj can be described as

Uji = ωdUd + ωsUs, (11)

where Ud denotes the distance potential value while Us isthe smoothness potential value; and ωd and ωs are weights.The item Ud is the potential value for path shortness, whichis calculated by

Ud = ωaUa − ωrUr, (12)

where Ua denotes the attractive potential value; Ur repre-sents the repellent potential value; and ωa and ωr are weightsin the distance potential item. In the proposed scheme, whenone robot detects some targets, the multi-robot system willcalculate the percentage of the uncovered targets and re-quest the proper robots to accomplish the remaining tasks.Commonly, the percentage of the involved robots is kept

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13


larger than the percentage of the detected targets, but atleast one robot is available for the remaining unknown tar-gets. The attractive field function at position Xpj

can bedescribed as

Ua =

⎧⎪⎪⎪⎨⎪⎪⎪⎩

1

2ka1

Nu/(Nu + Na)

(A − Aa)/Ad0, if Ri continues exploring,

1

2ka2(Xpj

− XTl)2, if Ri moves to targets ,

(13)

where ka1 and ka2 are position gain coefficients; XTlis the

position of target Tl, l = 1, ···, N; d0 denotes the shortestdistance from the particle position to the explored area; andthe item (Xpj

− XTl) is the relative distance between the

particle and target.The corresponding repulsive field function is given as

Ur =

⎧⎪⎨⎪⎩

ωrD

1

HD

+ ωrH

1

HH

+ ωrDD

1

HDD

, if Ri continues

exploring,

0, if Ri moves to targets,

(14)

where ωrD, ωrH

and ωrDDare density weights; and the vari-

ables HD, HH and HDD are dispersion degree, homodromousdegree and district-difference degree, respectively (Jang,Sun, & Mizutani, 1996). The dispersion degree HD is ap-

plied to evaluate how close the robots are to each other. Ifthere are m robots in a M × N area, the dispersion degreeis calculated by a Gaussian function as

HD = e−

(δ − μ)2

2σ 2 , (15)

where δ, μ and σ are calculated by

δ = D√M2 + N2

, (16)

μ = 1

k

k∑k1=1

δk1 , (17)

σ = 1

2[max(δk1 ) − min(δk1 )], (18)

D =∑m

p=1

∑mq=p+1 D(p, q)

C2m

= 2

m(m − 1)

m∑p=1

m∑q=p+1

D(p, q),

(19)

where D is the real-time average distance between therobots; k1 is the counter in σ calculation, k1 = 1, ···, k;and D(p, q) is the distance between robots Rp and Rq. Thehomodromous degree HH is applied to help robots to keeprelative directions when cooperatively exploring the un-known environments. If there are m robots and the robotdirections are {θ1, θ2, · · · , θn1}, where 0◦ ≤ θ < 360◦, the

Figure 4. Trajectory comparison with regular shape obstacles. Ri, i = 1, ···, 4, denotes robot i: (a) the improved approach; (b) theapproach in Cai and Yang (in press).

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13


Figure 5. Trajectory comparison with irregular shape obstacles: (a) the improved approach; (b) the approach in Cai and Yang (in press).

homodromous degree is calculated by

HH =2

m(m−1)

∑mp=1

∑mq=p+1 abs(θp − θq)

n1, (20)

where n1 is the number of possible robot moving directions.In the research, each possible direction area is regarded asa bound area at a 45◦ angle. So there are eight possibledirection areas in the simulations. The function abs() isabsolute value function. The district-difference degree isused to judge whether all the robots stay in the same area.Especially when both Nu and Au are large, the district-difference degree is applied to provide a proper repulsivepotential value to keep the robots from gathering. In theactual search task, the environment is usually divided intodifferent parts based on the number of targets and searchresources. If both the percentages of undetected targets (de-noted as Nu/N ) and unexplored area (denoted as Au/A )are high, the district-difference degree can help the robotsto separately explore, rather than gather too close to eachother. In other words, the density of the robots in a smallpart of the environment is supposed to be low under thissituation. For the calculation, the environment is dividedinto Nd parts A1, A2, · · · , ANd

, where Nd is a square num-ber and Nd < N. The value of district-difference degree canbe obtained by

HDD = ωD1Au

A+ ωD2

Nu

N+ ωD3

∑mi=1 P (Ri, k2)

m, (21)

where ωD1, ωD2 and ωD3 are weights; P(Ri, k2) is the func-tion to judge whether the robot Ri is in the k2-th part of theenvironment, k2 = 1, ···, Nd, which can be obtained by

P (Ri, k2) ={

1, if in the part,0, otherwise.

(22)

When there are obstacles in the environment, the avail-able robots need to plan the collision-free paths to completethe work. In this case, Equation (14) is updated as

Ur =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

ωrD

1

HD

+ ωrH

1

HH

+ ωrDD

1

HDD

, if Ri continues

exploring,

0, if Ri moves to targets,

1

2η

(1

d− 1

d1

)(Xpj

− XTl)α, if Ri detects

obstacles,

(23)

where d is the nearest distance between the particle and thedetected obstacles; d1 is the influence scope of obstacles;η is a position gain coefficient; and α is a positive realnumber. The relative distance between the particle and thetarget is added to the function, which ensures that the globalminimum is only obtained at the target in the entire potentialfield.

As Equations (13) and (23) show, Ud = ωaUa + ωrUr

can be regarded as an evaluation of the path shortness forthe multi-robot system. Meanwhile, for real mobile robot

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13


Figure 6. Trajectory comparison at three time instances: (a) when the first pile is detected; (b) when the second pile is detected; (c) whenthe third pile is detected.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13


Figure 7. The final path in four different trials: (a) the first trial; (b) the second trial; (c) the fourth trial; (d) the ninth trial.

applications, robot paths should not have very sharp turns.Thus, path smoothness is introduced to offer acceptableturning angle for the robots. An effective way is to calculatethe angle between the lines connecting the particle goalpoint with two robot positions during one iteration. Basedon the Law of Cosines, the smoothness fitness function Us

is defined as

Us =[

arccosd2

gk + d2g(k+1) − dk(k+1)

2dgkdg(k+1)

]−1

, (24)

where dgk and dg(k + 1) are the lengths of two lines connect-ing with the particle goal point during one iteration: oneline is connected with the initial robot position XRi

(t) whilethe other one is connected with the end position XRi

(k + 1);and dk(k + 1) is the distance between XRi

(k) and XRi(k + 1).

2.5. Dynamic parameter tuning

Although it is not difficult to use potential field in pathplanning, problems such as local minima or oscillatorymovements may appear during the simulation. Reasonableweights can avoid these problems and contribute to the co-

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13


Figure 8. Trajectory comparison with complex obstacles: (a) the improved approach; (b) the approach in Cai and Yang (in press).

operation. In addition, proper parameters for PSO are alsoessential to promote convergence. In order to achieve plau-sible results, acceptable and practical limits are specifiedfor each parameter. The random variables r1 and r2 can beobtained by a random function (such as the function rand()in C program). The weights c1 and c2 in PSO are suggestedwithin [1, 4]. For inertia weight ω(t), the desirable lower andupper bounds are 0.4 and 0.9, respectively. In the potentialfield function, the weights ωd and ωs are suggested within[1, 2], while ωa and ωr are preferred within [0.3, 1.5]. Theposition gain coefficients ka1 and ka2 in Ua are suggestedwithin [0.1, 1]. In Equation (23), the position gain coeffi-cient η is set within [1, 1.4] while the parameter α can beset to 2. All the value ranges are listed in Table 2.

The experiment results in Cai and Yang (in press) in-dicate that a fixed weight value for the potential field mayencounter oscillation problem where the targets are locatedwithin the area with irregular shape obstacles. In previ-ous design, an increasing value of ωa with the change ofexplored target percentage is designed as

ωa = ωa min + (ωa max − ωa min)Na

Na + Nu

, (25)

where ωamax and ωamin are upper and lower bounds of ωa,respectively. Such a linear increasing weight for the attrac-tive potential value is more feasible and applicable than afixed one. But the data from simulation experiment showsthat the repulsive potential field value is essential for thefitness function and highly related to the exploration effi-ciency. In addition, a fixed weight value cannot utilise anyfeedback from the output, and the adaptability is hardly

Table 4. The performance comparison in the environment withirregular shape obstacles.


Experiment times 10 10The average cooperationtime T (sec)

302 357

The standard deviation of T 4.69 14.61The average total pathlength d

214 266

The standard deviation of d 8.27 14.36

implemented. In the improved scheme, the parameter tun-ing for repulsive potential weight is highly related to theexplored area percentage Aa/A, and the goal is to increasethe percentage in a reasonable range. The improved weighttuning expression for repulsive potential field Ur is definedas

ωr (k + 1) = �ωr (n2) + ∑n2k=1 ωr (k)

n2, (26)

where n2 is the current iteration number; and the variable�ωr(k) is obtained by

�ωr (k) =⎧⎨⎩

0.02ωr (k), if�(Aa/A) increases,

−0.02ωr (k), otherwise.(27)

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13


Table 5. The performance comparison with complex obstacles.


Experiment times 10 FailedThe averagecooperation time T(sec)

473

The standarddeviation of T

21.36

The average totalpath length d

288

The standarddeviation of d

13.47

The above design onparameter tuning enables the weights toadjust the values based on the feedback of the performance.This improvement endows the adaptability to the scheme.Furthermore, the three weights in the repulsive potentialfield are similarly tuned by

ωrD(k + 1) = �ωrD

(n2) + ∑n2t=1 ωrD

(k)

n2, (28)

ωrH(k + 1) = �ωrH

(n2) + ∑n2t=1 ωrH

(k)

n2, (29)

ωrDD(k + 1) = �ωrDD

(n2) + ∑n2t=1 ωrDD

(k)

n2, (30)

where the variables �ωrD(k), �ωrH

(k) and �ωrDD(k) are

calculated by

�ωrD(k) =

⎧⎨⎩

0.01ωrD(k), if�(Aa/A) increases,

−0.01ωrD(k), otherwise,

(31)

�ωrH(k) =

⎧⎨⎩

0.01ωrH(k), if�(Aa/A) increases,

−0.01ωrH(k), otherwise,

(32)

�ωrDD(k) =

⎧⎨⎩

0.01ωrDD(k), if�(Aa/A) increases,

−0.01ωrDD(k), otherwise,

(33)

3. Simulation studies

The experiments in several situations with different obsta-cles are conducted. The environment space is 50 × 50 inall the cases. In the simulations, the robots know nothingabout the environment except the total number of targets.The sensing distance of robots is three units. It is assumedthat during exploration, when a robot reaches a pile of clus-tered targets, it will stop there while continue explorationafter reaching some scattered targets.

Figure 9. The final trajectories of the robots where R4 moves tothe upper area at the beginning.

3.1. Cooperation with regular shape obstacles

Most of the experiments conducted in Cai and Yang (inpress) are without obstacles. In real applications, it is pos-sible that some obstacles are detected during the process.For this case, robots need to build the map of environmentand plan the collision-free paths for the future exploration.In this scenario, an environment with regular shape obsta-cles is focused on, as shown in Figure 4. There are four pilesof clustered targets in the environment and the locations arerandom. During exploration with the improved method (seeFigure 4(a)), R3 detects the pile in the bottom left cornerand then R4 finds the pile in the middle left part. The largepile in the top left part is detected by R2. Differently, forthe approach in Cai and Yang (in press), R3 moves to theleft top part and detects the large pile firstly. R4 detects allthe targets in the left bottom area and it is the only avail-able robot for that part, because R2 is requested to help R3.The data comparison is presented in Table 3. The improvedmethod provides with better work efficiency and reducesthe energy consumption of multi-robot system.

3.2. Cooperation with irregular shape obstacles

Compared to regular shape obstacles, irregular shape ob-stacles produce more challenges to robots. The robot mayoscillate near the obstacle but cannot reach the targets. Inthe scenario of Figure 5, the target locations are similar toFigure 4. During exploration, R1 firstly detects the pile inthe top right corner, and then it stops there. The next robotthat detects targets is R3, while R4 detects the pile in leftbottom corner later. At last, R2 detects the pile in the top

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13


left part. In this scenario, the computation time is same ascooperation time, since no individual robot is requested tohelp its partner.

The comparison between the proposed approach andmethod in Cai and Yang (in press) is presented in Fig-ure 5. As indicated in Figure 5(b), the approach in Caiand Yang (in press) is competent to accomplish the task;however, the improved method can improve the work ef-ficiency. The computation time is saved. In addition, thefinal robot trajectory length with the improved approach isshorter.

Between the two approaches, the order of the targetdetecting is different. For the approach in Cai and Yang(in press), R3 is the second one that detects the target andthen the control centre requests R2 to help it, based onthe percentage of explored targets. R4 continues explo-ration for the last four targets. It detects the pile in theleft middle part firstly, and then continues to work sinceit is the only available robot to continue the task. Thereason provoking the different order is that the repulsivepotential field in Cai and Yang (in press) is unable to com-mand R4 to pass through the narrow corridor between themiddle and the just below obstacles. As a consequence,both cooperation time and total path length increase. Threetime instances are selected to be presented in Figure 6.Each one is a time instance when the piles are reachedindividually.

The data comparison is listed in Table 4. For this sce-nario, the experiment data is collected from 10 trials. Thefinal path in first, second, fourth and ninth trials are pre-sented in Figure 7. The main trend is similar in all the trials,with small differences between the robot trajectories, espe-cially the trajectory of R1. Those differences are provokeddue to the random values of parameters r1 and r2, shown inEquation (3).

3.3. Cooperation with complex obstacles

In exploration, the environment complexity increases whensome targets are surrounded by the obstacles and the en-trance is very limited. A typical scenario is presented inFigure 8. Two targets are surrounded by obstacles in the leftbottom corner. For this situation, the approach in Cai andYang (in press) is not competent to accomplish the task be-cause R4 will oscillate around due to the repulsive potentialvalue (see Figure 8(b)). For the improvement, the approachin this study can handle the local ‘corridor’ problem. In theexperiment, the final results may be highly different due tothe robot head-one direction when R4 detects the obstacle inthe left bottom corner. The scenario in Figure 9 is differentfrom the situation in Figure 8(a), where R4 moves towardsthe upper part and finally comes down to the left bottompart. During the 10 random experiment trials, similar re-sults to Figure 9 are obtained from three trials. The datafrom all the 10 trials are listed in Table 5.

4. Conclusion

An improved PSO-based algorithm is developed for multi-robot cooperation by introducing the district-difference de-gree and dynamic weight tuning. In the proposed approach,the designed cooperation rules can help potential field tooffer evaluation for the unexplored areas, and the improvedPSO algorithm commands the multi-robot system to accom-plish the task in complex environments. In the simulationstudies, different scenarios with regular shape obstacles,irregular shape obstacles and complex obstacles are con-sidered. The results demonstrate the effectiveness of theproposed approach that is applicable and flexible to multi-robot cooperation for target searching in complex unknownenvironments. However, in dynamic parameter tuning, thecoefficient values are obtained by trial and error during thesimulation experiments. In future works, the heuristic self-tuning coefficients are expected to apply. In addition, thetarget searching task can be extended to the foraging taskof multi-robots, which includes both the searching and thetransportation steps. Furthermore, some other issues canbe considered for multi-robot target searching that couldbroaden the scope of this work, such as the vehicle kine-matics and communication between the robots.

AcknowledgementsThis work was supported by Natural Sciences and EngineeringResearch Council (NSERC) of Canada.

ReferencesBarton, K.L., Hoelzle, D.J., Alleyne, A.G., & Johnson, A.J.W.

(2011). Cross-coupled iterative learning control of systemswith dissimilar dynamics: Design and implementation. Inter-national Journal of Control, 84, 1223–1233.

Burgard, W., Moors, M., Stachniss, C., & Schneider, F. (2005).Coordinated multi-robot exploration. IEEE Transactions onRobotics, 21, 376–386.

Cai, Y., & Yang, S.X. (in press). A novel potential field-basedPSO approach to multi-robot cooperation for target searchingin completely unknown environments. International Journalof Robotics and Automation.

Cepeda-Gomez, R., Olgac, N., & Sierra, D.A. (2011). Applicationof sliding mode control to swarms under conflict. IET ControlTheory & Applications, 5, 1167–1175.

Choi, K., Yoo, S.J., Park, J.B., & Choi, Y.H. (2010). Adaptiveformation control in absence of leader’s velocity information.IET Control Theory & Applications, 4, 521–528.

Doctor, S., Venayagamoorthy, G.K., & Gudise, V.G. (2004). Op-timal PSO for collective robotic search applications. In Pro-ceedings of the 2004 Congress on Evolutionary Computation(pp. 1390–1395). Piscataway, NJ: IEEE.

Eberhart, R.C., & Shi, Y. (1998). Comparison between genetic al-gorithms and particle swarm optimization. In Proceedings ofthe 7th International Conference on Evolutionary Program-ming VII (pp. 611–616). Berlin: Springer-Verlag.

Gao, Y., & Wang, L. (2010). Asynchronous consensus ofcontinuous-time multi-agent systems with intermittent mea-surements. International Journal of Control, 83, 552–562.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13


Ghommam, J., Mehrjerdi, H., Saad, M., & Mnif, F. (2011). Adap-tive coordinated path following control of non-holonomicmobile robots with quantised communication. IET ControlTheory & Applications, 5, 1990–2004.

Gong, D.W., Zhang, Y., & Qi., C.L. (2012). Localising odoursource using multi-robot and anemotaxis-based particleswarm optimisation. IET Control Theory & Applications, 6,1661–1670.

Hereford, J.M., Siebold, M., & Nichols, S. (2007). Using the par-ticle swarm optimization algorithm for robotic search appli-cations. In Proceedings of the 2007 IEEE Swarm IntelligenceSymposium (pp. 53–59). Piscataway, NJ: IEEE.

Hou, S.P., & Cheah, C.C. (2012). Dynamic compound shape con-trol of robot swarm. IET Control Theory & Applications, 6,454–460.

Ijaz, K., & Manzoor, U. (2006). Using vision and coordinationto find unknown target in fixed and random length obstacles.WSEAS Transactions on Computers, 5, 2400–2405.

Jang, J.R., Sun, C., & Mizutani, E. (1996). Neuro-fuzzy and softcomputing: A computational approach to learning and ma-chine intelligence. New York: Prentice Hall.

Karimadini, M., & Lin, H. (2011). Fault-tolerant cooperative task-ing for multi-agent systems. International Journal of Control,84, 2092–2107.

Kazerooni, E.S., & Khorasani, K. (2009). An optimal cooperationin a team of agents subject to partial information. Interna-tional Journal of Control, 82, 571–583.

Kiranyaz, S., Pulkkinen, J., & Gabbouj, M. (2008). Multi-dimensional particle swarm optimization for dynamic envi-ronments. In Proceedings of the 2008 International Confer-ence on Innovations in Information Technology (pp. 34–38).Piscataway, NJ: Institute of Electrical and Electronic Engi-neering Computer Society.

Li, T.J., Yuan, G.W., & Wang, F.J. (2009). Behavior control ofmultiple robots exploring unknown environment. In Proceed-ings of the 4th IEEE Conference on Industrial Electronics andApplications (pp. 1877–1882). Piscataway, NJ: IEEE.

March, J.G. (1991). Exploration and exploitation in organizationallearning. Organizational Science, 2, 71–87.

Masehian, E., & Sedighizadeh, D. (2010). A multi-objective PSO-based algorithm for robot path planning. In Proceedings of the2010 IEEE International Conference on Industrial Technol-ogy (pp. 465–470). Piscataway, NJ: IEEE.

Parsopoulos, K.E., & Vrahatis, M.N. (2010). Particle swarm op-timization and intelligence: Advances and applications. Her-shey, PA: Information Science Reference.

Pugh, J., & Martinoli, A. (2007). Inspiring and modeling multi-robot search with particle swarm optimization. In Proceedingsof the 2007 IEEE Swarm Intelligence Symposium (pp. 332–339). Piscataway, NJ: IEEE.

Sachs, S., Valle, S.M.L., & Rajko, S. (2004). Visibility-basedpursuit-evasion in an unknown planar environment. Interna-tional Journal of Robotics Research, 23, 3–26.

Sadowska, A., Broek, T., Huijberts, H., Wouw, N., Kosti, D., &Nijmeijer, H. (2011). A virtual structure approach to forma-tion control of unicycle mobile robots using mutual coupling.International Journal of Control, 84, 1886–1902.

Tang, Q., & Eberhard, P. (2011). Cooperative motion of swarm mo-bile robots based on particle swarm optimization and multi-body system dynamics. Mechanics Based Design of Struc-tures and Machines, 39, 179–193.

Tang, Y., Li, Q., Wang, L., Zhang, C., & Yin, Y. (2010). Animproved PSO for path planning of mobile robots and its pa-rameters discussion. In Proceedings of the 2010 InternationalConference on Intelligent Control and Information Processing(pp. 34–38). Piscataway, NJ: IEEE.

Tian, D., Yashiro, D., & Ohnishi, K. (2012). Haptic transmission byweighting control under time-varying communication delay.IET Control Theory & Applications, 6, 420–429.

Wang, S., & Xie, D. (2012). Consensus of second-ordermulti-agent systems via sampled control: Undirected fixedtopology case. IET Control Theory & Applications, 6,893–899.

Xue, Y., Tian, G., & Li, G. (2008). Global path planning formobile robot based on improved particle swarm optimization.Journal of Huazhong University of Science and Technology(Natural Science Edition), 36, 167–170.

Yin, Y., Su, L., & Guo, C. (2008). A policy of conflict negotiationbased on fuzzy matter element particle swarm optimizationin distributed collaborative creative design. Computer AidedDesign, 40, 1009–1014.

Zhu, D., Liu, J., & Yang, S.X. (2011). Particle swarm optimiza-tion approach to thruster fault-tolerant control of unmannedunderwater vehicles. International Journal of Robotics andAutomation, 26, 282–287.

Dow

nloa

ded

by [

Mon

ash

Uni

vers

ity L

ibra

ry]

at 0

3:24

22

Sept

embe

r 20

13

Documents

An improved PSO-based approach with dynamic parameter tuning for cooperative multi-robot target searching in complex unknown environments