9
©2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version of this article can be found at https://doi.org/10.1109/ICRA.2018.8460981 arXiv:1607.04358v3 [cs.MA] 14 Mar 2020

Modelling resource contention in multi-robot task …1 Modelling resource contention in multi-robot task allocation problems with uncertain timing Andrew W. Palmer, Andrew J. Hill,

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Modelling resource contention in multi-robot task …1 Modelling resource contention in multi-robot task allocation problems with uncertain timing Andrew W. Palmer, Andrew J. Hill,

©2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, includingreprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists,or reuse of any copyrighted component of this work in other works.The published version of this article can be found at https://doi.org/10.1109/ICRA.2018.8460981

arX

iv:1

607.

0435

8v3

[cs

.MA

] 1

4 M

ar 2

020

Page 2: Modelling resource contention in multi-robot task …1 Modelling resource contention in multi-robot task allocation problems with uncertain timing Andrew W. Palmer, Andrew J. Hill,

Modelling Resource Contention in Multi-Robot Task AllocationProblems with Uncertain Timing

Andrew W. Palmer, Andrew J. Hill, and Steven J. Scheding1

Abstract— This paper proposes an analytical framework formodelling resource contention in multi-robot systems, where thetravel times and task durations are uncertain. It uses severalapproximation methods to quickly and accurately calculate theprobability distributions describing the times at which the tasksstart and finish. Specific contributions include exact and fastapproximation methods for calculating the probability of a setof independent normally distributed random events occurringin a given order, a method for calculating the most likely andn-th most likely orders of occurrence for a set of independentnormally distributed random events that have equal standarddeviations, and a method for approximating the conditionalprobability distributions of the events given a specific order ofthe events. The complete framework is shown to be faster than aMonte Carlo approach for the same accuracy in two multi-robottask allocation problems. In addition, the importance of incor-porating uncertainty is demonstrated through a comparisonwith a deterministic method. This is a general framework thatis agnostic to the optimisation method and objective functionused, and is applicable to a wide range of problems.

I. INTRODUCTION

Multi-Robot Task Allocation (MRTA) problems arise inmany scenarios, and involve assigning a set of robots to a setof tasks. Much of the existing solution methods assume de-terministic robot dynamics, and ignore the effects of resourcecontention. This paper explicitly models uncertainty in thetravel times and task durations of the robots, and the effectsthat result from multiple robots using mutually exclusiveresources. An analytical framework for calculating the costof a set of task assignments is developed, and it is shown tooutperform deterministic and Monte Carlo approaches.

This work is motivated by multi-robot scenarios wherethere is a shared resource that cannot be used by all ofthe robots at once, such as an intersection or a rechargingpoint, as well as situations where the robots cannot performtheir tasks in parallel and must wait for previous robots tofinish before commencing their task, such as construction andmaintenance tasks. The techniques developed in this paperfor modelling these effects are applicable to a wide range ofrobotics scenarios such as multi-robot path planning [1] andplanning for recharging robots [2], as well as other scenariossuch as machine shop scheduling [3]. While there is literatureon MRTA problems that incorporate either uncertainty [4],[5], [6], [7] or resource contention [8], [9], to the best ofthe authors’ knowledge this is the first work to combine

This work was supported by the Rio Tinto Centre for Mine Automa-tion and the Australian Centre for Field Robotics, University of Sydney,Australia.

1 The authors were with the Australian Centre for Field Robotics,University of Sydney, Australia, email: [email protected],[email protected], [email protected].

the two. The framework developed in this paper analyticallycalculates the probability distributions describing the times atwhich the tasks are started and completed. This frameworkis independent of the choice of optimisation method used,facilitates the use of any objective function, and can also beused in conjunction with chance constraints. In addition tothe framework, specific contributions of this paper include:• exact and fast approximation methods for calculating

the probability of a set of independent normally dis-tributed random events occurring in a given order;

• a method for calculating the most likely and n-th mostlikely orderings of independent normally distributedrandom events, when the standard deviations of theoccurrence times for each event are equal; and

• a normal approximation to the conditional probabilitydistribution describing a random event given a specificorder of events.

In the following sections, Section II presents an overviewof related literature, and Section III develops the analyticalframework. Section IV then evaluates the utility of the frame-work in two simulation examples, and Section V concludesthe paper with suggestions for future research.

II. RELATED LITERATURE AND BACKGROUND

MRTA problems have been extensively studied in theliterature—a recent review of the state-of-the-art solutionmethods for MRTA problems is presented in [10]. Theauthors note that solving MRTA problems with complex con-straints, including uncertainty and resource contention, is stillan open question. Uncertainty has been considered by severalpapers, but has been dealt with by each in different ways. Theinterval Hungarian algorithm was developed in [4] to dealwith problems that have uncertainty in the utility estimateof a given assignment. This method relies on knowing theProbability Density Function (PDF) describing the utility.The interval Hungarian algorithm can be applied to problemswith resource contention, using the framework developedin this paper to calculate the PDFs of the utility. Taskallocation in teams consisting of both robots and humanswas considered by [5], where humans have the option ofrejecting a task assignment. They developed a replanningalgorithm using a multi-agent Markov decision process thatincorporated the probability that a human will reject the task.Finally, sensitivity analysis approaches were used in [6], [7]to quantify when a task assignment should be recomputedin response to changes in the environment. The effect ofresource contention was not included in any of the abovepapers.

Page 3: Modelling resource contention in multi-robot task …1 Modelling resource contention in multi-robot task allocation problems with uncertain timing Andrew W. Palmer, Andrew J. Hill,

Resource contention was considered in [8], where a teamof robots operating in an office building frequently encoun-tered areas where only one robot could operate at a time, suchas doorways and cluttered corridors. The authors introduceda decentralised method that used aggression signalling toresolve interactions during task execution. A method forcalculating the optimal task assignment in scenarios withresource contention was developed in [9]. The costs asso-ciated with the resource contention were modelled using apenalisation function, and they showed that the problem isNP-hard when the penalisation function is polynomial-timecomputable. Their approach used Murty’s ranking algorithmto find next best assignments when ignoring contention costs,and then evaluated these assignments with the contentioncosts included. The above approach could be used in con-junction with the framework developed in this paper forscenarios where uncertainty is also considered.

The proposed framework developed in this paper usesa number of approximations for performing operations onnormal distributions. Two existing approximations that areused in the framework are introduced in the followingsubsections.

A. Maximum of normally distributed random variables

This subsection presents an approximation for the meanand variance of the maximum of two independent normallydistributed variables, originally presented in [11]. ConsiderX ∼ N (µX , σ

2X) and Y ∼ N (µY , σ

2Y ). Let

α =√σ2X + σ2

Y , β =µX − µY

α. (1)

Using the following notation:

φ(x) =exp(−x2/2)√

(2π), (2)

Φ(x) =

∫ x

−∞φ(t)dt =

1

2

(1 + erf

(x√2

)), (3)

where erf(.) is the error function, defined as

erf(t) =2√π

t∫0

exp(−τ2

)dτ, (4)

max(X,Y ) is approximated by a normal distribution, Z ∼N (µZ , σ

2Z), where

µZ = µXΦ(β) + µY Φ(−β) + αφ(β), (5)

σ2Z = (µ2

X + σ2X)Φ(β) + (µ2

Y + σ2Y )Φ(−β)

+ (µX + µY )αφ(β)− µ2Z . (6)

For more than two variables, the author suggests recursivelyapplying the above approximation to pairs of variables.

B. Conditioning normally distributed random variables

This subsection summarises a method presented in [12] forcalculating the mean and variance of a normally distributedrandom variable, B, that is conditioned on other independentnormally distributed random variables, A and C, to satisfythe inequality A < B < C. This method can be used tocalculate (B|B < C) and (B|A < B) by using µA = −∞and µC = ∞ respectively. The mean and variance of theconditional probability distribution are denoted as µ̂B and σ̂2

B

respectively. First, the random variables are transformed suchthat B is described by a standard normal distribution. Thisyields transformations of A and C to D and E respectively:

D ∼ N (µD, σ2D), E ∼ N (µE , σ

2E), (7)

where:

µD =µA − µBσB

, σ2D =

σ2A

σ2B

, (8)

µE =µC − µBσB

, σ2E =

σ2C

σ2B

. (9)

Then, the mean of the conditional probability distribution isgiven by

µN = 2α

(1√

σ2D + 1

exp

(− µ2

D

2(σ2D + 1)

)

− 1√σ2E + 1

exp

(− µ2

E

2(σ2E + 1)

)), (10)

where

α =1

√2π

[erf

(µE√

2(σ2E+1)

)− erf

(µD√

2(σ2D+1)

)] , (11)

and the variance is given by

σ2N = α

[√

((1 + µ2

N

)(erf

(µE√

2(σ2E + 1)

)

− erf

(µD√

2(σ2D + 1)

)))

+2√

σ2D + 1

(µD

σ2D + 1

− 2µN

)exp

(− µ2

D

2(σ2D + 1)

)− 2√

σ2E + 1

(µE

σ2E + 1

− 2µN

)exp

(− µ2

E

2(σ2E + 1)

)].

(12)The new mean and standard deviation are then transformedback to the original reference frame to give µ̂B and σ̂2

B :

µ̂B = µNσB + µB , σ̂2B = σ2

Nσ2B . (13)

The approach used to derive the above method is reliant onthe condition that µA < µC , and the assumption that theprobability distributions describing A and C have limitedoverlap.

Page 4: Modelling resource contention in multi-robot task …1 Modelling resource contention in multi-robot task allocation problems with uncertain timing Andrew W. Palmer, Andrew J. Hill,

III. FRAMEWORK

This section develops the analytical framework for mod-elling resource contention when uncertainty is considered.Two cases are examined—in the first, the order in which therobots must use the resource is specified, while in the second,the order is simply the order in which they arrive. The robotshave a time that they arrive at and begin queuing for theresource, T aX , and a duration for using the resource, DX ,that are independent normally distributed random variables,where X identifies the robot.

For the first case, consider a simple example of two robots,A and B, that both need to perform an action at the samelocation. Only one robot can perform their action at thelocation at a time, so the second robot may have to waitfor the first robot to complete its action before commencingits own action. If robot A must perform its action before B,then the time that robot A starts its action, T sA, is simplythe time that it arrives at the location, T aA. The time that itfinishes the action at, T fA, is given by:

T fA = T sA +DA, (14)

where the mean and variance of T fA are the sum of the meansand variances respectively of T sA and DA. Since robot B canonly commence its action after A has finished, the time thatrobot B starts its action, T sB is given by:

T sB = max(T fA, TaB), (15)

where the max is calculated using the approximation de-scribed in Section II-A. The time that robot B completes itsaction, T fB , is then given by:

T fB = T sB +DB . (16)

For 3 or more robots, the start and finish times of their actionsare calculated by iteratively applying Eqs. (15) and (16) withrespect to the previous robot to perform its task.

The second case, where the order in which the robots usethe resource is not specified but is instead determined bywhen the robot begins queuing for the resource, exhibits aFirst-In First-Out (FIFO) property. Consider a similar multi-robot scenario to the first case where two robots, A and B,travel to a location and perform a task. In contrast to thefirst case, the first robot to arrive at the location is the firstto perform its task. For two robots, there are two possibleorders in which the tasks can be performed—A followed byB, and B followed by A. To calculate the completion timefor each robot, both orders must be considered. Consider thecase where A arrives before B. The time that A starts its taskis first conditioned on the order in which the robots arrive:

(T sA|T aA < T aB) = (T aA|T aA < T aB). (17)

The time that A completes its task is given by:

(T fA|TaA < T aB) = (T sA|T aA < T aB) +DA. (18)

The time that B starts its task is then calculated as:

(T sB |T aA < T aB) = max(

(T fA|TaA < T aB), (T aB |T aA < T aB)

),

(19)

and the time that B completes its task is given by:

(T fB |TaA < T aB) = (T sB |T aA < T aB) +DB . (20)

The completion time for each robot, considering all ordersof arrival, is calculated by summing the probability weightedcompletion times for each order of arrival. For robot A:

T fA = P (T aA < T aB)(T fA|TaA < T aB)

+ P (T aB < T aA)(T fA|TaB < T aA), (21)

where P (T aX < T aY ) is the probability that X arrives beforeY . This process extends to n robots, with the downside thatthere are n! orders of arrival that have to be considered.

The following subsections present methods for calculatingthe arrival times conditioned on the order of arrival (SectionIII-A) and the probability of an order of arrival (Section III-B). A method for reducing the number of orders of arrivalthat are considered based on Mutry’s ranking algorithm isdeveloped in Section III-C.

A. Conditioning the arrival times on the order of arrival

A method for calculating the mean and variance of therandom variable T aB conditioned on the arrival order ABC,(T aB |T aA < T aB < T aC), was presented in Section II-B.

1) Extension to general conditions: For cases where theconditions listed in Section II-B are not met, the method is apoor approximation of the mean and variance. To apply thisapproach in such situations, it is proposed that T aA and T aCbe applied one at a time. Applying these conditions one at atime involves using the procedure from Eq. (7) to Eq. (13)with one of the distributions set to either −∞ or ∞, andthen using the result of that procedure as the distribution forT aB when applying the remaining condition.

The order in which the conditions are applied can impactthe resultant distribution. In order to determine the order inwhich the conditions should be applied, a decision tree waslearnt using the scikit-learn [13] module for Python. Thethree choices of method are:

1) apply T aA and T aC together;2) apply T aA followed by T aC ; and3) apply T aC followed by T aA.The decision tree to determine which method to use is

presented in Figure 1. The parameters γ and δ are an overlapmetric and shape metric respectively, as defined in [12]:

γ =µC − µAσC + σA

, δ =

∣∣∣∣log

(σAσC

)∣∣∣∣ . (22)

The decision tree was trained on over 200,000 differentcombinations of values for T aA and T aC , and a separate setof 10,000 combinations of randomly sampled values wasused for evaluation. Kullback-Leibler (KL) divergence wasused as a measure of error between the method selectedby the decision tree and the best method. The best methodwas selected in 84.8% of cases. For all cases, the aver-age and Root Mean Squared (RMS) KL divergences were3.1E−3 and 9.4E−2 respectively, and for cases where theincorrect decision was made they were 2.0E−2 and 2.4E−1

Page 5: Modelling resource contention in multi-robot task …1 Modelling resource contention in multi-robot task allocation problems with uncertain timing Andrew W. Palmer, Andrew J. Hill,

𝛾 > 1.3

𝜇𝐷 > −𝜇𝐸

𝜎𝐷 > 𝜎𝐸 or δ < 0.316 𝜎𝐸 > 𝜎𝐷 or δ < 0.316

Method 1

Method 2 Method 3 Method 3 Method 2

True False

Fig. 1: Decision tree for determining the order in which toapply the conditions. Note that the means and standard devi-ations used here are calculated from the original distributionsusing Eq. (8) and (9). The first decision checks if the distribu-tions overlap enough to require the conditions to be appliedindividually, while the second and third decisions comparethe relative positions and shapes of the two distributions todetermine which should be applied first.

respectively. The majority of the error was accrued in afew cases where the distributions T aA and T aC have a verylow probability of satisfying T aA < T aC , resulting in a veryhigh KL divergence (> 1). Thus, even though this is a poorapproximation in a small number of cases, it will have anegligible impact on the result when it is multiplied by theprobability of that order occurring.

2) Iteratively applying conditions: So far, only the secondrobot to arrive in a group of three robots has been consid-ered. Analytically calculating the mean and variance for theconditional arrival time of the other robots is a challengingproblem. Instead, it is proposed that the conditions beapplied iteratively, thus enabling any number of robots tobe considered. For example, the conditional arrival time ofthe first robot, (T aA|T aA < T aB < T aC), can be calculated as:

(T aA|T aA < T aB < T aC) = (T aA|T aA < (T aB |T aB < T aC)). (23)

B. Calculating the probability of an order of arrival

First, consider two robots with arrival times T aA and T aB .The probability that A arrives before B is given by:

P (T aA < T aB) =

0∫−∞

pTaA−Ta

B(t)dt, (24)

where t is time, and pX(t) denotes the PDF of the randomvariable X . If T aA ∼ N (µA, σ

2A) and T aB ∼ N (µB , σ

2B),

then:

P (T aA < T aB) =1

2

(1 + erf

(µB − µA√2 (σ2

A + σ2B)

)). (25)

Extending to three robots, A, B, and C, the probabilitythat A arrives before B, and that B arrives before C, is:

P ((T aA < T aB) ∩ (T aB < T aC))

= P (T aA < T aB)× P (T aB < T aC |T aA < T aB). (26)

It is possible to formulate this as a multivariate normal dis-tribution through a linear transformation. Let X = T aA− T aBand Y = T aB − T aC , then:

MMM =

[XY

]= STSTST =

[1 −1 00 1 −1

]T aAT aBT aC

, (27)

µµµ =

µAµBµC

, ΣΣΣ =

σ2A 0 00 σ2

B 00 0 σ2

C

. (28)

The multivariate normal distribution, MMM , then has mean, µµµMMMand covariance, ΣΣΣMMM , calculated using a linear transformation:

µµµMMM = SSSµµµ, (29)

and:ΣΣΣMMM = SSSΣΣΣSSST . (30)

The probability P ((T aA < T aB)∩ (T aB < T aC)) is calculatedby evaluating the Cumulative Distribution Function (CDF)of MMM at X = 0 and Y = 0. Unfortunately, no analyticalsolution exists for the CDF of a multivariate normal distri-bution [14]. However, a numerical approximation approachbased on [15] is readily available as the mvnun function inthe stats.mvn module of the Scipy package for Python.

The mvnun function is computationally expensive, espe-cially for high-dimensional multivariate distributions. If thereare independent parts of the distribution (e.g., P ((T aA <T aB)∩ (T aB < T aC)∩ (T aD < T aE)∩ (T aE < T aF )) where ABCare independent from DEF ), then, in practice when usingmvnun, it is significantly faster to calculate the probabilitiesof each independent part separately and simply multiply theprobabilities together, than to compute the probability usingthe entire multivariate distribution.

1) Estimating the probability: Due to the large number ofpossible orders of arrival when considering multiple robots,calculating the probability of every order of arrival usingmvnun can result in long computation times. For example,with 8 robots, there are over 40,000 possible orders of arrival.In this case, mvnun takes 3ms to compute the probabilityof one arrival order, requiring a total of 120s to calculatethe probability of every order. It is therefore desirable tohave a fast method of estimating the probability to useeither in place of mvnun or to allow unlikely arrival ordersto be discarded before mvnun is called. Using conditionalprobabilities, the probability of a given arrival order of nrobots is:

P̄((T a1 < T a2 ) ∩ · · · ∩ (T an−1 < T an )

)=

i=n∏i=2

P((T ai−1|

(T aj−2 < T aj−1 ∀j ∈ {3, . . . , i}

))< T ai

).

(31)

Page 6: Modelling resource contention in multi-robot task …1 Modelling resource contention in multi-robot task allocation problems with uncertain timing Andrew W. Palmer, Andrew J. Hill,

To estimate this probability, the conditional probability dis-tributions are approximated as normal distributions using theapproach presented in Section III-A, and the probability ofeach pair of distributions is then calculated using Eq. (25).

C. Finding the n-th most likely order of arrival

Another method of reducing the computational require-ments of the framework is to only consider likely orders ofarrival. Thus, it is desirable to be able to determine whatthe most likely orders of arrival are. In the general case,this requires an exhaustive search over all orders of arrival.This section considers the special case where the standarddeviations of the arrival time for each robot are equal. Inthis case, the n-th most likely order of arrival can be foundusing a similar approach to Murty’s ranking algorithm forefficiently ranking assignments by their cost [16].

Theorem 1: The most likely order of arrival is the orderin which the mean arrival times are ascending.

Proof: See Appendix I for full proof.Summary: The proof presented in Appendix I shows that

ordering any pair of neighbouring robots in a sequence ofarriving robots by their mean arrival times will result in ahigher probability order of arrival than the opposite ordering.Applying this to all pairs of robots leads to the conclusionthat the most likely order of arrival is obtained by sortingthe robots by their mean arrival times.

Corollary 1: The second most likely order of arrival canbe found by swapping one pair of neighbouring robots in themost likely order of arrival.

Corollary 2: The n-th most likely order of arrival can befound by swapping at most n−1 pairs of neighbouring robotswhen starting with the most likely order.

Corollary 3: The (n + 1)-th most likely order of arrivalcan be found by swapping one pair of neighbouring robotsin one of the n most likely orders.

Together, Corollaries 2 and 3 enable the application of anapproach similar to Murty’s ranking algorithm, detailed inAlgorithm 1. The algorithm takes as input the list of arrivaltime distributions and a probability threshold, and returns alist of orders of arrival that has a sum probability greaterthan the input threshold. It first finds the most likely orderof arrival on Line 1. While the sum probability is less thanthe threshold (Line 3), neighbouring orders of the most likelyorders of arrival found up to that point are considered. Thei-th most likely order is then found on Line 7. While thisapproach is only guaranteed to search through the orders ofarrival in the order of their likelihood for cases where thedistributions describing the arrival times have equal standarddeviations, it can be applied to cases where the standarddeviations are not equal with the loss of this guarantee.

IV. RESULTS

This section presents results for two MRTA scenarios. Inthe first scenario, the order in which the robots must performtheir tasks was specified, and in the second scenario, theorder that the tasks are performed in was determined bythe order in which the robots arrive at the task. In both

Algorithm 1: Calculate the most likely orders of arrivalInput : List of arrival time distributions, T ,

probability threshold, φOutput: List of orders of arrival, L

1 O ← T sorted by mean values2 L← list[O], i← 23 while

∑(p (Q) ∀Q ∈ L) < φ do

4 for Y ∈ unvisited neighbours of O do5 Append Y to L6 Sort L by probability7 O ← i-th entry in L8 i← i+ 1

cases, it was assumed that the robots have local collisionavoidance algorithms to prevent collisions between robots,and any deviations in their travel times due to collisionavoidance between the robots was assumed to be captured bythe uncertainty of their travel times. All methods were testedon the same hardware and were programmed in Python.

A. Scenario 1: Task order is specified

The first scenario consisted of a heterogeneous fleet of 4different types of robots performing assembly tasks at fourlocations. There were 4 robots of each type, giving a total of16 robots. As the tasks must be executed in a specific order,some robots may have to wait for others to complete theirtasks before commencing its own task. The problem was toallocate one robot of each type to each location. 100 randominstances were considered. In each instance, the means andstandard deviations of the distributions describing the uncer-tain travel times and uncertain task durations of each robotwere randomly selected. The optimisation objective was tominimise the expected cost of the construction, where costis incurred if the construction takes longer than a specifieddeadline, κ. The analytical framework presented in this paperenables the calculation of the random variable describing thetime at which the construction is completed, T . From this,the expected tardiness, π, is calculated as:

π = E(max(0, T − κ))

=µT − κ

2

(1 + erf

(µT − κσT√

2

))+

σT√2π

exp

(− (µT − κ)2

2σ2T

), (32)

where E(X) is the expected value of the X .Deterministic (D) and Monte Carlo (M) approaches were

used as benchmarks for the analytical method (A), with thenumber of samples used in the Monte Carlo method variedbetween 5 and 80. An exhaustive search over all possibleallocations was used for testing each method, and groundtruth costs were evaluated using a Monte Carlo method with100,000 samples. Figure 2 shows the performance of eachmethod versus its calculation time. For each random instance,the cost of the lowest cost assignment found by any methodwas subtracted from the cost for each method and then

Page 7: Modelling resource contention in multi-robot task …1 Modelling resource contention in multi-robot task allocation problems with uncertain timing Andrew W. Palmer, Andrew J. Hill,

100 101 102 103

Calculation time (s)

10−4

10−3

10−2

10−1

100

Cost

abo

ve b

est k

nown

solu

tion

Cost above best known solution versus calculation time

ADM

Fig. 2: Results for Scenario 1 showing the average of thecost minus the best known cost for each scenario versusthe calculation time to find the optimal allocation for eachmethod. The error bars show a 95% confidence interval. TheMonte Carlo method used 5, 10, 20, 40, and 80 samples.

averaged across all instances. As can be seen, the analyticalapproach presented in this paper consistently resulted inthe lowest cost allocations. Not considering uncertainty, asdemonstrated by the D result, lead to additional costs thatwere several orders of magnitude higher than the proposedA approach, highlighting the importance of consideringuncertainty in these problems.

B. Scenario 2: Task order is unspecified

The second scenario consisted of 30 controlled robotscollecting packages from separate collection locations anddelivering them to their destinations. A set of uncontrolleddelivery robots were also operating in the environment,collecting packages from the same collection locations. Theresources under contention in this scenario are the packagecollection locations, as the controlled robots may be requiredto queue at the collection locations before collecting theirpackages. It was assumed that the actions of the uncontrolledrobots were known to the optimiser. The optimisation aimwas to generate an optimal allocation of robots to packagesthat minimised the expected tardiness cost incurred fordelivering packages after their deadline. In contrast to theprevious scenario, individual deadlines were considered foreach package. The number of robots controlled by otherentities that visit each collection location, n, was variedbetween 1 and 6. When n = 6, the entire scenario consistedof 30 controlled robots and 180 uncontrolled robots. 100random scenarios were tested for each value of n, withrandom arrival time distributions and task durations for eachcontrolled and uncontrolled robot, and random deadlines foreach package. The standard deviations of the arrival timedistributions for each robot were set to be equal to evaluatethe utility of the method presented in Section III-C forcalculating the set of most likely orders of arrival.

The analytical method was tested using both mvnun to

calculate the probability of an order of arrival (A), and theestimation presented in Section III-B.1 (AEst). Algorithm 1was also used in conjunction with A to limit the numberof orders considered, and values of 80%, 90%, 95%, 99%and 100% were used for the probability threshold. Theseapproaches were benchmarked against a deterministic ap-proach (D), and a Monte Carlo approach (M). The numberof samples used in M was varied between 10 and 10,000for the cases where n ≤ 5, and between 10 and 100,000 forthe case where n = 6. As only one controlled robot goesto each collection location, the cost incurred by one of thecontrolled robots does not depend on the assignments of theother controlled robots. Therefore, the cost associated witheach robot being assigned to each task can be calculatedprior to the optimisation and the Hungarian algorithm [17]can then be used to calculate the optimal allocation. Groundtruth costs were evaluated using a Monte Carlo method with100,000 samples when n ≤ 5, and 1,000,000 samples whenn = 6.

Figure 3 shows the results for each method versus thecalculation time for n ∈ {1, 3, 6}. As can be seen, theA method consistently achieved the lowest cost solutionswhen using a probability threshold of 100%. This came atthe expense of calculation time as the number of externalrobots was increased. Lowering the probability thresholdresulted in lower calculation times at the expense of solutioncost. This highlights the necessity of considering as manyof the orders of arrival as possible in order to calculatean accurate representation of the cost of an assignment.The AEst approach gave a reduction in calculation time ofa factor of approximately 5 over the A approach with aprobability threshold of 100%, with only a marginal increasein the solution cost. In the case where n = 6, the proposedapproaches produced similar results to the M approach using100,000 samples. However, the proposed approaches had thebenefit of higher consistency in this case. Similar to the previ-ous scenario, the D results were several orders of magnitudehigher than the approaches incorporating uncertainty.

The number of orders of arrival considered by the Amethod suffers from factorial growth as the number ofexternal robots is increased. These results suggest that, forcases where n > 6 and the robots arrive at similar times, theM approach will outperform the A and AEst approaches.However, if it is possible to separate the robots into multiplegroups where the probability that any of the robots inan earlier group will arrive after any of the robots in alater group is negligible, then the proposed approaches arepotentially more suitable. The computationally expensiveparts of the approaches were calculating the probability ofan order of arrival (81% of calculation time for A, 18% ofcalculation time for AEst) and conditioning the arrival timeson the order of arrival (14% of calculation time for A, 58%of calculation time for AEst).

V. CONCLUSION

This paper presented an analytical framework for mod-elling timing uncertainty and resource contention in multi-

Page 8: Modelling resource contention in multi-robot task …1 Modelling resource contention in multi-robot task allocation problems with uncertain timing Andrew W. Palmer, Andrew J. Hill,

10−2 10−1 100 101 102 103

Calculation time (s)

10−5

10−4

10−3

10−2

10−1

100

101

Cost

abo

ve b

est k

nown

solu

tion

Cost above best known solution versus calculation timeAAEstDM

(a) 1 uncontrolled robot per location (n = 1)

10−2 10−1 100 101 102 103

Calculation time (s)

10−5

10−4

10−3

10−2

10−1

100

101

Cost

abo

ve b

est k

nown

solu

tion

Cost above best known solution versus calculation timeAAEstDM

(b) 3 uncontrolled robots per location (n = 3)

10−2 10−1 100 101 102 103

Calculation time (s)

10−5

10−4

10−3

10−2

10−1

100

101

Cost

abo

ve b

est k

nown

solu

tion

Cost above best known solution versus calculation timeAAEstDM

(c) 6 uncontrolled robots per location (n = 6)

Fig. 3: Results for Scenario 2 showing the average of thecost for each scenario minus the best known cost versus thecalculation time to find the optimal allocation. Values for thenumber of robots controlled by other entities at each location,n, of 1, 3, and 6, were used, as noted under each figure. Theerror bars show a 95% confidence interval. The A methodused thresholds on the sum of the probabilities of 80%, 90%,95%, 99%, and 100%. The MC method used 10, 100, 1,000,and 10,000 samples. In (c), 100,000 samples was also used.

robot scenarios. The framework was shown to significantlyoutperform deterministic approaches, and produce more ac-curate results than Monte Carlo methods with similar run-times. A key advantage of this approach over numericalmethods is that it produces an accurate probability distribu-tion of the result that can then be exploited in optimisationmethods through approaches such as chance constrainedprogramming. Certain aspects of the framework suffer fromfactorial computational complexity with the number ofrobots considered. Thus, the most promising avenues forfuture work focus on developing methods of segmentingthe problem so that the number of robots considered bythe framework at each resource is minimised. For example,splitting the robots into different arrival groups such that thearrival orders of each group can be considered separately.Other suggestions for future work include integrating thisframework with an optimisation method such as branch andbound that considers partial solutions to further reduce thenumber of orders of arrival considered, and more accuratelymodelling other effects resulting from interactions betweenrobot such as local collision avoidance.

APPENDIX IPROOF OF THEOREM 1

This section proves the claim of Theorem 1 that the mostlikely order of arrival in the case where the variances of thearrival times are equal is the order in which the mean arrivaltimes are ascending. This can be restated as, for any numberof independent normally distributed random variables withequal variances, the most likely sequence of the randomvariables resulting from sorting using their actual values isthe sequence in which the random variables are sorted bytheir mean values.

Consider two neighbouring random variables, X and Y ,in a sequence of random variables, where X and Y areindependent and normally distributed, X ∼ N (µX , σ

2) andY ∼ N (µY , σ

2), with µX < µY and equal variances. Thissection will show that, regardless of the random variablesbefore and after X and Y , the probability that X occursbefore Y is higher than the probability that Y occurs beforeX . Let a and b be constants representing the values takenby the random variables immediately before and after X andY . Then this can be stated mathematically as:

P (X < Y |a < X < b, a < Y < b)

> P (Y < X|a < X < b, a < Y < b). (33)

Lemma 1: pX|a<X<b(x) = pY |a<Y <b(x) for one andonly one value of x.

Proof: The probability distribution functionpX|a<X<b(x) is a normal distribution that is truncatedbetween a and b. The PDF of a truncated normaldistribution is given by:

pX|a<X<b(x) =1σφ(x−µX

σ

)Φ(b−µX

σ

)− Φ

(a−µX

σ

) , (34)

Page 9: Modelling resource contention in multi-robot task …1 Modelling resource contention in multi-robot task allocation problems with uncertain timing Andrew W. Palmer, Andrew J. Hill,

for a ≤ x ≤ b, and 0 otherwise, where φ(.) and Φ(.) aredefined in Eqs. (2) and (3) respectively. As the integrals ofpX|a<X<b(x) and pY |a<Y <b(x) are both 1, it follows thatthey must intersect at least once in the range a < x < b. Thefollowing proof shows that they intersect at most once:

pX|a<X<b(x) = pY |a<Y <b(x) (35)1σφ(x−µX

σ

)Φ(b−µX

σ

)− Φ

(a−µX

σ

) =1σφ(x−µY

σ

)Φ(b−µY

σ

)− Φ

(a−µY

σ

) (36)

φ(x−µX

σ

)φ(x−µY

σ

) =Φ(b−µX

σ

)− Φ

(a−µX

σ

)Φ(b−µY

σ

)− Φ

(a−µY

σ

) . (37)

Let:

c =Φ(b−µX

σ

)− Φ

(a−µX

σ

)Φ(b−µY

σ

)− Φ

(a−µY

σ

) . (38)

Substituting Eq. (2) into Eq. (37), taking the naturallogarithm, and rearranging for x gives:

x =2σ2 ln(c) + µ2

X − µ2Y

2(µX − µY ). (39)

Lemma 2: pX|a<X<b(a) > pY |a<Y <b(a) andpX|a<X<b(b) < pY |a<Y <b(b).

Proof: By Lemma 1, pX|a<X<b and pY |a<Y <b intersectonly once. As pX|a<X<b and pY |a<Y <b are probabilitydistributions, they must be positive and have an integral of1. Thus, it is sufficient to show that

pX|a<X<b(b)

pX|a<X<b(a)<pY |a<Y <b(b)

pY |a<Y <b(a), (40)

to prove the Lemma. Substituting in Eq. (34) gives:

φ(b−µX

σ

)φ(a−µX

σ

) < φ(b−µY

σ

)φ(a−µY

σ

) . (41)

Substituting in Eq. (2), taking the natural logarithm, andsimplifying gives:

µX < µY . (42)

Lemma 3: E(X|a < X < b) < E(Y |a < Y < b).Proof: Let λ satisfy pX|a<X<b(λ) = pY |a<Y <b(λ).

Then:

E(X|a < X < b) =

∫ b

a

xpX|a<X<b(x)dx (43)

=

∫ b−λ

a−λ(x+ λ)pX|a<X<b(x+ λ)dx

(44)

= λ+

∫ b−λ

a−λxpX|a<X<b(x+ λ)dx.

(45)

By Lemmas 1 and 2:

pX|a<X<b(x+ λ) > pY |a<Y <b(x+ λ) ∀x < 0, (46)

pX|a<X<b(x+ λ) < pY |a<Y <b(x+ λ) ∀x > 0. (47)

Therefore:∫ b−λ

a−λxpX|a<X<b(x+ λ)dx

<

∫ b−λ

a−λxpY |a<Y <b(x+ λ)dx. (48)

It is sufficient to show that E(X|a < X < b) < E(Y |a <Y < b) to prove that X|a < X < b is more likely tooccur before Y |a < Y < b than the opposite order. Thus,Lemma 3 proves Eq. (33). Applying Eq. (33) to all pairs ofrandom variables leads to the conclusion that the most likelysequence will be the sequence resulting from sorting therandom variables by their mean values, proving the theorem.

REFERENCES

[1] J. Alonso-Mora, T. Naegeli et al., “Collision avoidance for aerialvehicles in multi-agent scenarios,” Autonomous Robots, vol. 39, no. 1,pp. 101–121, 2015.

[2] A. W. Palmer, A. J. Hill, and S. J. Scheding, “Methods for StochasticCollection and Replenishment (SCAR) Optimisation for PersistentAutonomy,” Robotics and Autonomous Systems, vol. 87, pp. 51–65,2017.

[3] P. Kouvelis, R. L. Daniels, and G. Vairaktarakis, “Robust schedulingof a two-machine flow shop with uncertain processing times,” IIETransactions, vol. 32, no. 5, pp. 421–432, 2000.

[4] L. Liu and D. A. Shell, “Assessing optimal assignment under un-certainty: An interval-based algorithm,” The International Journal ofRobotics Research, vol. 30, no. 7, pp. 936–953, 2011.

[5] F. Wu, S. D. Ramchurn et al., “Agile Planning for Real-World DisasterResponse,” in Proceedings of the Twenty-Fourth International JointConference on Artificial Intelligence, 2015, pp. 132–138.

[6] C. Nam and D. A. Shell, “When to do your own thing: Analysisof cost uncertainties in multi-robot task allocation at run-time,” 2015IEEE International Conference on Robotics and Automation (ICRA),pp. 1249–1254, 2015.

[7] ——, “Analyzing Uncertainties in Cost: Mitigating Centralization inMulti-Robot Task Allocation,” Computer Science and EngineeringDepartment, Texas A&M University, Tech. Rep., 2015.

[8] R. T. Vaughan, K. St\oy et al., “Go ahead, make my day: Robotconflict resolution by aggressive competition,” in Proceedings of the6th International Conference on Simulation of Adaptive Behaviour(SAB), 2000, pp. 491–500.

[9] C. Nam and D. A. Shell, “Assignment Algorithms for Modeling Re-source Contention and Interference in Multi-Robot Task-Allocation,”IEEE Transactions on Automation Science and Engineering, vol. 12,no. 3, pp. 2158–2163, 2015.

[10] A. Khamis, A. Hussein, and A. Elmogy, “Multi-robot Task Allocation:A Review of the State-of-the-Art,” Cooperative Robots and SensorNetworks 2015, pp. 31–51, 2015.

[11] C. E. Clark, “The greatest of a finite set of random variables,”Operations Research, vol. 9, no. 2, pp. 145–162, 1961.

[12] A. W. Palmer, A. J. Hill, and S. J. Scheding, “Applying Gaussiandistributed constraints to Gaussian distributed variables,” InformationFusion, vol. 32, pp. 1–11, 2016.

[13] F. Pedregosa, G. Varoquaux et al., “Scikit-learn: Machine learning inPython,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

[14] A. Genz and F. Bretz, Computation of multivariate normal and tprobabilities. Springer Science & Business Media, 2009.

[15] A. Genz, “Numerical Computation of Multivariate Normal Proba-bilities available,” Journal of computational and graphical statistics,vol. 1, no. 2, pp. 141–149, 1992.

[16] K. G. Murty, “Letter to the Editor–An Algorithm for Ranking allthe Assignments in Order of Increasing Cost,” Operations Research,vol. 16, no. 3, pp. 682–687, 1968.

[17] H. W. Kuhn, “The Hungarian method for the assignment problem,”Naval Research Logistics Quarterly, vol. 2, no. 1-2, pp. 83–97, 1955.