15
Real-time Optimization-based Planning in Dynamic Environments using GPUs Chonhyon Park and Jia Pan and Dinesh Manocha Abstract We present a novel algorithm to compute collision-free trajectories in dynamic environ- ments. Our approach is general and makes no assumption about the obstacles or their motion. We use a replanning framework that interleaves optimization-based planning with execution. Further- more, we describe a parallel formulation that exploits high number of cores on commodity graphics processors (GPUs) to compute a high-quality path in a given time interval. We derive bounds on how parallelization can improve the responsiveness of the planner and the quality of the trajectory. 1 Introduction Robots are increasingly used in dynamic or time-varying environments. These scenarios are com- posed of moving obstacles, and it is important to compute collision-free trajectories for navigation or task planning. Some of the applications include automated wheel chairs, manufacturing tasks with robots retrieving parts from moving conveyors, air and freeway traffic control, etc. The motion of the obstacles can be unpredictable and new obstacles may be introduced in the environment. As a result, we need to develop appropriate algorithms for planning and executing appropriate trajectories in such dynamic scenes. There is extensive work on motion planning. Some of the widely used techniques are based on sample-based planning, though they are mostly limited to static environments. There is recent work on extending sample-based planning techniques to dynamic scenes by incorporating the notion of time as an additional dimension in the configuration space [2, 12, 18]. However, the resulting algo- rithms may not generate smooth paths or handle dynamic constraints in real time.Other technique for dynamic environments are limited to local collision avoidance with the obstacles or make some assumptions about the motion of dynamic obstacles. In this paper, we address the problem of real-time motion planning in dynamic scenes. In order to deal with unpredictable environments, we use replanning algorithms that interleave planning with execution [14, 12, 17, 26]. In these cases, the robot may only compute partial or sub-optimal plans in the given time interval. In order to produce smooth paths and handle dynamic constraints, we combine replanning techniques with optimization-based planning [15, 32, 25]. We present a novel parallel optimization-based motion planning algorithm for dynamic scenes. Our planning algorithm optimize multiple trajectories in parallel to explore a broader subset of the configuration space and compute a high-quality path. The parallelization improves the optimality of the solution and makes it possible to compute a safe solution for the robot in a shorter time interval. We map our multiple trajectory optimization algorithm to many-core GPUs (graphics pro- Chonhyon Park and Jia Pan and Dinesh Manocha the Department of Computer Science, the University of North Carolina, Chapel Hill, e-mail: {chpark,panj,dm}@cs.unc.edu 1

Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

Real-time Optimization-based Planning in DynamicEnvironments using GPUs

Chonhyon Park and Jia Pan and Dinesh Manocha

Abstract We present a novel algorithm to compute collision-free trajectories in dynamic environ-ments. Our approach is general and makes no assumption about the obstacles or their motion. Weuse a replanning framework that interleaves optimization-based planning with execution. Further-more, we describe a parallel formulation that exploits high number of cores on commodity graphicsprocessors (GPUs) to compute a high-quality path in a given time interval. We derive bounds on howparallelization can improve the responsiveness of the planner and the quality of the trajectory.

1 Introduction

Robots are increasingly used in dynamic or time-varying environments. These scenarios are com-posed of moving obstacles, and it is important to compute collision-free trajectories for navigation ortask planning. Some of the applications include automated wheel chairs, manufacturing tasks withrobots retrieving parts from moving conveyors, air and freeway traffic control, etc. The motion ofthe obstacles can be unpredictable and new obstacles may be introduced in the environment. As aresult, we need to develop appropriate algorithms for planning and executing appropriate trajectoriesin such dynamic scenes.

There is extensive work on motion planning. Some of the widely used techniques are based onsample-based planning, though they are mostly limited to static environments. There is recent workon extending sample-based planning techniques to dynamic scenes by incorporating the notion oftime as an additional dimension in the configuration space [2, 12, 18]. However, the resulting algo-rithms may not generate smooth paths or handle dynamic constraints in real time.Other techniquefor dynamic environments are limited to local collision avoidance with the obstacles or make someassumptions about the motion of dynamic obstacles.

In this paper, we address the problem of real-time motion planning in dynamic scenes. In orderto deal with unpredictable environments, we use replanning algorithms that interleave planning withexecution [14, 12, 17, 26]. In these cases, the robot may only compute partial or sub-optimal plansin the given time interval. In order to produce smooth paths and handle dynamic constraints, wecombine replanning techniques with optimization-based planning [15, 32, 25].

We present a novel parallel optimization-based motion planning algorithm for dynamic scenes.Our planning algorithm optimize multiple trajectories in parallel to explore a broader subset of theconfiguration space and compute a high-quality path. The parallelization improves the optimalityof the solution and makes it possible to compute a safe solution for the robot in a shorter timeinterval. We map our multiple trajectory optimization algorithm to many-core GPUs (graphics pro-

Chonhyon Park and Jia Pan and Dinesh Manochathe Department of Computer Science, the University of North Carolina, Chapel Hill, e-mail:{chpark,panj,dm}@cs.unc.edu

1

Page 2: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

2 Chonhyon Park and Jia Pan and Dinesh Manocha

cessing units) and utilize their massively parallel capabilities to achieve 20-30X speedup over serialoptimization-based planner. Furthermore, we derive bounds on how parallelization improves theresponsiveness and the quality of the trajectory computed by our planner. We highlight the perfor-mance of our parallel replanning algorithm in the ROS simulation environment with a 7-DOF robotand dynamic obstacles.

The rest of the paper is organized as follows. In Section 2, we give a brief overview of priorwork on motion planning in dynamic environments and optimization-based planning. We presentan overview of optimization-based planning and execution framework in Section 3. In Section 4,we describe the parallel replanning algorithm and analyze its performance and responsiveness inSection 5. We highlight its performance in simulated dynamic environments in Section 6.

2 Related Work

In this section, we give a brief overview of prior work on motion planning in dynamic environments,optimization-based planning and parallel algorithms for motion planning.

2.1 Motion Planning in Dynamic Environments

Many approaches for motion planning in dynamic environment assume that the trajectories of mov-ing objects are known a priori. Some algorithms discretize the continuous trajectory and modeldynamic obstacles as static obstacles within a short horizon [21]. Other techniques compute anappropriate robot velocity that can avoid a collision with moving obstacles during a short timestep [10, 34]. The state space for planning in a dynamic environment is given as C ×T , i.e., the Carte-sian product of configuration space and time. Some RRT variants can handle continuous state spacedirectly [26], while other methods discretize the state space and use classic heuristic search [28, 27]or roadmap based algorithms [3].

Some planning algorithms for dynamic environments [3, 28] assume that the inertial constraints,such as acceleration and torque limit, are not significant for the robot. Such assumption imply thatthe robot can stop and accelerate instantaneously, which may not be feasible for physical robots.Moreover, these algorithms attempt to find a good solution for path planning before robot executionstarts. In many scenarios, the planning computation can be expensive. As a result, path planningbefore execution strategy can lead to long delays during robot’s movement and may cause collisionsfor robots operating in environments with fast dynamic obstacles. One solution to overcome theseproblems is based on real-time replanning, which interleaves planning with execution so that therobot may only compute partial or sub-optimal plans for execution to avoid collisions. Different al-gorithms can be used as the underlying planners in the real-time replanning framework, includingsample-based planners [14, 12, 26] or search-based methods [17, 22]. Most replanning algorithmsuse fixed time steps when interleaving between planning and execution [26]. Some recent work [12]computes the interleaving timing step in an adaptive manner to balance between safety, responsive-ness, and completeness of the overall system.

2.2 Optimization-based planning

Optimization techniques can be used to compute a robot trajectory that is optimal under some spe-cific metrics (e.g., smoothness or length) and also satisfies various constraints (e.g., collision-freeand dynamics constraints). Some algorithms assume that a collision-free trajectory is given andit can be refined using optimization techniques. One simple technique is the so-called ‘shortcut’

Page 3: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

Real-time Optimization-based Planning in Dynamic Environments using GPUs 3

heuristic, which picks pairs of configurations along a collision-free path and invokes a local plannerto replace the intervening sub-path with a shorter one [7]. Other examples include elastic bands orelastic strips planning, which models paths as mass-spring systems and uses gradient-based meth-ods to compute a minimum-energy path [6, 31]. Other optimization-based planning algorithms canrelax the assumptions about the initial path and may start with an in-collision path. Some recentapproaches, such as [32, 15, 9], directly encode the collision-free constraints using a global potentialfield and compute a collision-free trajectory for robot execution. These methods typically representvarious constraints(smoothness, torque, etc.) as soft constraints in terms of additional penalty termsto the objective function, and compute the final trajectory. In case the underlying robot has to satisfyhard constraints, e.g., dynamic constraints needed to maintain the balance for humanoid robots, thetrajectory computation problem is solved using constrained optimization [19, 20] and can be moreexpensive.

2.3 Parallel Algorithms

Due to the rapid advances in multi-core and many-core processors, designing efficient parallelplanning algorithms that can benefit from their computational capabilities is an important topic inrobotics. Many parallel algorithms have been proposed for motion planning by utilizing the proper-ties of configuration space [23]. Moreover, techniques based on distributed representation [2] can beeasily parallelized. In order to deal with very high dimensional or challenging scenarios, distributedsample-based techniques have also been proposed [1, 8, 30].

The rasterization capabilities of a GPU can be used for real-time motion planning of low DOFrobots [13] or to improve the sample generation in narrow passages [29]. Recently, the GPUs havebeen exploited to accelerate sampling-based motion planners in high dimensional spaces, includingsample-based planning [24], RRT algorithms [4], and search-based planning [16].

3 Overview

Our real-time replanning algorithm is based on optimization-based planning and uses parallel tech-niques to handle arbitrary dynamic environments. In this section, we first describe the underlyingframework for optimization-based planning and give an overview of our planning and executionframework.

3.1 Optimization-based Planning

Traditionally, the goal of motion planning is to find a collision free trajectory from the start con-figuration to the goal configuration. Optimization-based planning reduces trajectory computation toan optimization problem that minimizes the costs corresponding to collision-free, smoothness, anddynamics constraints. Specifically, the start configuration vector qstart and the goal configurationvector qend are defined in the configuration space C of a robot. In this case, the dimension D of Cis equal to the number of free joints in the robot. There may be several static and dynamic obstaclesin the environment, corresponding to rigid bodies. We assume that a solution trajectory has a fixedtime duration T , and discretize it into N (excluding the two endpoints qstart and qend) waypointsequally spaced in time. The trajectory can be also represented as a vector Q ∈RD·N :

Q = [qT1 ,q

T2 , ...,q

TN ]

T . (1)

Page 4: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

4 Chonhyon Park and Jia Pan and Dinesh Manocha

Similar to previous work [32, 15, 25], we define the objective function of our optimization prob-lem as:

minq1,...,qN

N

∑i=1

(cs(qi)+ cd(qi)+ co(qi))+12‖AQ‖2, (2)

where the three cost terms cs(·), cd(·), and co(·) represent the static obstacle cost, dynamic obstaclecost, and the problem specific additional constraints, respectively. ‖AQ‖2 represents the smoothnesscost which is computed by the sum of squared accelerations along the trajectory, using the same ma-trix A proposed by Kalakrishnan et al [15]. The solution to the optimization problem in Equation (2)corresponds to the optimal trajectory of the robot.

In order to compute static and dynamic obstacle costs, we use signed Euclidean Distance Trans-form (EDT) and geometric collision detection as in previous work [25]. We divide the workspaceinto a 3D voxel grid and precompute the distance to the boundary of the nearest static obstacle witheach voxel. Moreover, we approximate the robot’s shape B by using a set of overlapping spheresb ∈B. In this case, the static obstacle cost for a configuration qi can be computed by table lookupin the voxel map as follows:

cs(qi) = ∑b∈B

max(ε + rb−d(xb),0)‖xb‖, (3)

where rb is the radius of one sphere b, xb is the 3D point of sphere b computed from the kinematicmodel of the robot at configuration qi, d(x) is the signed EDT for a 3D point x, and ε is a smallsafety margin between robot and the obstacles.

EDT can be efficiently used to compute the cost of static obstacles, since it only requires a sim-ple table lookup after one-time one-time initialization. However, using EDT for dynamic obstaclesrequires recomputation of EDT during each step, which can be expensive. Therefore we use geo-metric collision detection between the robot and dynamic obstacles to formalize the cost of dynamicobstacles. Object-space collision detection algorithms based on bounding volume hierarchies [11]are used to compute the dynamic obstacle cost efficiently.

In real-world applications, we cannot make assumptions about the future trajectory of the ob-stacles. We can only locally estimate the trajectory based on sensor data. In order to guarantee thesafety of the planned trajectory, we compute a conservative local bound on the trajectories of dy-namic obstacles and use them for the collision detection. The allowed sensing error and the obstaclevelocity are used to determine the size of bounds. The conservative bound for an obstacle Od for thetime interval [t0, t1] is computed as follows:

Od([t0, t1]) =

⋃t∈[t0,t1]

(1+sensing error

Od(t))Od

(t) (4)

3.2 Planning and Execution Framework

In order to improve the responsiveness of the robot in dynamic environments, we use a replanningapproach previously used for sampling-based motion planning [12]. Instead of planning and exe-cuting the entire trajectory at once, this formulation interleaves the planning and execution threadswithin a small time interval ∆t . This approach allows us compute new estimates on the local trajec-tory of the obstacles based on latest sensor information. During each planning step, we compute anestimate of the position and velocity of dynamic obstacles based on senor data. Next, a conservativebound of dynamic obstacles during the local time interval is computed using these values and theplanner uses this bound to compute the cost for dynamic obstacles. This cost is only used during thetime interval ∆t , as the predicted positions of dynamic obstacles may not be valid over a long timehorizon. This bound guarantees the safeness of the trajectory during the planning interval, however

Page 5: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

Real-time Optimization-based Planning in Dynamic Environments using GPUs 5

COs1

qstart

qk

qend0

t0

T

C-Space at different time

I=

[t 0, t 1

]

COd([t0, t1])

COs2

t1

Q3Q1

Q2

Fig. 1: Multiple trajectories that arise in the optimization-based motion planning. The coordinatesystem shows how the configuration space changes over time as the dynamic obstacles move overtime: each plane slice represents the configuration space at time t. In the environment, there arethree C-obstacles: the two static obstacles COs

1, COs2 and the dynamic obstacle COd . The planned

trajectories start at time 0, stop at time T , and are represented by a set of way points qstart , q1, ...,qk, ..., qN , qend . The three trajectories for the time interval I = [t0, t1] are generated with differentrandom seeds and represent different solutions to the planner in these configurations correspondingto the dynamic obstacles. Though all the trajectories have no collision with obstacles during the timeinterval I, trajectory Q2 represents the most optimal solution based on smoothness and collision-freecost functions. Our goal is to use commodity parallel hardware to compute these multiple trajectoriesin parallel and improve the probability of computing an optimal solution.

the size of the bound increases as the planning interval increases. It turns out that large conservativebounds make it hard for planner to compute a solution or result in a less optimal solution in the giventime bound. Hence, it is important to choose a short time interval to improve the responsiveness ofthe robot. Our goal is to exploit the parallelism in commodity processors to improve the efficiencyof the optimization-based planner. This parallelism results in two benefits:

• The faster computation allows us to use shorter time intervals which can improve the responsive-ness and safety for robots working in fast changing environments.

• Based on parallel threads, we can try to compute multiple trajectories corresponding to differ-ent seed values, and thereby explore a broader configuration space to compute a more optimalsolution, as illustrated in Figure 1.

4 Parallel Replanning

Nowadays, all commodity processors have multiple cores. Even some of the robot systems areequipped with multi-core CPU processors. Furthermore, these robot systems provide expansibilityin terms of using many-core accelerators, such as graphics processing units (GPUs). These many-

Page 6: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

6 Chonhyon Park and Jia Pan and Dinesh Manocha

Goal

Setting

CPU GPU

Scheduler Motion

Planner

Robot

Controller Sensor

Environment

Robot

Motors

Parallel Trajectory Optimization

Fig. 2: The overall architecture of our parallel replanning algorithm. The planner consists of fourindividual modules(scheduler, motion planner, robot controller, sensor), and each of them runs as aseparate thread. When the motion planning module receives a planning request from the scheduler,it launches optimization of multiple trajectories in parallel.

core accelerators are massively parallel processors, which offer a very high peak performance (e.g.up to 1 TFlop single precision capaibilities). Our goal is to exploit the computational capabilitiesof these commodity parallel processors for optimization-based planners and real-time replaning. Inthis section, we present a new parallel algorithm to solve the optimization problem highlighted in inEquation (2).

Our parallel replanning algorithm is based on stochastic optimization solver introduced by [15]to solve Equation (2). The solver is a derivative-free method which allows us to optimize arbitrarycost functions for which derivatives are not available, or are non-differentiable or non-smooth. Weparallelize our algorithm in two ways. First, we parallelize optimization of a single trajectory byparallelizing each step of optimization by using multiple threads on a GPU. Second, we parallelizeoptimization of multiple trajectories by using different initial seed values. Since it is a randomizedalgorithm, the solver may converge to different local minima and the running time of the solver alsovaries based on the initial seed values. In practice, such parallelization can improve the responsive-ness and the quality of the trajectory computed by our planner.

In this section, we describe our parallel replanning algorithm, which exploits multiple cores. Firstwe present the framework of the parallel replanning pipeline with multiple trajectories. We alsopresent the also present the GPU-based algorithm for single trajectory optimization.

4.1 Parallelized Replanning with Multiple Trajectories

As shown in Figure 2, our algorithm consists of several modules: scheduler, motion planner, robotcontroller and sensor. The scheduler sends a planning request to the motion planner when it getsnew goal information. The motion planner starts optimizing multiple trajectory in parallel. When themotion planner computes a new trajectory which is safe for the given time interval ∆t , the schedulernotifies the trajectory to the robot controller to execute the trajectory. While the robot controllerexecutes the trajectory, the scheduler requests planning of the next execution interval to the motionplanner. The motion planner also gets updated environment description from the sensors and utilizes

Page 7: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

Real-time Optimization-based Planning in Dynamic Environments using GPUs 7

PLANNING T1

PLANNING T2

PLANNING T3

PLANNING T4

PLANNING T1

PLANNING T2

PLANNING T3

PLANNING T4

PLANNING T1

PLANNING T2

PLANNING T3

PLANNING T4

PLANNING T1

PLANNING T2

PLANNING T3

PLANNING T4

EXECUTION

EXECUTION

EXEC.

EXECUTION

𝑡0 𝑡1 𝑡2 𝑡3 𝑡𝑛−1 𝑡𝑛 𝑡𝑛+1

step 0 step 1 step 2 step n-1 step n

∆0 ∆1 ∆2 ∆𝑛−1 ∆𝑛

goal

time

Fig. 3: The timeline of interleaving planning and execution in parallel replanning. In this figure, weassume the number of trajectories computed by parallel optimization algorithm as four. At time t0,the planner starts planning for time interval [t1, t2], during the time budget [t0, t1]. It finds a solutionby trying to optimize four trajectories in parallel. At time t1, the planner is interrupted and returnsthe result corresponding to the best trajectory to scheduler module. Then scheduler module executesthe trajectory. The optimization of multiple trajectories makes it possible to explore a large subset ofconfiguration space and increases the probability of computing the global optimum. While the robotcontroller executes the planned robot motion, the planner starts planning for the next time step [t1,t2] and the result is used at time t2. This process is repeated until the goal is reached. During thereplanning, if the planner determines that one of the computed trajectories is the optimal solution, itterminates the rest of the trajectory computations and returns the optimal trajectory.

it to derive bounds on the trajectories of dynamic obstacles during the next time interval. Since allmodules run in separate threads, each module does not need to wait on other modules and can workconcurrently.

Figure 3 illustrates interleaved planning and execution with multiple trajectory planning. Duringstep i, the planner has a time budget ∆i = ti+1−ti, and it is also the time budget available for executionduring step i. During the planning computation in step i, the planner tries to generates trajectoriesof the next execution step, i.e, the time interval [ti+1, ti+2]. The sensor information at ti is used toestimate conservative bounds for the dynamic obstacles during the interval [ti+1, ti+2].

Within the time budget, multiple initial trajectories are refined by the optimization algorithm togenerate multiple solutions which are sub-optimal and have different costs. Some of the solutionsmay not be collision-free for the execution interval. It could be due to the limited time budget, orthe local optima corresponding to that particular solution. However, parallelization using multipletrajectories increases the possibility that there exists a collision-free trajectory and we can also expectto compute a higher quality solution, as we discussed in Section 3.2.

Page 8: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

8 Chonhyon Park and Jia Pan and Dinesh Manocha

Generate Initial

Trajectories

Generate Noise

Compute

Waypoint Cost

Compute Joint Cost

Compute Probability

Weights of Noise

Update Trajectories

Termination Check

Number of parallel GPU threads

during each step of trajectory optimization

k threads

k ∙ m threads

k ∙ m ∙ n threads

k ∙ m ∙ n ∙ d threads

k ∙ m threads

k threads

k threads

(k : number of trajectories)

(m : number of noise vectors

used for stochastic optimization)

(n : number of waypoints

in a trajectory)

(d : number of robot joints)

Fig. 4: The detailed breakdown of GPU trajectory optimization. It starts with the generation of kinitial trajectories. From these initial trajectories, the algorithm iterates over stochastic optimizationsteps. First it generates random noise vectors which are used for stochastic optimization, then com-putes the cost for each waypoint on each noise trajectory. These waypoint costs include collisioncost, end effector orientation cost, etc. We also compute joint cost, including smoothness cost orcomputing the torque constraints. The current trajectory cost is repeatedly improved using the costof noise trajectories until the algorithm satisfies termination criteria(time budget runs out, or thecurrent trajectory results in an optimal solution).

4.2 Highly Parallel Trajectory Optimization

With the parallelization of computation of multiple trajectories, our algorithm improves the respon-siveness of the planner by parallelization of the optimization of a single trajectory. We parallelizevarious aspects of the stochastic solver on the GPUs by using random noise vectors.

The trajectory optimization process and the number of threads used during each step are illus-trated in Figure 4. We assume that the number of parallelized trajectories is k, the number of noisevectors used for stochastic optimization is m, a trajectory is discretized into n waypoints, and thenumber of the free joints of the robot, i.e., the dimension of the configuration space is d. Then the al-gorithm uses (k,k ·m ·n ·d) threads in parallel according to these steps and exploits the computationalpower of GPUs.

The algorithm starts with the generation of k initial trajectories. As defined in Section 3, eachtrajectory is generated in the configuration space C (which has dimension d), which has n waypointsfrom qstart to qend . Then the algorithm generates m random noise vectors(with dimension d) for allthe n waypoints on the trajectory. These noise vectors are used to perform stochastic update of thetrajectory. Adding these m noise vectors to the current trajectory results in m noise trajectories, whichare slightly modified from the original trajectory. The cost for an waypoint, such as costs for static

Page 9: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

Real-time Optimization-based Planning in Dynamic Environments using GPUs 9

and dynamic obstacles are computed for each waypoint in the noise trajectories. As described in theSection 3.1, the static obstacle cost is computed by precomputed signed EDT. EDT is initialized onceduring the startup step of the planner and does not change over the course of the planner, the 3D spacepositions of the overlapping spheres b ∈B of the robot are computed by the kinematic model of therobot in the configuration of each waypoint. Collision detection for the cost of dynamic obstaclesis computed by GPU collision detection algorithm [24]. Smoothness cost, computed by a matrixmultiplication ‖AQ‖2 for each joint, can be computed efficiently using parallel capabilities of a GPU.When costs of all noise trajectories are computed, the update of current trajectory is implemented bymoving it towards the noise trajectory with low cost. It is computed by the weighted sum of noiseswhich are inversely proportional to their costs. At the end of each iteration, the algorithm decides tostop the optimization or repeat the next iteration. If a trajectory is determined to the global optimal,optimization of all trajectories are interrupted and the optimal solution is returned. Or if the giventime budget is expired, similarly, optimization of all trajectories are interrupted and the best solutionis returned.

5 Analysis

In this section, we analyze the benefits of parallelization and derive bounds on the improvement inresponsiveness and the quality of the trajectory computed by the planner.

5.1 Responsiveness

The use of multiple trajectories improves the responsiveness of our planner. The optimization func-tion corresponding to Equation 2 typically has multiple local minima. In general, any trajectorythat is collision-free, satisfies all constraints, and is suitably smooth can be regarded as an accept-able solution of optimization-based planning. In this section, we show the optimization of multipletrajectories by our GPU-based algorithm improves the performance of our planner.

We assume that the different random seeds used by the algorithm are uniformly distributed, thetime costs required by the solver to compute a solution are independent and identically distributed(i.i.d.) random variables, whose mean is µ and variation is σ2. Note that parameters µ and σ2 reflectthe capability of the solver: large µ implies that the environment is challenging and the solver needsmore time to compute an acceptable result; large σ2 means that the solver is sensitive to the initialvalues. Suppose there are n cores and we denote the time costs of different cores by X1, ..., Xn,respectively. Then the time cost for the parallelized solver is X = min(X1, ...,Xn), which is calledthe first order statistic of {Xi}. We measure the theoretical acceleration due to parallelization bycomputing the expected time costs without and with parallelization:

Definition 1. The theoretical acceleration of optimization-based planner with n threads is τ =E(Xi)E(X) =

µ

E(X) , where X = min(X1, ...,Xn).

If Xi follows the uniform distribution, then the acceleration ratio can be simply represented as τ =n+1

2 . A better way to describe the time cost of the solver is obtained based on a normal distribution.If Xi follows normal distribution, according to [5], then:

E(X)≈ µ +Φ−1(

1−α

n−2α +1)σ , (5)

where α = 0.375 and Φ(x) = 1√2π

∫ x−∞

e−t22 dt. We show the acceleration ratio for different number

of threads and different values of σ

µin Figure 5. It is obvious that when the number of parallel

Page 10: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

10 Chonhyon Park and Jia Pan and Dinesh Manocha

1 2 3 4 5 6 7 81

2

3

4

5

6

7

8

# of cores

acce

lera

tion

ratio

σ

µ= 0.1

σ

µ= 0.3

σ

µ= 0.4

σ

µ= 0.6

Fig. 5: Benefits of a parallel, multi-threaded algorithm in terms of the responsiveness improvement.We assume that the time costs of different threads using random initializations are i.i.d. Gaussianvariables with mean µ and variation σ2, which provide a measure of the complexity of the optimiza-tion problem. We show the theoretical acceleration under such assumption by varying the number ofparallel threads and the ratio between σ and µ .

threads increases, the acceleration ratio increases rapidly. Moreover, the acceleration is large whenσ

µis large, i.e., when the solver is sensitive to the initial seed value.We also analyze the responsiveness of the planner based on GPU parallelization. The computa-

tion of each waypoint and each joint are processed in parallel using multiple threads on a GPU and itimproves the performance of optimization algorithm. Figure 6 shows the performance of GPU-basedparallel optimization algorithm. When we perform multiple trajectory optimization using CPUs, thealgorithm assigns one CPU core to compute each trajectory. It implies that the increased numberof trajectories only uses more cores and does not affect the performance of a single trajectory op-timization computation. However, the GPU-based algorithm also utilizes various cores to improvethe performance of a single trajectory computation, as shown in Figure 4. If we increase the numberof trajectories, then it causes the system to share the resources for multiple trajectories. Overall, weobserve that by simultaneously optimizing multiple trajectories, we obtain a higher throughput usingGPUs. We observe that multiple trajectory optimization not only improves the overall performanceof the optimization algorithm, but also improves the quality of the trajectory.

Page 11: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

Real-time Optimization-based Planning in Dynamic Environments using GPUs 11

0 200 400 600 800 1000 1200

10 trajectories

1 trajectory

4 cores

2 cores

1 core

1,012.881

522.739

97.473

51.463

26.357

Iteration / sec

Multi-core CPU

Mani-core GPU

Fig. 6: Benefits of the parallel algorithm in terms of the performance of optimization algorithm. Thegraph shows the number of optimization iterations that can be performed per second. When multipletrajectories are used on a multicore CPU (by varying the number of cores) optimization, each coreis used to compute one single trajectory. Each trajectory computation is performed independently.The number of iterations performed per second increases as a linear function of the number of cores.In case of many-core GPU optimization, increasing the number of trajectories results in sharing ofGPU resources among different trajectory computations and we don’t observe a linear relationship.Overall, we see a better utilization of GPU resources, if we optimize a higher number of trajectoriesin parallel. In this case, we observe two times improvement in terms of GPU resource utilizationwhen we try to optimize 10 trajectories, as opposed to a single trajectory.

5.2 Quality

The parallel algorithm can also improve the probability of the planner in terms of computing theglobal optimal solution. The optimization problem in Equation 2 has D ·N degrees of freedom,where N tends to be a large number corresponding to several hundreds. It is difficult to find a globaloptimal solution when searching in such a high-dimensional space. However, we can show that theuse of multiple initializations can increase the probability of computing the global optima. Accordingto [33], the probability for a pure random search to find a global optima using n uniform samples isdefined as Theorem 1.

Lemma 1. An optimization-based planner with n threads will compute the global optima with theprobability 1− (1− |A||S| )n, where S is the entire search space, A is the neighborhood around theglobal optimal solutions and | · | is the measurement on the search space.

Here |A||S| measures the probability that one random sample lies in the neighborhood of global optimalsolutions. The stochastic solver in our implementation uses stochastic gradient method instead ofrandom search. However, the optimization solver with a specific initial value converges to the globaloptimum solution if and only if any of the intermediate trajectories generated during the optimizationalgorithm reaches the global minima. As a result, Lemma 1 provides a lower bound on the probabilitythat an optimization-based planner with n threads will compute the global optima. When the numberof threads increases, we have a higher chance of computing the global optimal trajectory.

Page 12: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

12 Chonhyon Park and Jia Pan and Dinesh Manocha

(a) Start configuration used in the performance mea-surement

(b) Goal configuration used in the performance mea-surement

Fig. 7: Planning environment used to evaluate the performance of our planner. The planner computesa trajectory of robot arm which avoids dynamic obstacles and moves horizontally from right to left.Green spheres are static, and red spheres are dynamic obstacles. Figure (a), (b) Show the start andgoal configurations of the right arm of the robot.

Scenario Average planning time (ms) Std. dev planning time (ms)CPU 1 core 810 0.339CPU 2 core 663 0.284CPU 4 core 622 0.180

GPU 1 trajectory 337 0.204 1 FailGPU 4 trajectory 203 0.326GPU 10 trajectory 60 0.071

Table 1: Results obtained from our trajectory computation algorithm based on different levels ofparallelization and number of trajectories (for the benchmarks shown in Figure 7). The planningtime decreases when the planner uses more trajectories.

6 Results

In this section, we highlight the performance of our parallel planning algorithm in dynamic environ-ments. All experiments are performed on a PC equipped with an Intel i7-2600 8-core CPU 3.4GHzwith 8GB of memory. Our GPU algorithm is implemented on an NVIDIA Geforce GTX580 graphicscard, which supports 512 CUDA cores.

Our first experiment is designed to estimate the responsiveness of the planner. We plan a trajectoryof the 7 degree-of-freedom right arm of PR2 in a simulation environment. In the environment shownin Figure 7, there are two static (green) and two moving (red) obstacles. We measure the elapsed timeto compute a collision-free solution with varying number of trajectories for both CPU and GPU-based planners. This experiment is performed to compute the appropriate time interval for a singleplanning time step during replanning. A shorter planning time makes the planner more responsive.We repeat the test 10 times for each scenario, and compute the average and standard deviation ofthe overall planning time. This result is shown in Table 1. We observe that the GPU-based plannerdemonstrates better performance than a CPU-based planner. In both cases, it is shown that whenmore trajectories are optimized in parallel, the performance of the planner increases. We restrict themaximum number of iterations to 500, and the planner failed to compute the collision-free solutiononce for a single trajectory case on GPUs. This happens because the single trajectory instance getsstuck in a local minima and is unable to compute an acceptable solution.

In the next experiment, we test our parallel replanning algorithm in dynamic environments with ahigh number of moving obstacles (Figure 8). The obstacles in the environments change their veloci-ties periodically. However, this information is not known to the planner. The planner uses replanning

Page 13: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

Real-time Optimization-based Planning in Dynamic Environments using GPUs 13

(a) Planning environment withmoving obstacles (correspondingto red spheres), which have non-constant velocities (shown withblue arrows).

(b) Planning environment with fastmoving obstacles

(c) Planning environment withslow moving obstacles

Fig. 8: Planning environments used in replanning experiments. The planner uses the latest obstacleposition and velocity to estimate the local trajectory. (a) There are several obstacles that change theirvelocities during the planning phase. (b)(c) The obstacles in the environment have varying (high orlow) speeds, shown with arrows.

technique to reach the goal while avoiding collisions with the obstacles. During each step, the plan-ner uses conservative local bounds that are based on the positions and velocities of the obstacles. Weobserve a different level of responsiveness between CPU and GPU-based planners. When the obsta-cles move at a high speed, the CPU-based planner may not be responsive. Moreover, we measure thecost of computing the entire solution trajectory, including robot execution. The cost used in this ex-periment consists of two costs, obstacle cost and smoothness cost. We measure the cost with varyingnumber of optimized trajectories in order to measure the effect of parallelization. We run 10 trialson the planning problem shown in Figure 8. Figure 9(a) highlights the performance. As the numberof optimized trajectories increases, the overall cost of entire trajectory computation decreases. Thisresult validates that the multiple trajectory optimization improves the quality of solution, as shownin Section 5.2.

We also measure the trajectory cost with moving obstacles with two sets of varying speeds. Weuse the same environment and only change the speed of obstacles. The result is shown in Figure 9(b).We observe that higher speed obstacles result in trajectories with lower costs associated with them.Moreover, in both cases, multiple trajectory optimization improves the quality of the solution.

7 Conclusions and Future Work

We present a novel, parallel algorithm for real-time replanning in dynamic environments. The under-lying planner uses an optimization-based formulation and we parallelize the computation on many-core GPUs. Moreover, we derive bounds on how parallelization improves the responsiveness and thequality of the trajectory computed by our planner.

There are many avenues for future work. Our current formulation doesn’t take into account anyuncertainty or inaccuracies in sensor data. We would like to integrate our approach with a physicalrobot , model different constraints on the motion, and evaluate its performance in real-world scenar-ios. Furthermore, we would to investigate other parallel optimization techniques to further improvethe performance.

Page 14: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

14 Chonhyon Park and Jia Pan and Dinesh Manocha

0 2000 4000 6000 8000 10000 12000

10 trajectories

4 trajectories

1 trajectory

4 cores

2 cores

1 core

29.86945975

84.146586

655.8292117

5608.3699

3755.1631

11574.8409

Trajectory Cost

Multicore CPU Manycore GPU

(a) Trajectory costs from replanning in dynamic environments

0 5000 10000 15000 20000 25000 30000

10 trajectories

4 trajectories

1 trajectory

4 cores

2 cores

1 core

609.414226

1435.272056

2800.323414

7033.6449

7531.9449

3477.0548

4833.663606

4585.207959

9081.604228

4013.1612

25445.7268

7118.6579

Trajectory Cost

High Speed Low Speed

Multicore CPU Manycore GPU

(b) Trajectory costs during the replanning algorithm based on obstacles moving high and low speeds

Fig. 9: Results obtained from the replanning in dynamic environments on a multi-core CPU and amany-core GPU. (a) Trajectory cost is measured for each planner. (b) Trajectory cost is measuredfor each planner with different obstacle speeds. The use of multiple trajectories in our replanningalgorithm results in trajectories with lower costs and thereby, improved quality.

References

1. Amato, N., Dale, L.: Probabilistic roadmap methods are embarrassingly parallel. In: Proceedings of IEEE Inter-national Conference on Robotics and Automation, pp. 688–694 vol.1 (1999)

2. Belkhouche, F.: Reactive path planning in a dynamic environment. Robotics, IEEE Transactions on 25(4), 902–911 (2009)

3. van den Berg, J., Overmars, M.: Roadmap-based motion planning in dynamic environments. IEEE Transactionson Robotics 21(5), 885–897 (2005)

4. Bialkowski, J., Karaman, S., Frazzoli, E.: Massively parallelizing the RRT and the RRT*. In: IEEE/RSJ Interna-tional Conference on Intelligent Robots and Systems, pp. 3513–3518 (2011)

5. Blom, G.: Statistical Estimates and Transformed Beta Variables. John Wiley and Sons, Inc (1958)

Page 15: Real-time Optimization-based Planning in Dynamic ...gamma.cs.unc.edu/ITOMP/wafr12.pdfReal-time Optimization-based Planning in Dynamic Environments using GPUs 3 heuristic, which picks

Real-time Optimization-based Planning in Dynamic Environments using GPUs 15

6. Brock, O., Khatib, O.: Elastic strips: A framework for motion generation in human environments. InternationalJournal of Robotics Research 21(12), 1031–1052 (2002)

7. Chen, P., Hwang, Y.: Sandros: a dynamic graph search algorithm for motion planning. IEEE Transactions onRobotics and Automation 14(3), 390–403 (1998)

8. Devaurs, D., Simeon, T., Cortes, J.: Parallelizing RRT on distributed-memory architectures. In: Proceedings ofIEEE International Conference on Robotics and Automation, pp. 2261 –2266 (2011)

9. Dragan, A., Ratliff, N., Srinivasa, S.: Manipulation planning with goal sets using constrained trajectory optimiza-tion. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 4582–4588 (2011)

10. Fiorini, P., Shiller, Z.: Motion planning in dynamic environments using velocity obstacles. International Journalof Robotics Research 17(7), 760–772 (1998)

11. Gottschalk, S., Lin, M.C., Manocha, D.: Obbtree: a hierarchical structure for rapid interference detection. In:SIGGRAPH, pp. 171–180 (1996)

12. Hauser, K.: On responsiveness, safety, and completeness in real-time motion planning. Autonomous Robots32(1), 35–48 (2012)

13. Hoff, K., Culver, T., Keyser, J., Lin, M., Manocha, D.: Interactive motion planning using hardware acceleratedcomputation of generalized voronoi diagrams. In: International Conference on Robotics and Automation, pp.2931–2937 (2000)

14. Hsu, D., Kindel, R., Latombe, J.C., Rock, S.: Randomized kinodynamic motion planning with moving obstacles.International Journal of Robotics Research 21(3), 233–255 (2002)

15. Kalakrishnan, M., Chitta, S., Theodorou, E., Pastor, P., Schaal, S.: STOMP: Stochastic trajectory optimization formotion planning. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 4569–4574(2011)

16. Kider, J., Henderson, M., Likhachev, M., Safonova, A.: High-dimensional planning on the gpu. In: Proceedingsof IEEE International Conference on Robotics and Automation, pp. 2515–2522 (2010)

17. Koenig, S., Tovey, C., Smirnov, Y.: Performance bounds for planning in unknown terrain. Artificial Intelligence147(1-2), 253–279 (2003)

18. Kunz, T., Reiser, U., Stilman, M., Verl, A.: Real-time path planning for a robot arm in changing environments.In: Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pp. 5906 –5911 (2010)

19. Lee, S.H., Kim, J., Park, F., Kim, M., Bobrow, J.: Newton-type algorithms for dynamics-based robot movementoptimization. Robotics, IEEE Transactions on 21(4), 657–667 (2005)

20. Lengagne, S., Mathieu, P., Kheddar, A., Yoshida, E.: Generation of dynamic motions under continuous con-straints: Efficient computation using b-splines and taylor polynomials. In: IEEE/RSJ International Conference onIntelligent Robots and Systems, pp. 698–703 (2010)

21. Likhachev, M., Ferguson, D.: Planning long dynamically feasible maneuvers for autonomous vehicles. Interna-tional Journal of Robotics Research 28(8), 933–945 (2009)

22. Likhachev, M., Ferguson, D., Gordon, G., Stentz, A., Thrun, S.: Anytime dynamic A*: An anytime, replanningalgorithm. In: Proceedings of the International Conference on Automated Planning and Scheduling (2005)

23. Lozano-Perez, T., O’Donnell, P.: Parallel robot motion planning. In: International Conference on Robotics andAutomation, pp. 1000–1007 (1991)

24. Pan, J., Lauterbach, C., Manocha, D.: g-Planner: Real-time motion planning and global navigation using gpus.In: Proceedings of AAAI Conference on Artificial Intelligence (2010)

25. Park, C., Pan, J., Manocha, D.: ITOMP: Incremental trajectory optimization for real-time replanning in dynamicenvironments. In: Proceedings of the International Conference on Automated Planning and Scheduling, to appear(2012)

26. Petti, S., Fraichard, T.: Safe motion planning in dynamic environments. In: Proceedings of IEEE/RSJ InternationalConference on Intelligent Robots and Systems, pp. 2210–2215 (2005)

27. Phillips, M., Likhachev, M.: Planning in domains with cost function dependent actions. In: Proceedings of AAAIConference on Artificial Intelligence (2011)

28. Phillips, M., Likhachev, M.: SIPP: Safe interval path planning for dynamic environments. In: Proceedings ofIEEE International Conference on Robotics and Automation, pp. 5628–5635 (2011)

29. Pisula, C., Hoff, K., Lin, M.C., Manocha, D.: Randomized path planning for a rigid body based on hardwareaccelerated voronoi sampling. In: International Workshop on Algorithmic Foundation of Robotics, pp. 279–292(2000)

30. Plaku, E., Kavraki, L.: Distributed sampling-based roadmap of trees for large-scale motion planning. In: Pro-ceedings of IEEE International Conference on Robotics and Automation, pp. 3868–3873 (2005)

31. Quinlan, S., Khatib, O.: Elastic bands: connecting path planning and control. In: Proceedings of IEEE Interna-tional Conference on Robotics and Automation, pp. 802–807 vol.2 (1993)

32. Ratliff, N., Zucker, M., Bagnell, J.A.D., Srinivasa, S.: CHOMP: Gradient optimization techniques for efficientmotion planning. In: Proceedings of International Conference on Robotics and Automation, pp. 489–494 (2009)

33. Rinnooy Kan, A., Timmer, G.: Stochastic global optimization methods part i: Clustering methods. MathematicalProgramming 39, 27–56 (1987)

34. Wilkie, D., van den Berg, J.P., Manocha, D.: Generalized velocity obstacles. In: Proceedings of IEEE/RSJ Inter-national Conference on Intelligent Robots and Systems, pp. 5573–5578 (2009)