Receding Horizon Control For Autonomous Aerial Vehicles

Receding Horizon Control of Autonomous Aerial Vehicles 1

John Bellingham 2, Arthur Richards 3, and Jonathan P. How 4

Massachusetts Institute of TechnologyCambridge MA 02139

Abstract

This paper presents a new approach to trajectory op-timization for autonomous xed-wing aerial vehiclesperforming large-scale maneuvers. The main result isa planner which designs nearly minimum time planartrajectories to a goal, constrained by no-y zones andthe vehicles maximum speed and turning rate. Mixed-Integer Linear Programming (MILP) is used for the op-timization, and is well suited to trajectory optimizationbecause it can incorporate logical constraints, such asno-y zone avoidance, and continuous constraints, suchas aircraft dynamics. MILP is applied over a recedingplanning horizon to reduce the computational eort ofthe planner and to incorporate feedback. In this ap-proach, MILP is used to plan short trajectories thatextend towards the goal, but do not necessarily reachit. The cost function accounts for decisions beyond theplanning horizon by estimating the time to reach thegoal from the plans end point. This time is estimatedby searching a graph representation of the environment.This approach is shown to avoid entrapment behind ob-stacles, to yield near-optimal performance when com-parison with the minimum arrival time found using axed horizon controller is possible, and to work consis-tently on large trajectory optimization problems thatare intractable for the xed horizon controller.

1 Introduction

The capabilities and roles of Unmanned Aerial Vehicles(UAVs) are evolving, and require new concepts for theircontrol. A signicant aspect of this control problem isoptimizing the long-range kinodynamically constrainedtrajectory from the UAVs starting point to its goal.This problem is complicated by the fact that the spaceof possible control actions is extremely large, and thatsimplications that reduce its dimensionality while pre-serving feasibility and near optimality are challenging.

Two well-known methods that have been applied to thisproblem are Probabilistic Road Maps [1] (PRMs) andRapidly-exploring Random Trees [2] (RRTs). Thesemethods reduce the dimensionality of the problem by

1 The research was funded in part under DARPA contract# N6601-01-C-8075 and Air Force grant # F49620-01-1-0453.

2 Research Assistant john [email protected] Research Assistant, [email protected] Associate Professor, [email protected]

sampling the possible control actions, but the result-ing trajectories are generally not optimal. This workpresents a new method for optimizing kinodynamicallyconstrained trajectories that is near-optimal (shown tobe within 3% on average for several examples) and com-putationally tractable.

Mixed-Integer Linear Programming (MILP) is the keyelement of this new approach. MILP extends linearprogramming to include variables that are constrainedto take on integer or binary values. These variablescan be used to add logical constraints into the opti-mization problem [3, 4], such as obstacle and collisionavoidance [5, 6, 7]. Recent improvements in combina-torial optimization and computer speed make MILP afeasible tool for trajectory optimization. To reduce thecomputation time required by MILP for large trajecto-ries, it can been used in a receding horizon framework,such as the basic approach presented in Ref. [5].

In general, receding horizon control (also called ModelPredictive Control) designs an input trajectory that op-timizes the plants output over a period of time, calledthe planning horizon. The input trajectory is imple-mented over the shorter execution horizon, and the op-timization is performed again starting from the statethat is reached. This re-planning incorporates feedbackto account for disturbances and plant modeling errors.In this problem setting, computation can be saved byapplying MILP in a receding horizon framework to de-sign a series of short trajectories to the goal instead ofone long trajectory, since the computation required tosolve a MILP problem grows non-linearly with its size.

One approach to ensuring that the successively plannedshort trajectories actually reach the goal is to minimizesome estimate of the cost to go from the plans end,or terminal point, to the goal. However, it is not obvi-ous how to nd an accurate cost-to-go estimate withoutplanning the trajectory all the way to the goal, increas-ing the computation required. The approach suggestedin [5] used an estimate of the time-to-go from endpointof the plan to the goal that was not cognizant of obsta-cles in this interval. This terminal penalty led to theaircraft becoming trapped behind obstacles. ControlLyapunov Functions have been used successfully as ter-minal penalties in other problem settings [8], but theseare also incompatible with the presence of obstacles inthe environment. This paper investigates a cost-to-go

function based on a modied version of Dijkstras Al-gorithm applied to a visibility graph representation ofthe environment.

2 Fixed Horizon Minimum Time Controller

A minimum arrival time controller using MILP over axed planning horizon was presented in Ref. [7]. Attime step k it designs a series of control inputs {u(k +i) R2 : i = 0, 1, . . . , L 1}, that give the trajectory{x(k+i) R2 : i = 1, 2, . . . , L}. Constraints are addedto specify that one of the L trajectory points x(k +i) = [xk+i,1 xk+i,2]

T must equal the goal xgoal. Theoptimization minimizes the time along this trajectoryat which the goal is reached, using L binary decisionvariables bgoal {0, 1} as

minu()

1(bgoal, t) =Li=1

bgoal,iti (1)

subject toxk+i,1 xgoal,1 R(1 bgoal,i)xk+i,1 xgoal,1 R(1 bgoal,i)xk+i,2 xgoal,2 R(1 bgoal,i)xk+i,2 xgoal,2 R(1 bgoal,i) (2)

Li=1

bgoal,i = 1 (3)

where R is a large positive number, and ti is the timeat which the trajectory point x(k+ i) is reached. Whenthe binary variable bgoal,i is 0, it relaxes the arrival con-straint in Eqn. 2. Eqn. 3 ensures that the arrival con-straint is enforced once.

To include collision avoidance in the optimization, con-straints are added to ensure that none of the L trajec-tory points penetrate any obstacles. Rectangular ob-stacles are used in this formulation, and are describedby their lower left corner (ulow, vlow) and upper rightcorner (uhigh, vhigh). To avoid collisions, the followingconstraints must be satised by each trajectory point

xk+i,1 ulow + R bin,1xk+i,1 uhigh R bin,2xk+i,2 vlow + R bin,3xk+i,2 vhigh R bin,4 (4)4

j=1

bin,j 3 (5)

The jth constraint is relaxed if bin,j = 1, and enforcedif bin,j = 0. Eqn. 5 ensures that at least one con-straint in Eqn. 4 is active for the trajectory point.These constraints are applied to all trajectory points{x(k + i) : i = 1, 2, . . . , L}. Note that the obstacleavoidance constraints are not applied between the tra-jectory points for this discrete-time system, so small

incursions into obstacles are possible. As a result, theobstacle regions in the optimization must be slightlylarger that the real obstacles to allow for this margin.

The trajectory is also constrained by discretized dy-namics, which model a xed-wing aircraft with limitedspeed and turning rate. The latter is represented bya limit on the magnitude of the turning force u(k + i)that can be applied [7].[

x(k + i + 1)x(k + i + 1)

]= A

[x(k + i)x(k + i)

]+ Bu(k + i) (6)

subject to the linear approximations of velocity andforce magnitude limits

a(x(k + i)) xmax (7)a(u(k + i)) umax (8)

The constraints in Eqns. 7 and 8 make use of an ap-proximation a(r) of the length of a vector r = (x1, x2)

a(r) = s : s L1x1 + L2x2, (L1, L2) L (9)

where L is a nite set of unit vectors whose directionsare distributed from 0 360.This formulation nds the minimum arrival time tra-jectory. Experience has shown that the computationaleort required to solve this optimization problem cangrow quickly and unevenly with the product of thelength of the trajectory to be planned and the numberof obstacles to be avoided [5, 7]. However, as discussedin the following sections, a receding horizon approachcan be used to design large-scale trajectories.

3 Simple Terminal Cost Formulation

In order to reduce the computational eort required,MILP has been applied within a receding horizonframework. To enable a more direct comparison ofthe eects of the terminal penalty, the following pro-vides a brief outline of the receding horizon approachsuggested in Ref. [5]. The MILP trajectory optimiza-tion is repeatedly applied over a moving time-windowof length Hp. The result is a series of control inputs{u(k + i) R2 : i = 0, 1, . . . , Hp 1}, that give thetrajectory {x(k + i) R2 : i = 1, 2, . . . , Hp}. The rstpart of this input trajectory, of length He < Hp, is ex-ecuted before a new trajectory is planned. The costfunction of this optimization is the terminal penalty2(x(k + Hp)), which nds the 1-norm of the distancebetween the trajectorys end point and the goal. Theformulation is piecewise-linear and can be included ina MILP using slack variables as

minu()

2(x(k + Hp)) = b(xgoal x(k + Hp)) (10)where b(r) evaluates the sum of the absolute valuesof the components of r. Obstacle avoidance and dy-namics constraints are also added. This formulation

2 as terminal penalty

Fig. 1: Starting point at left, goal at right. Circlesshow trajectory points. Receding horizon controller us-ing simple terminal penalty 2 and Hp = 12 becomesentrapped and fails to reach the goal.

is equivalent to the xed horizon controller when thehorizon length is just long enough to reach the goal.However, when the horizon length does not reach thegoal, the optimization minimizes the approximate dis-tance between the trajectorys terminal point and thegoal. This choice of terminal penalty can prevent theaircraft from reaching the goal when the approxima-tion does not reect the length of a flyable path. Thisoccurs if the line connecting x(k + Hp) and the goalpenetrates obstacles. This problem is especially appar-ent when the path encounters a concave obstacle, asshown in Fig. 1. If the terminal point that minimizesthe distance to the goal is within the concavity behindan obstacle, then the controller will plan a trajectoryinto the concavity. If the path out of the concavity re-quires a temporary increase in the 1-norm distance tothe goal, the aircraft can become trapped behind theobstacle. This is comparable to the entrapment in localminima that is possible using potential eld methods.

4 Improved Receding Horizon ControlStrategy

This section presents a novel method for approximatingthe time-to-go and shows how this can be implementedin a MILP program, using only linear and binary vari-ables. This approach avoids the diculties associatedwith nonlinear programming, such as choosing a suit-able initial guess for the optimization.

4.1 Control ArchitectureThe control strategy is comprised of a cost estimationphase and a trajectory design phase. The cost esti-mation phase computes a compact cost map of theapproximate minimum time-to-go from a limited set ofpoints to the goal. The cost estimation phase is per-formed once for a given obstacle eld and position of the

Fig.2: Resolution Levels of the Planning Algorithm

goal, and would be repeated if the environment changes.The trajectory designer uses this cost map informationin the terminal penalties of the receding horizon opti-mization to design a series of short trajectory segmentsthat are followed until the goal is reached. This divi-sion of computation between the cost estimation andtrajectory design phases enables the trajectory opti-mization to use only linear relationships. An exampleof a result that would be expected from the trajectorydesign phase is shown schematically in Fig. 2. In thisphase, a trajectory consistent with discretized aircraftdynamics is designed from x(k) over a ne resolutionplanning horizon with length Hp steps. The trajectoryis optimized using MILP to minimize the cost functionassessed at the plans terminal point x(k + Hp). Thiscost estimates the time to reach the goal from this pointas the time to move from x(k + Hp) to a visible pointxvis, whose cost-to-go was previously estimated, plusthe cost-to-go estimate Cvis for xvis. As described inSection 4.2, Cvis is estimated using a coarser modelof the aircraft dynamics that can be evaluated veryquickly. Only the rst He steps are executed beforea new plan is formed starting from x(k + He).

The use of two dynamics models with dierent levelsof resolution exploits the trajectory planning problemsstructure. On a long time-scale, a successful controllerneed only decide which combination of obstacle gapsto pass through in order to take the shortest dynami-cally feasible path. However, on a short time-scale, asuccessful controller must plan the dynamically feasibletime-optimal route around the nearby obstacles to passthrough the chosen gaps. The dierent resolution lev-els of the receding horizon controller described aboveallow it to make decisions on these two levels, withoutperforming additional computation to over plan thetrajectory to an unnecessary level of detail.

The cost estimation is performed in MATLAB. It pro-duces a data le containing the cost map in the AMPL

language, and an AMPL model le species the formof the cost function and constraints. The CPLEX op-timization program is used to solve the MILP problemand outputs the resulting input and position trajectory.MATLAB is used to simulate the execution of this tra-jectory up to x(k+He), which leads to a new trajectoryoptimization problem with an updated starting point.

4.2 Computation of Cost MapThe shortest path around a set of polygonal obstacles toa goal, without regard for dynamics, is a series of joinedline segments that connect the starting point, possiblyobstacle vertices, and the goal. To nd this path, avisibility graph can be formed whose nodes representthese points. Edges are added between pairs of nodesif the points they represent can be connected by a linethat does not penetrate any obstacles. The visibilitygraph is searched using Dijkstras Single Source Short-est Path Algorithm [9], starting at the goal, to nd theshortest path from the each node of the graph to thegoal, and the corresponding distances.

This work is concerned with minimum arrival timepaths, not minimum distance paths. It is possible thatthe fastest path is not necessarily the shortest: longerpaths can be faster if they require fewer sharp turns be-cause the aircraft must slow down to turn sharply. Thiseect can be approximately accounted for by modify-ing Dijkstras Algorithm. When Dijkstras Algorithmexamines an edge to see if it should be added to thetree of shortest paths being grown from the goal out-wards, its length is divided by xmax, and a penalty isadded that is proportional to the change in directionbetween the new edge and the next edge towards thegoal in the tree. This approximately compensates forthe reduction in aircraft speed necessary to make theturn between the two associated legs. With this mod-ication, Dijkstras Algorithm nds the approximateminimum time to reach the goal from each graph node.

In order to illustrate how the resulting cost map ac-counts for obstacles, their contribution to the cost isisolated in Fig. 3. To produce this graph, cost valueswere found over a ne grid of points in two elds ofequal size, one with obstacles, and one without1. Thetwo sets of cost values were subtracted to remove thecontribution of straight line distance to costs in theobstacle eld. Areas of larger dierence are shown inFig. 3 by darker shading. Note that the cost is increas-ing into the concave obstacle. This increase is crucialto avoiding the entrapment shown in Fig. 1.

1Cost values need not be found over such a fine grid to plantrajectories successfully. Since optimal large-scale trajectoriestend to connect the starting point, obstacle vertices, and thegoal, costs need only be found at these points. Many extra gridpoints are added here to more clearly demonstrate the trend incost values.

Fig.3: Dierence between actual cost at various pointsin an obstacle eld and cost in same region with no ob-stacles, showing the eects of obstacles on cost values.Goal is at center right.

4.3 Modified MILP ProblemThe results of the cost estimation phase are provided tothe trajectory design phase as pairs of a position wherethe approximate cost-to-go is known and the cost atthat point (xcost,j, Cj). This new formulation includesa signicantly dierent terminal cost that is a functionof x(k + Hp), and (xvis, Cvis), a pair from the cost es-timation phase. The optimization seeks to minimizethe time to y from x(k + Hp) to the goal by choosingx(k + Hp) and the pair (xvis, Cvis) that minimize thetime to y from x(k + Hp) to xvis, plus the estimatedtime to y from xvis to xgoal, Cvis,

minu()

3(x(k+Hp),xvis, Cvis) =a(xvis x(k + Hp))

vmax+Cvis

(11)A key element in the algorithm is that the optimizationis not free to choose x(k +Hp) and xvis independently.Instead, xvis is constrained to be visible from x(k+Hp).Note that visibility constraints are, in general, nonlin-ear because they involve checking whether every pointalong a line is outside of all obstacles. Because thesenonlinear constraints cannot be included in a MILPproblem, they are approximated by constraining a dis-crete set of interpolating points between x(k+Hp) andxvis to lie outside of all obstacles. These interpolatingpoints are a portion of the distance along the line-of-sight between x(k + Hp) and xvis

T : [x(k + Hp) + (xvis x(k + Hp))] / Xobst(12)

where T [0, 1] is a discrete set of interpolation dis-tances and Xobst is the obstacle space.The visibility constraint ensures that the length of theline between x(k+Hp) and xvis is a good estimate of thelength of a yable path between them, and ultimately

3 as terminal penalty

Fig. 4: Trajectories designed using receding horizoncontroller with 3 terminal penalty avoid entrapment.Trajectories start at left and at center, goal is at right.Circles show trajectory points. Hp = 12.

that the terminal penalty is a good estimate of the timeto reach the goal from x(k + Hp). The interpolatingpoints are constrained to lie outside obstacles in thesame way that the trajectory points are constrained tolie outside obstacles in the previous formulations (seeEqns. 4 and 5), so it is possible that portions of theline-of-sight between interpolating points penetrate ob-stacles. However, if a sucient number of interpolatingpoints is used, the line-of-sight will only be able to cutcorners of the obstacles. In this case, the extra timerequired to y around the corner is small, and the ac-curacy of the terminal penalty is not seriously aected.

The values of the position xvis and cost Cvis are eval-uated using the binary variables bcost {0, 1} and then points on the cost map as

xvis =n

j=1 bcost,jxcost,j (13)

Cvis =n

j=1 bcost,jCj (14)nj=1 bcost,j = 1 (15)

Obstacle avoidance constraints (Eqns. 4 and 5) areenforced without modication at {x(k + i) : i =1, 2, . . . , Hp}. The dynamics model (Eqn. 6), the veloc-ity limit (Eqn. 7), and the control force limit (Eqn. 8)are also enforced in this formulation. This provides acompletely linear receding horizon formulation of thetrajectory design problem.

5 Results

The following examples demonstrate that the new re-ceding horizon control strategy provides trajectoriesthat are close to time-optimal and avoid entrapment,while maintaining computational tractability.

5.1 Avoidance of EntrapmentIn order to test the performance of the improved costpenalty around concave obstacles, the improved termi-

5 10 15 20 25 300

1

2

3

4

Incr

ease

in p

lan

leng

th fr

om th

e

o

ptim

al p

lan

to th

e re

cedi

ng h

orizo

n co

ntro

ller (

%)

Number of Steps per Plan5 10 15 20 25 30

0

50

100

150

200

Aver

age

tota

l com

puta

tionT

ime

ts)

Fig.5: The Eects of Plan Length. Increase in arrivaltime from optimal to that found by receding horizoncontroller is plotted with a solid line. Average totalcomputation time is plotted with a dashed line.

nal penalty 3 was applied to the obstacle eld con-sidered in Section 3, and the resulting trajectories areshown in Fig. 4. The new cost function captures the dif-ference between the distance to the goal and the lengthof a feasible path to the goal, which allows the recedinghorizon controller to plan trajectories that consistentlyreach the goal.

5.2 OptimalityThe computational eort required by the receding hori-zon control strategy is signicantly less than that of thexed horizon controller because its planning horizon ismuch shorter. However, for the same reason, globaloptimality is not guaranteed. To examine this trade-o, a set of random obstacle elds was created, and atrajectory to the goal was planned using both the reced-ing and xed horizon controllers. The receding horizoncontroller was applied several times to each problem,each time with a longer planning horizon. The resultsare shown in Fig. 5. The extra number of time stepsin the receding horizon controllers trajectory is plottedas a percentage of the minimum number of steps foundusing the xed horizon controller, averaged over severalobstacle elds. The plot shows that, on average, the re-ceding horizon controller is within 3% of the optimumfor a planning horizon longer than 7 time steps. Theaverage total computation time for the receding hori-zon controller is also plotted, showing that the increasein computation time is roughly linear with plan length.

5.3 Computation SavingsThe eects of problem complexity on computation timewere also examined by timing the xed and recedinghorizon controllers computation on a 1 GHz PIII com-puter. The complexity of a MILP problem is relatedto its number of binary variables, which grows withthe product of the number of obstacles and number ofsteps to be planned. A series of obstacle elds was

Fig.6: Cumulative Computation Time vs. Complexity

Fig.7: Sample Long Trajectory Designed using Reced-ing Horizon Controller. Executed trajectory (plan plusvelocity disturbance) shown with thick line, plannedtrajectory segments shown with thin lines.

created with several dierent values of this complex-ity metric, and a trajectory to the goal was plannedusing both the receding and xed horizon controllers.The optimization of each plan was aborted if the op-timum was not found in 600 seconds. The results areshown in Fig. 6, which gives the average cumulativecomputation time for the receding horizon controller,and the median computation time for the xed hori-zon controller. The median computation time for thexed horizon controller was over 600 seconds for allcomplexity levels over 1628. At several complexity lev-els for which its median computation time was below600 seconds, the xed horizon controller also failed tocomplete plans. The cumulative time required by thereceding horizon controller to design all the trajectorysegments to the goal was less than this time limit forevery problem. All but the rst of its plans can be com-puted during execution of the previous, so the aircraftcan begin moving towards the goal much sooner.

Next, an extremely large problem, with a complex-ity of 6636, was attempted with the receding horizoncontroller. It successfully designed a trajectory thatreached the goal in 316 time steps, in a cumulative com-putation time of 313.2 seconds. This controller took

2.97 seconds on average to design one trajectory seg-ment. The xed horizon controller could not solve thisproblem in 1200 seconds of computation time. A trajec-tory through the same obstacle eld was also plannedin the presence of velocity disturbances, causing the fol-lowed trajectory to dier signicantly from each of theplanned trajectory segments. By designing each trajec-tory segment from the state that is actually reached, thereceding horizon controller compensates for the distur-bance. The executed trajectory and planned trajectorysegments are shown in Fig. 7.

6 ConclusionsThis paper presents a new algorithm for designinglong-range kinodynamically constrained trajectories forxed-wing UAVs. It is based on MILP optimizationwithin a receding horizon control framework. A novelterminal penalty for the receding horizon optimizationis computed by nding a cost map for the environment,and connecting the aircraft trajectory over the planninghorizon to the cost map. The resulting MILP problemcan be solved with commercially available optimizationsoftware. Simulation results show that: (a) the recedinghorizon controller plans trajectories whose arrival timesare within 3% of optimal; (b) the controller can suc-cessfully solve complex trajectory planning problemsin practical computation times; and (c) the controlleravoids entrapment behind obstacles.

References

[1] Kavraki, L. E., Lamiraux, F., and Holleman, C.,Probabilistic Roadmaps for Path Planning in High-Dimensional Conguration Spaces, IEEE Transactions onRobotics and Automation, 12(4):566-580.

[2] LaValle, S. M. and Kuner, J. J., Randomized Kino-dynamic Planning, IEEE International Conference on Ro-botics and Automation, July, 1999.

[3] Floudas, C. A., Nonlinear and Mixed-Integer Pro-gramming Fundamentals and Applications, Oxford Uni-versity Press, 1995.

[4] Williams, H. P., and Brailsford, S. C., Computa-tional Logic and Integer Programming, in Advances in Lin-ear and Integer Programming, Editor J. E. Beasley, Claren-don Press, Oxford, 1996, pp. 249281.

[5] Schouwenaars, T., De Moor, B., Feron, E. andHow, J., Mixed Integer Programming for Multi-VehiclePath Planning, ECC, 2001.

[6] Richards, A., How, J., Schouwenaars, T., Feron, E.,Plume Avoidance Maneuver Planning Using Mixed IntegerLinear Programming AIAA GN&C Conf., 2001.

[7] Richards, A. and How, J., Aircraft Trajectory Plan-ning With Collision Avoidance Using Mixed Integer LinearProgramming, to appear at ACC, 2002.

[8] Jadbabaie, A., Primbs, J. and Hauser, J. Uncon-strained receding horizon control with no terminal cost,presented at the ACC, Arlington VA, June 2001.

[9] Cormen, T. H., Leiserson, C. E., Rivest, R. L.. In-troduction to Algorithms, MIT Press, 1990.

Documents

Receding Horizon Control For Autonomous Aerial Vehicles