Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

8/10/2019 Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

http://slidepdf.com/reader/full/speeding-up-the-calculation-of-heuristics-for-heuristic-search-based-planning 1/8

Speeding Up the Calculation of Heuristicsfor Heuristic Search-Based Planning

Yaxin Liu, Sven Koenig and David FurcyCollege of Computing

Georgia Institute of TechnologyAtlanta, GA 30312-0280

yxliu,skoenig,dfurcy @cc.gatech.edu

Abstract

Heuristic search-based planners, such as HSP 2.0, solveSTRIPS-style planning problems efficiently but spend abouteighty percent of their planning time on calculating theheuristic values. In this paper, we systematically evaluate al-ternative methods for calculating the heuristic values for HSP2.0 and demonstrate that the resulting planning times differsubstantially. HSP 2.0 calculates each heuristic value by solv-

ing a relaxed planning problem with a dynamic programmingmethod similar to value iteration. We identify two differentapproaches for speeding up the calculation of heuristic val-ues, namely to order the value updates and to reuse infor-mation from the calculation of previous heuristic values. Wethen show how these two approaches can be combined, re-sulting in our PINCH method. PINCH outperforms both of the other approaches individually as well as the methods usedby HSP 1.0 and HSP 2.0 for most of the large planning prob-lems tested. In fact, it speeds up the planning time of HSP 2.0by up to eighty percent in several domains and, in general, theamount of savings grows with the size of the domains, allow-ing HSP 2.0 to solve larger planning problems than was pos-sible before in the same amount of time and without changingits overall operation.

IntroductionHeuristic search-based planners were introduced by (Mc-Dermott 1996) and (Bonet, Loerincs, & Geffner 1997) andare now very popular. Several of them entered the sec-ond planning competition at AIPS-2000, including HSP 2.0(Bonet & Geffner 2001a), FF (Hoffmann & Nebel 2001a),GRT (Refanidis & Vlahavas 2001), and AltAlt (Nguyen,Kambhampati, & Nigenda 2002). Heuristic search-basedplanners perform a heuristic forward or backward searchin the space of world states to find a path from the startstate to a goal state. In this paper, we study HSP 2.0,a prominent heuristic search-based planner that won one

of four honorable mentions for overall exceptional perfor-mance at the AIPS-2000 planning competition. It was oneof the first planners that demonstrated how one can obtaininformed heuristic values for STRIPS-style planning prob-lems to make planning tractable. In its default configura-tion, it uses weighted A* searches with inadmissible heuris-tic values to perform forward searches in the space of world

Copyright c 2002, American Association for Artificial Intelli-

gence (www.aaai.org). All rights reserved.

states. However, the calculation of heuristic values is time-consuming since HSP 2.0 calculates the heuristic value of each state that it encounters during the search by solvinga relaxed planning problem. Consequently, its calculationof heuristic values comprises about 80 percent of its plan-ning time (Bonet & Geffner 2001b). This suggests thatone might be able to speed up its planning time by speed-ing up its calculation of heuristic values. Some heuristicsearch-based planners, for example, remove irrelevant op-erators before calculating the heuristic values (Hoffmann& Nebel 2001b) and other planners cache information ob-tained in a preprocessing phase to simplify the calculation of all heuristic values for a given planning problem (Bonet &Geffner 1999; Refanidis & Vlahavas 2001; Edelkamp 2001;Nguyen, Kambhampati, & Nigenda 2002).

Different from these approaches, we speed up HSP 2.0without changing its heuristic values or overall operation. Inthis paper, we study whether different methods for calculat-ing the heuristic values for HSP 2.0 result in different plan-ning times. We systematically evaluate a large number of different methods and demonstrate that, indeed, the result-

ing planning times differ substantially. HSP 2.0 calculateseach heuristic value by solving a relaxed planning problemwith a dynamic programming method similar to value iter-ation. We identify two different approaches for speedingup the calculation of heuristic values, namely to order thevalue updates and to reuse information from the calculationof previous heuristic values by performing incremental cal-culations. Each of these approaches can be implementedindividually in rather straightforward ways. The questionarises, then, how to combine them and whether this is ben-eficial. Since the approaches cannot be combined easily,we develop a new method. The PINCH (Prioritized, IN-Cremental Heuristics calculation) method exploits the factthat the relaxed planning problems that are used to deter-

mine the heuristic values for two different states are similarif the states are similar. Thus, we use a method from al-gorithm theory to avoid the parts of the plan-constructionprocess that are identical to the previous ones and order theremaining value updates in an efficient way. We demonstratethat PINCH outperforms both of the other approaches indi-vidually as well as the methods used by HSP 1.0 and HSP2.0 for most of the large planning problems tested. In fact,it speeds up the planning time of HSP 2.0 by up to eighty



Non-Incremental Computations Incremental Computations

Unordered Updates VI, HSP2 IVI

Ordered Updates GBF, HSP1, GD PINCH

Table 1: Classification of Heuristic-Calculation Methods

percent in several domains and, in general, the amount of savings grows with the size of the domains, allowing HSP2.0 to solve larger planning problems in the same amount

of time than was possible before and without changing itsoverall operation.

Heuristic Search-Based Planning: HSP 2.0

In this section, we describe how HSP 2.0 operates. We firstdescribe the planning problems that it solves and then howit calculates its heuristic values. We follow the notation anddescription from (Bonet & Geffner 2001b).

The Problem

HSP 2.0 solves STRIPS-style planning problems withground operators. Such STRIPS-style planning problemsconsist of a set of propositions that are used to describethe states and operators, a set of ground operators

, the

start state , and the partially specified goal .Each operator has a precondition list ,an add list

, and a delete list

.The STRIPS planning problem induces a graph search prob-lem that consists of a set of states (vertices)

, a start state , a set of goal states , a set of ac-tions (directed edges)

for each state where action transitions from state to state

with cost one. All opera-tor sequences (paths) from the start state to any goal state inthe graph (plans) are solutions of the STRIPS-style planningproblem. The shorter the path, the higher the quality of thesolution.

The Method and its HeuristicsIn its default configuration, HSP 2.0 performs a forwardsearch in the space of world states using weighted A* (Pearl1985) with inadmissible heuristic values. It calculates theheuristic value of a given state by solving a relaxed versionof the planning problem, where it recursively approximates(by ignoring all delete lists) the cost of achieving each goalproposition individually from the given state and then com-bines the estimates to obtain the heuristic value of the givenstate. In the following, we explain the calculation of heuris-tic values in detail. We use

to denote the approximatecost of achieving proposition from state , and

to denote the approximate cost of achieving the pre-conditions of operator from state . HSP 2.0defines these quantities recursively. It defines for all ,

, and

(the minimum of an empty set is definedto be infinity and an empty sum is defined to be zero):

if

℄ otherwise (1)

(2)

procedure Main()

forever do

set

to the state whose heuristic value needs to get computed next

for each

do set

for each

do set

repeat

for each

do set

for each

do set

℄

until the values of all

remain unchanged during an iteration

/* use

*/

Figure 1: VI

Then, the heuristic value

of state can becalculated as

. This allows HSP 2.0

to solve large planning problems, although it is not guaran-teed to find shortest paths.

Calculation of Heuristics

In the next sections, we describe differentmethods that solveEquations 1 and 2. These methods are summarized in Ta-ble 1. To describe them, we use a variable if its values areguaranteed to be propositions, a variable

if its values are

guaranteed to be operators, and a variable if its values canbe either propositions or operators. We also use variables

(a proposition variable) for

and

(an opera-tor variable) for . In principle, the value of variable

satisfies

after termination and the value of variable

satisfies

after termination, except forsome methods that terminate immediately after the variables

for have their correct values because this alreadyallows them to calculate the heuristic value.

VI and HSP2: Simple Calculation of Heuristics

Figure 1 shows a simple dynamic programming method thatsolves the equations using a form of value iteration. ThisVI (Value Iteration) method initializes the variables

tozero if , and initializes all other variables to infinity. Itthen repeatedly sweeps over all variables and updates themaccording to Equations 1 and 2, until no value changes anylonger.

HSP 2.0 uses a variant of VI that eliminates the vari-ables

by combining Equations 1 and 2. Thus, this HSP2method performs only the following operation as part of itsrepeat-until loop:

foreach

do set

℄ .

HSP2 needs more time than VI for each sweep but re-duces the number of sweeps. For example, it needs only 5.0sweeps on average in the Logistics domain, while VI needs

7.9 sweeps.

Speeding Up the Calculation of Heuristics

We investigate two orthogonal approaches for speeding upHSP 2.0’s calculation of heuristic values, namely to orderthe value updates (ordered updates) and to reuse informationfrom the calculation of previous heuristic values (incremen-tal computations), as shown in Table 1. We then describehow to combine them.



procedure Main()

forever do

set


for each

do set

for each

do set

repeat

for each

do

set

if

then

set

for each

do

set

until the values of all

remain unchanged during an iteration

/* use

*/

Figure 2: GBF

IVI: Reusing Results from Previous Searches

When HSP 2.0 calculates a heuristic value

, it solvesequations in

and

. When HSP 2.0 then calcu-lates the heuristic value

of some other state

, itsolves equations in

and

. If

forsome and

for some , one could just cache these values during the calculation of

andthen reuse them during the calculation of

. Unfor-tunately, it is nontrivial to determine which values remainunchanged. We explain later, in the context of our PINCHmethod, how this can be done. For now, we exploit the factthat the values

and

for the same

and thevalues

and

for the same are often similarif

and

are similar. Since HSP 2.0 calculates the heuristicvalues of all children of a state in a row when it expands thestate, it often calculates the heuristic values of similar statesin succession. This fact can be exploited by changing VI sothat it initializes

to zero if but does not re-initializethe other variables, resulting in the IVI (Incremental ValueIteration) method. IVI repeatedly sweeps over all variablesuntil no value changes any longer. If the number of sweepsbecomes larger than a given threshold then IVI terminates

the sweeps and simply calls VI to determine the values. IVIneeds fewer sweeps than VI if HSP 2.0 calculates the heuris-tics of similar states in succession because then the valuesof the corresponding variables

tend to be similar as well.For example, IVI needs only 6.2 sweeps on average in theLogistics domain, while VI needs 7.9 sweeps.

GBF, HSP1 and GD: Ordering the Value Updates

VI and IVI sweep over all variables in an arbitrary order.Their numberof sweeps can be reduced by ordering the vari-ables appropriately, similarly to the way values are orderedin the Prioritized Sweeping method used for reinforcementlearning (Moore & Atkeson 1993).

Figure 2 shows the GBF (Generalized Bellman-Ford)

method, that is similar to the variant of the Bellman-Fordmethod in (Cormen, Leiserson, & Rivest 1990). It ordersthe value updates of the variables

. It sweeps over all vari-ables

and updates their values according to Equation 2.If the update changes the value of variable

, GBF iteratesover the variables

for the propositions in the add list of operator

and updates their values according to Equation 1.Thus, GBF updates the values of the variables

only if their values might have changed.

procedure Main()

forever do

set


for each

do set

for each

do set

set

while

set

the set of operators with no preconditions

for each

do set

set

for each

do set

for each

do

set

for each

if

set

set

for each

do set

/* use

*/

Figure 3: HSP1

Figure 3 shows the method used by HSP 1.0. The HSP1method orders the value updates of the variables

and

, similar to the variant of the Bellman-Ford method in(Bertsekas 2001). It uses an unordered list to rememberthe propositions whose variables changed their values. It

sweeps over all variables

that have these propositions aspreconditions and updates their values according to Equa-tion 2. After each update of the value of a variable

, GBFiterates over the variables

for the propositions in theadd list of operator

and updates their values according toEquation 1. The unordered list is then updated to containall propositions whose variables changed their values in thisstep. Thus, HSP1 updates the values of the variables

and

only if their values might have changed.Finally, the GD (Generalized Dijkstra) method (Knuth

1977) is a generalization of Dijkstra’s graph search method(Dijkstra 1959) and uses a priority queue to sweep over thevariables

and

in the order of increasing values. Similarto HSP1, its priority queue contains only those variables

and

whose values might have changed. To make it evenmore efficient, we terminate it immediately once the valuesof all

for

are known to be correct. We do notgive pseudocode for GD because it is a special case of thePINCH method, that we discuss next.1

PINCH: The Best of Both Worlds

The two approaches for speeding up the calculation of heuristic values discussed above are orthogonal. We nowdescribe our main contribution in this paper, namely howto combine them. This is complicated by the fact that theapproaches themselves cannot be combined easily becausethe methods that order the value updates exploit the prop-erty that the values of the variables cannot increase duringeach computation of a heuristic value. Unfortunately, thisproperty no longer holds when reusing the values of the vari-ables from the calculation of previous heuristic values. We

1PINCH from Figure 5 reduces to GD if one empties its priorityqueue and reinitializes the variables

and

before SolveE-quations() is called again as well as modifies the while loop of SolveEquations() to terminate immediately once the values of all

for

are known to be correct.



thus need to develop a new method based on a method fromalgorithm theory. The resulting PINCH (Prioritized, INCre-mental Heuristics calculation) method solves Equations 1and 2 by calculating only those values

and

thatare different from the corresponding values

and

from the calculation of previous heuristic values. PINCHalso orders the variables so that it updates the value of eachvariable at most twice. We will demonstrate in the experi-

mental section that PINCH outperforms the other methodsin many large planning domains and is not worse in mostother large planning domains.

DynamicSWSF-FP

In this section, we describe DynamicSWSF-FP (Rama-lingam & Reps 1996), the method from algorithm theorythat we extend to speed up the calculation of the heuristicvalues for HSP 2.0. We follow the presentation in (Rama-lingam & Reps 1996). A function

:

is called a strict weakly superior func-tion (in short: swsf) if, for every , itis monotone non-decreasing in variable

and satisfies:

. The swsf fixed point (in short: swsf-fp) problem is to compute the unique fixed point of equa-tions, namely the equations

, in the

variables

, where the

are swsf for .The dynamic swsf-fp problem is to maintain the unique fixedpoint of the swsf equations after some or all of the functions

have been replaced by other swsf’s. DynamicSWSF-FPsolves the dynamic swsf-fp problem efficiently by recalcu-lating only the values of variables that change, rather thanthe values of all variables. The authors of DynamicSWSF-FP have proved its correctness, completeness, and otherproperties and applied it to grammar problems and shortestpath problems (Ramalingam & Reps 1996).

Terminology and VariablesWe use the following terminology to explain how to useDynamicSWSF-FP to calculate the heuristic values for HSP2.0. If

then

is called consistent.Otherwise it is called inconsistent. If

is inconsistent theneither

, in which case we call

under-consistent, or

, in which case we call

overconsistent. We use the variables

to keep track of the current values of

. It always holds that

(Invariant 1). We can therefore com-pare

and

to check whether

is overconsistent orunderconsistent. We maintain a priority queue that alwayscontains exactly the inconsistent

(to be precise: it stores

rather than

) with priorities

(Invariant 2).

Transforming the EquationsWe transform Equations 1 and 2 as follows to ensure thatthey specify a swsf-fp problem, for all

,

and :

if

℄ otherwise (3)

(4)

procedure AdjustVariable(

)

if

then

if

then set

else set

else /*

*/ set

if

is in the priority queue then delete it

if

then insert

into the priority queue with priority

procedure SolveEquations()

while the priority queue is not empty do

delete the element with the smallest priority from the priority queue and assign it to

if

then

set

if

then for each

such that

do AdjustVariable(

)

else if

then for each

with

do AdjustVariable(

)

else

set

AdjustVariable(

)

if

then for each

such that

do AdjustVariable(

)

else if

then for each

with

do AdjustVariable(

)

procedure Main()

empty the priority queue

set

to the state whose heuristic value needs to get computed

for each

do set

for each

do AdjustVariable(

)

forever do

SolveEquations()

/* use

*/

set

and


for each

do AdjustVariable(p)

Figure 4: PINCH

The only difference is in the calculation of

. Thetransformed equations specify a swsf-fp problem since

,

℄ , and

are all

swsf in

and

for all

and all

. Thismeans that the transformed equations can be solved withDynamicSWSF-FP. They can be used to calculate

since it is easy to show that

and thus

.

Calculation of Heuristics with DynamicSWSF-FP

We apply DynamicSWSF-FP to the problem of calculating

the heuristic values of HSP 2.0, which reduces to solvingthe swsf fixed point problem defined by Equations 3 and 4.The solution is obtained by SolveEquations() shown in Fig-ure 4. In the algorithm, each

is a variable that containsthe corresponding

value, and each

is a variable thatcontains the corresponding

value.PINCH calls AdjustVariable() for each

to ensure thatInvariants 1 and 2 hold before it calls SolveEquations() forthe first time. It needs to call AdjustVariable() only for those

whose function

has changed before it calls SolveEqua-tions() again. The invariants will automatically continue tohold for all other

. If the state whose heuristic value needsto get computed changes from

to , then this changes onlythose functions

that correspond to the right-hand side of

Equation 3 for which

, in other words,where

is no longer part of

(and the corresponding vari-able thus is no longer clamped to zero) or just became part of (and the corresponding variable thus just became clampedto zero). SolveEquations() then operates as follows. The

solve Equations 3 and 4 if they are all consistent. Thus,SolveEquations() adjusts the values of the inconsistent

.It always removes the

with the smallest priority fromthe priority queue. If

is overconsistent then SolveEqua-



procedure AdjustVariable(

)

if

then

if

is not in the priority queue then insert it with priority

else change the priority of

in the priority queue to

else if

is in the priority queue then delete it

procedure SolveEquations()

while the priority queue is not empty do

assign the element with the smallest priority in the priority queue to

if

then

if

then

delete

from the priority queue

set

set

for each such that doif

then

set

else set

AdjustVariable(

)

else

set

if

then

set

AdjustVariable(

)

for each

such that

do

set

AdjustVariable(

)

else /*

*/

if

then

delete

from the priority queue

set

for each

with

AdjustVariable(

)

else

set

set

set

AdjustVariable(

)

for each

with

if

then

set

AdjustVariable(

)

procedure Main()

empty the priority queue

set

to the state whose heuristic value needs to get computed

for each

do set

for each

with

do set

for each

do

AdjustVariable(

)

forever do

SolveEquations()

/* use

*/

set

and


for each

do

AdjustVariable(

)

for each

do

AdjustVariable(

)

Figure 5: Optimized PINCH

tions() sets it to the value of

. This makes

consistent.Otherwise

is underconsistent and SolveEquations() setsit to infinity. This makes

either consistent or overcon-sistent. In the latter case, it remains in the priority queue.

Whether

was underconsistent or overconsistent, its valuegot changed. SolveEquations() then calls AdjustVariable()to maintain the Invariants 1 and 2. Once the priority queueis empty, SolveEquations() terminates since all

are con-sistent and thus solve Equations 3 and 4. One can provethat it changes the value of each

at most twice, namelyat most once when it is underconsistent and at most oncewhen it is overconsistent, and thus terminates in finite time(Ramalingam & Reps 1996).

Algorithmic Optimizations

PINCH can be optimized further. Its main inefficiency isthat it often iterates over a large number of propositions oroperators. Consider, for example, the case where hasthe smallest priority during an iteration of the while-loop inSolveEquations() and

. At some point in time,SolveEquations() then executes the following loop

for each

with

do AdjustVariable(

)

The for-loop iteratesover all propositionsthat satisfy its con-dition. For each of them, the call AdjustVariable( ) executes

set

The calculation of

therefore iterates over all operatorsthat contain in their add list. However, this iteration canbe avoided. Since

according to our assumption,SolveEquations() sets the value of

to

and thus de-creases it. All other values remain the same. Thus,

cannot increase and one can recalculate it faster as follows

set

Figure 5 shows PINCH after this and other optimizations. Inthe experimental section, we use this method rather than theunoptimized one to reduce the planning time. This reducesthe planning time, for example, by 20 percent in the Logis-tics domain and up to 90 percent in the Freecell domain.

Summary of Methods

Table 1 summarizes the methods that we have discussed andclassifies them according to whether they order the value up-dates (ordered updates) and whether they reuse informationfrom the calculation of previous heuristic values (incremen-tal computations).

Both VI and IVI perform full sweeps over all variables

and thus perform unordered updates. IVI reuses the valuesof variables from the computation of the previous heuris-tic value and is an incremental version of VI. We includedHSP2 although it is very similar to VI and belongs to thesame class because it is the method used by HSP 2.0. Itsonly difference from VI is that it eliminates all operator vari-ables, which simplifies the code.

GBF, HSP1, and GD order the value updates. They arelisted from left to right in order of increasing number of binary ordering constraints between value updates. GBFperforms full sweeps over all operator variables interleavedwith partial sweeps over those proposition variables whosevalues might have changed because they depend on oper-ator variables whose values have just changed. HSP1, the

method used by HSP 1.0, alternates partial sweeps over op-erator variables and partial sweeps over proposition vari-ables. Finally, both GD and PINCH order the value updatescompletely and thus do not perform any sweeps. This en-ables GD to update the value of each variable only onceand PINCH to update the value of each variable only twice.PINCH reuses the values of variables from the computationof the previous heuristic value and is an incremental versionof GD.





1000 2000 3000 4000 5000 6000 7000−100

−80

−60

−40

−20

0

20

40

60

80

100

Problem Size (#P + #O)

R e l a t i v e S p e e d u p i n P l a n n i n g T i m e ( % )

Blocks World

PINCHGBFHSP1GDIVIVI

PINCH

GBF

0.5 1 1.5 2 2.5 3 3.5 4

x 104

−100

−80

−60

−40

−20

0

20

40

60

80

100



Blocks World: Random Instances

PINCHGBFHSP1GDIVIVI

PINCH

GBF

500 1000 1500 2000 2500 3000 3500−100

−80

−60

−40

−20

0

20

40

60

80

100



Elevator

PINCHGBFHSP1GDIVIVI

PINCH

GBF

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

x 104

−100

−80

−60

−40

−20

0

20

40

60

80

100



Freecell

PINCHGBFHSP1GDIVI

VI

PINCH

GBF

1000 2000 3000 4000 5000 6000 7000 8000 9000−100

−80

−60

−40

−20

0

20

40

60

80

100



Logistics

PINCHGBFHSP1GDIVI

VI

PINCH

GBF

1000 1500 2000 2500 3000 3500−100

−80

−60

−40

−20

0

20

40

60

80

100



Schedule

PINCHGBFHSP1GDIVI

VI

PINCH

GBF

1000 2000 3000 4000 5000 6000−100

−80

−60

−40

−20

0

20

40

60

80

100



Gripper

PINCHGBFHSP1GDIVIVI

PINCH

GBF

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

x 104

−100

−80

−60

−40

−20

0

20

40

60

80

100



Mprime

PINCHGBFHSP1GDIVIVI

PINCH

GBF

0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−100

−80

−60

−40

−20

0

20

40

60

80

100



Mystery

PINCHGBFHSP1GDIVIVI

PINCH

GBF

Figure 6: Relative Speedup in Planning Time over HSP2

ning time of PINCH over HSP2 tends to increase with theproblem size. For example, PINCH outperforms HSP2 byover 80 percent for large problems in several domains. Ac-cording to theoretical results given in (Ramalingam & Reps1996), the complexity of PINCH depends on the numberof operator and proposition variables whose values changefrom the calculation of one heuristic value to the next.Since some methods do not use operator variables, we onlycounted the number of proposition variables whose valueschanged. Table 2 lists the resulting dissimilarity measure

for the Logistics domain. In this domain, #CV/#P is a goodpredictor of the relative speedup in planning time of PINCHover HSP2, with a negative correlation coefficient of -0.96.More generally, the dissimilarity measure is a good predic-tor of the relative speedup in planning time of PINCH overHSP2 in all domains, with negative correlation coefficientsranging from -0.96 to -0.36. This insight can be used toexplain the relatively poor scaling behavior of PINCH in the

Freecell domain. In all other domains, the dissimilarity mea-sure is negatively correlated with the problem size, with neg-ative correlation coefficients ranging from -0.92 to -0.57. Inthe Freecell domain, however, the dissimilarity measure ispositively correlated with the problem size, with a positivecorrelation coefficient of 0.68. The Freecell domain repre-sents a solitaire game. As the problem size increases, thenumber of cards increases while the numbers of columnsand free cells remain constant. This results in additionalconstraints on the values of both proposition and operator

variables and thus in a larger number of value changes fromone calculation of the heuristic value to the next. This in-creases the dissimilarity measure and thus decreases the rel-ative speedup of PINCH over HSP2. Finally, there are prob-lems that HSP 2.0 with PINCH could solve in the 10-minutetime limit that neither the standard HSP 2.0 distribution, norHSP 2.0 with GBF, could solve. The Logistics domain issuch a domain.



Second, GBF is at least as fast as HSP1 and HSP1 is atleast as fast as GD. Thus, the stricter the ordering of the vari-able updates among the three methods with ordered updatesand nonincremental computations, the smaller their relativespeedup over HSP2. This can be explained with the increas-ing overhead that results from the need to maintain the or-dering constraints with increasingly complex data structures.GBF does not use any additional data structures, HSP1 uses

an unordered list, and GD uses a priority queue. (We imple-mented the priority queue in turn as a binary heap, Fibonacciheap, and a multi-level bucket to ensure that this conclusionis independent of its implementation.) Since PINCH with-out value reuse reduces to GD, our experiments demonstratethat it is value reuse that allows PINCH to counteract theoverhead associated with the priority queue, and makes theplanning time of PINCH so competitive.

Conclusions

In this paper, we systematically evaluated methodsfor calcu-lating the heuristic values for HSP 2.0 with the

heuris-tics and demonstrated that the resulting planning times dif-fer substantially. We identified two different approaches for

speeding up the calculation of the heuristic values, namelyto order the value updates and to reuse information from thecalculation of previous heuristic values. We then showedhow these two approaches can be combined, resulting inour PINCH (Prioritized, INCremental Heuristics calcula-tion) method. PINCH outperforms both of the other ap-proaches individually as well as the methods used by HSP1.0 and HSP 2.0 for most of the large planning problemstested. In fact, it speeds up the planning time of HSP 2.0 byup to eighty percent in several domains and, in general, theamount of savings grows with the size of the domains. Thisis an important property since, if a method has a high relativespeedup for small problems, it wins by fractions of a secondwhich is insignificant in practice. However, if a method hasa high relative speedup for large problems, it wins by min-utes. Thus, PINCH allows HSP 2.0 to solve larger planningproblems than was possible before in the same amount of time and without changing its operation. We also have pre-liminary results that show that PINCH speeds up HSP 2.0with the

heuristics (Bonet & Geffner 2001b) by over20 percent and HSP 2.0 with the

(Haslum & Geffner2000) heuristic by over80 percent for small blocks world in-stances. We are currently working on demonstrating similarsavings for other heuristic search-based planners. For exam-ple, PINCH also applies in principle to the first stage of FF(Hoffmann & Nebel 2001a), where FF builds the planninggraph.

Acknowledgments

We thank Blai Bonet and Hector Geffner for making thecode of HSP 1.0 and HSP 2.0 available to us and for answer-ing our questions. We also thank Sylvie Thiebaux and JohnSlaney for making their blocksworld planning task genera-tor available to us. The Intelligent Decision-Making Groupis partly supported by NSF awards to Sven Koenig undercontracts IIS-9984827, IIS-0098807, and ITR/AP-0113881

as well as an IBM faculty partnership award. The viewsand conclusions contained in this document are those of theauthors and should not be interpreted as representing the of-ficial policies, either expressed or implied, of the sponsoringorganizations, agencies, companies or the U.S. government.

ReferencesBertsekas, D. 2001. Dynamic Programming and Optimal Control.

Athena Scientific, 2nd edition.Bonet, B., and Geffner, H. 1999. Planning as heuristic search:New results. In Proceedings of the 5th European Conference onPlanning, 360–372.

Bonet, B., and Geffner, H. 2001a. Heuristic search planner 2.0. Artificial Intelligence Magazine 22(3):77–80.

Bonet, B., and Geffner, H. 2001b. Planning as heuristicsearch. Artificial Intelligence – Special Issue on Heuristic Search129(1):5–33.

Bonet, B.; Loerincs, G.; and Geffner, H. 1997. A robust andfast action selection mechanism. In Proceedings of the NationalConference on Artificial Intelligence, 714–719.

Cormen, T.; Leiserson, C.; and Rivest, R. 1990. Introduction to Algorithms. MIT Press.

Dijkstra, E. 1959. A note on two problems in connection withgraphs. Numerical Mathematics 1:269–271.

Edelkamp, S. 2001. Planning with pattern databases. In Proceed-ings of the 6th European Conference on Planning, 13–24.

Haslum, P., and Geffner, H. 2000. Admissible heuristics for opti-mal planning. In Proceedings of the International Conference on

Artificial Intelligence Planning and Scheduling, 70–82.

Hoffmann, J., and Nebel, B. 2001a. The FF planning system:Fast plan generation through heuristic search. Journal of Artificial

Intelligence Research 14:253–302.

Hoffmann, J., and Nebel, B. 2001b. RIFO revisited: Detecting re-laxed irrelevance. In Proceedings of the 6th European Conferenceon Planning, 325–336.

Knuth, D. 1977. A generalization of Dijkstra’s algorithm. Infor-

mation Processing Letters 6(1):1–5.McDermott, D. 1996. A heuristic estimator for means-ends anal-ysis in planning. In Proceedings of the International Conferenceon Artificial Intelligence Planning and Scheduling, 142–149.

Moore, A., and Atkeson, C. 1993. Prioritized sweeping: Rein-forcement learning with less data and less time. Machine Learn-ing 13(1):103–130.

Nguyen, X.; Kambhampati, S.; and Nigenda, R. S. 2002. Plan-ning graph as the basis for deriving heuristics for plan synthesisby state space and csp search. Artificial Intelligence 135(1–2):73–123.

Pearl, J. 1985. Heuristics: Intelligent Search Strategies for Com- puter Problem Solving. Addison-Wesley.

Ramalingam, G., and Reps, T. 1996. An incremental algorithm

for a generalization of the shortest-path problem. Journal of Al-gorithms 21:267–305.

Refanidis, I., and Vlahavas, I. 2001. The GRT planning system:Backward heuristic construction in forward state-space planning.

Journal of Artificial Intelligence Research 15:115–161.

Documents

Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning