Metaheuristics in Combinatorial Optimization: Overview and ...christian.blum/downloads/blum_roli_2003.pdf · A. Roli acknowledges support by the CEC through a “Marie Curie Training

Metaheuristics in Combinatorial Optimization: Overviewand Conceptual Comparison

CHRISTIAN BLUM

Universite Libre de Bruxelles

AND

ANDREA ROLI

Universita degli Studi di Bologna

The field of metaheuristics for the application to combinatorial optimization problems isa rapidly growing field of research. This is due to the importance of combinatorialoptimization problems for the scientific as well as the industrial world. We give a surveyof the nowadays most important metaheuristics from a conceptual point of view. Weoutline the different components and concepts that are used in the differentmetaheuristics in order to analyze their similarities and differences. Two veryimportant concepts in metaheuristics are intensification and diversification. These arethe two forces that largely determine the behavior of a metaheuristic. They are in someway contrary but also complementary to each other. We introduce a framework, that wecall the I&D frame, in order to put different intensification and diversificationcomponents into relation with each other. Outlining the advantages and disadvantagesof different metaheuristic approaches we conclude by pointing out the importance ofhybridization of metaheuristics as well as the integration of metaheuristics and othermethods for optimization.

Categories and Subject Descriptors: G.2.1 [Discrete Mathematics]: Combinatorics—combinatorial algorithms; I.2.8 [Artificial Intelligence]: Problem Solving, ControlMethods, and Search—heuristic methods

General Terms: Algorithms

Additional Key Words and Phrases: Metaheuristics, combinatorial optimization,intensification, diversification.

C. Blum acknowledges support by the “Metaheuristics Network,” a Research Training Network funded bythe Improving Human Potential program of the CEC, contract HPRN-CT-1999-00106.A. Roli acknowledges support by the CEC through a “Marie Curie Training Site” fellowship, contract HPMT-CT-2000-00032.The information provided is the sole responsibility of the authors and does not reflect the Community’s opin-ion. The Community is not responsible for any use that might be made of data appearing in this publication.Authors’ addresses: C. Blum, Universite Libre de Bruxelles, IRIDIA, Avenue Franklin Roosevelt 50, CP194/6, 1050 Brussels, Belgium; email: [email protected]; A. Roli, DEIA—Universita degli Studi di Bologna,Viale Risorgimento, 2-Bologna, Italy; email: [email protected] to make digital or hard copies of part or all of this work for personal or classroom use is grantedwithout fee provided that copies are not made or distributed for profit or direct commercial advantage andthat copies show this notice on the first page or initial screen of a display along with the full citation. Copy-rights for components of this work owned by others than ACM must be honored. Abstracting with credit ispermitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any componentof this work in other works requires prior specific permission and/or a fee. Permissions may be requestedfrom Publications Dept., ACM, Inc., 1515 Broadway, New York, NY 10036 USA, fax: +1 (212) 869-0481, [email protected]©2003 ACM 0360-0300/03/0900-0268 $5.00

ACM Computing Surveys, Vol. 35, No. 3, September 2003, pp. 268–308.

Metaheuristics in Combinatorial Optimization 269

1. INTRODUCTION

Many optimization problems of practicalas well as theoretical importance con-sist of the search for a “best” configu-ration of a set of variables to achievesome goals. They seem to divide naturallyinto two categories: those where solutionsare encoded with real-valued variables,and those where solutions are encodedwith discrete variables. Among the lat-ter ones we find a class of problemscalled Combinatorial Optimization (CO)problems. According to Papadimitriou andSteiglitz [1982], in CO problems, we arelooking for an object from a finite—or pos-sibly countably infinite—set. This objectis typically an integer number, a subset, apermutation, or a graph structure.

Definition 1.1. A Combinatorial Opti-mization problem P = (S, f ) can be de-fined by:

—a set of variables X = {x1, . . . , xn};—variable domains D1, . . . , Dn;—constraints among variables;—an objective function f to be minimi-

zed,1 where f : D1 × · · · × Dn→ IR+;

The set of all possible feasible assignmentsis

S = {s = {(x1, v1), . . . , (xn, vn)} | vi ∈ Di, ssatisfies all the constraints}.

S is usually called a search (or solution)space, as each element of the set can beseen as a candidate solution. To solve acombinatorial optimization problem onehas to find a solution s∗ ∈ S with minimumobjective function value, that is, f (s∗) ≤f (s) ∀s∈ S. s∗ is called a globally optimalsolution of (S, f ) and the set S∗ ⊆ S iscalled the set of globally optimal solutions.

Examples for CO problems are theTravelling Salesman problem (TSP), theQuadratic Assignment problem (QAP),Timetabling and Scheduling problems.Due to the practical importance of CO

1As maximizing an objective function f is the sameas minimizing− f , in this work we will deal, withoutloss of generality, with minimization problems.

problems, many algorithms to tackle themhave been developed. These algorithmscan be classified as either complete orapproximate algorithms. Complete algo-rithms are guaranteed to find for everyfinite size instance of a CO problem anoptimal solution in bounded time (seePapadimitriou and Steiglitz [1982] andNemhauser and Wolsey [1988]). Yet, forCO problems that are NP-hard [Gareyand Johnson 1979], no polynomial timealgorithm exists, assuming that P 6=NP.Therefore, complete methods might needexponential computation time in theworst-case. This often leads to computa-tion times too high for practical purposes.Thus, the use of approximate methods tosolve CO problems has received more andmore attention in the last 30 years. In ap-proximate methods we sacrifice the guar-antee of finding optimal solutions for thesake of getting good solutions in a signifi-cantly reduced amount of time.

Among the basic approximate meth-ods we usually distinguish between con-structive methods and local search meth-ods. Constructive algorithms generatesolutions from scratch by adding—toan initially empty partial solution—components, until a solution is complete.They are typically the fastest approximatemethods, yet they often return solutionsof inferior quality when compared to lo-cal search algorithms. Local search algo-rithms start from some initial solution anditeratively try to replace the current solu-tion by a better solution in an appropri-ately defined neighborhood of the currentsolution, where the neighborhood is for-mally defined as follows:

Definition 1.2. A neighborhood struc-ture is a function N : S → 2S that assignsto every s ∈ S a set of neighbors N (s) ⊆ S.N (s) is called the neighborhood of s.

The introduction of a neighborhoodstructure enables us to define the conceptof locally minimal solutions.

Definition 1.3. A locally minimal so-lution (or local minimum) with respect toa neighborhood structure N is a solutions such that ∀ s ∈ N (s) : f (s) ≤ f (s). We

ACM Computing Surveys, Vol. 35, No. 3, September 2003.

270 C. Blum and A. Roli

call s a strict locally minimal solution iff (s)< f (s) ∀ s∈N (s).

In the last 20 years, a new kindof approximate algorithm has emergedwhich basically tries to combine basicheuristic methods in higher level frame-works aimed at efficiently and effec-tively exploring a search space. Thesemethods are nowadays commonly calledmetaheuristics.2 The term metaheuristic,first introduced in Glover [1986], derivesfrom the composition of two Greek words.Heuristic derives from the verb heuriskein(ευρισκειν) which means “to find”, whilethe suffix meta means “beyond, in an up-per level”. Before this term was widelyadopted, metaheuristics were often calledmodern heuristics [Reeves 1993].

This class of algorithms includes3—butis not restricted to—Ant Colony Opti-mization (ACO), Evolutionary Computa-tion (EC) including Genetic Algorithms(GA), Iterated Local Search (ILS), Sim-ulated Annealing (SA), and Tabu Search(TS). Up to now there is no commonly ac-cepted definition for the term metaheuris-tic. It is just in the last few years that someresearchers in the field tried to propose adefinition. In the following we quote someof them:

“A metaheuristic is formally definedas an iterative generation process whichguides a subordinate heuristic by combin-ing intelligently different concepts for ex-ploring and exploiting the search space,learning strategies are used to struc-ture information in order to find effici-ently near-optimal solutions.” [Osman andLaporte 1996].

“A metaheuristic is an iterative masterprocess that guides and modifies the op-erations of subordinate heuristics to effi-ciently produce high-quality solutions. Itmay manipulate a complete (or incom-plete) single solution or a collection of so-lutions at each iteration. The subordinate

2The increasing importance of metaheuristics is un-derlined by the biannual Metaheuristics Interna-tional Conference (MIC). The 5th is being held inKyoto in August 2003 (http://www-or.amp.i.kyoto-u.ac.jp/mic2003/).3In alphabetical order.

heuristics may be high (or low) level pro-cedures, or a simple local search, or just aconstruction method.” [Voß et al. 1999].

“Metaheuristics are typically high-levelstrategies which guide an underlying,more problem specific heuristic, to in-crease their performance. The main goalis to avoid the disadvantages of iterativeimprovement and, in particular, multipledescent by allowing the local search to es-cape from local optima. This is achieved byeither allowing worsening moves or gener-ating new starting solutions for the localsearch in a more “intelligent” way thanjust providing random initial solutions.Many of the methods can be interpretedas introducing a bias such that high qual-ity solutions are produced quickly. Thisbias can be of various forms and can becast as descent bias (based on the ob-jective function), memory bias (based onpreviously made decisions) or experiencebias (based on prior performance). Manyof the metaheuristic approaches rely onprobabilistic decisions made during thesearch. But, the main difference to purerandom search is that in metaheuris-tic algorithms randomness is not usedblindly but in an intelligent, biased form.”[Stutzle 1999b].

“A metaheuristic is a set of concepts thatcan be used to define heuristic methodsthat can be applied to a wide set of dif-ferent problems. In other words, a meta-heuristic can be seen as a general algo-rithmic framework which can be applied todifferent optimization problems with rel-atively few modifications to make themadapted to a specific problem.” [Meta-heuristics Network Website 2000].

Summarizing, we outline fundamen-tal properties which characterize meta-heuristics:

—Metaheuristics are strategies that“guide” the search process.

—The goal is to efficiently explore thesearch space in order to find (near-)optimal solutions.

—Techniques which constitute meta-heuristic algorithms range from sim-ple local search procedures to complexlearning processes.



—Metaheuristic algorithms are approxi-mate and usually non-deterministic.

—They may incorporate mechanisms toavoid getting trapped in confined areasof the search space.

—The basic concepts of metaheuristicspermit an abstract level description.

—Metaheuristics are not problem-specific.—Metaheuristics may make use of

domain-specific knowledge in the formof heuristics that are controlled by theupper level strategy.

—Todays more advanced metaheuristicsuse search experience (embodied insome form of memory) to guide thesearch.

In short, we could say that metaheuris-tics are high level strategies for explor-ing search spaces by using different meth-ods. Of great importance hereby is that adynamic balance is given between diversi-fication and intensification. The term di-versification generally refers to the explo-ration of the search space, whereas theterm intensification refers to the exploita-tion of the accumulated search experience.These terms stem from the Tabu Searchfield [Glover and Laguna 1997] and it isimportant to clarify that the terms ex-ploration and exploitation are sometimesused instead, for example in the Evo-lutionary Computation field [Eiben andSchippers 1998], with a more restrictedmeaning. In fact, the notions of exploita-tion and exploration often refer to rathershort-term strategies tied to randomness,whereas intensification and diversifica-tion also refer to medium- and long-termstrategies based on the usage of mem-ory. The use of the terms diversificationand intensification in their initial mean-ing becomes more and more accepted bythe whole field of metaheuristics. There-fore, we use them throughout the article.The balance between diversification andintensification as mentioned above is im-portant, on one side to quickly identify re-gions in the search space with high qual-ity solutions and on the other side notto waste too much time in regions of thesearch space which are either already ex-

plored or which do not provide high qualitysolutions.

The search strategies of different meta-heuristics are highly dependent on thephilosophy of the metaheuristic itself.Comparing the strategies used in differ-ent metaheuristics is one of the goalsof Section 5. There are several differ-ent philosophies apparent in the existingmetaheuristics. Some of them can be seenas “intelligent” extensions of local searchalgorithms. The goal of this kind of meta-heuristic is to escape from local minimain order to proceed in the exploration ofthe search space and to move on to findother hopefully better local minima. Thisis for example the case in Tabu Search,Iterated Local Search, Variable Neighbor-hood Search, GRASP and Simulated An-nealing. These metaheuristics (also calledtrajectory methods) work on one or sev-eral neighborhood structure(s) imposed onthe members (the solutions) of the searchspace.

We can find a different philosophy inalgorithms like Ant Colony Optimizationand Evolutionary Computation. They in-corporate a learning component in thesense that they implicitly or explicitlytry to learn correlations between deci-sion variables to identify high quality ar-eas in the search space. This kind ofmetaheuristic performs, in a sense, a bi-ased sampling of the search space. For in-stance, in Evolutionary Computation thisis achieved by recombination of solutionsand in Ant Colony Optimization by sam-pling the search space in every iterationaccording to a probability distribution.

The structure of this work is as follows:There are several approaches to classifymetaheuristics according to their proper-ties. In Section 2, we briefly list and sum-marize different classification approaches.Section 3 and Section 4 are devoted to adescription of the most important meta-heuristics nowadays. Section 3 describesthe most relevant trajectory methods and,in Section 4, we outline population-basedmethods. Section 5 aims at giving a unify-ing view on metaheuristics with respectto the way they achieve intensificationand diversification. This is done by the



introduction of a unifying framework, theI&D frame. Finally, Section 6 offers someconclusions and an outlook to the future.

We believe that it is hardly possible toproduce a completely accurate survey ofmetaheuristics that is doing justice to ev-ery viewpoint. Moreover, a survey of animmense area such as metaheuristics hasto focus on certain aspects and thereforehas unfortunately to neglect other aspects.Therefore, we want to clarify at this pointthat this survey is done from the concep-tual point of view. We want to outline thedifferent concepts that are used in differ-ent metaheuristics in order to analyze thesimilarities and the differences betweenthem. We do not go into the implementa-tion of metaheuristics, which is certainlyan important aspect of metaheuristicsresearch with respect to the increas-ing importance of efficiency and softwarereusability. We refer the interested readerto Whitley [1989], Grefenstette [1990],Fink and Voß [1999], Schaerf et al. [2000],and Voß and Woodruff [2002].

2. CLASSIFICATION OF METAHEURISTICS

There are different ways to classify anddescribe metaheuristic algorithms. De-pending on the characteristics selectedto differentiate among them, severalclassifications are possible, each of thembeing the result of a specific viewpoint.We briefly summarize the most importantways of classifying metaheuristics.

Nature-inspired vs. non-nature in-spired. Perhaps, the most intuitive wayof classifying metaheuristics is based onthe origins of the algorithm. There arenature-inspired algorithms, like GeneticAlgorithms and Ant Algorithms, andnon nature-inspired ones such as TabuSearch and Iterated Local Search. Inour opinion this classification is not verymeaningful for the following two reasons.First, many recent hybrid algorithms donot fit either class (or, in a sense, theyfit both at the same time). Second, it issometimes difficult to clearly attribute analgorithm to one of the two classes. So,for example, one might ask the question if

the use of memory in Tabu Search is notnature-inspired as well.

Population-based vs. single point search.Another characteristic that can be usedfor the classification of metaheuristics isthe number of solutions used at the sametime: Does the algorithm work on a popu-lation or on a single solution at any time?Algorithms working on single solutionsare called trajectory methods and encom-pass local search-based metaheuristics,like Tabu Search, Iterated Local Searchand Variable Neighborhood Search. Theyall share the property of describing a tra-jectory in the search space during thesearch process. Population-based meta-heuristics, on the contrary, perform searchprocesses which describe the evolution ofa set of points in the search space.

Dynamic vs. static objective function.Metaheuristics can also be classified ac-cording to the way they make use of theobjective function. While some algorithmskeep the objective function given in theproblem representation “as it is”, someothers, like Guided Local Search (GLS),modify it during the search. The idea be-hind this approach is to escape from lo-cal minima by modifying the search land-scape. Accordingly, during the search theobjective function is altered by trying to in-corporate information collected during thesearch process.

One vs. various neighborhood structures.Most metaheuristic algorithms work onone single neighborhood structure. Inother words, the fitness landscape topol-ogy does not change in the course of thealgorithm. Other metaheuristics, such asVariable Neighborhood Search (VNS), usea set of neighborhood structures whichgives the possibility to diversify the searchby swapping between different fitnesslandscapes.

Memory usage vs. memory-less methods.A very important feature to classify meta-heuristics is the use they make of thesearch history, that is, whether they usememory or not.4 Memory-less algorithms

4Here we refer to the use of adaptive memory, in con-trast to rather rigid memory, as used for instance inBranch & Bound.



perform a Markov process, as the infor-mation they exclusively use to determinethe next action is the current state of thesearch process. There are several differentways of making use of memory. Usually wedifferentiate between the use of short termand long term memory. The first usuallykeeps track of recently performed moves,visited solutions or, in general, decisionstaken. The second is usually an accumu-lation of synthetic parameters about thesearch. The use of memory is nowadaysrecognized as one of the fundamental ele-ments of a powerful metaheuristic.

In the following, we describe the mostimportant metaheuristics according tothe single point vs. population-basedsearch classification, which divides meta-heuristics into trajectory methods andpopulation-based methods. This choice ismotivated by the fact that this categoriza-tion permits a clearer description of the al-gorithms. Moreover, a current trend is thehybridization of methods in the directionof the integration of single point search al-gorithms in population-based ones. In thefollowing two sections, we give a detaileddescription of nowadays most importantmetaheuristics.

3. TRAJECTORY METHODS

In this section we outline metaheuristicscalled trajectory methods. The term trajec-tory methods is used because the searchprocess performed by these methods ischaracterized by a trajectory in the searchspace. Hereby, a successor solution may ormay not belong to the neighborhood of thecurrent solution.

The search process of trajectory meth-ods can be seen as the evolution in (dis-crete) time of a discrete dynamical sys-tem [Bar-Yam 1997; Devaney 1989]. Thealgorithm starts from an initial state (theinitial solution) and describes a trajectoryin the state space. The system dynamicsdepends on the strategy used; simple al-gorithms generate a trajectory composedof two parts: a transient phase followedby an attractor (a fixed point, a cycle ora complex attractor). Algorithms with ad-vanced strategies generate more complex

s← GenerateInitialSolution()repeat

s← Improve(N (s))until no improvement is possible

Fig. 1 . Algorithm: IterativeImprovement.

trajectories which can not be subdividedin those two phases. The characteristics ofthe trajectory provide information aboutthe behavior of the algorithm and its effec-tiveness with respect to the instance thatis tackled. It is worth underlining thatthe dynamics is the result of the combina-tion of algorithm, problem representationand problem instance. In fact, the problemrepresentation together with the neigh-borhood structures define the search land-scape; the algorithm describes the strat-egy used to explore the landscape and,finally, the actual search space character-istics are defined by the problem instanceto be solved.

We will first describe basic local searchalgorithms, before we proceed with thesurvey of more complex strategies. Fi-nally, we deal with algorithms that aregeneral explorative strategies which mayincorporate other trajectory methods ascomponents.

3.1. Basic Local Search: IterativeImprovement

The basic local search is usually called it-erative improvement, since each move5 isonly performed if the resulting solution isbetter than the current solution. The algo-rithm stops as soon as it finds a local mini-mum. The high level algorithm is sketchedin Figure 1.

The function Improve(N (s)) can be in theextremes either a first improvement, ora best improvement function, or any in-termediate option. The former scans theneighborhood N (s) and chooses the firstsolution that is better than s, the latter ex-haustively explores the neighborhood andreturns one of the solutions with the low-est objective function value. Both methods

5A move is the choice of a solution s′ from the neigh-borhood N (s) of a solution s.



s← GenerateInitialSolution()T ← T0

while termination conditions not met dos′ ← PickAtRandom(N (s))if ( f (s′) < f (s)) then

s← s′ % s′ replaces selse

Accept s′ as new solution with probability p(T, s′, s)endifUpdate(T )

endwhile

Fig. 2 . Algorithm: Simulated Annealing (SA).

stop at local minima. Therefore, their per-formance strongly depends on the defini-tion of S, f and N . The performance ofiterative improvement procedures on COproblems is usually quite unsatisfactory.Therefore, several techniques have beendeveloped to prevent algorithms from get-ting trapped in local minima, which isdone by adding mechanisms that allowthem to escape from local minima. Thisalso implies that the termination con-ditions of metaheuristic algorithms aremore complex than simply reaching a lo-cal minimum. Indeed, possible termina-tion conditions include: maximum CPUtime, a maximum number of iterations,a solution s with f (s) less than a prede-fined threshold value is found, or the max-imum number of iterations without im-provements is reached.

3.2. Simulated Annealing

Simulated Annealing (SA) is commonlysaid to be the oldest among the meta-heuristics and surely one of the first al-gorithms that had an explicit strategy toescape from local minima. The origins ofthe algorithm are in statistical mechan-ics (Metropolis algorithm) and it was firstpresented as a search algorithm for COproblems in Kirkpatrick et al. [1983] andCerny [1985]. The fundamental idea is toallow moves resulting in solutions of worsequality than the current solution (uphillmoves) in order to escape from local min-ima. The probability of doing such a moveis decreased during the search. The highlevel algorithm is described in Figure 2.

The algorithm starts by generating aninitial solution (either randomly or heuris-

tically constructed) and by initializing theso-called temperature parameter T . Then,at each iteration a solution s′ ∈ N (s) is ran-domly sampled and it is accepted as newcurrent solution depending on f (s), f (s′)and T . s′ replaces s if f (s′)< f (s) or, in casef (s′) ≥ f (s), with a probability which is afunction of T and f (s′)− f (s). The proba-bility is generally computed following theBoltzmann distribution exp(− f (s′)− f (s)

T ).The temperature T is decreased6 during

the search process, thus at the beginningof the search the probability of acceptinguphill moves is high and it gradually de-creases, converging to a simple iterativeimprovement algorithm. This process isanalogous to the annealing process of met-als and glass, which assume a low en-ergy configuration when cooled with anappropriate cooling schedule. Regardingthe search process, this means that thealgorithm is the result of two combinedstrategies: random walk and iterative im-provement. In the first phase of the search,the bias toward improvements is low andit permits the exploration of the searchspace; this erratic component is slowly de-creased thus leading the search to con-verge to a (local) minimum. The probabil-ity of accepting uphill moves is controlledby two factors: the difference of the ob-jective functions and the temperature. Onthe one hand, at fixed temperature, thehigher the difference f (s′)− f (s), the lowerthe probability to accept a move from s tos′. On the other hand, the higher T , thehigher the probability of uphill moves.

The choice of an appropriate coolingschedule is crucial for the performance ofthe algorithm. The cooling schedule de-fines the value of T at each iteration k,Tk+1= Q(Tk , k), where Q(Tk , k) is a func-tion of the temperature and of the itera-tion number. Theoretical results on non-homogeneous Markov chains [Aarts et al.1997] state that under particular condi-tions on the cooling schedule, the algo-rithm converges in probability to a global

6T is not necessarily decreased in a monotonic fash-ion. Elaborate cooling schemes also incorporate anoccasional increase of the temperature.



minimum for k→∞. More precisely:

∃0 ∈ IR such thatlim

k→∞p( global minimum found after

k steps) = 1

iff∞∑

k=1

exp(0

Tk

)= ∞

A particular cooling schedule that fulfilsthe hypothesis for the convergence is theone that follows a logarithmic law: Tk+1 =

0log(k+k0) (where k0 is a constant). Unfortu-nately, cooling schedules which guaranteethe convergence to a global optimum arenot feasible in applications, because theyare too slow for practical purposes. There-fore, faster cooling schedules are adoptedin applications. One of the most used fol-lows a geometric law: Tk+1 = αTk , whereα ∈ (0, 1), which corresponds to an expo-nential decay of the temperature.

The cooling rule may vary during thesearch, with the aim of tuning the bal-ance between diversification and intensi-fication. For example, at the beginning ofthe search, T might be constant or lin-early decreasing, in order to sample thesearch space; then, T might follow a rulesuch as the geometric one, to converge toa local minimum at the end of the search.More successful variants are nonmono-tonic cooling schedules (e.g., see Osman[1993] and Lundy and Mees [1986]). Non-monotonic cooling schedules are charac-terized by alternating phases of coolingand reheating, thus providing an oscillat-ing balance between diversification andintensification.

The cooling schedule and the initial tem-perature should be adapted to the partic-ular problem instance, since the cost ofescaping from local minima depends onthe structure of the search landscape. Asimple way of empirically determining thestarting temperature T0 is to initially sam-ple the search space with a random walkto roughly evaluate the average and thevariance of objective function values. Butalso more elaborate schemes can be imple-mented [Ingber 1996].

The dynamic process described by SA isa Markov chain [Feller 1968], as it follows

s← GenerateInitialSolution()TabuList← ∅while termination conditions not met do

s← ChooseBestOf(N (s) \ TabuList)Update(TabuList)

endwhile

Fig. 3 . Algorithm: Simple Tabu Search (TS).

a trajectory in the state space in whichthe successor state is chosen dependingonly on the incumbent one. This meansthat basic SA is memory-less. However,the use of memory can be beneficial forSA approaches (see, e.g., Chardaire et al.[1995]).

SA has been applied to several CO prob-lems, such as the Quadratic AssignmentProblem (QAP) [Connolly 1990] and theJob Shop Scheduling (JSS) problem [VanLaarhoven et al. 1992]. References toother applications can be found in Aartsand Lenstra [1997], Ingber [1996] andFleischer [1995]. SA is nowadays used as acomponent in metaheuristics, rather thanapplied as stand-alone search algorithm.Variants of SA called Threshold Accept-ing and The Great Deluge Algorithm werepresented by Dueck and Scheuer [1990]and Dueck [1993].

3.3. Tabu Search

Tabu Search (TS) is among the most citedand used metaheuristics for CO prob-lems. TS basic ideas were first introducedin Glover [1986], based on earlier ideas for-mulated in Glover [1977].7 A description ofthe method and its concepts can be foundin Glover and Laguna [1997]. TS explicitlyuses the history of the search, both to es-cape from local minima and to implementan explorative strategy. We will first de-scribe a simple version of TS, to introducethe basic concepts. Then, we will explain amore applicable algorithm and finally wewill discuss some improvements.

The simple TS algorithm (see Figure 3)applies a best improvement local searchas basic ingredient and uses a short termmemory to escape from local minima and

7Related ideas were labelled steepest ascent/mildestdescent method in Hansen [1986].



to avoid cycles. The short term memoryis implemented as a tabu list that keepstrack of the most recently visited solu-tions and forbids moves toward them. Theneighborhood of the current solution isthus restricted to the solutions that do notbelong to the tabu list. In the followingwe will refer to this set as allowed set.At each iteration the best solution fromthe allowed set is chosen as the new cur-rent solution. Additionally, this solution isadded to the tabu list and one of the so-lutions that were already in the tabu listis removed (usually in a FIFO order). Dueto this dynamic restriction of allowed so-lutions in a neighborhood, TS can be con-sidered as a dynamic neighborhood searchtechnique [Stutzle 1999b]. The algorithmstops when a termination condition is met.It might also terminate if the allowed setis empty, that is, if all the solutions inN (s)are forbidden by the tabu list.8

The use of a tabu list prevents fromreturning to recently visited solutions,therefore it prevents from endless cycling9

and forces the search to accept even up-hill moves. The length l of the tabu list(i.e., the tabu tenure) controls the mem-ory of the search process. With small tabutenures the search will concentrate onsmall areas of the search space. On theopposite, a large tabu tenure forces thesearch process to explore larger regions,because it forbids revisiting a higher num-ber of solutions. The tabu tenure can bevaried during the search, leading to morerobust algorithms. An example can befound in Taillard [1991], where the tabutenure is periodically reinitialized at ran-dom from the interval [lmin, lmax]. A moreadvanced use of a dynamic tabu tenure ispresented in Battiti and Tecchiolli [1994]and Battiti and Protasi [1997], wherethe tabu tenure is increased if thereis evidence for repetitions of solutions(thus a higher diversification is needed),

8Strategies for avoiding to stop the search when theallowed set is empty include the choice of the leastrecently visited solution, even if it is tabu.9Cycles of higher period are possible, since the tabulist has a finite length l which is smaller than thecardinality of the search space.

while it is decreased if there are no im-provements (thus intensification shouldbe boosted). More advanced ways to cre-ate dynamic tabu tenure are describedin Glover [1990].

However, the implementation of shortterm memory as a list that contains com-plete solutions is not practical, becausemanaging a list of solutions is highly in-efficient. Therefore, instead of the solu-tions themselves, solution attributes arestored.10 Attributes are usually compo-nents of solutions, moves, or differencesbetween two solutions. Since more thanone attribute can be considered, a tabulist is introduced for each of them. The setof attributes and the corresponding tabulists define the tabu conditions which areused to filter the neighborhood of a solu-tion and generate the allowed set. Storingattributes instead of complete solutions ismuch more efficient, but it introduces aloss of information, as forbidding an at-tribute means assigning the tabu status toprobably more than one solution. Thus, itis possible that unvisited solutions of goodquality are excluded from the allowed set.To overcome this problem, aspiration cri-teria are defined which allow to include asolution in the allowed set even if it is for-bidden by tabu conditions. Aspiration cri-teria define the aspiration conditions thatare used to construct the allowed set. Themost commonly used aspiration criterionselects solutions which are better than thecurrent best one. The complete algorithm,as described above, is reported in Figure 4.

Tabu lists are only one of the possibleways of taking advantage of the historyof the search. They are usually identi-fied with the usage of short term memory.Information collected during the wholesearch process can also be very useful,especially for a strategic guidance of thealgorithm. This kind of long-term mem-ory is usually added to TS by referring tofour principles: recency, frequency, qualityand influence. Recency-based memoryrecords for each solution (or attribute)

10In addition to storing attributes, some longer termTS strategies also keep complete solutions (e.g., elitesolutions) in the memory.



s← GenerateInitialSolution()InitializeTabuLists(TL1, . . . , TLr )k← 0while termination conditions not met do

AllowedSet(s, k)← {s′ ∈ N (s) | s does not violate a tabu condition,or it satisfies at least one aspiration condition}

s← ChooseBestOf(AllowedSet(s, k))UpdateTabuListsAndAspirationConditions()k← k + 1

endwhile

Fig. 4 . Algorithm: Tabu Search (TS).

while termination conditions not met dos← ConstructGreedyRandomizedSolution() % see Figure 6ApplyLocalSearch(s)MemorizeBestFoundSolution()

endwhile

Fig. 5 . Algorithm: Greedy Randomized Adaptive Search Procedure (GRASP).

the most recent iteration it was involvedin. Orthogonally, frequency-based mem-ory keeps track of how many times eachsolution (attribute) has been visited. Thisinformation identifies the regions (or thesubsets) of the solution space where thesearch was confined, or where it stayed fora high number of iterations. This kind ofinformation about the past is usually ex-ploited to diversify the search. The thirdprinciple (i.e., quality) refers to the ac-cumulation and extraction of informationfrom the search history in order to identifygood solution components. This informa-tion can be usefully integrated in the so-lution construction. Other metaheuristics(e.g., Ant Colony Optimization) explicitlyuse this principle to learn about good com-binations of solution components. Finally,influence is a property regarding choicesmade during the search and can be usedto indicate which choices have shown to bethe most critical. In general, the TS field isa rich source of ideas. Many of these ideasand strategies have been and are currentlyadopted by other metaheuristics.

TS has been applied to most CO prob-lems; examples for successful applica-tions are the Robust Tabu Search to theQAP [Taillard 1991], the Reactive TabuSearch to the MAXSAT problem [Battitiand Protasi 1997], and to assignmentproblems [Dell’Amico et al. 1999]. TS ap-

proaches dominate the Job Shop Schedu-ling (JSS) problem area (see, e.g., Nowickiand Smutnicki [1996]) and the VehicleRouting (VR) area [Gendreau et al. 2001].Further current applications can be foundat [Tabu Search website 2003].

3.4. Explorative Local Search Methods

In this section, we present more recentlyproposed trajectory methods. These arethe Greedy Randomized Adaptive SearchProcedure (GRASP), Variable Neighbor-hood Search (VNS), Guided Local Search(GLS) and Iterated Local Search (ILS).

3.4.1. GRASP. The Greedy RandomizedAdaptive Search Procedure (GRASP), seeFeo and Resende [1995] and Pitsoulis andResende [2002], is a simple metaheuristicthat combines constructive heuristics andlocal search. Its structure is sketched inFigure 5. GRASP is an iterative procedure,composed of two phases: solution construc-tion and solution improvement. The bestfound solution is returned upon termina-tion of the search process.

The solution construction mechanism(see Figure 6) is characterized by twomain ingredients: a dynamic constructiveheuristic and randomization. Assumingthat a solution s consists of a subset ofa set of elements (solution components),



s← ∅ % s denotes a partial solution in this caseα← DetermineCandidateListLength() % definition of the RCL lengthwhile solution not complete do

RCLα ← GenerateRestrictedCandidateList(s)x ← SelectElementAtRandom(RCLα)s← s ∪ {x}UpdateGreedyFunction(s) % update of the heuristic values (see text)

endwhile

Fig. 6 . Greedy randomized solution construction.

the solution is constructed step-by-step byadding one new element at a time. Thechoice of the next element is done by pick-ing it uniformly at random from a can-didate list. The elements are ranked bymeans of a heuristic criterion that givesthem a score as a function of the (my-opic) benefit if inserted in the current par-tial solution. The candidate list, called re-stricted candidate list (RCL), is composedof the best α elements. The heuristic val-ues are updated at each step, thus thescores of elements change during the con-struction phase, depending on the possi-ble choices. This constructive heuristic iscalled dynamic, in contrast to the staticone which assigns a score to elements onlybefore starting the construction. For in-stance, one of the static heuristics for theTSP is based on arc costs: the lower thecost of an arc, the higher its score. An ex-ample of a dynamic heuristic is the cheap-est insertion heuristic, where the score ofan element is evaluated depending on thecurrent partial solution.

The length α of the restricted candi-date list determines the strength of theheuristic bias. In the extreme case of α= 1the best element would be added, thusthe construction would be equivalent toa deterministic Greedy Heuristic. On theopposite, in case α=n the constructionwould be completely random (indeed, thechoice of an element from the candidatelist is done at random). Therefore, α isa critical parameter which influences thesampling of the search space. In Pitsoulisand Resende [2002] the most importantschemes to define α are listed. The sim-plest scheme is, trivially, to keep α con-stant; it can also be changed at each iter-ation, either randomly or by means of anadaptive scheme.

The second phase of the algorithm is alocal search process, which may be a ba-sic local search algorithm such as iterativeimprovement, or a more advanced tech-nique such as SA or TS. GRASP can beeffective if two conditions are satisfied:

—the solution construction mechanismsamples the most promising regions ofthe search space;

—the solutions constructed by the con-structive heuristic belong to basins ofattraction of different locally minimalsolutions;

The first condition can be met by thechoice of an effective constructive heuris-tic and an appropriate length of the can-didate list, whereas the second conditioncan be met by choosing the constructiveheuristic and the local search in a waysuch that they fit well.

The description of GRASP as givenabove indicates that a basic GRASP doesnot use the history of the search process.11

The only memory requirement is for stor-ing the problem instance and for keep-ing the best so-far solution. This is oneof the reasons why GRASP is often out-performed by other metaheuristics. How-ever, due to its simplicity, it is generallyvery fast and it is able to produce quitegood solutions in a very short amount ofcomputation time. Furthermore, it can besuccessfully integrated into other searchtechniques. Among the applications ofGRASP, we mention the JSS problem

11However, some extensions in this direction arecited in Pitsoulis and Resende [2002], and an ex-ample for a metaheuristic method using an adap-tive greedy procedure depending on search historyis Squeaky Wheel Optimization (SWO) [Joslin andClements 1999].



Select a set of neighborhood structures Nk , k = 1, . . . , kmax

s← GenerateInitialSolution()while termination conditions not met do

k← 1while k < kmax do % Inner loop

s′ ← PickAtRandom(Nk (s)) % Shaking phases′′ ← LocalSearch(s′)

if ( f (s′′) < f (s)) then

s← s′′

k← 1else

k← k + 1endif

endwhileendwhile

Fig. 7 . Algorithm: Variable Neighborhood Search(VNS).

[Binato et al. 2001], the graph pla-narization problem [Resende and Ribeiro1997] and assignment problems [Praisand Ribeiro 2000]. A detailed and anno-tated bibliography references many moreapplications [Festa and Resende 2002].

3.4.2. Variable Neighborhood Search. Vari-able Neighborhood Search (VNS) is ametaheuristic proposed in Hansen andMladenovic [1999, 2001], which explicitlyapplies a strategy based on dynamicallychanging neighborhood structures. The al-gorithm is very general and many degreesof freedom exist for designing variants andparticular instantiations.12

At the initialization step, a set of neigh-borhood structures has to be defined.These neighborhoods can be arbitrarilychosen, but often a sequence |N1|< |N2|<· · ·< |Nkmax | of neighborhoods with increas-ing cardinality is defined.13 Then an initialsolution is generated, the neighborhoodindex is initialized and the algorithm iter-ates until a stopping condition is met (seeFigure 7). VNS’ main cycle is composedof three phases: shaking, local search andmove. In the shaking phase a solution s′in the kth neighborhood of the currentsolution s is randomly selected. Then, s′

12The variants described in the following are also de-scribed in Hansen and Mladenovic [1999, 2001].13In principle they could be one included in the other,N1 ⊂ N2 ⊂ · · · ⊂ Nkmax . Nevertheless, such a se-quence might produce an inefficient search, becausea large number of solutions could be revisited.

becomes the local search starting point.The local search can use any neighbor-hood structure and is not restricted to theset of neighborhood structures Nk , k =1, . . . , kmax. At the end of the local searchprocess (terminated as soon as a prede-fined termination condition is verified) thenew solution s′′ is compared with s and, ifit is better, it replaces s and the algorithmstarts again with k = 1. Otherwise, k is in-cremented and a new shaking phase startsusing a different neighborhood.

The objective of the shaking phase isto perturb the solution so as to provide agood starting point for the local search.The starting point should belong to thebasin of attraction of a different local min-imum than the current one, but shouldnot be “too far” from s, otherwise the al-gorithm would degenerate into a simplerandom multi-start. Moreover, choosing s′in the neighborhood of the current best so-lution is likely to produce a solution thatmaintains some good features of the cur-rent one.

The process of changing neighborhoodsin case of no improvements correspondsto a diversification of the search. In par-ticular the choice of neighborhoods ofincreasing cardinality yields a progres-sive diversification. The effectiveness ofthis dynamic neighborhood strategy canbe explained by the fact that a “bad”place on the search landscape given byone neighborhood could be a “good” placeon the search landscape given by an-other neighborhood.14 Moreover, a solu-tion that is locally optimal with respectto a neighborhood is probably not locallyoptimal with respect to another neighbor-hood. These concepts are known as “OneOperator, One Landscape” and explainedin Jones [1995a, 1995b]. The core ideais that the neighborhood structure deter-mines the topological properties of thesearch landscape, that is, each neighbor-hood defines one landscape. The proper-ties of a landscape are in general differentfrom those of other landscapes, therefore

14A “good” place in the search space is an area fromwhich a good local minimum can be reached.



Fig. 8 . Two search landscapes defined by two different neighborhoods. On the landscape that isshown in the graphic on the left, the best improvement local search stops at s1, while it proceedstill a better local minimum s2 on the landscape that is shown in the graphic on the right.

a search strategy performs differently onthem (see an example in Figure 8).

This property is directly exploited bya local search called Variable Neighbor-hood Descent (VND). In VND a best im-provement local search (see Section 3.1)is applied, and, in case a local minimumis found, the search proceeds with an-other neighborhood structure. The VNDalgorithm can be obtained by substitutingthe inner loop of the VNS algorithm (seeFigure 7) with the following pseudo-code:

s′ ← ChooseBestOf(Nk(s))if ( f (s

′) < f (s))

then % i.e., if a better solution is found in Nk(s)s← s

′

else % i.e., s is a local minimumk← k + 1

endif

As can be observed from the descriptionas given above, the choice of the neigh-borhood structures is the critical point ofVNS and VND. The neighborhoods cho-sen should exploit different properties andcharacteristics of the search space, that is,the neighborhood structures should pro-vide different abstractions of the searchspace. A variant of VNS is obtained by se-lecting the neighborhoods in such a wayas to produce a problem decomposition(the algorithm is called Variable Neigh-borhood Decomposition Search—VNDS).VNDS follows the usual VNS scheme, butthe neighborhood structures and the lo-cal search are defined on sub-problems.For each solution, all attributes (usuallyvariables) are kept fixed except for k ofthem. For each, k, a neighborhood struc-

ture Nk is defined. Local search only re-gards changes on the variables belongingto the sub-problem it is applied to. The in-ner loop of VNDS is the following:

s′ ← PickAtRandom(Nk(s)) % s and s′ differ in kattributes

s′′ ← LocalSearch(s′,Attributes) % only movesinvolving the kattributes areallowed

if ( f (s′′) < f (s)) thens← s′′k← 1

elsek← k + 1

endif

The decision whether to perform a movecan be varied as well. The acceptance cri-terion based on improvements is stronglysteepest descent-oriented and it might notbe suited to effectively explore the searchspace. For example, when local minimaare clustered, VNS can quickly find thebest optimum in a cluster, but it has noguidance to leave that cluster and find an-other one. Skewed VNS (SVNS) extendsVNS by providing a more flexible accep-tance criterion that takes also into accountthe distance from the current solution.15

The new acceptance criterion is the fol-lowing: besides always accepting improve-ments, worse solutions can be accepted ifthe distance from the current one is lessthan a value αρ(s, s′′). The function ρ(s, s′′)measures the distance between s and s′′

15A distance measure between solutions has thus tobe formally defined.



Fig. 9 . Basic GLS idea: Escaping from a valley in the landscape by increas-ing the objective function value of its solutions.

and α is a parameter that weights theimportance of the distance between thetwo solutions in the acceptance criterion.The inner loop of SVNS can be sketched asfollows:

if ( f (s′′)− αρ(s, s

′′) < f (s)) then

s← s′′

k← 1else

k← k + 1endif

VNS and its variants have been success-fully applied to graph based CO problemssuch as the p-Median problem [Hansenand Mladenovic 1997], the degree con-strained minimum spanning tree prob-lem [Ribeiro and Souza 2002], the Steinertree problem [Wade and Rayward-Smith1997] and the k-Cardinality Tree (KCT)problem [Mladenovic and Urosevic 2001].References to more applications can befound in Hansen and Mladenovic [2001].

3.4.3. Guided Local Search. Tabu Searchand Variable Neighborhood Search ex-plicitly deal with dynamic neighborhoodswith the aim of efficiently and effectivelyexploring the search space. A differentapproach for guiding the search is todynamically change the objective func-tion. Among the most general methodsthat use this approach is Guided LocalSearch (GLS) [Voudouris and Tsang 1999;Voudouris 1997].

The basic GLS principle is to help thesearch to gradually move away from lo-cal minima by changing the search land-scape. In GLS, the set of solutions andthe neighborhood structure are kept fixed,while the objective function f is dynami-cally changed with the aim of making thecurrent local optimum “less desirable”. Apictorial description of this idea is given inFigure 9.

The mechanism used by GLS is basedon solution features, which may be anykind of properties or characteristics thatcan be used to discriminate between solu-tions. For example, solution features in theTSP could be arcs between pairs of cities,while in the MAXSAT problem they couldbe the number of unsatisfied clauses. Anindicator function Ii(s) indicates whetherthe feature i is present in solution s:

Ii(s) ={ 1 : if feature i is present in

: solution s0 : otherwise.

The objective function f is modified toyield a new objective function f

′by adding

a term that depends on the m features:

f′(s) = f (s)+ λ

m∑i=1

pi · Ii(s),

where pi are called penalty parame-ters and λ is called the regularization



s← GenerateInitialSolution()while termination conditions not met do

s← LocalSearch(s, f′ )

for all feature i with maximum utility Util(s, i) dopi ← pi + 1

endforUpdate( f ′, p) % p is the penalty vector

endwhile

Fig. 10 . Algorithm: Guided Local Search (GLS).

parameter. The penalty parametersweight the importance of the features:the higher pi, the higher the importanceof feature i, thus the higher the cost ofhaving that feature in the solution. Theregularization parameter balances the re-levance of features with respect to theoriginal objective function.

The algorithm (see Figure 10) works asfollows: It starts from an initial solutionand applies a local search method until alocal minimum is reached. Then the arrayp = (p1, . . . , pm) of penalties is updatedby incrementing some of the penalties andthe local search is started again. The pe-nalized features are those that have themaximum utility:

Util(s, i) = Ii(s) · ci

1+ pi,

where ci are costs assigned to every featurei giving a heuristic evaluation of the rel-ative importance of features with respectto others. The higher the cost, the higherthe utility of features. Nevertheless, thecost is scaled by the penalty parameter toprevent the algorithm from being totallybiased toward the cost and to make it sen-sitive to the search history.

The penalties update procedure can bemodified by adding a multiplicative ruleto the simple incrementing rule (that isapplied at each iteration). The multiplica-tive rule has the form: pi ← pi · α, whereα ∈ (0, 1). This rule is applied with a lowerfrequency than the incrementing one (forexample every few hundreds of iterations)with the aim of smoothing the weights ofpenalized features so as to prevent thelandscape from becoming too rugged. It isimportant to note that the penalties up-

date rules are often very sensitive to theproblem instance.

GLS has been successfully applied tothe weighted MAXSAT [Mills and Tsang2000], the VR problem [Kilby et al. 1999],the TSP and the QAP [Voudouris andTsang 1999].

3.4.4. Iterated Local Search. We concludethis presentation of explorative strate-gies with Iterated Local Search (ILS), themost general scheme among the explo-rative strategies. On the one hand, its gen-erality makes it a framework for othermetaheuristics (such as VNS); on the otherhand, other metaheuristics can be eas-ily incorporated as subcomponents. ILS isa simple but powerful metaheuristic al-gorithm [Stutzle 1999a,1999b; Lourencoet al. 2001,2002; Martin et al. 1991]. Itapplies local search to an initial solutionuntil it finds a local optimum; then it per-turbs the solution and it restarts localsearch. The importance of the perturba-tion is obvious: too small a perturbationmight not enable the system to escapefrom the basin of attraction of the local op-timum just found. On the other side, toostrong a perturbation would make the al-gorithm similar to a random restart localsearch.

A local search is effective if it is able tofind good local minima, that is, if it canfind the basin of attraction of those states.When the search space is wide and/orwhen the basins of attraction of good localoptima are small,16 a simple multi-startalgorithm is almost useless. An effectivesearch could be designed as a trajectoryonly in the set of local optima S, insteadof in the set S of all the states. Unfortu-nately, in most cases there is no feasibleway of introducing a neighborhood struc-ture for S. Therefore, a trajectory alonglocal optima s1, s2, . . . , st is performed,without explicitly introducing a neighbor-hood structure, by applying the followingscheme:

16The basin of attraction size of a point s (in a finitespace), is defined as the fraction of initial states oftrajectories which converge to point s.



s0 ← GenerateInitialSolution()s← LocalSearch(s0)while termination conditions not met do

s′ ← Perturbation(s, history)s′ ← LocalSearch(s′)s← ApplyAcceptanceCriterion(s, s′, history)

endwhile

Fig. 11 . Algorithm: Iterated Local Search(ILS).

(1) Execute local search (LS) from an ini-tial state s until a local minimum s isfound.

(2) Perturb s and obtain s′.(3) Execute LS from s′ until a local mini-

mum s′ is reached.(4) On the basis of an acceptance criterion

decide whether to set s← s′.(5) Goto step 2.

The requirement on the perturbation of s isto produce a starting point for local searchsuch that a local minimum different froms is reached. However, this new local min-imum should be closer to s than a localminimum produced by a random restart.The acceptance criterion acts as a counter-balance, as it filters and gives feedback tothe perturbation action, depending on thecharacteristics of the new local minimum.A high level description of ILS as it is de-scribed in Lourenco et al. [2002] is givenin Figure 11. Figure 12 shows a possible(lucky) ILS step.

The design of ILS algorithms has sev-eral degrees of freedom in the choice ofthe initial solution, perturbation and ac-ceptance criteria. A key role is played bythe history of the search which can be ex-ploited both in form of short and long termmemory.

The construction of initial solutionsshould be fast (computationally not ex-pensive), and initial solutions should be agood starting point for local search. Thefastest way of producing an initial solu-tion is to generate it at random; how-ever, this is the easiest way for problemsthat are constrained, whilst in other casesthe construction of a feasible solution re-quires also constraint checking. Construc-tive methods, guided by heuristics, canalso be adopted. It is worth underlining

that an initial solution is considered a goodstarting point depending on the particu-lar LS applied and on the problem struc-ture, thus the algorithm designer’s goal isto find a trade-off between speed and qual-ity of solutions.

The perturbation is usually non-deterministic in order to avoid cycling.Its most important characteristic is thestrength, roughly defined as the amount ofchanges made on the current solution. Thestrength can be either fixed or variable.In the first case, the distance betweens and s′ is kept constant, independentlyof the problem size. However, a variablestrength is in general more effective,since it has been experimentally foundthat, in most of the problems, the biggerthe problem size, the larger should bethe strength. More sophisticated schemesare possible; for example, the strengthcan be adaptive: it increases when morediversification is needed and it decreaseswhen intensification seems preferable.VNS and its variants belong to this cat-egory. A second choice is the mechanismto perform perturbations. This may be arandom mechanism, or the perturbationmay be produced by a (semi-)deterministicmethod (e.g., a LS different from the oneused in the main algorithm).

The third important component is theacceptance criterion. Two extreme exam-ples consist in (1) accepting the new localoptimum only in case of improvement and(2) in always accepting the new solution.In-between, there are several possibilities.For example, it is possible to adopt a kindof annealing schedule: accept all the im-proving new local optima and accept alsothe nonimproving ones with a probabilitythat is a function of the temperature T andthe difference of objective function values.In formulas:

p(Accept(s, s′, history))

=

1 if f (s′)< f (s)

exp(− f (s′)− f (s)

T

)otherwise

The cooling schedule can be eithermonotonic (non-increasing in time) or



Fig. 12 . A desirable ILS step: the local minimum s is perturbed, then LSis applied and a new local minimum is found.

non-monotonic (adapted to tune the bal-ance between diversification and intensifi-cation). The nonmonotonic schedule is par-ticularly effective if it exploits the historyof the search, in a way similar to the Re-active Tabu Search [Taillard 1991] men-tioned at the end of the section about TabuSearch. When intensification seems nolonger effective, a diversification phase isneeded and the temperature is increased.

Examples for successful applications ofILS are to the TSP [Martin and Otto1996; Johnson and McGeoch 1997], tothe QAP [Lourenco et al. 2002], and tothe Single Machine Total Weighted Tardi-ness (SMTWT) problem [den Besten et al.2001]. References to other applicationscan be found in Lourenco et al. [2002].

4. POPULATION-BASED METHODS

Population-based methods deal in every it-eration of the algorithm with a set (i.e.,a population) of solutions17 rather thanwith a single solution. As they deal with apopulation of solutions, population-basedalgorithms provide a natural, intrinsicway for the exploration of the searchspace. Yet, the final performance dependsstrongly on the way the population is ma-nipulated. The most studied population-

17In general, especially in EC algorithms, we talkabout a population of individuals rather thansolutions.

based methods in combinatorial optimiza-tion are Evolutionary Computation (EC)and Ant Colony Optimization (ACO). InEC algorithms, a population of individu-als is modified by recombination and mu-tation operators, and in ACO a colony ofartificial ants is used to construct solu-tions guided by the pheromone trails andheuristic information.

4.1. Evolutionary Computation

Evolutionary Computation (EC) algo-rithms are inspired by nature’s capabil-ity to evolve living beings well adaptedto their environment. EC algorithms canbe succinctly characterized as computa-tional models of evolutionary processes. Ateach iteration a number of operators is ap-plied to the individuals of the current po-pulation to generate the individuals of thepopulation of the next generation (itera-tion). Usually, EC algorithms use opera-tors called recombination or crossover torecombine two or more individuals to pro-duce new individuals. They also use muta-tion or modification operators which causea self-adaptation of individuals. The driv-ing force in evolutionary algorithms is theselection of individuals based on their fit-ness (this can be the value of an objec-tive function or the result of a simulationexperiment, or some other kind of qual-ity measure). Individuals with a higher



fitness have a higher probability to be cho-sen as members of the population of thenext iteration (or as parents for the gen-eration of new individuals). This corre-sponds to the principle of survival of thefittest in natural evolution. It is the capa-bility of nature to adapt itself to a chang-ing environment, which gave the inspira-tion for EC algorithms.

There has been a variety of slightly dif-ferent EC algorithms proposed over theyears. Basically they fall into three differ-ent categories which have been developedindependently from each other. These areEvolutionary Programming (EP) devel-oped by Fogel [1962] and Fogel et al.[1966], Evolutionary Strategies (ES) pro-posed by Rechenberg [1973] and GeneticAlgorithms initiated by Holland [1975](see Goldberg [1989], Mitchell [1998],Reeves and Rowe [2002], and Vose [1999]for further references). EP arose from thedesire to generate machine intelligence.While EP originally was proposed to op-erate on discrete representations of finitestate machines, most of the present vari-ants are used for continuous optimizationproblems. The latter also holds for mostpresent variants of ES, whereas GAs aremainly applied to solve combinatorial op-timization problems. Over the years, therehave been quite a few overviews and sur-veys about EC methods. Among those arethe ones by Back [1996], Fogel [1994],Spears et al. [1993] and Michalewicz andMichalewicz [1997]. Calegary et al. [1999]propose a taxonomy of EC algorithms.

In the following we provide a “combina-torial optimization”-oriented introductionto EC algorithms. For doing this, we fol-low an overview work by Hertz and Kobler[2000], which gives, in our opinion, a goodoverview of the different components ofEC algorithms and of the possibilities todefine them.

Figure 13 shows the basic structure ofevery EC algorithm. In this algorithm,P denotes the population of individuals.A population of offspring is generatedby the application of recombination andmutation operators and the individualsfor the next population are selected fromthe union of the old population and the

P ← GenerateInitialPopulation()Evaluate(P )while termination conditions not met do

P ′ ← Recombine(P )P ′′ ← Mutate(P ′)Evaluate(P ′′)P ← Select(P ′′ ∪ P )

endwhile

Fig. 13 . Algorithm: Evolutionary Com-putation (EC).

offspring population. The main featuresof an EC algorithm are outlined in thefollowing.

Description of the Individuals. EC algo-rithms handle populations of individuals.These individuals are not necessarily solu-tions of the considered problem. They maybe partial solutions, or sets of solutions,or any object which can be transformedinto one or more solutions in a structuredway. Most commonly used in combinato-rial optimization is the representation ofsolutions as bit-strings or as permutationsof n integer numbers. Tree-structures orother complex structures are also possible.In the context of Genetic Algorithms, indi-viduals are called genotypes, whereas thesolutions that are encoded by individualsare called phenotypes. This is to differen-tiate between the representation of solu-tions and solutions themselves. The choiceof an appropriate representation is cru-cial for the success of an EC algorithm.Holland’s [1975] schema analysis andRadcliffe’s [1991] generalization to formaeare examples of how theory can help toguide representation choices.

Evolution Process. In each iteration, ithas to be decided which individuals willenter the population of the next iteration.This is done by a selection scheme. Tochoose, the individuals for the next po-pulation exclusively from the offspring iscalled generational replacement. If it ispossible to transfer individuals of the cur-rent population into the next population,then we deal with a so-called steady stateevolution process.

Most EC algorithms work with pop-ulations of fixed size keeping at leastthe best individual always in the current



population. It is also possible to have avariable population size. In case of a con-tinuously shrinking population size, thesituation where only one individual is leftin the population (or no crossover partnerscan be found for any member of the popu-lation) might be one of the stopping condi-tions of the algorithm.

Neighborhood Structure. A neighbor-hood function NEC : I → 2I on the set ofindividuals I assigns to every individuali ∈ I a set of individualsNEC(i) ⊆ I whichare permitted to act as recombinationpartners for i to create offspring. If an indi-vidual can be recombined with any otherindividual (as for example in the simpleGA) we talk about unstructured popula-tions, otherwise we talk about structuredpopulations. An example for an EC al-gorithm that works on structured popu-lations is the Parallel Genetic Algorithmproposed by Muhlenbein [1991].

Information Sources. The most com-mon form of information sources to cre-ate offspring (i.e., new individuals) is acouple of parents (two-parent crossover).But there are also recombination oper-ators that operate on more than twoindividuals to create a new individual(multi-parent crossover), see Eiben et al.[1994]. More recent developments evenuse population statistics for generat-ing the individuals of the next popu-lation. Examples are the recombinationoperators called Gene Pool Recombina-tion [Muhlenbein and Voigt 1995] andBit-Simulated Crossover [Syswerda 1993]which make use of a distribution over thesearch space given by the current popula-tion to generate the next population.

Infeasibility. An important characteris-tic of an EC algorithm is the way it dealswith infeasible individuals. When recom-bining individuals, the offspring might bepotentially infeasible. There are basicallythree different ways to handle such a sit-uation. The most simple action is to re-ject infeasible individuals. Nevertheless,for many problems (e.g., for timetablingproblems) it might be very difficult to findfeasible individuals. Therefore, the strat-egy of penalizing infeasible individuals inthe function that measures the quality of

an individual is sometimes more appropri-ate (or even unavoidable). The third pos-sibility consists in trying to repair an in-feasible solution (see Eiben and Ruttkay[1997] for an example).

Intensification Strategy. In many appli-cations it proved to be quite beneficial touse improvement mechanisms to improvethe fitness of individuals. EC algorithmsthat apply a local search algorithm to ev-ery individual of a population are oftencalled Memetic Algorithms [Moscato 1989,1999]. While the use of a population en-sures an exploration of the search space,the use of local search techniques helps toquickly identify “good” areas in the searchspace.

Another intensification strategy isthe use of recombination operators thatexplicitly try to combine “good” parts of in-dividuals (rather than, e.g., a simple one-point crossover for bit-strings). This mayguide the search performed by EC algo-rithms to areas of individuals with certain“good” properties. Techniques of this kindare sometimes called linkage learningor building block learning (see Goldberget al. [1991], van Kemenade [1996],Watson et al. [1998], and Harik [1999] asexamples). Moreover, generalized recom-bination operators have been proposed inthe literature, which incorporate the no-tion of “neighborhood search” into EC. Anexample can be found in Rayward-Smith[1994].

Diversification Strategy. One of the ma-jor difficulties of EC algorithms (especiallywhen applying local search) is the prema-ture convergence toward sub-optimal solu-tions. The most simple mechanism to di-versify the search process is the use of amutation operator. The simple form of amutation operator just performs a smallrandom perturbation of an individual, in-troducing a kind of noise. In order to avoidpremature convergence there are waysof maintaining the population diversity.Probably the oldest strategies are crowd-ing [DeJong 1975] and its close relative,preselection. Newer strategies are fitnesssharing [Goldberg and Richardson 1987],respectively niching, whereby the repro-ductive fitness allocated to an individual



Initial Phase:SeedGeneration()repeat

DiversificationGenerator()Improvement()ReferenceSetUpdate()

until the reference set is of cardinality n

Scatter Search/Path Relinking Phase:repeat

SubsetGeneration()SolutionCombination()Improvement()ReferenceSetUpdate()

until termination criteria met

Fig. 14 . Algorithm: Scatter Search andPath Relinking.

in a population is reduced proportion-ally to the number of other individualsthat share the same region of the searchspace.

This concludes the list of the mainfeatures of EC algorithms. EC algorithmshave been applied to most CO problemsand optimization problems in general.Recent successes were obtained in therapidly growing bioinformatics area (see,e.g., Fogel et al. [2002]), but also in multi-objective optimization [Coello Coello2000], and in evolvable hardware [Sipperet al. 1997]. For an extensive collectionof references to EC applications, we referto Back et al. [1997]. In the following twosections, we are going to introduce twoother populations-based methods thatare sometimes also regarded as being ECalgorithms.

4.1.1. Scatter Search and Path Relinking.Scatter Search and its generalized formcalled Path Relinking [Glover 1999;Glover et al. 2000] differ from EC al-gorithms mainly by providing unifyingprinciples for joining (or recombining) so-lutions based on generalized path con-structions in Euclidean or neighborhoodspaces. They also incorporate some ideasoriginating from Tabu Search methods, as,for example, the use of adaptive memoryand associated memory-exploiting mech-anisms. The template for Scatter Search(respectively, Path Relinking) is shown inFigure 14.

Scatter Search (respectively, Path Re-linking) is a search strategy that gener-ates a set of solutions from a chosen setof reference solutions corresponding to fea-sible solutions to the problem under con-sideration. This is done by making com-binations of subsets of the current set ofreference solutions. The resulting solu-tions are called trial solutions. These trialsolutions may be infeasible solutions andare therefore usually modified by means ofa repair procedure that transforms theminto feasible solutions. An improvementmechanism is then applied in order to tryto improve the set of trial solutions (usu-ally this improvement procedure is a lo-cal search). These improved solutions formthe set of dispersed solutions. The new setof reference solutions that will be usedin the next iteration is selected from thecurrent set of reference solutions and thenewly created set of dispersed solutions.The components of the pseudo-code, whichis shown in Figure 14, are explained in thefollowing:

SeedGeneration(): One or more seed so-lutions, which are arbitrary trial solu-tions, are created and used to initiate theremainder of the method.

DiversificationGenerator(): This is a pro-cedure to generate a collection of diversetrial solutions from an arbitrary trial so-lution (or seed solution).

Improvement(): In this procedure, an im-provement mechanism—usually a localsearch—is used to transform a trial solu-tion into one or more enhanced trial solu-tions. Neither the input nor the output so-lutions are required to be feasible, thoughthe output solutions will more usually beexpected to be so. It might be necessaryto apply repair methods to infeasible solu-tions.

ReferenceSetUpdate(): The procedurefor updating the reference set is respon-sible for building and maintaining a refer-ence set consisting of a number of “best”solutions found in the course of the al-gorithm. The attribute “best” covers fea-tures such as quality of solutions and di-versity of solutions (the solutions in thereference set should be of good quality andthey should be diverse).



SubsetGeneration(): This method oper-ates on the reference set, to produce a sub-set of its solutions as a basis for creatingcombined solutions.

SolutionCombination(): A procedure totransform a given subset of solutions pro-duced by the subset generation methodinto one or more combined solutions. InScatter Search, which was introduced forsolutions encoded as points in the Eu-clidean space, new solutions are createdby building linear combinations of refer-ence solutions using both positive and neg-ative weights. This means that trial solu-tions can be both, inside and outside theconvex region spanned by the referencesolutions. In Path Relinking, the conceptof combining solutions by making linearcombinations of reference points is gen-eralized to neighborhood spaces. Linearcombinations of points in the Euclideanspace can be re-interpreted as paths be-tween and beyond solutions in a neigh-borhood space. To generate the desiredpaths, it is only necessary to select movesthat satisfy the following condition: uponstarting from an initiating solution, themoves must progressively introduce at-tributes contributed by a guiding solution.Multiparent path generation possibilitiesemerge in Path Relinking by consider-ing the combined attributes provided bya set of guiding solutions, where these at-tributes are weighted to determine whichmoves are given higher priority.

Scatter Search enjoys increasing inter-est in recent years. Among other prob-lems, it has been applied to multi-objective assignment problems [Lagunaet al. 2000] and the Linear Ordering Prob-lem (LOP) [Campos et al. 2001]. For fur-ther references, we refer to [Glover et al.2002]. Path relinking is often used asa component in metaheuristics such asTabu Search [Laguna et al. 1999] andGRASP [Aiex et al. 2003; Laguna andMartı 1999].

4.1.2. Estimation of Distribution Algorithms.In the last decade, more and more re-searchers tried to overcome the draw-backs of usual recombination operatorsof EC algorithms, which are likely to

P ← InitializePopulation()while termination criteria not met do

Psel ← Select(P) % Psel ⊆ Pp(x) = p(x | Psel )← EstimateProbabilityDistribution()P ← SampleProbabilityDistribution()

endwhile

Fig. 15 . Algorithm: Estimation of Distribution Al-gorithms (EDAs).

break good building blocks. So, a num-ber of algorithms—sometimes called Esti-mation of Distribution Algorithms (EDA)[Muhlenbein and Paaß 1996]—have beendeveloped (see Figure 15 for the algorith-mic framework). These algorithms, whichhave a theoretical foundation in probabil-ity theory, are also based on populationsthat evolve as the search progresses. EDAsuse probabilistic modelling of promisingsolutions to estimate a distribution overthe search space, which is then used to pro-duce the next generation by sampling thesearch space according to the estimateddistribution. After every iteration, the dis-tribution is re-estimated. For a survey ofEDAs, see [Pelikan et al. 1999b].

One of the first EDAs that was proposedfor Combinatorial Optimization is calledPopulation-based Incremental Learning(PBIL) [Baluja 1994; Baluja and Caruana1995]. The objective of this method is tocreate a real valued probability vector(each position corresponds to a binarydecision variable) which—when used tosample the search space—generates highquality solutions with high probability.Initially, the values of the probability vec-tor are initialized to 0.5 (for each variablethere is equal probability to be set to 0 or1). The goal of shifting the values of thisprobability vector in order to generatehigh quality solutions is accomplished asfollows: a number of solution vectors aregenerated according to the probability vec-tor. Then, the probability vector is shiftedtoward the generated solution vector(s)with highest quality. The distance thatthe probability vector is shifted dependson the learning rate parameter. Then,a mutation operator is applied to theprobability vector. After that, the cycle isrepeated. The probability vector canbe regarded as a prototype vector forgenerating high-quality solution vectors



with respect to the available knowledgeabout the search space. The drawbackof this method is the fact that it doesnot automatically provide a way to dealwith constrained problems. In contrastto PBIL, which estimates a distributionof promising solutions assuming thatthe decision variables are independent,various other approaches try to estimatedistributions taking into account depen-dencies between decision variables. Anexample for EDAs regarding pairwisedependencies between decision variablesis MIMIC [de Bonet et al. 1997] and anexample for multivariate dependenciesis the Bayesian Optimization Algorithm(BOA) [Pelikan et al. 1999a].

The field of EDAs is still quite youngand much of the research effort is fo-cused on methodology rather than high-performance applications. Applications toKnapsack problems, the Job Shop Schedu-ling (JSS) problem, and other CO prob-lems can be found in Larranaga andLozano [2002].

4.2. Ant Colony Optimization

Ant Colony Optimization (ACO) is a meta-heuristic approach proposed in Dorigo1992, 1996, 1999. In the course of thissection, we keep close to the descriptionof ACO as given in Dorigo and Di Caro[1999]. The inspiring source of ACO isthe foraging behavior of real ants. Thisbehavior—as described by Deneubourget al. [1990]—enables ants to find short-est paths between food sources and theirnest. While walking from food sources tothe nest and vice versa, ants deposit a sub-stance called pheromone on the ground.When they decide about a direction to go,they choose with higher probability pathsthat are marked by stronger pheromoneconcentrations. This basic behavior is thebasis for a cooperative interaction whichleads to the emergence of shortest paths.

ACO algorithms are based on aparametrized probabilistic model—thepheromone model—that is used to modelthe chemical pheromone trails. Artificialants incrementally construct solutionsby adding opportunely defined solution

components to a partial solution underconsideration.18 For doing that, artificialants perform randomized walks on a com-pletely connected graph G= (C, L) whosevertices are the solution components Cand the set L are the connections. Thisgraph is commonly called constructiongraph. When a constrained CO problemis considered, the problem constraintsÄ are built into the ants’ constructiveprocedure in such a way that in every stepof the construction process only feasiblesolution components can be added tothe current partial solution. In mostapplications, ants are implemented tobuild feasible solutions, but sometimes itis unavoidable to also let them constructinfeasible solutions. Components ci ∈ Ccan have associated a pheromone trailparameter Ti, and connections li j ∈L canhave associated a pheromone trail param-eter Ti j . The set of all pheromone trailparameters is denoted by T . The valuesof these parameters—the pheromonevalues—are denoted by τi, respectivelyτi j . Furthermore, components and connec-tions can have associated a heuristic valueηi, respectively ηij, representing a priorior run time heuristic information aboutthe problem instance. We henceforthdenote the set of all heuristic values byH.These values are used by the ants to makeprobabilistic decisions on how to move onthe construction graph. The probabilitiesinvolved in moving on the constructiongraph are commonly called transitionprobabilities. The first ACO algorithmproposed in the literature is called AntSystem (AS) [Dorigo et al. 1996]. Thepseudo-code for this algorithm is shownin Figure 16. For the sake of simplicity, werestrict the following description of AS topheromone trail parameters and heuristicinformation on solution components.

In this algorithm, A denotes the setof ants and sa denotes the solution con-structed by ant a ∈A. After the initial-ization of the pheromone values, at eachstep of the algorithm each ant constructs a

18Therefore, the ACO metaheuristic can be applied toany CO problem for which a constructive procedurecan be defined.



InitializePheromoneValues(T )while termination conditions not met do

for all ants a ∈ A dosa ← ConstructSolution(T ,H)

endforApplyOnlineDelayedPheromoneUpdate(T ,{sa | a ∈ A})

endwhile

Fig. 16 . Algorithm: Ant System (AS).

solution. These solutions are then used toupdate the pheromone values. The compo-nents of this algorithm are explained inmore detail in the following.

InitializePheromoneValues(T ): At the be-ginning of the algorithm the pheromonevalues are initialized to the same smallvalue ph > 0.

ConstructSolution(T ,H): In the construc-tion phase an ant incrementally builds asolution by adding solution components tothe partial solution constructed so far. Theprobabilistic choice of the next solutioncomponent to be added is done by meansof transition probabilities, which in AS aredetermined by the following state transi-tion rule:

p(cr |sa[cl ])

=

[ηr ]α [τr ]β∑

cu∈J (sa[cl ]) [ηu]α [τu]βif cr ∈ J (sa[cl ])

0 otherwise(1)

In this formula, α and β are parameters toadjust the relative importance of heuris-tic information and pheromone values andJ (sa[cl ]) denotes the set of solution compo-nents that are allowed to be added to thepartial solution sa[cl ], where cl is the lastcomponent that was added.

ApplyOnlineDelayedPheromoneUpdate(T ,{sa |a ∈A}): Once all ants have con-structed a solution, the online delayedpheromone update rule is applied:

τ j ← (1− ρ) · τ j +∑a∈A

1τsaj (2)

∀ T j ∈ T , where

1τsaj =

{F (sa) if c j is a component of sa0 otherwise,

(3)

while termination conditions not met doScheduleActivities

AntBasedSolutionConstruction()PheromoneUpdate()DaemonActions() % optional

end ScheduleActivitiesendwhile

Fig. 17 . Algorithm: Ant Colony Opti-mization (ACO).

where F : S 7→ IR+ is a function that satis-fies f (s)< f (s′)⇒ F (s)≥ F (s′), ∀s 6= s′ ∈S. F (·) is commonly called the qualityfunction. Furthermore, 0<ρ≤ 1 is thepheromone evaporation rate. This phero-mone update rule aims at an increase ofpheromone on solution components thathave been found in high-quality solutions.

In the following, we describe the moregeneral ACO metaheuristic, which isbased on the same basic principles as AS.The ACO metaheuristic framework that isshown in Figure 17 covers all the improve-ments and extensions of AS which havebeen developed over the years. It consistsof three parts gathered in the Schedule-Activities construct. The ScheduleActivitiesconstruct does not specify how these threeactivities are scheduled and synchronized.This is up to the algorithm designer.

AntBasedSolutionConstruction(): An antconstructively builds a solution to theproblem by moving through nodes of theconstruction graph G. Ants move by apply-ing a stochastic local decision policy thatmakes use of the pheromone values andthe heuristic values on components and/orconnections of the construction graph (e.g.,see the state transition rule of AS). Whilemoving, the ant keeps in memory the par-tial solution it has built in terms of thepath it was walking on the constructiongraph.

PheromoneUpdate(): When adding acomponent c j to the current partial so-lution, an ant can update the pheromonetrail(s) τi and/or τi j (in case the ant waswalking on connection li j in order to reachcomponent c j ). This kind of pheromoneupdate is called online step-by-step phero-mone update. Once an ant has built a so-lution, it can retrace the same path back-ward (by using its memory) and update



the pheromone trails of the used compo-nents and/or connections according to thequality of the solution it has built. This iscalled online delayed pheromone update.Pheromone evaporation is the process bymeans of which the pheromone trail in-tensity on the components decreases overtime. From a practical point of view, phe-romone evaporation is needed to avoid atoo rapid convergence of the algorithm to-ward a sub-optimal region. It implementsa useful form of forgetting, favoring the ex-ploration of new areas in the search space.

DaemonActions(): Daemon actions canbe used to implement centralized actionswhich cannot be performed by single ants.Examples are the use of a local search pro-cedure applied to the solutions built by theants, or the collection of global informa-tion that can be used to decide whether itis useful or not to deposit additional phe-romone to bias the search process froma nonlocal perspective. As a practical ex-ample, the daemon can observe the pathfound by each ant in the colony and chooseto deposit extra pheromone on the compo-nents used by the ant that built the bestsolution. Pheromone updates performedby the daemon are called offline phero-mone updates.

Within the ACO metaheuristic frame-work, as shortly described above, the cur-rently best performing versions in prac-tise are Ant Colony System (ACS) [Dorigoand Gambardella 1997] andMAX -MINAnt System (MMAS) [Stutzle and Hoos2000]. In the following, we are going tobriefly outline the peculiarities of these al-gorithms.

Ant Colony System (ACS). The ACS al-gorithm has been introduced to improvethe performance of AS. ACS is based onAS but presents some important differ-ences. First, the daemon updates phero-mone trails offline: At the end of an iter-ation of the algorithm—once all the antshave built a solution—pheromone is addedto the arcs used by the ant that found thebest solution from the start of the algo-rithm. Second, ants use a different deci-sion rule to decide to which component tomove next in the construction graph. Therule is called pseudo-random-proportional

rule. With this rule, some moves are cho-sen deterministically (in a greedy man-ner), others are chosen probabilisticallywith the usual decision rule. Third, inACS, ants perform only online step-by-step pheromone updates. These up-dates are performed to favor the emer-gence of other solutions than the best sofar.MAX -MIN Ant System (MMAS).MMAS is also an extension of AS. First,the pheromone trails are only updated of-fline by the daemon (the arcs that wereused by the iteration best ant or the bestant since the start of the algorithm receiveadditional pheromone). Second, the phero-mone values are restricted to an interval[τmin, τmax] and the pheromone trails areinitialized to their maximum value τmax.Explicit bounds on the pheromone trailsprevent that the probability to construct asolution falls below a certain value greaterthan 0. This means that the chance of find-ing a global optimum never vanishes dur-ing the course of the algorithm.

Recently, researchers have been deal-ing with finding similarities betweenACO algorithms and probabilistic learn-ing algorithms such as EDAs. An im-portant step into this direction was thedevelopment of the Hyper-Cube Frame-work for Ant Colony Optimization (HC-ACO) [Blum et al. 2001]. An extensivestudy on this subject has been presentedin Zlochin et al. [2004], where the au-thors present a unifying framework forso-called Model-Based Search (MBS) al-gorithms. Also, the close relation of algo-rithms like Population-Based Incremen-tal Learning (PBIL) [Baluja and Caruana1995] and the Univariate Marginal Dis-tribution Algorithm (UMDA) [Muhlenbeinand Paaß 1996] to ACO algorithms in theHyper-Cube Framework has been shown.We refer the interested reader to Zlochinet al. [2004] for more information on thissubject. Furthermore, connections of ACOalgorithms to Stochastic Gradient Descent(SGD) algorithms are shown in Meuleauand Dorigo [2002].

Successful applications of ACO includethe application to routing in communi-cation networks [Di Caro and Dorigo



1998], the application to the SequentialOrdering Problem (SOP) [Gambardellaand Dorigo 2000], and the application toResource Constraint Project Scheduling(RCPS) [Merkle et al. 2002]. Further refer-ences to applications of ACO can be foundin Dorigo and Stutzle [2002, 2003].

5. A UNIFYING VIEW ON INTENSIFICATIONAND DIVERSIFICATION

In this section, we take a closer look atthe concepts of intensification and diversi-fication as the two powerful forces drivingmetaheuristic applications to high perfor-mance. We give a view on metaheuristicsthat is characterized by the way inten-sification and diversification are imple-mented. Although the relevance of thesetwo concepts is commonly agreed, so farthere is no unifying description to be foundin the literature. Descriptions are verygeneric and metaheuristic specific. There-fore most of them can be considered incom-plete and sometimes they are even oppos-ing. Depending on the paradigm behinda particular metaheuristic, intensificationand diversification are achieved in differ-ent ways. Even so, we propose a unifyingview on intensification and diversification.Furthermore, this discussion could lead tothe goal-directed development of hybridalgorithms combining concepts originat-ing from different metaheuristics.

5.1. Intensification and Diversification

Every metaheuristic approach should bedesigned with the aim of effectively andefficiently exploring a search space. Thesearch performed by a metaheuristic ap-proach should be “clever” enough to bothintensively explore areas of the searchspace with high quality solutions, and tomove to unexplored areas of the searchspace when necessary. The concepts forreaching these goals are nowadays calledintensification and diversification. Theseterms stem from the TS field [Glover andLaguna 1997]. In other fields—such as theEC field—related concepts are often de-noted by exploitation (related to intensifi-cation) and exploration (related to diversi-

fication). However, the terms exploitationand exploration have a somewhat more re-stricted meaning. In fact, the notions ofexploitation and exploration often refer torather short term strategies tied to ran-domness, whereas intensification and di-versification refer to rather medium andlong term strategies based on the usage ofmemory. As the various different ways ofusing memory become increasingly impor-tant in the whole field of metaheuristics,the terms intensification and diversifica-tion are more and more adopted and un-derstood in their original meaning.

An implicit reference to the concept of“locality” is often introduced when inten-sification and diversification are involved.The notion of “area” (or “region”) of thesearch space and of “locality” can only beexpressed in a fuzzy way, as they alwaysdepend on the characteristics of the searchspace as well as on the definition of metricson the search space (distances betweensolutions).

The literature provides several highlevel descriptions of intensification anddiversification. In the following, we citesome of them.

“Two highly important componentsof Tabu Search are intensification anddiversification strategies. Intensificationstrategies are based on modifying choicerules to encourage move combinationsand solution features historically foundgood. They may also initiate a return toattractive regions to search them morethoroughly. Since elite solutions must berecorded in order to examine their imme-diate neighborhoods, explicit memory isclosely related to the implementation ofintensification strategies. The main dif-ference between intensification anddiversification is that during an in-tensification stage the search focuseson examining neighbors of elite solu-tions. [· · · ] The diversification stageon the other hand encourages thesearch process to examine unvisitedregions and to generate solutionsthat differ in various significant waysfrom those seen before.” [Glover andLaguna 1997]



Later in the same book, Glover andLaguna write: “In some instances we mayconceive of intensification as having thefunction of an intermediate term strat-egy, while diversification applies to con-siderations that emerge in the longerrun.”

Furthermore, they write: “Strategic os-cillation is closely linked to the originsof tabu search, and provides a meansto achieve an effective interplay be-tween intensification and diversifica-tion.”

“After a local minimizer is encountered,all points in its attraction basin loseany interest for optimization. The searchshould avoid wasting excessive computingtime in a single basin and diversificationshould be activated. On the other hand,in the assumptions that neighbors havecorrelated cost function values, some ef-fort should be spent in searching for betterpoints located close to the most recentlyfound local minimum point (intensifica-tion). The two requirements are con-flicting and finding a proper balanceof diversification and intensificationis a crucial issue in heuristics.”[Battiti 1996].

“A metaheuristic will be successfulon a given optimization problem if itcan provide a balance between the ex-ploitation of the accumulated searchexperience and the exploration of thesearch space to identify regions withhigh quality solutions in a problemspecific, near optimal way.” [Stutzle1999b].

“Intensification is to search carefullyand intensively around good solutionsfound in the past search. Diversification,on the contrary, is to guide the searchto unvisited regions. These terminologiesare usually used to explain the basic ele-ments of Tabu Search, but these are essen-tial to all the metaheuristic algorithms. Inother words, various metaheuristic ideasshould be understood from the viewpointof these two concepts, and metaheuris-tic algorithms should be designed sothat intensification and diversifica-

tion play balanced roles.” Yagiura andIbaraki [2001].

“Holland frames adaption as a ten-sion between exploration (the search fornew, useful adaptations) and exploitation(the use and propagation of these adapta-tions). The tension comes about since anymove toward exploration—testing previ-ously unseen schemas or schemas whoseinstances seen so far have low fitness—takes away from the exploitation of triedand true schemas. In any system (e.g., apopulation of organisms) required to faceenvironments with some degree of un-predictability, an optimal balance be-tween exploration and exploitationmust be found. The system has to keeptrying out new possibilities (or else it could“over-adapt” and be inflexible in the faceof novelty), but it also has to continuallyincorporate and use past experience as aguide for future behavior.”—M. Mitchellciting J.H. Holland in Mitchell [1998].

All these descriptions share the com-mon view that there are two forces forwhich an appropriate balance has to befound. Sometimes these two forces weredescribed as opposing forces. However,lately some researchers raised the ques-tion on how opposing intensification anddiversification really are.

In 1998, Eiben and Schippers [1998]started a discussion about that in the fieldof Evolutionary Computation. They ques-tion the common opinion about EC algo-rithms, that they explore the search spaceby the genetic operators, while exploita-tion is achieved by selection. In their pa-per they give examples of operators thatone cannot unambiguously label as be-ing either intensification or diversifica-tion. So, for example, an operator usinga local search component to improve in-dividuals is not merely a mechanism of di-versification because it also comprises astrong element of intensification (e.g., inMemetic Algorithms). Another example isthe heuristically guided recombination ofgood quality solutions. If the use of theaccumulated search experience is identi-fied with intensification, then a recombi-nation operator is not merely a means of



Fig. 18 . The I&D frame provides a unified view on intensification and diversification in meta-heuristics (OG = I&D components solely guided by the objective function, NOG = I&D componentssolely guided by one or more function other than the objective function, R = I&D components solelyguided by randomness).

diversification, it also—as in the exam-ple above—has a strong intensificationcomponent.

Especially the TS literature advocatesthe view that intensification and diversifi-cation cannot be characterized as opposingforces. For example, in Glover and Laguna[1997], the authors write: “Similarly, as wehave noted, intensification and diversifica-tion are not opposed notions, for the bestform of each contains aspects of the other,along a spectrum of alternatives.”

Intensification and diversification canbe considered as effects of algorithmcomponents. In order to understand sim-ilarities and differences among meta-heuristics, a framework may be helpful inproviding a unified view on intensificationand diversification components. We definean I&D component as any algorithmic orfunctional component that has an inten-sification and/or a diversification effecton the search process. Accordingly, exam-ples of I&D components are genetic opera-tors, perturbations of probability distribu-tions, the use of tabu lists, or changes inthe objective function. Thus, I&D compo-nents are operators, actions, or strategiesof metaheuristic algorithms.

In contrast to the still widely spreadview that there are components that haveeither an intensification or a diversifica-tion effect, there are many I&D compo-nents that have both. In I&D componentsthat are commonly labelled as intensifi-cation, the intensification component isstronger than the diversification com-

ponent, and vice versa. To clarify this,we developed a framework to put I&Dcomponents of different metaheuristicsinto relation with each other. We calledthis framework—shown in Figure 18—theI&D frame.

We depict the space of all I&D compo-nents as a triangle with the three cornerscorresponding to three extreme examplesof I&D components. The corner denoted byOG corresponds to I&D components solelyguided by the objective function of theproblem under consideration. An exampleof an I&D component which is located veryclose to the corner OG is the steepest de-scent choice rule in local search. The cor-ner denoted by NOG covers all I&D com-ponents guided by one or more functionsother than the objective function, againwithout using any random component. Anexample for such a component is a de-terministic restart mechanism based onglobal frequency counts of solution com-ponents. The third corner, which is de-noted by R, comprises all I&D compo-nents that are completely random. Thismeans that they are not guided by any-thing. For example, a restart of an ECapproach with random individuals is lo-cated in that corner. From the descriptionof the corners, it becomes clear that cor-ner OG corresponds to I&D componentswith a maximum intensification effect anda minimum diversification effect. On theother hand, corners NOG, R and the seg-ment between the two corners correspondto I&D components with a maximum



diversification effect and a minimum in-tensification effect.19 All I&D componentscan be located somewhere on or inbetweenthe three corners, where the intensifica-tion effect is becoming smaller the furtheraway a mechanism is located from OG. Atthe same time the diversification effect isgrowing. In step with this gradient is theuse I&D components make of the objec-tive function. The less an I&D componentis using the objective function, the furtheraway from corner OG it has to be located.There is also a second gradient to be foundin this frame (which is shown in the secondgraphic of Figure 18). Corner R stands forcomplete randomness. The less random-ness is involved in an I&D component, thefurther away from corner R it has to be lo-cated. Finally, a third gradient describesthe influence of criteria different from theobjective function, which generally stemfrom the exploitation of the search historythat is in some form kept in the memory. Inthe following, we analyze some basic I&Dcomponents intrinsic to the basic versionsof the metaheuristics with respect to theI&D frame.

5.2. Basic I&D Componentsof Metaheuristics

The I&D components occurring in meta-heuristics can be divided in basic (or in-trinsic) ones and strategic ones. The basicI&D components are the ones that are de-fined by the basic ideas of a metaheuris-tic. On the other side, strategic I&D com-ponents are composed of techniques andstrategies the algorithm designer adds tothe basic metaheuristic in order to im-prove the performance by incorporatingmedium- and long-term strategies. Manyof these strategies were originally devel-oped in the context of a specific meta-heuristic. However, it becomes more andmore apparent that many of these strate-gies can also be very useful when appliedin other metaheuristics. In the following,we exemplary choose some basic I&D com-

19There is no quantitative difference betweencorners NOG and R. The difference is ratherqualitative.

ponents that are inherent to a metaheuris-tic and explain them in the context of theI&D frame. With that, we show that mostof the basic I&D components have an in-tensification character as well as a diver-sification character.

For many components and strategies ofmetaheuristics, it is obvious that they in-volve an intensification as well as a diver-sification component, because they makean explicit use of the objective function.For example, the basic idea of TS is aneighbor choice rule using one or moretabu lists. This I&D component has two ef-fects on the search process. The restrictionof the set of possible neighbors in everystep has a diversifying effect on the search,whereas the choice of the best neighborin the restricted set of neighbors (the bestnon-tabu move) has an intensifying effecton the search. The balance between thesetwo effects can be varied by the length ofthe tabu list. Shorter tabu lists result ina lower influence of the diversifying ef-fect, whereas longer tabu lists result in anoverall higher influence of the diversifyingeffect. The location of this component inFigure 18 is on the segment between cor-ner OG and NOG. The shorter the tabulists, the closer is the location to cornerOG, and vice versa.

Another example for such an I&D com-ponent is the probabilistic acceptance cri-terion in conjunction with the coolingschedule in SA. The acceptance criterionis guided by the objective function and italso involves a changing amount of ran-domness. The decrease of the tempera-ture parameter drives the system from di-versification to intensification eventuallyleading to a convergence20 of the system.Therefore, this I&D component is locatedin the interior of the I&D space betweencorners OG, NOG and R.

A third example is the following one.Ant Colony Optimization provides an I&Dcomponent that manages the update of thepheromone values. This component has

20Here, we use the term convergence in the senseof getting stuck in the basin of attraction of a localminimum.



the effect of changing the probability dis-tribution that is used to sample the searchspace. It is guided by the objective func-tion (solution components found in bettersolutions than others are updated witha higher amount of pheromone) and it isalso influenced by a function applying thepheromone evaporation. Therefore, thiscomponent is located on the line betweencorners OG and NOG. The effect of thismechanism is basically the intensificationof the search, but there is also a diversify-ing component that depends on the greed-iness of the pheromone update (the lessgreedy or deterministic, the higher is thediversifying effect).

For other strategies and componentsof metaheuristics, it is not immediatelyobvious that they have both, an intensi-fication and a diversification effect. Anexample is the random selection of aneighbor from the neighborhood of a cur-rent solution, as it is done for example inthe kick-move mechanism of ILS. Initiallyone might think that there is no intensifi-cation involved and that this mechanismhas a pure diversification effect causedby the use of randomness. However, forthe following reason, this is not the case.Many strategies (such as the kick-moveoperator mentioned above) involve the ex-plicit or implicit use of a neighborhood. Aneighborhood structures the search spacein the sense that it defines the topologyof the so-called fitness landscape [Stadler1995, 1996; Jones 1995a; Kauffman1993], which can be visualized as alabelled graph. In this graph, nodes aresolutions (labels indicate their objectivefunction value) and arcs represent theneighborhood relation between states.21

A fitness landscape can be analyzed bymeans of statistical measures. One of thecommon measures is the auto-correlation,that provides information about howmuch the fitness will change when a moveis made from one point to a neighboring

21The discussion of definitions and analysis of fitnesslandscapes is beyond the scope of this article. We for-ward the interested reader to Stadler [1995,1996];Jones [1995a, 1995b]; Fonlupt et al. [1999], Hordijk[1996], Kauffman 1993], and Reeved [1999].

one. Different landscapes differ in theirruggedness. A landscape with small (av-erage) fitness differences between neigh-boring points is called smooth and it willusually have just a few local optima. Incontrast, a landscape with large (average)fitness differences is called rugged andit will be usually characterized by manylocal optima. Most of the neighborhoodsused in metaheuristics provide some de-gree of smoothness that is higher than theone of a fitness landscape defined by a ran-dom neighborhood. This means that sucha neighborhood is, in a sense, preselectingfor every solution a set of neighbors forwhich the average fitness is not too dif-ferent. Therefore, even when a solution israndomly selected from a set of neighbors,the objective function guidance is implic-itly present. The consequence is that evenfor a random kick-move there is some de-gree of intensification involved, as far asa nonrandom neighborhood is considered.

For a mutation operator of an ECmethod that is doing a random change ofa solution, it is neither immediately clearthat it can have both, an intensificationas well as a diversification effect. In thefollowing, we assume a bit-string repre-sentation and a mutation operator thatis characterized by flipping every bit ofa solution with a certain probability. Theimplicit neighborhood used by this opera-tor is the completely connected neighbor-hood. However, the neighbors have differ-ent probabilities to be selected. The onesthat are (with respect to the Hamming dis-tance) closer to the solution to which theoperator is applied to, have a higher proba-bility to be generated by the operator. Withthis observation, we can use the same ar-gument as above in order to show an im-plicit use of objective function guidance.The balance between intensification anddiversification is determined by the proba-bility to flip each bit. The higher this prob-ability, the higher the diversification effectof the operator. In contrast, the lower thisprobability, the higher the intensificationeffect of this operator.

On the other side, there are some strate-gies that are often labelled as intensi-fication supposedly without having any



diversifying effect. One example is the se-lection operator in EC algorithms. How-ever, nearly all selection operators involvesome degree of randomness (e.g., propor-tionate selection, tournament selection)and are therefore located somewhere be-tween corners OG, NOG and R of the I&Dframe. This means that they also have a di-versifying effect. The balance between in-tensification and diversification dependson the function that assigns the selectionprobabilities. If the differences betweenthe selection probabilities are quite high,the intensification effect is higher, andsimilarly for the other extreme of havingonly small differences between the selec-tion probabilities.

Even an operator like the neighborchoice rule of a steepest descent localsearch, which might be regarded as pureintensification, has a diversifying compo-nent in the sense that the search is “mov-ing” in the search space with respect to aneighborhood. A neighborhood can be re-garded as a function other than the objec-tive function, making implicit use of theobjective function. Therefore, a steepestdescent local search is located between cor-ners OG and NOG, and has both, a strongintensification effect but also a weak di-versification character.

Based on these observations we con-clude that probably most of the basic I&Dcomponents used in metaheuristics haveboth, an intensification and a diversifica-tion effect. However, the balance betweenintensification and diversification mightbe quite different for different I&D com-ponents. Table 1 attempts to summarizethe basic I&D components that are inher-ent to the different metaheuristics.

5.3. Strategic Control of Intensificationand Diversification

The right balance between intensificationand diversification is needed to obtainan effective metaheuristic. Moreover, thisbalance should not be fixed or only chang-ing into one direction (e.g., continuouslyincreasing intensification). This balanceshould rather be dynamical. This issue isoften treated in the literature, both implic-

Table 1. I&D-components intrinsic to the basicmetaheuristics

Metaheuristic I&D componentSA acceptance criterion

+ cooling scheduleTS neighbor choice (tabu lists)

aspiration criterionEC recombination

mutationselection

ACO pheromone updateprobabilistic construction

ILS black box local searchkick-moveacceptance criterion

VNS black box local searchneighborhood choiceshaking phaseacceptance criterion

GRASP black box local searchrestricted candidate list

GLS penalty function

itly and explicitly, when strategies to guidesearch algorithms are discussed.

The distinction between intensificationand diversification is often interpretedwith respect to the temporal horizon ofthe search. Short-term search strategiescan be seen as the iterative application oftactics with a strong intensification char-acter (for instance, the repeated applica-tion of greedy moves). When the horizonis enlarged, usually strategies referringto some sort of diversification come intoplay. Indeed, a general strategy usuallyproves its effectiveness especially in thelong term.

The simplest strategy that coordinatesthe interplay of intensification and diver-sification and can achieve an oscillatingbalance between them is the restart mech-anism: under certain circumstances (e.g.,local optimum is reached, no improve-ments after a specific number of algorithmcycles, stagnation, no diversity) the algo-rithm is restarted. The goal is to achievea sufficient coverage of the search spacein the long run, thus the already visitedregions should not be explored again. Thecomputationally least expensive attemptto address this issue is a random restart.Every algorithm applying this naive di-versification mechanism therefore incor-porates an I&D component located in cor-ner R of the I&D frame.



Usually, the most effective restart ap-proaches make use of the search history.Examples for such restart strategies arethe ones based on concepts such as globalfrequency and global desirability. The con-cept of global frequency is well known fromTS applications. In this concept, the num-ber of occurrences of solution componentsis counted during the run of the algorithm.These numbers, called the global fre-quency numbers, are then used for chang-ing the heuristic constructive method, forexample to generate a new population forrestarting an EC method or the initial so-lution for restarting a trajectory method.Similarly, the concept of global desirability(which keeps for every solution componentthe objective function value of the best so-lution it had been a member of) can be usedto restart algorithms with a bias towardgood quality solutions. I&D componentsbased on global frequency can be locatedin corner NOG, while global desirability-based components are located along thesegment NOG-OG. Examples of the useof nonrandom restarts can be found alsoin population-based methods. In EC algo-rithms, the new population can be gener-ated by applying constructive heuristics22

(line R-OG). In ACO, this goal is ad-dressed by smoothing or resetting phero-mone values [Stutzle and Hoos 2000]. Inthe latter case, if the pheromone reset isalso based on the search history, the ac-tion is located inside the I&D frame.

There are also strategies explicitlyaimed at dynamically changing the bal-ance between intensification and diversi-fication during the search. A fairly simplestrategy is used in SA, where an increasein diversification and simultaneous de-crease in intensification can be achievedby “reheating” the system and then cool-ing it down again (which corresponds toincreasing parameter T and decreasingit again according to some scheme). Sucha cooling scheme is called nonmonotoniccooling scheme (e.g., see Lundy and Mees[1986] or Osman [1993]). Another exam-ple can be found in Ant Colony System

22See, for example, Freisleben and Merz [1996] andGrefenstette [1987].

(ACS). This ACO algorithm uses an ad-ditional I&D component aimed at intro-ducing diversification during the solutionconstruction phase. While an ant is walk-ing on the construction graph to constructa solution it reduces the pheromone val-ues on the nodes/arcs of the constructiongraph that it visits. This has the effect toreduce for the other ants the probability oftaking the same path. This additional phe-romone update mechanism is called step-by-step online pheromone update rule. Theinterplay between this component and theother pheromone update rules (online de-layed pheromone update rules and onlinepheromone update rule) leads to an oscil-lating balance between intensification anddiversification.

Some more advanced strategies can befound in the literature. Often, they aredescribed with respect to the particularmetaheuristic in which they are applied.However, many of them are very generaland can be easily adapted and reused alsoin a different context. A very effective ex-ample is Strategic Oscillation [Glover andLaguna 1997].23 This strategy can be ap-plied both to constructive methods andimprovement algorithms. Actions are in-voked with respect to a critical level (os-cillation boundary), which usually corre-sponds to a steady state of the algorithm.Examples for steady states of an algorithmare local minima, completion of solutionconstructions, or the situation were nocomponents can be added to a partial solu-tion such that it can be completed to a fea-sible solution. The oscillation strategy isdefined by a pattern indicating the way toapproach the critical level, to cross it andto cross it again from the other side. Thispattern defines the distance of moves fromthe boundary and the duration of phases(of intensification and diversification). Dif-ferent patterns generate different strate-gies; moreover, they can also be adap-tive and change depending on the currentstate and history of the search process.Other representative examples of general

23Indeed, in Glover and Laguna [1997] and in the lit-erature related to TS, many strategies are describedand discussed.



strategies that dynamically coordinate in-tensification and diversification can befound in Battiti and Protasi [1997] andBlum [2002a, 2002b].

Furthermore, strategies are not re-stricted to single actions (e.g., variable as-signments, moves), but may also guidethe application of coordinated sequencesof moves. Examples of such a strategyare given by so-called ejection chain pro-cedures [Glover and Laguna 1997; Rego1998, 2001]. These procedures provide amechanism to perform compound moves,that is, compositions of different types ofmoves. For instance, in a problem definedover a graph (e.g., the VRP), it is possi-ble to define two different moves: insertionand exchange of nodes; a compound movecan thus be defined as the combination ofan insertion and an exchange move. Theseprocedures describe general strategies tocombine the application of different neigh-borhood structures, thus they providean example of a general diversification/intensification interplay. Further exam-ples of strategies that can be interpretedas mechanisms to produce compositionsof interlinked moves can also be foundin the literature concerning the integra-tion of metaheuristics and complete tech-niques [Caseau and Laburthe 1999; Shaw1998].

In conclusion, we would like to stressagain that most metaheuristic compo-nents have both an intensification and adiversification effect. The higher the objec-tive function bias, the higher the intensi-fication effect. In contrast, diversificationis achieved by following guiding criteriaother than the objective function and alsoby the use of randomness. With the in-troduction of the I&D frame, metaheuris-tics can be analyzed by their signature inthe I&D frame. This can be a first step to-ward the systematic design of metaheuris-tics, combining I&D components of differ-ent origin.

5.4. Hybridization of Metaheuristics

We conclude our work by discussing a verypromising research issue: the hybridiza-tion of metaheuristics. In fact, many of

the successful applications that we havecited in previous sections are hybridiza-tions. In the following, we distinguish dif-ferent forms of hybridization. The first oneconsists of including components from onemetaheuristic into another one. The sec-ond form concerns systems that are some-times labelled as cooperative search. Theyconsist of various algorithms exchanginginformation in some way. The third formis the integration of approximate and sys-tematic (or complete) methods. For a tax-onomy of hybrid metaheuristics, see Talbi[2002].

Component Exchange Among Meta-heuristics. One of the most popular waysof hybridization concerns the use of trajec-tory methods in population-based meth-ods. Most of the successful applicationsof EC and ACO make use of local searchprocedures. The reason for that becomesapparent when analyzing the respec-tive strengths of trajectory methods andpopulation-based methods.

The power of population-based methodsis certainly based on the concept of re-combining solutions to obtain new ones.In EC algorithms and Scatter Search, ex-plicit recombinations are implemented byone or more recombination operators. InACO and EDAs, recombination is implicit,because new solutions are generated byusing a distribution over the search spacewhich is a function of earlier populations.This allows to make guided steps in thesearch space, which are usually “larger”than the steps done by trajectory meth-ods. In other words, a solution resultingfrom a recombination in population-basedmethods is usually more “different” fromthe parents than, say, a predecessor solu-tion to a successor solution (obtained byapplying a move) in TS. We also have “big”steps in trajectory methods like ILS andVNS, but in these methods the steps areusually not guided (these steps are rathercalled “kick move” or “perturbation” indi-cating the lack of guidance). It is interest-ing to note, that in all population-basedmethods there are mechanisms in whichgood solutions found during the searchinfluence the search process in the hopeto find better solutions in-between those



solutions and current solutions. In PathRelinking, this idea is implemented in themost explicit way. The basic elements areguiding solutions (which are the good solu-tions found) and initiating solutions. Newsolutions are produced by applying movesto decrease the distance between the re-sulting solution and the guiding solution.In EC algorithms, this is often obtained bykeeping the best (or a number of best) so-lution(s) found since the beginning of therespective run of the algorithm in the po-pulation. This is called a steady state evo-lution process. Scatter Search performs asteady state process by definition. In someACO implementations (see, e.g., Stutzleand Hoos [2000] and Blum [2002b]) a phe-romone updating schedule is applied suchthat in a situation where the algorithmhas nearly converged to a solution, onlythe best found solution since the start ofthe algorithm is used for updating the phe-romone trails. This corresponds to “chang-ing direction” and directing the search pro-cess toward a very good solution in thehope to find better ones on the way.

The strength of trajectory methods israther to be found in the way they explorea promising region in the search space. Asin those methods local search is the driv-ing component, a promising area in thesearch space is searched in a more struc-tured way than in population-based meth-ods. In this way the danger of being closeto good solutions but “missing” them is notas high as in population-based methods.

In summary, population-based methodsare better in identifying promising ar-eas in the search space, whereas tra-jectory methods are better in explor-ing promising areas in the search space.Thus, metaheuristic hybrids that in someway manage to combine the advantageof population-based methods with thestrength of trajectory methods are oftenvery successful.

Cooperative Search. A loose form ofhybridization is provided by cooperativesearch [Hogg and Williams 1993; Hoggand Huberman 1993; Bachelet and Talbi2000; Denzinger and Offerman 1999;Toulouse et al. 1999a, 1999b; Sondergeldand Voß 1999] which consists of a search

performed by possibly different algo-rithms that exchange information aboutstates, models, entire sub-problems, so-lutions or other search space character-istics. Typically, cooperative search algo-rithms consist of the parallel executionof search algorithms with a varying levelof communication. The algorithms can bedifferent or they can be instances of thesame algorithm working on different mod-els or running with different parameterssetting. The algorithms composing a coop-erative search system can be all approx-imate, all complete, or a mix of approxi-mate and complete approaches.

Cooperative search nowadays receivesmore attention, which is among other rea-sons due to the increasing research onparallel implementations of metaheuris-tics. The aim of research on paralleliza-tion of metaheuristics is twofold. First,metaheuristics should be redesigned tomake them suitable for parallel imple-mentation in order to exploit intrinsic par-allelism. Second, an effective combinationof metaheuristics has to be found, bothto combine different characteristics andstrengths and to design efficient commu-nication mechanisms. Since the aim of thisarticle is to provide an overview of thecore ideas and strategies of metaheuris-tics, we refer to Crainic and Toulouse[2002a, 2002b] for a survey on the state-of-the-art in parallel metaheuristics.

Integrating Metaheuristics and System-atic Methods. Concluding this discus-sion on hybrid metaheuristics, we brieflyoverview the integration of metaheuristicsand systematic search techniques. Thisapproach has recently produced very ef-fective algorithms especially when appliedto real-world problems. Discussions onsimilarities, differences and possible in-tegration of metaheuristics and system-atic search can be found in Freuder et al.[1995], Ginsberg [1993], Harvey [1995],and Glover and Laguna [1997]. A verysuccessful example of such an integrationis the combination of metaheuristics andConstraint Programming (CP) [Focacciet al. 2002; Pesant and Gendreau 1996,1999; De Backer et al. 2000]. CP en-ables to model a CO problem by means of



variables, domains24 and constraints,which can be mathematical or symbolic(global). The latter ones involve a setof variables and describe subproblems,thus reducing the modelling complexityby encapsulating well defined parts of theproblem into single constraints. Everyconstraint is associated to a filtering al-gorithm that deletes those values from avariable domain that do not contributeto feasible solutions. A CP system canbe seen as the interaction of components(constraints) which communicate throughshared variables. Constraints are acti-vated as soon as a the domain of anyvariable involved has been changed. Then,they perform a propagation phase, that is,they apply the filtering algorithm. This be-havior stops as soon as there are no morevalues that can be removed from the do-mains or at least one domain is empty (i.e.,no feasible solution exists). Since the com-plexity of the full constraint propagationis often exponential, the filtering is usu-ally not complete. Therefore, at the end ofthe propagation phase, some domains maystill contain unfeasible values. Hence, asearch phase is started, such as Branch& Bound. A survey on the integrationof metaheuristics and CP is provided byFocacci et al. [2002].

There are three main approaches for theintegration of metaheuristics (especiallytrajectory methods) and systematic tech-niques (CP and tree search):

—A metaheuristic and a systematicmethod are sequentially applied (theirexecution can be also interleaved). Forinstance, the metaheuristic algorithm isrun to produce some solutions that arethen used as heuristic information bythe systematic search. Vice-versa, thesystematic algorithm can be run to gen-erate a partial solution which will thenbe completed by the metaheuristic.

—Metaheuristics use CP and/or treesearch to efficiently explore the neigh-borhood, instead of simply enumeratingthe neighbors or randomly sampling theneighborhood.

24We restrict the discussion to finite domains.

—The third possibility consists of intro-ducing concepts or strategies from ei-ther class of algorithms into the other.For example, the concepts of tabu listand aspiration criteria—defined in TabuSearch—can be used to manage the listof open nodes (i.e., the ones whose childnodes are not yet explored) in a treesearch algorithm.

The first approach can be seen as an in-stance of cooperative search and it repre-sents a rather loose integration.

The second approach combines the ad-vantages of a fast search space explo-ration by means of a metaheuristic withthe efficient neighborhood exploration per-formed by a systematic method. A promi-nent example of such a kind of integra-tion is Large Neighborhood Search andrelated approaches [Shaw 1998; Caseauand Laburthe 1999]. These approachesare effective mainly when the neighbor-hood to explore is very large. Moreover,many real-world problems have additionalconstraints (called side constraints) whichmight make them unsuitable for usualneighborhood exploration performed bymetaheuristics, since they usually justsample the neighborhood or enumerate itssolutions. For instance, time window con-straints often reduce the number of fea-sible solutions in a neighborhood, whichmight make a local search inefficient.Thus, domain filtering techniques caneffectively support neighborhood explo-ration. In fact, for such a kind of neigh-borhoods, both sampling and enumerationare usually inefficient. More examples canbe found in Pesant and Gendreau [1996,1999] and Focacci et al. [2002].

The third approach preserves the searchspace exploration based on a systematicsearch (such as tree search), but sacri-fices the exhaustive nature of the search[Ginsberg 1993; Harvey 1995; Harvey andGinsberg 1995; Milano and Roli 2002].The hybridization is usually achieved byintegrating concepts developed for meta-heuristics (e.g., probabilistic choices, as-piration criteria, heuristic construction)into tree search methods. A typical ap-plication of this integration is the use



of a probabilistic backtracking, insteadof a—deterministic—chronological back-tracking. For instance, an extreme case isthe random choice of a backtracking move.The list of possible backtracking movescan also be sorted by means of a dynamicheuristic or a sampling of the leaves ofthe search tree. This sampling can be per-formed by a metaheuristic: the result ofeach possible backtracking move is chosenas the starting point for producing a com-plete solution by a metaheuristic (morethan one solution can be generated fromeach partial solution). Then, the qualityof these complete solutions is used asa score for—probabilistically—selecting abacktracking move. Another prominentexample is the introduction of randomiza-tion in systematic techniques, as describedin Gomes et al. [2000]. Many examplesof this approach can be found in Focacciet al. [2002], Jussien and Lhomme [2002],Schaerf [1997], Dell’Amico and Lodi[2002], Prestwich [2002], and Della Croceand T’kindt [2003].

6. CONCLUSIONS

In this work, we have presented and com-pared the current most important meta-heuristic methods. In Sections 3 and 4,we have outlined the basic metaheuristicsas they are described in the literature.In Section 5, we then proposed a con-ceptual comparison of the different meta-heuristics based on the way they imple-ment the two main concepts for guidingthe search process: Intensification and di-versification. This comparison is foundedon the I&D frame, where algorithmic com-ponents can be characterized by the crite-ria they depend upon (objective function,guiding functions and randomization) andtheir effect on the search process. Al-though metaheuristics are different in thesense that some of them are population-based (EC, ACO), and others are trajec-tory methods (SA, TS, ILS, VNS, GRASP),and although they are based on differ-ent philosophies, the mechanisms to effi-ciently explore a search space are all basedon intensification and diversification. Nev-ertheless, it is possible to identify “sub-

tasks” in the search process where somemetaheuristics perform better than oth-ers. This has to be examined more closelyin the future in order to be able to producehybrid metaheuristics performing consid-erably better than their “pure” parents. Infact we can find this phenomenon in manyfacets of life, not just in the world of al-gorithms. Mixing and hybridizing is oftenbetter than purity.

ACKNOWLEDGMENTS

We would like to thank Marco Dorigo, JoshuaKnowles, Andrea Lodi, Michela Milano, MichaelSampels, and Thomas Stutzle for suggestions anduseful discussions. We also would like to thank theanonymous referees for many useful suggestions andcomments.

REFERENCES

AARTS, E. H. L., KORST, J. H. M., AND LAARHOVEN,P. J. M. V. 1997. Simulated annealing. InLocal Search in Combinatorial Optimization,E. H. L. Aarts and J. K. Lenstra, Eds. Wiley-Interscience, Chichester, England, 91–120.

AARTS, E. H. L. AND LENSTRA, J. K., EDS. 1997.Local Search in Combinatorial Optimization.Wiley, Chichester, UK.

AIEX, R. M., BINATO, S., AND RESENDE, M. G. C. 2003.Parallel GRASP with path-relinking for job shopscheduling. Paral. Comput. To appear.

BACHELET, V. AND TALBI, E. 2000. Cosearch: Aco-evolutionary metaheuritics. In Proceedingsof Congress on Evolutionary Computation—CEC’2000. 1550–1557.

BACK, T. 1996. Evolutionary Algorithms in The-ory and Practice. Oxford University Press, NewYork.

BACK, T., FOGEL, D. B., AND MACHALEWICZ, Z., EDS.1997. Handbook of Evolutionary Computation.Institute of Physics Publishing Ltd, Bristol, UK.

BALUJA, S. 1994. Population-based incrementallearning: A method for integrating geneticsearch based function optimization and compet-itive learning. Tech. Rep. No. CMU-CS-94-163,Carnegie Mellon University, Pittsburgh, Pa.

BALUJA, S. AND CARUANA, R. 1995. Removing thegenetics from the standard genetic algorithm.In The International Conference on MachineLearning 1995, A. Prieditis and S. Russel,Eds. Morgan-Kaufmann Publishers, San Mateo,Calif., 38–46.

BAR-YAM, Y. 1997. Dynamics of Complex Systems.Studies in nonlinearity. Addison–Wesley, Read-ing, Mass.

BATTITI, R. 1996. Reactive search: Toward self-tuning heuristics. In Modern Heuristic Search



Methods, V. J. Rayward-Smith, I. H. Osman,C. R. Reeves, and G. D. Smith, Eds. Wiley, Chich-ester, UK, 61–83.

BATTITI, R. AND PROTASI, M. 1997. Reactive Search,A history-base heuristic for MAX-SAT. ACM J.Exper. Algor. 2, Article 2.

BATTITI, R. AND TECCHIOLLI, G. 1994. The reactivetabu search. ORSA J. Comput. 6, 2, 126–140.

BINATO, S., HERY, W. J., LOEWENSTERN, D., AND RESENDE,M. G. C. 2001. A greedy randomized adaptivesearch procedure for job shop scheduling. In Es-says and Surveys on Metaheuristics, P. Hansenand C. C. Ribeiro, Eds. Kluwer AcademicPublishers.

BLUM, C. 2002a. ACO applied to group shop sche-duling: A case study on intensification and di-versification. In Proceedings of ANTS 2002—Third International Workshop on Ant Algo-rithms, M. Dorigo, G. Di Caro, and M. Sam-pels, Eds. Lecture Notes in Computer Science,vol. 2463. Springer Verlag, Berlin, Germany,14–27.

BLUM, C. 2002b. Ant colony optimization for theedge-weighted k-cardinality tree problem. InGECCO 2002: Proceedings of the Genetic andEvolutionary Computation Conference, W. B.Langdon, E. Cantu-Paz, K. Mathias, R. Roy,D. Davis, R. Poli, K. Balakrishnan, V. Honavar,G. Rudolph, J. Wegener, L. Bull, M. A. Potter,A. C. Schultz, J. F. Miller, E. Burke, andN. Jonoska, Eds. Morgan-Kaufmann, New York,27–34.

BLUM, C., ROLI, A., AND DORIGO, M. 2001. HC–ACO: The hyper-cube framework for ant colonyoptimization. In Proceedings of MIC’2001—Meta–heuristics International Conference. Vol. 2.Porto, Portugal, 399–403.

CALEGARY, P., CORAY, G., HERTZ, A., KOBLER, D., AND

KUONEN, P. 1999. A taxonomy of evolution-ary algorithms in combinatorial optimization. J.Heuristics 5, 145–158.

CAMPOS, V., GLOVER, F., LAGUNA, M., AND MARTı, R.2001. An experimental evaluation of a scattersearch for the linear ordering problem. J. GlobalOpt. 21, 397–414.

CASEAU, Y. AND LABURTHE, F. 1999. Effective forget-and-extend heuristics for scheduling problems.In Proceedings of CP-AI-OR’02—Fourth Int.Workshop on Integration of AI and OR tech-niques in Constraint Programming for Combi-natorial Optimization Problems. Ferrara (Italy).Also available at: www.deis.unibo.it/Events/Deis/Workshops/Proceedings.html.

CERNY, V. 1985. A thermodynamical approach tothe travelling salesman problem: An efficientsimulation algorithm. J. Optim. Theory Appl. 45,41–51.

CHARDAIRE, P., LUTTON, J. L., AND SUTTER, A. 1995.Thermostatistical persistency: A powerful im-proving concept for simulated annealing algo-rithms. Europ. J. Oper. Res. 86, 565–579.

COELLO COELLO, C. A. 2000. An updated survey

of GA-based multiobjective optimization tech-niques. ACM Comput. Surv. 32, 2, 109–143.

CONNOLLY, D. T. 1990. An improved annealingscheme for the QAP. Europ. J. Oper. Res. 46, 93–100.

CRAINIC, T. G. AND TOULOUSE, M. 2002a. Intro-duction to the special issue on parallel meta-heuristics. J. Heuristics 8, 3, 247–249.

CRAINIC, T. G. AND TOULOUSE, M. 2002b. Parallelstrategies for meta-heuristics. In Handbook ofMetaheuristics, F. Glover and G. Kochenberger,Eds. International Series in Operations Re-search & Management Science, vol. 57. KluwerAcademic Publishers, Norwell, MA.

DE BACKER, B., FURNON, V., AND SHAW, P. 2000. Solv-ing Vehicle Routing Problems Using ConstraintProgramming and Metaheuristics. J. Heuris-tics 6, 501–523.

DE BONET, J. S., ISBELL JR., C. L., AND VIOLA, P. 1997.MIMIC: Finding optima by estimating probabil-ity densities. In Proceedings of the 1997 Confer-ence on Advances in Neural Information Process-ing Systems (NIPS’97), M. C. Mozer, M. I. Jor-dan, and T. Petsche, Eds. MIT Press, Cambridge,MA, 424–431.

DEJONG, K. A. 1975. An analysis of the behaviorof a class of genetic adaptive systems. Ph.D. the-sis, University of Michigan, Ann Arbor, MI. Dis-sertation Abstracts International 36(10), 5140B,University Microfilms Number 76-9381.

DELLA CROCE, F. AND T’KINDT, V. 2003. A Recover-ing Beam Search algorithm for the one machinedynamic total completion time scheduling prob-lem. J. Oper. Res. Soc. To appear.

DELL’AMICO, M. AND LODI, A. 2002. On the inte-gration of metaheuristic strategies in constraintprogramming. In Adaptive Memory and Evolu-tion: Tabu Search and Scatter Search, C. Regoand B. Alidaee, Eds. Kluwer Academic Publish-ers, Boston, MA.

DELL’AMICO, M., LODI, A., AND MAFFIOLI, F. 1999. So-lution of the cumulative assignment problemwith a well-structured tabu search method. J.Heuristics 5, 123–143.

DEN BESTEN, M. L., STUTZLE, T., AND DORIGO, M. 2001.Design of iterated local search algorithms: Anexample application to the single machine totalweighted tardiness problem. In Proceedings ofEvoStim’01. Lecture Notes in Computer Science.Springer, 441–452.

DENEUBOURG, J.-L., ARON, S., GOSS, S., AND PASTEELS, J.-M. 1990. The self-organizing exploratory pat-tern of the argentine ant. J. Insect Behav. 3, 159–168.

DENZINGER, J. AND OFFERMAN, T. 1999. On coopera-tion between evolutionary algorithms and othersearch paradigms. In Proceedings of Congress onEvolutionary Computation—CEC’1999. 2317–2324.

DEVANEY, R. L. 1989. An introduction to chaoticdynamical systems, second ed. Addison–Wesley,Reading, Mass.



DI CARO, G. AND DORIGO, M. 1998. AntNet: Dis-tributed stigmergetic control for communicationnetworks. J. Artif. Int. Res. 9, 317–365.

DORIGO, M. 1992. Optimization, learning and nat-ural algorithms (in italian). Ph.D. thesis, DEI,Politecnico di Milano, Italy. pp. 140.

DORIGO, M. AND DI CARO, G. 1999. The ant colonyoptimization meta-heuristic. In New Ideasin Optimization, D. Corne, M. Dorigo, andF. Glover, Eds. McGraw-Hill, 11–32.

DORIGO, M., DI CARO, G., AND GAMBARDELLA, L. M.1999. Ant algorithms for discrete optimization.Art. Life 5, 2, 137–172.

DORIGO, M. AND GAMBARDELLA, L. M. 1997. Antcolony system: A cooperative learning approachto the travelling salesman problem. IEEE Trans.Evolution. Comput. 1, 1 (Apr.), 53–66.

DORIGO, M., MANIEZZO, V., AND COLORNI, A. 1996.Ant system: Optimization by a colony of cooper-ating agents. IEEE Trans. Syst. Man Cybernet.—Part B 26, 1, 29–41.

DORIGO, M. AND STUTZLE, T. 2002. The ant colonyoptimization metaheuristic: Algorithms, appli-cations and advances. In Handbook of Meta-heuristics, F. Glover and G. Kochenberger, Eds.International Series in Operations Research &Management Science, vol. 57. Kluwer AcademicPublishers, Norwell, MA, 251–285.

DORIGO, M. AND STUTZLE, T. 2003. Ant Colony Opti-mization. MIT Press, Boston, MA. To appear.

DUECK, G. 1993. New Optimization Heuristics. J.Comput. Phy. 104, 86–92.

DUECK, G. AND SCHEUER, T. 1990. Threshold accept-ing: A general purpose optimization algorithmappearing superior to simulated annealing. J.Comput. Phy. 90, 161–175.

EIBEN, A. E., RAUE, P.-E., AND RUTTKAY, Z. 1994.Genetic algorithms with multi-parent recombi-nation. In Proceedings of the 3rd Conferenceon Parallel Problem Solving from Nature, Y.Davidor, H.-P. Schwefel, and R. Manner, Eds.Lecture Notes in Computer Science, vol. 866.Springer, Berlin, 78–87.

EIBEN, A. E. AND RUTTKAY, Z. 1997. Constraintsatisfaction problems. In Handbook of Evolu-tionary Computation, T. Back, D. Fogel, andM. Michalewicz, Eds. Institute of Physics Pub-lishing Ltd, Bristol, UK.

EIBEN, A. E. AND SCHIPPERS, C. A. 1998. On evo-lutionary exploration and exploitation. Fund.Inf. 35, 1–16.

FELLER, W. 1968. An Introduction to ProbabilityTheory and Its Applications. Wiley, New York.

FEO, T. A. AND RESENDE, M. G. C. 1995. Greedy ran-domized adaptive search procedures. J. GlobalOptim. 6, 109–133.

FESTA, P. AND RESENDE, M. G. C. 2002. GRASP: Anannotated bibliography. In Essays and Surveyson Metaheuristics, C. C. Ribeiro and P. Hansen,Eds. Kluwer Academic Publishers, 325–367.

FINK, A. AND VOß, S. 1999. Generic metaheuristics

application to industrial engineering problems.Comput. Indust. Eng. 37, 281–284.

FLEISCHER, M. 1995. Simulated Annealing: past,present and future. In Proceedings of the 1995Winter Simulation Conference, C. Alexopoulos,K. Kang, W. Lilegdon, and G. Goldsman, Eds.155–161.

FOCACCI, F., LABURTHE, F., AND LODI, A. 2002. Localsearch and constraint programming. In Hand-book of Metaheuristics, F. Glover and G. Kochen-berger, Eds. International Series in Opera-tions Research & Management Science, vol. 57.Kluwer Academic Publishers, Norwell, MA.

FOGEL, D. B. 1994. An introduction to simulatedevolutionary optimization. IEEE Trans. NeuralNetw. 5, 1 (Jan.), 3–14.

FOGEL, G. B., PORTO, V. W., WEEKES, D. G., FOGEL, D. B.,GRIFFEY, R. H., MCNEIL, J. A., LESNIK, E., ECKER,D. J., AND SAMPATH, R. 2002. Discovery of RNAstructural elements using evolutionary compu-tation. Nucleic Acids Res. 30, 23, 5310–5317.

FOGEL, L. J. 1962. Toward inductive inference au-tomata. In Proceedings of the International Fed-eration for Information Processing Congress.Munich, 395–399.

FOGEL, L. J., OWENS, A. J., AND WALSH, M. J. 1966.Artificial Intelligence through Simulated Evolu-tion. Wiley, New York.

FONLUPT, C., ROBILLIARD, D., PREUX, P., AND TALBI,E. 1999. Fitness landscapes and performanceof meta-heuristics. In Meta-heuristics: advancesand trends in local search paradigms for op-timization, S. Voß, S. Martello, I. Osman, andC. Roucairol, Eds. Kluwer Academic.

FREISLEBEN, B. AND MERZ, P. 1996. A genetic lo-cal search algorithm for solving symmetric andasymmetric traveling salesman problems. In In-ternational Conference on Evolutionary Compu-tation. 616–621.

FREUDER, E. C., DECHTER, R., GINSBERG, M. L.,SELMAN, B., AND TSANG, E. P. K. 1995. System-atic versus stochastic constraint satisfaction. InProceedings of the 14th International Joint Con-ference on Artificial Intelligence, IJCAI 1995.Vol. 2. Morgan-Kaufmann, 2027–2032.

GAMBARDELLA, L. M. AND DORIGO, M. 2000. Antcolony system hybridized with a new local searchfor the sequential ordering problem. INFORMSJ. Comput. 12, 3, 237–255.

GAREY, M. R. AND JOHNSON, D. S. 1979. Computersand Intractability; A Guide to the Theory of NP-Completeness. W.H. Freeman.

GENDREAU, M., LAPORTE, G., AND POTVIN, J.-Y. 2001.Metaheuristics for the vehicle routing problem.In The Vehicle Routing Problem, P. Toth andD. Vigo, Eds. SIAM Series on Discrete Mathe-matics and Applications, vol. 9. 129–154.

GINSBERG, M. L. 1993. Dynamic backtracking. J.Artif. Int. Res. 1, 25–46.

GLOVER, F. 1977. Heuristics for integer program-ming using surrogate constraints. Dec. Sci. 8,156–166.



GLOVER, F. 1986. Future paths for integer pro-gramming and links to artificial intelligence.Comput. Oper. Res. 13, 533–549.

GLOVER, F. 1990. Tabu search Part II. ORSA J.Comput. 2, 1, 4–32.

GLOVER, F. 1999. Scatter search and path relink-ing. In New Ideas in Optimization, D. Corne,M. Dorigo, and F. Glover, Eds. Advanced topicsin computer science series. McGraw-Hill.

GLOVER, F. AND LAGUNA, M. 1997. Tabu Search.Kluwer Academic Publishers.

GLOVER, F., LAGUNA, M., AND MARTı, R. 2000. Fun-damentals of scatter search and path relinking.Control, 29, 3, 653–684.

GLOVER, F., LAGUNA, M., AND MARTı, R. 2002. Scat-ter search and path relinking: Advances andapplications. In Handbook of Metaheuristics,F. Glover and G. Kochenberger, Eds. Interna-tional Series in Operations Research & Manage-ment Science, vol. 57. Kluwer Academic Publish-ers, Norwell, MA.

GOLDBERG, D. E. 1989. Genetic Algorithms inSearch, Optimization and Machine Learning.Addison Wesley, Reading, MA.

GOLDBERG, D. E., DEB, K., AND KORB, B. 1991. Don’tworry, be messy. In Proceedings of the 4th In-ternational Conference on Genetic Algorithms.Morgan-Kaufmann, La Jolla, CA.

GOLDBERG, D. E. AND RICHARDSON, J. 1987. Geneticalgorithms with sharing for multimodal functionoptimization. In Genetic Algorithms and theirApplications, J. J. Grefenstette, Ed. LawrenceErlbaum Associates, Hillsdale, NJ, 41–49.

GOMES, C. P., SELMAN, B., CRATO, N., AND KAUTZ, H.2000. Heavy-Tayled phenomena in Satisfiabil-ity and Constraint Satisfaction Prpblems. J.Automat. Reason. 24, 67–100.

GREFENSTETTE, J. J. 1987. Incorporating problemspecific knowledge into genetic algorithms. InGenetic Algorithms and Simulated Annealing,L. Davis, Ed. Morgan-Kaufmann, 42–60.

GREFENSTETTE, J. J. 1990. A user’s guide to GEN-ESIS 5.0. Tech. rep., Navy Centre for AppliedResearch in Artificial Intelligence, Washington,D.C.

HANSEN, P. 1986. The steepest ascent mildest de-scent heuristic for combinatorial programming.In Congress on Numerical Methods in Combina-torial Optimization. Capri, Italy.

HANSEN, P. AND MLADENOVIC, N. 1997. Variableneighborhood search for the p-median. Loc.Sci. 5, 207–226.

HANSEN, P. AND MLADENOVIC, N. 1999. An introduc-tion to variable neighborhood search. In Meta-heuristics: Advances and trends in local searchparadigms for optimization, S. Voß, S. Martello,I. Osman, and C. Roucairol, Eds. Kluwer Aca-demic Publishers, Chapter 30, 433–458.

HANSEN, P. AND MLADENOVIC, N. 2001. Variableneighborhood search: Principles and applica-tions. Europ. J. Oper. Res. 130, 449–467.

HARIK, G. 1999. Linkage learning via probabilisticmodeling in the ECGA. Tech. Rep. No. 99010,IlliGAL, University of Illinois.

HARVEY, W. D. 1995. Nonsystematic backtrack-ing search. Ph.D. thesis, CIRL, University ofOregon.

HARVEY, W. D. AND GINSBERG, M. L. 1995. Limiteddiscrepancy search. In Proceedings of the 14thInternational Joint Conference on Artificial In-telligence, IJCAI 1995 (Montreal, Que, Canada).C. S. Mellish, Ed. Vol. 1. Morgan-Kaufmann,607–615.

HERTZ, A. AND KOBLER, D. 2000. A frameworkfor the description of evolutionary algorithms.Europ. J. Oper. Res. 126, 1–12.

HOGG, T. AND HUBERMAN, A. 1993. Better than thebest: The power of cooperation. In SFI 1992 Lec-tures in Complex Systems. Addison-Wesley, 163–184.

HOGG, T. AND WILLIAMS, C. 1993. Solving the reallyhard problems with cooperative search. In Pro-ceedings of AAAI93. AAAI Press, 213–235.

HOLLAND, J. H. 1975. Adaption in natural and artifi-cial systems. The University of Michigan Press,Ann Harbor, MI.

HORDIJK, W. 1996. A measure of landscapes. Evo-lut. Comput. 4, 4, 335–360.

INGBER, L. 1996. Adaptive simulated annealing(ASA): Lessons learned. Cont. Cybernet.—Special Issue on Simulated Annealing Appliedto Combinatorial Optimization 25, 1, 33–54.

JOHNSON, D. S. AND MCGEOCH, L. A. 1997. The trav-eling salesman problem: a case study. In LocalSearch in Combinatorial Optimization, E. Aartsand J. Lenstra, Eds. Wiley, New York, 215–310.

JONES, T. 1995a. Evolutionary algorithms, fitnesslandscapes and search. Ph.D. thesis, Univ. ofNew Mexico, Albuquerque, NM.

JONES, T. 1995b. One operator, one landscape.Santa Fe Institute Tech. Rep. 95-02-025, SantaFe Institute.

JOSLIN, D. E. AND CLEMENTS, D. P. 1999. “SqueakyWheel” Optimization. J. Artif. Int. Res. 10, 353–373.

JUSSIEN, N. AND LHOMME, O. 2002. Local searchwith constraint propagation and conflict-basedheuristics. Artif. Int. 139, 21–45.

KAUFFMAN, S. A. 1993. The Origins of Order: Self-Organization and Selection in Evolution. OxfordUniversity Press.

KILBY, P., PROSSER, P., AND SHAW, P. 1999. GuidedLocal Search for the Vehicle Routing Prob-lem with time windows. In Meta-heuristics: Ad-vances and trends in local search paradigms foroptimization, S. Voß, S. Martello, I. Osman, andC. Roucairol, Eds. Kluwer Academic, 473–486.

KIRKPATRICK, S., GELATT, C. D., AND VECCHI, M. P.1983. Optimization by simulated annealing.Science, 13 May 1983 220, 4598, 671–680.

LAGUNA, M., LOURENCO, H., AND MARTı, R. 2000. As-signing Proctors to Exams with Scatter Search.



In Computing Tools for Modeling, Optimizationand Simulation: Interfaces in Computer Scienceand Operations Research, M. Laguna and J. L.Gonzalez-Velarde, Eds. Kluwer Academic Pub-lishers, Boston, MA, 215–227.

LAGUNA, M. AND MARTı, R. 1999. GRASP and pathrelinking for 2-layer straight line crossing mini-mization. INFORMS J. Comput. 11, 1, 44–52.

LAGUNA, M., MARTı, R., AND CAMPOS, V. 1999. In-tensification and diversification with elite tabusearch solutions for the linear ordering problem.Comput. Oper. Res. 26, 1217–1230.

LARRANAGA, P. AND LOZANO, J. A., Eds. 2002. Es-timation of Distribution Algorithms: A NewTool for Evolutionary Computation. Kluwer Aca-demic Publishers, Boston, MA.

LOURENCO, H. R., MARTIN, O., AND STUTZLE, T.2001. A beginner’s introduction to IteratedLocal Search. In Proceedings of MIC’2001—Meta–heuristics International Conference. Vol. 1.Porto—Portugal, 1–6.

LOURENCO, H. R., MARTIN, O., AND STUTZLE, T. 2002.Iterated local search. In Handbook of Meta-heuristics, F. Glover and G. Kochenberger, Eds.International Series in Operations Research &Management Science, vol. 57. Kluwer AcademicPublishers, Norwell, MA, 321–353.

LUNDY, M. AND MEES, A. 1986. Convergence of anannealing algorithm. Math. Prog. 34, 1, 111–124.

MARTIN, O. AND OTTO, S. W. 1996. Combining sim-ulated annealing with local search heuristics.Ann. Oper. Res. 63, 57–75.

MARTIN, O., OTTO, S. W., AND FELTEN, E. W. 1991.Large-step Markov chains for the travelingsalesman problem. Complex Syst. 5, 3, 299–326.

MERKLE, D., MIDDENDORF, M., AND SCHMECK, H.2002. Ant colony optimization for resource-constrained project scheduling. IEEE Trans.Evolut. Comput. 6, 4, 333–346.

METAHEURISTICS NETWORK WEBSITE 2000. http://www.metaheuristics.net/. Visited in January 2003.

MEULEAU, N. AND DORIGO, M. 2002. Ant colony opti-mization and stochastic gradient descent. Artif.Life 8, 2, 103–121.

MICHALEWICZ, Z. AND MICHALEWICZ, M. 1997. Evolu-tionary computation techniques and their ap-plications. In Proceedings of the IEEE Inter-national Conference on Intelligent ProcessingSystems, (Beijing, China). Institute of Elec-trical & Electronics Engineers, Incorporated,14–24.

MILANO, M. AND ROLI, A. 2002. On the relationbetween complete and incomplete search: Aninformal discussion. In Proceedings of CP-AI-OR’02—Fourth Int. Workshop on Integration ofAI and OR techniques in Constraint Program-ming for Combinatorial Optimization Problems(Le Croisic, France). 237–250.

MILLS, P. AND TSANG, E. 2000. Guided local searchfor solving SAT and weighted MAX-SAT Prob-

lems. In SAT2000, I. Gent, H. van Maaren, andT. Walsh, Eds. IOS Press, 89–106.

MITCHELL, M. 1998. An Introduction to GeneticAlgorithms. MIT press, Cambridge, MA.

MLADENOVIC, N. AND UROSEVIC, D. 2001. Variableneighborhood search for the k-cardinality tree.In Proceedings of MIC’2001—Meta–heuristicsInternational Conference. Vol. 2. Porto, Portugal,743–747.

MOSCATO, P. 1989. On evolution, search, optimiza-tion, genetic algorithms and martial arts: To-ward memetic algorithms. Tech. Rep. CaltechConcurrent Computation Program 826, Califor-nia Institute of Technology,Pasadena, Calif.

MOSCATO, P. 1999. Memetic algorithms: A short in-troduction. In New Ideas in Optimization, F. G.D. Corne and M. Dorigo, Eds. McGraw-Hill.

MUHLENBEIN, H. 1991. Evolution in time andspace—The parallel genetic algorithm. In Foun-dations of Genetic Algorithms, G. J. E. Rawlins,Ed. Morgan-Kaufmann, San Mateo, Calif.

MUHLENBEIN, H. AND PAAß, G. 1996. From recombi-nation of genes to the estimation of distributions.In Proceedings of the 4th Conference on Paral-lel Problem Solving from Nature—PPSN IV, H.-M. Voigt, W. Ebeling, I. Rechenberg, and H.-P.Schwefel, Eds. Lecture Notes in Computer Sci-ence, vol. 1411. Springer, Berlin, 178–187.

MUHLENBEIN, H. AND VOIGT, H.-M. 1995. Gene poolrecombination in genetic algorithms. In Proc. ofthe Metaheuristics Conference, I. H. Osman andJ. P. Kelly, Eds. Kluwer Academic Publishers,Norwell, USA.

NEMHAUSER, G. L. AND WOLSEY, A. L. 1988. Integerand Combinatorial Optimization. Wiley, NewYork.

NOWICKI, E. AND SMUTNICKI, C. 1996. A fast taboosearch algorithm for the job-shop problem. Man-age. Sci. 42, 2, 797–813.

OSMAN, I. H. 1993. Metastrategy simulated an-nealing and tabu search algorithms for the ve-hicle routing problem. Ann. Oper. Res. 41, 421–451.

OSMAN, I. H. AND LAPORTE, G. 1996. Metaheuristics:A bibliography. Ann. Oper. Res. 63, 513–623.

PAPADIMITRIOU, C. H. AND STEIGLITZ, K. 1982. Com-binatorial Optimization—Algorithms and Com-plexity. Dover Publications, Inc., New York.

PELIKAN, M., GOLDBERG, D. E., AND CANTU-PAZ, E.1999a. BOA: The Bayesian optimization al-gorithm. In Proceedings of the Genetic andEvolutionary Computation Conference GECCO-99 (Orlando, Fla.). W. Banzhaf, J. Daida,A. E. Eiben, M. H. Garzon, V. Honavar,M. Jakiela, and R. E. Smith, Eds. Vol. I. Morgan-Kaufmann Publishers, San Fransisco, CA, 525–532.

PELIKAN, M., GOLDBERG, D. E., AND LOBO, F. 1999b.A survey of optimization by building and usingprobabilistic models. Tech. Rep. No. 99018, Illi-GAL, University of Illinois.



PESANT, G. AND GENDREAU, M. 1996. A view of localsearch in Constraint Programming. In Princi-ples and Practice of Constraint Programming—CP’96. Lecture Notes in Computer Science, vol.1118. Springer-Verlag, 353–366.

PESANT, G. AND GENDREAU, M. 1999. A constraintprogramming framework for local search meth-ods. J. Heuristics 5, 255–279.

PITSOULIS, L. S. AND RESENDE, M. G. C. 2002.Greedy randomized adaptive search proce-dure. In Handbook of Applied Optimization, P.Pardalos and M. Resende, Eds. Oxford Univer-sity Press, 168–183.

PRAIS, M. AND RIBEIRO, C. C. 2000. ReactiveGRASP: An application to a matrix decompo-sition problem in TDMA traffic assignment.INFORMS J. Comput. 12, 164–176.

PRESTWICH, S. 2002. Combining the scalability oflocal search with the pruning techniques ofsystematic search. Ann. Oper. Res. 115, 51–72.

RADCLIFFE, N. J. 1991. Forma Analysis and Ran-dom Respectful Recombination. In Proceed-ings of the Fourth International Conferenceon Genetic Algorithms, ICGA 1991. Morgan-Kaufmann, San Mateo, Calif., 222–229.

RAYWARD-SMITH, V. J. 1994. A unified approach totabu search, simulated annealing and genetic al-gorithms. In Applications of Modern Heuristics,V. J. Rayward-Smith, Ed. Alfred Waller Limited,Publishers.

RECHENBERG, I. 1973. Evolutionsstrategie: Opti-mierung technischer Systeme nach Prinzip-ien der biologischen Evolution. Frommann-Holzboog.

REEVES, C. R., Ed. 1993. Modern Heuristic Tech-niques for Combinatorial Problems. BlackwellScientific Publishing, Oxford, England.

REEVES, C. R. 1999. Landscapes, operators andheuristic search. Ann. Oper. Res. 86, 473–490.

REEVES, C. R. AND ROWE, J. E. 2002. Genetic Al-gorithms: Principles and Perspectives. A Guideto GA Theory. Kluwer Academic Publishers,Boston (USA).

REGO, C. 1998. Relaxed Tours and Path Ejectionsfor the Traveling Salesman Problem. Europ. J.Oper. Res. 106, 522–538.

REGO, C. 2001. Node-ejection chains for the ve-hicle routing problem: Sequential and paral-lel algorithms. Paral. Comput. 27, 3, 201–222.

RESENDE, M. G. C. AND RIBEIRO, C. C. 1997. AGRASP for graph planarization. Networks 29,173–189.

RIBEIRO, C. C. AND SOUZA, M. C. 2002. Variableneighborhood search for the degree constrainedminimum spanning tree problem. Disc. Appl.Math. 118, 43–54.

SCHAERF, A. 1997. Combining local search andlook-ahead for scheduling and constraint satis-faction problems. In Proceedings of the 15th In-

ternational Joint Conference on Artificial Intelli-gence, IJCAI 1997. Morgan-Kaufmann Publish-ers, San Mateo, CA, 1254–1259.

SCHAERF, A., CADOLI, M., AND LENZERINI, M. 2000.LOCAL++: a C++ framework for local search al-gorithms. Softw. Pract. Exp. 30, 3, 233–256.

SHAW, P. 1998. Using constraint programming andlocal search methods to solve vehicle routingproblems. In Principle and Practice of ConstraintProgramming—CP98, M. Maher and J.-F. Puget,Eds. Lecture Notes in Computer Science, vol.1520. Springer.

SIPPER, M., SANCHEZ, E., MANGE, D., TOMASSINI, M.,PEREZ-URIBE, A., AND STAUFFER, A. 1997. Aphylogenetic, ontogenetic, and epigenetic viewof bio-inspired hardware systems. IEEE Trans.Evolut. Comput. 1, 1, 83–97.

SONDERGELD, L. AND VOß, S. 1999. Cooperative in-telligent search using adaptive memory tech-niques. In Meta-Heuristics: Advances andTrends in Local Search Paradigms for Optimiza-tion, S. Voss, S. Martello, I. Osman, and C.Roucairol, Eds. Kluwer Academic Publishers,Chapter 21, 297–312.

SPEARS, W. M., JONG, K. A. D., BACK, T., FOGEL, D. B.,AND DE GARIS, H. 1993. An overview of evolu-tionary computation. In Proceedings of the Euro-pean Conference on Machine Learning (ECML-93), P. B. Brazdil, Ed. Vol. 667. Springer Verlag,Vienna, Austria, 442–459.

STADLER, P. F. 1995. Towards a theory of land-scapes. In Complex Systems and Binary Net-works, R. Lopez-Pena, R. Capovilla, R. Garcıa-Pelayo, H. Waelbroeck, and F. Zertuche, Eds.Lecture Notes in Physics, vol. 461. Springer-Verlag, Berlin, New York, 77–163. Also availableas SFI preprint 95-03-030.

STADLER, P. F. 1996. Landscapes and their corre-lation functions. J. Math. Chem. 20, 1–45. Alsoavailable as SFI preprint 95-07-067.

STUTZLE, T. 1999a. Iterated local search for thequadratic assignment problem. Tech. rep. aida-99-03, FG Intellektik, TU Darmstadt.

STUTZLE, T. 1999b. Local Search Algorithms forCombinatorial Problems—Analysis, Algorithmsand New Applications. DISKI—Dissertationenzur Kunstliken Intelligenz. infix, SanktAugustin, Germany.

STUTZLE, T. AND HOOS, H. H. 2000. MAX -MI NAnt System. Fut. Gen. Comput. Syst. 16, 8, 889–914.

SYSWERDA, G. 1993. Simulated Crossover in Ge-netic Algorithms. In Proceedings of the 2ndWorkshop on Foundations of Genetic Algorithms,L. Whitley, Ed. Morgan-Kaufmann Publishers,San Mateo, Calif., 239–255.

TABU SEARCH WEBSITE. 2003. http://www.tabusearch.net. Visited in January 2003.

TAILLARD, E. 1991. Robust Taboo Search for theQuadratic Assignment Problem. Paral. Com-put. 17, 443–455.



TALBI, E.-G. 2002. A Taxonomy of Hybrid Meta-heuristics. Journal of Heuristics 8, 5, 541–564.

TOULOUSE, M., CRAINIC, T., AND SANSO, B. 1999a.An experimental study of the systemic behav-ior of cooperative search algorithms. In Meta-Heuristics: Advances and Trends in Local SearchParadigms for Optimization, S. Voß, S. Martello,I. Osman, and C. Roucairol, Eds. Kluwer Aca-demic Publishers, Chapter 26, 373–392.

TOULOUSE, M., THULASIRAMAN, K., AND GLOVER, F.1999b. Multi-level cooperative search: A newparadigm for combinatorial optimization and ap-plication to graph partitioning. In Proceedingsof the 5th International Euro-Par Conferenceon Parallel Processing. Lecture Notes in Com-puter Science. Springer-Verlag, New York, 533–542.

VAN KEMENADE, C. H. M. 1996. Explicit filteringof building blocks for genetic algorithms. InProceedings of the 4th Conference on ParallelProblem Solving from Nature—PPSN IV, H.-M. Voigt, W. Ebeling, I. Rechenberg, and H.-P.Schwefel, Eds. Lecture Notes in Computer Sci-ence, vol. 1141. Springer, Berlin, 494–503.

VAN LAARHOVEN, P. J. M., AARTS, E. H. L., AND LENSTRA,J. K. 1992. Job Shop Scheduling by SimulatedAnnealing. Oper. Res. 40, 113–125.

VOSE, M. 1999. The Simple Genetic Algorithm:Foundations and Theory. Complex Adaptive Sys-tems. MIT Press.

VOß, S., MARTELLO, S., OSMAN, I. H., AND ROUCAIROL,C., Eds. 1999. Meta-Heuristics—Advancesand Trends in Local Search Paradigms forOptimization. Kluwer Academic Publishers,Dordrecht, The Netherlands.

VOß, S. AND WOODRUFF, D., Eds. 2002. OptimizationSoftware Class Libraries. Kluwer Academic Pub-lishers, Dordrecht, The Netherlands.

VOUDOURIS, C. 1997. Guided local search for com-binatorial optimization problems. Ph.D. disser-tation, Department of Computer Science, Uni-versity of Essex. pp. 166.

VOUDOURIS, C. AND TSANG, E. 1999. Guided localsearch. Europ. J. Oper. Res. 113, 2, 469–499.

WADE, A. S. AND RAYWARD-SMITH, V. J. 1997. Ef-fective local search for the steiner tree prob-lem. Studies in Locational Analysis 11, 219–241. Also in Advances in Steiner Trees, ed. byDing-Zhu Du, J. M.Smith and J.H. Rubinstein,Kluwer, 2000.

WATSON, R. A., HORNBY, G. S., AND POLLACK, J. B. 1998.Modeling building-block interdependency. InLate Breaking Papers at the Genetic Program-ming 1998 Conference, J. R. Koza, Ed. StanfordUniversity Bookstore, University of Wisconsin,Madison, Wisconsin, USA.

WHITLEY, D. 1989. The GENITOR algorithm andselective pressure: Why rank-based allocationof reproductive trials is best. In Proceedings ofthe 3rd International Conference on Genetic Al-gorithms, ICGA 1989. Morgan-Kaufmann Pub-lishers, 116–121.

YAGIURA, M. AND IBARAKI, T. 2001. On metaheuris-tic algorithms for combinatorial optimizationproblems. Syst. Comput. Japan 32, 3, 33–55.

ZLOCHIN, M., BIRATTARI, M., MEULEAU, N., AND DORIGO,M. 2004. Model-based search for combinato-rial optimization: A critical survey. Ann. Oper.Res. To appear.

Received July 2002; revised February 2003; accepted June 2003


Documents

Metaheuristics in Combinatorial Optimization: Overview and ...christian.blum/downloads/blum_roli_2003.pdf · A. Roli acknowledges support by the CEC through a “Marie Curie Training