Crowding Genetic Algorithms

  • View
    213

  • Download
    0

Embed Size (px)

Text of Crowding Genetic Algorithms

  • 8/11/2019 Crowding Genetic Algorithms

    1/39

    The Crowding Approach to Niching in Genetic

    Algorithms

    Ole J. Mengshoel omengshoel@riacs.eduRIACS, NASA Ames Research Center, Mail Stop 269-3, Moffett Field, CA 94035

    David E. Goldberg deg@uiuc.eduIllinois Genetic Algorithms Laboratory, Department of General Engineering, Univer-sity of Illinois at Urbana-Champaign, Urbana, IL 61801

    Abstract

  • 8/11/2019 Crowding Genetic Algorithms

    2/39

    O. J. Mengshoel and D. E. Goldberg

    classical GA. Deterministic crowding has also been found to work well on test functionsas well as in applications.

    In this article, we present an algorithmic framework that supports different crowd-

    ing algorithms, including different replacement rules and the use of multiple replace-ment rules in portfolios. While our main emphasis is on theprobabilistic crowding al-gorithm (Mengshoel and Goldberg, 1999; Mengshoel, 1999), we also investigate otherapproaches including deterministic crowding within. As the name suggests, proba-

    bilistic crowding is closely related to deterministic crowding, and as such shares manyof deterministic crowdings strong characteristics. The main difference is the use of aprobabilistic rather than a deterministic replacement rule (or acceptance function). Inprobabilistic crowding, stronger individuals do not always win over weaker individu-als, they win proportionally according to their fitness. Using a probabilistic acceptancefunction is shown to give stable, predictable convergence that approximates the nichingrule, a gold standard for niching algorithms.

    We also present here a framework for analyzing crowding algorithms. We con-sider performance at equilibrium and during convergence to equilibrium. Further, we

    introduce a novel portfolio mechanism and discuss the benefit of integrating differentreplacement rules by means of this mechanism. In particular, we show the advan-tage of integrating deterministic and probabilistic crowding when selection pressureunder probabilistic crowding only is low. Our analysis, which includes population siz-ing results, is backed up by experiments that confirm our analytical results and furtherincrease the understanding of how crowding and in particular probabilistic crowdingoperates.

    A final contribution of this article is to identify a class of algorithms to which bothdeterministic and probabilistic crowding belongs, local tournament algorithms. Othermembers of this class include the Metropolis algorithm (Metropolis et al., 1953), simu-lated annealing (Kirkpatrick et al., 1983), restricted tournament selection (Harik, 1995),elitist recombination (Thierens and Goldberg, 1994), and parallel recombinative simu-lated annealing (Mahfoud and Goldberg, 1995). Common to these algorithms is that

    competition is localized in that it occurs between genotypically similar individuals. Itturns out that slight variations in how tournaments are set up and take place are crucialto whether one obtains a niching algorithm or not. This class of algorithms is interesting

    because it is very efficient and can easily be applied in different settings, for exampleby changing or combining the replacement rules.

    We believe this work is significant for several reasons. As already mentioned, nich-ing algorithms reduce the effect of premature convergence or improve search for mul-tiple optima. Finding multiple optima is useful, for example, in situations where thereis uncertainty about the fitness function and robustness with respect to inaccuraciesin the fitness function is desired. Niching and crowding algorithms also play a fun-damental role in multi-objective optimization algorithms (Fonseca and Fleming, 1993;Deb, 2001) as well as in estimation of distribution algorithms (Pelikan and Goldberg,2001; Sastry et al., 2005). We enable further progress in these areas by explicitly stat-ing new and existing algorithms in an overarching framework, thereby improving theunderstanding of the crowding approach to niching. There are several informative ex-periments that compare different niching and crowding GAs (Ando et al., 2005b; Singhand Deb, 2006). However, less effort has been devoted to increasing the understand-ing of crowding from an analytical point of view, as we do here. Analytically, we alsomake a contribution with our portfolio framework, which enables easy combination ofdifferent replacement rules. Finally, while our focus is here on discrete multimodal

    2 Evolutionary Computation Volume x, Number x

  • 8/11/2019 Crowding Genetic Algorithms

    3/39

    Crowding in Genetic Algorithms

    optimization, a variant of probabilistic crowding has successfully been applied to hardmultimodal optimization problems in high-dimensional continuous spaces (Ballesterand Carter, 2003, 2004, 2006). We hope the present work will act as a catalyst to further

    progress also in this area.The rest of this article is organized as follows. Section 2 presents fundamental con-

    cepts. Section 3 discusses local tournament algorithms. Section 4 discusses our crowd-ing algorithms and replacement rules, including the probabilistic and deterministiccrowding replacement rules. In Section 5, we analyze several variants of probabilisticand deterministic crowding. In Section 6, we introduce and analyze our approach tointegrating different crowding replacement rules in a portfolio. Section 7 discusseshow our analysis compares to previous analysis, using Markov chains, of stochasticsearch algorithms including genetic algorithms. Section 8 contains experiments thatshed further light on probabilistic crowding, suggesting that it works well and in linewith our analysis. Section 9 concludes and points out directions for future research.

    2 Preliminaries

    To simplify the exposition we focus on GAs using binary strings, or bitstrings x 2f0; 1gm, of lengthm . Distance is measured using Hamming distance D ISTANCE(x; y)

    between two bitstrings,x;y 2 f0; 1gm. More formally, we have the following defini-tion.

    Definition 1 (Distance) Let x, y be bitstrings of length m and let, forxi 2 x and yi 2 ywhere1 i m,d(xi; yi) = 0ifxi = yi,d(xi; yi) = 1otherwise. Now, the distance functionDISTANCE(x;y)is defined as follows:

    DISTANCE(x; y) =mXi=1

    d(xi; yi):

    Our DISTANCEdefinition is often called genotypic distance; when we discuss dis-tance in this article the above definition is generally assumed.

    A natural way to analyze a stochastic search algorithms operation on a probleminstance is to use discrete time Markov chains with discrete state spaces.

    Definition 2 (Markov chain) A (discrete time, discrete state space) Markov chain M is de-fined as a 3-tuple M = (S, V, P) where S = fs1, : : :, skgdefines the set ofk states whileV= (1, . . . ,k), a k-dimensional vector, defines the initial probability distribution. The con-ditional state transition probabilities Pcan be characterized by means of ak kmatrix.

    Only time-homogenous Markov chains, where Pis constant, will be considered inthis article. The performance of stochastic search algorithms, both evolutionary algo-rithms and stochastic local search algorithms, can be formalized using Markov chains(Goldberg and Segrest, 1987; Nix and Vose, 1992; Harik et al., 1997; De Jong and Spears,1997; Spears and De Jong, 1997; Cantu-Paz, 2000; Hoos, 2002; Moey and Rowe, 2004a,b;Mengshoel, 2006). Unfortunately, if one attempts exact analysis, the size ofMbecomesvery large even for relatively small problem instances (Nix and Vose, 1992; Mengshoel,2006). In Section 7 we provide further discussion of how our analysis provides anapproximation compared to previous exact Markov chain analysis results.

    In M, some states O Sare of particular interest since they represent globallyoptimal states, and we introduce the following definition.

    Definition 3 (Optimal states) Let M = (S, V, P) be a Markov chain. Further, assume afitness function f : S ! Rand a globally optimal fitness function value f 2 Rthat definesglobally optimal statesO = fs j s 2 Sand f(s) = fg.

    Evolutionary Computation Volume x, Number x 3

  • 8/11/2019 Crowding Genetic Algorithms

    4/39

    O. J. Mengshoel and D. E. Goldberg

    Population-based Non-population-based

    Probabilisticacceptance

    Probabilistic crowdingParallel recombinative

    simulated annealing

    Metropolis algorithmSimulated annealing

    Stochastic local searchDeterministicacceptance

    Deterministic crowdingRestricted tournamentselection

    Local search

    Table 1: Two key dimensions of local tournament algorithms: (i) the nature of the ac-ceptance (or replacement) rule and (ii) the nature of the current state of the algorithmssearch process.

    The fitness function fand the optimal states Oare independent of the stochasticsearch algorithm and its parameters. In general, of course, neither M nor O are explic-itly specified. Rather, they are induced by the fitness function, the stochastic search

    algorithm, and the search algorithms parameter settings. Finding s

    2 Ois often thepurpose of computation, as it is given only implicitly by the optimal fitness functionvalue f 2 R. More generally, we want to not only find globally optimal states, but alsolocally optimal states L, with O L. Finding locally optimal states is, in particular, thepurpose of niching algorithms including crowding GAs. Without loss of generality weconsider maximization problems in this article; in other words we seek global and localmaxima in a fitness function f.

    3 Crowding and Local Tournament Algorithms

    In traditional GAs, mutation and recombination is done first, and then selection (orreplacement) is performed second, without regard to distance (or the degree of simi-larity) between individuals. Other algorithms, such as pro