17
A THESIS SUBMITTED AS PART OF THE APPLIED PHYSICS PROGRAM IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF BACHELOR OF SCIENCE IN PHYSICS Approximating the Evolution Time of the Eye: A Genetic Algorithms Approach Dov J. Rhodes DEPARTMENT OF PHYSICS INDIANA UNIVERSITY May 4, 2007

Approximating the Evolution Time of the Eye: A Genetic

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Approximating the Evolution Time of the Eye: A Genetic

A

THESIS

SUBMITTED AS PART OF THE APPLIED PHYSICS PROGRAM

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

BACHELOR OF SCIENCE IN PHYSICS

Approximating the Evolution Time of the Eye:

A Genetic Algorithms Approach

Dov J. Rhodes

DEPARTMENT OF PHYSICS

INDIANA UNIVERSITY

May 4, 2007

Page 2: Approximating the Evolution Time of the Eye: A Genetic

Abstract

We use genetic algorithms for estimating the evolution time of theeye. Our main purpose is to demonstrate that a deterministic ap-proach, which neglects to consider locally optimal evolutionary paths,tends to underestimate the evolution time by at least a factor of 5,and probably more.

1

Page 3: Approximating the Evolution Time of the Eye: A Genetic

1 Introduction

A paper published in 1994 by the Swedish scientists Nilsson and Pelger [6]gained immediate worldwide fame for describing the evolution process for aneye, and approximating the time required for an eye to evolve from a simplepatch that sense electromagnetic radiation. Nilsson and Pelger (NP) outlinedan evolutionary path, where by minute improvements on each step a camera-type eye can evolve in approximately 360,000 years, which is extremely faston an evolutionary time scale. Their paper drew particular attention becauseDarwin’s theory of evolution, although considered by most to be scientificfact, has shown few pieces of evidence for the development of vision. Acommon criticism of Darwin’s theory is that it struggles to justify the devel-opment of the eye. How can just part of an eye be useful to its owner? Is itpossible bridge the gap between imaging and mere absorption of radiation?This potential problem was acknowledged even by Darwin himself.[3]

In the 1994 article, NP consider a parameter that they call ”spatial reso-lution”, a balance of noise versus optical blur. Holding the ambient intensityand eye size constant, there is supposedly an optimal aperture size; smalleraperture size causes a higher noise factor but less blur, and vice versa. NPclaim that in 1829 increments of 1% improvement to the spatial resolution,their model develops a fully functional camera-type eye. In the next step,they quantify this change as equivalent to an object growing to be 1.011829 or80129540 times ”longer”. Approximating the average annual improvement tobe .0005% (a pessimistic value according to NP), the number of generations(n) required to increase the ”length” by a proportion of 80129540 is deter-mined by solving (1.000005)n = 80129540, which results in n = 363, 992.Hence, for a species with average lifetime of one year (fish), NP claim thatan eye is likely to evolve in 360 thousand years.

This time period is a mere blink on an evolutionary timescale. Is itpossible that NP’s approximation is correct? Possibly, but it is extremelyimprobable. Many weaknesses and critiques in the NP’s argument can befound in David Berlinski’s 2003 response to the former article [1]. Berlinskihas many strong arguments against the scientific credibility of NP, howeverhe does not demonstrate why their model is far from correct. The mainproblem with the NP model is that although the evolutionary path that itdescribes might be a legitimate one, it neglects consideration for divergent

2

Page 4: Approximating the Evolution Time of the Eye: A Genetic

paths. It is easy to construct a situation in which the best temporary optionfor the improvement of an eye does not lead towards the development of theglobally optimal solution. This idea motivates our alternative approach, themethod of genetic algorithms. In this paper we use the genetic algorithmwith a simplified (2-dimensional) version of NP’s setup and show the error intheir approach. We argue that if their approach is mistaken in the simplifiedmodel, it is even farther from reality in the full evolutionary setting.

1.1 The Genetic Algorithms Approach

Already in the 1960s, as computers began to grow more common, biologistswere attempting to simulate evolution. In the 1970’s, John Holland’s book[4] described the genetic algorithm - a program that emulates natural selec-tion - as a general method of optimization, going far beyond the scope of theoriginal biological intention. The 1980’s saw the beginning of a shift fromtheory to actual application of GA, as computing power increased dramati-cally, a process that is still in progress.

The imagery of peaks and valleys associated with the genetic algorithmwas popularized by evolutionist Richard Dawkins [3]. In his book ”ClimbingMount Improbable”, Dawkins describes how in evolution there are typicallylocal optimizing conditions (we will call them local minima, or valleys), butthese might not resemble the globally optimal solution. A nice example isthe pinhole eye: for a simple light-absorbing crevice with no lens or mirror,the larger an eye’s aperture size the more blurred the image. Thus the mostfocused eye of this type would be a tiny ”pinhole” eye. The obvious dis-advantage is that this eye does not see as much, does not absorb as muchinformation from its surroundings, and as NP point out the noise overwhelmsthe actual information. The pinhole eye nevertheless is a local minimum, thebest choice within a small class of choices. In order for a pinhole eye to evolveinto a lens eye, the pinhole eye must grow first (over many generations ofgenetic mixing and mutation), become less focused, and still survive the pro-cess of natural selection. This genetic algorithms analogue of this example isdescribed by uphill movement upon an evolutionary landscape of peaks andvalleys, going in the less probable direction. Only after climbing out of thevalley does the pinhole-type have access to other, possibly superior optionssuch as the lens or mirror eye.

3

Page 5: Approximating the Evolution Time of the Eye: A Genetic

There are several significant differences in the GA approach versus tradi-tional optimization methods. The search algorithm runs through many pathsin parallel rather taking just a single path. This immediately makes the opti-mization process more efficient, as well as allowing the discovery of multiplesolutions. The progression of the algorithm through these different paths isprobabilistic and thus more flexible than a deterministic model. Stochas-tic variations are obviously necessary when dealing with natural selection,or any other potentially chaotic process. Finally, the only information nec-essary for GA is the objective function. No derivative information is required.

The GA method shows great potential for a variety of multi-dimensionaloptimization problems such as scheduling, and even optimizing the efficiencyof control systems. Reference [5] gives an interesting description of the useof GA in diagnostics of aircraft engines.

2 Simple Genetic Algorithm

Here is an outline of a simple genetic algorithm from the Matlab GA toolbox[2]:

Before starting the program, a number of choices need to be made:

NIND := number of individuals in the population.MAXGEN := number of generations (loop iterations).NVAR := number of variables to be optimized (equivalent to chromosomes).PRECI := precision, length of binary string representing of each variable.GGAP := generation gap, percentage of population replaced in each gener-ation.MUT := mutation probability for each chromosome bit passed to next gen-eration.

We also must choose an objective function, used to evaluate the selectionvalue of each chromosome. More detail about objective functions will followin the next section.

(I) INITIALIZE POPULATION: Create a set of individuals, henceforth

4

Page 6: Approximating the Evolution Time of the Eye: A Genetic

referred to as Mambos.1 Each Mambo is initially assigned a random bi-nary string of length NV AR × PRECI. The initial choice of population is20 Mambos. After running GA experiments with different precision values(20,30,40 bits), we concluded that within the scope of this project 30 bitswill suffice, and choosing a higher number of bits results in nothing besidesslower computation.

(II) EVALUATE POPULATION: Convert each chromosome to a numberand evaluate it with the objective function.

(III) GENERATION LOOP (repeated MAXGEN times):

(i) Assign Fitness - rank the population of Mambos according to the val-ues given by the objective function.(ii) Select individuals for breeding - only higher ranked Mambos are chosen,the number chosen depending on the generation gap.(iii) Recombine - breed by combining the chromosomes of pairs of Mambos.The population is paired by rank, that is the two best Mambos mate witheach other, then the next best 2, etc., taking part of the chromosome fromeach parent to create the baby Mambo. To avoid a strong bias towards asingle parent, the recombination process uses a shuffle algorithm that mixesup the bits, breeds the parent Mambos by selecting a certain fraction of thebits from each one, and then unmixes the resulting child chromosome.(iv) Apply mutation - given some probability that each binary digit canchange its value.(v) Evaluate offspring - back to the objective function.(vi) Reinsert into population - baby Mambos grow up and join the reproduc-tive circle.

3 Objective Function - Different Landscape

NP’s model develops an eye based on optimization of spatial resolution. Thisis a problem of many variables. One variable is the diameter of the aper-ture, which controls how much light is allowed into the eye. Another is the”posterior nodal distance”, basically the depth of the eye, which affects the

1Thanks to Madeline for the creative name

5

Page 7: Approximating the Evolution Time of the Eye: A Genetic

imaging capabilities. More complicated variables must be considered for thedevelopment of a lens. The goal of this paper is to demonstrate that evenwhen considering just a simplified model of two variables, the deterministicapproach is still fundamentally flawed. Furthermore, we have reason to be-lieve that if an NP type model falls short in a 2 dimensional simulation, in amodel with more that 2 variables the complexity of the landscape would onlyincrease the need for a GA approach. A higher dimensional landscape tendsto have more intricate peaks and valleys to consider. At a quantitative level,natural selection must be examined by means of probabilistic simulation inorder to account for all of the different solutions that are possible.

3.1 Test Function

To test the GA, we opted to use an objective function described by a su-perposition of scattered gaussian surfaces, with random sign and amplitude.This smooth surface with many local peaks and valleys is shown in figure 1.The global minimum is located in the center of the landscape - this is theoptimal point, the eye that we are looking for.

Figure 2 demonstrates convergence of the GA to local (a) and global (b)minima, respectively, on the test landscape in figure 1. Note that in a singlegeneration the Mambo population can make large jumps across the land-scape. This happens partly because of mutation, and also because a Mamboneed not mate with a similar Mambo - recall that individuals are paired byrank (or objective height) on the landscape rather than by proximity. This isanalogous to the assumption that a Mambo will choose its mate based uponphysical fitness rather than similarity.

6

Page 8: Approximating the Evolution Time of the Eye: A Genetic

Figure 1: GA Test Landscape: randomly distributed gaussian surfaces

7

Page 9: Approximating the Evolution Time of the Eye: A Genetic

a

b

Figure 2: Bird’seye view of test GA: (a) converging to a local minimum (b)converging to a global minimum.

8

Page 10: Approximating the Evolution Time of the Eye: A Genetic

3.2 GA Landscape

To address the situation constructed by NP, we must similarly develop alandscape with a general downward trend towards the global minimum, butin our model include local minima as well. A one dimensional illustrationof such a landscape is shown below. Observe that in figure 3(a) the Mambopopulation converges upon the global minimum, while figure 3(b) shows thateven with just a single variable the GA does not always converge to the verybottom.

a

b

Figure 3: One dimensional GA: (a) converging to global min. (b) convergingto local min.

9

Page 11: Approximating the Evolution Time of the Eye: A Genetic

3.3 Extending to 2-dimensions

Keeping with the NP model, we must make sure that there exists at leastone continuous path that is strictly decreasing towards the global minimum.The basic landscape we chose to fulfill this requirement is a paraboloid dottedwith a 5x5 grid of smaller gaussian surfaces. Note that the 25 local minimarepresent different types of possibly expected eye types. A lot of informationabout the various existing eye types can be found in reference [3]. Figure4 below is an illustration of this surface, which we will refer to as the ”eyelandscape”.

10

Page 12: Approximating the Evolution Time of the Eye: A Genetic

a

b

Figure 4: The eye landscape: (a) Paraboloid with a grid distribution of localwells, equal in depth. (b) Bird’seye view.

11

Page 13: Approximating the Evolution Time of the Eye: A Genetic

The first step in our algorithm is the ”feeder GA”. If one starts witha randomly distributed population, it is likely that a Mambo is initiallyplaced close enough to the global minimum that convergence is trivial. Thuswe begin with a feeder landscape, a paraboloid centered at a corner of ourGA landscape, in order to create an initial population of Mambos that iscondensed in a small region. It is mainly due to mutation that the populationthen spreads out of its small initial region.

Figure 5: A feeder landscape gathers the initial population of Mambos intoa small region

12

Page 14: Approximating the Evolution Time of the Eye: A Genetic

4 Results

To simulate a landscape analogous to NP’s deterministic model, we first tryout the GA on a paraboloidal surface with no local valleys. Given any ini-tial population, and some finite (non-zero) probability of mutation, the GAalways converges to the global minimum. In the analysis of our differentGA experiments, we measure the convergence time, defined as the numberof generations necessary for at least one Mambo to come within 1% of theglobal minimum. It is not necessary to wait for more individuals to convergeto that same point because this is typically a very fast process once a singleindividual has achieved this optimal state. Our focus is then to compare theconvergence time of the GA on the paraboloid to the convergence time onthe eye landscape.

a b

Figure 6: Histograms representing the number of generations for the GAto converge to within 1% of the global minimum in 1,000 trial runs: (a)Paraboloid surface. (b) Eye landscape (observe that only 35% of runs con-verged

Observe the GA run depicted by histogram (b) in figure 6. This datarepresents 1,000 GA runs with the following parameters:

NIND = 40MAXGEN = 1000NVAR = 2

13

Page 15: Approximating the Evolution Time of the Eye: A Genetic

PRECI = 30GGAP = 0.90MUT = 0.001

Approximately 35% of the GA runs on the eye landscape tended to con-verge for this initial population of 40 Mambos. Note that 95% of the conver-gences occurred within 100 generations, and beyond this point convergencegrows more and more scarce. There is no reason to suspect that this patternof scarcity changes, but in order to make a conservative estimate of the lowerbound for convergence time, we assign each GA run that did not converge(in 1,000 generations) a convergence time of exactly 1,000 generations.

Define Λ to be the ratio of the average convergence time on the eyelandscape to the average convergence time on the smooth paraboloid. The40-Mambo population depicted in the histograms in figure 6, even with ourextremely pessimistic approach, yielded a ratio of Λ = 6.85. This means thatgiven these initial parameters, NP’s deterministic approximation for evolu-tion time of an eye is at best short 685%.

Experimenting with different initial populations, holding all other param-eters constant, the GA produced the following results:

Figure 7: Λ ratio for different initial populations, given MAXGEN = 1000,NVAR = 2, PRECI = 30, GGAP = 0.90, MUT = 0.001

14

Page 16: Approximating the Evolution Time of the Eye: A Genetic

On average, our GA simulation produced a convergence ratio Λ = 5.41.

5 Concluding Remarks

Although the paraboloid landscape guarantees convergence, the GA is stilla probabilistic algorithm and thus will not always converge quickly. As inevolution, the most efficient path is not necessarily the one taken. This factsuggests that our already conservative value of Λ = 5.41 would be even largerif compared with a real deterministic algorithm such as the NP model. Eventhough their computation accounts to some extent for the average probabil-ity of evolutionary development over time, it fails to consider the countlessdifferent evolutionary paths, and instead chooses just one.There remain still more questions about NP’s deterministic model for eye de-velopment. David Berlinski, in his response to NP, explains eloquently that”improvement in visual acuity is no doubt a fine thing for an organism; butno form of biological change is without cost” [1]. He then gives the examplethat slight improvement in spatial resolution, by changing the shape of aneye, might require an eye socket that significantly changes the structure ofthe skull. It is difficult to say which is the optional evolutionary path. Our2-variable simulation does not even consider such external difficulties. Themore variables considered, the more evolutionary paths available, and theless likely to find a convergence to any particular type of eye.Given that our approach showed Nilsson and Pelger to be under-approximatingthe evolution time of an eye by at least 500%, it would not surprise the au-thors of this paper if a more realistic time would be even an order of mag-nitude higher. Rather than 360 thousand generations, a reasonable lowerbound should be at least 5 ∗ 360, 000 = 1.8 ∗ 106 generations, and if our pre-vious speculations have merit, an order of magnitude higher would ramp upthe estimate to around 18 million generations.Future experiments that would be useful for improving the accuracy of ourresults might involve varying the mutation parameter, and most importantlyletting algorithms run for longer, allowing the lower bound for convergenceto be pushed even higher.

Many thanks to Ariel Balter - my unofficial mentor, Matt Shepherd who guided methroughout the project, and of course to my advisor, Koby Rubinstein.

15

Page 17: Approximating the Evolution Time of the Eye: A Genetic

References

[1] David Berlinski. A scientific scandal. EBSCO, April 2003.

[2] Andrew Chipperfield and et al. Genetic Algorithm Toolboox (for use withMATLAB). Department of Automatic Control and Systems Engineering,University of Sheffield, 1 edition.

[3] Richard Dawkins. Climbing Mount Improbable. Norton, 1996.

[4] John Holland. Adaptation in Natural and Artificial Systems. U. of Michi-gan Press, 1975.

[5] Takahisa Kobayashi and Donald L. Simon. A hybrid neural network-genetic algorithm technique for aircraft engine performance diagnostics.American Institute of Aeronautics and Astronautics, 2001.

[6] Dan-E. Nilsson and Susanne Pelger. A pessimistic estimate of the timerequired for an eye to evolve. Biological Sciences, 256(1345):53, 1994.

16