43
1 Genetic Programming: An Introduction

Genetic Programming: An Introduction

  • Upload
    patch

  • View
    50

  • Download
    1

Embed Size (px)

DESCRIPTION

Genetic Programming: An Introduction. The Lunacy of Evolving Computer Programs. Before we start, consider the general evolutionary algorithm : Randomly create a population of solutions. Evaluate each solution, giving each a score. - PowerPoint PPT Presentation

Citation preview

Page 1: Genetic Programming: An Introduction

1

Genetic Programming:An Introduction

Page 2: Genetic Programming: An Introduction

2

The Lunacy of Evolving Computer Programs

Before we start, consider the general evolutionary algorithm :

Randomly create a population of solutions. Evaluate each solution, giving each a score. Pick the best and reproduce, mutate or crossover with ot

her fit solutions to produce new solutions for the next generation.

Page 3: Genetic Programming: An Introduction

3

The Lunacy of Evolving Computer Programs

Now consider what this means in the context of genetic programming:

Randomly create a population of programs.

Evaluate each program, giving each a score.

Pick the best and reproduce, mutate or crossover with other fit programs to produce new programs for the next generation.

Page 4: Genetic Programming: An Introduction

4

The Lunacy of Evolving Computer Programs

A randomly generated C program#bjsieldi <dkjsldkfj.?+nit anim(tin x, rach*vrag[)}{nit x;rof (x = 10; : ) {touch ,? *wha”ts g01nG 0n?@; :]]

Page 5: Genetic Programming: An Introduction

5

The Lunacy of Evolving Computer Programs

The argument against evolving programs

Randomly created programs have an infinitesimal chance of compiling, let alone doing what you want them to do..

Running a randomly created program will most likely give array out-of bounds errors, data-casting, core-dumps and division by zero errors, and is ultimately prone to the halting problem

Mutating and mixing segments of randomly created programs is as senseless as randomly creating them in the first place.

(How) does genetic programming get around this?

Page 6: Genetic Programming: An Introduction

6

What makes GP different

VariableIn general by LISP S-expressions

GP

Fixed-length strings

Coded strings of numbers

GA(conventional)

Individual Size (complexity)

Individual Representation

Page 7: Genetic Programming: An Introduction

7

GP algorithmCreate random population

Evaluate fitness function

Apply evolution genetic operators probabilistically to obtain a new computer program

Reproduction/Crossover/Mutation

Insert new computer program into new population

Page 8: Genetic Programming: An Introduction

8

The Genetic Programming Representation

The trick is to choose an underlying representation for programs such that:the random creation, mutation and crossover of programs always yields a syntactically correct program.

The representation employed in genetic-programming is a tree: this representation is natural for LISP programs and leads to elegant algorithms for creation, mutation and crossover.

Page 9: Genetic Programming: An Introduction

9

Genetic Structure

Functions: Can be conditional(if, then,etc.), sequentual(+,-,etc.), iterative (whileDo etc.)

Terminals: No arguments, just return a value

Page 10: Genetic Programming: An Introduction

10

Evolving Trees

In fact the representation is useful for the evolution of more than just LISP programs! The tree structures in a genetic programming population can be used to determine layouts for analogue electric circuits, create neural networks, paralellise computer programs and much much more.

It’s a great representation because it can produce solutions of arbitrary size and complexity, as opposed to, for example, fixed-length genetic algorithms.

As we’ll be applying an evolutionary algorithm to this representation, we need to define creation, crossover and mutation operators.

Page 11: Genetic Programming: An Introduction

11

Creation, Crossover and Mutation

The following shows how tree structures can be created, crossed and mutated.

Creation: randomly generate a tree using the functions and terminals provided

Crossover: pick crossover points in both parents and swap the subtrees. If the parents are same, the offsprings will often be different.

Mutation: pick a mutation point in one parent and replace its sub-tree with a randomly generated tree.

Page 12: Genetic Programming: An Introduction

12

Crossover

Page 13: Genetic Programming: An Introduction

13

Mutation

Page 14: Genetic Programming: An Introduction

14

Population Creation

When creating a population, it’d be nice to begin with many trees of different shapes sizes. We can generate trees using the full or the grow method:

full - every path in the tree is the maximum length

grow - path lengths will vary up to the maximum length.

Typically, when a population is created, the “Ramp half-and-half” technique is used.Trees of varying depths from the minimum to maximum depth are created, and for each depth half are created using the full method and the other half are created using the grow method.

Page 15: Genetic Programming: An Introduction

15

Preparatory steps for GP

You’ve decided you want to use GP to solve a problem. To set up your GP runs, you need to do the following:

Determine the set of terminals (the leaves of your trees). In the programming context, these are usually variables, input values or action commands

Determine the set of functions (the nodes of your trees).

The fitness measure

The parameters for controlling the run: Population size, Maximum number of generations, Mutation, Crossover and Reproduction rates (1%, 90%, 9%)

The method for terminating a run and designating a result.

Page 16: Genetic Programming: An Introduction

16

Sufficiency & Closure

Function and terminal sets must satisfy the principles of closure and sufficiency:

Closure: every function f must be capable of accepting the values of every terminal t from the terminal set and every function f from the function set.

Sufficiency: A solution to the problem at hand must exist in the space of programs created from the function set and terminal set.

One way to get around closure is to use make all terminals and functions return the same type (for example, integer) or use strongly typed genetic programming to ensure that all expressions are type-safe.

Page 17: Genetic Programming: An Introduction

17

Example: Symbolic Regression

Problem: Can GP evolve the function to fit the following data::

x f(x)

0 01 42 303 1204 3405 7806 15547 28008 4680

Page 18: Genetic Programming: An Introduction

18

GP Symbolic Regression

Function Set: +, - *, /

Terminal Set: X

Fitness Measure: use the absolute difference of the error. Best normalized fitness is 0.

Parameters: Population Size = 500, Max Generations = 10, Crossover = 90%, Mutation = 1%, Reproduction = 9%. Selection is by Tournament Selection (size 5), Creation is performed using RAMP_HALF_AND_HALF.

Termination Condition: Program with fitness 0 found.

Page 19: Genetic Programming: An Introduction

19

Results

The following zero-fitness individual was found after two generations

(add (add (mul (mul X X) (mul X X)) (mul (mul X X) (- X)))(sub X (sub (sub (sub X X) (mul X X)) (mul (add X X)(mul X X)))))which correctly captures the function:f(x) = x4 + x3 + x2 + x

Page 20: Genetic Programming: An Introduction

20

Santa Fe Trail

In the Santa Fe Trail, an ant must eat all the items of food in a trail. The ant can only move left, right or forward, and can only sense what is directly in front of him.

Page 21: Genetic Programming: An Introduction

21

GP Santa Fe

Function Set: Prog2, Prog3, IfFoodAheadTerminal Set: TurnLeft, TurnRight, MoveForwardFitness Measure: count the number of items food eaten aft

er a fixed number of moves, and subtract from 89. Bad fitness = 89, Good fitness = 0.

Parameters: Population Size = 500, Max Generations = 50, Crossover = 90%, Mutation = 1%, Reproduction = 9%. Selection is by Tournament Selection (size 5), Creation is performed using RAMP_HALF_AND_HALF.

Termination Condition: Program with fitness 0 found.

Page 22: Genetic Programming: An Introduction

22

Some programs

Prog2(TurnRight)(TurnLeft)Prog2(MoveForward)(MoveForward)

Page 23: Genetic Programming: An Introduction

23

Result

Here’s how one agent fared:

Page 24: Genetic Programming: An Introduction

24

Agent

(Prog3 (IfFoodAhead (IfFoodAhead (IfFoodAhead (IfFoodAhead (Prog2 MoveForward MoveForward) TurnLeft) TurnLeft) TurnLeft) (IfFoodAhead MoveForward (IfFoodAhead MoveForward (IfFoodAhead (IfFoodAhead (Prog2 MoveForward MoveForward) TurnLeft) TurnLeft)))) TurnLeft (Prog3 (IfFoodAhead (IfFoodAhead MoveForward TurnLeft) TurnRight) MoveForward TurnRight))

Smaller agents can be found!

Page 25: Genetic Programming: An Introduction

25

Robot Wall-Following with GP (Koza, 1993)

Given: Odd-shaped room with robot in center.

Find: A control strategy for the robot that makes it move along the periphery.

GP Primitives:

Terminals: S0, S1..S11 (12 sensor readings, distance to wall),

Functions: IFLTE (if less than or equal), PROGN2, MF, MB (move forward/back), TL, TR (turn left/right).

Fitness Function: Fitness = peripheral cells visited.

Sample Individual/Strategy:

(IFLTE S3 S7 (MF) (PROG2 MB (IFLTE S4 S9 (TL) (PROG2 (MB) (TL)))))

Page 26: Genetic Programming: An Introduction

26

Wall-Following Evolution

Page 27: Genetic Programming: An Introduction

27

Fitness function

The fitness function is based on executing the evolved programs on one or more prescribed test suites.

The test suites can be devised in the same way as those used when testing traditional manually produced programs.

Program size as part of fitness

Page 28: Genetic Programming: An Introduction

28

Fitness function

Fitness Functions Error-based– Fitness inversely proportional to total error on the test data.– E.g. symbolic regression, classification, image compression,

multiplexer design.. Cost-based– Fitness inversely proportional to use of resources (e.g. time,

space, money, materials, tree nodes)– E.g. truck-backing, broom-balancing, energy network desig

n…

Page 29: Genetic Programming: An Introduction

29

Fitness function

Benefit-based

– Fitness proportional to accrued resources or other benefits.

– E.g. foraging, investment strategies

Parsimony-base

– Fitness partly proportional to the simplicity of the phenotypes.

– E.g. sorting algorithms, data compression…

Entropy-based

– Fitness directly or inversely proportional to the statistical entropy of a set of collections

– E.g. Random sequence generators, clustering algorithms, decision trees.

Page 30: Genetic Programming: An Introduction

30

… Designer GP

In recent times, the tree-representation employed by GP has been used for automatic design of electrical circuits.

The tree is no longer a “program”, but should be considered a “program that builds circuits”.

The idea of building graph structures using commands embedded in a tree was developed by Frederic Gruau. He used it to evolve neural networks: Koza et al now use it to evolve electric circuits.

Functions of node are Par (P) and Seq (S), that change the topology of the graph. Other functions and terminals modify the values at the nodes. Everything begins with one embryonic cell with a pointer to the head of the tree.

Page 31: Genetic Programming: An Introduction

31

Cellular Encoding

Page 32: Genetic Programming: An Introduction

32

Cellular Encoding

Page 33: Genetic Programming: An Introduction

33

Cellular Encoding

Page 34: Genetic Programming: An Introduction

34

Cellular Encoding

Page 35: Genetic Programming: An Introduction

35

Cellular Encoding

Page 36: Genetic Programming: An Introduction

36

Cellular Encoding

Page 37: Genetic Programming: An Introduction

37

Cellular Encoding

Page 38: Genetic Programming: An Introduction

38

Cellular Encoding

Page 39: Genetic Programming: An Introduction

39

Cellular Encoding

Page 40: Genetic Programming: An Introduction

40

Cellular Encoding

Page 41: Genetic Programming: An Introduction

41

Cellular Encoding

Page 42: Genetic Programming: An Introduction

42

Cellular Encoding

Page 43: Genetic Programming: An Introduction

43

So you want to use GP...

Genetic programming, at its heart, is the evolution of tree structures that can be interpreted as programs. Use GP to

solve problems where the solutions are naturally expressed as tree structures.

evolve LISP programs to solve a problem

evolve solutions in an indirect manner, by using the GP trees to build solutions to problems.

Your approach will be to determine the functions & terminals that constitute your trees, and how to interpret the resulting trees as solutions to your problem.