Neural Heuristics For Problem Solving: Using ANNs to Develop Heuristics for the 8-Puzzle by Bambridge E. Peterson

Neural Heuristics For Problem Solving:

Using ANNs to Develop Heuristics for the 8-Puzzle

by Bambridge E. Peterson

• question to be answered

• paradox to be resolved

• obstacle to be overcome

• goal to be achieved

• crisis to be averted

• challenge to be met

What is a problem? (informal)

What is a problem? (formal)

Formulate problem as a graph search

1.Initial state (question), goal state (answer)

2.Actions - allowable actions for a given state

3.Transition function - T(S,A) - given a state S and action A, return the resulting state S’ when A is performed in S

4.Goal test - function to test whether we’ve reached the goal

5.Path-cost function - keeps track of path cost(from Artificial Intelligence: A Modern Approach, 3rd Edition by Russell and Norvig)

S

G

1

2

1

12

1

3

11Idea:

•Use explored set to keep

track of expanded nodes

•Use frontier to store

successor nodes still to be

expanded

•Many search algorithms differ in how to store nodes

in the frontier

Graph Search

S

G

1

2

1

12

1

3

11Some Examples:

•Breadth-first search

•Depth-first search

•Iterative-deepening

•Uniform cost

•Greedy-best first

•A*

•Iterative-deepening A*

Graph Search

S

G

1

2

1

2

1

3

11A* search

•order priority queue using

cost function

• f(n) = g(n) + h(n)

• f(n) is a cost function

•g(n) : path cost to reach

node (n)

•h(n) is the heuristic function -

estimated distance to the goal

•A* optimal if h(n) is admissible and consistent

Graph Search

Heuristics in Graph Search• What is a heuristic?

o General rule of thumb for solving a problem. Usually developed through experience

• What is an admissible heuristic?o A heuristic that never overestimates the path-cost to the goal

• What is a consistent heuristic?o never takes a step back (monotone)

• Why use heuristics?o Brute force search is slow when state space is largeo Reduces number of nodes necessary to explore

N-Puzzle• n = i2 - 1 for positive integer i

• sliding block puzzle, grid

• n - 1 tiles, 1 ‘blank space’

• start in random state

• can move one tile at a time

• exchange places with the ‘blank’ space

• can only move up, down, left, right

• 8-puzzle example (right)

• goal state is numbers 1 through n in order, left to right, top to bottom

N-Puzzle Heuristics

• 8-puzzle: 9!/2 = 181,440 total states

• 15-puzzle: 16!/2 approximately 1 trillion states

• 24-puzzle: 25!/2 approximately 7.76 * 1024 stateso Have fun with brute-force search in this state space

Why use heuristics??? N-puzzle is a good example

Something more ‘clever’ than brute force approach is needed….

Manhattan Distance - sum total of city block distance of all tiles in their current position from

position in goal state

Misplaced tiles - total number of tiles not in goal state position

Symbolic vs. Subsymbolic 1. symbols + rules for their

arrangement in space and transformation in time (syntax) is a general definition of language

2. Infinite meaningful arrangements can be generated from a finite set of symbols

3. Natural languages

4. Formal languages

Manhattan Distance is a symbolic heuristic

1. Connectionist

2. Parallel-distributed process

3. Simultaneous processing among multiple parallel channels

Can we use machine learning to develop heuristics?

Subsymbolic heuristics aka “Neural Heuristics”...

So the goal is to develop a ‘better’ heuristic for the 8-puzzle...

Generating Training Data• generated 20,000 solved instances of the 8-puzzle

using Python to generate and solve states using the A*star algorithm

• stored the instances in MongoDB as well as .txt file for processing in Octave

• Note: A puzzle can be represented internally as a vector (3, 8, 2, 4, 5, 6, 1, 7, 9) - use 9 to represent the blank space. Obviously only certain operations can be performed...

Training Data FieldsExample

1. State n

2. # states explored

3. # nodes added to frontier

4. MD heuristic

5. Path-cost

6. Time (on my machine)

1. 8, 7, 1, 2, 9, 6, 3, 4, 5

2. 1571

3. 2448

4. 18

5. 24

6. 94928 microseconds

General statistics

MIN MAX MEAN STD

Path-cost 45 31 22.1 3.38

MD 4 22 14.01 2.88

Frontier 9 21518 1572.35 1791.61

Explored 5 14760 1007.89 1172.33

Time 235 989591 54630.73 80138.67

Neural Heuristics

The idea...

•Train various MLP networks with backpropagation

•goal is approximation (regression)

• Train network with different targets - o the optimal solutiono the difference between the optimal solution and the

manhattan distance of the stateo perhaps another...

Neural Network Input

• 9 element input state S was transformed in a 81 element vector of 1’s and 0’s - the 9 x k + t bit equaled 1 if and only if S[k] = t

Example: [2, 1, 3] = [0 1 0 1 0 0 0 0 1]

Example: [3, 2, 1] = [0 0 1 0 1 0 1 0 0]Tried this because of the following paper:

Likely Admissible and subsymbolic heuristics

http://www.dii.unisi.it/~ernandes/my_works/ernandes_gori_ECAI04.pdf

http://www.dii.unisi.it/~ernandes/my_works/ernandes_gori_ECAI04.pdf

Neural Networks (cont.) • # hidden layers - 5, 10 and 15

• learning rate set at 0.1

• momentum 0.8

• Number of epochs 500-1000, 64 samples an epoch

• used tanh activation function for the hidden layer

• sigmoid activation function on output

Neural Networks (cont.)

• 13,000 samples used for training set

• 2,000 samples for tuning

• 3,000 for testing the results of the trained MLP

• 3,000 for ‘official’ testing in Python using A*

• saved weights in a .txt file

• tested in Python using Numpy

Preliminary ResultsA bit disappointing so far...

For the 3,000 remaining testing samples, I compared the stats between the manhattan distance heuristic and various neural heuristics developed in training

Heuristic MIN MAX MEAN STD

MD 4 22 13.96 2.88

h* 15.84 27.15 22.49 1.88

h*_md 8.45 20.81 13.98 2.20

h*_md_avg 16.09 21.66 18.9 0.944

h* - heuristic developed with optimal path cost as target

h*_md - heuristic function developed with optimal path cost minus manhattan distance as target

h*_md_avg - mean of the two above heuristics

Preliminary ResultsA bit disappointing so far...

Examples…Using MD heuristic, takes less than 1 second to solve 10 n-puzzle examples. Average explored for these examples is 963, with 1508 nodes added to the frontier

For the same puzzles, using h*, it took over 2 minutes to solve the puzzles, with an average of 27,000 nodes explored and 40000 added to the frontier

Something isn’t right here...

Next Up

• Double check code for errors

• Try 9-h-1 topology, using just the state input without transformation into bit vector

• SVM - give Support Vector Machine a crack at it

• Discuss with Professor Hu

• Still a week left!

Questions?

Documents

Neural Heuristics For Problem Solving: Using ANNs to Develop Heuristics for the 8-Puzzle by Bambridge E. Peterson