Sudoku Problem Solving using Backtracking, Constraint Propagation, Stochastic Hill Climbing and Artificial Bee Colony Algorithms-METU 2013

Abstract— In this paper, the constraint satisfaction problem

called Sudoku studied by using several algorithms. At the

beginning, Backtracking and Constraint Propagation

algorithms have been utilized to solve the problem. However

due to Sudoku problem’s NP-Complete nature, those

algorithms are not applicable as the size of the Sudoku puzzle

grows more than 9x9. Therefore, ABC and Stochastic Hill

Climbing optimization algorithms have been proposed to find

a solution to Sudoku problems which can be bigger than 9x9.

We studied by different levels of Sudoku puzzles. The time

and memory performance of each method has been simulated

and compared in this study.

I. INTRODUCTION

Sudoku is a logic based combinatorial puzzle. The

objective is to fill a 9×9 grid with digits so that each

column, each row, and each of the nine 3×3 sub-grids that

compose the grid (also called "boxes", "blocks", "regions",

or "sub-squares") contains all of the digits from 1 to 9 with

no replication.

Millions of people around the world are tackling one of the

hardest problems in computer science—without even

knowing it. The logic game Sudoku is a miniature version

of a longstanding mathematical challenge, and it entices

both puzzlers, who see it as an enjoyable plaything, and

researchers, who see it as a laboratory for algorithm design.

A hard puzzle requires complex pattern recognition skills;

for instance, if a player computes all possible digits for

each cell in a sub-grid and notices that two cells have

exactly the same two choices, those two digits can be

eliminated from all other cells in the sub-grid. No matter

Afşar Saranli is assistant professor at Department of Electrical and

Electronics Engineering in Middle East Technical University, Turkey. (e-

mail: [email protected]).

Ali Fatih Gunduz is a graduate student in the Department of

Electrical&Electronics Engineering, Middle East Technical University,

Turkey. Larasmoyo Nugroho is a graduate student at Department of Aerospace

Engineering, Middle East Technical University, Turkey. (e-mail :

[email protected]).

the difficulty level, however, a dedicated puzzler can

eventually crack a 9-by-9 Sudoku game.

A computer solves a 9-by-9 Sudoku within a second by

using same logical tricks of humans use, but finishes much

faster. On a large scale, however, such shortcuts are not

powerful enough, and checking the explosive number of

combinations becomes impossible, even for the world's

fastest computers. And no one knows any algorithm that's

guaranteed to find a solution without trying out a huge

number of combinations. This places Sudoku in an

infamously difficult class, called NP-complete, that

includes problems of great practical importance, such as

scheduling, network routing, and gene sequencing.

Even an 9x9 Sudoku grid can have a large number of

different combinations. According to Felgenhauer and

Jarvis, the number of possible 9 by 9 Sudoku grids is

N=6670903752021072936960 which is approximately

6.671×1021.

Various algorithms have been applied to solve this

combinatorial problem. In this study, we will try 2 different

approaches and compare them. We will implement

backtracking algorithm and ABC genetic algorithm.

According to the textbook, Artificial Intelligence- A

Modern Approach, S. Russell, P. Norvig; the introduced

environment is considered to be: Discrete, Fully

Observable, Static, Deterministic, Sequential, and Single-

Multi agent.

Brief Information About Sudoku

The game in its current form was invented by American

Howard Garns in 1979 and published by Dell Magazines as

"Numbers in Place." In 1984, Maki Kaji of Japan published

it in the magazine of his puzzle company Nikoli. He gave

the game its modern name of Sudoku, which means "Single

Numbers" in Japan. The puzzle became popular in Japan

Sudoku Problem Solving using Backtracking, Constraint

Propagation, Stochastic Hill Climbing and Artificial Bee Colony

Algorithms

Ali Fatih Gunduz, Larasmoyo Nugroho, Afşar Saranli

and was discovered there by New Zealander Wayne Gould,

who then wrote a computer program that would generate

Sudokus. He was able to get some puzzles printed in the

London newspaper “The Times” beginning in 2004. Soon

after, Sudoku-fever swept England. The puzzle finally

became popular in the U.S. in 2005. It has become a regular

feature in many newspapers and magazines and is enjoyed

by people all over the globe.

II. PROBLEM DEFINITION

A. Constraints of Sudoku

For a 9x9 grid Sudoku problem.

1. Each row of cells contains digits 1 − 9 exactly once.

2. Each column of cells contains digits 1−9 exactly once.

3. Each 3 x 3 block contains digits 1 − 9 exactly once.

B. Solving Strategy

There exist many cunning and clever Sudoku solving

strategies but the most basic strategy to solve a Sudoku

puzzle is to first write down, in each empty cell, all possible

entries that will not contradict any constraint with respect to

the given cell. If a cell ends up having only one possible

entry, it is a "forced" entry that should be filled in.

Otherwise possible entries should be tried successively

until an accepted solution appears. Our first approach is to

implement this strategy, i.e. making a suitable choice at

random, then seeing where that leads to and if it fails restart

over.

The second strategy is to use an optimization algorithm.

For this purpose we will implement ABC, Artificial Bee

Colony, algorithm. Artificial Bee Colony is one of the most

recently defined optimization algorithms which was

proposed by Dervis Karaboge in 2005, motivated by the

intelligent behavior of honey bees. Basically there are three

kinds of bees responsible for different kind of tasks such as

Employed Bee, Onlooker Bee and Scout Bee. This

approach depends on having a fixed number population

whose members makes random selections to solve the

Sudoku problem for a user defined number of iterations.

Before updating this document, we studied Constraint

Propagation and Stochastic Hill Climbing algorithms and

included them in this document as well.

There will be detailed descriptions of both strategies and

their performances in this document.

Figure 1. A sample Sudoku problem and its solution, blacks

are input and reds are correctly inserted numbers.

III. BACKTRACKING SEARCH

A. Pseudocode

Procedure SOLVESUDOKU (row,col)

Begin

if row > 8 return true

if values (row,col) is set

return NEXT(row,col)

else if values(row,col) is zero

for i=1 to 9

if it does not violate constraints to put

i to values(row,col)

values(row,col)=i

if NEXT(row,col)==true

return true

values(row,col)=0

return false

End

Procedure NEXT(row,col)

Begin

if col==8

return SOLVESUDOKU(row+1,0)

else

return SOLVESUDOKU(row, col+1)

End

B. Structure of the Algorithm

There are two procedures in this approach and they are

mutually recursive. First one is SolveSudoku and its

arguments are row and column indexes of the current cell.

SolveSudoku method tries to put a valid number between 1-

to -9. If the assignment results successful, programs

continue. However if the assignment leads to a constraint

violation combination, programs replaces that assignment

by backtracking. On the other hand, second method is

responsible for moving the solution to the next cell.

IV. ABC ALGORITHM

A. Pseudocode

Begin

[01] Initialize Population

[02] Evaluate Population

[03] Set cycle=1

[04] Repeat for each Employed Bee

[05] Randomly select 2 cells from a randomly selected

sub-grid and replace them if they are not fixed and if it

increases evaluation value

[06] Repeat for each Onlooker Bee

[07] Evaluate Population and find the worst 30% bees and

randomly select one bee and then give the solution of

onlooker to the bee which has a worse grid

[08] Evaluate Population and find the best 10% bees and

then randomly select a grid and copy its solution into

onlooker's grid

[09]Repeat for each Scout Bee

[10] Randomly fill the grid without considering violations

[11] Memorize best solution

[12] cycle=cycle+1

[13] Until (termination condition is met)

[14]If cycle limit is reached give the memorized best

solution as the final solution

[15] Else if the accepted solution is met, give it

End

B. Structure of the Algorithm

User defines number of iteration cycles and number of bees

by using user interface. Population composition and cycle

count play critical role at completeness and time efficiency

of the algorithm.

Bees are divided into three different kinds namely

employed, scout and onlooker. At the beginning, each bee

is given a copy of the input Sudoku sample and each bee

fills randomly the empty cells regarding the sub-grid

constraint but not regarding the other row and column

constraints. Each bee evaluates its own grid by using

“Misplaced Cell” heuristic. For each misplaced cell minus

one penalty they receive. So the best and accepted solution

has a heuristic value of 0.

At each iteration each kind of bee acts differently. An

employed bee selects a sub-grid randomly and selects two

random cells from that sub-grid. If those cells are not given

as input meaning impossible to change, employed replaces

those cells if the resulting grid has a higher evaluation

value. Otherwise it does nothing in that cycle.

An onlooker bee, on the other hand, looks at the grids of

whole bee population and determine 10% of the bees which

has the best evaluation value and 30% of the bees which

has the worst. Then it randomly selects a bee from 10%

best bee population to replace its own grid. And then it

again randomly selects a bee from 30% worst bee

population to update that worse grid with onlooker’s grid.

Finally, a scout bee randomly generates a solution from the

given input at each iteration without considering its

evaluation value.

V. CONSTRAINT PROPAGATION ALGORITHM

A. Pseudocode

Procedure init (values[][])

Begin

for i=0 to 8

for j= 0 to 8

Cell[i][j] = values[i][j]

Cell[i][j].domain.clear()

for i=0 to 8

for j= 0 to 8

if values [i][j] is NOT 0

for k= 0 to 8

Cell[i][k].domain.remove(values[i][j])

assignIfOnlyOne(i,k)

for k= 0 to 8

Cell[k][j].domain.remove(values[i][j])

assignIfOnlyOne(k,j)

for k = 0 to 3

for kk = 0 to 3

Cell[floor(i/3)*3+k ][ floor(j/3)*3+kk]

.domain.remove(values[i][j])

assignIfOnlyOne(k, kk)

End

Procedure assignIfOnlyOne (row, col)

Begin

if(Cell[row][col].domain.size() ==1

Cell[row][col] = Cell[row][col].domain.get(0)

Cell[row][col].domain.clear()

for k= 0 to 8

Cell[i][k].domain.remove(values[i][j])

assignIfOnlyOne(i,k)

for k= 0 to 8

Cell[k][j].domain.remove(values[i][j])

assignIfOnlyOne(k,j)

for k = 0 to 3

for kk = 0 to 3

Cell[floor(i/3)*3+k ][ floor(j/3)*3+kk]

.domain.remove(values[i][j])

assignIfOnlyOne(k, kk)

End

B. Structure of Algorithm

We applied constraint propagation algorithm on the Sudoku

instance at the initialization and then write down the values

which are the only options for the given case. After

updating the Sudoku puzzle we applied Backtracking

algorithm on the resulting Sudoku problem.

In this way, most of the time we did not need to run

Backtracking since after applying Constraint Propagation at

the initialization the problem is totally solved.

VI. STOCHASTIC HILL CLIMBING

A. Pseudocode

X = Fill the empty cells randomly

flag = true

while flag is true

flag = false

for i=0 to 8

for j= 0 to 8

i= random btw 0 to 8

j= random btw 0 to 8

for k = 1 to 9

oldScore = evaluate(X)

temp = X[i][j]

X[i][j] = k

newScore = evaluate(X)

if oldScore > newScore

X[i][j] = temp

else

flag = true

end second for

end first for

end while

B. Structure of Algorithm

In this algorithm, we first randomly filled the empty cells of

the Sudoku problem. Then until no local optimization is

possible, we randomly selected a cell and tried to insert

values between 1 to 9 to that cell. During the insertion step,

we considered the improvement at the end of that insertion.

VII. SIMULATION RESULTS

\

In order to make comparison between different

algorithms and studying their effectiveness on solving the

Sudoku problems visually and numerically, a Graphical

User Interface (GUI) is built in Java language. This GUI

enables the user to run any algorithm we implemented. It is

also possible that the level of difficulty puzzle be chosen.

In addition, the designed GUI is constructed to receive

input from the user to tweak the setting of bees population.

Figure 2: Snapshot from program with an unsolved Sudoku

problem

Figure 3: Solution of the Sudoku problem depicted in

Figure 2. In this picture reds show wrongly inserted numbers

whereas blacks show correctly inserted numbers and input

values.

In the simulation shown in Fig. 2, the start position has

40 grids empty from total 81grids.

In Fig.3 the result of a poor performance is given by a

combination of population that is inserted as an input to the

GUI. In Fig.4 the perfect result is easily produced by

backtracking algorithm.

Figure 4: Perfect Solution of the Sudoku problem depicted

in Figure 2..

First major step in carrying the Monte Carlo test is to

find the ideal bee population to be used by the ABC

algorithm in order to be bechmarked to the backtracking

algorithm.

The procedure to get a combination of bee population is

described as below :

- Start with considerable number of computation cycle.

- Start growing employed bee (we can make analogy this

as a worker bee in real life) numbers.

- Start growing onlooker bee (analogy as manager bee).

- Start putting scout bee (analogy as spy bee).

- For each combination formed, check the accumulated

fitness number. Pick the combination that produce the

lowest fitness number.

As it is shown, in finding the ideal population, higher

number of bees doesn't mean better result.

Accumulated fitness = fitness of easy level + fitness of

medium level

Accumulated fitness is merely summation of the fitness

produced by each try. There is no need to find the average

sum, as the aim is just to find the lowest number of

accumulated fitness.

Figure 5: Comparison of fitness number accumulated

according to each combination of bee population

The lowest accumulated fitness number is given by the C

combination of bee population that is : 10 employed bees, 2

onlooker bees and 1 scout bee. The lowest accumulated

fitness number is 128.

Figure 6: Comparison of time and memory consumed by

each combination of bee population

The interesting finding fact is that the chosen

combination of bee population needs some significant

amount of memory (3.5MB). This means that accurate

results have correlation with energy consumed by the

agents to compute.

Second major step is to put side by side the four

algorithms (Backtracking, Constraint Propagation, Hill

Climbing and ABC algorithms) and compare their

performances in terms of time, memory and violations.

Following figures show the value of time performance

and memory amount needed by each algorithm to carry out

the task in different set of environment, i.e. three different

levels of difficulty.

Figure 7: Comparison of time consumed by each algorithm

The facts shown above tell us that the most time-

demanding algorithm is the ABC, on the other hand, the

least time-demanding algorithm is the Constraint

Propagation.

The nature of ABC algorithm is very time-demanding, as it

can be seen from these comparisons.

The interactions between the three bee agent kinds is the

main reason that makes ABC algorithm time demanding.

The time demand of ABC can be reduced by decreasing

cycle count however decreasing the cycle count will

increase the number of violations.

Figure 8: Comparison of memory consumed by each

algorithm

On the other hand, Constaint Propagation can be said to be

the most memory demanding algorithm. As it requires extra

memory for domain values in the heap and for recursive

function calls in the stack, this is expected.

Figure 9: Comparison of number of constraint violations

between ABC and Hill Climbing algorithms

Since backtracking and Constraint Propagation algorithms

did not give violation containing solutions, they are not

included in the scatter graphs.

ABC algorithm is detected to be more accurate than Hill

Climbing algorithm. This performance increase mainly

obtained from the existence and efforts of the Onlooker

bees. Onlooker bees helped ABC to avoid the search

sticking at local maximums.

VIII. CONCLUSION

The result of simulation shows that for 9x9 grid Sudoku

problems, it is better to use Backtracking algorithm as it has

zero constraint violation and has a good performance with

respect to time and memory.

Our Optimization algorithms could not find the correct

result at every trial. Moreover they can be said to be more

resource consuming.

REFERENCES

[1] http://www.math.cornell.edu/

mec/Summer2009/Mahmood/Intro.html, 2009, last

access 12 May 2013

[2] http://www.afjarvis.staff.shef.ac.uk/sudoku/

felgenhauer_jarvis_spec1.pdf, last access 12 May 2013

[3] Pacurib, Seno, Yusiong. "Solving Sudoku Puzzles using

Improved Artificial Bee Colony Algorithm ", 2007,

Fourth International Conference on Innovative

Computing, Information and Control

[4] Arifiyanto, Wahyu Adhi "Penggunaan Algoritma

Backtracking Dalam Penyelesaian Permainan

Sudoku", 2007

[5] http://www.cs.utexas.edu/~scottm/cs314/handouts/

slides/Topic13RecursiveBacktracking_4Up.pdf, , last

access 12 May 2013

[6] http://spectrum.ieee.org/consumer-

electronics/gaming/sudoku-science, , last access 12

May 2013

[7] "Satisfaction by Belief Propagation: An Example of

Using Sudoku", IEEE Mountain Workshop on Adap-

tive and Learning Systems, 2006, pp. 122-126.

[9] Das, K.N. et al., "A Retrievable GA for Solving Sudoku

Puzzles", 2008

Documents

Sudoku Problem Solving using Backtracking, Constraint Propagation, Stochastic Hill Climbing and Artificial Bee Colony Algorithms-METU 2013