17
Connect Four Solver Connect 4 Algorithms Design & Analysis of Algorithms David Alligood, Scott Swiger, Jo Van Voorhis University of North Carolina Wilmington

people.uncw.edupeople.uncw.edu/tagliarinig/Courses/380/S2018 papers and...  · Web viewwinning. Therefore, the board states where a player has a lot of alignments and the opponent

  • Upload
    hathuan

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Connect Four Solver

Connect 4 Algorithms

Design & Analysis of Algorithms

David Alligood, Scott Swiger, Jo Van Voorhis

University of North Carolina Wilmington

Connect Four Solver

Abstract

Connect 4 is a zero-sum, perfect information, solved game. It can be played perfectly through mathematical and algorithmic deduction. In this project, we observed three algorithms – Min Max, Alpha Beta, and our own personally created algorithm – and their ability to play Connect 4, with the goal of determining which one played the most efficiently. This required analysis of each algorithm's runtimes and accuracy. Min Max and Alpha Beta use a game tree in order to make decisions, but our algorithm takes a different approach. How does it fair against these established, effective algorithms that have made their mark in the computing world?

Introduction

Connect 4 is a popular two-player turn-based strategy game published in 1974 under Milton Bradley. Players take turns placing their chips in one of seven columns on a 7x6 board, which stands upright. This upright position causes gravity to be a part of the game, as the tokens fall to the bottom of the column they are inserted into, allowing the tokens to stack up on each other. The object of the game is to be the first player to get four of their tokens in a line, either horizontally, vertically, or diagonally.

Connect 4 is a zero-sum game, which means one player wins everything and one player loses everything, or neither player wins anything. It is also a game of perfect information, which means the players are aware of all previous moves. For example, while Connect 4 is a game of perfect information, a game like Texas Hold 'em is not. This is because, even though the players are aware of what cards are on the table, they are not aware of what cards are in each other's hands. Because Connect 4 is a zero-sum perfect information game, it is solvable. This means it can be played perfectly every time through mathematical deduction and decision making. In this project, we will explore several algorithms and their ability win Connect 4, looking at how often they win and the time they take to make moves. We will explore the question of what makes an algorithm the preferred algorithm in this case. An algorithm may win a lot of games, but what if it takes a long time to make each move? It may be effective, but is it the most efficient?

Formal Problem Statement

A Connect 4 board, s, represented by a 7x6 multidimensional array, will be made up of values from the set a = {X, O}; (x, y) represents a position on s, where x = row number and y = column number; x = {0, 5} and y = {0, 6}; these coordinates are set to a value from the set a; s is considered won when one of the following is true:

(x, y) = (x, y+1) = (x, y+2) = (x, y+3)(x, y) = (x+1, y) = (x+2, y) = (x+3, y)(x, y) = (x+1, y+1) = (x+2, y+2) = (x+3, y+3)

Connect Four Solver

These represent the different ways in which a player can get 4 tokens in a line: horizontally, vertically, and diagonally. Getting 4 in a row is the main goal because this is how Connect 4 is won, and the manner in which this is achieved serves as the main basis for how the algorithms will be evaluated.

Context

Connect 4 is a solved game. This means a way to play perfectly has been discovered, where the first player can win every time even if the second player plays perfectly, as well. Connect 4 was solved in 1988 by this James Dow Allen and then separately the same year by Victor Allis. These solutions involved knowledge and logic, a less program-based approach as we would expect to see today. John Tromp has since solved Connect 4 with brute force methods using a database consisting of the first eight moves of any Connect 4 game and all possible states these produce (Tromp). Perfect solvers for Connect 4 do exist as playable games. This is made possible through a very complicated integration and implementation of Min Max and Alpha Beta pruning, along with move ordering and transposition tables (Pons).

Experimental Procedure

Min Max

The Min Max, or Minimax, algorithm involves exploring every possible board state of Connect 4. It does this by recursively expounding on each board state, creating a number of new board states equal to the number of possible moves from each one. This creates a "game tree", which the algorithm uses to determine which move is the best move at any given point in the game.

Each node in this game tree represents a different state of the board, including the initial, empty Connect 4 board, all the way down to every possible terminal board state. A terminal state is a board state at the end of a game. This means the game is over, meaning either player 1 or player 2 has won the game, or the game has ended in a tie. Each board state has a score associated with it, called a utility value. Min Max gets its name from the fact that player 1 is always trying to maximize the score, while player 2 is trying to minimize this score. The scores indicate which player is winning at that current moment in the game. In order for each node in the tree to be scored, they must derive it from the terminal states of the game. This is because the outcome of the game has already been determined at each terminal state, so these can be accurately scored. Terminal nodes where player 1 has won will have a large positive value; player 2, large negative value. Where there's a tie, the score is zero.

The scores for the non-terminal nodes are determined by the scores from their children nodes. Between a node with a score and its siblings, one score from them will be brought up to the parent node, and will be that parent's score. Because terminal nodes are the only ones with scores at the start, the algorithm must trace the tree all the way down to the terminal nodes to

Connect Four Solver

begin scoring the other nodes. The score that gets carried depends on whose turn it is before that board state is created. Player 1 is considered the "max" player, which means he is trying to take the maximum score; player 2 is the "min" player, meaning he is trying to take the minimum score. If the turn player of a parent node of a set of children scored nodes is player 1, then the maximum score of the children becomes that parent's score. If it is player 2, then the minimum score of the children becomes the score for the parent. This process repeats until every node in the tree has a score. With every board state evaluated, the game is considered "solved", as the perfect path of play can be determined from the tree to ensure a victory.

Min Max's complexity is O(b^d), where b is the number of possible moves at any given point in the game, and d is the number of turns until the end of the game, or "depth". In Connect 4, there are seven possible moves, as there are seven columns. Each player has 21 chips, and there are 42 slots on the board so that each chip could potentially be placed on the board. This means the maximum depth of a game of Connect 4 is 42 moves. Since each board state spawns seven new board states, and each of these new board states spawn seven more and so on, it is evident that the game tree for Connect 4 will end up being extremely large. In fact, there are 4,531,985,219,092 total board states in Connect 4 (Numberphile, 1:56), including the initial, empty board, along with every terminal state: player 1 wins, player 2 wins, and ties. With this many nodes to explore, it would take an unrealistic amount of time to iterate through the entire game tree. This makes a raw Min Max algorithm highly inefficient; while it can, in theory, play Connect 4 perfectly, it needs to search all the way to every terminal node and then back up through the tree to score every node. With over 4 trillion board states, the time required is not practical in the slightest.

However, there is a solution to this runtime issue: a heuristic can be implemented to allow the algorithm to make decisions faster at the cost of accuracy. This means the computer won't be playing perfectly, but can still make good decisions using a lot less time required for the perfect play. A heuristic function for Min Max will allow board states to be evaluated by other means, rather than solely basing scores off the terminal states. This means the entire tree does not have to be explored in order for scores to be assigned to the nodes. For our implementation, we chose a heuristic function that would evaluate a board state based on the number of 3-in-a-rows and 2-in-a-rows each player has at that moment. The goal of Connect 4 is to get four pieces in a line, so the more pieces a player has aligned, the closer he is to

Connect Four Solver

winning. Therefore, the board states where a player has a lot of alignments and the opponent has very little alignments will yield a higher score for the player with more alignments. The exact formula we used is:

(([# of player1's 4's] * 10000) + ([# of player1's 3's] * 100) + ([# of player1's 2's] * 10)) - (([# of player2's 4's] * 10000) + ([# of player2's 3's] * 100) + ([# of player2's 2's] * 10))

A four-in-a-row is weighted the most heavily because this is the win condition. If player 2 has more alignments than player 1, the score for that state will be negative, which indicates player 2 is winning at that currents state because they are the "min" player. Positive scores signify a state in favor of player 1, the "max" player. This calculation allows board states to be scored in the middle of the game, without the need to pull scores from the terminal states. Now, the Min Max algorithm is playable as the runtime is cut tremendously. For our testing purposes, we had the computer explore the tree to a depth level of 8. This gives plenty of board states that have varying scores.

Alpha Beta

The alpha beta algorithm is a tree traversing search algorithm, with the primary goal of decreasing the number of nodes that are visited by the Minimax algorithm, making it an improvement over Minimax. Both Minimax and Alpha Beta belong to the branch and bound class of algorithms, however Alpha Beta optimizes Minimax by reducing effective depth to more than half that of Minimax by “pruning” nodes whose values are compared to that of the initial alpha and beta values, each players worst possible scores, respectively. The benefit of implementing Alpha Beta over Minimax lies in that the branches of the search tree can be eliminated based on the depth of game states/player moves (plies) and the branching factor (b), or b^d respectively. The best case complexity for Alpha Beta is O(b^d) and the worst case complexity of O(b^d), which is Minimax’s best case, in itself proving Alpha Beta to be more efficient. Due to Alpha Beta’s pruning ability, a deeper search can be achieved in the same runtime that an equivalent Minimax run would be able to achieve. As can be seen in the data, the initial move time is longer than subsequent moves as there are more game states.

Alpha Beta assigns two values, alpha and beta, at the beginning of the search tree that are maintained throughout the search. Alpha represents the minimum score that the maximizing player is assured of and Beta represents the maximum score that the minimizing player is assured. The initial value for alpha is set to negative infinity (-) and beta is set to positive infinity (+). These assigned values represent each players worst possible score at the initialization of the search tree. Once the maximum score that beta can be assigned becomes less than the minimum score that alpha can be assigned ( beta <= alpha ), it is pruned; alpha doesn’t consider that node. The efficiency improvements over Minimax makes Alpha Beta the superior algorithm to implement in this zero-sum style of Connect 4.

Connect Four Solver

This image is an illustration of alpha-beta pruning wherein the grayed out sections are the “pruned” subtrees that aren’t viable for exploration. Just as with Min Max, the max and min levels represent the turn or game-state of the player and it’s opponent.

In a standard Alpha Beta tree search, the subsequent nodes are used as placeholders by either a first or second player advantage. This placeholder advantage could cause a drop-in efficiency if the order of moves proves to be incorrect. Sorting moves early on in the search is beneficial because as the placeholder is closer to the current move state it decreases the number of positions searched exponentially.

Just as Minimax Can be improved upon heuristically, Alpha Beta has that ability, utilizing the same implemented formula from Minimax of:

(([# of player1's 4's] * 10000) + ([# of player1's 3's] * 100) + ([# of player1's 2's] * 10)) - (([# of player2's 4's] * 10000) + ([# of player2's 3's] * 100) + ([# of player2's 2's] * 10))

When the Alpha Beta algorithm was tested against the computer with a depth level of 8, it proved to be significantly more efficient with respect to runtime and move time than Minimax.

Connect Four Solver

Our Algorithm The goal of our algorithm is to play the game of Connect Four as a human would. This is made possible by a numeric database that consist of every way four, like-colored pieces could be positioned on a Connect Four board, which will be referred to as winning board states (see Figures 1 & 3). As a game plays out, our algorithm eliminates the winning board states that can no longer occur. For example, consider the game board pictured in Figure 2. The red pieces are those played by our algorithm and the yellow are those played by its human opponent. Because of the positioning of the three pieces, the following winning board states would be eliminated for yellow:

Vertical {J, K}

Horizontal {A, B, C, D, E, F, G, H}

Diagonal {J, K, S, V}

The following winning board states would be eliminated for red:

Vertical {M}

Horizontal {B, C, D}

Diagonal {S}

By separately eliminating the winning board states for both, yellow and red, our algorithm can refer to red’s remaining winning board states to decide winning moves and yellow’s remaining winning board states to anticipate the opponent’s next move, as well as avoid setting the opponent up for a win. This type of strategic gameplay is what makes our algorithm so similar to that of a human.

In the portion of our algorithm that was used to gather the runtimes of each move it made, there are two methods called which combine to form the Big O (see Figure 4). Both methods contain the nested for loop:

for each (winning board in winning board states) { for each (state in winning board) { consider state; }

}

But, each method is iterating through a different amount of winning board states as the game goes on since one method is considering yellow’s winning board states and the other is

Vertical Horizontal Diagonal

A 35 28 21 1435 36 37

38 35 29 23 17

B 28 21 14 736 37 38

39 28 22 16 10

C 21 14 7 037 38 39

40 21 15 9 3

D 36 29 22 1538 39 40

41 36 30 24 18

E 29 22 15 828 29 30

31 29 23 17 11

F 22 15 8 129 30 31

32 22 16 10 4

G 37 30 23 1630 31 32

33 37 31 25 19

H 30 23 16 931 32 33

34 30 24 18 12

I 23 16 9 221 22 23

24 23 17 11 5

J 38 31 24 1722 23 24

25 38 32 26 20

K 31 24 17 1023 24 25

26 31 25 19 13

L 24 17 10 324 25 26

27 24 18 12 6

M 39 32 25 1814 15 16

17 41 33 25 17

N 32 25 18 1115 16 17

18 34 26 18 10

O 25 18 11 416 17 18

19 27 19 11 3

P 40 33 26 1917 18 19

20 40 32 24 16

Q 33 26 19 1210 11 12

13 33 25 17 9R 26 19 12 5 9 10 11 12 26 18 10 2S 41 34 27 20 8 9 10 11 39 31 23 15T 34 27 20 13 7 8 9 10 32 24 16 8U 27 20 13 6 0 1 2 3 25 17 9 1V 1 2 3 4 38 30 22 14W 2 3 4 5 31 23 15 7X 3 4 5 6 24 16 8 0

Figure 1

Figure 3Figure 2

Connect Four Solver

considering red’s. This is the reason for combining the Big O of both methods as our algorithm must iterate through each winning board within red and yellow’s winning board states.

Figure 4

As for each state within every winning board, there will always be four, because winning the game of Connect Four requires the alignment of four pieces, thus there will only exist four states for each winning board within the winning board states. Therefore, the Big O of these methods can be denoted (A x C) and (B x C). “A” being the number of red’s winning board states, “B” being the number of yellow’s winning board states, and “C” being four, the number of states in every winning board. After combining and simplifying, our algorithm is determined to have a Big O(C(A+B)). Our algorithm’s worst case Big O will be 552, because the original database of winning board states contains 69 winning boards and 4(69 + 69) = 552. Our algorithm’s best case Big O will be 8 when there is only one winning board remaining for yellow and red.

Based on the information covered so far regarding our algorithm and its decreasing Big O, it could be assumed that its average runtimes would roughly result in a decreasing linear slope after the initial move (as seen in Figure 5) since the more moves that are made, the less data there is to iterate through, but that is not the case. Figure 5

Connect Four Solver

is a stacked area chart that shows the general evolution of the time it takes our algorithm to make Figure 5

its move over various game lengths, which provides a correct interpretation of our algorithm’s Big O. The raw line chart and bar graph of our algorithm’s average runtimes (see Figure 6) has a shifting zero slope. This occurrence can be attributed to the way in which the human opponent plays the game. When the opponent’s approach is vertically inclined towards the center of the board (see Figure 7), winning

board states can be eliminated much quicker than those of an opponent whose approach is horizontally inclined and away from the center of the board (see Figure 8). Hence, the average

Figure 6 runtime is split between these two processes which produce contrasting runtimes,

consequently causing our algorithm’s average runtime to appear plateaued. In conclusion, our

algorithm’s speed should be evaluated as a general evolution of move times over the duration of a given game and not by the averaged

Figure 7 Figure 8 move times over the duration of all games.

Results and Data:

Connect Four Solver

Figure 9

Figure 10

Figure 11

Holes:

Connect Four Solver

The most glaring hole for our results is in the win percentage. Since our group members were the only ones to play against the algorithms, we only had three different brains playing Connect 4. Our skills, therefore, become a factor in algorithms' win percentages. It may have helped to have a wider range of play testers, to get varying skill levels and have a larger pool of test games, as we only had 30 for each algorithm. Another worry is that some games may have played out exactly like another test game another player conducted, since we only made sure our individual games were unique.

Interpretation and Conclusions

It is evident in the results that Alpha Beta is faster than Min Max, which is to be expected as Alpha Beta prunes branches from the tree that it knows it does not need to iterate through. Another interesting note, indicated in Figure 11, is that the initial times stand out as longer times, as opposed to the end game times, which are miniscule compared to the initial times. We believe this is because at the beginning of games, there are more options for the computer to consider, which takes more time to evaluate. As games get closer to the end and drag on, columns start filling up, reducing the number of possible plays and giving the computer less to evaluate, resulting in quicker runtimes at the end of the game.

The win percentages, as seen in Figure 9, show the drastic impact of a heuristic approach. Min Max and Alpha Beta strongly solve Connect 4 in their pure forms, but the implementation of a heuristic function brought their win percentages, or effectiveness, down to a little more than half. Although the effectiveness was damaged, the heuristic vastly improved the runtime for these two algorithms. Without it, these would not have even been playable. We think that this improvement is worth the accuracy loss, as the algorithms were still able to win over half the time and run in a reasonable amount of time.

Between the three algorithms we analyzed, we conclude that Alpha Beta is the most efficient because of its fast runtimes and ability to win games. While our algorithm had very similar runtimes to Alpha Beta – in fact, they were slightly faster, as Figure 10 shows – the difference is too miniscule to call our algorithm the best one. Alpha Beta was able to win a considerable amount more than our algorithm, with near similar runtimes; therefore, we agree that Alpha Beta is the preferred algorithm.

Future Work

Looking into heuristics that are playable, yet more accurate or perfect would be the first step into creating something that is more efficient and player friendly. The process for something like this is highly complex, but we believe that it would be worth researching, as a strong heuristic can be very effective for Min Max and Alpha Beta. For the algorithm we created, adding a method to determine if there are two tokens in a horizontal row in columns 1-5 to avoid the scenario of [insert scenario]. Another topic of interest is observing how different depths in the game tree affect the gameplay for each algorithm since only a depth of 8 was tested. The deeper the search, how much more drastically would runtimes be affected?

Connect Four Solver

How much stronger would the algorithms play Connect 4? We would also like the addition of a randomized algorithm to act as a control for our testing, as it would provide something more relatable to compare our algorithms to.

Questions

1. What is a zero-sum game?2. What is the worst case complexity for Alpha Beta but best case for Min Max?3. What is pruning?4. Why is a heuristic needed for Min Max/Alpha Beta to be playable for Connect 4?5. What is the drawback of a heuristic approach for Min Max/Alpha Beta?

Answers

1. A game where either one player wins everything and the other wins nothing, or neither player wins anything (tie). The total sum of one player's total gain and the other's total losses equal zero, giving it the name "zero-sum".

2. O(b^d) where b = number of possible moves; d = depth.3. Eliminating branches from a tree that do not need to be searched.4. Because it would take too long for the entire game tree to be iterated through.5. There is a loss in gameplay accuracy, which means the algorithm won't play perfectly.

The benefit of a heuristic approach is more playable runtimes.

Connect Four Solver

Works Cited

YouTube, Numberphile, 1 Dec. 2013, youtu.be/yDWPi1pZ0Po.

Pons, Pascal. “Connect Four Solver.” connect4.Gamesolver.org, connect4.gamesolver.org/?pos=.

Tromp, John. “John's Connect 4 Playground.” John's Connect Four Playground, tromp.github.io/c4/c4.html.