Click here to load reader

BUILDING THE CHECKERS 10-PIECE ENDGAME - · PDF fileBUILDING THE CHECKERS 10-PIECE ENDGAME DATABASES ... that 6-piece chess endgame databases are ... The databases contained secrets

  • View

  • Download

Embed Size (px)



    J. Schaeffer, Y. Bjmsson, N. Burch, R. Lake, P. Lu, S. Sutphen Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E8


    Abstract In 1993, the CHINOOK tearn completed the computation of the 2 through 8-piece checkers endgame databases, consisting of roughly 444 billion positions. Until recently, nobody had attempted to extend this work. In November 2001, we began an effort to compute the 9- and 10-piece databases. By June 2003, the entire 9-piece database and the 5-piece versus 5-piece portion of the 10-piece database were completed. The result is a 13 trillion position database, compressed into 148 GB of data organized for real-time decompression. This represents the larg est endgame database initiative yet attempted. The results obtained from these computations are being used to aid an attempt to weakly salve the game. This paper describes our experiences working on building large endgame databases.

    Keywords: Retrograde analysis, endgame databases, checkers

    1. Introduction

    Endgame databases have had an enormous impact in computer games re-search. They have been instrumental in building world championship pro-grams (e.g., the World Man-Machine Checkers Champion CHINOOK (Schaef-fer, 1997)), solving games (e.g., Nine Men's Morris (Gasser, 1996) and Awari (Romein and Bal, 2002, 2003)), and uncovering new insights into games.

    For convergihg games, where the number of pieces on the board reduces as the game progresses, larger endgame databases are a performance asset to a game-playing program, both in terms of reducing the size of the search tree and by replacing heuristic evaluations with perfect knowledge. However, there are practica! considerations to building large databases, including the time required to compute them, and the resulting size of the ( compressed) databases. Few researchers and developers have the expertise, motivation, patience, and computing resources to push database technology to its limit (a recent exception is the solution to the game of Awari (Romein and Bal, 2002, 2003)). This means, for example, that the 6-piece chess endgame databases are unlikely to be completed in the near future.

    H. J. Van Den Herik et al. (eds.), Advances in Computer Games IFIP International Federation for Information Processing 2004

  • 194 J. Schaeffer, Y. Bjrnsson, N. Burch, R. Lake, P. Lu, S. Sutphen

    CHIN o o K is the World Man-Mac hine Checkers Champion (Schaeffer, 1997). 1

    The 8-piece endgame databases were a critica} part of the program's success against the top human players. The databases contained secrets that were well beyond the understanding of even the premier players in the world. These databases were started in 1989 and completed in 1993-444 billion positions compressed into 5.6 GB of data. These numbers may seem small by today's standards, but were impressive back in the early 1990s when a state-of-the-art CPU was an Intel 486, 32 MB was considered tobe a lot of memory, and 1 GB disks were new technology and very expensive.

    Beginning in November 2001, we started production runs for computing the 9- and 10-piece checkers endgame databases. The databases are not needed to improve the playing strength of checkers programs; there are currently at least five checkers programs that are superior to all human players. Rather, there is a more enticing goal: solving the game of checkers (or, more precisely, weakly solving the game (Allis, 1994)). The total search space for the game is 5 x 1020 , a seemingly prohibitively large number. However, most of the search space is likely tobe irrelevant to the proof, and resulting estimates of the proof-tree size are well within what is possible to compute with current technology. Building the 10-piece databases (specifically the key 5-piece versus 5-piece subset, where each si de has the same number of pieces) is a key stepping stone to solving checkers.

    This paper describes our experiences building the 9- and 10-piece checkers databases. The task was daunting, given the need for 64-bit addressing, large computations (up to 171 billion positions at a time), large intermediate disk needs ( over 1 TB ), verification of the results, and fault tolerance. In 1 O years, these numbers will seem trivial, but the techniques will be useful for the next large database computation.

    This paper makes the following contributions:

    1 the practica! considerations that complicate any long-te~ data-intensive computation,

    2 the system issues that need to be addressed, including memory con-straints, concurrency, compression, and fault tolerance,

    3 improved data compression techniques, 4 data on the 9- and 10-piece checkers databases, and 5 speculation on the likelihood of solving checkers in the near future.

    Section 2 describes the algorithms used to compute the 8-piece databases. Section 3 discusses the enhancements needed to move to the larger 10-piece

    1 There are over 100 checkers variants. The variant used here is played on an 8 X 8 board and is popular in the former British Commonwealth and in North America. So-called International Checkers is played on a 1 O x 1 O board and is popular in Russia, Europe, and Africa.

  • Building the Checkers 1 0-piece Endgame Databases 195

    databases. The results from building the databases and the implications for solving the game of checkers are in Section 4. Section 5 concludes with per-spectives on building larger databases.

    2. Algorithms

    The important application-specific properties that infiuence the database al-gorithms are (Goldenberg et al., 2003) (the "Properties"):

    1 The game starts with 12 white and 12 black checkers on the board. 2 A captured piece is removed from the board and cannot return. 3 Checkers can be promoted to become kings (when the checker moves to

    the back rank of the opponent). 4 Checkers move forward; kings move forward and backward.

    The algorithms used for the checkers computation are updated versions of those used to compute the CHINOOK 8-piece databases (Lake et al., 1994). This code had not been touched since the completion of the databases in 1993.

    The most common format of an endgame database stores for each position a distance metric. This metric is typically either the number of moves to win (if appropriate) or the number of moves to convert to another database. This level of detail is tremendously useful in practice since it allows a game-playing program to play the "best" database moves without needing any search. However, this representation requires (at least) a byte of data per position, and the resulting database does not compress well. The philosophy adopted for building checkers databases has been to build the largest databases possible. To do this necessitates storing the minimal amount of information per position in the database -recording only whether a position is a win, a loss ora draw. The result facilitates the creation of large endgame databases that compress extremely welL

    For database calculations, each position is represented by 2 bits, representing the values win (W), loss (L), at least a draw (D), and unknown (U). Using D to mean at-least-a-draw instead of exactly a draw is useful, 'since it reduces the amount of disk IlO done by the program (see the Lookups phase described below). A portion of the endgame database (a slice) is computed by resolving ali positions as wins, losses or draws. The final result is compressed, verified, and then added to the master copy of the completed databases.

    The 10-piece databases are huge (8.5 trillion positions for just the 5-piece versus 5-piece subset), and it is not practica! to do the entire calculation as one big computation. Instead, the problem is broken down into smaller slices that can be solved more easily. The databases are broken down as follows:

    By pieces: The N-piece database can be computed once the N-1-piece database is done (by Property #2).

    By material: An N-piece database is further divided so that subsets with a different number of pieces per side can be computed in parallel (Property

  • 196 1. Schaeffer, Y. Bjrnsson, N. Burch, R. Lake, P. Lu, S. Sutphen

    #2). For example, in the 9-piece database computation, the 8 pieces versus 1, 7 versus 2, 6 versus 3, and 5 versus 4 subsets can be computed in paraliel.

    By number of kings: The material division is further broken down by the number ofkings for each side (exploiting Property #3). For example, after 5 kings versus 4 kings have been computed, then the subset 4 kings and 1 checker versus 4 kings can be computed (the one checker might promote, thus the 5 king versus 4 king database must be computed first).

    By leading rank: A sub-database is further sliced into pieces by consid-ering the position of each side's most advanced (leading) checker (from ranks 1 to 7). Positions where the leading checker is on rank R must be computed before those where the leading checker is on rank R - 1 (Property #4). For example, in the 4 kings and 1 checker versus 4 kings endgame, ali positions where the checker is on the severith rank must be computed before tackling ali positions where the checker is on the sixth rank. For databases where each side has a checker, this technique results in dividing the computation into 49 (not-necessarily-equal) slices, dra-maticaliy reducing the size of the biggest computation to be performed.

    More details on the decomposition can be found in Lake et al. (1994). Table 1 shows how the 5-piece

    1 Database 11 Total Positions 1 Slices 1 versus 5-piece subset of the 10-5500 16,257,084,480 1 5401 142,249,489,200 7 5302 247,789,432,800 7 5

Search related