18
Journal of Applied Intelligence 3,207-224 (1993) © 1993 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Tutoring Bishop-Pawn Endgames" An Experiment in Using Knowledge- Based Chess as a Domain for Intelligent Tutoring DINESH GADWAL, JIM E. GREER, & GORDON I. McCALLA ARIES Laboratory, Department of Computational Science, University of Saskatchewan, Saskatoon, Saskatchewan S7N OWO, CANADA Abstract. Most research in computer chess has focused on creating an excellent chess player, with relatively little concern given to modeling how humans play chess. The research reported in this article is aimed at investigating knowledge-based chess in the context of building a prototype chess tutor, UMRAO, which helps students learn how to play bishop-pawn endgames. In tutoring it is essential to take a knowledge-based approach, since students must learn how to manipulate strategic concepts, not how to carry out large-scale lookahead searches. UMRAO uses an extension of Michie's advice language to represent expert and novice chess plans. For any given endgame, the system is able to compile the plans into a strategy graph, which elaborates strategies (both well formed and ill formed) that students might use as they solve the endgame problem. A strategy graph can be compiled "off-line," where real-time performance is not important. Later, during tutoring, the strategy graph can be accessed quickly in order to understand a student's moves in terms of his or her strategies. With such understanding, UMRAO is able to provide appropriate knowl- edge-based feedback to the student. Anderson et al. have called this tutoring paradigm "model tracing," but in the chess domain model tracing can be used without the need for immediate feedback that An- derson has required in his more complex abstract problem-solving domains. The chess domain thus allows experimentation with a variety of tutoring styles that range from immediate feedback to optional feedback, from strict tutor control of the feedback to student initiative in the choice of feedback. This points out UMRAO's most promising contribution: re-establishing chess as a vehicle for research in other areas of artificial intelligence, in this case intelligent tutoring systems. UMRAO also makes tech- nical contributions to knowledge-based chess and to intelligent tutoring as well. Key words: Knowledge-based chess, intelligent tutoring systems, planning, diagnosis 1. Introduction Over the history of artificial intelligence (AI) re- search, chess playing has been the subject of in- tense investigation. With considerable success, researchers have been trying to create com- puter chess programs to equal or surpass the best human chess players. Most of these pro- grams are based on the idea of carrying out mas- sive amounts of search through a huge tree of po- tential moves and counter-moves. The relatively few chess programs that have been based on knowledge-based approaches to playing chess, using ideas like chunking, planning, and pattern matching [1-5], have been unable to compete in playing competence with the search~based pro- grams. This has meant that the promise of chess to be the "drosophilia" (or fruit fly) for artificial intelligence [6] and cognitive science [7,8] has not been fulfilled. The search-based programs

Tutoring bishop-pawn endgames: An experiment in using knowledge-based chess as a domain for intelligent tutoring

  • Upload
    usask

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Journal of Applied Intelligence 3,207-224 (1993) © 1993 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.

Tutoring Bishop-Pawn Endgames" An Experiment in Using Knowledge- Based Chess as a Domain for Intelligent Tutoring

DINESH GADWAL, JIM E. GREER, & GORDON I. McCALLA ARIES Laboratory, Department of Computational Science, University of Saskatchewan, Saskatoon,

Saskatchewan S7N OWO, CANADA

Abstract. Most research in computer chess has focused on creating an excellent chess player, with relatively little concern given to modeling how humans play chess. The research reported in this article is aimed at investigating knowledge-based chess in the context of building a prototype chess tutor, UMRAO, which helps students learn how to play bishop-pawn endgames. In tutoring it is essential to take a knowledge-based approach, since students must learn how to manipulate strategic concepts, not how to carry out large-scale lookahead searches.

UMRAO uses an extension of Michie's advice language to represent expert and novice chess plans. For any given endgame, the system is able to compile the plans into a strategy graph, which elaborates strategies (both well formed and ill formed) that students might use as they solve the endgame problem. A strategy graph can be compiled "off-line," where real-time performance is not important. Later, during tutoring, the strategy graph can be accessed quickly in order to understand a student's moves in terms of his or her strategies. With such understanding, UMRAO is able to provide appropriate knowl- edge-based feedback to the student. Anderson et al. have called this tutoring paradigm "model tracing," but in the chess domain model tracing can be used without the need for immediate feedback that An- derson has required in his more complex abstract problem-solving domains. The chess domain thus allows experimentation with a variety of tutoring styles that range from immediate feedback to optional feedback, from strict tutor control of the feedback to student initiative in the choice of feedback. This points out UMRAO's most promising contribution: re-establishing chess as a vehicle for research in other areas of artificial intelligence, in this case intelligent tutoring systems. UMRAO also makes tech- nical contributions to knowledge-based chess and to intelligent tutoring as well.

Key words: Knowledge-based chess, intelligent tutoring systems, planning, diagnosis

1. Introduction

Over the history of artificial intelligence (AI) re- search, chess playing has been the subject of in- tense investigation. With considerable success, researchers have been trying to create com- puter chess programs to equal or surpass the best human chess players. Most of these pro- grams are based on the idea of carrying out mas- sive amounts of search through a huge tree of po-

tential moves and counter-moves. The relatively few chess programs that have been based on knowledge-based approaches to playing chess, using ideas like chunking, planning, and pattern matching [1-5], have been unable to compete in playing competence with the search~based pro- grams. This has meant that the promise of chess to be the "drosophilia" (or fruit fly) for artificial intelligence [6] and cognitive science [7,8] has not been fulfilled. The search-based programs

208 Gadwal, Greer, and McCalla

contribute interesting ideas to theories of heuris- tic search, but do not contribute beyond that to central issues in AI or cognitive science. The problem lies with the objective of creating a chess player: the goal has been achieved without resolving most of the fundamental AI questions.

In this article we change the goal from creating a chess player to developing an intelligent tutor- ing system that helps a human to learn chess. By definition, a chess tutoring system must help a learner develop strategies and plans for solving chess problems. The system must, therefore, both be able to reason using these strategies and be able to recognize appropriate and inappro- priate strategies being used by the learner. The change of goals from player to tutor thus neces- sitates a shift of emphasis from performance is- sues to cognitive and epistemological issues. This means that explorations into chess tutoring have the potential to shed light on many basic AI problems, and thereby return chess to a more central role in the understanding of intelligence. The fruit fly may take wing again!

As a first step towards building a full-fledged chess tutor, a prototype system, UMRAO, has been developed for helping novice students un- derstand how to solve bishop-pawn chess end- game problems. Twenty-two problems (from [9]) involving two white pawns, a white king, a black bishop, and a black king have been used for sys- tem development. UMRAO works by showing a configuration of pieces to the student and then asking the student to play white and find a se- quence of moves that results in the best possible outcome. As with a human chess tutor, the chess tutoring system not only makes moves to coun- teract the student's moves, but it also tries to un- derstand the strategy that the student may be us- ing so as to be able to provide relevant advice to the student on his/her moves at any point as the game is played out. There are numerous interest- ing questions that arise in building such a tutor- ing system, among them how to represent chess knowledge in strategic terms; how to use this strategic knowledge to actually make appropriate moves in response to the student's moves (some- times it may be better to make suboptimal moves for pedagogical reasons); how to use strategic knowledge in order to recognize student strate- gies, both correct and incorrect; when to provide

advice to the student and when not to; what kinds of advice to provide; and how to make the whole system efficient and effective, given re- stricted computational resources.

The approach we take is similar to that of AL 1 [2] and AL3 [10]. The advice language in ALl generates pieces of advice in terms of goals and subgoals consistent with human chess problem solving. In AL3, Bratko [10] extended ALl by introducing plans and improving the control structure of ALl . While Bratko's extensions aimed at facilitating the use of higher-order con- cepts for problem solving, our interest in tutoring required extensions to deal with issues related to novice chess strategies, plans, and misconcep- tions.

This research contributes both to chess and to tutoring. Although chess endgames are simpler than other parts of the game, they are still com- plex enough to be interesting from a chess point of view. There are many subtle and interesting strategies; one wrong step can lead to disaster, and the kinds of techniques that work for end- games may be generalizable to other parts of the game. Moreover, chess endgames are amenable to a knowledge-based approach, as shown by earlier research [2,11]. Chess is also a suitable domain for intelligent tutoring system (ITS) re- search. It is well defined while at the same time being nontrivial. Chess problems have well- formed solutions, consisting of various interact- ing strategies. Chess is not as simple as the gam- ing environments of WEST [12] nor as complex as the programming environments of PROUST [13] or SCENT [14]. Chess falls into an interest- ing middle ground between these two poles.

Finally, there are practical reasons for inves- tigating automated chess tutoring. Ideally, a chess tutoring system would be able to provide advice to the student that leads him/her to under- stand strategic chess thinking and to generalize from examples, while at the same time being adaptable to the individual needs of the student. Such a system would be a better learning tool than a search-based chess player not designed to aid human learning, since such chess-playing programs only play a game and do not provide any feedback on higher-level strategic concepts with which the novice should be concerned. A chess tutoring system would also be better than

Tutoring Bishop-Pawn Endgames 209

standard chess textbooks, since these textbooks do not interact with the student, nor do they ca- ter to individual variance in the ability and ex- perience of students.

2. System Design and Methodology

Before UMRAO was developed, efforts were first made to understand novice chess skills. Five novice chess players (players sufficiently versed in chess to begin making strategic decisions) were informally "tutored" in order to collect think-aloud problem-solving protocols. Each was asked to solve a set of 22 bishop-pawn endgame problems while their problem-solving behavior as well as the human expert tutoring behavior were recorded. These protocols were analyzed in order to extract important design features for UMRAO. The following principal design char- acteristics were identified: • the system must be able to model correct and

incorrect solution strategies; • advice to students should be given in context; • such advice should be conceptual, using higher-

level concepts such as plans and goals instead of simple feedback specifying move correctness or giving the next correct move;

• students should be given an opportunity to ex- plore the consequences of incorrect strategies.

The next stage was to actually design and im- plement UMRAO, based upon the architecture shown in figure 1. UMRAO consists of two parts: the EXPERT and the TUTOR. The EX- PERT is run only once for each new bishop- pawn endgame problem. For each such problem it compiles a strategy graph representing plausi- ble novice strategies and substrategies as welt as expert-level strategies and counter-strategies. The TUTOR runs each time a novice wants to

Q Task Specialists "~ ~ (Task Specialists "~ (Feature Detector, [ t l~n',~,~m~ ] ](Plan recognizer, ] Move generator, and] [ P~a-h%~ ] ] Control executor, and / Constraint e v a l u a t o r ) J ~ ~,, Feedback generator) )

Fig. /. The architecture of UMRAO.

work on a particular problem. It uses the pre- compiled strategy graph to track and predict nov- ice and expert moves while tutoring individual students.

In order to compile its strategy graph, the EX- PERT uses a knowledge base called the plan li- brary. The plan library contains expert- and nov- ice-level plans in order to model correct as well as flawed lines of play. The plan representation is based on Michie's advice definition in ALl [2] modified for representing novice misconceptions and faulty reasoning processes. The EXPERT uses the plan library to construct its strategy graph for a given problem. The graph is built re- cursively from the initial board position. Plausi- ble novice and expert strategies are instantiated from plans in the plan library by matching fea- tures of the board position to the applicability predicate and feasibility predicate associated with the plans. These strategies constrain the wide range of possible moves to the few plausible moves the novice is likely to carry out. These plausible moves, in turn, lead to a new set of board positions when it is the system's turn to move. Using only expert-level plans from the plan library, optimal system strategies are cre- ated for each of these board positions, leading to another level of board positions to be considered by the novice. The process continues until an en- tire strategy graph is generated.

Whenever a student wants to work on a par- ticular problem, the TUTOR uses the strategy graph associated with that problem to understand novice moves in the context of his or her strate- gies. Because the TUTOR tracks the novice's strategy (or strategies) at any point, it can com- ment on shortcomings in this strategy (or strate- gies). It is also able to provide explanations to the novice contrasting the expert and novice strate- gies, because it knows what an expert chess player would do in the same situation (the pre- ferred expert-level strategy is marked in the graph). Note that the process of generating an ex- planation is very efficient, since the strategy graph has been generated long before the tutor- ing session takes place. The TUTOR only tracks through the graph; it does not build it.

Rather than forcing its advice on the student, UMRAO can give him or her the choice of what feedback, if any, to select. The student can just

210 Gadwal, Greet, and McCalla

play on with no feedback, can ask for a hint, can request a detailed description of his or her prob- lem, can ask for the winning strategy, or, if there is no winning strategy, can ask the tutor to back up to the last place where a winning strategy could still be employed. We are still experiment- ing with the feedback features of UMRAO as we observe what students like to do as they interact with the system.

In the following two sections we describe in more detail the EXPERT and TUTOR modules.

3. The EXPERT Module

The goal of the EXPERT module is, for a given bishop-pawn endgame, to generate a strategy graph to be used by the TUTOR module in rep- resenting the strategic implications at any point as the endgame is played out. The strategy graph is essentially a game tree, pruned of any branches not justified by valid plans, and anno- tated by the plans that were used to generate the nodes and branches in the tree. In order to build a strategy graph, the EXPERT module has a knowledge base of typical plans that could be used by novices or experts to solve bishop-pawn endgames. For a given endgame, the EXPERT employs a planning system that elaborates the strategy graph by applying all relevant plans re- cursively from the initial board position until the game ends in win, lose, or draw.

3.1. Plans

Plans are the backbone of UMRAO. They play an important role in defining expert, novice, and ill-formed strategies in a single framework. Moreover, through notions such as goals, con- straints, patterns, etc., they provide the knowl- edge necessary for generating feedback to the student. Finally, they form the basis for interac- tion with the student by suggesting situations ap- propriate for tutor intervention. The representa- tion of plans in UMRAO is based on Michie's advice structure [2]. A plan is an object with slots representing various aspects of the plan: side to play (black or white), type of plan (expert or nov- ice), applicability predicate, feasibility predicate, better-goals, holding-goals, move-constraints for

both sides, and decidability of the plan. An ap- plicability predicate consists of board features, such as the pawn formation (blocked pawns, passed pawns, etc.), that if present would favor selection of the plan. Feasibility predicates give the likelihood for the success of a plan. The cri- teria for success or the purposes of the plan are described by its better-goals. The criteria for failure of the plan are defined by its holding- goals. For example, if a side has a passed pawn (applicability) and is not controlled by the bishop (feasibility) then the plan can be to "queen the pawn" (better goal); at the same time, the pawn should be safe from attack (holding-goal), and the success of the plan decides the game (decid- ability). The move-constraints are represented as Mcx and Mcy, where Mcx defines the constraints on the moves for the side to play and Mcy defines the constraints for the opponent's moves in order to satisfy the goals (better-goals and holding- goals of the plan).

Although the plan representation is similar to Michie's advice structure, there are some signif- icant differences worth mentioning. The most im- portant difference is the addition of the three ad- ditional slots: applicability predicate, feasibility predicate, and decidability. These slots add fur- ther modularity and efficiency to the system. The knowledge stored in these slots corresponds to the knowledge stored in various methods and lemmas in Michie's framework. The applicabil- ity predicate and feasibility predicate represent the condition part of methods whose action part is a piece of advice. The knowledge stored in the decidability slot corresponds to an inference rule, which states that achievement of the better- goal contained in a particular piece of advice re- sults in the outcome of a particular plan se- quence. The other important difference from Michie's approach is the capability of defining not only expert-level plans but also novice-level plans using the same set of slots.

Figure 3 shows an example of a novice-level and an expert-level plan applicable in the board position shown in figure 2. The novice level plan can be summarized: "If you have double pawns and the first pawn is prevented by the bishop from queening: Exchange the first pawn with the bishop and queen the second pawn." The expert level plan can be summarized: "If you have dou-

Tutoring Bishop-Pawn Endgames 211

N N

N N @ @

a b c

Fix'. 2. White

@ @t N o@ @ N N N

@.J.N

N @ N @ N

d e f g h

to play and win [9].

ble pawns and the first pawn is prevented by the bishop from queening and the opponent's king can reach the second pawn: Exchange the first pawn with the bishop and queen the second pawn, but in the meantime keep the opponent's king away."

First consider the expert plan in figure 3: plan W6. The applicability slot is a conjunction of three board features that must all be present in a given board position for this plan to be applica- ble. The first feature is (double-pawn P1 P2), which matches when there are two pawns in the

same column. The second feature is (not (feasi- ble WI)), which matches when a particular plan W1 (attempting to queen the pawn P1) is not fea- sible. This feature specifies a kind of plan order- ing, which in this case suggests that only if plan W1 is not feasible should plan W6 be considered. The third feature is a geometric feature to check that the black king threatens the second of the double pawns. Although in this example the features are conjoined, any boolean connective (and, or; not) can be used in elaborating the ap- plicable features. These features are matched by a general feature detector that consists of various methods that together can °'read" a given feature and check whether that feature is present in a given board position. The feature detector is, in effect, a global interpreter for features, which al- lows the features themselves to be encoded de- claratively. In this way procedural knowledge has been encoded in a declarative fashion. This ability to represent a variety of procedural knowl- edge in a declarative form adds to the represen- tational power of the system.

A plan can be applicable but not feasible in a given board position. There is thus a need for a separate slot to encode feasibility conditions. Feasibility conditions are also checked for by the

A novice plan relevant to the position in Figure 2 [((Identification W5)

(ToPlay (Applicability (Feasibility (Better-Goal (Holding-Goal (Mcx

(Mcy (Decidability

(white student)) (and (double-pawn PI P2) (not (feasible W1)))) (can-support WK (bishop-square P1))) (and (exchange P1 B) (queen P2))) (and (safe WK) (safe-from P1 B) (safe P2))) (and (destination Pl (queening-square P 1)) (destination WK (near (bishop-square P 1))) (destination P2 (promotion-square P2)))) nil) t))

An ((Identification W6)

(ToPlay (Applicability

(Feasibility

(Better-Goal (Holding-Goai (Mcx

expert plan relevant to the same situation

(Mcy (Decidability

(white expert)) (and (double-pawn P 1 P2) (not (feasible W 1)) (black-king-threatens P2))) (and (can-support WK (bishop-square Pl)) (can-prevent WK BK P2))) (and (exchange P1 B) (queen P2))) (and (safe WK) (safe-from P1 B) (safe P2))) (and (destination P1 (queening-square P1)) (destination WK (near (bishop-square P 1)) (destination P2 (promotion-square P2)))) ((move-toward BK P2))) t))

Fig. 3. Examples of expert- and novice-level plans.

212 Gadwal, Greel, and McCalla

feature detector. In plan W6 the feasibility slot is a conjunction of two features. The first feature (can-support WK (bishop-square PI)) suggests that the white king must be able to support a par- ticular square, in this case the square that is com- mon to both P1 and the bishop. To express this square requires the use of a nested subfeature (bishop-square P1). This ability to make compos- ite features adds to the power of the feature rep- resentation language. The second feature in plan W6's feasibility slot is (can-prevent WK PK P2), which matches if the white king is able to prevent the black king from reaching the neighborhood of P2, the second pawn.

The better-goal represents the criterion for success of the plan. In W6, the better goal is to exchange the first pawn for the bishop, and then to queen the second pawn. The holding-goal rep- resents criteria that must hold true throughout the plan in order for it not to fail. The holding goal for plan W6 is this: the white king must be safe throughout; the first pawn, P1, must not be taken by the bishop; and the second pawn, P2, must also be safe.

A move-constraint generally specifies the de- sirable movement of a particular piece in order to execute a particular plan. The movement of the piece is specified in terms of a square (or a group of squares) having a particular feature. When- ever UMRAO is considering whether a move should be made that is appropriate to a plan, the move must be evaluated with respect to the ap- propriate move-constraints of that plan. White moves must be evaluated relative to move-con- straints represented in Mcx, and black moves rel- ative to move-constraints in Mcy. This evalua- tion is carried out by the constraint evaluator. For example, in plan W6 the move-constraint Mcx (destination P1 (queening-square P1)) directs the sequence of pawn movements toward the queen- ing square for pawn P1. Another more com- plex Mcx move-constraint (destination WK (near (bishop-square P1))) specifies that the white king should be moved to one of the squares near the square on the pawn's path controlled by the bishop. The third move-constraint in Mcx (des- tination P2 (promotion-square P2)) suggests that pawn P2 should move forward. The Mcy move- constraints are constraints on the movements of the opponent's pieces. In this case, the black king should be moving towards pawn P2.

Now compare the expert plan W6 and the nov- ice plan W5. The plans have the same goals, as shown by the fact that their better and holding goals are the same. The difference between them starts with the difference in their applicability conditions. The novice has not considered/per- ceived the necessity of being concerned with a threat to P2 from the black king, as indicated by the absent (black-king-threatens P2) applicability condition. As a result the novice does not check for the feasibility of the black king capturing the second pawn, which produces a less adequate set of move-constraints (Mcx and Mcy). This, in turn, means that against a skillful player plan W5 will not succeed, where plan W6 will. In general a variety of novice-level plans can be generated by making different types of standard deviations from the expert-level plans. For example, if there is a correct plan Wx, then a novice strategy can be made from Wx by making the feasibility slot of Wx true to represent a novice plan where the student does not check the feasibility of Wx. Of course, not all such deviations may be meaning- ful, and they must be supported by empirical ob- servation of actual novice behavior.

3.2. The Strategy Graph

For a given board position, the EXPERT module must produce a strategy graph that elaborates all possible plan-justified moves and counter-moves through to the end of the game. Each node in the strategy graph is called a cnode and, as in a game tree, corresponds to a board position and an in- dication of who is to move (white or black). However, unlike a conventional game tree, at- tached to the cnode are a number of knowledge- based attributes. First is a list of strategies that apply at this board position, where a strategy is just an instantiated plan whose applicability and feasibility slots have been matched by features in the current board position. In turn, the goals and move-constraints of each such instantiated plan imply that certain plausible moves can be carried out in the current board position. A specialist called the move generator is responsible for gen- erating these possible moves. Made up of meth- ods for generating pawn moves, bishop moves, and king moves, the move generator creates a move plan for each possible move in a given board position for a given strategy. These move

Tutoring Bishop-Pawn Endgames 213

plans contain information about how the bet ter goal, holding goal, and move constraints have been satisfied, as well as an indication of the move that the move plan represents. Move plans are at tached to the strategies from which they are generated. Finally, linked to a cnode are branches to successor board positions, one for each valid move represented in the move plans. Of course, a given move may be generated by more than one plan, so it is necessary to indicate with each branch which plans were ultimately re- sponsible for mandating that move.

The root cnode represents the initial board po- sition where it is always the student 's turn to make a move. The strategy graph then alternates with each play between expert cnodes and stu- dent cnodes, as moves and counter-moves are made. The strategy graph eventually terminates when the game is over, win, lose, or draw. The basic philosophy for generating moves at each cnode is the same, whether it is a student or ex- pert cnode. However , the available plans are dif- ferent for each type of cnode. At student cnodes, only student-oriented plans for playing white need to be looked at. The particular set depends on the level of expert ise the students using UMRAO are deemed to have. If the students are generally novices, then only novice plans need be included; if the students are more expert or more varied in ability, then more plans will be needed. In its current state, UMRAO assumes varied ability on the part of the students, and therefore provides a fairly large repertoire of plans. In the future, it would be bet ter to have this set be individualized to each student through use of a student model. Right now, all plausible moves must be generated at a student cnode in order that the behavior of as many students as possible can be tracked. At expert cnodes, the situation is different. There needs to be only a small number of expert plans that represent op- timal play for black. Moreover, there needs to be only one move actually chosen, since UMRAO has the choice of what move to make at an expert cnode.

An example of the structure of a strategy graph is given in figure 4, which shows a port ion of the strategy graph generated for the chess end- game problem shown in figure 2. Cnode C0 rep- resents the initial board position; it is detailed in terms of its slots in figure 5. Four plans apply at

ml

~ f f @ ~ - " * ml m2 m3

• "*" ~ ° "-(S2"~-~*~*~=~l~l " " m2 m3

m4 W4 ~ / m2.WS,W6 N ~ . ~ " " ' J m5 " ' f ' ml,Wl,W5,W6 \ m3,W5.W6~ ~4 J - ~ B2

m6,B2 roT,B2 m5,B2

~ 8,W5 m9,W5 ml0 2,W5

Fig. 4. An example strategy graph.

this position: W1, W5, W6, and W4. Each of them is instantiated to corresponding strategies S 1, $2, $3, and $4. The instantiation is indicated by truth conditions attached to the applicability, feasibility, better-goal and holding-goal slots of the corresponding plan, as can be seen in figure 6 showing strategy $2. S1 has affiliated with it only one move plan, Mpl , that mandates move ml be made on the board. $2 and $3 each have three move plans, and these move plans suggest moves ml , m2, and m3, respectively. Finally, $4 has one move plan, directing that move m4 be made. To illustrate the idea of move plans, Mp4 for strategy $2 is shown in figure 7. Values as- sociated with each move constraint reflect the progress the move makes towards meeting that constraint. An overall evaluation of the move is

Identification: CO Configuration: (initial board position shown in Figure 2) SideToMove: White - Student Parent: nil Descendants: ((m4 C1) (ml C2) (m2 C3) (m3 C4)) Strategies: (S1 $2 $3 $4) Moves: (ml m2 m3 m4) BestResponse: ((m3 S3)) Result: (White nil)

Fie. 5. The Cnode CO.

214 Gadwal, Greer, and McCalla

Identification: $2 From Plan: W5 Applicability: ((and (double-pawn P1 P2)

(not (feasible W1))) t) Feasibility: ((can-support WK (bishop-square P1)) t) Better-Goal: ((and (exchange P1 B) (queen P2)) nil) Holding-Goal: ((and (safe WK) (safe-from P1 B)

(safe P2)) t) M c x : (and(destination P1 (bishop-square Pi))

(destination WK (near (bishop-square P1))) (destination P2 (promotion-square P2)))

Mcy: nil Decidability: t MovePlanObj: (Mp2 Mp3 Mp4)

Fig,. 6. The strategy $2.

derived from the values of the move constraints. Note that move ml is generated by three differ- ent plans (W1, W5, W6); moves m2 and m3 are generated by both plans W5 and W6; and move m4 is generated by only plan W4. Thus, during tutoring, in recognizing the strategic thinking un- derlying a student's move, UMRAO is not al- ways certain which plan is behind a given move. Only if the student makes move m4 will the plan be unambiguous.

The four different moves result in four possi- ble successor board positions: C1, C2, C3, and C4. The branches to these successor cnodes are labeled with the move name as well as the plan(s) that generated the moves. Many nodes and branches in figure 4 are only shown skele- tally: many strategies, move plans, and branches have been left out for reasons of space. For ex- ample, all moves beneath node C1 have been left out. It is important to note, however, that there is in fact only one branch from an expert cnode. For example, move m5 is the only move that black will consider making from position C4. Note that this move, as for all of black's moves, is also justified by a plan (B2). This is useful to UMRAO during tutoring for explaining the sys- tem's own strategic thinking.

Identification: Mp4 Move: m3 Plan: W5 Better-Goal: ((and (exchange PI B) (queen P2)) nil) Holding-Goal: ((and (safe WK) (safefrom P1 B) (safe P2)) t) M c x : (((destination P1 (bishop-square Pl)) 0)

((destination WK (near(bishop-square P1))) 1) ((destination P2 (promotion-square P2)) 0))

Mcy: 0 Evaluation: 1

Fig,. 7. The move plan object Mp4.

3.3. Generating the Strategy Graph

The planning system used for generating the strategy graph is based on a depth-first search algorithm over the space of plausible plans a chess player could use (unlike the usual search- based programs, which search over the space of moves). This is very similar to other knowledge- based chess-playing programs. However, the tu- toring domain also necessitates some changes to the standard knowledge-based approach. Typical knowledge-based chess programs have to ana- lyze only the best or most promising plans at any time. They do not have to look for other plans until the current plan must be abandoned for some reason. In contrast, UMRAO has to ana- lyze all plausible plans at every student cnode. This is because the planning behavior of a novice differs from that of an expert. In other words, all the applicable plans have to be analyzed for their consequences. Even for the expert cnodes, static analysis is carried out at every cnode. Because of this additional requirement of analyzing faulty plans, a different design for the planning system had to be devised.

The planner consists of two modules: Ex- Search and StSearch. The ExSearch module is applied when it is the system's turn to move, while StSearch is applied when a student's pos- sible responses to system moves have to be gen- erated and analyzed. ExSearch is very similar to move-generation algorithms in the usual knowl- edge-based chess programs. It analyzes all the moves that correspond to feasible plans until it gets one that results in a favorable position for the system. It then attaches the analysis of the best move to the evolving strategy graph. In con- trast, StSearch must analyze all plausible student responses exhaustively, regardless of feasibility, since the strategy graph must reflect as wide a possible set of student moves as possible. All of these plausible student moves are incorporated into the strategy graph. ExSearch and StSearch are applied alternatively as the strategy graph is elaborated.

Although our implementation is serial, it might be possible to achieve a degree of parallel- ism in the generation of the strategy graph. We are not talking here of generating nodes in par- allel, which would soon lose its effectiveness as

Tutoring Bishop-Pawn Endgames 215

the graph exploded, but are only talking about the possibility of checking out the plans that might plausibly apply at a given node in parallel. This would be strictly bounded by the number of plans, which presumably would be both finite in number and fixed throughout the planning process. Another possible improvement in effi- ciency would be to prune the repertoire of plau- sible plans, especially at student nodes, accord- ing to expected characteristics of the students to be tutored. Novice students would, for example, not have available certain plans available to more experienced students of chess. To implement this, however, would imply that several strategy graphs would have to be generated for each prob- lem, for each class of students to be tutored. It also would require the TUTOR module to have considerably enhanced diagnostic skills in order to be able to tell when the student had advanced to a new level of play, so that the appropriate new strategy graph could be "paged in." If enough ef- ficiency could be achieved through these kinds of techniques, it might prove possible to be able to run the EXPERT in real time as the TUTOR needs information about the student's current plans. The strategy graph would not be generated separately in advance, but would be generated dynamically as needed. This would considerably enhance the flexibility of the entire tutoring sys- tem, and would, paradoxically, allow further speed-ups in the search, since the same rules that apply to ExSearch could now be applied to St- Search: only if it seems that the student is stray- ing from the plan we know him/her to be follow- ing need StSearch look broadly into alternative moves at a given node. Thus, only the student moves relevant to the current plan need be gen- erated, not all plausible moves. Unfortunately, right now the planning system is too slow for such "real-time" experimentation. Although generating this graph is slow, the resulting graph can be ac- cessed very quickly by the TUTOR, thus making the tutoring system as a whole efficient enough to run in real time. Fortunately, at least in the endgame scenarios considered here, it is possible to generate a fairly complete strategy graph. The comprehensiveness of the strategy graph is de- pendent upon the completeness of the plan li- brary. If plausible plans have been omitted from the plan library, the strategy graph might not be

able to anticipate or account for certain move se- quences.

4. The TUTOR Module

Whereas the EXPERT module generates all the chess expertise required to help a student with a particular chess endgame problem, the TUTOR module actually carries out the tutoring. The stu- dent can choose to play any of the endgame prob- lems from the problem library. The goal of the TUTOR module is to challenge students to solve new problems while monitoring and commenting upon their actions. The system can recognize op- timal, less than optimal, or clearly irrelevant moves. The student continues problem solving while the TUTOR offers help, hints, explana- tions, and tutoring advice when needed or re- quested. The main pedagogical goal underlying the design of the TUTOR module is to be a part- ner and co-solver of problems with the student, who is encouraged to experiment with various strategies.

A variety of tutoring styles can be imple- mented with the tools provided in UMRAO. Four tutoring styles have been implemented and tested. In immediate feedback with strict tutor control, UMRAO immediately explains the stu- dent's strategy and describes a more suitable strategy as soon as the student makes a subopti- mal move. This is similar to the immediate feed- back pedagogical style used in many of Ander- son's tutors [15]. In immediate feedback with student initiative, UMRAO alerts the student when a suboptimal move is made, but offers op- tions for the student to explore a faulty line of play, or to request a hint or an explanation of the current suboptimal strategy or the best strategy. Optional feedback with strict tutor control does not allow the student to deviate from the path of optimal play, but satisfies students who want the tutor to provide explanations only when re- quested. Optional feedback with student initia- tive provides the student with a full set of options as to desired feedback and gives the student the ability to choose when to play and whether to undo the previous move. Once play reaches a ter- minal position, that is, an obvious win, loss, or draw, an obvious blunder by the student, or a

216 Gadwal, Greer, and McCalla

~ Links of TAN

I Introduce I ~ Information Flow

I

F

E C

,, IMa eMovol L / I

2 FiA,. 8. Tutoring activity net (TAN).

i Choose node I

d

move that deviates from any known strategy, the system always takes initiative to force the stu- dent to make an appropriate choice.

4.1. The Tutoring Activity Net

The TUTOR conducts the tutoring session with the help of a control executor task-specialist that follows a control flow expressed in the Tutoring Activity Net (TAN) shown in figure 8. The TU- TOR starts the tutoring session by asking the stu- dent to select a problem. When the student se- lects a particular problem, the TUTOR enters the TAN at the Introduce state.

4.1.1. Introduce State. In this state, the TUTOR displays the selected endgame problem on the computer screen, gives a brief introduction to the problem, and focuses on the root cnode of the strategy graph for the selected problem. The stu- dent is asked to play a particular color, and the other side is played by the tutor. The TUTOR then branches to the Get Move state. Figure 9 shows the screen for the endgame problem shown in figure 2.

4.1.2. Get Move State. In the Get Move state, the student is asked to make a move by dragging a piece using the mouse to an appropriate new position. The TUTOR checks for the validity of the move (i.e., that it is one of the moves the tu- tor expects from the strategy graph), and if the move is invalid the student is asked to make an- other move. UMRAO distinguishes between a move that is invalid because it is illegal and one that is invalid because the student is not moving according to any recognizable plan. Ille- gal student moves are immediately undone by UMRAO.

Forcing the student to follow pre-specified plans is an attribute of all "model tracing" tutors (such as those of Anderson et al. [15]). Such tu- tors cannot allow the student to stray outside the range of behavior that is expected (i.e., to go be-

r ~ File Edit Evol Tools Windows Design

EN Wb.I

. II@

A B C D E F G H

4~28:21~

. , .s tudent 's move...

FiA,. 9. Initial screen for sample bishop-pawn game "play white and win"

Tutoring Bishop-Pawn Endgames 217

yond the models the system has of possible stu- dent behavior). Where UMRAO is an advance over previous model-tracing tutors is in the qual- ity of its model (i.e., the strategy graph) and in UMRAO's ability to generate this model auto- matically. Because the semantics of the chess do- main are simpler than for the LISP, geometry, and algebra domains explored by Anderson et al., we are able to capture a wider range of pos- sible student behavior in the UMRAO strategy graph than Anderson and his co-workers are able to capture in their production systems for their domains. This means that the student can be allowed to stray further afield in UMRAO than in Anderson's tutors. Because of this, we can experiment with tutoring styles other than im- mediate feedback with strict tutor control (see below). The generality and flexibility of our "model" is also enhanced by the fact that the strategy graph can be semiautomatically gener- ated. Unlike other model-tracing tutors, for each new task the EXPERT can generate the strategy graph automatically. Our only knowledge engi- neering is to come up with the plan library for the EXPERT to use. Moreover, the strategy graph can be generated in a separate phase from the tutoring. This eliminates the necessity that other model-tracing tutors have to restrict the range of models in order to achieve real-time ef- ficiency during the tutoring process. The crea- tion of a sophisticated strategy graph (a process that is slow) can be done off-line from the use of that strategy graph for tutoring (a process that must be fast).

4.1.3. Recognize Strategy State. After getting a valid student move, the TUTOR branches to the Recognize Strategy state in the TAN where the student's plausible strategy is recognized with the help of strategy recognize1; a task specialist responsible for associating a set of plausible strategies with a given student move. These strategies can be more or less read directly from the strategy graph. For example, in the graph shown in figure 5, if the student makes move m2, the strategy the student is using could be either $2 (corresponding to plan W5) or $3 (corre- sponding to plan W6). This set is passed on to the control executor of the TUTOR.

Once the set of plausible strategies is obtained corresponding to a student move, the next step

is to select from this set of plausible strategies one most likely strategy on which to comment. This is a difficult credit assignment problem. A lot of information is required to single out one strategy from the set of plausible strategies, such as the skill level of the student, default knowl- edge about students of a particular level, student history, etc. This emphasizes the need for a so- phisticated student model. At present no such student model exists in UMRAO, and as a result, no single strategy can be necessarily identified. At present, the system has some simple heuris- tics for selecting one strategy out of a set of plau- sible strategies for a student move for the pur- poses of generating advice. For example, if at a previous node the student has unambiguously been following a certain strategy (information available from the student history), then if this strategy is among the current options, it can be chosen. If even these heuristics fail, the sys- tem presents a menu of plausible strategies, and the student is asked to select the appropriate strategy. This points out an important limitation in UMRAO: the student may not find his cur- rent strategy in the menu, the student may be prompted to adopt a different strategy by having seen it, or the student may not understand one or more items in the menu. Further research is needed to overcome this limitation.

4.1.4. Generate Feedback State. After figuring out the selected strategy, the TUTOR comes to the Generate Feedback state of the TAN, where it may comment about the appropriateness of the strategy underlying the student's move. If the student is following an ideal strategy, then UM- RAO plays on without interruption. If a subop- timal move is made by the student, depending on the pedagogical style that is in effect, comments can be suppressed, made compulsorily by the TUTOR, or provided optionally to the student if he/she clicks the appropriate button on the screen.

The feedback options include Hint to Win- ning Variation, Explain Winning Variation, Best Move, Best Move and Strategy, Explain Strat- egy, Take Back, and Play. In the Hint to Winning Variation option, the TUTOR provides an appro- priate hint about what the student should try to do in the given board position. In the Explain Winning Variation option, the TUTOR explains

218 Gadwal, Greel, and McCalla

the winning variation in more detail. The TU- TOR gives the correct next move when the Best Move option is selected. In the Best Move and Strategy option, the TUTOR not only provides the best move but also explains the correspond- ing strategy. The student can request a detailed explanation of a particular plausible strategy (among all strategies that are plausible in the cur- rent board position) through the Explain Strategy option. If the student wants to try a move other than the one just played, he/she selects the Take Back option, which branches the TUTOR back to the Get Move state of the TAN. If he/she wants to continue to play the game after a partic- ular move, then he/she must select the Play op- tion, and the TUTOR branches to the Make Move state in order to make its own move.

Verbalizations for the Hint to Winning Varia- tion, Explain Winning Variation, Best Move and Strategy, and Explain Strategy options are pro- duced through use of a set of feedback templates and a dictionary function that relates various predicates to standard English forms. For exam- ple, to Explain Strategy for any strategy object S, the system uses a template similar to the fol- lowing:

(You have recognized the following features on the board (dictionary (plans-applicable S)) and the goal is to: (dictionary (plans-better-goal s)))

In the template, the call to the plans-applicable function returns the applicability-condition slot of strategy S, while the plans-better-goal func- tion returns the better-goal slot value of S. The function dictionary converts the system predi- cates returned by plans-applicable and plans- better-goal into English.

4.1.5. Make Move State. In the Make Move state, the TUTOR generates its own counter- move to the student's move. This simply involves reading off the single correct move from the strategy graph and updating the board on the screen, before branching to the Get Move state for the student's next move. The Make Move state could be made considerably more interest- ing in two ways. First, it might prove to be useful

pedagogically if the EXPERT were to produce more than one move for the TUTOR to choose (a task that would make the EXPERT's job con- siderably more difficult). Novice students, for example, could benefit from having the system play suboptimally in order for them to explore the chess problem more deeply before they make a fatal error (in all of the chess problems in Av- erbach [9], one early wrong move by the student is invariably fatal against a clever opponent). The second interesting possibility in the Make Move state would be not to branch immediately to the get the next student move, but instead branch back to a Generate Feedback state and have the student be able to query the system about its move. This might prove to be beneficial to stu- dents approaching experthood who might gain from understanding the system's reasoning. If these extensions were made, however, it would be much more difficult to decide what to do in the Make Move state. It would require, among other things, reference to a student model to de- termine what would be appropriate for the partic- ular student being tutored.

4.1.6. Conclude State. At some point in the game, no more moves will be possible. This oc- curs when a terminal node of the strategy graph is reached. In this situation the TUTOR moves to the Conclude state of the TAN. The student is told either that he/she has successfully solved the endgame problem, or has failed to solve the prob- lem. In the latter case, the student is given the option of replaying the game from the point where the student could still have made moves to win. This position is easy for the TUTOR to com- pute, since it need only look back from terminal nodes in the strategy graph representing winning positions for the student, and intersect them with the deepest cnode that was actually encountered during play. If the student decides to go back, the TUTOR refocuses on this cnode, the board is ap- propriately redrawn on the screen, and the TU- TOR branches to the Get Move state. If the stu- dent declines to go back, or if he/she has been successful, the TUTOR offers the student the op- tion of quitting, playing the game again from the start, or choosing another endgame problem to work on.

Tutoring Bishop-Pawn Endgames 219

4.2. Implementing a Variety of Pedagogical Styles

A variety of tutoring styles can be implemented with the tools provided in UMRAO. Four of the tutoring styles have been implemented and tested by manipulating appropriate options like Explain Strategy, Hint to Winning Variation, Ex- plain Winning Variation, Best Move and Strat- egy, Best Move, Take Back, and Play. These four tutoring styles are given below.

4.2.1. Immediate Feedback and Strict Tutor Con- trol. This style is similar to that employed in the Anderson et al. [15] tutors. It can be simulated by limiting selection to the Hint to Winning Vari- ation, Explain Winning Variation, Explain Strat- egy, and Best Move options. In this case, as soon as the student makes an incorrect move, the sys- tem immediately explains the student's strategy and the best strategy by carrying out Explain Strategy, Best Move and Strategy, Best Move, and Hint to Winning Variation. At the same time the Take Back and Play options are disabled, forcing the student to move at the tutor's pace without a chance to rethink his or her previous move.

4.2.2. Immediate Feedback and Student Initia- tive. This style can be simulated by permitting selection of any of the Hint to Winning Variation, Explain Winning Variation, Explain Strategy, Best Move, Take Back, or Play options. Notice that in this case, the Take Back and Play options are enabled. The Play option lets a student ex- plore a particular line of play at the student's own pace. The Take Back option lets a student try dif- ferent moves at the same position with feedback on each move. As soon as the student makes an incorrect move, the system explains the current strategy and the best strategy, without waiting to be asked by the student.

4.2.3. Optional Feedback and Strict Tutor Con- trol. Often students seem to prefer limited feed- back and want the tutor to explain only when asked. This style of tutoring can be simulated by providing feedback only when the student selects any of the options Hint to Winning Variation, Ex- plain Winning Variation, Explain Strategy, and

Best Move. In order to maintain strict tutor con- trol, the Take Back and Play options are dis- abled.

4.2.4. Optional Feedback and Student Initiative. This style can be simulated by enabling all of the options Hint to Winning Variation, Best Move and Strategy, Explain Strategy, Best Move, Take Back, and Play. These options are selected by the student whenever feedback is requested or a chance to take back the move is desired. Once the play reaches a terminal position, the system takes the initiative to tell the student the outcome of the game and suggest possible actions for the student, including backing up to previous posi- tions to replay parts of the game.

We have not, as yet, performed many experi- ments with students using UMRAO under these different tutorial styles. Most of our experimen- tation to date has involved using optional feed- back and student initiative. This has been the style most used with UMRAO, since it comes closest to making UMRAO a collaborator with the student, rather than a dictatorial expert. The other styles have been described here only to indicate the potential for UMRAO to serve as a vehicle for experimentation with various ap- proaches to tutoring. We show in the next section some excerpts from a tutoring interaction.

4.3. Excerpts from a Tutoring Interaction

To give a flavor of the tutoring interaction that has been implemented, consider a sample tutor- ing session for the bishop-pawn endgame origi- nally presented in figure 2. Figure 10 shows a screen where the student has chosen a move that is not the best move (as indicated in the feedback window at the bottom of the screen). In line with the optional feedback/student initiative tutoring style, the student is presented with a menu from which to choose the feedback and play options desired.

Figures 1! and 12 show some of the types of feedback provided by the system in response to a request for a hint (figure 11) and for a full ex- planation of the winning strategy (figure 12). Note the different levels of detail in the descrip- tion provided.

Suppose that the student ignores the system's

220 Gadwal, Greer, and McCalla

File Edit Eval Tools Windows

N N N e N ~ @ @

e e e N @ @ @ @

N ~ N @ e N @ @ @

A B C D E F G II

CXINT T0 WINNING VARIATION~

~XPLAINWINNING UAAIATION~

BE$TMOUE )

BESTMOUE flNB ~TRflTEGY )

EXPLAIN STRATEGY

C TAKE eAcK ?

PLAY

...this is not the best mvee . . .p lease select f r o m one of the above,.. I

Fig. I0. Student has chosen a sub-optimal move.

advice and decides to pursue a line of suboptimal play. The system reacts to the student's move, and after some time, a position is reached where it becomes obvious why the earlier move was not correct. (As shown in figure 13, after a pawn is lost there is no possible win for White). Play then returns to an earlier board position where there is a way for the student to correctly solve the problem. In the complete session, as the student tries out various moves, the TUTOR tries to rec-

ognize the plan behind these moves and offers a variety of feedback.

The interaction between the student and tu- tor is both exploratory and tutorial in nature. UMRAO's ability to engage in flexible model tracing enables the student to explore both cor- rect and incorrect strategies. UMRAO demon- strates that representing and reasoning about expert and novice plans can provide an environ- ment suitable for student learning.

File Edit Evai Tvols Windows to:53:lQ O

A @ N !@~N ~ N gN

G @ N N @

4 N N @ N ~@ @ @ N 2 @ @ @ @ ' N @ @ N

A B C D E f G H

TRY TO EACXANGE THE FIRST PAWN AND QUEEN TIIE SECONO PAWN.

...this is not the best move . . ,p lease select f rom orie of the above...

Fig. I1. Hint feedback.

Tutoring Bishop-Pawn Endgames 221

File Edit Eval Tools W i n d o w s

B @

@ 4 @

A B C O E

10:5S:26~

I I

F G H

TRY TO (MOVE YOUR FIRST PAWN TO CD MODE YDUR KING NEAR THE B-SQUARE OF FIRST PAWN {AS B8 87) TO SUPPORT TÁÁE FIRST PAWN FOB EXCHANGE THEN PROMOTE THE SECOND PAWN WITH THE SUPPORT OF KING) ATTHE SAME TIME (KEEP THE BLACK KING AWAY FROM YOUR SECOND PAWN)

.. .please m a k e se lec t ion. , .

FiA,. 12. Strategy explanation feedback.

5. Conclusion

UMRAO makes a number of contributions to both chess and tutoring. First, UMRAO extends traditional notions of knowledge-based chess. UMRAO has an interesting hybrid planning tech- nique that is like traditional knowledge-based ap- proaches in its search for optimal plans when it is the expert's turn to move, but is original in its need to consider all plausible plans, even sub-

optimal or incorrect ones, when it is the student's turn to move. Moreover, UMRAO delineates an elegant separation between plans and strategies. Plans are problem independent and stored in a plan library. Strategies are problem-specific in- stantiations of plans, automatically created by the system in response to particular board posi- tions. This separation is used by UMRAO so that it can compile a particular strategy graph that is tailored to each endgame problem. Although this

I~ File Edit Evai Tools W indows Io : 1 8 : o 5

D

IF A B

m,

C D E F G H

m

BLACK is able to d r a w t h e g a m e ; BL2 successfu | l (NO PAWN CAN QUEEN! BISHOP IS PREVENTING TRE FIRST PAWN AND BLACK KING IS PREVENTING TXE SECOND PAWN) Lets go back to pre. DOS

I . . .my m e r e is... E5 - D6

FiA,. 13. White can no longer win.

222 Gadwal, Greer, and McCalla

is a fairly slow search (in the order of a few min- utes on a SPARCstation), the compilation of strategy graphs can be done "off-line" so that these graphs can later be efficiently used in real- time student-tutor interaction.

UMRAO also makes contributions to intelli- gent tutoring. UMRAO shows how the model- tracing tutoring methodology can be made flexi- ble and responsive to the student, instead of rigid and dictatorial as it is often perceived to be. UMRAO also provides a laboratory for experi- ments as to the effectiveness of various kinds of tutoring strategies: the system's deep under- standing of chess strategies allows tutor-con- trolled or student-controlled pedagogy and also allows a choice between immediate and delayed feedback to the student.

Finally, there are general contributions of this research that transcend particular subdisciplines of artificial intelligence. The tutoring domain re- establishes chess as a natural exploratory envi- ronment for ideas other than search, and pro- vides the area of intelligent tutoring systems with a perfect domain for exploring deep ideas in di- agnosis and pedagogy without the complexities of other domains. With further development UMRAO may also prove to be a practical contri- bution to the teaching of chess, more responsive and adaptable than a book, an expert that can not only present chess problems to students, but also solve them and comment strategically on them.

UMRAO is, of course, still a prototype sys- tem. Enhancements in both chess capabilities and tutoring abilities are needed. In chess, new bishop-pawn endgame problems must be added to UMRAO's repertoire, a process that likely will also necessitate a gradual increase in the number of plans in the plan library. Experience with the problems considered so far has shown that the number of extra plans that needs to be added for each new problem is decreasing (see [16] for a detailed description). There is hope that the plan library can be made virtually complete for the bishop-pawn endgame domain. In such an event, adding a new bishop-pawn endgame problem would be easy, simply requiring the automatic creation of a new strategy graph by the UMRAO EXPERT.

A more difficult problem, of course, would be extending UMRAO beyond bishop-pawn end-

games. The plan library would have to be consid- erably enhanced to allow for strategic concepts involving new pieces and new patterns of pieces on the board. This should be possible for other, fairly sparse, endgame situations, but to handle the middle game or beginning game would be dif- ficult. In these situations the UMRAO approach would have to be extensively revamped. It is problematical whether a pre-compiled strategy graph could represent the complexities of the in- teractions in these situations. Even if it could, the number of possible plans and plan variations would be huge, as would the strategy graph. What would be needed in these situations would be the ability to create new plans automatically (i.e., to learn them), preferably on-line in reac- tion to changing game circumstances. This would seem to be very difficult, so a possible alterna- tive would be to see if plans could be described at multiple grain sizes (akin to programming strategies in the LISP domain in SCENT [17], and the strategy graph would still be compilable at a coarse-grained level, if not in fine-grained de- tail. Even this may prove problematical, how- ever, as is argued in the next paragraph.

Extensions to the tutoring methodology will also be needed if UMRAO is to be generalized. One of the most significant problems with the current version of UMRAO is the occasional lapse in coverage in the strategy graph. Cur- rently, the UMRAO TUTOR stops the student from making any move that is not justified by a strategy. This is a familiar problem with any model-tracing tutoring methodology, and is a dif- ficult one to overcome unless there is complete coverage of all possible strategies. As argued in the context of our work in the program tutor- ing domain [18], it is possible that granularity might help in this situation by providing coarser- grained models that cover a wider range of be- havior. The application of granularity to the chess domain is difficult, however, since specific student moves must be recognized in the context of chess plans, and even a coarse-grained plan must therefore predict specific moves. More- over, unlike in the programming domain where recognizing a student programming strategy is a "one-shot" enterprise, in the chess domain all future moves that derive from this student move must also be allowed for, including moves that

Tutoring Bishop-Pawn Endgames 223

UMRAO itself must make. Thus, it seems that at the very least, UMRAO must be given the ability to do fine-grained analysis in real time, on-line, as the tutoring proceeds, perhaps using standard game-tree search algorithms if the plan library fails to provide any detailed support. Of course, it is an open question what kind of feedback to give to the student if his/her moves are under- stood using lookahead search of some sort rather than plans. Still, it is an intriguing possibility to combine search-based and knowledge-based ap- proaches to gain better flexibility in chess tutor- ing.

A related problem faced by the TUTOR is the problem of student intentions differing from stu- dent actions. Sometimes, students may intend to make a move that is justified by certain strategic considerations, but may not be able to apply their strategic thought correctly to a given board position. In such situations, the TUTOR is likely to conclude that students are using a different strategy than they think they are using. Advice provided to a student based on such an erroneous assumption will likely be quite confusing. At the very least, the TUTOR should probably more clearly feed back to the student its assumptions about the student's strategy. Of course, this is made more difficult by the problem of how to ex- press the essence of the strategy to the student in terms the student can understand (telling him/her that we feel he/she is using strategy W5 is not very useful!). At best, the TUTOR should be pre- pared to enter into a dialogue with the student about mutual strategic assumptions in order to clarify the student's own beliefs about his or her reasoning. This is done now only when the TU- TOR is unable to identify a unique strategy using its heuristics, and the feedback is through artifi- cially phrased menus from which the student must select. To make this interaction more flexi- ble, it should be in natural language, which raises all the complexity of full-scale natural-language dialogue. In the short term, these problems may be partially overcome by being more careful in devising the pre-stored templates that are pro- vided to the student so that they are understand- able to typical novice chess players. In the longer term, there will probably have to be a student model that helps to provide context for the anal- ysis of student behavior.

Even in its current state, however, the UMRAO prototype demonstrates the feasibility of having a knowledge-based chess tutor. Perhaps the most important contribution of UMRAO is the devel- opment of a framework to combine the research done in the field of intelligent tutoring systems (ITS), cognition in chess, and knowledge-based chess. Such a chess tutor can act as a testbed for various theories about expert and novice chess skills, can allow the exploration of knowledge representation and reasoning schemes for knowl- edge-based chess in a new and realistic domain, and, finally, can allow experimentation with var- ious tutoring strategies in ITS. The main feature of the implemented design of UMRAO is its mod- ularity and simplicity, qualifying it to be a good experimental system for exploring these various issues. In sum, this research provides a good starting point for a long-term project that can il- luminate various AI problems and can act as a framework to bring the research done by chess cognitive scientists and knowledge engineers un- der a common umbrella.

Acknowledgments

We would like to thank the Natural Sciences and Engineering Research Council of Canada for financially supporting this research and the Uni- versity of Saskatchewan for providing a scholar- ship to the first author during his graduate pro- gram.

References

1. H. Berliner and M. Campbell, "Using chunking to solve chess pawn endgames," Artif. Intell., vol. 23, pp. 97-120, 1984.

2. D. Michie, "A theory of advice," in Machine Intelli- gence, vol. 8, edited by T. Elcock and D. Michie, Edin- burgh University Press: Edinburgh, 30-59, 1977.

3. J. Pitrat, "A chess combination program which uses plans," Artif. lntell. , vol. 8, pp. 275-321, 1977.

4. D.E. Wilkins, "Using patterns and plans in chess ," Ar- tif. Intell., vol. 14, pp. 165-203, 1980.

5. J. Schaeffer, "Long-range planning in computer chess," Proc. 1983 Annu. Conf. ACM, 1983, pp. 170-179.

6. R. Reddy, "Foundat ions and grand challenges of arti- ficial intelligence," AI Magazine, vol. 9., no. 4, pp. 9- 21, 1988.

224 Gadwal , Greer, and McCal la

7. N. Charness, "Expert ise in chess and bridge," in Com- plex Information Processing: The Impact of H. A. Si- mon, edited by D. Klahr and K. Kotovsky, Lawrence Erlbaum: Hillsdale, NJ, pp. 183-208, 1989.

8. H. Simon and Chase, "Skill in chess ," Am. Sci., vol. 61, no. 4, pp. 394403, 1973.

9. Y. Averbach, Comprehensive Chess Endings, Vol. 1, Academic Press: New York, 1980.

10. I. Bratko, "Knowledge-based problem solving in AL3," in Machine Intelligence, Vol. 10, edited by D. Michie and J.H. Pao, Ellis Horwood: Chichester, 1982.

1 I. I. Bratko and T. Niblett, "Conjectures and refutations in a framework for chess endgame knowledge," in Ex- pert Systems in the Micro-Electronic Age, edited by D. Michie, Edinburgh University Press: Edinburgh, 1979.

12. R. Burton and J.S. Brown, "An investigation of com- puter coaching for informal learning activities," in In- telligent Tutoring Systems, edited by D. Sleeman and J.S. Brown, Academic Press: New York, pp. 79-98, 1982.

13. W.L. Johnson and E.M. Soloway, "PROUST: Knowl- edge-based program debugging," Proc. Seventh Int. Software Eng. Conf., Orlando, FL, 1984, pp. 369-380.

14. G.I. McCalla, J.E. Greer, and the SCENT Research Team, "Intell igent advising in problem solving do- mains: the SCENT-3 architecture," Int. Conf. Intelli- gent Tutoring Systems, Montreal, 1988, pp. 124-131.

15. J.R. Anderson, F. Boyle, A. Corbett , and M. Lewis, "Cognitive modeling and intelligent tutoring," Artif. Intell., vol. 42, pp. 7-49, 1990.

16. D. Gadwal, "UMRAO: A chess endgame tutor," M.Sc. Thesis, Department of Computational Science, Univer- sity of Saskatchewan, 1990.

17. G.I. McCalla, J.E. Greer, B. Barrie, and P. Pospisil, "Granulari ty hierarchies," Int. J. Comput. Math., Spe- cial Issue on Semantic Networks, vol. 23, pp. 363-375, 1992.

18. G.I. McCalla, J.E. Greer, and R. Coulman, "Enhanc- ing the robustness of model-based recognition," Third International Workshop on User Modelling, Germany, pp. 240-248, 1992.

Dinesh Gadwal received a B.Tech. from the Indian Institute of Technology in New Delhi in 1986 and an M.Sc. from the University of Saskatchewan in 1990. He is now a Research Associate with the Department of Computational Science at the University of Saskatchewan. His research interests include artificial intelligence and software engineering.

l/ Jim Greer received his Ph.D. from the University of Texas at Austin in 1987. He is an Associate Professor in Compu- tational Science at the University of Saskatchewan. His in- terests are in the area of intelligent tutoring systems, user and student modeling, apcl knowledge representation.

Gordon McCalla is a Professor in and Head of the Depart- ment of Computational Science at the University of Sas- katchewan. He has published over 70 research papers in various areas of artificial intelligence, with a particular in- terest in artificial intelligence and education. He is cur- rently serving on the editorial boards of four different jour- nals, including being co-editor in chief of Computational Intelligence.