Learning to rank moves in mahjong using SVM with tree kernels

Learning to rank moves in mahjong using SVM with tree

kernelsChikayama & Taura Lab.

M1 Ayato Miki

1

1. Introduction2. Related work3. Proposed method

◦ SVM with kernels◦ Tree kernels in mahjong◦ Learning to rank using SVM

4. Experiment5. Conclusion

Outline

2

Features = elements to evaluate positions in games◦ e.g. Numbers and arrangements of pieces in Shogi

Difficulty in creating features

◦ Require expert knowledge for the game

◦ Simple linear combinations are insufficient e.g. XOR

1. Introduction

3

Use kernels for evaluation features in games

◦ Simple inputs Tree structure

◦ Expect to work as non-linear features Implicit classification in high level feature space

Objective

4

Classification of moves in mahjong using SVM with kernels

◦ “Evaluation functions > search” in mahjong Kernel method is effective

◦ Tree kernels for tree structures of mahjong hands Similarity representation by kernel functions

◦ Use expert game records Learn “expert moves > other moves” with SVM

Method overview

5




Outline

6

Machine learning using simple features◦ TD-Gammon [Tesauro, 1992]

Research about mahjong◦ Learning from expert records [Kitagawa, 2007]

2. Related work

7

Machine learning in mahjong [Kitagawa, 2007]

Game Mahjong

Features Manually picked

Method Bonanza method(without search)

Accuracy 56%

9

Mahjong features [Kitagawa, 2007]

Features of player

面前の持ち牌面前の持ち牌 2枚の組み合わせ面前の持ち牌 3枚の組み合わせ

鳴いた牌の構成と状態面子数

リャンメン数カンチャン数とペンチャン数の和

トイツ数テンパイしているかどうか

ドラの枚数面前であるかどうか親であるかどうか

リーチしているかどうか自分が捨てたことのある牌

Features of opponents

鳴いた牌の構成と状態鳴いた回数

鳴いた牌の中で見えているドラの数親であるかどうか

リーチしているかどうかそのプレイヤに対する完全安牌筋や壁などによって安全度が高い牌

自分との点差

Features of field オーラスかどうか見えていない牌の残り枚数

10




Outline

11




Outline

12

2-class linear classifier

Support Vector Machine (SVM)[Vapnik, 1965]

13

bxwxg )(

0)( xg

1)( xg

1)( xg

w1

w1

Maxmize margin w2

Method for non-linear classification

SVM with kernels[Cortes and Vapnik, 1995]

)(xx

)()( 21 xx ),( 21 xxK

Explicit Replace14




Outline

15

Tree structure手牌

面子面子候補孤立牌

暗刻暗順リャンメンカンチャントイツ

明刻ペンチャン

…

Specific cards as leaves16

Example

孤立牌暗刻暗順リャンメンカンチャントイツペンチャン

17

Tree kernels [Moschitti, 06]

手牌面子面子候補孤立牌

暗順リャンメンカンチャン

手牌面子面子候補孤立牌

暗順リャンメンカンチャン

Count common subtrees

… … …リャンメン …

リャンメン手牌

面子面子候補暗順リャンメン

… …面子候補リャンメン

SST

18

Deep subtrees are not very important

Subtrees weight [Moschitti, 06]

面子暗順暗順暗刻

depth )10(

19

Tree kernel function

11 22

),(),( 2121t tNn Nn

t nnttK

F

iii

fl nInInn i

121

)(21 )()(),(

},,,{ 21 FfffF

01

)(nI i

tN

)( ifl

Set of nodes in tree t

Subtree set

Depth of subtree fi

otherwise

If fi is rooted at node n

20




Outline

21

SVM is just a 2-class classifier◦ How learn to rank

Learning to rank [Shen et al. 03]

If you want to know ranks of three moves…

Order classifier ( > or < )

),(),(),(

32

31

21

mmmmmm

32

31

21

mmmmmm

213 mmm Rank

Input data -> Pairs of moves

22

One training example has two tree instances

Learn and classify orders

Define kernel function for relative order

ri

lii tte ,

),(),(),(),(),( 2121212121lr

trl

trr

tll

ttr ttKttKttKttKeeK

Classify “tl > tr” and “ tl< tr”

Label +1 when tl > trLabel -1 when tl < tr

24




Outline

25

1. Experiment of proposed method

2. Error analysis

3. Comparison with related work

4. Practical player

26

4. Experiment

Machine spec◦ Dual-Core AMD Opteron 2.4GHz◦ 32GB RAM

Implementation◦ SVM-Light-TK [Moschitti, 2004]

Soft margin trade-off parameter C=0.1 Optimization threshold ε=0.1

Environment

27

Learn from tsumo positions in expert records◦ “Offensive” positions only

Nobody declares “li-zhi” Nobody calls 3 or more “chi”, “pon” or “kan”

◦ Using records of Totugeki Tohoku ~285 games (~13,000 training positions)

Evaluation◦ Accuracy rates of trained classifiers

Are expert moves ranked as the bests ?◦ 4-fold cross validation

Method of experiment

28

Accuracy rates

0 2000 4000 6000 8000 10000 12000 1400030%

40%

50%

60%

70%

80%

90%

100%

rank 1rank 1-2rank 1-3rank 1-5

Training positions

Accu

racy

rat

es

29

“Defensive” positions

Typical mistakes (1)

31

Positions require “yaku”(=poker hands) knowledge

Typical mistakes (2)

32

Using features designed in [Kitagawa, 2007]◦ Including board status information

Implementation◦ Ranking SVM in SVM-Light [Joachims, 2002]

Comparison with SVM using linear features

34

Accuracy rates with linear features

0 2000 4000 6000 8000 10000 12000 1400030%

40%

50%

60%

70%

80%

90%

100%

rank 1rank 1-2rank 1-3rank 1-5

Training positions

Accu

racy

rat

es

35

Core2 Duo 1.06GHz 91819 support vectors 700ms for classification of one tree pair

Practical player ?

7 seconds for deciding one move◦ 4 seconds in dual threading◦ Good enough for playing against human players

36




Outline

37

Classified ranks of moves with tree kernels◦ Possible with simple input

57% accuracy◦ Despite the lack of information of field and

opponents◦ Increasing…

Fine accuracy with permissible cost

5. Conclusion

38

Classification analysis◦ Positions that linear combinations cannot classify

Refine tree structure

Other information◦ Hands information with other kernels

String kernels◦ Information of field and opponents

Add as linear combinations or other kernels

Heavy computing cost◦ Classification time increases with a number of training positions◦ Indispensable in other games

Future work

39

Documents

Learning to rank moves in mahjong using SVM with tree kernels