9
2 EJNHFKM (Decision Trees) 'FND (&FND)/>;$ =& (;&)/>;KNK; 80?!)<O:+;.?;= ,(:61$ GDNL ;.?948OBICA2?(=#A1) GDNL%*<O/>571?("() KNK<O;8&3@? 2 P BICA BICB BICB 5;@:6<* 6@3247-#, 5;@*?&(,- 5;@*-%!$, />7=9@180. !+)5;@*?&(,-"%5 ;@-!$, 5;@*?('-"5 ;@-#,

Keio Universityweb.sfc.keio.ac.jp/~maunz/DM09/DM09-07.pdfCross Validation ( N-hold Cross Validation 1" +N + '0 Ex {Ex1, Ex2, Ex3, … , Exn Exi 1 E x #. / $" &(0 - Ex - Exi 1 Exi 1

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Keio Universityweb.sfc.keio.ac.jp/~maunz/DM09/DM09-07.pdfCross Validation ( N-hold Cross Validation 1" +N + '0 Ex {Ex1, Ex2, Ex3, … , Exn Exi 1 E x #. / $" &(0 - Ex - Exi 1 Exi 1

2

(Decision Trees) (& )

= ( )

(= )

( )

2

A B

B

… … … … …

Page 2: Keio Universityweb.sfc.keio.ac.jp/~maunz/DM09/DM09-07.pdfCross Validation ( N-hold Cross Validation 1" +N + '0 Ex {Ex1, Ex2, Ex3, … , Exn Exi 1 E x #. / $" &(0 - Ex - Exi 1 Exi 1

(

p

-log2p

(

(

pi

1: M = 2n pi = 1/ 2n (

2: M = 2n p1 = 1 p2=p3=…=p2n=0

3: M = 2n pi=1/2i (1 i n-1) pn=1/2n-1

n 2

i

pi

i

pi

Page 3: Keio Universityweb.sfc.keio.ac.jp/~maunz/DM09/DM09-07.pdfCross Validation ( N-hold Cross Validation 1" +N + '0 Ex {Ex1, Ex2, Ex3, … , Exn Exi 1 E x #. / $" &(0 - Ex - Exi 1 Exi 1

( )

3: M = 2n pi=1/ 2i (1 i n-1), pn= 1/ 2n

3-1: n=3 (1/2,1/4,1/4)

3-2: n=4 (1/2,1/4,1/8,1/8)

n H(s) 2 ( )

pi

i 1 2 3 4

pi

i 1 2 3

H(p)

0 1 p (1-p)

X>2

X>1 X>3

X=1 X=2 X=3 X=4

yes

yes yes

no

nono

X>1

X>1 X>2

X=2 X>3

X=3 X=4

yes

yes

yes

no

no

no

1

2

=(log24)

3

=(log28

2

2.25

p(1) = 1/2, p(2) = 1/4, p(3) = p(4) =1/8

X>1

X>1 X>2

X=2 X>3

X=3 X=4

yes

yes

yes

no

no

no

X>2

X>1 X>3

X=1 X=2 X=3 X=4

yes

yes yes

no

nono

=1/2 1+1/4 2+1/8 3+1/8 3 =1.75( )

=1/2 2+1/4 2+1/8 2+1/8 2 =2( )

log2n (n )•

Page 4: Keio Universityweb.sfc.keio.ac.jp/~maunz/DM09/DM09-07.pdfCross Validation ( N-hold Cross Validation 1" +N + '0 Ex {Ex1, Ex2, Ex3, … , Exn Exi 1 E x #. / $" &(0 - Ex - Exi 1 Exi 1

P N 2

S P p N n

S P N

(ID3/C4.5)A S S1, S2 , …, Sv

Si P pi N ni ,

S1, S2 , …, Sv

A

age?

student? credit rating?

<=30 >40

yes

30..40

{9:5}: I(9,5)

{2:3}: I(2,3) {4:0}: I(4,0) {3:2}: I(3,2)

5/14 4/14 5/14

g P: buys_computer =

“yes”

g N: buys_computer =

“no”

g I(p, n) = I(9, 5) =0.940

g age A T T1, T2 , …, Tn

SIA(T)

Page 5: Keio Universityweb.sfc.keio.ac.jp/~maunz/DM09/DM09-07.pdfCross Validation ( N-hold Cross Validation 1" +N + '0 Ex {Ex1, Ex2, Ex3, … , Exn Exi 1 E x #. / $" &(0 - Ex - Exi 1 Exi 1

2

Oi

Page 6: Keio Universityweb.sfc.keio.ac.jp/~maunz/DM09/DM09-07.pdfCross Validation ( N-hold Cross Validation 1" +N + '0 Ex {Ex1, Ex2, Ex3, … , Exn Exi 1 E x #. / $" &(0 - Ex - Exi 1 Exi 1

x2

(Oi-Ei)2 /Ei { |Oi-Ei| - 1/2 }

2/Ei

x 2 contingency table

Page 7: Keio Universityweb.sfc.keio.ac.jp/~maunz/DM09/DM09-07.pdfCross Validation ( N-hold Cross Validation 1" +N + '0 Ex {Ex1, Ex2, Ex3, … , Exn Exi 1 E x #. / $" &(0 - Ex - Exi 1 Exi 1

{ |Oi-Ei| - 1/2 }2/Ei

=(24.5-20-1/2)2/24.5+(50-45.5-1/2)

2/45.5

+(15-10.5-1/2)2/10.5+(19.5-15-1/2)

2/19.5

= 2

|Oi-Ei| = |Ei-Oi|

C4.5 (1)

C4.5 (2)

Actual Predicted

Page 8: Keio Universityweb.sfc.keio.ac.jp/~maunz/DM09/DM09-07.pdfCross Validation ( N-hold Cross Validation 1" +N + '0 Ex {Ex1, Ex2, Ex3, … , Exn Exi 1 E x #. / $" &(0 - Ex - Exi 1 Exi 1

Cross Validation (

N-hold Cross Validation

N

Ex {Ex1, Ex

2, Ex

3, … , Ex

n}

Exi Ex

Ex - Exi

Exi

N

=

ID .. E-Mail …

00001 .. M 31 [email protected] VISA …

00002 .. F 20 [email protected] VISA …

… … … … … … …

YES

NO

ID

ID .. E-Mail …

00001 .. M 31 [email protected] VISA …

00002 .. F 20 [email protected] VISA …

… … … … … … …

YES

NO

Page 9: Keio Universityweb.sfc.keio.ac.jp/~maunz/DM09/DM09-07.pdfCross Validation ( N-hold Cross Validation 1" +N + '0 Ex {Ex1, Ex2, Ex3, … , Exn Exi 1 E x #. / $" &(0 - Ex - Exi 1 Exi 1

ID .. E-Mail …

00001 .. M 31 [email protected] VISA …

00002 .. F 20 [email protected] VISA …

… … … … … … …

YES

NO

ID .. …

00001 .. M 31 ac VISA …

00002 .. F 20 ne VISA …

… … … … … … …

YES

NO

ID

00001

00001

00002

00002

00002

… …

ID ..

00001 .. M 31 VISA YES

00002 .. F 20 VISA NO

… … … … … …

ID ..

00001 .. M 31 VISA YES

00002 .. F 20 VISA NO

… … … … … …

ID ..

00001 .. M 31 VISA YES

00002 .. F 20 VISA NO

… … … … … …

ID .. ..

00001 .. M 31 VISA Yes Yes No .. YES

00002 .. F 20 VISA No Yes Yes .. NO

… … … … … …

Gini