Upload
hoanghanh
View
224
Download
0
Embed Size (px)
Citation preview
Knowledge Discovery in Databases I
SS 2016
Exercise 10:SVM + Decision Tree
Database Systems Group • Prof. Dr. Thomas Seidl
Exercise 10-1: SVM
205.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-1: SVM
305.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-1: SVM
405.07.2016Knowledge Discovery in Databases I: Exercise 10
w‘
b
Exercise 10-1: SVM
505.07.2016Knowledge Discovery in Databases I: Exercise 10
w‘
b
𝑤′ =1
0
𝑏 = −5
Exercise 10-1: SVM
605.07.2016Knowledge Discovery in Databases I: Exercise 10
w‘
b
𝑤′ =1
0
𝑏 = −5
ℎ 𝑥 = 𝑤′, 𝑥 + 𝑏 = 𝑥1 − 5
Exercise 10-1: SVM
705.07.2016Knowledge Discovery in Databases I: Exercise 10
w‘
b
𝑤′ =1
0
𝑏 = −5
ℎ 𝑥 = 𝑤′, 𝑥 + 𝑏 = 𝑥1 − 5
ℎ3
4= 3 − 5 < 0
ℎ7
4= 7 − 5 > 0
ℎ5
5= 5 − 5 = 0
Exercise 10-2: Kernel Trick
To show: 𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝜙(𝑥), 𝜙(𝑦)
805.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-2: Kernel Trick
To show: 𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝜙(𝑥), 𝜙(𝑦)
𝜙(𝑥), 𝜙(𝑦) = 1, 2𝑥1, 2𝑥2, 𝑥12, 𝑥2
2, 2𝑥1𝑥2T1, 2 𝑦1, 2𝑦2, 𝑦1
2, 𝑦22, 2𝑦1𝑦2
𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝑥1𝑦1 + 𝑥2𝑦2 + 1 2
905.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-2: Kernel Trick
To show: 𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝜙(𝑥), 𝜙(𝑦)
𝜙(𝑥), 𝜙(𝑦) = 1, 2𝑥1, 2𝑥2, 𝑥12, 𝑥2
2, 2𝑥1𝑥2T1, 2 𝑦1, 2𝑦2, 𝑦1
2, 𝑦22, 2𝑦1𝑦2
= 1 + 2𝑥1𝑦1 + 2𝑥2𝑦2 + 𝑥12𝑦1
2 + 𝑥22𝑦2
2 + 2𝑥1𝑥2𝑦1𝑦2
𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝑥1𝑦1 + 𝑥2𝑦2 + 1 2
= 𝑥1𝑦1𝑥1𝑦1 + 𝑥1𝑦1𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2𝑥1𝑦1 + 𝑥2𝑦2𝑥2𝑦2 + 𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2 + 1
1005.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-2: Kernel Trick
To show: 𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝜙(𝑥), 𝜙(𝑦)
𝜙(𝑥), 𝜙(𝑦) = 1, 2𝑥1, 2𝑥2, 𝑥12, 𝑥2
2, 2𝑥1𝑥2T1, 2 𝑦1, 2𝑦2, 𝑦1
2, 𝑦22, 2𝑦1𝑦2
= 1 + 2𝑥1𝑦1 + 2𝑥2𝑦2 + 𝑥12𝑦1
2 + 𝑥22𝑦2
2 + 2𝑥1𝑥2𝑦1𝑦2
𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝑥1𝑦1 + 𝑥2𝑦2 + 1 2
= 𝑥1𝑦1𝑥1𝑦1 + 𝑥1𝑦1𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2𝑥1𝑦1 + 𝑥2𝑦2𝑥2𝑦2 + 𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2 + 1
1105.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-2: Kernel Trick
To show: 𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝜙(𝑥), 𝜙(𝑦)
𝜙(𝑥), 𝜙(𝑦) = 1, 2𝑥1, 2𝑥2, 𝑥12, 𝑥2
2, 2𝑥1𝑥2T1, 2 𝑦1, 2𝑦2, 𝑦1
2, 𝑦22, 2𝑦1𝑦2
= 1 + 2𝑥1𝑦1 + 2𝑥2𝑦2 + 𝑥12𝑦1
2 + 𝑥22𝑦2
2 + 2𝑥1𝑥2𝑦1𝑦2
𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝑥1𝑦1 + 𝑥2𝑦2 + 1 2
= 𝑥1𝑦1𝑥1𝑦1 + 𝑥1𝑦1𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2𝑥1𝑦1 + 𝑥2𝑦2𝑥2𝑦2 + 𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2 + 1
1205.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-2: Kernel Trick
To show: 𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝜙(𝑥), 𝜙(𝑦)
𝜙(𝑥), 𝜙(𝑦) = 1, 2𝑥1, 2𝑥2, 𝑥12, 𝑥2
2, 2𝑥1𝑥2T1, 2 𝑦1, 2𝑦2, 𝑦1
2, 𝑦22, 2𝑦1𝑦2
= 1 + 2𝑥1𝑦1 + 2𝑥2𝑦2 + 𝑥12𝑦1
2 + 𝑥22𝑦2
2 + 2𝑥1𝑥2𝑦1𝑦2
𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝑥1𝑦1 + 𝑥2𝑦2 + 1 2
= 𝑥1𝑦1𝑥1𝑦1 + 𝑥1𝑦1𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2𝑥1𝑦1 + 𝑥2𝑦2𝑥2𝑦2 + 𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2 + 1
1305.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-2: Kernel Trick
To show: 𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝜙(𝑥), 𝜙(𝑦)
𝜙(𝑥), 𝜙(𝑦) = 1, 2𝑥1, 2𝑥2, 𝑥12, 𝑥2
2, 2𝑥1𝑥2T1, 2 𝑦1, 2𝑦2, 𝑦1
2, 𝑦22, 2𝑦1𝑦2
= 1 + 2𝑥1𝑦1 + 2𝑥2𝑦2 + 𝑥12𝑦1
2 + 𝑥22𝑦2
2 + 2𝑥1𝑥2𝑦1𝑦2
𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝑥1𝑦1 + 𝑥2𝑦2 + 1 2
= 𝑥1𝑦1𝑥1𝑦1 + 𝑥1𝑦1𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2𝑥1𝑦1 + 𝑥2𝑦2𝑥2𝑦2 + 𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2 + 1
1405.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-2: Kernel Trick
To show: 𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝜙(𝑥), 𝜙(𝑦)
𝜙(𝑥), 𝜙(𝑦) = 1, 2𝑥1, 2𝑥2, 𝑥12, 𝑥2
2, 2𝑥1𝑥2T1, 2 𝑦1, 2𝑦2, 𝑦1
2, 𝑦22, 2𝑦1𝑦2
= 1 + 2𝑥1𝑦1 + 2𝑥2𝑦2 + 𝑥12𝑦1
2 + 𝑥22𝑦2
2 + 2𝑥1𝑥2𝑦1𝑦2
𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝑥1𝑦1 + 𝑥2𝑦2 + 1 2
= 𝑥1𝑦1𝑥1𝑦1 + 𝑥1𝑦1𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2𝑥1𝑦1 + 𝑥2𝑦2𝑥2𝑦2 + 𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2 + 1
1505.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-2: Kernel Trick
To show: 𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝜙(𝑥), 𝜙(𝑦)
𝜙(𝑥), 𝜙(𝑦) = 1, 2𝑥1, 2𝑥2, 𝑥12, 𝑥2
2, 2𝑥1𝑥2T1, 2 𝑦1, 2𝑦2, 𝑦1
2, 𝑦22, 2𝑦1𝑦2
= 1 + 2𝑥1𝑦1 + 2𝑥2𝑦2 + 𝑥12𝑦1
2 + 𝑥22𝑦2
2 + 2𝑥1𝑥2𝑦1𝑦2
𝐾 𝑥, 𝑦 = 𝑥𝑇𝑦 + 1 2 = 𝑥1𝑦1 + 𝑥2𝑦2 + 1 2
= 𝑥1𝑦1𝑥1𝑦1 + 𝑥1𝑦1𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2𝑥1𝑦1 + 𝑥2𝑦2𝑥2𝑦2 + 𝑥2𝑦2 + 𝑥1𝑦1 + 𝑥2𝑦2 + 1
1605.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-3: Decision Tree
Gini-index and Information Gain
1705.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-3: Decision Tree
1805.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-3: Decision Tree
1905.07.2016Knowledge Discovery in Databases I: Exercise 10
Male: 6+,2-Female: 6+,2-
y: 4+,0-n: 8+,4-
y: 6+,1-n: 2+,1-l: 4+,2-
y: 6+,1-n: 6+,3-
y: 4+,1-n: 8+,3-
Exercise 10-3: Decision Tree
Recursion 1 + - 𝒈𝒊𝒏𝒊 𝑻𝒊 𝒈𝒊𝒏𝒊𝑨(𝑻)
Gender: maleGender: female
66
22
0.3750.375
0.3750
Useful: yesUseful: no
48
04
0.0000.444
0.3333
Beautiful: yesBeauftiful: littleBeautiful: no
642
121
0.2450.4440.444
0.3571
Self-made: yesSelf-made: no
66
13
0.2450.444
0.3571
Eatable: yesEatable: no
48
13
0.3200.397
0.3727
2005.07.2016Knowledge Discovery in Databases I: Exercise 10
𝑔𝑖𝑛𝑖 𝑇 = 1 −
𝑗=1
𝑘
𝑝𝑗2 𝑔𝑖𝑛𝑖𝐴 𝑇 =
𝑖=1
𝑚|𝑇𝑖|
|𝑇|⋅ 𝑔𝑖𝑛𝑖(𝑇𝑖)
Exercise 10-3: Decision Tree
Recursion 2 + - 𝒈𝒊𝒏𝒊 𝑻𝒊 𝒈𝒊𝒏𝒊𝑨(𝑻)
Gender: maleGender: female
26
22
0.5000.375
0.4167
Beautiful: yesBeauftiful: littleBeautiful: no
341
121
0.3750.4440.500
0.4306
Self-made: yesSelf-made: no
44
13
0.3200.490
0.4190
Eatable: yesEatable: no
35
13
0.3750.469
0.4375
2105.07.2016Knowledge Discovery in Databases I: Exercise 10
𝑔𝑖𝑛𝑖 𝑇 = 1 −
𝑗=1
𝑘
𝑝𝑗2 𝑔𝑖𝑛𝑖𝐴 𝑇 =
𝑖=1
𝑚|𝑇𝑖|
|𝑇|⋅ 𝑔𝑖𝑛𝑖(𝑇𝑖)
Exercise 10-3: Decision Tree
Recursion 3 (male)
+ - 𝒈𝒊𝒏𝒊 𝑻𝒊 𝒈𝒊𝒏𝒊𝑨(𝑻)
Beautiful: yesBeauftiful: littleBeautiful: no
110
101
0.5000.0000.000
0.2500
Self-made: yesSelf-made: no
11
11
0.5000.500
0.5000
Eatable: yesEatable: no
20
02
0.0000.000
0.0000
2205.07.2016Knowledge Discovery in Databases I: Exercise 10
𝑔𝑖𝑛𝑖 𝑇 = 1 −
𝑗=1
𝑘
𝑝𝑗2 𝑔𝑖𝑛𝑖𝐴 𝑇 =
𝑖=1
𝑚|𝑇𝑖|
|𝑇|⋅ 𝑔𝑖𝑛𝑖(𝑇𝑖)
Exercise 10-3: Decision Tree
Recursion 3 (female)
+ - 𝒈𝒊𝒏𝒊 𝑻𝒊 𝒈𝒊𝒏𝒊𝑨(𝑻)
Beautiful: yesBeauftiful: littleBeautiful: no
231
020
0.0000.4800.000
0.3000
Self-made: yesSelf-made: no
33
02
0.0000.480
0.3000
Eatable: yesEatable: no
15
11
0.5000.278
0.3333
2305.07.2016Knowledge Discovery in Databases I: Exercise 10
𝑔𝑖𝑛𝑖 𝑇 = 1 −
𝑗=1
𝑘
𝑝𝑗2 𝑔𝑖𝑛𝑖𝐴 𝑇 =
𝑖=1
𝑚|𝑇𝑖|
|𝑇|⋅ 𝑔𝑖𝑛𝑖(𝑇𝑖)
Exercise 10-3: Decision Tree
Recursion 4 + - 𝒈𝒊𝒏𝒊 𝑻𝒊 𝒈𝒊𝒏𝒊𝑨(𝑻)
Self-made: yesSelf-made: no
30
02
0.0000.000
0.0000
Eatable: yesEatable: no
03
11
0.0000.375
0.3000
2405.07.2016Knowledge Discovery in Databases I: Exercise 10
𝑔𝑖𝑛𝑖 𝑇 = 1 −
𝑗=1
𝑘
𝑝𝑗2 𝑔𝑖𝑛𝑖𝐴 𝑇 =
𝑖=1
𝑚|𝑇𝑖|
|𝑇|⋅ 𝑔𝑖𝑛𝑖(𝑇𝑖)
Exercise 10-3: Decision Tree
2505.07.2016Knowledge Discovery in Databases I: Exercise 10
Exercise 10-3: Decision Tree
Recursion 1 + - 𝒆𝒏𝒕𝒓𝒐𝒑𝒚 𝑻𝒊 𝑰𝑮(𝑻)
Gender: maleGender: female
66
22
0.8110.811
0.0000
Useful: yesUseful: no
48
04
0.0000.918
0.1225
Beautiful: yesBeauftiful: littleBeautiful: no
642
121
0.5920.9180.918
0.0356
Self-made: yesSelf-made: no
66
13
0.5920.918
0.0356
Eatable: yesEatable: no
48
13
0.7220.845
0.0044
2605.07.2016Knowledge Discovery in Databases I: Exercise 10
𝑒𝑛𝑡𝑟𝑜𝑝𝑦 𝑇 = −
𝑖=1
𝑘
𝑝𝑖 ∙ log2 𝑝𝑖 𝐼𝐺 𝑇, 𝐴 = 𝑒𝑛𝑡𝑟𝑜𝑝𝑦(𝑇) −
𝑖=1
𝑚|𝑇𝑖|
|𝑇|∙ 𝑒𝑛𝑡𝑟𝑜𝑝𝑦(𝑇𝑖)
Exercise 10-3: Decision Tree
Recursion 2 + - 𝒆𝒏𝒕𝒓𝒐𝒑𝒚 𝑻𝒊 𝑰𝑮(𝑻)
Gender: maleGender: female
26
22
1.0000.811
0.0440
Beautiful: yesBeauftiful: littleBeautiful: no
341
121
0.8110.9181.000
0.0220
Self-made: yesSelf-made: no
44
13
0.7220.985
0.0426
Eatable: yesEatable: no
35
13
0.8110.954
0.0117
2705.07.2016Knowledge Discovery in Databases I: Exercise 10
𝑒𝑛𝑡𝑟𝑜𝑝𝑦 𝑇 = −
𝑖=1
𝑘
𝑝𝑖 ∙ log2 𝑝𝑖 𝐼𝐺 𝑇, 𝐴 = 𝑒𝑛𝑡𝑟𝑜𝑝𝑦(𝑇) −
𝑖=1
𝑚|𝑇𝑖|
|𝑇|∙ 𝑒𝑛𝑡𝑟𝑜𝑝𝑦(𝑇𝑖)
Exercise 10-3: Decision Tree
Recursion 3a(male)
+ - 𝒆𝒏𝒕𝒓𝒐𝒑𝒚 𝑻𝒊 𝑰𝑮(𝑻)
Beautiful: yesBeauftiful: littleBeautiful: no
110
101
1.0000.0000.000
0.5000
Self-made: yesSelf-made: no
11
11
1.0001.000
0.0000
Eatable: yesEatable: no
20
02
0.0000.000
1.0000
2805.07.2016Knowledge Discovery in Databases I: Exercise 10
𝑒𝑛𝑡𝑟𝑜𝑝𝑦 𝑇 = −
𝑖=1
𝑘
𝑝𝑖 ∙ log2 𝑝𝑖 𝐼𝐺 𝑇, 𝐴 = 𝑒𝑛𝑡𝑟𝑜𝑝𝑦(𝑇) −
𝑖=1
𝑚|𝑇𝑖|
|𝑇|∙ 𝑒𝑛𝑡𝑟𝑜𝑝𝑦(𝑇𝑖)
Exercise 10-3: Decision Tree
Recursion 3b(female)
+ - 𝒆𝒏𝒕𝒓𝒐𝒑𝒚 𝑻𝒊 𝑰𝑮(𝑻)
Beautiful: yesBeauftiful: littleBeautiful: no
231
020
0.0000.9710.000
0.2041
Self-made: yesSelf-made: no
33
02
0.0000.971
0.2041
Eatable: yesEatable: no
15
11
1.0000.650
0.0735
2905.07.2016Knowledge Discovery in Databases I: Exercise 10
𝑒𝑛𝑡𝑟𝑜𝑝𝑦 𝑇 = −
𝑖=1
𝑘
𝑝𝑖 ∙ log2 𝑝𝑖 𝐼𝐺 𝑇, 𝐴 = 𝑒𝑛𝑡𝑟𝑜𝑝𝑦(𝑇) −
𝑖=1
𝑚|𝑇𝑖|
|𝑇|∙ 𝑒𝑛𝑡𝑟𝑜𝑝𝑦(𝑇𝑖)
Exercise 10-3: Decision Tree
Recursion 4 + - 𝒆𝒏𝒕𝒓𝒐𝒑𝒚 𝑻𝒊 𝑰𝑮(𝑻)
Self-made: yesSelf-made: no
30
02
0.0000.000
0.9710
Eatable: yesEatable: no
03
11
0.0000.811
0.3222
3005.07.2016Knowledge Discovery in Databases I: Exercise 10
𝑒𝑛𝑡𝑟𝑜𝑝𝑦 𝑇 = −
𝑖=1
𝑘
𝑝𝑖 ∙ log2 𝑝𝑖 𝐼𝐺 𝑇, 𝐴 = 𝑒𝑛𝑡𝑟𝑜𝑝𝑦(𝑇) −
𝑖=1
𝑚|𝑇𝑖|
|𝑇|∙ 𝑒𝑛𝑡𝑟𝑜𝑝𝑦(𝑇𝑖)
Exercise 10-3: Decision Tree
3105.07.2016Knowledge Discovery in Databases I: Exercise 10