View
69
Download
0
Category
Tags:
Preview:
DESCRIPTION
Machine Learning in Practice MidTerm Review. Carolyn Penstein Ros é Kishore Prahallad Language Technologies Institute. Error Analysis. Error Analysis from Assgn4. === Confusion Matrix === a b c d e f g h
Citation preview
Machine Learning in PracticeMidTerm ReviewCarolyn Penstein Rosé
Kishore Prahallad
Language Technologies Institute
Error Analysis
Error Analysis from Assgn4 === Confusion Matrix === a b c d e f g h <-- classified as 1443 105 98 127 141 27 396 73 | a = irf03 934 190 125 264 86 58 604 163 | b = irf04 985 95 350 219 108 69 863 134 | c = irf06 841 152 177 774 127 80 524 161 | d = irf07 269 98 25 78 111 180 1208 294 | e = irm02 369 94 27 61 95 438 1062 235 | f = irm05 241 70 43 38 123 216 1457 88 | g = irm06 470 66 38 55 73 211 1188 422 | h = irm07
Error Analysis from Assgn4 === Confusion Matrix === a b c d e f g h <-- classified as 1443 105 98 127 141 27 396 73 | a = irf03 934 190 125 264 86 58 604 163 | b = irf04 985 95 350 219 108 69 863 134 | c = irf06 841 152 177 774 127 80 524 161 | d = irf07 269 98 25 78 111 180 1208 294 | e = irm02 369 94 27 61 95 438 1062 235 | f = irm05 241 70 43 38 123 216 1457 88 | g = irm06 470 66 38 55 73 211 1188 422 | h = irm07
Error Analysis from Assgn4 === Confusion Matrix === a b c d e f g h <-- classified as 1443 105 98 127 141 27 396 73 | a = irf03 934 190 125 264 86 58 604 163 | b = irf04 985 95 350 219 108 69 863 134 | c = irf06 841 152 177 774 127 80 524 161 | d = irf07 269 98 25 78 111 180 1208 294 | e = irm02 369 94 27 61 95 438 1062 235 | f = irm05 241 70 43 38 123 216 1457 88 | g = irm06 470 66 38 55 73 211 1188 422 | h = irm07
Diagonal Elements are non-zero
Error Analysis from Assgn4 === Confusion Matrix === a b c d e f g h <-- classified as 1443 105 98 127 141 27 396 73 | a = irf03 934 190 125 264 86 58 604 163 | b = irf04 985 95 350 219 108 69 863 134 | c = irf06 841 152 177 774 127 80 524 161 | d = irf07 269 98 25 78 111 180 1208 294 | e = irm02 369 94 27 61 95 438 1062 235 | f = irm05 241 70 43 38 123 216 1457 88 | g = irm06 470 66 38 55 73 211 1188 422 | h = irm07
Diagonal Elements are non-zero
NON-Diagonal Elements should be
Zero
Try to find an explanation for large error cells in confusion matrix
Try to find an explanation for large error cells in confusion matrix
Try to find an explanation for large error cells in confusion matrix
Try to find an explanation for large error cells in confusion matrix
From Assgn6 === Stratified cross-validation === === Summary ===
Correctly Classified Instances 77 51.3333 % Incorrectly Classified Instances 73 48.6667 %
Kappa statistic 0.0235 === Confusion Matrix ===
a b <-- classified as33 40 | a = negative33 44 | b = positive
Ranked Attributes Ranked attributes: 16.6146 6465 life 15.3272 996 bad 14.3417 7565 nothing 12.3659 2625 created 12.24 12337 world 11.7684 7798 others 10.8115 10654 stupid 9.6538 11050 terrible 9.5345 2552 could 9.0771 3388 dream 8.86 11285 top 8.4936 1992 children
Add Bigrams (only) and select Top 5 Attributes
Correctly Classified Instances 87 58% Incorrectly Classified Instances 63 42 % Kappa statistic 0.1414 === Confusion Matrix ===
a b <-- classified as12 61 | a = negative 2 75 | b = positive
What these Top 5 Attributes are? Ranked attributes: 7.745 22 entir_movi 5.456 23 fall_flat 5.456 42 million_dollar 4.904 56 support_role 4.904 59 visual_effect
Add All Features and do a Naïve Bayes
Correctly Classified Instances 107 71.3% Incorrectly Classified Instances 43 28.6% Kappa statistic 0.4252 === Confusion Matrix ===
a b <-- classified as49 24 | a = negative19 58 | b = positive
Methods of Analyzing Error Confusion amongst classes
Check the confusion Matrix Check to see what is common across these
two classes Find out ways to remove these commonalities by
feature extraction or selection
AlgorithmsOften errors are also due to nature of ML
algorithm used Experiment with different algorithms
Recommended