Upload
rafiul-sabbir
View
172
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Enhancing the performance of Naive Bayesian(NB) using the attributes with have highest Information Gain
Citation preview
1
ENHANCING PERFORMANCE OF CLASSICAL NAÏVE BAYESIAN
CLASSIFIER USING INFORMATION GAIN CONCEPT
OF DECISION TREE.
RESEARCH PROJECT FOR THE DEGREE OF B. SC. IN CSE UNDER THE
SUPERVISION OF DR. CHOWDHURY MOFIZUR RAHMAN
RAFIUL SABBIR011092023
DEPT. OF CSE, UIU7/29/2013
2
Contents
Abstract of the work Why we need it? Naïve Bayesian Classifier
Definition Algorithm
Gaussian Distribution Decision Tree
Definition Algorithm
7/29/2013
3
Contents
Information Gain My Algorithm Experimental Design Experimental Result Remarks
7/29/2013
4
Abstract of the work Apply Naïve Bayesian. Based on information gain create
decision tree & select attributes. Apply Naïve Bayesian with the selected
attributes. Minimize the time & space need to
analysis. Can work with continuous data stream.
7/29/2013
5
Why we need it?1. Now-a-days data volume of
internet user is getting larger.2. Machine learning is getting
harder day by day.3. Pre-processing of data may be a
solution for it.4. Using the necessary data only
can make the learning process faster
7/29/2013
6
Why we need it?5. A better technique can make the process
more organized using only necessary data
6. Cut off all un-important attributes from data set.
7. Dataset become compact in terms of attributes and calculation becomes fast.
8. Get better performance than past on behalf of time and space.
7/29/2013
7
Key Concepts
Naïve Bayesian Classifier Gaussian Distribution Decision Tree Information Gain
7/29/2013
8
Naïve Bayesian Classifier
The Naïve Bayesian classifier(NB) is a straightforward and frequently used method for supervised learning.
It provides a flexible way for dealing with any number of attributes or classes
It’s based on statistical probability theory.
7/29/2013
9
Naïve Bayesian Classifier(continued)
It is the asymptotically fastest learning algorithm that examines all its training input.
It has been demonstrated to perform surprisingly well in a very wide variety of problems in spite of the simplistic nature of the model.
Furthermore, small amounts of bad data, or “noise,” do not perturb the results by much.
7/29/2013
10
Naïve Bayesian Classifier Algorithm
There are classes, say Ck for the data to be classified into.
Each class has a probability P(Ck) that represents the prior probability of classifying an attribute into Ck.
For n attribute values, vj, the goal of classification is clearly to find the conditional probability P(Ck | v1 ∧ v2 ∧ … ∧ vn).
7/29/2013
11
Naïve Bayesian Classifier Algorithm(Continued)
By Bayes’ rule, this probability is equivalent to
7/29/2013
12
Gaussian Distribution
The mathematical function for calculating the probability density of Gaussian distribution at a particular point X is:
where µ is the mean and σ is the standard deviation of the continues-valued attribute X
7/29/2013
13
Decision tree
1. Decision trees are one of the most popular methods used for inductive inference.
2. The basic algorithm for decision tree induction is a greedy algorithm that constructs decision trees in a top-down recursive divide-and-conquer manner.
3. The main concept of selecting an attribute and constructing a decision tree is Information Gain(IG)
7/29/2013
14
Decision Tree Algorithm
The basic idea behind any decision tree algorithm is as follows: Choose the best attribute(s) to split the
remaining instances and make that attribute a decision node using Information Gain
Repeat this process for recursively for each child
Stop when: All the instances have the same target attribute value There are no more attributes There are no more instances
7/29/2013
Decision Tree Example
Leave At
Stall? Accident?
10 AM 9 AM8 AM
Long
Long
Short Medium Long
No Yes No Yes
If we leave at 9 AM and there is no accident happened on the road, what will our commute time be?
16
Information Gain
The critical step in decision trees is the selection of the best test attribute.
The information gain measure is used to select the test attribute at each node in the tree.
The expected information needed to classify a given sample is given by
where pk is the probability that an arbitrary sample belongs to class Ck and is estimated by sk
/ s. 7/29/2013
17
My Algorithm
1. Run Naïve Bayesian classifier on the training data set
2. Run C4.5 on data from step 1. 3. Select a set of attributes that appear only
in the simplified decision tree as relevant features.
4. Run Naïve Bayesian classifier on the training data using only the final attributes selected in step 3.
5. Compare the result of step 4 with step 1.7/29/2013
18
Experimental Design
Each dataset is shuffled randomly. Produce disjoint training and test sets as
follows. 80% training & 20% test data 70% training & 30% test data 60% training & 40% test data
For each set of training and test data, run Naïve Bayesian Classifier (NBC) C4.5 Selective Bayesian Classifier(SBC)
7/29/2013
19
Experimental Results
Dataset # of instances # of attributes
# of attributes selected
Iris 150 4 2
Diabetes 768 8 6
Ionosphere 351 34 14
Breast Cancer 286 9 6
Ecoli 336 8 7
7/29/2013
Number of instances and attributes before & after Decision Tree
20
Experimental Results (Continued)
Number of test instance(s) classified properly
7/29/2013
Training : Test
Number of instance
Naïve Bayesian
Accuracy(%)
Selective Naïve Bayesian
Accuracy(%)
80 : 20 30 27 90% 29 96.67%
Iris 70 : 30 45 42 93.33% 43 95.56%
60 : 40 60 56 93.33% 57 95%Training : Test
Number of instance
Naïve Bayesian(NB)
Accuracy(%)
Selective Naïve Bayesian
Accuracy(%)
80 : 20 154 119 77.27% 126 81.81%
Diabetes
70 : 30 231 173 76.20% 181 78.35%
60 :40 308 239 77.60% 246 79.87%
21
Experimental Results (Continued)
7/29/2013
Training : Test
Number of instance
Naïve Bayesian(NB)
Accuracy(%)
Selective Naïve Bayesian
Accuracy(%)
80 : 20 137 134 97.81% 135 98.54%
Breast Cancer
70 : 30 205 200 97.56% 202 98.54%
60 : 40 274 261 95.26% 264 96.35%Training : Test
Number of instance
Naïve Bayesian(NB)
Accuracy(%)
Selective Naïve Bayesian
Accuracy(%)
80 : 20 68 56 82.35% 58 85.29%
Ecoli 70 : 30 101 81 80.20% 82 81.19%
60 :40 135 110 81.48% 110 81.48%
22
Experimental Results (Continued)
7/29/2013
Training : Test
Number of instance
Naïve Bayesian
Accuracy(%)
Selective Naïve Bayesian
Accuracy(%)
80 : 20 81 74 91.36% 78 96.30%
Ionosphere
70 : 30 106 97 91.51% 100 94.34%
60 : 40 141 131 92.91% 134 95.04%
23
Experimental Results (Continued)
Result of Cross Validation(10 fold)
7/29/2013
Naïve Bayesian
Selective Naïve Bayesian
Number of instances
15 16 16
16 16 16
14 14 16
16 16 16Iris 13 13 16
16 16 16
15 15 16
15 16 16
15 16 16
15 15 15
24
Experimental Results (Continued)
Result of Cross Validation(10 fold)
7/29/2013
Naïve Bayesian
Selective Naïve Bayesian
Number of instances
65 63 69
68 68 69
68 68 69
66 65 69Breast Cancer 65 65 69
66 66 69
68 69 69
67 68 69
65 66 69
67 69 69
25
Experimental Results (Continued)
Result of Cross Validation(10 fold)
7/29/2013
Naïve Bayesian
Selective Naïve Bayesian
Number of instances
69 68 77
53 56 77
56 57 77
61 62 77Diabetes 65 64 77
56 56 77
56 57 77
60 59 77
52 54 77
59 60 77
26
Experimental Results (Continued)
Result of Cross Validation(10 fold)
7/29/2013
Naïve Bayesian
Selective Naïve Bayesian
Number of instances
21 21 34
31 31 34
31 31 34
26 26 34Ecoli 25 25 34
23 23 34
24 24 34
27 27 34
29 29 34
30 30 34
27
Experimental Results (Continued)
Result of Cross Validation(10 fold)
7/29/2013
Naïve Bayesian
Selective Naïve Bayesian
Number of instances
35 33 36
33 33 36
31 32 36
33 34 36Ionosphere 33 35 36
30 31 36
30 31 36
32 33 36
31 31 36
33 34 36
28
Remarks
Dataset: UCI Machine Learning Repository Weka provided datasets
Software & Tools: Weka 3.6.9 Python Data Mining libraries
sklearn numpy pylab
7/29/2013
29
Thank You
7/29/2013