NAÏVE BAYES CLASSIFIER

NAÏVE BAYESCLASSIFIER

ACM Student Chapter,Heritage Institute of Technology

10th February, 2012SIGKDD Presentation byAnirban GhoseParami RoySourav Dutta

CLASSIFICATION●What is it?

Assigning a given piece of input data into one of a given number of categories.

e.g. :Classifying kitchen items : separating cups and saucers.

CLASSIFICATION●Why do we need it?

Separating like things from unlike things.

Categorizing different types of cattle like cows, goats, etc.

CLASSIFICATION Looking for identifiable patterns.

Predicting an e-mail is spam ornon-spam from patterns observedin previous mails.

Automatic categorization on online articles.

Classification Allowing extrapolation.

Given the red dots predicting thevalue at the blue box.

Classification Techniques

• Decision Tree based methods• Rule-based methods• Memory based methods• Neural Networks• Naïve Bayes Classifier• Support Vector Machines

Problem StatementPlay Tennis : Training Examples

Day Outlook Temperature

Humidity Wind Play Tennis

D1 Sunny Hot High Weak NoD2 Sunny Hot High Strong NoD3 Overcast Hot High Weak YesD4 Rain Mild High Weak YesD5 Rain Cool Normal Weak YesD6 Rain Cool Normal Strong NoD7 Overcast Cool Normal Strong YesD8 Sunny Mild High Weak NoD9 Sunny Cool Normal Weak YesD10 Rain Mild Normal Weak YesD11 Sunny Mild Normal Strong YesD12 Overcast Mild High Strong YesD13 Overcast Hot Normal Weak YesD14 Rain Mild High Strong No

Problem Statement• Domain space : Set of values an attribute can

• Domain space of previous example:

o Outlook – {Sunny, Overcast, Rain}o Temperature – {Hot, Mild, Cool}o Humidity – {High, Normal}o Wind – {Strong, Weak}

o Play Tennis – {Yes, No}

Problem Statement• Instances X :

A set of items over which the concept is defined.

Set of all possible days with attributes Outlook, Temperature, Humidity, Wind.

• Target concept (c):concept or function to be learned.c : X → {0,1}c(x) = 1 : Play Tennis = Yesc(x) = 0 : Play Tennis = No

Problem Statement• Hypothesis (H)

A statement that is assumed to be true for the sake of argument.

Conjunction of constraints on the attributes.h : X → {0,1}

• For each attribute hypothesis will be :? – any value is acceptable<value> - a single required valueØ - no value is acceptable

Problem Statement• Training examples - Prior knowledge.

Set of input vector (instances) and a label(outcome).

Input vectorOutlook - Sunny,Temperature - Hot,Humidity - High,Wind - Weak.LabelPlay tennis – No

Problem StatementTraining examples can be :

• Positive example :Instance satisfies all the constraints of

hypothesish(x) = 1

• Negative Example :Instance does not satisfy one or many

constraints of hypothesis.h(x) = 0

Learning Algorithm• Naïve Bayes Classifier – Supervised Learning

• Supervised Learning: machine learning task of inferring a function from supervised (labelled) training data

g : X YX : input spaceY : output space

A quick Recap• Conditional Probability : P(A/B) = P(A∩B) P(B)• Multiplication Rule: P(A∩B) = P(A/B).P(B) = P(B/A).P(A)

• Independent Events: P(A∩B) = P(A).P(B)

• Total Probability:

Few Important Definitions

o Prior Probability: Let p be an uncertain quantity .

Then prior probability is the probability distribution that would express one's uncertainty about p before the "data" is taken into account.

o Posterior probability: The posterior probability of a random event or an uncertain proposition is the conditional probability that is assigned after the relevant evidence is taken into account

Bayes’ Theoremo P(h) : prior probability of hypothesis h

o P(D) : prior probability that the training data will be observed.

• P(D | h) : probability of observing data D given some world in which hypothesis h holds.

• P(h | D) : posterior probability of h ( to be found).

• Then as per Bayes' Theorem:

MAP HYPOTHESIS

P(hi) = P(hj)

Example• A medical diagnosis problem: It has 2 alternative hypothesis:

1) Patient has a particular form of cancer

2) The patient does not have the particular form of cancer

Example - Bayes’ Theorem

TEST OUTCOMESa) + (Positive - having rare disease)b) - (Negative - not having rare disease)

Prior Knowledge:P(cancer) = 0.008 P(~cancer) =

0.992P(+|cancer) = 0.98 P(-|cancer) = 0.02P(+|~cancer) = 0.03 P(-|~cancer) =

Examples – Bayes Theorem

Suppose we now observe a new patient for whom the lab test returns a positive value.

Should we diagnose the patient as having cancer or not??

SolutionTherefore, from the following equation:

We get,P(cancer|+) = P(+|cancer).P(cancer)

= (0.98).(0.008) = 0.0078

P(~cancer|+) = P(+|~cancer).P(~cancer)

= (0.03).(.992) = 0.0298

Naïve Bayes Classifier• Supervised Learning Technique

• Bayes Theorem

• MAP Hypothesis

Naïve Bayes Classifier• Prior Knowledge

• Training data set• A new instance of data.

• Objective• Classify the new instance of data: <a1,a2,..,an>• Find P(vj|a1,a2,….,an)• Find the required probability for all possible

classifications.• Find the maximum probability among them.

Naïve Bayes Classifier• (vj|a1,a2,. . . . ,an) for all vj in V

• Using Bayes’ Theorem (vj|a1,a2,. . . . ,an)

Naïve Bayes Classifier• Why Naïve?

• Assume all attributes to be conditionally independent.

• P(a1,a2,…,an|vj) = P(ai|vj) for all i=1 to n

• VNB = max of P(vj) P(ai|vj) for all vj in V

Play Tennis : Training ExamplesDay Outlook Temperat

ureHumidity Wind Play

TennisD1 Sunny Hot High Weak NoD2 Sunny Hot High Strong NoD3 Overcast Hot High Weak YesD4 Rain Mild High Weak YesD5 Rain Cool Normal Weak YesD6 Rain Cool Normal Strong NoD7 Overcast Cool Normal Strong YesD8 Sunny Mild High Weak NoD9 Sunny Cool Normal Weak YesD10 Rain Mild Normal Weak YesD11 Sunny Mild Normal Strong YesD12 Overcast Mild High Strong YesD13 Overcast Hot Normal Weak YesD14 Rain Mild High Strong No

New Instance: < Sunny, Cool, High, Strong >

Probability Estimate• We define our probability estimate to be the

frequency of data combinations within the training examples

• P(vj) = Fraction of times vj occurs in the training set.

• P(ai|vj) = Fraction of times ai occurs in those examples which are classified as vj

Example• Let’s calculate P(Overcast | Yes)

• Number of training examples classified as Yes = 9

• Number of times Outlook = Overcast given the classification is Yes = 4

• Hence, P(Overcast | Yes) = 4/9

• Prior Probability• P(Yes) = 9/14 i.e. P(playing tennis)• P(No) = 5/14 i.e. P(not playing tennis)

• Look up Tables

• P(Yes) P(Sunny|Yes) P(Cool|Yes) P(High|Yes) P(Strong|Yes)

= 9/14 * 2/9 * 3/9 * 3/9 * 3/9= 0.0053

• P(No) P(Sunny|No) P(Cool|No) P(High|No) P(Strong|No)

= 5/14 * 3/5 * 1/5 * 4/5 * 3/5= 0.0206

• Hence, We can’t play tennis given the weather conditions.

Drawback of the estimate

• What happens if the probability estimate is zero?

• The estimate is zero when a particular attribute value never occurs in the training data set given the classification.

• This estimate will ultimately dominate the product term VNB for that particular classification.

Example• For a new training set, the attribute outlook does

not have the value overcast when the example is labeled yes.

• P(Overcast | Yes) = 0

• VNB = P(Yes) * P(Overcast | Yes)*P(Cool | Yes)…. = 0

Solution• m-estimate of probability

• P(ai | Vj) =

Wherem is the equivalent sample size

P is the prior estimate of the attribute value

Disadvantages of Naïve Bayes Classifier

1) Require initial knowledge about many probabilities.

2)Significant computational cost needed to determine Bayes optimal hypothesis.

Conclusion• Naïve Bayes based on the independence

assumptiono Training is very easy and fasto Test is straightforward; just looking up tables or calculating

conditional probabilities with normal distributions

• A popular generative modelo Performance competitive to most of state-of-the-art

classifiers even in presence of violating independence assumption

o Many successful applications, e.g., spam mail filteringo A good candidate of a base learner in ensemble learning

NAÏVE BAYES CLASSIFIER

Documents

Naïve Bayes classifier - VIUcsci.viu.ca/~barskym/teaching/DM2012/lectures/Lecture5.NaiveBayes.pdfClassifier based on Bayes rule •We can build a classifier which will classify a

Bayes optimal classifier Naïve Bayesguestrin/Class/15781/slides/epxing_naive... · 2007. 9. 16. · 1 ©Carlos Guestrin 2005-2007 Bayes optimal classifier Naïve Bayes Machine Learning

Lecture 13: Naive Bayes classifier - University of Pittsburghnaraehan/ling1330/Lecture13.pdf · Lecture 13: Naïve Bayes Classifier LING 1330/2330: Introduction to Computational Linguistics

Naïve Bayes - unibas.ch · •Bayes classifier with the assumption of independent features •Probabilistic, generative classifier •Easy-to-estimate likelihoods: Product of feature

Learning: Naïve Bayes Classifier

Data Classification Preprocessing Naïve Bayes Classifierrjohns15/cse40647.sp14/www/content/lectures/… · Data Preprocessing Classification & Regression Naïve Bayes Classifier

Penerapan algoritma naïve bayes classifier untuk

Enhanced Smoothing Methods Using Na ve Bayes Classifier ... · III Naïve Bayes classifier A Naive Bayes classifier is a simple probabilistic classifier which is based on applying

NAÏVE BAYES CLASSIFIER - UoPvclass.uop.gr/modules/document/file.php/ITCOM664/Naive-Bayes (1… · = 0,05 x 0,05 x 0,992 = 0,00248 21 Naïve Bayes 22 • Algorithm: Continuous-valued

NAÏVE BAYES CLASSIFIER: A MAPREDUCE APPROACH A …

NAÏVE BAYES CLASSIFIER - ULisboa · PDF file1 NAÏVE BAYES CLASSIFIER Ana Teresa Freitas Adapted from “Digital Minds”, Arlindo Oliveira Computational Biology 2015/2016 Outline

Naïve Bayes Classifier-Assisted Least Loaded Routing for ...supcache.wyqrks.com/data/...Digital Object Identifier 10.1109/ACCESS.2019.2892063 Naïve Bayes Classifier-Assisted Least

COMP24111 Machine Learning Naïve Bayes Classifier Ke Chen

KLASIFIKASI KUNYIT, JAHE, DAN LENGKUAS · PDF file2.1 Naïve Bayes Clasifier Sebuah bayes classifier adalah classifier probabilistik sederhana berdasarkan penerapan teorema Bayes (dari

IMPLEMENTASI NAÏVE BAYES CLASSIFIER DAN ASOSIASI …

Naïve Bayes Classifier. Bayes Classifier l A probabilistic framework for classification problems l Often appropriate because the world is noisy and also

Pengenalan Pola - afif.lecture.ub.ac.id · Naïve Bayes Classifier Metode klasifikasi ini diturunkan dari penerapan teorema Bayes dengan asumsi independence (saling bebas) Naïve

The Naïve Bayes Classifier - svivek · •The naïve Bayes Classifier •Learning the naïve Bayes Classifier •Practical concerns 2. Today’s lecture •The naïve Bayes Classifier

Bayes Decision Rule and Naïve Bayes Classifierlsong/teaching/CSE6740fall13/lecture8.pdf · Bayes Decision Rule and Naïve Bayes Classifier Machine Learning I CSE 6740, Fall 2013

Klasifikasi Metagenom dengan Metode Naïve Bayes Classifier … · Klasifikasi Metagenom dengan Metode Naïve Bayes Classifier Metagenome Classification Using Naïve Bayes Classifier