Click here to load reader

29 August 2013 Venkat Naïve Bayesian on CDF Pair Scores

Embed Size (px)

Citation preview

Nave Bayesian on CDF Pair Scores

29 August 2013VenkatNave BayesianonCDF Pair ScoresOutlineNave Bayesian Overview

Adapting Nave Bayesian to CDF Pairscores

Comparisons with Logistic Regression

Comparisons with the Voting SchemeBayesian Framework

Bayesian FrameworkUnbiased Learning requires O(NKd) samples for reasonable parameter estimation

Impractical for most values of d

Nave Bayes AssumptionLet

The Nave Bayes Assumption implies class conditional independence

Requires O(NK) samples

Gaussian Nave BayesWhat are the parameters to be estimated ?

N Priors

Nd Likelihood functions

Nave Bayes on CDF PairscoresDirect application of GNB on CDF Pairscores guaranteed to give poor results.

Must make use of which features are irrelevant conditioned on a class.

For instance, conditioned on class 7, the score for the say class-36-vs-class-9 model is irrelevant. Nave Bayes on CDF PairscoresWe have N=50 classes and d = 1225 pairscores

Nave Bayes on CDF Pairscores

Nave Bayes on CDF Pairscores

Nave Bayes on CDF Pairscores

Nave Bayes on CDF Pairscores

Nave Bayes on CDF Pairscores

Nave Bayes on CDF Pairscores

Likelihood Distributions

Likelihood Distributions

P(s(c,c)|y=c)P(s(c,c)|y=c)Results (2nd level)Nave Bayesian : 57.18%

Voting : 59.01%

Logistic Regression : 57.51%

So, which is the overall best scheme ??Nave Bayes vs Logistic RegressionGNB (generative) and LR (discriminative) essentially model the same classifier when Nave Bayesian Assumptions hold.

However, LR converges to asymptotic accuracies slower than GNB

This is due to LR requiring exponentially higher number of samples compared to GNB for good parameter estimatesNave Bayes vs Logistic Regression

Nave Bayes vs Logistic Regression

LOGISTICREGRESSIONNave Bayes vs Logistic Regression

NAVE BAYESIANNave Bayes vs Logistic Regression

NAVE BAYESIANNave Bayes vs Logistic RegressionWhen training data is scarce, GNB theoretically outperforms LR

Moreover, if LR only marginally outperforms GNB, then GNB should still be chosen due to its low variance property. Nave Bayes vs Voting SchemeNave Bayes is equivalent to a weighted voting scheme.

Unweighted voting scheme takes unbiased votes from pairwise models, ignoring scores and scales.

The binary structure of the unweighted scheme has ill-defined bias-variance properties.

One can argue that it just happens to work well in this case.