20
Naïve Bayes based Model Billy Doran 09130985

Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Embed Size (px)

Citation preview

Page 1: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Naïve Bayes based Model

Billy Doran

09130985

Page 2: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

“If the model does what people do, do people do what the model does?”

Page 3: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Bayesian Learning

Determines the probability of a hypothesis H given a set of data D:

Ρ(Η|D) = P(D|H) P(H)⁄P(D)

Page 4: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Ρ(Η|D) = P(D|H) P(H)⁄P(D)

P(H) is the prior probability of H. The probability of observing H for the whole data set

P(H|D) is the posterior probability of H. This means that given the Data D what is the probability of the hypothesis H.

P(D) is the prior probability of observing D. It is constant throughout the data set and can be ignored.

P(D|H) is the likelihood of observing the data given the hypothesis. Does the hypothesis reproduce the data?

Page 5: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Maximum a Posteriori Probability

In order to classify an example as belonging to one category or another we aim to find the maximal value of

P(H|D)

For example we can take the training pattern <A X C>, if we want to find the probability that this example belongs to category A the posterior probability is:

P(Category A|A,X,C)

Page 6: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Naïve Bayes

The Naïve Bayes algorithm allows us to assume conditional independence of the dimensions.

This means that we consider each dimension in terms of its probability given the category:

P(A,B|Cat A) = P(A|Cat A)P(B|Cat A)

Using this information we are able to build a table of the Conditional Probabilities for each dimension

Page 7: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Conditional Probability Table

Probabilities for Category A

A B C

Dimension1 0.6666 0 0.1

Dimension2 0.5 0.1666 0.1

Dimension3 0 0.1666 0.1666

P(Dimension1=A|Category A) is 4/6, which is 0.6666

Page 8: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Calculation

In order to get the scores for the pattern <A B C> we first find

P(A|A,B,C)=P(A|A)P(B|A)P(C|A)P(A) =0.666*0.1666*0.1666*0.375=0.00688 P(B|A,B,C)=0.166*0.5*0.1*0.375=0.0031125 P(C|A,B,C)=0.1*0.1*0.833*0.375=0.00312375

Next we normalise the score to get a value in the range [0-1]

A=0.00688/(0.0068+0.0031125+0.00312375) = 0.52

Page 9: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Conjunctions

In order to calculate the conjunction of categories we find the joint probability of the two categories

P(A&B) = P(A)P(B)

This is similar to the Prototype Theory for conjunctions.

Page 10: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Training DataA B C A&B A&C B&C Single Joint

1 0.571327651 0.085707718 0.342964631 0.048967189 0.195945177 0.029394716 A

2 0.714316331 0.178525504 0.107158165 0.127523683 0.076544828 0.019130465 A

3 0.88264362 0.07323744 0.04411894 0.064642559 0.038941301 0.003231158 A

4 0.600528465 0.19937545 0.200096085 0.119730633 0.120163395 0.039894247 A

5 0.33415783 0.55490177 0.1109404 0.185424771 0.037071603 0.061561024 B AB

6 0.543442811 0.407622871 0.048934318 0.221519719 0.026593003 0.019946747 A AB

7 0.060206562 0.903785195 0.036008243 0.0544138 0.002167932 0.032543716 B

8 0.060206562 0.903785195 0.036008243 0.0544138 0.002167932 0.032543716 B

9 0.142893902 0.71472682 0.142379278 0.102130104 0.020345131 0.101762289 B

10 0.142893902 0.71472682 0.142379278 0.102130104 0.020345131 0.101762289 B

11 0.243388618 0.080805021 0.67580636 0.019667022 0.164483576 0.054608547 C

12 0.069905193 0.349651846 0.580442961 0.02444248 0.040575977 0.202952953 C

13 0.028615214 0.017175999 0.954208788 0.000491495 0.027304888 0.016389489 C

14 0.081233216 0.016188132 0.902578652 0.001315014 0.073319367 0.014611062 C

15 0.028615214 0.017175999 0.954208788 0.000491495 0.027304888 0.016389489 C

16 0.178514026 0.107151276 0.714334698 0.019128006 0.127518763 0.076541874 C

Page 11: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Training Data

The model is almost perfectly consistent learner, meaning that it reproduces the original training data with 100% accuracy.

For the conjunction examples #5 and #6 it classifies them as B and A respectively. They obtain a significantly higher score in the AB conjunction than in the AC or BC conjunctions.

This seems to suggest that these two examples are more representative of one member of the conjunction than the other.

Page 12: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Test Data

A B C A&B A&C B&C Single Joint

1 0.543442811 0.407622871 0.048934318 0.221519719 0.026593003 0.019946747 A>B>C AB>AC>BC

2 0.069905193 0.349651846 0.580442961 0.02444248 0.040575977 0.202952953 C>B>A BC>AC>AB

3 0.394851186 0.078685831 0.526462983 0.031069194 0.207874533 0.041425177 C>A>B AC>BC>AB

4 0.192184325 0.346208696 0.461606979 0.066535885 0.088713626 0.15981235 C>B>A BC>AC>AB

5 0.142893902 0.71472682 0.142379278 0.102130104 0.020345131 0.101762289 C>B=A AB=AC>AC

Page 13: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Graphs: Comparing Experimental results to Model results

Page 14: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Test Data

The results are generally consistent with the experimental data.

Except for #3 and #4:For #3 the experiment predicts AC>AB>BC, while

the model generates AC>BC>ABFor #4 the experimental data predicts C>B>A, the

model gives B>C>A

Page 15: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Statistical Analysis

The average for the correlation between the model and experimental data was R=0.88

At alpha =0.05 and df = n-2, this was significant.

#1 0.82, #2 0.93, #3 0.85, #4 0.84, #5 0.88, #6 0.92

Page 16: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Unusual Predictions

How would the model handle <A B C>?

Output: A > B > C, AC > AB > BC

Is it possible to ask the model about triple conjunction?Example: <X X B>Model predicts: C>B=A, AB=AC>ABC>BC

Page 17: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Conclusion

Naïve Bayes produces a good hypothesis of how people learn category classification.

The use of probabilities matches well with the underlying logic of the correlations between the dimensions and categories.

Creating a Causal Network might be an informative way to investigate further the interactions between the individual dimensions.

Page 18: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Limitations

As the model uses a version of prototype to calculate its conjunction it is not able to capture overextension. To rectify this The formulae:

Can be used to approximate overextension, where C is the category and KC is the set of non-C categories. €

D(x |C |Kc) =(num _ x _ in _C)

(size_C) + (num _ x _ in _Kc)

Page 19: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

Limitations

The model, also, does not take into account negative evidence. While it captures the general trend of the categories it does not, for example, represent the strength of negativity for Category C in test pattern #5

This pattern is very similar to the conjunction patterns given in the training data. The strong negative reaction seems to be caused by the association between these conjunctions and categories A and B.

Page 20: Naïve Bayes based Model Billy Doran 09130985. “If the model does what people do, do people do what the model does?”

The End