21
Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal [email protected] September 26, 2016 1 A.I.

Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal [email protected] A.I. September

  • Upload
    haliem

  • View
    234

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Artificial IntelligenceLearning: decision lists, evaluation, Naive

Bayesian networks

Peter [email protected]

September 26, 2016 1A.I.

Page 2: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Algorithms for concept learning◦ Best vs. version space

PAC-learning for decision lists

The evaluation of performance

From predictions to optimal decisions

Learning Naiv Bayesian networks

September 26, 2016A.I. 2

Page 3: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Each model specifies true/false for each proposition symbol

E.g. P1,2 P2,2 P3,1false true false

With these symbols, 8 possible models, can be enumerated automatically.

Rules for evaluating truth with respect to a model m:

S is true iff S is false S1 S2 is true iff S1 is true and S2 is trueS1 S2 is true iff S1is true or S2 is trueS1 S2 is true iff S1 is false or S2 is truei.e., is false iff S1 is true and S2 is falseS1 S2 is true iff S1S2 is true andS2S1 is true

Simple recursive process evaluates an arbitrary sentence, e.g.,

P1,2 (P2,2 P3,1) = true (true false) = true true = true

9/26/2016 3A.I.

Page 4: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

9/26/2016 4A.I.

Page 5: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Two sentences are logically equivalent} iff true in same models: α ≡ ß iff α╞ β and β╞ α

9/26/2016 5A.I.

Page 6: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

B1,1 (P1,2 P2,1)β

1. Eliminate , replacing α β with (α β)(β α).2.

(B1,1 (P1,2 P2,1)) ((P1,2 P2,1) B1,1)

2. Eliminate , replacing α β with α β.

(B1,1 P1,2 P2,1) ((P1,2 P2,1) B1,1)

3. Move inwards using de Morgan's rules and double-negation:

(B1,1 P1,2 P2,1) ((P1,2 P2,1) B1,1)

4. Apply distributivity law ( over ) and flatten:

(B1,1 P1,2 P2,1) (P1,2 B1,1) (P2,1 B1,1)

9/26/2016 6A.I.

Page 7: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Goal: selection of a logical function f: {0,1}n→{0,1} from a function class C,

which is consistent with the data DN={(x1, y1),..,(xN, yN)}, i.e. for i=1..N: f(xi)= yi.

Predicted Ref.:0 Ref.1

0 True negative (TN)

False negative (FN)

1 False positive

(FP)

True positive

(TP)

Learning method:True negative/ True positive: -False negative: generalizeFalse positive: specialize

Page 8: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

False negative: generalization

◦ Replace A B to A

◦ Replace A to A B

False positive: specialization

◦ Replace A to A B

◦ Replace A B to A

September 26, 2016A.I. 8

+ + +

+ + +

+ + +

+ + +

+ + +

+ + -

+ + +

+ + -

Page 9: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Bound the set of consistent hypotheses with two limiting sets:◦ S: the set of most specific consistent hypotheses

◦ G: the set of most general consistent hypotheses

Learning from (xi, yi): update Si and Gi

◦ For each hypothesis in Si:

FP: delete

FN: generalize to all neigbours

◦ For each hypothesis in Gi:

FP: specialize to all neighbours

FN: delete

September 26, 2016A.I. 9

Sp

ecia

lg

en

era

l

Page 10: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

One possible representation for hypotheses

E.g., here is the “true” tree for deciding whether to wait:

Page 11: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

How many distinct decision trees with n Boolean attributes?= number of Boolean functions= number of distinct truth tables with 2n rows = 22n

E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees

How many purely conjunctive hypotheses (e.g., Hungry Rain)?

Each attribute can be in (positive), in (negative), or out 3n distinct conjunctive hypotheses

More expressive hypothesis space◦ increases chance that target function can be expressed◦ increases number of hypotheses consistent with training set

may get worse predictions

Page 12: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Sequential k tests using n attributes: k-DL(n)

Number of tests:

Number of test sequences:

Number of decision lists:

September 26, 2016A.I. 12

),(3

knConj

!),(3)(DL),(

knConjnkknConj

k

i

knOi

nknConj

0

)(2

),(

Page 13: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Number of decision lists:

PAC sample complexity:

September 26, 2016A.I. 13

))(log( 22)(DLkk nnOnk

)))(log(1

(ln1

2

kk nnOm

Page 14: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Sensitivity: p(Prediction=TRUE|Ref=TRUE)

Specificity: p(Prediction=FALSE|Ref=FALSE)

PPV: p(Ref=TRUE|Prediction=TRUE)

NPV: p(Ref=FALSE|Prediction=FALSE)

Page 15: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Mutation

Onset

Bleeding

absent

P(D|a,l,m)

Regularity

weak

Onset=early Onset=late

h.wild

regular irregular

mutated

P(D|a,l,h.w.)

P(D|a,e)

strong

P(D|Bleeding=strong)

Mutation

P(D|w,i,m)

h.wild mutated

P(D|w,i,h.w.)

P(D|w,r)

Decision tree: Each internal node represent a (univariate) test, the leafs contains

the conditional probabilities given the values along the path.

Decision graph: If conditions are equivalent, then subtrees can be merged.

E.g. If (Bleeding=absent,Onset=late) ~ (Bleeding=weak,Regularity=irreg)

Page 16: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Healthy Disease present

threshold t

Page 17: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

a1

a0

o0

o1

o0

o1

reported Ref.:0 Ref.1

0 C0|0 C0|1

1 C1|0 C1|1

Page 18: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Variables (nodes) Flu: present/absent

FeverAbove38C: present/absent

Coughing: present/absent

Flu

Fever Coughing

P(Fever=present|Flu=present)=0.6

P(Fever=absent|Flu=present)=1-0.6

P(Fever=present|Flu=absent)=0.01

P(Fever=absent|Flu=absent)=1-0.01

P(Flu=present)=0.001

P(Flu=absent)=1-P(Flu=present)Model

P(Coughing=present|Flu=present)=0.3

P(Coughing=absent|Flu=present)=1-0.7

P(Coughing=present|Flu=absent)=0.02

P(Coughing=absent|Flu=absent)=1-0.02

Assumptions:

1, Two types of nodes: a cause and effects.

2, Effects are conditionally independent of each other given their cause.

Page 19: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Decomposition of the joint:

P(Y,X1,..,Xn) = P(Y)∏iP(Xi,|Y, X1,..,Xi-1) //by the chain rule

= P(Y)∏iP(Xi,|Y) // by the N-BN assumption

2n+1 parameteres!

Diagnostic inference:

P(Y|xi1,..,xik) = P(Y)∏jP(xij,|Y) / P(xi1,..,xik)

If Y is binary, then the oddsP(Y=1|xi1,..,xik) / P(Y=0|xi1,..,xik) = P(Y=1)/P(Y=0) ∏j P(xij,|Y=1) / P(xij,|Y=0)

Flu

Fever Coughing

)|()|()(

),|(

presentFlupresentCoughingppresentFluabsentFeverppresentFlup

presentCoughingabsentFeverpresentFlup

Page 20: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

9/26/2016A.I. 20

Page 21: Artificial Intelligence Learning: decision lists ... · Artificial Intelligence Learning: decision lists, evaluation, Naive Bayesian networks Peter Antal antal@mit.bme.hu A.I. September

Naive concept learning

Learning decision lists

Decision trees and graphs

Optimal decisions

Error types in classification

Cost-free performance measures

Naive Bayesian network classifiers

September 26, 2016A.I. 21