34
159.734 159.734 LECTURE LECTURE Learning DNF Learning DNF 2 Source: MIT OpenCourseWare Machine Learning Machine Learning

First meeting

  • Upload
    butest

  • View
    145

  • Download
    1

Embed Size (px)

Citation preview

Page 1: First meeting

159.734159.734 LECTURELECTURE

Learning DNFLearning DNF

2

Source: MIT OpenCourseWare

Machine LearningMachine Learning

Page 2: First meeting

Supervised LearningSupervised Learning

Given data (training set)

Page 3: First meeting

Supervised LearningSupervised Learning

Given data (training set)

Classification problem: discrete Y

Regression problem: continuous Y

Page 4: First meeting

Supervised LearningSupervised Learning

Given data (training set)

Classification problem: discrete Y

Regression problem: continuous Y

Goal: Find a hypothesis h in hypothesis class H that does a good job of mapping x to y.

Page 5: First meeting

Best HypothesisBest Hypothesis

Hypothesis should

do a good job of describing the data

not to be too complex

Page 6: First meeting

Best HypothesisBest Hypothesis

Hypothesis should

do a good job of describing the data

not to be too complex

• ideally: ii yxh =)(

• number of errors: ),( DhE

Page 7: First meeting

Best HypothesisBest Hypothesis

Hypothesis should

do a good job of describing the data

not to be too complex

• ideally: ii yxh =)(

• number of errors: ),( DhE

• measure: )(hC

Page 8: First meeting

Best HypothesisBest Hypothesis

Hypothesis should

do a good job of describing the data

not to be too complex

• ideally: ii yxh =)(

• number of errors: ),( DhE

• measure: )(hC

Minimize )(),( hCDhE α+

trade-off

Page 9: First meeting

Learning ConjunctionsLearning Conjunctions

Boolean features and output

H = conjunctions of features

Page 10: First meeting

Learning ConjunctionsLearning Conjunctions

Boolean features and output

H = conjunctions of features

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

Page 11: First meeting

Learning ConjunctionsLearning Conjunctions

Boolean features and output

H = conjunctions of features

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

31 ffh ∧=

3),( =DhE

Page 12: First meeting

Learning ConjunctionsLearning Conjunctions

Boolean features and output

H = conjunctions of features

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

31 ffh ∧=

3),( =DhE

2)( =hC

Set alpha so we’re looking for smallest h with 0 error

Page 13: First meeting

AlgorithmAlgorithm

Could search in hypothesis space using tools we’ve already studied in AI

Instead, be greedy!

Start with h=True

All errors are on negative examples

On each step, add conjunct that rules out most new negatives (without excluding positives)

Page 14: First meeting

Pseudo-CodePseudo-Code

N = negative examples in D

h = True

Loop until N is empty

Page 15: First meeting

Pseudo-CodePseudo-Code

N = negative examples in D

h = True

Loop until N is empty

For every feature j that does not have value 0 on any positive examples

nj := number of examples in N for which fj = 0

j* := j for which nj is maximized

h := h ^ fj

N := N – examples in N for which fj = 0

If no such feature found, fail

Page 16: First meeting

SimulationSimulation

N={x1, x3, x5}, h = True

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

Page 17: First meeting

SimulationSimulation

N={x1, x3, x5}, h = True

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

n3 = 1, n4 = 2

Page 18: First meeting

SimulationSimulation

N={x1, x3, x5}, h = True

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

n3 = 1, n4 = 2

N={x5}, h = f4

Page 19: First meeting

SimulationSimulation

N={x1, x3, x5}, h = True

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

n3 = 1, n4 = 2

N={x5}, h = f4

n3 = 1, n4 = 0

Page 20: First meeting

SimulationSimulation

N={x1, x3, x5}, h = True

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

n3 = 1, n4 = 2

N={x5}, h = f4

n3 = 1, n4 = 0

N={ },

34 ffh ∧=

Done!

Page 21: First meeting

Simulation: A Harder ProblemSimulation: A Harder Problem

We made one negative into a positive

11110

01001

11100

10111

11101

00110

Yf4f3f2f1

Page 22: First meeting

Simulation: A Harder ProblemSimulation: A Harder Problem

We made one negative into a positive

11110

01001

11100

10111

11101

00110

Yf4f3f2f1

Only one usable feature is f3

Can’t add any more features to h

We’re stuck

Best we can do when H is purely conjunctions

Live with error or change H

Page 23: First meeting

Disjunctive Normal FormDisjunctive Normal Form

Like the opposite of conjunctive normal form (but , for now, without negations of the atoms)

Think of each disjunct as narrowing in on a subset of the positive examples

EA)(DC)B(A ∨∧∨∧∧

Page 24: First meeting

Disjunctive Normal FormDisjunctive Normal Form

Like the opposite of conjunctive normal form (but , for now, without negations of the atoms)

Think of each disjunct as narrowing in on a subset of the positive examples

EA)(DC)B(A ∨∧∨∧∧

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

Page 25: First meeting

Disjunctive Normal FormDisjunctive Normal Form

Like the opposite of conjunctive normal form (but , for now, without negations of the atoms)

Think of each disjunct as narrowing in on a subset of the positive examples

EA)(DC)B(A ∨∧∨∧∧

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

)f(f)f(f 2143 ∧∨∧

Page 26: First meeting

Learning DNFLearning DNF

Let H be DNF expressions

C(h) : number of mentions of features

42143 =∧∨∧ ))f(f)fC((f

Page 27: First meeting

Learning DNFLearning DNF

Let H be DNF expressions

C(h) : number of mentions of features

42143 =∧∨∧ ))f(f)fC((f

Really hard to search this space, so be greedy again!

Page 28: First meeting

Learning DNFLearning DNF

Let H be DNF expressions

C(h) : number of mentions of features

42143 =∧∨∧ ))f(f)fC((f

Really hard to search this space, so be greedy again!

A conjunction covers an example if all of the features mentioned in the conjunction are true in the example.

Page 29: First meeting

AlgorithmAlgorithmP = set of all positive examples

h = False

Loop until P is empty

Else, select a feature fj to add to r

r = r ^ fj

N = N – examples in n for which fj = 0

Covered := examples in P covered by r

r = True

N = set of all negative examples

Loop until N is empty

If all features are in r, fail

rhh ∨=

If Covered is empty, fail

Else, P := P = Covered

end

Page 30: First meeting

Choosing a FeatureChoosing a Feature

Heuristic:

)001.0,max( j

j

j n

nv −

+

=

Page 31: First meeting

Choosing a FeatureChoosing a Feature

Heuristic:

)001.0,max( j

j

j n

nv −

+

=

frby covered examples positive coveredyet not # j∧=+jn

frby covered examples negativeout ruledyet not # j∧=−jn

Page 32: First meeting

Choosing a FeatureChoosing a Feature

Heuristic:

)001.0,max( j

j

j n

nv −

+

=

frby covered examples positive coveredyet not # j∧=+jn

frby covered examples negativeout ruledyet not # j∧=−jn

Choose feature with largest value of vj

Page 33: First meeting

SimulationSimulationh = False, P={x2, x3 , x4, x6}

11110

01001

11100

10111

11101

00110

Yf4f3f2f1

r = True, N={x1, x5}

v1=2/1, v2=2/1, v3=4/1, v4=3/1

v1=2/0, v2=2/1, v4=3/0

r = f3, N={x1}

r = f3 ^ f4 , N={ }

h = f3 ^ f4 , P={x3}

Page 34: First meeting

SimulationSimulationh = False, P={x2, x3 , x4, x6}

r = True, N={x1, x5}

v1=2/1, v2=2/1, v3=4/1, v4=3/1

v1=2/0, v2=2/1, v4=3/0

r = f3, N={x1}

r = f3 ^ f4 , N={ }

h = f3 ^ f4 , P={x3}

r = True, N={x1, x5}

v1=1/1, v2=1/1, v3=1/1, v4=0/1

r = f1, N={x1}

v2=1/0, v3=1/0, v4=0/1

r = f1 ^ f2 , N={ }

h = (f3 ^ f4 ) v (f1 ^ f2 ), P={ }

11110

01001

11100

10111

11101

00110

Yf4f3f2f1

Done!