First meeting

159.734159.734 LECTURELECTURE

Learning DNFLearning DNF

2

Source: MIT OpenCourseWare

Machine LearningMachine Learning

Supervised LearningSupervised Learning

Given data (training set)



Classification problem: discrete Y

Regression problem: continuous Y



Classification problem: discrete Y

Regression problem: continuous Y

Goal: Find a hypothesis h in hypothesis class H that does a good job of mapping x to y.

Best HypothesisBest Hypothesis

Hypothesis should

do a good job of describing the data

not to be too complex


Hypothesis should



• ideally: ii yxh =)(

• number of errors: ),( DhE


Hypothesis should





• measure: )(hC


Hypothesis should





• measure: )(hC

Minimize )(),( hCDhE α+

trade-off

Learning ConjunctionsLearning Conjunctions

Boolean features and output

H = conjunctions of features




11110

01001

11100

00111

11101

00110

Yf4f3f2f1




11110

01001

11100

00111

11101

00110

Yf4f3f2f1

31 ffh ∧=

3),( =DhE




11110

01001

11100

00111

11101

00110

Yf4f3f2f1

31 ffh ∧=

3),( =DhE

2)( =hC

Set alpha so we’re looking for smallest h with 0 error

AlgorithmAlgorithm

Could search in hypothesis space using tools we’ve already studied in AI

Instead, be greedy!

Start with h=True

All errors are on negative examples

On each step, add conjunct that rules out most new negatives (without excluding positives)

Pseudo-CodePseudo-Code

N = negative examples in D

h = True

Loop until N is empty

Pseudo-CodePseudo-Code

N = negative examples in D

h = True


For every feature j that does not have value 0 on any positive examples

nj := number of examples in N for which fj = 0

j* := j for which nj is maximized

h := h ^ fj

N := N – examples in N for which fj = 0

If no such feature found, fail

SimulationSimulation

N={x1, x3, x5}, h = True

11110

01001

11100

00111

11101

00110

Yf4f3f2f1


N={x1, x3, x5}, h = True

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

n3 = 1, n4 = 2


N={x1, x3, x5}, h = True

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

n3 = 1, n4 = 2

N={x5}, h = f4


N={x1, x3, x5}, h = True

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

n3 = 1, n4 = 2

N={x5}, h = f4

n3 = 1, n4 = 0


N={x1, x3, x5}, h = True

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

n3 = 1, n4 = 2

N={x5}, h = f4

n3 = 1, n4 = 0

N={ },

34 ffh ∧=

Done!

Simulation: A Harder ProblemSimulation: A Harder Problem

We made one negative into a positive

11110

01001

11100

10111

11101

00110

Yf4f3f2f1

Simulation: A Harder ProblemSimulation: A Harder Problem

We made one negative into a positive

11110

01001

11100

10111

11101

00110

Yf4f3f2f1

Only one usable feature is f3

Can’t add any more features to h

We’re stuck

Best we can do when H is purely conjunctions

Live with error or change H

Disjunctive Normal FormDisjunctive Normal Form

Like the opposite of conjunctive normal form (but , for now, without negations of the atoms)

Think of each disjunct as narrowing in on a subset of the positive examples

EA)(DC)B(A ∨∧∨∧∧




EA)(DC)B(A ∨∧∨∧∧

11110

01001

11100

00111

11101

00110

Yf4f3f2f1




EA)(DC)B(A ∨∧∨∧∧

11110

01001

11100

00111

11101

00110

Yf4f3f2f1

)f(f)f(f 2143 ∧∨∧


Let H be DNF expressions

C(h) : number of mentions of features

42143 =∧∨∧ ))f(f)fC((f




42143 =∧∨∧ ))f(f)fC((f

Really hard to search this space, so be greedy again!




42143 =∧∨∧ ))f(f)fC((f

Really hard to search this space, so be greedy again!

A conjunction covers an example if all of the features mentioned in the conjunction are true in the example.

AlgorithmAlgorithmP = set of all positive examples

h = False

Loop until P is empty

Else, select a feature fj to add to r

r = r ^ fj

N = N – examples in n for which fj = 0

Covered := examples in P covered by r

r = True

N = set of all negative examples


If all features are in r, fail

rhh ∨=

If Covered is empty, fail

Else, P := P = Covered

end

Choosing a FeatureChoosing a Feature

Heuristic:

)001.0,max( j

j

j n

nv −

+

=


Heuristic:

)001.0,max( j

j

j n

nv −

+

=

frby covered examples positive coveredyet not # j∧=+jn

frby covered examples negativeout ruledyet not # j∧=−jn


Heuristic:

)001.0,max( j

j

j n

nv −

+

=

frby covered examples positive coveredyet not # j∧=+jn

frby covered examples negativeout ruledyet not # j∧=−jn

Choose feature with largest value of vj

SimulationSimulationh = False, P={x2, x3 , x4, x6}

11110

01001

11100

10111

11101

00110

Yf4f3f2f1

r = True, N={x1, x5}

v1=2/1, v2=2/1, v3=4/1, v4=3/1

v1=2/0, v2=2/1, v4=3/0

r = f3, N={x1}

r = f3 ^ f4 , N={ }

h = f3 ^ f4 , P={x3}

SimulationSimulationh = False, P={x2, x3 , x4, x6}


v1=2/1, v2=2/1, v3=4/1, v4=3/1

v1=2/0, v2=2/1, v4=3/0

r = f3, N={x1}

r = f3 ^ f4 , N={ }

h = f3 ^ f4 , P={x3}


v1=1/1, v2=1/1, v3=1/1, v4=0/1

r = f1, N={x1}

v2=1/0, v3=1/0, v4=0/1

r = f1 ^ f2 , N={ }

h = (f3 ^ f4 ) v (f1 ^ f2 ), P={ }

11110

01001

11100

10111

11101

00110

Yf4f3f2f1

Done!

Documents

First meeting