Upload
butest
View
145
Download
1
Embed Size (px)
Citation preview
159.734159.734 LECTURELECTURE
Learning DNFLearning DNF
2
Source: MIT OpenCourseWare
Machine LearningMachine Learning
Supervised LearningSupervised Learning
Given data (training set)
Supervised LearningSupervised Learning
Given data (training set)
Classification problem: discrete Y
Regression problem: continuous Y
Supervised LearningSupervised Learning
Given data (training set)
Classification problem: discrete Y
Regression problem: continuous Y
Goal: Find a hypothesis h in hypothesis class H that does a good job of mapping x to y.
Best HypothesisBest Hypothesis
Hypothesis should
do a good job of describing the data
not to be too complex
Best HypothesisBest Hypothesis
Hypothesis should
do a good job of describing the data
not to be too complex
• ideally: ii yxh =)(
• number of errors: ),( DhE
Best HypothesisBest Hypothesis
Hypothesis should
do a good job of describing the data
not to be too complex
• ideally: ii yxh =)(
• number of errors: ),( DhE
• measure: )(hC
Best HypothesisBest Hypothesis
Hypothesis should
do a good job of describing the data
not to be too complex
• ideally: ii yxh =)(
• number of errors: ),( DhE
• measure: )(hC
Minimize )(),( hCDhE α+
trade-off
Learning ConjunctionsLearning Conjunctions
Boolean features and output
H = conjunctions of features
Learning ConjunctionsLearning Conjunctions
Boolean features and output
H = conjunctions of features
11110
01001
11100
00111
11101
00110
Yf4f3f2f1
Learning ConjunctionsLearning Conjunctions
Boolean features and output
H = conjunctions of features
11110
01001
11100
00111
11101
00110
Yf4f3f2f1
31 ffh ∧=
3),( =DhE
Learning ConjunctionsLearning Conjunctions
Boolean features and output
H = conjunctions of features
11110
01001
11100
00111
11101
00110
Yf4f3f2f1
31 ffh ∧=
3),( =DhE
2)( =hC
Set alpha so we’re looking for smallest h with 0 error
AlgorithmAlgorithm
Could search in hypothesis space using tools we’ve already studied in AI
Instead, be greedy!
Start with h=True
All errors are on negative examples
On each step, add conjunct that rules out most new negatives (without excluding positives)
Pseudo-CodePseudo-Code
N = negative examples in D
h = True
Loop until N is empty
Pseudo-CodePseudo-Code
N = negative examples in D
h = True
Loop until N is empty
For every feature j that does not have value 0 on any positive examples
nj := number of examples in N for which fj = 0
j* := j for which nj is maximized
h := h ^ fj
N := N – examples in N for which fj = 0
If no such feature found, fail
SimulationSimulation
N={x1, x3, x5}, h = True
11110
01001
11100
00111
11101
00110
Yf4f3f2f1
SimulationSimulation
N={x1, x3, x5}, h = True
11110
01001
11100
00111
11101
00110
Yf4f3f2f1
n3 = 1, n4 = 2
SimulationSimulation
N={x1, x3, x5}, h = True
11110
01001
11100
00111
11101
00110
Yf4f3f2f1
n3 = 1, n4 = 2
N={x5}, h = f4
SimulationSimulation
N={x1, x3, x5}, h = True
11110
01001
11100
00111
11101
00110
Yf4f3f2f1
n3 = 1, n4 = 2
N={x5}, h = f4
n3 = 1, n4 = 0
SimulationSimulation
N={x1, x3, x5}, h = True
11110
01001
11100
00111
11101
00110
Yf4f3f2f1
n3 = 1, n4 = 2
N={x5}, h = f4
n3 = 1, n4 = 0
N={ },
34 ffh ∧=
Done!
Simulation: A Harder ProblemSimulation: A Harder Problem
We made one negative into a positive
11110
01001
11100
10111
11101
00110
Yf4f3f2f1
Simulation: A Harder ProblemSimulation: A Harder Problem
We made one negative into a positive
11110
01001
11100
10111
11101
00110
Yf4f3f2f1
Only one usable feature is f3
Can’t add any more features to h
We’re stuck
Best we can do when H is purely conjunctions
Live with error or change H
Disjunctive Normal FormDisjunctive Normal Form
Like the opposite of conjunctive normal form (but , for now, without negations of the atoms)
Think of each disjunct as narrowing in on a subset of the positive examples
EA)(DC)B(A ∨∧∨∧∧
Disjunctive Normal FormDisjunctive Normal Form
Like the opposite of conjunctive normal form (but , for now, without negations of the atoms)
Think of each disjunct as narrowing in on a subset of the positive examples
EA)(DC)B(A ∨∧∨∧∧
11110
01001
11100
00111
11101
00110
Yf4f3f2f1
Disjunctive Normal FormDisjunctive Normal Form
Like the opposite of conjunctive normal form (but , for now, without negations of the atoms)
Think of each disjunct as narrowing in on a subset of the positive examples
EA)(DC)B(A ∨∧∨∧∧
11110
01001
11100
00111
11101
00110
Yf4f3f2f1
)f(f)f(f 2143 ∧∨∧
Learning DNFLearning DNF
Let H be DNF expressions
C(h) : number of mentions of features
42143 =∧∨∧ ))f(f)fC((f
Learning DNFLearning DNF
Let H be DNF expressions
C(h) : number of mentions of features
42143 =∧∨∧ ))f(f)fC((f
Really hard to search this space, so be greedy again!
Learning DNFLearning DNF
Let H be DNF expressions
C(h) : number of mentions of features
42143 =∧∨∧ ))f(f)fC((f
Really hard to search this space, so be greedy again!
A conjunction covers an example if all of the features mentioned in the conjunction are true in the example.
AlgorithmAlgorithmP = set of all positive examples
h = False
Loop until P is empty
Else, select a feature fj to add to r
r = r ^ fj
N = N – examples in n for which fj = 0
Covered := examples in P covered by r
r = True
N = set of all negative examples
Loop until N is empty
If all features are in r, fail
rhh ∨=
If Covered is empty, fail
Else, P := P = Covered
end
Choosing a FeatureChoosing a Feature
Heuristic:
)001.0,max( j
j
j n
nv −
+
=
Choosing a FeatureChoosing a Feature
Heuristic:
)001.0,max( j
j
j n
nv −
+
=
frby covered examples positive coveredyet not # j∧=+jn
frby covered examples negativeout ruledyet not # j∧=−jn
Choosing a FeatureChoosing a Feature
Heuristic:
)001.0,max( j
j
j n
nv −
+
=
frby covered examples positive coveredyet not # j∧=+jn
frby covered examples negativeout ruledyet not # j∧=−jn
Choose feature with largest value of vj
SimulationSimulationh = False, P={x2, x3 , x4, x6}
11110
01001
11100
10111
11101
00110
Yf4f3f2f1
r = True, N={x1, x5}
v1=2/1, v2=2/1, v3=4/1, v4=3/1
v1=2/0, v2=2/1, v4=3/0
r = f3, N={x1}
r = f3 ^ f4 , N={ }
h = f3 ^ f4 , P={x3}
SimulationSimulationh = False, P={x2, x3 , x4, x6}
r = True, N={x1, x5}
v1=2/1, v2=2/1, v3=4/1, v4=3/1
v1=2/0, v2=2/1, v4=3/0
r = f3, N={x1}
r = f3 ^ f4 , N={ }
h = f3 ^ f4 , P={x3}
r = True, N={x1, x5}
v1=1/1, v2=1/1, v3=1/1, v4=0/1
r = f1, N={x1}
v2=1/0, v3=1/0, v4=0/1
r = f1 ^ f2 , N={ }
h = (f3 ^ f4 ) v (f1 ^ f2 ), P={ }
11110
01001
11100
10111
11101
00110
Yf4f3f2f1
Done!