Upload
justin-harmon
View
213
Download
0
Embed Size (px)
Citation preview
K2 Algorithm PresentationK2 Algorithm Presentation
Learning Bayes Networks from DataLearning Bayes Networks from Data
Haipeng GuoHaipeng Guo Friday, April 21, 2000Friday, April 21, 2000
KDD Lab, CIS Department, KSUKDD Lab, CIS Department, KSU
Presentation OutlinePresentation Outline
• Bayes Networks Introduction
• What’s K2?
• Basic Model and the Score Function
• K2 algorithm
• Demo
• A Bayes network B = (Bs, Bp)
• A Bayes Network structure Bs is a directed
acyclic graph in which nodes represent
random domain variables and arcs between
nodes represent probabilistic independence.
• Bs is augmented by conditional probabilities,
Bp, to form a Bayes Network B.
Bayes Networks IntroductionBayes Networks Introduction
Bayes Networks IntroductionBayes Networks Introduction
• Example: Sprinkler - Bs of Bayes Network: the structure
x1
x2
x3
x4 x5
Season
Sprinkler
Rain
Ground_moist Ground_state
Bayes Networks IntroductionBayes Networks Introduction
- Bp of Bayes Network: the conditional probability
P(Spring) P(Summer) P(Fall) P(Winter)0.25 0.25 0.25 0.25
Season P(on) P(off)Spring 0.75 0.25Summer 1 0Fall 0.75 0.25Winter 0.25 0.75
season
sprinkler
Rain , Ground-moist, and Ground-state
What’s K2?What’s K2?
• K2 is an algorithm for constructing a Bayes Network from a database of records
• “A Bayesian Method for the Induction of Probabilistic Networks from Data”, Gregory F. Cooper and Edward Herskovits, Machine Learning 9, 1992
Basic ModelBasic Model
• The problem: to find the most probable
Bayes-network structure given a database
• D – a database of cases
• Z – the set of variables represented by D
• Bsi , Bsj – two bayes network structures
containing exactly those variables that are in
Z
Basic ModelBasic Model
),(
),(
)(
),()(
),(
)|(
)|(
DBP
DBP
DP
DBPDP
DBP
DBP
DBP
Sj
Si
Sj
S
Sj
S
i
i
• By computing such ratios for pairs of bayes network structures, we can rank order a set of structures by their posterior probabilities.
• Based on four assumptions, the paper introduces an efficient formula for computing P(Bs,D), let B represent an arbitrary bayes network structure containing just the variables in D
Computing P(BComputing P(Bss,D),D)
• Assumption 4 The density function f(Bp|Bs) is uniform. Bp is a vector whose values denotes the conditional-probability assignment associated with structure Bs
• Assumption 2 Cases occur independently, given a bayes network model
• Assumption 3 There are no cases that have variables with missing values
• Assumption 1 The database variables, which we denote as Z, are discrete
Computing P(BComputing P(Bss,D),D)
D - dataset, it has m cases(records)Z - a set of n discrete variables: (x1, …, xn)
ri - a variable xi in Z has ri possible value assignment: ),...( 1 iiri vv
Bs - a bayes network structure containing just the variables in Zi - each variable xi in Bs has a set of parents which we represent with a list of variables i qi - there are has unique instantiations of i wij - denote jth unique instantiation of i relative to D.Nijk - the number of cases in D in which variable xi has the value of and i is instantiated as wij.
Nij -
ikv
ir
kijkij NN
1
!)!1(
)!1()(),(
11 1
ii r
kijkij
n
i
q
j iij
iss NN
rN
rBPDBP
Where
Decrease the computational complexityDecrease the computational complexity
Three more assumptions to decrease the computational
complexity to polynomial-time:
<1> There is an ordering on the nodes such that if xi precedes
xj, then we do not allow structures in which there is an arc from
xj to xi .
<2> There exists a sufficiently tight limit on the number of
parents of any nodes
<3> P(i xi) and P(j xj) are independent when i j.
!)!1(
)!1()([)],([max
11 1
ii
s
r
kijkij
n
i
q
j iij
iiis
BNN
rN
rxPDBP
K2 algorithm: a heuristic search methodK2 algorithm: a heuristic search method
Use the following functions:
!)!1(
)!1(),(
11
ii r
kijk
q
j iij
ii N
rN
rig
Where the Nijk are relative to i being the parents of xi and relative to a database D
Pred(xi) = {x1, ... xi-1}
It returns the set of nodes that precede xi in the node ordering
{Input: A set of nodes, an ordering on the nodes, an
upper bound u on the number of parents a node may
have, and a database D containing m cases}
{Output: For each nodes, a printout of the parents of the
node}
K2 algorithm: a heuristic search methodK2 algorithm: a heuristic search method
Procedure K2
For i:=1 to n do
i = ;
Pold = g(i, i );
OKToProceed := true
while OKToProceed and | i |<u do
let z be the node in Pred(xi)- i that maximizes g(i, i {z});
Pnew = g(i, i {z});
if Pnew > Pold then
Pold := Pnew ;
i :=i {z} ;
else OKToProceed := false;
end {while}
write(“Node:”, “parents of this nodes :”, i );
end {for}
end {K2}
K2 algorithm: a heuristic search methodK2 algorithm: a heuristic search method
Conditional probabilities
• Let ijk denote the conditional probabilities P(xi =vik | i = wij )-that is,
the probability that xi has value v for some k from 1 to ri , given that
the parents of x , represented by , are instantiated as wij. We call ijk a
network conditional probability.
• Let be the four assumptions.
• The expected value of ijk :
)(
)1(],,|[
iij
ijksijk rN
NBDE
Demo ExampleDemo Example
Input: Case x1 x2 x31 present absent absent2 present present present3 absent absent present4 present present present5 absent absent absent6 absent present present7 present present present8 absent absent absent9 present present present10 absent absent absent
The dataset is generated from the following structure:
x1 x2 x3
Demo ExampleDemo Example
Note:
-- use log[g(i, i )] instead of g(i, i ) to save running time