View
220
Download
0
Tags:
Embed Size (px)
Citation preview
1
Exact Inference Algorithms Bucket-elimination and more
COMPSCI 179, Spring 2010Set 8: Rina Dechter
(Reading: chapter 14, Russell and Norvig
Counting
1 2 3 4
4 3 2 155
5 5 5
How many people?
SUM operatorCHAIN structure
Maximization
What is the maximum?
15
23
10 32
10
100
65
47
50
77
100
15
23
77
100
47
100
77
100
100
100
100
23
77
10
100
100
32
MAX operatorTREE structure
12” 14” 15”
S
I II III
P60G80G
H
6C 9C
B
Min-Cost Assignment
What is minimum cost configuration?
6C 9C
I 30 50
II 40 55
III ∞ 60
I II III
12” 45 ∞ ∞
14” 50 60 70
15” ∞ 65 8060G
80G
12”
30 50
14”
40 45
15”
50 ∞
I II III
12” 75 ∞ ∞
14” 80 100 130
15” ∞ 105 140
12”
14”
15”
105
120
155
40
II
30
I
60
III+
MIN-SUM operatorsCHAIN structure
60
40
30
105
80
75
80
14”
75
12”
105
15”
50
40
30
40
14”
30
12”
50
15”
+ =
Belief Updating
Buzzsound
Mechanical problem
Hightemperature
Faultyhead
Readdelays
H P(H)
0 .91 .1
F P(F)0 .991 .01
H F M P(M|H,F)
0 0 0 .90 0 1 .10 1 0 .10 1 1 .91 0 0 .81 0 1 .21 1 0 .011 1 1 .99
F R P(R|F)0 0 .80 1 .21 0 .30 1 .7
P(F | B=1) = ?
M h1(M)
0 .051 .8
H F M Bel(M,H,F)
0 0 0 .04050 0 1 .0720 1 0 .00450 1 1 .6481 0 0 .0041 0 1 .0081 1 0 .000051 1 1 .0792
H h2(H)0 .91 .1
F h3(F)0 .12451 .7317
5
F h4(F)0 11 1
H F M P(M|H,F)
0 0 0 .90 0 1 .10 1 0 .10 1 1 .91 0 0 .81 0 1 .21 1 0 .011 1 1 .99
* * =
M B P(B|M)0 0 .950 1 .051 0 .21 1 .8
* * =F P(F,B=1
)0 .1232551 .073175
P(B=1) = .19643
Probability of evidence
P(F=1|B=1) = .3725
Updated belief
SUM-PROD operatorsPOLY-TREE structure
P(h,f,r,m,b) = P(h) P(f) P(m|h,f) P(r|f) P(b|m)
X
Y Z
T R L M
)(XmZX
)(XmXZ
)(ZmZM)(ZmZL
)(ZmMZ)(ZmLZ
)(XmYX
)(XmXY
)(YmTY
)(YmYT
)(YmRY
)(YmYR
ZLZMZZX
LLZ
MMZ
(Z)m(Z)mXZP(X)m
ZLP(Z)m
ZMP(Z)m
)|(
)|(
)|(
Belief updating (sum-prod)
XLZXZZM
XMZXZZL
YXXZ
(Z)mXmXZPZm
(Z)mXmXZPZm
(X)mXPXm
)()|()(
)()|()(
)()(
X
Y Z
T R L M
)(XmZX
)(XmXZ
)(ZmZM)(ZmZL
)(ZmMZ)(ZmLZ
)(XmYX
)(XmXY
)(YmTY
)(YmYT
)(YmRY
)(YmYR
(Z)m(Z)mXZP(X)m
ZLP(Z)m
ZMP(Z)m
LZMZL
ZX
LLZ
MMZ
)|(max
)|(max
)|(max
MPE (max-prod)
(Z)mXmXZPZm
(Z)mXmXZPZm
(X)mXPXm
LZXZX
ZM
MZXZX
ZL
YXXZ
)()|(max)(
)()|(max)(
)()(
CSP – consistency (projection-join)
X
Y Z
T R L M
)(XmZX
)(ZmMZ)(ZmLZ
)(XmYX
)(YmTY )(YmRY
(Z)λ(Z)λZXR(X)λ
LZR(Z)λ
MZR(Z)λ
LZMZZ
ZX
LLZ
MMZ
)(
)(
)(
X
Y Z
T R L M
)(XmZX
)(ZmMZ)(ZmLZ
)(XmYX
)(YmTY )(YmRY
XZXYX
LLZMZZX
LLZ
MMZ
(X)m(X)msol
(Z)m(Z)mZXR(X)m
LZR(Z)m
MZR(Z)m
#
)(
)(
)(
#CSP (sum-prod)
X
Y Z
T R L M
)(XmZX
)(XmXZ
)(ZmZM)(ZmZL)(ZmMZ)(ZmLZ
)(XmYX
)(XmXY
)(YmTY
)(YmYT)(YmRY
)(YmYR
Tree-solving
ZLZMZZX
LLZ
MMZ
(Z)m(Z)mXZP(X)m
ZLP(Z)m
ZMP(Z)m
)|(
)|(
)|(
XLZXZZM
XMZXZZL
YXXZ
(Z)mXmXZPZm
(Z)mXmXZPZm
(X)mXPXm
)()|()(
)()|()(
)()(
Belief updating (sum-prod)
MPE (max-prod)
(Z)m(Z)mXZP(X)m
ZLP(Z)m
ZMP(Z)m
LZMZL
ZX
LLZ
MMZ
)|(max
)|(max
)|(max
(Z)mXmXZPZm
(Z)mXmXZPZm
(X)mXPXm
LZXZX
ZM
MZXZX
ZL
YXXZ
)()|(max)(
)()|(max)(
)()(
CSP – consistency (projection-join)
(Z)λ(Z)λZXR(X)λ
LZR(Z)λ
MZR(Z)λ
LZMZZ
ZX
LLZ
MMZ
)(
)(
)(
#CSP (sum-prod)
XZXYX
LLZMZZX
LLZ
MMZ
(X)m(X)msol
(Z)m(Z)mZXR(X)m
LZR(Z)m
MZR(Z)m
#
)(
)(
)(
Belief Propagation
• Instances of tree message passing algorithm
• Exact for trees
• Linear in the input size
• Importance:– One of the first algorithms for inference in Bayesian networks– Gives a cognitive dimension to its computations – Basis for conditioning algorithms for arbitrary Bayesian network– Basis for Loopy Belief Propagation (approximate algorithms)
[Pearl, 1988]
21
Exact Inference Algorithms Bucket-elimination
COMPSCI 179, Spring 2010Set 8: Rina Dechter
(Reading: chapter 14, Russell and Norvig
26
Belief Updating
lung Cancer
Smoking
X-ray
Bronchitis
Dyspnoea
P (lung cancer=yes | smoking=no, dyspnoea=yes ) = ?
27
Belief updating: P(X|evidence)=?
“Moral” graph
A
D E
CB
P(a|e=0) P(a,e=0)=
bcde ,,,0
P(a)P(b|a)P(c|a)P(d|b,a)P(e|b,c)=
0e
P(a) d
),,,( ecdahB
b
P(b|a)P(d|b,a)P(e|b,c)
B C
ED
Variable Elimination
P(c|a)c
28
Bucket elimination Algorithm BE-bel (Dechter 1996)
b
Elimination operator
P(a|e=0)
W*=4”induced width” (max clique size)
bucket B:
P(a)
P(c|a)
P(b|a) P(d|b,a) P(e|b,c)
bucket C:
bucket D:
bucket E:
bucket A:
e=0
B
C
D
E
A
e)(a,hD
(a)hE
e)c,d,(a,hB
e)d,(a,hC
29
“Moral” graph
A
D E
CB
30
BE-BEL
IntelligenceDifficulty
Grade
Letter
SAT
Job
Apply
Student Network example
• P(J)?
36
E
D
C
B
A
B
C
D
E
A
Fall 2003 ICS 275A - Constraint Networks 37
The induced-width
• width: is the max number of parents in the ordered graph• Induced-width: width of induced graph: recursively connecting parents going from last node
to first.• Induced-width w*(d) = the max induced-width over all nodes• Induced-width of a graph: max w*(d) over all d
38
Complexity of elimination
))((exp ( * dwnOddw ordering along graph moral of widthinduced the)(*
The effect of the ordering:
4)( 1* dw 2)( 2
* dw“Moral” graph
A
D E
CB
B
C
D
E
A
E
D
C
B
A
39
More accurately: O(r exp(w*(d)) where r is the number of cpts.For Bayesian networks r=n. For Markov networks?
BE-BEL
40
41
The impact of observationsMoral graph Induced
Moral graph
Adjusted Graph for evidence in B
Induced-adjusted.
42
Probabilistic Inference Tasks
evidence)|xP(X)BEL(X iii
Belief updating:
Finding most probable explanation (MPE) e),xP(maxarg*x
x
43
b
maxElimination operator
MPE
W*=4”induced width” (max clique size)
bucket B:
P(a)
P(c|a)
P(b|a) P(d|b,a) P(e|b,c)
bucket C:
bucket D:
bucket E:
bucket A:
e=0
B
C
D
E
A
e)(a,hD
(a)hE
e)c,d,(a,hB
e)d,(a,hC
Algorithm elim-mpe (Dechter 1996)
)xP(maxMPEx
),|(),|()|()|()(maxby replaced is
,,,,cbePbadPabPacPaPMPE
:
bcdea
max
44
Generating the MPE-tuple
C:
E:
P(b|a) P(d|b,a) P(e|b,c)B:
D:
A: P(a)
P(c|a)
e=0 e)(a,hD
(a)hE
e)c,d,(a,hB
e)d,(a,hC
(a)hP(a)max arga' 1. E
a
0e' 2.
)e'd,,(a'hmax argd' 3. C
d
)e'c,,d',(a'h
)a'|P(cmax argc' 4.B
c
)c'b,|P(e')a'b,|P(d')a'|P(bmax argb' 5.
b
)e',d',c',b',(a' Return
12” 14” 15”
S
I II III
P60G80G
H
6C 9C
B
Min-Cost Assignment
What is minimum cost configuration?
6C 9C
I 30 50
II 40 55
III ∞ 60
I II III
12” 45 ∞ ∞
14” 50 60 70
15” ∞ 65 8060G
80G
12”
30 50
14”
40 45
15”
50 ∞
I II III
12” 75 ∞ ∞
14” 80 100 130
15” ∞ 105 140
12”
14”
15”
105
120
155
40
II
30
I
60
III+
MIN-SUM operatorsCHAIN structure
60
40
30
105
80
75
80
14”
75
12”
105
15”
50
40
30
40
14”
30
12”
50
15”
+ =
46
47
BE-MPE
BE-MPE
48
Finding small induced-width
• NP-complete• A tree has induced-width of ?• Greedy algorithms:
– Min width– Min induced-width– Max-cardinality– Fill-in (thought as the best)– See anytime min-width (Gogate and Dechter)