1 Exact Inference Algorithms Bucket-elimination and more COMPSCI 179, Spring 2010 Set 8: Rina Dechter (Reading: chapter 14, Russell and Norvig

1

Exact Inference Algorithms Bucket-elimination and more

COMPSCI 179, Spring 2010Set 8: Rina Dechter

(Reading: chapter 14, Russell and Norvig

Counting

1 2 3 4

4 3 2 155

5 5 5

How many people?

SUM operatorCHAIN structure

Maximization

What is the maximum?

15

23

10 32

10

100

65

47

50

77

100

15

23

77

100

47

100

77

100

100

100

100

23

77

10

100

100

32

MAX operatorTREE structure

12” 14” 15”

S

I II III

P60G80G

H

6C 9C

B

Min-Cost Assignment

What is minimum cost configuration?

6C 9C

I 30 50

II 40 55

III ∞ 60

I II III

12” 45 ∞ ∞

14” 50 60 70

15” ∞ 65 8060G

80G

12”

30 50

14”

40 45

15”

50 ∞

I II III

12” 75 ∞ ∞

14” 80 100 130

15” ∞ 105 140

12”

14”

15”

105

120

155

40

II

30

I

60

III+

MIN-SUM operatorsCHAIN structure

60

40

30

105

80

75

80

14”

75

12”

105

15”

50

40

30

40

14”

30

12”

50

15”

+ =

http://download.intel.com/pressroom/kits/pentiumee/pentiumee_processor_back.jpg

Belief Updating

Buzzsound

Mechanical problem

Hightemperature

Faultyhead

Readdelays

H P(H)

0 .91 .1

F P(F)0 .991 .01

H F M P(M|H,F)

0 0 0 .90 0 1 .10 1 0 .10 1 1 .91 0 0 .81 0 1 .21 1 0 .011 1 1 .99

F R P(R|F)0 0 .80 1 .21 0 .30 1 .7

P(F | B=1) = ?

M h1(M)

0 .051 .8

H F M Bel(M,H,F)

0 0 0 .04050 0 1 .0720 1 0 .00450 1 1 .6481 0 0 .0041 0 1 .0081 1 0 .000051 1 1 .0792

H h2(H)0 .91 .1

F h3(F)0 .12451 .7317

5

F h4(F)0 11 1

H F M P(M|H,F)

0 0 0 .90 0 1 .10 1 0 .10 1 1 .91 0 0 .81 0 1 .21 1 0 .011 1 1 .99

* * =

M B P(B|M)0 0 .950 1 .051 0 .21 1 .8

* * =F P(F,B=1

)0 .1232551 .073175

P(B=1) = .19643

Probability of evidence

P(F=1|B=1) = .3725

Updated belief

SUM-PROD operatorsPOLY-TREE structure

P(h,f,r,m,b) = P(h) P(f) P(m|h,f) P(r|f) P(b|m)

X

Y Z

T R L M

)(XmZX

)(XmXZ

)(ZmZM)(ZmZL

)(ZmMZ)(ZmLZ

)(XmYX

)(XmXY

)(YmTY

)(YmYT

)(YmRY

)(YmYR

ZLZMZZX

LLZ

MMZ

(Z)m(Z)mXZP(X)m

ZLP(Z)m

ZMP(Z)m

)|(

)|(

)|(

Belief updating (sum-prod)

XLZXZZM

XMZXZZL

YXXZ

(Z)mXmXZPZm

(Z)mXmXZPZm

(X)mXPXm

)()|()(

)()|()(

)()(

X

Y Z

T R L M

)(XmZX

)(XmXZ

)(ZmZM)(ZmZL

)(ZmMZ)(ZmLZ

)(XmYX

)(XmXY

)(YmTY

)(YmYT

)(YmRY

)(YmYR

(Z)m(Z)mXZP(X)m

ZLP(Z)m

ZMP(Z)m

LZMZL

ZX

LLZ

MMZ

)|(max

)|(max

)|(max

MPE (max-prod)

(Z)mXmXZPZm

(Z)mXmXZPZm

(X)mXPXm

LZXZX

ZM

MZXZX

ZL

YXXZ

)()|(max)(

)()|(max)(

)()(

CSP – consistency (projection-join)

X

Y Z

T R L M

)(XmZX

)(ZmMZ)(ZmLZ

)(XmYX

)(YmTY )(YmRY

(Z)λ(Z)λZXR(X)λ

LZR(Z)λ

MZR(Z)λ

LZMZZ

ZX

LLZ

MMZ

)(

)(

)(

X

Y Z

T R L M

)(XmZX

)(ZmMZ)(ZmLZ

)(XmYX

)(YmTY )(YmRY

XZXYX

LLZMZZX

LLZ

MMZ

(X)m(X)msol

(Z)m(Z)mZXR(X)m

LZR(Z)m

MZR(Z)m

#

)(

)(

)(

#CSP (sum-prod)

X

Y Z

T R L M

)(XmZX

)(XmXZ

)(ZmZM)(ZmZL)(ZmMZ)(ZmLZ

)(XmYX

)(XmXY

)(YmTY

)(YmYT)(YmRY

)(YmYR

Tree-solving

ZLZMZZX

LLZ

MMZ

(Z)m(Z)mXZP(X)m

ZLP(Z)m

ZMP(Z)m

)|(

)|(

)|(

XLZXZZM

XMZXZZL

YXXZ

(Z)mXmXZPZm

(Z)mXmXZPZm

(X)mXPXm

)()|()(

)()|()(

)()(

Belief updating (sum-prod)

MPE (max-prod)

(Z)m(Z)mXZP(X)m

ZLP(Z)m

ZMP(Z)m

LZMZL

ZX

LLZ

MMZ

)|(max

)|(max

)|(max

(Z)mXmXZPZm

(Z)mXmXZPZm

(X)mXPXm

LZXZX

ZM

MZXZX

ZL

YXXZ

)()|(max)(

)()|(max)(

)()(

CSP – consistency (projection-join)

(Z)λ(Z)λZXR(X)λ

LZR(Z)λ

MZR(Z)λ

LZMZZ

ZX

LLZ

MMZ

)(

)(

)(

#CSP (sum-prod)

XZXYX

LLZMZZX

LLZ

MMZ

(X)m(X)msol

(Z)m(Z)mZXR(X)m

LZR(Z)m

MZR(Z)m

#

)(

)(

)(

Belief Propagation

• Instances of tree message passing algorithm

• Exact for trees

• Linear in the input size

• Importance:– One of the first algorithms for inference in Bayesian networks– Gives a cognitive dimension to its computations – Basis for conditioning algorithms for arbitrary Bayesian network– Basis for Loopy Belief Propagation (approximate algorithms)

[Pearl, 1988]

21

Exact Inference Algorithms Bucket-elimination

COMPSCI 179, Spring 2010Set 8: Rina Dechter

(Reading: chapter 14, Russell and Norvig

26

Belief Updating

lung Cancer

Smoking

X-ray

Bronchitis

Dyspnoea

P (lung cancer=yes | smoking=no, dyspnoea=yes ) = ?

27

Belief updating: P(X|evidence)=?

“Moral” graph

A

D E

CB

P(a|e=0) P(a,e=0)=

bcde ,,,0

P(a)P(b|a)P(c|a)P(d|b,a)P(e|b,c)=

0e

P(a) d

),,,( ecdahB

b

P(b|a)P(d|b,a)P(e|b,c)

B C

ED

Variable Elimination

P(c|a)c

28

Bucket elimination Algorithm BE-bel (Dechter 1996)

b

Elimination operator

P(a|e=0)

W*=4”induced width” (max clique size)

bucket B:

P(a)

P(c|a)

P(b|a) P(d|b,a) P(e|b,c)

bucket C:

bucket D:

bucket E:

bucket A:

e=0

B

C

D

E

A

e)(a,hD

(a)hE

e)c,d,(a,hB

e)d,(a,hC

29

“Moral” graph

A

D E

CB

30

BE-BEL

IntelligenceDifficulty

Grade

Letter

SAT

Job

Apply

Student Network example

• P(J)?

36

E

D

C

B

A

B

C

D

E

A

Fall 2003 ICS 275A - Constraint Networks 37

The induced-width

• width: is the max number of parents in the ordered graph• Induced-width: width of induced graph: recursively connecting parents going from last node

to first.• Induced-width w*(d) = the max induced-width over all nodes• Induced-width of a graph: max w*(d) over all d

38

Complexity of elimination

))((exp ( * dwnOddw ordering along graph moral of widthinduced the)(*

The effect of the ordering:

4)( 1* dw 2)( 2

* dw“Moral” graph

A

D E

CB

B

C

D

E

A

E

D

C

B

A

39

More accurately: O(r exp(w*(d)) where r is the number of cpts.For Bayesian networks r=n. For Markov networks?

BE-BEL

40

41

The impact of observationsMoral graph Induced

Moral graph

Adjusted Graph for evidence in B

Induced-adjusted.

42

Probabilistic Inference Tasks

evidence)|xP(X)BEL(X iii

Belief updating:

Finding most probable explanation (MPE) e),xP(maxarg*x

x

43

b

maxElimination operator

MPE

W*=4”induced width” (max clique size)

bucket B:

P(a)

P(c|a)

P(b|a) P(d|b,a) P(e|b,c)

bucket C:

bucket D:

bucket E:

bucket A:

e=0

B

C

D

E

A

e)(a,hD

(a)hE

e)c,d,(a,hB

e)d,(a,hC

Algorithm elim-mpe (Dechter 1996)

)xP(maxMPEx

),|(),|()|()|()(maxby replaced is

,,,,cbePbadPabPacPaPMPE

:

bcdea

max

44

Generating the MPE-tuple

C:

E:

P(b|a) P(d|b,a) P(e|b,c)B:

D:

A: P(a)

P(c|a)

e=0 e)(a,hD

(a)hE

e)c,d,(a,hB

e)d,(a,hC

(a)hP(a)max arga' 1. E

a

0e' 2.

)e'd,,(a'hmax argd' 3. C

d

)e'c,,d',(a'h

)a'|P(cmax argc' 4.B

c

)c'b,|P(e')a'b,|P(d')a'|P(bmax argb' 5.

b

)e',d',c',b',(a' Return

12” 14” 15”

S

I II III

P60G80G

H

6C 9C

B

Min-Cost Assignment

What is minimum cost configuration?

6C 9C

I 30 50

II 40 55

III ∞ 60

I II III

12” 45 ∞ ∞

14” 50 60 70

15” ∞ 65 8060G

80G

12”

30 50

14”

40 45

15”

50 ∞

I II III

12” 75 ∞ ∞

14” 80 100 130

15” ∞ 105 140

12”

14”

15”

105

120

155

40

II

30

I

60

III+

MIN-SUM operatorsCHAIN structure

60

40

30

105

80

75

80

14”

75

12”

105

15”

50

40

30

40

14”

30

12”

50

15”

+ =

http://download.intel.com/pressroom/kits/pentiumee/pentiumee_processor_back.jpg

46

47

BE-MPE

BE-MPE

48

Finding small induced-width

• NP-complete• A tree has induced-width of ?• Greedy algorithms:

– Min width– Min induced-width– Max-cardinality– Fill-in (thought as the best)– See anytime min-width (Gogate and Dechter)

Documents

1 Exact Inference Algorithms Bucket-elimination and more COMPSCI 179, Spring 2010 Set 8: Rina Dechter (Reading: chapter 14, Russell and Norvig