27
Lauritzen-Spiegelhalter Algorithm Lauritzen-Spiegelhalter Algorithm Probabilistic Inference Probabilistic Inference In Bayes Networks In Bayes Networks Haipeng Guo Haipeng Guo Nov. 08, 2000 Nov. 08, 2000 KDD Lab, CIS Department, KSU KDD Lab, CIS Department, KSU

Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Embed Size (px)

Citation preview

Page 1: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Lauritzen-Spiegelhalter AlgorithmLauritzen-Spiegelhalter Algorithm

Probabilistic Inference Probabilistic Inference

In Bayes NetworksIn Bayes Networks

Haipeng GuoHaipeng Guo Nov. 08, 2000Nov. 08, 2000

KDD Lab, CIS Department, KSUKDD Lab, CIS Department, KSU

Page 2: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Presentation OutlinePresentation Outline

• Bayes Networks

• Probabilistic Inference in Bayes Networks

• L-S algorithm

• Computational Complexity Analysis

• Demo

Page 3: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

• A Bayes network is a directed acyclic graph with

a set of conditional probabilities

• The nodes represent random variables and arcs

between nodes represent conditional dependence

of the nodes.

• Each node contains a CPT(Conditional Probabilistic Table) that contains probabilities of the node being a specific value given the values of its parents

Bayes NetworksBayes Networks

Page 4: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Bayes NetworksBayes Networks

Page 5: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Probabilistic Inference in Bayes NetworksProbabilistic Inference in Bayes Networks

• Probabilistic Inference in Bayes Networks is the process of computing the conditional probability for some variables given the values of other variables (evidence).

• P(V=v| E=e): Suppose that I observe e on a set of variables E(evidence ), what is the probability that variable V has value v, given e?

Page 6: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Probabilistic Inference in Bayes NetworksProbabilistic Inference in Bayes Networks

Example problem:

• Suppose that a patient arrives and it is

known for certain that he has recently visited

Asia and has dyspnea.

• What’s the impact that this has on the

probabilities of the other variables in the

network ?

• Probability Propagation in the network

Page 7: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Probabilistic Inference in Bayes NetworksProbabilistic Inference in Bayes Networks

• The problem of exact probabilistic inference in an

arbitrary Bayes network is NP-Hard.[Cooper 1988]

• NP-Hard problems are at least as computational

complex as NP-complete problems

• No algorithms has ever been found which can solve

a NP-complete problem in polynomial time

• Although it has never been proved whether P = NP

or not, many believe that it indeed is not possible.

• Accordingly, it is unlikely that we could develop an

general-purpose efficient exact method for

propagating probabilities in an arbitrary network

Page 8: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

• L-S algorithm is an efficient exact

probability inference algorithm in an

arbitrary Bayes Network

• S. L. Lauritzen and D. J. Spiegelhalter.

Local computations with probabilities on

graphical structures and their application to

expert systems. Journal of the Royal

Statistical Society , 1988.

Lauritzen-Spiegelhalter AlgorithmLauritzen-Spiegelhalter Algorithm

Page 9: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Lauritzen-Spiegelhalter AlgorithmLauritzen-Spiegelhalter Algorithm

• L-S algorithm works in two steps:

First, it creates a tree of cliques(join tree

or junction tree), from the original Bayes

network;

Then, it computes probabilities for the

cliques during a message propagation and

the individual node probabilities are

calculated from the probabilities of cliques.

Page 10: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

L-S Algorithm: CliquesL-S Algorithm: Cliques

• An undirected graph is compete if every

pair of distinct nodes is adjacent

• a clique W of G is a maximal complete

subset of G, that is, there is no other complete

subset of G which properly contains W

A B

C DE

Clique 1: {A, B, C, D}

Clique 2: {B, D, E}

Page 11: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 1: Building the tree of cliquesStep 1: Building the tree of cliques

In step 1, we begin with the DAG of a Bayes Network,

and apply a series of graphical transformation that

result in a join tree:

1. Construct a moral graph from the DAG of a Bayes

network by marrying parents

2. Add arcs to the moral graph to form a triangulated

graph and create an order of all nodes using

Maximum Cardinality Search

3. Identify cliques from the triangulated graph and

order them according to order

4. Connect these cliques to build a join tree

Page 12: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 1.1: MoralizationStep 1.1: Moralization

Input: G - the DAG of a Bayes Network,

Output: Gm - the Moral graph relative to G

Algorithm: “marry” the parents and drop the direction

(add arc for every pair of parents of all nodes)A

D

B E G

C

H

F A

D

B E G

C

H

F

Page 13: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 1.2: TriangulationStep 1.2: Triangulation

Input: Gm - the Moral graph

Output: Gu – a perfect ordering of the nodes and the

triangulated graph of Gm

Algorithm: 1. Use Maximum Cardinality Search to create a

perfect ordering of the nodes

2. Use Fill-in Computation algorithm to triangulate Gu

A

D

B E G

C

H

F A1

D8

B2 E3 G5

C4

H7

F6

Page 14: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 1.3: Identify CliquesStep 1.3: Identify Cliques

Input: Gu and a ordering of the nodes

Output: a list of cliques of the triangulated graph Gu

Algorithm: Use Cliques-Finding algorithm to find

cliques of a triangulated graph then order them

according to their highest labeled nodes according

to order

Page 15: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

G5

A1

D8

B2 E3

C4

H7

F6

Step 1.3: Identify CliquesStep 1.3: Identify Cliques

D8

C4

A1

B2

E3

C4

B2

G5E3

C4

G5

F6

E3

G5

H7

C4

Clique 6

Clique 5

Clique 4

Clique 3

Clique 1

Clique 2

order them according to their highest

labeled nodes according to order

Page 16: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 1.4: Build tree of CliquesStep 1.4: Build tree of Cliques

Input: a list of cliques of the triangulated graph Gu

Output: Create a tree of cliques, compute Separators nodes

Si,Residual nodes Ri and potential probability (Clqi) for all

cliques

Algorithm: 1. Si = Clqi (Clq1 Clq2 … Clqi-1)

2. Ri = Clqi - Si

3. If i >1 then identify a j < i such that Clqj is a parent of Clqi

4. Assign each node v to a unique clique Clqi that v c(v)

Clqi

5. Compute (Clqi) = f(v) Clqi =P(v|c(v)) {1 if no v is assigned

to Clqi}

6. Store Clqi , Ri , Si, and (Clqi) at each vertex in the tree

of cliques

Page 17: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 1.4: Build tree of CliquesStep 1.4: Build tree of Cliques

Clq6

(Clq5) = P(H|C,G)

(Clq2) = P(D|C)

D8

C4

A1

B2

E3

C4

B2

G5E3

C4

G5

F6

E3

G5

H7

C4

Clq5

Clq4

Clq3

Clq1

Clq2

Clq1

Clq2

Clq3

Clq4 Clq5

Clq6

Clq3 = {E,C,G}R3 = {G}

S3 = { E,C }

Clq1 = {A, B}R1 = {A, B}S1 = {}

Clq2 = {B,E,C}R2 = {C,E}

S2 = { B }

Clq4 = {E, G, F}R4 = {F}

S4 = { E,G }

Clq5 = {C, G,H}R5 = {H}

S5 = { C,G }

Clq6 = {C, D}R5 = {D}

S5 = { C}

(Clq1) = P(B|A)P(A)

(Clq2) = P(C|B,E)

(Clq3) = 1

(Clq4) = P(E|F)P(G|F)P(F)

AB

BEC

ECG

EGFCGH

CD

B

EC

CGEG

C

Ri: Residual nodes Si: Separator nodes

(Clqi): potential probability of Clique i

Page 18: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 1: ConclusionStep 1: Conclusion

In step 1, we begin with the DAG of a Bayes Network,

and apply a series of graphical transformation that

result in a Permanent Tree of Cliques. Stored at

each vertex in the tree are the following:

1. A clique Clqi

2. Si

3. Ri

4. (Clqi)

Page 19: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 2: Computation inside the join treeStep 2: Computation inside the join tree

• In step 2, we start from copying a copy of the Permanent Tree of

Cliques to Clqi’, Si

’,, Ri’ and ’ (Clqi

’) and P’ (Clqi’) , leave P’ (Clqi

’)

unassigned at first.

• Then we compute the prior probability P’ (Clqi’) using the same

updating algorithm as the one to determine the conditional

probabilities based on instantiated values.

• After initialization, we compute the posterior probability P’ (Clqi’)

again with evidence by passing and messages in the join tree

• When the probabilities of all cliques are determined , we can

compute the probability for each variable from any clique

containing the variable

Page 20: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 2: Message passing in the join treeStep 2: Message passing in the join tree

• The message passing process consists of first

sending so-called messages from the bottom of

the tree to the top, then sending messages from

the top to the bottom, modifying and accumulating

node properties(’ (Clqi ’) and P’ (Clqi

’) ) along the way

• The message upward is a summed product of all

probabilities below the given node

• The messages downward is information for

updating the node prior probabilities

Page 21: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 2: Message passing in the join treeStep 2: Message passing in the join tree

Clq1

Clq2

Clq3

Clq4 Clq5

Clq6

Upward messages

Downward messages

• ’ (Clqi ’) is modified as the

messages passing is going

• P’ (Clqi’) is computed as the

messages passing is going

Page 22: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

Step 2: Message passing in the join treeStep 2: Message passing in the join tree

• When the probabilities of all cliques are determined , for each vertex Clqi and each variable v Clqi , do

vwClqw

i

i

ClqPvP,'

)'(':)('

Page 23: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

L-S Algorithm:Computational Complexity L-S Algorithm:Computational Complexity AnalysisAnalysis

1. Computations in the Algorithm which creates the permanent

Tree of Cliques --- O(nrm)

• Moralization – O(n)

• Maximum Cardinality Search – O(n+e)

• Fill-in algorithm for triangulating – O(n+e)

---- the general problem for finding optimal triangulation (minimal fill-in) is

NP-Hard, but we are using a greedy heuristic

• Find Cliques and build join tree – O(n+e)

--- the general problem for finding minimal Cliques from an arbitrary graph

is NP-Hard, but our subject is a triangulated graph

• Compute (Clqi) – O(nrm)

--- n = number of variables; m = the maximum number of variables in a

clique; r = maximum number of alternatives for a variable

Page 24: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

L-S Algorithm:Computational Complexity L-S Algorithm:Computational Complexity AnalysisAnalysis

2. Computations in the updating Algorithm --- O(prm )

• Computation for sending all messages --- 2prm

• Computation for sending all messages --- prm

• Computation for receiving all messages --- prm

• Computation for receiving all messages --- prm

---- p = number of vertices in the tree of cliques

L-S algorithm has a time complexity of O(prm), in the

worst case it is bounded below by 2m, i.e. (2m)

Page 25: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

L-S Algorithm:Computational Complexity L-S Algorithm:Computational Complexity AnalysisAnalysis

• It may seem that we should search for a better general-

purpose algorithm to perform probability propagation

• But in practice, most Bayes networks created by human

hands should often contain small clusters of variables,

and therefore a small value of m. So L-S algorithm works

efficiently for many application because networks available

so far are often sparse and irregular.

• L-S algorithm could have a very bad performance for more

general networks

Page 26: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

L-S Algorithm: Alternative methodsL-S Algorithm: Alternative methods

• Since the general problem of probability propagation is NP-

Hard, it is unlikely that we could develop an efficient general-

purpose algorithm for propagating probabilities in an arbitrary

Bayes network.

• This suggests that research should be directed towards

obtaining alternative methods which work in different cases:

1. Approximate algorithms

2. Monte Carlo techniques

3. Heuristic algorithms

4. Parallel algorithms

5. Special case algorithms

Page 27: Lauritzen-Spiegelhalter Algorithm Probabilistic Inference In Bayes Networks Haipeng Guo Nov. 08, 2000 KDD Lab, CIS Department, KSU

L-S Algorithm: DemoL-S Algorithm: Demo

• Laura works on step 1

• Ben works on step 2