33
Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well represented by undirected graphical models. A clique will be formed because of induced dependency of the two coins given the bell. Coin 1 Bell Coin 2

1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

Embed Size (px)

Citation preview

Page 1: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

1

Bayesian Networks(Directed Acyclic Graphical Models)

The situation of a bell that rings whenever the outcome of two coins are equal can not be well represented by undirected graphical models.

A clique will be formed because of induced dependency of the two coins given the bell.

Coin1

Bell

Coin2

Page 2: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

2

Bayesian Networks (BNs)Examples of models for diseases &

symptoms & risk factors

One variable for all diseases (values are diseases)

One variable per disease (values are True/False)

Naïve Bayesian Networks versus Bipartite BNs

Page 3: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

3

Boundary Basis for Dependency Models

Let M be a dependency model over U={X1,…,Xn}. Let d be an ordering of these elements.

A boundary basis wrt d of M is a set of independence statements I(Xi, Bi, Ui-Bi) that hold in M where Ui={X1,X2,…,Xi-1}, i=1,..n.

A boundary basis is minimal if every Bi is minimal.

Example I: What is the boundary basis for P(X1,X2,X3,X4) = P(X1)P(X2|X1)P(X3|X2)P(X4|X3)?

Page 4: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

4

Example I

I ( X3 , X2 , X1)

I ( X4 , X3, {X1, X2})

X1 X2 X3 X4

A boundary basis and a boundary DAG for: P(X1,X2,X3,X4) = P(X1)P(X2|X1)P(X3|X2)P(X4|X3)?

The directed acyclic graph (DAG) created by assigning each vertex Xi the parents Bi is called the boundary DAG of M relative to order d.

Page 5: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

5

Example II

I ( coin1, { } ,coin2)

Coin1

Bell

Coin2

A boundary basis and a boundary DAG for: P(coin1,coin2,bell) =P(coin1)P(coin2)P(bell|coin1,coin2)

Page 6: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

6

Example III

In the order V,S,T,L,B,A,X,D, we have a boundary basis: I( S, { }, V ) I( T, V, S) I( l, S, {T, V}) … I( X,A, {V,S,T,L,B,D})

V S

LT

A B

X D

),|()|(),|()|()|()|()()(

),,,,,,,(

badPaxPltaPsbPslPvtPsPvP

dxabltsvP

Does I ( {X, D} ,A,V) also hold in the dependency model P ?

Page 7: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

7

1. A Directed Acyclic Graph (DAG) D=(U,E) is an I-map of a dependency model M over U if ID(X,Z,Y) IM(X,Z,Y) for all disjoint subsets X,Y, Z of U.

2. D is a minimal I-map of M if by removing any edge, D ceases to be an I-map.

3. D is a perfect map of M if ID(X,Z,Y) IM(X,Z,Y) for all disjoint subsets X,Y, Z of U.

DefinitionsCan we define “Independence” ID(X,Z,Y) graphically that answers these probabilistic independence questions ?

Page 8: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

8

From Separation in UGs

To d-Separation in DAGs

Page 9: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

9

Paths

Intuition: dependency must “flow” along paths in the graph

A path is a sequence of neighboring variables

Examples: X A D B A L S B

V S

LT

A B

X D

Page 10: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

10

Path blockage

Every path is classified given the evidence: active -- creates a dependency between the

end nodes blocked – does not create a dependency

between the end nodes

Evidence means the assignment of a value to a subset of nodes.

Page 11: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

11

Blocked

S

L B

S

L B

Path Blockage

Three cases: Common cause

Blocked Active

Page 12: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

12

Blocked

S

A

L

S

A

L

Path Blockage

Three cases: Common cause

Intermediate cause

Blocked Active

Page 13: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

13

Blocked

T L

X

A

T L

X

AT L

X

A

Path Blockage

Three cases: Common cause

Intermediate cause

Common Effect

Blocked Active

Page 14: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

14

Definition of Path BlockageDefinition: A path is active, given evidence Z, if Whenever we have the configuration

then either A or one of its descendents is in Z

No other nodes in the path are in Z.

Definition: A path is blocked, given evidence Z, if it is not active.

T L

A

Definition: X is d-separated from Y, given Z, if all paths from a node in X and a node in Y are blocked, given Z.

Page 15: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

15

d-Separation

Page 16: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

16

ID(T,S|) = yes

Example

V S

LT

A B

X D

Page 17: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

17

V S

LT

A B

X D

ID (T,S |) = yes ID(T,S|D) = no

Example

Page 18: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

18

ID (T,S |) = yes ID(T,S|D) = no ID(T,S|{D,L,B}) = yes

Example

V S

LT

A B

X D

Page 19: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

19

Example

In the order V,S,T,L,B,A,X,D, we get from the boundary basis: ID( S, { }, V )

ID( T, V, S)

ID( l, S, {T, V}) … ID( X,A, {V,S,T,L,B,D})

V S

LT

A B

X D

Page 20: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

20

Main Result - Soundness

Page 21: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

21

Bayesian Networks(Directed Acyclic Graphical Models)

Definition: Given a probability distribution P on a set of variables U, a DAG D = (U,E) is called a Bayesian Network of P iff D is a minimal I-map of P.

Page 22: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

22

First claim holds because any probability distribution is a semi graphoid (Symmetry, Decomposition, Contraction, Weak union).

Page 23: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

23

Second claim of uniqueness of parents sets holds due to. I(X,ZW1,YW2) and I(X,ZW2,YW1) I(X,Z,YW1W2)

Proof:(1) I(X, ZW1,YW2). Given.(2) I(X, ZW2,YW1). Given.

(3) I(X, ZW1W2,Y) by weak union from (1).(4) I(X, ZYW1,W2) by weak union from (1).(5) I(X, ZYW2,W1) by weak union from (2).(6) I(X, ZY, W1W2) by intersection from (4) and (5).

I(X, Z, YW1W2) by intersection from (3) and (6).

Page 24: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

24

d-separation

The definition of ID(X, Z, Y) is such that:

Soundness [Theorem 9]: ID(X, Z, Y) = yes implies IP(X, Z, Y) follows from the boundary Basis(D).

Completeness [Theorem 10]: ID(X, Z, Y) = no

implies IP(X, Z, Y) does not follow from the boundary Basis(D).

Page 25: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

25

Revisiting Example II

V S

LT

A B

X D

So does IP( {X, D} ,A, V) hold ?

Enough to check d-separation !

Page 26: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

26

Bayesian Networks with numbers

p(t|v)V S

LT

A B

X D

),|()|(),|()|()|()|()()(

),,,,,,,(

badPaxPltaPsbPslPvtPsPvP

dxabltsvP

p(x|a) p(d|a,b)

p(a|t,l)p(b|s)

p(l|s)

p(s)p(v)

Page 27: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

27

Bayesian Network (cont.)Each Directed Acyclic Graph defines a factorization of the form:

n

iiin xpxxp

11 )|(),,( pa

),|()|(),|()|()|()|()()(

),,,,,,,(

badPaxPltaPsbPslPvtPsPvP

dxabltsvP

p(t|v)V S

LT

A B

X D

p(x|a) p(d|a,b)

p(a|t,l)p(b|s)

p(l|s)

p(s)p(v)

Page 28: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

28

Independence in Bayesian networks

(*))|(),,(1

1

n

iiin xpxxp pa

n

iiin xxxpxxp

1111 ),|(),,(

This set of independence assertions is denoted Basis(G) .All other independence assertions that are entailed by (*) are derivable using the semi-graphoid axioms.

IP( Xi ; {X1,…,Xi-1}\Pai | Pai )

Page 29: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

29

Local distributions- Asymmetric independence

Table:p(A=y|L=n, T=n) = 0.02p(A=y|L=n, T=y) = 0.60p(A=y|L=y, T=n) = 0.99p(A=y|L=y, T=y) = 0.99

Lung Cancer(Yes/No)

Tuberculosis

(Yes/No)

Abnormalityin Chest(Yes/no)

p(A|T,L)

Page 30: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

30

COROLLARY 4: D is an I-map of P iff each variable X is conditionally independent in P of all its non-descendants, given its parents.

Proof : Each variable X is conditionally independent of all its non-descendants, given its parents implies using decomposition that it is also independent of its predecessors in a particular order d.

Proof : X is d-separated of all its non-descendants, given its parents. Since D is an I-map, by the soundness theorem the claim holds.

Page 31: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

31

COROLLARY 5: If D=(U,E) is a boundary DAG of P constructed in some order d, then any topological order d’ of U will yield the same boundary DAG of P. (Hence construction order can be forgotten).

Proof : By Corollary 4, each variable X is d-separated of all its non-descendants, given its parents in the boundary DAG of P.

In particular, due to decomposition, X is independent given its parents from all previous variables in any topological order d’.

Page 32: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

32

Extension of the Markov Chain Property

I(Xk, Xk-1, X1 … Xk-2) I(Xk, Xk-1 Xk+1, X1 … Xk-2 Xk+2… Xn )

Holds due to the soundness theorem. Converse holds when Intersection is assumed.

Markov Blankets in DAGs

Page 33: 1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well

33

Consequence: There is no improvement to d-separation and no statement escapes graphical representation.

Reasoning: (1) If there were an independence statement not shown by d-separation, then must be true in all distributions that satisfy the basis. But Theorem 10 states that there exists a distribution that satisfies the basis and violates . (2) Same argument. [Note that (2) is a stronger claim.]