21
Artificial Intelligence Probabilistic reasoning Fall 2008 professor: Luigi Ceccaroni

Artificial Intelligence Probabilistic reasoning

Embed Size (px)

DESCRIPTION

Artificial Intelligence Probabilistic reasoning. Fall 2008 professor: Luigi Ceccaroni. Bayesian networks. A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions . Syntax: a set of nodes, one per variable - PowerPoint PPT Presentation

Citation preview

Page 1: Artificial Intelligence Probabilistic reasoning

Artificial IntelligenceProbabilistic reasoning

Fall 2008

professor: Luigi Ceccaroni

Page 2: Artificial Intelligence Probabilistic reasoning

Bayesian networks

• A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions.

• Syntax:– a set of nodes, one per variable– a directed, acyclic graph (links ≈ "directly influences")– a conditional distribution for each node given its

parents:P (Xi | Parents (Xi))

• In the simplest case, conditional distribution represented as a conditional probability table (CPT) giving the distribution over Xi for each combination of parent values.

Page 3: Artificial Intelligence Probabilistic reasoning

Example

• Topology of network encodes conditional independence assertions:

• Weather is independent of the other variables.• Toothache and Catch are conditionally

independent given Cavity.

Page 4: Artificial Intelligence Probabilistic reasoning

Example

• What is the probability of having a heart attack?

• This probability depends on “4 variables”: – Sport– Diet– Blood pressure– Smoking

• Knowing the dependency among these variables let us build a Bayesian network.

4

Page 5: Artificial Intelligence Probabilistic reasoning

Constructing Bayesian networks

• 1. Choose an ordering of variables X1, … ,Xn

• 2. For i = 1 to n– add Xi to the network

– select parents from X1, … ,Xi-1 such that

P (Xi | Parents(Xi)) = P (Xi | X1, ... Xi-1)

This choice of parents guarantees:P (X1, … ,Xn) = πi =1 P (Xi | X1, … , Xi-1) (chain rule)

= πi =1P (Xi | Parents(Xi)) (by construction)

n

n

Page 6: Artificial Intelligence Probabilistic reasoning

Example

Heartattack

Smoking

Blood pressure

DietSport

Diet P(Di)

balanced 0.4

unbalanced 0.6

Sport P(Sp)

yes 0.1

no 0.9

Smoking P(Sm)

yes 0.4

no 0.6

Diet Sport P(Bp = high)

P(Bp = normal)

bal. yes 0.01 0.99

unbal. yes 0.2 0.8

bal. no 0.25 0.75

unbal. no 0.7 0.3

Bp Sm P(Ha=yes) P(Ha=no)

high yes 0.8 0.2

norm. yes 0.6 0.4

high no 0.7 0.3

norm. no 0.3 0.7

Page 7: Artificial Intelligence Probabilistic reasoning

Compactness

• A CPT for Boolean Xi with k Boolean parents has 2k rows for the combinations of parent values.

• Each row requires one number p for Xi = true(the number for Xi = false is just 1-p).

• If each variable (n) has no more than k parents (k<<n), the complete network requires O(n · 2k) numbers.

Page 8: Artificial Intelligence Probabilistic reasoning

Representation cost

• The network grows linearly with n, vs. O(2n) for the conditional full joint distribution.

• Examples: – With 10 variables and at most 3 parents:

•80 vs. 1024

– With 100 variables and at most 5 parents:•3200 vs. 1030

Page 9: Artificial Intelligence Probabilistic reasoning

Semantics

The full joint distribution is defined as the product of the local conditional distributions:

P (X1, … ,Xn) = πi = 1 P (Xi | Parents(Xi))

Example:P (sp ∧ Di=balanced ∧ Bp=high ∧ ¬sm ∧ ¬ha) =

= P (sp) P (Di=balanced) P (Bp=high | sp, Di=balanced) P (¬sm) P (¬ha | Bp=high, ¬sm)

n

Page 10: Artificial Intelligence Probabilistic reasoning

Bayesian networks – Joint distribution - Example

P(ha Bp = high sm sp Di = ∧ ∧ ∧ ∧balanced)

= P(ha | Bp = high, sm) P(Bp = high | sp, Di = balanced) P(sm) P(sp) P(Di = balanced)

= 0.8 x 0.01 x 0.4 x 0.1 x 0.4

= 0.000128

10

Page 11: Artificial Intelligence Probabilistic reasoning

Exact inference in Bayesian networks: example

• Inference by enumeration:

P(X | e) = α P(X, e) = α y P(X, e, y)

• Let’s calculate:

P(Smoking | Heart attack = yes, Sport = no)• The full joint distribution of the network is:

P(Sp, Di, Bp, Sm, Ha) =

= P(Sp) P(Di) P(Bp | Sp, Di) P(Sm) P(Ha | Bp, Sm)

• We want to calculate: P(Sm | ha, ¬sp).

Page 12: Artificial Intelligence Probabilistic reasoning

Exact inference in Bayesian networks: example

P(Sm | ha, ¬sp) = α P(Sm, ha, ¬sp) =

= αDi{b, ¬b}Bp{h, n}P(Sm, ha, ¬sp, Di, Bp) =

= α P(¬sp) P(Sm) Di{b, ¬b}P(Di) Bp{h,n}P(Bp | ¬sp, Di) P(ha | Bp, Sm) =

= α <0.9 * 0.4 * (0.4 * (0.25 * 0.8 + 0.75 * 0.6) + 0.6 * (0.7 * 0.8 + 0.3 * 0.6)),

0.9 * 0.6 * (0.4 * (0.25 * 0.7 + 0.75 * 0.3) + 0.6 * (0.7 * 0.7 + 0.3 * 0.3)> =

= α <0.253, 0.274> = <0.48, 0,52>

Page 13: Artificial Intelligence Probabilistic reasoning

Variable elimination algorithm

• The variable elimination algorithm let us avoid the calculation repetition of inference by enumeration.

• Each variable is represented by a factor.• Intermediate results are saved to be later

reused.• Non-relevant variables, being constant

factors, are not directly computed.

13

Page 14: Artificial Intelligence Probabilistic reasoning

Variable elimination algorithm

14

CALCULA-FACTOR generates the factor corresponding to variable var in the function of the joint probability distribution.

PRODUCTO-Y-SUMA multiplies factors and sums over the hidden variable.

PRODUCTO multiplies a set of factors.

Page 15: Artificial Intelligence Probabilistic reasoning

Variable elimination algorithm - Example

α P(¬sp) P(Sm) Di{b, ¬b}P(Di) Bp{h,n}P(Bp | ¬sp, Di) P(ha | Bp, Sm)

• Factor for variable Heart attack P(ha | Bp, Sm), fHa(Bp, Sm):

15

Bp Sm fHa(Bp, Sm)

high yes 0.8

high no 0.7

normal yes 0.6

normal no 0.3

Page 16: Artificial Intelligence Probabilistic reasoning

Variable elimination algorithm - Example

• Factor for variable Blood pressure P(Bp | ¬sp, Di), fBp(Bp, Di):

• To put together the factors just obtained, we calculate the product of fHa(Bp, Sm) x fBp(Bp, Di) = fHa Bp(Bp, Sm, Di) 16

Bp Di fBp(Bp, Di)

high balanced 0.25

high unbalanced 0.7

normal balanced 0.75

normal unbalanced 0.3

Page 17: Artificial Intelligence Probabilistic reasoning

fHa Bp(Bp, Sm, Di) =

= fHa(Bp, Sm) x fBp(Bp, Di)

Variable elimination algorithm - Example

Bp Sm Di fHa Bp(Bp, Sm, Di)

high yes balanced 0.8 * 0.25

high yes unbalanced 0.8 * 0.7

high no balanced 0.7 * 0.25

high no unbalanced 0.7 * 0.7

normal yes balanced 0.6 * 0.75

normal yes unbalanced 0.6 * 0.3

normal no balanced 0.3 * 0.75

normal no unbalanced 0.3 * 0.3

17

Page 18: Artificial Intelligence Probabilistic reasoning

• We sum over the values of variable Bp to obtain factor fHa Bp(Sm, Di)

• Factor for variable Di, fDi(Di):

Variable elimination algorithm - Example

18

Sm Di fHa Bp(Sm, Di)

yes balanced 0.8 * 0.25 + 0.6 * 0.75 = 0.65

yes unbalanced 0.8 * 0.7 + 0.6 * 0.3 = 0.74

no balanced 0.7 * 0.25 + 0.3 * 0.75 = 0.4

no unbalanced 0.7 * 0.7 + 0.3 * 0.3 = 0.58

Di fDi(Di)

balanced 0.4

unbalanced 0.6

Page 19: Artificial Intelligence Probabilistic reasoning

• fHa Di Bp(Sm, Di) = fDi(Di) x fHa Bp(Sm, Di)

• We sum over the values of variable Di to obtain factor fHa Di Bp(Sm)

Variable elimination algorithm - Example

19

Sm Di fHa Di Bp(Sm, Di)

yes balanced 0.65 * 0.4

yes unbalanced 0.74 * 0.6

no balanced 0.4 * 0.4

no unbalanced 0.58 * 0.6

Sm fHa Di Bp(Sm)

yes 0.65 * 0.4 + 0.74 * 0.6 = 0.7

no 0.4 * 0.4 + 0.58 * 0.6 = 0.51

Page 20: Artificial Intelligence Probabilistic reasoning

Variable elimination algorithm - Example

• Factor for variable Sm, fSm(Sm):

• fHa Sm Di Bp(Sm) = fSm(Sm) x fHa Di Bp(Sm)

• Normalizing, we obtain:20

Sm fSm(Sm)

yes 0.4

no 0.6

Sm fHa Sm Di Bp(Sm)

yes 0.4 * 0.7 = 0.282

no 0.6 * 0.51 = 0.305

Sm P(Sm | ha, ¬sp)

yes 0.48

no 0.52

Page 21: Artificial Intelligence Probabilistic reasoning

• Bayesian networks provide a natural representation for (causally induced) conditional independence.

• Topology + CPTs = compact representation of joint distribution.

• Generally easy for domain experts to construct.

Summary