     # Variational Bayes and Variational Message Passing Variational Bayes and Variational Message Passing

• View
1

• Download
0

Embed Size (px)

### Text of Variational Bayes and Variational Message Passing Variational Bayes and Variational Message Passing

• Variational Bayes and Variational Message Passing

Mohammad Emtiyaz Khan

CS,UBC

Variational Bayes and Variational Message Passing – p.1/16

• Variational Inference

Find a tractable distribution Q(H) that closely approximates the true posterior distribution P (H|V ).

log P (V ) = ∑

H

Q(H) log P (V )

= ∑

H

Q(H) log P (H,V )

P (H|V )

= ∑

H

Q(H) log

[ P (H,V )

Q(H)

Q(H)

P (H|V )

]

= ∑

H

Q(H) log P (H,V )

Q(H) ︸ ︷︷ ︸

L(Q)

+ ∑

H

−Q(H) log P (H|V )

Q(H) ︸ ︷︷ ︸

KL(Q||P )

Variational Bayes and Variational Message Passing – p.2/16

• Variational Inference

log P (V ) = L(Q) + KL(Q||P ) (1)

L(Q) = ∑

H

Q(H) log P (H,V )

Q(H) (2)

KL(Q||P ) = − ∑

H

Q(H) log P (H|V )

Q(H) (3)

Find Q(H) that maximizes lower bound L(Q) (and hence minimizes KL divergence).

For Q(H) = P (H|V ), KL vanishes to zero, but P (H|V ) is intractable (that’s why variational approach).

Trick : Consider a restricted class of Q(H), and then find the member which minimizes the KL divergence.

Variational Bayes and Variational Message Passing – p.3/16

• Factorized Distributions

Q(H) = ∏

i

Qi(Hi) (4)

Substituting this in the expression for lower bound,

L(Q) = ∑

H

i

Qi(Hi) log P (H,V )

i Qi(Hi) (Outline)

= ∑

H

i

Qi(Hi) log P (H,V )− ∑

H

i

Qi(Hi) ∑

i

log Qi(Hi)

= ∑

H

i

Qi(Hi) log P (H,V )− ∑

i

Hi

i

Qi(Hi) log Qi(Hi)

= ∑

H

i

Qi(Hi) log P (H,V ) + ∑

i

H(Qi)

Variational Bayes and Variational Message Passing – p.4/16

• Factorized Distributions

Now separate out all the terms in one factor Qj.

L(Q) = ∑

Hj

Qj(Hj)〈log P (H,V )〉∼Qj(Hj) ︸ ︷︷ ︸

log Q∗j (Hj)

+ H(Qi) + ∑

i6=j

H(Qi)

= −KL(Qj ||Q ∗ j) + terms not in Qj (5)

This bound is maximized wrt Qj when

log Qj(Hj) = log Q ∗ j(Hj) = 〈log P (H,V )〉∼Qj(Hj) + c (6)

Now iterate, guaranteed convergence ...

Variational Bayes and Variational Message Passing – p.5/16

• Variational Bayes for Bayesian Networks

log Q∗j(Hj) = 〈log P (H,V )〉∼Qj(Hj) + c

= ∑

i

〈log P (Xi|pai)〉∼Qj(Hj) + c

= 〈log P (Hj |paj)〉∼Qj(Hj)

+ ∑

k∈chj

〈log P (Xk|paj)〉∼Qj(Hj) + c

Variational Bayes and Variational Message Passing – p.6/16

• Exponential-Conjugate Models

P (Y |θ) = exp[φTY (θ)u(Y ) + f(Y ) + g(θ)] (7)

u(Y ) = Natural statistics (8)

φY (θ) = Natural Parameter vector (9)

g(θ) = Constant of integration (10)

Example I: Bernoulli Distribution

p(x|µ) = µx(1− µ)1−x (11)

log p(x|µ) = x log µ + (1− x) log(1− µ) (12)

= log µ

(1− µ) ︸ ︷︷ ︸

φ(µ)

x ︸︷︷︸

u(x)

+ log(1− µ) ︸ ︷︷ ︸

g(µ)

(13)

Variational Bayes and Variational Message Passing – p.7/16

• Exponential-Conjugate Models

P (Y |θ) = exp[φTY (θ)u(Y ) + f(Y ) + g(θ)] (14)

P (Y |φ) = exp[φT u(Y ) + f(Y ) + g̃(φ)](Re-parametrization)

Property I: 〈u(Y )〉P (Y |θ) = − dg̃(φ) dφ

log p(x|µ) = log µ

(1− µ) ︸ ︷︷ ︸

φ(µ)

x ︸︷︷︸

u(x)

+ log(1− µ) ︸ ︷︷ ︸

g(µ)

(15)

φ = log µ

(1− µ) ⇒ µ =

1 + eφ (16)

g(µ) = log(1− µ) = − log(1 + eφ) = g̃(φ) (17)

E(x) = 〈u(Y )〉 = eφ(1 + eφ)−1 = µ (18)

Variational Bayes and Variational Message Passing – p.8/16

• Exponential-Conjugate Models

P (Y |θ) = exp[φTY (θ)u(Y ) + f(Y ) + g(θ)] (19)

Example II: Gaussian Distribution θ → Y → X ← β

p(Y |θ) = (2π)−1/2 exp− 1

2 (Y −θ)2

log p(Y |θ) = [θ,−1/2] ︸ ︷︷ ︸

φY (θ)

[

Y

Y 2

]

︸ ︷︷ ︸

uY (Y )

− 1

2 θ2

︸︷︷︸

gY (θ)

− 1

2 log(2π)

︸ ︷︷ ︸

fY (Y )

p(X|Y, β) = (2π)−1/2β1/2 exp− β

2 (X−Y )2

log p(X|Y, β) = [βY,−β/2] ︸ ︷︷ ︸

φX(Y,β)

[

X

X2

]

︸ ︷︷ ︸

uX(X)

+ −1

2 (βY 2 + log β)

︸ ︷︷ ︸

gX(Y,β)

− 1

2 log(2π)

︸ ︷︷ ︸

fX(X)

Variational Bayes and Variational Message Passing – p.9/16

• Exponential-Conjugate Models

Property II: Multi-linearity θ → Y → X ← β

log p(X|Y, β) = [βY,−β/2] ︸ ︷︷ ︸

φX(Y,β)

[

X

X2

]

︸ ︷︷ ︸

uX(X)

+ −1

2 (βY 2 + log β)

︸ ︷︷ ︸

gX(Y,β)

− 1

2 log(2π)

︸ ︷︷ ︸

fX(X)

= [βX,−β/2] ︸ ︷︷ ︸

φXY (X,β)

[

Y

Y 2

]

︸ ︷︷ ︸

uY (Y )

+ −1

2 (βX2 + log β)

︸ ︷︷ ︸

gXY (X,β)

− 1

2 log(2π)

︸ ︷︷ ︸

fY (Y )

log p(Y |θ) = [θ,−1/2] ︸ ︷︷ ︸

φY (θ)

[

Y

Y 2

]

︸ ︷︷ ︸

uY (Y )

− 1

2 θ2

︸︷︷︸

gY (θ)

− 1

2 log(2π)

︸ ︷︷ ︸

fY (Y )

Variational Bayes and Variational Message Passing – p.10/16

• Exponential-Conjugate Models

Consider Y node and it’s children in θ → Y → X ← β,

log P (Y |θ) = φTY (θ)uY (Y ) + fY (Y ) + gY (θ)

log P (X|Y, β) = φTX(Y, β)uX(X) + fX(X) + gX(Y, β)

= φTXY (X, β)uY (Y ) + gXY (Y, β)

Recall that,

log Q∗Y (Y ) = 〈log P (Y |θ)〉∼QY (Y ) + 〈log P (X|Y, β)〉∼QY (Y ) + c

= 〈φTY (θ)uY (Y ) + fY (Y ) + gY (θ)〉∼QY (Y )

+〈φTXY (X, β)uY (Y ) + gXY (Y, β)〉∼QY (Y ) + c

= 〈φTY (θ) + φ T XY (X, β)〉∼QY (Y )uY (Y ) + fY (Y ) + c1

Variational Bayes and Variational Message Passing – p.11/16

• Exponential-Conjugate Models

log Q∗Y (Y ) = 〈φ T Y (θ) + φ

T XY (X, β)〉∼QY (Y )uY (Y ) + fY (Y ) + c1

Finally,

〈φTY (θ)〉 = [θ,−1/2]

〈φTXY (X, β)〉 = 〈[βX,−β/2]〉

Later is found using the property I (explain).

Variational Bayes and Variational Message Passing – p.12/16

• Back to Bayesian Networks

Take each node, write the expression as a function of natural statistics of that node.

log Q∗Y (Y )

= 〈log P (Y |paY )〉∼QY (Y ) + ∑

k∈chj

〈log P (Xk|paj)〉∼QY (Y ) + c

=

〈φTY (θ) + ∑

k∈chj

φTXY (X, β)〉∼QY (Y )

uY (Y ) + fY (Y ) + c1

The compute the expectation of natural statistics of each children node, and use that to find the quantity in bracket.

Variational Bayes and Variational Message Passing – p.13/16

• Variational Message Passing

Message from a parent node Y to a child node X:

mY →X = 〈uY 〉 (20)

Message from a child node X to a parent node Y:

mX→Y = φ̃XY (〈uX〉, {mi→X}i∈cpY ) (21)

Node Y update it’s posterior Q∗Y :

φ∗Y = φ̃Y ({mi→Y }i∈paY ) + ∑

j∈chY

mj→Y (22)

Variational Bayes and Variational Message Passing – p.14/16

• Variational Message Passing

Variational Bayes and Variational Message Passing – p.15/16

• Discussion

Initialization and message passing schedule.

Calculation of Lower Bound

Allowable Model

VIBES

Variational Bayes and Variational Message Passing – p.16/16

Variational Inference Variational Inference Factorized Distributions Factorized Distributions Variational Bayes for Bayesian Networks Exponential-Conjugate Models Exponential-Conjugate Models Exponential-Conjugate Models Exponential-Conjugate Models Exponential-Conjugate Models Exponential-Conjugate Models Back to Bayesian Networks Variational Message Passing Variational Message Passing Discussion

Recommended ##### Auto-encoding Variational Bayes - Universitetet i 2019-10-28آ  Auto-encoding Variational Bayes Vinit
Documents ##### Averaged Collapsed Variational Bayes Averaged Collapsed Variational Bayes Inference Katsuhiko Ishiguro
Documents ##### CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton
Documents ##### Variational Autoencoders - An Introduction Introduction I Auto-Encoding Variational Bayes, Diederik
Documents ##### A VARIATIONAL BAYES APPROACH TO VARIABLE .A VARIATIONAL BAYES APPROACH TO VARIABLE SELECTION
Documents ##### 0.1 Bayesian modeling and variational learning: introduc- tion â€؛ bayes â€؛ biennial2005-bayes.pdfآ
Documents ##### Chapter 24: Variational wpenny/publications/spm-book/vb.pdf · Chapter 24: Variational Bayes ... (4)
Documents ##### Autoencoding variational Bayes for latent Dirichlet allocationceur-ws.org/Vol-2540/FAIR2019_paper_33.pdfآ
Documents ##### Variational Bayes and beyond: Bayesian inference for big and beyond: Bayesian inference for big data
Documents ##### Auto-Encoding Variational Bayes - University of Cambridge 2019-11-06آ  Auto-Encoding Variational Bayes
Documents Documents