Upload
bbkuhn
View
28
Download
3
Tags:
Embed Size (px)
Citation preview
Undirected Graphical ModelsY1 Y2 Y3 Y4
X1 X2 X3 X4
= Variable not observed at test time.
= Variable always observed.
= Factor. Encodes correlations between variables.
Undirected Graphical ModelsY1 Y2 Y3 Y4
X1 X2 X3 X4
= Variable not observed at test time.
= Variable always observed.
Sum-Product Message PassingInitialize all messages to 1 Until Convergence:
For each variable, Yi: For each F in Neighborhood(Yi):
For each v in Dom(Yi):
For each factor, F: For each Yi in Neighborhood(F):
For each v in Dom(Yi):
Sum-Product Message PassingInitialize all messages to 1 Until Convergence:
For each variable, Yi: For each F in Neighborhood(Yi):
For each v in Dom(Yi):
For each factor, F: For each Yi in Neighborhood(F):
For each v in Dom(Yi):
Sum-Product Message PassingInitialize all messages to 1 Until Convergence:
For each variable, Yi: For each F in Neighborhood(Yi):
For each v in Dom(Yi):
For each factor, F: For each Yi in Neighborhood(F):
For each v in Dom(Yi):
Sum-Product Message PassingInitialize all messages to 1 Until Convergence:
For each variable, Yi: For each F in Neighborhood(Yi):
For each v in Dom(Yi):
For each factor, F: For each Yi in Neighborhood(F):
For each v in Dom(Yi):
Sum-Product Message Passing
Yi
“Based on what my other neighbors said, this my distribution over my values.”
Sum-Product Message Passing
Yi
“Based on what my other neighbors said, this my distribution over your values.”
Sum-Product Message Passing
- If it converges, we get:
- For tree structured graphs, it is guaranteed to converge and the marginals will be exact.
- For loopy graphs, it may not converge, and if it converges, the marginals may not be correct.
- With some tricks, it tends to work very well in practice.
Why do we want them?
Y1 Y3 YnY2 …
They can be used to encode structure. E.g. Y1 through Yn must all take different values.
Why are they hard?
Y1 Y3 YnY2 …
Basic message passing has exponential complexity in the neighborhood size of the largest factor.
Fixed High Order Factors
1) Doesn’t depend on the parameters, so we don’t need the marginal to calculate gradients in MLE.
Fixed High Order Factors
2) The structure is so constrained, that the sum in message passing becomes tractable.