Relational Factor Graphs

Preview:

DESCRIPTION

Relational Factor Graphs. Lin Liao Joint work with Dieter Fox. A Running Example. Collective classification of a person’s significant places. Features to Consider. Local features: Temporal: time of day, day of week, duration Geographic: near restaurants, near stores Pair-wise features: - PowerPoint PPT Presentation

Citation preview

1

Relational Factor Graphs

Lin Liao

Joint work with Dieter Fox

2

A Running Example

Collective classification of a person’s significant places

3

Features to Consider Local features:

Temporal: time of day, day of week, duration

Geographic: near restaurants, near stores Pair-wise features:

Transitions: which place follows which place Global features:

Aggregates: number of homes or workplaces

4

Which Graphical Model?

Option 1: Bayesian networks and Probabilistic Relational Models But the pair-wise relations may introduce

cycles

Place 1

Place 3 Place 4

Place 2

5

Which Graphical Model?

Option 2: Markov networks and Relational Markov Networks But aggregations can introduce huge

cliques and lose independence relations.

Place 1

Place 3 Place 4

Place 2

Number of homes

6

Motivation

We want a relational probabilistic model that is Suitable to represent both undirected

relations (e.g., pair-wise features) and directed relations (e.g., deterministic aggregation)

Able to address some of the computational issues at the template level

7

Outline Representation

Factor graphs [Kschischang et al. 2001, Frey 2003]

Relational factor graphs Inference

Belief propagation Inference templates

Summation template based on FFT Experiments

8

Factor Graph Undirected factor graph [Kschischang et al.

2001] Bipartite graph that includes both variable

nodes (x1,…,xN) and factor nodes (f1,…,fM)

Joint distribution of variables is proportional to the product of factor functions

x1

x2

x3

x4

f1

f2

f3

9

Factor Graph Directed factor graph [Frey 2003]

Allow some edges to be directed so as to unify Bayesian networks and Markov networks

A valid graph should have no directed cycles

x1

x2

f1

x3

x4

f3

f2

10

Markov Network to Factor Graph

Factors represent the potential functions

Markov network Factor graph

11

Bayesian Network to Factor Graph

Factors represent the conditional probability table

Bayesian network Factor graph

12

Unify MN and BN

+

Local features

Place labels

Aggregation factor

Number of homes

Aggregate features

13

Relational Factor Graph

A set of factor templates that can be used to instantiate (directed) factor graphs given data Representation template

Use SQL (similar to RMN) Guarantee no directed cycles

Inference template Optimization within a factor (discussed

later)

14

Place Labeling: Schema

15

Place Labeling: Transition Features

Label1 Label2 Label3

Pair-wise factor

16

Place Labeling: Aggregate Features

Label1 Label2 Label3

+

=Home? =Home? =Home?

Bool variables

Num of homes

Aggregate feature

17

Outline Representation

Factor graphs [Kschischang et al. 2001, Frey 2003]

Relational factor graphs Inference

Belief propagation Inference templates

Summation template based on FFT Experiments

18

Inference in Factor Graph Belief propagation: two types of messages

Message from variable x to factor f

Message from factor f to variable x

nx: factors adjacent to x; nf: variables adjacent to f

19

Inference Templates Simplest case: specify the function f(nf) and

use the above formula to compute message f -> x Problem: complexity is exponential in the

number of factor arguments. This can be very expensive for aggregation factors

Inference templates allow users to specify optimized algorithms at the template level Be in general form and easy to be shared Support template level complexity analysis

20

Summation Templates

+

…..

xin1 xin

2 xin7 xin

8

xout

21

Summation: Forward Message

+

…..

xin1 xin

2 xin7 xin

8

xout

Compute the distribution of the sum of independent variables xin

1, …. , xin8

22

Summation: Forward Message

Convolution tree: each node can be computed using FFT; total complexity O(nlog2n)

23

Summation: Backward Message

+

…..

xin1 xin

2 xin7 xin

8

xout

Message from xout defines a prior distribution of the sum. For each value of xin

2, compute the distribution of sum and weighted by the prior

24

Summation: Backward Message

If we reuse the results cached for the forward message, complexity becomes O(nlogn)

25

Summation Templates

By using convolution tree, FFT, and caching, the average complexity of passing a message through summation factor is O(nlogn), instead of exponential.

26

Learning

Estimate the weights for probabilistic factors (local features, pair-wise features, and aggregate features)

Optimize the weights to maximize the conditional likelihood of the labeled training data The same algorithm as RMN

27

Experiments Two data sets:

“Single” data set: one person’s GPS data for 4 months

“Multiple” data set: one-week GPS data from 5 subjects

Six candidate labels: Home, Work, Shopping, Dining, Friend, Others

Get the geographic knowledge from Microsoft MapPoint Web Service

28

How Much Aggregates Help

Error rate Multiple Single

No aggregate 28% 9%

With aggregate 18% 6%

Test on “multiple” data set: leave-one-subject-crossvalidation

Test on “single” data set: crossvalidation (train on 1 month, test on 3 months)

29

How Efficient the Optimized BP

30

Summary

Relational factor graph is SQL + (directed) factor graph

It is Suitable to represent both undirected

relations and directed relations Convenient to use: no directed cycles Able to address computation issues at the

template level

Recommended