63
Graph Partitioning using Bayesian Inference on GPU Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland, Aydın Bulu¸c, John D. Owens UC Davis, NVIDIA intern [email protected] March 26, 2018 Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Bulu¸c, John D. Owens (NVIDIA) Final Presentation March 26, 2018 1 / 63

Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Graph Partitioning using Bayesian Inference on GPU

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens

UC Davis, NVIDIA intern

[email protected]

March 26, 2018

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 1 / 63

Page 2: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Overview

1 Introduction

2 Stochastic Block Model

3 Bayesian inference for graph partitioning

4 Parallelization strategy

5 Experiments

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 2 / 63

Page 3: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Problem: How can we break this graph up into smallerpieces so we can understand it?

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 3 / 63

Page 4: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Problem definition

Problem 1

Can MCMC be sped up by using a GPU?

Problem 2

How is convergence affected?

Problem 3

Is this a scalable solution?

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 4 / 63

Page 5: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Problem definition

Problem 1

Can MCMC be sped up by using a GPU?

Problem 2

How is convergence affected?

Problem 3

Is MCMC a scalable solution to the graph clustering problem?

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 5 / 63

Page 6: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Problem definition

Problem 1

Can MCMC be sped up by using a GPU?

Problem 2

How is convergence affected?

Problem 3

Is this a scalable solution?

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 6 / 63

Page 7: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Related work

Minimum-cut method

Hierarchical clustering

Girvan–Newman algorithm

Modularity maximization

Clique-based methods

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 7 / 63

Page 8: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Generative models

Idea

Before thinking of how to partition, we should come up with a model thatgenerates what we are looking for.

Want:

The parameters should describe block structure in a graph.

The parameter values are unknown, but can be inferred from the dataand the current state in a principled, statistical way.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 8 / 63

Page 9: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Stochastic Block Model (SBM)

Holland, Laskey, and Leinhardt. ”Stochastic blockmodels: First steps.”Social networks 5.2 (1983)

Parameters: ηi → probability a node belongs to block i

Mrs → probability an edge exists between block r and block s

Rules for placing N nodes in B blocks:

1 Sample bi ∼ Cat(η) to obtain each node’s colour.

2 Sample eij ∼ Poisson(M) to determine which two blocks r and s theedge connects.

3 Sample i ∼ Uniform(nr ) and j ∼ Uniform(ns) to get two nodes inblocks r and s respectively for edge eij .

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 9 / 63

Page 10: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Stochastic Block Model (SBM)

Holland, Laskey, and Leinhardt. ”Stochastic blockmodels: First steps.”Social networks 5.2 (1983)

Parameters: ηi → probability a node belongs to block i

Mrs → probability an edge exists between block r and block s

Rules for placing N nodes in B blocks:

1 Sample bi ∼ Cat(η) to obtain each node’s colour.

2 Sample eij ∼ Poisson(M) to determine which two blocks r and s theedge connects.

3 Sample i ∼ Uniform(nr ) and j ∼ Uniform(ns) to get two nodes inblocks r and s respectively for edge eij .

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 10 / 63

Page 11: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Stochastic Block Model (SBM)

Holland, Laskey, and Leinhardt. ”Stochastic blockmodels: First steps.”Social networks 5.2 (1983)

Parameters: ηi → probability a node belongs to block i

Mrs → probability an edge exists between block r and block s

Rules for placing N nodes in B blocks:

1 Sample bi ∼ Cat(η) to obtain each node’s colour.

2 Sample eij ∼ Poisson(M) to determine which two blocks r and s theedge connects.

3 Sample i ∼ Uniform(nr ) and j ∼ Uniform(ns) to get two nodes inblocks r and s respectively for edge eij .

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 11 / 63

Page 12: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Formulate clustering as exact recovery problem

1 Given G and b(t), find M(t).

2 Given G and M(t), find arg maxb P(b|G ,M). This becomes b(t+1).

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 12 / 63

Page 13: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Exact recovery problem

1 Given G and b(t), find M(t).

2 Given G and M(t), find arg maxb P(b|G ,M). This becomes b(t+1).

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 13 / 63

Page 14: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Exact recovery problem

1 Given G and b(t), find M(t).

2 Given G and M(t), find arg maxb P(b|G ,M). This becomes b(t+1).

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 14 / 63

Page 15: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Bayesian inference

We want to find partition b that maximizes:

P(b|G ,M) =P(G |b,M)P(b,M)

P(G )

Taking negative logs of both sides, we want to minimize Σ:

Σ = − logP(G |b,M)− logP(b,M) + logP(G )

S is the amount of information required to describe the graph when themodel is known.L is the amount of information required to describe the model.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 15 / 63

Page 16: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Bayesian inference

We want to find partition b that maximizes:

P(b|G ,M) =P(G |b,M)P(b,M)

P(G )

Taking negative logs of both sides, we want to minimize Σ:

Σ = − logP(G |b,M)︸ ︷︷ ︸S

− logP(b,M)︸ ︷︷ ︸L

+ logP(G )︸ ︷︷ ︸constant

S is the amount of information required to describe the graph when themodel is known.L is the amount of information required to describe the model.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 16 / 63

Page 17: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Computing terms

S can be found by counting the number of configurations of the graph.The fewer configurations, the better our model fits the graph:

S = log( 1

Ω

)= log

( ∏rs Mrs !∏

r k+r !∏

r k−r !

)−1

L can be found by counting:

L = log

((B

N

))+ logN!−

∑r

log nr !︸ ︷︷ ︸b term

+ log

((B2

E

))︸ ︷︷ ︸

M term

Design decision: Ignore L for now in prototype, but leave room for it to beadded in the future.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 17 / 63

Page 18: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Computing terms

S can be found by counting the number of configurations of the graph.The fewer configurations, the better our model fits the graph:

S = log( 1

Ω

)= log

( ∏rs Mrs !∏

r k+r !∏

r k−r !

)−1

L can be found by counting:

L = log

((B

N

))+ logN!−

∑r

log nr !︸ ︷︷ ︸b term

+ log

((B2

E

))︸ ︷︷ ︸

M term

Design decision: Ignore L for now in prototype, but leave room for it to beadded in the future.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 18 / 63

Page 19: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Intuition

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 19 / 63

Page 20: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Intuition

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 20 / 63

Page 21: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Combinatorial optimization problem

So we want to partition b s.t. Σ is minimized.

However for a graph of B blocks and N nodes, there are BN manypossible partitions b we would need to compute that quantity for.

We need an efficient way to traverse large state space.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 21 / 63

Page 22: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

MCMC sampling

1 Propose move.

2 Calculate move acceptance probability.

3 Commit move.

Upside: Stationary distribution will converge to probability distribution weare trying to find.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 22 / 63

Page 23: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Merge phase

Merge phase:

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 23 / 63

Page 24: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Merge phase

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 24 / 63

Page 25: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Nodal (MCMC) phase

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 25 / 63

Page 26: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Nodal (MCMC) phase

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 26 / 63

Page 27: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

MCMC sampling applied to solve graph partitioning

Merge phase1 Propose move2 Calculate change in objective function3 Get block move that improves objective function the most4 Commit move5 Goto 1) until nblocksinitial

r blocks left

MCMC phase1 Propose move2 Calculate change in objective function3 Calculate move acceptance probability4 Commit move5 Goto 1) until MCMC chain has converged

Do Merge phase, MCMC phase, Merge phase, MCMC phase, etc.until target cluster count has been reached.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 27 / 63

Page 28: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

MCMC sampling applied to solve graph partitioning

Merge phase1 Propose move2 Calculate change in objective function3 Get block move that improves objective function the most4 Commit move5 Goto 1) until nblocksinitial

r blocks left

MCMC phase1 Propose move2 Calculate change in objective function3 Calculate move acceptance probability4 Commit move5 Goto 1) until MCMC chain has converged

Do Merge phase, MCMC phase, Merge phase, MCMC phase, etc.until target cluster count has been reached.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 28 / 63

Page 29: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

MCMC sampling applied to solve graph partitioning

Merge phase1 Propose move2 Calculate change in objective function3 Get block move that improves objective function the most4 Commit move5 Goto 1) until nblocksinitial

r blocks left

MCMC phase1 Propose move2 Calculate change in objective function3 Calculate move acceptance probability4 Commit move5 Goto 1) until MCMC chain has converged

Do Merge phase, MCMC phase, Merge phase, MCMC phase, etc.until target cluster count has been reached.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 29 / 63

Page 30: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

1. Propose move

Counter-based RNG allows O(1) skip-ahead for each thread.

This allows independent random numbers to be generated within adevice function.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 30 / 63

Page 31: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

2. Calculate objective function

Problem: How do we compute the objective function as if we have alreadymade the move, but without actually changing our graph?

Key insight: Merge move and node move can be both expressed as thesimultaneous element-wise addition of rows and columns of a matrix.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 31 / 63

Page 32: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

2. Calculate objective function

Problem: How do we compute the objective function as if we have alreadymade the move, but without actually changing our graph?

Key insight: Merge move and node move can be both expressed as thesimultaneous element-wise addition of rows and columns of a matrix.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 32 / 63

Page 33: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

We have a graph

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 33 / 63

Page 34: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

How to express in matrix notation node 1 being movedfrom blue to yellow?

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 34 / 63

Page 35: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Elementwise move node 1’s out-edge contribution fromblue to yellow

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 35 / 63

Page 36: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Elementwise move node 1’s out-edge contribution fromblue to yellow

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 36 / 63

Page 37: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Elementwise move node 1’s in-edge contribution from blueto yellow

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 37 / 63

Page 38: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Elementwise move node 1’s in-edge contribution from blueto yellow

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 38 / 63

Page 39: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Move complete

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 39 / 63

Page 40: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

2. Calculate objective function

For sparse matrices, elementwise addition is equivalent to doing a setunion.

Warp-wide sorting network allows us to do set unions using registermemory.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 40 / 63

Page 41: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

3. Commit move

Triple matrix product used to update model between Merge and MCMCphases.

Hypothesis 1: Committing merge moves in parallel does not affectconvergence rate.

Hypothesis 2: Committing MCMC moves in parallel does not affectconvergence rate.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 41 / 63

Page 42: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Parallelization summary

Reference impl. Our contributionCPU Seq CPU Par GPU Seq GPU Par

Propose move par par par parMerge Calculate obj par par par par

Commit move seq seq seq par

Propose move seq par par parMCMC Calculate obj seq par par par

Commit move seq seq par par

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 42 / 63

Page 43: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Experimental Setup

Hardware:

CPU: Intel Core i7-5820K CPU @ 3.30GHz, 32GB RAM

GPU: Titan Xp, 12GB RAM

Datasets:

Nodes 50 100 1K 5K 20K 50K 500K

Edges 319 6K 20K 102K 409K 1M 10M

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 43 / 63

Page 44: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Experimental Setup

Hardware:

CPU: Intel Core i7-5820K CPU @ 3.30GHz, 32GB RAM

GPU: Titan Xp, 12GB RAM

Datasets:

Synthetic datasets with ground truth partitions for each node.

Nodes 50 100 1K 5K 20K 50K 500K

Edges 319 6K 20K 102K 409K 1M 10M

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 44 / 63

Page 45: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Speedup comparison

0

2

4

6

8

10

12

14

16

18

50 100 1000 5000 20000 50000 500000

Speedup

NumberofNodes

CPUSeq CPUPar

GPUSeq GPUPar

Figure: Speedup comparison across four implementations.Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 45 / 63

Page 46: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Runtime breakdown

0

0.2

0.4

0.6

0.8

1

1.2

CPUSeq

CPUPar

GPUSeq

GPUPar

CPUSeq

CPUPar

GPUSeq

GPUPar

CPUSeq

CPUPar

GPUSeq

GPUPar

CPUSeq

CPUPar

GPUSeq

GPUPar

CPUSeq

CPUPar

GPUSeq

GPUPar

CPUSeq

CPUPar

GPUSeq

GPUPar

50 100 1000 5000 20000 50000

Build Merge MCMC

Figure: Runtime breakdown between four implementations.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 46 / 63

Page 47: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Rate of convergence

-600000

-500000

-400000

-300000

-200000

-100000

0

100000

200000

0 500000 1000000 1500000 2000000

GPU CPUSeq CPUPar

Figure: Change in objective function plotted against number of moves.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 47 / 63

Page 48: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Rate of convergence (in runtime)

-600000

-500000

-400000

-300000

-200000

-100000

0

100000

200000

0 50 100 150 200 250

GPU CPUSeq CPUPar

Figure: Change in objective function plotted against runtime in seconds.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 48 / 63

Page 49: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Raw runtime numbers and accuracy

CPU Seq CPU Par GPU Seq GPU ParNodes Time (s) Acc (%) Time (s) Acc (%) Time (s) Acc (%) Time (s) Acc (%)50 0.519 100 0.519 100 0.0876 100 0.0603 100100 0.802 100 0.531 82 0.2249 100 0.1779 1001000 5.193 81.41 0.939 100 3.153 100 1.5649 1005000 16.443 90 2.255 81.7 27.093 92.943 3.113 87.620000 118.201 94.6 29.97 93.93 51.519 96.5 7.671 88.550000 272.249 89.8 97.68 87.15 2902.4 97.6 23.707 89.2

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 49 / 63

Page 50: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Takeaways

It is surprisingly easy to make MCMC converge.

However, it’s a different story to make MCMC scalable.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 50 / 63

Page 51: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Future work

Use specialized triple matrix product kernel to take advantage ofknowledge about matrix structure.

Use load-balancing methods such as TWC to handle unbalanced data.

Try newer Bayesian inference methods such as minibatch MCMC andADVA (auto differentiation variational inference) that claim to scalebetter with data size than standard MCMC.

Add multi-GPU support.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 51 / 63

Page 52: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Questions?

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 52 / 63

Page 53: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Stochastic Block Model (SBM)

Holland, Laskey, and Leinhardt. ”Stochastic blockmodels: First steps.”Social networks 5.2 (1983)

Given N nodes in B blocks:

State: bi → block node i belongs to

Parameters: ηi → probability a node belongs in block i

λrs → probability an edge exists between block r and block s

1 Sample each node i.i.d. over ηi to obtain each node’s colour.

2 Sample each edge i.i.d. over Poi(λrs) to obtain blocks r and s theyconnect. For each edge, sample one node in block r with probability1nr

and one node in block s with probability 1ns

to determine whichtwo nodes the edge connects.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 53 / 63

Page 54: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Stochastic Block Model (SBM)

Holland, Laskey, and Leinhardt. ”Stochastic blockmodels: First steps.”Social networks 5.2 (1983)

Given N nodes in B blocks:

State: bi → block node i belongs to

Parameters: ηi → probability a node belongs in block i

λrs → probability an edge exists between block r and block s

The probability of generating a graph G and partition b given parametersη, λ assuming a Bernoulli edge distribution is:

P(G |b,M) =∏i

ηbi∏i<j

λAij

bibj(1− λbibj )

1−Aij

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 54 / 63

Page 55: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Variant of SBM we will use

Non-parametric: use Bayesian formulation instead of maximumlikelihood.

This solves the over-fitting problem.

Degree-corrected: add additional parameters ki for every node irepresenting its propensity for high degree

This accounts for the power law degree distribution that manyreal-world graphs exhibit.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 55 / 63

Page 56: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Expression

Taking negative logs of both sides:

− logP(b|G ,M) = − logP(G |b,M)︸ ︷︷ ︸S

− logP(b,M)︸ ︷︷ ︸L

+ logP(G )︸ ︷︷ ︸constant

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 56 / 63

Page 57: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Sequential MCMC for graph partitioning

Input: b: N × 1 current block assignment vector, M: B × B interblockedge count matrix, A: N × N adjacency matrix

1: procedure MCMCSequential(b,M,A)2: for node i do3: Propose random move for i : block r → s4: Acceptance probability:

5: paccept = min[exp(−β∆S)ps→r

pr→s, 1]

6: Perform move by updating b,M

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 57 / 63

Page 58: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 58 / 63

Page 59: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Generative models

Idea

Before thinking of how to partition, we should come up with a model ofwhat we are looking for.

The parameters should describe block structure.

The parameter values are unknown, but can be inferred from the dataand the current state in a principled, statistical way.

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 59 / 63

Page 60: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Generative models: Sketch of algorithm

Given data G and an initial guess of partition b(0), we can compute M(1)

and b(1):

1 Compute model parameters M(1) using G and b(1).

2 Make better guess for partition b(1) using Bayesian inference:

arg maxb

P(b|G ,M) = arg maxb

P(G |b,M)P(b,M)

P(G )

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 60 / 63

Page 61: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Computing terms

S can be found by counting the number of configurations of the graph.The fewer configurations, the better our model fits the graph:

S =1

Ω

=( ∏

rs Mrs !∏r k

+r !∏

r k−r !

)−1

L can be found by counting:

L = log

((B

N

))+ logN!−

∑r

log nr !︸ ︷︷ ︸b term

+ log

((B2

E

))︸ ︷︷ ︸

M term

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 61 / 63

Page 62: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Variable-at-a-time Metropolis-Hastings

Algorithm 1 Sequential MCMC.

Input: b0: N × 1 state vector initialized randomlyOutput: bT : N × 1 vector equal to stationary distribution1: for iteration t = 1, 2, ... do2: for node i = 1, 2, ...,N do

3: Propose: b(cand)i ∼ q(bti |bt−1)

4: Acceptance probability:

α = min (q(bt−1

i |bcandi )π(bcandi )

q(bcandi |bt−1i )π(bt−1

i ), 1)

5: u ∼ Uniform(0, 1)6: if u < α then7: Accept proposal: bti ← bcandi

8: else9: Reject proposal: bti ← bt−1

i

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 62 / 63

Page 63: Graph Partitioning using Bayesian Inference on GPUctcyang/pub/gtc-slides2018.pdf · Overview 1 Introduction 2 Stochastic Block Model 3 Bayesian inference for graph partitioning 4

Where SBM fits into machine learning

Hidden Markov Model

Latent Variable Model

Variational auto-encoders

Carl Yang, Steven Dalton, Maxim Naumov, Michael Garland,Aydın Buluc, John D. Owens (NVIDIA)Final Presentation March 26, 2018 63 / 63