1
Design of a NVRAM Specialized Dynamic Graph Data Structure Keita Iwabuchi 1,2 , Roger A Pearce 2 , Brian Van Essen 2 , Maya B Gokhale 2 , Satoshi Matsuoka 1 1. Tokyo Institute of Technology (Tokyo Tech) 2. Lawrence Livermore National Laboratory (LLNL) A NVRAM Specialized Degree-Aware Dynamic Graph Data Structure 1. Streaming edges are ingested in Sorted, Random, or BFS order 2. Partition the edges into 1D or 2D partitioning (#partitions is 64) 3. Buffer a subset of edges (1 million) into DRAM 4. Insert edge buffer into the graph data structure sequentially Streaming edge insertion Experiments Motivation Key Design Objectives Increase page-level locality of data stored in NVRAM Optimize for low degree vertices Efficiently search and retrieve vertices, edges, and metadata Quickly locate a specific edge matching topological and metadata constraints Our Approach Degree aware data structures, where low-degree vertices are compactly represented Use Robin Hood Hashing [Celis ‘86] because of its locality properties Store and Process Large Dynamic Graphs Social network, genome analysis, WWW, etc. Streaming graph updates (insert or delete edges or vertices) Efficiently store sparse scale-free graphs Leverage Emerging NVRAM in HPC Systems NVRAM has lower cost and power consumption than DRAM Persistently store distributed graph database across compute nodes with attached NVRAM Extends node’s memory capacity Goal High performance: Insertion and deletion of vertices and edges search for a specific edge based on edge meta data Controller / Partitioner Streaming edges (Sorted, Random, BFS order) Comp. Node v1 p1 v2 p2 The dynamic graph data structure (This work) Comp. Node Comp. Node v3 p3 WebGraph 2012 [Lehmberg’14] Configuration Catalyst cluster at LLNL with 800GB of NVRAM per node (single node) Memory mapped I/O using DI-MMAP as an interface to NVRAM, limiting the DRAM resident portion of graph DB (page buffer) to 4GB Boost.Interprocess to allocate data structures in memory-mapped region Largest open source webgraph to our knowledge, 120 billion of edges Graph is 1D or 2D partitioned, modeling 64 partitions Vertex: webpage, Edge: hyperlink, Weight: N/A Graph Note that each data structure is distributed across the nodes Baseline model (Boost) Dataset Vertex table (unordered map) Edge tables (vector) Results Robin Hood Hashing (edge insertion example) Degree Aware Edge Insertion Algorithm Degree aware data structures scale near-linearly with the number of edges inserted (up to 2 billion edges) Robin Hood Hashing improves page-level locality and overall performance when graph database grows beyond 4GB page cache This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNLPOST676096 Jeremiah Willcock, Indiana University IEEE ACM The International Conference for High Performance Computing, Networking, Storage and Analysis 2015 Austin TOKYO INSTITUTE OF TECHNOLOGY LAWRENCE LIVERMORE NATIONAL LABORATORY v4 p4 w1 w2 w4 w5 w7 v: vertex ID p: vertex property w: edge weight w6 w8 {v1, v2} {v1, v3} p1 w1 w2 {v2} {v4} p2 p4 {v3} w4 {v3} {v3} w7 w8 {v1} w4 Low-degree table Mid-high-degree table Overview Insert Sortededgelist DegreeAware exceed 4 GB (page cache) Baseline exceed 4 GB (page cache) DegreeAware exceed 4 GB (page cache) Baseline exceed 4 GB (page cache) Insert BFSordededgelist DegreeAware: 34GB DegreeAware: 33GB d1 < low_degree_threshold d1L_TBL.degree(u) False MH_TBL.insert(L_TBL.pop(u)) MH_TBL.insert(u,v) L_TBL.insert(u,v) max_probedistance < long_probedistance Allocate a chain table Insert Edge (u,v) Finish u: source vertex ID v: target vertex ID L_TBL: lowHdegree table MH_TBL: middle HhighHdegree table d1 > 0 True True False d2 MH_TBL.dgree(u) d2 = 0 L_TBL.insert(u,v) MH_TBL.insert(u,v) True False True False 0 1 2 3 4 5 6 7 0 0 1 0 1 1 1 2 5 0 6 0 page 0 page 1 1 0 Key Hash value Probe distance 0 0 1 0 1 1 5 0 6 0 {0, 1} {1, 0} {1, 2} {1, 5} 0 1 2 3 4 5 6 7 {5, 6} {6, 5} {0, 1} {1, 0} {1, 2} {5, 6} {6, 5} {1, 5}

Design of a NVRAM Specialized Dynamic Graph Data Structuresc15.supercomputing.org/sites/all/themes/SC15images/tech_poster/… · Streaming edges are ingested in Sorted, Random, or

  • Upload
    others

  • View
    33

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Design of a NVRAM Specialized Dynamic Graph Data Structuresc15.supercomputing.org/sites/all/themes/SC15images/tech_poster/… · Streaming edges are ingested in Sorted, Random, or

Design of a NVRAM Specialized Dynamic Graph Data StructureKeita Iwabuchi1,2, Roger A Pearce2, Brian Van Essen2, Maya B Gokhale2, Satoshi Matsuoka1

1. Tokyo Institute of Technology (Tokyo Tech) 2. Lawrence Livermore National Laboratory (LLNL)

A NVRAM Specialized Degree-Aware Dynamic Graph Data Structure

1. Streaming edges are ingested in Sorted, Random, or BFS order

2. Partition the edges into 1D or 2D partitioning (#partitions is 64)

3. Buffer a subset of edges (1 million) into DRAM4. Insert edge buffer into the graph data

structure sequentially

Streaming edge insertion

ExperimentsMotivation

Key Design Objectives• Increase page-level locality of data stored in NVRAM• Optimize for low degree vertices • Efficiently search and retrieve vertices, edges, and metadata• Quickly locate a specific edge matching topological and metadata

constraints

Our Approach• Degree aware data structures, where low-degree vertices are

compactly represented• Use Robin Hood Hashing [Celis ‘86] because of its locality properties

Store and Process Large Dynamic Graphs• Social network, genome analysis, WWW, etc.• Streaming graph updates (insert or delete edges or

vertices)• Efficiently store sparse scale-free graphs

Leverage Emerging NVRAM in HPC Systems• NVRAM has lower cost and power consumption than DRAM• Persistently store distributed graph database across

compute nodes with attached NVRAM• Extends node’s memory capacity

GoalHigh performance:• Insertion and deletion of vertices and edges• search for a specific edge based on edge meta data

Controller /Partitioner

Streaming edges (Sorted, Random, BFS order)

Comp.Node

v1p1

v2p2

The dynamic graph data structure

(This work)

Comp.Node

Comp.Node

v3p3

WebGraph 2012 [Lehmberg’14]

Configuration• Catalyst cluster at LLNL with 800GB of NVRAM per node (single node)• Memory mapped I/O using DI-MMAP as an interface to NVRAM, limiting the

DRAM resident portion of graph DB (page buffer) to 4GB• Boost.Interprocess to allocate data structures in memory-mapped region

• Largest open source webgraph to our knowledge, 120 billion of edges• Graph is 1D or 2D partitioned, modeling 64 partitions• Vertex: webpage, Edge: hyperlink, Weight: N/A

Graph

Note that each data structure is distributed across the nodes

Baseline model (Boost)

Dataset

Verte

x ta

ble

(uno

rder

ed m

ap) Edge tables

(vector)

Results

Robin Hood Hashing (edge insertion example)

Degree Aware Edge Insertion Algorithm

• Degree aware data structures scale near-linearly with the number of edges inserted (up to 2 billion edges)

• Robin Hood Hashing improves page-level locality and overall performance when graph database grows beyond 4GB page cache

This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-­‐POST-­‐676096

Jeremiah Willcock, Indiana University

IEEE ACM The International Conference for High Performance Computing, Networking, Storage and Analysis 2015 AustinTOKYO INSTITUTE OF TECHNOLOGYLAWRENCE LIVERMORE NATIONAL LABORATORY

v4p4

w1

w2

w4

w5

w7

v: vertex IDp: vertex propertyw: edge weight

w6・・・

w8

{v1,%v2} {v1,%v3}p1w1 w2

{v2} {v4}p2 p4

{v3} � �

w4 � �

{v3} {v3} �

w7 w8 �

{v1} � �

w4 � �

Low-degree table Mid-high-degree table

Overview

Insert  Sorted-­‐edgelist

DegreeAware exceed4  GB  (page  cache)

Baseline  exceed4  GB  (page  cache)

DegreeAware exceed4  GB  (page  cache)

Baseline  exceed4  GB  (page  cache)

Insert  BFS-­‐orded-­‐edgelist

DegreeAware:  34GB

DegreeAware:  33GB

d1#< low_degree_threshold

d1⟵ L_TBL.degree(u)

False

MH_TBL.insert(L_TBL.pop(u))

MH_TBL.insert(u,v)

L_TBL.insert(u,v)

max_probedistance <#

long_probedistance

Allocate#a#chain#table

Insert#Edge#(u,v)

Finish

u:#source#vertex#ID

v:#target#vertex#ID

L_TBL:# lowHdegree#table

MH_TBL:#middleHhighHdegree#tabled1#>#0

True

True

False

d2#⟵MH_TBL.dgree(u)

d2#=#0

L_TBL.insert(u,v) MH_TBL.insert(u,v)

True False

True False

0 1 2 3 4 5 6 7

0 0 1 0 1 1 1 2 5 0 6 0

page-0 page-1

1--0

Key

Hashvalue

Probedistance

0 0 1 0 1 1 5 0 6 0{0,-1} {1,-0} {1,-2}

{1,-5}

0 1 2 3 4 5 6 7

{5,-6} {6,-5}

{0,-1} {1,-0} {1,-2} {5,-6} {6,-5}{1,-5}