Upload
qin-liu
View
34
Download
2
Embed Size (px)
Citation preview
VENUS: Vertex-Centric Streamlined Graph Computation on a Single PC
Jiefeng Cheng1, Qin Liu2, Zhenguo Li1,
Wei Fan1, John C.S. Lui2, Cheng He1
1Huawei Noah’s Ark Lab
2 The Chinese University of Hong Kong
ICDE’15
Graph is everywhere
We have large graphs• Web graph• Social graph• User-movie ratings graph• …
Graph Computation• PageRank• Community detection• ALS for collaborative filtering• …
Mining from Big Graphs: two feasible ways
Distributed systems• Pregel[SIGMOD’10], GraphLab[OSDI’12], GraphX[OSDI’14],
Giraph, ...• Expensive cluster, complex setup, writing distributed programs
Single-machine system• Disk: GraphChi[OSDI’12], X-Stream[SOSP’13]
• SSD: TurboGraph[KDD’13], FlashGraph[FAST’15]
• Computation time close to distributed systems• PageRank on Twitter graph (41M nodes, 1.4B edges)
• Spark: 8.1min with 50 machines (each with 2 CPUs, 7.5G RAM)[Stanton KDD’12]
• VENUS: 8 min on a single machine with quad-core CPU, 16G RAM
• Affordable, easy to program/debug
Existing Systems
Vertex-centric programming model: popularized by Pregel / GraphLab / GraphChi• Each vertex updates itself based on its neighborhood
GraphChi• Updated data on each vertex must be propagated to its
neighbors through disk• Extensive disk I/O
X-Stream• Different API: edge-centric programming
• Less expressive, re-implement common algorithms
• Also use disk to propagate updates
Our Contributions
Design and implement a disk-based system, VENUS• A new vertex-centric streamlined processing model• Separate mutable vertex data and immutable edge
data• Read/Write less data compared to other systems
Evaluation on large graphs• Outperform GraphChi and X-Stream• Verify that our design reduce data access
Vertex-Centric Programming
Consider GraphChi
for each iteration for each vertex v update(v)
void update(v) fetch data from each in-edge update data on v spread data to each out-edge
Duplicated data
v
Vertex-Centric Programming
VENUS:• Only store mutable values on vertices
Pros• Less data access• Enable ``streamlined’’ processing
Cons• Limited expressiveness
void update(v) fetch data from each in-edge update data on v spread data to each out-edge
in-neighbor
v
VENUS Architecture
VENUS Architecture
Disk storage (offline)• Sharding• Separation of edge data and vertex data
Computing model (online)• Load edge data sequentially
• Execute the update function on each vertex
• How to load vertex data and propagate updates
Sharding
Graph cannot fit in RAM?• Split the graph into shards
Each shard corresponds to an interval of vertices:• G-shard: immutable structure of graph
• In-edges of nodes in the interval
• V-shard: mutable vertex values• Vertex values of all vertices in the shard
Structure table: all g-shards
Value table: all vertex dataVertex ID 1 2 3 4 5 6 7 8 9 10 11 12
Data
Interval I1=[1,4] I2=[5,8] I3=[9,12]
G-shard 7,9,10 → 16,10 → 2
1,2,6 → 31,2,6,7,10 → 4
6,7,8,11 → 51,10 → 6
3,10,11 → 73,6,11 → 8
2,3,4,10,11 → 9
11 → 104,6 → 11
2,3,9,10,11 → 12
V-shard I1∪{6,7,9,10} I2∪{1,3,10,11} I3∪{2,3,4,6}
Vertex-Centric Streamlined Processing
V-shards are much smaller than g-shards• Load each v-shard entirely into memory
Scan each g-shard sequentially• Execute the update function in parallel
Execution
Load v-shard 1
7,9,10 → 16,10 → 21,2,6 → 3
1,2,6,7,10 → 4
Update v-shard 1Load v-shard 2
6,7,8,11 → 51,10 → 6
3,10,11 → 73,6,11 → 8
Update v-shard 2Load v-shard 3
2,3,4,10,11 → 911 → 104,6 → 11
2,3,9,10,11 → 12
Update v-shard 3
LoadingExecution
Parallelize execution and loading
Load and Update v-shards
Two I/O efficient algorithms• Algorithm 1: Extension of PSW in GraphChi (skip)• Algorithm 2: Merge-Join
• Load: merge-join between value table and v-shard
• Update: write values of [1,4] back to vertex table
Use value buffer to cache value table
ID 1 2 3 4 5 6 7 8 9 10 11 12
DataValue table on disk
ID 1 2 3 4 6 7 9 10 Vertices in v-shard 1on disk
ID 1 2 3 4 6 7 9 10
DataLoaded v-shard 1
Evaluation of VENUS
Setup: a commodity PC• quad-core 3.4GHz CPU• 16GB RAM and 4TB hard disk
Main competitors:• GraphChi and X-Stream
Applications:• PageRank• WCC: weakly connected components• CD: community detection• ALS: alternating least square for collaborative filtering• Shortest path, label propagations, etc.
PageRank on Twitter
Twitter follow-graph: 41M nodes, 1.4B edges
Cost of Updates Propagation:Data Write and Read
Applications: WCC, CD, ALS
Failed to implement CD on X-Stream, due to its edge-centric programming model
Web-Scale Graph
Clueweb12: web scale graph• 978 million nodes, 42.5 billion edges• 402 GB on disk• 2 iterations of PageRank
Computation time• GraphChi: 4.3 hours• X-Stream: 7.4 hours• VENUS-I: 2 hours• VENUS-II: 1.8 hours
Conclusion
Present a disk-based graph computation system, VENUS
Our design of graph storage and execution can reduce data access and I/O
Evaluations show it outperforms GraphChi and X-Stream
Also VENUS can handle billion-scale problems
Thank you!Q&A