Fast Iterative Graph Computation with Block Updates: VLDB 2014 Talk

  • View
    146

  • Download
    5

Embed Size (px)

Text of Fast Iterative Graph Computation with Block Updates: VLDB 2014 Talk

1. Fast Iterative Graph Computation with Block Updates Wenlei Xie*, Guozhang Wang+, David Bindel*, Alan Demers*, Johannes Gehrke* *Cornell University +LinkedIn 1 2. Outline Motivation Block model Implementation Experiments 2 3. Outline Motivation Block model Implementation Experiments 3 4. Think-Like-A-Vertex Programming : The current state of vertex v : The state of neighboring vertices to v : The state of incoming edges to v : The new state of vertex v Scheduling policy Decides the next vertex to be updated 4 Vertex update function 5. Vertex Model: Intuition 5 6. Simple Example: Shortest Path 6 Vertex update function VertexUpdate(Vertex v) { ForEach (u, v) { v.dist = min(v.dist, u.dist + cost(u, v)); } } Scheduling policy Round robin: Bellman-Ford Prioritized: Dijkstra-like 7. Speedup for Different Applications 7 Belief Propagation PageRank 8. Throughput w.r.t. Computation Load 8 Computationally LightComputationally Heavy 9. Throughput w.r.t. Computation Load 9 Computationally LightComputationally Heavy Memory Wall 10. Computationally Light Applications Many applications are computationally light PageRank Shortest Paths Connected Component Coordinate Descent More CPUs better performance Communication is the bottleneck 10 11. Communication Bottleneck Also observed by researchers in HPC community In the context of PDE solvers [Barrett+94] Special graphs: Grid, Mesh How does the HPC community overcome this bottleneck: Blocking [PattersonH96] [Anderson+99] Our observation: Blocking can be applied to general graphs 11 12. Outline Motivation Block model Implementation Experiments 12 13. Block-Oriented Computation Block Formulation Block: Closely connected subgraph Graph is pre-partitioned into disjoint blocks Efficient software (e.g. METIS [KarypisAKS97]) Block Update Function Naturally extends the vertex update function 13 14. Block Model: Intuition 14 15. Block Update Function Block Update Function How do we let the user specify the block update function? Block Programming Abstraction? Recent work on sub-graph centric framework: Giraph++ [Tian+14], Goffish [Simmhan+14] 15 16. Vertex Programming with Block Execution Sub-graph centric frameworks demonstrate the benefits of block updates, yet Think-Like-A-Vertex Programming is more natural Already existing code Can we have the best of the both worlds? Think-Like-A-Vertex Programming and Execute with blocking to overcome memory wall Idea: Define block update as iteratively applying vertex update BlockUpdate = VertexUpdate WithinBlockScheduler 16 17. Example: Eikonal PDE 17 Equations and figures adopted from Fast two-scale methods for Eikonal equations Eikonal PDE: widely used in physical simulations 18. Eikonal PDE (Contd.) An application over grid graphs 18 19. Eikonal PDE (Contd.) An application over grid graphs Natural block definition 19 20. Block Update in the Eikonal PDE Update the block based on boundary data Fast Sweeping: Sweep in four directions 20 21. Block Update in the Eikonal PDE Update the block based on boundary data Fast Sweeping: Sweep in four directions 21 22. Block Update in the Eikonal PDE Update the block based on boundary data Fast Sweeping: Sweep in four directions 22 23. Block Update in the Eikonal PDE Update the block based on boundary data Fast Sweeping: Sweep in four directions 23 24. Block Update in the Eikonal PDE Update the block based on boundary data Fast Sweeping: Sweep in four directions 24 25. Vertex Programming with Block Execution Define block update as iteratively applying vertex update Two-level scheduling BlockUpdate = VertexUpdate WithinBlockScheduler Simple and cheap scheduler: low overhead AcrossBlockScheduler Scheduler at the block-level More intelligent (yet expensive) scheduler to accelerate global convergence Benefits Better Cache Utilization Reduced Scheduling Overhead 25 26. Outline Motivation Block model Implementation Experiments 26 27. Implementation 27 Following the BSP model Computation proceeds in iterations Processing each vertex at most once in each iteration User customizable functions for flexible scheduling 28. Across Block Scheduling Dynamic block scheduling The priorities of blocks Aggregation of vertices priorities max, min, sum, avg Maintain priorities of blocks Optimization Incrementally maintain the aggregation 28 29. Within Block Scheduling Low-overhead vertex scheduling is preferred Static within block scheduling Inside-Block Ordering Maximum number of inner Iterations Termination at Block convergence FIFO-style dynamic scheduling 29 30. Outline Motivation Block model Implementation Experiments 30 31. Experiments Micro benchmark: cache utilizations Experiments for different applications Personalized PageRank Single-Source Shortest Path Etch Simulation 31 32. Effect of Block Strategy 32 Time(sec) Block-Level Scheduling Policy 33. Effect on Within Block Scheduling 33 Time(sec) Max Inner Iterations 34. The End Thank you ! Questions? 34 35. Reference [Barrett+94] R. Barrett, et al. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, 2nd edition, 1994 [PattersonH96] D. Patterson and J. Hennessey. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, second edition, 1996 [Anderson+99] E. Anderson, et al. LAPACK Users Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, third edition, 1999 [KarypisAKS97] G. Karypis, et al. Multilevel hypergraph partitioning: Application in VLSI domain. In DAC, 1997. [Zhao07] H. Zhao. Parallel implementation of fast sweeping method. Journal of Computational Mathematics, 25(4):421429, 2007. [ChaconV12] A. Chacon and A. Vladimirsky. Fast two-scale methods for eikonal equations. SIAM J. Scientific Computing, 34(2), 2012. [Recht+11] B. Recht, C. Re, S. J. Wright, and F. Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In NIPS, 2011. [Dean+12] J. Dean, et al. Large scale distributed deep networks. In NIPS, 2012. [Tian+14] Y. Tian, et al. From think like a vertex to think like a graph [Simmhan+14] Y. Simmhan et al. GoFFish: A Sub-Graph Centric Framework for Large- Scale Graph Analytics 35