Upload
jason-riedy
View
1.543
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Presentation to the 9th International Conference on Parallel Processing and Applied Mathematics (PPAM11)
Citation preview
Parallel Community Detection for MassiveGraphsE. Jason Riedy, Henning Meyerhenke, David Ediger, andDavid A. Bader
13 September 2011
Main contributions
• First massively parallel algorithm for community detection.
• Very general algorithm using a weighted matching and applyingto many optimization criteria.
• Partition a 122 million vertex, 1.99 billion edge graph intocommunities in 2 hours.
PPAM11—Parallel Community Detection—Jason Riedy 2/25
Exascale data analysis
Health care Finding outbreaks, population epidemiology
Social networks Advertising, searching, grouping
Intelligence Decisions at scale, regulating algorithms
Systems biology Understanding interactions, drug design
Power grid Disruptions, conservation
Simulation Discrete events, cracking meshes
• Graph clustering is common in all application areas.
PPAM11—Parallel Community Detection—Jason Riedy 3/25
These are not easy graphs.Yifan Hu’s (AT&T) visualization of the Livejournal data set
PPAM11—Parallel Community Detection—Jason Riedy 4/25
But no shortage of structure...
Protein interactions, Giot et al., “A ProteinInteraction Map of Drosophila melanogaster”,Science 302, 1722-1736, 2003.
Jason’s network via LinkedIn Labs
• Locally, there are clusters or communities.
• First pass over a massive social graph:• Find smaller communities of interest.• Analyze / visualize top-ranked communities.
• Our part: First massively parallel community detection method.
PPAM11—Parallel Community Detection—Jason Riedy 5/25
Outline
Motivation
Defining community detection and metrics
Our parallel method
ResultsImplementation and platform detailsPerformance
Conclusions and plans
PPAM11—Parallel Community Detection—Jason Riedy 6/25
Community detection
What do we mean?• Partition a graph’s
vertices into disjointcommunities.
• A community locallymaximizes some metric.• Modularity,
conductance, ...
• Trying to capture thatvertices are more similarwithin one communitythan betweencommunities. Jason’s network via LinkedIn Labs
PPAM11—Parallel Community Detection—Jason Riedy 7/25
Community detection
Assumptions
• Disjoint partitioning ofvertices.
• There is no one uniqueanswer.• Some metrics are
NP-complete to opti-mize [Brandes, et al.].
• Graph is lossyrepresentation.
• Want an adaptabledetection method.
Jason’s network via LinkedIn Labs
PPAM11—Parallel Community Detection—Jason Riedy 8/25
Common community metric: Modularity
• Modularity: Deviation of connectivity in the communityinduced by a vertex set S from some expected backgroundmodel of connectivity.
• We take [Newman]’s basic uniform model.
• Let m count all edges in graph G, mS count of edges withboth endpoints in S, and xS count the edges with anyendpoint in S. Modularity QS :
QS = (mS − x2S/4m)/m
• Total modularity: sum of each partion’s subsets’ modularity.
• A sufficiently positive modularity implies some structure.
• Known issues: Resolution limit, NP-complete opt. prob.
PPAM11—Parallel Community Detection—Jason Riedy 9/25
Sequential agglomerative method
A
B
C
D
E
FG
• A common method(e.g. [Clauset, et al.]) agglomeratesvertices into communities.
• Each vertex begins in its owncommunity.
• An edge is chosen to contract.• Merging maximally increases
modularity.• Priority queue.
• Known often to fall into an O(n2)performance trap withmodularity [Wakita & Tsurumi].
PPAM11—Parallel Community Detection—Jason Riedy 10/25
Sequential agglomerative method
A
B
C
D
E
FG
C
B
• A common method(e.g. [Clauset, et al.]) agglomeratesvertices into communities.
• Each vertex begins in its owncommunity.
• An edge is chosen to contract.• Merging maximally increases
modularity.• Priority queue.
• Known often to fall into an O(n2)performance trap withmodularity [Wakita & Tsurumi].
PPAM11—Parallel Community Detection—Jason Riedy 11/25
Sequential agglomerative method
A
B
C
D
E
FG
C
B
D
A
• A common method(e.g. [Clauset, et al.]) agglomeratesvertices into communities.
• Each vertex begins in its owncommunity.
• An edge is chosen to contract.• Merging maximally increases
modularity.• Priority queue.
• Known often to fall into an O(n2)performance trap withmodularity [Wakita & Tsurumi].
PPAM11—Parallel Community Detection—Jason Riedy 12/25
Sequential agglomerative method
A
B
C
D
E
FG
C
B
D
A
B
C
• A common method(e.g. [Clauset, et al.]) agglomeratesvertices into communities.
• Each vertex begins in its owncommunity.
• An edge is chosen to contract.• Merging maximally increases
modularity.• Priority queue.
• Known often to fall into an O(n2)performance trap withmodularity [Wakita & Tsurumi].
PPAM11—Parallel Community Detection—Jason Riedy 13/25
New parallel method
A
B
C
D
E
FG
• We use a matching to avoid the queue.
• Compute a heavy weight, largematching.• Greedy algorithm.• Maximal matching.• Within factor of 2 in weight.
• Merge all communities at once.
• Maintains some balance.
• Produces different results.
• Agnostic to weighting, matching...• Can maximize modularity, minimize
conductance.• Modifying matching permits easy
exploration.
PPAM11—Parallel Community Detection—Jason Riedy 14/25
New parallel method
A
B
C
D
E
FG
C
D
G
• We use a matching to avoid the queue.
• Compute a heavy weight, largematching.• Greedy algorithm.• Maximal matching.• Within factor of 2 in weight.
• Merge all communities at once.
• Maintains some balance.
• Produces different results.
• Agnostic to weighting, matching...• Can maximize modularity, minimize
conductance.• Modifying matching permits easy
exploration.
PPAM11—Parallel Community Detection—Jason Riedy 15/25
New parallel method
A
B
C
D
E
FG
C
D
G
E
B
C
• We use a matching to avoid the queue.
• Compute a heavy weight, largematching.• Greedy algorithm.• Maximal matching.• Within factor of 2 in weight.
• Merge all communities at once.
• Maintains some balance.
• Produces different results.
• Agnostic to weighting, matching...• Can maximize modularity, minimize
conductance.• Modifying matching permits easy
exploration.
PPAM11—Parallel Community Detection—Jason Riedy 16/25
Implementation: Cray XMT
Tolerates latency by massive multithreading.
• Hardware: 128 threads per processor• Every-cycle context switch• Many outstanding memory requests (180/proc)
• Flexibly supports dynamic load balancing• Globally hashed address space, no data cache
• Support for fine-grained, word-level synchronization• Full/empty bit on with every memory word
• 128 processor XMT atPacific Northwest Nat’l Lab
• 500 MHz processors, 16384threads, 1 TB of sharedmemory
Image: cray.com
PPAM11—Parallel Community Detection—Jason Riedy 17/25
Implementation: Data structures
Extremely basic for graph G = (V,E)
• An array of (i, j;w) weighted edge pairs, i < j, |E|• An array to store self-edges, d(i) = w, |V |• A temporary floating-point array for scores, |E|• Two additional temporary arrays (|V |, |E|) to store degrees,
matching choices, linked list for compression, ...
• Weights count number of agglomerated vertices or edges.
• Scoring methods need only vertex-local counts.
• Greedy matching parallelizes trivially over the edge array.
• Contraction compacts the edge list in place through atemporary, hashed linked list of duplicates.• Full-empty bits or compare&swap keep correctness.
PPAM11—Parallel Community Detection—Jason Riedy 18/25
Data and experiment
• Generated R-MAT graphs [Chakrabarti, et al.] (a = 0.55,b = c = 0.1, and d = 0.25) and extracted the largestcomponent.
Scale Fact. |V | |E| Avg. degree Edge group
18 8 236 605 2 009 752 8.5 2M16 252 427 3 936 239 15.6 4M32 259 372 7 605 572 29.3 8M
19 8 467 993 3 480 977 7.4 4M16 502 152 7 369 885 14.7 8M32 517 452 14 853 837 28.7 16M
• Three runs with each parameter setting.
• Tested both modularity (CNM for Clauset-Newman-Moore)and McCloskey-Bader’s filtered modularity (MB).
PPAM11—Parallel Community Detection—Jason Riedy 19/25
Performance: Time
Number of processors
Tim
e (s
)
22
24
26
28
210
CNM
●●●
●●●
●●●
●●●
●●●
●●●●●●
●●●
●●●
●●●
2.7s
51.4s
4.3s
144.6s
9.8s
464.4s
19.8s
1321.1s
1 2 4 8 16 32 64 128
MB
●●●
●●●
●●●
●●●
●●●
●●● ●●● ●●●
●●●
●●●
2.7s
51.9s
4.5s
144.7s
10.0s
460.8s
19.7s
1316.6s
1 2 4 8 16 32 64 128
Edge group
● 2M
4M
8M
16M
PPAM11—Parallel Community Detection—Jason Riedy 20/25
Performance: Scaling
Number of processors
Spe
edup
1
2
4
8
16
32
64
128
CNM
●●●
●●●
●●●
●●●
●●●
●●● ●●●
●●●
●●●
●●●
2M: 19.2x at 32P
4M: 33.4x at 48P
8M: 48.6x at 96P
16M: 66.8x at 128P
1 2 4 8 16 32 64 128
MB
●●●
●●●
●●●
●●●
●●●
●●● ●●● ●●
●
●●
●●●●
2M: 19.4x at 48P
4M: 32.5x at 48P
8M: 48.0x at 96P
16M: 66.7x at 128P
1 2 4 8 16 32 64 128
Edge group
● 2M
4M
8M
16M
PPAM11—Parallel Community Detection—Jason Riedy 21/25
Performance: Modularity
Scoring method
Mod
ular
ity
0.1
0.2
0.3
0.4
●
●
●
●
●
●
(18,16): 0.171
(18,32): 0.084
(18, 8): 0.240
(18,16): 0.187
(18,32): 0.104
(18, 8): 0.319
(18,16): 0.425(18,32): 0.418
(18, 8): 0.423
(18,16): 0.013(18,32): 0.009
(18, 8): 0.023
CNM MB
Implementation
● Parallel
SNAP
• SNAP: GT’s sequentialimplementation.
• Plain modularity (CNM)known for too-largecommunities (resolutionlimit).
• MB filters out statisticallynegligible merges, smallercommunities & modularity.
• Our more balanced, parallelalgorithm falls between thetwo?
PPAM11—Parallel Community Detection—Jason Riedy 22/25
Performance: Large and real-world
Large R-MAT Scale 27, edge factor 16 generated a component with122 million vertices and 1.99 billion edges.
7258 seconds ≈ 2 hours on 128PR-MAT known not to have much communitystructure...
LiveJournal 4.8 million vertices, 68 million edges2779 seconds ≈ 47 minutes on 128P
Long tail of around 1 000 matched pairs per iteration.Tail is important to modularity and community count,but to users?
PPAM11—Parallel Community Detection—Jason Riedy 23/25
Conclusions and plans
• First massively parallel algorithm based on matching foragglomerative community detection.• Non-deterministic algorithm, still studying impact.
• Now can easily experiment with agglomerative communitydetection on real-world graphs.• How volatile are modularity and conductance to perturbations?• What matching schemes work well?• How do different metrics compare in applications?
• Experimenting on OpenMP platforms.• CSR format is more cache-friendly, but uses more memory.
• Extending to streaming graph data!
• Code available on request, to be integrated into GT packageslike GraphCT.
PPAM11—Parallel Community Detection—Jason Riedy 24/25
And if you can do better...
Then please do!
10th DIMACS Implementation Challenge — Graph Partitioningand Graph Clustering
• 12-13 Feb, 2012 in Atlanta,http://www.cc.gatech.edu/dimacs10/
• Paper deadline: 21 October, 2011http://www.cc.gatech.edu/dimacs10/data/call-for-papers.pdf
• Challenge problem statement and evaluation rules:http://www.cc.gatech.edu/dimacs10/data/dimacs10-rules.pdf
• Co-sponsored by DIMACS, by the Command, Control, and Interoperability
Center for Advanced Data Analysis (CCICADA); Pacific Northwest
National Lab.; Sandia National Lab.; and Deutsche
Forschungsgemeinschaft (DFG).
PPAM11—Parallel Community Detection—Jason Riedy 25/25
Acknowledgment of support
PPAM11—Parallel Community Detection—Jason Riedy 26/25
Bibliography I
D. Bader and J. McCloskey.Modularity and graph algorithms.Presented at UMBC, Sept. 2009.
J. Berry., B. Hendrickson, R. LaViolette, and C. Phillips.Tolerating the community detection resolution limit with edgeweighting.CoRR, abs/0903.1072, 2009.
U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer,Z. Nikoloski, and D. Wagner.On modularity clustering.IEEE Trans. Knowledge and Data Engineering, 20(2):172–188,2008.
PPAM11—Parallel Community Detection—Jason Riedy 27/25
Bibliography II
D. Chakrabarti, Y. Zhan, and C. Faloutsos.R-MAT: A recursive model for graph mining.In Proc. 4th SIAM Intl. Conf. on Data Mining (SDM), Orlando,FL, Apr. 2004. SIAM.
A. Clauset, M. Newman, and C. Moore.Finding community structure in very large networks.Physical Review E, 70(6):66111, 2004.
M. Newman.Modularity and community structure in networks.Proc. of the National Academy of Sciences, 103(23):8577–8582,2006.
K. Wakita and T. Tsurumi.Finding community structure in mega-scale social networks.CoRR, abs/cs/0702048, 2007.
PPAM11—Parallel Community Detection—Jason Riedy 28/25