Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Graph Drawing Algorithms inInformation Visualization
Yaniv Frishman
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Graph Drawing Algorithms inInformation Visualization
Research Thesis
In Partial Fulfillment of theRequirements for the
Degree of Doctor of Philosophy
Yaniv Frishman
Submitted to the Senate ofthe Technion - Israel Institute of Technology
Tevet, 5769 Haifa Janurary, 2009
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
This Research Thesis Was Done Under The Supervision of
Prof. Ayellet Tal
in the Department of Computer Science.
The Generous Financial Help of the Technion is Gratefully Acknowledged.
Acknowledgements
Obtaining a Ph.D. is a great privilege. There are many people I would like to thankfor helping me with this achievement.
I would like to express my gratitude to my advisor Prof. Ayellet Tal for supportingme during the different stages of this long journey. I would especially like to thank herfor providing feedback and suggestions for improving my work. Thanks to her guidance,my presentation and writing skills have improved markedly, a valuable skill of its own.
I would like to thank my loving wife Maya for her support, encouragement and un-derstanding, especially when I was occupied with my studies and consequently not therefor her. It was very rewarding to share many joyful moments with her along the way.
This achievement would not have been possible without the help, support and en-couragement of my parents Miriam and Dov. I would also like to thank them for all theyhave done for me. Special thanks go to my mother for taking an active part in producingmy papers and accompanying videos. I would like to thank my brothers Etai and Ofrifor their support. I would also like to mention my grandmother Bela who always helpsme and encourages me. I would also like to dedicate this dissertation to the memory ofmy grandfather Solomon for his belief in the value of a higher education.
I would like to thank my parents-in-law Kitty and Arie and Maya’s grandmother forsupporting me and providing an environment where I could concentrate on my studies.
Special thanks go to my friends Sivan Bercovici, Dr. Avi Steiner and Dr. AmitMizrachi for their help and support along the way.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Contents
Abstract 1
List of Symbols and Abbreviations 3
1 Introduction 7
1.1 Information Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Graph Drawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Graphics Processing Units . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Outline and Main Contributions . . . . . . . . . . . . . . . . . . . . . . . 12
2 Related Work 15
2.1 Graph Drawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 General Purpose Computation on Graphics Processing Units (GPGPU) . 20
3 Multi-Level Graph Layout on the GPU 23
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Spectral Graph Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 Multi-level layout Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.5 GPU Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.7 Visualization of ISP Router Networks . . . . . . . . . . . . . . . . . . . . 45
3.8 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 47
4 Uncluttering Graph Layouts Using Anisotropic Diffusion and Mass Trans-
port 49
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Contents iv
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.4 Computing an Optimal Mapping . . . . . . . . . . . . . . . . . . . . . . 61
4.5 Implementation on the GPU . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.7 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 72
5 Online Dynamic Graph Drawing 75
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4.1 Computing Dynamic Layouts . . . . . . . . . . . . . . . . . . . . 79
5.4.2 Computing the Initial Layout L0 . . . . . . . . . . . . . . . . . . 88
5.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.7 Application to Discussion Thread Visualization . . . . . . . . . . . . . . 99
5.8 Application to Social Network Visualization . . . . . . . . . . . . . . . . 101
5.9 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 103
6 Dynamic Drawing of Clustered Graphs 105
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.3 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.3.2 Supporting Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.3.3 Minimizing Visual Changes . . . . . . . . . . . . . . . . . . . . . 114
6.3.4 Merging Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.3.5 Improving the Layout . . . . . . . . . . . . . . . . . . . . . . . . 115
6.3.6 Display and Animation . . . . . . . . . . . . . . . . . . . . . . . . 116
6.4 Visualizing Mobile Object Software . . . . . . . . . . . . . . . . . . . . . 117
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Contents v
6.5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 120
7 MOVIS: A system for Visualizing Distributed Mobile Object Environ-
ments 125
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.4 Physical and Logical Visualization . . . . . . . . . . . . . . . . . . . . . . 130
7.5 Visualization Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.6 Visualization Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.6.1 Levels of Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.6.2 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.7 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.7.1 Event Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.7.2 Event Synchronization Component . . . . . . . . . . . . . . . . . 140
7.8 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.9 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 145
8 Conclusions 147
8.1 Contribution and Summary . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.2 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Bibliography 155
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Contents vi
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Figures
1.1 A straight-edge layout of an undirected, labeled graph. . . . . . . . . . . 9
1.2 A comparison of the peak floating-point calculation rate in giga float-
ing point operations per second (GFLOPS) of Intel CPUs and ATI and
NVIDIA GPUs. Image is reproduced from [164]. . . . . . . . . . . . . . 10
3.1 ISP router map. Each node represents a router. Edges link routers. Red
nodes are external to the ISPs visualized. Other nodes are colored accord-
ing to the ISP they belong to: green - Abovenet (US, 664 routers); blue -
Exodus (US, 551 routers); black - Tiscali (Europe, 513 routers). A total
of 5044 routers and 8043 connections are shown. . . . . . . . . . . . . . 24
3.2 The power iteration algorithm . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Algorithm overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Representing a graph on the GPU. Left: A graph spatially partitioned into
partitions; right: a corresponding location texture . . . . . . . . . . . . . 38
3.5 Representing graph edges on the GPU. Node X has three neighbors: Y,Z
and W. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.6 Execution graph of GPU layout (rectangles = streams, ovals=kernels) . . 40
3.7 bcsstk31. Red: our layout, black: FM 3 layout . . . . . . . . . . . . . . . 42
3.8 Sierpinski 08. Red: our layout, black: FM 3 layout . . . . . . . . . . . . . 42
3.9 finan512. Red: our layout, black: FM 3 layout . . . . . . . . . . . . . . . 43
3.10 flower B. Red: our layout, black: FM 3 layout . . . . . . . . . . . . . . . 43
3.11 4elt. Red: our layout, black: Kamada-Kawai layout . . . . . . . . . . . . 44
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Figures viii
3.12 ISP router map. Each node represents a router. Edges link routers. Red
nodes are external to the ISPs visualized. Other nodes are colored accord-
ing to the ISP they belong to: blue - Abovenet (US, 665 routers); black
- Exodus (US, 554 routers); yellow - Ebone (Europe, 314 routers); pink -
Tiscali (Europe, 514 routers); brown - Telstra (Australia, 3756 routers). A
total of 10895 routers and 15667 connections are shown. Top left - GRIP
layout. Bottom right - our layout. . . . . . . . . . . . . . . . . . . . . . 46
4.1 Protein graph (V=30727, E=1206654). (a) FM 3 [91] layout. (b) Improved
layout. Note how displacing nodes outwards allows more details to become
visible, especially in the center of the drawing. Also note that the overall
structure of the graph is maintained. . . . . . . . . . . . . . . . . . . . . 51
4.2 Comparison between node overlap removal and graph uncluttering. (a) is
a layout produced using neato [79] of a reduced version of the bcsstk32
graph from [204]. In (b) the node overlap removal algorithm from [77] is
used. Note that although the overlaps between nodes are eliminated, the
structure of the graph is not maintained and the center of the layout is
cluttered. In (c) our algorithm is used. Note how the cluttered right side
of the input layout is expanded, thus increasing node separation, while the
structure of the graph is maintained. . . . . . . . . . . . . . . . . . . . . 53
4.3 Algorithm steps. Higher intensity represents higher values. Values are
scaled to improve contrast. . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4 Execution graph of finding the best advancement direction on the GPU
in Step 3 (rectangles = textures, ovals=kernels, θ is the current direction
being tested) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.5 ug 380 graph (V=1104, E=3231). Note how when using our algorithm the
center expands, reducing node density while the outer ring is unchanged.
When using [49] the layout is hardly changed. . . . . . . . . . . . . . . . 67
4.6 Add32 graph (V=4960, E=9462). Note how in (c) each of the rings is
expanded, showing more detail. . . . . . . . . . . . . . . . . . . . . . . . 68
4.7 ISP router graph (V=5044, E=8043) . Nodes are color-coded by the ISP
they belong to. Note how in (c) the blue nodes are uncluttered. . . . . . 70
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Figures ix
4.8 Bcsstk32 graph (V=44609, E=985046). Note how in (c) reducing the node
density allows more of the mesh structure of the graph to be uncovered in
the top left, bottom and middle of the graph. . . . . . . . . . . . . . . . 70
5.1 Snapshots from the threads1 graph sequence, visualizing discussion threads
at http://www.dailytech.com, left to right. Node labels in red show user
names, edges link users replying to posted comments. Up to 119 users
are shown. Discussion topics, marked as blue A n nodes, include GPUs
(A 4864, A 4285), chipsets (A 4637, A 4425, A 4538 and A 4866) and
CPUs (A 4589). A total of 144 messages are visualized. . . . . . . . . . . 76
5.2 Dynamic layout steps: (a) previous layout, Li−1 (b) merged graph (Step 1),
color coded according to the positioning score Γ(v). Brighter nodes have
a higher Γ. Here, nodes with Γ ∈ 0.1, 0.25, 1 are shown. (c) Pinning
weights wpin(v) (Step 2). Brighter color corresponds to a higher wpin(v)
(d) Final layout (Step 5), color coded according to the partitioning (Step 4) 82
5.3 Parallel force directed layout algorithm . . . . . . . . . . . . . . . . . . . 85
5.4 Partition size effect on layout, graph bcsstk31, |V | = 35588, |E| = 572916 87
5.5 Sorting nodes by pinning weight wpin on the GPU. (a) : A location texture
separated to regions, color coded by the partition each node belongs to.
(b) : Nodes in each region are sorted from low wpin to high wpin. . . . . 91
5.6 Snapshots from layouts of the 3elt sequence (|V | ≈ 4000, |E| ≈ 10, 500),
left-to-right, top-to-bottom . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.7 Snapshots from the layouts of the newcomb fraternity data [152]. Left:
our algorithm. Right: SoNIA algorithm [11,12], used in [149]. . . . . . . 97
5.8 Snapshots from the threads2 graph sequence, visualizing discussion threads
at http://www.dailytech.com, left to right, top to bottom. 109 mes-
sages from 86 users in 5 discussion threads are shown. Discussion top-
ics, marked as blue A n nodes, include computer games (A 5054), nuclear
fusion (A 5027), low-cost PCs (A 5060), Windows/Linux switch (A 5069)
and Christmas e-shopping (A 5082) . . . . . . . . . . . . . . . . . . . . . 101
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Figures x
5.9 Snapshots from the Rimzu graph sequence, visualizing the social network
at http://www.rimzu.com, left to right, top to bottom. Nodes represent
users and edges represent connections between users. In the visualiza-
tion the graph grows from V=216, E=544 to V=962, E=1561. Nodes are
colored by age in a red→ yellow → green scale. . . . . . . . . . . . . . 102
6.1 Snapshots from an animation sequence . . . . . . . . . . . . . . . . . . . 106
6.2 Incremental vs. non-incremental layout (from left to right) . . . . . . . . 107
6.3 Algorithm overview in pseudo-code . . . . . . . . . . . . . . . . . . . . . 112
6.4 3D view of a clustered graph . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.5 2D view of a clustered graph . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.6 Comparing the three layout algorithms . . . . . . . . . . . . . . . . . . . 119
6.7 Sample animation sequence (from left to right and top to bottom) . . . . 122
6.8 Density metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.9 Sum of cluster displacements . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.10 Number of clusters with the same size . . . . . . . . . . . . . . . . . . . . 123
6.11 Running times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.1 MOVIS user interface. Small rectangles represent mobile objects. Color
stripes show their movement history. Big rectangles represent the cores the
objects reside in. Dashed lines represent physical communication between
cores. Higher communication frequency is indicated by a higher frequency
of alternation in the lines. Solid lines represent logical connections between
objects. The square in the middle of the figure represents several cores
which have been collapsed. The rectangle with a double boundary was
selected by the user as the current focus of attention core. . . . . . . . . 126
7.2 Levels of detail. Several visualizations of the same mobile object network
are shown. Parts of the graph are progressively collapsed. Note the sta-
bility in the layouts and the conservation of the overall structure of the
graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.3 Focus-based clustering algorithm . . . . . . . . . . . . . . . . . . . . . . 137
7.4 Event synchronization algorithm . . . . . . . . . . . . . . . . . . . . . . . 141
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Figures xi
7.5 Sample animation sequence of the mobile object simulator (from left to
right and top to bottom) . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.6 Mailbox mobility in the DEM system. (a) Before movement. (b) A new
core was created. A mailbox migrated to it. . . . . . . . . . . . . . . . . 144
7.7 Sending an e-mail in the DEM system . . . . . . . . . . . . . . . . . . . 145
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Figures xii
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Tables
3.1 Graph information and running time [sec.]. Runtime columns show total
running times for computing a layout. . . . . . . . . . . . . . . . . . . . 41
4.1 Graph information and running times. The left side of the table gives
information about the graphs. V and E are the number of graph nodes
and edges, respectively. The central part of the table gives the running
times in seconds of the algorithm from [49], using the same machine used
to run our algorithm. The right side of the table shows the results of our
algorithm. The width and height in pixels of the density image used is
equal to√
P . ITRS is the number of iterations of Equation 4.7 in Step 5.
CPU is the total running time of the algorithm in seconds when using only
the CPU. CPU+GPU is the total running time of the algorithm in seconds
when using the GPU to accelerate Step 3. . . . . . . . . . . . . . . . . . 71
5.1 Layout quality - values are averages for a sequence of layouts . . . . . . . 96
5.2 Graph sequence information. . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.3 Running times [sec.]. The running times of the CPU only and GPU-
accelerated implementation of the algorithm are shown. All times shown
are total running times for computing a layout. Dynamic layout times are
averaged over a sequence of layouts. . . . . . . . . . . . . . . . . . . . . 98
6.1 Average results of an animation sequence . . . . . . . . . . . . . . . . . . 121
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Tables xiv
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Abstract
Information Visualization is the use of computer-supported, interactive, visual represen-
tations of data, and in particular, abstract data, to amplify cognition. Graph drawing
addresses the problem of creating geometric representations of graphs.
This thesis addresses several related problems in graph drawing. First, we address
the problem of quickly creating a layout of a large, general graph. We devise a multi-
level algorithm which is based on spectral partitioning. The algorithm is able to produce
aesthetic layouts at a fraction of the time of existing algorithms. Next, we discuss an
algorithm for improving an existing layout. This is done by warping the coordinates of the
nodes, thus making use of empty and sparse regions in the image of the layout. We then
turn out attention to dynamic graph drawing. The challenge here is to compute a stable
and aesthetic layout while still making it easy for the user to comprehend the changes.
The first dynamic algorithm discussed is a multi-level online incremental algorithm for
drawing general graphs. Assigning different movement flexibilities to the nodes of the
graph allows efficiently creating a stable layout. The second dynamic algorithm discussed
is an online dynamic algorithm for drawing graphs which contain an inherent grouping
into clusters. The algorithm uses node pinning, invisible spacer nodes and edge lengths
and weights in order to minimize changes to the clustered structure of the graph.
In recent years, the programmability and computational power of commodity graph-
ics processing units (GPUs) has increased tremendously. GPUs, traditionally used for
graphics-related applications are now being employed in many data-parallel problems.
Unlike matrices and images, graphs are unstructured and hence graph layout does not
seem to be suitable for acceleration on the GPU. In this research we present methods
to accelerate both static and dynamic graph drawing using a GPU. In addition, using
GPU-accelerated ray-casting, we are able to accelerate our graph improvement algorithm
significantly. In all cases, using a GPU allows performing the required computation in a
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Abstract 2
matter of seconds, even for large graphs. Accelerating static layout, dynamic layout and
graph improvement problems by factors of 5.5, 17 and 135, respectively, is demonstrated.
The algorithms developed during this research have been used in different information
visualization applications. We visualize the structure of the networks of Internet service
providers using the static layout algorithm. Our graph improvement algorithm has been
applied to improving graphs computed by various state of the art algorithms on several
applications, including bioinformatics, social interactions and finite element meshes. We
study social networks and the evolution of discussion threads in Internet sites using our
online dynamic graph drawing algorithm. Finally, we employ our dynamic clustered graph
layout algorithm in order to visualize mobile object frameworks. In these frameworks
objects migrate between hosts while the application is running. An innovative, graph-
based, scalable, focus + context visualization is used to depict both the physical network
of machines and the logical network of ties between mobile objects.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Symbols and Abbreviations
c core
C a set of clusters which form a partition of the vertex set V
Ci the i-th cluster
C i coarser graph of level i in the graph hierarchy
χ divergence free vector field
dG(u, v) the length of a shortest path between vertices u and v
Di set of nodes with a distance to modification equal to i
D(u, v) the distance between nodes u and v
Ddistorted(u, v) distorted distance between nodes u and v
Dfocal(u) the shortest distance between node u and the closest focal node
Dfocalavg (u, v) joint average distance of nodes u and v and a focal node
Dinitial initial density image of a graph layout
Dsmooth smoothed density image of a graph layout
Dtarget target density image
Du Jacobian matrix of the 2D function u
|Du| Determinant of the Jacobian matrix of the 2D function u
E edges
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Symbols and Abbreviations 4
e, e′, e′′ events
Ec set of cluster-cluster edges
Ev set of vertex-vertex edges
F force acting on a graph node
fracdone fraction of the iterations done
G=(V,E) graph
Gi the i-th graph in a series of graphs
Γ(v) positioning score of node v
K optimal geometric node distance
Li the i-th layout in a series of graph layouts
λ temperature decay constant
Lfinal final graph layout
Li layout of graph i
Linitial initial graph layout
lmax length of ray until image boundary is met
µ density image
N nodes
∇⊥ gradient rotated by 90 degrees
Ω0, Ω1 subdomains of R2
Pi the i-th partition of a graph
PN(v) be the set of neighbors of the node v
pi, pi(x, y) position of node i
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Symbols and Abbreviations 5
pos(v) position of node v
≺ precedes
r algorithm run
R the set of real numbers
t initial graph temperature
θ candidate advancement direction
θbest preferred advancement direction
U graph potential energy
u = (u1(x, y), u2(x, y)) image warp
V vertices
wij weight of edge between node i and node j in a graph
wpin(v) pinning weight of node v
W graph edge weights matrix
API application programming interface
CORBA common object request broker architecture
CPU central processing unit
DAG directed acyclic graph
FR Fruchterman-Reingold layout algorithm
GPGPU general purpose computation on graphics processing units
GPU graphics processing unit
GUI graphical user interface
ID identifier
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
List of Symbols and Abbreviations 6
IP internet protocol
ISP internet service provider
JDI java debug interface
KK Kamada-Kawai layout algorithm
MP mass-preservation
PDE partial differential equation
PVM parallel virtual machine
P2P peer to peer
RMI remote method invocation
SIMD single instruction multiple data
SPMD single program multiple data
UML unified modeling language
VLSI very large scale integration
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Chapter 1
Introduction
This thesis addresses graph drawing in information visualization. In the implementation
of some of the algorithms graphics processing units (GPUs) are utilized in order to sig-
nificantly reduce the running time. This chapter presents the background to the thesis
and discusses the main contributions.
1.1 Information Visualization
Information Visualization is defined as the use of computer-supported, interactive, visual
representations of data, and in particular, abstract data, to amplify cognition [29]. Using
graphical representations of data allows making use of the human visual system which is
able to rapidly process large amounts of data and has good pattern recognition abilities.
Information visualization deals with abstract data, which has no inherent mapping to
space. One of the challenges in information visualization is finding a way to map the
data to an image in a way that makes it understandable. This is in contrast to scientific
visualization, which deals with physically-based data which is inherently defined in a
coordinate system. Using interactive and dynamic visual representations of data allows
the user to modify the visualization. This allows the data to be analyzed by exploration.
Users can develop an understanding of the structure and the connections inherent in
the data by observing the effects of the interaction on the data. A few examples of
visualization techniques include selective hiding of data, layering data, using 3D, scaling
and warping techniques in order to use more screen space for important parts of the data
(e.g. fisheye views) and using color and shading to convey information.
In many cases the information is dynamic in nature. In these cases, it is important to
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
1. Introduction 8
maintain coherence in the visualization, thereby helping the user conserve his/her mental
image of the evolving information. Simply piecing together a series of static snapshots
is not sufficient in order to create a dynamic visualization. The challenge is to create a
coherent sequence of images that tells a story. The user looking at a dynamic visualization
should be able to note changes being unfolded while maintaining an overall understanding
of the data.
Dynamic information visualization provides many interesting research challenges. Start-
ing from innovative ways of collecting data, moving to techniques of processing data to
provide meaningful insights, and ending with creating cognition amplifying methods of
displaying information. As the digital revolution continues, massive amounts of informa-
tion are becoming available. The dynamic visualization challenge is to harness the ample
sources of information available today in a way that enhances understanding and aids
the human mind in getting insight into the evolving phenomena being studied.
In a world rich in communication, processing and display technologies, there is ample
opportunity for innovative visualization techniques. Some applications of information
visualization include graph and network visualization, security and network intrusion
visualization, financial analysis, software visualization, text and document visualization
and social network visualization.
1.2 Graph Drawing
Graphs are abstract mathematical objects that are designed for describing relations be-
tween objects. Graph drawing addresses the problem of finding the best way to draw a
picture of a graph. One of the common methods used to draw a graph is the node-link
diagram. In this visualization, nodes are drawn using dots, circles or other geometrical
forms and edges are drawn using straight or curved lines. Arrows are used to show the
orientation of directed edges. The information corresponding to the nodes and edges can
be visualized using text labels at various positions in or next to a graph object, different
colors (as on a subway map), or other visual elements such as thickness of lines, size of
boxes, etc. A graph may be drawn in the plane or in three dimensions. It may be drawn
completely, partially, or hierarchically, i.e., clusters are shrunken to a single node which
can be expanded on request. Figure 1.1 shows an example of a drawing of an undirected
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
1. Introduction 9
graph which contains textual node names.
Figure 1.1: A straight-edge layout of an undirected, labeled graph.
Very different graph drawings or graph layouts can correspond to the same graph. In
the abstract graph, all that matters is which vertices are connected to which others by
how many edges. In the visual representation of the graph, however, the arrangement
of these vertices and edges impacts understandability, usability, fabrication cost, and
aesthetics. Therefore, graph drawing is a central problem in information visualization.
The criteria used to judge the quality of a graph layout depends on the application it is
used for. In applications where the main goal is to produce layouts for human consump-
tion these criteria are appropriately called aesthetics criteria, however in applications
where graphs are drawn for other purposes as in VLSI schematics, for example, technical
criteria such as wire length might be more important than aesthetics criteria. Some of
the commonly used aesthetics criteria include: crossing and overlap minimization, bend
minimization, area minimization, angle maximization, length minimization, symmetries
and proper separation between nodes.
Graph drawing has emerged in recent years as a very lively area in computer
1. Introduction 10
Various methods for graph drawing have been proposed, such as hierarchical, planar,
circular, orthogonal, symmetric, spectral and force directed layouts [111, 112, 116, 154,
192,199].
Graph drawing has been used for numerous applications. A few examples include VLSI
circuit design, social networks, bioinformatics, train network maps, genealogy, state ma-
chines, function call graph visualization, software evolution visualization, network visu-
alization, databases, data structures, computer security, software engineering (e.g. UML
diagramming, class browsers) and workflow management (e.g. flow chart generation).
1.3 Graphics Processing Units
Commodity computer graphics chips, known generically as Graphics Processing Units or
GPUs are one of the most accessible high-performance computational platforms [89,163,
178]. Intended initially for performing graphics related computations for applications
such as computer games, computer-aided design and visualization, GPUs have evolved
into programmable and economic parallel processing units. They exist in almost every PC
sold today and are evolving at a rapid rate. Figure 1.2 compares the relative performance
of CPUs and GPUs. Note the large increase of GPU performance over time. Also note
that the GPU performance is much higher than the CPU performance.
The GPU’s arithmetic power is a result of a specialized architecture, which evolved
over years to provide maximum performance on the highly parallel tasks of traditional
computer graphics [134, 148]. Unlike CPUs, which are optimized for high-performance
on sequential code, where many transistors are dedicated to instruction-level parallelism
using techniques such as branch prediction and out-of-order execution, GPUs are opti-
mized for data-parallel applications. This allows a larger portion of the die area of GPUs
to be dedicated to computational units. Thus, using the same semiconductor technology,
GPUs are able to achieve much higher computation speeds compared to CPUs under
some conditions.
Unlike CPUs, which dedicate a large percentage of the die area to cache memory, in
GPUs, handling the long memory access latency is achieved by quickly switching between
multiple threads in hardware. This enables the GPU to hide long memory access latencies
without sacrificing die area for large caches. Thus, the GPU is able to achieve good
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
1. Introduction 11
Figure 1.2: A comparison of the peak floating-point calculation rate in giga floating point
operations per second (GFLOPS) of Intel CPUs and ATI and NVIDIA GPUs. Image is
reproduced from [164].
utilization of its computational units. Unlike a CPU, the caches of the GPU are designed
for short-term reuse and are constructed to give a 2D access locality. This gives the GPU
an advantage in image related applications where data is accessed in 2D patterns.
As graphics hardware has become more powerful, one of the primary goals of each new
architecture has been to increase the visual realism of rendered images. This is achieved by
implementing increasingly complex rendering algorithms in real-time on the GPU. From
the fixed-function graphics pipelines of several years ago [60], the GPU architecture has
been steadily progressing towards a more general-purpose architecture. New versions of
the graphics APIs [16, 181] expose more programmable parts of the GPU while adding
more instructions and programming flexibility.
In parallel to the evolution of the graphics hardware, the languages used to program
GPUs have been evolving. Starting from small programs written in assembly language,
there are now several C-like programming environments in which GPUs can be pro-
grammed. Some examples include Cg [140], Sh [142] and Brook [26]. In an effort to
streamline the use of GPUs for non-graphics high-performance computation, the major
GPU vendors have released programming environments that do no use the traditional
method of using the graphics driver to access the GPU. NVIDIA’s CUDA [156] and
1. Introduction 12
ATI’s CTM [168] reduce the overhead of accessing the GPU, thus potentially improving
the efficiency of using the GPU.
The programmable units of the GPU are architected to follow the single program,
multiple data (SPMD) programming model, in which many independent elements are
processed in parallel using the same program. This model is well-suited for straight-line
programs in which many elements are processed in lockstep, running the exact same code.
Such code is single instruction, multiple data (SIMD). While today’s GPUs are capable of
executing code in which different execution paths are taken by each element, this results
in a performance penalty. Thus, GPU programs attempt to group elements into blocks,
in order to have coherent branches in each block. One of the main challenges in achieving
a speedup when using a GPU is understanding how to exploit the architecture of the
GPU effectively.
The combined advances in GPU hardware architecture and programmability have
spawned the emergence of a vibrant developer community of GPGPU (general computa-
tion of graphics processing units) applications [89, 162]. Taking advantage of the much
higher peak computation ability and memory bandwidth of the GPU, many algorithms
have been successfully implemented on the GPU with high speedups compared to CPU
implementations. Some examples of algorithms accelerated on GPUs include solving par-
tial differential equations, linear algebra, image and signal processing, segmentation, and
geometric computing [58, 89, 153, 163, 169]. See Section 2.2 for a more comprehensive
review of algorithms which were accelerated on GPUs.
1.4 Outline and Main Contributions
In this thesis we address several related problems in graph drawing for information visu-
alization. It is based on the papers [13, 63–69]. We start with the problem of drawing a
large, general graph quickly and aesthetically. Next, an algorithm for improving a graph
layout is presented. This algorithm can be used as a post-processing step of any domain-
specific layout algorithm. Our improvement algorithm unclutters the given layout by
making more efficient use of available screen space. Next, we address dynamic graph
drawing algorithms for both general and clustered graphs. These algorithms attempt
to preserve the mental map [145] and to produce, online, stable layouts of time-varying
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
1. Introduction 13
graphs.
Applications of structured data, such as matrices and images are very suitable for
acceleration using a GPU. The challenge is how to use this hardware to accelerate al-
gorithms that utilize unstructured data, such as graphs. In this thesis we present two
GPU-accelerated graph drawing algorithms which are able to quickly compute aesthetic
layouts of large graphs. One is for the layout of a single graph and one is for computing
stable layouts of a sequence of graphs. Speedups of x5.5 to x17 relative to a CPU im-
plementation are demonstrated. In addition, using a GPU, we are able to accelerate our
algorithm for improving graph layouts by a factor of over 100 times.
Throughout this thesis we provide practical applications of our algorithms to different
information visualization problems. We demonstrate how the structure of Internet service
provider (ISP) networks can by visualized and analyzed using our static graph drawing
algorithm. We show how the layout of graphs from different application domains such
as bioinformatics, VLSI, and finite element meshes can be improved by our uncluttering
algorithm. We apply our dynamic graph drawing algorithm to visualization of social
networks and Internet discussion threads. Finally, we have developed a system for the
visualization of mobile objects [38,102,130], which are an extension of distributed objects.
This system uses clustered graphs to show the structure and interactions in a network
of mobile objects. A hierarchical, scalable focus + context technique is used in the
visualization.
In Chapters 3- 7 we present the main research results of this thesis. The main contri-
butions of each chapter in the thesis are summarized below.
Chapter 3 presents an algorithm for static graph layout [65]. The algorithm is based
on the force-directed approach [18,70,113,199]. We propose a multi-level scheme which is
based on spectral partitioning. A technique to efficiently perform the layout on the GPU
is presented. The algorithm manages to compute high quality layouts of large graphs in
a fraction of the time required by existing algorithms of similar quality.
Chapter 4 discusses a technique for modifying an existing layout in order to reduce the
clutter in dense areas [69]. Using a physically-inspired evolution process, graph nodes are
dispersed more evenly in the available screen space. A mental-map preserving warping
process is used to displace the nodes. The complexity of the algorithm depends mainly on
the resolution of the image used for computing the density of information in the graph.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
1. Introduction 14
As such, the computation can be scaled according to the allotted running time. Using a
GPU, we are able to significantly handle large graphs in a matter of seconds. Applications
to bioinformatics, VLSI and finite element meshes are demonstrated.
In Chapter 5 an algorithm for drawing a sequence of graphs online is presented [66,68].
While allowing arbitrary modifications to the graph, the algorithm strives to maintain
the global structure of the graph and thus the user’s mental map. The algorithm works
online and uses various execution culling methods in order to reduce the layout time and
handle large dynamic graphs. Techniques for representing graphs on the GPU allow a
speedup by a factor of up to 17 compared to the CPU implementation. Applications to
social networks and visualization of Internet discussion threads are presented.
Chapter 6 presents an algorithm for drawing a sequence of graphs that contain an
inherent grouping of their vertex set into clusters [63]. The algorithm works online and
allows arbitrary modifications to the graph. It uses node pinning and invisible nodes in
order to maintain the clustered structure of the graph during incremental layout. Several
metrics for measuring the quality of the dynamic layout of clustered graphs are discussed.
In Chapter 7, an application to the visualization of mobile objects is discussed.
In Chapter 7 a system for visualizing mobile object frameworks is presented [13,64,67].
In these frameworks, the objects migrate to remote hosts, along with their state and
behavior, while the application is running. A graph-based visualization is used to depict
both the physical layer (placement on hosts) and the logical layout (relations between
objects) of the system. The system is scalable and able to create a consistent visualization
of the distributed system.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Chapter 2
Related Work
In this chapter we discuss some research that is related to this thesis. We start with graph
drawing, which is a fundamental tool in many information visualization applications, such
as software visualization. Next, we review some applications of graphics processing units,
which are related to the research presented in this thesis. In the following chapters
additional, topic-specific references are discussed.
2.1 Graph Drawing
The general problem of drawing graphs, e.g., assigning coordinates to graph vertices,
edges and other elements, has been extensively studied [46, 111, 112, 116, 154, 192, 199].
In the following paragraphs we review different types of layout algorithms and give more
information about work related to this thesis.
Several classes of algorithms for drawing graphs have been developed. Selecting a spe-
cific algorithm depends both on the type of graph to be laid out and on the requirements
from the resulting layout.
Graphs that are planar, i.e. can be drawn with no edge crossings are often drawn
using planar layout algorithms [90, 114, 116, 180]. Tree-like structures are drawn using
tree layout [33,41,42,80,116,175]. Directed graphs that contain an inherent hierarchy are
drawn using hierarchical layout algorithms [46,52,76,116,194]. These algorithms attempt
to find a source and sink within a directed graph and arrange the nodes in layers with most
edges starting from the source and flowing in the direction of the sink. Such algorithms
try to minimize the number of crossings or the area of the layout. In some cases, it is
required to draw edges in either the horizontal or vertical direction, while trying to reduce
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
2. Related Work 16
edge crossings. Examples include drawings used for circuit board and integrated circuit
design. In these cases orthogonal layout algorithms [15,81,116,166,196,197] are used.
A variety of algorithms have been devised for drawing general graphs. Among them
spectral layout algorithms and force-directed layouts are widespread. In spectral lay-
out [19, 121–123] the node coordinates are extracted from the eigenvectors of a matrix,
such as the Laplacian matrix of the graph, which is derived from the adjacency (con-
nectivity) matrix of the graph. In force-directed layout [39, 43, 50, 62, 70, 113, 193] a
gradient-descent minimization of an energy function based on physical analogies is used
to compute the layout. In this thesis we focus on force-directed layout.
Force-directed layout One of the most popular techniques for graph layout is force-
directed layout. It uses physical analogies in order to converge to an aesthetically pleasing
drawing [18,39,43,50,62,70,113,193,199]. In this class of algorithms, the graph is modeled
as a system of particles that exert forces on each other. Springs are used to model the
edges in the graph. The direction and magnitude of the force exerted on the two particles
connected by an edge depends on the distance between them and the ”stiffness” of the
spring. These are parameters that can be modified to produce different effects. Starting
from an initial position, force-directed algorithms strive to converge to an equilibrium
position, which often produces a good layout of the graph. It should be noted that due
to the complexity of the problem, a local minimum is reached, and not a global one.
The algorithm of Fruchterman and Reingold [70] is a well-known variant of the force-
directed layout technique. In this algorithm, the following force is defined between each
pair of vertices pu, pv ∈ V :
frepulsive(pu, pv) =l2
‖pu − pv‖· pv − pu
‖pv − pu‖.
Here, l is a parameter of the algorithm, which is used to denote the natural length of
a spring attaching two vertices connected by an edge. In addition, an attractive force
is defined between every pair of vertices pu, pv which are connected by an edge (i.e.
(pu, pv) ∈ E):
fattractive(pu, pv) =‖pu − pv‖2
l· pv − pu
‖pv − pu‖.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
2. Related Work 17
The running time of this algorithm is O(V 2 + E): all vertex pairs on the graph need
to be considered for calculating repulsive forces and all edges are considered in order
to calculate attractive forces. In order to prevent excessive changes, especially in later
stages of the iteration when the placement is close to a stable state, the algorithm uses a
time-dependent maximum displacement value which declines over time.
Kamada and Kawai [113] introduced a different variant of the force-directed layout
algorithm. The idea here is to minimize the energy of the layout directly, instead of
reducing the forces acting on the vertices. In this algorithm, the ideal distance between
two nodes is set as the length of the shortest path between them, multiplied by the ideal
length of a single edge. The resulting objective function is the sum over the potential
energies of all n(n− 1)/2 springs,
UKamada Kawai =∑
u,v∈V
c
dG(u, v)2· (‖Pu − Pv‖ − l · dG(u, v))2,
Where dG(u, v) denotes the length of a shortest path between vertices u and v, c is a
scaling constant and l is the ideal length of a single edge. To obtain a local minimum of
this objective function, a modified Newton-Raphson method is applied. In each iteration,
the vertex with the longest gradient is picked and displaced. The running time of this
algorithm is quite high, since it requires computing all-pairs shortest paths and it scans
all vertices in the graph and then only displaces one vertex.
Due to the high computational cost of force-directed algorithms, extending them for
drawing large graphs has been extensively studied [92]. One popular technique is to use a
multi-level approach [7,72,91,96,123,205]. The idea here is to recursively create reduced
graphs, until a sufficiently small enough graph is created. Next, a series of graph layout
problems are solved - starting from the coarsest graph, finer and finer approximations of
the final layout are created.
Various algorithms have been used to perform multi-level graph layout. In [205],
edge collapse operations, commonly used in computer graphics for mesh simplification,
are used. The algebraic multi-grid technique has also been used to successfully compute
high-quality layouts [123]. Creating ”solar systems” by clustering nodes at a distance of
2 edges or less from a central node is described in [91]. Using a maximum-independent
set filtration in order to coarsen the graph is discussed in [72]. TopoLayout [7] is a
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
2. Related Work 18
feature-based multi-level graph drawing algorithm. It creates a subgraph hierarchy by
recursively detecting topological features in the graph and replacing them with meta-
nodes. Chapter 3 describes a new multi-level force-directed algorithm, which is based on
spectral partitioning [59,170].
Dynamic graph drawing As opposed to static graph drawing where a single graph
is considered, dynamic graph drawing address the problem of computing layouts for a
sequence of related graphs. In offline dynamic graph drawing the entire sequence of
graphs to be laid out is known in advance. In contrast, in online dynamic graph drawing,
for each graph provided as input, a layout is computed. Thus, an online algorithm is not
able to take future changes to the graph into account when computing the layout.
If the sequence of graphs to be laid out is known in advance, different algorithms
can be employed in order to solve the incremental layout challenge. One algorithm to
address this problem is discussed in [47]. The algorithm constructs a super-graph that
combines information from several adjacent timeslots in the animation sequence in order
to produce a smooth animation. In [128], a stratified, abstracted version of the graph is
used. An offline algorithm for the visualization of the evolution of software over time is
presented in [40,56].
Online visualization of social networks is discussed in [149]. An approach based on
Bayesian networks is described in [20]. Online drawing of orthogonal and hierarchical
graphs is discussed in [86]. Chapters 5 and 6 present online algorithms for drawing
general and clustered graphs, respectively.
Incremental drawing of directed acyclic graphs is discussed in [155], which uses a mod-
ification of the Sugiyama algorithm [194] in order to draw ranked digraphs. A heuristic
that moves nodes between adjacent layers is employed. Although the algorithm performs
well, it is restricted to graphs that contain an inherent hierarchy of nodes.
Clustered graph drawing Work on clustered graph drawing is less widespread. In [206],
a divide and conquer approach, in which each cluster is laid out separately and then the
clusters are composed to form the graph, is used. This approach has the drawback of
not taking edges between vertices belonging to different clusters into account. This may
result in many edge crossings or long edges. In [51], a method of drawing the clustering
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
2. Related Work 19
hierarchies of the graph using different Z coordinates in a 3D view is discussed. Display
in 3D allows to present the recursive clustered structure of the graph more easily. One
drawback of this work is that the entire structure is presented, which may be quite com-
plex. No means are given in order to help simplify the visual complexity of the graph.
Other research in clustered and compound graph layout includes [14, 25]. In Chapter 6
we present our online algorithm for drawing sequences of clustered graphs.
Commercial graph drawing software There are several companies which have graph
drawing products. These are used for different applications such as process and workflow
diagrams, business organizational charts, network management displays and supply-chain
diagrams.
The ILOG JViews Diagrammer package [107] includes a broad range of layout al-
gorithms implemented in Java. Algorithms supported include hierarchical layout, tree
layout, circular layout and a spring embedder. Nested subgraphs are supported, allowing
the user to expand the view of the contents of subgraphs. The graph can be annotated
with different line styles, labels and tooltips. The package includes some incremental lay-
out capability, ensuring that small changes do not force large diagram rearrangements.
Tom Sawyer offers graph layout software [200]. Algorithms supported include circular,
hierarchical, orthogonal, symmetric and well as an interface for supplying constraints
on the layout. The package includes an edge router, which can help reduce node-edge
overlaps. The software has several interfaces, including ActiveX, C++, Java and .Net.
Emphesis is put on running the layout algorithms quickly. The package can interface
with analysis and visualization software.
YWorks offers the yFiles graph layout pacakge [209]. Several layout algorithms are
supported, including circular, hierarchical, orthagonal and tree layout. In addition, an
“organic” layout algorithm exists, which seems to be based on force-directed layout. Note-
worthy is support for incremental layout of tree, heirarchical and circular drawings [117],
as well as support for incremental edge routing.
The clearcase [37] software configuration and version management tool, uses a hierar-
chical graph layout algorithm [194] in order to display versioning information for files and
directories in a software project. This allows inspecting the modification history of each
software component in the project. In addition, the user interface allows running com-
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
2. Related Work 20
parison and merging queries on the graph being displayed. Using the graph visualization,
it is much easier for the user to understand the changes applied to a module.
While the commercial tools discussed above include implementations of several layout
algorithms, there is still ample room for research. Examples include incremental layout
of clusterd graphs, incremental layout of multi-level clustered graphs, efficient handling
of large graphs, creating high-quality layouts and techniques for improving layout quality.
In this thesis, some of these challenges are addressed.
2.2 General Purpose Computation on Graphics Pro-
cessing Units (GPGPU)
In recent years, graphics processing units (GPUs) have been used in many applications
not directly related to computer graphics [58,153,163,169]. A few examples include clas-
sification using support vector machines [31], sequence alignment in Biomedical applica-
tions [139], image compression [55], visual tracking [147], probabilistic sequence search in
Biomedical applications [104], tone mapping [84], particle systems [118], histogram gen-
eration [179], neural networks [160], level-set segmentation [177], wavelet transforms [28],
database queries [87, 88], geometric pattern matching [5], acustic simulation [176]. This
has been termed GPGPU or general purpose computation on graphics processing units.
The website [89] lists several hundred papers and research applications of GPUs. In this
section we review work in linear algebra, ray tracing, solution of PDEs and computation
of forces between particles, which is more relevant to our work.
GPUs have been successfully used to perform linear algebra and matrix computa-
tions. In [73] a system for solving dense linear systems using LU decompositions and
other techniques is described. The computation is accelerated using GPU architectural
features devised initially for texture processing, such as coordinate interpolation units. A
system for performing general-purpose linear algebra calculations on matrices and vectors
is presented in [127]. Applications to multi-dimensional finite differecnes, such as the 2D
wave equation and incompressible Navier-Stokes equations are presented. In [57,131] al-
gorithms for dense matrix multiplication are presented and their performance is analyzed.
A recent paper presents new ways of accelerating spare matrix-vector multiplication on
the GPU using scan primitives to perform the calculation efficiently on the GPU [182].
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
2. Related Work 21
The multi-grid algorithm [24,44] is an advanced, fast and popular approach to solving
large boundary value problems. In [17] an implementation of a spare matrix conjugate
gradient solver and a regular-grid multi-grid solver on the GPU are discussed. Another
implementation of a multi-grid solver on the GPU is presented in [85]. In [109], an appli-
cation of creating marble-like textures on the GPU which uses the multi-grid technique
to solve PDEs (partial differential equations) in a fluid dynamics simulation, is presented.
In Chapter 4 A GPU is used to accelerate a ray casting algorithm. In [30] ray-triangle
intersections are performed on the GPU. The highest acceleration is achieved when caches
of coherent rays are processed. The algorithm is partitioned between the CPU and the
GPU, combining the strengths of both. In [171] all of the triangles in the scene are stored
on the GPU in a 3D grid. This allows the entire raytracer to run on the GPU, eliminating
the CPU-GPU communication bottleneck of [30]. More recent work, such as [61] uses
more advanced techniques, such as using kd-trees to accelerate the computation. Ray
casting on GPUs has also been used in order to perform volume rendering of 3D data.
In [126] a multi-pass ray casting technique, employing empty space skipping and early
termination is used. In [188] single-pass ray casting on the GPU, which improves accuracy
in the volume integral computation compared to texture slicing which uses framebuffer
precision of 8 to 16 bits, is introduced. In [137] an adaptive object and image-space
sampling density of multiresolution volumes is used to reduce running time.
Simulation of physical phenomena using PDEs is one of the many applications of
GPUs. In [99] a real-time simulation of fluid dynamics on the GPU, which uses Jacobi
iterations [44] to converge to a solution is used. Simulation of the dynamics of clouds
on the GPU is discussed in [100]. There, methods to efficiently process a 3D volume
using the GPUs 2D addressing capabilities are presented. Simulation in 3D of fluids in a
volume that contains obstacles is discussed in [136].
Chapters 3 and 5 show how the calculation of forces acting on nodes, used for graph
layout, can be accelerated on the GPU. There are many research problems in physics,
chemistry and astrology where very similar calculations between interacting particles are
important. Hence, similarly to this research, researchers in these fields have turned to
the GPU in order to achieve high performance computation on cheap and accessible
platforms (i.e. GPUs). In [6, 135,190] the simulation of the dynamics of molecules using
a GPU is described. In [165] a world-wide distributed system for the simulation of protein
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
2. Related Work 22
folding on the GPU and other high-performance computational platforms is described.
The simulation of N-body gravitational forces on the GPU is discussed in [157].
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Chapter 3
Multi-Level Graph Layout on theGPU
This chaper presents a new algorithm for force directed graph layout on the GPU. The
algorithm, whose goal is to compute layouts accurately and quickly, has two contributions.
The first contribution is proposing a general multi-level scheme, which is based on spectral
partitioning. The second contribution is computing the layout on the GPU. Since the
GPU requires a data parallel programming model, the challenge is devising a mapping
of a naturally unstructured graph into a well-partitioned structured one. This is done
by computing a balanced partitioning of a general graph. This algorithm provides a
general multi-level scheme, which has the potential to be used not only for computation
on the GPU, but also on emerging multi-core architectures. The algorithm manages to
compute high quality layouts of large graphs in a fraction of the time required by existing
algorithms of similar quality. An application for visualization of the topologies of ISP
(Internet Service Provider) networks is presented. This chapter is based on [65].
The rest of this chapter is structured as follows. Section 3.1 gives an introduction.
Related work is disucssed in Section 3.2. Partitioning graphs using spectral methods is
reviewed in Section 3.3. Section 3.4 presents the layout algorithm. The GPU implemen-
tation of the algorithm is reviewed in Section 3.5. Results are presented in Section 3.6.
An application to the visualization of Internet service provider networks is discussed in
Section 3.7. Finally, Section 3.8 concludes.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 24
Figure 3.1: ISP router map. Each node represents a router. Edges link routers. Red
nodes are external to the ISPs visualized. Other nodes are colored according to the ISP
they belong to: green - Abovenet (US, 664 routers); blue - Exodus (US, 551 routers);
black - Tiscali (Europe, 513 routers). A total of 5044 routers and 8043 connections are
shown.
3.1 Introduction
Rapidly producing aesthetically pleasing, high-quality graph layouts is still a challenging
problem. For instance, one of the most popular graph layout algorithms, the force di-
rected algorithm, is computationally expensive. The complexity of each iteration of the
algorithm is O(V 2 +E). On large graphs, the layout procedure can take anywhere from a
few seconds to several minutes to complete, hindering the capability to use this algorithm
to explore large data sets.
In recent years, a popular way to accelerate computations is to perform them on the
GPU (graphics processing unit) [58, 89, 163, 169]. This is due to the high computational
power, low cost, and ubiquity of GPUs in every modern PC. Please refer to Sections 1.3
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 25
and 2.2 for more information about accelerating computations using GPUs.
GPUs are geared towards repetitively performing the same computation on large
streams of data. Therefore, the GPU suits uniformly structured data, such as images
or matrices. Graphs do not posses a uniform structure, hence, they do not admit any
intuitive and natural representation that suits computation on the GPU.
This chapter proposes two ways in which force directed algorithms can be accelerated.
The first is a general multi-level scheme, which is based on spectral partitioning. The
second is computation of a graph layout on the GPU.
Multi-level graph layout algorithms have been proposed in the past [72,91,96,98,123,
172, 205]. In these algorithms, the given graph is recursively coarsened, to compute its
multi-level representation. In contrast, in our scheme, the algorithm works on a high-
detailed graph at all levels of the partitioning. Thus, a good hierarchical representation
of the graph is obtained. The scheme proposed in this chapter is a general multi-level
scheme, which is based on spectral partitioning. Using a coarse to fine approach, layouts
of increasing detail are computed. It is shown how coarse layouts of a graph can be
efficiently extended to the final high quality layout.
In addition, this chapter describes a method of representing graphs so as to make
efficient use of GPU resources. Partitioning is used to break the large problem into
smaller and similarly-sized problems that suit computation on the GPU or on other
data-parallel programming models. This algorithm exposes the underlying structure of
the graph, and thus can be used in a multi-level scheme.
Another algorithmic contribution of the chapter is devising a layout algorithm that
combines the strengths of two different well-known layout algorithms [70,113]. The pro-
duced layouts are as good as existing state of the art layouts [91, 92], yet computed at a
fraction of the running time. For example, a layout of the graph bcsstk31 is computed
using our approach in 5.8 seconds (using a GPU on a Core 2 machine) compared to 83
seconds in [91] (using a Pentium CPU).
Implementation-wise, the chapter elaborates on how force directed layout is acceler-
ated, by performing the time-consuming stages on the GPU. The data storage and the
stream processing are described.
Last but not least, the algorithm is applied to the visualization of the topologies of
Internet Service Providers (ISP) networks. In this application, illustrated in Figure 3.1,
3. Multi-Level Graph Layout on the GPU 26
nodes represent routers and edges represent the connections between them.
3.2 Related Work
Many algorithms have been proposed to perform graph layouts [116, 199]. This chapter
focuses on force directed layout [70, 113], which is based on simulating the graph as a
network of charged particles that repel each other, where edges are simulated by springs.
The algorithm is popular due to its ability to draw general undirected graphs, its ability
to be tailored according to specific requirements, and the aesthetically pleasing layouts
it produces. However, a major drawback of the algorithm is its high computational cost.
Some algorithms have been proposed to perform force directed layouts of large graphs
[92]. In [205] coarser representations of the graph are recursively built using the edge
collapse operation. Instead of computing all-pairs repulsion forces, only close-by nodes
are addressed. The algorithm in [96] creates coarse graphs using an approximation of
the k-center problem. A modified version of [113] is used to perform single level layout.
This algorithm requires O(V 2) memory and O(V E) time for a graph with V nodes and
E edges. The algorithm in [9] computes repulsion forces in O(N log N) for N nodes.
In [172] a quadtree is used to accelerate layout and to visualize the graph in multiple
levels of detail. In [72] a maximum independent set filtration is used to coarsen the
graph. At each level new nodes are placed in accordance with their neighbors. A local
force computation is performed using both [113] and [70]. FM 3 [91] is a state of the art
multi-level algorithm [92]. There, solar systems are created, which consist of nodes at a
distance of two edges or less from the center of the solar system. A clever O(N log N)
approximation of the all-pairs repulsive forces is used to accelerate layout.
In [123] a simplified energy function is used, which allows more robust mathematical
treatment. The layout problem is reduced to an Eigen value computation problem, which
is solved using an algebraic multi-grid approach. Although the resulting algorithm is very
rapid, the quality of the layout is limited [92]. This may be attributed to the algorithm
defining forces only along edges of the graph. In [98] a high dimensional embedding of
the graph is computed and then projected into the drawing plane, allowing a linear time
O(E + V ) algorithm.
In the current chapter, instead of working on increasingly coarsened graphs, the input
3. Multi-Level Graph Layout on the GPU 27
graph is partitioned to smaller and smaller parts. This helps construct an accurate multi-
level representation of the graph.
In recent years, GPUs have been successfully applied to numerous problems outside of
classical computer graphics [163]. Some GPU usage examples include solving differential
equations [85], linear algebra [73,127], signal processing [150], visualization [95,108] and
simulation [100,118,136], to name a few.
Several other GPU applications are somewhat related to ours. In [83,198] simulation
of deformable bodies using mass-spring systems is performed. However, while the mass-
spring algorithms take only nodes connected by edges into account, the force directed
algorithm considers all the nodes when calculating the force exerted on a node. GPUs
have also been used to simulate gravitational forces [157], where an approximate force
field is used to calculate forces. Accelerating dynamic graph drawing on the GPU has
been addressed in [66]. The focus of that work was on creating stable layouts of changing
graphs, whereas the current chapter addresses static layouts.
3.3 Spectral Graph Partitioning
Computing directly the layout of a large graph is both time-consuming and difficult.
This is due to the sensitivity of force directed layout to the initial conditions given to the
algorithm. To address these problems, multi-level schemes have been used [72,91,96,98,
123, 172, 205]. The key idea is that a good representation of the overall structure of the
graph will yield a layout of the “skeleton”, which can be quickly computed, and which
can assist in drawing the large input graph.
We propose an algorithm for creating a series of resolution decreasing representations
of the graph by recursively partitioning it. We require the parts to have similar size and
have a minimal cut between them. The former requirement helps preserve the balance
between the nodes during layout, while the latter guarantees that different parts are
weakly coupled and hence can be treated relatively independently.
While existing multi-level graph layout algorithms recursively coarsen the graph in
order to compute the multi-level representation, our algorithm works on a high-detailed
graph at all levels of the partitioning. This allows us to obtain a high-quality repre-
sentation of the graph, which does not suffer from the growing inaccuracy involved in
3. Multi-Level Graph Layout on the GPU 28
repetitively creating coarser and coarser representations of a reduced version of the graph.
To do it, we use spectral graph theory [36]. This theory has been used in the field of
parallel computation to partition computation dependency graphs, where the amount of
work between processors needs to be balanced [170]. It was also used in image segmenta-
tion, where normalized cuts were introduced [184]. The idea of using eigenvectors of the
Laplacian for finding partitions of graphs has a rich history [59].
Suppose that wij is the weight of the edge (i,j), D is a diagonal matrix, D(i, i) ≡∑
j wij, and W (i, j) ≡ wij is the graph edge weights matrix. The matrix L = D −W
is the Laplacian of graph G. The goal is to partition G into two equal-sized partitions
A,B. For node i, we define qi = 1 if i ∈ A and qi = −1 if i ∈ B. It can be shown [170]
that the cut size J is:
J = CutSize =1
4
∑
i,j
wij(qi − qj)2 =
1
2qT (D −W )q.
This is so since if qi and qj are in the same partition, qi− qj is zero. If not, the expression
evaluates to 22. Hence, dividing by four achives the desired result. The right hand side
of the equality stems from the characteristics of the Lapalcian of the graph.
In order to minimize J, we can relax the indicators qi to continuous values and take
the second smallest eigenvector of
(D −W )q = λq.
This vector is known as the Fiedler vector [59]. (The smallest eigenvector, corresponding
to an eigenvalue λ1 = 0 is q1 = (1, ..., 1)T .)
To compute the Fiedler vector, we use the power iteration algorithm [208], shown
in Figure 3.2. The input of the algorithm is a guess for the Fiedler vector, stored in
v2. The computed Fiedler vector is returned in v2. The algorithm is iterative. In each
iteration v2 is orthogonalized against the first eigenvector and multiplied by the matrix
B which is used to reverse the order of the eigenvectors, using the Gershgorin bound,
which bounds the magnitude of the largest eigenvalue of the Laplacian. This algorithm
fits sparse matrices (i.e., graphs), since it requires only matrix-vector multiplications. A
similar algorithm is used in [123] to directly compute the graph layout, whereas it is used
here only to partition the graph.
3. Multi-Level Graph Layout on the GPU 29
v2 = random guess
L = Laplacian(G)
g = Gershgorin bound(L) = maxi
(
Lii +∑
j 6=i
|Lij|)
B = gI - L
v1 = 1√N· (1, ..., 1) //first (known) eigenvector
do
v2old = v2
v2 = v2− (v2T · v1)v1
v2 = B · v2
v2 = v2‖v2‖
until |v2old ·v2T −1| < ε or max iteration count reached
Figure 3.2: The power iteration algorithm
A drawback of the power iteration algorithm is its slow convergence rate. To accelerate
the convergence, a multi-grid algorithm is used. Instead of directly operating on the
largest Laplacian matrix, a series of coarsening operations is performed, until reaching a
minimal problem size. The coarsening algorithm is detailed in Section 3.4, Step 1. After
coarsening, the coarser problems are recursively solved and interpolated back, setting a
good initial guess for the next (finer) problem.
After computing the Fiedler vector v2, it is used to partition the graph. Each node
in the graph has a corresponding value in v2. Unlike the discrete partitioning case, when
using the Fiedler vector, the values in v2 are continuous, and range between -1 and 1.
These values can be used to partition the graph. This value is used to determine which
partition the node will be assigned to. The values in the vector v2 are sorted from lowest
to highest. This creates an ordering of the nodes of the graph. A set of k − 1 splitting
values is determined by sampling the sorted vector at k−1 uniformly spaced points. This
splits the vector into k regions. The partition to which a node is assigned is computed by
determining to which of the k regions the value of v2 corresponding to the node belongs
to.
Since the graph is partitioned into more than two parts, some clusters may be discon-
3. Multi-Level Graph Layout on the GPU 30
nected. A post-processing stage that merges clusters is performed. Each cluster whose
size is below a threshold, is merged with its largest neighboring cluster. In our imple-
mentation, disconnected clusters smaller than 19
of the graph are merged.
The partitioning algorithm continues repetitively, building finer and finer represen-
tations of the graph. The finer representations are then used in a multi-level scheme,
described in Section 3.4, to compute a globally pleasing layout of the original graph.
In our implementation, any eigen problem of a size smaller than 128 nodes is directly
solved, since coarsening it further is not time-effective. For each problem, a maximum of
10000 power iterations are allowed and an accuracy ε = 10−8 is used.
It should be noted that although the spectral partitioning algorithm was conceived to
split the graph into two partitions, we partition by default to three parts (k=3). This is a
heuristic that works well in practice and helps reduce the running time of the algorithm.
Our attempts to perform a more adaptive partitioning, resulted in lower quality results.
3.4 Multi-level layout Algorithm
Given an undirected weighted graph G = G0 = (V,E), the goal of the algorithm is
to compute a straight-line drawing of G, assigning 2D coordinates to each node. Our
algorithm is based on the force-directed approach [70, 113, 116, 199], which simulates a
system of forces defined on the input graph and converges towards a local minimum
energy position, starting from an initial placement of the vertices.
Our algorithm has several key ideas. First, a multi-level scheme is used to compute
the layout. Instead of directly computing a layout for the input graph, several coarsened
versions of it are created. Starting from the coarsest version, a series of increasingly
detailed layouts are computed. Care is taken to interpolate positions from each coarse
layout and use them as the starting point for the next finer layout.
Second, spectral partitioning methods are used to compute lower resolution represen-
tations of the graph, as discussed in Section 3.3. Using this approach the difficult graph
partitioning problem is transformed to a 1D partitioning problem. Breaking the graph
into increasingly finer parts allows us to produce a series of increasingly detailed graphs,
which are used in the multi-level scheme.
Third, a layout algorithm which combines the strengths of [70,113] is used. While [113]
3. Multi-Level Graph Layout on the GPU 31
is able to compute a good layout, given any starting point, it is time consuming. The
algorithm of [70] is faster and computes ”smoother” layouts, but is more sensitive to the
initial conditions given to it. We propose an algorithm which combines the strengths of
both algorithms in order to produce the final layout.
The algorithm is composed of the following stages, shown in Figure 3.3: We elaborate
on each stage below.
1. Initial coarsening: Given G = G0, compute G1, G2, . . . , Gcoarsest where Gk+1 =
edge collapse(Gk).
2. Partitioning initialization: set P level=0part num=0 to Gcoarsest. Set l = 0.
3. Partitioning: try to partition each graph P ln. This creates a new set of graphs
P l+10 , P l+1
1 , . . .. If no graph P ln could be partitioned, goto step 7.
4. Multi-level construction: construct Ll out of Gcoarsest, where each node in Ll cor-
responds to a graph P ln.
5. Layout initialization: compute an initial layout for Ll, using interpolated initial
positions from the coarser Ll−1.
6. Layout: compute the layout for Ll. This is the core step of the algorithm, which
uses our variant of the force-directed approach. Set l = l + 1, goto step 3.
7. Compute a layout for Gcoarsest using interpolated initial positions from Lfinest, the
finest graph layout computed in stage 6.
8. Final un-coarsening: Compute layouts for Gcoarsest−1, Gcoarsest−2, . . . , G0 by repet-
itively interpolating from Gi to Gi−1 and laying out Gi−1.
Figure 3.3: Algorithm overview
Initial coarsening (Step 1): In step 1, the graph is coarsened several times, as a
pre-processing stage that helps reduce computation time. At each level k, given a fine
graph Gk, a coarser representation Gk+1 is constructed using a series of edge collapse
3. Multi-Level Graph Layout on the GPU 32
operations [205]. A collapse operation replaces two connected nodes and the edge between
them by a single node, whose weight is the sum of the weights of the nodes being replaced.
The weights of the edges are updated accordingly. (The initial weight of a node/edge is
1.) The order of the edge collapse operations is different than in [205]: First, candidate
nodes for elimination are sorted by their degree, so as to eliminate low-degree nodes first.
An adjacent edge of a low-degree node is chosen for collapse by maximizing the following
measure: w(u,v)w(v)
+ w(u,v)w(u)
, where w(x) is the weight of node x and w(x, y) is the weight
of edge (x, y). This function helps to preserve the topology of the graph by “uniformly”
collapsing highly connected nodes.
In our implementation, three initial coarsening steps are performed. This significantly
reduces the computation time of spectral partitioning (Step 3), while maintaining a good
relation between the input graph G0 and Gcoarsest.
Partitioning initialization (Step 2): This step initializes the variables used in the
recursive partitioning of graph Gcoarsest in the next step. The graph P 00 , which is set to
Gcoarsest, is created.
Partitioning (Step 3): The goal of this step is to create high quality coarser repre-
sentations of the graph Gcoarsest, which are used in the multi-level layout scheme.
Starting from the single graph P 00 at level 0, for each level l the set of graphs P l
n in
this level are partitioned as described in Section 3.3. Each graph P ln is partitioned into
graphs P l+1m , by adding the corresponding edges from P l
n. As the level number l increases,
Gcoarsest is partitioned into a growing number of graphs decreasing in size.
Multi-level construction (Step 4): A series of graphs L0, L1, . . . , Lfinest of increasing
detail is created. At level l, the graph Ll is created as follows. Each node nk in Ll
corresponds to a single graph P lk in level l. The weight of a node nk in Ll is the sum
of the weights of the nodes in graph P lk it corresponds to. Edges (nk, nj) in Ll are
created by summing corresponding edges in Gcoarsest which connect the nodes in Gcoarsest
corresponding to P lk and P l
j .
3. Multi-Level Graph Layout on the GPU 33
Layout initialization (Step 5): The goal of this stage is to compute a good initial
layout of Ll. This is done based on the layout of Ll−1, and proceeds as follows. Initially,
each node pi ∈ Ll is placed at the position of its parent node in Ll−1, whose layout was
already computed. Next, the position of each node is scaled, as follows:
pi(x, y) =
√
|V (Ll)||V (Ll−1)| · pi(x, y), (3.1)
where V (Lk) is the set of nodes in Lk. The intuition behind Eq. 3.1 is that the scale
should be proportional to the ratio between the number of nodes in the graphs Ll and
Ll−1. A square root is used since the area of the graph should be scaled linearly with
the node ratio. Finally, an iterative algorithm is used to improve the placement. At each
iteration, each node i is placed at the average between its current position, pi, and the
average position of its neighbors, N(i), as follows:
pi =1
2
(
pi +1
degree(i)
∑
j∈N(i)
pj
)
.
This procedure creates a good initial placement, which is used in the next step. In our
implementation 50 iterations are used.
Layout (Step 6): In this stage, a layout for Ll is computed, using our variant of the
force directed approach. This is done utilizing the multi-level scheme, until the final
layout of the finest graph, Lfinest, is computed. Using this scheme, it is possible to retain
important information about the overall structure of the graph from previous layouts,
which is extracted from the spectral partitioning of the graph.
There are a couple of common approaches to performing force directed layout. The
first common approach, exemplified by the Fruchterman-Reingold (FR) algorithm [70],
computes the forces directly. Each node is moved according to the forces acting on it.
It computes ”smooth” layouts, but is sensitive to the initial conditions given to it. A
second common approach, used in the Kamada-Kawai (KK) algorithm [113], derives an
energy function from the forces and attempts to minimize the energy in order to create
the layout. The node that reduces the energy the most is moved in each step. This
3. Multi-Level Graph Layout on the GPU 34
algorithm is less sensitive to the initial conditions. However, it requires an expensive
all-pairs shortest path calculation and the computed layouts are less ”smooth”.
In this chapter , an approach that combines the strengths of both algorithms is used.
The key idea is to use the KK approach, to give the overall structure of the graph and
reduce the sensitivity to initial conditions. Then, the computed layout is used as an input
to the FR-based algorithm. On finer graphs, only the faster FR layout is used. By doing
so, we get a good initial placement from the KK algorithm and a ”smooth”, aesthetically
more pleasing layout from the FR algorithm. Note that a combined approach is used
in [97] in order to meet node-size constraints. In the current chapter, however, FR is
used to refine the layout of finer graphs in the multi-level hierarchy.
The most expensive step of the FR algorithm is the computation of all-pairs repulsive
forces between nodes, which is crucial for obtaining a good layout. This step is accelerated
in two ways. First, the graph is geometrically partitioned. Instead of calculating all-pairs
repulsive forces, as customary, approximate forces are calculated. An exact calculation
is performed only for nodes contained in the same partition, while an approximate cal-
culation is performed for nodes belonging to different partitions. Second, the calculation
of the forces is parallelized and performed on the GPU.
Graph Ll is now partitioned geometrically, according to the current layout, so as to
balance the number of nodes per partition. This is important in order to achieve good
load balance between the parallel processors of the GPU (Section 3.5). Moreover, since
the nodes in each partition are geometrically localized, it is possible to approximate the
partitions with a single ”heavy” node, as discussed below.
Specifically, a KD-tree-type partitioning is created. The nodes are partitioned ac-
cording to their median, alternating between the X and Y coordinates. This recursive
subdivision terminates when the size of the subset is below the required partition size.
The algorithm is iterative. In each iteration, the KD-tree is updated according to the
current layout (while required). Then, the center of gravity is found for each partition
and is used to replace the nodes it contains. Next, The forces applied to each node are
computed. Finally, the nodes are displaced according to the forces acting on them, while
bounding the allowed displacement according to the exponential converge schedule, which
resembles simulated annealing.
The key to achieving high performance is to perform these computations (i.e., finding
3. Multi-Level Graph Layout on the GPU 35
the center of gravity of the partitions, calculating the various forces acting on the nodes,
and calculating the displacements), in parallel on the GPU for each node/partition.
In particular, the repulsive and attractive forces that are computed in parallel for each
node are as follows. The difference from [70] is that the forces from distant partitions are
approximated using their center of gravity CG. For each node v that belongs to partition
Pi,
F repl(v) = K2(
∑
u6=v,u∈Pi
pos(v)− pos(u)
‖pos(v)− pos(u)‖2 +∑
Pj 6=Pi
|Pj|pos(v)− CG(Pj)
‖pos(v)− CG(Pj)‖2)
F attr(v) =∑
u:(u,v)∈E
‖pos(u)− pos(v)‖(pos(u)− pos(v))
K,
where pos(u) is the 2D position vector of node u and CG(Pi) is the 2D position vector
of the center of gravity of partition Pi.
The attractive and repulsive forces are then summed up in parallel for every node, re-
sulting in an approximation of the total force applied to each node, F total(v). Then, each
node is displaced, in parallel, using a simulated annealing technique, which exponentially
decreases the allowed displacement:
posnew(v) = pos(v) + F total(v)‖F total(v)‖min(t, ‖F total(v)‖).
Here, t is the bound for the maximum displacement, which is initialized to K ∗√
|V | and
decreases at each iteration by a factor λ. In our implementation, K = 0.1 and λ = 0.9.
This makes the scale of the graph proportional to the number of vertices it contains and
makes the annealing process stop after 50 iterations.
The simulated annealing technique makes the graph slowly freeze into position. Thus,
later iterations perform increasingly local corrections to the layout. Because of this be-
havior, it is possible to perform geometrical KD partitioning of the graph with decreasing
frequency.
In our implementation, re-partitioning is done on iterations 1-4 and then every 10
iterations. A total of 50 FR iterations are performed [205]. KK layout is performed on
graphs smaller than 1000 nodes. This constant was selected so the layout time will not
be dominated by KK layout which requires performing an expensive all-pairs shortest
3. Multi-Level Graph Layout on the GPU 36
path calculation. We use 2000 iterations in each KK layout.
Layout of Gcoarsest (Step 7): In this step, the layout of Lfinest is extended to a layout
for Gcoarsest. Here, the same method applied in Steps 4–6, is used. Instead of interpo-
lating positions from Li−1 to Li, an initial placement for Gcoarsest is computed using the
existing layout of Lfinest. The mapping of nodes between Gcoarsest and Lfinest is performed
similarly to Step 4: each graph P finestn corresponds to several nodes in Gcoarsest. After
computing an initial placement for Gcoarsest, layout proceeds as discussed in Step 5-6.
Final un-coarsening (Step 8): This step extends the layout of Gcoarsest to a layout
of the original graph G = G0. In each iteration, the layout of Gi is used to compute an
initial placement for the nodes of the finer graph Gi−1, using the algorithm described in
Step 5. Then, the force directed algorithm of Step 6 is applied to the initial placement
of nodes in Gi−1.
In our implementation, we do not perform force directed layout of the final graph
G0, for which the layout is the most expensive. Instead, using the layout of G1 and the
interpolation algorithm for computing initial positions, we are able to get a good layout
for G0.
Complexity: The most time consuming steps of the algorithm are spectral partitioning
and the FR force directed layout. Assuming that each KD partition of the graph contains
Cs nodes, the asymptotic FR complexity is O(|E|+ |V | ∗ (Cs + |V |Cs
)), which is minimized
to O(|E| + |V |1.5) when Cs =√
|V |. The spectral partitioning takes O(|V |1.5) [184].
Therefore, the total complexity is O(|E|+ |V |1.5). When |E| ≈ |V |, the dominating term
is |V |1.5. However, due to the calculation’s simplicity and its parallel implementation,
the actual running times are low, as discussed in Section 3.6.
3.5 GPU Implementation
This section describes how the GPU is utilized to accelerate the force-directed layout. It
elaborates on key details, which are briefly introduced in [66]. Figures that illustrate the
overall process are included.
3. Multi-Level Graph Layout on the GPU 37
The key to high performance on the GPU is using multiple processors, which operate
in parallel. The GPU schedules the execution of multiple threads, thus hiding memory
access latency. Each thread runs a small program called a kernel program, which computes
a single element of the output stream.
In the following, we first describe how the data is stored on the GPU and then how
the stream processing is performed [26].
Data Storage: On the GPU, input and output are represented as two-dimensional
arrays of data, called textures. The challenge is to map the graph and its elements onto
textures, even though graphs do not admit any intuitive and natural representation as
balanced arrays. Below, we describe the textures used to represent the graph,
To represent the graph layout, three textures are used: one texture for the nodes and
two textures for the edges.
The location texture holds the (x,y) positions of all the nodes in the graph. Each
graph node has a corresponding (u,v) index in the texture. As shown in Figure 3.4, the
nodes in each partition are stored at a rectangular region in the location texture. Recall
that Section 3.4 described how to partition a graph, so that the nodes in each partition are
geometrically close and the number of nodes in each partition is similar. This partitioning
is critical for the acceleration of the layout on the GPU for two reasons. First, storing
neighboring nodes (those that belong to the same partition) together maximizes memory
access locality. Thus, it makes efficient use of the GPU’s memory bandwidth, since
information regarding neighboring nodes will most likely reside in the cache. Second, since
the number of nodes in each partition is similar, the amount of computation performed on
each node is balanced. Thus, it makes efficient use of the GPU’s data parallel architecture,
which requires lock-step execution.
The location texture also holds the partition number of each node. Given a partition
of maximum size csz, the height and width of each rectangular region representing a
partition are set to hpartition = max(8,√
Csz) and d Csz
hpartitone, respectively.
Graph edges are represented by a neighbors texture and by an adjacency texture, as
shown in Figure 3.5. The adjacency texture, whose size is O(|E|), contains lists of (u, v)
pointers into the location texture. These lists represent the neighbors of each node.
The neighbors texture holds for each node a pointer into the adjacency texture, to the
3. Multi-Level Graph Layout on the GPU 38
Figure 3.4: Representing a graph on the GPU. Left: A graph spatially partitioned into
partitions; right: a corresponding location texture
coordinates of the first neighbor of the node. Pointers to additional neighboring nodes
are stored in consecutive locations in the adjacency texture. Doing so improves access
locality. The degree of each node is also stored in the neighbors texture. Its size is equal
to that of the location texture.
Figure 3.5: Representing graph edges on the GPU. Node X has three neighbors: Y,Z and
W.
The geometric (KD) partitions (described in Section 3.4, Step 6) are represented
using two textures: the partition information texture and the partition center of gravity
texture. The partition information texture holds the following information: (u0, v0) –
the coordinates in the location texture of the upper left corner of the partition, the width
and height of the partition rectangle in the location texture, the number of nodes in
the last row of the partition (which may be partially filled), and the number of nodes
3. Multi-Level Graph Layout on the GPU 39
in the partition. The partition center of gravity (C.G.) texture holds the current (x,y)
coordinates of the center of gravity of each partition. Two textures are used to represent
partitions not only because each texture is limited in the number of fields (to 4), but also
to separate between the constant information and the information modified during the
layout computation (i.e., center of gravity).
The forces computed during layout iteration are stored in two textures in a straightfor-
ward manner: the attractive force texture and the repulsive force texture. The attractive
force texture contains for each node the sum of the attractive forces F attr exerted on it by
its neighbors. The repulsive force texture holds the sum of repulsive forces, F repl: both
by nodes in the same partition and by the other partitions in the graph. Both textures
have the same dimensions as the location texture and contain the 2D components of the
forces, (Fx, Fy).
Stream processing: On the GPU computation is performed by selecting the rendering
target, which is the stream, or the texture, to which the output should be written.
Next, an appropriate kernel program is loaded. Finally, graphics primitives such as
quadrilaterals, are rendered in order to invoke the computation. For each pixel in the
primitive (i.e., that the quadrilateral covers), the loaded kernel program is executed.
Below we describe the order of invocations of the kernel programs, and their input and
output textures. Figure 3.6 displays the execution graph of the algorithm.
The algorithm is composed of three main stages, each implemented in a separate
parallel foreach loop which is executed in parallel for all elements on the GPU. The first
loop calculates the center of gravity of each partition. The second loop calculates the
forces acting on each node. The third loop displaces nodes using simulated annealing.
The partition CG (center of gravity) kernel calculates the center of gravity of each
partition. The kernel reads information about each partition from the partition informa-
tion texture and from the location texture and writes its result into the partition center
of gravity texture. The GPU operates on all partitions in parallel.
The repulse kernel, which is the most time consuming kernel, calculates the repulsive
forces exerted on each node. The kernel reads information from the partition information,
the partition center of gravity, and the location textures. The output of the kernel is
written to the repulsive force texture. For each fragment, the kernel first calculates the
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 40
Figure 3.6: Execution graph of GPU layout (rectangles = streams, ovals=kernels)
internal forces (exerted by nodes contained in the partition that the node belongs to).
Then, it approximates the forces by all other partitions. Both of these calculations are
performed using branching and looping instructions, in order to iterate over all other
nodes in a partition and over all other partitions. Since the partitions are similarly sized,
good branching consistency is maintained.
The attract kernel calculates the attractive forces caused by graph edges. It reads
the neighbors, adjacency, and location textures and writes its output to the attractive
forces texture. For each node, the kernel accesses the neighbors texture in order to get
a pointer into the adjacency texture, which contains the (u,v) texture coordinates in the
location texture, of the node’s neighbors. For each neighboring node, the attractive force
is calculated and accumulated.
Finally, the anneal kernel calculates the total force on each node. It reads the at-
tractive force, repulsive force, and location textures and updates a second copy of the
location texture. This double-buffering technique is used due to the inability of the GPU
to read and write to the same stream. In the next iteration, the updated location texture
is bound as input to the different kernels, thus facilitating feedback in our computation.
The anneal kernel also bounds the total displacement of each node according to the
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 41
graph |V | |E| FM3 alg. our alg. our alg. our alg.2.8GHz 3GHz 2.4GHz 2.4GHzPentium Pentium Core 2 Duo Core 2 Duo +
8800GTS GPUflower B 9030 131241 11.9 3.25 2.21 1.59
4elt 14588 40176 N\A 8.094 4.973 3.237crack 10240 30380 23.0 4.844 3.018 2.44
bcsstk31 35586 572913 83.6 25.329 14.199 5.754bcsstk32 44609 985046 110.9 39.266 22.549 9.617bcsstk33 8738 291583 23.8 5.141 2.986 2.486fe pwt 36463 144794 69.0 22.985 13.48 5.44
finan512 74752 261120 158.2 79.268 43.645 12.267fe ocean 143437 409593 355.9 158.849 86.32 15.536
Sierpinski 08 9843 19683 16.8 5.25 3.127 2.705
Table 3.1: Graph information and running time [sec.]. Runtime columns show total
running times for computing a layout.
current temperature of the layout. This temperature exponentially decreases at every
iteration, hence allowing the graph to ”freeze” into its final layout.
In total, the partition CG kernel performs O(|V |) operations; the repulse kernel per-
forms O(|V |1.5) operations; the attract kernel performs O(|E|) operations; and the anneal
kernel O(|V |) operations. On the GPU, the computations executed in each kernel, are
run in parallel.
3.6 Results
Our algorithm was tested on several well-known graphs, commonly used in the graph
drawing literature [204]. The bcsstk* graphs represent stiffness matrices. The Sierpinski
graph is a self-similar fractal composed of triangles. The finan512 graph is taken from
a linear programming matrix. The flower B graph is constructed by joining 6 circles of
length 50 at a single node before replacing each of the nodes by a complete subgraph
with 30 nodes (K30) [92]. The 4elt and crack graphs are 2D Finite–element meshes.
The fe * graphs are unstructured meshes related to fluid dynamics, structural mechanics,
or combinatorial optimization problems. Figures 3.7 - 3.11 show some of the layouts
computed by our algorithm, whereas Table 3.1 gives information about the graphs. Each
image is accompanied with a layout computed by other algorithms [75,92].
It can be seen that the layouts computed by our algorithm compare well with FM 3 [91].
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 42
Figure 3.7: bcsstk31. Red: our layout, black: FM 3 layout
Figure 3.8: Sierpinski 08. Red: our layout, black: FM 3 layout
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 43
Figure 3.9: finan512. Red: our layout, black: FM 3 layout
Figure 3.10: flower B. Red: our layout, black: FM 3 layout
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 44
Figure 3.11: 4elt. Red: our layout, black: Kamada-Kawai layout
The bcsstk31 graph (Figure 3.7) has a high edge density: |E|/|V | = 16. Moreover, it
has a regular mesh-like structure. This regularity is extracted in our layout, as a result
of the good partitioning and interpolation of the graph. Figure 3.8 shows the Sierpinski
graph, which demonstrates that the symmetry of the graph is maintained, even though
the holes in the graph are challenging, compared to more uniform mesh graphs. Figure 3.9
demonstrated the layout of the topologically challenging finan512. It is of similar quality
to FM 3 and better than the other algorithms compared in [92]. Figure 3.10 shows the
flower B graph, which has a relatively high edge density: |E|/|V | ≥ 14. Here, k = 6 is
used for partitioning the graph and KK layout is performed on graphs up to 128 nodes.
The 4elt graph, shown in Figure 3.11, exhibits large variations in node density and is thus
challenging for an algorithm that seeks to maintain equal edge lengths [205]. The layout
manages to show the interesting features of the graph – planarity and holes. Our layout
is more uniform and contains less overlaps than the Kamada-Kawai layout from [75].
For the performance tests, a PC equipped with a 2.4 GHz Intel Core 2 Duo CPU
and an NVIDIA 8800GTS GPU is used. Our algorithm was implemented in C++, Cg,
and OpenGL. Table 3.1 shows the running time of our algorithm when using only the
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 45
CPU and using the GPU to accelerate the computation. It also shows the running times
for the FM 3 algorithm, produced on a 2.8 GHz Intel Pentium 4 CPU . In addition, it
shows our algorithm on a slower machine (3.0 GHz Pentium 4), which is comparable to
the machine used for the reported experiments of FM 3 [92].
Compared to FM 3 running on an older machine, running our algorithm using a new
GPU-equipped machine, a speedup by a factor of up to 22 times is achieved. The GPU
accelerates the total computation time by a factor of up to 5.5. Without the GPU, on
comparable hardware, our algorithm runs 2-4 times faster than FM 3.
3.7 Visualization of ISP Router Networks
We have applied our algorithm to the visualization of Internet Service Provider (ISP)
router networks. The router networks of ISPs are comprised of several points of presence
(POPs). In each POP, several routers are located. They are connected to the backbone
of the ISP and to routers connected to subscribers of the ISP. The data is taken from [2].
It was collected by using the traceroute tool to determine the route taken by packets
traversing the ISP’s network [186].
Figures 3.1, 3.12 show layouts of the networks of several ISPs. Each node in the
graph corresponds to a router. Edges represent links between routers. Red nodes are not
associated with any ISP in the data – they are used to connect the ISP to the rest of the
Internet. The other nodes are color coded according to the ISP they belong to.
The layouts make evident some facts about these networks. First, most routers of
each ISP are clustered together. This can be seen from the large clusters of nodes having
the same color (excluding the red nodes). Second, two clusters are evident in Figure 3.12
– the brown cluster on the left, which represents an Australian ISP, and the rest of the
graph. The yellow and pink nodes represent European ISPs. The black and blue nodes
represent North American ISPs. The strongest connections exist between the two North
American ISPs. There are good connections between European and North American
ISPs. Connections between the Australian ISP and the other ISPs are sparser. Third,
the per-ISP clusters are further divided into small clusters of routers, perhaps in the
same city or nearby area. For instance, it can be seen that the brown routers belong to
a couple of clusters. Fourth, the red external routers, which do not belong to any ISP,
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 46
Figure 3.12: ISP router map. Each node represents a router. Edges link routers. Red
nodes are external to the ISPs visualized. Other nodes are colored according to the ISP
they belong to: blue - Abovenet (US, 665 routers); black - Exodus (US, 554 routers);
yellow - Ebone (Europe, 314 routers); pink - Tiscali (Europe, 514 routers); brown - Telstra
(Australia, 3756 routers). A total of 10895 routers and 15667 connections are shown. Top
left - GRIP layout. Bottom right - our layout.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 47
are used to link to the external world (outside the ISPs visualized). Fifth, the number of
external routers is about the same as the number of internal routers, hence each router
has one link on average to the world outside the ISP it belongs to. Sixth, the routers
have varying degrees. Some have high degree and are central points (such as the router
connecting the brown ISP and the yellow ISP), while others have low degree.
Figure 3.12 also compares our layout to one computed by GRIP [72]. It can be
seen that GRIP’s layout does not display the overall, clustered structure of the graph.
Moreover, important edges, such as the ones connecting the brown cluster to the other
part of the graph, are not visible. However, the GRIP layout contains less overlap between
nodes. To compare the performance, both layouts were computed using only the CPU
on a 3GHz Pentium PC. Linux, required for GRIP, is not available on the PC with the
GPU. The running time of GRIP was 3 seconds and the running time of our algorithm
was 12 seconds. Trying to modify the parameters of GRIP resulted in a higher runtime,
but without an improvement in layout quality.
3.8 Conclusion and Future Work
This chapter has presented a new algorithm for multi-level force directed layout of graphs
on the GPU. The algorithm has several key ideas. First, the graph is multi-level and is
based on spectral partitioning. Second, the algorithm combines the strengths of both
the Kamada–Kawai and Fruchterman– Reingold approaches, in order to compute a good
layout fast. Third, a geometric partitioning and interpolation method in proposed, which
facilitates the generation of good initial layouts of the finer versions of the graph.
Moreover, the chapter has demonstrated how the GPU can be used to accelerate the
algorithm by a factor of up to 5.5 times compared to our CPU implementation.
Last but not least, it has been demonstrated that the algorithm computes meaningful
high quality layouts, while requiring significantly lower running times than existing algo-
rithms of similar quality. Moreover, the algorithm was applied to visualize ISP networks.
There are several avenues for future research. Using the stress majorization algo-
rithm [74] can help improve the coarsest layout computed. Computing the Fidler vector,
which is used for the spectral partitioning, on the GPU, can further accelerate the al-
gorithm. This is a non-trivial task which requires sparse matrix multiples, which are
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
3. Multi-Level Graph Layout on the GPU 48
difficult to accelerate on the GPU. However, in our case we are tasked with computing
many Fiedler vectors on different parts of the partitioned graph. Performing the compu-
tations on parallel on the GPU can help improve the results. Creating a more balanced
graph hierarchy can help improve both runtime and layout quality. Currently, the al-
gorithm does not attempt to take steps to balance the number of nodes in each part of
the graph in the spectral partitioning phase. An improved graph partitioning algorithm,
such as one based on [183] may further improve the layout quality. Finally, a better force
approximation scheme may help improve the results.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Chapter 4
Uncluttering Graph Layouts UsingAnisotropic Diffusion and MassTransport
Many graph layouts include very dense areas, making the layout difficult to understand.
In this chapter, we propose a technique for modifying an existing layout in order to reduce
the clutter in dense areas. A physically-inspired evolution process, based on a modified
heat equation is used to create an improved layout density image, making better use of
available screen space. Using results from optimal mass transport problems, a warp to
the improved density image is computed. The graph nodes are displaced according to
the warp. The warp maintains the overall structure of the graph, thus preserving the
mental map, while reducing the clutter in dense areas of the layout. The complexity
of the algorithm depends mainly on the resolution of the image visualizing the graph
and is linear in the size of the graph. This allows scaling the computation according
to required running times. It is demonstrated how the algorithm can be significantly
accelerated using a graphics processing unit (GPU), resulting in the ability to handle
large graphs in a matter of seconds. Results on several layout algorithms and applications
are demonstrated. The material is this chapter is based on [69].
The rest of this chapter is structured as follows. Section 4.1 gives an introduction.
Related work is reviewed in Section 4.2. The algorithm is presented in Section 4.3. An
algorithm for the solution of mass transport problems and it’s connection to this chapter
is discussed in Section 4.4. Methods to accelerate the running time of the algorithm
on a GPU are presented in Section 4.5. Results are presented in Section 4.6. Finally,
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 50
Section 4.7 concludes.
4.1 Introduction
Graph layouts often contain a highly varying local density. While some regions in the
generated layouts are sparse or even empty, others are very dense, containing many close-
by or overlapping edges and nodes. This results in low efficiency in utilizing the available
screen space.
Instead of developing a new layout algorithm, this chapter describes an algorithm
that can improve a given graph layout. This allows the user to select a layout algorithm
that is suited for the application at hand. The clutter in the layout can then be reduced
by our algorithm, resulting in a layout with a smaller node density in the high-denisty
regions of the original layout. This is achieved while preserving the overall structure of
the graph. Figure 4.1(a) shows an example of a cluttered layout. The layout is difficult
to read and the available screen space is not used effectively. Figure 4.1(b) shows the
enhanced layout. Note how the screen space is more efficiently used, allowing more details
of the graph to become visible.
Some research has addressed the problem of reducing the visual clutter of graph
layouts in the past. Lyons et. al. [138] use a combination of a Voronoi diagram and a
force-directed type approach [50, 70, 113] in order to disperse nodes clustered together.
Merrick and Gudmundsson [143] modify the layout based on properties of the structure
of the underlying graph. However, these algorithms employ schemes that are either
computationally expensive or perform local improvements to the graph. In contrast,
the algorithm in this chapter is able to operate on large graphs, making a more global
enhancement to the layout.
Instead of operating on the abstract graph representation, the algorithm proposed in
this chapter operates on an image of the density of the input layout. The density image
is modified, making use of low-density regions in order to reduce the visual complexity in
high-density regions of the layout. A physically-inspired evolution of the density image
using a modified heat diffusion process is used to create the target density image. Given
the target density, a warp of the 2D layout is computed, in which dense regions are allowed
to expand and make use of available screen space. The warp is computed using results
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 51
from optimal mass transport problems [10, 94, 115]. The evolution process attempts to
retain the overall structure of the input graph layout, thus preserving the user’s mental
map [145] of the layout.
This chapter makes a couple of contributions. First, a new algorithm for uncluttering
graph layouts in a mental-map preserving fashion is presented. Second, a method for
accelerating the computation of the target density, which is the most time-consuming
stage of the algorithm, using a graphics processing unit (GPU), is described. Several
examples, using various layout algorithms and applications, are provided to demonstrate
the capabilities of the algorithm.
(a) (b)
Figure 4.1: Protein graph (V=30727, E=1206654). (a) FM 3 [91] layout. (b) Improved
layout. Note how displacing nodes outwards allows more details to become visible, espe-
cially in the center of the drawing. Also note that the overall structure of the graph is
maintained.
4.2 Related Work
This work is related to three sub-fields: algorithms for graph uncluttering, node overlap
removal in graph drawing and overlap removal in areas outside of graph drawing. In this
section we discuss related work in these fields.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 52
Several papers have addressed the graph uncluttering problem. Lyons et. al. [138]
attempt to more evenly distribute the nodes while maintaining the user’s mental map of
the original layout. Two algorithms are presented. The first uses a Voronoi diagram in
order to move nodes. The second algorithm repositions nodes inside a region defined by
a Voronoi diagram, according to the forces acting on them, defined using a force-directed
approach [50,70,113]. Using a Voronoi diagram performs only local enhancements, which
may not be sufficient in order to reduce clutter in dense areas of the graph.
Merrick and Gudmundsson [143] propose a technique for enlarging dense areas of
a given graph layout and shrinking sparse areas. Their algorithm first determines the
important nodes, then calculates the desired edge lengths, and finally repositions vertices
using the algorithm of Shimizu and Inoue [185], which tries to minimize the change in
the angles of the edges. Determining the important nodes, called node centrality, is an
expensive operation, taking O(V · E) for V nodes and E edges. It is thus not scalable
to large graphs. Centrality is determined according to graph-theoretic properties of the
underlying graph, which do not take the actual layout into account. Therefore, the
algorithm is not effective at uncluttering dense areas of the graph with non-central nodes.
Our algorithm attempts to solve these problems.
There are two related, yet distinct, problems to graph uncluttering: graph overlap
removal and overlap removal in other fields such as map cartography. Hereafter we
describe some related work on these issues.
While most graph drawing algorithms assume that nodes are dimensionless (e.g.
point-sized), in practice nodes may be labeled, and the labels may overlap. Several
algorithms have been developed to remove overlaps between nodes.
Chuang et. al. [35] use potential fields in order to remove overlaps. Gansner and
North [77] use an iterative Voronoi diagram method in order to tidy up the layout. Harel
and Koren [97] use a combination of a Kamada Kawai [113] method and a modified
spring method, which takes node shapes into account when calculating forces in order to
converge to an overlap free layout. Marriott et. al. [141] use a constrained optimization
approach in order to remove overlaps. Eades and Nikolov [133] remove overlaps using
spring algorithms, followed by displacement of nodes in a way that preserves the mental
map as measured by the orthogonal node ordering model. Huang et. al. [106] discuss the
force-transfer algorithm which pushes overlapping nodes away from each other. Dwyer
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 53
et. al. [49] use a constraint optimization problem for each dimension separately.
The graph uncluttering problem addressed in this chapter is different from the node
overlap removal problem. Overlap removal attempts to compute a minimal displacement
of nodes in order to avoid overlaps, but may result in graphs that are still difficult to
comprehend since they include very dense areas. Moreover, while the algorithms discussed
above deal with removing overlaps between a small number of large, labeled nodes, our
algorithm attempts to improve layouts of large, dense graphs in a mental-map preserving
fashion. Finally, graph uncluttering attempts to maintain the original structure of the
graph, while overlap removal does not necessarily have this aim.
Figure 4.2 shows a comparison between the results of using a node overlap removal
algorithm [77] and using our graph uncluttering algorithm. It can be seen that the overlap
removal algorithm not only modifies the structure of the graph, but also leaves some dense
areas (Figure 4.2(b)). Our uncluttering algorithm improves the layout in a mental-map
conserving manner by expanding the graph to empty regions (Figure 4.2(c)).
(a) Input layout (b) Removing node overlaps (c) Uncluttering using(V=247, E=1230) our algorithm
Figure 4.2: Comparison between node overlap removal and graph uncluttering. (a) is a
layout produced using neato [79] of a reduced version of the bcsstk32 graph from [204].
In (b) the node overlap removal algorithm from [77] is used. Note that although the
overlaps between nodes are eliminated, the structure of the graph is not maintained and
the center of the layout is cluttered. In (c) our algorithm is used. Note how the cluttered
right side of the input layout is expanded, thus increasing node separation, while the
structure of the graph is maintained.
Overlap removal problems arise in other fields except graph drawing. Deussen et.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 54
al. [45] present an extension of Lloyd’s method for distributing objects on the plane in
order to create stipple drawings. Chan et. al. [32] use a density constrained minimization
formulation in order to compute overlap-free placements for components in integrated
circuits. Hayashi et. al. [101] present an O(n2) algorithm for finding the minimum area
layout of a set of n rectangles that avoids intersections and preserves the orthogonal
ordering of the rectangles.
Map cartography attempts to create maps in which the size of regions is in proportion
to their population or some other analogous property. Gastner and Newman [82] perform
diffusion in order to create maps which have a uniform information density. There are
a couple of differences between their work and this chapter. First, in cartography an
attempt to conserve the area is made, while our algorithm tries to use sparse or empty
regions of the screen. Second, while in [82] isotropic diffusion is used, here anisotropic
diffusion is used in order to avoid ”collisions” between neighboring dense areas of the
graph.
4.3 The Algorithm
Given Linitial, which is a straight-edge layout of an un-directed graph G = (V,E), the
goal of the algorithm is to produce an enhanced layout Lfinal. This layout should make
better use of the available screen space by dispersing nodes from high density regions to
surrounding regions, while maintaining the structure of the original layout. The algorithm
utilizes several key ideas. First, for each pixel in the image of the layout, we compute
the density of the information it contains. Second, we perform an evolution process in
order to improve this density, making use of unused areas of the image and reducing
the density in congested areas. Third, a warp is computed between the initial and the
improved densities. This image warp is used to modify the graph layout in a mental-map
preserving way, resulting in an enhanced layout. Algorithm 1 gives an overview of the
steps of the algorithm. We elaborate on each of these steps below.
Computing the density image of the layout (Step 1): The first step of the
algorithm computes the density Dinitial of the given layout Linitial, as illustrated in Fig-
ure 4.3(a) and (c). The intensity of each pixel in the density image is proportional to the
number of graph elements that cover the pixel. Using the density image, the cluttered
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 55
Algorithm 1 Layout improvement algorithm
input: Linitial, layout of a graph G=(V,E)
output: Lfinal, modified layout of G
1. Compute Dinitial, the density image of the layout Linitial.
2. Calculate Dsmooth, a smoothed density image of Linitial, using the heat equation.
3. Calculate Dtarget, the target density image, using a modified heat evolution.
4. Calculate an optimal mapping u between Dsmooth and Dtarget.
5. Calculate Lfinal by displacing nodes according to the mapping u.
areas of the graph, which we wish to visualize more clearly, can be identified.
The density image can be computed using only the nodes or both the nodes and edges
of the graph. Our experiments indicate that using only the nodes produces better results.
This is since each edge has a rigid structure, while node concentrations consist of individ-
ual points which can be dispersed by our algorithm to generate a more understandable
layout. The resolution of the computed image is configurable by the user. While small
grids reduce the running time of the algorithm, the quality of the results can suffer, espe-
cially for large, dense graphs. In our experience, using a resolution of 257 by 257 pixels
gave good results at a reasonable running time for a large variety of graphs, and thus
was used as the default. (Note that the multigrid algorithm requires a resolution equal
to k · 2m + 1 where k,m ∈ N (see Section 4.4) [24].)
In our implementation, the density is computed using OpenGL and the GPU. Since
we are interested in identifying areas where several graph elements (i.e. nodes) occupy
the same screen pixel (i.e. overlap), we use blending in order to accumulate the density.
This is achieved by using a rendering mode in which the color of different overlapping
rendered primitives is accumulated. Thus, pixels that contain more graph elements will
have a higher value in the density image. Anti-aliasing is used to render a smoother
image.
Note that in this chapter density images are used to compute an improved layout.
However, there can be other uses of density images. For instance, in [202] they have been
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 56
used to aid in visualization.
Smoothing the density image (Step 2): In this step the image Dinitial is smoothed
it in order to create the image Dsmooth. This is a pre-processing phase that creates an
input that is more suitable and hence improves the numerical stability of the warping
algorithm in Step 4.
We base the smoothing algorithm on the heat equation [191]. This is a partial differ-
ential equation (PDE) that models the variation of the temperature in a region over time.
Intuitively, this PDE implies that the rate of change in temperature over time depends
on the temperature difference between a point and its neighbors. The PDE describes a
diffusion process that can be used for smoothing. In addition, it has the desirable prop-
erty that given a potentially discontinuous initial temperature, it very rapidly becomes
continuous.
Given a 2D domain Ω we define the temperature in each point in the domain as
u(x, y). The heat equation is
∂u
∂t= k(
∂2u
∂x2+
∂2u
∂y2) ≡ k∇2u, (4.1)
where ∇2 is the Laplacian operator and k is a constant describing the rate of heat
diffusion. In our case, u(x, y) is set to the density Dinitial(x, y) computed in Step 1
and it is evolved to compute the smoother density Dsmooth(x, y). Appropriate boundary
conditions need to be set on the values of u. We define u = 0 on the boundary ∂Ω,
corresponding to setting a zero density at the boundary of the image of the layout.
To solve this equation numerically it is necessary to discretize the grid and use nu-
merical approximations for derivatives [44]. This results in the following discrete approx-
imation of Equation 4.1:
ut+1(i, j)− ut(i, j)
dt= k
ut(i + 1, j)− 2ut(i, j) + ut(i− 1, j)
(dx)2
+kut(i, j + 1)− 2ut(i, j) + ut(i, j − 1)
(dy)2, (4.2)
where ut(i, j) is the value of the density at grid point (i,j) at time step t, dx and dy are the
grid dimensions in the x and y directions, respectively, dt is the time step and u0(x, y) =
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 57
(a) Input layout Linitial for the 3elt graph, (b) Output graph Lfinal
V=4720 E=13722
(c) Initial density (d) Smoothed density (e) Target densityDinitial (Step 1) Dsmooth (Step 2) Dtarget (Step 3)
(f) x-component of the warp u (Step 4) (g) y-component of the warp u (Step 4)
Figure 4.3: Algorithm steps. Higher intensity represents higher values. Values are scaled
to improve contrast.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 58
Dintitial(x, y). Thus, given the density at every grid point at time t we are able to compute
the density at time t + 1. Figure 4.3 (c) and (d) shows the smoothing performed by the
heat equation. The Laplacian operator on the right-hand side of Equation 4.2 can be
represented by the following template [44]:
∇2 ≈
0 1 01 −4 10 1 0
, (4.3)
which describes how the values in each grid point are updated, taking its neighbors into
consideration.
It should be noted that it is possible to perform the smoothing by performing a
convolution with the heat kernel. The iterative formulation discussed here serves as a
basis for the anisotropic case discussed in Step 3.
The algorithm uses several parameters. We use a square grid and therefore set dx =
dy = 1. Using k = 1 in the heat equation results in a reasonable diffusion rate. In
order to maintain numerical stability, it is required to have dt ≤ 18
(dx)2+(dy)2
k[34]. We use
dt = 0.23. Thirty iterations of Equation 4.2 are run. This number represents a tradeoff.
If too few iterations are used, the smoothing will not be sufficient for Step 4. If too many
iterations are used, the image will be too smooth, potentially reducing the displacements
computed in Step 4.
Calculating the target density image (Step 3): Although the algorithm in
Step 2 has the advantage of creating a more uniform, evenly distributed density, it has
the disadvantage that the diffusion process takes into account only local properties of
the density, as governed by the heat equation. This is not desirable in our case since
it may lead to cases of ”collisions” between close-by high density regions. We would
like to take the topology of the given graph density into consideration when calculating
an alternative, more uniform density with lower maximal values, corresponding to a less
cluttered layout. The goal of this step is to compute Dtarget, which is an improved density
image, given Dinitial.
Creating an improved, shape-aware density image is achieved by modifying the evo-
lution described by the heat equation (Equation 4.1). Instead of performing isotropic
diffusion as governed by the discrete Lapalcian operator (shown in Matrix 4.3), we mod-
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 59
ify the direction of the diffusion according to the shape of the density image. The diffusion
is performed in a direction that makes use of empty and low-density regions of the image.
This allows making more effective use of the screen space in the improved layout Lfinal.
To select the preferred direction θbest at each time step and for each pixel of the current
density image µ, a ray-shooting process is performed. For location (x, y) in the density
image, given a possible diffusion direction θ, we calculate the following score
score(x, y, θ) =
∫ l=lmax
l=0
µ(x + lcosθ, y + lsinθ) dl,
where lmax corresponds to a point on the ray that is on the image boundary. The intuition
behind this formula is that we sum up the amount of material we encounter when traveling
in direction θ from (x, y) up to the boundary of the density image. In discrete form, the
score is
score(x, y, θ) =
l=blmaxc∑
l=0
µ(x + lcosθ, y + lsinθ). (4.4)
The final advancement direction is
θbest(x, y) = argminθ∈[0,2π]
score(x, y, θ), (4.5)
which corresponds to the direction in which the least amount of material is encountered,
hence making the best use of available screen space (since we disperse the material to the
emptiest regions).
Since there are potentially several nodes located in the same pixel of the density image
µ, it is required to use sub-pixel accuracy in the sampling performed in Equation 4.4.
This is efficiently handled by using bilinear interpolation for sampling µ. Using higher
fidelity kernels is also possible, but would result in a significant decrease in performance.
Given θbest for every pixel in the current density image, we evolve the density according
to equation 4.1, but replace the isotropic Laplacian operator in Matrix 4.3 with the
following anisotropic operator:
∇2anisotropic ≈
0 1 + sin(θbest) 01 + cos(θbest) −4 1− cos(θbest)
0 1− sin(θbest) 0
. (4.6)
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 60
The intuition behind this operator is that the averaging performed depends on the direc-
tion θbest, resulting in a new density that is biased in the required direction.
In summary, in this step, starting with µ = Dinitial, we iteratively compute Equa-
tion 4.5 and update µ using the anisotropic Laplacian Matrix 4.6, resulting in Dtarget.
In our implementation we calculate the best diffusion direction for 64 angles symmet-
rically distributed over the possible advancement directions (i.e. [0, 2π]). Five iterations
of the heat equation evolution (using Matrix 4.6) are performed between recalculations
of the best direction (Equation 4.5). This is a tradeoff between computation speed and
accuracy, which our experiments show produces good results. A total of 60 iterations of
the heat equation evolution are performed. This number is used in order to ensure that
the evolution of the target density Dtarget continues for more iterations than the evolution
of Dsmooth. Doing so allows the warp computed in Step 4 to expand the layout to unused
portions of the screen.
Computing an optimal warp (Step 4): After computing Dsmooth and Dtarget in
the previous steps, we are now ready to compute a warp u = (u1(x, y), u2(x, y)) that
maps location (x, y) in Dsmooth to location (u1(x, y), u2(x, y)) in Dtarget. Using u, we are
able to modify the layout, as discussed in Step 5, in order to compute Lfinal.
The warp procedure is based on the algorithm of Haker et. al. [94], which is shown to
compute a warp that minimizes displacements. In our case this helps maintain the overall
structure of the graph, thus preserving the mental map. The key idea of the algorithm
is to iteratively converge to an optimal mapping by using a gradient descent technique.
More details are given in Section 4.4.
Computing the final layout (Step 5): In the final stage of the algorithm, the
positions of the nodes are modified in order to create the output layout Lfinal. Given the
optimal warping u = (u1(x, y), u2(x, y)) that was computed in Step 4, which is defined
over a discreet, regular grid, this step computes the updated positions of each node in
the graph, which are non-integral. Note that this stage modifies the node coordinates
and not the image of the layout.
The optimal warping u = (u1(x, y), u2(x, y)) gives for each pixel in the input density
a destination position in the image. Using the warp, new node positions are computed
using an iterative process. Given a node n with current position (xn, yn) (initialized to
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 61
the node position in Linitial), its updated position is set to
xupdatedn = xn + α(u1(xn, yn)− xn)
yupdatedn = yn + α(u2(xn, yn)− yn). (4.7)
The number of repetitions of Equation 4.7 is controlled by the user. Performing more
iterations results in a larger displacement, representing a tradeoff between node separation
and preserving the structure of the graph. The constant α, whose default value is 0.5 is
used to scale the displacement.
In order to compute the value of the functions u1 and u2 at the non-integral node
coordinates (xn, yn) bilinear interpolation is used. Using an interpolation method with
sub-pixel accuracy helps increase the separation between close-by nodes in the input
layout.
Complexity: Step 1 requires traversing the nodes and edges of the graph, which is
O(E + V ) for a graph with E edges and V nodes. In addition it requires rasterizing the
nodes and edges, which is performed quickly on the GPU. Step 2 performs a fixed number
of iterations, each of which takes O(P ) for an image containing P pixels. Step 3 uses
a fixed number of directions, each requiring O(√
P ) work for summing up the densities
along the ray emanating from each of the P pixels. The total here is O(P 1.5). As discussed
in Section 4.4, Step 4 requires O(P ). Finally, the last step is O(V ). Hence, the total
runtime is O(E + V + P 1.5). As shown in Section 4.6, it is dominated by the time spent
in Step 3, which can be controlled by changing P .
4.4 Computing an Optimal Mapping
In this section we describe a method, based on optimal mass transport, for finding a
mapping between the two density images Dsmooth and Dtarget in a way that minimizes
displacements, thus preserving the structure of the graph.
First, a brief introduction to the optimal mass transport problem, which was first
formulated by Monge in 1781 and later by Kantorovich [115] is provided. Next, the
application of this problem to improving graph layouts is discussed. The section con-
cludes by briefly describing how the mass-transport problem is efficiently solved using
the algorithm of Haker et. al. [94].
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 62
Let Ω0 and Ω1 be two subdomains of R2, with smooth boundaries. Positive density
functions µ0(x, y) and µ1(x, y) are defined on these domains, respectively. We assume
that∫∫
Ω0
µ0(x, y) dx dy =
∫∫
Ω1
µ1(x, y) dx dy, (4.8)
i.e. the same total mass is contained in both regions. In our case of density images of
graph layouts, we assume Ω0 = Ω1 = [0, 1]× [0, 1].
Our purpose is to construct a mapping between Dsmooth computed in Step 2 and
Dtarget computed in Step 3. Unlike the classical setting described above, the densities
used in our case can be zero in some regions of the image - the ones not occupied by the
input graph layout. We therefore equalize the mass (in order to ensure Equation 4.8 is
met) and add a constant ε to each of the input densities before computing the optimal
mapping u, using the following relations:
µ0 = ε + Dsmooth , µ1 = ε + Dtarget
∫∫
Ω0
Dsmooth(x, y) dxdy
∫∫
Ω1
Dtarget(x, y) dxdy, (4.9)
where µ0, µ1 are the equalized and shifted densities which are used to compute the optimal
warp. In our implementation ε = 0.5.
Diffeomorphisms u = (u1(x, y), u2(x, y)) from Ω0 to Ω1, which map one density func-
tion to the other according to the following relation
µ0(x, y) = |Du(x, y)|µ1(u(x, y)) (4.10)
are considered. Here Du is the Jacobian matrix and |Du| is its determinant [191] .
Equation 4.10 is called the Mass Preservation (MP) property and accordingly u ∈ MP .
It implies, for example, that if a small region in Ω0 is mapped to a large region in Ω1,
there must be a corresponding decrease in density in order for the mass to be preserved.
Many mappings u that satisfy Equation 4.10 exist. We would like to choose an optimal
one for our application. We use the squared L2 Monge-Kantorovich distance, defined as
follows
d22(µ0, µ1) = inf
u∈MP
∫∫
‖u(x, y)− (x, y)‖2µ0(x, y) dx dy. (4.11)
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 63
This distance places a penalty on the distance the map u moves each bit of material,
weighted by its mass. Hence, this distance fits our requirement of disturbing the input
graph layout as little as possible, in order to reduce changes to the structure of the layout,
thus conserving the user’s mental map.
A fundamental theoretical result [10,22,119] states that there exists a unique optimal
mapping u that is a gradient of a convex function ω, i.e. u = ∇ω. In order to find
the optimal mapping u we use the algorithm of Haker et. al. [94]. This algorithm has
two main stages. First, an initial mapping u0 is found. Next, the mapping is updated
iteratively in order to decrease the functional in Equation 4.11.
Finding an initial mapping is achieved by first solving a one-dimensional problem of
transporting mass in a direction parallel to the x-axis (Equation 4.12), followed by the
solution of a series of problems transporting mass parallel to the y-axis (Equation 4.13).
A function a = a(x) is implicitly defined by the equation
∫ a(x)
0
∫ 1
0
µ1(η, y) dy dη =
∫ x
0
∫ 1
0
µ0(η, y) dy dη. (4.12)
a(x) is determined by numerically calculating the integrals. Differentiating Equation 4.12
with respect to x gives
a′(x)
∫ 1
0
µ1(a(x), y) dy =
∫ 1
0
µ0(x, y) dy.
A function b = b(x, y) is now defined implicitly by the equation
a′(x)
∫ b(x,y)
0
µ1(a(x), ρ) dρ =
∫ y
0
µ0(x, ρ) dρ. (4.13)
Given a(x), the function b(x, y) can be computed by numerically performing the integra-
tions in Equation 4.13. The initial mapping is set to be u0(x, y) = (a(x), b(x, y)).
Considering u0 to be a vector field, the Helmholtz-Hodge decomposition [191] states
that u0 can be decomposed into the sum of a curl-free vector field ∇ω and a divergence
free vector field χ, i.e. u0 = ∇ω + χ. In the 2D case a divergence free vector field χ can
be written as χ = ∇⊥h for some scalar function h, were ⊥ represents rotation by 90, so
∇⊥h = (−∂h∂y
, ∂h∂x
). In this case the decomposition is u0 = ∇ω +∇⊥h.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 64
In order to compute the optimal MP mapping u = ∇ω, the second step of the algo-
rithm removes the curl from u0. This is achieved by using an iterative gradient descent
method. In each iteration the current mapping u is modified in order to reduce the func-
tional in Equation 4.11. Note that at all stages, the mapping u is a valid solution to the
mass-transport problem. Setting u = ∇ω + ∇⊥f , f is found by solving the following
Poisson Equation with a Dirichlet-type boundary condition:
∇2f = −div(u⊥)
f = 0 on ∂Ω0. (4.14)
The boundary condition ensures that the mapping will remain constrained in the given
domain. It is shown in [94] that the functional in Equation 4.11 can be reduced by the
following evolution equation:
∂u
∂t=
1
µ0
Du∇⊥f. (4.15)
The time step 4t is set as 4t = minx,i ‖ 1µ0
(∇⊥f)i‖−1, where the subscript i stands for
the component of the vector. The algorithm iteratively solves the Poisson Equation and
updates the mapping u until the curl of u is below a given threshold. Our experiments
show that performing up to 30 iterations of Equation 4.15 is sufficient for obtaining a
high-quality warp.
A multi–grid method [24, 44] is used in order to quickly solve Equation 4.14. The
implementation uses the V-cycle algorithm to control the transition between grid levels,
Jacobi iterations for smoothing the solution and full weighting for downsampling solutions
between grids [24]. The complexity of the multi-grid method for an image containing P
pixels is O(P ), resulting in a rapid solution. Equations 4.12,4.13,4.15 are linear in the
image size. A fixed number of iterations of Equation 4.15 is performed. Hence, The total
complexity of this step in the algorithm is O(P ).
4.5 Implementation on the GPU
Computing the target density (Step 3) is the most time consuming stage of the algorithm
since we need to perform many computations for each pixel of the image. In this section
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 65
we describe how this step is implemented on the GPU, resulting in a significant speedup
of the running time of the algorithm, as shown in Section 4.6.
Please refer to Sections 1.3 and 2.2 for more information about accelerating compu-
tations using GPUs.
The GPU has several architectural characteristics that help improve the speed of
computation compared to the CPU. First, the GPU is highly parallel. It is able to run
hundreds of computational threads in parallel. In some cases, memory access latency
is hidden by switching to executing a different thread. Second, the GPUs memory sys-
tem is optimized for two-dimensional locality, as opposed to the one-dimensional locality
employed in CPUs. Our implementation on the GPU takes advantage of these properties.
Given the current density image µ as an input, the goal is to calculate for each pixel
the best advancement direction, θbest, as in Equation 4.5. This is done by finding for each
pixel the angle that minimizes the score in Equation 4.4.
Figure 4.4: Execution graph of finding the best advancement direction on the GPU in
Step 3 (rectangles = textures, ovals=kernels, θ is the current direction being tested)
Several textures, which are two-dimensional images or data arrays, are used to store
data on the GPU, as illustrated in Figure 4.4. The input density µ is stored in the density
texture. For each candidate direction θ, the current score for each pixel is stored in the
local metric texture. Two textures are used to store the current best angle for each pixel:
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 66
global metric #1 and global metric #2. We use two textures due to the GPU’s inability
to read and write to the same texture. At the end of the computation the global metric
texture holds the best advancement direction θbest for each pixel.
Computation on the GPU is achieved by running a kernel or fragment program for each
pixel in the image. The GPU is able to split the computation into hundreds of parallel
threads, thus achieving high performance. The computation, shown in Figure 4.4, is
performed using two kernels. The first kernel, called calc metric, calculates Equation 4.4
for each pixel in the image given the current direction θ. Given the coordinates of the
current pixel, lmax from Equation 4.4 is determined by calculating the closest intersection
of the ray in direction θ, starting at the current pixel, with a boundary of the image. Next,
the score is accumulated using Equation 4.4. During this process, Bilinear interpolation
is used to access the density texture in the non-integral coordinates.
The GPU is able to efficiently execute the calc metric kernel. For each direction θ,
when concurrently running the kernel on neighboring pixels, the accesses to the density
metric have a 2D locality. This results in a good utilization of the caches and memory
bandwidth of the GPU, which are optimized for 2D operations.
A second kernel, the merge kernel is used to update the current best advancement
direction per pixel. This kernel accepts as input the previous best direction, stored in
the global metric texture, and the value of the score calculated in the current direction
θ, stored in the local metric texture. The kernel compares the two scores and writes to
its output the merged best score. After iteratively running the calc metric and merge
kernel for the set of all candidate angles, the global metric texture contains the value of
the best angle θbest for each pixel.
It should be noted that it is better to compute the best direction (Equation 4.5) in a
single pass, using the current density µ as the input and the best direction θbest(x, y) as
the output. This would remove the necessity for having a temporary texture for the local
result, performing the ping-pong algorithm between the two copies of the global metric
texture and running the merge kernel. However, in order to protect the system from fatal
errors, the graphics driver limits the amount of time a computational kernel is allowed to
run. The allotted time is insufficient to perform the computation in one pass, especially
in lower performance GPUs. Thus, we chose the multi-pass implementation discussed in
the previous paragraphs.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 67
(a) Input FM 3 layout (b) Removing node overlaps (c) Our improved layoutusing [49]
Figure 4.5: ug 380 graph (V=1104, E=3231). Note how when using our algorithm the
center expands, reducing node density while the outer ring is unchanged. When using [49]
the layout is hardly changed.
4.6 Results
Our algorithm was tested using the output of several state-of-the-art graph layout algo-
rithms in a variety of applications. Table 4.1 gives information about the graphs and the
parameters used in our algorithm. Below, we discuss the results of our algorithm and
compare them to the results obtained by the node overlap algorithm of Dwyer et. al. [49].
Figures 4.1 and 4.5 show improvements of layouts computed by FM 3 [91], which is
a multi-level force-directed algorithm. It uses solar systems, which consist of nodes at a
distance of two edges or less from the center of the solar system, in order to create the
graph hierarchy.
Figure 4.1 shows a layout of the protein graph, which is the unweighted version of
the protein homology graph presented in [3]. The layout contains a large, dense central
cluster. Applying our algorithm increases the percentage of screen space devoted to the
elements of the graph. This allows more of the fine details of the graph to become
visible, especially in the central region of the graph. Note how the overall structure of
the different elements of the graph, such as the different ”spokes” it contains, is retained.
In comparison, the algorithm from [49] was not able to remove all of the overlaps and the
changes to the layout were small, similarly to Figure 4.8 (b).
Figure 4.5 shows a layout of the ug 380 graph [1], which contains one node with a
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 68
very high degree. The layout contains a central core which is packed with many nodes.
In (b) the results of a node overlap removal algorithm [49] are shown. Since the input
layout contains hardly any overlaps, the result in (b) is very similar to (a) and the graph
remains cluttered. Applying our algorithm to this challenging case, shown in (c), results
in an increase in the radius of the central core, increasing the separation between the
nodes. The exterior nodes, which are sparser, are unaffected.
(a) Input layout by (b) Removing node overlaps (c) Our improved layoutTopoLayout [7] using [49]
Figure 4.6: Add32 graph (V=4960, E=9462). Note how in (c) each of the rings is
expanded, showing more detail.
Figure 4.6 shows an improvement of the layout produced by TopoLayout, which is a
feature-based multi-level graph drawing algorithm [7]. It creates a subgraph hierarchy
by recursively detecting topological features in the graph and replacing them with meta-
nodes. Each feature is drawn using an algorithm tuned for the specific topology. The
graph hierarchy is drawn bottom-up using an area-aware algorithm. The figure shows
the add32 graph [204], which describes a 32-bit adder that contains many biconnected
components. In (b) the results of a node overlap removal algorithm [49] are shown.
Note that the structure of the input layout is significantly distorted, making it difficult
to comprehend the structure of the graph. Our improved layout, shown in (c), is able
to expand the circular clusters contained in the graph, better visualizing the intricate
details of the graph. For example, additional details about the composition of the inner
circle in the leftmost part of the graph become visible. Also, expanding the small circular
formation at the bottom right hand side of the graph allows more detail about the sub-
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 69
clusters it contains to become visible. Moreover, as opposed to (b), the layout in (c)
maintains the overall structure of the layout.
Figures 4.7 and 4.8 show improvements of the layouts produced by [65], which is a
multi-level forced directed graph layout algorithm. Spectral partitioning is used to create
the graph hierarchy. KD-tree type partitioning is used to accelerate the computation and
allows for an efficient GPU implementation.
Figure 4.7 shows the ISP graph, which represents the router networks of several in-
ternet service providers (ISPs) [2]. In the layout, green, black and blue nodes represent
routers belonging to the ISPs visualized, while red nodes show other routers used to
connect to the Internet. The layout in (b), computed by the algorithm from [49], man-
ages to displace nodes in order to avoid overlaps, while generally maintaining the overall
structure of the graph. Unlike our algorithm, the resulting layout does not attempt to
make use of sparse regions of the layout. Instead, small displacements are used in order
to avoid overlaps. Applying our algorithm to this layout, as shown in (c), improves the
separation between the nodes of the graph, while maintaining important characteristics of
the graph, such as the separation to clusters (excluding the red nodes). This is especially
evident in the blue cluster at the bottom right and among the red nodes in the center
left part of the graph. Note how the algorithm is able to expand each of the clusters
into surrounding sparse areas, allowing more details to become visible inside the clusters,
while still preserving the overall clustered structure of the graph.
Figure 4.8 shows the bcsstk32 graph [204], which represents a stiffness matrix. It has
a very high edge density: E/V > 22. The layout in (b), computed by the algorithm
from [49], is nearly identical to the input layout. The algorithm is not able to remove all
of the overlaps of the graph, even when we change the size of the squares representing the
nodes. In (c) our uncluttering algorithm is used. It stretches the input layout, making
the mesh-like structure of the graph more evident. Note that the overall structure and
features of the graph are conserved after the uncluttering process. Also note that in
the improved layout there are less highly-concentrated areas, where the edges are totally
hidden. This makes the mesh structure of the graph visible in a larger portion of the
layout.
For our performance tests, we used a PC running Windows XP equipped with 2GB
RAM, an Intel Core 2 Duo E6750 2.66 GHz CPU and an NVIDIA 8800GTS GPU with
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 70
(a) Input layout from [65] (b) Removing node overlaps (c) Our improved layoutusing [49]
Figure 4.7: ISP router graph (V=5044, E=8043) . Nodes are color-coded by the ISP they
belong to. Note how in (c) the blue nodes are uncluttered.
(a) Input layout from [65] (b) Removing node overlaps (c) Our improved layoutusing [49]
Figure 4.8: Bcsstk32 graph (V=44609, E=985046). Note how in (c) reducing the node
density allows more of the mesh structure of the graph to be uncovered in the top left,
bottom and middle of the graph.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 71
graph information node overlap removal [49] our algorithm
graph V E CPU√
P ITRS CPU CPU+GPUprotein 30727 1206654 543 257 8 643 6.62add32 4960 9462 2.23 257 15 641 4.86
bcsstk32 44609 985046 462 257 4 642 5.84ISP 5044 8043 0.9 257 25 643 5.19
ug 380 1104 3231 0.03 257 30 643 4.86
Table 4.1: Graph information and running times. The left side of the table gives informa-
tion about the graphs. V and E are the number of graph nodes and edges, respectively.
The central part of the table gives the running times in seconds of the algorithm from [49],
using the same machine used to run our algorithm. The right side of the table shows
the results of our algorithm. The width and height in pixels of the density image used
is equal to√
P . ITRS is the number of iterations of Equation 4.7 in Step 5. CPU is the
total running time of the algorithm in seconds when using only the CPU. CPU+GPU
is the total running time of the algorithm in seconds when using the GPU to accelerate
Step 3.
96 shader processors running at 1.2GHz. The algorithm was implemented using C++,
OpenGL and Cg.
Table 4.1 gives information about the graphs and the running times. It is evident that
the running time is relatively independent of the size of the graph and the number of
displacement iterations made. This is so since the bulk of the computation time is spent
working on the different images the algorithm operates on. More specifically, as can be
seen from comparing the CPU and CPU+GPU columns, most of the time is spent in
Step 3, which involves a computationally demanding ray-shooting process (Equations 4.4
and 4.5). Using the GPU results in a very large speedup of this step, accelerating the
total runtime by up to 130 times. This reduces the total runtime to a few seconds.
Table 4.1 compares our running times to those of [49]. In the latter, there is a big
variation in the running time, since it depends on the number of overlaps. When there
are few overlaps (add32, ISP, ug 380), the algorithm runs quickly. Consequently, the
changes to the layout are small. In other cases (protein, bcsstk32), the running time is
higher. Due to the large variation in running times, in some cases it runs faster than our
GPU implementation while in others it runs slower.
There are several reasons why the GPU is able to accelerate Step 3 and therefore the
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 72
execution of the entire algorithm so significantly. First, since the amount of work per-
pixel is similar, there is good load balance between the different processors in the GPU.
Thus, the GPU is able to make efficient use of its computing power, which is much higher
than the CPU’s. Second, due to the 2D locality in the memory access pattern during
the ray-shooting process, the GPU is able to make efficient use of its caches. On the
CPU, however, accessing a 2D image requires lookups using pointers, which is inefficient.
Finally, as opposed to the CPU, the GPU contains built-in instructions for performing the
clamping operations needed for performing the interpolation of the values in the density
texture. In summary, this is a good example in which the architecture of the GPU is able
to provide a significant speedup compared to a CPU implementation.
4.7 Conclusion and Future Work
This chapter proposes a new algorithm for reducing the cluttering commonly occurring
in graph layouts. Given any graph layout, the algorithm moves nodes to empty regions
of the screen in a mental-map preserving way.
The algorithm has several key ideas. First, the density image of the computed graph
layout is used to decide how nodes will be displaced. Second, a diffusion process that takes
the structure of the density image into account computes an alternative node distribution,
making better use of the available screen space. Third, an optimal and mental-map pre-
serving warp, based on results from mass-transport problems, determines how to displace
the nodes. Although the mathematical techniques used in this chapter require a great
deal of computation, the chapter demonstrates how improved layouts can be computed
in a matter of seconds, by using the GPU to significantly accelerate the algorithm.
It has been shown that our algorithm is able to improve layouts of large graphs,
produced by a variety of well-known algorithms.
The are several future research directions. First, more research into edge uncluttering
is required. Possible techniques include edge-bundling which is based on the actual layout
of the graph (as opposed to the graph-theoretical structure of the graph) and bending
some of the edges. Second, a model for edge repulsion can help better separate the edges,
improving the readability of the improved layout. Third, the algorithm can be integrated
into an interactive graph exploration system in which the areas to unclutter are selected
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 73
by the user. This will allow interactively expanding the current region of interest on
expense of the other parts of the graph. Finally, the algorithm can be used to enhance
the visualization of changes in a dynamic graph sequence.
It may be possible to accelerate the algorithm further by moving more parts to the
GPU. These include the multi-grid solution of the Poisson equation [85] (Equation 4.14)
and the iterative mass-transport evolution [174] (Equation 4.15).
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
4. Uncluttering Graph Layouts Using Anisotropic Diffusion and MassTransport 74
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Chapter 5
Online Dynamic Graph Drawing
This chapter presents an algorithm for drawing a sequence of graphs online. The algo-
rithm strives to maintain the global structure of the graph and thus the user’s mental
map, while allowing arbitrary modifications between consecutive layouts. The algorithm
works online and uses various execution culling methods in order to reduce the layout time
and handle large dynamic graphs. Techniques for representing graphs on the GPU allow
a speedup by a factor of up to 17 compared to the CPU implementation. The scalability
of the algorithm across GPU generations is demonstrated. Applications of the algorithm
to the visualization of discussion threads in Internet sites and to the visualization of social
networks are provided. The material in this chapter is based on [66,68].
The rest of the chapter is organized as follows. Section 5.1 gives an introduction.
Section 5.2 discusses related work. Section 5.3 formally defines the problem and gives an
overview of key algorithm ideas. Section 5.4 presents the algorithm in detail. Section 5.5
discusses our implementation. Section 5.6 presents results. Section 5.7 discusses an ap-
plication to Internet discussion threads visualization. Section 5.8 presents an application
to the visualization of social networks. Section 5.9 concludes the chapter .
5.1 Introduction
Many applications require the ability of dynamic graph drawing, i.e., the ability to modify
the graph [47,116,155], as illustrated in Figure 5.1. Sample applications include financial
analysis, network visualization, security, social networks, and software visualization. The
challenge in dynamic graph drawing is to compute a new layout that is both aesthet-
ically pleasing as it stands and fits well into the sequence of drawings of the evolving
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 76
(a) (b) (c)
Figure 5.1: Snapshots from the threads1 graph sequence, visualizing discussion threads
at http://www.dailytech.com, left to right. Node labels in red show user names, edges
link users replying to posted comments. Up to 119 users are shown. Discussion topics,
marked as blue A n nodes, include GPUs (A 4864, A 4285), chipsets (A 4637, A 4425,
A 4538 and A 4866) and CPUs (A 4589). A total of 144 messages are visualized.
graph. The latter criterion has been termed preserving the mental map [145] or dynamic
stability [155].
Most existing algorithms address the problem of offline dynamic graph drawing, where
the entire sequence of graphs to be drawn is known in advance [47, 56, 128]. This gives
the layout algorithm information about future changes in the graph, which allows it to
optimize the layouts generated across the entire sequence. For instance, the algorithm
can leave place in order to accommodate a node that appears later in the sequence. In
contrast, very little research has addressed the problem of online dynamic graph drawing,
where the graph sequence to be laid out is not known in advance [63,132].
This chapter proposes an online algorithm for dynamic layout of graphs. It attempts
to maintain the user’s mental map, while computing fast layouts that take the global
graph structure into account. The algorithm, which is based on force directed layout
techniques, controls the displacement of nodes according to the structure and changes
performed on the graph. By taking special care in order to represent the graph in a GPU-
efficient manner, the algorithm is able to make use of the GPU to significantly accelerate
the layout.
This chapter makes the following contributions. First, a novel, efficient algorithm
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 77
for online dynamic graph drawing is presented. It spends most of the execution time
on the parts of the graph being modified. Second, it is shown how the heaviest part
of the algorithm, performing force directed layout, can be implemented in a manner
suitable for execution on the GPU. This allows us to significantly shorten the layout time.
For example, incremental drawing of a graph of 32,000 nodes takes 0.704 seconds per
layout. Finally, two information visualization applications of the algorithm are presented.
The first is the visualization of the evolution over time of discussion threads in Internet
sites. In this application, illustrated in Figure 5.1, nodes represent users and edges
represent messages sent between users in discussion forums. The second application is
the visualization of the growth of a social network, shown in Figure 5.9. Here, nodes
represent users and edges represent connections between friends.
5.2 Related Work
Several algorithms address the problem of offline dynamic graph drawing, where the entire
sequence is known in advance. In [47], a meta-graph built using information from the
entire graph sequence, is used in order to maintain the mental map. In [128], a stratified,
abstracted version of the graph is used. The nodes are topologically sorted into a tree–like
structure (before layout) in order to expose interesting features. An offline force directed
algorithm is used in [56] in order to create 2D and 3D animations of evolving graphs.
Creating smooth animation between changing sequences of graphs is addressed in [19].
A few algorithms have been proposed to address the online dynamic graph drawing
problem, where the graph sequence is not known in advance. An approach based on
Bayesian networks is described in [20]. A cost function that takes both aesthetic and
stability considerations into account, is defined in [132]. Unfortunately, computing this
function is very expensive (45 seconds for a 63 node graph). An algorithm for visualizing
dynamic social networks is discussed in [149]. Drawing constrained graphs has also been
addressed. Incremental drawing of DAGs (directed acyclic graphs) is discussed in [155].
In [63] dynamic drawing of clustered graphs is addressed. Dynamic drawing of orthogonal
and hierarchical graphs is discussed in [86]. The current chapter aims at producing online
layouts of general graphs efficiently.
In recent years, GPUs have been successfully applied to numerous problems outside
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 78
of classical computer graphics [163]. Protein folding [165] and simulation of deformable
bodies using mass-spring systems [83,198] are related to our application. However, while
the mass-spring algorithms take only nodes connected by edges into account, the force
directed algorithm considers all the nodes when calculating the force exerted on a node.
GPUs have also been used to simulate gravitational forces [157], where an approximate
force field is used to calculate forces. A GPU-based implementation of the MDS (multidi-
mensional scaling) algorithm is discussed in [189]. Accelerating static graph drawing on
the GPU has been addressed by several authors [8, 65, 93]. A GPU accelerated force di-
rected layout algorithm using an Euler method is presented in [8]. Although a very large
acceleration is achieved, the complexity of the underlying algorithm is O(|E|+ |V |2) for
|E| edges and |V | nodes. Please refer to Sections 1.3 and 2.2 for more information about
accelerating computations using GPUs.
5.3 Overview
Given, online, a series of undirected graphs G0 = (V0, E0), G1 = (V1, E1), . . . , Gn =
(Vn, En), the goal of the algorithm is to produce a sequence of layouts L0, L1, . . . , Ln,
where Li is a straight-edge drawing of Gi. The updates Ui that can be performed between
successive graphs Gi−1 and Gi, include adding or removing vertices and edges.
A key issue in dynamic graph drawing is the preservation of the mental map, i.e. the
stability of the layouts [145]. This is an important consideration since a user looking at a
graph drawing becomes gradually familiar with the structure of the graph. The quality of
the layout can be evaluated by measuring the movement of the nodes between successive
layouts, which should be small, especially in unchanged areas of the graph. In addition,
each layout in the sequence should satisfy the standard requirements from static graph
layouts, such as minimization of edge crossings, avoidance of node overlaps and layout
symmetry [116].
Among the different classes of graph drawing algorithms, the force directed algorithm
class [116, 199] is a natural choice in our case, for several reasons. First, different layout
criteria can be easily integrated into these algorithms. Second, in some of these algo-
rithms, it is possible to update node positions in parallel, thus making it possible to
efficiently employ the GPU’s parallel computation model. Finally, it is possible to use a
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 79
convergence scheme that resembles simulated annealing, in which nodes are slowly frozen
into position [70]. This is suitable for use in dynamic layout, where nodes have different
scales of movement.
Our algorithm utilizes several key ideas. In order to maintain the mental map, we
perform the following. First, nodes are initially placed using local graph properties and
information from the previous layout. Second, a movement flexibility degree is assigned
to each node, according to the changes in the graph. This allows the algorithm to “focus”
on nodes that may have large displacements. Third, an approach similar to simulated
annealing is used, where the graph slowly freezes into its final position. Fourth, the
changes between graphs are smoothly animated. In order to reduce the layout time while
maintaining layout quality, the graph is partitioned so that forces from distant nodes can
be approximated, and the GPU is used to accelerate the layout. Moreover, in order to
quickly compute aesthetic layouts, a multi-level force directed scheme is used.
5.4 Algorithm
Given a sequence of graphs G0, . . . Gn, our algorithm computes layouts L0, . . . Ln. This
section describes the algorithm in detail. We begin with describing how the online dy-
namic layouts Li, i ≥ 1 are computed, given Li−1 and Gi. Next, we discuss the algorithm
used to compute the initial layout L0.
5.4.1 Computing Dynamic Layouts
Given a set of undirected graphs G1, G2 . . . Gn, the goal of the dynamic algorithm is
to compute online layouts L1, L2, . . . Ln. Algorithm 2 is used to compute the layouts.
Figure 5.2 visualizes the main steps of the algorithm. We elaborate on these steps below.
Merging (Step 1): Computing a good initial position is vital for reducing the layout
time and maintaining dynamic stability [39,72]. The coordinates of nodes that exist both
in Gi−1 and in Gi are copied from Li−1. Nodes in Gi that do not exist in Gi−1 are assigned
coordinates while considering local graph properties, as follows.
Each un–positioned node v is examined in turn. Let PN(v) be the set of neighbors of
node v ∈ Vi that have already been assigned a position. If v has at least two positioned
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 80
Algorithm 2 Dynamic layout of graph Gi, i ≥ 1
input: Gi, Li−1 output: Li
1. Merging: Merge layout Li−1 and graph Gi to produce an initial layout.
2. Pinning: Assign pinning weights to the nodes, which control the allowed displace-
ment of each node.
3. Coarsening: Set C0 = Gi. Compute C1, C2, . . . , Ccoarsest where Ck+1 =
edge collapse(Ck). Set l = coarsest.
4. Compute a geometric partitioning of the nodes of C l.
5. Perform incremental layout of C l. If l = 0 goto step 7 and use the layout of C0 as
Li (the layout of Gi).
6. Interpolation: Update the initial layout of C l−1 using the layout of C l. Set l = l−1,
goto step 4.
7. Animation: Smoothly morph Li−1 into Li.
neighbors, v is placed at their weighted barycenter: pos(v) = 1|PN(v)|
∑
u∈PN(v)
pos(u). If v
has a single positioned neighbor, u, then v is positioned along the line between pos(u) and
the center of the bounding box of Li−1. This procedure is performed in a BFS (breadth–
first search) manner, starting from the positioned nodes. The nodes that cannot be placed
by this procedure are placed in a circle around the center of the bounding box of Li−1.
A Positioning score Γ(v) ∈ [0, 1] is assigned to each node, based on the method used
to position it. These scores indicate the “confidence” in the node’s position. The higher
the positioning score, the better the initial placement is considered. The scores are used
to control the movement of nodes, as described in Step 2. The highest score is assigned
to nodes whose neighborhood has not changed between Gi−1 and Gi, since we are most
confident with their positions. A lower score is assigned to nodes that are positioned
according to two or more neighbors. An even lower score is assigned to nodes positioned
according to one neighbor. Finally, the lowest score is assigned to nodes for which no
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 81
good initial guess is known, and are therefore placed near the center of the bounding
box of the graph. In our implementation, scores of 1, 0.25, 0.1 and 0 are assigned to
nodes positioned according to their coordinates at Li−1, at the barycenter of two or more
neighbors, according to one neighbor (in a direction pointing away from the center of the
bounding box of the graph), and at the center of the bounding box of Li−1, respectively.
Figure 5.2 (b) shows an example of computing the positioning score Γ. Note that darker
nodes, with a lower Γ are relatively localized. These changes are propagated to the reset
of the graph in the next step.
Pinning (Step 2): After all the nodes are placed, their pinning weights, wpin(v) ∈[0, 1], which reflect the stiffness in the positions of the nodes, are computed [20,63,128].
The position of a node with a pinning weight 1 is fixed during layout, while a node with
a pinning weight 0 is completely free to move during layout.
Pinning weights are assigned using two sweeps. The first sweep, which is local, uses
information regarding the positioning scores Γ of the node and its neighbors:
wpin(v) = α · Γ(v) + (1− α)1
degree(v)
∑
u:(u,v)∈E
Γ(u).
Taking the neighbors of v into account amounts to performing low pass filtering of the
pinning weights, according to graph connectivity information. This mimics the creation
of flexible ligaments in the graph around areas that were modified. Using a higher α
value will reduce the influence of the neighbors of a node on its displacement. In our
implementation α = 0.6.
In the second sweep, the local changes are propagated, in order to create a global
effect. A BFS-type algorithm assigns each node a distance-to-modification measure, as
follows. The distance-zero node set, D0, is defined as the union of the set of nodes with a
pinning weight of less than one and the set of nodes adjacent to an edge that was either
added or removed from Gi−1. The distance-one set, D1, is defined as the subset of nodes
in V \D0 adjacent to a node in D0. In general, Di is the subset of nodes not yet marked,
which are adjacent to a node in Di−1. This process continues until all the nodes in V are
assigned to one of the sets D0, D1, · · · , Ddmax. Note that according to this definition, the
nodes in set Di, i ≥ 1 were assigned wpin ≡ 1 in the first sweep. In the second sweep, as
described below, some of these nodes are assigned a lower pinning weight. This gives the
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 82
(a) (b)
(c) (d)
Figure 5.2: Dynamic layout steps: (a) previous layout, Li−1 (b) merged graph (Step 1),
color coded according to the positioning score Γ(v). Brighter nodes have a higher Γ. Here,
nodes with Γ ∈ 0.1, 0.25, 1 are shown. (c) Pinning weights wpin(v) (Step 2). Brighter
color corresponds to a higher wpin(v) (d) Final layout (Step 5), color coded according to
the partitioning (Step 4)
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 83
layout algorithm more flexibility in adopting to changes in the graph.
Pinning weights are assigned to nodes based on their distance-to-modification. In
particular, nodes that are farther than some cutoff distance dcutoff , are assigned a
pinning weight of one, thus remaining fixed, since they are far away from areas of the
graph that were changed. The movement of other nodes depend on the set Di they belong
to. This is done as follows. Given dcutoff = k ∗ dmax, the nodes in Di, i ∈ [1, dcutoff ]
are assigned pinning weights:
wpin = (winitialpin )(1− i
dcutoff).
This assignment creates a decaying effect in which nodes farther away from D0 are as-
signed higher pinning weights. The constant winitialpin is used to determine the decay in
pinning weight. The nodes in Dj+1 are assigned a pinning weight that is (winitialpin )( −1
dcutoff)
times the pinning weight of nodes in Dj. Note that a larger k results in a more global
effect, possibly trading layout stability for better layout quality (since nodes are more
free to move). Setting a higher winitialpin will make the graph more rigid, thus limiting the
displacement of nodes already existing in the previous layout. In our implementation
k = 0.5 and winitialpin = 0.35.
Figure 5.2 (c) shows an example of computing the pinning weights. Note how the local
changes in (b) are propagated to a larger portion of the graph. Also note the decaying
effect as the distance from the modified part, in the middle of the graph, increases. This
reflects the requirement that nodes further from the changed areas should undergo fewer
modifications during layout.
While pinning weights were proposed in the past [128], the approach taken here is
different. In the current chapter pinning weights are used as part of setting the allowed
displacement of nodes, prior to computing the layout. This controls the movement flex-
ibility of each node. In [128], nodes are displaced according to a combination of two
different forces. The relative strength of the forces is determined by weights that are
modified as the layout iterations progress.
Coarsening (Step 3): In this step a series of reduced versions of the graph, which
include initial positions, are constructed. These are used to compute increasingly detailed
”skeletons” of the final layout. At each level, given a fine graph, a coarser representation
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 84
is constructed by performing a series of edge collapse operations. This is done by replacing
two connected nodes and the edge between them by a single node, whose weight is the
sum of the weights of the nodes being replaced. The pinning weight of the new node is
set to the geometric mean of the pinning weights of the replaced nodes. The new node is
placed at the weighted average position of the corresponding fine nodes, biased according
to their weights. The weights of the edges are updated accordingly. (The weight of a
node/edge in the finest graph is 1.)
The order of the edge collapse operations is determined as follows. First, nodes, which
are candidates to be eliminated, are sorted by their degree (so as to eliminate low-degree
nodes first). An adjacent edge of an un-paired low-degree node is chosen for collapse by
maximizing the following measure: w(u,v)w(v)
+ w(u,v)w(u)
, where w(x) is the weight of node x
and w(x, y) is the weight of edge (x, y). This function helps to preserve the topology of
the graph by “uniformly” collapsing highly connected nodes. Coarsening is used in [205],
where a different ordering of the edge collapse operations is used.
In our implementation, the coarsening stops either when the graph is reduced to
several hundred nodes or after four coarsening steps. Coarsening further may lead to
diminishing results due to the inaccuracy in the computed pinning weights of the coarse
graph.
Geometric partitioning (Step 4): The partitioning step is used to accelerate the
layout step, discussed below. There are three requirements that should be satisfied by
partitioning. First, the partitions should be geometrically localized, thus the nodes in
each partition should be relatively close to each other. This will let us represent each
partition using a single ”heavy” node. Second, the number of nodes in each partition
should be similar. This is important in order to achieve good load balance between the
parallel processors of the GPU, as discussed in Section 5.5. Third, the algorithm should
be fast.
We have chosen to use a KD-tree-type partitioning. The algorithm works top down.
Given the positions of all nodes, they are sorted according to the X coordinate and the
index of the median node is located. The nodes are partitioned into two sets: one with
indices below the median and one with indices equal or greater to the median index.
The algorithm proceeds recursively with the two subsets. This time, sorting is performed
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 85
according to the Y coordinate. The algorithm alternates between computing the median
X and Y coordinates. The recursive subdivision terminates when the size of the subset
is below the required partition size. Figure 5.2 (d) shows an example of computing a
geometric partitioning of a graph.
Layout (Step 5): This step of the algorithm computes the layout. Our algorithm
builds on the basic Fruchterman-Reingold (FR) force directed algorithm [70], which is
modified, so as to make it suitable both for incremental layout and for efficient imple-
mentation on the GPU. The basic algorithm is thus modified in three ways. First, an
approximate force model is used in order to speedup the calculation. Second, node pin-
ning allows individual control over the movement of each node. Third, the algorithm is
reformulated in a manner suitable for efficient implementation on the GPU.
Figure 5.3 outlines our algorithm. The input is a graph G = (V,E) decomposed into
partitions Pi, nodes with initial placement pos(v), and their pinning weights wpin(v). The
output is the positions for all nodes. The key idea of the algorithm is to converge into a
minimal energy configuration, which usually leads to aesthetically pleasing layouts.
The initialization of the algorithm includes setting the optimal geometric node dis-
tance K (that affects the scale of the graph), the initial annealing temperature t, the
temperature decay constant λ, and the fraction of the iterations done fracdone ∈ [0, 1].
Partitioning is used to accelerate the algorithm. Instead of calculating all-pair repul-
sive forces, as is customary, approximate forces are calculated. An exact calculation is
performed only for nodes contained in the same partition, while an approximate calcu-
lation is performed for nodes belonging to different partitions. The center of gravity is
found for each partition Pi and is used to replace the nodes in Pi.
Our experiments show that there is flexibility in the number of nodes in each partition,
e.g. Figure 5.4 shows that using twenty times fewer nodes in each partition has little effect
on the final layout. Moreover, it is not necessary to re-partition at every iteration, except
for the initial iterations of the initial layout (Algorithm 3, Step 4), where the nodes may
have a high displacement. During the incremental layout, the merge stage (Algorithm 2,
Step 1) already gives a good approximation of the final layout. In cases where there
are large changes between consecutive graphs, performing several re-partitioning steps
may improve the results. These cases can be identified using the following formula:
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 86
fracdone = 0 , K = 0.1, t = K ∗√
|V |, λ = 0.9
do iteration count times,
update partitioning (Alg. 2 Step 4, Alg. 3 Step 3) if required
parallel foreach partition Pi ∈ P ,
(1) calculate partition center of gravity CG(Pi) =
P
v∈Pi
pos(v)
|Pi|
parallel foreach node v, v ∈ Pi where fracdone > wpin(v),
(2) F replint (v) =
∑
u∈Pi,u6=v
K2 pos(v)−pos(u)‖pos(v)−pos(u)‖2
(3) F replext (v) =
∑
Pj∈P,Pj 6=Pi
K2|Pj| pos(v)−CG(Pj)
‖pos(v)−CG(Pj)‖2
(4) F repltot (v) = F repl
int (v) + F replext (v)
(5) F attr(v) =∑
u:(u,v)∈E
‖pos(u)−pos(v)‖(pos(u)−pos(v))K
parallel foreach node v where fracdone > wpin(v),
(6) F total(v) = F repltot (v) + F attr(v)
(7) posnew(v) = pos(v) + F total(v)‖F total(v)‖min(t, ‖F total(v)‖)
t∗ = λ, fracdone+ = iteration count−1
Figure 5.3: Parallel force directed layout algorithm
1|V |
∑
v∈V
(1− wpin(v)), whose value is proportional to the changes performed to the graph.
This is so since the number of iterations during which each node v moves, is proportional
to (1− wpin(v)) (see Figure 5.3).
The key to efficient implementation of this algorithm on the GPU is deciding which
nodes will be processed by the parallel foreach loops. In order to reduce layout time and
maintain dynamic stability, only some of the nodes are displaced in each layout iteration.
For each node v, wpin(v) is compared to the current fraction of layout iterations done,
fracdone. Only nodes that satisfy fracdone > wpin(v) are processed. This makes it possible
to control the relative displacement of nodes. Nodes with a low pinning weight will be
displaced during more iterations of the algorithm. Thus, the pinning weight, assigned
according to the changes performed in the vicinity of each node, controls the stability of
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 87
(a) 0.5√
|V | partitions (b) 10√
|V | partitions
Figure 5.4: Partition size effect on layout, graph bcsstk31, |V | = 35588, |E| = 572916
node locations. Because the allowed displacement is decreased from one iteration to the
next, setting a higher pinning weight limits the total displacement of nodes.
Using this method, the algorithm spends computation time only on nodes which
should be displaced in each layout iteration. The amount of work done depends on the
changes performed to the graph. Areas which did not change are not processed, thereby
reducing the layout time. It is often possible to accelerate the incremental layout time
by a factor of two using this technique.
The algorithm computes the total force acting on each node in several steps. First,
the centers of gravity of all partitions are computed. Next, the set of active nodes, which
are allowed to be displaced in the current iteration, is determined. For each such node,
the repulsive forces F replint , F repl
ext and the attractive force F attr acting on it, are calculated.
Finally, the nodes are displaced by an amount bounded by the current temperature of
the algorithm, which slowly decays, mimicking particles freezing into position.
Interpolation (Step 6): In this stage the computed layout of graph C l is interpo-
lated and used to update the initial layout of the higher-resolution graph C l−1. Given a
node v ∈ C l−1, which was mapped to node p ∈ C l, node v is displaced by the following
amount:
(1− wpin(v))A(Bboxold(C l))
A(Bboxnew(C l))(posnew(p)− posold(p)),
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 88
where A(Bboxold(C l)) is the area of the bounding box of graph C l computed during the
coarsening step, A(Bboxnew(C l)) is the area computed during the layout step, posold(p) is
the position of node p computed during the coarsening step and posnew(p) is the position
of p computed during the layout step. The motivation for using this formula is as follows.
The amount 1 − wpin(v) is used to displace nodes according to their pinning weights.
Nodes with a higher pinning weight are allowed a smaller displacement. Doing so helps
maintain the stability of the graph. Nodes with a lower pinning weight are allowed
greater flexibility in order to compute a high-quality layout. The displacement is scaled
according to the change in the area of the coarser C l due to the layout step. Finally, node
v is displaced according to the movement of the corresponding lower-resolution node p.
Morphing (Step 7): The old layout Li−1 is morphed into the new layout Li. The
animation, showing a gradual change, helps the user maintain the mental map of the
graph. Node positions are linearly interpolated. Removed nodes and edges fade out,
then the nodes and edges move to their new position and finally added nodes and edges
fade into view.
Complexity: The asymptotic complexity of the merging, pinning, coarsening and
interpolation steps is O(|E| + |V |). The complexity of the partitioning step is O(|V | ·log(|V |)): finding the median is linear at each level in the partition tree which contains
O(log|V |) levels. Assuming that each partition contains Cs nodes, the running time of
each layout iteration is O(|E| + |V | · (Cs + |V |Cs
)). This expression is minimized when
Cs =√
|V |, resulting in a total complexity of O(|E| + |V |1.5). When |E| ≈ |V |, the
dominating term is |V |1.5. Although this may look relatively high, the simplicity of the
calculation and its parallel implementation on the GPU give good results, as discussed
in Section 5.6. We use 50 layout iterations [205].
5.4.2 Computing the Initial Layout L0
Algorithm 3 is used to compute a static layout of the first graph, G0. This algorithm
uses a multi-level force directed scheme in order to quickly compute an aesthetic layout.
Both the Kamada-Kawai (KK) [113] and Fruchterman-Reingold (FR) [70] algorithms are
employed. We elaborate on the steps of the algorithm below.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 89
Algorithm 3 Static layout of the first graph, G0
input: G0 output: L0
1. Coarsening: Set C0 = G0. Compute C1, C2, . . . , Ccoarsest where Ck+1 =
edge collapse(Ck). Set l = coarsest.
2. Perform KK layout of Ccoarsest.
3. Compute a geometric partitioning of the graph nodes.
4. Perform layout of C l. Update the partitioning (step 3) every few iterations. If l = 0
terminate and use the layout of C0 as L0 (the layout of G0).
5. Interpolate the layout of C l to form an initial layout for C l−1. Set l = l − 1, goto
step 3.
Coarsening (Step 1): A similar method to Algorithm 2, Step 3 is utilized to create
a series of reduced versions of the graph, which are used to compute increasingly detailed
”skeletons” of the final layout. The coarsening continues recursively until a small graph
of several hundred nodes is created. This graph is then efficiently handled in the next
step and is used as a basis of a series of resolution-increasing layouts. Note that unlike
the incremental case, initial coordinates for the constructed graphs Ck, are not available.
KK layout (Step 2): The KK algorithm [113] is used to compute a force-directed
layout of the coarsest graph, Ccoarsest. This algorithm is used in conjunction with the
FR [70] force-directed algorithm (in Step 4) in order to produce an aesthetic layout.
While the KK algorithm is good at producing a good placement from an arbitrary ini-
tial position, the FR algorithm produces a ”smoother” layout, is quicker, but is more
sensitive to the initial conditions given to it. Hence, combining the algorithms gives a
fast and aesthetic result. In our implementation 2000 iterations of the KK algorithm are
performed. Note that during incremental layout (Section 5.4.1) combining our multi-level
approach while reusing the previous layout as a starting point gives fast and good results
without incurring KK’s performance penalty.
Geometric partitioning (Step 3): The same algorithm as in step 4 of Algorithm 2
(Section 5.4.1) is used here.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 90
FR layout (Step 4): In this step we perform force-directed layout of the current
graph in the hierarchy, C l. The algorithm is described in detail in Step 5 of Algorithm 2
(Section 5.4.1). Unlike the dynamic case, here pinning weights are not used and all nodes
are free to move in every layout iteration. In order to get improved results, we update
the node partitioning (Step 3) several times during the layout. The center of gravity of
each partition is updated every iteration, though. The algorithm terminates when the
layout of C0 = G0 is computed.
Interpolation (Step 5): In this stage the existing layout of C l is interpolated to
form an initial layout for the higher-resolution C l−1. Nodes in C l−1 are initially placed
near the position of their parent in C l.
5.5 Implementation
This section discusses the implementation of the algorithm. As will be shown in Sec-
tion 5.6, performing incremental layout, i.e. Algorithm 2, Step 5, (and similarly Algo-
rithm 3, Step 4) on the GPU can significantly accelerate the overall running time of the
algorithm. Therefore, in this section we focus on describing the GPU implementation of
this step.
On the GPU, parallel computation is achieved by rendering graphics primitives that
cover several pixels. The GPU runs a program called a kernel program for each pixel
candidate, called a fragment. The key to high performance on the GPU is using multiple
fragment processors, which operate in parallel. The GPU suits uniformly structured
data, such as matrices. The challenge is representing graphs, which are unstructured, in
a manner that makes efficient use of GPU resources.
Implementing static force directed layout on the GPU has been discussed in [65].
While the algorithm used here for static layout is different, the GPU implementation is
similar. This section reviews the GPU implementation and focuses on the changes needed
for dynamic layout.
Several textures are used on the GPU to represent the graph: the textures represent
the nodes, the partitions, the edges, and the forces. The location texture holds the (x,y)
positions of all the nodes in the graph. Each graph node has a corresponding (u,v) index
in the texture. As shown in Figure 5.5 (a), the nodes in each partition are stored in a
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 91
rectangular region in the location texture.
Bucket-sort is performed on the pinning weights of the nodes in each partition. Nodes
are placed into the texture in a left to right, top to bottom order, according to the bucket
they belong to, as shown in Figure 5.5 (b). The number of buckets is set to the number
of iterations of the layout algorithm. Sorting creates contiguous regions of nodes with
similar wpin values. This allows the algorithm to control the set of nodes whose positions
are updated at every layout iteration. Using appropriate rendering commands, the GPU
is instructed to process only the relevant nodes in each iteration, as discussed below.
Figure 5.5: Sorting nodes by pinning weight wpin on the GPU. (a) : A location texture
separated to regions, color coded by the partition each node belongs to. (b) : Nodes in
each region are sorted from low wpin to high wpin.
The partition center of gravity texture holds the current (x,y) coordinates of the center
of gravity of each partition. Graph edges are represented using the neighbors texture and
the adjacency texture. The adjacency texture contains lists of (u, v) pointers into the
location texture, representing the neighbors of each node. The neighbors texture holds
for each node v, a pointer into the adjacency texture, to the coordinates of the first
neighbor of the node. Pointers to additional neighboring nodes are stored in consecutive
locations in the adjacency texture. The neighbors texture also holds the degree of each
node. The forces computed during layout are stored in two textures: the attractive force
texture and the repulsive force texture. The attractive force texture contains for each
node the sum of the attractive forces F attr exerted on it by its neighbors. The repulsive
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 92
force texture holds the sum of repulsive forces, both by nodes in the same partition –
F replint and by the other partitions in the graph – F repl
ext .
The overall storage complexity is O(|V | + |E|): every node and edge is stored a
fixed number of times. Each node is represented as four 32-bit floating-point values
in the following textures: location (two textures), forces (two textures) and neighbors.
Each edge is represented twice in the adjacency texture (once for each of the nodes in its
endpoints), whose entries are also four 32-bit floating-point numbers. Due to performance
reasons, information about the graph partitions is stored in three textures holding four
32-bit floating-point numbers each. These textures have the same size as the textures
representing nodes.
Hence, in the current implementation, a total of 32 32-bit numbers are stored per node
and 8 32-bit numbers are stored per edge in the different textures. This amounts to about
8MB of texture memory for the fe pwt graph with (V,E) = (32045, 112395). Modern
graphics cards have hundreds of megabytes of texture memory, making accommodation
of very large graphs possible. Note that for implementation ease, textures holding four
32-bit numbers are used in all cases. This in not always required, and can further reduce
the memory footprint.
Computing each layout iteration is done in several steps, which are implemented as
kernel programs that run on the GPU. The partition CG kernel calculates the center of
gravity of each partition, as shown in the line numbered (1) in Figure 5.3. The repulse
kernel calculates the repulsive forces exerted on each node. This kernel first calculates for
each fragment it processes, the internal forces, e.g. forces exerted by nodes contained in
the partition that the fragment belongs to. Then, it approximates the forces by all other
partitions. See lines (2)-(4) in Figure 5.3. The attract kernel is used to calculate the
attractive forces caused by graph edges. For each node, the kernel accesses the neighbors
texture in order to get a pointer into the adjacency texture, which contains the (u,v)
location texture coordinates of the node’s neighbors. For each neighboring node, the
attractive force is calculated and accumulated. This corresponds to line (5) in Figure 5.3.
Finally, the anneal kernel calculates the total force on each node, F total, and displaces
nodes accordingly, as shown in lines (6),(7) in Figure 5.3. This kernel updates a second
copy of the location texture. This double buffering is required since the GPU can not
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 93
read and write to the same texture.
In total, the partition CG kernel performs O(|V |) operations; the repulse kernel per-
forms O(|V |1.5) operations; the attract kernel performs O(|E|) operations; and the anneal
kernel O(|V |) operations. On the GPU, the computations executed in each kernel, are
run in parallel. Since, as discussed below, only some of the nodes are operated on during
each layout iteration, in practice the average number of operations performed by each
kernel is lower than the maximum values presented above.
Recall that the nodes in each partition are sorted according to wpin, as shown in
Figure 5.5 (b). This allows us to control the nodes processed in each layout iteration, thus
spending GPU time only on the nodes which should move. Before each layout iteration,
for each rectangular texture region representing a partition of the graph, the rows which
contain nodes for which fracdone > wpin(v) are determined. A set of quadrilaterals which
cover the corresponding parts of each region are rendered. This instructs the GPU to
process only these nodes. OpenGL display lists are used in order to efficiently send these
rendering commands to the GPU. Note that this method operates on a per-row basis,
potentially causing a small amount of extra fragments to be processed for each region.
The processing of these extra fragments is avoided by conditionally updating the location
of a node only if fracdone > wpin(v).
Note that our implementation does not require copying data from GPU memory
(textures) to CPU memory while performing the layout iterations. Keeping the data on
the graphics card enables full utilization of the GPUs compute and memory bandwidth
resources.
5.6 Results
Two criteria are used to measure the quality of the resulting dynamic layouts: average
displacement of nodes between each pair of successive layouts and potential energy. The
first criterion measures the stability of the layout. The second criterion judges the quality
of the layout. Lower energy (in absolute value) implies low stress in the graph, corre-
sponding to a good layout. The energy U is derived from the relation ~F = −∇U . Hence,
given the force ~F , the energy can be derived by integrating. Given two nodes at positions
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 94
~u,~v, connected by an edge , the attractive force acting along the edge is
~F attr =1
K‖~u− ~v‖(~u− ~v) = −∇U attr,
hence
Uattr =−1
3K‖~u− ~v‖3.
The repulsive force between two nodes is
~F repl =−(~u− ~v)
‖~u− ~v‖2 K2 = −∇U repl,
hence
U repl =1
2K2log(‖~u− ~v‖2).
The total energy is computed by summing over all edges and over all node pairs: U total =
Uattr + U repl , e.g.
U total =∑
u:(u,v)∈E
−1
3K‖~u− ~v‖3 +
∑
u,v∈V,u6=v
1
2K2 log(‖~u− ~v‖2).
Other static graph layout quality criteria are indirectly handled by the underlying force
directed algorithm. Note that other criteria have also been used to measure mental map
preservation. For example the orthogonal ordering of nodes [145].
The quality of the layout is compared to two algorithms. The first is a force-directed
non-incremental algorithm that lays each graph in the sequence independently. This
algorithm, which is expected to produce the best layouts since it has no constraints,
is used to check the quality of our dynamic layouts. The second is a variant of our
dynamic algorithm which does not use pinning weights (e.g. wpin ≡ 0). This algorithm
demonstrates that simply using the previous placement is insufficient for generating stable
layouts. Note that the running time of these two algorithms is much higher than the
running time of our algorithm since they process all nodes in each layout iteration.
Several well–known graphs (3elt, 4elt, fe pwt, bcsstk31) are used to demonstrate our
algorithm [204]. The dynamic sequences are generated by performing random changes
on the graphs, modifying |E| and |V | by up to 15%. In addition, the sequences marked
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 95
Figure 5.6: Snapshots from layouts of the 3elt sequence (|V | ≈ 4000, |E| ≈ 10, 500),
left-to-right, top-to-bottom
threads1,2 and Rimzu come from real data, discussed in Sections 5.7, 5.8. In these graphs,
there are cases in which the changes between consecutive graphs in the series are small
(e.g. a few nodes are added). As discussed in Section 5.4.1, Step 5, the algorithm is able
to efficiently handle such changes by performing computations only on the nodes which
should be displaces in each layout iteration. Figure 5.6 shows a few snapshots from the
dynamic graph layout of 3elt.
Another example is Newcomb’s fraternity data [152], which represents friendship rela-
tions between college students. This data was visualized using the SoNIA tool for social
network visualization [11,12,149]. As discussed in [149], the Newcomb data is best visual-
ized by the peer-influence (PI) algorithm of SoNIA, where nodes are displaced according
to forces exerted by neighbors.
Table 5.1 shows average results for the layout quality metrics. (Lower values are
better.) The ∆pos column shows the average displacement of nodes and the |U total|column shows the absolute value of the potential energy of the graph. It is clear that our
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 96
graph rimzu threads1 threads2metric ∆pos |U total| ∆pos |U total| ∆pos |U total|non-incr 31.4 4418 1.45 39.2 1.06 9.72basic-incr 4.62 4435 0.333 40.4 0.297 9.81ours 0.274 3418 0.042 30.3 0.048 5.55
graph newcomb 3elt fe pwtmetric ∆pos |U total| ∆pos |U total| ∆pos |U total|non-incr 0.48 1.82 25.9 2.73x105 105.5 9.59x105
basic-incr 0.221 1.81 2.3 3.06x105 10.7 9.37x105
ours 0.099 1.94 0.968 2.79x105 3.62 8.1x105
Table 5.1: Layout quality - values are averages for a sequence of layouts
incremental algorithm outperforms the other algorithms and maintains dynamic stability.
The potential energies achieved by all algorithms are similar, demonstrating that the
quality of layouts computed by our algorithm is good. In some cases (like fe pwt) the
two incremental algorithms surprisingly perform better than the static one. This is due
to the fact that the force-directed algorithm finds a local minimum which depends on
the initial conditions, which are different for each algorithm used here. In summary,
the results demonstrate that our algorithm computes aesthetic layouts while decreasing
the movements of the nodes. This reduction does not come at the expense of layout
quality. The algorithm tries to maintain the structure of the graph, using node pinning
to propagate changes across the graph, allowing for new landmarks to be created, while
at the same time maintaining the mental map. Note that compared to the algorithm
of [66], using a multi-level incremental algorithm somewhat reduces the stability of the
layout. However, this gives the algorithm an opportunity to calculate a higher quality
layout.
Figure 5.7 shows a comparison of the SoNIA layouts using the PI algorithm and our
layouts. As can be seen, one of the advantages of our algorithm is the greater stability in
node positions, especially when only the edges of the graph are modified. Although both
SoNIA and our algorithm are based on force-directed methods, the more sophisticated
initial placement and pinning algorithms help improve the results.
For our performance tests we used two computers. The first is a PC with a 3 GHz
Pentium IV CPU and an NVIDIA 7900GS GPU. The second is a newer PC with a 2.4
GHz Intel Core 2 Duo E6600 CPU and an NVIDIA 8800GTS GPU. Our algorithm was
implemented using C++, Cg and OpenGL.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 97
Figure 5.7: Snapshots from the layouts of the newcomb fraternity data [152]. Left: our
algorithm. Right: SoNIA algorithm [11,12], used in [149].
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 98
Graph name avg. |V | avg. |E|3elt 4097 104684elt 14588 40176bcsstk31 32715 48495fe pwt 32045 112395
Table 5.2: Graph sequence information.
Graph 3GHz Pentium + 7900GS GPU 2.4GHz Core 2 + 8800GTS GPUname initial layout dynamic layout initial layout dynamic layout
CPU CPU+GPU CPU CPU+GPU CPU CPU+GPU CPU CPU+GPU3elt 2.72 1.49 0.764 0.249 1.72 1.27 0.436 0.24elt 17.6 2.98 5.91 0.777 10.4 2.22 3.38 0.39bcsstk31 50.4 9.28 21.2 4.74 34 9.61 12.1 1.38fe pwt 47.7 6.03 21 2.1 28.8 4.27 12 0.704
Table 5.3: Running times [sec.]. The running times of the CPU only and GPU-accelerated
implementation of the algorithm are shown. All times shown are total running times for
computing a layout. Dynamic layout times are averaged over a sequence of layouts.
Table 5.2 gives information about the graph sequences and Table 5.3 shows running
times - when using only the CPU and when using the GPU to accelerate the compu-
tation. As can be seen in the table, our GPU implementation provides a significant
speedup compared to the CPU. Using the older 7900GS GPU, a speedup of up to 10
times is achieved. Using the newer and faster 8800GTS GPU, the speedup increases to
up to 17 times, compared to the latest CPU. Due to the high ratio of arithmetic opera-
tions to memory accesses, the algorithm is compute and not memory bound. Therefore,
as demonstrated in the comparison between the PCs, the GPU implementation of the
algorithm is scalable.
Focusing on the part of the algorithm that runs on the GPU leads to interesting
insights. For the fe pwt graph, the average time for computing the FR incremental
layout stage using the 7900GS GPU was 1.66 seconds. Using the 8800GTS GPU, the
time dropped to 0.417 seconds. This represents a significant performance increase between
GPU generations (∼ 4 times), which is larger than the performance increase between the
CPU generations [163]. The speedup is achieved while taking into account the overhead
of instructing the GPU to perform the layouts, which can be significant in the coarser
graphs. The speedup of performing the last layout stage (on the finest graph) is about 8
times.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 99
There are several factors contributing to the increase in performance between the
GPUs. The new GPU has a different architecture, which is better suited for dealing with
graphs. Due to its smaller branch granularity, a smaller penalty is encountered when
dealing with non-uniform data, such as graphs. In addition, the 8800GTS uses a scalar
architecture, which is more efficient here, since the algorithm deals mostly with 2D and
1D quantities. Finally, the new GPU has more raw compute power.
5.7 Application to Discussion Thread Visualization
We applied our algorithm to the visualization of Internet discussion forums. We col-
lected data from several discussion threads at http://www.dailytech.com . This site
contains various hi-tech related news items. The discussion threads visualized contain
the comments people make on the news items. In the graph, each node represents a user.
Edges are constructed between the user adding a comment and users which replied to
that comment. Each discussion thread is represented by a node labeled A n where n is
the discussion thread number (corresponding to a news item).
In order to create the visualization, shown in Figures 5.1 and 5.8, several steps are
executed. First, the graph is transformed into a connected graph, as required by the graph
layout algorithm. This is achieved by adding an invisible root node and connecting it with
invisible edges to all the A n nodes representing the discussion threads. The connected
graph is then handed to the incremental layout algorithm.
Second, in order to improve the visualization of the computed layout sequence, over-
lapping between node labels is addressed. A set of bounding boxes of drawn node labels
is maintained and updated after each label is drawn. If a new label to be drawn inter-
sects any of the bounding boxes of already drawn labels, it is drawn at the background
– farther away from the viewer and with a lighter color. Doing so prevents the new label
from occluding the text of any previously drawn labels. If a new label does not intersect
any of the existing labels, it is drawn in the foreground. Before each node label is drawn,
a rectangle with the same color as the background is drawn behind the node label. This
is done so each pixel will display the text of a single label (preventing overlaps).
Third, during animation, the nodes are drawn in a specific order which is designed
to visualize the interesting features of the evolving graph sequence more clearly. The
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 100
labels of important nodes should receive priority when drawn. These include nodes with
a high degree, acting as central nodes and in the graph, and nodes whose neighborhood
in the graph has changed. Each node is assigned a score. Nodes with a higher score
are rendered before nodes with a lower score. This reduces the probability that an
important node’s label will be occluded. The score of each node v is set to score(v) =
degree(v) + β · degree change(v), where degree change(v) is the change of the degree of
node v between the current and previous graphs. The score helps emphasize the main
features of the evolving graph sequence. The constant β can be changed by the user. Its
default value is 2.
Figure 5.1 shows a sample visualization of 7 discussion threads with 119 users. Al-
though during visualization the graph more than doubles, our layout manages to preserve
the mental map. Several insights can be gained from the visualization. Clusters are ev-
ident around the A n nodes, representing each discussion thread. As time progresses,
more clusters, representing new discussion threads, become visible. There are clusters of
various sizes – correlating to threads drawing different levels of attention. Some users
post messages on several threads while others discuss only one topic. Some users are
very active and post many messages, acting as central nodes in the graph. The degree of
nodes representing such users increases over time and they contribute to the connectivity
of the graph. Some users, who are drawn at the boundaries of the graph, contribute only
one comment.
As a second example we studied the latest headlines section of the website. We
selected five items, appearing over a span of three days, from seemingly unrelated fields:
computer games, nuclear fusion, low-cost PCs, Windows/Linux switch and Christmas
e-shopping. The number of comments for each article varied from 15 to 31. A total of 86
users contributed to the discussion threads. Figure 5.8 presents several snapshots from
the animation sequence showing the evolution of these discussion threads over time. A
movie showing the visualization is available in the supplementary material.
Looking at the visualization, several conclusions can be drawn. The graph is initially
partitioned into disconnected clusters, representing nuclear fusion, low-cost PCs and com-
puter games. Later, connections start to appear in the graph. The threads discussing
low-cost PCs and Windows/Linux switch are highly connected. Some connections exist
between these clusters and the computer game cluster. Surprisingly, several users dis-
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 101
Figure 5.8: Snapshots from the threads2 graph sequence, visualizing discussion threads at
http://www.dailytech.com, left to right, top to bottom. 109 messages from 86 users in 5
discussion threads are shown. Discussion topics, marked as blue A n nodes, include com-
puter games (A 5054), nuclear fusion (A 5027), low-cost PCs (A 5060), Windows/Linux
switch (A 5069) and Christmas e-shopping (A 5082) .
cussing nuclear fusion join both the computer games and Windows/Linux switch threads.
Good correlation also exists between nuclear fusion and the Christmas e-shopping dis-
cussion.
5.8 Application to Social Network Visualization
Our algorithm was applied to the visualization of the growth of social networks. We used
data from the social network at http://www.rimzu.com. In this network, new users can
register after receiving an invitation from an existing user. Each user is able to list a set
of friends among the members of the network. In the visualization, users are represented
as nodes. Edges link each user to his/her friends.
Figure 5.9 shows a visualization of the growth of this network. The visualization
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 102
Figure 5.9: Snapshots from the Rimzu graph sequence, visualizing the social network at
http://www.rimzu.com, left to right, top to bottom. Nodes represent users and edges
represent connections between users. In the visualization the graph grows from V=216,
E=544 to V=962, E=1561. Nodes are colored by age in a red→ yellow → green scale.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 103
shows a period in time where the network grew considerably, from 216 nodes to 962
nodes. The visualization was created by constructing the graph of the network at equally
spaced intervals in time. As in the Internet threads visualization, a dummy invisible root
node was added in order to make the graph connected.
Several properties of the network are evident from the created visualization. The
graph has dozens of connected components. The fact that the graph is not connected is
surprising since members are able to join the network only after receiving an invitation.
There are many users who joined the network but did not list any friends. They are
represented as a cluster of nodes with degree zero (no edges). There are components of
varying complexity in the network. Some are very simple, connecting a handful of nodes,
while others are large and highly connected. Several tree-like components are visible.
These correspond to one user with several friends who are not linked between themselves.
There is one large component which exists from the beginning of the visualization.
Coloring the nodes by age reveals more information on the graph. Some components
of the graph were created in a relatively short time frame. Others, such as the large
component on the right, grow continuously.
Note how the algorithm manages to compute a stable, mental-map preserving layout
of the dynamic graph sequence while at the same time providing meaningful layouts
from which the insights discussed above can be extracted. This is especially challenging
due to the large growth of the network in the period visualized. A movie showing the
visualization is available in the supplementary material.
5.9 Conclusion and Future Work
We have presented an online algorithm for dynamic layout of graphs, whose goal is to
efficiently compute stable and aesthetic layouts. The algorithm has several key ideas.
First, a good initial layout is computed. Second, the allowed displacement of nodes
is controlled according to the changes applied to the graph. In particular, each node
is assigned an individual convergence schedule. Third, the global interactions in the
graph are approximated in order to maintain the structure of the graph and compute an
aesthetic layout. Fourth, a multi-level scheme is used in order to compute high-quality
layouts. Last but not least, the GPU is used to accelerate the algorithm, requiring the
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
5. Online Dynamic Graph Drawing 104
representation of unstructured graphs in an ordered manner that fits the GPU.
It has been demonstrated that the algorithm computes an aesthetic layout, while
reducing displacement and maintaining the user’s mental map between layout iterations.
Our GPU implementation of the algorithm performs up to 17 times faster than the CPU
version. We have applied our algorithm to the visualization of discussion threads on the
Internet and to social network visualization.
There are several avenues for future research. An interesting research direction is
the extension of the algorithm to drawing multi-level clustered graphs. Finding ways to
implement more parts of the algorithm on the GPU will help accelerate the computa-
tion. Improving the algorithm used for morphing between layouts can further help in
maintaining the mental map.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Chapter 6
Dynamic Drawing of ClusteredGraphs
This chapter presents an algorithm for drawing a sequence of graphs that contain an
inherent grouping of their vertex set into clusters. It differs from previous work on
dynamic graph drawing in the emphasis that is put on maintaining the clustered structure
of the graph during incremental layout. The algorithm works online and allows arbitrary
modifications to the graph. It is generic and can be implemented using a wide range
of static force-directed graph layout tools. This chapter introduces several metrics for
measuring layout quality of dynamic clustered graphs. The performance of our algorithm
is analyzed using these metrics. The algorithm has been successfully applied to visualizing
mobile object software. This chapter is based on [63].
The rest of this chapter is structured as follows. An introduction is given in Sec-
tion 6.1. Section 6.2 defines the problem. Section 6.3 describes the algorithm. A software
visualization application is presented in Section 6.4. Finally, Section 6.5 concludes and
discusses future directions.
6.1 Introduction
In clustered graphs, the vertices are divided between a set of components called clusters,
which form a partition of the vertex set. In some applications, the graphs are inherently
clustered [25]. In other cases, clustering has been successfully used in order to aid in the
visualization of graphs [210].
Many applications require the ability of dynamic graph drawing, i.e., the ability of
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 106
Figure 6.1: Snapshots from an animation sequence
modifying the graph [21, 47, 155]. Different types of graph modifications may be per-
formed: adding vertices and clusters, moving vertices between clusters, removing edges,
etc. The challenge in dynamic graph drawing is to compute a new layout that is both
aesthetically pleasing as it stands and fits well into the sequence of drawings of the
evolving graph. The latter criterion has been termed preserving the mental map [145]
or dynamic stability [155]. A short animation sequence showing incremental layouts of
clustered graphs computed by our algorithm is shown in Figure 6.1. In this dynamic
scenario, vertices move between clusters and thus the size of clusters change, edges are
added, and clusters are added and removed. Yet, the relative locations of the clusters
and the vertices are preserved, while allowing changes in the size of clusters when deemed
necessary.
One field in which clustered graphs arise is software visualization, and in particular,
visualization of mobile object frameworks [38, 102, 130]. Such frameworks extend the
distributed objects concept [158,195] in allowing the objects to migrate to remote hosts,
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 107
along with their state and behavior, while the application is executing (in order to speed
up interaction).
In these frameworks, the notion of a dynamic clustered graph arises quite naturally.
Every object is represented by a vertex in the graph. A machine is represented as a cluster
that contains the objects currently residing in it. The area occupied by a cluster is used
as a visual clue to the user regarding the number of objects located in the machine
represented by the cluster. Naturally, the graph being visualized evolves with time,
as objects migrate between machines and machines connect and disconnect from the
network. Our algorithm has been designed to show these interactions.
(a) Force-directed non-incremental layout
(b) Our incremental layout
Figure 6.2: Incremental vs. non-incremental layout (from left to right)
Compared to graph which are not clustered, work on clustered graph drawing is less
widespread. In [206], a divide and conquer approach, in which each cluster is laid out
separately and then the clusters are composed to form the graph, is used. In [51], a
method of drawing the clustering hierarchies of the graph using different Z coordinates
in a 3D view is discussed. See also [14, 116] for a discussion of clustered and compound
graph layout.
Several algorithms address the problem of offline dynamic graph drawing, where the
entire sequence is known in advance. In [47], a meta-graph built using information from
the entire graph sequence, is used in order to maintain the mental map. In [128] a
stratified, abstracted version of the graph is used to expose its underlying structure. An
offline force directed algorithm is used in [56] in order to create 2D and 3D animations
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 108
of evolving graphs. Creating smooth animation between changing sequences of graphs is
addressed in [19].
An online graph drawing algorithm is discussed in [132], where a cost function that
takes both aesthetic and stability considerations into account, is defined and used. Un-
fortunately, computing this function is very expensive (45 seconds for a 63 node graph).
Drawing constrained graphs has also been addressed. Incremental drawing of DAGs is
discussed in [155]. Dynamic drawing of orthogonal and hierarchical graphs is discussed
in [86].
The DA-TU system described in [105] allows navigating and interactively clustering
huge graphs. In [138] an algorithm that tries to improve the distribution of nodes in a
graph while maintaining the mental map is described. Finally, some commercial graph
layout packages such as [200,209] contain provisions for dynamic layout of graphs. As far
as we know, none of the above was designed to handle incremental drawing of clustered
graphs. Here, we wish to support adding and removing nodes, clusters and edges and
moving nodes between clusters.
In this chapter we propose a new algorithm for online incremental layout of clustered
graphs. The algorithm does not impose restrictions on the structure of the graph. It
allows drawing of edges not only between vertices but also between clusters, which is
used to convey information to the user. The algorithm provides a means of separating
the set of vertices in each cluster to a subset of vertices that stay in the same cluster
and a subset of vertices that might move to a different cluster. The layout of the vertices
inside the cluster is influenced by this separation.
The major design consideration of our algorithm is preserving the mental map while
the graph is being updated. We show that force directed layout techniques [18, 113,
199] can be used as a basic building block. However, they cannot be used as is, as
demonstrated in Figure 6.2(a), where clusters and vertices move considerably between
successive drawings. We propose a few enhancements to existing algorithms in order to
preserve the mental map, as shown in Figure 6.2(b), where only small variations in cluster
location and size are exhibited. Also note the stability of the vertices inside the clusters
as opposed to the non-incremental layout.
A key consideration in designing algorithms is the desirable properties of the results.
This chapter proposes several criteria for evaluating the quality of dynamic clustered
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 109
graphs. They include space compactness, minimization of the changes between frames
and run-time efficiency. We demonstrate that our algorithm performs well according to
these properties. Moreover, we show that this is the case when considering a software
visualization application.
6.2 Problem Statement
This section defines clustered graphs and possible graph updates. It also discusses criteria
by which the quality and stability of the layout is evaluated.
Definition 6.2.1. Partition: A k-way partition of a set C is a family of subsets
(C1, C2, . . . , Ck) such that⋃k
i=1 Ci = C and Ci ∩ Cj = ∅ for i 6= j.
Definition 6.2.2. Clustered Graph: A clustered graph is an ordered quadruple G =
(V,C,Ev, Ec), where V is the vertex set, C is a set of clusters which form a partition of
the vertex set V , Ev is the set of vertex-vertex edges Ev ⊆
(vi, vj)|i 6= j, vi, vj ∈ V
and
Ec is the set of cluster-cluster edges Ec ⊆
(Ci, Cj)|i 6= j, Ci, Cj ∈ C
.
Given a series of clustered graphs G1, G2, . . . , Gn, the goal of the algorithm is to
produce a sequence of layouts L1, L2, . . . , Ln, where Li is a drawing of Gi, such that the
sets Vi, Ci, Evi, Eci
are assigned coordinates. Since the sequence of graphs Gi is not
known in advance, the algorithm is an online algorithm. The updates Ui that can be
performed between successive elements Gi−1 and Gi are: Adding or removing vertices,
edges or clusters, and modifying the partition of vertices into clusters (i.e. moving vertices
between clusters).
A key issue in incremental graph drawing is the stability of the layouts [145,155]. This
is important since a user looking at a graph drawing gradually becomes familiar with the
structure of the graph. We propose the following criteria for evaluating the quality of the
layout [23,145,155,199]:
1. The movement of clusters between successive drawings should be small. Specifically,
clusters that are not modified should remain in their previous position if possible.
The location of clusters plays an important role in the user’s mental map of the
graph.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 110
2. The change in cluster size between successive drawings should be minimal when the
number of vertices in the cluster is similar. Unnecessarily large deviations in size
cause the user to be distracted.
3. Movement of vertices inside a cluster should be minimized. This improves layout
stability.
4. The size of each cluster Ci should be proportional to the number of vertices it
contains. This allows the user to quickly understand how the mobile objects are
distributed between cores.
5. In order to conserve screen space, the drawing of each cluster Ci should be compact.
6. In order to reduce graph cluttering, overlapping between vertices should be avoided
and overlapping between cluster boundaries should be minimal.
Our application to software visualization adds an additional requirement. The vertices in
each cluster are divided into two subsets, static objects that remain at the same cluster
throughout the animation and movable objects. This should become visually apparent.
Note that there are classical aesthetic criteria such as the number of edge crossings,
the total edge length, etc. which we ignore here. However, the underlying static algorithm
used addresses these criteria.
6.3 The Algorithm
Given a sequence of clustered graphs G1, G2, . . . , Gn, our goal is to compute a sequence of
graph layouts L1, L2, . . . , Ln, so as to adhere as much as possible to the criteria discussed
in Section 6.2. A possible approach is to develop an incremental algorithm for drawing
clustered graphs from the ground up. A different approach, which we have pursued, is to
use an existing non-incremental graph layout algorithm as a basic block, and build the
incremental layout capability on top.
Among the different classes of graph drawing algorithms, the force directed algorithm
class seems to be the natural choice in our case [18, 40, 48, 113, 199]. Roughly speaking,
this approach simulates a system of forces defined on the input graph and outputs a local
minimum energy configuration. An edge is simulated by a spring connecting its endpoint
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 111
vertices. Edge length influences the optimal spring length and edge weight determines its
stiffness. The algorithm converges towards a minimum energy position, starting from an
initial placement of the vertices. In our case, the previous layout, Li−1, can be used as a
starting position for the new layout, Li. Extending a force directed algorithm to perform
a layout of clustered graphs is discussed in Section 6.3.2.
Our algorithm’s requirements from the underlying force-directed static layout algo-
rithm are that there exist ways to assign initial coordinates to vertices, to restrict their
movement, to set edge lengths and to add support for drawing clusters. Since little as-
sumptions are made regarding the underlying layout algorithm, a wide variety of existing
layout tools can be used. As such, our algorithm can add incremental layout capabilities
to most existing packages.
In our implementation we use the GraphViz graph drawing package [53] and its force
directed layout component, Neato [78, 113]. Neato avoids overlaps between vertices and
allows setting preferred edge lengths and weights. It also allows pinning down vertices.
Pinned vertices are not moved while the algorithm converges by moving vertices according
to the forces acting on them. However, Neato neither supports clustered graphs nor does it
support controlling the repulsive forces between vertices. These deficiencies are addressed
by our algorithm, as will be described next.
We adopt the proposition made in [155] that vertex stability is more crucial than
edge stability. Specifically, we prefer changing edge lengths rather than moving vertices.
Moreover, in our case, cluster stability is more significant than vertex stability. Thus,
our algorithm utilizes the following key ideas.
First, dummy vertices and edges are used in order to create a clustered structure.
Since clusters are treated as vertices, their motion can be controlled. Second, invisible
place-holder vertices are used in order to minimize the movement of clusters and of
vertices within clusters. This is done while maintaining compactness and keeping the
size of the clusters proportional to the number of vertices they contain. Third, edge
length and weight are used as a means of controlling the changes made to the layout.
Fourth, to achieve both dynamic stability and distinguish between stable and movable
vertices, the set of vertices is partitioned into two sub-sets – stable and movable. The
subsets are laid out in a structure that approximates two concentric circles around the
center of the cluster. Static objects are placed in the inner circle and movable objects in
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 112
the outer one.
These ideas are elaborated in this section. After outlining the algorithm, various
phases and aspects of the algorithm are discussed in detail, including cluster support,
minimization of visual changes, and animations of graph updates.
6.3.1 Overview
To compute layout Li, only the last layout, Li−1, and the new graph that needs to be
laid out, Gi, are used. This is a fast and simple approach that fits well with the view
that incremental layout performs some local changes in the graph. In other words, the
previous layout is considered as a good starting point for the new layout, with some
adjustments made according to the changes that occurred.
The first step in computing the new layout, described in Section 6.3.4, is a merge
stage, which merges layout Li−1 and graph Gi. In the second stage, an actual layout,
L1i , is computed using a static force directed layout algorithm with the modifications
described in Sections 6.3.2–6.3.3. In the third stage, the quality of this layout is checked,
as described in Section 6.3.5. If the layout is deemed satisfactory, it is accepted and Li
= L1i . Otherwise, a second layout attempt is performed, producing layout L2
i . During
this attempt, more freedom is given to the layout algorithm in terms of moving vertices,
at the expense of weakening the connection between the old and the new layouts. The
better of L1i and L2
i is selected as the final drawing Li. The final stage of the algorithm,
described in Section 6.3.6, animates the change between the drawings Li−1 and Li in a
smooth manner. The algorithm is summarized in Figure 6.3.
6.3.2 Supporting Clusters
Adding an invisible dummy attractor vertex to each cluster, to which all of the vertices in
the cluster are connected with invisible edges, is proposed in [25], where repulsive forces
are also used, in order to increase cluster separation. One of the approaches discussed is a
divide and conquer algorithm, in which the clusters are first laid out separately and then
the different layouts are composed together. A hybrid approach that solves the problem
of neglecting inter-cluster edges, caused by this algorithm, is discussed in [206].
We follow the approach of adding a dummy vertex to each cluster. However, separa-
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 113
procedure incremental drawing ( Li−1, Gi ) Gm
i = merge graphs ( Li−1, Gi )L1
i = layout graph ( Gmi )
if ( L1i is good enough )
Li = L1i
else L2
i = layout graph ( modify graph ( L1i ) )
Li = better ( L2i , L1
i )animate change ( Li−1, Li )
Figure 6.3: Algorithm overview in pseudo-code
tion between the clusters and meeting the other requirements described in Section 6.2,
is achieved differently. It is accomplished through proper settings of edge lengths and
weights, as described below.
Five kinds of edge lengths are utilized and indicate the expected level of proximity
between their adjacent vertices. The shortest length is assigned to the invisible edges
connecting static vertices to the dummy vertex of the cluster they belong to. The edges
connecting movable vertices and the dummy vertex are assigned longer lengths. This
creates a layout that resembles two concentric circles. The next type of edges is the
edges between vertices. If both vertices at the endpoints of the edge are contained in the
same cluster, a shorter length is set than if the vertices are in different clusters. This
increases the separation between clusters. The last kind of edges are cluster-cluster edges.
The length of these edges is variable and depends on the requested proximity between
the different clusters, which is determined by the application, e.g., by the amount of
interaction between clusters.
Edge weights are also used in our algorithm. Higher edge weights instruct the un-
derlying force-directed algorithm to try harder to generate edges with lengths close to
the optimal lengths supplied to the algorithm (as discussed above). Inter-cluster edges
are assigned lower weights than intra-cluster edges. This is done in an attempt to give
inter-cluster edges less influence on the layout. This is important when vertices move
between clusters. In such cases, it is preferable to stretch or shorten the length of the
edges somewhat, rather than displace vertices.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 114
In our implementation, the lengths assigned to the edges connecting a static vertex to
a dummy vertex, a movable vertex to a dummy vertex, two regular vertices in the same
cluster and two regular vertices located in different clusters, are 1, 2, 1.5 and 4 units of
length, respectively. The lengths assigned to cluster-cluster edges vary between 5 and
6 units, where the dummy vertices are used as endpoints for cluster-cluster edges. The
weight of intra-cluster edges is set to 1 unit and the weight of inter-cluster edges is set to
2.5 units. These values represent a compromise between stable layouts to aesthetic ones.
Allowing the user control over these parameters will tailor the visualization to the user’s
preferences.
6.3.3 Minimizing Visual Changes
Invisible vertices, called spacer vertices, are added to each cluster, in an attempt to reduce
the change in clusters’ outlines and minimize the movement of clusters between successive
layouts.
The spacer vertices are used as place-holders for regular vertices in a cluster. They are
connected with invisible edges to the dummy vertex of the cluster to which they belong,
like any other vertex in the cluster. When a vertex is removed from a cluster, a spacer
vertex is added to the cluster instead of it. The initial location of the spacer vertex is set
to be the location of the vertex that left the cluster. This is done in order to keep the
size of the cluster constant and in order to reserve space for a new vertex that might be
added to the cluster in the future. When a vertex moves (or is added) to a cluster, the
spacer vertex that is closest to its previous location is replaced by this new vertex.
However, when adding or removing spacers, the algorithm keeps the number of spacers
in a cluster between an upper and a lower fraction of the number of vertices in the cluster.
This is done in order to give the algorithm breathing room when modifying clusters.
Moreover, the limits are set so as to avoid a case in which a cluster with a very small
number of regular, visible vertices occupies a large area due to the many spacer vertices
it contains.
When calculating the outline of each cluster, which is often simply the bounding
box, the spacer vertices are taken into account as if they were regular visible vertices.
Obviously, this minimization of the movements comes at the expense of extra screen
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 115
space, which is occupied by the spacers.
6.3.4 Merging Graphs
The first step in performing the incremental layout is merging the new graph to be drawn,
Gi, and the previous graph drawing, Li−1. The result of the merge stage is a partially
laid out graph, Gmi , in which some of the vertices are assigned initial coordinates. After
merging, the graph Gmi is laid-out by the static layout algorithm. The quality of the
resulting incremental layout depends on the initial conditions computed by the merging
algorithm.
Merging is performed in several steps. Unchanged and dummy vertices are assigned
initial coordinates from Li−1. Then, clusters to which vertices were both added and
removed are handled. The added and removed vertices of a cluster are paired-up, and
the initial coordinates of an added vertex is set to the coordinates of a removed vertex.
Then, vertices that were added to a cluster or removed from it, but cannot be paired-
up, are handled, as discussed in Section 6.3.3. Next, the vertices in new clusters, that
is clusters that exist in Gi but not in Li−1, are inserted into the graph without initial
coordinates, along with new spacer vertices. The number of the latter is set to a constant
fraction of the number of vertices in the cluster.
The last stage of merging involves vertex pinning, which restricts vertex movement,
allowing it to move only as an indirect result of the movement of an unpinned vertex. We
have experimented with several strategies for computing the set of vertices to be pinned.
Our conclusion is that pinning all vertices that were assigned coordinates achieves good
results in terms of the dynamic stability of the layout. We have also observed that in
most cases the resulting layouts are aesthetically pleasing.
6.3.5 Improving the Layout
After computing the graph layout L1i , a cluster density metric determines whether the
layout is of satisfactory quality. For a cluster Ci, we define
density metric(Ci) =area(bounding box(Ci))
number of vertices(Ci).
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 116
That is, the density metric of a cluster is the ratio between the area of its bounding
box and the number of vertices it contains. Higher values imply that the vertices in the
cluster are spaced further apart, which is not desirable. For the entire graph G we define
density metric(G) = maxCi∈Gdensity metric(Ci).
Experience has shown that a correlation exists between high density metric values and
overlaps between clusters.
A second layout, L2i , is computed if the value of the graph density metric exceeds
a threshold. To improve the layout, the restrictions on vertex movement are relaxed.
The layout algorithm is re-run with the positions of the vertices in L1i as the initial
condition. This time the vertices are not pinned down. This gives the layout algorithm
more freedom and allows it to converge to a better result. The new layout L2i still
resembles L1i because of the supplied initial condition. The final layout is selected as the
layout with the lower density metric between L1i and L2
i . Clearly, the choice between
L1i and L2
i demonstrates the tradeoff between preserving the mental map and creating
an aesthetically pleasing layout. It should be noted that initial attempts to use more
relaxed constraints when computing L2i , such as removing some of the assigned vertex
coordinates, were counterproductive.
6.3.6 Display and Animation
We have investigated display in three dimensions, as illustrated in Figure 6.4, in order to
distinguish between vertex types and edge types. Vertex-vertex edges are drawn on the
lower plane, while cluster-cluster edges are drawn on the upper plane. In 3D, a cluster
is drawn as a semi-transparent pyramid with the cluster’s dummy vertex, which is the
endpoint of cluster-cluster edges, drawn at the apex of the pyramid. One of our guidelines
in creating this visualization is being able to collapse the 3D view into a 2D view in a
natural and comprehensible way, as illustrated in Figure 6.5, which shows a 2D drawing
of the graph from Figure 6.4. Color is also employed in order to help the user comprehend
the image – each cluster has a different color.
The transition between Li−1 and Li is performed using a sequence of intermediate
drawings generated by a linear interpolation of the coordinates of vertices, edges and
cluster boundaries.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 117
Figure 6.4: 3D view of a clustered graph
Figure 6.5: 2D view of a clustered graph
6.4 Visualizing Mobile Object Software
Our layout algorithm has been used in the visualization of mobile object applications [38,
102, 130]. This framework extends the distributed objects concept, where objects can
migrate to remote hosts, along with their state and behavior, during the execution of the
application. The visualization should expose the connections, interactions and movements
of the objects that are distributed throughout a computer network. We discuss this
application in detail in Chapter 7. Here, we briefly review the visualization.
In our visualization, every object is depicted by a vertex. Connections between objects
are drawn as vertex–vertex edges. Each machine is represented by a cluster that contains
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 118
all of the objects currently residing on that machine. The set of cluster–cluster edges is
used to display physical connections between machines, as opposed to logical relations
that exist between objects.
Our algorithm is demonstrated in Figures 6.6 and 6.7 as well as in Figures 6.1 and 6.2.
A movie showing results is available at http://www.ee.technion.ac.il/∼ayellet/Movies/-
FrishmanTal-1.mov . The algorithm was tested on several graph sequences. Some of them
represent executions of real mobile object applications and others represent simulated
data.
To measure the quality of the resulting layouts, we identify several criteria. The first
is the density metric discussed in Section 6.3.5, which is used to measure the compactness
of the layout. The second is the sum of displacement of clusters between each pair of
successive layouts, which is used to measure the stability of the layout. The third is
the percentage of clusters with the same size between successive layouts, which helps to
demonstrate the effectiveness of using spacer vertices in minimizing visual changes to the
graph.
The performance of our algorithm is compared to two other algorithms. The first
is a non-incremental algorithm which computes each layout from scratch using force-
directed methods. The second is a variant of our incremental algorithm in which vertices
are assigned initial coordinates computed in the merge stage, but vertex pinning and
spacer vertices are not used. We use this second algorithm in order to show that simply
reusing the initial coordinates from the previous graph does not yield satisfactory results.
Figure 6.6 shows a comparison of the layouts computed by the three algorithms. Note
that only our algorithm manages to compute stable layouts.
Figures 6.8-6.11 present a quantitative comparison of our algorithm to the two other
algorithms. The density metric is plotted in Figure 6.8. Higher values in the graph
represent sparse clusters, which should be avoided. All three algorithms produce simi-
lar results, which means that the incremental algorithm manages to compute compact
layouts of the graph. Figure 6.9 shows the sum of the displacements of clusters between
each pair of successive layouts. Lower values imply higher stability in the location of
clusters. As can be seen, our algorithm outperforms the other algorithms. Figure 6.10
depicts the number of clusters that maintain their size between each pair of successive
layouts. Higher values imply that there are less modifications to cluster outlines. It is
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 119
(a) Non-incremental layout
(b) Incremental layout without using pinning and spacers
(c) Our incremental layout
Figure 6.6: Comparing the three layout algorithms
clear from the graphs that our algorithm produces much better results than the other
algorithms. Finally, Figure 6.11 depicts the running times of the algorithms. Both incre-
mental algorithms take more time to compute than the non-incremental algorithm. This
is mostly due to the extra processing done in the merge stage.
Table 6.1 summarizes the average values of each of the above metrics. All algorithms
produce similar cluster densities. The cluster displacement of our algorithm is by far supe-
rior to the non-incremental algorithm, averaging about one twelfth of the non-incremental
algorithm. Reducing the movement of clusters has indeed been one of the main design
goals of the algorithm. The average percentage of clusters that remain with the same size
in our algorithm is about four times as much as the non-incremental algorithm. This is
facilitated by the spacer vertices that are used to minimize visual changes to the graph.
Finally, the running times of both incremental algorithms is about twice the running time
of the non-incremental algorithm, which is reasonable.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 120
6.5 Conclusion and Future Work
We have presented an online algorithm for incremental layout of clustered graphs. The
algorithm uses a force directed static layout tool as a basic building block. The key idea of
the algorithm is to establish priorities of avoiding changes. First and foremost, movement
of clusters should be avoided, because clusters give insight into the basic structure of the
graph. Then, movement of vertices should be avoided, since vertices convey information
regarding the size of the clusters and aid in navigating the graph. Movement of edges is
considered the least critical.
To achieve this, our algorithm incorporates a few novel concepts. First, crucial vertices
(dummy and old) are pinned down. Second, invisible place-holders are used to minimize
changes. Finally, lengths and weights of edges are used to control both vertex placement
and graph modifications.
It has been demonstrated that the algorithm computes a compact and space efficient
graph layout, while minimizing the displacement and changes to clusters between layout
iterations.
The algorithm has been applied to the visualization of mobile object environments,
where both real and simulated data has been tested. Good results have been achieved
at the expense of higher running times. This is due both to the added complexity of
the algorithm and to the fact that our implementation is only loosely coupled to the
underlying static layout tool.
In future research, we plan to investigate enhancements to our 3D display mode.
We would also like to extend the spacer vertices concept to drawing the cluster bound-
aries. Allowing some flexibility in fitting the boundary around the vertices in the cluster
might improve the layout. An additional layout stage where each cluster is modeled as a
non-uniform node could help improve cluster separation [35]. Finally, using stronger con-
straints when a second layout is necessary might further improve the dynamic stability
of the algorithm.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 121
Average / Algorithm non- no vertex with vertexincremental pinning pinning
density metric [area\vertices] 1.2516× 104 1.1994× 104 1.1936× 104
cluster displacement [distance] 4.0193 1.4118 0.3311fraction of clusters with the same size 0.1575 0.23 0.615running time [ms.] 492 1076 1084
Table 6.1: Average results of an animation sequence
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 122
Figure 6.7: Sample animation sequence (from left to right and top to bottom)
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 123
0 5 10 15 200.5
1
1.5
2
2.5x 104 a) Non−incremental
Layout number
Dens
ity m
etric
[area
/node
s]
0 5 10 15 200.5
1
1.5
2
2.5x 104 b) Without vertex pinning
Layout numberDe
nsity
metr
ic [ar
ea/no
des]
0 5 10 15 200.5
1
1.5
2
2.5x 104 c) With vertex pinning
Layout number
Dens
ity m
etric
[area
/node
s] final layout1layout2
Figure 6.8: Density metric
0 5 10 15 200
2
4
6
8
10
12a) Non−incremental
Layout number
Displa
ceme
nt
0 5 10 15 200
2
4
6
8
10
12b) Without vertex pinning
Layout number
Displa
ceme
nt
0 5 10 15 200
2
4
6
8
10
12c) With vertex pinning
Layout numberDis
place
ment
Figure 6.9: Sum of cluster displacements
0 5 10 15 200
1
2
3
4a) Non−incremental
Layout number
Numb
er of
cluste
rs
0 5 10 15 200
1
2
3
4b) Without vertex pinning
Layout number
Numb
er of
cluste
rs
0 5 10 15 200
1
2
3
4c) With vertex pinning
Layout number
Numb
er of
cluste
rs
Figure 6.10: Number of clusters with the same size
0 5 10 15 200
500
1000
1500
2000
2500a) Non−incremental
Layout number
Time [
ms.]
0 5 10 15 200
500
1000
1500
2000
2500b) Without vertex pinnning
Layout number
Time [
ms.]
0 5 10 15 200
500
1000
1500
2000
2500c) With vertex pinning
Layout number
Time [
ms.]
merge layout1layout2
Figure 6.11: Running times
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
6. Dynamic Drawing of Clustered Graphs 124
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Chapter 7
MOVIS: A system for VisualizingDistributed Mobile ObjectEnvironments
This chapter presents MOVIS – a system for visualizing mobile object frameworks. In
such frameworks, the objects can migrate to remote hosts, along with their state and
behavior, while the application is running. The graph–based visualization algorithm,
described in Chapter 6, is used to depict the physical and the logical connections in the
distributed object network. Scalability is achieved by using a focus+context technique
jointly with a user-steered clustering algorithm. In addition, an event synchronization
model for mobile objects is presented. The system has been applied to visualizing several
mobile object applications. This chapter is based on [13,64,67].
The rest of this chapter is structured as follows: Section 7.1 gives an introduction.
Section 7.2 discusses related work. In Section 7.3, the requirements of a mobile object vi-
sualization system are discussed. Our visualization is presented in Section 7.4. Section 7.5
addresses consistency. Section 7.6 discusses scalability. Section 7.7 discusses implemen-
tation issues. Results are presented in Section 7.8. Finally, Section 7.9 concludes and
discusses future directions.
7.1 Introduction
In recent years, distributed objects have become prominent in the design of distributed
applications [158]. Mobile objects are a natural evolution of the distributed objects
concept [4, 102, 103, 144, 159]. The mobile object paradigm allows programs to migrate
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 126
Figure 7.1: MOVIS user interface. Small rectangles represent mobile objects. Color
stripes show their movement history. Big rectangles represent the cores the objects reside
in. Dashed lines represent physical communication between cores. Higher communica-
tion frequency is indicated by a higher frequency of alternation in the lines. Solid lines
represent logical connections between objects. The square in the middle of the figure rep-
resents several cores which have been collapsed. The rectangle with a double boundary
was selected by the user as the current focus of attention core.
to remote hosts while they are running. It offers scalability, availability and flexibility
advantages compared to other methods of creating distributed applications. However,
such systems are more difficult to design and debug, two tasks in which visualization
can greatly assist. This chapter addresses the challenging problem of visualizing mobile
objects.
Mobile objects have two distinctive features. The first feature is code mobility : objects
can migrate to remote hosts, together with their state and behavior, while the application
is running. We refer to the processes hosting mobile objects as cores. The second feature
of mobile objects is location transparency, which allows the programmer to make calls to
objects regardless of their current location. Since the location of objects may change over
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 127
time, provisions must be supplied in order to track referenced objects. Unlike regular
distributed objects, in which the location of a remote object is fixed, when making a call
using a reference to a mobile object, the parameters may pass through several intermedi-
ary cores until reaching the called object. The introduction of intermediary cores allows
for a more scalable, lazy update of the location of a referenced object [102].
Although research on mobile objects is widespread, visualization of such frameworks
has hardly been done. As far as we know, the only work in this field includes [207,
211]. These systems fall short in several aspects, including the types of events generated
and visualized and visualization consistency and scalability, which are addressed in this
chapter .
This chapter makes the following contributions: First, we present MOVIS (Mobile
Object Visualization), a system for visualization of distributed mobile object environ-
ments. Second, we discuss the requirements of a visualization system for mobile objects.
Third, a graph-based visualization that concurrently shows the physical connections in
the computer network as well as the logical relations between the mobile objects is pre-
sented (see Figure 7.1). Fourth, a context-sensitive focus+context fisheye type display
technique is suggested in order to provide hierarchical information display and support
scalability. Fifth, a clustering algorithm, which is affected by nodes of interest to the
user, is presented. Sixth, we present a model for event synchronization that is used to
guarantee visualization consistency. Finally, we propose a method in which events are
automatically generated, avoiding additional work by the programmer of the application.
7.2 Related Work
Several tools have been developed for visualizing parallel and distributed programs [124].
The PVanim system [201] is a toolkit for creating visualizations of the execution of PVM
programs. PARADE [187] in an environment for developing visualizations of parallel and
distributed programs. In [146], tracing of CORBA [158] remote procedure calls is used to
analyze runtime activities and look for anomalous behavior. Vade [151] is a distributed
algorithm animation system in which visualizations can be created and executed on a
web page on the client’s machine. Pablo [173] provides analysis and presentation of
performance data for massively parallel distributed memory systems. Jinsight [167] is a
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 128
system for the visual exploration of the run-time behavior of complex Java programs.
Although research on mobile objects is widespread, visualization of such frameworks
has hardly been done. In [207], a modification of the process-time diagram, adopted
from XPVM [120], is used as the means of visualization. The creation, destruction
and movement of mobile objects are visualized. Event synchronization is handled by
timestamps and ordering rules. This system has a few drawbacks. It requires manual
annotation of source code in order to generate events. The system does not visualize
the physical connections between machines nor does it display the logical connections
between objects. Finally, it is not scalable. In [211] a visualization tool used to debug
mobile objects is presented. This tool is concerned mainly with checking the mobility of
objects as a function of time and identifying movement hotspots. Visualizations offered
include an object location display and movement history for an object. The system does
not visualize communication between objects or between computers hosting the objects.
Visualization consistency and scalability to large numbers of objects in not addressed.
In this chapter we discuss a different approach to the visualization of mobile object
frameworks, attempting to solve these problems.
7.3 Requirements
This section discusses the requirements of a visualization system for mobile objects:
1. Physical and logical visualization: A mobile object application has two dis-
tinct, yet related facets. The first is the physical computer network with the interconnec-
tions between the cores. The second is the logical network of mobile objects that can be
used to show the connections and interactions between objects. The visualization should
display both of these facets. This is important in order to easily detect cases where
closely interacting system components are placed on distant nodes. Using the visualiza-
tion the system architect will detect this inefficiency and modify the logic and layout of
the application in order to place such objects close together.
2. Interesting events: In any visualization system, the events that need to be
visualized greatly affect the design of the system. In the case of mobile objects, the
following interesting events should be visualized:
• Object Movement: The movement of objects between cores while the application
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 129
is running is the main difference between mobile object frameworks and regular
distributed applications. Therefore, a clear and concise depiction of such activities
is of great importance.
• Construction/destruction: Being a dynamic, distributed application, both objects
and cores may be added or removed during the execution of the application.
• Communication: Being distributed in nature, the messages sent between the differ-
ent parts of the system play a paramount role during execution of the application
and therefore provisions to visualize them should be supplied.
Event generation should be transparent both to the programmer and to the user of
the application. Moreover, care must be taken in order to reduce the perturbation of the
application caused by generating the events.
Being able to visualize movement allows easily identifying cases where objects mi-
grate too often, which is inefficient. Visualizing communication can help expose closely
interacting objects, which should be placed in close proximity
3. Consistent depiction: In a distributed, asynchronous environment there is no
global clock that can be used to synchronize events. This may lead to inconsistent
visualizations in which, for example, a message is shown to be received before it is sent.
One of the challenges in visualizing distributed systems is creating an animation that
provides a consistent depiction of events. This is especially challenging for mobile objects,
since parts of the application change their physical location during execution.
4. Scalability: One of the main challenges in software visualization is building a scal-
able visualization. This is especially important when dealing with networks of computers,
which can potentially generate massive amounts of information. A visualization system
should be able to process large amounts of data. This should be done while avoiding
swamping the user with unnecessary information and without slowing the response of the
visualization system to a point where it is no longer useful.
The user should be able to steer the visualization system to display relevant and
interesting data out of the large amount of information collected. This control should
be interactive, allowing the user to feed back to the system new requests based on the
knowledge accumulated while viewing the unfolding visualization. This will allow the user
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 130
to easily study interesting parts of a large distributed system, for example ones which
require tuning.
5. Dynamic graph layout: A graph is a natural way to represent the structure of
a software system. In the context of mobile objects, a dynamic, clustered graph is used.
Since the graph is dynamic, it is important to produce stable layouts that help maintain
the users mental map of the system. This is required in order to avoid distracting the
user with confusing changes to the way the graph looks each time it changes.
In the following sections we describe how MOVIS addresses there requirements.
7.4 Physical and Logical Visualization
As discussed in Section 7.3, two simultaneous networks are of interest: the physical
network of cores (machines) and the logical relations and interactions between mobile
objects. A graph is a natural choice for visualizing a distributed network. In our case,
we need to simultaneously visualize two graphs. This is done using a clustered graph, as
defined in Definition 6.2.2.
A clustered graph is a natural choice for displaying the simultaneous physical and
logical graphs, as demonstrated in Figure 7.1. Every mobile object is depicted by a node in
the graph. The logical connections between objects are shown using solid edges connecting
the nodes. In order to overlay the physical structure of the network, clusters are used.
Each core is represented by a cluster that contains all of the objects currently residing
in that core. Dashed cluster-cluster edges are used to represent physical connections
between cores (see Figure 7.1), as opposed to logical relations that exist between objects.
As discussed in Section 6.3.6 and demonstrated in Figures 6.4 and 6.5, the graph can
be displayed either in 3D or 2D. Color and transparency (in 3D) are used to help the
user comprehend the visualization.
We use several techniques and attributes in order to display information in this graph.
Each cluster boundary is drawn using a different color. This helps the user track the
different clusters while changes are performed to the graph during the visualization.
Each node is drawn using color strips, as shown in Figure 7.1. The strips are colored
according to the location history of the object. The bottom strip is the current location
(e.g. colored with the same color as the cluster the node currently resides in) the strip
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 131
above corresponds to the previous location, etc. This ”growing stacks” metaphor is
similar to the growing squares in [54].
In order to create a more scalable and meaningful display, we employ lazy construc-
tion of edges. Instead of cluttering the graph with node-node edges showing all of the
references between objects, an edge is drawn between two objects once a method call
between the objects is detected.
In addition to the existence of communication between objects or cores, the frequency
of this communication is of interest to the user. Line patterns are used to convey this
information. The higher the frequency of alternation in the dashed lines, the higher the
frequency of communication. See for example Figure 7.1. The sum of two weighted
averages is used to calculate the amount of communication between cores. The first is
the average number of objects moving between the cores connected by the edge. The
second is the average number of remote invocations performed between the two cores.
The averages are calculated using a weighted sliding window, taking the last N samples
into account.
Some mobile object frameworks [102] allow tagging of specific objects as stable, i.e.
objects that remain at the same location throughout their lifetime. This distinction
between stable and movable objects is visualized by laying out the objects in each cluster
using two concentric circles. The inner circle contains the stable objects while the outer
one contains movable ones.
As discussed in Chapter 6, we have developed a special incremental graph layout
algorithm tailored for the requirements of mobile object visualization [63]. The algorithm
produces a dynamic display of clustered graphs, attempting to preserve the users mental
map of the graph, as it is being changed [145, 155]. The algorithm uses a static force-
directed layout algorithm as a basic building block [53,113,199]. It uses invisible dummy
nodes to create the clustered structure and place-holder nodes to maintain layout stability.
Edge length and weight are used as a means of controlling the changes made to the layout.
Animation is used in order to show different events. When a new graph layout is
performed, for example after an object moves between cores, the positions of nodes,
edges and clusters are linearly interpolated between the old and the new locations. A
method call between two remote objects is animated using a lightning bolt icon that
moves from the caller to the called object.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 132
7.5 Visualization Consistency
One of the main challenges in visualizing distributed environments is the accurate depic-
tion of events. Since in asynchronous distributed systems there is no way of knowing the
real ordering of events, it is necessary to generate a visualization that is consistent with
the events [129].
We base our solution to event synchronization on [151], where consistency of dis-
tributed environments with static objects was addressed, and extend it to support mobile
object frameworks. In [151], the following is assumed:
1. There is a fixed (known) number of processes.
2. A process can perform two types of actions: sending a message to a different process
and an internal computation, possibly modifying the process’s local state. Receiving
a message is considered an internal action.
3. The communication network and processes are reliable.
4. Messages sent by a single process to another process arrive in the order they were
sent.
5. The network is asynchronous - there is no universal clock.
Since the visualization process is part of the distributed environment, it cannot know
the relative order of actions performed by different processes. A way to solve this difficulty
is to introduce semantic causality.
Definition 7.5.1. With respect to a given algorithm run r, we say that an event e in r
semantically causes e′, denoted by e→ e′, if one of the following holds:
1. e and e′ are on the same process, e occurs before e′ and the user specified that they
are semantically dependant.
2. e and e′ are on two different processes connected by a communication channel, e is
a send event and e′ is the corresponding receive event.
3. There is an event e′′ such that e→ e′′ and e′′ → e′.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 133
Let e and e′ be two events of the algorithm. Let An(e) and An(e′) be the animation
segments of these events, respectively. We say that an animation An(e) precedes an
animation An(e′), denoted by An(e) ≺ An(e′), if An(e) completes before An(e′) starts.
The following theorem has been proved in [151]:
Theorem 7.5.1. An animation is consistent with the execution of the algorithm if and
only if for every two algorithm events e and e′, such that e→ e′ also An(e) ≺ An(e′).
That is, in order to ensure that the animation is consistent with the execution of
the algorithm, we have to ensure that for every two events e and e′, if e → e′ then
An(e) ≺ An(e′).
A possible implementation of this requirement is called receive synchronization. In this
method, reports of send and receive events are sent to the animation system immediately
after they take place and there is no delay in the execution of the algorithm. The
animation of the receive event is delayed until the corresponding send event has been
animated.
We now turn our attention to mobile object environments. The main differences
between this model and the distributed environments model, in the context of consistency,
are:
1. Assumption 1 is violated. Both cores and objects might join or leave the network.
2. Objects might move between cores.
3. Assumption 4 is violated. Since objects might move, messages sent by a single
object to another might be received out of order.
The first problem is addressed as follows. Dynamic creation and deletion of cores and
objects are modeled as internal messages. A core / object is introduced to the animation
system after its internal create event is received. A core / object is deleted from the
animation system once a deletion event is received and all proceeding events have been
animated.
To solve the second problem, object movement between locations is modeled as a
method call between the sending and receiving cores. The parameters passed include the
state and behavior (code) of the object that is being moved from one core to the other.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 134
This approach allows us to synchronize the events emitted by objects, even when they
migrate to remote cores. It is ensured that the movement event will be animated before
any events generated at the destination core are shown. Hence, when treating object
movement as a method call between cores, we are able to revert to using existing event
synchronization algorithms.
The third problem, out-of-order messages, should be solved by the middleware or the
application. It is not a visualization problem, but rather an inherent problem. When this
is solved, all that remains is to solve possible out-of-order reception of messages by the
visualization system. This can be done by adding an event counter to each object and
using the receive synchronization technique described above for visualization.
We have chosen to perform synchronization at the core level. Using a finer-grained
approach requires extensive profiling of the application , possibly considerably slowing
down the execution. Like regular distributed applications, each core is viewed as a sepa-
rate process. Events notifying about communication between cores and activities internal
to each core are emitted and synchronized. The internal events in each core are serialized.
This may add redundant dependencies between activities that are independent in a core
but is guaranteed to create a consistent visualization. The alternative of asking the user
to explicitly define dependencies is not viable in the context of our problem.
Messages sent between cores are modeled as messages sent between processes. The
dependency between receiving the parameters for a message call and forwarding the
parameters to the next core on the way to the destination core is handled automatically
since these are two events that occur at the same core, one after the other. This is also
true for messages sending the return value back to the caller core.
Events showing average information that is periodically updated are not synchronized.
For example, in our system events notifying the amount of communication between cores
are periodically generated, yet not synchronized.
7.6 Visualization Scalability
As the number of objects and cores increases, the visualization might get cluttered with
information. Gaining any insight from the visualization will become increasingly difficult.
In this section we present a context sensitive focus + context technique that alleviates
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 135
this problem.
7.6.1 Levels of Detail
The visualization should provide the user with an overview of the graph while at the
same time allowing focusing on specific, user-defined areas in order to get more detailed
information [29, 71]. To achieve these goals, a hierarchy of levels of detail is defined,
allowing different parts of the graph to be displayed in different levels of detail.
At the highest level, full information is displayed, as shown in Figure 7.2(a). The next
level of the hierarchy omits information about the objects residing in each core and the
logical connections between objects. Instead of displaying a cluster for each core in the
network, a single node is used to depict each core. As before, cluster–cluster edges are
used to convey the physical connections to other cores in the network, as demonstrated
in Figure 7.2(b). The final level combines several cores into one node in the display. This
allows collapsing un-interesting parts of the graph into a small display area while still
showing the user the overall structure of the graph. The size of such nodes is proportional
to the number of cores they depict. Figures 7.2(c) and (d) demonstrate graphs containing
nodes of various levels of detail. Note the stability of the layouts and the way nodes are
collapsed as the level of detail is decreased.
The user has several methods to control which parts of the graph will be displayed
in which level of detail. The first is selecting focal nodes (cores) that are of primary
interest to the user and thus should be displayed with full detail. The second method
is navigating the graph using zoom-in and zoom-out operations. The third is choosing
the total number of nodes to be displayed in the graph and letting the system cluster
the graph nodes accordingly. Once the user selects focus nodes, a clustering algorithm
is employed in order to decide at what level of detail each core will be displayed, as
described in the next subsection.
Zoom–in and zoom–out operations are animated smoothly. The old nodes fade out of
the graph while the new nodes fade in. Next, the new nodes smoothly move to their final
location. This helps maintain the mental map. A similar animation is performed when
re–clustering is performed. The locations of the new clusters are calculated by the layout
algorithm, which takes into account the previous locations of the nodes comprising the
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 136
(a) Original graph – 43 clusters (b) Clustering to 36 clusters
(c) Clustering to 25 clusters (d) Clustering to 15 clusters
Figure 7.2: Levels of detail. Several visualizations of the same mobile object network are
shown. Parts of the graph are progressively collapsed. Note the stability in the layouts
and the conservation of the overall structure of the graph.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 137
cluster, thus maintaining layout stability.
In order to improve scalability, the synchronization scheme presented in Section 7.5
can be extended to a hierarchy of synchronization units, which is constructed according
to the hierarchical representation of the graph. Each level in the hierarchy contains a
synchronization unit. Events are forwarded to the next (higher) level only if they are not
contained in the current level in the hierarchy. Using this method, the amount of events
reaching the higher levels of the hierarchy (which represent more cores) is significantly
reduced. In order to further reduce the volume of events, instead of showing movements
of objects using animation, this information can be time-averaged and visualized by
changing the frequency of the dashed lines connecting cores.
7.6.2 Clustering
A clustering algorithm is used in order to compute the hierarchical representation of
the graph. The clustering is influenced by the focal nodes, which are interesting nodes
selected by the user. A fisheye type effect is used, where nodes farther away from the focal
nodes are displayed with less detail. The algorithm, which is summarized in Figure 7.3,
is based on an extension of the agglomerative clustering algorithm.
Input: Set of focal nodes; distances between nodes; number of desired clusters
Algorithm:
1. Calculate shortest distance between each node and the closest focus node.
2. Update distances between nodes according to distance to focus node.
3. Perform hierarchical clustering.
Output: Clustering hierarchy of the nodes
Figure 7.3: Focus-based clustering algorithm
The algorithm has several inputs. The first is a set of focal nodes (e.g., cores of
interest), selected interactively by the user. The second is the distances between nodes,
designated D(u, v), which correspond to the weights of edges in the graph. They are
calculated according to the frequency of method calls and object moves between cores, as
described in Section 7.4. The third input is the desired number of clusters. The output
of the algorithm is a hierarchical clustering of the graph.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 138
In the first step of the algorithm, the shortest distance between each node u and the
closest focal node, Dfocal(u), is calculated. This is done using Dijkstra’s algorithm on
the focal nodes. Additionally, the maximum of the minimal distances is computed as
dmax = maxv∈V Dfocal(v), where V is the set of nodes in the graph.
In the second step, the distances, D(u, v), between every pair of nodes u, v, are up-
dated according to their proximity to focal nodes. As opposed to the regular fisheye
technique, in which geometric distortion is used, our method moves the distortion to the
clustering phase. This results in a better layout since the graph is not distorted after
layout. We set the initial, joint average distance of nodes u and v and a focus node to
Dfocalavg (u, v) =
Dfocal(u) + Dfocal(v)
2.
It should be noted that the focal node used in Dfocal(u) may be different from the one
used in Dfocal(v). The distance D(u, v) is distorted to form Ddistorted(u, v), the updated
distance between nodes u and v, according to the following formula:
Ddistorted(u, v) =D(u, v)
1 + C · Dfocalavg (u,v)
dmax
.
The greater the average distance between the nodes and the closest focal node, the bigger
the distortion. This behavior mimics the fisheye effect. Nodes in the periphery are less
interesting and therefore have a higher probability of being clustered together, since they
are perceived to be close. In our implementation we use C = 3. Another option is to
have C depend on the size of the graph.
In the last step of the algorithm the actual clustering is performed, using the distances
computed in the previous steps. A bottom–up, hierarchical clustering algorithm is used.
The algorithm starts with assigning every node to its own singleton cluster. It then
repetitively greedily joins the two closest clusters. The algorithm terminates when the
required number of clusters have been created.
The distance between clusters Ci and Cj is calculated using a modified average dis-
tance metric. Only edges (distances) in the set Eij = e = (u, v) ∈ G|u ∈ Ci, v ∈ Cj,that is edges directly connecting a node u ∈ Ci and a node v ∈ Cj (e.g., edges crossing
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 139
the boundary between the clusters) are taken into account. The distance is
Dist(Ci, Cj) =
∑
(u,v)∈Eij
Ddistorted(u, v)
| Eij |
e.g., the sum of the lengths of the edges divided by the number of edges. This formula is
a tradeoff between an exact calculation and a rapid, approximate calculation.
7.7 Implementation
MOVIS was implemented on top of FarGo [102], a Java-based mobile object framework.
FarGo contains extensive monitoring facilities [103] and uses a source–to–source compiler
called Fargoc for generating proxies and other code used to implement support for mobile
objects. Our implementation is Java based. We use the Java3D API for generating the
visualization.
Our system is composed of several components. In each core (machine), a special
local profiling object, used to collect events, is instantiated. This object listens both to
events generated by the Fargo monitor and to events generated by our modified Fargoc
compiler. The events generated by each core are forwarded to a main event collection
object. This object either stores the events for offline visualization or forwards them to
the event synchronization unit, described Section 7.7.2. After creating a synchronized
event list, from which a consistent run of the application can be constructed, the events
are sent to the visualization component. Events generated by the user, such as requests
for re-clustering or zoom in / zoom out operations are fused together with the events
collected from the system, in order to form a unified event queue that is visualized.
7.7.1 Event Generation
One of the goals of a program visualization system is to generate events with minimal
effort by the programmer and the user of the application being visualized, while perturb-
ing the running application as little as possible. In this section we describe how this is
achieved.
The interesting events are related to communication between mobile objects and move-
ment of objects between cores. Since location transparency needs to be maintained when
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 140
communication is performed between mobile objects, some kind of proxy needs to be used
in order to forward the method call to the actual destination object. This proxy is gen-
erated either statically [102] or dynamically [4, 159]. This is where the event generation
code is (automatically) inserted.
In order to trace method calls, the Fargoc compiler was modified to transparently
generate an event each time execution enters an interface method of a mobile object.
Generating events for movement of objects between cores is implemented by piggybacking
onto the migration code supplied by the middleware. Other types of actions for which
events need to be generated include the creation and destruction of mobile objects and
cores (e.g., connecting/disconnecting from the application network). This is handled by
tapping into an existing profiling interface.
7.7.2 Event Synchronization Component
The event synchronization component receives events from all of the event collection
objects located at the different cores that constitute the application to be monitored. It
reorders the events in order to generate a sequence of events that is consistent. This
stream of events is then visualized.
The implementation of the synchronization component follows several rules and obser-
vations made in this chapter . The first is that all events generated at a core are reported
in FIFO order and each event depends on the previous event. The second is that a send
event should be reported (to the visualization) before (depends on) the receive event.
The algorithm is described in Figure 7.4.
For each core, the synchronization component maintains a queue of events. This queue
contains received events that cannot be forwarded to the visualization component, since
a dependent send event was not received yet by the synchronization component. We will
call the act of sending an event to the visualization component committing the event.
Committing a send event may be delayed since it in itself is dependant on a previous
event that has not been committed, yet.
When a new event is received by the synchronization component, the following is
done. First, a check is made if the core from which the event was sent is blocked, e.g.
waiting for events. If this is the case, the event is added at the end of the event queue of
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 141
procedure handle event (event e) if (e’s core is blocked)
queue event(e)
else
if (e is a send or internal event)
commit event(e)
else //e is a receive event
if (e depends on a committed event)
commit event(e)
else
queue event(e)
procedure commit event(event e)
send to vis(e)
if (e can unblock a core)
BFS unblock core(e.getCore())
procedure BFS unblock core(core c)
active cores list ← cwhile (active cores list 6= ∅)
c = remove first(active cores list)
if (c has more queued events)
ec = c.nextEvent()
if (ec can be sent)
send to vis(ec)
if (ec is a send event)
dest(ec) = ec destination core
if (dest(ec) blocked on ec)
add dest(ec) to active cores list
add c to active cores list
Figure 7.4: Event synchronization algorithm
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 142
the core. If the core is not queuing events and the event is a send event - it is committed.
A check is made if there is another core that is blocked on this event. If this is the case,
events from the blocked core may be committed, according to their order in the queue.
If the newly received event is a receive event, a check is made to determine if the send
event that it depends on was already sent. If this is the case, the event is committed. If
this is not the case, the event is queued and its core enters the blocked state.
When a core unblocks, the queued events are committed. This, in turn may cause
other cores to become unblocked (due to committing a send event that the blocked core
depends on). A list of active cores is maintained. Each time one event is committed from
a core, the activity switches over to the next core in the list. This is similar to advancing
in a graph using a BFS algorithm. The motivation of using this method is to create a
stream of events that will produce animation that is maximally parallel. Switching from
one core to an other while committing events attempts to expose the possible parallelism
to the visualization component. The synchronization component can be modified to
produce a variety of interesting orderings, as described in [125].
7.8 Results
Our system has been used for visualizing several applications, including a mobile object
simulator, an e-commerce application [110] and a distributed e-mail system (abbreviated
DEM) [13]. We first present visualizations of our mobile object simulator and then
proceed to discuss the application of MOVIS to the DEM system.
Mobile object simulator In order to test our visualization system, a mobile object
simulator was implemented. The simulator uses a configuration file which governs the
activities of mobile objects it creates. The number of objects, their creation and destruc-
tion time and location, their movement and communication patterns are all specified in
the configuration file. Figure 7.5 demonstrates an animation sequence created with our
visualization algorithm. Note how the users mental map is maintained during the ani-
mation sequence. Also note the stripes which show the location history of each mobile
object.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 143
Figure 7.5: Sample animation sequence of the mobile object simulator (from left to right
and top to bottom)
Mobile-object based E-mail application E-mail is one of the most popular Internet
applications. Nowadays, e-mail architectures are governed by a server-centric design,
which implies a handful of weaknesses such as a single point of failure, storage and
processing stress, bottlenecks and inefficiency.
The goal of the DEM system is to overcome these drawbacks. Service is provided by
using the participants’ resources. Lightweight servers and users’ mailboxes scatter be-
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 144
tween participants’ computers instead of residing on a single server (or cluster). By using
the mobile objects paradigm, the mailboxes and servers are able to travel on the “live”
network, so that they continue their operation despite the fact that participants con-
stantly join and leave the network. Most of the communication is done directly between
users, thus removing the bottlenecks caused by mail servers. The system’s components
are replicated across numerous hosts, eliminating single point of failure problems. Storage
and processing stress is reduced as participants take an even share of the burden. This
yields a reliable and scalable system, with negligible operational and maintenance cost.
Visualization has been used during the development of this application – for debugging
purposes as well as for managing and monitoring its deployment across the network. Due
to the complexity of the architecture, its developer expressed a need for visualization
at the very early stages of implementation. Using visualization, several problems were
quickly discovered. For example, a case where an object does not flee from a core that is
shutting down was uncovered.
(a) (b)
Figure 7.6: Mailbox mobility in the DEM system. (a) Before movement. (b) A new core
was created. A mailbox migrated to it.
In this application, icons have been used to represent the objects. The mailboxes
are displayed using a mailbox icon. Servers are represented as gray disks. Yellow pools
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 145
Figure 7.7: Sending an e-mail in the DEM system
represent mailbox placeholders. Finally, the GUI is represented by a mailbox icon with
a white background.
Figure 7.6 shows a visualization of the movement of a mailbox between computers.
In Figure 7.6(a) there is one mailbox in each core. In Figure 7.6(b) a mailbox moved to
a new core that connected to the service, shown at the bottom.
Filtering of method calls was used in order to show specific interesting events. For
example, Figure 7.7 shows an e-mail message being sent from the source mailbox directly
to the destination mailbox. The message, in transit, is drawn inside a red circle. An
accompanying movie can be found at http://www.ee.technion.ac.il/∼ayellet/Movies/-
FrishmanTal.mov.
7.9 Conclusion and Future Work
We have presented MOVIS – a system for visualizing mobile object frameworks. The key
features of these frameworks – object mobility, location transparency, and distributed
operation – are addressed by our system. A clustered graph is used to concurrently show
the physical connections between cores and the logical connections between objects. A
clustering algorithm, which is influenced by the areas of interest to the user, is used to
provide a hierarchical, scalable context+focus visualization. The overall complexity of
the graph is user controlled. The visualization is dynamic: incremental graph layout and
animation depict changes in a smooth, comprehensible manner.
MOVIS has been used for monitoring, debugging and presenting system architectures.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
7. MOVIS: A system for Visualizing Distributed Mobile ObjectEnvironments 146
It has been used in several scenarios, including simulators, e-commerce and distributed
e-mail.
There are several avenues of future research. Additional levels of detail can be inte-
grated into the visualization. The existing profiling infrastructure can be used to supply
object-specific information such as memory usage and creation time. Information about
the cores themselves, such as thread count, memory usage and CPU usage can also be
integrated into the visualization.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Chapter 8
Conclusions
In this thesis we have examined several problems in the field of graph drawing in infor-
mation visualization. In this chapter we summarize the main results and propose several
topics for extending this work in future research.
8.1 Contribution and Summary
The major contribution of this thesis is addressing several interconnected problems in the
field of graph drawing. First, a new algorithm for solving the basic problem of computing
a layout for a single graph is presented. The algorithm is able to quickly compute layouts
of large graphs. One of the difficulties with graph layouts is the variable information
density in different parts of the screen. An algorithm that improves a layout computed
by any algorithm is presented next. One of the goals of the algorithm is to maintain the
overall structure of the graph while it is improved. This is one instance of the problem of
maintaining the mental map [145], which is also addressed in this research. In addition to
computing a layout for a single graph, we study methods of creating sequences of graph
layouts. The challenge here is maintaining aesthetics while at the same time maintaining
stability, in a way that allows the user to comprehend the changes preformed on the
graph without being distracted by unwanted, abrupt changes to the layout. Dynamic
algorithms for both clustered and un-clustered graphs are discussed.
In recent years, graphics processing units (GPUs) have become increasingly powerful
and programmable. Devised for quickly rendering high-quality images for graphics tasks,
GPUs are architected for working in parallel on large, structured data. On the other
hand, graphs are inherently unstructured and hence do not seem suitable for processing
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
8. Conclusions 148
on GPUs. In this thesis we have demonstrated how a variety of problems related to graph
layout can be restructured and thus efficiently handled by the GPU.
The algorithms developed in this thesis have been used in several information visual-
ization applications. The dynamic clustered graph drawing algorithm is used as a basic
building block for a system for the visualization of mobile object frameworks. Focus +
context techniques are used to create a scalable visualization system, showing both physi-
cal and logical interactions in the mobile object network. The static layout algorithm has
been used for visualization of the structure of the networks of internet service providers.
It is shown that the layout can provide meaningful insights about these networks. The
dynamic graph drawing algorithm has been used for a couple of applications. The first
is the visualization of discussion threads occurring at an Internet news site. The second
is the visualization of the growth of an Internet-based social network.
The research presented in this thesis is based on the following papers [13, 63–69].
Below, we give more details about the contributions and main results of each chapter.
In Chapter 3, based on [65], a new algorithm for force directed graph layout on the
GPU was presented. The algorithm uses a multi-level scheme, which is based on spectral
partitioning. The strengths of the algorithms of [70,113] are combined in order to create
a high-quality layout of a simplified graph, which is the basis for the final layout. A new
scheme for extending coarse layouts to finer layouts, which creates good initial layouts
for coarse graphs is discussed. Finally, the algorithm presented is able to efficiently
use the GPU to accelerate the layout. Using spectral partitioning and KD-partitioning
techniques, we are able to restructure the graph layout problem in a manner suitable for
acceleration on the GPU or any other data-parallel architecture. Thus, this algorithm
can be efficently implemented on future, parallel, architectures.
It has been demonstrated that the algorithm is able to quickly compute aesthetic lay-
outs of different types of graphs. Using the GPU the layout computation was accelerated
by up to 5.5 times compared to a CPU implementation of the algorithm. Combined with
the inherent speed of the algorithm, this resulted in being able to compute layouts with
similar quality to state-of the art force directed algorithms such as [91] in a fraction of
the running time. The algorithm has been applied to the visualization of the networks
of Internet service providers.
In Chapter 4 [69], the problem of reducing cluttering in graph layouts was addressed.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
8. Conclusions 149
In many cases, graph layouts contain a non-uniform spatial density of information. While
some regions of the layout are highly congested, others are sparse or even empty. A new
algorithm for improving a given graph layout, computed by any layout algorithm was
presented.
The algorithm is based on a physically-inspired evolution process, where the content
of dense areas of the layout is spread to surrounding empty areas. The evolution uses a
ray-casting approach in order to find a better distribution for the information contained
in the graph layout. An image warp, which is used to displace the nodes of the graph is
computed. Results from optimal mass-transport problems are used in order to compute
this warp. Since the wrap minimizes the displacements in each pixel of the image, the
algorithm is able to compute a mental-map preserving improvement of the layout.
Various acceleration techniques were used. The GPU was used to efficiently perform
the ray-casting, which is required to compute the updated layout density image. Using
the GPU accelerated the total computation time by a factor of over 100 over our CPU
implementation. A multi-grid method was used to accelerate the computation of the
image warp. These techniques resulted in being able to compute an updated layout in a
matter of seconds, even for large input graphs.
The algorithm has been applied to unclutter layouts of both small and large graphs
computed by several well-known algorithms. It was demonstrated that the algorithm is
able to better utilize the available screen space while maintaining the user’s mental map.
This allows, for example, to create animations of the improvement process, where the
structure of the graph is maintained while the readability of the graph improves.
In Chapter 5, based on [66, 68], we described a new, GPU-accelerated algorithm for
online dynamic graph drawing. The algorithm is able to efficiently compute stable and
aesthetic layouts of a series of graphs, which contain arbitrary modifications between
consecutive graphs. The algorithm uses various execution culling techniques in order to
reduce the layout time, while maintaining the layout quality. Nodes are assigned individ-
ual movement flexibilities according to the changes to the graph. A multi-level scheme
for dynamic graphs is presented and used to improve the layout quality. The algorithm
has been applied to the visualization of several real datasets, including discussion threads
in Internet sites and visualization of social networks.
The algorithm was shown to compute high-quality layouts while reducing node dis-
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
8. Conclusions 150
placement and preserving the user’s mental map. Implementation on the GPU allowed
for a speedup of the total running time of the algorithm (including parts running on the
CPU) by up to 17 times. Further, it was shown that using newer GPUs results in an
even larger acceleration of the layout.
In Chapter 6, based on [63], an online algorithm for dynamic drawing of graphs which
contain an inherent grouping into clusters was discussed. The algorithm is based on a
few concepts. First, in order to maintain stability, some of the nodes of the graph are
pinned down. Second, invisible place-holder vertices are used to minimize changes to the
structure of the graph. Finally, edge lengths and weights are used to control the placement
of vertices and the modifications of the graph. Several metrics for measuring dynamic
layout quality were introduced. The algorithm has been applied to the visualization of
mobile objects, which is discussed in Chapter 7.
In Chapter 7, based on [13, 64, 67], a system for the visualization of mobile object
environments was presented. During this research several visualization challenges were
encountered. First, the visualization needs to be consistent with the execution of the
asynchronous application. Second, the user’s mental map needs to be maintained while
the visualization unfolds. Third, the visualization must be scalable. Distributed systems
are naturally scalable and mobile object systems are even more so. Devising a visual-
ization method that can scale well to hundreds of objects and machines is much harder
than providing a tool that can display a few objects. Fourth, simultaneously showing ac-
tivities in the physical network communication level and in the logical object interaction
level is required. This is important since objects are mobile and the interaction between
machines changes over time due to object migration. Finally, the massive amounts of
information available need to be filtered in order to allow the user to focus on interesting
events.
In our work we devised and implemented algorithms to solve all of these difficulties.
We made the following contributions. First, we identified and discussed the requirements
from a mobile object visualization system. Second, we supplied an algorithm to maintain
the consistency between the execution of the algorithm and its visualization. We proved
the correctness of our synchronization algorithm. Third, we developed a focus+context
visualization algorithm [29, 71] which provides a system that is scalable and capable of
displaying large networks and many events. This was achieved by using a focus-based,
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
8. Conclusions 151
user-directed clustering algorithm and displaying information using different levels of
detail. Fourth, we displayed the logical and physical aspects of the system simultaneously
using a dynamic clustered graph. We used both 2D and 3D graphics in our visualization.
Finally, we devised a way to generate interesting events automatically, avoiding additional
work by the programmer of the application. We implemented our system on top of the
FarGo mobile object framework [102].
Our visualization system has been used in several scenarios, ranging from simulators
to distributed e-mail and e-commerce applications. It has been used for monitoring,
debugging, as well as for presenting system architectures. One byproduct of this work
has been the design and implementation of a distributed e-mail architecture that is based
on the mobile object paradigm [13].
8.2 Future Research
In our modern world, which is filled with different sources of information, being able to
visualize the large amounts of available information is an increasingly important task.
In this thesis, several algorithms for visualizing different types of information have been
presented. In this section we outline a few possible extensions of this thesis and related
research problems. More specific reserch ideas are included in the conclusions section of
each chapter.
In this research, an algorithm for dynamic drawing of clustered graphs has been
presented. Dynamic drawing of changing, nested hierarchical graphs of arbitrary depth
is an interesting research challenge. Adding more levels of detail and more information
while providing a consistent, mental-map preserving and understandable visualization is
significantly more complex. Some of the applications of such an algorithm include web
visualization and software visualization.
One of the major deficiencies of force-directed graph layout algorithms is the fact
that they converge to a local minimum position. This has two drawbacks. First, the final
outcome depends on the initial conditions used. Second, since the minimum is local, it is
not guaranteed that the optimal layout is computed. Hence, finding a high-quality layout
algorithm that converges to a global minimum is an interesting research problem.
Many graph drawing algorithms spend most of their effort optimizing the positions
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
8. Conclusions 152
of the nodes in the graph. Often, the edges are simply used to connect the nodes,
almost as an after-thought of the layout process. However, when looking at a layout, it
is evident that the edges are an important part of the layout, not only in terms of the
information they contain, but also in terms of the amount of screen space allocated for
drawing the edges. New static and dynamic algorithms that make the graph edges an
integral part of the layout process can generate improved and more readable layouts. An
especially challenging problem here is dynamic graph drawing. Updating node positions,
edge positions and edge shapes in a mental-map preserving way is an interesting future
research problem.
Another future research direction is applying the algorithms to diverse information vi-
sualization problems. One application is visualization for computer security. In this field,
graphs and especially dynamic graphs can provide a meaningful visualization, helping
identify behaviors and patterns. For example, visualization of the time-varying changes
in network traffic between Internet hosts can help identify suspicious node groups with
highly variable traffic, possibly due to a breach of security. Another area that has received
little attention thus far is using visualization tools in order to perform application man-
agement tasks. An initial attempt, preformed as part of our previous work [13] suggests
some promising results. Visualization of the process of developing software [27, 161, 203]
is another interesting application. Software is composed of hierarchal modules: source
files, classes and directories. These modules, their relative locations and the connections
between them constantly change during the lifetime of the software. Here, visualization
can be applied for project management, complexity analysis and understanding software
structure. To meet this challenge, algorithms for hiding some of the information, creating
smooth animations, and display algorithms need to be devised. Biological networks are
immensely complex. Applying graph layout techniques in order to visualize activity in
such networks can help researchers better comprehend the large amounts of data they
face. The size and complexity of biological data make graph drawing applications in this
field especially challenging.
Clearly, there is a growing interest in using GPUs to accelerate computations [89,163].
While GPUs have been successfully applied in graphics and visualization tasks, the use
of GPUs for accelerating information visualization tasks is not as common. In this thesis
some progress was made in applying GPUs in the field of graph drawing, which is one
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
8. Conclusions 153
of the central problems in information visualization. Implementing other graph-related
problems such as graph partitioning and clustering on the GPU can provide a basic
building block for a variety of applications, extending outside the field of information
visualization.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
8. Conclusions 154
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References
[1] AT&T graph library. linked from http://www.graphdrawing.org/.
[2] Rocketfuel maps and data. http://www.cs.washington.edu/research/-
networking/rocketfuel/ .
[3] S. W. A. T. Adai, S. V. Date and E. M. Marcotte. Lgl: creating a map of protein
function with an algorithm for visualizing very large biological networks. J. Mol
Biol, pages 179–190, 2004.
[4] A. Acharya, M. Ranganathan, and J. Saltz. Sumatra: A language for resource-aware
mobile programs. In J. Vitek and C. Tschudin, editors, Mobile Object Systems:
Towards the Programmable Internet, number 1222 in Lecture Notes in Computer
Science, LNCS, pages 111–130. Springer-Verlag, 1996.
[5] D. Aiger and K. Kedem. Applying graphics hardware to achieve extremely fast
geometric pattern matching in two and three dimensional transformation space.
Inf. Process. Lett, 105(6):224–230, 2008.
[6] J. A. Anderson, C. D. Lorenz, and A. Travesset. General purpose molecular dy-
namics simulations fully implemented on graphics processing units. J. Comput.
Phys., 227(10):5342–5359, 2008.
[7] D. Archambault, T. Munzner, and D. Auber. TopoLayout: Multilevel graph layout
by topological features. IEEE Trans. on Visualization and Computer Graphics,
13(2):305–317, 2007.
[8] D. Auber and Y. Chriricota. Improved efficiency of spring embedders: Taking
advantage of GPU programming. In Visualization, Imaging, and Image Processing,
pages 169–175, 2007.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 156
[9] J. Barnes and P. Hut. A hierarchical O(N logN) force-calculation algorithm. Nature,
324(4):446–449, 1986.
[10] J.-D. Benamou and Y. Brenier. A computational fluid mechanics solution to the
Monge–Kantorovich mass transfer problem. Numerische Mathematik, 84(3):375–
393, Oct. 2000.
[11] S. Bender-deMoll and D. McFarland. The art and science of dynamic network
visualization. Journal of Social Structure, 7(2), 2006.
[12] S. Bender-deMoll and D. A. McFarland. SoNIA - social network image animator.
http://www.stanford.edu/group/sonia/.
[13] S. Bercovici, Y. Frishman, I. Keidar, and A. Tal. Decentralized electronic mail. In
International Workshop on Dynamic Distributed Systems (IWDDS), 2006.
[14] F. Bertault and M. Miller. An algorithm for drawing compound graphs. In J. Kra-
tochvıl, editor, Proc. 7th Int. Symp. Graph Drawing (GD 1999), number 1731 in
Lecture Notes in Computer Science, LNCS, pages 197–204. Springer-Verlag, 2000.
[15] T. Biedl and G. Kant. A better heuristic for orthogonal graph drawings. In Proc.
2nd European Symp. on Algorithms (ESA’94), number 855 in LNCS, pages 24–35,
1994.
[16] D. Blythe. The direct3D 10 system. ACM Trans. Graph, 25(3):724–734, 2006.
[17] J. Bolz, I. Farmer, E. Grinspun, and P. Schroder. Sparse matrix solvers on the
GPU: conjugate gradients and multigrid. ACM Trans. Graph, 22(3):917–924, 2003.
[18] U. Brandes. 4. drawing on physical analogies. Lecture Notes in Computer Science,
LNCS, 2025:71–86, 2001.
[19] U. Brandes, D. Fleischer, and T. Puppe. Dynamic spectral layout of small worlds.
In Proc. 13th Int. Symp. Graph Drawing, GD, pages 25–36, 2005.
[20] U. Brandes and D. Wagner. A Bayesian paradigm for dynamic graph layout. In
Proc. 5th Int. Symp. Graph Drawing, GD, number 1353 in LNCS, pages 85–99,
1997.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 157
[21] J. Branke. 9. dynamic graph drawing. Lecture Notes in Computer Science, LNCS,
2025:228–246, 2001.
[22] Y. Brenier. Polar factorization and monotone rearrangement of vector-valued func-
tions. Communications on pure and applied mathematics, 44(4):375–417, 1991.
[23] S. S. Bridgeman and R. Tamassia. A user study in similarity measures for graph
drawing. J. Graph Algorithms Appl, 6(3):225–254, 2002.
[24] W. L. Briggs, V. E. Henson, and S. F. McCormick. A multigrid tutorial: second
edition. SIAM, 2000.
[25] R. Brockenauer and S. Cornelsen. 8. drawing clusters and hierarchies. Lecture Notes
in Computer Science, LNCS, 2025:193–227, 2001.
[26] I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Han-
rahan. Brook for GPUs: stream computing on graphics hardware. ACM Trans. on
Graphics, 23(3):777–786, 2004.
[27] M. Burch, S. Diehl, and P. Weiβgerber. Visual data mining in software archives.
In ACM Symposium on Software Visualization, pages 37–46, May 2005.
[28] C. Tenllado and J. Setoain and M. Prieto and L. Pinuel and F. Tirado. Parallel
implementation of the 2D discrete wavelet transform on graphics processing units:
Filter bank versus lifting. IEEE Transactions on Parallel and Distributed Systems,
19(3):299–310, 2008.
[29] S. K. Card, J. D. Mackinlay, and B. Shneiderman, editors. Readings in Information
Visualization Using Vision to Think. Morgan Kaufman, 1999.
[30] N. A. Carr, J. D. Hall, and J. C. Hart. The ray engine. In SIGGRAPH/Eurographics
Workshop on Graphics Hardware, pages 37–46, 2002.
[31] B. Catanzaro, N. Sundaram, and K. Keutzer. Fast support vector machine training
and classification on graphics processors. In Proceedings of the 25th Annual Inter-
national Conference on Machine Learning (ICML 2008), pages 104–111, 2008.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 158
[32] T. F. Chan, J. Cong, and K. Sze. Multilevel generalized force-directed method for
circuit placement. In P. Groeneveld and L. Scheffer, editors, ISPD, pages 185–192.
ACM, 2005.
[33] T. M. Chan. A near-linear area bound for drawing binary trees. In Proceedings of
the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages
161–168, 1999.
[34] S. C. Chapra and R. P. Canale. Numerical methods for engineers: with programming
and software applications, 3rd edition. McGraw Hill, 1998.
[35] J. H. Chuang, C. C. Lin, and H. C. Yen. Drawing graphs with nonuniform nodes
using potential fields. In G. Liotta, editor, Proc. 11th Int. Symp. Graph Drawing
(GD 2003), number 2912 in Lecture Notes in Computer Science, LNCS, pages
460–465. Springer-Verlag, 2004.
[36] F. R. K. Chung. Spectral graph theory. Regional Conference Series in Mathematics,
American Mathematical Society, 92:1–212, 1997.
[37] IBM rational clearcase, 2008. Currently Available at http://www-306.ibm.com/-
software/awdtools/clearcase/.
[38] W. R. Cockayne and M. Zyda, editors. Mobile Agents. Prentice Hall, 1998.
[39] J. D. Cohen. Drawing graphs to convey proximity: an incremental arrangement
method. ACM Trans. Comput.-Hum. Interact., 4(3):197–229, 1997.
[40] C. Collberg, S. Kobourov, J. Nagra, J. Pitts, and K. Wampler. A system for
graph-based visualization of the evolution of software. In Proceedings ACM 2003
Symposium on Software Visualization, pages 77–86. ACM, 2003.
[41] P. Crescenzi, G. D. Battista, and A. Piperno. A note on optimal area algorithms
for upward drawings of binary trees. Comput. Geom, 2:187–200, 1992.
[42] P. Crescenzi and A. Piperno. Optimal-area upward drawings of AVL trees. In Pro-
ceedings of the DIMACS International Workshop, Graph Drawing, GD’94, volume
894 of LNCS, pages 307–317. 1995.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 159
[43] R. Davidson and D. Harel. Drawing graphics nicely using simulated annealing.
ACM Transactions on Graphics, 15(4):301–331, Oct. 1996.
[44] J. W. Demmel. Applied Numerical Linear Algebra. SIAM, 1997.
[45] O. Deussen, S. Hiller, C. van Overveld, and T. Strothotte. Floating points: A
method for computing stipple drawings. Computer Graphics Forum, 19(3), Aug.
2000. ISSN 1067-7055.
[46] G. Di Battista, P. Eades, R. Tamassia, and I. G. Tollis. Algorithms for drawing
graphs: An annotated bibliography. Computational Geometry: Theory and Appli-
cations, 4(5):235–282, 1994.
[47] S. Diehl and C. Gorg. Graphs, They Are Changing - Dynamic Graph Drawing for
a Sequence of Graphs. In Proc. 10th Int. Symp. Graph Drawing, pages 23–31, 2002.
[48] T. Dwyer. Three dimensional UML using force directed layout. In P. Eades and
T. Pattison, editors, Australian Symposium on Information Visualisation, (invis.au
2001), volume 9 of Conferences in Research and Practice in Information Technology,
pages 77–85, Sydney, Australia, 2001. ACS.
[49] T. Dwyer, K. Marriott, and P. J. Stuckey. Fast node overlap removal. In P. Healy
and N. S. Nikolov, editors, Graph Drawing, volume 3843 of Lecture Notes in Com-
puter Science, pages 153–164. Springer, 2005.
[50] P. Eades. A heuristic for graph drawing. Congressus Numerantium, 42:149–160,
1984.
[51] P. Eades and Q. W. Feng. Multilevel visualization of clustered graphs. In S. C.
North, editor, Proc. 4th Int. Symp. Graph Drawing (GD 1996), number 1190 in
Lecture Notes in Computer Science, LNCS, pages 101–112. Springer-Verlag, 18–
20 Sept. 1996.
[52] P. Eades and K. Sugiyama. How to draw a directed graph. J. Information Process-
ing, 13(4):424–437, 1990.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 160
[53] J. Ellson, E. R. Gansner, L. Koutsofios, S. C. North, and G. Woodhull. Graphviz
— open source graph drawing tools. In Proc. 9th Int. Symp. Graph Drawing (GD
2001), number 2265 in LNCS, pages 483–484, 2002.
[54] N. Elmqvist and P. Tsigas. Growing squares: animated visualization of causal
relations. In S. Diehl, J. T. Stasko, and S. N. Spencer, editors, Proceedings ACM
2003 Symposium on Software Visualization, pages 17–26. ACM, 2003.
[55] U. Erra. Toward real time fractal image compression using graphics hardware. In
Advances in Visual Computing, pages 723–728, 2005.
[56] C. Erten, P. J. Harding, S. G. Kobourov, K. Wampler, and G. V. Yee. GraphAEL:
Graph animations with evolving layouts. In Proc. 11th Int. Symp. Graph Drawing,
pages 98–110, 2003.
[57] K. Fatahalian, J. Sugerman, and P. Hanrahan. Understanding the efficiency of GPU
algorithms for matrix-matrix multiplication. In SIGGRAPH/EUROGRAPHICS
Workshop On Graphics Hardware, pages 133–137, 2004.
[58] R. Fernando, editor. GPU Gems: Programming Techniques, Tips, and Tricks for
Real-Time Graphics. 2004.
[59] M. Fiedler. A property of eigenvectors of nonnegative symmetric matrices and its
application to graph theory. Czechoslovak Mathematical Journal, 25(100):619–633,
1975.
[60] J. Foley, A. van Dam, S. Feiner, and J. Hughes. Computer Graphics: Principles
and Practice, second edition. Addison-Wesley Professional, 1990.
[61] T. Foley and J. Sugerman. KD-tree acceleration structures for a GPU raytracer.
In Graphics Hardware, pages 15–22, 2005.
[62] A. Frick, A. Ludwig, and H. Mehldau. A fast adaptive layout algorithm for undi-
rected graphs. In Graph Drawing, volume 894 of Lecture Notes in Computer Science,
pages 388–403. DIMACS, Oct. 1994.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 161
[63] Y. Frishman and A. Tal. Dynamic drawing of clustered graphs. In Proc. of the
IEEE Symposium on Information Visualization, InfoVis, pages 191–198, 2004.
[64] Y. Frishman and A. Tal. Visualization of mobile object environments. In ACM
Symposium on Software Visualization, pages 145–154, 2005.
[65] Y. Frishman and A. Tal. Multi-level graph layout on the GPU. IEEE Trans. on
Visualization and Computer Graphics (Proc. InfoVis), 13(6):1310–1317, 2007.
[66] Y. Frishman and A. Tal. Online dynamic graph drawing. In EuroVis, pages 75–82,
2007.
[67] Y. Frishman and A. Tal. Movis: A system for visualizing distributed mobile object
environments. Journal of Visual Languages and Computing, 19(3):303–320, 2008.
[68] Y. Frishman and A. Tal. Online dynamic graph drawing. IEEE Transactions on
Visualization and Computer Graphics, 14(4):727–740, 2008.
[69] Y. Frishman and A. Tal. Uncluttering graph layouts using anisotropic diffusion and
mass transport. submitted for publication.
[70] T. M. J. Fruchterman and E. M. Reingold. Graph drawing by force-directed place-
ment. Software—Practice and Experience, 21(11):1129–1164, 1991.
[71] G. W. Furnas. Generalized fisheye views. In M. Mantei and P. Orbeton, editors,
Human Factors in Computing Systems, CHI’86 Conference Proceedings, pages 16–
23. ACM/SIGCHI, Special Issue of ACM SIGCHI Bulletin, 1986.
[72] P. Gajer, M. T. Goodrich, and S. G. Kobourov. A multi-dimensional approach to
force-directed layouts of large graphs. Comput. Geom, 29(1):3–18, 2004.
[73] N. Galoppo, N. K. Govindaraju, M. Henson, and D. Manocha. LU-GPU: Efficient
algorithms for solving dense linear systems on graphics hardware. In ACM / IEEE
Supercomputing, 2005.
[74] E. R. Gansner, Y. Koren, and S. C. North. Graph drawing by stress majorization.
In J. Pach, editor, Graph Drawing, 12th International Symposium, GD, volume
3383 of Lecture Notes in Computer Science, pages 239–250. Springer, 2004.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 162
[75] E. R. Gansner, Y. Koren, and S. C. North. Topological fisheye views for visual-
izing large graphs. IEEE Transactions on Visualization and Computer Graphics,
11(4):457–468, 2005.
[76] E. R. Gansner, E. Koutsofios, S. C. North, and K.-P. Vo. A technique for drawing
directed graphs. IEEE Trans. Software Engineering, 19(3):214–230, Mar. 1993.
[77] E. R. Gansner and S. C. North. Improved force-directed layouts. In Graph Drawing,
volume 1547 of LNCS, pages 364–373, 1998.
[78] E. R. Gansner and S. C. North. Improved force-directed layouts. In S. Whitesides,
editor, Proc. 6th Int. Symp. Graph Drawing (GD 1998), number 1547 in Lecture
Notes in Computer Science, LNCS, pages 364–373. Springer-Verlag, 1998.
[79] E. R. Gansner and S. C. North. An open graph visualization system and its appli-
cations to software engineering. Software — Practice and Experience, 30(11):1203–
1234, 2000.
[80] A. Garg, M. T. Goodrich, and R. Tamassia. Planar upward tree drawings with
optimal area. Int. J. Comput. Geometry Appl, 6(3):333–356, 1996.
[81] A. Garg and R. Tamassia. A new minimum cost flow algorithm with applications
to graph drawing. In GD ’96: Proceedings of the Symposium on Graph Drawing,
pages 201–216, 1997.
[82] M. T. Gastner and M. E. J. Newman. Diffusion-based method for producing
density-equalizing maps. Proc. Nat. Acad. Sci. USA, 101(20):7499–7504, 2004.
[83] J. Georgii, F. Echtler, and R. Westermann. Interactive simulation of deformable
bodies on GPUs. In SimVis, pages 247–258, 2005.
[84] N. Goodnight, R. Wang, C. Woolley, and G. Humphreys. Interactive time-
dependent tone mapping using programmable graphics hardware. In Eurographics
Symposium on Rendering, pages 1–13, 2003.
[85] N. Goodnight, C. Woolley, G. Lewin, D. Luebke, and G. Humphreys. A multigrid
solver for boundary value problems using programmable graphics hardware. In
SIGGRAPH/Eurographics Workshop on Graphics Hardware, pages 102–111, 2003.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 163
[86] C. Gorg, P. Birke, M. Pohl, and S. Diehl. Dynamic graph drawing of sequences of
orthogonal and hierarchical graphs. In Proc. 12th Int. Symp. Graph Drawing, GD,
volume 3383 of LNCS, pages 228–238, 2004.
[87] N. K. Govindaraju, J. Gray, R. Kumar, and D. Manocha. GPUTerasort: high
performance graphics co-processor sorting for large database management. In SIG-
MOD Conference, pages 325–336, 2006.
[88] N. K. Govindaraju, B. Lloyd, W. Wang, M. Lin, and D. Manocha. Fast computation
of database operations using graphics processors. In Proceedings of the 2004 ACM
SIGMOD International Conference on Management of Data, pages 215–226, 2004.
[89] GPGPU. http://www.gpgpu.org.
[90] C. Gutwenger and P. Mutzel. Planar polyline drawings with good angular resolu-
tion. In Proc. 6th Int. Symp. Graph Drawing, GD, volume 1547 of Lecture Notes
in Computer Science, LNCS, pages 167–182, 1998.
[91] S. Hachul and M. Junger. Drawing large graphs with a potential-field-based multi-
level algorithm. In Graph Drawing, pages 285–295, 2004.
[92] S. Hachul and M. Junger. An experimental comparison of fast algorithms for draw-
ing general large graphs. In Graph Drawing, volume 3843 of LNCS, pages 235–250,
2005.
[93] T. Hakamata, T. Caudell, and E. Angel. Force-directed graph layout using the gpu.
In Supercomputing ’06 Workshop ”General-Purpose GPU Computing: Practice And
Experience”, 2006.
[94] S. Haker, L. Zhu, A. Tannenbaum, and S. Angenent. Optimal mass transport for
registration and warping. International Journal of Computer Vision, 60(3):225–240,
2004.
[95] C. D. Hansen, J. M. Kniss, A. E. Lefohn, and R. T. Whitaker. A streaming
narrow-band algorithm: Interactive computation and visualization of level sets.
IEEE Transactions on Visualization and Computer Graphics, 10(4):422–433, 2004.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 164
[96] D. Harel and Y. Koren. A Fast Multi-Scale Algorithm for Drawing Large Graphs.
J. Graph Algorithms Appl., 6(3):179–202, 2002.
[97] D. Harel and Y. Koren. Drawing graphs with non-uniform vertices. In Proc.
Working Conference on Advanced Visual Interfaces (AVI’02), pages 157–166. ACM
Press, 2002.
[98] D. Harel and Y. Koren. Graph drawing by high-dimensional embedding. J. Graph
Algorithms Appl, 8(2):195–214, 2004.
[99] M. J. Harris. GPU Gems: Programming Techniques, Tips, and Tricks for Real-
Time Graphics, chapter 38: Fast Fluid Dynamics Simulation on the GPU, pages
637–665. Addison-Wesley, 2004.
[100] M. J. Harris, W. Baxter, T. Scheuermann, and A. Lastra. Simulation of cloud dy-
namics on graphics hardware. In SIGGRAPH/Eurographics Workshop on Graphics
Hardware, pages 92–101, 2003.
[101] K. Hayashi, M. Inoue, T. Masuzawa, and H. Fujiwara. A layout adjustment prob-
lem for disjoint rectangles preserving orthogonal order. In S. Whitesides, editor,
Graph Drawing, volume 1547 of Lecture Notes in Computer Science, pages 183–197.
Springer, 1998.
[102] O. Holder, I. Ben-Shaul, and H. Gazit. Dynamic layout of distributed applications in
fargo. In Proceedings of the 1999 International Conference on Software Engineering,
pages 163–173. IEEE Computer Society Press / ACM Press, 1999.
[103] O. Holder, I. Ben-Shaul, and H. Gazit. System support for dynamic layout of dis-
tributed applications. In 19th International Conference on Distributed Computing
Systems (19th ICDCS’99), Austin, Texas, May 1999. IEEE.
[104] D. R. Horn, M. Houston, and P. Hanrahan. ClawHMMER: a streaming HMMer-
search implementation. In Proceedings of the 2005 ACM/IEEE Conference on Su-
percomputing, Seattle, Washington, 2005.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 165
[105] M. L. Huang and P. Eades. A fully animated interactive system for clustering
and navigating huge graphs. In Proc. 6th Int. Symp. Graph Drawing (GD 1998),
number 1547 in LNCS, pages 374–383, 1998.
[106] X. Huang, W. Lai, A. S. M. Sajeev, and J. Gao. A new algorithm for removing
node overlapping in graph visualization. Inf. Sci., 177(14):2821–2844, 2007.
[107] ILOG JViews diagrammer, 2008. Currently Available at http://www.ilog.com/-
products/jviews/graphlayout.
[108] T. Jansen, B. von Rymon-Lipinski, N. Hanssen, and E. Keeve. Fourier volume ren-
dering on the GPU using a split-stream-FFT. In Vision, modeling and visualization,
pages 395–403, 2004.
[109] X. Jin, S. Chen, and X. Mao. Computer-generated marbling textures: A gpu-based
design system. IEEE Computer Graphics and Applications, 27(2):78–84, 2007.
[110] A. Joseph, R. Dar, and Y. Almog. Active Market Project Report, 2000. Available
at http://softlab.technion.ac.il/project/amarket/html/home.htm.
[111] M. Junger and P. Mutzel, editors. Graph Drawing Software. Springer-Verlag, 2003.
[112] T. Kamada. Visualizing Abstract Objects and Relations. World Scientific, 1989.
[113] T. Kamada and S. Kawai. An algorithm for drawing general undirected graphs.
Information Processing Letters, 31(1):7–15, 1989.
[114] G. Kant. Drawing planar graphs using the canonical ordering. Algorithmica,
16(1):4–32, July 1996.
[115] L. V. Kantorovich. On a problem of Monge. Uspekhi Mat. Nauk., 3(2):225–226,
1948.
[116] M. Kaufmann and D. Wagner, editors. Drawing Graphs: Methods and Models.
2001.
[117] M. Kaufmann and R. Wiese. Maintaining the mental map for circular drawings.
In Graph Drawing, 10th International Symposium, volume 2528 of Lecture Notes in
Computer Science, pages 12–22, 2002.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 166
[118] P. Kipfer, M. Segal, and R. Westermann. Uberflow: A GPU-based particle en-
gine. In Eurographics/SIGGRAPH Workshop on Graphics Hardware, pages 115–
122, 2004.
[119] M. Knott and C. S. Smith. On the optimal mapping of distributions. Journal of
Optimization Theory and Applications, 43(1):39–49, 1984.
[120] J. A. Kohl and G. A. Geist. The PVM 3.4 tracing facility and XPVM 1.1. In
H. El-Rewini and B. D. Shriver, editors, Proceedings of the Twenty-Ninth Hawaii
International Conference on System Sciences (HICSS-29), volume 1, pages 290–
299. IEEE Computer Society Press, 1996.
[121] Y. Koren. Drawing graphs by eigenvectors: Theory and practice. In Computers
and Mathematics with Applications, volume 45, pages 1867–1888. Elsevier, 2005.
[122] Y. Koren, L. Carmel, and D. Harel. ACE: A fast multiscale eigenvectors com-
putation for drawing huge graphs. In INFOVIS, pages 137–144. IEEE Computer
Society, 2002.
[123] Y. Koren, L. Carmel, and D. Harel. Drawing huge graphs by algebraic multigrid
optimization. Multiscale Modeling & Simulation, 1(4):645–673, 2003.
[124] E. Kraemer and J. Stasko. The visualization of parallel systems: an overview.
Journal of Parallel and Distributed Computing, 18(2):105–117, 1993.
[125] E. Kraemer and J. T. Stasko. Creating an accurate portrayal of concurrent execu-
tions. IEEE Concurrency, 6(1):36–46, 1998.
[126] J. Kruger and R. Westermann. Acceleration techniques for GPU-based volume
rendering. In IEEE Visualization, pages 287–292, 2003.
[127] J. Kruger and R. Westermann. Linear algebra operators for GPU implementa-
tion of numerical algorithms. In Proc. ACM SIGGRAPH, volume 22(3) of ACM
Transactions on Graphics, pages 908–916, 2003.
[128] G. Kumar and M. Garland. Visual exploration of complex time-varying graphs.
IEEE Trans. on Visualization and Computer Graphics, Proc. InfoVis, 2006.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 167
[129] L. Lamport. Time, clocks, and the ordering of events in a distributed system. In
Communications of the ACM, pages 558–565, July 1978.
[130] D. Lange and M. Oshima. Seven Good Reasons for Mobile Agents. Communications
of the ACM, 42(3):88–89, 1999.
[131] E. S. Larsen and D. McAllister. Fast matrix multiplies using graphics hardware.
In ACM / IEEE Supercomputing, page 55, 2001.
[132] Y.-Y. Lee, C.-C. Lin, and H.-C. Yen. Mental Map Preserving Graph Drawing
Using Simulated Annealing, volume 60 of Conferences in Research and Practice in
Information Technology. 2006.
[133] W. Li, P. Eades, and N. Nikolov. Using spring algorithms to remove node over-
lapping. In Asia Pacific Symposium on Information Visualisation (APVIS2005),
volume 45 of CRPIT, pages 131–140, 2005.
[134] E. Lindholm, M. J. Kilgard, and H. Moreton. A user-programmable vertex engine.
In SIGGRAPH 2001, Computer Graphics Proceedings, pages 149–158, 2001.
[135] W. Liu, B. Schmidt, G. Voss, and W. Muller-Wittig. Molecular dynamics simula-
tions on commodity GPUs with CUDA. In HiPC, volume 4873 of Lecture Notes in
Computer Science, pages 185–196, 2007.
[136] Y. Liu, X. Liu, and E. Wu. Real-time 3D fluid simulation on GPU with complex
obstacles. In Pacific Conference on Computer Graphics and Applications, pages
247–256, 2004.
[137] P. Ljung. Adaptive sampling in single pass, GPU-based raycasting of multiresolu-
tion volumes. In Eurographics/IEEE VGTC Workshop on Volume Graphics, pages
39–46, Boston, Massachusetts, USA, 2006.
[138] K. A. Lyons, H. Meijer, and D. Rappaport. Algorithms for cluster busting in
anchored graph drawing. J. Graph Algorithms and Applications, 2(1):1–24, 1998.
[139] S. A. Manavski and G. Valle. CUDA compatible GPU cards as efficient hardware
accelerators for smith-waterman sequence alignment. BMC Bioinformatics, 9(Suppl
2):S10, 2008.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 168
[140] W. R. Mark, R. S. Glanville, K. Akeley, and M. J. Kilgard. Cg: a system for pro-
gramming graphics hardware in a C-like language. ACM Transactions on Graphics,
22(3):896–907, July 2003.
[141] K. Marriott, P. J. Stuckey, V. Tam, and W. He. Removing node overlapping in
graph layout using constrained optimization. Constraints, 8(2):143–171, 2003.
[142] M. D. McCool, S. D. Toit, T. Popa, B. Chan, and K. Moule. Shader algebra. ACM
Transactions on Graphics, 23(3):787–795, 2004.
[143] D. Merrick and J. Gudmundsson. Increasing the readability of graph drawings with
centrality-based scaling. In Asia Pacific Symposium on Information Visualisation
(APVIS2006), volume 60 of CRPIT, pages 67–76, 2006.
[144] D. Milojicic, F. Douglis, and R. Wheeler, editors. Mobility: Processes, Computers
and Agents. ACM Press, 1999.
[145] K. Misue, P. Eades, W. Lai, and K. Sugiyama. Layout adjustment and the mental
map. J. Visual Languages and Computing, 6(2):183–210, 1995.
[146] J. Moe and D. A. Carr. Understanding distributed systems via execution trace
data. In International Workshop on Program Comprehension, pages 60–69. IEEE
Computer Society Press, 2001.
[147] A. S. Montemayor, R. Cabido, J. J. Pantrigo, and B. R. Payne. Bandwidth-
improved gpu particle filter for visual tracking. In 3rd Ibero-American Symposium
on Computer Graphics, SIACG, pages 874–881, 2006.
[148] J. Montrym and H. P. Moreton. The geforce 6800. IEEE Micro, 25(2):41–51, 2005.
[149] J. Moody, D. McFarland, and S. Bender-deMoll. Dynamic network vi-
sualization. American Journal of Sociology, 110(4):1206–1241, 2005.
http://www.journals.uchicago.edu/AJS/journal/issues/v110n4/-
080349/080349.html.
[150] K. Moreland and E. Angel. The FFT on a GPU. In SIGGRAPH/Eurographics
Workshop on Graphics Hardware, pages 112–119, 2003.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 169
[151] Y. Moses, Z. Polunsky, A. Tal, and L. Ulitsky. Algorithm visualization for dis-
tributed environments. Journal of Visual Languages and Computing, 15(1):97–123,
2004.
[152] T. M. Newcomb. The acquaintance process. Holt, Rinehart and Winston, 1961.
[153] H. Nguyen, editor. GPU Gems 3. Addison-Wesley, 2007.
[154] T. Nishizeki and M. S. Rahman. Planar graph drawing. World Scientific, 2004.
[155] S. C. North. Incremental layout in dynadag. In Proc. 3rd Int. Symp. Graph Drawing,
number 1027 in LNCS, pages 409–418, 1995.
[156] NVIDIA. CUDA : Compute unified device architecture. http://-
www.nvidia.com/object/cuda home.html.
[157] L. Nyland, M. Harris, and J. Prins. The rapid evaluation of potential fields using
programmable graphics hardware. In ACM Workshop on General Purpose Com-
puting on Graphics Hardware, 2004.
[158] Object Management Group. The Common Object Request Broker: Architecture
and Specification. Revision 2.2, February 1998.
[159] ObjectSpace. ObjectSpace Voyager Core Package: Technical Overview, December
1997.
[160] K. Oh and K. Jung. Gpu implementation of neural networks. Pattern Recognition,
37(6):1311–1314, June 2004.
[161] C. O’Reilly, D. Bustard, and P. Morrw. The war room command console - shared
visualizations for inclusive team coordination. In ACM Symposium on Software
Visualization, pages 57–65, May 2005.
[162] J. Owens, M. Houston, D. Luebke, S. Green, J. Stone, and J. Phillips. GPU
computing. Proceedings of the IEEE, 96(5):879–899, 2008.
[163] J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Kruger, A. E. Lefohn, and
T. J. Purcell. A survey of general-purpose computation on graphics hardware. In
Eurographics, pages 21–51, 2005.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 170
[164] J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Kruger, A. E. Lefohn,
and T. J. Purcell. A survey of general-purpose computation on graphics hardware.
Computer Graphics Forum, 26(1):80–113, 2007.
[165] V. Pande. Folding@home on ati gpu’s, 2006.
http://folding.stanford.edu/FAQ-ATI.html.
[166] A. Papakostas and I. G. Tollis. Orthogonal drawing of high degree graphs with small
area and few bends. In WADS: 5th Workshop on Algorithms and Data Structures,
1997.
[167] W. D. Pauw, E. Jensen, N. Mitchell, G. Sevitsky, J. Vlissides, and J. Yang. Vi-
sualizing the execution of java programs. In S. Diehl, editor, Proceedings of the
International Seminar on Software Visualization, number 2269 in Lecture Notes in
Computer Science, LNCS, pages 151–162. Springer-Verlag, 2001.
[168] S.-M. Peercy, M. and D. Derstmann. A performance-oriented data parallel virtual
machine for gpus. In ACM SIGGRAPH sketches. ACM Press, 2006.
[169] M. Pharr and R. Fernando, editors. GPU Gems 2 : Programming Techniques for
High-Performance Graphics and General-Purpose Computation. 2005.
[170] Pothen, A., Simon, H., and Liou, K. Partitioning sparse matrices with eigenvectors
of graphs. SIAM J. Matrix Anal. and Appl., 11:430–452, 1990.
[171] T. J. Purcell, I. Buck, W. R. Mark, and P. Hanrahan. Ray tracing on programmable
graphics hardware. ACM Transactions on Graphics, 21(3):703–712, 2002.
[172] A. J. Quigley and P. Eades. FADE: Graph drawing, clustering, and visual abstrac-
tion. In Graph Drawing, number 1984 in LNCS, pages 197–210, 2000.
[173] D. A. Reed, R. A. Aydt, R. J. Noe, P. C. Roth, K. A. Shields, B. W. Schwartz,
and L. F. Tavera. Scalable Performance Analysis: The Pablo Performance Analysis
Environment. In Proceedings of Scalable Parallel Libraries Conference, pages 104–
113. IEEE Computer Society, 1993.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 171
[174] T. Rehman, E. Haber, G. Pryor, J. Melonakos, and A. Tannenbaum. 3D nonrigid
registration via optimal mass transport on the GPU. Accepted - Elsevier Journal
of Medical Image Analysis, 2008.
[175] E. M. Reingold and J. S. Tilford. Tidier drawings of trees. IEEE Trans. on Softw.
Eng., 7(2):223, Mar. 1981.
[176] N. Rober, U. Kaminski, and M. Masuch. Ray acoustics using computer graphics
technology. In 10th International Conference on Digital Audio Effects (DAFx-07),
pages 117–124, 2007.
[177] M. Rumpf and R. Strzodka. Level set segmentation in graphics hardware. In
International Conference on Image Processing, pages 1103–1106, 2001.
[178] S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, and W. mei
W. Hwu. Optimization principles and application performance evaluation of a
multithreaded GPU using CUDA. In Proceedings of the 13th ACM SIGPLAN
Symposium on Principles and Practice of Parallel Programming, PPOPP, pages
73–82, 2008.
[179] T. Scheuermann and J. Hensley. Efficient histogram generation using scattering on
GPUs. In B. Gooch and P.-P. J. Sloan, editors, SI3D, pages 33–37. ACM, 2007.
[180] W. Schnyder. Embedding planar graphs on the grid. In SODA ’90: Proceedings
of the first annual ACM-SIAM symposium on Discrete algorithms, pages 138–148,
Philadelphia, PA, USA, 1990. Society for Industrial and Applied Mathematics.
[181] M. Segal and K. Akeley. The opengl graphics system: A specification, version 2.0
www.opengl.org, 2004.
[182] S. Sengupta, M. Harris, Y. Zhang, and J. D. Owens. Scan primitives for GPU
computing. In Graphics Hardware, pages 97–106, San Diego, California, USA,
2007.
[183] E. Sharon, A. Brandt, and R. Basri. Fast multiscale image segmentation. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR-00), pages 70–77, Los Alamitos, June 13–15 2000. IEEE.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 172
[184] J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. on
PAMI, 22(8):888–905, 2000.
[185] E. Shimizu and R. Inoue. Time-distance mapping: visualization of transporta-
tion level of service. In Proc. of Symposium on Environmental Issues Related to
Infrastructure Development, pages 221–230, 2003.
[186] N. T. Spring, R. Mahajan, and D. Wetherall. Measuring ISP topologies with rock-
etfuel. In SIGCOMM, pages 133–145, 2002.
[187] J. T. Stasko and E. Kraemer. A methodology for building application-specific
visualizations of parallel programs. Journal of Parallel and Distributed Computing,
18(2):258–264, 1993.
[188] S. Stegmaier, M. Strengert, T. Klein, and T. Ertl. A simple and flexible volume ren-
dering framework for graphics-hardware-based raycasting. In Eurographics/IEEE
VGTC Workshop on Volume Graphics, pages 187–195, 2005.
[189] I. Stephen, T. Munzner, and M. Olano. Glimmer: Multilevel MDS on the GPU.
Technical Report UBC CS TR-2007-15, University of British Columbia, 2007.
[190] J. E. Stone, J. C. Phillips, P. L. Freddolino, D. J. Hardy, L. G. Trabuco, and
K. Schulten. Accelerating molecular modeling applications with graphics processors.
Journal of Computational Chemistry, 28(16):2618–2640, 2007.
[191] G. Strang. Introduction to Applied Mathematics. Wellesley-Cambridge press, 1986.
[192] K. Sugiyama. Graph Drawing and Applications for Software and Knowledge Engi-
neers. World Scientific, 2002.
[193] K. Sugiyama and K. Misue. A simple and unified method for drawing graphs:
Magnetic-spring algorithm. In Graph Drawing, volume 894 of Lecture Notes in
Computer Science, pages 364–375. DIMACS, Oct. 1994.
[194] K. Sugiyama, S. Tagawa, and M. Toda. Methods for visual understanding of hier-
achical system structures. IEEE Transactions on Systems, Man, and Cybernetics,
SMC-11(2):109–125, Feb. 1981.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 173
[195] Sun Microsystems, Inc. Java Remote Method Invocation (RMI) Specification, De-
cember 1997.
[196] R. Tamassia. On embedding a graph in the grid with the minimum number of
bends. SIAM J. Comput., 16(3):421–444, 1987.
[197] R. Tamassia and I. Tollis. Planar grid embedding in linear time. IEEE Transactions
on Circuits and Systems, 36:1230–1234, 1989.
[198] E. Tejada and T. Ertl. Large Steps in GPU-based Deformable Bodies Simulation.
Simulation Modelling Practice and Theory, 13:703–715, 2005.
[199] I. G. Tollis, G. D. Battista, P. Eades, and R. Tamassia. Graph Drawing: Algorithms
for the Visualization of Graphs. Prentice Hall, 1999.
[200] Tom sawyer graph layout toolkit, 2004. Currently Available at
http://www.tomsawyer.com.
[201] B. Topol, J. T. Stasko, and V. Sunderam. Pvanim: A tool for visualization
in network computing environments. Concurrency: Practice and Experience,
10(14):1197–1222, 1998.
[202] R. van Liere and W. C. de Leeuw. Graphsplatting: Visualizing graphs as continuous
fields. IEEE Trans. Vis. Comput. Graph, 9(2):206–212, 2003.
[203] L. Voinea, A. Telea, and J. J. van Wijk. CVSscan: Visualization of code evolution.
In ACM Symposium on Software Visualization, pages 47–56, May 2005.
[204] C. Walshaw. graph collection. http://staffweb.cms.gre.ac.uk/~c.walshaw/-
partition/.
[205] C. Walshaw. A Multilevel Algorithm for Force-Directed Graph Drawing. J. Graph
Algorithms Appl., 7(3):253–285, 2003.
[206] X. Wang and I. Miyamoto. Generating customized layouts. In F.-J. Brandenburg,
editor, Proc. 3rd Int. Symp. Graph Drawing (GD 1995), number 1027 in Lecture
Notes in Computer Science, LNCS, pages 504–515. Springer-Verlag, 1996.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
References 174
[207] Y. Wang and T. Kunz. Visualizing mobile agent executions. In E. Horlait, editor,
Second International Workshop on Mobile Agents for Telecommunication Applica-
tions (MATA 2000), number 1931 in Lecture Notes in Computer Science, LNCS,
pages 103–114. Springer-Verlag, 2000.
[208] D. S. Watkins. Fundamentals of Matrix Computations. John Wiley, 2002.
[209] R. Wiese, M. Eiglsperger, and M. Kaufmann. yfiles: Visualization and automatic
layout of graphs. In P. Mutzel, M. Junger, and S. Leipert, editors, Proc. 9th Int.
Symp. Graph Drawing (GD 2001), number 2265 in Lecture Notes in Computer
Science, LNCS, pages 453–454. Springer-Verlag, 2001.
[210] R. Wilson and R. Bergeron. Dynamic hierarchy specification and visualization. In
Proc. IEEE Symp. Information Visualization, InfoVis, pages 65–72, 1999.
[211] A. Wong, T. Dillon, M. Ip, and W. Lin. A generic visualization framework to
help debug mobile-object-based distributed programs running on large networks.
In WORDS, pages 240–250. IEEE Computer Society, 2001.
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
את הבעיה בצורה המתאימה למימוש על מעבד גרפי או על ארכיטקטורות מקביליות עתידיות . ריצה על ידי שימוש במעבד גרפיהראינו כיצד מתקבלת האצה משמעותית של זמן ה. אחרות
הטובים םהאלגוריתם שהתקבל הוא מהיר מאוד ומחשב שיכונים בעלי איכות דומה לאלגוריתמי . ביותר בתחום
במקרים . אנו מציגים אלגוריתם לצמצום העומס הנוצר בשיכונים של גרפים4בפרק מסוימים של התמונה הם בעוד שחלקים . רבים צפיפות האינפורמציה בשיכון הגרף אינה אחידה
דבר זה מקשה על מיצוי . חלקים אחרים מכילים צמתים וקשתות רבות, דלילים או אפילו ריקיםבעזרת שימוש . המידע משיכון הגרף ומהווה ניצול לא יעיל של שטח המסך העומד לרשותנו
של האלגוריתם המוצע מחשב פיזור חדש , בתהליך התפתחות המדמה התפשטות של חום במישורקביעת כיוון ההתפתחות של פיזור המידע . המידע בגרף כך ששטח המסך מנוצל בצורה יעילה יותר
. המואץ בצורה משמעותית על ידי שימוש במעבד גרפי, ידי תהליך השלכת קרניים-נעשית עלמיקום כל צומת בגרף מעודכן , לבסוף. לאחר מכן מחושב העיוות של תמונת פיזור המידע בגרף
.יוות שחושבבהתאם לעמלבד . מוצג אלגוריתם מקוון לחישוב סדרת שיכונים של גרף דינאמי כללי5בפרק
המפה "יש צורך לשמור על , כאשר מטפלים בגרפים דינאמיים, הדרישה לקבלת גרף אסתטיתוך כדי , מספר שיטות לצמצום כמות החישוב הנדרשת. שבונה המשתמש בראשו" המנטאלית
האלגוריתם פותר את בעיית השיכון . מוצגות, הגרף שעברו שינוייםהתרכזות באזורים של הרעיון המרכזי של האלגוריתם הוא קביעת . הדינאמי על ידי חישוב השיכון במספר רמות פירוט
בשיטה זו מתקבל שיכון יציב . יכולת התנועה של כל צומת בגרף בהתאם לשינויים בסביבתה בגרף כיצד ניתן להאיץ את חישוב סדרת השיכונים על ידי שימוש הראינו, כן-כמו. ואסטטי של הגרף
. בזמן הריצה17כדי להשיג שיפור של עד פי , במעבד גרפי, אנו דנים באלגוריתם לחישוב מקוון של סדרת שיכונים של גרף המכיל אשכולות6בפרק
. תמשלית של הגרף שיש למשאהאלגוריתם שומר על המפה המנט. המשתנה כפונקציה של הזמןהאלגוריתם עושה . יהדינאמכחלק מהמחקר פיתחנו מספר מטריקות למדידת איכות השיכון
ותקשתות בלתי נראבושאינם מורשים לזוז שימוש בצמתים נעשה, ראשית. שימוש במספר כלים בצמתים נוספים כדי למזער את השינויים בין נעשה שימוש, שנית. גרףה שמור על מבנהכדי ל
, בנוסף. במשקלות על הקשתות כדי לשלוט בצורת הגרףנעשה שימוש, שלישית. שיכון אחד לשני .נשארים קבועים במקומםה לצמתים דינאמים מתיר להבדיל בין צמתים האלגוריתם
. [64 ,61 ,14] לויזואליזציה של מערכות אובייקטים ניידים אנו מציגים מערכת 7בפרק נעשה שימוש . ין המחשבים בזמן ריצת האפליקציההאובייקטים רשאים לעבור ב, במערכות כאלו
כמו גם השכבה ) מיקום על מחשבים ששיכים למערכת(בגרף עם אשכולות להצגת השכבה הפיסית המערכת היא סקלאבילית ויוצרת ויזואליזציה . במערכת) קשרים בין אובייקטים(הלוגית
.קונסיסטנטית של המערכת המבוזרת
iii
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
,139 ,26] יבד הגרפ כדי להיות מסוגלים להפעיל את המעיקיימת בעבר להכרות עם הממשק הגרפ
[141, 155, 167. פרץ םהשילוב בין כוח החישוב הרב לבין הגמישות ביכולת התכנות של המעבדים הגרפיי
הארכיטקטורה של . [161 ,88] יאת הדרך ליישום של שלל אפליקציות מדעיות על הכרטיס הגרפ האתגר הוא הרצת לעומת מעבד רגיל שבו. שונה מהארכיטקטורה של מעבד רגיליהמעבד הגרפ
מצטיין במקרים שבהם נדרש ביצוע צעדים זהים יהמעבד הגרפ, קוד סדרתי מהר ככל האפשר . במקביל על פיסות מידע רבות בקצב מהיר ככל הניתן
. בתזה זו נחקרו מספר בעיות קשורות בתחום של שיכון גרפים עבור ויזואליזציה של מידעלאחר מכן מוצג . אנו מתמקדים בבעיה של חישוב שיכון מהיר ואיכותי של גרף כללי יחיד, ראשית
ניתן להריץ אלגוריתם זה כדי לשפר פלט של כל אלגוריתם . אלגוריתם לשיפור שיכון קיים של גרףהן בגרפים המחולקים , של גרפיםיאנו עוסקים בבעיות של שיכון דינאמ, לאחר מכן.שיכון
אנו מראים שהאלגוריתמים שומרים בצורה טובה על . לאשכולות והן בגרפים בעלי מבנה כללישיכונים יציבים לגרפים , באופן מקוון, של המשתמש לגבי הגרף ומייצריםתהמפה המנטאלי
. המשתנים עם הזמןגרפים הם בעלי מבנה לא , שהן בעלות מבנה מסודר ואחיד, ד לתמונות או למטריצותבניגו
במבט ראשון נראה שפתרון של בעיות הקשורות לגרפים אינו מתאים ליכולות החישוב , לכן. אחיד. הבנויים לבצע במקביל אותה סדרת פעולות על פיסות מידע שונות, של המעבדים הגרפיים
ואף , ם לבצע מספר חישובים על גרפים בצורה יעילה על מעבדים גרפייבמחקר זה הראינו שניתן .לקבל תוצאות מהירות פי כמה מאשר ניתן להשיג בהרצה על מעבדים רגילים
אנו מציגים שיכונים . בתזה זו מוצגות אפליקציות שונות לאלגוריתמים שפותחו משפרים שיכונים של אנו. סטאטיים של גרפים המייצגים רשתות של ספקי תשתית אינטרנט
אנו מציגים התפתחות . תכן מעגלים ורשתות אלמנטים סופיים, גרפים מתחומי ביואינפורמטיקהפיתחנו , בנוסף. בזמן של גרפים המייצגים קבוצות דיון באינטרנט ורשתות חברתיות באינטרנט
טים יכולים במערכות אלה האובייק. מערכות אובייקטים ניידיםמערכת תוכנה לויזואליזציה של אנו מציגים מערכת סקלאבילית העושה שימוש . לעבור ממחשב למחשב בזמן הרצת האפליקציה
כדי לאפשר ויזואליזציה של האובייקטים , באלגוריתם לשיכון דינאמי של גרף בעל אשכולות . הניידים
אנחנו. אנו מטפלים בבעיה של חישוב של שיכון של גרף לא מכוון כללי במישור3בפרק בעזרת שיטה זו ניתן לקבל שיכון טוב תוך . מציעים שיטה שבה הגרף מוצג במספר רמות פירוט
האלגוריתם מחלק אותו , כדי לבנות את רמות הפירוט השונות של הגרף. צמצום זמן החישובחלוקת הגרף מתבצעת על ידי אלגוריתם חלוקה . לחלקים הולכים וקטניםיבאופן רקורסיב
הנבנית מתוך יחסי השכנות בין הצמתים , מטריצת הלפלסיאן של הגרףספקטראלי הפועל עלכדי לקבל , המחקר עושה שימוש בתכונות של שני אלגוריתמים קיימים לחישוב השיכון. בגרף
מוצגת שיטה חדשה להמרת שיכון של גרף ברמת פירוט נמוכה לשיכון התחלתי של . שיכון אסתטי בין חלקים הוהאינטראקצי, וב הגרף מחולק לחלקיםכדי להאיץ את החיש. גרף מפורט יותר
ניתן למקבל , כיוון שהחלקים שיוצר האלגוריתם הנם בעלי גודל דומה. רחוקים של הגרף מקורבת
ii Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
תקציר
וגרפיותאינטראקטיביות , שיטות מבוססות מחשבב השימוש היא ויזואליזציה של מידע שימוש .]29 [ כדי לעזור למשתמש להגיע לתובנות,ובעיקר מידע מופשט, להצגה גרפית של מידע
מערכת . מאפשר לרתום את מערכת הראייה האנושית לצורך הבנת המידע וניתוחות גרפיבהצגהאחד . ידי היכולות של קליטת כמויות רבות של מידע וזיהוי דפוסים במידע-הראייה מתאפיינת על
. מהאתגרים המרכזיים בויזואליזציה של מידע הוא מציאת דרכים למפות מידע מופשט לתמונה
ר ו ייצהנוהאתגר כאן . ימידע שלו רוצים לבצע ויזואליזציה הוא דינאמה, במקרים רבים" מפה המנטלית"כך שהצופה יכול לשמור על ה, "מספרות סיפור"סידרה קוהרנטית של תמונות ש
ויזואליזציה של ליצירת מספיקה אינההרכבה של מספר תמונות סטטיות .]139[שלו לגבי המידע צריך להיות מסוגל להבחין בשינויים תדינאמיויזואליזציה צופה בההמשתמש . ידינאממידע כל זה צריך להתבצע תוך כדי שמירה של המבנה . ולהבינםים במידע בזמן הויזואליזציההחל
.המידעייצוג הכללי של ,45[ היא אחת הבעיות המרכזיות בתחום של ויזואליזציה של מידע שיכון גרפיםבעיית
גרפים הינם אובייקטים מתמטיים מופשטים המציגים קשרים ]. 197 ,190 ,153 ,115 ,111 ,110בבעיה . גרף מכיל קבוצה של צמתים וקבוצה של קשתות המחברות בין צמתים. בין אובייקטים
בייצוג זה נקבע המיקום במישור . של גרףישל שיכון גרפים מתמודדים עם מציאת ייצוג גיאומטר. ל ידי חיבור הצמתים בעזרת עקומים או קווים ישריםלכל צומת של הגרף והקשתות מיוצגות ע
מיקום הצמתים , בשיכון הגרף. קיימים אינספור שיכונים המתאימים לגרף מופשט נתון . והקשתות במישור קובע את יכולתנו להבין את מבנה הגרף ולהגיע לתובנות כלפיו
רשתות , משולביםתכן מעגלים : מספר דוגמאות הם. לשיכון גרפים ישנם שימושים רביםמבני , גרף קריאות לפונקציות(הנדסת תוכנה , מכונות מצבים, ביואינפורמטיקה, חברתיות .רשתות תקשורת ותהליכי בקרה, )הצגת התפתחות של תוכנה, נתונים
התפתחו , של המחשביהמהוות את ליבת הכרטיס הגרפ, )GPU (תיחידות עיבוד גרפיו רק עבור יבעבר השתמשו במעבד הגרפ]. 176 ,162 ,88[בשנים האחרונות בקצב מהיר מאוד
ככל שטכנולוגיית ייצור המוליכים למחצה . אפליקציות הקשורות בצורה ישירה לגרפיקהכוח החישוב והגמישות , של ההתקנים הנמצאים על פיסת סיליקון גדלהההשתפרה והאינטגרצי
גרפיים מתקדמים הפועלים םמיכתוצאה מכך מומשו אלגורית. גדלוישל יחידות העיבוד הגרפ . בזמן אמת על המעבדים הגרפיים
התפתחו גם שפות םבמקביל להתפתחות בארכיטקטורת החומרה של המעבדים הגרפיי כל דור חדש של מעבדים . בשפות עיליותםכיום ניתן לכתוב תוכנה למעבדים גרפיי. התכנות שלהם
סביבות הרצה . םל המעבדים הגרפיימצמצם את המגבלות הקיימות על התוכניות הרצות ע מאפשרות למתכנת שליטה טובה יותר על תשאינן קשורות ישירות לאפליקציות גראפיו, חדשות
דבר שמשפר את הביצועים ומצמצם את הדרישה שהייתה, יהרצת החישובים על הכרטיס הגרפ
i
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
.המחקר נעשה בהנחיית פרופסור אילת טל בפקולטה למדעי המחשב
.אני מודה לטכניון על התמיכה הכספית הנדיבה בהשתלמותי
תודות
יע להישג הייתי רוצה להודות למספר אנשים שעזרו לי להג. ר היא זכות גדולה"קבלת תואר ד .הזה
איילת טל על התמיכה שלה בשלבים השונים של המסע ' פרופ, ברצוני להכיר תודה למנחה שליתודות להכוונה . אני רוצה להודות לה במיוחד על מתן משוב והצעות לשיפור המחקר. הארוך הזה
נות וזאת מיומ, כישורי הכתיבה שלי והיכולת שלי להציג נושאים השתפרו בצורה ניכרת, שלה .חשובה בפני עצמה
בעיקר בזמן שהייתי , העידוד וההבנה שלה, אני רוצה להודות לאשתי האהובה מאיה על התמיכההיה מאוד מספק לשתף איתה רגעים . טרוד בענייני לימודים ולכן לא יכולתי להיות איתה
.משמחים לאורך הדרך
הייתי רוצה . ורים שלי מרים ודבהתמיכה והעידוד של הה, ההישג הזה לא היה אפשרי ללא העזרהתודות מיוחדות מגיעות לאימי על ההשתתפות . להודות להם עבור כל מה שהם עשו בשבילי
אני רוצה להודות גם . הפעילה שלה בהפקת המאמרים והסרטים שמדגימים את המחקר שלית לי ברצוני להזכיר את סבתא בלה שתמיד עוזר. לאחים שלי איתי ועופרי על התמיכה שלהם
שתמיד האמין בערך , ל"אני רוצה להקדיש את המחקר הזה לסבא שלי שלמה ז. ומעודדת אותי . של השכלה גבוהה
הייתי רוצה להודות להורים של מאיה קיטי ואריה ולסבתה של מאיה על התמיכה בי ועל כך שהם
.סיפקו לי סביבה שבה הייתי יכול להתמקד במחקר שלי
ר עמית מזרחי על "ר אבי שטיינר וד"ד, חברים שלי סיוון ברקוביץתודות מיוחדות מגיעות ל .העזרה והתמיכה שלהם
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
שיטות שיכון גרפים לויזואליזציה של מידע
חיבור על מחקר
לשם מילוי חלקי של הדרישות לקבלת התואר דוקטור לפילוסופיה
יניב פרישמן
ולוגי לישראל מכון טכנ–הוגש לסנט הטכניון
2009 ינואר חיפה ט" תשסטבת
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009
שיטות שיכון גרפים לויזואליזציה של מידע
יניב פרישמן
Technion - Computer Science Department - Ph.D. Thesis PHD-2009-02 - 2009