Upload
tocho
View
22
Download
0
Embed Size (px)
DESCRIPTION
I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis. Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University University of Aarhus. The Union-Find Problem. A universe of N elements: x 1 , x 2 , …, x N Initially N singleton sets: { x 1 }, { x 2 }, …, { x N } - PowerPoint PPT Presentation
Citation preview
I/O-Efficient Batched Union-Find and Its I/O-Efficient Batched Union-Find and Its
Applications to Terrain AnalysisApplications to Terrain Analysis
Pankaj K. Agarwal, Lars Arge, and Ke YiPankaj K. Agarwal, Lars Arge, and Ke Yi
Duke UniversityDuke UniversityUniversity of AarhusUniversity of Aarhus
The Union-Find ProblemThe Union-Find Problem
• A universe of N elements: x1, x2, …, xN
• Initially N singleton sets: {x1}, {x2 }, …, {xN}
• Each set has a representative
• Maintain the partition under– Union(xi, xj) : Joins the sets containing xi and xj
– Find(xi) : Returns the representative of the set containing xi
The SolutionThe Solution
d
b j a
e g
h
f l
n
m
i
s r cz k
p
representatives
d
b j a
e g
h
f l
n
m
Union(d, h) :
link-by-rank
d
b j a
e g
h
f l n
Find(n) :
path compression
m
ComplexityComplexity
• O(N α(N)) for a sequence of N union and find operations [Tarjan 75]
– α(•) : Inverse Ackermann function (very slow!)– Optimal in the worst case [Tarjan79, Fredman
and Saks 89]
• Batched (Off-line) version– Entire sequence known in advance– Can be improved to linear on RAM [Gabow and
Tarjan 85]– Not possible on a pointer machine [Tarjan79]
Simple and Good, as long as …Simple and Good, as long as …
The entire data structure fits in memory
The I/O ModelThe I/O Model
Main memory of size M
Disk of infinite size
One I/O transfers B items between memory and disk
Our ResultsOur Results
• An I/O-efficient algorithm for the batched union-find problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os expected– Same as sorting– optimal in the worst case
• A practical algorithm using O(sort(N) log(N/M)) I/Os• Applications to terrain analysis
– Topological persistence : O(sort(N)) I/Os– Contour trees : O(sort(N)) I/Os
I/O-Efficient Batched Union-FindI/O-Efficient Batched Union-Find
• Assumption: No redundant unions– Each union must join two different sets– Will remove later
• Two-stage algorithm– Convert to interval union-find
• Compute an order on the elements s.t. each union joins two adjacent sets
– Solve batched interval union-find
Union GraphUnion Graph
r
a b
c d e f
g h i
1: Union(d, g)2: Union(a, c)3: Union(r, b)4: Union(a, e)5: Union(e, i)6: Union(r, a)7: Union(a, d) g8: Union(d, h) r9: Union(b, f)
3
1
2 4
5
7
8
9
6r
a b
c d e
f
g
h
i
3
1
2 4
5
7
8
96
Equivalent union trees
(Tree if no redundant unions)
Transforming the Union TreeTransforming the Union Treer
a b
c d e f
g h i
3
1
2 4
5
7
8
9
6r
a b
c d e f
g
h
i
3
1
2 4
5
7
8
9
6r
a b
c
d
e fg
h
i
3
1 2 4
5
78
9
6
r
a b
c
d
e
f
g
h
i
3
1 2 4 5
78
96
Weights along root-to-leafpath decrease
Formulating as a Batched ProblemFormulating as a Batched Problem
r
a b
c d e f
g h i
3
1
2 4
5
7
8
9
6
r
a b
c
d
e
f
g
h
i
3
1 2 4 5
78
96
For each edge, find the lowest ancestor edgewith a higher weight
Cast in a Geometry SettingCast in a Geometry Settingr
a b
c d e f
g h i
3
1
2 4
5
7
8
9
6
Euler Tour
In O(sort(N)) I/Os [Chiang et al. 95]
12
3
45
6
78
9
x: positions in the toury: weight
Cast in a Geometry SettingCast in a Geometry Settingr
a b
c d e f
g h i
3
1
2 4
5
7
8
9
6
12
3
45
6
78
9
For each edge, find the lowest ancestor edgewith a higher weight
For each segment, find the shortest segment above and containing it
Distribution SweepingDistribution SweepingM/B vertical slabs
checked here
checkedrecursively
Total cost:O(sort(N))
In-Order TraversalIn-Order Traversalr
ab
c
d
e
f
g
h
i
3
12 4 5
796Weights along root-to-leaf
path decrease
At u, with child u1,…, uk (in increasing order of weight)
1. Recursively visit subtree at u1
2. Return u3. For i=2 ,…, k
Recursively visit subtree at ui
b r
8
ac e i g d h f
Claim: this traversalproduces the right order
Solving Interval Union-FindSolving Interval Union-Find
Union:x: two operands y: time stamp
Find:x: operand y: time stamp
representative
Solving Interval Union-FindSolving Interval Union-Find
Union:x: two operands y: time stamp
Find:x: operand y: time stamp
Four instances of batched ray shooting: O(sort(N))
Solving Interval Union-FindSolving Interval Union-Find
Union:x: two operands y: time stamp
Find:x: operand y: time stamp
Four instances of batched ray shooting: O(sort(N))
Handling Redundant UnionsHandling Redundant Unions
• Union tree becomes a general graph
• Compute the minimum spanning tree– O(sort(N)) I/Os (randomized) [Chiang et al. 95]
O(sort(N) loglog B) I/Os (deterministic) [Arge et al. 04]
– Deterministic O(sort(N)) I/Os if graph is planar– Only MST edges are non-redundant
ApplicationsApplications
1.1. Topological PersistenceTopological Persistence
2.2. Contour TreesContour Trees
Application: Application: Topological PersistenceTopological Persistence
• Introduced by Edelsbrunner et al. 2000• Measure importance on a surface
– Feature extraction– Topological de-noising
• Many applications– Surface modeling– Shape analysis– Terrain analysis– Computational Biology
Topological Persistence IllustratedTopological Persistence Illustrated
Formulated as Batched Union-FindFormulated as Batched Union-Find• Represented as a triangulated mesh
• Consider minimum-saddle pairs• When reach
– A minimum or maximum: do nothing– A regular point u: Issue union(u,v) for a lower neighbor v– A saddle u: let v and w be nodes from u’s two connected
pieces in its lower link Issue: find(v), find(w), union(u,v), union(u,w)
lower link
Experiment 1:Experiment 1:Random Union-FindRandom Union-Find
128MBmemory
Experiment 2: Topological Experiment 2: Topological Persistence on Terrain DataPersistence on Terrain Data
Neuse River Basin of North Carolina: ~ 0.5 billion points
Experiment 2: Topological Experiment 2: Topological Persistence on Terrain DataPersistence on Terrain Data
Entire data set (0.5b): IM fails and EM takes 10 hours
128MBmemory
Contour TreesContour Trees
SummarySummary
• An I/O-efficient algorithm for the batched union-find problem using O(sort(N)) = O(N/B logM/B(N/B)) I/Os– optimal in the worst case
• A practical algorithm using O(sort(N) log(N/M)) I/Os• Applications to terrain analysis
– Topological persistence : O(sort(N)) I/Os– Contour trees : O(sort(N)) I/Os
• Open Question: – On-line case: Can we get below O(N α(N)) I/Os?
Thank you!Thank you!
Previous ResultsPrevious Results
• Directly maintain contours– O(N log N) time [van Kreveld et al. 97]
– Needs union-split-find for circular lists– Do not extend to higher dimensions
• Two sweeps by maintaining components, then merge– O(N log N) time [Carr et al. 03]
– Extend to arbitrary dimensions
Join Tree and Split TreeJoin Tree and Split Tree
9
8
76
5
4
32
1
Join tree
9
8
76
5
4
32
1
Split tree
Qualified nodes
9
8
76
5
4
3
1
Join tree
9
8
76
5
4
3
1
Split tree
Final Contour TreeFinal Contour Tree
9
8
76
5
4
32
1
Join tree
9
8
76
5
4
32
1
Split tree
9
8
76
5
4
32
1
Contour tree
Hard to BATCH!
Another CharacterizationAnother Characterization
9
8
76
5
4
32
1
Join tree
9
8
76
5
4
32
1
Split tree
9
8
76
5
4
32
1
Contour tree
u
vw
u
vw
u
uw
Let w be the highest node that is a descendant of v in join treeand ancestor of u in split tree, (u, w) is a contour tree edge
Now can BATCH!
Map to RectanglesMap to Rectangles
9
8
76
5
4
32
1
Join tree
9
8
76
5
4
32
1
Split tree
u
vw
u
vw
u
v
w
Can be solved in O(sort(N)) I/Os(practical, too)
Topological PersistenceTopological Persistence
Label Nodes with IntervalsLabel Nodes with Intervals
Using Euler tour (O(sort(N) I/Os)
9
8
76
5
4
32
1
Map to RectanglesMap to Rectangles
9
8
76
5
4
32
1
Join tree
9
8
76
5
4
32
1
Split tree
u
vw
u
vw
u
v
w
Can be solved in O(sort(N)) I/Os(practical, too)
Formulated as Batched Union-FindFormulated as Batched Union-Find• Represented as a triangulated mesh
• Consider minimum-saddle pairs• When reach
– A minimum or maximum: do nothing– A regular poin u: Issue union(u,v) for a lower neighbor v– A saddle u: let v and w be nodes from u’s two
connected pieces in its lower link Issue: find(v), find(w), union(u,v), union(u,w)
lower link
Experiment 1:Experiment 1:Random Union-FindRandom Union-Find
Experiment 2: Topological Experiment 2: Topological Persistence on Terrain DataPersistence on Terrain Data
Experiment 2: Topological Experiment 2: Topological Persistence on Terrain DataPersistence on Terrain Data