View
215
Download
0
Category
Tags:
Preview:
Citation preview
1
Finding Dominators in FlowgraphsFinding Dominators in Flowgraphs
Linear-Time Algorithm Linear-Time Algorithm 11
andandExperimental Study Experimental Study 22
Loukas Georgiadis
1 joint work with Robert E. Tarjan
2 joint work with Renato F. Werneck, Robert E. Tarjan, Spyridon Triantafyllis and David I.
August
2
Dominators in a FlowgraphDominators in a Flowgraph
Flowgraph: G = (V, E, r); each v in V is reachable from r
v dominates w if every path from r to w includes v
w
v
r
3
Dominators in a FlowgraphDominators in a Flowgraph
Flowgraph: G = (V, E, r); each v in V is reachable from r
v dominates w if every path from r to w includes v
Set of dominators: Dom(w) = { v | v dominates w }
Trivial dominators: w r, w, r Dom(w)
Immediate dominator: idom(w) Dom(w) – w and dominated by every v in Dom(w) – w
4
Dominators in a FlowgraphDominators in a Flowgraph
Flowgraph: G = (V, E, r); each v in V is reachable from r
v dominates w if every path from r to w includes v
Set of dominators: Dom(w) = { v | v dominates w }
Trivial dominators: w r, w, r Dom(w)
Immediate dominator: idom(w) Dom(w) – w and dominated by every v in Dom(w) – w
Goal: Find idom(v) for each v in V (immediate dominator tree)
Applications: Program optimization, code generation, circuit testing
5
1979 Lengauer and Tarjan; O(m· (m,n)) time.
1997 Alstrup, Harel, Lauridsen and Thorup; O(n+m) time for RAM.
1998 Buchsbaum, Kaplan, Rogers and Westbrook; claimed O(n+m) for Pointer Machine. (Corrected in 2004 to work in linear time for RAM.)
2004 G. and Tarjan
• We showed that the Buchsbaum et al. algorithm runs in O(m· (m,n)) time.
• Based on Buchsbaum et al. we gave a linear-time algorithm for Pointer Machine, simpler than Alstrup et al. (no complicated data structures).
HistoryHistory
6
The Lengauer-Tarjan AlgorithmThe Lengauer-Tarjan Algorithm
Depth-First Search DFS Tree D
We refer to the vertices by their DFS numbers:
v < w : v was visited by DFS before w
r 1
4
3
5
6
7
8
2
7
The Lengauer-Tarjan Algorithm: SemidominatorsThe Lengauer-Tarjan Algorithm: Semidominators
Depth-First Search DFS Tree D
We refer to the vertices by their DFS numbers:
v < w : v was visited by DFS before w
Semidominator path (SDOM-path):
P = (v0 = v, v1, v2, …, vk = w) such that
vi>w, for 1 i k-1
r 1
4
3
5
6
7
8
2
8
The Lengauer-Tarjan Algorithm: SemidominatorsThe Lengauer-Tarjan Algorithm: Semidominators
Depth-First Search DFS Tree D
We refer to the vertices by their DFS numbers:
v < w : v was visited by DFS before w
Semidominator path (SDOM-path):
P = (v0 = v, v1, v2, …, vk = w) such that
vi>w, for 1 i k-1
Semidominator:
sdom(w) = min { v | SDOM-path from v to w }
r 1
4
3
5
6
7
8
2
9
OverviewOverview
1. Carry out a DFS.
2. Process the vertices in reverse preorder. For vertex w, compute sdom(w).
3. Implicitly define idom(w).
4. Explicitly define idom(w) by a preorder pass.
The Lengauer-Tarjan AlgorithmThe Lengauer-Tarjan Algorithm
10
Data Structure: Maintain forest F and supports the operations:
link(v, w): Add the edge (v,w) to F. eval(v): Let r = root of the tree that contains v in F.
If v = r then return v. Otherwise return any vertex
with minimum sdom among the vertices u that
are proper descendants of r and ancestors of v.
Initially every vertex in V is a root in F.
The Lengauer-Tarjan Algorithm:The Lengauer-Tarjan Algorithm: Evaluate minima on Evaluate minima on tree pathstree paths
11
Data Structure: Maintain forest F and supports the operations:
link(v, w): Add the edge (v,w) to F. eval(v): Let r = root of the tree that contains v in F.
If v = r then return v. Otherwise return any vertex
with minimum sdom among the vertices u that
are proper descendants of r and ancestors of v.
Initially every vertex in V is a root in F.
Simple version: n links, m evals in O(mlogn).
Sophisticated version: n links, m evals in O(mα(m,n)).
The Lengauer-Tarjan Algorithm:The Lengauer-Tarjan Algorithm: Evaluate minima on Evaluate minima on tree pathstree paths
12
The Linear-Time AlgorithmThe Linear-Time Algorithm
Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]
Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.
Trivial microtree: Single internal vertex of D.
1
2
3
4
5
6
9
10 11
7
8
12
13 14
15
1617
2118
19 20
22
13
The Linear-Time AlgorithmThe Linear-Time Algorithm
Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]
Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.
Trivial microtree: Single internal vertex of D.
1
2
3
4
5
6
9
10 11
7
8
12
13 14
15
1617
2118
19 20
22
trivialmicrotree
nontrivialmicrotree
g = 3
14
The Linear-Time AlgorithmThe Linear-Time Algorithm
Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]
Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.
Trivial microtree: Single internal vertex of D.
Core C: Tree D – nontrivial microtrees; has n/g leaves.
1
2
3
4
5
6
9
10 11
7
8
12
13 14
15
1617
2118
19 20
22
15
The Linear-Time AlgorithmThe Linear-Time Algorithm
Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]
Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.
Trivial microtree: Single internal vertex of D.
Core C: Tree D – nontrivial microtrees; has n/g leaves.
Line: Path (v1=s, v2, …, vk=t) in C such that outdegreeC(vi)=1, 1ik-1, and outdegreeC(vk) = 0 or >1.
1
2
3
4
5
6
9
10 11
7
8
12
13 14
15
1617
2118
19 20
22
16
The Linear-Time AlgorithmThe Linear-Time Algorithm
Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]
Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.
Trivial microtree: Single internal vertex of D.
Core C: Tree D – nontrivial microtrees; has n/g leaves.
Line: Path (v1=s, v2, …, vk=t) in C such that outdegreeC(vi)=1, 1ik-1, and outdegreeC(vk) = 0 or >1.
1
2
3
4
5
6
9
10 11
7
8
12
13 14
15
1617
2118
19 20
22
line
17
The Linear-Time AlgorithmThe Linear-Time Algorithm
Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97]
Nontrivial microtree: Maximal subtree of D of size g that contains at least one leaf of D.
Trivial microtree: Single internal vertex of D.
Core C: Tree D – nontrivial microtrees; has n/g leaves.
Line: Path (v1=s, v2, …, vk=t) in C such that outdegreeC(vi)=1, 1ik-1, and outdegreeC(vk) = 0 or >1.
There are L 2n/g lines.Contract each line into a single vertex tree C’ with L nodes.
{1, 2, 3}
{4, 7, 8} {15, 17}
18
The Linear-Time AlgorithmThe Linear-Time Algorithm
Extend the definition of semidominators for the vertices of the nontrivial microtrees [Buchsbaum et al.]:
Pushed external dominator path (PXDOM-path): P = (v0 = v, v1, v2, …, vk = w) such that vi root of microtree of w, for 1 i k-1.
Pushed external dominator: pxdom(w) = min { v | PXDOM-path from v to w }
pxdom(w)
w
sdom(w)
19
The Linear-Time AlgorithmThe Linear-Time Algorithm
Extend the definition of semidominators for the vertices of the nontrivial microtrees [Buchsbaum et al.]:
Pushed external dominator path (PXDOM-path): P = (v0 = v, v1, v2, …, vk = w) such that vi root of microtree of w, for 1 i k-1.
Pushed external dominator: pxdom(w) = min { v | PXDOM-path from v to w }
For any vertex w of the core C
pxdom(w) = sdom(w)
pxdom(w)
w
sdom(w)
20
OverviewOverview
1. Compute internal dominators in each nontrivial microtree.
The Linear-Time AlgorithmThe Linear-Time Algorithm
21
OverviewOverview
1. Compute internal dominators in each nontrivial microtree.
2. Compute pxdoms in each nontrivial microtree t by link and eval on C’ and Nearest Common Ancestor (NCA) queries on a tree built by the sdom values of the line that contains the parent of the root of t.
The Linear-Time AlgorithmThe Linear-Time Algorithm
22
OverviewOverview
1. Compute internal dominators in each nontrivial microtree.
2. Compute pxdoms in each nontrivial microtree t by link and eval on C’ and Nearest Common Ancestor (NCA) queries on a tree built by the sdom values of the line that contains the parent of the root of t.
3. Compute sdoms in each line l by a top-down pass using link and eval on C’ and contracting
connected components in l.
The Linear-Time AlgorithmThe Linear-Time Algorithm
23
OverviewOverview
1. Compute internal dominators in each nontrivial microtree.
2. Compute pxdoms in each nontrivial microtree t by link and eval on C’ and Nearest Common Ancestor (NCA) queries on a tree built by the sdom values of the line that contains the parent of the root of t.
3. Compute sdoms in each line l by a top-down pass using link and eval on C’ and contracting
connected components in l. Remarks: link and eval run in linear-time on C’ . Buchsbam et al. claimed that link and eval
run in linear time on C but the claim is false.
The Linear-Time AlgorithmThe Linear-Time Algorithm
24
The Iterative Algorithm: Set-basedThe Iterative Algorithm: Set-based
Dominators can be computed by solving iteratively the set of equations [Allen and Cocke, 1972]
Dom(v) = ( u pred(v) Dom(u) ) {v}, v r
Initialization
Dom(r) = {r}
Dom(v) = , v r
In the intersection we consider only the nonempty Dom(u).
25
The Iterative Algorithm: Set-basedThe Iterative Algorithm: Set-based
Dominators can be computed by solving iteratively the set of equations [Allen and Cocke, 1972]
Dom(v) = ( u pred(v) Dom(u) ) {v}, v r
Initialization
Dom(r) = {r}
Dom(v) = , v r
In the intersection we consider only the nonempty Dom(u).
Each Dom(v) set can be represented by an n-bit vector.Intersection bit-wise AND.
Requires n2 space. Very slow in practice.
26
The Iterative Algorithm: Tree-basedThe Iterative Algorithm: Tree-based
Efficient implementation [Cooper, Harvey and Kennedy 2000]
dfs(r)T {r}changed truewhile ( changed ) do
changed falsefor all v in V – r in reverse postorder do
x nca(pred(v)) if x parent(v) then
parent(v) xchanged true
enddone
done
27
The Iterative AlgorithmThe Iterative Algorithm
Running TimeRunning Time
Each pair wise intersection takes O(n) time.
The number of iterations is d + 3. [Kam and Ullman ’76]
d = max #back-edges in any cycle-free path of G
= O(n)
Running time = O(mn2)
This bound is tight, but very pessimistic in practice.
28
The Iterative Algorithm: Generic Tree-basedThe Iterative Algorithm: Generic Tree-based
T T0 /* a spanning (sub)tree of G */changed truewhile ( changed ) do
changed falsefor all v in V – r in order do
x nca(pred(v)) if x parent(v) then
parent(v) xchanged true
enddone
done
29
The Iterative Algorithm: Generic Tree-basedThe Iterative Algorithm: Generic Tree-based
T T0 /* a spanning (sub)tree of G */changed truewhile ( changed ) do
changed falsefor all v in V – r in order do
x nca(pred(v)) if x parent(v) then
parent(v) xchanged true
enddone
done
Good choices (in practice): T0 = a Bread-First Search (BFS) tree
= BFS order
30
A Hybrid AlgorithmA Hybrid Algorithm
Lemma: For any vertex w r,
idom(w) = NCA( I, parent(w), sdom(w) ).
I = (immediate) dominator tree
parent(w) = parent of w in the DFS tree D
31
A Hybrid AlgorithmA Hybrid Algorithm
Lemma: For any vertex w r,
idom(w) = NCA( I, parent(w), sdom(w) ).
I = (immediate) dominator tree
parent(w) = parent of w in the DFS tree D
SEMI-NCA:
1. Compute sdoms as in simple version of LT.
2. Construct I incrementally applying Lemma.
(NCA calculations implemented naïvely)
32
Experimental ResultsExperimental Results
AlgorithmsAlgorithms
• SLT: simple version of Lengauer-Tarjan
• LT: almost-linear-time version of Lengauer-Tarjan
• IDFS: DFS tree-based iterative
• IBFS: BFS tree-based iterative
• SNCA: SEMI-NCA
33
InputsInputs
• Control-flow graphs from SPARC ’95 generated by the SUIF compiler (Stanford).
> 4900 graphs, avg #vertices ~ 40, #edges ~ 55
max #vertices ~ 2100, #edges ~ 3200
• Control-flow graphs from SPARC’ 00 generated by the IMPACT compiler (UIUC).
> 2000 graphs, avg #vertices ~ 25, #edges ~ 70
max #vertices~580, #edges~3100
• VLSI circuits from ISCAS’89 suite.
50 graphs, avg #vertices ~ 3200, #edges ~ 5000
max #vertices ~ 24000, #edges ~ 34000
Experimental ResultsExperimental Results
34
IDFS IBFS LT SLT SNCA mean dev mean dev mean dev mean dev mean dev
CIRCUITS 5.89 1.19 6.17 1.42 6.71 1.18 4.62 1.15 4.40 1.14
SUIF-INT 2.45 1.50 2.25 1.62 3.69 1.40 2.48 1.33 2.73 1.45
IMPACT 2.60 1.65 2.24 1.77 4.02 1.40 2.74 1.33 2.56 1.31
IMPACTP 2.58 1.63 2.25 1.82 3.84 1.44 2.61 1.30 2.52 1.29
Experimental ResultsExperimental Results
Times relative to BFS: geometric mean and geometric standard deviation
35
iterations comparisons per vertex SDP(%) IDFS IBFS IDFS IBFS LT SLT SNCA CIRCUITS 76.7 2.8000 3.2000 32.6 39.3 12.0 9.9 8.9
IMPACT 73.4 2.0686 1.4385 30.9 28.0 15.6 12.8 11.1
IMPACTP 88.6 2.0819 1.5376 30.2 32.2 15.5 12.3 10.9
SUIF-INT 63.9 2.0009 1.6659 14.9 17.2 11.2 8.6 7.2
Experimental ResultsExperimental Results
SDP = percentage of vertices v that have parent(v) = sdom(v)
36
Experimental ResultsExperimental Results
Relative Running Times per Instance Size
0
1
2
3
4
5
6
7
8
9
10
1 2 3 4 5 6 7 8 9 10
log of instance size
mea
n re
lati
ve r
unni
ng t
ime
(w.r
.t B
FS)
BFS
IDFS
IBFS
LT
SLT
SNCA
Recommended