View
217
Download
0
Embed Size (px)
Citation preview
Early Global Program Optimizations Chapter 12.4-6
Mooly Sagiv
Outline• Value Numbering
– Basic blocks– Procedure based data flow analysis
• Sparse conditional constant propagation
Three Global Data Flow Problems
• Value-Numbers Assign integer values to expressions– e1 and e2 have the same value (at the entry a given block) e1 and
e2 evaluate to the same value every time the control reaches this block
• Constant-Propagation Assign integer values to program variables – v has a constant value n (at the entry of a given block) every
time the control reaches this block, has the value n
• (Formal) Common Available Sub-expression Evaluation Assign sets of available expressions– e is available (at the entry of a given block) every time the
control reaches this block, e is already computed
Example
read(i)
j i + 1
k i
l k + 1
i 2
j i * 2
k i + 2
read(i)
if i > 0 goto L1
j 2 * i
goto L2
L1: k 2 * i
L2: l 2 * i
Solutions
• General– Develop the most precise algorithm that
approximates the three problems together
• Specific– Develop the most efficient algorithm that
approximates one problem– Apply several algorithms
(multiple times)
The General Algorithm(Killdall 1973)
• An iterative algorithm
• The Lattice of data flow information =“pools” of sets of equivalent expressions with an optional constant value
• Optimistically start with everything being available and iterate until no more expressions/values are removed
{}
read(i)
{{i}}
j i + 1
{{i}, {i+1, j}}
k i
{{i, k}, {i+1, k+1, j}}
l k + 1
{{i, k}, {i+1, k+1, j, l}}
{}
i 2
{{i, 2}}
j i * 2
{{i, 2}, {j, i *2, 4}}
k i + 2
{{i, 2}, {j, i *2, i+2, 4, k}}
{}
read(i)
{{i}}
if i > 0 goto L1
{{i}}
j 2 * i
{{i}, {2*i, j}}
goto L2
L1: {{i}}
k 2 * i
{{i}, {2*i, k}}
L2: {{i}, {2*i}}
l 2 * i
{{i}, {2*i, l}}
The General Algorithm(Killdall 1973)
• The Lattice of data flow information– Pool = Sets of sets of equivalent expressions– p Pool, x, y p x = y xy = {}– p Pool, x p, z1, z2 x z1=z2
• p1 p2 p2 is a refinement of p1 x2 p2: x1 p1: x2 x1
• p1 p1 = {x1 x2 - {}: x1 p1, x2 p2} =• Init = {{}}
The effect of st==x eon a pool P
• If e is in P e is redundant
• Create a new class P with the partial computations in the program which have operands equivalent to e
• If e evaluates to a constant z, then add e to the (possibly new) class of z
• Remove all the expressions with argument x and add x to the class of e (and all its consequences)
{}
read(i)
{{i}}
j i + 1
{{i}, {i+1, j}}
k i
{{i, k}, {i+1, k+1, j}}
l k + 1
{{i, k}, {i+1, k+1, j, l}}
{}
i 2
{{i, 2}}
j i * 2
{{i, 2}, {j, i *2, 4}}
k i + 2
{{i, 2}, {j, i *2, i+2, 4, k}}
{}
read(i)
{{i}}
if i > 0 goto L1
{{i}}
j 2 * i
{{i}, {2*i, j}}
goto L2
L1: {{i}}
k 2 * i
{{i}, {2*i, k}}
L2: {{i}, {2*i}}
l 2 * i
{{i}, {2*i, l}}
Efficient Algorithms for Value Numbering
• For basic blocks the problem is easy (12.4.1)• SSA can be used for extended basic blocks• Reif & Lewis 1982 O(E (E, E)) algorithm• Bowen, Wegman, Zadeck (1988)
O(E log E)• Extensions for more constants- Knoop,
Steffen, Ruething 1999
Bowen, Wegman, Zadeck Algorithm
• Convert the program into SSA form
• Build a “Value Graph” --- a directed graph representing symbolic execution of the program
• Find the “congruent” nodes using the coerset partition of a set (automata minimization)
• Variables are detected as equivalent at a basic block if (i) the corresponding nodes are equivalent (ii) their defining assignment dominates p
A Simple Example
I<29
J 1
K 1
J 2
K 2
I<29
Y N
L 1 L 2
Y N
A Simple Example (Killdall’s Algorithm)
I<29
J 1
K 1
J 2
K 2
I<29Y
N
L 1 L 2
{{}}
{{I<29}} {{I<29}}
{{I<29}{J, K, 1}} {{I<29},{J, K, 2}}
{{I<29},{J, K}}
{{I<29},{J, K}} {{I<29},{J, K}}
{{I<29},{J, K}, {L, 1}} {{I<29},{J, K}, {L, 2}}
{{I<29},{J, K}, {L}}
Y N
N
A Simple Example(Bowen, Wegman, Zadeck)
I<29
J1 1
K1 1
J2 2
K2 2
J3 4 (J1, J2)
K3 4 (K1, K2)
, I<29
Y N
L1 1 L2 2Y N
1
2 3
4
5 6
7 L3 7(L1, L2)
1J1
2J2
4
1K1
2K2
4
1J1
2J2
7
I 29
<
I 29
<
J3K3l
l
l l
lrr
r
rr
Global Value Graph
• Directed labeled graph
• Nodes may be labeled by– constant numbers– normal function symbols (no side effect) functions
• Directed edges from functions to arguments(ordered according to argument position)
Congruence
• Two nodes in the value graph are congruent:– the nodes have identical function label– the corresponding destination of edges leaving
the nodes are congruent
• Tricky for value graphs with cycles
J J + 1
K K + 1
J 1
K 1
, I<29
NY
J J + 2
K K + 2
Loop Example
J2 J1 + 1
K2 K1 + 1
J0 1
K0 1
J1 2(J0, J4)
K1 2(K0, K4)
, I<29
NY
1
2
J3 J1 + 2
K3 K1 + 2
J4 5(J2, J3)
K4 5(K2, K3)
Loop Example(SSA)
3 4
5
Loop Example(Value Graph)
3
J0 1K0 1
J1 2( J0, J4)K1 2( K0, K4)I < 29
J2 J1 + 1K2 K1 + 1
J4 5( J2, J3)K4 5( K3, K3)
Y N
4 J3 J1 + 1K3 K1 + 1
1
Z0
1
J0
5
J4
2
Z1
2
J1
+J3
+J2
1
Z2
1
K0
5
K4
2
Z3
2
K1
+K3
+K2
1
2
3
5
l
l
r
l
l r
r
r
l
l
r
l
l r
r
r
A Simple Partitioning Algorithm
• Place all the nodes with the same label in the same partition
• Repeat splitting partitions with two nodes having corresponding edges leading to nodes in different partitions
A Simple Partitioning AlgorithmPartition = {P | x, y P label(x) == label(y) }
while (change) do
change = false
for each P Partition do
for each fi do
if (x,yP.fi(x) != fi(y)) then
split P
change = true
fi
od
od
od
An (E log E) Partitioning Algorithm Aho, Hofcroft, Ulllman 1974
• Given:– A set S– A function f: S S– A partition of S into disjoints blocks ={B1, B2, …, Bp}
• Find the coerset (having the fewest blocks) partition ’={E1, E2, …, Eq} such that
– ’ is a refinement of (’)
– a, b El f(a), f(b) Ej
WAITING { 1, 2, .. p }
q p
while WAITING != do
select and delete an integer I from WAITING
for m from 1 to k do
INVERSE
for x in B[i] INVERSE INVERSE f-1m(x) end
for each j such that B[j] INVERSE != and
not (B[j] INVERSE) do
q q + 1
create a new block B[q]
B[q] B[j] INVERSE
B[j] B[j] – B[q]
if j is in WAITING then add q to WAITING
else if B[j] <= B[q] then add j to WAITING
else add j to WAITING fi
od
od
od
Generalizations
• Identify control flow structures and generate special functions
• Handle arrays with ACCESS, UPDATE
• But can we find all the Kildall’s expressions in O (E log E)
I<29
J1 1
K1 1
J2 2
K2 2
J3 4 (J1, J2)
K3 4 (K1, K2)
, I<29
Y N
L1 1 L2 2Y N
1
2 3
4
5 6
7 L3 7(L1, L2)
1J1
2J2
4
1K1
2K2
4
1J1
2J2
7
I 29
<
I 29
<
J3K3l
l
l l
lrr
r
rr
Conditional Constant Propagation
• Conditions with constant values can be interpreted to improve precision
• A more precise solution is obtained “optimistically”
char * Red = “red”;
char * Yellow = “yellow”;
char * Orange = “orange”;
main()
{ FRUIT snack;
VARIETY t1; SHAPE t2; COLOR t3;
t1 = APPLE;
t2 = ROUND;
switch (t1) {
case APPLE: t3= Red;
break;
case BANANA: t3=Yellow;
break;
case ORANGE: t3=Orange; }}
printf(“%s\n”, t3 );}
main()
{ printf(“%s\n”, “red”);}
“red”
char * Red = “red”;
char * Yellow = “yellow”;
char * Orange = “orange”;
main()
{ FRUIT snack;
VARIETY t1; SHAPE t2; COLOR t3;
t1 = APPLE;
t2 = ROUND;
switch (t1) {
case APPLE: t3= Red;
break;
case BANANA: t3=Yellow;
break;
case ORANGE: t3=Orange; }}
printf(“%s\n”, t3);}
Iterative Data-Flow AlgorithmInput: a flow graph G=(N,E,r) An init value Init A montonic function FB for every B in N
Output: For every N in(N)Initializatio: in(Entry) := Init;
for each node B in N-{Entry} do in(B) := WL := N - {Entry}Iteration: while WL != {} do Select and remove an B from WL out := FB(in(B)) For all B’ in succ(B) such that in(B’) != in(B’) out do in(B’):= in(B’) out WL := WL {B’}
Iterative Data-Flow AlgorithmInput: a flow graph G=(N,E,r) An init value Init A montonic function FB for every B in N
Output: For every N in(N)Initializatio: in(Entry) := Init;
for each node B in N-{Entry} do in(B) := WL := {Entry}Iteration: while WL != {} do Select and remove an B from WL out := FB(in(B)) For all B’ in succ(B) such that in(B’) != in(B’) out do in(B’):= in(B’) out WL := WL {B’}
char * Red = “red”;
char * Yellow = “yellow”;
char * Orange = “orange”;
main()
{ FRUIT snack;
VARIETY t1; SHAPE t2; COLOR t3;
t1 = APPLE;
t2 = ROUND;
switch (t1) {
case APPLE: t3= Red;
break;
case BANANA: t3=Yellow;
break;
case ORANGE: t3=Orange; }}
printf(“%s\n”, t3);}
Conditional Constant Propagation• initialize the worklist to the entry node• mark all edges as not executable• repeat until the worklist is empty:
– select and remove a node from the worklist
– if it is an assignment then mark the successor edge as executable
– if it is a test then symbolically evaluate the test and mark the enabled successor edges as executable
• if test evaluates to true or mark true edge executable
• if test evaluates to false or mark false edge executable
– update the value of all the variables at the entry and exit of this node
– if there are changes then add all successors reachable from the node with edges marked executable to the worklist
Sparse Conditional Constant
• bring the program in SSA form• initialize the analysis information:
– all variables are mapped to – all flow edges are marked as not executable
• initialize the two worklists– Flow-Worklist contains all edges of the flow graph with
the entry node as source
– SSA-Worklist is empty
• repeat until both worklists are empty:– select and remove an edge from one of the worklists– if it is a flow edge then
• if the edge is not marked executable then – mark it executable
– if the target of the edge is a -node then call visit-– if it is the first time the node is visited (only one incoming flow edge is
marked executable) and it is a normal node then call visit-instr
– if it is an SSA edge then• if the target of the edge is a -node then call visit-• if it is a normal node and at least one of the flow edges entering the node
are marked executable then call visit-instr
• visit-: (the node is a -node)
– the assigned variable is given a value that is the join the values of the arguments with incoming edges marked executable
• visit-instr: (the node is a normal node)
– determine the value of the expression of the node and update the variable in case of an assignment
– if there are changes then • if the node is an assignment then add all SSA edges with source
at the target of the current edge to the SSA-worklist
• if the node is a test then add all relevant flow edges to the Flow-worklist and mark them executable
– if test evaluates to true or add true edge
– if test evaluates to false or : add false edge