44
Intel 9/12/05 1 Induction Variable Analysis with Chains of Recurrences Brief tutorial on induction variable recognition Past and present methods for IV detection Chains of recurrences: why and how? Analyzing pointer arithmetic in loops Array dependence testing for loop restructuring and vectorization Results and conclusions

Induction Variable Analysis with Chains of Recurrences

  • Upload
    sienna

  • View
    40

  • Download
    1

Embed Size (px)

DESCRIPTION

Induction Variable Analysis with Chains of Recurrences. Brief tutorial on induction variable recognition Past and present methods for IV detection Chains of recurrences: why and how? Analyzing pointer arithmetic in loops Array dependence testing for loop restructuring and vectorization - PowerPoint PPT Presentation

Citation preview

Page 1: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 1

Induction Variable Analysis with Chains of Recurrences Brief tutorial on induction variable recognition

Past and present methods for IV detection

Chains of recurrences: why and how? Analyzing pointer arithmetic in loops Array dependence testing for loop restructuring

and vectorization Results and conclusions

Page 2: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 2

Induction Variable Recognition: a Classic Compiler Problem

Detect induction variables (IVs) in loops

The example loop has a basic IV: a scalar integer variable with one unconditional update

Values of derived IVs depend on values of basic IVs

Loop analysis algorithms detect IVs by analyzing back edges to detect loops in IR forms, e.g. AST, CFG, or SSA

Beware of aliases!

I = 0do … I = I+1 …while (…)

…LDW R8,#0

…ADD R8,#1

…BNE L1

CFG

HLL

Page 3: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 3

Most Loop Optimizations Rely on Accurate IV Recognition

Loop strength reduction [Allen69, Aho86]

IV elimination [Lowry69, Kennedy81, Aho86]

IV substitution [Gerlek95, Haghighat96, Wolfe92]

Loop iteration bounds analysis Pointer-to-array conversion and array recovery

[vanEngelen01b, Franke01] Array dependence testing for loop restructuring

[Banerjee88, Blume94, Goff91, Maydan91, Muchnick97, Psarris03, Pugh91, vanEngelen04, Wolfe92, Zima90] and others

Page 4: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 4

A Classic Induction Variable Recognition Algorithm [Aho86]

Input: Loop L with reaching definition informationand loop-invariant information

Output: Triple (i,stride,init) for each IV

1. Find basic IVs: represent basic IV i by triple (i,stride,init)2. Find derived IVs: search for variables k with single assignment

k = j b where b is constant (loop invariant),if j has triple (i,c,d) and = * then k has triple (i,b*c, b*d)else if j has triple (i,c,d) and = + then k has triple (i,c, b+d)(assuming no use of k before its def)

Page 5: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 5

IV Recognition for Loop Optimization

Example loop with IVs

IV Triples After strength reduction

After IV elimination

I = 0 J = 1 while (I<N) I = I+1 … = A[J] J = J+2 K = 2*I A[K] = … endwhile

(I,1,0) basic

(J,2,1) basic(I,2,2) derived

K = 0 I = 0 J = 1 while (I<N) I = I+1 … = A[J] J = J+2 K = K+2 A[K] = … endwhile

K = 0

while (K<2*N) … = A[K+1] K = K+2 A[K] = …endwhile

Page 6: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 6

IV Recognition for Loop Vectorization & Parallelization

Example loop with IVs

After IV substitution (IVS) (note the affine indexes)

After parallelization

I = 0 J = 1 while (I<N) I = I+1 … = A[J] J = J+2 K = 2*I A[K] = … endwhile

for i=0 to N-1 S1: … = A[2*i+1] S2: A[2*i+2] = … endfor

forall (i=0,N-1) … = A[2*i+1] A[2*i+2] = … endforall

GCD test to solve dependence equation 2id - 2iu = -1Since 2 does not divide 1 there is no data dependence.

W R W R W R

A[2*i+1]

A[2*i+2]

A[]

Dep testIVS

Page 7: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 7

IV Recognition isn’t Always Trivial … IVs that are linear (affine) but not handled by [Aho86]:

It quickly gets more complicated with deps and flow Note that transformations such as constant propagation,

forward substitution, and code motion can help

do K = J+1 J = K+1while (…)

do K = 3 K = K+J if (…) J = K else J = J+3 endifwhile (…)

Page 8: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 8

More Powerful IV Recognition Methods

Linear IV recognition on SSA forms and FUD chains[Cytron91, Wolfe92]

Linear and nonlinear IV recognition withsymbolic differencing [Haghighat95]

Linear and nonlinear IV recognition withrecurrence system solvers [Gerlek95]

Linear and nonlinear IV recognition withchains of recurrences [vanEngelen01a]

Page 9: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 9

A Method for Linear IV Recognition on FUD Chains

Similar algorithms by [Cytron91, Wolfe92]

Factored use-def (FUD) chains are similar to single static assignment (SSA) forms

Input: FUD chainsOutput: Triple (i,stride,init) for each IV

1. Build depth-first spanning tree of operations2. Start at -node in loop cycle to find basic IVs3. Find derived IVs from other -nodes not in a cycle

Page 10: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 10

Example

I1 = 3M1 = 0do I2 = (I1,I3) J1 = (?,J3) K1 = (?,K2) L1 = (?,L2) M2 = (M1,M3) J2 = 3 I3 = I2+1 L2 = M2+1 M3 = L2+2 J3 = I3+J2

K2 = 2*J3

while (…)

I2 = (i,1,3) J1 = (i,1,7)L1 = (i,3,1) K1 = (i,2,14)M2 = (i,3,0)

Spanningtree

Page 11: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 11

Symbolic Differencingdo x = x+z y = z+1 z = y+1while (…)

Iteration x y z

1 x+z diff z+1 diff z diff

2 x+2z+2 z+2 diff z+3 2 z+2 2

3 x+3z+6 z+4 2 z+5 2 z+4 2

Use abstract interpretation to evaluate loop iterations and construct symbolic difference table of the IV values.

x(i) = x0 + z0i + (i2-i) y(i) = z0 + 2i + 1 z(i) = z0 + 2i

From difference tables compute the characteristic functions describing the IV progressions.

Page 12: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 12

Symbolic Differencing: Oops

for i=0 to n t = a a = b b = c+2*b-t+2 c = c+d d = d+iendfor

[vanEngelen01a] identified a serious problem with the differencing method for IVs involving cyclic recurrence relations.

The inferred closed-form function might be incorrect when it is assumed that the polynomial order of the IVs is bounded.

Quiz: guess the maximum polynomial order of the characteristic functions of the IVs shown in the loop on the right.

?

Page 13: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 13

Outline

Brief tutorial on induction variable recognition Chains of recurrences: why and how?

Chains of recurrences preliminaries [Zima92, vanEngelen01a] Chains of recurrences for IV recognition, loop analysis, and

optimization [vanEngelen01a, vanEngelen01b, vanEngelen04]

Analyzing pointer arithmetic in loops Array dependence testing for loop restructuring and

vectorization Results and conclusions

Page 14: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 14

Preliminaries

A chain of recurrences (CR) represents a polynomial or exponential function or mix evaluated over a unit-distance grid [Zima92]

Basic form: {init, , stride}

Iteration {init, , stride} f(i) = 2i+1 = {1,+,2} f(i) = 2i = {1,*,2}

i = 0 init 1 1

i = 1 init stride 3 2

i = 2 init stride stride 5 4

i = 3 init stride stride stride 7 8

Page 15: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 15

Chains of Recurrences:General Formulation

The key idea is to represent a non-constant CR stride in CR form itself, thereby forming a chain of recurrences

Example: f(i) = i2 = {0, +, s(i-1)} with s(i) = {1, +, 2}

Iteration {init, , s(i-1)} s(i) = {1, +, 2} f(i) = {0, +, s(i-1)}

i = 0 init 1 0

i = 1 init s(0) 3 1

i = 2 init s(0) s(1) 5 4

i = 3 init s(0) s(1) s(2) 7 9

Page 16: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 16

Primary Application of CRs: Loop Strength Reduction

Loop strength reduction is straight forward to implement with CR forms of real-valued and complex functions [Zima92]

Method: add each CR and its nested CR stride as IVs to the loop nest

Page 17: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 17

Example

Suppose f(i) = a + b·i + c·i2 = {a, +, {b+c, +, 2c}} We have two IVs x and y:

f(i) = x = {x0, +, y} with x0 = as(i) = y = {y0, +, 2c} with y0 = b+c

Add x and y as IVs to loop for efficient function evaluation over unit-distance grid i = 0, …, n :

x = ay = b+cfor i=0 to n f[i] = x x = x+y y = y+2*cendfor

0

5

10

15

20

25

30

1 2 3 4 5 6 7 8 9 10

Iteration

s(i)

Page 18: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 18

Loop Strength Reduction:Multi-dimensional Loops

Algorithm for 2-d case (n-d similar)1. Determine iteration order:

for i = 0 to n for j = 0 to m

2. Compute CR forms for j-loop first by treating i invariant within j-loop

3. Compute multivariate CR (MCR) forms for i-loop

Page 19: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 19

Example

Suppose f(i,j) = i2 + i·j + 11. Create IV k for f(i,j) in j-loop:

f(i,j) = kj = {pi, +, ri}j with pi = i2 + 1 and ri = i

2. Create IVs for pi and ri in i-loop:pi = {p0, +, qi}i with p0 = 1qi = {q0, +, 2}i with q0 = 1ri = {r0, +, 1}i with r0 = 0

3. Add IVs k, p, q, and r to loops

p = 1q = 1r = 0for i = 0 to n k = p for j = 0 to m f[i,j] = k k = k+r endfor p = p+q q = q+2 r = r+1endfor

Page 20: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 20

How to Obtain CR Forms for Strength Reduction: CR Algebra Algorithm to compute the CR form of a symbolic function f(i):

1. Replace i with {0,+,1} in the symbolic form of f2. Compute CR form using the CR algebra rewrite rules

(selected rules shown here):

Example: f(i) = c·(i+a) = c·({0, +, 1}+a) = c{a, +, 1} = {c·a, +, c}

{x, +, y} + c {x+c, +, y}

c{x, +, y} {c·x, +, c·y}

{x, +, y} + {u, +, v} {x+u, +, y+v}

{x, +, y} * {u, +, v} {x·u, +, y{u, +, v}+v{x, +, y}+y·v}

Page 21: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 21

But What About IV Recognition?

The key idea of our approach: Scan the loop to detect IV updates Determine the CR form of the IV using the update

operation

In simple terms, the method looks for operations on IVs that can be represented as CR forms:

do J = J+I I = I+3 P = 2*P while (…)

J = {J0, +, I} J = {J0, +, {I0, +, 3}} I = {I0, +, 3} P = {P0, *, 2}

Page 22: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 22

Compiler Algorithms for IV Recognition with CRs

IV recognition algorithms [vanEngelen01a, vanEngelen04] :1. Scan the loop in backward order to determine recurrence relations

of scalar variables in the loop2. Compute CR forms of recurrence relations3. “Solve” CR forms by computing closed-form characteristic functions

with the CR inverse rules

Note: GCC 4.x uses this algorithm first published in [vanEngelen01a] applied to loops in SSA form.GCC developers refer to CRs as “scalar evolutions” (without proper justification).

Page 23: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 23

Algorithm 1: Find Recurrences

Input: Loop L with live variable informationOutput: Set S of recurrence relations of IVs

1. Start with set S = { v, v | v is live at loop header }2. Search L from bottom to top:

for each assignment v = x of expression x to scalar variable v update tuples u, y in S by replacing v in y with x

Loop L Step S = {H, H, I, I, J, J, K, K}

do M = 2 L = J-H J = L+M K = K+M*I I = I+1 while (…)

54321

S5 = {H, H, I, I+1, J, J-H+2, K, K+2*I}S4 = {H, H, I, I+1, J, J-H+M, K, K+M*I}S3 = {H, H, I, I+1, J, L+M, K, K+M*I}S2 = {H, H, I, I+1, J, J, K, K+M*I}S1 = {H, H, I, I+1, J, J, K, K}

Page 24: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 24

Algorithm 2: Compute CR Forms

Input: Set S with recurrence relationsOutput: CR forms for IVs in S

1. For each relation v, x in S do:if x is of the form v then v = v0 (v is loop invariant) if x is of the form v + y then v = {v0, +, y}if x is of the form v * y then v = {v0, *, y}if x does not contain v then v = {v0, #, y} (v is wrap around)

2. Simplify the CR forms with the CR algebra rewrite rules

Recurrence relation in S CR form Simplified CR form

H, HI, I+1J, J-H+2K, K+2*I

H = H0

I = {I0, +, 1}

J = {J0, +, 2-H}

K = {K0, +, 2*I}

H = H0

I = {I0, +, 1}

J = {J0, +, 2-H0}

K = {K0, +, 2I0, +, 2}

Page 25: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 25

Algorithm 3: Solve

Input: CR forms for IVsOutput: Closed-form solutions for IVs (when possible)

1. For each CR form of v apply the CR inverse algebra, assuming loop is normalized for i = 0, …, n

2. Certain “exotic” mixed non-polynomial and non-exponential CR forms may not have closed forms

Loop L Simplified CR form Closed form

do M = 2 L = J-H J = L+M K = K+M*I I = I+1 while (…)

J = {J0, +, 2-H0} K = {K0, +, 2I0, +, 2} I = {I0, +, 1}

for i = 0, …

J = J0 + (2-H0)*i K = K0 + i2 + (2I0-1)*i I = I0 + i

Page 26: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 26

Example 1

Loop L Step S = {x, x, z, z} CR form Closed form

do x = x+z y = z+1 z = y+1 while (…)

321

S3 = {x, x+z, z, z+2}S2 = {x, x, z, z+2}S1 = {x, x, z, y+1}

x = {x0, +, z} z = {z0, +, 2}

x(i) = x0 + z0i + i2-i z(i) = z0+2i

Page 27: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 27

Example 2

DO I=1,M DO J=1,I ij = ij+1 ijkl = ijkl+I-J+1 DO K=I+1,M DO L=1,K ijkl = ijkl+1 xijkl[ijkl]=xkl[L] ENDDO ENDDO ijkl = ijkl+ij+left ENDDOENDDO

TRFD code segmentfrom Perfect Benchmark

with IV updates

DO I=0,M-1 DO J=0,I DO K=0,M-I-2 DO L=0,I+K+1 tmp = ijkl+L+I*(K+(M+M*M+2*left+6)/4)+J*(left+(M+M*M)/2)+((I*I*M*M)+2*(K*K+3*K+I*I*(left+1))+M*I*I)/4+2 xijkl[tmp] = xkl[L+1] ENDDO ENDDO ENDDOENDDO

TRFD after aggressiveinduction variable substitution

IVS

Page 28: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 28

Recognizing Mixed Functional Forms and Reductions

Loop L Simplified CR form Factorial

I = 1 do F = F*I I = I+1 while (…)

F = {F0, *, 1, +, 1} I = {1, +, 1}

F = F0 * i!

Loop L Simplified CR form Reduction

I = 0; S = 0 do S = S+A[I] I = I+2 while (…)

S = {0, +, A[{0, +, 2}]} I = {0, +, 2}

S = ∑ A[2i]

Page 29: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 29

Outline

Brief tutorial on induction variable recognition Chains of recurrences: why and how? Analyzing pointer arithmetic in loops

Converting pointer references into array references to facilitate array dependence testing and loop restructuring

Array dependence testing for loop restructuring and vectorization

Results and conclusions

Page 30: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 30

Converting Pointer References to Array References

Key observation: pointers with pointer arithmetic in loops often behave similar to IVs

Use IV recognition algorithms to detect pointer-based IVs

Convert pointer-based IVs to closed-form characteristic functions to obtain array references

Page 31: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 31

Pointer Access Descriptions of Pointer and Array References

A pointer access description (PAD) [vanEngelen01b] is a CR form of a pointer or array reference in a loop nest

PADs are computed with the CR-based IV algorithms

Loop Code PAD Sequence

a[i] {a, +, 1} a[0],a[1],a[2],a[3]

a[2*i+1] {a+1, +, 2} a[1],a[3],a[5],a[7]

a[(i*i-i)/2] {a, +, 0, +, 1} a[0],a[0],a[1],a[3]

a[1<<i] {a+1, +, 1, *, 2} a[1],a[2],a[4],a[8]

p++ {a, +, 1} a[0],a[1],a[2],a[3]

p+=i {a, +, 0, +, 1} a[0],a[0],a[1],a[3]

short a[…], *p;int i;p = a;for(i=0;…;i++){

}

Page 32: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 32

Example

f += 2;lsp += 2;for (i = 2; i <= 5; i++){ *f = f[-2]; for (j = 1; j < i; j++, f--) *f += f[-2]-2*(*lsp)*f[-1]; *f -= 2*(*lsp); f += i; lsp += 2;}

Lsp_az speech codec segmentfrom ETSI with pointer updates.

for (i = 0; i <= 3; i++){ f[i+2] = f[i]; for (j = 0; j <= i; j++) f[i-j+2] += f[i-j]- 2*lsp[2*i+2]*f[i-j+1]; f[1] -= 2*lsp[2*i+2];}

Lsp_az speech codec segmentafter pointer-to-array conversion.

Note that all array indexexpressions are affine.

Page 33: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 33

Outline

Brief tutorial on induction variable recognition Chains of recurrences: why and how? Analyzing pointer arithmetic in loops Array dependence testing for loop restructuring and

vectorization CR-based dependence testing Solving linear (affine), nonlinear, and symbolic equations

Results and conclusions

Page 34: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 34

Benefits of Array & Pointer Dependence Testing with CRs

Reduced complexity of testing Eliminates IV substitution phase Eliminates the need for pointer-to-array conversion for

dependence testing on pointer-based C code

Extended coverage Able to solve linear, nonlinear, and symbolic dependence

equations Possibility to augment existing tests, such as the extreme

value test and range test

Page 35: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 35

CR Dependence Equations

Compute dependence equations in CR form for pointer and array accesses in loop nests directly without IV substitution or pointer-to-array conversion

Solve the equations by computing value ranges of the CR forms to determine solution intervals

If the solution space is empty, there is no dependence

Page 36: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 36

Determining the Value Range of a CR Form on a Domain

Suppose x(i) = {x0, +, s(i-1)} for i = 0, …, n If s(i-1) > 0 then x(i) is monotonically increasing If s(i-1) < 0 then x(i) is monotonically decreasing

If a function is monotonic on its domain, then it is trivial to find its exact value range

Page 37: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 37

Example

Determine the value range of x(i) = (i2-i)/2 for i = 0,…,10 Convert to CR form x(i) = {0, +, 0, +, 1} = {0, +, {0, +, 1}} Monotonically increasing, since {0, +, 1} = i > 0 for i = 0,…,10 Therefore, lower bound of x(i) is x(0) = 0 and upper bound is x(10)

= (102-10)/2 = 45

Classic interval analysis often gives conservative results For this example, interval analysis gives the range

([0,10]2-[0,10])/2 = ([0,100]+[-10,0])/2 = [-10,100]/2 = [-5,50]

Page 38: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 38

Solving a Dependence Equation

float a[…], *p, *q; p = a; q = a+2*n; for (i=0; i<n; i++) { t = *p; S: *p++ = *q; *q-- = t; }

Dependence equation:{a, +, 1}id = {a+2n, + ,-1}iu

Constraints:0 < id < n-10 < iu < n-1

Rewrite dependence equation:{a, +, 1}id = {a+2n, +, -1}iu

{a, +, 1}id - {a+2n, +, -1}iu = 0 {{-2n, +, 1}iu, +, 1}id = 0

Compute solution interval:Low[{{-2n, +, 1}iu, +, 1}id]= Low[{-2n, +, 1}iu]= -2nUp[{{-2n, +, 1}iu, +, 1}id]= Up[{-2n, +, 1}iu + n-1]= Up[-2n + 2n - 2]= -2

No dependence

S *

p={a, +, 1}q={a+2n, +, -1}

Page 39: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 39

Dependence Testing on Nonlinear & Symbolic Accesses

float a[…], *p, *q;p = q = a;for (i=0; i<n; i++){ for (j=0; j<=i; j++) *q += *++p; q++;}

CR dep. test disprovesflow dependence (<, <)

p = {{a+1, +, 1, +, 1}i, +, 1}j = a[(i2+i)/2+j+1]q = {a, +, 1}i = a[i]

DO i = 1, M+1 S1: A[I*N+10] = ... S2: ... = A[2*I+K] K = 2*K+N ENDDO

S1: A[{N+10, +, N}i]S2: A[{K0+2N, +, K0+ N+2, *, 2}i]

CR range test disprovesdependence when

K+N > 10 and K > 2

Page 40: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 40

Some Observations wrt. Vectorization

Auto-vectorization (mostly) requires affine array accesses, e.g. to enable unimodular loop transformations

Best when vector loads/stores are memory aligned Data remapping possible, but runtime remapping may

outweigh vectorization speedup CR forms of array and pointer accesses naturally

represent memory access sequences, which may help in detecting aligned memory accesses and to support data remapping

Page 41: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 41

Some Preliminary Results

Perfect Club Suite Benchmark

Additional dependence pairs broken by CR test over Omega and range test

DYFESM*

OCEAN

QCD*

BDNA

TRFD

MDG*

MG3D*

0

63

51

6

10

62

8

Results shown produced from CR test implementation in Polaris*Test results incomplete due to Polaris memory issue

Page 42: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 42

Conclusions

The CR-based compiler analysis framework supports: IV recognition and strength reduction optimizations Pointer-to-array conversion for analysis of C loops Array dependence testing with affine, nonlinear, and symbolic

dependence equations Dependence testing on pointer arithmetic Induction variable substitution

Implementations GCC 4.x (with limitations!) From our lab: Polaris (TBA)

Page 43: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 43

Further Reading Robert van Engelen, Johnnie Birch, Yixin Shou, Burt Walsh, and Kyle Gallivan, “A

Unified Framework for Nonlinear Dependence Testing and Symbolic Analysis”, in the proceedings of the ACM International Conference on Supercomputing (ICS), 2004, pages 106-115.

Robert van Engelen, Johnnie Birch, and Kyle Gallivan, “Array Dependence Testing with the Chains of Recurrences Algebra”, in the proceedings of the IEEE International Workshop on Innovative Architectures for Future Generation High-Performance Processors and Systems (IWIA), January 2004, pages 70-81.

Robert van Engelen and Kyle Gallivan, “An Efficient Algorithm for Pointer-to-Array Access Conversion for Compiling and Optimizing DSP Applications”, in proceedings of the 2001 International Workshop on Innovative Architectures for Future Generation High-Performance Processors and Systems (IWIA), January 2001, pages 80-89.

Robert van Engelen, “Efficient Symbolic Analysis for Optimizing Compilers”, in proceedings of the International Conference on Compiler Construction, ETAPS 2001, LNCS 2027, pages 118-132.

Page 44: Induction Variable Analysis with Chains of Recurrences

Intel 9/12/05 44

References[Aho86] AHO, A., SETHI, R., AND ULLMAN, J. Compilers: Principles,Techniques and Tools. Addison-Wesley Publishing Company, Reading MA, 1985.[Allen69] ALLEN, F.E. Program optimization, Annual Review in Automatic Programming, 5, pp. 239-307.[Bannerjee88] BANERJEE, U. Dependence Analysis for Supercomputing. Kluwer, Boston, 1988.[Blume94] BLUME, W., AND EIGENMANN, R. The range test: a dependence test for symbolic non-linear expressions. In proceedings of Supercomputing

(1994), pp. 528–537. 22, 2 (1994), 183–205.[Cytron91] CYTRON, R., FERRANTE, J., ROSEN B.K, WEGMAN, M.N, ZADECK, F.K. Efficiently Computing Static Single Assignment Form and the Control

Dependence Graph, ACM Transactions on Programming Languages and Systems, 1991[Franke01] FRANKE, B., AND O’BOYLE, M. Compiler transformation of pointers to explicit array accesses in DSP applications. In proceedings of the ETAPS

Conference on Compiler Construction 2001, LNCS 2027 (2001), pp. 69–85.[Gerlek95] GERLEK, M., STOLZ, E., AND WOLFE, M. Beyond induction variables: Detecting and classifying sequences using a demand-driven SSA form.

ACM Transactions on Programming Languages and Systems (TOPLAS) 17, 1 (Jan. 1995), pp. 85–122.[Goff91] GOFF, G., KENNEDY, K., AND TSENG, C.-W. Practical dependence testing. In proceedings of the ACM SIGPLAN’91 Conference on Programming

Language Design and Implementation (PLDI) (1991), vol. 26, pp. 15–29.[Haghighat95] HAGHIGHAT, M. R. Symbolic Analysis for Parallelizing Compilers. Kluwer Academic Publishers, 1995.[Kennedy81] KENNEDY, K. A survey of data flow analysis techniques, in Muchnick and Jones, (1981), pp.5-54.[Lowry69] LOWRY. E.S. AND MEDLOCK C.W. Object code optimization, Communications of the ACM, 12,( 1991), pp.159-166.[Maydan91] MAYDAN, D. E., HENNESSY, J. L., AND LAM, M. S. Efficient and exact data dependence analysis. In proceedings of the ACM SIGPLAN Conference on

Programming Language Design and Implementation (PLDI) (1991), ACM Press, pp. 1–14.[Muchnick97] MUCHNICK, S. Advanced Compiler Design and Implementation. Morgan Kaufmann, San Fransisco, CA, 1997.[Psarris03] PSARRIS, K. Program analysis techniques for transforming programs for parallel systems. Parallel Computing 28, 3 (2003), 455–469.[Pugh91] PUGH, W., AND WONNACOTT, D. Eliminating false data dependences using the Omega test. In proceedings of the ACM SIGPLAN Conference

on Programming Language Design and Implementation (PLDI) (1992), pp. 140–151.[vanEngelen01a] VAN ENGELEN, R. Efficient symbolic analysis for optimizing compilers. In proceedings of the ETAPS Conference on Compiler Construction,

LNCS 2027 (2001), pp. 118–132.[vanEngelen01b] VAN ENGELEN, R., AND GALLIVAN, K. An efficient algorithm for pointer-to-array access conversion for compiling and optimizing

DSP applications. In proceedings of the International Workshop on Innovative Architectures for Future Generation High-Performance Processor and Systems (IWIA) (2001), pp. 80–89.

[vanEngelen04] VAN ENGELEN, R. A., BIRCH, J., SHOU, Y., WALSH, B., AND GALLIVAN, K. A. A unified framework for nonlinear dependence testing andsymbolic analysis. In proceedings of the ACM International Conference on Supercomputing (ICS) (2004), pp. 106–115.

[Wolfe92] WOLFE, M. Beyond induction variables. In ACM SIGPLAN’92 Conf. on Programming Language Design and Implementation (1992), pp. 162–174.[Wolfe96] WOLFE, M. High Performance Compilers for Parallel Computers. Addison-Wesley, Redwood City, CA, 1996.[Zima90] ZIMA, H., AND CHAPMAN, B. Supercompilers for Parallel and Vector Computers. ACM Press, New York, 1990.[Zima92] ZIMA, E. Recurrent relations and speed-up of computations using computer algebra systems. In proceedings of DISCO’92 (1992), LNCS 721, pp.152–161.