Upload
jab
View
214
Download
0
Embed Size (px)
Citation preview
Automatic Generation of Modular Mapypings *
Hyuk-Jae Lee and Jose A.B. Fortes School of Electrical and Computer Engineering
Purdue University, W. Lafayette, IN 47907 { hyuk,fortes}Qecn.purdue.edu
Abstract
Modular mappings have been recently proposed for optimizations of algorithms that cannot be
efficiently mapped by affine mappings. This paper addresses the problem of generating modular
mappings that satisfy conditions for validity and optimality. In general, this is a difficult problem
due to the presence of non-linear constraints. Hence, a method of O ( n Z ) complexity is provided
to assign values to some entries of a transformation matrix so that nori-linear constraints are
transformed into linear ones, where n is the dimension of a computation domain. The proposed
heuristic attempts to reduce the number of value-assigned entries and exclude as few solutions
as possible. This paper also considers the issue of deriving the inverse transformation of a
given modular mapping. It identifies a class of modular functions whose inverses result directly
from computing the inverse of the (coeficient) matrix used to specify a modular mapping. A n
eficient method of O ( n 2 ) complexity is provided to formulate the problem of generating such
modular mappings as an integer linear programming problem.
1 Introduction
Affine transformations are widely used for many optimizations (of programs with loops) in
parallelizing compilers and systolic array design [1]-[4]. Recently, modular mappings, described
'This research was partially funded by the National Science Foundation under grants MIP-9500673 and CDA- 9015696.
1063-6862/96 $5.00 0 1996 IEEE 155
by linear transformations modulo a constant vector, have been proposed for additional important
optimizations and designs[5]. Initial work on modular mappings focused on the characterization
of injectivity of a modular mapping [6]-[7]. These conditions can be used to identify the space
of valid modular time-space mappings of a given regular algorithm. Additional work concen-
trated on finding constraints that capture conditions of modular mappings for fast schedules,
data alignment, data distribution, and efficient space allocation [8]-[lo]. This paper addresses
the issue of how to combine all these constraints and systematically generate modular time-
space mappings. The proposed method is a heuristic one but, nevertheless. it is systematic,
computationally affordable, and attempts to exclude as few solutions as possible.
To automatically generate a program that results from a modular time-space transformation,
it is necessary to compute the inverse of the transformation. This paper identifies conditions of
modular mappings (whose form is TA : j’ i A , as discussed in Section 2) whose inverse
can also be easily derived from the inverse of the transformation matrix T (i.e., the inverse is
of the form T;! : j’ i A I ) . Then, a heuristic is proposed to formulate integer linear
programming problems whose solutions are modular mappings that satisfy these conditions.
The heuristic is not optimal in the sense that the solution of the integer linear programming
problem excludes some feasible modular mappings. However, it is similar to the above-mentioned
heuristic for combining multiple constraints and is systematic and computationally affordable
while attempting to include as many solutions as possible.
The rest of the paper is organized as follows. Section 2 defines and characterizes modular
mappings. Previous work on the conditions for injectivity of modular mappings is briefly re-
viewed. Section 3 studies the problem of generating a modular mapping that satisfies injectivity
conditions as well as other constraints. Section 4 investigates conditions of modular mappings
whose inverse can be derived from the inverse of the transformation matrix. In addition, gen-
eration of modular time-space mappings that satisfy the invertibility conditions is considered.
156
Conclusions are presented in Section 5.
2 Background
A time-space transformation is a mapping from an index set (iteration set, or computation
domain) of a (nested-loop) program into the domain of time and space (i.e., processors). A
modular time-space transformation is a special type of time-space transformation that can be
described by a linear transformation modulo a constant vector.
Definition 1 [modular function] A modularftinction, Tfi : 2" -+ 2"', is a mapping of the form: q1.*, m,)
9L.l
T&) = [ q 2 ' * ) '4"" m2) 1 where T($,,) is a row vector. The matrix T=: [ & ] and vector
T(nr.*) ' j ( m o d " 1 )
6 = (ml,. *., m,,)T are called the tmnsformation matrix and modulus vcictor, respectively. 0 Definition 2 [modular time-space transformation] A modular time-space transformation, T6,
is a modular function that is injective when its domain is restricted to the index set J of an
algorithm, i.e., Tfi : 1 2"' is injective. 0
For modular time-space transformations considered in the remaining of this paper, it is always
the case that n = n'.
It is not trivial to check whether a transformation matrix T and a modulus vector 6 yield an
injective modular mapping. Initial work on the characterization of inject.ive modular mappings
of rectangular index sets appears in [5] , [6] and additional results are due to [7]. The following
theorem shows the conditions of transformation matrices that guarantee the injectivity of the
corresponding modular mappings. Other injectivity conditions can be also found in [5]-[7].
Theorem 1 ([5])Let T; be a modular function of the index set 1;. Let :+ be an arbitrary total
order on the set { 1 ,2 , . . ., n}. Ti is injective if its transformation matrix T satisfies (1) t i j is
relatively prime to bi, and (2) tij = 0 if i + j . 0
157
In addition to injectivity conditions, there are many other conditions that may have to be
imposed on modular mappings to guarantee their validity and optimality[8]-[10]. Due to space
limitations, full details of these constraints are not explained in this paper. However, in general,
the problem of generating a modular mapping can be described as follows:
Find T and T' that
minimize hi(T) (or h;(T*)) subject to: (1)GT > 0, (2)KT" > 0 , (3)TT' = I , (4) injectivity conditions.
Here, T i s the transformation matrix of the modular mapping, h,(T) (or h;(T*)) is the objective
function of this problem, and G and Ii' are matrices. The inequalities in constraints (1) and (2)
can be replaced by equalities. Constraint ( 3 ) captures the fact that T* is the inverse of T . The
difficulty in solving this problem lies in the fact that some constraints are imposed on T while
others are imposed on the inverse of T , so that the non-linear constraint (3) should be satisfied.
For injectivity conditions, the problem needs to be decomposed into n!n! subproblems which
have linear injectivity conditions [5]. Hence, by linearizing constraint (3), this problem can be
converted into n!n! linear programming problems.
3 Generation of Injective Modular Mappings with Constraints
The constraint TT-' = I consists of a set of non-linear equations of the form: Ck t zk t ; ; = &,,
where S,, is one if i = j , and zero if z # j. To make this condition linear, it is necessary to choose
either t z k or tk;' as an arbitrary constant. However, arbitrary choice of t.k or ti,' may result in
a non-optimal solution. The more entries are chosen arbitrarily, the smaller the search space is
and so is the likelihood of finding an optimal solution. Hence, it is desirable to minimize the
number of arbitrarily chosen entries.
Consider a directed graph g ( V , t, W ) induced by a matrix T and an order + as follows:
nodes: V = {vilvi represents the i t h row of T } . edges: E = { ( w i , w j ) l j > i } , w h e r e > i s a n o r d e r o n t h e s e t {1 ,2 , . . . , n } .
158
1 1
(b)
Figure 1. (a) Graph induced by T = ( i p i ) a n d 2 + 3 > 4 + 1
(b) Maximally merged graph
weights of edges: - w(v i , v j ) = 1 if ti? is not determined. ~ w(vi , v j ) = 0 if tij is determined.
Two nodes vi and vj are adjacent with respect to > if there does not exist any number k between
i and j in the order >. Let U ; and vj be adjacent with respect to > and let w(vi , v j ) = 0. Then,
a (v i , vj)-merged graph 4' is generated from a graph G ( V , &, W ) as follows:
two nodes vi and vj are merged into one node vi,j; for a given node 'U[ of 6, w(vI, becomes w ( q , vi) + w ( q , v?); for a given node VI of Q , w(wi,j, w l ) becomes w(vi, V I ) + w ( v j , v ~ ) .
T h e maximally merged graph of 6 is a graph generated by merging all pairs of adjacent nodes
connected by a zero-weight ed e.
Example 1 Consider T = p i ) , where * denotes undetermined entries, and the
order 2 + 3 + 4 > 1 on the set {1,2,3,4}. The graph induced by T is shown in Fig. 1. Since
w(v,,v,) = 0, v1 and v4 can be merged as shown in Fig. 1 (b). The weight ~ ( v ~ , ~ , v ~ ) becomes
w(v1,v3) + w(v4,v3) = 2. Similarly, ~ ( 0 1 , 4 , ~ 2 ) = 1 is obtained. There do not exist any other
adjacent nodes with zero edge weights. Hence, this graph is the maximally merged graph. 0
Proposition 1 Let Tfi be a modular mapping that satisfies conditions of Theorem 1. The
condition TT-' = I becomes a system of linear equations if the maximally merged graph of the
graph induced by T has at most two nodes. U
Proposition 1 provides the condition of the induced graph that gu.arantees the equations
159
TT-' = I to be linear. Suppose that an induced graph does not satisfy Proposition 1. Some
edge weights should be reset to zero so that the graph can be merged to a two-node graph. The
resetting of an edge weight implies determination of the corresponding entry of T thus imposing
more restrictions on T . Hence, one needs to carefully choose edges to be reset to minimize the
number of the determined entries. In addition, the value chosen for a determined entry also
affects the quality of the solution of the resulting linear programming problem. However, in the
reset procedure, there is no exact information on which value results in a better solution. In
general, the constraint on the entry in the first row of T often requires that the entry should
be non zero because the first row of T is the schedule vector ii that has constraint of the form
ii # 0. On the other hand, the entries in the other rows of T are often required to be zero for
the conditions of code generation discussed in the next section. Hence, in the reset step, it may
be desirable to assign one to all entries in the first row and zero to all entries in the other rows.
When an edge of a merged graph is reset, it affects as many undetermined entries as the
weight of the edge. Therefore, it is desirable to reset the edge with the smallest weight. After
an edge weight is reset, the corresponding adjacent nodes can be merged. This reset and merge
steps can be repeated until only two nodes are left. For a single reset/merge step, it is necessary
to find the edge with minimal weight among edges connecting adjacent nodes. To do so, it is
necessary to compare only edges connecting adjacent nodes and therefore O( n ) time is required
for given n nodes. Since O ( n ) reset/merge steps are necessary, the time complexity of generating
a two-node maximally merged graph is O ( n Z ) . Example 1 [continued] The maximally merged graph violates the condition in Proposition 1.
Hence, there are non-linear equations in the equations TT-' = I . To make a two-node maximally
merged graph, one needs to merge either v1,4 and v3 or w3 and v2. Here, the edge connecting 03
and vz has smaller weight than the other. Hence, it is desirable to reset weight w(v3,v2) and
merge v3 and w2. The resulting graph is shown in Fig. 2. This graph satisfies Proposition 1.
This implies that determination of t32 guarantees the condition TT-l = I to be linear. 0
160
Figure 2. The maximally merged graph with two nodes
4 Generation of Modular Mappings for Code Generation
For automatic code generation, it is necessary to derive the invers,e of a given time-space
mapping. The following proposition provides the conditions of a modular mapping with a
transformation matrix T whose inverse is a niodular mapping with tra,nsformation matrix T-'.
Proposition 2 Let T;(T) = (Ty))mod 5 be a modular time-space transformation. The inverse of
Ti is (T-'y))mod i if either ti , = 0 for all i, i # cy or t,j = 0 for all j , j + 0. In other words, for
any a E { 1,2, . t., n} either t,, is the only nonzero entry of the at' row or it is the only nonzero
entry of the cyth column. U
( H d i ) Example 2
and = (3 ,4,5)T. Proposition 2 indicates that (T-l(.))modg is not the inverse of (T(.)),nod;.
Consider another modular transformation Tlmod ;(.) with T' = ( p ) and g = ( 3 , 4 , 5 ) ~ . In
this case, the transformation matrix satisfies the condition of Proposition 2. Hence, (T-'(.))mod;
Consider a modular time-space transformation Tmod ;(.) with T =
is the inverse of Tmod ;( .). 0
Consider the generation of modular mappings that satisfy Proposition 2. As in Section 3, a
graph-theoretical approach can be used. Consider a directed graph G( V , E , W ) induced by an
n x n matrix T and an order + as follows:
nodes: V = {v&wi represents the i th row of T } . edges: { E = (vi ,vj) for all j > i}, where > is an order on the set {1 ,2 ," . ,n} . weights of edges:
- w(v;,vj) = 1 if t ; j is not determined. - w(t~ i ,v j ) = 0 if t i j = 0. - w(vi ,v j ) = -n if t , j # 0.
This graph is the same as that of Section 3 except for edge weights when tij # 0. As in
161
1 1
1 0 . 0
Figure 3. (a) Graph induced by T = ( i Y ) and 2 t 3 t 4 t 1
(b) Maximally merged graph
Section 3 , two adjacent nodes can be merged if the weight between these two nodes are zero.
Example 3 Suppose that the transformation matrix be T = ( i i !) where * denotes
undetermined entries and let 2 + 3 t 4 t 1 be an order on the set {1,2,3,4}. The graph
induced by T is shown in Fig. 3 (a). Fig. 1 (b) shows the maximally-merged graph. 0
The following proposition gives the condition of the induced graph that guarantees the cor-
responding modular mapping satisfies Proposition 2.
Proposition 3 Transformation matrix T satisfies the conditions of Proposition 2 if the induced
graph from T has the maximally merged graph that has at most two nodes. 0
If an induced graph cannot be merged into a two-node graph, then, as done in Section 3 , it
is necessary to choose undetermined entries to be zero and repeat the reset/merge steps until
only two nodes are left. The only difference occurs when an entry t i j is initially determined to
be nonzero. Then, it is not possible t o merge vi and uj because the corresponding weight is
initially set to -n which is small enough to prevent the weight from becoming zero even if two
nodes ui and uj are merged with other nodes and the weight 1o(ui,vj) between them increases
by summation with other weights.
Example 3 [continued] The maximally merged graph violates the condition in Proposition 3
because it has three nodes. Hence, it is necessary t o merge either o ( ~ , ~ ) and u3 or u3 and u2 .
Two nodes u ( ~ , ~ ) and cannot be merged because the connecting weight is negative. Therefore,
it is necessary to reset edge v3, u2 and merge these two nodes. The resulting graph is shown in
162
Figure 4. The maximally merged graph with two nodes
Fig. 4. This graph has only two nodes, and therefore satisfies Proposition 3. 0
A modular mapping that satisfies Proposition 2 always satisfies Proposition 1 in Section 3.
Hence, a modular mapping generated by the method in this section also guarantees linearity of
constraints of the problem of generating a modular mapping. The procedure discussed in this
section also requires O(n2) time.
5 Conclusions
This paper addresses the problem of systematically generating modu1a.r time-space mappings
that simultaneously satisfy many conditions for validity and optimality. In general, this can be
formulated as a non-linear integer programming problem. This paper proposes an O( n2)-time
heuristic that fixes a small number of entries of a transformation matrix so that the non-linear
program can be converted to a linear program where R is the dimension of a computation domain.
By fixing some entries, the solution space of the modular mapping generation problem is reduced.
However, the proposed heuristic attempts to preserve as much of the solution space as possible
while maintaining computationally affordable complexity. For automatjc code generation, the
inverse of a modular mapping is required. Hence, this paper provides (invertibility) conditions
for modular mapping whose inverse can be derived from the inverse of the transformation matrix.
An O(n2)-time heuristic is provided to formulate an integer programming problem to generate
modular mappings that satisfy invertibility conditions as well as other conditions.
References
[l] A. Darte and Y. Robert. On the alignment problem. Parallel Processing Letters, 4(3):259-
163
270, Sep. 1994.
[2] S.-Y. Kung. VLSI array processors. Prentice-Hall, 1988.
[3] G.-J. Li and B.W. Wah. The design of optimal systolic arrays. IEEE Trans. Compul.,
C-34:66-77, Jan. 1985.
[4] W. Shang and J.A.B. Fortes. Time optimal linear schedules for algorithms with uniform
dependencies. IEEE Trans. Comput., C-40:723-742, June 1991.
[5] H.-J. Lee and J.A.B. Fortes. On the injectivity of modular mappings. In Proc. Int. Conf.
Application-Specific Array Processors, pages 236-247, Aug. 1994.
[6] €1.-J. Lee and J.A.B. Fortes. Modular mappings of rectangular algorithms. Technical Report
TR-EE 94-22, Electrical Engr., Purdue Univ., May 1994.
[7] A. Darte, M. Dion, and Y. Robert. A characterization of one-to-one modular mappings. In
Proc. 7th IEEE Symp. Parallel Distributed Processing, pages 382-389, Oct. 1995.
[8] H.-J. Lee and J.A.B. Fortes. Toward data distribution independent parallel matrix multi-
plication. In Proc. Int. Parallel Processing Symposium, pages 436-440, April 1995.
[9] H.-J. Lee and J.A.B. Fortes. Data alignments for modular mappings of BLAS-like algo-
rithms. In Proc. Int. Conf. Application-Specific Array Processors, pages 34-41, July 1995.
[lo] H.-J. Lee and J.A.B. Fortes. Conditions of blocked BLAS-like algorithms for data alignment
and communication minimization. In Proc. Int. Conf. Parallel Processing, volume 3, pages
220-223, Aug. 1995.
164