21

A geometrical data-parallel language

Embed Size (px)

Citation preview

A geometrical data-parallel language1Jean-Luc Dekeyser Dominique LazurePhilippe Marquetfdekeyser,lazure,[email protected] d'Informatique Fondamentale de LilleUniversit�e des Sciences et Technologies de Lille2Publication ERA{127April 1993revised January 1994ACM Sigplan Notices, 19941This project is partially supported by Digital Equipment Corporation in the frame of anEERP/DPRI contract.2Batiment M3, Cit�e Scienti�que, 59655 Villeneuve d'Ascq cedex, France.

AbstractThe help project proposes a model of data-parallel programming allowing a programmer todevelop an algorithm the nearest of his thought. Usually, for many parts of a data-parallelprogram, the manipulations of data could be modelized as geometrical migrations inside acartesian reference space.We de�ne the language C-help in the frame of explicit data-parallel languages, the com-munications and the computations are separated, moreover any vector description is banished.This model and the associated languages are based on the hyper-space notion, and thealgorithm development follows an original semantic of computations limited to a set of hyper-space points. The hyper-space is not only a compilation-oriented concept but consists in amulti-dimentional virtual array integrated at the programming model and provides a refer-ential for any object access.IntroductionThe parallel programming techniques are divided into two �elds: the task-parallelism and thedata-parallelism. Those two techniques coexist and the parallelism specialist community is di-vided into two clans. Nevertheless a connection between the two philosophies is observed: thetask parallelism is selected for the execution model (supported by the architecture) whereasthe data-parallelism becomes the programming model (supported by the compilers).Some recent works have allowed the emergence of two standard programming language:HPF [For93] and HPC. With those tools, it is possible to manage e�ciency the power ofparallel architectures but no general methodology is provide for the scientist to develop parallelalgorithms. Those new standards are based on a standard languages (Fortran , C) on whichsome features are added, independently of the programming model.For the C-based languages, the speci�c parallel features are often issued directly fromthe architecture; while Fortran-issued languages do not provide deterministic criteria of op-timization (one can optimize its program only after execution). Hence, the data-parallelalgorithm development needs some tools to simplify the architecture power access.The aim of the help project is to propose a general methodology integrating all steps ofthe parallel development. This methodology is particularly well-suited to scienti�c algorithms.It relies on the three following principles:The data-parallel developing tools uni�cation The design, the translation, and the de-bugging of a data-parallel program have to be grounded on a single concept: the geo-metrical model.The data-parallel programming deals with some computations applied in parallel onsome homogeneous data structures. Those structures are often multi-dimentional arraysand the descriptions of those objects can be translated in some geometrical moves insidea cartesian space.The resolution methodology, that has been elaborated during the algorithm design, hasto be directly supported by the syntax and the semantic of the parallel programminglanguage. Also, the debugging tool has to lay on this methodology.The communications and the parallel computations explicitation To be e�cient,the parallel programming has to be explicitly expressed by the programmer. Like the

vector architectures, the massively parallel machines require explicit communicationsand explicit calls to parallel computations.The compiler has not to take into account the object migrations in the network. As faras e�ciency is concerned, those migrations are too much in uent to be programmer-hidden.The communications and the computations separation While the communicationsand the computations are explicit, a parallel programming legibility needs those twofundamental operation separation at the model and the language levels. Therefore, thegenerated code becomes more e�cient.help provides an explicit and imperative data-parallel language. Thus, the communi-cation and the computation phases are clearly distinguished. The help-based languagestake into account this distinction by providing some explicit communication operators. Inopposition to High Performance Fortran [For93], every communication generated by a non-aligned-operand computation is banished in the help model.In this article, we detail every steps of a scienti�c algorithm development using the helpmodel. We present the main characteristics of the help implementation inside the C-helplanguage. The square matrix inversion by the Gauss-Jordan algorithm is developed as anexample; it shows the adequation of help model to the matrix computation algorithms.1 Data-parallel thinkBefore writing a parallel program, one has to choose a strategy in order to parallelize hisproblem. To support this choice, we propose the geometrical model.1.1 The geometrical modelThe scienti�c data-parallel algorithms are mainly laying on some regular structure manipu-lations (vectors, matrix...) on which some global or regular-subset-limited computations (forexample, limited to a matrix row...) are applied.The algorithm description is then simpli�ed if the programming model is restricted tothe manipulation of geometrical objects and their geometrical subsets. For example, in manynumerical algorithms, a matrix has to interact with one or several of its rows or columns.In this case, and in order to avoid implicit communications, the data-parallel model has toadopt a primitive of replication to extend the vector and make it conform to the matrix.The programming model able to implement those algorithms consists in de�ning a workspace, moving data-parallel objects inside this space by geometrical primitives and then trig-gering computations locally to the points of this space. Those computations may be triggeredin parallel because each point owns its elemental operands.A geometrical approach allows the programmer to visualize his algorithm inside the space.The data-parallel model understanding is made easier by the association of the geometricalvisualization with the geometrical expression of the algorithm.1.2 Example: Gauss-Jordan, the algorithmThe Gauss-Jordan square matrix inversion algorithm is made up of three phases: the initial-ization, the diagonalization and the inverse computation.2

Initialization The square matrix A has to be inversed. A and a same-size identity matrixare coupled together. The treatments applied to A will be simultaneously computed onthis second matrix. The \union" of the two matrix is called M .0 01 1 1 1 1 1 1 1 1 11ADiagonalization In order to zero the elements of the p-row, we apply to each r-row, withr 6= p rowr = rowr � rowp � Arp=dpWhile the communications are explicit, we have to di�use the pivot row (rowp) on everyother row, and to di�use the pivot column (colp) on every other column, in order tomake Arp visible along r-row. This treatment may be realized parallel.d1d2 dpdp�100rowr colprowpInverse computation After the diagonalization phase, each matrix row has to be dividedby the same-index diagonal element value.rowr = rowr=drIn order to di�use each element of the diagonal to the right part of the matrix M , weuse the geometrical primitives of projection and of replication. This communicationtreatment is realized in parallel for each row.3

A�10 0d1d2 dr dNThis basic example has demonstrated the real adequation of a data-parallel model using2-dimensional space and geometrical primitives of communication for the general matrix com-putation purposes: some geometrical primitives provide the data migrations. Then, the dataare available on the points where computation treatments have to be triggered.1.3 Hyper-Space: A geometrical reference for the HELP modelThe interacting object manipulation is always hyper-space internal. An hyper-space is ageometrical set of points where the parallel objects (matrix, vectors) will be positioned andmoved. The help hyper-space point is the basic entity for computation treatments. Allcomputation on the data-parallel objects are triggered at the point level on the local data.Several hyper-space may be de�ned for a single algorithm, for di�erent speci�cations ofsize or data shape.The hyper-space regroups and aligns together the same-distribution objects. The pro-grammer speci�es the data distribution of its program by the expression of some directiveslinked to the hyper-space notion. The same-hyper-space objects are so forth allocated withthe same distribution algorithm.An hyper-space is de�ned as a cartesian reference of positive-coordinate points. Someinformations are linked to each hyper-space:� The size of each dimension (number of points).� Some priority orders between the hyper-space dimensions, allowing the programmer tomake either the parallelism or the communications provileged; two concepts are providedfor this feature:The elementary block speci�es a geometrical set of hyper-space points. The com-munications inside two points of the same block will never generate any physicalcommunication.Particularly, a full dimension may be grouped inside a unique elementary block ofthe same size. During the projection, the help compiler will allocate this dimen-sion in memory.The parallelism priority tree A priority order between the hyper-space dimensionsallows the programmer to privilege the physical projection of a dimension on theprocessors network. Several examples of this parallelism priority tree are developedin the part 2. 4

Secondary dimensions could be de�ned as the composition of some hyper-space dimen-sions. With this feature, one can manipulate non-regular objects like the diagonal of a matrix.Those secondary dimensions are also used with the geometrical primitives of communication.The data-parallel objects (DPO) are necessarily linked to an hyper-space. Their shapeand their position are dynamic during the execution, but the DPO association to an hyper-space is �xed. Those objects are multi-dimensional arrays, de�ned in relation of the hyper-space dimensions, either primary or secondary dimensions. Each point included in the objectgeometry holds one of its elements.1.4 Two programming levelsThe help thinking model clearly separates communications from computations. Like thismodel, two programming levels are provided by help languages: the microscopic level wherethe local-to-point computations are triggered; and the macroscopic level where the commu-nications are modelized by some geometrical primitives calls.2 Data-parallel writingAfter the algorithm design at the model level, one has to translate it into a data-parallellanguage. Such a language is derived from a classical language (C or Fortran ) in whichsome features derived from the model are included.One can also directly program its algorithm with HelpDraw. HelpDraw is a graphicaleditor derived from the help model, providing all geometrical features of the hyper-spacenotion.In order to ease the data-parallel language learning, a sequential expression in the usuallanguage is kept. Only the data-parallel part of the algorithm is written with speci�c con-structors. Those constructors are identical for C-help and Fortran-help .2.1 The C-HELP languageHyper-Spaces declarationFor each hyper-space dimension, the programmer has to declare:� A name, in order to increase the C-help code legibility.� The size (number of points).� The elementary block size. By default, this size is 1. The star symbol ('*') makes thedimension to be mapped in memory.The parallelism priority is speci�ed by the expression of a priority tree. The most prioritydimensions (appearing at the highest level of the tree) are mapped onto the physical network,in order to privilege the parallelism. If two dimensions have the same priority (the same levelin the tree), the priority is assigned from the left to the right. By default, the priority tree isone-level and the dimensions appear in the declaration order.5

Examples of priority tree A 6 � 2 hyper-space is mapped onto a 2-dimentional gridmachine with 3�2 processors. Two lays are necessary to map two points by physical processor.Three di�erents mapping could be obtain with di�erent priority trees:hspace plan [ x = 6 , y = 2 ] (x,(y));(4,1) (2,2)(2,5) x y(2,3)(2,6)(2,4)(1,1) (3,1)(6,1)(2,1)(5,1) (1,2)hspace plan [ x = 6 , y = 2 ] ((x),y);(1,1) (1,2) (2,1)(2,2) (3,1) (3,2) (4,1) (4,2) (5,1)(5,2) (6,1) (6,2) x yhspace plan [ x = 6 , y = 2 ] (x,y);(1,1) (6,2)(1,2) (2,2)(2,1) (3,1) (4,1) (5,1) (6,1)(3,2) (4,2) (5,2) x ySecondary dimensions The secondary dimensions are obtained by the composition ofseveral hyper-space dimensions. This composition can be operated between some primarydimensions or secondary dimensions, if every primary dimension appearing in the expressionis not used more than once.The secondary dimension sizes can not be expressed by the programmer.hspace plan [ x = 100 , y = 100 , d = (x , y) ];-6 xy � d

The hyper-space declaration follows the grammar:6

<hspace> = hspace <id> `[' <axis-list> `]' [ <priority-axis> ]<axis-list> = <axis-decl> `,' <axis-list> | <axis-decl><axis-decl> = <id> `=' <axis-def><axis-def> = <axis-size> [ <axis-blocking> ] | <secondary-axis><secondary-axis> = `(' <axis-name-list> `)'<axis-name-list> = <id> , <axis-name-list> | <id> , <id><axis-size> = <int-const-expr><axis-blocking> = `(' <axis-blocking-size> `)'<axis-blocking-size> = `*' | <int-const-expr><priority-axis> = `(' <priority-axis-list> `)'<priority-axis-list> = <priority> `,' <priority-axis-list> | <priority><priority> = <id> | <priority-axis>Data-Parallel Object declarationBy default, every DPO is dynamic for its size and its position inside the hyper-space. Onecan declare a DPO static using the key-word steady if its size and position are known atcompil-time. Such a DPO becomes static and could not migrate over the hyper-space.A DPO is built over the primary and secondary dimensions ; but all the primary dimen-sions used to de�ne a secondary dimension could not be used at the same time with thissecondary dimension to allocate a DPO.The DPO declaration follows the grammar :<dpo> = [ steady ] dpo <type> <hs-name><position-size> <dpo-decl-list><position-size> = `[' <position-size-axis-list> `]'| <conformity-copy><position-size-axis-list> = <position-size-axis> `,' <position-size-axis-list>| <position-size-axis><position-size-axis> = [ <axe-name> `=' ] <position-size-spec><position-size-spec> = <int-expr>| `*'| <int-expr> `:' <int-expr>| <int-expr> `:' `*'| <int-expr> `;' <int-expr><conformity-copy> = `conform' `(' <dpo-name> ')'For a DPO declaration, the references to the hyper-space declaration is expressed eitherwith the dimension name, or with the hyper-space primary dimensions declaration order.In order to specify the position and the size, one can declare:� Only the position; the DPO dimension is by default one-sized.� `*'; the DPO dimension size is complete (the same size as the hyper-space).� A couple lower bound `:' upper bound.� A couple lower bound ';' length. 7

hspace cubicspace [ x=100 , y=100 , z=100, d=(x,y) ] ;dpo float cubicspace [ x=40;30 , y=80;20, z=1 ] cube ;dpo float cubicspace [ x=30:100, y=10 , z=* ] plan ;dpo float cubicspace [ x=1 , z=80 , y=* ] line ;dpo float cubicspace [ x=1 , y=50 , z=50] point ;dpo float cubicspace [ x=30 , d=1;70 , z=1 ] diag ;dpo float cubicspace conform(line) sameline;-6

xyz cubicspaceline diagpoint plan

cubeThe microscopic levelThe microscopic treatments are computed locally to the hyper-space points. Such treatmentsconsist in scalar operator or function applications on the DPO values belonging to the samepoint. All the scalar operators of the host language are extended to the DPO manipulation.In order to make two DPO interacting, the help model needs the programmer to insurethe DPO conformity (same position and same shape). While the projection directives arelinked to the hyper-space notion, the logical conformity of two DPO implies the physicalconformity of data. There is not any implicit communication in C-help, making the codegeneration simpli�ed and more e�cient.The conformity domain By default, two interacting DPO have to be conform; the con-formity domain is de�ned as the set of points where those DPO are allocated.-6 xy A B CA = B + C /* legal */conformity domain= A = B = C -6 xy A BCA = B + C /* illegal */8

Constraint domain, conformity rule Sometimes, the computation has to be constrictedto a set of common points to non-same-shape or non-same-position DPO. The operator andthe constructor on(DPO) limit the conformity domain to the points where DPO is allocated.The interacting objects inside the in uence of an on(DPO) must embed DPO. One cansuccessively reduce the constraint domain, nesting on constructors.-6 xy A BA + Billegal : A and B are not conform. -6 xyA Bon(B) A + Blegal : A includes B.The conformity rule must be applied to all the operands of all the microscopic operators(except injective assignment, see below). This rule is:Two DPO are conform:1. If the conformity domain is expressed by a conformity constructor on, each DPO domainhas to include this domain.2. If there is no explicit conformity domain, all the DPO must have the same shape andthe same position inside the hyper-space.The scalar data are considered conform to any DPO.Moreover, a DPO appearing as an argument of a nested on has to be included in the previousconformity domain (successive reductions of the conformity domain).-6 xy A B CDon (B) { /* conformity domain = allocation domain of B */A + B; /* correct : A and B include the conformity domain */on (C) { /* correct : the conformity domain includes C *//* the conformity domain becomes the domain of C */D + A; /* correct : D and A include the conformity domain */} /* the conformity domain back to B */D + B; /* error: D does not include B */} 9

Masked domain Like most of the data-parallel languages, C-help provides the possibilityto mask some points of the conformity domain by the use of the operator where(DPO_EXPR).The DPO appearing in the expression DPO_expr and inside the where-block expressionshave to be conform. The new masked domain is composed of the points where the DPO_exprexpression evaluation result is true. The masked domain is a sub-set of the conformity domain.The microscopic operations nested in a masked domain are only computed on the maskeddomain points. Moreover, a macroscopic operator application or a function call masks thecurrent conformity domain (cf. infra). Afterwards, the masked domain is inversed for theelsewhere block execution. /* A, B, C, and D are conform */where (A!=0) /* mask1 : (A!=0) */where (C!=0) /* mask2 : (mask1) && (C!=0) */A = ( (B / C) + D ) / A ;elsewhere /* mask3 : not(mask2) */A = D / A;elsewhere /* mask4 : not(mask1) */A = 1 ;The C-language operator expr ? then_expr : else_expr is extended to the DPOoperands. This constructor is valid if expr, then_expr and else_expr satisfy the confor-mity rule. The resulting DPO is allocated on the conformity domain.Association The association operator <- is applied on a DPO:dpo-name <- dpo-exprThe right-hand expression veri�es the conformity rule. The left-hand DPO is reallocatedon the conformity domain obtained from the right-hand expression assignment, and receivesthe expression values. These operator is not allowed inside a on or a where. Only a dynamicDPO can be associated.Injective assignment The DPO assignment does not a�ect the DPO size or position.The conformity rule is applied on the right-hand side of the = sign. The local assignation ise�ective only on the conformity domain's righ-hand side points (this domain may be masked).The assigned DPO has to include the righ-hand side domain expression.-6 xy A C B A = B ;C <- B ; -6 xyA B C10

Example: Gauss-Jordan, the initializationLike the algorithm described in the help model, the square matrix inversion is implementedwith the following declarations:#define N 100hspace planar [ x = 2*N , y = N , d = (x,y) ] ;dpo float planar GaussJordan (A)dpo float planar[ x=1:N, y=* ] A ;{ steady dpo float planar [ x=*, y=* ] M ;dpo float planar [ x=1+N;N, y=* ] res ;dpo float planar [ x=1, d=1:N ] D ;M = A ;M = ipoint(res,x)==ipoint(res,y) ? 1 : 0 ;}In order to initialize the matrix M, A is assigned to the left part of M; whereas theconditional expression result is computed on the constraint domain res and assigned to theright part ofM. Thise conditional expression returns 1 on the diagonal, 0 elsewhere (ipointis an intrinsic function of help).The macroscopic levelThe communications are explicited at the macroscopic level. The macroscopic primitivesglobally handle the DPO, without any microscopic computation on their elements. Froma source DPO, a temporary DPO is produced; the size and position of this resulting DPOdepend on the source size and position.Macroscopic primitive classi�cation The macroscopic primitives could be split intothree families:Migration The resulting DPO shape is similar to the source DPO shape. Only the positioncan be modi�ed. For example:Translation primitives: the programmer can explicit a DPO moving along a hyper-spacedimension.Exchange of two dimension primitives: in order to operate a permutation between twoaxis (rotation).Reshape The shape of resulting DPO is not similar to the source one. For example:Extraction primitives (row(s), column(s) of a matrix, for example), expansion, or repli-cation of a DPO along an hyper-space dimension...Shake The resulting DPO and the source DPO are conform. Those primitives provide in-ternal migration of the DPO elements. For example: wrapped shift along a DPOdimension, symmetrical permutations...11

-6 xy -lgtrans(x,lg) -6 xy -"offtransabs(x,off) -6 xy Iexchange(x,y) -6 xy Iexchangeabs(x,y)-6 xy "offextract(x,off) -6 xy "offextractabs(x,off) -6 xyexpand(x,3) -6 xyexpand(x)Macroscopic primitive application A macroscopic primitive is triggered by the dot con-structor `.' application, following the syntax:<dpo-ref> = <dpo-expr> `.' <macro-list> (1)| <dpo-expr> `.' `(' <dpo-bool-expr> `)' <macro-list> (2)`:' <dpo-scal-expr>| ...The application of a <macro-list>macroscopic primitive list produces a temporary DPOwhich the size and the position depend of the source DPO and the primitive features.One has to notice that the including conformity domain is masked by the dot application.The <dpo_expr> expression evaluation is computed ignoring the including the conformitydomain. The conformity rule is applied locally to this expression.Eventually, a Macroscopic Control Domain (<dpo-bool-expr>) identi�es a sub-set ofthe source DPO points on which the primitive list is applied; the <dpo-expr> and<dpo-bool-expr> expressions produce two conform results. The resulting DPO is conformto the resulting macroscopic primitive application on the source DPO, without considerationfor the MCD. The resulting value for a point outside of the MCD target is obtain by the<dpo-scal-expr> expression evaluation; therefore, those expression must be conform to theresulting DPO.A = (C!=0 ?(B/C):0) . (C!=0) trans(x,100) : 0 ;The only values of B/C on points where C is non-zero are transfered.-6 xy -B C A12

-6 xy - -?shifttor(x,2) -6 xy �� ?shifttor(x,-2) -6 xy 6?flip(x)Example : Gauss-Jordan, the diagonalizationA diagonalization step of the matrix M is realized by the expression:M -= M.extract (y,i) .expand (y)* M.extract (x,i) .expand (x)/ M.scalar (i,i) ;Hence, the algorithm is directly translated from the geometrical numerical thinking to thegeometrical help model:� The pivot row is communicated to every other rows.� The pivot column is communicated to every other columns.� The pivot value is scalar converted (to obtain the all hyper-space visibility).Then, the new value of theM matrix element is computed in each point in parallel. Thiscomputation consists of a linear combination of rows to zero the pivot column. In order toavoid to zero the pivot itself, this computation is triggered inside a mask block.The HELP functionsThe host language function notion is extended to the DPO manipulation. A new type offunction is de�ned: the microscopic functions.Microscopic functions The microscopic operator features are extended by the microscopicfunctions. Similarly to those operators, those functions are computed locally to the conformitydomain active points and do not generate any communication.The microscopic functions are declared with the micro keyword and written with the hostlanguage syntax, allowing only the calls to microscopic functions. During the call of such afunction, the DPO e�ective parameters have to verify the conformity rule. This microscopicfunction is applied on each point of the conformity domain, eventually of the masked domain.micro int gcd (a,b)int a, b ;{ if (a > b)return gcd (a-b, b) ;if (a < b)return gcd (a, b-a) ; 13

return a ;}dpo int planar [*,*] A, B ;A = gcd (A, B) ;where (B>30)A = gcd (A, 10) ;The arithmetical functions of the standard libraries are implicitly extended to microscopicfunctions.General functions The hyper-space is a referential for each access to a DPO. Hence, fora DPO parameter of a function, this referential is needed to access to its geometry andits element values, therefore the hyper-space of a DPO parameters has to be passed as anargument.In order to provide a powerful model of function (or library) calls, the sub-hyper-spacenotion is de�ned. An hyper-space parameter is considered as a sub-hyper-space of the callinghyper-space.For a function call, the programmer as to explicitly express the position of the formalsub-hyper-space inside the calling hyper-space. The DPO parameters, the local DPO and theresult may be positioned inside this formal sub-hyper-space.dpo float planar MatMul(planar,M1,M2)hspace planar [x,y];dpo float planar [*,*] M1,M2;{ int iter;dpo float planar[*,*] res;res=0;M1 = M1 . shifttor(x,-1*(ipoint(M1,y)-1));M2 = M2 . shifttor(y,-1*(ipoint(M1,x)-1));res = M1 * M2;for (iter=1;iter<SizeDim(M1,x);iter++) {M1 = M1 . shifttor(x,-1);M2 = M2 . shifttor(y,-1);res += M1*M2 ;}return res;}hspace cubic [a=100,b=100,c=100];dpo float cubic [a=31;70 ,b=75 , c=1;70] Mat1,Mat2,Mat3;Mat3 <- MatMul(cubic(a(31:*),b(75),c),Mat1,Mat2);14

-6 ab

c - xyA call to a general function is independent of the current context, specially the currentactivity is not token into account for this call. The e�ective parameters evaluation masks theembedding conformity domain.If the hyper-space of the DPO parameters is in the function scope, there is no need forthis hyper-space to be passed as an argument (cf example of Gauss-Jordan).

15

Example: Gauss-Jordan/**************************************************************************//* Gauss-Jordan algorithm. Square matrix (N * N) inversion. */#define N 100hspace planar [ x = 2*N , y = N , d = (x,y) ] ;#define row(i) extract(y,i)#define col(i) extract(x,i)#define diag(i) scalar(i,i)dpo float planar GaussJordan (A)dpo float planar[ x=1:N, y=* ] A ;{ int i ;steady dpo float planar [ x=*, y=* ] M ;dpo float planar [ x=1+N;N, y=* ] res ;dpo float planar [ x=1, d=1:N ] D ;/*************************** Initialization ******************************//* The assignment is injective, the conformity has not to be verified. */M = A ;/* ipoint(DPO,x) returns each point coordinate onto x dimension. */M = ipoint(res,x)==ipoint(res,y) ? 1 : 0 ;/*************************** Diagonalization *****************************/for (i=1;i<N;i++)where (ipoint(M,y) != i)M -= M.row(i).expand(y) * M.col(i).expand(x) / M.diag(i) ;/*************************** Inverse computation *************************/on (D) D = M ;on (res)res = M / D.exchange(d,y).trans(x,N).expand(x,N) ;return res ;} 16

2.2 HelpDraw: a programming environmentIn order to interactively program data-parallel algorithms, we have de�ned HelpDraw,a graphical environment allowing the visualization of the help macroscopic primitives[BDM93].The �rst HelpDraw feature is a dual help code editor. The �rst part of this editor isa standard text editor; whereas the second one is dedicated to the graphical edition of themacroscopic algorithm parts. A directly DPO manipulation interface is provide by HelpDrawand allows the programmer to graphically handle the DPO, via the mouse, some menus ordialog boxes. The help code providing from those manipulations is automatically producedand linked with the text editor.HelpDraw provides a demonstrational aspect, in order to make easier for the programmerthe repetitive manipulations of DPO. For example: to directly reach a known goal, suchas obtain the two-DPO conformity; or to apply several times the same macroscopic DPOmanipulation.3 Related worksIn order to distinguish the data-parallel languages, several criteria are taken into account:Basic language Data-parallel languages extend classical language.Data-Parallel object keyword To specify an object to be data-parallel, a data-parallellanguage based on C often provide a keyword, whereas the Fortran extensions implic-itly de�ne array as parallel construction.Abstract machine An abstract machine provides a support for the parallel object de�nitionand the realization of operation such as communications.Object virtuality A language provides the object virtuality if a data-parallel object maybe declared independently of the target machine size.Access to machine Some data-parallel language provide both virtuality and access to thephysical machine characteristics.Object dynamicity An object is dynamic if its size and its shape may be changed duringthe execution.Heterogeneous alignment We call heterogeneous alignment the alignment of two objectsof di�erent shapes.Explicit communications In order to generate the communications, a data-parallel lan-guage can either explicitly express those communications, or make interact two objectsby di�erent description of those objects, in this case, the communications are implicit.Explicit distribution speci�cation Some data-parallel languages provide the possibilityto specify the mapping of the abstract machine onto the physical processors.The �gure 1 describes the main features of some well-known data-parallel languages:17

Fortran extensions Fortran -90 [MR90, ANS91], CM-Fortran [Thi90], MP-Fortran[Mas91a], Fortran-D [FHK+91], HPF [For93].C extensions C* [RS87, Fra91], MPL [Mas91b], Hyper-C (POMP-C) [Par92, Hyp93].Others extensions ACTUS [Per79, PCM83], PARALLAXIS [Br�a89, BBES91], *Lisp[Thi91]. Data-parallel object keyword Abstract machine Objects virtuality Accessto machine Objects dynamicity Heterogeneous alignment Explicit communications Explicit distribution speci�cationC* shape p p p pMPL plural p pPOMPCHyperC collection link p p p p pFortran -90 array p sizeCM-Fortran array LAYOUT p p pMP-Fortran array p p pFortran -D array DECOMPOSITION p p pHPF array TEMPLATEPROCESSORS p p p p pACTUS EOP pPARALLAXIS VECTOR CONFIGURATIONCONNECTION p p p*Lisp PVAR staticcon�g. p pC-help dpo hspace p p p p pFigure 1: Main characteristics of usual data-parallel languagesConclusionThe help model provides the three principles of the conception of a data-parallel program-ming model:Unify the data-parallel application development link The help model relies on thegeometrical manipulation of data-parallel objects; this manipulation is basically theoriginality of the C-help language and the HelpDraw environment relies on this ma-nipulation. HelpDraw is also used for the step-by-step execution of an help program.Explicit the communications and the parallel computations The Data-Parallel Ob-jects are identi�ed, there is no automatical parallelization of loops. The conformityrule applied to each operand of a microscopic operation implies the absence of implicitcommunications. 18

Separate the communications and the computations The help macroscopic opera-tions trig some physical and logical communications, otherwise, the computations aretrigged only by microscopic computations.This article has shown the square matrix inversion under the help model, using theGauss-Jordan algorithm. This example can be written, compiled and run over a 16k proces-sors MasPar MP-1 [Bla90]. The C-help compiler generates MPL code and allows currentdevelopment to implement some interesting compilation techniques issued from the helpmodel.The help project trend consists in an implementation of the sparse matrix inside theC-help compiler. The usually manipulations of those irregular structures are well-suited tothe geometrical model [Saa90, PE93]. Two complementary approachs consist either in implicitor explicit manipulations due to the sparsity of data. The help run-time seems to be ableto provide a implicit manipulation of those structures with only few explicit informations ofsparsity and without speci�c sparse-dedicated algorithm.References[ANS91] ANSI. FORTRAN 90, May 1991. X3J3 Draft S8.118 submitted as text forANSI X3.198-1991.[BBES91] Ingo Barth, Thomas Br�aunl, Stefan Engelhardt, and Frank Sembach. PARAL-LAXIS version 2 user manual. Technical Report 2/91, Fakult�at Informatik, Uni-versit�at Stuttgart, Germany, February 1991.[BDM93] Akram Benalia, Jean-Luc Dekeyser, and Philippe Marquet. HelpDraw graphicalenvironment: A step beyond data parallel programming languages. In Fifth Int'lConf. on Human-Computer Interaction, pages 591{596, Orlando, FL, August 1993.Elsevier Science Publishers.[Bla90] Tom Blank. The MasPar MP-1 architecture. In Proceedings of the IEEE CompconSpring 1990, pages 20{24, San Francisco, CA, February 1990. IEEE Society Press.[Br�a89] Thomas Br�aunl. Structured SIMD programming in PARALLAXIS. StructuredProgramming, 10(3):121{132, July 1989.[FHK+91] Geo�rey Fox, Seema Hiranandani, Ken Kennedy, Charles Koelbel, Uli Kremer,Chau-Wen Tseng, and Min-You Wu. FORTRAN D language speci�cation. Re-search Report TR90-141, Dept. of Computer Science, Rice University, April 1991.[For93] High Performance Fortran Forum. High Performance Fortran language speci�ca-tion, version 1.0. Rice University, Houston, TX, May 1993.[Fra91] James L. Frankel. C* language reference manual. Technical report, ThinkingMachines Corporation, Cambridge, MA, May 1991.[Hyp93] HyperParallel Technologies, Palaiseau, France. HyperC Documentation Kit, 1993.[Mas91a] MasPar Computer Corp., Sunnyvale, CA. MasPar Fortran | Reference Manual,Software Version 1.0, March 1991. Doc. 9303-0000, Rev. A1.19

[Mas91b] MasPar Computer Corp., Sunnyvale, CA. MasPar Parallel Application Language(MPL) | User Guide, Software Version 2.0, March 1991. Doc. 9302-0100, Rev.A4.[MR90] Michael Metcalf and John Reid. FORTRAN 8x Explained. Oxford Science Publi-cation, 1990. Revised edition, reprinted with corrections.[Par92] Nicolas Paris. De�nition of POMPC (version 1.99). Rapport de Recherches 92-5-bis, LIENS, �Ecole Normale Sup�erieure, Paris, March 1992.[PCM83] R. H. Perrott, D. Crookes, and P. Milligan. The programming language Actus.Software | Practice and Experience, 13(4):305{322, April 1983.[PE93] Serge Petiton and Guy Edjlali. Data parallel structures and algorithms for sparsematrix computation. In Advanced in Parallel Computing. North-Holland, Septem-ber 1993.[Per79] R. H. Perrott. A language for array and vector processors. ACM Transactions onProgramming Languages and Systems, 1(2):177{195, October 1979.[RS87] John Rose and Guy L. Steele Jr. C*: An extented C language for data parallelprogramming. Technical Report PL87-5, Thinking Machine Corp., April 1987.[Saa90] Youcef Saad. Sparskit: a basic toolkit for sparse matrix computations. Technicalreport, University of Minnesota, Minneapolis, 1990.[Thi90] Thinking Machines Corporation, Cambridge, MA. Getting Started in CM FOR-TRAN, February 1990. Version 5.2-0.6.[Thi91] Thinking Machines Corporation, Cambridge, MA. Getting Started in *Lisp, June1991.20