Published on

02-Aug-2016View

213Download

1

Embed Size (px)

Transcript

<ul><li><p>Automation and Remote Control, Vol. 63, No. 9, 2002, pp. 15061514. Translated from Avtomatika i Telemekhanika, No. 9, 2002, pp. 153163.Original Russian Text Copyright c 2002 by Podlazov.</p><p>TECHNICAL DIAGNOSIS</p><p>Arbitrary Group Permutations on Hypercube</p><p>and Nonblocability of Cube-connected Cycles</p><p>V. S. Podlazov</p><p>Trapeznikov Institute of Control Sciences, Russian Academy of Sciences, Moscow, RussiaReceived January 3, 2002</p><p>AbstractFor packet switching of arbitrary group permutations on the hypercube and arbitrarypermutations on the cube-connected cycles with small number of node channels, methods ofconflictless realization were proposed, and their speed was considered.</p><p>1. INTRODUCTION</p><p>Works on the communication networks of highly-parallel multiprocessor computer systems focuson permutation of the data elements between N processors (network nodes). Consideration isusually given to the permutation where each network node before and after operation containsa single data element. Group permutation where each network node before and after operationcontains a group of r 2 data elements is given much less consideration. Group permutation canbe decomposed into a sequence of r conventional permutations, but decomposition is a nontrivialoperation of computational complexity O(rN).</p><p>Communication networks are usually divided into dynamic or dedicated and static or directnetworks [1]. The former are multistage (n-cube, inverse n-cube, omega, or ClosBenes) networksusually based on channel switching. The latter are characterized by rigid neighborhood of nodesand make use of packet switching (multiring, hypercube, multidimensional grid, or cube-connectedcycles).</p><p>Among the dynamic networks, only the ClosBenes ones are nonblockable on arbitrary permu-tations or, more correctly, conditionally nonblockable because any permutation is realized by anindividual schedule. Depending on the degree of parallelism, the algorithm of schedule compilationrequires from O(log22 N) to O(N log2 N) operations.</p><p>In practice, channel switching or its derivative mixed packet-channel switching (the wormholeand cutthrough techniques) are used without preliminary compilation of schedules, that is, withpossible blockings. This substantially contracts the effective width (parallelism) of the switch,that is, the mean number of data elements transmitted concurrently through it. For example, theeffective width of the n-cube on arbitrary permutations is only</p><p>N [1].</p><p>Among the static networks, full p-ary multiring and generalized p-ary hypercube are nonblock-able on arbitrary permutations. Nonblockability is attained by using packet switching and realizingany permutation according to unique static schedules structured as counter-forests [26]. At that,on a network with N = pr nodes and mG = r(p 1) input-output channels arbitrary permutationat each node is realized in n cycles obeying the following expression:</p><p>nG(N) =</p><p>(prb 1)(p 1)rb</p><p>+</p><p>(pre 1)(p 1)re</p><p>, where rb = dr/2e and re = br/2c. (1)</p><p>0005-1179/02/6309-1506$27.00 c 2002 MAIK Nauka/Interperiodica</p></li><li><p>ARBITRARY GROUP PERMUTATIONS 1507</p><p>For even r, (1) assumes a more convenient form</p><p>nG(N) = 2</p><p>2(N 1)</p><p>(p 1)r</p><p>= 2</p><p>2(N 1)mG</p><p>. (2)</p><p>According to (1), the number of cycles for arbitrary permutation is minimal and can vary only ifthe number of channels in nodes varies. In this case, the effective width of the hypercube is</p><p>N/NG 0.25N log2 N, (3)</p><p>which for N 256 is much greater than for the multistage n-cube.The full multiring and generalized hypercube have rather complicated nodes. There exist static</p><p>switches with nodes of much lower complexity for their comparable number. For the same numberof nodes N = pr, for example, the p-ary r-cube (multidimensional grid) has mrC = logpN input-output channels at each node, and the cube-connected cycles [7] have for N = r2r nodes onlymcC = 2 3 input-output channels at each node. One should discriminate between the cube-connected cycles and cyclic cubes [8] which are close parametrically and quite distinct structurally.Together with the hypercube, they are the Cayley graphs [9] and have smaller diameter as comparedwith the hypercube having a close number of nodes.</p><p>No deterministic methods of conflictless realization of arbitrary permutations with given cycledelays are known for these switches. This gives rise to the question whether transmission by thecounter-forest schedules is applicable to them and what are the delays reached in this case. Thepresent author obtained a positive answer for the toral multidimensional grids (p-ary r-cubes forN = pr) [10]. This paper proposes a method of realization of arbitrary permutations on cube-connected cycles and examines its characteristics.</p><p>2. HYPERCUBE, CUBE-CONNECTED CYCLES,AND GROUP PERMUTATION ON THE HYPERCUBE</p><p>The ordinaryor binaryhypercube has N = 2r nodes. Each node has an r-digit binary</p><p>number x = xr1 . . . xi . . . x0, where xi [0, 1] and x =r1i=0</p><p>xi2i. Any two nodes in hypercube with</p><p>numbers differing in one and only one ith position are connected by a duplex channel regardedas that of the ith dimension (i [0, r 1]). A formal length 2i is assigned to the channel of ithdimension. The nodes with the numbers having ith positions xi and yi are connected by a channelfrom xi to yi of formal length 2i if and only if (xi + 1) mod 2 = yi.</p><p>We characterize the hypercube by a set of formal channel lengths</p><p>{SmG} = {1S = 1, 2S, . . . mGS},</p><p>where 1S < 2S < . . . < mGS, mG = r, and i+1S = 2i (0 i r 1).A route from the node with the number x = xr1 . . . xi . . . x0 to the node with the number</p><p>y = yr1 . . . yi . . . y0 has the decomposition (dr1, . . . , d0) for di [0, 1] if (xi + di) mod 2 = yi issatisfied for each i. The data element moving over the hypercube uses one cycle to pass the channelof the ith dimension if di = 1, and does not move along the channel of the ith dimension if di = 0.</p><p>The formal length d =r1i=0</p><p>di2i is assigned to the route with the decomposition (dr1, . . . , d0).</p><p>Passage of any data element along any route in the hypercube is defined by the route schedulecharacterized by the formal route length. It defines the sequence of passing the channels whoselengths are involved in the decomposition of this route and the numbers of cycles in which these</p><p>AUTOMATION AND REMOTE CONTROL Vol. 63 No. 9 2002</p></li><li><p>1508 PODLAZOV</p><p>Fig. 1. Three-dimensional cube-connected cycles. The channels of the original hypercube are shown by bold lines.</p><p>channels are passed. These cycles can be nonadjacent, that is, alternate with cycles where elementsstay in nodes without moving. These cycles are treated as passage of zero-length channel.</p><p>The method of conflictless realization of arbitrary permutation is based on using a static counter-forest schedule where any two route schedules coinciding in a cycle with nonzero length of the passedchannel coincide either in all preceding or all succeeding cycles. This schedule enables conflictlessrealization of arbitrary permutation in the number of cycles obeying (1) for p = 2. It is constructedas a direct Cartesian product of the initial and final unilateral schedules of much smaller size.Table 1 shows examples of such schedules for N = 256. For a greater number of nodes, examplescan be found in [26, 10]. In the initial schedule, the channels from the first half of the set SmG areused, and in the final schedule, those from the second half are used. The route schedules coincidingin a cycle coincide in all preceding cycles in the initial schedule and in all succeeding cycles in thefinal schedule.</p><p>The cube-connected cycles of dimensionality r with N = r2r nodes are obtained from theordinary 2r-node hypercube by replacing each node by a group of r nodes enumerated within eachgroup from [0, r 1] and connected by a unilateral ring channel (ring) or two counter-rings. Eachnode of any group is connected with a node of the same name (number) of another group bya duplex channel of the original hypercube whose number of dimension coincides with the nodenumber. Figure 1 depicts an example of three-dimensional cube-connected cycles where the nodesof each group are connected by a pair of counter-rings. In this case, each node has only three</p><p>Table 1. Unilateral schedules for the hypercube with N = 256 nodes</p><p>Initial schedule Final schedule</p><p>L \ T 1 2 3 4 5 6 7 8 T \ L0 01 1 16 162 2 32 323 1 2 32 16 484 4 64 645 4 1 16 64 806 2 4 64 32 967 4 1 2 32 16 64 1128 8 128 1289 8 1 16 128 144</p><p>10 2 8 128 32 16011 2 8 1 16 128 32 17612 4 8 128 64 19213 8 1 4 64 16 128 20814 2 8 4 64 128 32 22415 4 1 2 8 128 32 16 64 240L \ T 1 2 3 4 5 6 7 8 T \ L</p><p>AUTOMATION AND REMOTE CONTROL Vol. 63 No. 9 2002</p></li><li><p>ARBITRARY GROUP PERMUTATIONS 1509</p><p>input-output channels independently of the dimensionality of hypercube. If the nodes of a groupare connected by one ring, there are only two such channels.</p><p>Solution of the problem of arbitrary permutation on cube-connected cycles by necessity requiressolution of the problem of group arbitrary permutation on the ordinary hypercube. This becomesevident if each group of the nodes of cube-connected cycles is folded into a node of the ordinaryhypercube retaining the data elements contained in the nodes of the group. Therefore, we considera method of realizing group permutation on hypercube where each data element of any group istransmitted according to a conflictless schedule (Table 1) intended for the ordinary hypercube.Cycles of data element transmission from any node having the same names (numbers) are unitedin a hypercycle having the number of its component cycles. In any hypercycle, the data elementsare transmitted from any node in an arbitrary order. The following theorem is valid.</p><p>Theorem 1. For conflictless realization of group permutation on hypercube according to a coun-ter-forest schedule, it is necessary and sufficient that the hypercycle consists of r cycles.</p><p>Proof of Necessity. There exist group permutations such that all data elements of some groupshave identical routes. Therefore, their conflictless transmission requires r cycles.</p><p>Proof of Sufficiency. Let us assume that on the contrary the hypercycle consist of r + s cycles,where s 1, which means that there exists a group permutation such that its conflictless realizationrequires transmission of r + s data elements from some node in some hypercycle. This in turnimplies that the route schedules for these data elements coincide in some schedule cycle. Accordingto the property of counter-forest schedule, all these elements have coinciding route schedules inall preceding or succeeding schedule cycles. Therefore, they must have the same sending node ordestination node in the given group permutation, that is, belong to the same group. But there areonly r such elements, which leads to contradiction.</p><p>Corollary 1. In each hypercycle, each node has at most r data elements to be transmitted alonga channel of any dimension.</p><p>Corollary 2. An arbitrary group permutation is realized without conflicts on the hypercube inrnG cycles.</p><p>Among the counter-forest schedules, there are orthogonal schedules [2] enabling one to realizetwo arbitrary (partial) permutations concurrently and without conflicts. For even r, the orthogonalschedules are the mirror reflections of each other, that is, have inversely enumerated cycles. Forodd r, the mirror schedules are constructed by inserting empty cycles to equalize the lengths of theinitial and final schedules. When using mirror schedules, the group is divided into two subgroups ofdr/2e and br/2c terms, the data elements of the first subgroup being transmitted according to thedirect schedule, and those of the second subgroup, by the mirror schedule. The data elements ofdifferent subgroups are transmitted simultaneously. The following counterpart of Theorem 1 withappropriately modified corollaries is valid.</p><p>Theorem 2. For conflictless realization of group permutation on hypercube according to the mir-ror forest schedules, it is necessary and sufficient that the hypercycle consists of dr/2e cycles.</p><p>3. ARBITRARY PERMUTATIONS ON CUBE-CONNECTED CYCLES</p><p>We return to conflictless realization of arbitrary permutation on the cube-connected cycles withN = r2r nodes and decompose all nodes into groups, each group containing nodes connected by ringchannels. Theorem 2 shows that for conflictless realization of permutation it suffices to transmitthe data elements between the nodes of different groups in hypercycles of dr/2e cycles.</p><p>AUTOMATION AND REMOTE CONTROL Vol. 63 No. 9 2002</p></li><li><p>1510 PODLAZOV</p><p>Table 2. Characteristics of cube-connected cycles</p><p>Hypercube Cube-connected cycles Relative characteristics</p><p>NG mG nG NcC mcC r ncC mG/mcC ncC/nG G</p><p>64 6 6 64 2 4 42 2.67 7 2.622K 11 18 2K 2 8 406 5.5 22.56 4.1064 6 6 64 3 4 16 2 2.67 1.34128 7 7 160 3 5 34 2.33 4.85 2.082K 11 18 2K 3 8 120 3.67 6.67 1.824K 12 22 4.5K 3 9 268 4 12.18 3.05</p><p>To enable transmission of the data elements along the given routes, transmission between thenodes of each group must precede transmission in the hypercycle. It also must be executed afterthe last hypercycle. Conflictless transmission of a data element from each node to any other nodemay be required before the first hypercycle and after the last hypercycle. This transmission iscarried out concurrently by nodes (like in the loop register) and in the case of a single ring requiresD = r1 cycles, where D is the ring diameter. In the case of two counter-rings, each data elementmust be transmitted along the ring having the least internode distance. In this case, intragrouptransmission requires only D = br/2c cycles, D denoting here for the diameter of the pair ofcounter-rings.</p><p>Conflictless transmission of at most r data elements between any two nodes of any group maybe required before the rest of the hypercycles. It is carried out in succession by data elements andconcurrently by nodes and requires Dr or Ddr/2e cycles, respectively, for one ring or two counter-rings in group. The last expression is due to the fact that the data elements of different subgroupsare transmitted concurrently along shorter routes.</p><p>Since intragroup and intergroup transmissions make use of different channels (group rings andhypercube channels), for any hypercycle they, obviously, can be partially superposed in time sothat after transmission along a ring any data element is transmitted through the hypercube channelconcurrently with transmission of the next data element along the ring. In doing so, one hypercycledoes not combine transmission of the first data element along ring(s) and the last data elementthrough the hypercube channel, but they are combined in all neighboring hypercycles, except thefirst one. These hypercycles will be referred to as combined. Duration of the first combinedhypercycle is equal to the diameter of the group. With regard for partial combination of theneighboring hypercycles, durations of the remaining combined hypercycles are Dr or Ddr/2e cycles.The following theorem summarizes the above argument.</p><p>Theorem 3. Arbitrary (partial) permutation on cube-c...</p></li></ul>