4
Error-free parallel rational arithmetic for optical and VLSI computing E. V. Krishnamurthy and V. K. Murthy An error-free carry-free (parallel) rational arithmetic system based on residue and p-adic representation is introduced. In this system, Farey rational numbers, whose numerators and denominators are bounded, are encoded into Para-Hensel codes (parallel rational Hensel code), and the parallel element-wise arithmetic is performed using these codes. The algorithms for encoding into and decoding from the Para-Hensel code and the arithmetic algorithms are described. This system will have extensive applications in massively parallel processors. 1. Introduction Many papers have appeared on residue arithmetic processors for Optical and VLSI computing. 1 - 6 One important advantage of the residue arithmetic is that it requires no carry mechanism. 7 . 8 This allows fast parallel computations to be performed in a single clock cycle, and so massively parallel systolic processors can be built. 3 The residue arithmetic processors essentially per- form integer addition/subtraction/multiplication modulo a prime; however, division is a very involved and complicated operation. No efficient algorithms as such exist for the division operation. 7 Thus, as such, rational arithmetic cannot be realized using the residue arithmetic system. This paper considers the extension of residue arith- metic systems for performing carry-free (parallel) er- ror-free rational arithmetic. The term carry-free is very well known; it means that there is no carry propa- gation in the arithmetic algorithms. By error-free, we mean each rational is exactly represented 8 by a finite length linear string with no representational error that usually occurs in a conventional p-ary system, if the denominator is relatively prime to the base of representation. For example, (1/7)1o = 0.142857,..., has no finite length exact representation. E. V. Krishnamurthy is with University of Waikato, Department of Computer Science, Hamilton, New Zealand. V. K. Murthy is a software consultant. Received 23 March 1987. 0003-6935/87/224819-04$02.00/0. © 1987 Optical Society of America. Thus the conventionalp-ary (or decimal) arithmetic system is neither error-free nor carry-free. One way to perform error-free rational arithmetic is by use of an ordered pair or fractional notation. This system is not carry-free. Also, each rational addition/subtraction operation in such a system requires three integer mul- tiplications and one integer addition/subtraction; each rational multiplication and division operations require two integer multiplications. Besides, we require an- other operation to reduce the result to the lowest form. Thus the fractional arithmetic system is quite expen- sive and cumbersome. An alternative approach to error-free rational arith- metic is by using p-adic system recently studied in detail. 8 Thep-adic system is more economicalthan the frac- tional arithmetic, since it requires only one addition/ subtraction operation for each rational addition/sub- traction and one multiplication/division for each ratio- nal multiplication/division; also, it does not require reduction to the lowest form. However, this system is not carry-free. For a massively parallel realization of rational arith- metic one needs to have a carry-free error-free arith- metic system. This paper describes the basic princi- ples of a new rational arithmetic system with these properties. This system combines the residue and p- adic 8 (Hensel codes) representation of rational num- bers to derive a new code called the Para-Hensel code (for the parallel rational Hensel code). Using the Para-Hensel codes, parallel rational arithmetic can be realized. This paper essentially confines to practical algo- rithms for (i) forward mapping from Farey rational numbers (rationals whose numerators and denominators are bounded) to Para-Hensel codes (PHCs); 15 November 1987 / Vol. 26, No. 22 / APPLIED OPTICS 4819

Error-free parallel rational arithmetic for optical and VLSI computing

  • Upload
    v-k

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Error-free parallel rational arithmetic for optical and VLSI computing

Error-free parallel rational arithmetic for optical andVLSI computing

E. V. Krishnamurthy and V. K. Murthy

An error-free carry-free (parallel) rational arithmetic system based on residue and p-adic representation isintroduced. In this system, Farey rational numbers, whose numerators and denominators are bounded, areencoded into Para-Hensel codes (parallel rational Hensel code), and the parallel element-wise arithmetic isperformed using these codes. The algorithms for encoding into and decoding from the Para-Hensel code andthe arithmetic algorithms are described. This system will have extensive applications in massively parallel

processors.

1. Introduction

Many papers have appeared on residue arithmeticprocessors for Optical and VLSI computing.1 -6 Oneimportant advantage of the residue arithmetic is thatit requires no carry mechanism.7 .8 This allows fastparallel computations to be performed in a single clockcycle, and so massively parallel systolic processors canbe built. 3

The residue arithmetic processors essentially per-form integer addition/subtraction/multiplicationmodulo a prime; however, division is a very involvedand complicated operation. No efficient algorithmsas such exist for the division operation.7 Thus, assuch, rational arithmetic cannot be realized using theresidue arithmetic system.

This paper considers the extension of residue arith-metic systems for performing carry-free (parallel) er-ror-free rational arithmetic. The term carry-free isvery well known; it means that there is no carry propa-gation in the arithmetic algorithms. By error-free, wemean each rational is exactly represented8 by a finitelength linear string with no representational error thatusually occurs in a conventional p-ary system, if thedenominator is relatively prime to the base ofrepresentation. For example, (1/7)1o = 0.142857,...,has no finite length exact representation.

E. V. Krishnamurthy is with University of Waikato, Departmentof Computer Science, Hamilton, New Zealand. V. K. Murthy is asoftware consultant.

Received 23 March 1987.0003-6935/87/224819-04$02.00/0.© 1987 Optical Society of America.

Thus the conventionalp-ary (or decimal) arithmeticsystem is neither error-free nor carry-free. One way toperform error-free rational arithmetic is by use of anordered pair or fractional notation. This system is notcarry-free. Also, each rational addition/subtractionoperation in such a system requires three integer mul-tiplications and one integer addition/subtraction; eachrational multiplication and division operations requiretwo integer multiplications. Besides, we require an-other operation to reduce the result to the lowest form.Thus the fractional arithmetic system is quite expen-sive and cumbersome.

An alternative approach to error-free rational arith-metic is by using p-adic system recently studied indetail. 8

Thep-adic system is more economical than the frac-tional arithmetic, since it requires only one addition/subtraction operation for each rational addition/sub-traction and one multiplication/division for each ratio-nal multiplication/division; also, it does not requirereduction to the lowest form. However, this system isnot carry-free.

For a massively parallel realization of rational arith-metic one needs to have a carry-free error-free arith-metic system. This paper describes the basic princi-ples of a new rational arithmetic system with theseproperties. This system combines the residue and p-adic8 (Hensel codes) representation of rational num-bers to derive a new code called the Para-Hensel code(for the parallel rational Hensel code). Using thePara-Hensel codes, parallel rational arithmetic can berealized.

This paper essentially confines to practical algo-rithms for

(i) forward mapping from Farey rational numbers(rationals whose numerators and denominators arebounded) to Para-Hensel codes (PHCs);

15 November 1987 / Vol. 26, No. 22 / APPLIED OPTICS 4819

Page 2: Error-free parallel rational arithmetic for optical and VLSI computing

(ii) inverse mapping from PHCs to rationals;(iii) arithmetic using the PHC.The basic theoretical framework for the construc-

tion of Hensel codes and Para-Hensel codes and thekey theorems ensuring the validity of the encoding anddecoding algorithms are available in Gregory andKrishnamurthy.8

II. Para-Hensel Codes

(i) Farey Rationals FNLet P = P1P2 ,... Pn} be a set of n distinct primes

andn

M = [I Pi-

Consider any rational number a/b that belongs to thefinite subset of the rational number:

FN = a/b: gcd(a,b) = 1 and 0 < al N and 0 <Ibl < N}

(gcd = greatest common divisor),

known as Farey rationals of the order of N, whosenumerator and denominators are less than or equal toN, an arbitrary integer. LetMbe such thatM> 2N2 +1 or N < INTISQRT[(M - 1)/2]}, where INT = lowerintegral part and SQRT = square root function.

Then we can construct Para-Hensel codes of theorder of N Farey rationals, suitable for exact parallelrational arithmetic.

(ii) Para-Hensel Codes for FNLet a/b be any nonzero rational number. We can

express a/b as

a/b = a/bi (Pi)i,

with gcd(ai,bi) = gcd(aipi) = gcd(bipi) = 1.Then we define the Para-Hensel code of a/b [denot-

ed by PHC (a/b)] as the set of n ordered pairs:

PHC(a/b) = (a1 ,rj), ( 2,r 2 ),... (awrdl}

where

i= 1,2,..,nai= (aib- 1) modpi,

ri = exponent of pi (positive, negative, or zero).

Note that b-1 exists since gcd(bipi) = 1; it can becomputed using an extended Euclidean algorithm [seealso Sect. III(ii)]. In analogy with the floating-pointnumber system, we call ai the mantissa and r theexponent.

Our aim is to establish a bijective mapping betweenFN and a set of n ordered pairs denoting

PHC(a/b),a/b FN,

so that we can encode a/b FN into PHC and denotePHC to its equivalent a/b FN.

To represent zero, we may use integer z (usually z =0) and write 0 modp = (O,zi), i 1,2, ... ,n. Note thatthe exponent for zero is not unique.

(iii) ExampleLetp = {2,3,5,71 and a/b = 3/7; then

PHC(3/7) = (1,0), (1,1), (4,0), (3,-1)}.

(iv) Additive/Multiplicative Inverses of PHCTo perform inverse operations, such as subtraction/

division, we define, respectively, the additive/multipli-cative inverses of PHC(a/b). Thus

PHC(-a/b) = (-a,,rl),.... (-anrn)1,

where -ai(i = 1,2, ... ,n) is the additive inverse of asmodpi, which equals (pi-ai):

PHC(b/a) = (a-',-r,),. *(a-rd),

where a 1(i = 1,2, ... ,n) is the multiplicative inverseof ai modpi and -ri denotes the opposite sign for expo-nent.

(v) ExamplesLet PPHC(3/7)PHC(-3/7)PHC(7/3)

= 2,3,5,7, N < 10,= (1,0), (1,1), (4,0), (3-1)},= (1,0), (2,1), (1,0), (4,-1)},= (1,0), (1,-1), (4,0), (5,1)1.

Ill. Decoding a Para-Hensel Code

Consider the set of images QP of the set of rationalnumbers Q:

Q = PHC(a/b): a/b Q,

and the finite subset of QP consisting of the images ofthe order N Farey fractions:

Q = PHC(a/b): a/b e FN).

The mapping PHC(a/b):FN - QP is onto (surjec-tion), and it is one to one in the sense that the inversemapping Q - FN exists and is obtained by using thetwo well-known classical algorithms (Chinese remain-dering7-9 and the extended Euclidean algorithm8 9).

Before proceeding to describe the algorithms, wedefine the following terms:

(a) Let M+ = ipi, where p is a modulus for whichthe exponent ri in PHC(a/b) is greater than zero. If nosuch modulus exists, M+ = 1.

(b) Let M = I1ipi, where p is a modulus for whichthe exponent ri in PHC(a/b) is zero. If no such modu-lus exists, the computation cannot proceed further,and the PHC(a/b) remains undefined. In such a case anew set of primes is to be chosen to compute PHC.

(c) Let M_ = Hipi, where pi is a modulus for whichthe exponent r in PHC(a/b) is negative. If no suchmodulus exists, M_ = 1.

Note that M = M+MOM_.

(i) Chinese Remaindering AlgorithmLet PHC(a/b) be given.Let a in IM = 0,1,2,... ,(M - 1)} be the integer

which has the property that for each modulus pi in Mo(i = 1,. . . m), m < n,

4820 APPLIED OPTICS / Vol. 26, No. 22 / 15 November 1987

Page 3: Error-free parallel rational arithmetic for optical and VLSI computing

a modpi = a b-1 modpi,

or, in other words, a is the integer representationmodMO of residues ai modpi (i = 1,2, .. ,m,m < n), forwhich ri = 0. It is well known that a can be computedusing the Chinese remaindering algorithm (CRA) withinputs ai.9

Algorithm CRA: This algorithm takes ai (for whichri = 0) (i = 1,2, .. ,m) and reconstructs a over Im,begin

Me 1; R -a,for k = 2 until (i) dobegin

Me M*Pk-1;h - M-1 modpk;d - (ak - R)h modpk;ReR+dM

end.a - R

end.Now we compute a* defined by

a* = [ M_ . M+'(modM0 )] modM0.

Then the following extended Euclidean algorithm(EEA) maps PHC(a/b) onto FN.

(ii) Extended Euclidean Algorithm8 9

This algorithm computes FN corresponding to.PHC(a/b). Also it can be modified to compute themultiplicative inverse ai- of a modp by setting A - p,B -0, AO - a, and Bo - 1; and replacing the If andwhile statements by while RA F) 1 do; whenRA = 1, a 1-modp = RB.

Given M, find N = INTISQRT[(M - 1)/2]).Algorithm EEA

beginA M+ Mo;

RA Ao -M+ *;RB -Bo M_;if A0/Bo e FN Stop; elsewhile RAIRB $ FN dobegin

Q - quotient (A,Ao);RA - A-Q QAO;RB B- Q Bo;A- AO;AO - RA;B - Bo; Bo; - RB;

end.Result - RA/RB

end.Note: If no ri equals zero, MO remains undefined,

and hence PHC(a/b) is not defined and so not decoda-ble.

(iii) Examples(a) PHC(a/b) = {(1,-1), (1,0), (5,-1), (10,0)1,

P= 13,5,7,111,M= 1155, N= 24.

Here MO = 5 11 = 55; M+ = 1; thus M+ 1

(modMO) = 1, M_ = 21.

Using algorithm CRA, we get a = 21,

a* = [21- 21] mod55 = 1.

Using algorithm EEA, we get

a/b = 1/21 e F2 4-

(b) PHC(a/b) = (1,-3), (1,1), (4,0), (4,0)1,P = 2,3,5,71.

Thus M = 210, N 10, Mo = 35, M+ = 3, M_ = 2,MT1 modMo = 3-1 mod35 = 12,a = 4, a* = [4 * 2 12] mod35 = 26.

Using algorithm EEA, we get (a/b) = -3/8 e F1o.

IV. Parallel Rational Arithmetic Algorithms

The parallel rational arithmetic algorithms use ele-mentwise addition, subtraction, multiplication, anddivision on the PHC. In the following we use @, e, 0,0, for PHC add, subtract, multiply, divide operations,respectively. The symbols +, -, -,-* (superfix) areused for addition, subtraction, multiplication, andinversion modulo a prime pi, respectively in the man-tissa.

(i) AlgorithmsLet (a,ri) and (,si) denote the two rational oper-

ands in PHC form for i = 1,2, . .. n.(a) Addition ED. The rules for addition are (for i =

1,2,. . . n):(i) (O,z) @ (,ri) = (airi),(ii) (si) @ (0,Z) = Wisi),(iii) For ai, fhi # 0:

(yori) for ri = si,

(airi) (Xsi) = (a,ri) for ri < si,

(f3i,si) for si < ri,

where yi = (i + hi) modpi.For example, if P = 13,5,7,111, N = 24,

PHC(5/7) = t(2,0), (3,1), (5,-1), (7,0)),

PHC(2/7) = t(2,0), (1,0), (2,-1), (5,0)),PHC(5/7) e PHC(2/7) = t(1,0), (1,0), (0,-1), (1,0)).

To convert, we find

M = 1155, M+ = 1, MO = 165, M_ = 7,

1-' modl65 = 1; a = 1, a* = 7;

on using EEA algorithm we obtain (7/7).(b) Subtraction e: For i = 1,2,... ,n in parallel:

(ai,ri) e (Wi,si) = (ari) @ (pi -isi)

(This is a complemented addition.) For example, if P= 13,5,7,111, N = 24,

PHC(5/7) e PHC(2/7) = (0,0), (4,0), (3,-1), (2,0)).

To convert we find

M = 1155, M+ = 1, MO = 165, M_ = 7;

-' modi65 = 1;

a = 24, a* = 3;

on using EEA algorithm we obtain 3/7.

15 November 1987 / Vol. 26, No. 22 / APPLIED OPTICS 4821

Page 4: Error-free parallel rational arithmetic for optical and VLSI computing

(c) Multiplication a: For i = 1,2,... ,n in parallel,

(a,,ri) (,si) = ([ai Oil modpi,ri + si).

For example, if P = {3,5,7,111, N = 24,

PHC(5/7) o PHC(/2 = (2,0), (3,I), (5,-1), (7,0)t.

t(2,0), (3,0), (4,0), (6,0)t

=(10,(4.1), (6,-1), (9,0)}1.

To convert we find

M = 1155, M= = 5, M_ = 7, MO = 33,

5-l mod33 = 20;

a = 31, a* = 17;

on using EEA algorithm we obtain 5/14.(d) Division 0: For i = 1,2, . .. n in parallel,

(ai,ri) 0 (si) = ([ai it'] modpi,ri - si).

For example, if P = {3,5,7,11},

PHC(5/7) o PHC(2/7) = (1,0), (3,1), (6,0), (8,0)t.

To convert we find

M = 1155, M+ = 5, M = 231, M_ = 1,

5-l mod231 = 185;

a = 118, a* = 116;

on using EEA algorithm we obtain 5/2.

(ii) RemarksThe PHC addition and multiplication operations

are commutative, but not associative, and the distribu-tive law fails. This is analogous to the conventionalfloating point addition and multiplication operations.

As an example, for pi = 2,

[(1,0) ED3 (1,0)] <3D (1,2) (1,2),(1,0) e [(1,0) @ (1,2)] = (0,0).

This phenomenon leads to a noncanonical form ofrepresentation for PHC [see subsection (iii)]. Al-though this poses no problem for the conversion ofPHC, it could force us to abandon the computation fora particular Pi when the mantissa is zero and inverse ofa PHC is needed. In the above case (1,2) is invertiblebut not (0,0); during a computation, if we get the form(1,2) and inverse is to be taken, the computation pro-ceeds. If, however, we get the (0,0) form, the computa-tion with that particular prime pi is abandoned, broad-casting failure. This aspect will be explained in afollowing paper dealing with the inversion of matri-ces. 10

(iii) Noncanonical Nature of PHCThe Para-Hensel codes are not truly canonical like

the p-adic representations or Hensel codes. Therecould be more than one PHC representation wheneither the numerator or the denominator of a givenrational is not relatively prime to one or more pi terms.

For example, in the example in the last section,{(0,0), (4,0) (3,-i), (2,0)1 on conversion yields 3/7.

However, we obtain PHC (3/7) = 1(1,1), (4,0), (3,-1),(2,0)1 if we use the algorithm in Sect. II for P =

{3,5,7,11}. This phenomenon is similar to having, forexample, 3/7 = 6/14 in rational numbers. The non-canonical nature does not, however, interfere with de-coding the Para-Hensel codes.

V. Concluding Remarks

In this paper we described an error-free carry-freerational arithmetic system based on Para-Henselcodes. This will have extensive applications in con-structing massively parallel arithmetic processors todo rational computation in a specified range. Forexample, if we use sixty-four 16-bit processors andsixty-four primes of 16-bit size, the order of Fareyfraction N that can be handled is N = 2511 or 160-digitrational numbers.

The arithmetic system described here has severalother novel features such as fault-tolerance and error-correcting capabilities. These are being investigated.

The authors thank H. Schroder, Australian NationalUniversity, for comments. Also the authors thank thereviewers for suggesting improvements in the presen-tation of this paper.

References1. E. Swartzlander, "Digital Optical Arithmetic," Appl. Opt. 25,

3021 (1986).2. D. Psaltis and D. Casasent, "Optical Residue Arithmetic: A

Correlation Approach," Appl. Opt. 18, 163 (1979).3. J. Jackson and D. Casasent, "Optical Systolic Array Processor

using Residue Arithmetic," Appl. Opt. 22, 2817 (1983).4. P. R. Beaudet, A. P. Goutzoulis, E. C. Malarkey, and J. C.

Bradley, "Residue Arithmetic Techniques for Optical Process-ing of Adaptive Phased Array Radars," Appl. Opt. 25, 3097(1986).

5. A. Tai, I. Cindrich, J. R. Fienup, and C. C. Aleksoff, "OpticalResidue Arithmetic Computer with Programmable Computa-tion Modules," Appl. Opt. 18, 2812 (1979).

6. A. Huang, Y. Tsunoda, J. W. Goodman, and S. Ishihara, "OpticalComputation using Residue Arithmetic," Appl. Opt. 18, 149(1979).

7. N. Z. Szabo and R. I. Tanaka, Residue Arithmetic and itsApplications to Computer Technology (McGraw-Hill, NewYork, 1967).

8. R. T. Gregory and E. V. Krishnamurthy, Methods and Applica-tions of Error-free Computation (Springer-Verlag, New York,1984).

9. E. V. Krishnamurthy, Error-free Polynomial Matrix Computa-tions (Springer-Verlag, New York, 1985).

10. V. Murthy, Exact Parallel Matrix Inversion using Para-HenselCodes with Systolic Processors (to be published).

.

4822 APPLIED OPTICS / Vol. 26, No. 22 / 15 November 1987