Upload
jesse-ross-waters
View
292
Download
4
Embed Size (px)
Citation preview
Chapter 6-2Multiplier
MultiplierMultiplier Next LectureNext Lecture
DividerDivider Floating Point NumbersFloating Point Numbers
2
Multiplication of Positive Numbers using usual algorithm for multiplying integers
Algorithm applies to unsigned numbers and to positive numbers
Result of the product of two n-digit numbers can be accommodated in 2n digits
Binary multiplication of positive operands can be implemented in a purely combinational, two dimensional logic array1 1 0 11 1 0 1 (13) (13)
Multiplicand MMultiplicand M1 0 1 11 0 1 1 (11) (11) Multiplier QMultiplier Q
Partial ProductsPartial Products
(143) Product P(143) Product P
3
Formal Representation
Z X·· Y Zk2k
k 0=
M N 1–+
= =
Xi2i
i 0=
M 1–
Yj2j
j 0=
N 1–
=
XiYj2i j+
j 0=
N 1–
i 0=
M 1–
=
X Xi2i
i 0=
M 1–
=
Y Yj2j
j 0=
N 1–
=
with
4
Multiplier Implementation
Mult
iplier
Multiplicand
m3 m2 m1 m00 0 0 0
q3
q2
q1
q00
p2
p1
p0
0
0
0
p3p4p5p6p7
PP1
PP2
PP3
Partial product(PP0)
p , p , ...pPP4 = 7 6 0 = Product
Carry-in
qi
mjBit of incoming partial product PPi
Bit of outgoing partial product PP(i+1)
Carry-out
Typical cell
FA
5
Array Multiplier
q0
q1
m3 m2 m1 m0
m3
HA
m2
FA
m1
FA
m0
HA
q2m3
FA
m2
FA
m1
FA
m0
HA
P1
P3P6P7 P5 P4
q3m3
FA
m2
FA
m1
FA
m0
HA
P2
P0
6
Ripple-Carry Array Multiplier
FA FA FAFA
FA FA FAFA
FA FA FAFA
p7 p6 p5 p4 p3 p1 p0p2
0 m3q0
m3q1 m2q1
m2q0 m1q0
m1q1 m0q1
m3q2 m2q2 m1q2 m0q2
m3q3 m2q3 m1q3 m0q3
0
0
0
m0q0
For the multiplication operation M Q = P for 4-bit operands
M: m3m2m1m0 Q: q3q2q1q0 P: p7p6p5p4p3p2p1p0 miqj = mi·qj
7
The MxN Array MultiplierCritical Path
Critical Path 1 & 2
DDmultmult=[(M-1)+(N-2)]D=[(M-1)+(N-2)]Dcarrycarry +(N-1)D +(N-1)Dsumsum+1D+1Dandand
HA FA FA HA
HAFAFAFA
FAFA FA HA
Critical Path 1
Critical Path 2
8
TThe main component in each cell is an adder circuithe main component in each cell is an adder circuitryry EachEach AND gate determines whether a multiplicand bit m AND gate determines whether a multiplicand bit mjj
is added to the incoming partial product bit, based on is added to the incoming partial product bit, based on the value of the multiplier bit qthe value of the multiplier bit qjj
ForFor each row i ( 0 each row i ( 0 ≤≤ i i ≤≤ 3) 3) where where q qii = 1, adds the = 1, adds the multiplicand appropriately shifted, to the incoming multiplicand appropriately shifted, to the incoming partial product, PPpartial product, PPii, to generate PP, to generate PPi+1i+1
IIf qf qii = 0 = 0,, PP PPii is passed vertically downward unchanged is passed vertically downward unchanged PP0 is all 0sPP0 is all 0s PP4 is the desired productPP4 is the desired product TThe multiplicand is shifted left one position per row by he multiplicand is shifted left one position per row by
the diagonal signal paththe diagonal signal path
Multiplier Implementation
9
TThe previous algorithm may be impractical for he previous algorithm may be impractical for largelarge numbers because inumbers because itt uses ma uses manny gatesy gates
MMultiplication can be performed using a mixture of ultiplication can be performed using a mixture of combinational array techniques and sequential combinational array techniques and sequential techniques that require techniques that require less less combinational logiccombinational logic
In early computers, because of In early computers, because of the cost of the cost of logic logic gategates, s, the adder circuitry in the ALU was used to perform the adder circuitry in the ALU was used to perform multiplication sequentiallymultiplication sequentially
Called sequential circuit binary multiplierCalled sequential circuit binary multiplier
Another Method of Multiplier Design
qn 1-
mn 1-
n-bit
Multiplicand M
Controlsequencer
Multiplier Q
0
C
Shift right
Register A (initially 0)
adder
Add/Noaddcontrol
an 1- a0 q0
m0
0
MUX
1 1 1 1
1 0 1 1
1 1 1 11 1 1 0
1 1 1 01 1 0 1
1 1 0 1
Initial configuration
Add
M
1 1 0 1
C
First cycle
Second cycle
Third cycle
Fourth cycle
No add
Shift
ShiftAdd
Shift
ShiftAdd
1 1 1 1
0
0
0
10
00
1
0
0 0 0 0
0 1 1 0
1 1 0 1
0 0 1 1
1 0 0 10 1 0 0
0 0 0 1
1 0 0 0
1 0 0 1
1 0 1 1
QA
Product
11
This circuit performs multiplication by using a single adder n times to implement the spatial addition performed by the n rows of ripple carry adders
Registers A and Q combined hold PPi while multiplier bit qi generates the signal Add/Noadd
Add/Noadd controls the addition of the multiplicandAdd/Noadd controls the addition of the multiplicand M to M to PPPPi to generate PP to generate PPi+1
TThe product is computed in n cycleshe product is computed in n cycles TThe partial product grows in length 1 bit per cycle from he partial product grows in length 1 bit per cycle from
the initial vector PPthe initial vector PP00 of of nn 0s in register A 0s in register A The carry-out from the adder is stored in FThe carry-out from the adder is stored in Flip-lip-FFloplop C C AAt the start, the multiplier is loaded into register Q, the t the start, the multiplier is loaded into register Q, the
multiplicand into register Mmultiplicand into register M,, and C and C as well asas well as A are A are cleared to 0cleared to 0
Sequential Circuit Binary Multiplier
12
AAt the end of each cycle, C, A and Q are shifted right by t the end of each cycle, C, A and Q are shifted right by one bit position to allow for one bit position to allow for the the growth of the partial growth of the partial product as the multiplier is shifted out of register Qproduct as the multiplier is shifted out of register Q
BBecause of this shifting, multiplier biecause of this shifting, multiplier bitt q qii appears in the appears in the LSB position of Q to generate the Add/Noadd signal at LSB position of Q to generate the Add/Noadd signal at the correct time, starting with qthe correct time, starting with q00 during the during the fifirst cycle, rst cycle, qq11 during the second cycle, etc during the second cycle, etc......
IIf the adder has a delay of 10 nsf the adder has a delay of 10 ns TThe control setting and the shift operations take another he control setting and the shift operations take another
10ns each10ns each AA hardwired multiply in a 32-bit word-length computer hardwired multiply in a 32-bit word-length computer
would take about 640nswould take about 640ns MMultiply instructions took much longer to execute than ultiply instructions took much longer to execute than
Add instructions in early computersAdd instructions in early computers
Sequential Circuit Binary Multiplier
13
MMultiplication of signed operands generates a double ultiplication of signed operands generates a double length product in the 2's complement number systemlength product in the 2's complement number system
CConsider the case of a positive multiplier and a onsider the case of a positive multiplier and a negative multiplicandnegative multiplicand
WWhen we add a negative multiplicand to a partial hen we add a negative multiplicand to a partial productproduct,, we must extend the sign bit value of the we must extend the sign bit value of the multiplicand to the left as far as the product will extendmultiplicand to the left as far as the product will extend
TThe previous hardware can be used for negative he previous hardware can be used for negative multiplicands if it provides for sign extension of the multiplicands if it provides for sign extension of the partial productspartial products
Signed Operand Multiplication
14
Sign Extension of Negative Multiplicand
Negative number must be the multiplicand and the positive number is the multiplier
1
0
11 11 1 1 0 0 1 1
110
110
1
0
1000111011
000000
1100111
00000000
110011111
13-
143-
11+( )
Sign extension isshown in blue
15
AA powerful algorithm for signed-number multiplication powerful algorithm for signed-number multiplication treats positive and negative numbers treats positive and negative numbers uniformlyuniformly
So far, the number of additions equals the number of 1s So far, the number of additions equals the number of 1s in the multiplierin the multiplier
CConsider a multiplication in which the multiplier is onsider a multiplication in which the multiplier is positive and has a single block of 1s (positive and has a single block of 1s (e.g., e.g., 0011110001111022 = = 30301010))
TTo derive the producto derive the product,, we could add four appropriately we could add four appropriately shifted versions of the multiplicandshifted versions of the multiplicand (i.e., for four 1s) (i.e., for four 1s)
WWe can reduce the number of operations by regarding e can reduce the number of operations by regarding the multiplier as the difference between two numbersthe multiplier as the difference between two numbers, , i.e., 32i.e., 321010-2-21010 or 0100000 or 010000022-0000010-000001022
This suggests that the product can be generated by This suggests that the product can be generated by adding 2adding 255 times the multiplicand to the 2's complement times the multiplicand to the 2's complement of 2of 211 times the multiplicand times the multiplicand
TThe sequence of required operations can be recoded as he sequence of required operations can be recoded as 0+1000-100+1000-10
Booth Algorithm
16
Booth Algorithm
-1 times the shifted multiplicand is selected when changing multiplier from 0 to 1
+1 times the shifted multiplicand is selected when changing multiplier from 1 to 0
The multiplier is scanned form right to left
17
Normal and Booth Multiplication Schemes
0
1
0
0 0
1 0 1 1 0 1
0
0 0 0 0 0 01
00110101011010
10110101011010
0000000000000
011000101010
0 1 0 1 1 1
0000
00000000000000
0
00
1 1 1 1 1 1 1 0 1 0 0 100
0
0 0 0 1 0 1 1 0 10 0 0 0 0 0 0 0
0110001001000 1
2's complement ofthe multiplicand
0
0
00
1+ 1-
1+ 1+ 1+ 1+
00
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 00 0 0 0 0 0 0
NormalNormal
BoothBooth
18
Booth Recoding of a Multiplier
WWhen the least significant bit is 1 , assume an implied 0hen the least significant bit is 1 , assume an implied 0lies to its rightlies to its right
001101011100110100
00000000 1+ 1-1-1+1-1+1-1+1-1+
19
Booth Multiplication with a Negative Multiplier
010
1 1 1 1 0 1 10 0 0 0 0 0 0 0 0
000110
0 0 0 0 1 1 01100111
0 0 0 0 0 0
01000 11111
1
10 1 1 0 11 1 0 1 0 6-
13+( )
78-
+11- 1-
Handles both positive and negative multipliers uniformlyHandles both positive and negative multipliers uniformly
20
LLet the leftmost zero of a negative number, X, be at bit position et the leftmost zero of a negative number, X, be at bit position kk
X = 11…10xX = 11…10xk-1k-1….x….x00
The value of X is given byThe value of X is given by
V(X)= -2V(X)= -2k+1k+1 ++ xxk-1k-122k-1k-1 +….+x +….+x002200
EExamplexample V(X) V(X)
11000 (-8) 11000 (-8) 11001 (-7)11001 (-7)
= -2= -233 = -2= -233 + 1 + 1 For example, 110110For example, 11011022(-10(-101010) is recoded as 0-1+10-10) is recoded as 0-1+10-10
-2-244+2+233-2 = -10-2 = -101010
Correctness of Booth Technique for Negative Multipliers
X=X=
--22k+1 k+1 ==
21
Booth Multiplier Recoding Scheme
Multiplier
Bit i Bit i 1-
Version of multiplicandselected by bit i
0
1
0
0
01
1 1
0 M
1+ M
1 M
0 M
22
Booth Recoded Multipliers
Achieves some efficiency in the number of additions required when the multiplier has a few large blocks of 1s
1
0
1110000111110000
001111011010001
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0
000000000000
00000000
1- 1- 1- 1- 1- 1- 1- 1-
1- 1- 1- 1-
1-1-
1+ 1+ 1+ 1+ 1+ 1+ 1+ 1+
1+
1+1+1+
1+
Worst-casemultiplier
Ordinarymultiplier
Goodmultiplier
23
Fast MultiplicationBBit-pair recoding of multipliersit-pair recoding of multipliers HHalves the maximum number of summandsalves the maximum number of summands DDerived from the booth algorithmerived from the booth algorithm (+1 -1) is equivalent to (0 +1)(+1 -1) is equivalent to (0 +1)
Because (+1 -1) is (+10Because (+1 -1) is (+1022 + -1 + -122) = +2M + -M = +M = (0 +1)) = +2M + -M = +M = (0 +1) IInstead of adding +1nstead of adding +1××M at position iM at position i++11 to to -1 times the -1 times the
multiplicand M at a shift position imultiplicand M at a shift position i TThe same result can be obtained by adding +1he same result can be obtained by adding +1××M at position iM at position i
(+1 0) is equivalent to (0 +2)(+1 0) is equivalent to (0 +2) (-1 +1) is equivalent to (0 -1)(-1 +1) is equivalent to (0 -1) TThe boothhe booth--recoded multiplier is recoded multiplier is examined two bits at a timeexamined two bits at a time, ,
starting from the rightstarting from the right
24
Multiplier Bit-Pair Recoding
1+1 (a) Example of bit-pair recoding derived from Booth recoding
0
000
1 1 0 1 0
Implied 0 to right of LSB
1
0
Sign extension
1
21
i 1+ i 1
(b) Table of multiplicand selection decisions
selected at positioniMultiplicandMultiplier bit-pair
i
0
0
1
1
1
0
1
0
1
1
1
1
0
0
0
1
1
0
0
1
0
0
1
Multiplier bit on the right
0 0 M
1+ M
1 M
1+ M
0 M
1 M
2 M
2+ M
MultiplicationRequiring onlyn/2 Summands
1-
00001 1 1 1 1 00 0 0 0 111 1 1 1 10 00 0 0 0 0 0
0000 111111
0 1 1 0 10
1 0100111111 1 1 1 0 0 1 10 0 0 0 0 0
1 1 1 0 1 1 0 0 1 0
0
1
0 0
1 0
1
0 0
00 1
0
0 1
10
0
0100 1 1 0 1
11
1-
6- 13+( )
1+
78-
1- 2-
Example
26
Ripple-Carry Array Disadvantage
MMultiplication requires ultiplication requires manymany addition additionss Using Ripple-Carry Array is slowUsing Ripple-Carry Array is slow Consider the addition of three n-bit numbers W, X,Consider the addition of three n-bit numbers W, X, Y to produce Y to produce
the sum Zthe sum Z WWe can first add W to X to generate a number Ae can first add W to X to generate a number A TThen we can add A to Y to produce Zhen we can add A to Y to produce Z This can be done by using two ripple carry addersThis can be done by using two ripple carry adders
27
A Different Approach IInstead of adding W to X to produce A in the upper ripple nstead of adding W to X to produce A in the upper ripple
carry addercarry adder, let’s i, let’s introduce the bits of Y into the inputsntroduce the bits of Y into the inputs TThis generates the vectors S and the saved carries C as his generates the vectors S and the saved carries C as
the outputsthe outputs IIn the second row, S and C are added in a a ripple carry n the second row, S and C are added in a a ripple carry
adder to produce Zadder to produce Z CCarry save addition can speedup this processarry save addition can speedup this process
28
Carry Save Array For the multiplication operation M Q = P for 4-bit operands M: m3m2m1m0 Q: q3q2q1q0 P: p7p6p5p4p3p2p1p0
FA FA FAFA
FA FA FAFA
FA FA FAFA
p7 p6 p5 p4 p3 p1 p0p2
0 m3q0m3q1 m
2q
1
m2 q0 m1 q0m
1q1
m0q1
m2 q3 m1q3 m0 q3 0
0
0
m2 q2 m1 q2 m0 q2m3 q2
m3q3
m0 q0
Q: Do you see any saving here?Q: Do you see any saving here?
29
Example Ripple-Carry vs. Carry-Save
30
100 1 11
100 1 11
100 1 11
11111 1
100 1 11 M
Q
A
B
C
D
E
F
(2,835)
X
(45)
(63)
100 1 11
100 1 11
100 1 11
000 1 11 111 0 00 Product
Carry-Save Addition Approach
Complete Example
00000101 0 10
10010000 1 11 1
+
1000011 1
10010111 0 10 1
0110 1 10 0
00011010 0 00
10001011 1 0 1
110001 1 0
00111100
00110 1 10
11001 0 01
100 1 11
100 1 11
100 1 11
00110 1 10
11001 0 01
100 1 11
100 1 11
100 1 11
11111 1
100 1 11 M
Q
A
B
C
S1
C1
D
E
F
S2
C2
S1
C1
S2
S3
C3
C2
S4
C4
Product
x
32
Schematic Representation of C.S.A.
C2
ABE D CFLevel 1 CSA
S2 C1 S1
C2 C3 S3
C4 S4
+Product
Level 2 CSA
Level 3 CSA
Final addition
1.7log1.7log22k – 1.7 steps, where k is the number of summandsk – 1.7 steps, where k is the number of summands
33
CCarryarry--save addition transforms W, X and Y into S and Csave addition transforms W, X and Y into S and C AAdvantages: all bits of S and C are produced in a short fixed dvantages: all bits of S and C are produced in a short fixed
amount of time after W, X, and Y are appliedamount of time after W, X, and Y are applied Each row approximately takes one full-adder delayEach row approximately takes one full-adder delay Carry propagation takes place only in the Carry propagation takes place only in the lastlast row row CCarry lookahead adder could be used effectively to add the S arry lookahead adder could be used effectively to add the S
and C vectors because all bits of S and C are available in and C vectors because all bits of S and C are available in parallelparallel
CConsider the addition of many summandsonsider the addition of many summands WWe can group the summands in threes ane can group the summands in threes andd perform the carry perform the carry
save addition on each of these groups in parallel to generate S save addition on each of these groups in parallel to generate S and Cand C
Next, group all the S and C vectors into threes and perform Next, group all the S and C vectors into threes and perform carry save addition on themcarry save addition on them
CContinue this process until there are only two vectors remainingontinue this process until there are only two vectors remaining TThese remaining vectors can be added in a ripple carry or hese remaining vectors can be added in a ripple carry or a a
carry lookahead addercarry lookahead adder to produce the sum to produce the sum
Example Ripple-Carry vs. Carry-Save