Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Introduction to
CMOS VLSI
Design
Lecture 13:
Datapath Functional UnitsDavid Harris, Harvey Mudd College
Kartik Mohanram and Steven Levitan
University of Pittsburgh
CMOS VLSI Design12: Datapath Functional Units Slide 2
Outline
Comparators
Shifters
Multi-input Adders
Multipliers
CMOS VLSI Design12: Datapath Functional Units Slide 3
Comparators
0’s detector: A = 00…000
1’s detector: A = 11…111
Equality comparator: A = B
Magnitude comparator: A < B
CMOS VLSI Design12: Datapath Functional Units Slide 4
1’s & 0’s Detectors
1’s detector: N-input AND gate
0’s detector: NOTs + 1’s detector (N-input NOR)
A0
A1
A2
A3
A4
A5
A6
A7
allones
A0
A1
A2
A3
allzeros
allones
A1
A2
A3
A4
A5
A6
A7
A0
CMOS VLSI Design12: Datapath Functional Units Slide 5
Equality Comparator
Check if each bit is equal (XNOR, aka equality gate)
1’s detect on bitwise equality
A[0]B[0]
A = B
A[1]B[1]
A[2]B[2]
A[3]B[3]
CMOS VLSI Design12: Datapath Functional Units Slide 6
Magnitude Comparator
Compute B-A and look at sign
B-A = B + ~A + 1
For unsigned numbers, carry out is sign bit
A0
B0
A1
B1
A2
B2
A3
B3
A = BZ
C
A B
N A B
CMOS VLSI Design12: Datapath Functional Units Slide 7
Signed vs. Unsigned
For signed numbers, comparison is harder
– C: carry out
– Z: zero (all bits of A-B are 0)
– N: negative (MSB of result)
– V: overflow (inputs had different signs, output sign B)
CMOS VLSI Design12: Datapath Functional Units Slide 8
Shifters
Logical Shift:
– Shifts number left or right and fills with 0’s
• 1011 LSR 1 = ____ 1011 LSL1 = ____
Arithmetic Shift:
– Shifts number left or right. Rt shift sign extends
• 1011 ASR1 = ____ 1011 ASL1 = ____
Rotate:
– Shifts number left or right and fills with lost bits
• 1011 ROR1 = ____ 1011 ROL1 = ____
CMOS VLSI Design12: Datapath Functional Units Slide 9
Shifters
Logical Shift:
– Shifts number left or right and fills with 0’s
• 1011 LSR 1 = 0101 1011 LSL1 = 0110
Arithmetic Shift:
– Shifts number left or right. Rt shift sign extends
• 1011 ASR1 = 1101 1011 ASL1 = 0110
Rotate:
– Shifts number left or right and fills with lost bits
• 1011 ROR1 = 1101 1011 ROL1 = 0111
CMOS VLSI Design12: Datapath Functional Units Slide 10
Funnel Shifter
A funnel shifter can do all six types of shifts
Selects N-bit field Y from 2N-bit input
– Shift by k bits (0 k < N)
B C
offsetoffset + N-1
0N-12N-1
Y
CMOS VLSI Design12: Datapath Functional Units Slide 11
Funnel Shifter Operation
Computing N-k requires an adder
CMOS VLSI Design12: Datapath Functional Units Slide 12
Funnel Shifter Operation
Computing N-k requires an adder
CMOS VLSI Design12: Datapath Functional Units Slide 13
Funnel Shifter Operation
Computing N-k requires an adder
CMOS VLSI Design12: Datapath Functional Units Slide 14
Funnel Shifter Operation
Computing N-k requires an adder
CMOS VLSI Design12: Datapath Functional Units Slide 15
Funnel Shifter Operation
Computing N-k requires an adder
CMOS VLSI Design12: Datapath Functional Units Slide 16
Simplified Funnel Shifter
Optimize down to 2N-1 bit input
CMOS VLSI Design12: Datapath Functional Units Slide 17
Simplified Funnel Shifter
Optimize down to 2N-1 bit input
CMOS VLSI Design12: Datapath Functional Units Slide 18
Simplified Funnel Shifter
Optimize down to 2N-1 bit input
CMOS VLSI Design12: Datapath Functional Units Slide 19
Simplified Funnel Shifter
Optimize down to 2N-1 bit input
CMOS VLSI Design12: Datapath Functional Units Slide 20
Simplified Funnel Shifter
Optimize down to 2N-1 bit input
CMOS VLSI Design12: Datapath Functional Units Slide 21
Funnel Shifter Design 1
N N-input multiplexers
– Use 1-of-N hot select signals for shift amount
– nMOS pass transistor design (Vt drops!)k[1:0]
s0
s1
s2
s3
Y3
Y2
Y1
Y0
Z0
Z1
Z2
Z3
Z4
Z5
Z6
left Inverters & Decoder
CMOS VLSI Design12: Datapath Functional Units Slide 22
Funnel Shifter Design 2
Log N stages of 2-input muxes
– No select decoding needed
Y3
Y2
Y1
Y0
Z0
Z1
Z2
Z3
Z4
Z5
Z6
k0
k1
left
CMOS VLSI Design12: Datapath Functional Units Slide 23
Multi-input Adders
Suppose we want to add k N-bit words
– Ex: 0001 + 0111 + 1101 + 0010 = _____
CMOS VLSI Design12: Datapath Functional Units Slide 24
Multi-input Adders
Suppose we want to add k N-bit words
– Ex: 0001 + 0111 + 1101 + 0010 = 10111
CMOS VLSI Design12: Datapath Functional Units Slide 25
Multi-input Adders
Suppose we want to add k N-bit words
– Ex: 0001 + 0111 + 1101 + 0010 = 10111
Straightforward solution: k-1 N-input CPAs
– Large and slow
+
+
0001 0111
+
1101 0010
10101
10111
CMOS VLSI Design12: Datapath Functional Units Slide 26
Carry Save Addition
A full adder sums 3 inputs and produces 2 outputs
– Carry output has twice weight of sum output
N full adders in parallel are called carry save adder
– Produce N sums and N carry outs
Z4
Y4
X4
S4
C4
Z3
Y3
X3
S3
C3
Z2
Y2
X2
S2
C2
Z1
Y1
X1
S1
C1
XN...1
YN...1
ZN...1
SN...1
CN...1
n-bit CSA
CMOS VLSI Design12: Datapath Functional Units Slide 27
CSA Application
Use k-2 stages of CSAs
– Keep result in carry-save redundant form
Final CPA computes actual result
4-bit CSA
5-bit CSA
0001 0111 1101 0010
+
10110101_
0001
0111
+1101
1011
0101_
X
Y
Z
S
C
0101_
1011
+0010
X
Y
Z
S
C
A
B
S
CMOS VLSI Design12: Datapath Functional Units Slide 28
CSA Application
Use k-2 stages of CSAs
– Keep result in carry-save redundant form
Final CPA computes actual result
4-bit CSA
5-bit CSA
0001 0111 1101 0010
+
10110101_
01010_ 00011
0001
0111
+1101
1011
0101_
X
Y
Z
S
C
0101_
1011
+0010
00011
01010_
X
Y
Z
S
C
01010_
+ 00011
A
B
S
CMOS VLSI Design12: Datapath Functional Units Slide 29
CSA Application
Use k-2 stages of CSAs
– Keep result in carry-save redundant form
Final CPA computes actual result
4-bit CSA
5-bit CSA
0001 0111 1101 0010
+
10110101_
01010_ 00011
0001
0111
+1101
1011
0101_
X
Y
Z
S
C
0101_
1011
+0010
00011
01010_
X
Y
Z
S
C
01010_
+ 00011
10111
A
B
S
10111
CMOS VLSI Design12: Datapath Functional Units Slide 30
Multiplication
Example: 1100 : 1210
0101 : 510
CMOS VLSI Design12: Datapath Functional Units Slide 31
Multiplication
Example: 1100 : 1210
0101 : 510
1100
CMOS VLSI Design12: Datapath Functional Units Slide 32
Multiplication
Example: 1100 : 1210
0101 : 510
1100
0000
CMOS VLSI Design12: Datapath Functional Units Slide 33
Multiplication
Example: 1100 : 1210
0101 : 510
1100
0000
1100
CMOS VLSI Design12: Datapath Functional Units Slide 34
Multiplication
Example: 1100 : 1210
0101 : 510
1100
0000
1100
0000
CMOS VLSI Design12: Datapath Functional Units Slide 35
Multiplication
Example: 1100 : 1210
0101 : 510
1100
0000
1100
0000
00111100 : 6010
CMOS VLSI Design12: Datapath Functional Units Slide 36
Multiplication
Example:
M x N-bit multiplication
– Produce N M-bit partial products
– Sum these to produce M+N-bit product
1100 : 1210
0101 : 510
1100
0000
1100
0000
00111100 : 6010
multiplier
multiplicand
partial
products
product
CMOS VLSI Design12: Datapath Functional Units Slide 37
General Form
Multiplicand: Y = (yM-1, yM-2, …, y1, y0)
Multiplier: X = (xN-1, xN-2, …, x1, x0)
Product:1 1 1 1
0 0 0 0
2 2 2M N N M
j i i j
j i i j
j i i j
P y x x y
x0y
5x
0y
4x
0y
3x
0y
2x
0y
1x
0y
0
y5
y4
y3
y2
y1
y0
x5
x4
x3
x2
x1
x0
x1y
5x
1y
4x
1y
3x
1y
2x
1y
1x
1y
0
x2y
5x
2y
4x
2y
3x
2y
2x
2y
1x
2y
0
x3y
5x
3y
4x
3y
3x
3y
2x
3y
1x
3y
0
x4y
5x
4y
4x
4y
3x
4y
2x
4y
1x
4y
0
x5y
5x
5y
4x
5y
3x
5y
2x
5y
1x
5y
0
p0
p1
p2
p3
p4
p5
p6
p7
p8
p9
p10
p11
multiplier
multiplicand
partial
products
product
CMOS VLSI Design12: Datapath Functional Units Slide 38
Dot Diagram
Each dot represents a bit
partial products
multip
lier x
x0
x15
CMOS VLSI Design12: Datapath Functional Units Slide 39
Array Multipliery
0y
1y
2y
3
x0
x1
x2
x3
p0
p1
p2
p3
p4
p5
p6
p7
B
ASin Cin
SoutCout
BA
CinCout
Sout
Sin
=
CSA
Array
CPA
critical path BA
Sout
Cout CinCout
Sout
=Cin
BA
CMOS VLSI Design12: Datapath Functional Units Slide 40
Rectangular Array
Squash array to fit rectangular floorplany
0y
1y
2y
3
x0
x1
x2
x3
p0
p1
p2
p3
p4
p5
p6
p7
CMOS VLSI Design12: Datapath Functional Units Slide 41
Fewer Partial Products
Array multiplier requires N partial products
If we looked at groups of r bits, we could form N/r
partial products.
– Faster and smaller?
– Called radix-2r encoding
Ex: r = 2: look at pairs of bits
– Form partial products of 0, Y, 2Y, 3Y
– First three are easy, but 3Y requires adder
CMOS VLSI Design12: Datapath Functional Units Slide 42
Booth Encoding
Instead of 3Y, try –Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial product
CMOS VLSI Design12: Datapath Functional Units Slide 43
Booth Encoding
Instead of 3Y, try –Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial product
CMOS VLSI Design12: Datapath Functional Units Slide 44
Booth Encoding
Instead of 3Y, try –Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial product
CMOS VLSI Design12: Datapath Functional Units Slide 45
Booth Encoding
Instead of 3Y, try –Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial product
CMOS VLSI Design12: Datapath Functional Units Slide 46
Booth Encoding
Instead of 3Y, try –Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial product
CMOS VLSI Design12: Datapath Functional Units Slide 47
Booth Encoding
Instead of 3Y, try –Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial product
CMOS VLSI Design12: Datapath Functional Units Slide 48
Booth Encoding
Instead of 3Y, try –Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial product
CMOS VLSI Design12: Datapath Functional Units Slide 49
Booth Encoding
Instead of 3Y, try –Y, then increment next partial
product to add 4Y
Similarly, for 2Y, try –2Y + 4Y in next partial product
CMOS VLSI Design12: Datapath Functional Units Slide 50
Booth Hardware
Booth encoder generates control lines for each PP
– Booth selectors choose PP bits
Mi
yj
Xi
yj-1
2Xi
PPij
Booth
Selector
Booth
Encoder
x2i+1
x2i
x2i-1
CMOS VLSI Design12: Datapath Functional Units Slide 51
Sign Extension
Partial products can be negative
– Require sign extension, which is cumbersome
– High fanout on most significant bit
mu
ltiplie
r x
x0
x15
0
00
x-1
x16
x17
ssssssssssssssss
ssssssssssssss
ssssssssssss
ssssssssss
ssssssss
ssssss
ssss
ss
PP0
PP1
PP2
PP3
PP4
PP5
PP6
PP7
PP8
CMOS VLSI Design12: Datapath Functional Units Slide 52
Simplified Sign Ext.
Sign bits are either all 0’s or all 1’s
– Note that all 0’s is all 1’s + 1 in proper column
– Use this to reduce loading on MSB
s111111111111111s
s1111111111111s
s11111111111s
s111111111s
s1111111s
s11111s
s111s
s1s
PP0
PP1
PP2
PP3
PP4
PP5
PP6
PP7
PP8
CMOS VLSI Design12: Datapath Functional Units Slide 53
Even Simpler Sign Ext.
No need to add all the 1’s in hardware
– Precompute the answer!
ssss
ss1
ss1
ss1
ss1
ss1
ss1
ss
PP0
PP1
PP2
PP3
PP4
PP5
PP6
PP7
PP8
CMOS VLSI Design12: Datapath Functional Units Slide 54
Advanced Multiplication
Signed vs. unsigned inputs
Higher radix Booth encoding
Array vs. tree CSA networks