Upload
malli-rao
View
12
Download
4
Embed Size (px)
Citation preview
EE1411
© Digital Integrated Circuits2ndArithmetic Circuits
Digital Integrated Digital Integrated CircuitsCircuitsA Design PerspectiveA Design Perspective
Arithmetic CircuitsArithmetic Circuits
Jan M. RabaeyAnantha ChandrakasanBorivoje Nikolic
January, 2003
EE1412
© Digital Integrated Circuits2ndArithmetic Circuits
A Generic Digital ProcessorA Generic Digital Processor
MEMORY
DATAPATH
CONTROL
INP
UT
-OU
TP
UT
EE1413
© Digital Integrated Circuits2ndArithmetic Circuits
Building Blocks for Digital ArchitecturesBuilding Blocks for Digital Architectures
Arithmetic unit
- Bit-sliced datapath (adder, multiplier, shifter, comparator, etc.)
Memory
- RAM, ROM, Buffers, Shift registers
Control
- Finite state machine (PLA, random logic.)
- Counters
Interconnect
- Switches
- Arbiters
- Bus
EE1414
© Digital Integrated Circuits2ndArithmetic Circuits
An Intel MicroprocessorAn Intel Microprocessor
9-1
Mux
9-1
Mux
5-1
Mux
2-1
Mux
ck1
CARRYGEN
SUMGEN+ LU
1000um
b
s0
s1
g64
sum sumb
LU : LogicalUnit
SU
MS
EL
a
to Cache
node1
RE
G
Itanium has 6 integer execution units like this
EE1415
© Digital Integrated Circuits2ndArithmetic Circuits
Bit-Sliced DesignBit-Sliced Design
Bit 3
Bit 2
Bit 1
Bit 0
Reg
iste
r
Add
er
Shif
ter
Mul
tipl
exer
ControlD
ata-
In
Dat
a-O
ut
Tile identical processing elements
EE1416
© Digital Integrated Circuits2ndArithmetic Circuits
Bit-Sliced DatapathBit-Sliced Datapath
Adder stage 1
Wiring
Adder stage 2
Wiring
Adder stage 3
Bit s
lice 0
Bit s
lice 2
Bit s
lice 1
Bit s
lice 63
Sum Select
Shifter
Multiplexers
Loopback Bus
From register files / Cache / Bypass
To register files / CacheLoopback B
us
Loopback Bus
EE1417
© Digital Integrated Circuits2ndArithmetic Circuits
Itanium Integer DatapathItanium Integer Datapath
Fetzer, Orton, ISSCC’02
EE1418
© Digital Integrated Circuits2ndArithmetic Circuits
AddersAdders
EE1419
© Digital Integrated Circuits2ndArithmetic Circuits
Full-AdderFull-AdderA B
Cout
Sum
Cin Fulladder
EE14110
© Digital Integrated Circuits2ndArithmetic Circuits
The Binary AdderThe Binary Adder
S A B Ci =
A= BCi ABCi ABCi ABCi+ + +
Co AB BCi ACi+ +=
A B
Cout
Sum
Cin Fulladder
EE14111
© Digital Integrated Circuits2ndArithmetic Circuits
Express Sum and Carry as a function of P, G, DExpress Sum and Carry as a function of P, G, D
Define 3 new variable which ONLY depend on A, B
Generate (G) = AB
Propagate (P) = A B
Delete = A B
Can also derive expressions for S and Co based on D and P
Propagate (P) = A BNote that we will be sometimes using an alternate definition for
EE14112
© Digital Integrated Circuits2ndArithmetic Circuits
The Ripple-Carry AdderThe Ripple-Carry Adder
Worst case delay linear with the number of bits
Goal: Make the fastest possible carry path circuit
FA FA FA FA
A0 B0
S0
A1 B1
S1
A2 B2
S2
A3 B3
S3
Ci,0 Co,0
(Ci,1)
Co,1 Co,2
td = O(N)
tadder = (N-1)tcarry + tsum
EE14113
© Digital Integrated Circuits2ndArithmetic Circuits
Complimentary Static CMOS Full AdderComplimentary Static CMOS Full Adder
28 Transistors
A B
B
A
Ci
Ci A
X
VDD
VDD
A B
Ci BA
B VDD
A
B
Ci
Ci
A
B
A CiB
Co
VDD
EE14114
© Digital Integrated Circuits2ndArithmetic Circuits
Inversion PropertyInversion Property
A B
S
CoCi FA
A B
S
CoCi FA
S A B Ci S A B Ci
=
Co A B Ci Co A B Ci
=
EE14115
© Digital Integrated Circuits2ndArithmetic Circuits
Minimize Critical Path by Reducing Inverting StagesMinimize Critical Path by Reducing Inverting Stages
Exploit Inversion Property
A3
FA FA FA
Even cell Odd cell
FA
A0 B0
S0
A1 B1
S1
A2 B2
S2
B3
S3
Ci,0 Co,0 Co,1 Co,3C
EE14116
© Digital Integrated Circuits2ndArithmetic Circuits
A Better Structure: The Mirror AdderA Better Structure: The Mirror Adder
VDD
Ci
A
BBA
B
A
A BKill
Generate"1"-Propagate
"0"-Propagate
VDD
Ci
A B Ci
Ci
B
A
Ci
A
BBA
VDD
SCo
24 transistors
EE14117
© Digital Integrated Circuits2ndArithmetic Circuits
Mirror AdderMirror AdderStick Diagram
CiA B
VDD
GND
B
Co
A Ci Co Ci A B
S
EE14118
© Digital Integrated Circuits2ndArithmetic Circuits
The Mirror AdderThe Mirror Adder•The NMOS and PMOS chains are completely symmetrical. A maximum of two series transistors can be observed in the carry-generation circuitry.
•When laying out the cell, the most critical issue is the minimization of the capacitance at node Co. The reduction of the diffusion
capacitances is particularly important.
•The capacitance at node Co is composed of four diffusion capacitances, two internal gate capacitances, and six gate capacitances in the connecting adder cell .
•The transistors connected to Ci are placed closest to the output.
•Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.
EE14119
© Digital Integrated Circuits2ndArithmetic Circuits
Transmission Gate Full AdderTransmission Gate Full Adder
A
B
P
Ci
VDDA
A A
VDD
Ci
A
P
AB
VDD
VDD
Ci
Ci
Co
S
Ci
P
P
P
P
P
Sum Generation
Carry Generation
Setup
EE14120
© Digital Integrated Circuits2ndArithmetic Circuits
Manchester Carry ChainManchester Carry Chain
CoCi
Gi
Di
Pi
Pi
VDD
CoCi
Gi
Pi
VDD
EE14121
© Digital Integrated Circuits2ndArithmetic Circuits
Manchester Carry ChainManchester Carry Chain
G2
C3
G3
Ci,0
P0
G1
VDD
G0
P1 P2 P3
C3C2C1C0
EE14122
© Digital Integrated Circuits2ndArithmetic Circuits
Manchester Carry ChainManchester Carry Chain
Pi + 1 Gi + 1
Ci
Inverter/Sum Row
Propagate/Generate Row
Pi Gi
Ci - 1Ci + 1
VDD
GND
Stick Diagram
EE14123
© Digital Integrated Circuits2ndArithmetic Circuits
Carry-Bypass AdderCarry-Bypass Adder
FA FA FA FA
P0 G1 P0 G1 P2 G2 P3 G3
Co,3Co,2Co,1Co,0Ci ,0
FA FA FA FA
P0 G1 P0 G1 P2 G2 P3 G3
Co,2Co,1Co,0Ci,0
Co,3
Mul
tipl
exer
BP=PoP1P2P3
Idea: If (P0 and P1 and P2 and P3 = 1)then Co3 = C0, else “kill” or “generate”.
Also called Carry-Skip
EE14124
© Digital Integrated Circuits2ndArithmetic Circuits
Carry-Bypass Adder (cont.)Carry-Bypass Adder (cont.)
Carrypropagation
Setup
Bit 0–3
Sum
M bits
tsetup
tsum
Carrypropagation
Setup
Bit 4–7
Sum
tbypass
Carrypropagation
Setup
Bit 8–11
Sum
Carrypropagation
Setup
Bit 12–15
Sum
tadder = tsetup + Mtcarry + (N/M-1)tbypass + (M-1)tcarry + tsum
EE14125
© Digital Integrated Circuits2ndArithmetic Circuits
Carry Ripple versus Carry BypassCarry Ripple versus Carry Bypass
N
tp
ripple adder
bypass adder
4..8
EE14126
© Digital Integrated Circuits2ndArithmetic Circuits
Carry-Select AdderCarry-Select AdderSetup
"0" Carry Propagation
"1" Carry Propagation
Multiplexer
Sum Generation
Co,k-1 Co,k+3
"0"
"1"
P,G
Carry Vector
EE14127
© Digital Integrated Circuits2ndArithmetic Circuits
Carry Select Adder: Critical Path Carry Select Adder: Critical Path
0
1
Sum Generation
Multiplexer
1-Carry
0-Carry
Setup
Ci,0 Co,3 Co,7 Co,11 Co,15
S0–3
Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15
0
1
Sum Generation
Multiplexer
1-Carry
0-Carry
Setup
S4–7
0
1
Sum Generation
Multiplexer
1-Carry
0-Carry 0-Carry
Setup
S8–11
0
1
Sum Generation
Multiplexer
1-Carry
Setup
S
EE14128
© Digital Integrated Circuits2ndArithmetic Circuits
Linear Carry Select Linear Carry Select
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15
S0-3 S4-7 S8-11 S12-15
Ci,0
(1)
(1)
(5)(6) (7) (8)
(9)
(10)
(5) (5) (5)(5)
EE14129
© Digital Integrated Circuits2ndArithmetic Circuits
Square Root Carry Select Square Root Carry Select
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Setup
"0" Carry
"1" Carry
Multiplexer
Sum Generation
"0"
"1"
Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13
S0-1 S2-4 S5-8 S9-13
Ci,0
(4) (5) (6) (7)
(1)
(1)
(3) (4) (5) (6)
Mux
Sum
S14-19
(7)
(8)
Bit 14-19
(9)
(3)
EE14130
© Digital Integrated Circuits2ndArithmetic Circuits
Adder Delays - Comparison Adder Delays - Comparison
Square root select
Linear select
Ripple adder
20 40N
t p(in
un
it de
lays
)
600
10
0
20
30
40
50
EE14131
© Digital Integrated Circuits2ndArithmetic Circuits
LookAhead - Basic IdeaLookAhead - Basic Idea
Co k f A k Bk Co k 1– Gk P kCo k 1–+= =
AN-1, BN-1A1, B1
P1
S1
• • •
• • • SN-1
PN-1Ci, N-1
S0
P0Ci,0 Ci,1
A
EE14132
© Digital Integrated Circuits2ndArithmetic Circuits
Look-Ahead: TopologyLook-Ahead: Topology
Co k Gk Pk Gk 1– Pk 1– Co k 2–+ +=
Co k Gk Pk Gk 1– Pk 1– P1 G0 P0 Ci 0+ + + +=
Expanding Lookahead equations:
All the way:
Co,3
Ci,0
VDD
P0
P1
P2
P3
G0
G1
G2
EE14133
© Digital Integrated Circuits2ndArithmetic Circuits
Logarithmic Look-Ahead AdderLogarithmic Look-Ahead Adder
A7
F
A6A5A4A3A2A1
A0
A0
A1
A2
A3
A4
A5
A6
A7
F
tp log2(N)
tp N
EE14134
© Digital Integrated Circuits2ndArithmetic Circuits
Carry Lookahead TreesCarry Lookahead Trees
Co 0 G0 P0Ci 0+=
Co 1 G1 P1 G0 P1P0 Ci 0+ +=
Co 2 G2 P2G1 P2 P1G0 P+ 2 P1P0C i 0+ +=
G2 P2G1+ = P2P1 G0 P0Ci 0+ + G 2:1 P2:1Co 0+=
Can continue building the tree hierarchically.
EE14135
© Digital Integrated Circuits2ndArithmetic Circuits
Tree AddersTree Adders
16-bit radix-2 Kogge-Stone tree
(A0,
B0)
(A1,
B1)
(A2,
B2)
(A3,
B3)
(A4,
B4)
(A5,
B5)
(A6,
B6)
(A7,
B7)
(A8,
B8)
(A9,
B9)
(A10
, B10
)
(A11
, B11
)
(A12
, B12
)
(A13
, B13
)
(A14
, B14
)
(A15
, B15
)
S0
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
S15
EE14136
© Digital Integrated Circuits2ndArithmetic Circuits
Tree AddersTree Adders(a
0, b
0)
(a1, b
1)
(a2, b
2)
(a3, b
3)
(a4, b
4)
(a5, b
5)
(a6, b
6)
(a7, b
7)
(a8, b
8)
(a9, b
9)
(a1
0,
b1
0)
(a1
1,
b1
1)
(a1
2,
b1
2)
(a1
3,
b1
3)
(a1
4,
b1
4)
(a1
5,
b1
5)
S0
S1
S2
S3
S4
S5
S6
S7
S8
S9
S1
0
S1
1
S1
2
S1
3
S1
4
S1
5
16-bit radix-4 Kogge-Stone Tree
EE14137
© Digital Integrated Circuits2ndArithmetic Circuits
Sparse TreesSparse Trees(a
0, b
0)
(a1, b
1)
(a2, b
2)
(a3, b
3)
(a4, b
4)
(a5, b
5)
(a6, b
6)
(a7, b
7)
(a8, b
8)
(a9, b
9)
(a1
0,
b1
0)
(a1
1,
b1
1)
(a1
2,
b1
2)
(a1
3,
b1
3)
(a1
4,
b1
4)
(a1
5,
b1
5)
S1
S3
S5
S7
S9
S1
1
S1
3
S1
5
S0
S2
S4
S6
S8
S1
0
S1
2
S1
4
16-bit radix-2 sparse tree with sparseness of 2
EE14138
© Digital Integrated Circuits2ndArithmetic Circuits
Tree AddersTree Adders(A
0, B
0)
(A1,
B1)
(A2,
B2)
(A3,
B3)
(A4,
B4)
(A5,
B5)
(A6,
B6)
(A7,
B7)
(A8,
B8)
(A9,
B9)
(A10
, B
10)
(A11
, B
11)
(A12
, B
12)
(A13
, B
13)
(A14
, B
14)
(A15
, B
15)
S0
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
S15
Brent-Kung Tree
EE14139
© Digital Integrated Circuits2ndArithmetic Circuits
Example: Domino AdderExample: Domino Adder
VDD
Clk Pi= ai + bi
Clk
ai bi
VDD
Clk Gi = aibi
Clk
ai
bi
Propagate Generate
EE14140
© Digital Integrated Circuits2ndArithmetic Circuits
Example: Domino AdderExample: Domino Adder
VDD
Clkk
Pi:i-k+1
Pi-k:i-2k+1
Pi:i-2k+1
VDD
Clkk
Gi:i-k+1
Pi:i-k+1
Gi-k:i-2k+1
Gi:i-2k+1
Propagate Generate
EE14141
© Digital Integrated Circuits2ndArithmetic Circuits
Example: Domino SumExample: Domino SumVDD
Clk
Gi:0
Clk
Sum
VDD
Clkd
Clk
Gi:0
Clk
Si1
Clkd
Si0
Keeper
EE14142
© Digital Integrated Circuits2ndArithmetic Circuits
MultipliersMultipliers
EE14143
© Digital Integrated Circuits2ndArithmetic Circuits
The Binary MultiplicationThe Binary Multiplication
Z X·· Y Zk2k
k 0=
M N 1–+
= =
Xi2i
i 0=
M 1–
Yj2j
j 0=
N 1–
=
XiYj2i j+
j 0=
N 1–
i 0=
M 1–
=
X Xi2i
i 0=
M 1–
=
Y Yj2j
j 0=
N 1–
=
with
EE14144
© Digital Integrated Circuits2ndArithmetic Circuits
The Binary MultiplicationThe Binary Multiplication
x
Partial products
Multiplicand
Multiplier
Result
1 0 1 0 1 0
1 0 1 0 1 0
1 0 1 0 1 0
1 1 1 0 0 1 1 1 0
0 0 0 0 0 0
1 0 1 0 1 0
1 0 1 1
EE14145
© Digital Integrated Circuits2ndArithmetic Circuits
The Array MultiplierThe Array MultiplierY0
Y1
X3 X2 X1 X0
X3
HA
X2
FA
X1
FA
X0
HA
Y2X3
FA
X2
FA
X1
FA
X0
HA
Z1
Z3Z6Z7 Z5 Z4
Y3X3
FA
X2
FA
X1
FA
X0
HA
EE14146
© Digital Integrated Circuits2ndArithmetic Circuits
The MxN Array MultiplierThe MxN Array Multiplier— Critical Path— Critical Path
HA FA FA HA
HAFAFAFA
FAFA FA HA
Critical Path 1
Critical Path 2
Critical Path 1 & 2
EE14147
© Digital Integrated Circuits2ndArithmetic Circuits
Carry-Save MultiplierCarry-Save MultiplierHA HA HA HA
FAFAFAHA
FAHA FA FA
FAHA FA HA
Vector Merging Adder
EE14148
© Digital Integrated Circuits2ndArithmetic Circuits
Multiplier FloorplanMultiplier Floorplan
SCSCSCSC
SCSCSCSC
SCSCSCSC
SC
SC
SC
SC
Z0
Z1
Z2
Z3Z4Z5Z6Z7
X0X1X2X3
Y1
Y2
Y3
Y0
Vector Merging Cell
HA Multiplier Cell
FA Multiplier Cell
X and Y signals are broadcastedthrough the complete array.( )
EE14149
© Digital Integrated Circuits2ndArithmetic Circuits
Wallace-Tree MultiplierWallace-Tree Multiplier
6 5 4 3 2 1 0 6 5 4 3 2 1 0
Partial products First stage
Bit position
6 5 4 3 2 1 0 6 5 4 3 2 1 0
Second stage Final adder
FA HA
(a) (b)
(c) (d)
EE14150
© Digital Integrated Circuits2ndArithmetic Circuits
Wallace-Tree MultiplierWallace-Tree Multiplier
Partial products
First stage
Second stage
Final adder
FA FA FA
HA HA
FA
x3y3
z7 z6 z5 z4 z3 z2 z1 z0
x3y2x2y3
x1y1x3y0 x2y0 x0y1x0y2
x2y2x1y3
x1y2x3y1x0y3 x1y0 x0y
EE14151
© Digital Integrated Circuits2ndArithmetic Circuits
Wallace-Tree MultiplierWallace-Tree Multiplier
FA
FA
FA
FA
y0 y1 y2
y3
y4
y5
S
Ci-1
Ci-1
Ci-1
Ci
Ci
Ci
FA
y0 y1 y2
FA
y3 y4 y5
FA
FA
CC S
Ci-1
Ci-1
Ci-1
Ci
Ci
Ci
EE14152
© Digital Integrated Circuits2ndArithmetic Circuits
Multipliers —SummaryMultipliers —Summary
• Optimization Goals Different Vs Binary Adder
• Once Again: Identify Critical Path
• Other possible techniques
- Data encoding (Booth)- Pipelining
FIRST GLIMPSE AT SYSTEM LEVEL OPTIMIZATION
- Logarithmic versus Linear (Wallace Tree Mult)
EE14153
© Digital Integrated Circuits2ndArithmetic Circuits
ShiftersShifters
EE14154
© Digital Integrated Circuits2ndArithmetic Circuits
The Binary ShifterThe Binary Shifter
Ai
Ai-1
Bi
Bi-1
Right Leftnop
Bit-Slice i
...
EE14155
© Digital Integrated Circuits2ndArithmetic Circuits
The Barrel ShifterThe Barrel Shifter
Sh3Sh2Sh1Sh0
Sh3
Sh2
Sh1
A3
A2
A1
A0
B3
B2
B1
B0
: Control Wire
: Data Wire
Area Dominated by Wiring
EE14156
© Digital Integrated Circuits2ndArithmetic Circuits
4x4 barrel shifter4x4 barrel shifter
BufferSh3Sh2Sh1Sh0
A3
A2
A1
A0
Widthbarrel ~ 2 pm M
EE14157
© Digital Integrated Circuits2ndArithmetic Circuits
Logarithmic ShifterLogarithmic ShifterSh1 Sh1 Sh2 Sh2 Sh4 Sh4
A3
A2
A1
A0
B1
B0
B2
B3
EE14158
© Digital Integrated Circuits2ndArithmetic Circuits
A3
A2
A1
A0
Out3
Out2
Out1
Out0
0-7 bit Logarithmic Shifter0-7 bit Logarithmic Shifter