7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 1/42
VLSI Architecture :: MEL G642
MEL G642
Dr. A. Amalin PrinceBITS Pilani K.K. Birla Goa Campus
Department of Electrical , Electronics and Instrumentation Engineering
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 2/42
Contents
MAC fundamentalsMAC implementations
A MAC case studyMAC integration
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 3/42
Datapath in a DSP processor
RF ALU
The data path (DP)
h ( C P )
MAC
MEL G642
PM
DM1 DM2
AGU2
Addressing path (AGU)
AGU1
C o n
t r o
l p a
rocessor memory an reg ster usses
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 4/42
MAC general
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 5/42
MAC instructions
Multiplication arithmetic'sMAC & Iterative instructions
Double-precision arithmetic instructionsMove data from and to MAC
MEL G642
Other instructions
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 6/42
Why MAC
MAC: Multiplication and accumulation unit– Performs convolution based algorithms
o
FIR, IIR, Auto correlation, Cross correlation– Support most transformation algorithms
o FFT and DCT need MAC hardware
MEL G642
x n
c(0)
Z-1
c(1)
x(n-1)
+
Z-1
c(2)
x(n-2)
+
Z-1
c(3)
x(n-3)
+
Z-1
c(4)
x(n-4)
+y(n)
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 7/42
Why MAC
Data x(n) is shifted through a FIFO buffer consisting of 4registersSo that x(n) become x(n-1) and x(n-1) become x(n-2) …the next clock cycle
∑−
=
−=
1
0
)()()(m
i
icin xn y
MEL G642
All arithmetic executions are mapped to hardware inparallelThere are four multipliers and four full adders
A sample of y(n) is computed per clock cycle y(n) = x(n)*c(0) + x(n-1)*c(1) + x(n-2)*c(2 )+ x(n-3)*c(3 )+ x(n-4)*c(4)
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 8/42
MAC basics
MAC: Multiplication and accumulation unit– Adder = accumulator; Accumulator register
MOA MOB
Multiplier
MOA MOB
Multiplier
MEL G642
AccumulatorACR
AOA AOB
ACR =Accumulating
registerFlag circuit
AccumulatorACR
AOA AOB
ACR =Accumulating
registerFlag circuit
pe ne
(a) MAC without pipeline (b) MAC with pipeline
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 9/42
MUL circuit
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 10/42
Multiplications
How to manage double precision?How to manage signed?
Hardware multiplication
MEL G642
Fractional multiplication Integer multiplication
Signed multiplication Unsigned multiplication
Result with
double precision
Result with
single precisoin
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 11/42
Multiplications
How to manage double precision?How to manage signed?
Hardware multiplication
The 16-bit signed and 16-bit unsigned multiplicationcan be implemented based
on a 17b × 17b signedmultiplier.In general, a (N+1)×(N+1)
MEL G642
Fractional multiplication Integer multiplication
Signed multiplication Unsigned multiplication
Result with
double precision
Result with
single precisoin
can give N bits signed and unsigned multiplication
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 12/42
Basic multiplication instructions
No. Specifications on the result
M1 Signed integer multiplication, double precision resultACR [31:0] <= {A[15], A[15:0]} * {B[15], B[15:0]}
M2 Signed-Unsigned integer multiplication, double precision resultACR [31:0] <= {A[15], A[15:0]} * { “0” , B[15:0]}
M3 Unsigned-signed integer multiplication, double precision result= “ ” *
MEL G642
, ,
M4 Unsigned-Unsigned integer multiplication, double precision resultACR [31:0] <= { “0”, A[15:0]} * { “0” , B[15:0]}
M5 Signed integer multiplication, single precision result no roundACR [31:16] <= SAT(2 16*({A[15], A[15:0]} * {B[15], B[15:0]}))
M6 Signed fractional multiplication, double precision resultACR [31:0] <= SAT (2*({A[15], A[15:0]} * {B[15], B[15:0]}))
M7 Signed fractional multiplication, single precision rounded result
ACR [31:16] <= SAT(Round(2*({A[15], A[15:0]} * {B[15], B[15:0]})))
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 13/42
Multiplication of long data
ACR [47:0] <= {X[31], X[31:16]} * {Y[15], Y[15:0]}
+ 2-16
*( “0”, X[15:0]} * {Y[15], Y[15:0]});
ACR <=X[31:0]× Y[15:0]
MEL G642
ACR [64:0] <= {X[31], X[31:16]} * {Y[31], Y[31:16]}+ 2 -16*({ “0”, X[15:0]} * {Y[31], Y[31:16]})+ 2 -16*({X[31], X[31:16]} * {“0”, Y[15:0]})
+ 2 -32*({“0”, X[15:0]} * {“0”, Y[15:0]});
ACR<= X[31:0] ×Y[31:0]
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 14/42
An example of MUL
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 15/42
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 16/42
MAC instructions
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 17/42
Guard Operations In MAC
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 18/42
G 6 4 2
i r c u i t
MEL G642
M E L
M A C
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 19/42
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 20/42
MAC instructions
Single step (signed) MAC– Integer– Fractional
(Signed) Convolution– Integer
MEL G642
–
Diff between MAC and convolution– In control path, not shown here
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 21/42
Double-Precision Arithmetic
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 22/42
Double-Precision Arithmetic in MAC
No. Specifications on the result
D1 Double-precision data add/sub double-precision data Saturate(ACRx[39:0] ± ACRy[39:0])
D2 Double-precision data add/sub single-precision data align to LSBSaturate (ACRx[39:0] ± {24’b OPB [15],OPB[15:0]})
D3 Double-precision data add/sub single-precision data align to MSB
MEL G642
Saturate (ACRx[39:0] ± {8’b OPB [15],OPB[15:0], 16’b0})D4 Double-precision data plus/sub single precision immediate Saturate
(ACRx[39:0] ± 24’b immediate[15], immediate[15:0])
D5 Absolute operation on a double-precision data if ACRx[39] Saturate
(INV(ACRx[39:0]) + “1”) else ACRxD6 Compare two double-precision data and set flags set flag: Saturate
(ACRx [39:0] - ACRy [39:0])
D7 Simple scale by MUX instead of by shift logic
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 23/42
Scaling in DSP :: MAC
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 24/42
G 6 4 2
i s i o n A r i t h m e t i c
MEL G642
M E L
W i t h D o u
b l e P r e
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 25/42
G 6 4 2
S i g n a l s
MEL G642
M E L
C o n
t r o l
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 26/42
Move / change data types
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 27/42
Move data from MAC and to MAC
Basic load– Loads in half ACRn and keeps another half.
o The higher part and fill in guards– Loads in half ACRn and cleans another half.
o The higher part and fill in guards
MEL G642
– Loads in both lower higher part of ACRn.o To fill in guards using the higher part sign
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 28/42
Move data to MAC
Specifications on the result
L1 ACRn <= {8’bA[15],A[15:0], ACRn[15:0]} //Keep lower part
L2 ACRn <= {8’bA[15],A[15:0], 16’H0000} //clean lower part
L3 ACRn <= {ACRn [39:16], A[15:0]} //keep higher part
MEL G642
n <= , : s gn extens on g er part
L5 ACRn <= {8’bA[15],A[15:0], B[15:0]}// Load A and B from RF
L6 ACRn <= {A[7:0], ACRn[31:0]} // restore guards
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 29/42
Move data to MAC: Logic
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 30/42
G 6 4 2
o d i f i e d
MEL G642
M E L
M A C M
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 31/42
Move data from MAC
Specifications on the result
1 Rn <= ACRn[31:16] //Rn is a register in RF
2 Rn <= ACRn[15:0] //Rn is a register in RF
3 Rn <= ACRn[31:16]; Rn+1 <= ACRn [15:0] //Rn and Rn+1 in RF
MEL G642
4 M1 <= ACRn[31:16]; M2 <= ACRn[15:0]; //M1 M2: memories5 Rn <= ACRn[31:16]; Rn+1<=ACRn[15:0]; Rn+2<=ACRn[39:32]
6 Rn <= {8’h00, ACRn[39:32]}; // guard to register file RF
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 32/42
MAC integration
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 33/42
Flags in MAC
Usually control code is implemented using ALUinstructions– Flags in MAC is not used much
Mainly for exception– MAC has
MEL G642
o
Saturation Flag (FMO)o Sign Flag (FMS)o Zero Flag (FMZ)
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 34/42
Data operation Sepuence
Very important– Add guard bits– Operation (iteration) and scaling– Round after iteration– Saturation and removing guard bits
MEL G642
– Truncation and output
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 35/42
Physical critical path
What is physical critical path?
MEL G642
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 36/42
Physical critical path
D-mem 1 D-mem 2 D-mem 3 D-mem4 ConstantRF OPA
32 to1
RF OPB
32 to1
Long wires Long wires
MEL G642
As MAC input Very heavy fan out here!
h l l h
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 37/42
Physical critical path
ACR1
ACR2ACRm
Registerselectlogic
Dataselectlogic……
MEL G642
Heavy fan out forMACinternal logic
Long wireFromRF Data memory
Pi li
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 38/42
Pipeline
ACR
*
ACR
*
ACR*
MEL G642
(a) MAC in one clock cycle (b) MAC using two clocks
Accumulator
Flag circuit
(a) MAC using three clocks
Accumulator
Flag circuit
Accumulator
Flag circuit
E l MAC D i
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 39/42
Example :: MAC Design
Design a MAC unit capable of the following operations:o OP0: No operationo OP1: ACR = 0o OP2: ACR = A * B (Fractional multiplication (signed))o OP3: ACR = A * B + ACR (Fractional multiplication (signed))o OP4: ACR = 1.25 * ACR (Scaling)o OP5: Load ACR with a fractional value from a registero OP6: ACR = SATURATE(ROUND(ACR))
MEL G642
o : = :o OP8: RF = ACR[15:8]o OP9: RF = SIGNEXTEND(ACR[19:16])
Constraints:
•A and B are 8 bits, registers are 8 bits•ACR is 20 bits (including 4 guard bits).•Only one multiplier may be used. You should select as small a multiplier asnecessary. You also need to annotate whether it is signed or unsigned.
E l MAC D i
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 40/42
Example :: MAC Design
MEL G642
E l MAC D ig
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 41/42
Example :: MAC Design
MEL G642
The End :: Thank you for your attention
7/28/2019 Lecture ASIP 8, 9
http://slidepdf.com/reader/full/lecture-asip-8-9 42/42
The End :: Thank you for your attention
Questions?
MEL G642