122
VLSI Arithmetic Lecture 10: Multipliers Prof. Vojin G. Oklobdzija University of California http://www.ece.ucdavis.edu/ acsel

VLSI Arithmetic Lecture 10: Multipliers

  • Upload
    keisha

  • View
    69

  • Download
    2

Embed Size (px)

DESCRIPTION

VLSI Arithmetic Lecture 10: Multipliers. Prof. Vojin G. Oklobdzija University of California http://www.ece.ucdavis.edu/acsel. Multiplication Algorithm*. *from Parhami. Multiplication Algorithm*. *from Parhami. Multiplication Algorithm*. *from Parhami. *from Parhami. Multiplication*. - PowerPoint PPT Presentation

Citation preview

VLSI Arithmetic

Lecture 10:Multipliers

Prof. Vojin G. OklobdzijaUniversity of California

http://www.ece.ucdavis.edu/acsel

April 22, 2023 2

Multiplication Algorithm*

*from Parhami

April 22, 2023 3

Multiplication Algorithm*

*from Parhami

April 22, 2023 4

Multiplication Algorithm*

*from Parhami

April 22, 2023 5

*from Parhami

April 22, 2023 6

Multiplication*

*from Parhami

April 22, 2023 7

Multiplication*

*from Parhami

April 22, 2023 8

*from Parhami

April 22, 2023 9

*from Parhami

April 22, 2023 10

Multiplier Recoding**from Parhami

April 22, 2023 11

*from Parhami

April 22, 2023 12

Multiplication by Constants

*from Parhami

April 22, 2023 13

Multiplication by Constants

*from Parhami

April 22, 2023 14

Fast Multipliers

*from Parhami

April 22, 2023 15

Using Higher Radix Multiplier

*from Parhami

April 22, 2023 16

Using Higher Radix Multiplier

*from Parhami

April 22, 2023 17

Higher Radix Multiplier

*from Parhami

April 22, 2023 18

*from Parhami

April 22, 2023 19

Booth’s Recoding

*from Parhami

April 22, 2023 20

Booth’s Recoding

*from Parhami

April 22, 2023 21

Booth’s Recoding

*from Parhami

April 22, 2023 22

*from Parhami

April 22, 2023 23

Higher Radix Multipliers

*from Parhami

April 22, 2023 24

Tree and Array Multipliers

*from Parhami

April 22, 2023 25

Tree and Array Multipliers

*from Parhami

April 22, 2023 26

Generating Partial Products

*from G. Bewick

April 22, 2023 27

Generating Partial Products*from G. Bewick

April 22, 2023 28

Generating Partial Products using Booth’s Recoding

*from G. Bewick

April 22, 2023 29

Generating Partial Products using Booth’s Recoding

*from G. Bewick

April 22, 2023 30

Booth Partial Product Selector Logic

*from G. Bewick

April 22, 2023 31

Tree Multipliers

*from Parhami

April 22, 2023 32

April 22, 2023 33

April 22, 2023 34

April 22, 2023 35

April 22, 2023 36

April 22, 2023 37

Tree Multipliers

*from Parhami

April 22, 2023 38

Tree Multipliers

*from Parhami

April 22, 2023 39

Tree Multipliers

*from Parhami

April 22, 2023 40

April 22, 2023 41

April 22, 2023 42

April 22, 2023 43

Reduction using 4:2 Compressors

*from G. Bewick

44 Multiplier DesignApril 22, 2023

A Method for Generation of FastParallel Multipliers

by

Vojin G. OklobdzijaDavid VillegerSimon S. Liu

Electrical and Computer EngineeringUniversity of California

Davis

Multiplier DesignApril 22, 2023 45

Fast Parallel MultipliersObjectiveImproved Speed of Parallel Multiplier via:

• Improvements in Partial-Product Bit Reduction Techniques

• Optimization of the Final Adder for the Uneven Signal Arrival Profile from the Multiplier Tree

Multiplier DesignApril 22, 2023 46

Multiplication

Algorithm:

in

i

iin

i

i ryXryXXYP

1

0

1

0

0 p)(0

)(1)1(j

njj Xyrpr

p for j=0,....,n-1

initially

p(n)=XY after n steps

Multiplier DesignApril 22, 2023 47

Multiplier DesignApril 22, 2023 48

Multiplier DesignApril 22, 2023 49

Parallel Multipliers

Step 0

S tep 1

S tep 2

S tep 3

S tep 4

Multiplier DesignApril 22, 2023 50

Multiplier DesignApril 22, 2023 51

Multiplier DesignApril 22, 2023 52

Multiplier DesignApril 22, 2023 53

Multiplier DesignApril 22, 2023 54

Multiplier DesignApril 22, 2023 55

Minimum Number of Stages (Dada’s Rule)

56 Multiplier DesignApril 22, 2023

Their Schemes

Multiplier DesignApril 22, 2023 57

Use of 4:2 Compressors

A. Weinberger 1981M. Santoro 1988

Multiplier DesignApril 22, 2023 58

4:2 Compressor

4-2

I4 I1I2I3

C0 Ci

C S

Multiplier DesignApril 22, 2023 59

Critical Signal Path in a 4:2 Compressor Tree

Multiplier DesignApril 22, 2023 60

Re-designed 4:2 Compressor with 3 XOR Delay (Nagamatsu, Toshiba)C inI1

I2I3I4

0

1

S

C

Cout

Multiplier DesignApril 22, 2023 61

Critical Path in a 4:2 Compressor

Multiplier DesignApril 22, 2023 62

Signal Arrival Profile for RWT (3:2) and MWT (4:2)

Multiplier DesignApril 22, 2023 63

Using 9:2 Compressors

(P. Song, G. De Michelli 1991)

Multiplier DesignApril 22, 2023 64

Compressor Tree Implemented with 9:2 Compressors

Multiplier DesignApril 22, 2023 65

9:2 Compressor Structure

Multiplier DesignApril 22, 2023 66

Title

0

2

4

6

8

10

12

14

16

18

20

22

24D

elay

(XO

R G

ates

)

0 10 20 30 40 50 60 70 80 90 100

Multiplier Width (bits)

Critical Path: (Equivalent XOR Gate Delays)

3,2 Counter

9to2 Compressor (Redesigned)

4:2 Compressor (Redesigned)

Multiplier DesignApril 22, 2023 67

Delay Expressed as No. of XOR Gate Delays

Multiplier DesignApril 22, 2023 68

Use of Higher-Order Compressors

D. Villeger, V.G. Oklobdzija 1993

Multiplier DesignApril 22, 2023 69

Design of a 13:2 Compressor from a 9:2 Compressor

Multiplier DesignApril 22, 2023 70

Delay Profile of a 24:2 Compressor Tree

Multiplier DesignApril 22, 2023 71

Compressor Family Characteristics

Multiplier DesignApril 22, 2023 72

Using Carry-Propagate Adders

(G. Bewick 1993) (D. Villeger & V. G. Oklobdzija 1993)

Multiplier DesignApril 22, 2023 73

Column Compression Tree Consisting of 4-bit Adders

Multiplier DesignApril 22, 2023 74

Bit Reduction Using 4-bit Adders (24X24)

75 Multiplier DesignApril 22, 2023

Idea !!!!!

Multiplier DesignApril 22, 2023 76

A Method for Speed Optimized Partial Product Reduction and Generation of

Fast Parallel Multipliers Using an Algorithmic Approach – TDM

(Oklobdzija, Villeger, Liu, 1994)

Multiplier DesignApril 22, 2023 77

Carry Propagate Adder

Vertical Slices

Horizontal PropagationCarry and Sum Connection to the Final Adder

Partial Product Martix Divided into Vertical Compressor Slices

Multiplier DesignApril 22, 2023 78

Example of a12 X 12 Multiplication

1 0 1 1 0 1 0 1 0 1 0 01 0 1 1 0 1 0 1 0 1 0 0

0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 1 0 1 0 1 0 0

1 0 1 1 0 1 0 1 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0

1 0 1 1 0 1 0 1 0 1 0 01 0 1 1 0 1 0 1 0 1 0 0

0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 1 0 1 0 1 0 0

Vertical Compressor Slice - VCS

(Partial Product for X*Y =B54 * B1B)

FA FA

FA

FA

0 0 1 1 0 1 0

FA

3-Dimensional View of Partial Product Reduction

Time

Final Adder

Multiplier DesignApril 22, 2023 79

AB

Cin Sum

Carry

Signal Delays in a Full Adder(3,2) Counter

Fast Input

Fast Output

Multiplier DesignApril 22, 2023 80

AB

Cin Sum

Carry

Signal Delays in a Full Adder(3,2) Counter

Fast Input

Fast Output

Multiplier DesignApril 22, 2023 81

Three-Dimensional optimization Method: TDM(Oklobdzija, Villeger, Liu, 1996)

Sum

Carry

ABCin Sum

Carry

ABCin

I1

I2I3I4

Cout

Cin 3 XORdelays

82 Multiplier DesignApril 22, 2023

Method

Multiplier DesignApril 22, 2023 83

sc

cina b

TDM ArrangementWorst Case

4

4

21

44

sc

cina b

24 3

6 1

1

6

3

Multiplier DesignApril 22, 2023 84

Two cases of signals passing through the next level

0 0 0

sc

cina b

sc

cina b

sc

cina b

1

2 11

100 0

02 2

Best Balanced Case Average Case

sc

cina b

2 221 1 0

3 433

Multiplier DesignApril 22, 2023 85

Example of a Optimized Interconnection

sc

cina b

sc

cina b

sc

cina b

sc

cina b

bit (n-1) positionbit (n) position

2 xor0 xor

1 xor

3 xor3 xor

Example of a not Optimized Interconnection

sc

cina b

sc

cina b

sc

cina b

sc

cina b

bit (n-1) positionbit (n) position

2 xor0 xor

1 xor

4 xor3 xor

Example of Delay Optimization

Multiplier DesignApril 22, 2023 86

The 9th Vertical Compressor Slice of a Multiplier

A B

C S

A B Cin

C S

A B Cin

C S

A B Cin

C S

A B Cin

C S

A B Cin

C S

A B Cin

C S

0 0 0 0 0 0 0 0 0 .5 1 1 2 3

.5 1 11 2 22 2.5

3 3 3.5 4

5 5

87 Multiplier DesignApril 22, 2023Computer Tools

Multiplier DesignApril 22, 2023 88

sc

cina b

Signals withDifferent Delay Levels

Signals withDifferent Delay Levels

Bypass Signals,No Delay Added

sc

cina b

sc

cina b

Method for Optimal Interconnection

Signals with short delays

Signals with intermediate delays

Add slow input delay

Add fast input delay

Level i

Level i+1

Signals with long delays

Signals with long delaysSignals with

short delaysSignals with intermediate delays

sc

cina b

Multiplier DesignApril 22, 2023 89

Design of Parallel MultipliersAlgorithm for Automatic Generation of Partial Product Array.

Initialize:

Form 2N-1 lists Li ( i = 0, 2N-2 ) each consisting of pi elements where:

p i = i+1 for i N-1 and p i = 2N-1-i for i N

An element of a list Li ( j = 0,...,pi-1 ) is a pair: <nj, Dj>i where:

nj : is a unique node identifying name

Dj : is a delay associated with that node representing a delay of a signal arriving to the node nj with respect to some reference point.

For i = 0,1 and 2N-2: connect nodes from the corresponding lists Li directly to the CPA.

Multiplier DesignApril 22, 2023 90

Delays

Delay(S) = MAX {Delay(A) + DA-S, Delay(B) + DB-S, Delay(Cin) + DCin-S}

Delay(C) = MAX {Delay(A) + DA-C, Delay(B) + DB-C, Delay(Cin) + DCin-C}

In our case the delays in a FA are :

FAA S = FAB S = 2 XOR delays FACin S = FAA C = FAB C = FACin C = 1 XOR delay.

In a HA:

HAA S = HAB S = 1 XOR delay while HAA C = HAB C = 0.5 XOR delay.

Multiplier DesignApril 22, 2023 91

For i=2 to i=2N-3 {Partial Product Array Generation} Begin For if length of Li is even Then Begin If sort the elements of Li in ascending order by the values of delay j

connect an HA to the first 2 elements of Li starting with the slowest input

Ds =max {A+A-S, B+B-S} Dc =max {A+A-C, B+B-C} remove 2 elements from Li

insert the pair <Ds,NetName> into Li

insert the pair <Dc,NetName> into Li+1

decrement the length of Li

increment the length of Li+1

End If;

92 Multiplier DesignApril 22, 2023

while length of Li > 3 Begin While sort the elements of Li in ascending order by the values of delay j connect an FA to the first 3 elements of Li starting with the slowest input of the FA:

Ds =max {cA+cA-S, cB+cB-S, cCi+cCi-S} Dc = max {cA+cA-C, cB+cB-C, cCi+cCi-C}

remove 3 elements from Li insert the pair <Ds,NetName> into Li insert the pair <Dc,NetName> into Li+1 subtract 2 from the length of Li increment the length of Li+1

End While;

sort the elements of Li connect an FA to the last 3 nodes of Li connect the S and C to the bit i and i+1 of the CPA

End For;End Method;

93 Multiplier DesignApril 22, 2023

Competing Approaches

94 Multiplier DesignApril 22, 2023

Comparison between TDM and other representative schemes, in XOR levels.

MultiplierWord-length

Wallace Tree [7] 4:2 Tree [11] Fadavi-Ardekani [16]

TDM

3 2 2 2 24 4 3 3 36 6 6 5 58 8 6 7 59 8 8 7 6

11 10 9 8 712 10 9 8 716 12 9 10 819 12 12 11 924 14 12 12 1032 16 12 13 1142 16 15 14 1253 18 15 15 1364 20 15 16 1495 20 18 17 15

Multiplier DesignApril 22, 2023 95

oC, VCritical Path Delay [CMOS: Leff=1 , T=25 cc=5V]

N = 24-bits 4:2 Design 9:2 Design Fadavi-Ardekani TDM DesignDelay [nS] 14.0 13.0 11.7 10.5

Multiplier DesignApril 22, 2023 96

0

2

4

6

8

10

12

14

16

18

20

22

24

Del

ay (X

OR

Lev

els)

0 20 40 60 80 100

Multiplier Width

Equivalent XOR Delays

TDM

Fadavi-Ardekani

9:2

4:2

3,2

Multiplier DesignApril 22, 2023 97

Algorithm for Implementation of Fast Parallel Multipliers

[1] V. G. Oklobdzija, D. Villeger, and S. S. Liu, “A Method For Speed Optimized Partial Product Reduction And Generation Of Fast Parallel Multipliers Using An Algorithmic Approach,” IEEE Transactions on Computers, Vol 45, No.3, March, 1996.

[2] V. G. Oklobdzija and D. Villeger, “Improving Multiplier Design By Using Improved Column Compression Tree And Optimized Final Adder In CMOS Technology,” IEEE Transactions on VLSI Systems, Vol.3, No.2, June, 1995, 25 pages.

[3] V. G. Oklobdzija and D. Villeger, “Multiplier Design Utilizing Improved Column Compression Tree And Optimized Final Adder In CMOS Technology,” Proceedings of the 1993 International Symposium on VLSI Technology, Systems and Applications, pp. 209-212, 1993.

[4] P. Stelling, C. Martel, V. G. Oklobdzija, R. Ravi, “Optimal Circuits for Parallel Multipliers,” IEEE Transaction on Computers, Vol. 47, No.3, pp. 273-285, March, 1998.

Multiplier DesignApril 22, 2023 98

Organization of Hitachi's DPL multiplier

4-2 4-2

4-2

4-2 4-2

4-2

4-2 4-2

4-2

4-2

4-2

4-2

4-2

54 bit 54 bit

Booth's Encoder

108-b CLA Adder

108 bit

Walace's tree

Conditional Carry Selection (CCS)

Multiplier DesignApril 22, 2023 99

Hitachi's 4:2 compressor structure

MUX

MUX

MUX

MUX

I4

I3

I1

I2

MUX

MUX

I1

I3

I4

Ci

Ci

Co

C

S

3 GATES

Multiplier DesignApril 22, 2023 100

DPL multiplexer circuit

L

HMUX

D0

D1

D0

D1

S S

OUT

OUT

OUT

S

D1

D0

Multiplier DesignApril 22, 2023 101

Addition Under Non-equal Signal Arrival Profile

Assumption

P. Stelling , V. G. Oklobdzija, "Design Strategies for Optimal Hybrid Final Adders in a Parallel Multiplier", special issue on VLSI Arithmetic, Journal of VLSI Signal Processing, Kluwer Academic Publishers, Vol.14, No.3, December 1996

Multiplier DesignApril 22, 2023 102

Signal Arrival Profile form the Parallel Multiplier Partial-Product Recuction Tree

Multiplier DesignApril 22, 2023 103

Oklobdzija, Villeger, IEEE Transactions on VLSI Systems, June, 1995

Multiplier DesignApril 22, 2023 104

Oklobdzija and Villeger, IEEE Transactions on VLSI Systems, June, 1995

Multiplier DesignApril 22, 2023 105

Multiplier DesignApril 22, 2023 106

Multiplier DesignApril 22, 2023 107

Multiplier DesignApril 22, 2023 108

Multiplier DesignApril 22, 2023 109

Multiplier DesignApril 22, 2023 110

Multiplier DesignApril 22, 2023 111

Multiplier DesignApril 22, 2023 112

Multiplier DesignApril 22, 2023 113

Performing Multiply-Add Operation in the Multiply Time

P. Stelling, V. G. Oklobdzija, " Achieving Multiply-Accumulate Operation in the Multiply Time", Thirteenth International Symposium on Computer Arithmetic, Pacific Grove, California, July 5 - 9, 1997.

Multiplier DesignApril 22, 2023 114

Multiplier DesignApril 22, 2023 115

Final Adder: Implementation

Multiplier DesignApril 22, 2023 116

Final Adder: Implementation

Multiplier DesignApril 22, 2023 117

Final Adder: Implementation

Multiplier DesignApril 22, 2023 118

Final Adder: Implementation

119 Multiplier DesignApril 22, 2023

RECOMENDATIONS

Multiplier DesignApril 22, 2023 120

Fast Parallel Multipliers• Different Counter and Compressor Families were

compared. The best way is to build a compressor of the maximal size (i.e. the entire size of the multiplier)

• The Essence of the optimal tree is optimal wiring and NOT the use of counter/compressor family

• The use of Carry-Propagate Adders is advantageous for larger size multipliers in the first stage and for particular technology

• Tuning of the Final Adder into the signal arrival profile is more important than the speed of the Final Adder.

Multiplier DesignApril 22, 2023 121

THEEND

Hollywood