View
224
Download
1
Embed Size (px)
Citation preview
Lecture 9 Sept 28
Chapter 3 Arithmetic for Computers
Arithmetic for Computers
Operations on integers Addition and subtraction Multiplication and division Dealing with overflow
Floating-point real numbers Representation and operations
§3.1
Intro
ductio
n
Integer Addition
Example: 7 + 6
§3.2
Add
ition
and
Subtra
ction
Overflow if result out of range Adding +ve and –ve operands, no overflow Adding two +ve operands
Overflow if result sign is 1 Adding two –ve operands
Overflow if result sign is 0
Integer Subtraction
Add negation of second operand Example: 7 – 6 = 7 + (–6)
+7: 0000 0000 … 0000 0111–6: 1111 1111 … 1111 1010+1: 0000 0000 … 0000 0001
Overflow if result out of range Subtracting two +ve or two –ve operands, no
overflow Subtracting +ve from –ve operand
Overflow if result sign is 0 Subtracting –ve from +ve operand
Overflow if result sign is 1
Dealing with Overflow
Some languages (e.g., C) ignore overflow Use MIPS addu, addui, subu instructions
Other languages (e.g., Ada, Fortran) require raising an exception Use MIPS add, addi, sub instructions On overflow, invoke exception handler
Save PC in exception program counter (EPC) register
Jump to predefined handler address mfc0 (move from coprocessor reg)
instruction can retrieve EPC value, to return after corrective action
Two’s-Complement Addition and Subtraction
Binary adder used as 2’s-complement adder/subtractor.
AddSub
x y
y
x
k /
k /
k /
y or y
Adder
c out
c in
k /
Simple Adders
Binary half-adder (HA) and full-adder (FA).
x y c s 0 0 0 0 0 1 0 1 1 0 0 1 1 1 1 0
Inputs Outputs
HA
x y
c
s
x y c c s 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1
Inputs Outputs
c out c in
out in x
y
s
FA
Digit-set interpretation:{0, 1} + {0, 1} = {0, 2} + {0, 1}
Digit-set interpretation:{0, 1} + {0, 1} + {0, 1} = {0, 2} + {0, 1}
Full-Adder Implementations
Full adder implemented with two half-adders, by means of two 4-input multiplexers, and as two-level gate network.
(a) FA built of two HAs
(c) Two-level AND-OR FA (b) CMOS mux-based FA
1
0
3
2
HA
HA
1
0
3
2
0
1
x y
x y
x y
s
s s
c out
c out
c out
c in
c in
c in
Ripple-Carry Adder: Slow But Simple
Ripple-carry binary adder with 32-bit inputs and output.
x
s
y
c c
x
s
y
c
x
s
y
c
c out c in
0 0
0
c 0
1 1
1
1 2
31
31
31
31
FA FA FA 32 . . .
Critical path
Carry Propagation Networks
The main part of an adder is the carry network. The rest is just a set of gates to produce the g and p signals and the sum
bits.
Carry network
. . . . . .
x i y i
g p
s
i i
i
c i c i+1
c k 1
c k
c k 2 c 1
c 0
g p 1 1 g p 0 0
g p k 2 k 2 g p i+1 i+1 g p k 1 k 1
c 0 . . . . . .
0 0 0 1 1 0 1 1
annihilated or killed propagated generated (impossible)
Carry is: g i p i
gi = xi yi pi = xi yi
Carry-Lookahead Logic with 4-Bit Block
Blocks needed in the design of carry-lookahead adders with four-way grouping of bits.
Blo
ck s
ign
al g
en
era
tion
p [i, i+3]
c i
Inte
rme
idte
ca
rrie
s
c i+1 c i+2 c i+3 g [i, i+3]
p i+3 g i+3 p i+2 g i+2 p i+1 g i+1 p i g i
Another Speed-Up Method: Carry Select
Carry-select addition circuit
c out c in Adder
Version 1 of sum bits 1
0
x [a, b ]
c out c in Adder
Version 0 of sum bits
y [a, b]
s [a, b]
c a
0 1
Allows doubling of adder width with a single-mux additional delay
Arithmetic for Multimedia
Graphics and media processing operates on vectors of 8-bit and 16-bit data Use 64-bit adder, with partitioned carry chain
Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors
SIMD (single-instruction, multiple-data)
Saturating operations On overflow, result is largest representable
value c.f. 2’s-complement modulo arithmetic
E.g., clipping in audio, saturation in video
Shift-Add Multiplication
Multiplication of 4-bit numbers in dot notation
Multiplicand
Partial products bit-matrix
x y
z
y x 2 0 0
y x 2 1 1
y x 2 2 2
y x 2 3 3
Multiplier
Product
z(j+1) = (z(j) + yj x 2k) 2–1 with z(0) = 0 and z(k) = z |––– add –––|
|–– shift right ––|
Multiplication
Start with long-multiplication approach
1000× 1001 1000 0000 0000 1000 1001000
Length of product is the
sum of operand lengths
multiplicand
multiplier
product
§3.3
Mu
ltiplica
tion
Multiplication Hardware
Initially 0
Optimized Multiplier
Perform steps in parallel: add/shift
One cycle per partial-product additionThat’s ok, if frequency of multiplications is low
Faster Multiplier
Uses multiple adders Cost/performance tradeoff
Can be pipelined Several multiplications performed in parallel
MIPS Multiplication
Two 32-bit registers for product HI: most-significant 32 bits LO: least-significant 32-bits
Instructions mult rs, rt / multu rs, rt
64-bit product in HI/LO mfhi rd / mflo rd
Move from HI/LO to rd Can test HI value to see if product overflows 32 bits
mul rd, rs, rt Least-significant 32 bits of product –> rd
Division Check for 0 divisor Long division approach
If divisor ≤ dividend bits 1 bit in quotient, subtract
Otherwise 0 bit in quotient, bring down next
dividend bit Restoring division
Do the subtract, and if remainder goes < 0, add divisor back
Signed division Divide using absolute values Adjust sign of quotient and
remainder as required
10011000 1001010 -1000 10 101 1010 -1000 10
n-bit operands yield n-bitquotient and remainder
quotient
dividend
remainder
divisor
Division Hardware
Initially dividend
Initially divisor in left
half
Optimized Divider
One cycle per partial-remainder subtraction Looks a lot like a multiplier!
Same hardware can be used for both
MIPS Division
Use HI/LO registers for result HI: 32-bit remainder LO: 32-bit quotient
Instructions div rs, rt / divu rs, rt No overflow or divide-by-0 checking
Software must perform checks if required Use mfhi, mflo to access result
Binary and Decimal Multiplication
Step-by-step multiplication examples for 4-digit unsigned numbers.
Position 7 6 5 4 3 2 1 0 Position 7 6 5 4 3 2 1 0========================= =========================x24 1 0 1 0 x104 3 5 2 8y 0 0 1 1 y 4 0 6 7========================= =========================z
(0) 0 0 0 0 z (0) 0 0 0 0
+y0x24 1 0 1 0 +y0x104 2 4 6 9 6–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z
(1) 0 1 0 1 0 10z (1) 2 4 6 9 6
z (1) 0 1 0 1 0 z
(1) 0 2 4 6 9 6+y1x24 1 0 1 0 +y1x104 2 1 1 6 8–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z
(2) 0 1 1 1 1 0 10z (2) 2 3 6 3 7 6
z (2) 0 1 1 1 1 0 z
(2) 2 3 6 3 7 6+y2x24 0 0 0 0 +y2x104 0 0 0 0 0–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z
(3) 0 0 1 1 1 1 0 10z (3) 0 2 3 6 3 7 6
z (3) 0 0 1 1 1 1 0 z
(3) 0 2 3 6 3 7 6+y3x24 0 0 0 0 +y3x104 1 4 1 1 2–––––––––––––––––––––––––– ––––––––––––––––––––––––––2z
(4) 0 0 0 1 1 1 1 0 10z (4) 1 4 3 4 8 3 7 6
z (4) 0 0 0 1 1 1 1 0 z
(4) 1 4 3 4 8 3 7 6========================= =========================