View
234
Download
8
Embed Size (px)
Citation preview
Chapter 3Arithmetic for Computers
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Taxonomy of Computer Information Information
Instructions AddressesData
Numeric Non-numeric
Fixed-point Floating-point Character(ASCII)
Boolean
Unsigned(ordinal)
Signed
Sign-magnitude 2’s complement
Single-precision
Other
Double-precision
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Number Format Considerations
Type of numbers (integer, fraction, real, complex) Range of values
between smallest and largest values wider in floating-point formats
Precision of values (max. accuracy) usually related to number of bits allocated
n bits => represent 2n values/levels Value/weight of least-significant bit
Cost of hardware to store & process numbers (some formats more difficult to add, multiply/divide, etc.)
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Unsigned Integers
Positional number system:
an-1an-2 …. a2a1a0 =
an-1x2n-1 + an-2x2n-2 + … + a2x22 + a1x21 + a0x20
Range = [(2n -1) … 0]
Carry out of MSB has weight 2n
Fixed-point fraction:
0.a-1a-2 …. a-n = a-1x2-1 + a-2x2-2 + … + a-nx2-n
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Signed Integers
Sign-magnitude format (n-bit values):
A = San-2 …. a2a1a0 (S = sign bit)
Range = [-(2n-1-1) … +(2n-1-1)] Addition/subtraction difficult (multiply easy) Redundant representations of 0
2’s complement format (n-bit values):
-A represented by 2n – A Range = [-(2n-1) … +(2n-1-1)] Addition/subtraction easier (multiply harder) Single representation of 0
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Computing the 2’s Complement
To compute the 2’s complement of A:
Let A = an-1an-2 …. a2a1a0
2n - A = 2n -1 + 1 – A = (2n -1) - A + 1
1 1 … 1 1 1
- an-1 an-2 …. a2 a1 a0 + 1
-------------------------------------
a’n-1a’n-2 …. a’2a’1a’0 + 1 (one’s complement + 1)
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
2’s Complement Arithmetic
Let (2n-1-1) ≥ A ≥ 0 and (2n-1-1) ≥ B ≥ 0
Case 1: A + B (2n-2) ≥ (A + B) ≥ 0
Since result < 2n, there is no carry out of the MSB Valid result if (A + B) < 2n-1
MSB (sign bit) = 0 Overflow if (A + B) ≥ 2n-1
MSB (sign bit) = 1 if result ≥ 2n-1
Carry into MSB
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
2’s Complement Arithmetic
Case 2: A - B Compute by adding: A + (-B) 2’s complement: A + (2n - B) -2n-1 < result < 2n-1 (no overflow possible) If A ≥ B: 2n + (A - B) ≥ 2n
Weight of adder carry output = 2n
Discard carry (2n), keeping (A-B), which is ≥ 0 If A < B: 2n + (A - B) < 2n
Adder carry output = 0 Result is 2n - (B - A) =
2’s complement representation of -(B-A)
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
2’s Complement Arithmetic
Case 3: -A - B Compute by adding: (-A) + (-B) 2’s complement: (2n - A) + (2n - B) = 2n + 2n - (A + B) Discard carry (2n), making result 2n - (A + B)
= 2’s complement representation of -(A + B) 0 ≥ result ≥ -2n Overflow if -(A + B) < -2n-1
MSB (sign bit) = 0 if 2n - (A + B) < 2n-1 no carry into MSB
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Relational OperatorsCompute A-B & test ALU “flags” to compare A vs. B
ZF = result zero OF = 2’s complement overflowSF = sign bit of result CF = adder carry output
Signed Unsigned
A = B ZF = 1 ZF = 1
A <> B ZF = 0 ZF = 0
A ≥ B (SF OF) = 0 CF = 1 (no borrow)
A > B (SF OF) + ZF = 0 CF ZF’ = 1
A ≤ B (SF OF) + ZF = 1 CF ZF’ = 0
A < B (SF OF) = 1 CF = 0 (borrow)
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
MIPS Overflow Detection
An exception (interrupt) occurs when overflow detected for add,addi,sub Control jumps to predefined address for exception Interrupted address is saved for possible resumption
Details based on software system / language example: flight control vs. homework assignment
Don't always want to detect overflow— new MIPS instructions: addu, addiu, subu
note: addiu still sign-extends!note: sltu, sltiu for unsigned comparisons
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Designing the Arithmetic & Logic Unit (ALU)
Provide arithmetic and logical functions as needed by the instruction set
Consider tradeoffs of area vs. performance
(Material from Appendix B)
32
32
32
operation
result
a
b
ALU
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Not easy to decide the “best” way to build something Don't want too many inputs to a single gate (fan in) Don’t want to have to go through too many gates (delay) For our purposes, ease of comprehension is important
Let's look at a 1-bit ALU for addition:
How could we build a 1-bit ALU for add, and, and or? How could we build a 32-bit ALU?
Different Implementations
cout = a b + a cin + b cin
sum = a xor b xor cin
Sum
CarryIn
CarryOut
a
b
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Building a 32 bit ALU
b
0
2
Result
Operation
a
1
CarryIn
CarryOut
Result31a31
b31
Result0
CarryIn
a0
b0
Result1a1
b1
Result2a2
b2
Operation
ALU0
CarryIn
CarryOut
ALU1
CarryIn
CarryOut
ALU2
CarryIn
CarryOut
ALU31
CarryIn
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Two's complement approach: just negate b and add.
What about subtraction (a – b) ?
0
2
Result
Operation
a
1
CarryIn
CarryOut
0
1
Binvert
b
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Adding a NOR function
Can also choose to invert a. How do we get “a NOR b” ?
Binvert
a
b
CarryIn
CarryOut
Operation
1
0
2+
Result
1
0
Ainvert
1
0
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Need to support the set-on-less-than instruction
(slt)
remember: slt is an arithmetic instruction
produces a 1 if rs < rt and 0 otherwise
use subtraction: (a-b) < 0 implies a < b
Need to support test for equality (beq $t5, $t6, $t7)
use subtraction: (a-b) = 0 implies a = b
Tailoring the ALU to the MIPS
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Supporting slt
Use this ALU for most significant bitall other bits
Binvert
a
b
CarryIn
Operation
1
0
2+
Result
1
0
3Less
Overflowdetection
Set
Overflow
Ainvert
1
0
Binvert
a
b
CarryIn
CarryOut
Operation
1
0
2+
Result
1
0
Ainvert
1
0
3Less
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Supporting slt
a0
Operation
CarryInALU0Less
CarryOut
b0
CarryIn
a1 CarryInALU1Less
CarryOut
b1
Result0
Result1
a2 CarryInALU2Less
CarryOut
b2
a31 CarryInALU31Less
b31
Result2
Result31
......
...
BinvertAinvert
0
0
0 Overflow
Set
CarryIn
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Test for equality
Notice control lines:
0000 = and0001 = or0010 = add0110 = subtract0111 = slt1100 = NOR•Note: zero is a 1 when the result is zero!
a0
Operation
CarryInALU0Less
CarryOut
b0
a1 CarryInALU1Less
CarryOut
b1
Result0
Result1
a2 CarryInALU2Less
CarryOut
b2
a31 CarryInALU31Less
b31
Result2
Result31
......
...
Bnegate
Ainvert
0
0
0 Overflow
Set
CarryIn...
...Zero
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Conclusion
We can build an ALU to support the MIPS instruction set key idea: use multiplexor to select the output we want we can efficiently perform subtraction using two’s complement we can replicate a 1-bit ALU to produce a 32-bit ALU
Important points about hardware all of the gates are always working the speed of a gate is affected by the number of inputs to the gate the speed of a circuit is affected by the number of gates in series
(on the “critical path” or the “deepest level of logic”) Our primary focus: comprehension, however,
Clever changes to organization can improve performance(similar to using better algorithms in software)
We saw this in multiplication, let’s look at addition now
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Is a 32-bit ALU as fast as a 1-bit ALU? Is there more than one way to do addition?
two extremes: ripple carry and sum-of-products
Can you see the ripple? How could you get rid of it?
c1 = b0c0 + a0c0 + a0b0
c2 = b1c1 + a1c1 + a1b1 c2 =
c3 = b2c2 + a2c2 + a2b2 c3 =
c4 = b3c3 + a3c3 + a3b3 c4 = Not feasible! Why?
Problem: ripple carry adder is slow
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
One-bit Full-Adder Circuit
ai
bi
XOR
AND
XOR
ANDOR
ci
sumi
Ci+1
FAi
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
32-bit Ripple-Carry Adder
FA0
FA1
FA2
FA31
c0 a0 b0
a1 b1
a2 b2
a31 b31
sum0
sum1
sum2
sum31
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
How Fast is Ripple-Carry Adder?
Longest delay path (critical path) runs from cin to sum31.
Suppose delay of full-adder is 100ps. Critical path delay = 3,200ps Clock rate cannot be higher than 1012/3,200 =
312MHz. Must use more efficient ways to handle carry.
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Fast Adders In general, any output of a 32-bit adder can be evaluated
as a logic expression in terms of all 65 inputs. Levels of logic in the circuit can be reduced to log2N for
N-bit adder. Ripple-carry has N levels. More gates are needed, about log2N times that of ripple-
carry design. Fastest design is known as carry lookahead adder.
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
N-bit Adder Design Options
Type of adder Time complexity
(delay)
Space complexity
(size)
Ripple-carry O(N) O(N)
Carry-lookahead O(log2N) O(N log2N)
Carry-skip O(√N) O(N)
Carry-select O(√N) O(N)
Reference: J. L. Hennessy and D. A. Patterson, Computer Architecture:A Quantitative Approach, Second Edition, San Francisco, California,1990.
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
An approach in-between our two extremes Motivation:
If we didn't know the value of carry-in, what could we do? When would we always generate a carry? gi = aibi
When would we propagate the carry? pi = ai + bi
Did we get rid of the ripple?c1 = g0 + p0c0
c2 = g1 + p1c1 c2 = g1 + p1 g0 + p1p0c0
c3 = g2 + p2c2 c3 = …
c4 = g3 + p3c3 c4 = …
Feasible! Why?
Carry-lookahead adder
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Can’t build a 16 bit adder this way... (too big) Could use ripple carry of 4-bit CLA adders Better: use the CLA principle again!
Use principle to build bigger adders
a4 CarryIn
ALU1 P1 G1
b4a5b5a6b6a7b7
a0 CarryIn
ALU0 P0 G0
b0
Carry-lookahead unit
a1b1a2b2a3b3
CarryIn
Result0–3
pigi
ci + 1
pi + 1
gi + 1
C1
Result4–7
a8 CarryIn
ALU2 P2 G2
b8a9b9
a10b10a11b11
ci + 2
pi + 2
gi + 2
C2
Result8–11
a12 CarryIn
ALU3 P3 G3
b12a13b13a14b14a15b15
ci + 3
pi + 3
gi + 3
C3
Result12–15
ci + 4C4
CarryOut
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Carry-Select Adder16-bit ripple carry adder
a0-a15
b0-b15
cin
sum0-sum15
16-bit ripple carry adder
a16-a31
b16-b31
0
16-bit ripple carry adder
a16-a31
b16-b31
1
Mu
ltip
lex
er
sum16-sum31
0
1This is known ascarry-select adder
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
ALU Summary
We can build an ALU to support MIPS addition Our focus is on comprehension, not performance Real processors use more sophisticated techniques for arithmetic Where performance is not critical, hardware description languages
allow designers to completely automate the creation of hardware!
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Multiplication
More complicated than addition accomplished via shifting and addition
More time and more area Let's look at 3 versions based on a gradeschool
algorithm
0010 (multiplicand)
__x_1011 (multiplier)
Negative numbers: convert and multiply there are better techniques, we won’t look at them
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Multiplication: Implementation
DatapathControl
MultiplicandShift left
64 bits
64-bit ALU
ProductWrite
64 bits
Control test
MultiplierShift right
32 bits
32nd repetition?
1a. Add multiplicand to product and
place the result in Product register
Multiplier0 = 01. Test
Multiplier0
Start
Multiplier0 = 1
2. Shift the Multiplicand register left 1 bit
3. Shift the Multiplier register right 1 bit
No: < 32 repetitions
Yes: 32 repetitions
Done
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Final Version
Multiplicand
32 bits
32-bit ALU
ProductWrite
64 bits
Controltest
Shift right
32nd repetition?
Product0 = 01. Test
Product0
Start
Product0 = 1
3. Shift the Product register right 1 bit
No: < 32 repetitions
Yes: 32 repetitions
Done
What goes here?
•Multiplier starts in right half of product
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Multiplying Signed Numbers withBoothe’s Algorithm Consider A x B where A and B are signed integers
(2’s complemented format) Decompose B into the sum B1 + B2 + … + Bn
A x B = A x (B1 + B2 + … + Bn)= (A x B1) + (A x B2) + … (A x Bn)
Let each Bi be a single string of 1’s embedded in 0’s:
…0011…1100… Example:
0110010011100 = 0110000000000 + 0000010000000 + 0000000011100
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Boothe’s Algorithm
Scanning from right to left, bit number u is the first 1 bit of the string and bit v is the first 0 left of the string:
v u
Bi = 0 … 0 1 … 1 0 … 0
= 0 … 0 1 … 1 1 … 1 (2v – 1)
- 0 … 0 0 … 0 1 … 1 (2u – 1)
= (2v – 1) - (2u – 1)
= 2v – 2u
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Boothe’s Algorithm Decomposing B: A x B = A x (B1 + B2 + … ) = A x [(2v1 – 2u1) + (2v2 – 2u2) + …] = (A x 2v1) – (A x 2u1) + (A x 2v2) – (A x 2u2) …
A x B can be computed by adding and subtracted shifted values of A: Scan bits right to left, shifting A once per bit When the bit string changes from 0 to 1, subtract
shifted A from the current product P – (A x 2u) When the bit string changes from 1 to 0, add shifted A
to the current product P + (A x 2v)
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Floating Point Numbers
We need a way to represent a wide range of numbers numbers with fractions, e.g., 3.1416 large number: 976,000,000,000,000 = 9.76 × 1014
small number: 0.0000000000000976 = 9.76 × 10-14
Representation: sign, exponent, significand: (–1)sign ×significand ×2exponent
more bits for significand gives more accuracy more bits for exponent increases range
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Scientific Notation
Scientific notation: 0.525×105 = 5.25×104 = 52.5×103 5.25 ×104 is in normalized scientific notation.
position of decimal point fixed leading digit non-zero
Binary numbers 5.25 = 101.01 = 1.0101×22
Binary point multiplication by 2 moves the point to the left division by 2 moves the point to the right
Known as floating point format.
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Binary to Decimal Conversion
Binary (-1)S (1.b1b2b3b4) × 2E
Decimal (-1)S × (1 + b1×2-1 + b2×2-2 + b3×2-3 + b4×2-4) × 2E
Example: -1.1100 × 2-2 (binary) = - (1 + 2-1 + 2-2) ×2-2
= - (1 + 0.5 + 0.25)/4
= - 1.75/4
= - 0.4375 (decimal)
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
IEEE Std. 754 Floating-Point Format
S E: 8-bit Exponent F: 23-bit Fraction
bits 0-22 bits 23-30 bit 31
S E: 11-bit Exponent F: 52-bit Fraction +
bits 0-19 bits 20-30 bit 31
Continuation of 52-bit Fraction
bits 0-31
Single-Precision
Double-Precision
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
IEEE 754 floating-point standard
Represented value = (–1)sign ×F) ×2exponent – bias Exponent is “biased” (excess-K format) to make sorting easier
bias of 127 for single precision and 1023 for double precision E values in [1 .. 254] (0 and 255 reserved) Range = 2-126 to 2+127 (1038 to 10+38)
Significand in sign-magnitude, normalized form Significand = (1 + F) = 1. b-1b-1b-1…b-23
Suppress storage of leading 1
Overflow: Exponent requiring more than 8 bits. Number can be positive or negative.
Underflow: Fraction requiring more than 23 bits. Number can be positive or negative.
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
IEEE 754 floating-point standard
Example:
Decimal: -5.75 = - ( 4 + 1 + ½ + ¼ ) Binary: -101.11 = -1.0111 x 22
Floating point: exponent = 129 = 10000001
IEEE single precision: 110000001011100000000000000000000
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Examples
1.1010001 × 210100 = 0 10010011 10100010000000000000000 = 1.6328125 × 220
-1.1010001 × 210100 = 1 10010011 10100010000000000000000 = -1.6328125 × 220
1.1010001 × 2-10100 = 0 01101011 10100010000000000000000 = 1.6328125 × 2-20
-1.1010001 × 2-10100 = 1 01101011 10100010000000000000000 = -1.6328125 × 2-20
Biased exponent (0-255), bias 127 (01111111) to be subtracted
0.50.1250.00781250.6328125
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
NegativeOverflow
PositiveOverflow
Expressible numbers
Numbers in 32-bit Formats Two’s complement integers
Floating point numbers
Ref: W. Stallings, Computer Organization and Architecture, Sixth Edition, Upper Saddle River, NJ: Prentice-Hall.
-231 231-10
Expressible negativenumbers
Expressible positivenumbers
0-2-127 2-127
Positive underflowNegative underflow
(2 – 2-23)×2127- (2 – 2-23)×2127
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
IEEE 754 Special Codes
± 1.0 × 2-127
Smallest positive number in single-precision IEEE 754 standard. Interpreted as positive/negative zero. Exponent less than -127 is positive underflow (regard as zero).
S 00000000 00000000000000000000000
S 11111111 00000000000000000000000
Zero
Infinity ± 1.0 × 2128 Largest positive number in single-precision IEEE 754 standard. Interpreted as ± ∞ If true exponent = 128 and fraction ≠ 0, then the number is greater
than ∞. It is called “not a number” or NaN (interpret as ∞).
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Addition and Subtraction
Addition/subtraction of two floating-point numbers:
Example: 2 x 103 Align mantissas: 0.2 x 104
+ 3 x 104 + 3 x 104
3.2 x 104
General Case:
m1 x 2e1 ± m2 x 2e2 = (m1 ± m2 x 2e2-e1) x 2e1 for e1 > e2
= (m1 x 2e1-e2 ± m2) x 2e2 for e2 > e1
Shift smaller mantissa right by | e1 - e2| bits to align the mantissas.
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Addition/Subtraction Algorithm
0. Zero check- Change the sign of subtrahend
- If either operand is 0, the other is the result
1. Significand alignment: right shift smaller significand until two exponents are identical.
2. Addition: add significands and report exception if overflow occurs.
3. Normalization- Shift significand bits to normalize.
- report overflow or underflow if exponent goes out of range.
4. Rounding
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
FP Add/Subtract (PH Text Figs. 3.16/17)
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Example Subtraction: 0.5ten- 0.4375ten
Step 0: Floating point numbers to be added
1.000two×2-1 and -1.110two×2-2
Step 1: Significand of lesser exponent is shifted right until exponents match
-1.110two×2-2 → - 0.111two×2-1
Step 2: Add significands, 1.000two + (- 0.111two)
Result is 0.001two ×2-1
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Example (Continued) Step 3: Normalize, 1.000two× 2-4
No overflow/underflow since
127 ≥ exponent ≥ -126 Step 4: Rounding, no change since the sum
fits in 4 bits.
1.000two ×2-4 = (1+0)/16 = 0.0625ten
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
FP Multiplication: Basic Idea
(m1 x 2e1) x (m2 x 2e2) = (m1 x m2) x 2e1+e2
Separate signs Add exponents Multiply significands Normalize, round, check overflow Replace sign
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
FP Multiplication Algorithm
P-H Figure 3.18
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
FP Mult. Illustration Multiply 0.5ten and -0.4375ten (answer = - 0.21875ten) or
Multiply 1.000two×2-1 and -1.110two×2-2
Step 1: Add exponents-1 + (-2) = -3
Step 2: Multiply significands 1.000
×1.110 0000 1000 100010001110000 Product is 1.110000
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
FP Mult. Illustration (Cont.)
Step 3: Normalization: If necessary, shift significand right and increment
exponent.
Normalized product is 1.110000 × 2-3
Check overflow/underflow: 127 ≥ exponent ≥ -126 Step 4: Rounding: 1.110 × 2-3
Step 5: Sign: Operands have opposite signs,
Product is -1.110 × 2-3
Decimal value = - (1+0.5+0.25)/8 = - 0.21875ten
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
FP Division: Basic Idea
Separate sign. Check for zeros and infinity. Subtract exponents. Divide significands. Normalize/overflow/underflow. Rounding. Replace sign.
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
MIPS Floating Point 32 floating point registers, $f0, . . . , $f31 FP registers used in pairs for double precision; $f0
denotes double precision content of $f0,$f1 Data transfer instructions:
lwc1 $f1, 100($s2) # $f1←Mem[$s1+100] swc1 $f1, 100($s2) # Mem[$s2+100]←$f1
Arithmetic instructions: (xxx=add, sub, mul, div) xxx.s single precision xxx.d double precision
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Floating Point Complexities
Operations are somewhat more complicated (see text)
In addition to overflow we can have “underflow”
Accuracy can be a big problem
IEEE 754 keeps two extra bits, guard and round
four rounding modes
positive divided by zero yields “infinity”
zero divide by zero yields “not a number”
other complexities Implementing the standard can be tricky Not using the standard can be even worse
see text for description of 80x86 and Pentium bug!
Spring 2005 ELEC 5200/6200From Patterson/Hennessey Slides
Chapter Three Summary
Computer arithmetic is constrained by limited precision Bit patterns have no inherent meaning but standards do exist
two’s complement IEEE 754 floating point
Computer instructions determine “meaning” of the bit patterns Performance and accuracy are important so there are many
complexities in real machines Algorithm choice is important and may lead to hardware
optimizations for both space and time (e.g., multiplication)
You may want to look back (Section 3.10 is great reading!)