Upload
miron
View
51
Download
0
Embed Size (px)
DESCRIPTION
Representation of real number. Presented by: Pawan yadav Puneet vinayak. Contents:-. Floating Point Numbers Decimal Binary conversion Floating point representation Mantissa Exponent Normalization IEEE Floating Point Representation Floating point airhtematic - PowerPoint PPT Presentation
Citation preview
REPRESENTATION OF REAL NUMBER
Presented by: Pawan yadav Puneet vinayak
CONTENTS:- Floating Point Numbers Decimal Binary conversion Floating point representation Mantissa Exponent Normalization IEEE Floating Point Representation Floating point airhtematic Error in floating point airthematic
FLOATING POINT NUMBERS In computer science real number is also called
floating point number. In the decimal system, a decimal point (radix
point) separates the whole numbers from the fractional part
Examples:
37.25 ( whole=37, fraction = 25)
123.567
10.12345678
FLOATING POINT NUMBERS
For example, 37.25 can be analyzed as:
101 100 10-1 10-2
Tens Units Tenths Hundredths3 7 2 5
37.25 = 3 x 10 + 7 x 1 + 2 x 1/10 + 5 x 1/100
BINARY EQUIVALENT In the binary representation of a floating point
number the column values will be as follows:
… 26 25 24 23 22 21 20 . 2-1 2-2 2-3 2-4 …
… 64 32 16 8 4 2 1 . 1/2 1/4 1/8 1/16 …
… 64 32 16 8 4 2 1 . .5 .25 .125 .0625…
DECIMAL BINARY CONVERSION
Repeatedly multiply fraction by two until fraction becomes zero.
0.8125 1.6250.625 1.250.25 0.50.5 1.0
SCIENTIFIC NOTATION OF FLOATING NUMBERS Decimal:-123,000,000,000,000 -1.23 × 1014
0.000 000 000 000 000 123 +1.23× 10-16
Binary:110 1100 0000 0000 1.1011× 214
-0.0000 0000 0000 0001 1011 -1.1101 × 2-16
FLOATING POINT NUMBER REPRESENTATION If x is a real number then its normal form
representation is:x = f • Base E
where f : mantissaE: exponent
exponentExample: 125.3210 = 0.12532 • 103
mantissa - 125.3210 = - 0.12532 • 103
0.054610 = 0.546 • 10 –1
NORMALIZED AND UNNORMALIZED
NORMALIZATION PROCESS
FLOATING POINT FORMAT FOR BINARY NUMBERS
IEEE FLOATING POINT REPRESENTATION
– more exponent bits greater range– more significant bits greater accuracy
IEEE FLOATING POINT REPRESENTATION The first, or leftmost, field of our floating point
representation will be the sign bit: 0 for a positive number, 1 for a negative number.
IEEE FLOATING POINT REPRESENTATION The second field of the floating point number will be
the exponent. Since we must be able to represent both positive and
negative exponents, we will use a convention which uses a value known as a bias of 127 to determine the representation of the exponent. An exponent of 5 is therefore stored as 127 + 5 or 132; an exponent of -5 is stored as 127 + (-5) OR 122.
The biased exponent, the value actually stored, will range from 0 through 255. This is the range of values that can be represented by 8-bit, unsigned binary numbers.
IEEE FLOATING POINT REPRESENTATION The mantissa is the set of 0’s and 1’s to
the left of the radix point of the normalized (when the digit to the left of the radix point is 1) binary number. ex:1.00101 X 23
The mantissa is stored in a 23 bit field,
NORMALIZING NUMBERSExample:
134.1510 = 0.13415 x 103
0.002110 = 0.21 x 10-2
101.11B = .1011 x 23 or 1.011 x 22 (hidden1)
0.011B = .11 x 2-1 or 1.1 x 2-2 (hidden1)
AB.CDH= .ABCD x 162
0.00ACH= .AC x 16-2
Note that the concept of a hidden 1 only applied to binary.
CONVERTING DECIMAL FLOATING POINT VALUES TO STORED IEEE STANDARD VALUES. Example: Find the IEEE FP representation of
40.15625.
Step 1. Compute the binary equivalent of the whole part and the fractional part. ( convert 40 and .15625. to their binary equivalents)
40.1562510 = 101000.001012
CONVERTING DECIMAL FLOATING POINT VALUES TO STORED IEEE STANDARD VALUES.
Step 2. Normalize the number by moving the decimal point to the right of the leftmost one.
101000.00101 = 1.0100000101 x 25
Step 3. Convert the exponent to a biased
exponent
127 + 5 = 132
==> 13210 = 100001002
CONVERTING DECIMAL FLOATING POINT VALUES TO STORED IEEE STANDARD VALUES.
Step 4. Store the results from above
Sign Exponent (from step 3) Mantissa ( from step 2)
0 10000100 01000001010 .. 0
CONVERT 10.37 TO SINGLE PRECISION FLOATING POINT
Floating point arithmetic
FLOATING-POINT ADDITION
23
Assume 4 decimal digit for mantissa
FLOATING POINT SUBTRACTION(USING 4 DIGIT MANTISSA)
Addition must be of terms of the same scale: 0.2361106 - 0.1455104
0.2361106 - 0.001455106 {both106} (0.2361 - 0.001455) 106
0.147861 106
0.234645 106
0.2346 106 {4 digit mantissa}
REAL NUMBER MULTIPLICATION(USING 4 DIGIT MANTISSA)
Multiplication problem is in the mantissa (0.2361102) (0.1455 104) 0.2361 0.1455 102+4 {add indices} 0.03435255 106 = 0.3435255 105
0.3435 105 {4 digit mantissa}
Notice that multiplication must work from the largest digit downwards since at some point the number is going to have to be truncated.
REAL NUMBER DIVISION(USING 4 DIGIT MANTISSA)
(0.2361102) /(0.1455 104) (0.2361 /0.1455) 102-4 {sub indices} 1.6226804 10-2 = 0.3435255 105
0.16226804 10-1
0.1623 10-1 {4 digit mantissa}
ERRORS IN FLOATING POINT ARITHMETIC Round off errorEx- 5.6999=5.7 7.238=7.24 Truncation error 4.67444444=4.674 5.45676767=5.4567
thanks