Representing Real Numbers Using Floating Point Notation Lecture 6 CSCI 1405, CSCI 1301 Introduction to Computer Science Fall 2009

Representing Real Numbers Using

Floating Point Notation

Lecture 6CSCI 1405, CSCI 1301

Introduction to Computer ScienceFall 2009

Real numbers in Binary System

Remember that:

(101.001)2

= 1*22 + 0*21 + 1*20 + 0*2-1 + 0*2-2

+ 1*2-3

= 22 + 20 + 2-3 = 5.125

(101011.101)2 (which is 43.625) = 1.01011101 *

25.

Single Precision Floating Point

• Single precision floating point unit is a packet of 32 bits • Divided into three sections one bit, eight bits, and twenty-

three bits, in that order.

Sign1

bit

Exponent8 bits

Mantissa23 bits

Single Precision Floating Point

• Sign Field: one bit long, and is the sign bit. It is either 0 or 1; 0 indicates that the number is positive, 1 negative. The number 1.01011101 * 25 is positive, so this field would have a value of 0.

• Exponent eight bits long, and serves as the "exponent" of the number, this "exponent" is actually 127 greater than the "real" exponent, in our 1.01011101 x 25 number, the eight-bit exponent field would have a decimal value of 5 + 127 = 132. In binary this is 10000100. (Note: actual range of real exponent values from -126 to +128).

• Mantissa Field: twenty-three bits long, and serves as the "mantissa." In our 1.01011101 * 25 number, the mantissa, the most significant 1 is assumed to be there and is left out to give us just that much more precision. Thus, our mantissa for our number would in fact be 01011101000000000000000.

01000010001011101000000000000000

Conversion from Decimal to Floating Point Representation(329.390625 )

10 = (?)

2

(329)10

=(101001001)2

(.390625)10

= (0.011001)2

0.390625 *2 =0.781250

0.78125 *2 =1.56251

0.5625 *2 =1.1251

0.125 *2 =0.250

0.25 *2 =0.50

0.5 *2 =11

0

Conversion from Decimal to Floating Point Representation

(329.390625 )10

= (101001001.011001 )2

= 1.01001001011001 * 28

• The sign is positive, so the sign field is 0.• The exponent is 8. 8 + 127 = 135, so the

exponent field is 10000111.• The mantissa is merely 01001001011001

(remember the implied 1 of the mantissa means we don't include the leading 1) plus however many 0s we have to add to the right side to make that binary number 23 bits long.

Thank You

Documents

Representing Real Numbers Using Floating Point Notation Lecture 6 CSCI 1405, CSCI 1301 Introduction to Computer Science Fall 2009