Lecture2-2010dmrocke.ucdavis.edu/Class/EAD115 Fall 2010.old/Lecture2-2010.pdf · • 1 sign bit •...

EAD 115

Numerical Solution of Engineering and Scientific Problems

David M. RockeDepartment of Applied Science

Computer Representation of Numbers

• Counting numbers (unsigned integers) are the numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, …

• In almost all computers, these numbers are represented in binary (base 2) rather than decimal.

• We count 0, 1, 10, 11, 100, 101, 110, 111, 1000, 1001, …

Fixed length Integers

• Data storage is generally in bytes, where 1 byte = 8 bits.

• With one-byte integers, the smallest integer that can be stored is 0, and the largest is 111111112 = 28 – 1 = 255.

• Internet IP addresses consist of four bytes, so that no part of an IP address exceeds 255 (UC Davis is 168.150.243.2).

• The IP address 168.150.243.2 looks like this in binary:

10101000

10010110

11110011

00000010

More Unsigned Integers

• Two-byte or 16 bit short integers can represent any whole number from 0 to 65,535

• Long integers of four bytes or 32 bits can represent any whole number from 0 to 4,294,967,296

• If each disk block has an address of a long integer, and each disk block has 4,196 bytes, then the disk can hold 16TB

Application: Digital Audio

• Uncompressed digital audio can be represented as a sequence of loudness levels

• A pure tone has a sequence that evolve as a sine wave

• The loudness levels can be represented as unsigned integers, giving all possible values

Pure Tone

6-bit Audio

Sampling Rate

• The sampling rate is the number of times per second that a loudness measure is taken

• CD’s are 44,100 times per second (44.1 kHz)

• Digital recordings are typically 44.1, 48, 96, or 192 kHz

Word Length

• 8-bit audio has loudness levels that exist in 28 = 256 discrete levels. This is crude

• 16-bit audio has 216 = 65,536 loudness levels. This is what is used for CD’s

• Audio is often now recorded in 24-bit audio, which has 16,777,216 levels, and is difficult to distinguish from the smooth original

Loudest Sound

• In 16-bit audio, the loudest sound that can be recorded has a numerical value of 65,536

• If the input in a recording goes over this level, it is still recorded at 65,536

• This leads to distorted sound, which is much more unpleasant than analog overload distortion (as with Jimi Hendrix)

Pure Tone with no Headroom

Signed integers

16 bit signed integers can represent any whole number from -32,767 to 32,767

Integer Overflow

• Suppose we are using one-byte signed integers, which can represent any whole number from -128 to 128.

• What happens when we add 100 and 100? The answer should be 200, but…

• 1100100 + 1100100 = 11001000 which has nine bits, so is probably truncated to 1001000 or 72

Decimal Numbers

• Decimal numbers or floating point (vs. fixed point) are represented in scientific notation.

• 1,437,526 = .1437526×107

• Exponent +7 mantissa +1437526• We represent this in binary on a computer

• Typical single/double precision:– 1 sign bit– 8/11 exponent bits (one sign)– 23/52 bit mantissa

Hypothetical 7-bit Reals

• 1 sign bit• 3 exponent bits• 3 mantissa bits• Mantissa normalized to be between 0.5

and 1 to avoid wasting bits (we don’t want to use a mantissa of 001 when we could use a mantissa of 100 instead since 100 and 101 (for example) look the same when truncated. (We could omit leading 1.)

Smallest Positive Number• Sign 0 (positive)• Exponent sign 1 (negative)• Exponent magnitude 11 (3 in decimal)• Mantissa, smallest normalized is 100 (next

smallest is 011 which has a leading 0).• 100 represents 2-1 = 0.5 in decimal.• Smallest positive number is 0.5 × 2-3 = 2-4 = 1/16• If we divide this by 2 we get 0 (underflow)!

Largest Positive Number

• Sign 0 (positive)• Exponent sign 0 (positive)• Exponent magnitude 11 (3 in decimal)• Mantissa, largest normalized is 111• 1112 = 2-1 + 2-2 + 2-3 =0.875 in decimal.• Largest positive number is 0.875 × 23 = 7• If we multiply this by 2 we get overflow!

Many Numbers Cannot be Represented Exactly

• 1/3 in our 7-bit real has the following representation:

• This is .3125 instead of .3333333 because that is as close as it can get

• When multiplied by 3, the result is 0.9375 instead of 1

• (3)(1/3) = 0.9375!

0 1 0 1 1 0 1

Limitations of Floating Point

• There is a limited range of quantities that can be represented

• There is only a finite number of quantities that can be represented in a given range

• Chopping = truncation or rounding of numbers that cannot be represented exactly

Machine Epsilon

• Machine epsilon is the largest computer number ε such that (1 + ε) - 1 = 0

• Excel uses double precision, which has 52 bit mantissa.

• Machine epsilon is about this size:52 162 10

Some Excel Arithmetic

ε (1 + ε) - 1

1E-13 1E-13

1E-14 0.999E-14

5E-15 5.11E-15

1E-15 0

Precision and Accuracy

• Precision means the variability between estimates

• Accuracy means the amount of deviation between the estimate and the “true value”

Errors of approximation• True Value = Approximation + Error• ET = TV – Approx• (True) Relative error is εT = ET / TV

• Absolute (relative) error is the absolute value of the (relative) error

• εA = EA / Approximation• Both the error and the relative error can

matter

Example

• True Value = 20• Approximation = 20.5• ET = TV – Approximation = -0.5• (True) Relative error is εT = ET / TV = -0.5/20 = -0.025 or -2.5%

• EA = | ET| = 0.5• εA = 0.025 or 2.5%

(Series) truncation error

x xe x

= + + + +

Roundoff Error

• Results from the approximate representation of numbers in a computer

• Accumulation over many computations• Addition or subtraction of small and large

numbers

2 1 2 2

2 1 2 1 1 2

2 1 2 1 2 1 2

2 1 2 1 2

( 1) ( )

( 1) ( 2 )

( 1) 2( 1) ( 1)

( 1) ( 1)

i iin n

i ii in

s n x x

s n x xx x

s n x n x x n nx

s n x n nx n nx

s n x n nx

= - - +

= - - - + -

= - - -

Shortcut or Mistake?

• The variance of the data set {1,2,3,4,5} is 2.5.

• The variance of the data set (100,000,001, 100,000,002, …) is the same because the spacing has not changed

• The shortcut formula gives 2 for the variance in Excel

• If the sequence starts 1,000,000,001, the variance by the shortcut is 0!

Taylor’s Theorem

• Can often approximate a function by a polynomial

• The error in the approximation is related to the first omitted term

• There are several forms for the error• We will use this kind of analysis

extensively in this course

''( ) ( )( ) ( ) '( )( ) ( ) ( )2! !

( ) ( )!

( ) ( )( 1)!

is between and

f a f af x f a f a x a x a x a Rn

x tR f t dtn

x aR fn

= + - + - + + - +

''( ) ( )( ) ( ) '( )2! !

( )( 1)!

f x f xf x h f x f x h h h Rn

+ = + + + + +

Series Truncation Error

• In general, the more terms in a Taylor series, the smaller the error

• In general, the smaller the step size h, the smaller the error

• Error is O(hn+1), so halving the step size should result in a reduction of error that is on the order of 2n+1

• In general, the smoother the function, the smaller the the error

Taylor Series Approximation of a Polynomial

( ) 0.1 0.15 0.5 0.25 1.2(0) 1.2(1) 0.2(1) 1.2'(0) 0.25(1) (0) 0.25(1) 1.2 .25 0.95''(0) 1

(1)(1) 1.2 .25 1 0.95 0.5 0.452!

f x x x x xfffff ff

=- - - - +===

=-= - = - =

= - - = - =

( ) 0.1 0.15 0.5 0.25 1.2

(0) 1.2; '(0) 0.25; ''(0) 1; '''(0) 0.9

''''(0) 2.4; (0) 0, 4

( ) 1.2 0.25( ) ( 1/ 2)( )( 0.9 / 6)( ) ( 2.4 / 24)( )

( ) 1.2 0.25( ) ( 1/ 2)( )

( ) 0.5 0

f x x x x x

f f f f

f x x xx x

f x x x

=- - - - +

= =- =- =-

=- = >

= - + - +

= - + -

=- - .25 1.2x+

( ) 0.1 0.15 0.5 0.25 1.2

(1) 0.2; '(1) 0.25; ''(1) 2.2; '''(1) 3.3

''''(1) 2.4; (1) 0, 4

( ) 0.2 0.25( 1) ( 2.2 / 2)( 1)( 3.3 / 6)( 1) ( 2.4 / 24)( 1)

( ) 0.2 .25 .25 1.1 2.2

f x x x x x

f f f f

f x x xx x

f x x x x

=- - - - +

= =- =- =-

=- = >

= - - + - - +

- - + - -

= - + - + -2

( ) 1.1 0.95 0.65f x x x=- + -

Approximating Polynomials

• Any fourth degree polynomial has a fifth derivative that is identically zero

• The remainder term for the order four Taylor series contains the fourth derivative at a point.

• Thus the order four Taylor series approximation is exact; that is, it is the polynomial itself.

• The Taylor approximation of order n to a function f(x) at a point a is the best polynomial approximation to f() at a in the following sense:– It is a polynomial– It is of order n or less (no terms higher than xn

– It matches the value and first n derivatives of f() at a.

);(ˆ axfn

Taylor Series and Euler’s Method

''( ) '''( )( ) ( ) '( )2! 3!

'( ) ( )

''( ) '( ) ( )

dv cg vdt m

v x v xv x h v x v x h h h

cv x g v xm

c gc cv x v x v xm m m

+ = + + + +

æ ö÷ç=- =- + ÷ç ÷çè ø

2 2 21 1

( ) ( )

( ) ( ) ( )

''( ) ''( ) ( )2! 2

i i i i

i i i i i

dv cg vdt m

dvv t v t t t Rdt

cv t v t g v t t tm

v vR t t h O hx x

= + - +

æ ö÷ç+ - -÷ç ÷çè ø

= - = =

Nonlinearity and Step Size

• For the first-order Taylor approximation, the more nearly linear the function is, the better the approximation

• The smaller the step size, the better the approximation

( )'( )( ) ( ) '( )

''( ) ( 1)2! 2!

f x xf x mxf x h f x f x h R

f m mR h hx x

=+ = + +

Numerical Differentiation2

21 1 1

( ) ( ) '( )( ) ( )

'( )( ) ( ) ( ) ( )

( ) ( )'( ) ( )

'( ) ( )

i i i i i i i

i ii i i

f x f x f x x x O x x

f x x x f x f x O x x

f x f xf x O x x

x xff x O h

é ù= + - + -ê úë ûé ù- = - + -ê úë û

- é ù= + -ë û-

First Forward Difference

( ) ( ) '( ) ( )( ) ( )'( ) ( )

'( ) ( )

f x f x f x h O hf x f xf x O h

hff x O h

First Backward Difference

( ) ( ) '( ) 0.5 ''( ) ( )

( ) ( ) 2 '( ) ( )( ) ( )

'( ) ( )2

( ) ( )'( ) ( )

i i i i

f x f x f x h f x h O h

f x f x f x h O hf x f x

f x O hhf x f x

f x O hh

= + + +

= - + +

First Centered Difference

4 3 2( ) 0.1 0.15 0.5 0.25 1.20.5; 0.5

(0.5) .925; '(0.5) .9125(0) 1.2; (1) 0.2'(0.5) (0.2 .925) / .5 1.45

( .9125 1.45) / .9125 .589'(0.5) (.925 1.2) / .5 .55

( .9125 .55) / .9125 .397'(0.5)

f x x x x xh xf ff ff

=- - - - += =

= =-= =

= - + =

(0.2 1.2) / (2)(.5) 1.00( .9125 1.00) / .9125 .096

= - + =

4 3 2( ) 0.1 0.15 0.5 0.25 1.20.25; 0.5

(0.5) .925; '(0.5) .9125(.25) 1.10351563; (.75) 0.63632813'(0.5) (0.63632813 .925) / .5 1.155

( .9125 1.155) / .9125 .265'(0.5) (.925 1.10351563) / .5

f x x x x xh xf ff ff

=- - - - += =

= =-= =

= - + =

.714( .9125 .714) / .9125 .217

'(0.5) (0.63632813 1.10351563) / .5 0.934( .9125 .934) / .9125 .024

= - + =

Summary of Exampleh = 0.5 h = 0.25

Forward 0.589 .265

Backward 0.397 0.217

Centered 0.096 0.024

Relative Error

Second Differences

( )( ) ( )

2 2 21

22 1 1

''( ) ( ) / ( ) ( )

''( ) ( ) ( ) ( ) ( )

''( ) ( ) 2 ( ) ( )

i i i i

i i i i i

i i i i

f x f x h h f x f x

f x h f x f x f x f x

f x h f x f x f x

-+ + +

D = D -

é ù- - -ê úë ûé ù- +ë û

2 2 2 2 2

( )( )( ) 2 2

f f fIf fFf fF I f f f

F IF I F FI I F F I

D = - = - + = - +

12 2 2

2 2 2 2

( ) ( 2 )

( 2 ) ( 2 )( 2 ) ( 2 )

( ) ( 2 )

i i i i

f f f I B f

f I B f I B B f

f f f f

B F F I B F I BF I B B F F I B

f f f F B f

f F B f F FB B f

f f f f

= - = -

= - = - +

D = - + = - +

= - + = - +

= - = -

= - = - +

Second Derivatives2 3

2 32 1

( ) ( ) '( )(2 ) 0.5 ''( )(2 ) ( )

( ) ( ) '( )( ) 0.5 ''( )( ) ( )

2 ( ) 2 ( ) 2 '( )( ) ''( )( ) ( )

( ) 2 ( ) ( ) ''( )( ) ( )( ) 2 ( )

i i i i

f x f x f x h f x h O h

f x f x f x f x h O hf x f x

= + + +

- =- + +

- += 2

( )( )

''( ) ( )

f xO h

hff x O h

Second Forward Difference

Second Derivatives

( ) 2 ( ) ( )''( ) ( )

''( ) ( )

i i ii

f x f x f xf x O hh

ff x O hh

- -- += +

Second Backward Difference

Second Derivatives

( ) 2 ( ) ( )''( ) ( )

''( ) ( )

i i ii

f x f x f xf x O h

hff x O h

+ -- += +

Second Centered Difference

Propagation of Error

• Suppose that we have an approximation of the quantity x, and we then transform the value of x by a function f(x).

• How is the error in f(x) related to the error in x?

• How can we determine this if f is a function of several inputs?

''( )( ) ( ) '( )2!

( ) ( ) '( )If the error is bounded

( ) ( ) '( )If the error is random with standard deviation

( )( ( )) '( )

x x x xf xf x f x f x

f x f x f x

B f x f x f x B

SD xSD f x f x

= + + +

1 1 1 1 1

2 2 2 2 2

1 2 1 2 1 1 2 1 2 1 2 2

1 2 1 2 1 1 2 1 2 1 2

( , ) ( , ) ( , ) ( , )( , ) ( , ) ( , ) ( , )

If the errors are bounded

( , ) ( , ) ( , ) ( ,i i

x x x xx x x xf x x f x x f x x f x xf x x f x x f x x f x x

f x x f x x f x x B f x x

e ee e

= + + +

Stability and Condition

• If small changes in the input produce large changes in the answer, the problem is said to be ill conditioned or unstable

• Numerical methods should be able to cope with ill conditioned problems

• Naïve methods may not meet this requirement

• The condition number is the ratio of the output error to the input error

The error of the input is ./ / is the relative error of the input.

The error of the output is( ) ( ) '( )

and the relative error of the output is( ) ( ) '( ) '( )

( ) ( )

x xx x

f x f x f x

f x f x f x f xf x f x f

e ee e

The ratio of the output RE to the input RE is( ) ( ) '( ) '( ) '( )

( / ) ( ) ( / ) ( ) ( ) ( )

f x f x f x xf x xf xx f x x f x f x f x

Lecture2-2010dmrocke.ucdavis.edu/Class/EAD115 Fall 2010.old/Lecture2-2010.pdf · • 1 sign bit •...

Documents

Lecture2, Matrices

Operatingsystems lecture2

CAD Lecture2

Pl lecture2

Place - Lecture2

Lecture2 SC

Lecture2 B

Lecture2 color

Lecture2: 123.312

El102 lecture2

Crm Lecture2

Unit2 Lecture2

Zaridah lecture2

Lecture2 systems

Lecture2 Datamodeling

sharing a C++ library Building, testing and · Minifloat IEEE 754 (floating point spec) 8 bits 1 sign bits 4 exponent bits 3 mantissa bits "In place of infinity, we usually put some

EC404 Lecture2

Microprocessor Lecture2

DRAM Lecture2

ICS2208 lecture2