CS-110 Computational Engineering Part 3

Preview:

DESCRIPTION

CS-110 Computational Engineering Part 3. A. Avudainayagam Department of Mathematics. Root Finding: f(x)=0. Method 1: The Bisection method Th: If f(x) is continuous in [a,b] and if f(a)f(b) 0. x 2. a. a. x 4. b. - PowerPoint PPT Presentation

Citation preview

CS-110Computational Engineering

Part 3

• A. Avudainayagam

• Department of Mathematics

Root Finding: f(x)=0

Method 1: The Bisection methodTh: If f(x) is continuous in [a,b] and if f(a)f(b)<0, then there

is atleast one root of f(x)=0 in (a,b).

Single root in (a,b)

f (a) < 0a

x1 b

f(b) > 0

Multiple roots in (a,b)

ax1

x2

x3

x4 x5

b

y

x

y

x

The Method• Find an interval [x0 ,x1] so that f(x0)f(x1) < 0 (This may not be easy.

Use your knowledge of the physical phenomenon which the equation represents).

• Cut the interval length into half with each iteration,by examining the sign of the function at the mid point.

• If f(x2) = 0, x2 is the root. • If f(x2) 0 and f(x0)f(x2) < 0, root lies in [x0,x2].

• Otherwise root lies in [x2,x1].

• Repeat the process until the interval shrinks to a desired level.

210

2

xxx

Number of Iterations and Error Tolerance

• Length of the interval (where the root lies) after n iterations

n01

n 2

xxe

• We can fix the number of iterations so that the root lies within an interval of chosen length (error tolerance).

ne 2ln

ln)xxln(n 01

• If n satisfies this, root lies within a distance of / 2 of the actual root.

Though the root lies in a small interval, |f(x)| may not be small if f(x) has a large slope.

Conversely if |f(x)| small, x may not be close to the root if

f(x) has a small slope.

So, we use both these facts for the termination criterion. We first choose an error tolerance on f(x) : |f(x)| < and m the maximum number of iterations.

Pseudo code (Bisection Method)1. Input > 0 , m > 0, x1 > x0 so that f(x0) f(x1) < 0.

Compute f0 = f(x0) .

k = 1 (iteration count)

2. Do { (a) Compute f2 = f(x2) = f

(b) If f2f0 < 0 , set x1= x2 otherwise set x0 = f2 and f0 = f2.

(c) Set k = k+1.

}

210 xx

3. While |f2| > and k m

set x = x2 , the root.

False Position Method (Regula Falsi)

Instead of bisecting the interval [x0,x1], we choose

the point where the straight line through the end

points meet the x-axis as x2 and bracket the root

with [x0,x2] or [x2,x1] depending on the sign of

f(x2).

False Position Method

(x1,f1)

x0

(x0,f0)

x2 x1

(x2,f2)

y=f(x)

Straight line through (x0,f0) ,(x1,f1) :)( 1

01

011 xx

xx

fffy

New end point x2 :

10 1

0 11 2f

f f

x xx x

y

x

False Position Method (Pseudo Code)

101

0112 f

ff

xxxx

3. While (|f2| ) and (k m)

1. Choose > 0 (tolerance on |f(x)| ) m > 0 (maximum number of iterations ) k = 1 (iteration count) x0 ,x1 (so that f0 ,f1 < 0)

2. { a. Compute

f2 = f(x2)

b. If f0f2 < 0 set x1 = x2 , f0 = f2

c. k = k+1

}

4. x = x2 , the root.

Newton-Raphson Method / Newton’s Method

At an approximate xk to the root ,the curve is approximated by the tangent to the curve at xk and the next approximation xk+1 is the point where the tangent meets the x-axis.

y = f(x)

root

x0

xx1x2

y

)(

)(1

k

kkk xf

xfxx

Warning : If f´(xk) is very small , method fails.

• Two function Evaluations per iteration

This tangent cuts the x-axis at xk+1

Tangent at (xk, fk) : y = f(xk) + f ´(xk)(x-xk)

Newton’s Method - Pseudo code

4. x = x1 the root.

1. Choose > 0 (function tolerance |f(x)| < ) m > 0 (Maximum number of iterations) x0 - initial approximation k - iteration count Compute f(x0)2. Do { q = f (x0) (evaluate derivative at x0) x1= x0 - f0/q x0 = x1

f0 = f(x0) k = k+1 }3. While (|f0| ) and (k m)

Getting caught in a cycle of Newton’s Method

Alternate iterations fall at the same point .

No Convergence.

xk+1

xk x

y

Newton’s Method for finding the square

root of a number x = a

k

kkk x

axxx

2

22

1

Example : a = 5 , initial approximation x0 = 2.

x1 = 2.25

x2 = 2.236111111

x3 = 2.236067978

x4 = 2.236067978

f(x) = x2 - a2 = 0

The secant Method

• The Newton’s Method requires 2 function evaluations (f,f ).

• The Secant Method requires only 1 function evaluation and converges as fast as Newton’s Method at a simple root.

• Start with two points x0,x1 near the root (no need for bracketing the root as in Bisection Method or Regula Falsi Method) .

• xk-1 is dropped once xk+1 is obtained.

The Secant Method (Geometrical Construction)

• Two initial points x0, x1 are chosen

• The next approximation x2 is the point where the straight line joining (x0,f0) and (x1,f1) meet the x-axis

• Take (x1,x2) and repeat.

y = f(x1)

x2 x0 x1

(x0,f0)(x1,f1)

x

y

The secant Method (Pseudo Code)1. Choose > 0 (function tolerance |f(x)| ) m > 0 (Maximum number of iterations) x0 , x1 (Two initial points near the root ) f0 = f(x0) f1 = f(x1) k = 1 (iteration count)

2. Do { x0 = x1

f0 = f1

x1= x2

f1 = f(x2) k = k+1 }

ff f

x xx x

0 1

2 11 2

3. While (|f1| ) and (m k)

General remarks on Convergence

# The false position method in general converges faster than the bisection method. (But not always ) counter example follows.

# The bisection method and the false position method are guaranteed for convergence.

# The secant method and the Newton-Raphson method are not guaranteed for convergence.

Order of Convergence

# A measure of how fast an algorithm converges .

Let be the actual root : f () = 0

Let xk be the approximate root at the kth iteration . Error at the kth iteration = ek = |xk- |

The algorithm converges with order p if there exist such that p

k 1k e e

Order of Convergence of

# Bisection method p = 1 ( linear convergence ) # False position - generally Super linear ( 1 < p < 2 )

# Secant method (super linear) 618.12

51

# Newton Raphson method p = 2 quadratic

Machine Precision

# The smallest positive float M that can be added to

one and produce a sum that is greater than one.

Pseudo code to find ‘Machine Epsilon’

1. Set M = 1

2. Do

{

M = M / 2

x = 1 + M

}

3. While ( x < 1 )

4. M = 2 M

Solutions Of Linear Algebraic Systems

AX = B

A n n matrixB n 1 vectorX n 1 vector (unknown)

• If |A| 0, the system has a unique solution, provided B 0.

• If |A| = 0, the matrix A is called a singular matrix.

# Direct Methods

# Iterative Methods

The Augmented Matrix

[A | B]

nnnn3n2n1

22n232221

11n131211

baaaa

baaaa

baaaa

Elementary row operations

1. Interchange rows j and k.

2. Multiply row k by .

3. Add times row j to row k.

• Use elementary row operations and transform the equation so that the solution becomes apparent

Diagonal Matrix (D) =

Upper Triangular Matrix (U) =

Lower Triangular Matrix (L) =

33

22

11

a00

0a0

00a

33

2322

131211

a00

aa0

aaa

333231

2221

11

aaa

0aa

00a

Gauss Elimination Method

Using row operations, the matrix is reduced to an upper

triangular matrix and the solutions obtained by back

substitution

Partial Pivot MethodStep 1 We look for the largest element in absolute value (called

pivot) in the first column and interchange this row with the first row

Step 2 Multiply the I row by an appropriate number and subtract this from the II row so that the first element of the second row becomes zero.

Do this for the remaining rows so that all the elements in the I column except the top most element are zero.

Now the matrix looks like:

x x x x 0

x x x x 0

x x x x 0

x x x x x

Now look for the pivot in the second column below the first row

and interchange this with the second row.

Multiply the II row by an appropriate numbers and subtract this

from remaining rows (below the second row) so that the matrix

looks like :

xxx00

xxx00

xxxx0

xxxxx

Now look for the pivot in the III column below the second

row and interchange this row with the third row. Proceed

till the matrix looks like :

xx000

xxx00

xxxx0

xxxxx

Now the solution is obtained by back substitution.

Gauss Elimination (An example)

2

1

1

x

x

x

814

312

201

3

2

1

Augmented Matrix :

2814

1312

1201

2814

1312

1201

1201

1312

2814

5.0025.00

215.10

2814

8333.01666.000

215.10

2814

Solution by back substitution

x3 = 0.8333/1.1667 = 5

x2 = (2 - 5 )/1.5 = -2

x1 = (2-8(5)+2)/4 = -9

x1 = -9 , x2 = -2 , x3 = 5

8333.01667.100

215.10

2814

Gauss Elimination (Pseudo Code)1. Input matrix A of order n n and r h s vector B and form the augmented matrix [A|B].

2. For j = 1 to n do

{ (a) Compute the pivot index j p n such that }A{maxA ij

n

jipj

(b) If Apj = 0 , print “singular matrix” and exit.(c) If p > j , interchange rows p and j (d) For each i > j , subtract Aij/Ajj times row j from row i. }

3. Now the augmented matrix is [C|D] where C is upper triangular.

4. Solve by back substitution For j = 1= n down to 1 compute :

n

1jiijij

jj j xDD

C

1x

Determinant of a Matrix

Some properties of the determinant.

1. det (AT) = det (A)

2. det (AB) = det (A) det (B)

3. det (U) = U11U22U33...Unn

4. det [E1(k,j)] = -1 . det A

5. det [E2(k,)] = det A

6. det [E3(k,j,)] = det A

Matrix DeterminantGauss Elimination procedure uses two types of row operations.

1. Interchange of rows for pivoting

2. Add times row j to row k.

The first operation changes the sign of the determinant while the second has no effect on the determinant.

To obtain the matrix determinant, follow Gauss elimination and obtain the upper triangular form U, keeping count of (r) the interchanges of rows

Then det A = (-1)r U11U22 ... Unn

Matrix Determinant (Pseudo Code)

1. Set r = 0

2. For j = 1 to n do

{ (a) Compute the pivot index j p n such that

}A{maxA ij

n

jipj

3. det = (-1)r A11A22 ... Ann

(b) If Apj = 0 , print “Singular Matrix” and exit.

(c) If p > j interchange rows p and j , set r = r +1.

(d) For each i > j , subtract Aij/Ajj times row j and row i

}

Matrix Determinant

A =

0r101

620

041

0r

140

620

041

1r620

140

041

1 r 5. 5 0 0

1 4 0

0 4 1

det = (-1)1 (1)(4)(5.5) = -22

LU Decomposition

• When Gauss elimination is used to solve a linear system, the row operations are applied simultaneously to the RHS vector.

• If the system is solved again for a different RHS vector, the row operations must be repeated.

• The LU decomposition eliminates the need to repeat the row operators each time a new RHS vector is used.

The Matrix A (AX = B) is written as a product of an upper triangular matrix U and a lower triangular matrix L. The diagonal elements of the upper triangular matrix are chosen to be 1 to get a consistent system of equations

n = 3

333231

232221

131211

AAA

AAA

AAA

333231

2221

11

LLL

0LL

00L

100

U10

UU1

23

1312

=

9 Equations for 9 unknowns L11, L21, L22, L31, L32, L33, U12, U13, U23.

332332133132123131

2322132122122121

1311121111

LULULLULL

ULULLULL

ULULL

333231

232221

131211

AAA

AAA

AAA

First column gives L11 L21 L31 = (A11, A21, A31)

First row gives U11 U12 U13 = (1, A12/L11, A13/L11)

Second column gives L12 L22 L32 = (0, A22 - L21U12, A32 - L31U22)

Second row gives U21 U22 U23 = (0, 1, (A23 - L21U13)/L22 )

Third column gives L13 L23 L33 = (0, 0, A33 - L31U13 - L32 U23)

=

LU Decomposition (Pseudo Code)Li1 = Ai1 , i = 1, 2, ... , n

U1j = A1j/L11 , j = 2, 3, ... , n

for j = 2, 3, ... , n-1

1-j

1kkjik ij ij ULAL , i = j, j+1, ... , n

jj

1

1iikjijk

jk L

ULAU

j

, k = j, j+1, ... , n

1

1knnknnnn ULAL

n

k

LU Decomposition (Example)

243

131

152

=

45.33

05.01

002

100

110

5.05.21

The value of U are stored in the zero space of L .

Q = [L \ U] =

nnn3n2n1

2n232221

1n131211

L...LLL

U...ULL

U...UUL

After each element of A is employed , it is not needed again. So Q can be stored in place of A.

After LU decomposition of the matrix A, the system AX = B is solved in 2 steps : (1) Forward substitution, (2) Backward Substitution

AX = LUX = B

Put UX = Y

LY = =

3

2

1

B

B

B

Forward Substitution : Y1 = B1/ L11

Y2 = (B2 - L21Y1) / L22

Y3 = (B3 - L31B1 - L32B2) / L33

3

2

1

333231

2221

11

Y

Y

Y

LLL

0LL

00L

UX =

3

2

1

23

1312

x

x

x

100

U10

UU1

=

3

2

1

Y

Y

Y

Back Substitution :

x3 = Y3

x2 = Y2 - x3 U23

x1 = Y1 - x2U12 - x3 U23

As in Gauss elimination, LU decomposition must

employ pivoting to avoid division by zero and to

minimize round off errors. The pivoting is done

immediately after computing each column.

Matrix Inverse Using the LU Decomposition

• LU decomposition can be used to obtain the inverse of

the original coefficient matrix.

• Each column j of the inverse is determined by using a

unit vector (with 1 in the jth row ) as the RHS vector

Matrix inverse using LU decomposition-an example

243

131

152

=

45.33

05.01

002

100

110

5.05.21

L U First column of the inverse of A is given by

45.33

05.01

002

100

110

5.05.21

3

2

1

x

x

x

=

0

0

1

25.1

1

5.0

0

0

1

45.33

05.01

002

3

2

1

3

2

1

d

d

d

d

d

d

25.1

25.0

5.0

x

x

x

25.1

1

5.0

x

x

x

100

110

5.05.21

3

2

1

3

2

1

The second and third columns are obtained by taking the RHS vector as (0,1,0) and (0,0,1).

A

Iterative Method for solution of Linear systems

• Jacobi iteration

• Gauss - Seidel iteration

Jacobi IterationAX = B

n = 3

a11x1 + a12x2 + a13x3 = b1

a21x1+ a22x2 + a23x3 = b2

a31x1 + a32x2 + a33x3 = b3

Preprocessing :

1. Arrange the equations so that the diagonal terms of the coefficient matrix are not zero.

2. Make row transformation if necessary to make the diagonal elements as large as possible.

Rewrite the equation as

x1 = 1/a11 (b1 - a12x2 - a13x3)

x2 = 1/a22 (b2 - a21x1 - a23x3)

x3 = 1/a33 (b3 - a31x1 - a32x2)

• Choose an initial approximation vector (x0, x1, x2).

If no prior information is available then (x0, x1, x2) = (0, 0, 0) will do.

Iterate : )xAb(

a

1x

ij

kjiji

ii

1ki

, 1 i n

Stop when

100x

xxji

iji

ji

Jacobi Pseudo Code

Diagonally Dominant Matrices

n

ij1j

ijii aa

ie., the diagonal element of each row is larger than the sum of the absolute values of the other elements in the row.

Diagonal dominance is a sufficient but not necessary condition for the convergence of the Jacobi or Gauss-Seidel iteration.

The Gauss-Seidel iteration

• In the Jacobi iteration all the components of the new estimate of the vector 1k

n1k

31k

21k

1 x,...,x,x,x are computed from thecurrent estimate k

nk3

k2

k1 x,...,x,x,x

• However when is computed the updated estimates

are already available.

1kix

1k1i

1k2

1ki x,...,x,x

• The Gauss - Seidel iteration takes advantage of the latest information available where updating an estimate.

Gauss - Seidel Jacobi First Iteration

x1 = (b1 - a12x2 - a13x3) / a11

x2 = (b2 - a21x1 - a23x3) / a22

x3 = (b3 - a31x1 - a32x2) / a33

x1 = (b1 - a12x2 - a13x3) / a11

x2 = (b2 - a21x1 - a23x3) / a22

x3 = (b3 - a31x1 - a23x3) / a33

Second Iteration

x1 = (b1 - a12x2 - a13x3) / a11

x2 = (b2 - a21x1 - a23x3) / a22

x3 = (b3 - a31x1 - a32x2) / a33

x1 = (b1 - a12x2 - a13x3) / a11

x2 = (b2 - a21x1 - a23x3) / a22

x3 = (b3 - a31x1 - a32 x2) / a33

Matrix Condition Number • Vector and Matrix norms

• Norm is a real valued function that gives a measure of size or length of the vector or the matrix.

• Example : Length of the vector A = (a, b, c)

is a norm.

222 cbaA

For the n dimensional vector A = (x1, x2, x3, ... , xn)

For the matrix Aij ,

2n

22

21 x...xxA

n

1i

n

1j

2ijij AA

These are called Euclidean norms.

Other Useful NormsFor Vectors

- (the p - norm)

n

1ii1

xA - (sum of the absolute values

of the element )

p

1pn

1iip

xA

inii

xmaxA

- (Maximum magnitude norm)

For Matrices

n

1iij

njiAmaxA - (Column sum norm)

n

1 jij

n i 1A max A - (row sum norm)

Properties of norms

1. || x || > 0 for x 0

2. || x || = 0 for x = 0

3. || x || = || || . || x ||

4. || x + y || || x || +|| y ||

Ill Conditioning Consider the system :

100

100

x

x

401200

200100

2

1

100

100

x

x

400200

200101

|2

1

The exact solution : x1 = 201, x2 = 100

Consider the slightly perturbed system :

A11 is increased by 1 percent

A22 is decreased by 0.25 percent (approx.)

Solution of the perturbed system :

x1 = 50, x2 = 24.75

Near 75% change in the solution.

When the solution of system, is highly sensitive to the

values of the coefficient matrix or the RHS vector, we

say that the system is ill - conditioned or the matrix is

ill - conditioned

Ill conditioned systems will result in large accumulated

round off errors.

Condition number of a Matrix

k(A) = || A || || A-1 ||

• This is a measure of the relative sensitivity of the solution to

small change in the RHS vector.

• If k is large, the system is regarded as being ill - conditioned

# The row - sum norm is usually employed to find k(A)

Condition Number of the Matrix

401200

200100A

12

201.4A 1

Row - sum norm of A = || A || = Max (300, 601) = 601.

Row - sum norm of A-1 = || A-1 || = Max (6.01, 3) = 6.01.

Condition Number k(A) = 601 (6.01) = 3612 (large).

A is ill - conditioned.

Equations that are not properly scaled may give rise to large condition numbers.

Consider

10001000

11A

0005.05.0

0005.05.0A 1

k (A) = || A || || A-1 || = 2000 × 0.0005 = 1001(large)

Equilibration : Scale each row of A by its largest element.

11

11A

5.05.0

5.05.0A 1

K(A) = || A || || A-1 || = 2(1) = 2s

Great Improvement in the condition number.

• Checking the answer is not enough when the condition number is high

• Consider

34

21

y

x

8955

5534

x = - 0.11 and y = 0.45 given

34 (-0.11) + 55 (0.45) - 21 = 0.01 0

55 (-0.11) + 89 (0.45) - 34 = 0.00 0

Interpolation

×

× ×

P1

P2

P3P4y

xGiven the data :

(xk , yk) , k = 1, 2, 3, ... , n ,

find a function f passing through each of the data samples.

f(xk) = yk , 1 k n

Once f is determined, we can use it to predict the value of y at points other than the samples.

×

Piecewise Linear Interpolation• A straight line segment is used between each adjacent pair of the data points.

x1 x2 x3 x4 x5

××

××

×

k1kk1k

kkk yy

xx

xxykf

1 k n

Polynomial Interpolation

For the data set (xk , yk), k = 1, ... , n ,

we find a polynomial of degree (n - 1) that has n constants :

n

1k

1kk xaxf

The n interpolation constraints f(xk) = yk gives

n

2

1

n

2

1

1nn

2nn

1n2

222

1n1

211

y

y

y

a

a

a

xxx1

xxx1

xxx1

• Not feasible for large data sets, since the condition number increases rapidly with increasing n.

Lagrange Interpolating PolynomialThe Lagrange interpolating polynomial of degree k, Qk(x) is constructed as follows :

)xx)...(xx)(xx(

)xx)...(xx(xxxQ

n13121

n321

.

.

.

)xx)(xx)(xx)...(xx)(xx(

)xx)...(xx)(xx)...(xx(xxxQ

nk1kk1kk3k1k

n1k1k21k

)xx)...(xx)(xx(

)xx)...(xx(xxxQ

n23212

n312

• Each Qk(x) is a polynomial of degree (n - 1)

• Qk (xj) = 1 , j = k

= 0 , j k

• The polynomial curve that passes through the data set (xk, yk), k = 1, 2, ..., n is

f(x) = y1Q1(x) + y2Q2(x) + ... +yn Qn(x)

• Polynomial is written directly without having to solve a system of equations.

Lagrange interpolations (Pseudo Code)Choose x, the point where the function value is required

y = 0

Do for i = 1 to n

product = f1

Do for j = 1 to n

if ( i j )

product = product (x - xj)/ (xi - xj)

end if

end Do

y = y + product

End Do

Least Square Fit

If the number of sample is large or if the dependent variable contains measurement noise, it is often letter to find a function which minimizes an error criterion such as

A function that minimizes E is called the Least Square Fit

n

1k

2kk y)x(fE

Straight Line Fit• To fit a straight line through the n points (x1, y1), (x2, y2), ... ,(xn, yn)

• Assume f(x) = a1 + a2x

Error 2n

1kkk y)x(fE

2n

1kkk21 yxaa

• Find a1, a2 which minimizes E.

0x)yxaa(2a

E n

1kkkk21

2

0)yxaa(2a

E n

1kkk21

1

kk

k

2

12kk

k

yx

y

a

a

xx

xx

Straight Line Fit (An example )Fit a straight line through the five points

(0, 2.10), (1, 2.83), (2, 1.10), (3, 3.20), (4, 3.90)

a11 = n = 5

3016941xa 2k22

043210xa k12

15.1320.310.185.210.2yb k1

a21 = a12

25.30)90.3(4)20.3(3)10.1(285.2yxb kk2

25.30

15.13

a

a

3010

105

2

1

a1 = 1.84 , a2 = 0.395 , f(x) = 1.84 +0.395x

Data Representation

Integers - Fixed Point Numbers

Decimal System - base 10 uses 0,1,2,…,9

(396)10 = (6 100) + (9 101) + (3 102) = (396)10

Binary System - base 2 uses 0,1

(11001)2 = (1 20) + (0 21) + (0 22) + (1 23) + (1 24) = (25)10

Decimal to Binary ConversionConvert (39)10 to binary form

base = 2

3922 19 + Remainder 1 2 9 + Remainder 1 2 4 + Remainder 1

2 2 + Remainder 0

2 1 + Remainder 0

0 + Remainder 1

Put the remainder in reverse order

(100111)2 = (1 20 ) + (1 21) + (1 22) + (0 23) + (0 24)

+ (1 25) = (39)10

Largest number that can be stored in m-digits base - 10 : (99999…9) = 10m - 1

base - 2 : (11111…1) = 2m - 1

m = 3 (999) = 103 - 1

(111) = 23 - 1

Limitation : Memory cells consist of 8 bits(1byte) multiples each position containing 1 binary digit.

• Common cell lengths for integers : k = 16 bits

k = 32 bits

Sign - Magnitude Notation

First bit is used for a sign

0 - non negative number 1 - negative number

The remaining bits are used to store the binary magnitude of the number.

Limit of 16 bit cell : (32767)10

Limit of 32 bit cell : (2 147 483 647)10

Two’s Compliment notation

Definition : The two’s complement of a negative integer I in a

k - bit cell :

Two’s Compliment of I = 2k + I

(Eg) : Two’s Complement of (-3)10 in a 3 - bit cell

= (2 3 - 3)10 = (5)10 = (101)2

(-3)10 will be stored as 101

Storage Scheme for storing an integer I in a k - bit cell in Two’ Compliment notation

Stored Value C =

I , I 0 , first bit = 0

2k + I , I < 0

The Two’s Compliment notation admits one more negative number than the sign - magnitude notation.

(Eg) Take a 2 bit cell (k = 2)

Range in Sign - magnitude notation : 21 - 1 = 1

-1 = 11

1 = 01

Range in Two’s Compliment notation

Two’s Compliment of -1 = 22-1 = (3)10 = (11)2

Two’s Compliment of -2 = 22- 2 = (2)10 = (10)2

Two’s Compliment of -3 = 22 - 2 = 0 - Not possible

Floating Point Numbers

Integer Part + Fractional Part

Decimal System - base 10 235 . 7846

Binary System - base 2 10011 . 11101

410

6

310

4

210

8

10

7Fractional Part (0.7846)10

Fractional Part (0.11101)2 52

1

42

0

32

1

22

1

2

1

Binary Fraction Decimal Fraction

(10.11)

Integer Part (10)2 = 0.20 + 1.21 = 2

Fractional Part (11)2 = 75.025.05.022

1

2

1

Decimal Fraction = ( 2.75 )10

Decimal Fraction Binary Fraction

Convert (0.9)10 to binary fraction

0.9 2

0.8 + integer part 1 2 0.6 + integer part 1 2

0.2 + integer part 1 2

0.4 + integer part 0 2

0.8 + integer part 0 Repetition

(0.9)10 = (0.11100)2

Scientific Notation (Decimal)

0.0000747 = 7.47 10-5

31.4159265 = 3.14159265 10

9,700,000,000 = 9.7 109

Binary

(10.01)2 = (1.001)2 21

(0.110)2 = (1.10)2 2-1

Computer stores a binary approximation to x

x q 2n

q - mantissa , n exponent

(-39.9)10 = (-100111.1 1100)2

= (-1.0001111 1100)2 (25)10

Decimal Value of stored number (-39.9)10

= -1. 001111 1100 1100 1100 11001

23 bit

32 bits : First bit for sign

Next 8 bits for exponent

23 bits for mantissa

= -39. 900001525 878 906 25

Round of Errors can be reduced by Efficient Programming Practice

# The number of operations (multiplications and additions ) must be kept minimum. (Complexity theory)

An Example of Efficient Programming

Problem : Evaluate the value of the Polynomial.

P(x) = a4x4 + a3x3 + a2x2 + a1x + a0

at a given x.

Requires 13 mulitiplications and 4 additions.

Bad Programme !

An Efficient method (Horner’s method)

P(x) = a4x4 + a3x3 + a2x2 + a1x + a0

= a0 + x(a1 + x(a2 + x(a3 + xa4) ))

Requires 4 multiplications and 4 additions.

Psuedo code for an nth degree polynomial

Input a’s, n,x

p = 0 For i = n, n-1, … , 0

p = xP + a(i)

End

Numerical Integration

• When do we need numerical integration ?• Antiderivate of f(x) difficult to find.• f(x) is available only in a tabular form

b

a

f(x)dx :evaluate To

Trapezoidal Rule

X0 X0+hX0-h

b

a

bfhnafhafhafafh

dxxf ))()]1(....)2()([2)((2

)(

Simpson’s rule

b

a

bffhafhaffhafhafafh

dxxf )]()}(.............)4()2({2)}(........)3()({4)([3

)(

a+h a+2ha

EULER’S METHOD-Pseudocode

:SolutionExact

1)0(,'

2

2x

ey

yxyy

Euler’s Method-Pseudo code

end

xy,print

x value theupdate,

loopsolution1

condition initial,0,1

size step,/1

steps ofnumber , :Input

hxx

yxyy

ntoiFor

xy

nh

n

A Second order method:Taylor Series method

0

0

0

0,00

00

00

)(''

)),((''

:calculated becan '',At

)(',At

; , At

,,'

dx

dy

y

f

x

fxy

dx

dy

y

f

x

fyxf

dx

dy

yx

yxfyx

yyx

yxyyxfy

Taylor’s method

..........,)2( toproceed and

thisfrom)( Calculate

).(''2

)(')()(

0

0

0

2

000

hxy

hxy

xyh

xyhxyhxy

Taylor Series Method:An example

1

'

)0(''

;0)0('

1)0(,'

0

0

yxy

dx

dy

y

f

x

fy

y

yxyy

Taylor Series solution at x = 0.1

005.1

005.1

)1(2

)1.0(01.01

)(''2

)(')()(

1.0

2

)1.0(

2

0

2

000

2

eysolutionExact

xyh

xhyxyhxy

h

Finite Difference Method forBoundary Value Problems

2

)()(2)()(''

2

)()()('

h

xyxyxyxy

h

xyxyxy

xxx

hiihii

hihii

hiihi

Recommended