Continuous Mathematics

1 & 2 Continuous Mathematics

Continuous

Mathematics

Hilary Term 2015

Jonathan Whiteley

Preliminaries

A course webpage, from which all material can be dowloaded, is

located at

www.cs.ox.ac.uk/teaching/courses/2014-2015/ContMath/

The worksheets are designed so that (approximately) worksheet 1

covers the material in lectures 1-4, worksheet 2 covers the material in

lectures 5-8 etc.

A textbook for this course is “Foundations of Science Mathematics”

by D.S. Sivia and S.G. Rawlings

Other suitable textbooks are “Advanced Engineering Mathematics”

by E. Kreyszig, and “Mathematical Techniques” by D.W. Jordan and

P. Smith

Continuous Mathematics 3 & 4

Location

⇒ • Mathematical preliminaries

• Partial differentiation

• Taylor series

• Critical points

• Solution of nonlinear equations

• Constrained optimisation

• Integration

• Fourier series

• First order initial value ordinary differential equations

• Second order boundary value ordinary differential equations

• Simple partial differential equations

Mathematical preliminaries

Powers

yN means the number y multiplied by itself N times:

y1 = y y2 = y × y

y3 = y × y × y y4 = y × y × y × y

Four properties of powers are:

xMxN = xM+N(XM

)N= xMN

x−N =1

xNx0 = 1


Logarithms

In this course we will use base e for logarithms and so

y = log x⇔ x = ey = exp(y)

where e = 2.718281828459046 . . .

The symbol “⇔” means “implies and is implied by” and denotes that

the expressions on either side are equivalent

This definition of a logarithm makes sense for computer scientists —

in all common programming languages the expression “log(x)”

corresponds to the logarithm with base e of the variable x

Hence, in this course

log x = loge x = lnx

Properties of logarithms include

log xy = log x+ log y

logx

y= log x− log y

log(xN)

= N log x

Example: Simplify log xy4

logx

y4= log x− log

(y4)

= log x− 4 log y


Trigonometry

Using the standard definitions of the sine, cosine and tangent of an

angle we may deduce that

tan θ =sin θ

cos θ

We also have the inverse of the sine, cosine and tangent functions

defined by

y = sinx⇔ x = arcsin y

y = cosx⇔ x = arccos y

y = tanx⇔ x = arctan y

We will use the (standard) notation

sin2 x = (sinx)2

sinx2 = sin(x2)

with similar notation for other trigonometric and log functions

We also have the following formulae

sin(x+ y) = sinx cos y + sin y cosx

cos(x+ y) = cosx cos y − sinx sin y

sin2 x+ cos2 x = 1


Complex numbers

The imaginary number i is defined by i2 = −1

We may perform standard arithmetic on complex numbers: if a, b, c, d

are real numbers than

(a+ bi) + (c+ di) = (a+ c) + (b+ d)i

(a+ bi)(c+ di) = ac+ adi+ bci+ bdi2

= (ac− bd) + (ad+ bc)i

a+ bi

c+ di=

(a+ bi)(c− di)(c+ di)(c− di)

=(ac+ bd) + (bc− ad)i

c2 + d2

If the complex number z is given by

z = a+ bi

where a and b are real numbers, then the conjugate of z, denoted by

z̄, is given by

z̄ = a− bi

If θ is a real number then

eiθ = cos θ + i sin θ


The modulus of a number

Sometimes we are interested in the size of a number, and not the

sign: we define |x|, known as the modulus of x, by

|x| =

x x ≥ 0

−x x < 0

−10 −5 0 5 100

2

4

6

8

10

x

|x|

Factorial notation

For a positive integer n the factorial of n, denoted by n! is given by

n! = n× (n− 1)× (n− 2)× . . .× 3× 2× 1

For example

1! = 1

2! = 2× 1 = 2

3! = 3× 2× 1 = 6

7! = 7× 6× 5× 4× 3× 2× 1 = 5040

Note that n! = n× (n− 1)!

We also have 0! = 1


Summation notation

Suppose we want an expression for the sum of the squares of the first

N positive integers, which we may write verbosely as

S = 1 + 4 + 9 + 16 + . . .+ (N − 1)2 +N2

There is a more convenient notation for this:

S =

N∑n=1

n2

The notation above above tells us to:

1. for all integers n between 1 and N inclusive;

2. evaluate n2, and calculate the sum of these results.

For example

5∑n=−3

n = (−3) + (−2) + (−1) + 0 + 1 + 2 + 3 + 4 + 5 = 9

5∑k=2

k(k − 1) = 2× 1 + 3× 2 + 4× 3 + 5× 4 = 40


Functions

A function f(x), defined for x in a specified interval that may be

infinite, defines a unique value f(x) for each value of x.

Functions are often illustrated by plotting y = f(x) against x

0 5 10−300

−200

−100

0

100

200

300

x

y

f(x) = x3 − 10 x2 − 5

−5 0 50

0.5

1

1.5

2

2.5

3

x

y

f(x) = exp ( sin (x2 ))

Composite functions

The second example function on the previous slide —

f(x) = exp(sinx2) – is known as a composite function

To evaluate f(x) for a given value of x we first evaluate t = sinx2

f(x) is then evaluated by substituting t into

f(x) = exp(t)


Limits of functions

Suppose the functions f(x), g(x), h(x) are defined as follows

f(x) = x2 − 3x g(x) = 5x2 + x h(x) =f(x)

g(x)

We then have f(0) = 0 and g(0) = 0, and so h(x) = 00 which is not

defined

Is there, however, anything useful we can say about the behaviour of

h(x) when x is close to zero?

We may write

h(x) =x(x− 3)

x(5x+ 1)

and so for x 6= 0 we may write

h(x) =x− 3

5x+ 1

0 0.2 0.4 0.6 0.8 1−7

−6

−5

−4

−3

−2

−1

0

x

y

h(x)


For x = ε, where ε is as close to zero as we want without actually

being zero, and ε may be positive or negative, h(x) approaches -3

h(x) is said to approach the limit -3 as x→ 0 which is written

mathematically as

limx→0

h(x) = −3

Some results on limits

Suppose two functions f(x) and g(x) have the following limits as

x→ a

limx→a

f(x) = A limx→a

g(x) = B

The following properties then hold

limx→a

(f(x) + g(x)) = A+B

limx→a

(f(x)g(x)) = AB


Differentiation

Loosely speaking the derivative of the function is the gradient or

slope of the tangent to the graph of a function (provided it exists)

In the figure below, the derivative at x = 1 is given by Ly/Lx

0.6 0.8 1 1.2 1.4 1.6

−2

0

2

4

6

8

Lx

Ly

x

y

Calculating the derivative from first principles

Drawing a tangent at each point on the graph to calculate the

derivative isn’t possible

We will use the concept of limits to systematically calculate the

gradient

For the function y = f(x) the gradient is commonly denoted by

either

dy

dxor f ′(x)


The gradient, or slope, of y = f(x) is defined as the following limit

dy

dx= lims→0

f(x+ s)− f(x)

s

As s gets closer to zero — without actually reaching zero — the

fraction becomes a better approximation to the slope

Example: differentiating y = x3

First need to note that

(x+ s)3

= x3 + 3x2s+ 3xs2 + s3

dy

dx= lims→0

(x+ s)3 − x3

s

= lims→0

x3 + 3x2s+ 3xs2 + s3 − x3

s

= lims→0

3x2s+ 3xs2 + s3

s

= lims→0

(3x2 + 3xs+ s2

)= 3x2


Some derivatives

y dydx

A, constant 0

xn, n 6= 0 nxn−1

sinx cosx

cosx − sinx

ex ex

Higher derivatives

Sometimes we want to calculate not just the derivative of a function,

but also the derivative of the derivative

This is known as the second derivative and is denoted by d2ydx2

By definition,

d2y

dx2=

d

dx

(dy

dx

)


Example:

y = 3x4 + sinx− 4ex

dy

dx= 12x3 + cosx− 4ex

The second derivative is given by

d2y

dx2=

d

dx

(dy

dx

)=

d

dx

(12x3 + cosx− 4ex

)= 36x2 − sinx− 4ex

Higher derivatives — for example the third and fourth derivatives

may also be defined:

d3y

dx3=

d

dx

(d2y

dx2

)d4y

dx4=

d

dx

(d3y

dx3

)

Alternative notation for higher derivatives of y = f(x) is

f ′′(x) =d2y

dx2

f ′′′(x) =d3y

dx3

f (n)(x) =dny

dxn


Derivatives of inverse functions

Suppose y = f(x)

x = g(y) is said to be an inverse function of f(x) if, and only if, the

following is true:

y = f(x)⇔ x = g(y)

We then use the notation f−1(x) to denote the inverse function,

where

f−1(x) = g(x)

Graphically, an inverse function is given by reflecting in the line

y = x so that the x− and y− axes are interchanged

Example: The function f(x) is defined by

f(x) = exp(x1/3), x > 0

Calculate the inverse function f−1(x)

We want to find g(y) such that y = f(x)⇔ x = g(y)

y = exp(x1/3), x > 0

log y = x1/3, y > 1

x = (log y)3, y > 1

and so

g(y) = (log y)3, y > 1


The inverse function is then given by

f−1(x) = (log x)3, x > 1

f(x) and f−1(x) are shown below — note the symmetry about the

line y = x

0 2 4 6 8 100

2

4

6

8

10

12

14

x

f(x)

f−1

(x)

Returning to our initial illustration of a derivative

0.6 0.8 1 1.2 1.4 1.6

−2

0

2

4

6

8

Lx

Ly

x

y

We see that the derivative of the inverse function, dxdy is given by

dx

dy=LxLy

and sodx

dy= 1/

dy

dx

Calculating the derivative of the inverse function can be a useful

“trick” for differentiating a function


Example: differentiate y = log x for x > 0

We may write this as x = ey

We then have

dx

dy= ey

= x

Hence

dy

dx= 1/

dx

dy=

1

x

Example: differentiate y = arcsinx for −1 < x < 1

x = sin y

dx

dy= cos y

=

√1− sin2 y using sin2 y + cos2 y = 1

=√

1− x2

Therefore

dy

dx= 1/

dx

dy=

1√1− x2


A cautionary note

We have used the formula

dx

dy= 1/

dy

dx

This only holds for first derivatives: it is not true that

d2x

dy2= 1/

d2y

dx2

Instead, we have to perform the differentiation:

d2x

dy2=

d

dy

(dx

dy

)

Location

√• Mathematical preliminaries

⇒ • Partial differentiation

• Taylor series

• Critical points

• Solution of nonlinear equations


• Integration

• Fourier series





Partial differentiation

All the functions we have considered to date are of one variable —

they have one “input” and return one “output”

For example, f(x) = sinx requires that we need only specify x to

calculate f(x)

We will now think about functions of two (or more variables), for

example f(x, y) = sin(y2 + x)− cos(y − x2)

We need to specify both x and y to evaluate the single output f(x, y)

Below is a plot of the surface defined by

f(x, y) = sin(y2 + x)− cos(y − x2)

−2

0

2

−2

0

2

−2

0

2

x

f(x,y) = sin(y2+x) − cos(y−x

2)

y

f(x,y

)


When we have a function g(x, y) it is often useful to differentiate g

with respect to the two variables x and y separately

The partial derivative of g with respect to x, denoted by ∂g∂x or gx is

the differential of g with respect to x with the other variable y

treated as a constant.

The partial derivative of g with respect to y has an analogous

definition, and is denoted by ∂g∂y or gy

Using the example of g(x, y) = sin(x+ y) + 3x2y + cos2(3x+ y6)

∂g

∂x= cos(x+ y) + 6xy − 6 sin(3x+ y6) cos(3x+ y6)

∂g

∂y= cos(x+ y) + 3x2 − 12y5 sin(3x+ y6) cos(3x+ y6)

Composite functions

In an earlier slide we saw a composite function f(x) = exp(sinx2)

This could be written f(x) = g(h(x)), where h(x) = sinx2 and

g(x) = exp(x)

This can be differentiated using the chain rule:

f ′(x) = g′(h(x))h′(x)


Example: differentiate f(x) = sinn x for n 6= 0

Recall that sinn x = (sinx)n

We write g(x) = xn and h(x) = sinx so that f(x) = g(h(x))

Differentiating g(x) and h(x) gives

g′(x) = nxn−1 h′(x) = cosx

The derivative of f is then given by

f ′(x) = g′(h(x))h′(x)

= n (sinx)n−1

cosx

= n sinn−1 x cosx

A similar chain rule exists for functions of two variables

If f(x, y) = g(h(x, y)) then

∂f

∂x= g′(h(x, y))

∂h

∂x∂f

∂y= g′(h(x, y))

∂h

∂y


Suppose f(x, y) = sin(x2 + y).

We can write f(x, y) = g(h(x, y)) where

g(x) = sinx h(x, y) = x2 + y

The partial derivatives of f(x, y) are given by

∂f

∂x= cos(x2 + y)

∂h

∂x= 2x cos(x2 + y)

∂f

∂y= cos(x2 + y)

∂h

∂y= cos(x2 + y)

The product rule

Suppose f(x) = u(x)v(x), where u(x), v(x), f(x) are functions of one

variable.

The derivative of f is then given by

f ′(x) = u′(x)v(x) + u(x)v′(x)

The product rule for a function of one variable has an analogous

definition for functions of two variables.

Suppose f(x, y) = u(x, y)v(x, y). Then

∂f

∂x=∂u

∂xv + u

∂v

∂x∂f

∂y=∂u

∂yv + u

∂v

∂y


Example: The function f(x, y) is given by

f(x, y) = ex2+sin y (1 + x+ y)

Evaluate the partial derivatives ∂f∂x and ∂f

∂y

We write f(x, y) = u(x, y)v(x, y) where

u(x, y) = ex2+sin y

v(x, y) = 1 + x+ y

The partial derivatives of u and v are given by

∂u

∂x= 2xex

2+sin y ∂v

∂x= 1

∂u

∂y= cos y ex

2+sin y ∂v

∂y= 1

The partial derivatives of f are then given by

∂f

∂x=(

2xex2+sin y

)(1 + x+ y) + ex

2+sin y

∂f

∂y=(

cos y ex2+sin y

)(1 + x+ y) + ex

2+sin y


Differentiation of quotients

Suppose f(x) = u(x)v(x) , where u(x), v(x), f(x) are functions of one

variable.

We may differentiate f(x) using the following formula

f ′(x) =vu′ − uv′

v2

For functions of two variables where f(x, y) = u(x,y)v(x,y) we also have a

quotient rule given by

∂f

∂x=v ∂u∂x − u

∂v∂x

v2

∂f

∂y=v ∂u∂y − u

∂v∂y

v2

Higher derivatives

As with functions of one variable we may want to calculate higher

derivatives as well as first derivatives

For example

∂2g

∂x2=

∂

∂x

(∂g

∂x

)∂2g

∂x∂y=

∂

∂x

(∂g

∂y

)=

∂

∂y

(∂g

∂x

)order of differentiation can be changed

=∂2g

∂y∂x

∂2g

∂y2=

∂

∂y

(∂g

∂y

)


Example: write down all the second order partial derivatives of

g(x, y) = sin(x+ y) + 3x2y + cos2(3x+ y6)

We already have

∂g

∂x= cos(x+ y) + 6xy − 6 sin(3x+ y6) cos(3x+ y6)

∂g

∂y= cos(x+ y) + 3x2 − 12y5 sin(3x+ y6) cos(3x+ y6)

∂2g

∂x2=

∂

∂x

(∂g

∂x

)= − sin(x+ y) + 6y − 18 cos2(3x+ y6) + 18 sin2(3x+ y6)

∂2g

∂x∂y=

∂2g

∂y∂x

=∂

∂x

(∂g

∂y

)= − sin(x+ y) + 6x− 36y5 cos2(3x+ y6) + 36y5 sin2(3x+ y6)

∂2g

∂y2=

∂

∂y

(∂g

∂y

)= − sin(x+ y)− 60y4 sin(3x+ y6) cos(3x+ y6)−

72y10 cos2(3x+ y6) + 72y10 sin2(3x+ y6)


Taylor series

If we assume that a function f(x) may be differentiated as many

times as we wish, and that these derivatives are continuous, then we

may write f(x) as a Taylor series

f(x) =

∞∑n=0

an (x− x0)n

By substituting x = x0 into this formula we obtain

a0 = f(x0)

Differentiating the Taylor series gives

f ′(x) =

∞∑n=1

nan (x− x0)n−1

Note that the series now begins at n = 1. Substituting x = x0 into

this formula yields

a1 = f ′(x0)

Differentiating again gives

f ′′(x) =

∞∑n=2

n(n− 1)an (x− x0)n−2

Substituting x = x0 into this formula yields

a2 =1

2f ′′(x0)


Repeating, we obtain

an =f (n)(x0)

n!

The Taylor series about the point x = x0 can then be written

f(x) =

∞∑n=0

f (n)(x0)

n!(x− x0)

n

Example: find the Taylor series for f(x) = sinx around the point

x = 0

We have the following derivatives for integer values of n

f (4n)(x) = sinx f (4n+1)(x) = cosx

f (4n+2)(x) = − sinx f (4n+3)(x) = − cosx

Our Taylor series is therefore

f(x) = x− x3

3!+x5

5!− x7

7!+ . . .


Example: find the Taylor series for f(x) = 3− 4x2 + ex around the

point x = 3

First we write down the derivatives of f(x):

f ′(x) = −8x+ ex

f ′′(x) = −8 + ex

f (n)(x) = ex n = 3, 4, 5, . . .

The Taylor series about x = a is then given by

f(x) = f(3) + f ′(3) (x− 3) +f ′′(3)

2(x− 3)

2+

f ′′′(3)

6(x− 3)

3+f ′′′′(3)

24(x− 3)

4+ . . .

= −33 + e3 +(−24 + e3

)(x− 3) +

−8 + e3

2(x− 3)

2+

e3

6(x− 3)

3+

e3

24(x− 3)

4+ . . .


A plot of the true function y = f(x) = 3− 4x2 + ex, together with

the Taylor series up to and including the linear, the quadratic and

the cubic term are shown in the figure below

−2 0 2 4−20

−15

−10

−5

0

5

10

15

20

x

y

y=3−4x

2 + e

x

Taylor 1

Taylor 2

Taylor 3

For the Taylor series:

• All Taylor series approximate f(x) better near to x = 3

• Adding more terms allows the approximation to be reasonably

good for a wider area

The points on the previous slide are emphasised by zooming in to the

region near x = 3

2 2.5 3 3.5 4−18

−16

−14

−12

−10

−8

−6

−4

−2

x

y

y=3−4x

2 + e

x

Taylor 1

Taylor 2

Taylor 3

2.8 2.9 3 3.1 3.2−14

−13.5

−13

−12.5

−12

−11.5

x

y

y=3−4x

2 + e

x

Taylor 1

Taylor 2

Taylor 3


Error in Taylor series

It is possible to bound the error given by a Taylor series truncated at

order N

It can be shown that

f(x) =

N∑n=0

f (n)(x0)

n!(x− x0)

n+f (N+1)(x∗)

(N + 1)!(x− x0)

N+1

where x∗ is an (unknown) point between x and x0.

Suppose we know that∣∣f (N+1)(x)

∣∣ < A for some constant A. Then∣∣∣∣∣f(x)−N∑n=0

f (n)(x0)

n!(x− x0)

n

∣∣∣∣∣ < A

(N + 1)!

∣∣∣(x− x0)N+1

∣∣∣

Earlier we showed that the Taylor series approximation to

f(x) = 3− 4x2 + ex about x = 3, truncated at the quadratic terms

was

T2(x) = −33 + e3 +(−24 + e3

)(x− 3) +

−8 + e3

2(x− 3)

2

Suppose we restrict outselves to the region 2 < x < 4.

We know that f ′′′(x) < e4, and∣∣∣(x− 3)

3∣∣∣ < 1

The maximum error is therefore

e4/6


Use of Taylor series to calculate limits

Example: use the Taylor series for f(x) = sinx to evaluate the limit

limx→0sin xx

Using the Taylor series for f(x) = sinx about the point x = 0 we

may write

f(x)

x= 1− x2

3!+x4

5!− x6

7!+ . . .

Substituting x = 0 into the expression above gives

limx→0

sinx

x= 1

Critical points

A critical point of a function f(x) is a point where the slope is zero,

and so f ′(x) = 0

A critical point may be a local maximum, a local minimum, or a

saddle point

Below there is a local maximum at x = 0, a local minimum at x = 1

and a saddle point at x = 2

−0.5 0 0.5 1 1.5 2 2.53

3.5

4

4.5

5

x

y


Suppose f(x) has a critical point at x = x0

By definition, the slope is zero at x = x0 and so f ′(x0) = 0

Recall that we saw earlier that Taylor series were good local

approximations to a function in a small region

We can use a Taylor series about x = x0 to tell us more about the

critical point, i.e. whether it is a minimum, maximum or saddle

Suppose f(x) has a critical point at x = x0, and so f ′(x0) = 0

Suppose further that f ′′(x0) = A 6= 0 for some constant A

We then have a Taylor series expansion up to an including the

quadratic terms about x = x0 given by

f(x) ≈ f(x0) +1

2A (x− x0)

2

The local behaviour in the region of the critical point is a quadratic

function, and so the critical point must be a maximum or a minimum


0 1 2 3 4 5 60

2

4

6

8

10

12

x

yA > 0

0 1 2 3 4 5 6−8

−6

−4

−2

0

2

4

x

y

A < 0

We therefore see that:

• If f ′(x0) = 0 and f ′′(x0) > 0 then x0 is a minimum value

• If f ′(x0) = 0 and f ′′(x0) < 0 then x0 is a maximum value

Note that we haven’t considered the case f ′′(x0) = 0 yet

Example: classify the critical values of f(x) = exp(13x

3 − x)

Differentiating,

f ′(x) = (x2 − 1) exp

(x3

3− x)

f ′′(x) = (x4 − 2x2 + 2x+ 1) exp

(x3

3− x)

At critical points f ′(x) = 0

As exp(x3

3 − x) is never zero, the only critical points are x = ±1

f ′′(−1) < 0 and so x = −1 is a maximum value f ′′(1) > 0 and so

x = 1 is a minimum value


The graph below verifies that x = −1 is a maximum value of f(x),

and x = 1 is a minimum value

−2 −1 0 1 20.5

1

1.5

2

x

y

Note that the above analysis has required f ′′(x0) 6= 0 at a critical

point

The special case that f ′′(x) = 0 at a critical point

Example: classify the critical points of f(x) = exp(x3).

Differentiating

f ′(x) = 3x2 exp(x3) f ′′(x) = (6x+ 9x4) exp(x3)

We see that x = 0 is the only critical point

However, f ′′(0) = 0 and so we can’t use the earlier theory to classify

the critical point


To classify the critical point we look for a higher order Taylor

expansion. Differentiating again:

f ′′′(x) = (6 + 54x3 + 27x6) exp(x3)

and so f ′′′(0) = 6

The Taylor series, up to and including the cubic term, about x = 0 is

T3(x) = 1 + x3

This is not a minimum or a maximum, so must be a saddle point, as

can be seen by plotting T3(x) on the next slide

−1 −0.5 0 0.5 10

0.5

1

1.5

2

2.5

3

x

y

Taylor series

True function

We see that the Taylor series about the critical point at x = 0 allows

us to deduce that this critical point is a saddle


Another example: classify the critical point of f(x) = sinx4 − 5 at

x = 0

Differentiating:

f ′(x) = 4x3 cosx4

f ′′(x) = 12x2 cosx4 − 16x6 sinx4

f ′′′(x) = (24x− 64x9) cosx4 − 144x5 sinx4

f ′′′′(x) = (24− 1152x8) cos4(x) + (−624x4 + 256x12) sin4(x)

We see that f ′(x) = f ′′(x) = f ′′′(x) = 0 at x = 0

We need to go as far as the term in x4 in the Taylor series to classify

this critical point

In this case we have:

f(x) ≈ −5 +f ′′′′(0)

4!x4 = −5 + x4

Hence x = 0 is a minimum value of f(x) as can be see by plotting the

Taylor series about x = 0

−2 −1 0 1 2−7

−6

−5

−4

−3

−2

−1

0

x

y

Taylor series

True function


Summary for classifying critical points:

• Find the points xc such that f ′(xc) = 0

• If f ′′(xc) 6= 0 use the simple method described earlier for

classifying whether it is a local minimum or maximum

• If f ′′(xc) = 0 find the smallest n ≥ 3 such that f (n)(xc) 6= 0. The

local behaviour is then given by the Taylor series

f(x) ≈ f(xc) +f (n)(xc)

n!(x− xc)n

Use this Taylor series to classify the critical point — it can be a

maximum, a minimum, or a saddle

Taylor series for functions of two variables

If f(x) is a function of one variable, we have seen that we may

expand f(x) as a Taylor series about the point x = x0

f(x) =

∞∑n=0

ann!

(x− x0)n

Even when the infinite series was truncated at only a few terms the

polynomial approximation was very effective in the region of x = x0

Taylor series are also available for functions of two variables, allowing

us to write these functions as a polynomial expansion


We may expand f(x, y) as a Taylor series about x = x0, y = y0

In this case we write

f(x, y) = A0,0 + [A1,0 (x− x0) +A1,1 (y − y0)] +[A2,0 (x− x0)

2+A2,1 (x− x0) (y − y0) +A2,2 (y − y0)

2]

+[A3,0 (x− x0)

3+A3,1 (x− x0)

2(y − y0) +

A3,2 (x− x0) (y − y0)2

+A3,3 (y − y0)3]

+ . . .

=

∞∑n=0

(n∑

m=0

An,m (x− x0)n−m

(y − y0)m

)

We may establish the values of Ai,j in a similar manner to Taylor

series of one variable.

Substituting x = x0, y = y0 into the Taylor series gives

A0,0 = f(x0, y0)


To calculate A1,0 we calculate the partial derivative of the Taylor

series with respect to x:

∂f

∂x= A1,0 + [2A2,0 (x− x0) +A2,1 (y − y0)] +[

3A3,0 (x− x0)2

+ 2A3,1 (x− x0) (y − y0) +A3,2 (y − y0)2]

+ . . .

Substituting x = x0, y = y0 gives

A1,0 =∂f

∂x

∣∣∣∣∣x=x0,y=y0

Similarly,

A1,1 =∂f

∂y

∣∣∣∣∣x=x0,y=y0

To calculate A2,0 we calculate the second partial derivative of the

Taylor series with respect to x:

∂2f

∂x2= 2A2,0 + [6A3,0 (x− x0) + 2A3,1 (y − y0)] + . . .


A2,0 =1

2

∂2f

∂x2

∣∣∣∣∣x=x0,y=y0

Similarly,

A2,2 =1

2

∂2f

∂y2

∣∣∣∣∣x=x0,y=y0


A2,1 can be calculated from the second partial derivative ∂2f∂x∂y

∂2f

∂x∂y= A2,1 + [2A3,1 (x− x0) + 2A3,2 (y − y0)] + . . .


A2,1 =∂2f

∂x∂y

∣∣∣∣∣x=x0,y=y0

The Taylor expansion for f(x, y), up to an including quadratic terms,

about the point x = x0, y = y0 is given by

f(x, y) ≈ A0,0 + [A1,0 (x− x0) +A1,1 (y − y0)] +[A2,0 (x− x0)

2+A2,1 (x− x0) (y − y0) +A2,2 (y − y0)

2]

where

A0,0 = f(x0, y0)

A1,0 =∂f

∂x

∣∣∣∣∣x=x0,y=y0

A1,1 =∂f

∂y

∣∣∣∣∣x=x0,y=y0

A2,0 =1

2

∂2f

∂x2

∣∣∣∣∣x=x0,y=y0

A2,1 =∂2f

∂x∂y

∣∣∣∣∣x=x0,y=y0

A2,2 =1

2

∂2f

∂y2

∣∣∣∣∣x=x0,y=y0


The full Taylor expansion for f(x, y) about the point x = x0, y = y0

is given by

f(x, y) =

∞∑n=0

(n∑

m=0

An,m (x− x0)n−m

(y − y0)m

)

It can be shown that

An,m =1

m!(n−m)!

∂nf

∂xn−m∂ym

∣∣∣∣∣x=x0,y=y0

Example: find the Taylor series expansion of f(x, y) = xey − y3

about the point x = 0, y = 0, up to and including all quadratic terms

The partial derivatives are given by

∂f

∂x= ey

∂f

∂y= xey − 3y2

∂2f

∂x2= 0

∂2f

∂x∂y= ey

∂2f

∂y2= xey − 6y


Substituting x = 0, y = 0 into the partial derivatives gives

A0,0 = 0

A1,0 = 1

A1,1 = 0

A2,0 = 0

A2,1 = 1

A2,2 = 0

The Taylor series is therefore

f(x) ≈ x+ xy

Example: find the Taylor series expansion of f(x, y) = xey − y3

about the point x = 2, y = 1, up to and including all quadratic terms

Substituting x = 2, y = 1 into the partial derivatives (calculated

earlier) gives

A0,0 = 2e− 1

A1,0 = e

A1,1 = 2e− 3

A2,0 = 0

A2,1 = e

A2,2 = 2e− 6


The Taylor series in this case is

f(x) ≈ (2e− 1) + e (x− 2) + (2e− 3) (y − 1) + e (x− 2) (y − 1) +

1

2(2e− 6) (y − 1)

2

Critical points in two dimensions

As with functions of one variable, we can define critical points of a

function of two variables, f(x, y), to be points where the function has

zero slope (in any direction)

At critical points ∂f∂x = ∂f

∂y = 0

We can classify critical points so that, roughly speaking, if (x0, y0) is

a critical point

• A maximum value, where f(x, y) < f(x0, y0) in some region of

(x0, y0) for (x, y) 6= (x0, y0)

• A minimum value, where f(x, y) > f(x0, y0) in some region of

(x0, y0) for (x, y) 6= (x0, y0)

• A saddle point otherwise


As with functions of one variable, we can classify a critical point at

(x0, y0) by looking at a quadratic Taylor series approximation in the

region of the critical point:

f(x, y) ≈ f(x0, y0) +A2,0 (x− x0)2

+A2,1 (x− x0) (y − y0) +

A2,2 (y − y0)2

where

A2,0 =1

2

∂2f

∂x2

∣∣∣∣∣x=x0,y=y0

A2,1 =∂2f

∂x∂y

∣∣∣∣∣x=x0,y=y0

A2,2 =1

2

∂2f

∂y2

∣∣∣∣∣x=x0,y=y0

Assuming A2,0 6= 0 we can write this as

f(x, y) ≈ f(x0, y0) +A2,0

[(x− x0) +

A2,1

2A2,0(y − y0)

]2+

4A2,0A2,2 −A22,1

4A2,0(y − y0)

2


Suppose that

A2,0 > 0 and 4A2,0A2,2 −A22,1 > 0

Then both of the last two terms on the right hand side of the last

equation on the previous slide are positive, and so f(x, y) > f(x0, y0)

for (x, y) 6= (x0, y0) in some region around (x0, y0)

(x0, y0) is therefore a local minimum of the function f(x, y)

Now suppose that

A2,0 < 0 and 4A2,0A2,2 −A22,1 > 0

Then both of the last two terms on the right hand side of the last

equation on the slide two slides back are negative, and so

f(x, y) < f(x0, y0) for (x, y) 6= (x0, y0) in some region around (x0, y0)

(x0, y0) is therefore a local maximum of the function f(x, y)


Now suppose 4A2,0A2,2 −A22,1 < 0

The last two terms on the right hand side of the last equation on the

slide three slides back have different signs, and so the value of f(x, y)

either increases or decreases depending on which path we take

(x0, y0) is therefore a saddle point of the function f(x, y)

Summary of classifying critical points of functions of twovariables

• Find the points (x0, y0) where ∂f∂x = ∂f

∂y = 0

• Calculate the second partial derivatives

A2,0 =1

2

∂2f

∂x2

∣∣∣∣∣x=x0,y=y0

A2,1 =∂2f

∂x∂y

∣∣∣∣∣x=x0,y=y0

A2,2 =1

2

∂2f

∂y2

∣∣∣∣∣x=x0,y=y0


• – If A2,0 > 0 and 4A2,0A2,2 −A22,1 > 0 then (x0, y0) is a

local minimum of the function f(x, y)

– If A2,0 < 0 and 4A2,0A2,2 −A22,1 > 0 then (x0, y0) is a

local maximum of the function f(x, y)

– If 4A2,0A2,2 −A22,1 < 0 then (x0, y0) is a saddle point of the

function f(x, y)

Example: find all critical points of the function

f(x, y) = x2 + 2xy − y2 + y3

and classify them as maxima, minima or saddle points

The first partial derivatives are given by

∂f

∂x= 2x+ 2y

∂f

∂y= 2x− 2y + 3y2

At critical points, ∂f∂x = ∂f∂y = 0


∂f

∂x= 0⇒ x = −y

Substituting into ∂f∂y = 0 gives

−4y + 3y2 = 0

equivalently y(3y − 4) = 0

Critical points are therefore (0, 0) and (−4/3, 4/3)

To classify the critical points we calculate the second derivatives

∂2f

∂x2= 2

∂2f

∂x∂y= 2

∂2f

∂y2= 6y − 2

At x = 0, y = 0

A2,0 = 1 A2,1 = 2 A2,2 = −1

The quantity 4A2,0A2,2 −A22,1 < 0 and so the point (0, 0) is a saddle


At x = −4/3, y = 4/3

A2,0 = 1 A2,1 = 2 A2,2 = 3

The quantity 4A2,0A2,2 −A22,1 > 0, A2,0 > 0 and so the point

(−4/3, 4/3) is a local minimum

Example: find, and classify, the critical points of the function

f(x, y) = ex+y(x2 − xy + y2

)

At critical points, ∂f∂x = ∂f

∂y = 0

The first partial derivatives are given by

∂f

∂x= ex+y

(x2 − xy + y2 + 2x− y

)∂f

∂y= ex+y

(x2 − xy + y2 − x+ 2y

)


Setting the first partial derivatives to zero:

ex+y(x2 − xy + y2 + 2x− y

)= 0

ex+y(x2 − xy + y2 − x+ 2y

)= 0

Multiplying both equations by e−x−y, and subtracting the second

equation from the first yields 3x− 3y = 0, i.e. x = y

This implies that x2 + x = 0, i.e. x = 0,−1

The critical points are therefore (0, 0) and (−1,−1)

To classify the critical points we need the second derivatives

∂2f

∂x2= ex+y

(x2 − xy + y2 + 4x− 2y + 2

)∂2f

∂x∂y= ex+y

(x2 − xy + y2 + x+ y − 1

)∂2f

∂y2= ex+y

(x2 − xy + y2 − 2x+ 4y + 2

)

At (0, 0), A2,0 = 1, A2,1 = −1 and A2,2 = 1

We have A2,0 > 0 and 4A2,0A2,2 −A22,1 = 3 > 0 and so (0, 0) is a

minimum value.


At (−1,−1), A2,0 = 12e−2, A2,1 = −2e−2 and A2,2 = 1

2e−2

In this case 4A2,0A2,2 −A22,1 = −3e−4 < 0 and so (−1,−1) is a saddle

point

The gradient vector

If f(x, y) is a function of two variables the gradient vector, often

called “grad f”, is given by

∇f =

∂f∂x

∂f∂y

Example: if f(x, y) = sinx+ y ecos x then

∇f =

cosx− y sinx ecos x

ecos x


Suppose f(x, y) is a function of two variables

Suppose that we are following a path through the (x, y)−plane given

by

x = x(p) y = y(p)

We may then want to calculate the gradient of f with respect to the

parameter p

The plot below shows the curve defined by

x = ep y = p2

0 2 4 6 80

0.5

1

1.5

2

2.5

3

3.5

4

x

y


Suppose we want to evaluate the derivative of f(x, y) along the curve

given by x = x(p), y = y(p)

As the curve x = x(p), y = y(p) is parameterised by a single variable p

the rate of change is a total derivative rather than a partial derivative

Define F (p) = f(x(p), y(p)). Then

F ′(p) = limh→0

F (p+ h)− F (p)

h

We will now evaluate this limit using a Taylor series expansion to

expand F (p)

We will write

s = x(p+ h) s0 = x(p)

t = y(p+ h) t0 = y(p)

A Taylor series expansion of f about the point (s0, t0), up to and

including the linear terms, gives

f(s, t) ≈ f(s0, t0) + (s− s0)∂f

∂x

∣∣∣∣∣x=s0,y=t0

+ (t− t0)∂f

∂y

∣∣∣∣∣x=s0,y=t0


We may then write

F (p+ h)− F (p) = f(x(p+ h), y(p+ h))− f(x(p), y(p))

= f(s, t)− f(s0, t0)

≈ (s− s0)∂f

∂x

∣∣∣∣∣x=s0,y=t0

+ (t− t0)∂f

∂y

∣∣∣∣∣x=s0,y=t0

= (x(p+ h)− x(p))∂f

∂x

∣∣∣∣∣x=x(p),y=y(p)

+

(y(p+ h)− y(p))∂f

∂y

∣∣∣∣∣x=x(p),y=y(p)

Dividing by h gives

F (p+ h)− F (p)

h=x(p+ h)− x(p)

h

∂f

∂x

∣∣∣∣∣x=x(p),y=y(p)

+

y(p+ h)− y(p)

h

∂f

∂y

∣∣∣∣∣x=x(p),y=y(p)

Taking the limit h→ 0 of the above expression gives

F ′(p) = x′(p)∂f

∂x

∣∣∣∣∣x=x(p),y=y(p)

+ y′(p)∂f

∂y

∣∣∣∣∣x=x(p),y=y(p)


The rate of change of f with respect to p is given by

df

dp=∂f

∂x

dx

dp+∂f

∂y

dy

dp

If we define the vector t to be

t =

x′(p)y′(p)

then

df

dp= (∇f) · t where the · is the scalar product

Example: Calculate the derivative of f(x, y) = (x− y)4 + (x+ y)2

along the curve x = sin p, y = p2

The partial derivatives of f are given by

∂f

∂x= 4(x− y)3 + 2(x+ y)

∂f

∂y= −4(x− y)3 + 2(x+ y)

and

x′(p) = cos p y′(p) = 2p

The derivative with respect to p is then given by

df

dp= cos p

(4(sin p− p2)3 + 2(sin p+ p2)

)+

2p(−4(sin p− p2)3 + 2(sin p+ p2)

)


Change of coordinate system

Now suppose F (p, q) = f(x(p, q), y(p, q))

This is a common application — instead of using (x, y) coordinates,

we have a different coordinate system (p, q)

There is a specified map from the (x, y) coordinate system to the

(p, q) system given by

x = x(p, q) y = y(p, q)

Example: polar coordinates

Polar coordinates are defined by

x = r cos θ y = r sin θ

where r is the distance from the origin and θ is the angle, in the

anti–clockwise direction, between the x− axis and the line from the

origin to the point (x, y)

−2.5 −2 −1.5 −1 −0.5 0 0.5−0.5

0

0.5

1

1.5

2

2.5

3

3.5

θ

r

x

y


Suppose F (p, q) = f(x(p, q), y(p, q))

By treating q as constant we may calculate the partial derivative of

F with respect to p by using the chain rule:

∂F

∂p=∂f

∂x

∂x

∂p+∂f

∂y

∂y

∂p

Similarly

∂F

∂q=∂f

∂x

∂x

∂q+∂f

∂y

∂y

∂q

Example: Suppose f(x, y) = sinx+ ey, and F (r, θ) = f(x, y) where

(r, θ) are the polar coordinates given by


Calculate the first derivatives of F with respect to r and θ

∂F

∂r=∂f

∂x

∂x

∂r+∂f

∂y

∂y

∂r

= cosx cos θ + ey sin θ

= cos (r cos θ) cos θ + er sin θ sin θ


∂F

∂θ=∂f

∂x

∂x

∂θ+∂f

∂y

∂y

∂θ

= −r cosx sin θ + rey cos θ

= r(− cos (r cos θ) sin θ + er sin θ cos θ

)

Suppose we have a coordinate transformation

x = x(p, q) y = y(p, q)

and we know that an inverse transformation exists so that

p = p(x, y) q = q(x, y)

Clearly we can write down the partial derivatives ∂x∂p , ∂x

∂q , ∂y∂p , ∂y

∂q

The inverse transformation may be hard to write down explicitly —

is it possible to calculate the partial derivatives ∂p∂x etc. without

explicitly performing this inversion?


We already have

x = x(p, q) y = y(p, q)

We now introduce a further change of variables

p = p(s, t) q = q(s, t)

This allows us to write x and y in terms of s and t

x = x(p(s, t), q(s, t)) y = y(p(s, t), q(s, t))

We may now calculate the partial derivatives of both x and y with

respect to s and t:

∂x

∂s=∂x

∂p

∂p

∂s+∂x

∂q

∂q

∂s

∂x

∂t=∂x

∂p

∂p

∂t+∂x

∂q

∂q

∂t

∂y

∂s=∂y

∂p

∂p

∂s+∂y

∂q

∂q

∂s

∂y

∂t=∂y

∂p

∂p

∂t+∂y

∂q

∂q

∂t


Now suppose x = s and y = t

We then have

1 =∂x

∂p

∂p

∂x+∂x

∂q

∂q

∂x

0 =∂x

∂p

∂p

∂y+∂x

∂q

∂q

∂y

0 =∂y

∂p

∂p

∂x+∂y

∂q

∂q

∂x

1 =∂y

∂p

∂p

∂y+∂y

∂q

∂q

∂y

In matrix form1 0

0 1

=

∂x∂p

∂x∂q

∂y∂p

∂y∂q

∂p∂x

∂p∂y

∂q∂x

∂q∂y

Hence∂p∂x

∂p∂y

∂q∂x

∂q∂y

=

∂x∂p

∂x∂q

∂y∂p

∂y∂q

−1


The inverse of a 2× 2 matrix may easily be calculated using the

formula(a b

c d

)−1=

1

ad− bc

(d −b−c a

)

If x = x(p, q), y = y(p, q) is an invertible coordinate map we may

therefore calculate the first partial derivatives of the inverse map

p = p(x, y), q = q(x, y) by:

• calculating the first partial derivatives of x = x(p, q), y = y(p, q);

and

• inverting the matrix of these partial derivatives

Example: given polar coordinates


calculate the first partial derivatives of r and θ with respect to x and

y

We have∂x∂r

∂x∂θ

∂y∂r

∂y∂θ

=

cos θ −r sin θ

sin θ r cos θ


We therefore have ∂r∂x

∂r∂y

∂θ∂x

∂θ∂y

=

cos θ −r sin θ

sin θ r cos θ

−1

=1

r

r cos θ r sin θ

− sin θ cos θ

and so

∂r

∂x= cos θ

∂r

∂y= sin θ

∂θ

∂x= −1

rsin θ

∂θ

∂y=

1

rcos θ

Given polar coordinates


we can now write first and higher partial derivatives with respect to

x and y in terms of partial derivatives with respect to r and θ

We can write

∂

∂x=∂r

∂x

∂

∂r+∂θ

∂x

∂

∂θ

= cos θ∂

∂r− sin θ

r

∂

∂θ

and

∂

∂y=∂r

∂y

∂

∂r+∂θ

∂y

∂

∂θ

= sin θ∂

∂r+

cos θ

r

∂

∂θ


Suppose f(r, θ) = r2

We then have

∂f

∂x= cos θ

∂f

∂r− sin θ

r

∂f

∂θ= 2r cos θ

= 2x

This isn’t surprising, as f(x, y) = x2 + y2, and so ∂f∂x = 2x

Suppose f(r, θ) = θ

We then have

∂f

∂x= cos θ

∂f

∂r− sin θ

r

∂f

∂θ

= − sin θ

r

= − y

x2 + y2

This isn’t surprising, as f(x, y) = arctan yx , and so ∂f

∂x = − yx2+y2


We may calculate higher derivatives by writing

∂2u

∂x2=

(cos θ

∂

∂r− sin θ

r

∂

∂θ

)(cos θ

∂u

∂r− sin θ

r

∂u

∂θ

)= cos2 θ

∂2u

∂r2+

2 cos θ sin θ

r2∂u

∂θ− 2 cos θ sin θ

r

∂2u

∂θ∂r+

sin2 θ

r

∂u

∂r+

sin2 θ

r2∂2u

∂θ2

Noting that

∂

∂y=∂r

∂y

∂

∂r+∂θ

∂y

∂

∂θ

= sin θ∂

∂r+

cos θ

r

∂

∂θ

we may use the same method to calculate other higher derivatives

such as ∂2u∂y2 and ∂2u

∂x∂y

For example:

∂2u

∂x∂y=

∂

∂x

(∂u

∂y

)=

(cos θ

∂

∂r− sin θ

r

∂

∂θ

)(sin θ

∂u

∂r+

cos θ

r

∂u

∂θ

)

and

∂3u

∂x3=

∂

∂x

(∂

∂x

(∂u

∂x

))=

(cos θ

∂

∂r− sin θ

r

∂

∂θ

)(cos θ

∂

∂r− sin θ

r

∂

∂θ

)(cos θ

∂u

∂r− sin θ

r

∂u

∂θ

)


Which coordinate system is it best to evaluate partialderivatives in?

Example: if u = r2, evaluate ∂2u∂x2

Using

∂2u

∂x2= cos2 θ

∂2u

∂r2+

2 cos θ sin θ

r2∂u


r

∂2u

∂θ∂r+

sin2 θ

r

∂u

∂r+

sin2 θ

r2∂2u

∂θ2

we see that

∂2u

∂x2= (cos2)(2) +

2 cos θ sin θ

r2(0)− 2 cos θ sin θ

r(0) +

sin2 θ

r(2r) +

sin2 θ

r2(0)

= 2

Alternatively on the previous slide we could have written

u = r2 = x2 + y2

and then deduce that ∂2u∂x2 = 2 much more quickly


Another example: if u = θ, evaluate ∂2u∂x2

Again using

∂2u

∂x2= cos2 θ

∂2u

∂r2+

2 cos θ sin θ

r2∂u


r

∂2u

∂θ∂r+

sin2 θ

r

∂u

∂r+

sin2 θ

r2∂2u

∂θ2

we see very simply that

∂2u

∂x2=

2 cos θ sin θ

r2

=2xy

(x2 + y2)2

Alternatively:

u = arctany

x

and so

∂2u

∂x2=

∂

∂x

(∂

∂x

(arctan

y

x

))= . . .

=2xy

(x2 + y2)2

This time, calculating ∂2u∂x2 using the formula on the previous slide is

much easier


Location


√• Partial differentiation

√• Taylor series

√• Critical points

⇒ • Solution of nonlinear equations


• Integration

• Fourier series




Solution of nonlinear equations

The first step in identifying critical points of a function f(x) of one

variable is to find the values of x where f ′(x) = 0.

If f(x) = sin(cos(3x5 + 5x3 + x)), then critical points satisfy

− cos(cos(3x5 + 5x3 + 1)) sin(3x5 + 5x3 + 1)(15x4 + 15x2 + 1) = 0

This nonlinear equation is not easy to solve — there is no closed form

solution of a general nonlinear equation

We also do not know, in general, whether a solution exists

If a solution does exist, it is not always clear whether or not it is

unique


Similarly, if f(x, y) is a function of two variables then critical points

(x, y) satisfy ∂f∂x = ∂f

∂y = 0

For f(x, y) = sin(cos(3x5y2 + 5x3 + xy3)) critical points satisfy thenonlinear system of equations

− cos(cos(3x5y2+ 5x

3+ xy

3)) sin(3x

5y2+ 5x

3+ xy

3)(15x

4y2+ 15x

2+ y

3) = 0

− cos(cos(3x5y2+ 5x

3+ xy

3)) sin(3x

5y2+ 5x

3+ xy

3)(6x

5y + 3xy

2) = 0

This is more difficult than the previous example — we now need to

solve a system of nonlinear equations

For general nonlinear equations there is no guarantee that a solution

exists

If a solution does exist there is no guarantee that it is unique

The simplest nonlinear equation is: find a real number x that

satisfies the quadratic equation ax2 + bx+ c = 0. Clearly, if x exists

then x = (−b±√b2 − 4ac)/2a

• A solution exists is b2 − 4ac ≥ 0

• The solution is unique if b2 − 4ac = 0


We can solve some simple nonlinear systems, e.g. quadratic

equations, very simple trigonometric equations

In general there isn’t an analytic representation or formula for the

solution of the nonlinear equation f(x) = 0

The best that can be done is iterative methods — start with an

initial guess to the solution x0 and then use a formula

xi = g(xi−1), i = 1, 2, 3, . . .

to calculate updates to the estimate of the solution

Hopefully this iteration will converge, but this can’t always be

guaranteed even if a solution does exist

The Newton–Raphson method for a single nonlinearequation

xn−1

xnx

*

(xn−1

, yn−1

)

y=f(x)

The Newton–Raphson method is an iterative method

At each iteration the equation is linearised about the current iterate,

and the resulting linear equation used to update the iterate

In this case, xn is a closer approximation to x∗ than xn−1


The linearisation on the previous slide gives

f ′(xn−1) = slope

=yn−1 − 0

xn−1 − xn

=f(xn−1)

xn−1 − xnwhich may be written as the explicit Newton–Raphson iteration

xn = xn−1 −f(xn−1)

f ′(xn−1)

We also need an initial guess x0

A word of warning — the Newton–Raphson method can diverge.

In the plot below xn is a worse approximation to x∗ than xn−1

xn−1

xnx

*

(xn−1

, yn−1

)

y=f(x)


Example: solve f(x) = xe−x = 0

−1 0 1 2 3 4−3

−2.5

−2

−1.5

−1

−0.5

0

0.5

x

x e−x

Clearly there is one unique solution, x = 0

Noting that f ′(x) = e−x (1− x), the Newton–Raphson iteration is

given by

xn = xn−1 −xn−1e−xn−1

e−xn−1 (1− xn−1)

= xn−1 −xn−1

1− xn−1

= −x2n−1

1− xn−1

When does Newton’s method work?


Initial guess x0 = 0.2

This gives iterates

x1 = −0.05

x2 = −0.0.002381

x3 = −5.655× 10−6

x4 = −3.1984× 10−11

We see that xn → 0 as required


This gives iterates

x1 = −0.5

x2 = −0.1667

x3 = −0.02381

x4 = −0.0005537

x5 = −3.064× 10−7

x6 = −9.390× 10−14

We again see that xn → 0 as required



This gives iterates

x1 = −98.01

x2 = −97.02

x3 = −96.03

x4 = −95.04

x5 = −94.05

x6 = −93.06

First iteration was x1 was not what we wanted. But error is

decreasing (slightly) with subsequent iterates. What happens as

n→∞?

0 20 40 60 80 100 120

20

40

60

80

100

120

140

n

|xn −

x* |

0 20 40 60 80 100 12010

−20

10−15

10−10

10−5

100

n

|xn −

x* |

We see that the absolute error initially increases, and then decreases

to zero


Initial guess x0 = 2

This gives iterates

x1 = 4

x2 = 5.333

x3 = 6.564

x4 = 7.7744

x5 = 8.892

x6 = 10.02

This doesn’t look good. What happens as n→∞?

0 20 40 60 80 1000

20

40

60

80

100

120

n

|xn −

x* |

We see that the absolute error continues to increase as n increases

We will now give a proof of convergence of the Newton–Raphson

method under certain conditions


Proof of convergence for the Newton–Raphson method

Suppose x = x∗ is a solution of f(x) = 0

Suppose further that:

• f(x) is a continuous function with continuous first and second

derivatives on a closed interval x∗ −K ≤ x ≤ x∗ +K for some

K > 0

• there exists a positive constant A such that

|f ′′(x)||f ′(y)|

≤ A

for all x∗ −K ≤ x ≤ x∗ +K and x∗ −K ≤ y ≤ x∗ +K

Let h be the minimum of K and 1/A.

The Newton–Raphson method will converge if the initial guess x0

satisfies |x0 − x∗| < h.

We need to prove that xn → x∗ as n→∞

The Newton–Raphson iteration is given by

xn = xn−1 −f(xn−1)

f ′(xn−1)n = 1, 2, 3, . . .

A Taylor expansion about x = xn−1 yields

0 = f(x∗) = f(xn−1) + (x∗ − xn−1)f ′(xn−1) +1

2(x∗ − xn−1)2f ′′(ηn−1)

for n = 1, 2, 3, . . ., with ηn−1 between x∗ and xn−1


Eliminating f(xn−1) between the two equations on the previous slide

and rearranging gives

x∗ − xn = − (x∗ − xn−1)2f ′′(ηn−1)

2f ′(xn−1)

We will now assume that

|x∗ − xn−1| ≤ h

and will show that this implies that

|x∗ − xn| ≤ h

We have assumed that

|x∗ − xn−1| ≤ h ≤1

A

ηn−1 lies between x∗ and xn−1

As h < K our conditions on f(x) allows us to deduce that

|f ′′(ηn−1)||f ′(xn−1)|

≤ A

Therefore

|x∗ − xn| ≤1

2|x∗ − xn−1|


We can now deduce that if

|x∗ − xn−1| ≤ h

then

|x∗ − xn| ≤1

2|x∗ − xn−1| ≤

1

2h ≤ h

Hence, provided |x∗ − x0| ≤ 0.5h ≤ h, then |x∗ − x1| ≤ 0.5h ≤ h, and

|x∗ − x2| ≤ 0.5h ≤ h, and |x∗ − x3| ≤ 0.5h ≤ h . . .

We have shown that successive iterates of xn lie in the interval

|x∗ − xn| ≤ h, but not that xn → x∗ as n→∞.

To do that, we note that

|x∗ − xn| ≤1

2|x∗ − xn−1|

≤ 1

22|x∗ − xn−2|

≤ 1

23|x∗ − xn−3|

...

≤ 1

2n|x∗ − x0|

Clearly xn → x∗ as n→∞


Returning to our original example of f(x) = xe−x, we saw that

Newton’s method

• converged with absolute error decreasing monotonically for

x0 = 0.2, 0.5

• converged non–monotonically for x0 = 0.99

• diverged for x0 = 2

We will now explain this in terms of the conditions required for

convergence of Newton’s method

Recall the conditions for Newton’s method to converge:

Suppose x = x∗ is a solution of f(x) = 0 and that:

• f(x) is a continuous function with continuous first and second

derivatives on a closed interval x∗ −K ≤ x ≤ x∗ +K for some

K > 0

• there exists a positive constant A such that

|f ′′(x)||f ′(y)|

≤ A

for all x∗ −K ≤ x ≤ x∗ +K and x∗ −K ≤ y ≤ x∗ +K

Let h be the minimum of K and 1/A.

The Newton–Raphson method will converge if the initial guess x0

satisfies |x0 − x∗| < h.


The first condition, on continuity of f(x), f ′(x) and f ′′(x) is satisfied

for f(x) = xe−x

We also have

|f ′′(x)||f ′(y)|

=ey

1− ye−x(x− 2)

As the only solution to f(x) = 0 is x∗ = 0 we want to bound the

quantity above for −K ≤ x, y ≤ K

Noting that

e−x(x− 2) ≤ eK(K + 2) −K ≤ x ≤ Key

1− y≤ eK

1−K−K ≤ y ≤ K

we may use A = e2K(K+2)1−K

We now have convergence of Newton’s method for any initial guess

x0 that satisfies |x0| = min(K, 1

A

)

Note that A — and therefore 1/A — is a function of K

We therefore want to choose K so that h = min(K, 1

A

)is as large as

possible


Below we plot K and 1/A as a function of K

0 0.2 0.4 0.6 0.80

0.2

0.4

0.6

0.8

1

K

h

K

1/A

The largest value of h we can take is where the curves intersect —

this is at K = h = 0.2234. By taking this value of K we can

guarantee convergence for any initial guess satisfying |x0| ≤ 0.2234.

We can guarantee convergence with absolute error decreasing

monotonically for any initial guess satisfying |x0| ≤ 0.2234. This was

verified when x0 = 0.2.

For |x0| > 0.2234 we cannot say whether the Newton–Raphson

method will converge or not: we haven’t proved divergence. For

example:

• When x0 = 0.5 the Newton–Raphson method converged with

absolute error decreasing monotonically

• When x0 = 0.99 the Newton–Raphson method converged non

monotonically

• When x0 = 2 the Newton–Raphson method diverged


The theory provides sufficient conditions for the Newton–Raphson

method to converge, but it doesn’t follow that the method will

diverge if those conditions are not met

It can be useful to analyse the difference equation that arises to

investigate the behaviour of Newton’s method

When more than one root exists this can be useful in identifying

which root a given initial guess converges to

For example, suppose we want to solve f(x) = x(1− x) = 0, with

initial guess x0

This equation has roots at x = 0 and x = 1

The Newton–Raphson method gives, for n = 1, 2, 3, . . .

xn = xn−1 −f(xn−1)

f ′(xn−1)

= xn−1 −xn−1(1− xn−1)

1− 2xn−1

= −x2n−1

1− 2xn−1

We will now consider the cases x0 < 0.5, x0 = 0.5, x0 > 0.5

separately.


Suppose x0 < 0.5

We then have

x1 = − x201− 2x0

< 0

Suppose xn−1 < 0

We then have, for n = 1, 2, 3, . . .

xn = −x2n−1

1− 2xn−1< 0

We therefore have x2 < 0, x3 < 0, x4 < 0, . . .

We also have, for n = 1, 2, 3, . . .

xn = xn−1 −xn−1(1− xn−1)

1− 2xn−1

and so

xn − xn−1 = −xn−1(1− xn−1)

1− 2xn−1

When xn < 0 we have

xn − xn−1 > 0

and so xn > xn−1 for n = 2, 3, 4, . . .


xn is therefore an increasing sequence

xn < 0 for n = 2, 3, 4, . . . and so xn is bounded above by 0

As xn is an increasing sequence of numbers that is bounded above it

must converge to a limit.

The only possible limit is xn → 0 as n→∞

Hence, for any x0 < 0.5 the Newton–Raphson method will converge

to the root at x = 0

Suppose x0 = 0.5

Remembering that

xn = −x2n−1

1− 2xn−1

we see that we cannot compute x1

The Newton–Raphson will not work for x0 = 0.5


For x0 > 0.5 a similar analysis to the case x0 < 0.5 reveals that the

Newton–Raphson method will converge to the root x = 1

Alternatively, appeal to symmetry about x = 0.5

Newton’s method for systems of nonlinear equations

The Newton–Raphson method for a single nonlinear equation was

underpinned by making a linear approximation to the function about

the point x = xn−1 to calculate the next iterate xn

xn−1

xnx

*

(xn−1

, yn−1

)

y=f(x)


This is equivalent to making a Taylor series approximation to f(x)

about the point x = xn−1 and neglecting quadratic and higher order

terms: we may then write

0 = f(x∗) ≈ f(xn−1) + (x∗ − xn−1)f ′(xn−1)

Rearranging gives

x∗ ≈ xn−1 −f(xn−1)

f ′(xn−1)

We therefore take our next iterate to be

xn = xn−1 −f(xn−1)

f ′(xn−1)

and recover the Newton–Raphson method

We may extend the Taylor series approach to systems of nonlinear

equations

Suppose we want to solve the system of two equations in two

unknowns

f(x, y) = 0 g(x, y) = 0

Note that, in common with systems of linear equations, we expect

one equation per unknown variable


We have already seen that we may expand a function of two variables

f(x, y) as a Taylor series about the point (xn−1, yn−1) up to and

including linear terms:

f(x, y) ≈ f(xn−1, yn−1) +A(x− xn−1) +B(y − yn−1)

where

A =∂f

∂x

∣∣∣∣∣x=xn−1,y=yn−1

B =∂f

∂y

∣∣∣∣∣x=xn−1,y=yn−1

Similarly, g(x, y) may be expanded as a Taylor series about the point

(xn−1, yn−1) up to and including linear terms:

g(x, y) ≈ g(xn−1, yn−1) + C(x− xn−1) +D(y − yn−1)

where

C =∂g

∂x

∣∣∣∣∣x=xn−1,y=yn−1

D =∂g

∂y

∣∣∣∣∣x=xn−1,y=yn−1


Suppose x = x∗, y = y∗ is a root of the system of equations

f(x, y) = 0 g(x, y) = 0

Using our linear Taylor series expansions we have

0 = f(x∗, y∗) ≈ f(xn−1, yn−1) +A(x∗ − xn−1) +B(y∗ − yn−1)

0 = g(x∗, y∗) ≈ g(xn−1, yn−1) + C(x∗ − xn−1) +D(y∗ − yn−1)

or, in matrix form,

A B

C D

x∗ − xn−1y∗ − yn−1

≈ −f(xn−1, yn−1)

g(xn−1, yn−1)

Inverting this linear system — hoping it is invertible — givesx∗y∗

≈xn−1yn−1

−A B

C D

−1f(xn−1, yn−1)

g(xn−1, yn−1)

We then use this approximation to x∗ and y∗ as our next iterate, i.e.xnyn

=

xn−1yn−1

−A B

C D

−1f(xn−1, yn−1)

g(xn−1, yn−1)


A first example: solve the system of equations

f(x, y) = 4x2 + y2 − 4 = 0

g(x, y) = x+ y − sin(x− y) = 0

using the initial guess x0 = 1, y0 = 0.

By evaluating the partial derivatives of f(x, y) and g(x, y) at

x = xn−1, y = yn−1, we obtainA B

C D

=

8xn−1 2yn−1

1− cos(xn−1 − yn−1) 1 + cos(xn−1 − yn−1)

The Newton iteration then becomesxn

yn

=

xn−1

yn−1

− 8xn−1 2yn−1

1− cos(xn−1 − yn−1) 1 + cos(xn−1 − yn−1)

−1 4x2n−1 + y2

n−1 − 4

xn−1 + yn−1 − sin(xn−1 − yn−1)

This iteration gives

n x y f(x, y) g(x, y)

0 1 0 0 1.59× 10−1

1 1 -0.1029207154 1.06× 10−2 4.55× 10−3

2 0.9986087598 -0.1055307239 1.46× 10−5 6.63× 10−7

3 0.9986069441 -0.1055304923 1.32× 10−11 1.87× 10−12


For this system of equations, we see that

f(−x,−y) = f(x, y)

g(−x,−y) = −g(x, y)

If (x, y) is a solution to f(x, y) = 0 = g(x, y) then it follows that

(−x,−y) is also a solution

Do we know which root a given starting guess will converge to?

Another example: find complex numbers z such that ez − z − 2 = 0

Writing z = x+ iy, where x, y are real, this becomes the system of

equations

ex cos y − x− 2 = 0

ex sin y − y = 0

We may apply Newton’s method to this system, noting thatA B

C D

=

ex cos y − 1 − ex sin y

ex sin y ex cos y − 1


On the next slide, the black open circles correspond to some roots of

the system of equations — it can be shown that there are an infinite

number of these roots

The red dots that are connected by broken red lines show the

convergence paths for different initial guesses to the solution, with a

red dot corresponding to one iteration

Note that:

• Newton’s method doesn’t necessarily converge to the nearest

root for a given initial condition

• The path to the root isn’t always uni–directional

−20 0 20 40 60 80 100−5

0

5

10

15

x

y


Generalisation of Newton’s method to larger systems

Suppose we have a system of M equations in M unknown variables:

f1(x1, x2, x3, . . . , xM ) = 0

f2(x1, x2, x3, . . . , xM ) = 0

......

...

fM (x1, x2, x3, . . . , xM ) = 0

We may generalise Newton’s method to these systems

Let us suppose that xn−1 = (x(n−1)1 , x

(n−1)2 , . . . , x

(n−1)M ) is an

iterative estimate to x that satisfies f(x) = 0

We may linearise fi as a Taylor series up to and including the linear

term about xn−1 = (x(n−1)1 , x

(n−1)2 , . . . , x

(n−1)M ) as

fi(x) ≈ fi(xn−1) +

M∑j=0

Jij(xj − x(n−1)j )

where

Jij =∂fi∂xj

∣∣∣∣∣x=xn−1


Writing

x =

x1

x2

x3

. . .

xM

f(x) =

f1(x)

f2(x)

f3(x)

. . .

fM (x)

J =

∂f1∂x1

∂f1∂x2

∂f1∂x3

. . . ∂f1∂xM

∂f2∂x1

∂f2∂x2

∂f2∂x3

. . . ∂f2∂xM

∂f3∂x1

∂f3∂x2

∂f3∂x3

. . . ∂f3∂xM

......

.... . .

...

∂fM∂x1

∂fM∂x2

∂fM∂x3

. . . ∂fM∂xM

The linearisation becomes, in vector form,

f(x) ≈ f(xn−1) + J (x− xn−1)

where all the entries of J are evaluated at x = xn−1

If x is chosen such that f(x) = 0 then

x ≈ xn−1 − J−1f(xn−1)


Newton’s method becomes, given an initial guess x0:

xn = xn−1 − J−1f(xn−1)

for n = 1, 2, 3, . . ., where all entries of J are evaluated at xn−1

As with the Newton–Raphson method for scalar equations, this

iteration will converge provided x0 is close enough to the solution

Location





√• Solution of nonlinear equations

⇒ • Constrained optimisation

• Integration

• Fourier series





Constrained optimisation

So far we have only looked at unconstrained optimisation

We will now think about constrained optimisation

This has applications in, for example, machine learning

In general, some quantifiable observables may be known, and we

want to optimise other observables

Suppose we want to calculate the minimum value of f(x, y) = x+ y,

subject to x and y lying on the unit circle x2 + y2 = 1.

This is an example of constrained optimisation

In this case we can parameterise the unit the circle by

x = cos θ, y = sin θ

and then we have to minimise the expression

cos θ + sin θ

It is then straightforward to show that the constrained minimum

value is -√

2


Suppose a curve in the (x, y)-plane is given implicitly by

g(x, y) = 3x2 + 3y2 + 4xy − 2 = 0

If we want to locate the point on g(x, y) with the maximum square of

the distance from the origin, we could write this as:

Maximise f(x, y) = x2 + y2

subject to g(x, y) = 0

In this case there isn’t an obvious parameterisation of the curve

defined by g(x, y) = 0.

Another example from three dimensions: suppose we want to

minimise

f(x, y, z) = x+ y + z

subject to the constraints that the point (x, y, z) lies on the

intersection of the two spheres given by

g(x, y, z) = x2 + y2 + z2 − 1 = 0

h(x, y, z) = (x− 1)2

+ y2 + z2 − 1 = 0

Again, parameterising the combination of the constraints is not

simple


Lagrange multipliers are a systematic method for optimising

functions subject to constraints

Suppose we want to minimise (or maximise) a function f(x, y)

subject to a constraint g(x, y) = 0.

Let (x(t), y(t)) be a point moving along the curve given by the

constraint g(x, y) without pausing — this implies that x′(t) and y′(t)

are never simultaneously zero

From the chain rule we know that, along the curve given by

(x(t), y(t))

df

dt=∂f

∂x

dx

dt+∂f

∂y

dy

dt

dg

dt=∂g

∂x

dx

dt+∂g

∂y

dy

dt

At a maximum or minimum of f(x, y) subject to the constraint we

will have

df

dt= 0

As g(x, y) always takes the value g(x, y) = 0 on the curve (x(t), y(t))

its value will not change and so

dg

dt= 0

In matrix form:∂f∂x

∂f∂y

∂g∂x

∂g∂y

dxdt

dydt

=

0

0


As we have chosen (x(t), y(t)) such that x′(t) and y′(t) are never

simultaneously zero, the matrix on the previous slide must be singular

The rows of the matrix must be proportional to each other, and so

∂f

∂x= λ

∂g

∂x∂f

∂y= λ

∂g

∂y

for some constant λ known as a Lagrange multiplier

We may now re–write out constrained optimisation problem of

minimising (or maximising) a function f(x, y) subject to a constraint

g(x, y) = 0 as the (possibly nonlinear) system of equations

∂f

∂x= λ

∂g

∂x∂f

∂y= λ

∂g

∂y

g(x, y) = 0

This system of equations has three equations for three unknowns

(λ, x, y)

This can be thought of as minimising or maximising the function of

three variables F (x, y, λ) = f(x, y)− λg(x, y)


Our original example was: minimise f(x, y) = x+ y subject to the

constraint g(x, y) = x2 + y2 − 1 = 0

The Lagrange multiplier system of equations is

1 = 2λx

1 = 2λy

x2 + y2 = 1

Noting that the first two equations give x = y = 1/(2λ) we may use

the final equation to deduce that

λ2 = 1/2, i.e. λ = ±√

1/2

Maxima or minima of f are therefore at the points (√

1/2,√

1/2)

and (−√

1/2,−√

1/2)

Noting that

f(√

1/2,√

1/2) =√

2

f(−√

1/2,−√

1/2) = −√

2

we see that the minimum value of f(x, y) subject to the constraint

g(x, y) = 0 is −√

2


Example from earlier:

Maximise f(x, y) = x2 + y2

subject to g(x, y) = 3x2 + 3y2 + 4xy − 2 = 0

Lagrange multiplier system of equations is

2x = λ(6x+ 4y)

2y = λ(6y + 4x)

3x2 + 3y2 + 4xy − 2 = 0

Dividing the first equation by the second equation gives

2x

2y=

6x+ 4y

6y + 4x

which simplifies to x = ±y

Considering first the case x = y, the final equation gives

x = y = ±√

1/5

For the case x = −y the final equation gives

−y = x = ±1


We have now identified four points where the maximum of f(x, y)

may occur.

f(√

1/5,√

1/5) =2

5

f(−√

1/5,−√

1/5) =2

5f(1,−1) = 2

f(−1, 1) = 2

Hence the maximum value of x2 + y2, subject to

3x2 + 3y2 + 4xy − 2 = 0, is 2

Extra constraints: minimise (or maximise) f(x, y, z) subject to the

two constraints g(x, y, z) = 0 and h(x, y, z) = 0

This can be posed as: minimise (or maximise)

F (x, y, z, λ, µ) = f(x, y, z)− λg(x, y, z)− µh(x, y, z), and gives the

following equations for the unknowns x, y, z, λ, µ

∂f

∂x= λ

∂g

∂x+ µ

∂h

∂x∂f

∂y= λ

∂g

∂y+ µ

∂h

∂y

∂f

∂z= λ

∂g

∂z+ µ

∂h

∂zg = 0

h = 0


Earlier example: minimise

f(x, y, z) = x+ y + z

subject to the constraints

g(x, y, z) = x2 + y2 + z2 − 1 = 0

h(x, y, z) = (x− 1)2

+ y2 + z2 − 1 = 0

Lagrange multiplier system of equations is

1 = 2λx+ 2µ(x− 1)

1 = 2λy + 2µy

1 = 2λz + 2µz

x2 + y2 + z2 = 1

(x− 1)2

+ y2 + z2 = 1

The last two equations give x = 1/2, and the second and third

equations give y = z

The fourth equation then gives y = ±√

3/8

Noting that

f

(1

2,

√3

8,

√3

8

)=

1

2+

√3

2

f

(1

2,−√

3

8,−√

3

8

)=

1

2−√

3

2

we see that the minimum value of f subject to the constraints is

1/2−√

3/2


In the previous two examples we haven’t bothered to calculate the

Lagrange multipliers — is there an interpretation of them?

The answer is “sometimes”

Suppose we want to calculate the maximum value of f(x, y) = x+ y

on the circle of radius c and centred at the origin.

The constraint may be written g(x, y) = x2 + y2 − c2 = 0

We may write the maximisation problem as maximise

F (x, y, λ) = x+ y − λ(x2 + y2 − c2

)

The Lagrange multiplier system of equations is

1 = 2λx

1 = 2λy

x2 + y2 − c2 = 0

There are two solutions to this system of equations —

x = c/√

2, y = c√

2, λ = 1/(√

2c) and

x = −c/√

2, y = −c√

2, λ = −1/(√

2c)

Noting that

f(c/√

2, c√

2)

=√

2c

f(−c/√

2,−c√

2)

= −√

2c

we see that the maximum of f , subject to the constraint, is√

2c


Note that F is also dependent on the radius of the circle, c.

Suppose we change c by a small amount δc

This will change F by a small amount δF

By the definition of a derivative

∂F

∂c=δF

δc= 2cλ

and so

δF = 2cλδc

= 2c1√2cδc

=√

2δc

In this case the Lagrange multiplier gives an indication of how small

variations of a parameter in the problem may affect the minima and

maxima

Location






√• Constrained optimisation

⇒ • Integration

• Fourier series





Integration

Suppose we define F (a) to be the area enclosed by the lines x = 0,

x = a, the x−axis and the curve y = f(x): this is known as the

integral of f(x) between x = 0 and x = a and is denoted by

F (a) =

∫ a

0

f(x) dx

F (a) is equal to the shaded region in the diagram below

−1 0 1 2 3 4 5−4

−2

0

2

4

6

8

10

x

y

y=f(x)

x=a

Now let us consider F (a+ s), which is the sum of the two shaded

regions in the diagram below

−1 0 1 2 3 4 5−4

−2

0

2

4

6

8

10

x

y

y=f(x)

x=a x=a+s

We can see that

F (a+ s)− F (a) = area of darker shaded region


Approximating the area of the darker shaded region by

1

2s (f(a) + f(a+ s))

allows us to write

F (a+ s)− F (a)

s=

1

2(f(a) + f(a+ s))

Taking the limit as s→ 0:

lims→0

F (a+ s)− F (a)

s=

1

2lims→0

(f(a) + f(a+ s))

which may be written

F ′(a) = f(a)

We now see that

F (a) =

∫ a

0

f(x) dx⇒ f(x) = F ′(x)

We see that we can think of integration as being the inverse of

differentiation — if we want to integrate f(x) we want to find a

function F (x) such that F ′(x) = f(x)


Indefinite integrals

We have defined F (a) to be the area enclosed by the lines x = 0,

x = a, the x−axis and the curve y = f(x)

Choosing x = 0 was arbitrary: we could pick x to be any value

x = x0: this would simply add an arbitrary constant to the value of

F (a)

Adding this arbitrary constant to the integral is known as indefinite

integration

For an indefinite integral we remove the limits from the integral sign

and write, e.g.∫

2x dx

Suppose we want to evaluate the indefinite integral∫2x dx

On the previous slide we thought of integration as being the inverse

of differentiation

We therefore want to find a function whose derivative is 2x

We therefore deduce that∫2x dx = x2 +A

where A is an arbitrary constant


Useful integrals

∫sinx dx = − cosx+A

∫1xdx = log |x|+A∫

sin kx dx = − 1kcos kx+A

∫1

1+xdx = log |1 + x|+A∫

cosx dx = sinx+A∫

11−x

dx = − log |1− x|+A∫cos kx dx = 1

ksin kx+A

∫1 dx = x+A∫

ex dx = ex +A∫x dx = 1

2x2 +A∫

ekx dx = 1kekx +A

∫xn dx = 1

n+1xn+1 +A, (n 6= 0)∫

f ′(x) [f(x)]n dx = [f(x)]n+1

n+1+A

∫ f ′(x)f(x)

dx = log |f(x)|+A

Definite integrals

Suppose we want to find the area between the curves x = a, x = b,

the x−axis and the curve y = f(x)

This is denoted by the definite integral∫ b

a

f(x) dx

The phrase definite integral is used because we want to evaluate an

area, rather than simply find a function


Example: find the area enclosed by the x−axis, the lines x = 1 and

x = 2, and the curve y = x3

Area =

∫ 2

1

x3 dx

=

[1

4x4 +A

]21

=

(1

424 +A

)−(

1

414 +A

)= 4− 1

4

Note that the arbitrary constant A disappears — we do not usually

worry about it in the working for definite integrals

Integration by substitution

Suppose we want to evaluate∫cosx sin4 x dx

Writing u = sinx, and noting that dudx = cosx we can (ignoring a

good deal of mathematical rigour) write

du = cosx dx

Noting that cosx dx appears in the integral allows us to re–write the

integral


We may now write∫cosx sin4 x dx =

∫sin4 x du

=

∫u4 du

=1

5u5 +A

=1

5sin5 x+A

Definite integrals by substitution

Suppose we want to evaluate∫ 1

0

1

1 + x2dx

We may use the substitution x = tanu: we then have

dx

du= sec2 u

= 1 + tan2 u

= 1 + x2

Hence

1

1 + x2dx = du


We now have to think about the limits of the integral:

When x = 0, u = 0

When x = 1, u = π/4

Hence∫ 1

0

1

1 + x2dx =

∫ π/4

0

du

=π

4

Integration by parts

Recall the derivative of a product of two functions u(x) and v(x):

d

dx(uv) =

du

dxv + u

dv

dx

Rearranging and integrating gives the indefinite intregal∫u

dv

dxdx = uv −

∫v

du

dxdx+A

or the definite integral∫ b

a

udv

dxdx = [uv]

ba −

∫ b

a

vdu

dxdx


Example: evaluate∫x sinx dx

In this case we use

u = xdv

dx= sinx

We then have

du

dx= 1 v = − cosx

and so, integrating by parts,∫x sinx dx = −x cosx+

∫cosx dx+A

= −x cosx+ sinx+A

Example: calculate the area enclosed by the curvey = x3 log x and the x−axis between x = 1 and x = 2

The area is given by∫ 2

1

x3 log x dx

Integrating by parts, we take

u = log xdv

dx= x3

We then have

du

dx=

1

xv =

1

4x4


and so, integrating by parts,∫ 2

1

x3 log x dx =

[1

4x4 log x

]21

−∫ 2

1

1

4x3 dx

= 4 log 2−[

1

16x4]21

= 4 log 2− 15

16

= log 16− 15

16

Reduction formulae for integration

Suppose

In =

∫ 1

0

xnex dx n = 0, 1, 2, . . .

We have

I0 =

∫ 1

0

ex dx = [ex]10 = e− 1

We may also use integration by parts to calculate a recurrence

relation for n ≥ 1


Using

u = xndv

dx= ex

we have

du

dx= nxn−1 v = ex

and so

In =

∫ 1

0

xnex dx

= [xnex]10 − n

∫ 1

0

xn−1ex dx

= e− nIn−1

We therefore have

I0 = e− 1

I1 = e− I0 = 1

I2 = e− 2I1 = e− 2

I3 = e− 3I2 = 6− 2e


Example: evaluate∫

x+3x2+2x−8 dx by using partial fractions

Noting that the denominator can be factorised as

x2 + 2x− 8 = (x− 2)(x+ 4)

we write

x+ 3

x2 + 2x− 8=

x+ 3

(x− 2)(x+ 4)

=A

x− 2+

B

x+ 4

=A(x+ 4) +B(x− 2)

(x− 2)(x+ 4)

We therefore have, for all x,

x+ 3 = A(x+ 4) +B(x− 2)

Putting x = 2 gives A = 5/6

Putting x = −4 gives B = 1/6

We therefore have

x+ 3

x2 + 2x− 8=

1

6

(5

x− 2+

1

x+ 4

)

This allows us to write∫x+ 3

x2 + 2x− 8dx =

∫1

6

(5

x− 2+

1

x+ 4

)dx

=1

6(5 log |x− 2|+ log |x+ 4|) +A


Example: evaluate∫

x2−2x−1(x−1)(x2+1)

dx by using partial fraction

As we now have a quadratic factor in the denominator that we can’t

factorise we look for a partial fractions decomposition of the form

x2 − 2x− 1

(x− 1)(x2 + 1)=

A

x− 1+Bx+ C

x2 + 1

Proceeding as before:

x2 − 2x− 1

(x− 1)(x2 + 1)=A(x2 + 1) + (Bx+ C)(x− 1)

(x− 1)(x2 + 1)

and so x2 − 2x− 1 = A(x2 + 1) + (Bx+ C)(x− 1)

x = 1⇒ A = −1

x = 0⇒ C = 0

x = 2⇒ B = 2

Hencex2 − 2x− 1

(x− 1)(x2 + 1)= − 1

x− 1+

2x

x2 + 1

∫x2 − 2x− 1

(x− 1)(x2 + 1)dx =

∫− 1

x− 1+

2x

x2 + 1dx

= − log |x− 1|+ log(x2 + 1) + logA

= logA(x2 + 1)

|x− 1|


Example with a repeated root

To decompose

−x2 − 5x+ 58

(x+ 3)(x− 5)2

into partial fractions we look for a decomposition of the form

−x2 − 5x+ 58

(x+ 3)(x− 5)2=

A

x+ 3+

B

x− 5+

C

(x− 5)2

Exercise: show that A = 1, B = −2, C = 1

This allows us to integrate −x2−5x+58

(x+3)(x−5)2

Summary of methods for integration

We have seen a few methods for evaluating∫f(x) dx

• Inspection, i.e. by writing down a function F (x) such that

F ′(x) = f(x)

• Substitution

• Integration by parts

• Reduction formulae

• Partial fractions


Numerical integration

Evaluating∫ baf(x) dx requires we find a function F (x) such that

F ′(x) = f(x)

This isn’t always possible

For definite integrals — which are equivalent to finding an area — we

may use numerical methods to approximate the area

The trapezium rule

Using the diagram below, we may write∫ x8

x0

f(x) dx =

8∑i=1

∫ xi

xi−1

f(x) dx

This is equivalent to saying that the total area under the curve is

equal to the sum of the areas in the strips

x0 x1 x2 x3 x4 x5 x6 x7 x8

x

f(x)


We will assume all the strips have the same width h, where

h = (x8 − x0)/N

The area under the curve in the strip between xi−1 and xi may be

approximated by a trapezium with area

hf(xi−1) + f(xi)

2

We may therefore approximate the integral by∫ x8

x0

f(x) dx ≈8∑i=1

hf(xi−1) + f(xi)

2

= h

(f(x0)

2+

7∑i=1

f(xi) +f(x8)

2

)

More generally, suppose we want to evaluate∫ baf(x) dx

We divide the interval a < x < b into N intervals of equal width h,

where

h =b− aN

We then have∫ b

a

f(x) dx ≈ h

(f(x0)

2+

N−1∑i=1

f(xi) +f(xN )

2

)

Intuitively we expect the approximation to become more accurate as

N increases (and h decreases)


Example: use the trapezium rule to approximate∫ π0

sinx dx

Clearly the true value of this integral is 2

The table below shows how the value computed by the trapezium

rule varies as N is increased

N Integral N Integral

2 1.570796326794897 64 1.999598388640037

4 1.896118897937040 128 1.999899600184202

8 1.974231601945551 256 1.999974900235052

16 1.993570343772339 512 1.999993725070576

32 1.998393360970145 1024 1.999998431268381

As we increase N we get closer and closer to the true value of 2

Below we plot — on logarithmic scales — the error as a function of

h = π/N

10−3

10−2

10−1

100

101

10−8

10−6

10−4

10−2

100

h

Absolu

te e

rror

We see that — on these logarithmic axes — the gradient of the graph

is 2


Suppose the absolute error, E, is given by

E = Ahn

for constants A,n as h→ 0

This is equivalent to

logE = logA+ n log h

Suppose we have values of E measured at different values of h

We can estimate n by plotting E against h on logarithmic axes — n

is the gradient of the straight line

We see that n = 2 for the trapezium rule — this method is said to be

second order

When h is small, halving h will divide the error by a factor of 4

Proof of the error bound for the trapezium rule

We have already noted that the shaded area under the curve below is

equal to the sum of the areas of the strips

x0 x1 x2 x3 x4 x5 x6 x7 x8

x

f(x)

We will first bound the absolute error in the trapezium rule

approximation to the area of each strip and then use this to bound

the total absolute error


The Taylor series expansion of f(x) on the interval x0 < x < x1, up

to and including linear terms and a quadratic error term is

f(x) = f(x0) + (x− x0)f ′(x0) +1

2(x− x0)

2f ′′(ζ)

where ζ is some value such that x0 < ζ < x1

This allows us to write

f(x1) = f(x0) + (x1 − x0)f ′(x0) +1

2(x1 − x0)

2f ′′(ζ)

or, re–arranging

f ′(x0) =f(x1)− f(x0)

x1 − x0− 1

2(x1 − x0) f ′′(ζ)

=f(x1)− f(x0)

h− 1

2hf ′′(ζ)

We may now write∫ x1

x0

f(x) dx =

∫ x1

x0

f(x0) + (x− x0)f ′(x0) +1

2(x− x0)

2f ′′(ζ) dx

= hf(x0) +h2

2f ′(x0) +

h3

6f ′′(ζ)

= hf(x0) +h2

2

(f(x1)− f(x0)

h− 1

2hf ′′(ζ)

)+h3

6f ′′(ζ)

=h

2(f(x0) + f(x1))− h3

12f ′′(ζ)

This may be used to bound the error on one strip — we will now use

this to bound the whole error


We divide the interval a < x < b into N intervals of equal size h, and

so Nh = b− a

We may write∫ b

a

f(x) dx =

N∑i=1

∫ xi

xi−1

f(x) dx

On each strip∫ xi

xi−1

f(x) dx =h

2(f(xi−1) + f(xi))−

h3

12f ′′(ζi)

where ζi lies in the interval xi−1 < ζi < xi

We then have∫ b

a

f(x) dx =

N∑i=1

(h

2(f(xi−1) + f(xi))−

h3

12f ′′(ζi)

)

= h

(f(x0)

2+

N−1∑i=1

f(xi) +f(xN )

2

)−

N∑i=1

h3

12f ′′(ζi)

The first term on the right–hand–side of the equation above is the

trapezium rule approximation to the integral

The absolute error then satisfies∣∣∣∣∣∫ b

a

f(x) dx− h

(f(x0)

2+

N−1∑i=1

f(xi) +f(xN )

2

)∣∣∣∣∣ =

∣∣∣∣∣N∑i=1

h3

12f ′′(ζi)

∣∣∣∣∣


Now let us assume that |f ′′(x)| ≤ F for a < x < b

We then have∣∣∣∣∣∫ b

a

f(x) dx− h

(f(x0)

2+

N−1∑i=1

f(xi) +f(xN )

2

)∣∣∣∣∣ ≤N∑i=1

h3

12F

=Nh3

12F

=1

12(b− a)Fh2

Hence the absolute error varies like h2 as observed in our numerical

example

Example: by considering the integral∫ 1

0x3 dx, and by using the

trapezium rule, show that

4 limN→∞

N−1∑n=1

n3

N4= 1

Let I =∫ 1

0x3 dx. Clearly I = 1

4 .

Applying the trapezium rule with N intervals gives h = 1/N and

xn = nh, n = 0, 1, 2, 3, . . . , N .

We then have

I ≈ 1

N

(0

2+

N−1∑n=1

(nh)3

+1

2

)

=1

N

(0

2+

N−1∑n=1

n3

N3+

1

2

)


In the limit N →∞ the approximation to the trapezium rule

converges to the true value.

Therefore

1

4= limN→∞

1

N

(0

2+

N−1∑n=1

n3

N3+

1

2

)

= limN→∞

1

N

N−1∑n=1

n3

N3

= limN→∞

N−1∑n=1

n3

N4

Simpson’s rule

Simpson’s rule is another numerical method for approximating

definite integrals

To evaluate∫ baf(x) dx we again divide the interval a < x < b into N

equally sized intervals of width h, with interval i defined by

xi−1 < x < xi

For Simpson’s rule, N must be an even number

Simpson’s rule then gives∫ b

a

f(x) dx ≈ h

3

(f(x0) + 4f(x1) + 2f(x2) + 4f(x3) + 2f(x4) +

4f(x5) + . . .+ 2f(xN−2) + 4f(xN−1) + f(xN )

)


Example: use Simpson’s rule to approximate∫ π0

sinx dx

The true value of this integral is 2

The table below shows how the value computed by Simpson’s rule

varies as N is increased

N Integral N Integral

2 2.094395102393195 64 2.000000064530002

4 2.004559754984421 128 2.000000004032257

8 2.000269169948388 256 2.000000000252002

16 2.000016591047935 512 2.000000000015752

32 2.000001033369413 1024 2.000000000000984

A plot of the absolute error against h on logarithmic axes is given

below

10−3

10−2

10−1

100

101

10−15

10−10

10−5

100

h

Absolu

te e

rror

We see that — on these logarithmic axes — the gradient of the graph

is 4

Halving h (i.e. doubling N) will therefore reduce the absolute error

by a factor of 16


A bound for the absolute error of the integral approximated by

Simpson’s rule exists (proof not examinable)

The absolute error is bounded by

E ≤ h4(b− a)

180G

where

|f ′′′′(x)| < G a < x < b

This explains why plotting the absolute error against h on

logarithmic axes resulted in a graph of gradient 4

Location







√• Integration

⇒ • Fourier series





Fourier series

A function f(x) is periodic with period a if, for all x,

f(x+ a) = f(x)

For example, cos 2πxa is periodic with period a

Note that the period is not unique — cos 2πxa is also periodic with

period 2a

If f and g are periodic with period a then f + g and fg are also

periodic with period a

Fourier series allow us to represent periodic functions as infinite

linear sums of trigonometric functions

Initially we will consider functions of period 2π — it is trivial to

re-scale for functions of other periods

The functions 1, sinx, cosx, sin 2x, cos 2x, . . . , sinnx, cosnx, . . . are

periodic functions with period 2π

We will require integrals of products of these functions when deriving

Fourier series


Recall that

sin(A+B) = sinA cosB + cosA sinB

sin(A−B) = sinA cosB − cosA sinB

This allows us to write

sinA cosB =1

2(sin(A+B) + sin(A−B))

Similarly, by expanding cos(A±B) we may deduce that

cosA cosB =1

2(cos(A+B) + cos(A−B))

sinA sinB =1

2(cos(A−B)− cos(A+B))

Suppose m,n are positive integers with m 6= n. Then∫ π

−πsinmx cosnx dx =

1

2

∫ π

−πsin(m+ n)x+ sin(m− n)x dx

=1

2

[−cos(m+ n)x

m+ n− cos(m− n)x

m− n

]π−π

= 0

Now suppose m = n:∫ π

−πsinmx cosnx dx =

1

2

∫ π

−πsin 2mx dx

= 0

Hence, for all positive integers m,n we have∫ π−π sinmx cosnx dx = 0


Again working with positive integers m,n, if m 6= n∫ π

−πcosmx cosnx dx =

1

2

∫ π

−πcos(m+ n)x+ cos(m− n)x dx

=1

2

[sin(m+ n)x

m+ n+

sin(m− n)x

m− n

]π−π

= 0

and, if m = n∫ π


∫ π

−πcos2mx dx

=1

2

∫ π

−π1 + cos 2mx dx

= π

We then have, for integers m,n∫ π


π m = n

0 m 6= n

Similarly,∫ π

−πsinmx sinnx dx =

π m = n

0 m 6= n


We are now in a position to write down the Fourier series of a

function f(x) with period 2π. We write

f(x) =1

2a0 +

∞∑n=1

(an cosnx+ bn sinnx)

where a0, a1, a2, . . . , and b1, b2, . . . , are to be determined.

Assuming we can interchange the order in which we integrate and

sum an infinite series we may write∫ π

−πf(x) cosmx dx =

∫ π

−π

a02

cosmx dx+

∞∑n=1

an

∫ π

−πcosmx cosnx dx+

∞∑n=1

bn

∫ π

−πcosmx sinnx dx

Taking m = 0 on the previous slide gives

a0 =1

π

∫ π

−πf(x) dx

For m = 1, 2, . . . the properties of integrals of trigonometric functions

derived earlier give

am =1

π

∫ π

−πf(x) cosmx dx


We may also write∫ π

−πf(x) sinmx dx =

∫ π

−π

a02

sinmx dx+

∞∑n=1

an

∫ π

−πsinmx cosnx dx+

∞∑n=1

bn

∫ π

−πsinmx sinnx dx

For m = 1, 2, . . . the properties of integrals of trigonometric functions

derived earlier give

bm =1

π

∫ π

−πf(x) sinmx dx

Example: the function f(x) is periodic with period 2π and, for

−π < x ≤ π is defined by f(x) = x2. Find the Fourier series

representation of f(x).

Using the expressions for the Fourier coefficients derived earlier we

have

a0 =1

π

∫ π

−πx2 dx =

2π2

3


For m = 1, 2, . . .

am =1

π

∫ π

−πx2 cosmx dx

=1

π

([x2 sinmx

m

]π−π− 2

m

∫ π

−πx sinmx dx

)

=2

mπ

([x cosmx

m

]π−π− 1

m

∫ π

−πcosmx dx

)=

2

m2ππ (cosmπ + cos(−mπ))

=4(−1)m

m2

where we have used the property of the cosine function

cosmπ = (−1)m

Also, for m = 1, 2, . . .

bm =1

π

∫ π

−πx2 sinmx dx

=1

π

([−x

2 cosmx

m

]π−π

+2

m

∫ π

−πx cosmx dx

)

=2

mπ

([x sinmx

m

]π−π− 1

m

∫ π

−πsinmx dx

)= 0

We now have expressions for all the Fourier coefficients.

A Fourier series is an infinite sum, so we cannot evaluate it exactly.

But we can evaluate a finite number of the terms.


Denote FN by the Fourier series up to the N−th harmonic:

FN (x) =1

2a0 +

N∑n=1

(an cosnx+ bn sinnx)

−3 −2 −1 0 1 2 3

0

2

4

6

8

10

x

f(x)

F1

−3 −2 −1 0 1 2 3

0

2

4

6

8

10

x

f(x)

F2

Solid lines represent the Fourier series, broken lines the function

f(x) = x2

−3 −2 −1 0 1 2 3

0

2

4

6

8

10

x

f(x)

F3

−3 −2 −1 0 1 2 3

0

2

4

6

8

10

x

f(x)

F5


f(x) = x2

As expected, adding extra terms increases the accuracy of the

approximation


Odd and Even functions

In the previous example we could have deduced that b1, b2, . . . , were

zero without resorting to integrating by parts

The coefficient bm is given by

bm =1

π

∫ π

−πf(x) sinmx dx

If f(x) is an even function, then — as sinmx is an odd function —

the integrand will be an odd function

The integral is zero under these conditions, and so b1 = b2 = . . . = 0

for even functions.

The coefficient am is given by

am =1

π

∫ π

−πf(x) cosmx dx

If f(x) is an odd function then the integrand will be an odd function

The integral is zero under these conditions, and so a1 = a2 = . . . = 0

for odd functions.

This is very useful, but make sure you show you know what is going

on if you use it in an exam


Do Fourier series converge?

Two definitions

• A function f(x) is piecewise continuous on the interval a < x ≤ bif a < x ≤ b can be divided into a finite number of subintervals

on each of which f(x) is continuous and the limits at the left and

right endpoints of the subintervals exist

• A function f(x) is piecewise smooth on the interval a < x ≤ b if

f(x) and f ′(x) are piecewise continuous

If f(x) is piecewise smooth on −π < x ≤ x then the Fourier series for

f(x) converges at all points to the value

1

2limδ→0

(f(x+ δ) + f(x− δ))

Clearly if f(x) is continuous at x = x0 then the Fourier series

converges to f(x0)

Fourier series can be used to express constants as infinite sums.

Using the earlier example of f(x) = x2, −π < x ≤ x, with f(x)

having period 2π, we see that f(x) is continuous for all x, and that

f ′(x) is piecewise continuous with finite discontinuities at

x = (2n+ 1)π.

−15 −10 −5 0 5 10 15

0

2

4

6

8

10

12

x

f(x)


The Fourier series at x = π will therefore converge to f(π) = π2, i.e.

π2 =π2

3+

∞∑n=1

4(−1)n

n2cosnπ

Noting that cosnπ = (−1)n we have

π2 =

∞∑n=1

6

n2

An example Fourier series for a discontinuous function

Let f(x) = x, −π < x ≤ x, with f(x) having period 2π

−15 −10 −5 0 5 10 15−4

−3

−2

−1

0

1

2

3

4

xf(

x)

As f(x) is an odd function we know that the Fourier coefficients

a0, a1, a2, . . . are all zero


The coefficients b1, b2, . . . , are given by

bn =1

π

∫ π

−πx sinnx dx

=1

π

([−x cosnx

n

]π−π

+

∫ π

−π

cosnx

ndx

)

=1

π

(−π(−1)n

n− π(−1)n

n

)=

2(−1)n+1

n

Truncated Fourier series are shown on the next slide

−5 0 5−4

−3

−2

−1

0

1

2

3

4

x

f(x)

F2

−5 0 5−4

−3

−2

−1

0

1

2

3

4

x

f(x)

F4

−5 0 5−4

−3

−2

−1

0

1

2

3

4

x

f(x)

F6

−5 0 5−4

−3

−2

−1

0

1

2

3

4

x

f(x)

F8


f(x) = x, −π ≤ x ≤ x, with f(x) having period 2π


We have noted that f(x) has a discontinuity at x = π. The Fourier

series at x = π will converge to

1

2limδ→0

(f(π + δ) + f(π − δ)) =1

2limδ→0

(f(−π + δ) + f(π − δ))

=1

2(−π + π)

= 0

We see by evaluating the Fourier series at x = π that this is indeed

true

Fourier series for functions with period 2a

Suppose f(x) is a function with period 2a.

Noting that the functions 1, cos nπxa , sin nπxa , n = 1, 2, 3, . . ., have

period 2a we may write the Fourier series for f(x) as

f(x) =1

2a0 +

∞∑n=1

(an cos

nπx

a+ bn sin

nπx

a

)where, using similar analysis to that carried out earlier,

a0 =1

a

∫ a

−af(x) dx

an =1

a

∫ a

−af(x) cos

nπx

adx, n = 1, 2, . . .

bn =1

a

∫ a

−af(x) sin

nπx

adx, n = 1, 2, . . .


Example: the function f(x) is defined by f(x) = ex for −a < x ≤ a,

and is periodic with period 2a. Find the Fourier series of f(x). Hence

show that, for any a > 0,

sinh a

a+

∞∑n=1

2a(−1)n sinh a

a2 + n2π2= 1.

What value does the Fourier series converge to at x = a?

We have

a0 =1

a

∫ a

−aex dx

=ea − e−a

a

=2 sinh a

a

For n = 1, 2, . . ., integrating by parts gives

an =1

a

∫ a

−aex cos

nπx

adx

=2a(−1)n sinh a

a2 + n2π2

bn =1

a

∫ a

−aex sin

nπx

adx

= −2nπ(−1)n sinh a

a2 + n2π2


The function f(x) is continuous at x = 0, and so the Fourier series

for f converges to f(0) = 1. Hence, noting that

sin 0 = 0, cos 0 = 1

we have, on substituting x = 0 into the Fourier series:

1 =sinh a

a+

∞∑n=1

(2a(−1)n sinh a

a2 + n2π2cos 0− 2nπ(−1)n sinh a

a2 + n2π2sin 0

)

=sinh a

a+

∞∑n=1

2a(−1)n sinh a

a2 + n2π2

The function f(x) is discontinuous at x = a, and the Fourier series

converges to

1

2limδ→0

(f(a+ δ) + f(a− δ)) =1

2limδ→0

(f(−a+ δ) + f(a− δ))

=1

2

(e−a + ea

)= cosh a


Location







√• Integration

√• Fourier series

⇒ • First order initial value ordinary differential equations



First order initial value ordinary differentialequations

An ordinary differential equation is an equation that contains

derivatives of a function of one variable

The order of the differential equation is the degree of the highest

derivative in the equation

An example first order ordinary differential equation(dy

dx

)3

+dy

dx+ y = x3 + sinx

An example second order ordinary differential equation

d2y

dx2+ y = 0


The simplest first order differential equations are very similar to

integration.

Example

dy

dx= x3 + 5x2 + 1

General solution of this equation is

y =

∫x3 + 5x2 + 1 dx

=1

4x4 +

5

3x3 + x+A

where A is an arbitrary constant

The solution on the previous slide was a general solution, and

includes an arbitrary solution A

To determine A we need an initial condition: i.e. the value of y at a

given value of x

Suppose we are told that y = 5 when x = 1: we then substitute this

into the general solution to give

5 =1

4+

5

3+ 1 +A

from which we deduce that A = 2512 and so

y =1

4x4 +

5

3x3 + x+

25

12


Separable first order equations

Suppose a differential equation can be written in the form

dy

dx=f(x)

g(y)

Equations such as these are called separable equations, and provided

we can perform the integration, we can solve them by writing the

equation as∫g(y) dy =

∫f(x) dx

Example

dy

dx=

ex + x3

y

∫y dy =

∫ex + x3 dx

and so

1

2y2 = ex +

1

4x4 +A

y = ±

√2

(ex +

1

4x4 +A

)for an arbitrary constant A


Homogeneous first order equations

A homogeneous first order equation is an equation that can be

written

dy

dx= f(v)

where v = y/x

For example:

dy

dx=x3 + xy2 + y3

x3 + xy2

=x3(1 + y2/x2 + y3/x3

)x3 (1 + y2/x2)

=1 + v + v3

1 + v2

For a homogeneous equation, y = xv

Using the product rule,

dy

dx= v + x

dv

dx

Using the example on the previous slide,

v + xdv

dx=

1 + v + v3

1 + v2

equivalently xdv

dx=

1

1 + v2


This equation is a separable equation:∫1 + v2 dv =

∫1

xdx

v +1

3v3 = log |Ax|

for arbitrary constant A

Now eliminate v to write the solution in terms of the original

variables x and y:

x2y +1

3y3 = x3 log |Ax|

Integrating factors

Example

dy

dx+ y = e−x y = 5 when x = 0

Multiply by ex

exdy

dx+ exy = 1

Re–write as

d

dx(yex) = 1


Integrate:

yex =

∫1 dx

x+A

y = e−x (x+A)

Fitting initial condition:

5 = e0 (0 +A)

Hence y = e−x (x+ 5)

Multiplying by ex on the previous slide might have appeared a very

inspired choice

It wasn’t

There are systematic ways to calculate these integrating factors for

equations that can be written in the form

dy

dx+ f(x)y = g(x)

In these cases, the equation should be multiplied by

e∫f(x) dx


Example

xdy

dx+ 2y = x

Re–write this as

dy

dx+

2

xy = 1 and so f(x) =

2

x

Integrating factor is then given by

e∫

2x dx = e2 log x

= elog(x2) as 2 log x = log

(x2)

= x2

Now multiply second equation on previous slide by x2:

x2dy

dx+ 2xy = x2

The left hand side can be written as a single derivative:

d

dx

(yx2)

= x2

Integrate:

yx2 =1

3x3 +A

or equivalently, y =1

3x+

A

x2


Another integrating factor example

1

2x

dy

dx+ y = 1

Re–write this as

dy

dx+ 2xy = 2x

Integrating factor:

e∫

2x dx = ex2

Multiply second equation on previous slide by ex2

:

ex2 dy

dx+ 2ex

2

xy = 2ex2

x

Re–write left–hand side:

d

dx

(yex

2)

= 2ex2

x

Integrate:

yex2

=

∫2ex

2

x dx = ex2

+A

or equivalently, y = 1 +Ae−x2


The previous example that we solved using the method of integrating

factors was

1

2x

dy

dx+ y = 1

This can be re–written as

dy

dx= 2x(1− y)

which is in separable form

Might have been easier to solve it in separable form

Summary on First Order Differential Equations

• Calculate general solution

– Try to write as a separable equation first

– If not, try to use an integrating factor

• General solution will include an arbitrary constant—this may be

eliminated using initial conditions (if these are given)


Numerical methods for first order equations

Suppose we want to solve the differential equation

dy

dx= sin2 xy y = 0.5 when x = 0

We are unable to integrate this

Instead, we may calculate a numerical solution of the differential

equation

Overview of numerical methods for ODEs

Suppose want to calculate the numerical solution of

dy

dx= f(x, y) y = y0 when x = x0 x0 < x < X

We first divide the region x0 < x < X into N equally spaced intervals

of width h = (X − x0)/N

These N intervals are bounded by the equally spaced nodes

x0, x1, x2, x3, . . . , xN−1, xN , where xi − xi−1 = h

We now want to calculate a set of values y1, y2, y3, . . . , yN−1, yN ,

where yi approximates the value of y at x = xi


Below is an example set of xi and yi when N = 4

x0 x1 x2 x3 x4

y0y1

y2

y3

y4

Different numerical methods calculate the values

y0, y1, y2, y3, . . . , yN−1, yN in different ways

The forward Euler method

Our model differential equation is

dy


The forward Euler method calculates the values

y1, y2, y3, . . . , yN−1, yN using the formula

yi − yi−1h

= f(xi−1, yi−1) i = 1, 2, 3, . . . , N

This may be written as the explicit formula

yi = yi−1 + hf(xi−1, yi−1) i = 1, 2, 3, . . . , N

Methods where an explicit expression for yi may be written down are

known as explicit methods


Example: use the forward Euler method, with 4 intervals, to

approximate the solution of

dy

dx= y + ex y = 1 when x = 0 0 < x < 1

In this case we have h = 1/4 = 0.25,

x0 = 0, x1 = 0.25, x2 = 0.5, x3 = 0.75, x4 = 1

From the initial conditions, y0 = 1

We now apply the forward Euler method to calculate y1:

y1 = y0 + hf(x0, y0)

= 1 + 0.25(1 + e0)

= 1.5

Similarly, noting that x1 = 0.25

y2 = y1 + hf(x1, y1)

= 1.5 + 0.25(1.5 + e0.25)

= 2.1960

Continuing,

y3 = 3.1572

y4 = 4.4757


The backward Euler method

Using the same model differential equation

dy


The backward Euler method calculates the values

y1, y2, y3, . . . , yN−1, yN using the formula

yi − yi−1h

= f(xi, yi) i = 1, 2, 3, . . . , N

It is not always possible to write an explicit expression for yi for the

backward method — methods such as this are known as implicit

methods

Example: use the backward Euler method, with 4 intervals, to


dy

dx= y + ex y = 1 when x = 0 0 < x < 1

Again h = 1/4 = 0.25, x0 = 0, x1 = 0.25, x2 = 0.5, x3 = 0.75, x4 = 1


We now apply the backward Euler method to calculate yi, i=1,2,3,4:

yi = yi−1 + hf(xi, yi)

= yi−1 + (0.25)(yi + exi)

=4

3(yi−1 + 0.25exi)


We can then proceed as for the forward Euler method, calculating

successive values of yi

y1 = 1.7613

y2 = 2.8980

y3 = 4.5697

y4 = 6.9990

Example: use the backward Euler method, with 4 intervals, to


dy

dx= x+ ey y = 1 when x = 0 0 < x < 1

Again h = 1/4 = 0.25, x0 = 0, x1 = 0.25, x2 = 0.5, x3 = 0.75, x4 = 1


We now apply the backward Euler method to calculate yi, i=1,2,3,4:

yi = yi−1 + hf(xi, yi)

= yi−1 + (0.25)(xi + eyi)


There isn’t an explicit expression for the valies of yi in this case

For example, y1 satisfies the nonlinear equation

y1 = 1 + 0.25ey1

Mathematical techniques, and computational implementations of

these techniques in Matlab, exist

An obvious question is why go to the trouble of implementing the

backward Euler method in cases such as this?

A comparison of the implementation of the forward andbackward Euler methods

We will compare the forward and backward Euler methods using the

model ODE

dy

dx= −λy y = 1 when x = 0 0 < x < 10

where λ > 0 is a constant

This ODE has analytic solution y = e−λx

In this case both the forward and backward Euler methods have

explicit representations


The forward Euler method for this problem is

yi = (1− λh)yi−1

The backward Euler method for this problem is

yi =1

1 + λhyi−1

We will now compare the solutions for different values of λ and h

We will start by setting λ = 1, and try N = 20, 40, 80, 160

We would expect that increasing N — and therefore decreasing h —

will make the numerical solution more accurate

λ = 1, N = 20 λ = 1, N = 40

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1

Forward EulerBackward EulerTrue solution

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1



λ = 1, N = 80 λ = 1, N = 160

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1


0 2 4 6 8 100

0.2

0.4

0.6

0.8

1


We see that for λ = 1 progressively increasing N improves the

accuracy of the solution

We now set λ = 10.

Below are the forward Euler and backward Euler simulations for

N = 160, 80

λ = 10, N = 160 λ = 10, N = 80

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1


0 2 4 6 8 10−0.4

−0.2

0

0.2

0.4

0.6

0.8

1


Note the behaviour of the forward Euler method for N = 80


Still with λ = 10 we use N = 40

0 2 4 6 8 10−1

−0.5

0

0.5

1

1.5x 10

7


Note the scale on the y−axis

What happened?

The plot on the previous slide, with λ = 10 and N = 40, is re–plotted

with different y−axes below

0 2 4 6 8 10−100

−50

0

50

100


0 2 4 6 8 100

0.2

0.4

0.6

0.8

1


We see that the backward Euler solution is well–behaved although

not as accurate as we would like

The forward Euler solution is wildly inaccurate


To explain the phenomena on the previous slide we return to the

forward and backward Euler approximations we wrote down earlier

Noting that h = 10/N these may be written

FE: yi =

(1− 10λ

N

)yi−1

BE: yi =1

1 + 10λN

yi−1

These approximations allow us to write

FE: yi =

(1− 10λ

N

)iy0

BE: yi =1(

1 + 10λN

)i y0

For this model problem we expect, as λ > 0, that the value of yi will

decrease to zero as i increases

Both methods are written in the form

yi = Aiy0

For yi to decay to zero we require −1 < A < 1


For the Backward Euler method, A =(1 + 10λ

N

)−1

As both λ > 0 and N > 0 we can deduce that 0 < A < 1

Hence, for all values of N , A satisfies the condition that −1 < A < 1

specified on the previous slide

The backward Euler method will always be well–behaved in this

sense, whatever value of N we choose

For the forward Euler method, A = 1− 10λN

As λ > 0 and N > 0 we can deduce that A < 1

However, for N < 50 we have A < −1, and so Ai will not decay to

zero as i increases

Instead, under these conditions, the modulus of Ai will increase as we

have seen for our simulations with N = 40, λ = 10


Using a very simple example we have demonstrated a common

phenomena associated with the forward and backward Euler

methods, namely that there is usually a critical value of h above

which the forward Euler method gives a nonsensical answer

The backward Euler method is more relaible as this instability

doesn’t happen

The extra reliability offered by the backward Euler method often

comes at the cost of having to solve a non–linear algebraic equation

for each value of yi

Example: By approximating the differential equation

dy

dx= −λy, y = 1 when x = 0

using the forward Euler method, show that

limN→∞

(1− λ

N

)N= e−λ

This equation has true solution y = e−λx, and so y = e−λ when x = 1


Suppose we use the forward Euler method with N equally sized

intervals on the interval 0 < x < 1

We then have interval width h = 1N , and xn = nh, n = 0, 1, 2, . . . , N

The initial conditions tell us that y0 = 1

The values of yn, n = 1, 2, 3, . . . , N are given by the forward Euler

method:

yn+1 = yn +1

N(−λyn)

=

(1− λ

N

)yn

Applying our forward Euler approximation

y1 =

(1− λ

N

)y0 =

(1− λ

N

)y2 =

(1− λ

N

)y1 =

(1− λ

N

)2

y3 =

(1− λ

N

)y2 =

(1− λ

N

)3

. . . . . .

yN =

(1− λ

N

)N

yN approximates the value of y at x = xN = 1

As N →∞, yN will approach the true value of y at x = 1, and so

limN→∞

(1− λ

N

)N= e−λ


Location







√• Integration


√• First order initial value ordinary differential equations

⇒ • Second order boundary value ordinary differential equations


Second order boundary value ordinary differentialequations

A second order boundary value problem (BVP) is an equation of the

form

3d2y

dx2+ 6

dy

dx− 9y = 1 + x,

valid in a specified interval, say, 0 ≤ x ≤ 1, together with two

boundary conditions: one at each end, for example:

y = 2 at x = 0, y = 5 at x = 1.

The equation is known as second order because the highest derivative

is the second derivative d2ydx2 .

We will see later that the boundary conditions can be a bit more

general than those given above


Homogeneous Second Order BVPs

A BVP is a homogeneous BVP if there are no terms on the

right–hand side, for example

3d2y

dx2+ 6

dy

dx− 9y = 0,

valid for 0 ≤ x ≤ 1, together with the following boundary conditions:

y = 2 at x = 0, y = 5 at x = 1.

Equations such as these can be solved by looking for a solution of the

form y = eαx

We then have

dy

dx= αeαx

d2y

dx2= α2eαx

Using the example of

ad2y

dx2+ b

dy

dx+ cy = 0

we substitute y = eαx and obtain

eαx(aα2 + bα+ c

)= 0

As eαx 6= 0 we therefore have the auxiliary equation

aα2 + bα+ c = 0

This quadratic equation has two roots α1, α2 — general solution is

y = Aeα1x +Beα2x

for arbitrary constants A and B that are fitted from boundary

conditions


There are two obvious difficulties with the approach on the previous

slide

• α1 and α2 may be complex numbers

• If α1 = α2 then we only have one solution and so can fit only one

arbitrary constant

We will first show an example with real distinct roots, and return to

the other cases later

Calculate the solution of

3d2y

dx2+ 6

dy

dx− 9y = 0,

subject to boundary conditions

y = 2 at x = 0, y = 5 at x = 1.

Start off by solving the quadratic auxiliary equation

3α2 + 6α− 9 = 0

i.e. derivatives in the differential equation are replaced by powers of α

α =−6±

√62 − 4× 3× (−9)

2× 3= 1 or − 3


Auxiliary equation has real roots α1 = 1 and α2 = −3.

Roots of auxiliary equation are real and distinct, so general solution

of

3d2y

dx2+ 6

dy

dx− 9y = 0,

is

y = Aeα1x +Beα2x

i.e. y = Aex +Be−3x

A and B are unknown constants—they can be found by applying the

boundary conditions

General solution of BVP is

y = Aex +Be−3x

Boundary conditions are

y = 2 at x = 0, y = 5 at x = 1.

Simultaneous equations

BC at x = 0 : 2 = A+B

BC at x = 1 : 5 = Ae1 +Be−3


Solve (with slightly messy algebra):

A =5− 2e−3

e− e−3, B =

2e− 5

e− e−3

The solution is therefore

y =5− 2e−3

e− e−3ex +

2e− 5

e− e−3e−3x

Suppose the auxiliary equation has complex roots given by

α1 = λ+ µi α2 = λ− µi

We may modify the approach before to avoid the use of complex

numbers

The general solution is

y = <(Ce(λ+µi)x +De(λ−µi)x

)where C,D may be complex


Remembering that

e(λ+µi)x = eλx (cosµx+ i sinµx)

e(λ−µi)x = eλx (cosµx− i sinµx)

we may write

y = eλx< ((C +D) cosµx+ (C −D)i sinµx)

= eλx (A cosµx+B sinµx)

where the real numbers A and B are given by

A = <(C +D), B = <(i(C −D))

A and B are both fitted from the boundary conditions

An example with complex roots:

4d2y

dx2+ π2y = 0, 0 ≤ x ≤ 1

with boundary conditions

y = 2 at x = 0, y = −5 at x = 1

Auxiliary equation

4α2 + π2 = 0

Solution of auxiliary equation is

α = −π2i or

π

2i,


Roots are imaginary, α = ±π2 i

Using the notation above, λ = 0 and µ = π/2

As roots are imaginary (i.e. no real part) general solution is

y = A sinπx

2+B cos

πx

2

Boundary conditions:

y = 2 at x = 0, y = −5 at x = 1

These boundary conditions give

2 = A sin 0 +B cos 0, −5 = A sinπ

2+B cos

π

2

Using sin 0 = 0, cos 0 = 1, sin π2 = 1, cos π2 = 0, we can see that

A = −5, B = 2

and solution is

y = −5 sinπx

2+ 2 cos

πx

2


Another example with complex roots:

2d2y

dx2− 8

dy

dx+ 26y = 0, 0 ≤ x ≤ π

2


y = 1 at x = 0, y = 0 at x =π

2

Auxiliary equation

2α2 − 8α+ 26 = 0

Roots of auxiliary equation are

α = 2 + 3i, 2− 3i

Roots of auxiliary equation have both a real part and an imaginary

part:

α = 2± 3i

This time we have λ = 2 and µ = 3

In this case, general solution is

y = e2x (A sin 3x+B cos 3x)


Boundary conditions

y = 1 at x = 0, y = 0 at x =π

2

These boundary conditions give

1 = A sin 0 +B cos 0, 0 = eπ(A sin

3π

2+B cos

3π

2

)

Using sin 0 = 0, cos 0 = 1, sin 3π2 = −1, cos 3π

2 = 0, we can see that

A = 0, B = 1

and solution is

y = e2x cos 3x

Repeated root of auxiliary equation

Returning to our original example

ad2y

dx2+ b

dy

dx+ cy = 0

suppose the auxiliary equation

aα2 + bα+ c = 0

has a repeated real root α

It follows that

α = − b

2a


y = eαx is one solution of the differential equation

Suppose y = xeαx. We then have

dy

dx= (1 + αx)eαx

d2y

dx2= α(2 + αx)eαx

We then have

ad2y

dx2+ b

dy

dx+ cy =

(x(aα2 + bα+ c

)+ 2aα+ b

)eαx

= 0

and so y = xeαx is also a solution of the homogeneous equation

If the auxiliary equation has a repated root α, the general solution is

therefore

y = eαx (A+Bx)


An example with a repeated root:

d2y

dx2− 4

dy

dx+ 4y = 0

Auxilliary equation is

α2 − 4α+ 4 = 0

This equation only has one distinct root—the repeated root α = 2.

Under these conditions, the general solution is

y = eαx (A+Bx)

i.e. y = e2x (A+Bx)

Solving General Homogeneous Equations

General homogeneous equation:

Pd2y

dx2+Q

dy

dx+Ry = 0

Solve auxiliary equation Pα2 +Qα+R = 0

Three cases:

1. Real distinct roots, α1 6= α2. General solution is

y = Aeα1x +Beα2x

2. Only one real root, α. General solution is y = eαx (A+Bx)

3. Complex roots, α = λ± µi. General solution is

y = eλx (A sinµx+B cosµx)


First write down general solution

This general solution will include two arbitrary constants A and B

Use boundary conditions to set up simultaneous equations for the

constants A and B

A homogeneous example with slightly different boundaryconditions

d2y

dx2+ y = 0, 0 ≤ x ≤ π


y = 7 at x = 0,dy

dx= 3 at x = π

Auxiliary equation: α2 + 1 = 0 has roots α = ±i

General solution is therefore

y = A sinx+B cosx


General solution is

y = A sinx+B cosx

Boundary condition at x = π is in terms of dydx

From the general solution we see that

dy

dx= A cosx−B sinx

Boundary condition y = 7 at x = 0 implies that

A sin 0 +B cos 0 = 7, i.e. B = 7

Boundary condition dydx = 3 at x = π implies that

A cosπ −B sinπ = 3, i.e. A = −3

Solution is therefore

y = −3 sinx+ 7 cosx


A warning example

d2y

dx2+ y = 0, 0 ≤ x ≤ π


y = 7 at x = 0, y = −7 at x = π



y = A sinx+B cosx


A sin 0 +B cos 0 = 7, i.e. B = 7

Boundary condition y = 7 at x = π implies that

A sinπ +B cosπ = −7, i.e. B = 7

Both boundary conditions tell us that B = 7, but neither give

information on A

All we can say is that

y = A sinx+ 7 cosx

where A is any constant—the solution is said to be non–unique


Another warning example

d2y

dx2+ y = 0, 0 ≤ x ≤ π


y = 7 at x = 0, y = 5 at x = π



y = A sinx+B cosx


A sin 0 +B cos 0 = 7, i.e. B = 7

Boundary condition y = 5 at x = π implies that

A sinπ +B cosπ = 5, i.e. B = −5

As in example 19 the boundary conditions do not tell us what the

constant A is

One boundary condition tells us that B = −5 and the other tells us

that B = 7

As the boundary conditions give us conflicting information on B no

solution exists that is compatible with the boundary conditions


Inhomogeneous BVPs

A general inhomogeneous BVP is of the form

Ad2y

dx2+B

dy

dx+ Cy = f(x), a ≤ x ≤ b

with boundary conditions given at x = a and x = b

Let yH be the general solution of the homogeneous equation

Ad2y

dx2+B

dy

dx+ Cy = 0, a ≤ x ≤ b

and let yPS be any solution of the inhomogeneous equation above.

The general solution of the inhomogeneous BVP is then

y = yH + yPS

This is because

Ad2y

dx2+B

dy

dx+ Cy =

(A

d2yHdx2

+BdyHdx

+ CyH

)+(

Ad2yPS

dx2+B

dyPSdx

+ CyPS

)= 0 + f(x)

and it therefore satisfies the inhomogeneous BVP

As with homogeneous equations this general solution will contain two

unknown constants—these are then determined from the two

boundary conditions


The general solution of

3d2y

dx2+ 6

dy

dx− 9y = 1 + x, 0 ≤ x ≤ 1

is given by the sum of:

• the general solution to the homogeneous equation yH ; and

• any solution of the inhomogeneous equation—known as a

particular solution yPS

We know the general solution to the homogeneous equation—it is the

solution to an earlier example:

yH = Aex +Be−3x

Now need a particular solution of the inhomogeneous equation

3d2y

dx2+ 6

dy

dx− 9y = 1 + x, 0 ≤ x ≤ 1

As the left hand side is a linear function we will look for a linear

solution:

yPS = Px+Q

This choice of particular solution yields dyPS

dx = P , and d2yPS

dx2 = 0


Substituting this into the differential equation:

6P − 9(Px+Q) = 1 + x,

which may be written

(6P − 9Q− 1)− (9P + 1)x = 0

If this is true for all x, we must have P = − 19 , and Q = − 5

27

The general solution of our differential equation is therefore

y = Aex +Be−3x − 1

9x− 5

27

We can now fit boundary conditions for the inhomogeneous problem

in the same way as for homogeneous problems, for example

y = 2 at x = 0, y = 5 at x = 1.


Example: find a general solution of

d2y

dx2+ y = 1 + e3x

Need solution to the homogeneous equation yH , and a particular

solution yPS

Solution to homogeneous equation is clearly

yH = A sinx+B cosx

Right hand side is a combination of a constant and a multiple of e3x.

Look for a particular solution that mirrors this

yPS = P +Qe3x

dyPSdx

= 3Qe3x

d2yPSdx2

= 9Qe3x

Substitute into differential equation:

(9Q+Q) e3x + P = 1 + e3x

and so P = 1, and Q = 110



y = yH + yPS

= A sinx+B cosx+ 1 +1

10e3x


d2y

dx2+ 4

dy

dx+ 3y = sin 2x

Need solution to the homogeneous equation, and a particular solution


yH = Ae−x +Be−3x


Right hand side is sin 2x

Try a particular solution that is multiples of sin 2x and cos 2x:

yPS = P sin 2x+Q cos 2x

dyPSdx

= 2P cos 2x− 2Q sin 2x

d2yPSdx2

= −4P sin 2x− 4Q cos 2x


(−4P − 8Q+ 3P ) sin 2x+ (−4Q+ 8P + 3Q) cos 2x = sin 2x

and so P = − 165 , and Q = − 8

65


y = yH + yPS

= Ae−x +Be−3x − 1

65(sin 2x+ 8 cos 2x)



d2y

dx2+ y = sinx

Need solution to the homogeneous equation yH , and a particular

solution yPS


yH = A sinx+B cosx

Right hand side is sinx

This would suggest trying (see example 23)

yPS = P sinx+Q cosx

but this is identical to yH , so won’t work.

Instead try

yPS = x (P sinx+Q cosx)

dyPSdx

= (P −Qx) sinx+ (Q+ Px) cosx

d2yPSdx2

= (−2Q− Px) sinx+ (2P −Qx) cosx



(−2Q− Px+ Px) sinx+ (2P −Qx+Qx) cosx = sinx

and so P = 0, and Q = − 12


y = yH + yPS

= A sinx+B cosx− 1

2x cosx


d2y

dx2− 2

dy

dx+ y = ex

Auxiliary equation is α2 − 2α+ 1 = 0, and so α = 1, 1

General solution to homogeneous problem is yH = (A+Bx)ex

Obvious choice for particular solution is yPS = Cex but this is a

solution of the homogeneous equation

Multiplying by x gives yPS = Cxex — but this is also a solution of

the homogeneous equation


Multiply by x again, and try yPS = Cx2ex

We then have

dyPSdx

= C(x2 + 2x

)ex

d2yPSdx2

= C(x2 + 4x+ 2

)ex

Substitute into given equation:

C(x2 + 4x+ 2− 2

(x2 + 2x

)+ x2

)ex = ex

This gives C = 12

General solution is y = (A+Bx+ 12x

2)ex


d2y

dx2+ 4y = 3x sinx

Solution to homogeneous problem is yH = A cos 2x+B sin 2x

For particular solution, try

yPS = (C +Dx)(E sinx+ F cosx)

= (CE sinx+ CF cosx+DEx sinx+DFx cosx)

= P sinx+Q cosx+Rx sinx+ Sx cosx


We then have

dyPSdx

= (R−Q) sinx+ (P + S) cosx− Sx sinx+Rx cosx

d2yPSdx2

= −(P + 2S) sinx+ (2R−Q) cosx−Rx sinx− Sx cosx

Substitute into given equation and equate coefficients:

x sinx : −R+ 4R = 3

x cosx : −S + 4S = 0

sinx : −P − 2S + 4P = 0

cosx : 2R−Q+ 4Q = 0

These equations give P = 0, R = 1, S = 0, Q = −2/3

General solution is therefore y = A cos 2x+B sin 2x+ x sinx− 23 cosx

The previous example was of the form

Pd2y

dx2+Q

dy

dx+Ry = f(x)g(x)

Let y1 be a suitable particular solution if the right–hand–side was

f(x), and y2 be a suitable particular solution if the right–hand–side

was g(x)

A suitable particular solution for the equation above is the product

yPS = y1y2



x2d2y

dx2+ x

dy

dx− 4y = x2 + x4

The left–hand–side of this equation is different to those we have seen

earlier — the coefficients of the derivatives of y are not constants, but

are functions of x

Note that a multiple of x2 multiplies the second derivative, and a

multiple of x multiplies a first derivative

Equations such as these can be transformed to constant coefficient

equations using the substitution x = et — we can then write the

equation for y as a function of t

If x = et then

dy

dt=

dy

dx

dx

dt=

dy

dxet =

dy

dxx

Similarly,

d2y

dt2=

d

dt

(dy

dxet)

=d2y

dx2dx

dtet +

dy

dxet

= x2d2y

dx2+

dy

dt

We can therefore use the following substitutions:

xdy

dx=

dy

dt, x2

d2y

dx2=

d2y

dt2− dy

dt


Using these substitutions the given equation becomes

d2y

dt2− 4y = e2t + e4t

The solution to the homogeneous problem is

yH = Ae2t +Be−2t

A suitable particular solution is (exercise — why?)

yPS = Cte2t +De4t

Plugging yPS into the given equation and equating coefficients of e2t

and e4t yields C = 14 and D = 1

12 .


y = Ae2t +Be−2t +1

4te2t +

1

12e4t

In terms of the original variables this may be written

y = Ax2 +B

x2+

1

4x2 log x+

x4

12


Example: Find the solution of

dx

dt= x− 2y,

dy

dt= y − 2x

subject to initial conditions x = 2, y = 4 at t = 0

Solution method is similar to simultaneous equations — use one

equation to isolate one of the variables, then substitute it into the

other equation

From the first equation

y =1

2

(x− dx

dt

)from which we may deduce that

dy

dt=

1

2

(dx

dt− d2x

dt2

)

Substituting for y and dydt in the second equation gives

1

2

(dx

dt− d2x

dt2

)=

1

2

(x− dx

dt

)− 2x

which can be re–written as

d2x

dt2− 2

dx

dt− 3x = 0

Auxiliary equation, α2 − 2α− 3 = 0 has roots α = −1, 3 and so

general solution for x is

x = Ae−t +Be3t

and by susbtituting into our expression for y on the previous slide:

y = Ae−t −Be3t


Fitting the initial conditions x = 2, y = 4 at t = 0 gives

A+B = 2, A−B = 4

from which we can deduce that A = 3, B = −1

The solutions are therefore

x = 3e−t − e3t

y = 3e−t + e3t

Summary for Solving Inhomogeneous BVPs

• Find general solution of homogeneous problem, yH

• Find particular solution, yPS

• General solution of inhomogeneous problem is then y = yH + yPS

• Calculate arbitrary constants from boundary conditions if

necessary


Summary for finding solution of homogeneous problems


Pd2yHdx2

+QdyHdx

+RyH = 0


Three cases:

1. Real distinct roots, α1 6= α2. General solution is

yH = Aeα1x +Beα2x

2. Only one real root, α. General solution is yH = eαx (A+Bx)

3. Complex roots, α = λ± µi. General solution is

yH = eλx (A sinµx+B cosµx)

Summary for finding particular solutions

General equation

Pd2y

dx2+Q

dy

dx+Ry = f(x)

f(x) yPS

Polynomial of degree n Polynomial of degree n

ekx Aekx if yH 6= Aekx

ekx Axekx if yH = Aekx

sinx A sinx+B cosx if yH 6= A sinx,B cosx

sinx x(A sinx+B cosx) if yH = A sinx,B cosx


Difference equations revisited

General equation

Pyn+2 +Qyn+1 +Ryn = f(n), n = 0, 1, 2, . . . , N − 2

with either

• Initial conditions y0, y1

• Boundary conditions y0, yN

Summary for solving inhomogeneous difference equations

• Find general solution of homogeneous problem, y(H)n

• Find particular solution, y(PS)n

• General solution of inhomogeneous problem is then

yn = y(H)n + y

(PS)n

• Calculate arbitrary constants from initial or boundary conditions

if necessary

Note the similarities with solving boundary value ODEs


Summary for finding solution of homogeneous problems


Pyn+2 +Qyn+1 +Ryn = 0, n = 0, 1, 2, . . . , N − 2


Two cases:

1. Distinct roots, α1 6= α2. General solution is y(H)n = Aαn1 +Bαn2

2. One repeated root, α. General solution is y(H)n = αn (A+Bn)

Summary for finding particular solutions

General equation

Pyn+2 +Qyn+1 +Ryn = f(n)

f(n) y(PS)n

Polynomial of degree n Polynomial of degree n

ekn Aekn

sin kn A sin kn+B cos kn

If y(PS)n contains a multiple of y

(H)n , multiply by n until it isn’t


Example: solve the following difference equation

pyn+2 − yn+1 + (1− p)yn = −1, n = 0, 1, 2, . . . , N

with y0 = yN = 0, for a given 0 < p < 1.

Step 1: calculate y(H)n . Auxiliary equation is

pα2 − α+ (1− p) = 0

This has roots α = (1− p)/p, 1

The roots are repeated if p = 1/2, and distinct otherwise

Case 1: p = 1/2. Solution to homogeneous problem is

y(H)n = A+Bn

Particular solution is of the form y(PS)n = Cn2

Substituting into difference equation gives C = −1

General solution is

yn = A+Bn− n2

Fitting boundary conditions y0 = yN = 0 gives A = 0, B = N and so

yn = n(N − n)


Case 2: p 6= 1/2. Solution to homogeneous problem is

y(H)n = A

(1− pp

)n+B

Particular solution is of the form y(PS)n = Cn

Substituting into difference equation gives C = 1/(1− 2p)

General solution is

yn = A

(1− pp

)n+B +

n

1− 2p

Fitting boundary conditions gives

yn =1

2p− 1

(N

((1− p)/p)n − 1

((1− p)/p)N − 1− n

)

Numerical solution of second order boundary valueproblems

We will now develop methods for calculating the numerical solution

of second order boundary value ordinary differential equations

Suppose we want to calculate the numerical solution of

Pd2y

dx2+Q

dy

dx+Ry = f(x), X0 < x < X1

with boundary conditions y = Y0 at x = X0 and y = Y1 at x = X1

We first divide the region X0 < x < X1 into N equally spaced

intervals of width h = (X1 −X0)/N

These N intervals are bounded by the equally spaced nodes

x0 = X0, x1, x2, . . . , xN = X1, where the solution is approximated by

y0, y1, y2, . . . , yN


Here is an example of a numerical solution when N = 4

x0 x1 x2 x3 x4

y0y1

y2

y3

y4

To calculate the numerical solution we first need a numerical

approximation of the second derivative d2ydx2

When xn is not on the boundary, a Taylor series approximation of

y(x) about x = xn, and neglecting cubic and higher terms gives

yn+1 ≈ y(xn+1) ≈ y(xn) + hy′(xn) +1

2h2y′′(xn)

yn−1 ≈ y(xn−1) ≈ y(xn)− hy′(xn) +1

2h2y′′(xn)

Adding these equations, and using the approximation y(xn) ≈ yn,

gives an approximation to the second derivative

y′′(xn) ≈ yn−1 − 2yn + yn+1

h2


Suppose

d2y

dx2= −1

with boundary conditions y = 0 at x = 0, 1

True solution is y = x(1− x)/2

The points xn satisfy

xn = nh =n

N, n = 0, 1, 2, . . . , N

Numerical approximation is

yn−1 − 2yn + yn+1

h2= −1, n = 1, 2, 3, . . . , N − 1

together with boundary conditions y0 = yN = 0

Difference relation can be written

yn+2 − 2yn+1 + yn = −h2, n = 0, 1, 2, . . . , N − 2

Solution to homogeneous problem is

y(H)n = A+Bn

Particular solution is y(PS)n = Cn2

Substituring into equation yields C = −h2/2

General solution is yn = A+Bn− (nh)2/2

Fitting y0 = 0 = yN gives A = 0, B = Nh2/2


General solution is

yn = nh2(N − n)/2

= nh(Nh− nh)/2

= xn(1− xn)/2

In this case the solution matches the true solution at all points xn

Another example:

d2y

dx2− y = 0, 0 < x < 1

with boundary conditions y = 3 at x = 0, and y = e + 2/e at x = 1

True solution is y = ex + 2e−x

We again have

xn = nh =n

N, n = 0, 1, 2, . . . , N

and yn satisfies

yn−1 − 2yn + yn+1

h2− yn = 0, n = 1, 2, . . . , N − 1

with y0 = 3, and yN = e + 2/e



yn+2 − (2 + h2)yn+1 + yn = 0, n = 0, 1, . . . , N − 2

Auxiliary equation is

α2 − (2 + h2)α+ 1 = 0

with (real) roots

α1 = 1 +h2

2+ h

√1 +

h2

4, α2 = 1 +

h2

2− h√

1 +h2

4

General solution is yn = Aαn1 +Bαn2

Fitting boundary conditions gives

A+B = 3, AαN1 +BαN2 = e +2

e

giving

A =e + 2

e − 3αN2αN1 − αN2

B =e + 2

e − 3αN1αN2 − αN1


Below is the numerical solution for N = 10 (circles) and the true

solution (solid line)

0 0.2 0.4 0.6 0.8 12.8

2.9

3

3.1

3.2

3.3

3.4

3.5

3.6

x

y

Another example:

d2y

dx2− y = x, 0 < x < 1

with boundary conditions y = 0 at x = 0, and y = 0 at x = 1

True solution is y = ee2−1 (ex − e−x)− x

We again have

xn = nh =n

N, n = 0, 1, 2, . . . , N

This time yn satisfies

yn−1 − 2yn + yn+1

h2− yn = xn, n = 1, 2, . . . , N − 1

with y0 = 0, and yN = 0



yn+2 − (2 + h2)yn+1 + yn = xn+1 = (n+ 1)h, n = 0, 1, . . . , N − 2

Using previous example, y(H)n = Aαn1 +Bαn2

As α1 6= 1 and α2 6= 1 particular solution is y(PS)n = P +Qn

Substituting y(PS)n into the inhomogeneous equation determines P

and Q

The boundary conditions then determine A and B

Location







√• Integration


√• First order initial value ordinary differential equations

√• Second order boundary value ordinary differential equations

⇒ • Simple partial differential equations


Simple partial differential equations

Ordinary differential equations are differential equations that only

depend on total derivatives, e.g. dydx , d2y

dx2

Partial differential equations are differential equations that depend

on partial derivatives, for example

∂u

∂t+ u

∂u

∂x= x+ u

∂

∂x

(u

1 + u

∂u

∂x

)+

∂

∂y

(3u

1 + u

∂u

∂y

)= eu

The heat equation

Suppose a metal bar occupies the region 0 < x < L. Assuming the

temperature T (x, t) is uniform across the bar’s cross section, the

temperature in the bar is given by

∂T

∂t= D

∂2T

∂x2+ f(x, t)

where D > 0 is the thermal diffusivity and f(x, t) is an internal heat

source (if one exists)

We assume that we know the initial temperature

T (x, 0) = T0(x)

for some given function T0(x)

We also require boundary conditions for T or ∂T∂x at x = 0 and x = L

for all times t > 0


Separable solutions to the heat equation

Example: Find T (x, t) that satisfies

∂T

∂t= D

∂2T

∂x20 < x < L

with initial conditions

T (x, 0) = 3 sinπx

L

with boundary conditions T = 0 at x = 0, L

We will assume throughout these examples that D > 0

We will first show that if we can find a solution then it is unique

Suppose there are two solutions, T1 and T2

Let U = T1 − T2

It then follows by substituting U into the equation and boundary

conditions that

∂U

∂t= D

∂2U

∂x20 < x < L

with initial conditions

U(x, 0) = 0

and with boundary conditions U = 0 at x = 0, L


We define E(t) by

E(t) =

∫ L

0

[U(x, t)]2

dx

It follows immediately that

E(0) = 0, E(t) ≥ 0

dE

dt=

d

dt

∫ L

0

[U(x, t)]2

dx

=

∫ L

0

∂

∂t[U(x, t)]

2dx

=

∫ L

0

2U∂U

∂tdx

=

∫ L

0

2UD∂2U

∂x2dx

=

[2UD

∂U

∂x

]L0

−∫ L

0

2D

(∂U

∂x

)2

dx

= −∫ L

0

2D

(∂U

∂x

)2

dx

≤ 0


Hence E(t) is a decreasing function

The only way all the conditions on E(t) can be met is if E(t) = 0 at

all times t

This gives∫ L

0

[U(x, t)]2

dx = 0

and can only be true if U(x, t) = 0 for all values of x and t

This implies that T1 = T2 — i.e. if there are two solutions, then they

must be equal

We look for a separable solution T (x, t) = X(x)S(t)

If we find a solution then we know by uniqueness that it will be the

only solution

We have

∂T

∂t= X(x)S′(t)

∂2T

∂x2= X ′′(x)S(t)

We may write the governing equation as

X(x)S′(t) = DX ′′(x)S(t)

equivalentlyS′(t)

DS(t)=X ′′(x)

X(x)


The right–hand–side of the last equation is a function only of x and

not of t

The left–hand–side is a function only of t and not of x

The only way this can simultaneously be true is if both sides are

equal to a constant, i.e.

S′(t)

DS(t)=X ′′(x)

X(x)= λ

We now think about boundary conditions

We have T (0, t) = 0 and T (L, t) = 0

As T (x, t) = X(x)S(t) we must have

X(0)S(t) = 0, X(L)S(t) = 0

We don’t want S(t) = 0 for all times t — this would give T (x, t) = 0

We therefore have boundary conditions on X given by X(0) = 0 and

X(L) = 0


We have

X ′′(x)

X(x)= λ

together with boundary conditions X(0) = 0 and X(L) = 0

Suppose λ > 0. Then we can write λ = k2, and so

d2X

dx2− k2X = 0

This has general solution X = Aekx +Be−kx

The boundary conditions then give A = B = 0, and so X(x) = 0, and

then T (x, t) = 0

This isn’t what we want — the assumption λ > 0 must be false

Suppose now that λ = 0

We then have

X ′′(x)

X(x)= 0

together with boundary conditions X(0) = 0 and X(L) = 0

Again, this only has solution X(x) = 0

We must have λ < 0


We therefore write

X ′′(x)

X(x)= −k2

This equation has general solution

X(x) = A sin kx+B cos kx

We now fit the boundary conditions

X(0)⇒ B = 0

X(L) = 0⇒ A sin kL = 0

If A = 0 we would have the trivial solution X(x) = 0, and so

T (x, t) = 0 which violates the initial conditions

Instead, sin kL = 0 and so kL = nπ where n = 1, 2, 3, . . .


When kL = nπ the equation for S(t) is

S′(t) = −Dn2π2

L2S(t)

which has general solution

S(t) = Ce−Dn2π2t/L2

Combining the solutions for X(x) and S(t), we see that

En sinnπx

Le−Dn

2π2t/L2

where En = AC

is a solution that satisfies the boundary conditions for any n = 1, 2, 3

The general solution is the sum of these solutions:

T (x, t) =

N∑i=1

En sinnπx

Le−Dn

2π2t/L2

where En, n = 1, 2, 3, . . . are constants that are fitted from the initial

conditions

In this case, E1 = 3 and En = 0 for n = 2, 3, 4, . . . and so

T (x, t) = 3 sinnπx

Le−Dn

2π2t/L2


We see that initially the bar has a positive temperature inside the bar

The ends of the bar are maintained at a temperature T = 0, and so

we would expect that this would cool the bar down

This is evident from the solution — as t→∞ we see that T (x, t)→ 0

for all values of x

Another separable solution to the heat equation

Consider the PDE

∂T

∂t= D

∂2T

∂x2, 0 < x < L

with boundary conditions ∂T∂x = 0 at x = 0, and T = 0 at x = L, and

initial conditions

T (x, 0) = L2 − x2

We again proceed by seeking a separable solution T (x, t) = X(x)S(t)

Boundary conditions give X ′(0) = 0 and X(L) = 0


As with the previous example we may write

X ′′(x)

X(x)=

S′(t)

DS(t)= −k2

from which we may deduce that

X(x) = A sin kx+B cos kx

The boundary condition X ′(0) = 0 implies A = 0

The boundary condition X(L) = 0 gives B cos kL = 0

For a non–trivial solution we require kL = (n+ 12 )π, n = 0, 1, 2, . . .

We therefore have, for n = 0, 1, 2, . . .

X(x) = B cos(2n+ 1)πx

2L, k =

(2n+ 1)π

2L

Associated with this X(x) is the equation for S(t):

S′(t) = − (2n+ 1)2π2D

4L2S(t)

with solution

S(t) = Ce−(2n+1)2π2Dt/(4L2)


For n = 1, 2, 3, . . . the following is a solution of the PDE:

Pn cos(2n+ 1)πx

2Le−(2n+1)2π2Dt/(4L2)

where Pn is a constant

A general solution is a linear sum of these solutions:

T (x, t) =

∞∑n=1

Pn cos(2n+ 1)πx

2Le−(2n+1)2π2Dt/(4L2)

The constants Pn are determined by the initial conditions

T (x, 0) = L2 − x2

Setting t = 0 in the infinite sum and equating to the initial

conditions gives

L2 − x2 =

∞∑n=1

Pn cos(2n+ 1)πx

2L

We now use an approach similar to that used for Fourier series to

determine the constants Pn

Assuming we may interchange the order of the infinite summation

and integration gives∫ L

0

(L2−x2) cos(2m+ 1)πx

2Ldx =

∞∑n=1

Pn

∫ L

0

cos(2m+ 1)πx

2Lcos

(2n+ 1)πx

2Ldx


Remembering that

cosA cosB =1

2(cos(A+B) + cos(A−B))

we have, for integers m 6= n:∫ L

0

cos(2m+ 1)πx

2Lcos

(2n+ 1)πx

2Ldx

=1

2

∫ L

0

cos(n+m+ 1)πx

L+ cos

(n−m)πx

Ldx

=1

2

[L

(n+m+ 1)πsin

(n+m+ 1)πx

L+

L

(n−m)πsin

(n−m)πx

L

]L0

= 0

We also have∫ L

0

cos2(2n+ 1)πx

2Ldx =

1

2

∫ L

0

1 + cos(2n+ 1)πx

Ldx

=L

2

The constants Pn are therefore given by

Pn =2

L

∫ L

0

(L2 − x2) cos(2n+ 1)πx

2Ldx

which can be evaluated by integration by parts


One final, short example on separable solutions of the heatequation

Solve

∂T

∂t=∂2T

∂x2, 0 < x < 1, t > 0


T = 2 at x = 0, and T = 4 at x = 1

and initial conditions

T (x, 0) = 2 + 2x+ 3 sinπx

All the boundary conditions we have considered before are of the

form T = 0 or ∂T∂x = 0, which are known as homogeneous boundary

conditions

This allows us to find non–trivial sine and cosine solutions in the x

variable

At first sight, we can’t do this for the non homogeneous boundary

conditions for this problem.

But there is a way around it — write U = T − (2 + 2x)


Noting that

∂U

∂t=∂T

∂t,

∂2U

∂t2=∂2T

∂t2

we see that U satisfies the PDE

∂U

∂t=∂2U

∂x2, 0 < x < 1, t > 0

The boundary conditions become

U = 0 at x = 0, and U = 0 at x = 1

and the initial conditions become

U(x, 0) = 3 sinπx

The equation for U has homogeneous boundary conditions, so can be

solved in the same way as we have solved earlier equations

The solution for T can then be recovered by writing

T (x, t) = U(x, t) + 2 + 2x


Similarity solutions to the heat equation

Suppose we want to solve

∂T

∂t= D

∂2T

∂x2, x, t > 0

with initial condition

T (x, 0) = 0, x > 0

and boundary conditions

T (0, t) = U, T (∞, t) = 0, t > 0

Physically this corresponds to a semi–infinite bar occupying the

region 0 < x <∞, that is initially at zero temperature. At time t = 0

the end at x = 0 is raised to temperature U

Let η = x/√Dt, and let T = f(η)

η is known as a similarity variable

On Worksheet 1 you showed that for the heat equation on the

previous slide this implies that

f ′′(η) +1

2ηf ′(η) = 0

Integrating once gives, for arbitrary constant B,

f ′(η) = Be−η2/4

Integrating once more gives, for arbitrary constant A,

f(η) = A+B

∫ η

s=0

e−s2/4 ds


Noting that x > 0, t = 0 corresponds to η = x/√Dt =∞, the initial

condition corresponds to f(∞) = 0

Similarly, x = 0, t > 0 corresponds to η = 0 and so the first boundary

condition corresponds to f(0) = U .

x =∞, t > 0 corresponds to η =∞ — the second boundary

condition corresponds to f(∞) = 0. Note this is consistent with other

conditions on f

Using these conditions on f(0) and f(∞) we may determine A and B

to give

u(x, t) = f(η) = U

(1−

∫ ηs=0

e−s2/4 ds∫∞

s=0e−s2/4 ds

)

Poisson’s equation

Poisson’s equations in two dimensions is the partial differential

equation

D

(∂2u

∂x2+∂2u

∂y2

)+ f(x, y) = 0

where D is constant

This models many time independent diffusion processes — e.g.

chemical, heat — where f(x, y) is a source term


Different coordinate systems

An exercise on Worksheet 2 was to show that, for cylindrical polar

coordinates x = r cos θ, y = r sin θ,

∂2u

∂x2+∂2u

∂y2=

1

r

∂

∂r

(r∂u

∂r

)+

1

r2∂2u

∂θ2

We may therefore write Poisson’s equation as

1

r

∂

∂r

(r∂u

∂r

)+

1

r2∂2u

∂θ2+ f(r, θ) = 0

Example: Solve

∂2u

∂x2+∂2u

∂y2− x2 − y2 = 0

for x2 + y2 < 4, with boundary condition u = 5 on x2 + y2 = 4

Noting that x2 + y2 = r2 we may write this as

1

r

∂

∂r

(r∂u

∂r

)+

1

r2∂2u

∂θ2= r2

for r < 2, with boundary condition u = 5 on r = 2


As there is no dependence on θ in the boundary conditions or source

term we seek a solution u = u(r), i.e. we neglect the dependence on θ

The partial derivatives with respect to r are now total derivatives,

and the partial derivatives with respect to θ are zero

The equation therefore becomes

1

r

d

dr

(r

du

dr

)= r2

This has general solution

u =1

16r4 +A log r +B

We first note that u must be finite at r = 0 — as limr→0 log r = −∞this requires A = 0

The other boundary condition is u = 5 on r = 2 — this yields B = 4

The solution is therefore

u =1

16r4 + 4

=1

16

(x2 + y2

)2+ 4


Separable solutions to Poisson’s equation

Suppose we want to solve

∂2u

∂x2+∂2u

∂y2= 0, 0 < x, y < 1


u = 0, x = 0, x = 1, y = 1

u = x(1− x), y = 0

We seek a sepatable solution u(x, y) = X(x)Y (y), with boundary

conditions

X(0) = 0, X(1) = 0

, Y (1) = 0

Substitution into the given PDE gives X ′′Y +XY ′′ = 0 which may

be written

X ′′

X= −Y

′′

Y= −k2

We have two boundary conditions on X, so start with this equation

first:

X ′′ + k2X = 0

Boundary conditions give, for arbitrary constant B

X(x) = B sinnπx, n = 1, 2, 3, . . .

and k = nπ.


ODE for Y (y) becomes

Y ′′ − n2π2Y = 0

This has general solution

Y (y) = Cenπy +De−nπy

Applying the boundary condition Y (1) = 0 gives D = −Ce2nπ

General solution becomes Y (y) = C(enπy − enπ(2−y)

)

Solution may be written

u(x, t) =

∞∑n=1

Pn

(enπy − enπ(2−y)

)sinnπx

To set u = x(1− x) on y = 0 we note that

x(1− x) = u(x, 0)

=

∞∑n=1

Pn(1− e2nπ

)sinnπx

We will now follow the Fourier type approach to determine the

constants Pn


Noting that for integers m,n we have∫ 1

0

sinmπx sinnπx dx =

0 m 6= n

12 m = n

we may write

Pn =2

1− e2nπ

∫ 1

0

x(1− x) sinnπx dx

The coefficients Pn may be evaluated by integration by parts.

Documents

Continuous Mathematics