4 Partial Diﬀerentiation - University College Cork · 4 Partial Diﬀerentiation Many equations in engineering, physics and mathematics tie together more than two variables. For

PARTIAL DIFFERENTIATION

4 Partial Differentiation

Many equations in engineering, physics and mathematics tie together more than two variables.For example Ohm’s Law (V = IR) and the equation for an ideal gas, PV = nRT , whichgives the relationship between pressure (P ), volume (V ) and temperature (T ). If we vary anytwo of these then the behaviour of the third can be calculated:

P =nRT

V, V =

nRT

P, T =

PV

nR.

How P varies as we change T and V is easy to see from the above, but we want to adapt thetools of one-variable calculus to help us investigate functions of more than one variable.

For the most part we shall concentrate on functions of two variables such as z = x2 + y2

or z = x sin(y + ex). Graphically z = f(x, y) describes a surface in 3D space — varying thex- and y-coordinates gives the z-coordinate, producing the surface:

xy

z

(x0, y0)

z0 = f(x0, y0)

10

15

0

2y

0

2 x

One task of interest will be to maximise or minimise such functions, where we may haveto take into account limitations on the domain of definition, i.e. those points (x, y) for whichwe calculate z = f(x, y). Restrictions on the domain can come from both mathematical andphysical reasons. For example above the function T = PV

nRmakes mathematical sense for

negative P or V , but, physically, negative pressure or volume has no obvious meaning.

Exercise 4.1. What is the domain of the function z = f(x, y) =√

1 − x2 − y2?

Solution.

Consider the function z = ln(x + y). From its definition we see that it is defined onlywhen x + y > 0, that is only for points (x, y) ∈ R

2 lying above the line y = −x. Moreoveron any line x + y = a for a > 0 we have z = ln a, that is z maintains a constant value, so wehave a contour line of the surface:

59

x

y

x + y = 0

x + y = a

Note also that on any line with equation y = mx + c for constants m 6= −1 and c, we havez = ln

(

(m + 1)x + c)

= ln(

x + c/(m + 1))

+ ln(m + 1) for all x > −c/(m + 1), so that weget a copy of the graph of the logarithm curve when not travelling parallel to the lines ofconstancy. For example on y = x, z = ln(2x) = lnx + ln 2.

As another example, consider the function z = x2 +y2. If we choose a positive value for z,for example z = 4, then the points (x, y) that can give rise to this value are those satisfyingx2 + y2 = 4 = 22, i.e. those on the circle centred on the origin of radius 2. On the other handif we fix a value for x, for example x = 0, thenwe have z = y2. If we fix x = 1 then z = y2 + 1.Both of these are parabolas, and indeed fixingany value of x produces such a curve. Sym-metrically, fixing a value for y also produces aparabola, e.g. z = x2 + (−3)2 = x2 + 9.

Note that at (x, y) = (0, 0), z = 0, but ifx 6= 0 or y 6= 0, then x2 > 0 or y2 > 0, and it fol-lows that z > 0. Thus the minimum value takenby this function is z = 0, at the origin. Thiscontrasts with our earlier example z = ln(x + y)where if we move along y = x we have z =ln(2x) = lnx + ln 2, which diverges to −∞ asx → 0, and diverges to +∞ as x → +∞. Thusthere is no overall maximum or minimum value.

Unfortunately in general it is harder to picture what is happening with less simple multi-variable functions, such as z = sin(x2 + y) + exy. One useful technique illustrated above forz = x2 + y2 is to hold either x or y constant. For example consider z = x2(1− y)− xy2 + y3.Setting x = 0 gives

z = y3 ⇒ dz

dy= 3y2,

and setting x = 1 gives

z = y3 − y2 − y + 1 = (y − 1)2(y + 1) ⇒ dz

dy= 3y2 − 2y − 1 = (y − 1)(3y + 1).

On the other hand if y = −2 then

z = 3x2 − 4x − 8 ⇒ dz

dx= 6x − 4 = 2(3x − 2).

60


All of these slices through the surface give us an insight into the behaviour of the function:

x

z = y3 z = 3x2 − 4x − 8z = y3 − y2 − y + 1

yy

z zz

4.1 Definition of partial derivatives

Suppose that z = f(x, y) is a function of two variables. We define partial derivatives taken

with respect to x and with respect to y by:

∂z

∂x= lim

h→0

f(x + h, y) − f(x, y)

h,

∂z

∂y= lim

k→0

f(x, y + k) − f(x, y)

k,

whenever these limits exist. These definitions mirror those for the one variable case. For∂z

∂xwe are holding the value of y fixed, altering x by a small amount h to get the point (x+h, y),and calculating the slopes of straight line approximations to the tangent in the x-direction.

Similarly calculating∂z

∂yinvolves holding x fixed and finding the limit of approximations to

the tangent in the y-direction.

Applying this definition to the function z = x2(1 − y) − xy2 + y3 above we have

∂z

∂x= lim

h→0

[

(x + h)2(1 − y) − (x + h)y2 + y3]

−[

x2(1 − y) − xy2 + y3]

h

= limh→0

(2xh + h2)(1 − y) − hy2

h

= limh→0

[

(2x + h)(1 − y) − y2]

= 2x(1 − y) − y2,

and this limit exists at all points (x, y) in the plane. A similar calculation shows that

∂z

∂y= −x2 − 2xy + 3y2.

61

Definition of partial derivatives

However, in practice it is rarely necessary to go back to the definitions as we did above.Indeed, since all we are doing is holding one variable fixed, we can treat this variable as aconstant in our calculations. For example if z = x2 + xy5 − 6x3y + y4 then

∂z

∂x=

d

dx(x2) + y5 d

dx(x) − 6y

d

dx(x3) + y4 d

dx(1)

= 2x + y5 × 1 − 6y × 3x2 + y4 × 0 = 2x + y5 − 18x2y.

Similarly,

∂z

∂y= x2 d

dy(1) + x

d

dy(y5) − 6x3 d

dy(y) +

d

dy(y4)

= x2 × 0 + x × 5y4 − 6x3 × 1 + 4y3 = 5xy4 − 6x3 + 4y3.

Using this technique we can make use of known results from one-variable theory such asthe product and quotient rules, and the chain rule if we are considering a function of one

variable. (The general chain rule for partial derivatives is a little more complicated.) Forexample if z = sin(xy)ex+y then

∂z

∂x=

∂

∂x

(

sin(xy))

× ex+y + sin(xy) × ∂

∂xex+y Product rule

= cos(xy)∂

∂x(xy) × ex+y + sin(xy) × ex+y ∂

∂x(x + y) Chain rule ×2

= y cos(xy)ex+y + sin(xy)ex+y

A similar calculation shows that∂z

∂y= x cos(xy)ex+y + sin(xy)ex+y.

4.1.1 Alternative notations

If z = f(x, y), then∂z

∂xis often written as fx(x, y), and

∂z

∂yas fy(x, y). A further alternative

is to write these as f1(x, y) and f2(x, y) respectively, where the number indicates whether thederivative is being taken with respect to the first or second variable.

Exercise 4.2. For which points (x, y) is the function z = ln(y − x2) + sin(x + y2) defined?

Sketch the domain of definition. Calculate∂z

∂xand

∂z

∂y.

Solution.

62


Exercise 4.3 (S03 6(a i)). Compute∂z

∂xand

∂z

∂ywhen z = x2y + 3x sin(x − 2y).

Solution.

4.1.2 Functions of more variables

We can extend the notion of partial derivatives to functions of any (finite) number of variablesin a natural way. For example if w = sin(x + y) + z2ex then

∂w

∂x= cos(x + y)

∂

∂x(x + y) + z2 ∂

∂xex = cos(x + y) + z2ex

∂w

∂y= cos(x + y)

∂

∂y(x + y) + 0 = cos(x + y)

∂w

∂z= 0 +

∂

∂z(z2)ex = 2zex.

The geometrical significance of such functions is not so immediate as for functions of onlytwo variables.

4.2 Tangent planes

By holding one variable fixed in the definition of partial derivatives, for example setting y = b,we are taking a surface z = f(x, y), intersecting it with the plane y = b, and then takingderivatives of the resulting curve in this plane:

x

z

a

1

fx(a, b)

For one-variable calculus we are interested in approximating a function y = f(x) by finding

63

Tangent planes

a tangent line to the curve at some point(

a, f(a))

. For two variables the appropriate objectis the tangent plane.

n

tx

ty

By taking partial derivatives in the orthogonal directions corresponding to the x- andy-axes we can produce vectors that are in the direction of the tangent lines to the two curvesobtained by intersecting the surface with the planes x = a and y = b. In particular thetangent plane at the point

(

a, b, f(a, b))

contains this point, and should contain the twotangent lines through this point in the directions specified by the vectors obtained from thesepartial derivatives.

In the plane y = b a tangent vector is tx =(

1, 0, fx(a, b))

, and in the plane x = a a tangent

vector is ty =(

0, 1, fy(a, b))

. Both of these vectors lie in the plane, hence their vector productn = tx × ty is a normal vector to the plane:

n = tx × ty =

∣

∣

∣

∣

∣

∣

i j k

1 0 fx(a, b)0 1 fy(a, b)

∣

∣

∣

∣

∣

∣

=(

−fx(a, b),−fy(a, b), 1)

,

and so the tangent plane has equation

[

(x, y, z) −(

a, b, f(a, b))

]

.(

−fx(a, b),−fy(a, b), 1)

= 0

⇔ −fx(a, b) (x − a) − fy(a, b)(y − b) + z − f(a, b) = 0

⇔ z = f(a, b) + fx(a, b)(x − a) + fy(a, b)(y − b)

This function of x and y is also known as the linearisation of z = f(x, y) at the point (a, b).

Exercise 4.4 (A04 8(b)). Find the equation of the tangent plane and the normal line to thesurface z = x2y3 − sin

(

πx + π2y)

at the point (2, 1). Find the intersection of this normal linewith the (x, y)-plane.

Solution.

64


Exercise 4.5 (S03 6(b)). Find the equation of the tangent plane to the surface z = f(x, y) =√

8 − 3x2 − y2 at the point (1, 2, 1). Write down the linear approximation to f(x, y) at (1, 2)and use it to find an approximate value for f(1.05, 1.95).

Solution.

Exercise 4.6. Find the tangent planes to the surface z = f(x, y) = 3xy + x + y2 at (1, 0)and at (−1, 2). Find the line of intersection of these two planes.

Solution.

65

Higher order derivatives

Example 4.7. Find the tangent plane to the surface

z = e2xy + tanπ(x + y)

16

at the point (3, 1, e6 + 1).

Solution. Since z = e2xy + tanπ(x + y)

16, we have

∂z

∂x= 2ye2xy +

π

16sec2 π(x + y)

16,

∂z

∂y= 2xe2xy +

π

16sec2 π(x + y)

16.

So when x = 3 and y = 1, z = e6 + tanπ

4= e6 + 1 and

∂z

∂x= 2e6 +

π

16

1

(1/√

2)2= 2e6 +

π

8,

∂z

∂y= 6e6 +

π

16

1

(1/√

2)2= 6e6 +

π

8

So the tangent plane has equation

z = e6 + 1 +(

2e6 +π

8

)

(x − 3) +(

6e6 +π

8

)

(y − 1).

Example 4.8 (S05 8(c)). Find the equation of the tangent plane to the surface with equationz = y cos(x − y) at the point (2, 2, 2).

Solution. Taking partial derivatives of z we get

∂z

∂x= −y sin(x − y)

∂

∂x(x − y) = −y sin(x − y), and

∂z

∂y= cos(x − y) − y sin(x − y)

∂

∂y(x − y) = cos(x − y) + y sin(x − y).

So at (2, 2, 2) we have∂z

∂x= −2 sin 0 = 0, and

∂z

∂y= cos 0 + 2 sin 0 = 1. Thus the tangent

plane isz = 2 + 0 × (x − 2) + 1 × (y − 2) ⇒ z = y.

4.3 Higher order derivatives

Suppose z = x sin y + x2y. Then

∂z

∂x= sin y + 2xy and

∂z

∂y= x cos y + x2.

Both of these partial derivatives are again functions of x and y, so we can differentiate bothof them, either with respect to x, or with respect to y. This gives us a total of four second

order partial derivatives:

∂2z

∂x2=

∂

∂x

(

∂z

∂x

)

= 2y,∂2z

∂y∂x=

∂

∂y

(

∂z

∂x

)

= cos y + 2x

∂2z

∂x∂y=

∂

∂x

(

∂z

∂y

)

= cos y + 2x,∂2z

∂y2=

∂

∂y

(

∂z

∂y

)

= −x sin y.

66


Remark. The mixed partial derivatives in this case are equal:∂2z

∂y∂x=

∂2z

∂x∂y. This is not

something special about our particular example, but is true for all reasonably well-behaved

functions. However it is possible to find functions for which this is not true (examples can befound in calculus text books).

In the∂

∂xnotation, the order of taking the derivatives is given by reading the variables in

the denominator from right to left. For example∂2z

∂y∂xmeans calculate

∂z

∂x, then differentiate

the result with respect to y. When using the fx or f1 notations, the convention is the otherway — the subscripts are read left to right. That is, fxy = (fx)y, the derivative with respecty of the derivative with respect to x. However, in light of the remark above, these conventionscan usually be ignored for most functions, since the results in either order will be the same.

For example, if f(x, y) = x2 + xy2, then

fx(x, y) = 2x + y2, fy(x, y) = 2xy,

and sofxx(x, y) = 2, fxy(x, y) = 2y = fyx(x, y), fyy(x, y) = 2x.

Exercise 4.9. Compute all the second order partial derivatives of the function f(x, y) =sin(x + xy).

Solution.

Example 4.10. Compute∂z

∂x,

∂z

∂yand

∂2z

∂x2when z = x3y + ex+y2

+ y sin x.

Solution. Since z = x3y + ex+y2

+ y sin x then

∂z

∂x= 3x2y + ex+y2

+ y cosx,∂2z

∂x2= 6xy + ex+y2 − y sin x,

and∂z

∂y= x3 + 2yex+y2

+ sin x

67

Exercises

Example 4.11. Consider the following function of three variables:

f(x, y, z) = x2y3 + sin(x2 + z) − xy

z.

Find the partial derivatives fx, fy, fyx and fxyz.

Solution. Since f(x, y, z) = x2y3 + sin(x2 + z) − xy

zthen

fx = 2xy3 + 2x cos(x2 + z) − y

z, fy = 3x2y2 − x

z,

fyx = (fy)x = 6xy2 − 1

z, fxyz = (fxy)z = (fyx)z =

1

z2.

4.4 Exercises

1. Find the domains of the following functions. Sketch these domains in parts (i) and (iii).

(i) f(x, y) =√

8 − 3x2 − y2 (ii) f(x, y) = sin(x − y) +1

x − y2

(iii) f(x, y) = ln(1 − x2 − y2) +x2 + 1

x + y(iv) f(x, y) = sin(x2 + exy)

2. Find all the first order derivatives of the following functions:

(i) f(x, y) = x3 − 4xy2 + y4 (ii) f(x, y) = x2ey − 4y

(iii) f(x, y) = x2 sin xy − 3y2 (iv) f(x, y, z) = 3x sin y + 4x3y2z

3. Find the indicated partial derivatives:

(i) f(x, y) = x3 − 4xy2 + 3y : fxx, fyy, fxy

(ii) f(x, y) = x4 − 3x2y3 + 5y : fxx, fxy, fxyy

(iii) f(x, y, z) = e2xy − z2

y+ xz sin y : fxx, fyy, fyyzz

4. Find the equation of the tangent plane and normal line to the following surfaces at thegiven point:

(i) z = x2 + y2 − 1 at (2, 1, 4) (ii) z = e−x2−y2

at (0, 0, 1)

(iii) z = sinx cos y at (0, π, 0) (iv) z = x3 − 2xy at (−2, 3, 4)

(v) z =√

x2 + y2 at (−3, 4, 5) (vi) z =4x

yat (1, 2, 2)

4.5 The Chain Rule

For one-variable calculus if y = f(x), i.e. y is a function of x, and if x = x(t), i.e. x is afunction of the variable t, then y can be viewed as a function of t, and has derivative

dy

dt=

dy

dx

dx

dt.

Suppose instead that z = f(x, y) is a function of two variables, each of which is writtenin terms of the single variable t. Then we think of z just as a function of t, and differentiate.It depends on the partial derivatives with respect to x and y through the following formula:

dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy

dt

68


As an example, suppose z = x2 − xy, and that x = sin t, y = t2. Then

dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy

dt

= (2x − y)d

dt(sin t) + (−x)

d

dt(t2) = (2 sin t − t2) cos t − 2t sin t.

In this case we could avoid use of the chain rule, since direct substitution for x and y givesz = sin2 t − t2 sin t. But substitution may not always be convenient, or even available.

Exercise 4.12. Suppose z = x2y + y2, where x = cos t and y = 1/t. Finddz

dt.

Solution.

Exercise 4.13. The pressure, volume and temperature of an ideal gas are related by theequation PV = 8.31T . Find the rate at which the pressure is changing when the temperatureis 300K and increasing at a rate of 0.1Ks−1, and the volume is 100l and increasing at a rateof 0.2ls−1.

Solution.

Example 4.14. The volume of a right circular cylinder of base radius r and height h isV = πr2h. If the radius is decreasing at a rate of 3cms−1 while the height is increasing at arate of 2cms−1, what is the rate of the change of V when r = 40cm and h = 110cm?

Solution. Since V = πr2h,∂V

∂r= 2πrh and

∂V

∂h= πr2. So thinking of V as a function of t

we have, by the chain rule,

dV

dt=

∂V

∂r

dr

dt+

∂V

∂h

dh

dt= 2πrh

dr

dt+ πr2 dh

dt.

For the problem we have r = 40, h = 110,dr

dt= −3 and

dh

dt= 2, and so

dV

dt= 2π × 40 × 110 × (−3) + π × 402 × 2 = −23200π cm3 s−1.

69

The Chain Rule

Example 4.15. The voltage V in an electrical circuit is slowly decreasing as the batterywears out. The resistance is slowly increasing as the resistor heats up. Use Ohm’s LawV = IR to find how the current I is changing at the moment when R = 400Ω, I = 0.08A,dV

dt= −0.01Vs−1 and

dR

dt= 0.03Ωs−1.

Solution. From V = IR we get I =V

R, and so the chain rule gives

dI

dt=

∂I

∂V

dV

dt+

∂I

∂R

dR

dt

=1

R

dV

dt− V

R2

dR

dt=

1

R

dV

dt− I

R

dR

dt

=1

400× (−0.01)− 0.08

400× 0.03

=1

400(−0.01 − 0.08 × 0.03) = −0.000031A s−1.

Now suppose that z = f(x, y), a function of the two variables x and y, and that each ofthese in turn depend on two variables s and t. Then, viewing z as a function of s and t, wehave the two partial derivatives

∂z

∂s=

∂z

∂x

∂x

∂s+

∂z

∂y

∂y

∂sand

∂z

∂t=

∂z

∂x

∂x

∂t+

∂z

∂y

∂y

∂t

As an example, suppose that z = xy − y2, and that x = es+t and y =s

t. Then

∂z

∂s=

∂z

∂x

∂x

∂s+

∂z

∂y

∂y

∂s= y

∂

∂s(es+t) + (x − 2y)

∂

∂s

(s

t

)

=s

t× es+t +

(

es+t − 2s

t

)

× 1

t=

1

t2[

(s + 1)tes+t − 2s]

Similarly,

∂z

∂t=

s

t3[

t(t − 1)es+t + 2s]

.

Exercise 4.16. If z = ex sin y where x = st2 and y = s2t, find∂z

∂s.

Solution.

Exercise 4.17 (A04 8(c)). Suppose that z = f(u, v) and that the variables u and v depend

on x and y through u = x2y + y2 and v = ex cos(πy). If∂z

∂u= −4 and

∂z

∂v= 3 at the point

(u, v) = (4, 1), find∂z

∂xand

∂z

∂yat the corresponding point (x, y) = (0, 2).

70


Solution.

Exercise 4.18. If g(s, t) = f(s2 − t2, t2 − s2), and if f has partial derivatives with respectto both variables, show that g satisfies the equation

t∂g

∂s+ s

∂g

∂t= 0.

Solution.

In general, if z is a function depending on the m variables x1, x2, . . . , xm, and each ofthese are defined in terms of the n variables y1, y2, . . . , yn, then we can think of z as afunction of the yj , and take n different partial derivatives, which are given by

∂z

∂yj

=∂z

∂x1

∂x1

∂yj

+∂z

∂x2

∂x2

∂yj

+ · · · + ∂z

∂xm

∂xm

∂yj

.

The previous two formulae given are special cases of this general version of the chain rule.

4.6 Directional derivatives; the gradient operator

Given a function z = f(x, y), we have defined the two partial derivatives∂z

∂xand

∂z

∂yby

making small changes in one variable while holding the other fixed. This amounts to movinga small distance in the (x, y)-plane parallel to one or other of the coordinate axes. This canbe generalised by moving in any direction in the (x, y)-plane as specified by a unit vectorc = (c, d) (so c2 + d2 = 1). The directional derivative of z = f(x, y) at the point r = (a, b)in the direction c is

Dcf(r) = limh→0

f(a + hc, b + hd) − f(a, b)

h.

71

Directional derivatives; the gradient operator

Particular examples are given by taking c = i = (1, 0) or c = j = (0, 1), the unit vectors inthe coordinate directions, since from these we just recover the partial derivatives:

Dif(r) = limh→0

f(a + h, b) − f(a, b)

h= fx(r), Djf(r) = fy(r).

More generally, having fixed a unit vector c and a point r, we are taking the one-variablefunction g(h) := f(r + hc) = f(a + hc, b + hd), differentiating it, and evaluating at h = 0.Using the chain rule we get

d

dhg(h) = fx(a + hc, b + hd)

d

dh(a + hc) + fy(a + hc, b + hd)

d

dh(b + hd)

= cfx(r + hc) + dfy(r + hc)

for all h ∈ R, and so setting h = 0 we get

Dcf(r) = cfx(r) + dfy(r) = (c, d).(fx(r), fy(r)).

That is, Dcf(r) can be calculated by finding the vector of partial derivatives, evaluating thisat the point r, and taking the dot product of the result with the direction vector c. Thevector (fx(r), fy(r)) is known as the gradient of f , and is denoted ∇f(r), so that

Dcf(r) = c.∇f(r)

For example, if we take f(x, y) = 4xy − x3y2, c = (3

5, 4

5) and r = (2,−3) then

fx(x, y) = 4y − 3x2y2 ⇒ fx(r) = −120, fy(x, y) = 4x − 2x3y ⇒ fy(r) = 56,

and so

Dcf(r) =(3

5,4

5

)

.(−120, 56) =−360 + 224

5= −136

5.

Recall that for any two vectors a and b, a.b = |a||b| cos θ, where θ is the angle betweenthese two vectors. Applying this to our formula for directional derivatives, and noting that|c| = 1, we get

Dcf(r) = |∇f(r)||c| cos θ = |∇f(r)| cos θ.

But −1 6 cos θ 6 1, so the directional derivative is maximised if we take θ = 0, whencos θ = 1. That is, if we take c in the same direction as ∇f(r). Alternatively, if we take c inthe opposite direction, so that θ = π, and hence cos θ = −1, then Dcf(r) = −|∇f(r)|.

Exercise 4.19. Find the directional derivative of f(x, y) = x3 − 3xy + 4y2 in the directiongiven by the unit vector c at an angle π/6 to the x-axis. What is Dcf(1, 2)?

Solution.

72


Exercise 4.20. If f(x, y) = xey, find the rate of change at the point P with position vector(2, 0) in the direction from P to the point Q with position vector (1

2, 2). In which direction

does f have the maximum rate of change? What is this maximum rate of change?

Solution.

Example 4.21 (S05 8(b)). Find the directional derivative of the function

f(x, y) = 5xy2 − 4x3y

at the point P = (1, 2) in the direction of the vector (5, 12).What is the maximum rate of decrease of the function at P , and in which direction does

this occur?

Solution. Since 52 + 122 = 169 = 132, a unit vector in the required direction is c = 1

13(5, 12).

Also,

f(x, y) = 5xy2 − 4x3y ⇒ ∇f = (5y2 − 12x2y, 10xy − 4x3)

and so the directional derivative in the direction of c at P is

1

13(5, 12).∇f(1, 2) =

1

13(5, 12).(20 − 24, 20− 4) =

1

13(−20 + 192) =

172

13.

The maximum rate of decrease occurs in the direction of −∇f(1, 2) = (4,−16), and is

|∇f(1, 2)| =√

(−4)2 + 162 = 4√

17.

4.6.1 Tangent planes revisited

Consider the equation

x2 + y2 + z2 = 1. (S)

This can be written as |r|2 = 1, which is equivalent to |r| = 1, where r = (x, y, z) is theposition of a general point in space. Thus a point satisfies the equation (S) precisely if it isdistance 1 from the origin, that is, if it is on the sphere of radius 1 whose centre is the origin.

73


It is impossible to rearrange this equation to get a single function z = f(x, y), since we havethe two possibilities:

z =√

1 − x2 − y2 and z = −√

1 − x2 − y2,

and note that these only make sense whenever x2 + y2 6 1, that is when the point (x, y) liesin the disc of radius 1 centred on (0, 0) in the (x, y)-plane.

Define a function F : R3 → R by F (r) = F (x, y, z) = x2 + y2 + z2. The equation of

the sphere is given by F (r) = 1, that is, setting F equal to a constant value. Surfacesgenerated this way are called the level surfaces of the function F , and provide a more generalway of describing surfaces than equations of the form z = f(x, y). Note that if we defineF (r) := z − f(x, y) then our first type of surface is nothing but the level surface

F (r) = 0.

Consider now the problem of finding the equation of a tangent plane to such a surfaceF (r) = k at the point a. To do this imagine a point moving around on the surface. At time tits position vector is r(t) =

(

x(t), y(t), z(t))

, and suppose that at time t = 0 it passes througha. That is

r(0) =(

x(0), y(0), z(0))

= a and F(

r(t))

= k for all t,

the second equation following since the point is constrained to lie on the surface. Since thefunction t 7→ F

(

r(t))

is constant, its derivative is 0. But we can also evaluate this using thechain rule, which gives

0 = Fx

(

r(t))dx(t)

dt+ Fy

(

r(t))dy(t)

dt+ Fz

(

r(t))dz(t)

dt

=(

Fx

(

r(t))

, Fy

(

r(t))

, Fz

(

r(t))

)

.(dx(t)

dt,dy(t)

dt,dz(t)

dt

)

= ∇F(

r(t))

.dr(t)

dt

where ∇F is the gradient vector of F made up of its partial derivatives, which is the 3Danalogue of ∇f considered above. Putting t = 0 in this equation shows that ∇F (a) is

orthogonal todr(0)

dt, which is the tangent vector to the path taken by the point when passing

through a (i.e. its velocity, when thinking in terms of its motion). Since this is true for any

path passing through a, it follows that the vector ∇F (a) must be orthogonal to the surfaceat this point, and so provides the normal vector for the tangent plane. Hence all points r onthe tangent plane at a satisfy the equation

(r − a).∇F (a) = 0.

∇F

dr(t)

dt

and

74


For an example consider F (x, y, z) = x2 + y2 + z2 as above. We have

∇F =( ∂

∂x(x2 + y2 + z2),

∂

∂y(x2 + y2 + z2),

∂

∂z(x2 + y2 + z2)

)

= 2(x, y, z),

a vector parallel to the position vector of the point r, showing that the normal line to thesphere at any point can be continued back to the origin. In particular at the point (1, 0, 0)on the surface of the sphere, the tangent plane has equation

[

(x, y, z) − (1, 0, 0)]

.(2, 0, 0) = 0 ⇔ 2(x − 1) = 0 ⇔ x = 1.

This equation could not have been calculated with our earlier method.Finally, if we are given a surface by means of the equation z = f(x, y) then, rewriting

this as F (x, y, z) := z − f(x, y) = 0, we see that the normal vector at the point (x, y, z) =(a, b, f(a, b)) is (−fx(a, b),−fy(a, b), 1), as shown previously.

Exercise 4.22. Find the tangent plane to the hyperboloid x2 − y2 + 2z2 = 1 at the point(3, 4,−2). At which points is the normal line to the surface parallel to the line through thepoints (3,−1, 0) and (5, 3, 6).

Solution.

75


Exercise 4.23. Show that (0,−1, 2) lies on both of the following surfaces: x2 + 4y + z2 = 0and x2 + y2 + z2 − 6z + 7 = 0. Show, moreover, that the surfaces are tangent to one anotherat this point.

76


Solution.

4.7 Critical/stationary points

A function z = f(x, y) has a local maximum at the point (a, b) if f(a, b) > f(x, y) for all

(x, y) close to (a, b). More precisely, z = f(x, y) has a local maximum at the point (a, b) ifwe can find some number r > 0 such that when we evaluate f(x, y) at any point in the discwith centre (a, b) and radius r, we have f(a, b) > f(x, y).

x

y

(a, b)

r

Similarly, z = f(x, y) has a local minimum at the point(c, d) if f(c, d) 6 f(x, y) for all points (x, y) close to (c, d)in the same sense, i.e. all points in some disc of positiveradius centred on (c, d).

We shall make use of partial derivatives to locate can-didates for such points. Note that if (a, b) is a local max-imum or a local minimum, then the tangent plane at thispoint should be horizontal, which is equivalent to sayingthat the normal vector n to the surface/tangent planeshould have 0 for its x- and y- components.

Recall that n =(

−fx(a, b),−fy(a, b), 1)

. Consequently a point (a, b) in the domain of afunction z = f(x, y) is called a critical or stationary point if

fx(a, b) = fy(a, b) = 0.

This is a necessary condition for there to be a local maximum or a local minimum at (a, b),but is not a sufficient condition.

It tallies with the fact that if there was a local maximum in the surface at that point thenx = a would be a local maximum in the curve z = g(x) = f(x, b), and y = b would be a localmaximum in the curve z = h(y) = f(a, y) — the curves obtained by intersecting the surfacewith the planes y = b and x = a respectively.

77

Critical/stationary points

4.7.1 The second derivative test

Recall that for a function y = f(x), if f ′(a) = 0 then we can check to see whether there is alocal maximum or local minimum at x = a by calculating f ′′(a) (providing these derivativesexist) and seeing if the result is positive or negative.

Given a function z = f(x, y) of two variables, its discriminant is the function

Df (a, b) = fxx(a, b)fyy(a, b) − fxy(a, b)2 =

∣

∣

∣

∣

fxx(a, b) fxy(a, b)fyx(a, b) fyy(a, b)

∣

∣

∣

∣

where we are assuming that fxy(a, b) = fyx(a, b).

Theorem 4.24. Suppose that z = f(x, y) has partial derivatives up to second order, and that

(a, b) is a critical point of the function.

(i) If Df (a, b) > 0 and fxx(a, b) > 0 then f has a local minimum at (a, b).

(ii) If Df (a, b) > 0 and fxx(a, b) < 0 then f has a local maximum at (a, b).

(iii) If Df (a, b) < 0 then f has a saddle point at (a, b).

(iv) If Df (a, b) = 0 then no conclusion can be drawn.

Remark. The matrix whose determinant is taken to form the discriminant is symmetric, andso, by an earlier theorem, we know that it has two real (possibly repeated) eigenvalues. Theinequality Df (a, b) > 0 is equivalent to saying that the eigenvalues are either both positiveor both negative. The inequality Df (a, b) < 0 is equivalent to saying that the eigenvalues areof different sign. If Df (a, b) = 0 then at least one eigenvalue is 0, which leads to the lack ofconclusion.

The theorem follows from the multidimensional version of Taylor’s Theorem which impliesthat for points (x, y) close to (a, b)

f(x, y) ≈ f(a, b) + fx(a, b)(x − a) + fy(a, b)(y − a)

+[

x − a y − b]

[


] [

x − ay − b

]

.

So if there is a critical point at (a, b) then

f(x, y) ≈ f(a, b) +[

x − a y − b]

[


] [

x − ay − b

]

Local maxima and minima are relatively easy to visualise, and the final possibility (Df (a, b) =0) being inconclusive can happen for reasons similar to the one variable case (e.g. there is apoint of inflection when looking at the curve passing through

(

a, b, f(a, b))

in some direction).Case (iii) is a phenomenon that occurs for surfaces but not for curves. In one direction thecurve obtained by intersecting the surface with a plane has a local minimum, in the orthogonaldirection the corresponding curve has a local maximum.

yx

Figure 1: z = x2 + y2

y x

Figure 2: z = −x2 − y2

y

x

Figure 3: z = x2 − y2

78


For each of the above examples∂z

∂x= 2x or

∂z

∂x= −2x, and

∂z

∂y= 2y or

∂z

∂y= −2y.

Consequently in all cases the only critical point is the origin (x, y) = (0, 0).When z = x2 + y2,

∂2z

∂x2= 2,

∂2z

∂x∂y= 0,

∂2z

∂y2= 2 ⇒ Df (0, 0) = 4.

Since fxx(0, 0) = 2 > 0, the point (0, 0) is a local minimum for this function.When z = −x2 − y2,

∂2z

∂x2= −2,

∂2z

∂x∂y= 0,

∂2z

∂y2= −2 ⇒ Df (0, 0) = 4.

Since fxx(0, 0) = −2 < 0, the point (0, 0) is a local maximum for this function.When z = x2 − y2,

∂2z

∂x2= 2,

∂2z

∂x∂y= 0,

∂2z

∂y2= −2 ⇒ Df (0, 0) = −4,

so the point (0, 0) is a saddle point for this function.Perhaps more instructively one should consider the intersections of each surface with the

planes y = 0 and x = 0 which produces the curves z = ±x2 and z = ±y2.

Exercise 4.25. Locate and classify the stationary points of f(x, y) = x3 − 2y2 − 2y4 + 3x2y.

Solution.

79

Critical/stationary points

Exercise 4.26 (S04 8(c)). Locate and classify the stationary points of the function z =2x2 + y3 − x2y − 3y.

Solution.

80


Example 4.27. Locate and classify all the critical points of the function z = x sin y.

Solution. z = x sin y, so∂z

∂x= sin y and

∂z

∂y= x cos y. So to have

∂z

∂x= 0 we need sin y = 0,

that is y = nπ for n = 0,±1,±2, . . .

But note: cosnπ = (−1)n 6= 0 for all n, and we also need∂z

∂y= x cos y = 0, where

cos y 6= 0. So we must have x = 0. Thus the critical points of z are (0, nπ) for n = 0,±1,±2, . . .

But now∂2z

∂x2= 0 and

∂2z

∂x∂y= cos y, so that

∂2z

∂x2

∂2z

∂y2−

(

∂2z

∂x∂y

)2

= − cos2 y = −1 for

x = 0 and y = nπ. Thus each of the critical points is a saddle point.

4.8 Exercises

1. Show that for any k ≥ 0 the function z = sin kx cos kct satisfies the wave equation c2∂2z

∂x2=

∂2z

∂t2. Show that if f(u) is any function of one variable that is twice differentiable (i.e. f ′(u)

and f ′′(u) exist) then z = f(x − ct) is also a solution of this equation.

2. Use the chain rule to calculate the indicated derivatives:

(i) g′(t) where g(t) = f(

x(t), y(t))

, f(x, y) = x2y − sin y, x =√

t2 + 1, y = et.

(ii) g′(t) where g(t) = f(

x(t), y(t))

, f(x, y) =√

x2 + y2, x = sin t, y = t2 + 2.

(iii)∂z

∂uand

∂z

∂v, where z = sin(xy), x = u2v, y = veu.

(iv)∂z

∂uand

∂z

∂v, where z = xy3, x = eu2

, y =√

v2 + 1 sinu.

3. Find the directional derivatives of the given functions at the given point and in the givendirection:

(i) f(x, y) = x sin(xy) at (1, π2) in the direction of c = 1√

2(1,−1).

(ii) f(x, y) = x2 + ln(x − y) at (2e, e) in the direction of the vector a = (4,−7).

(iii) g(x, y) = y2 − x2 − x at (4, 5) in the direction of the vector from P = (3, 7) toQ = (−1, 9).

(iv) f(x, y, z) = x + y2z − xz3 at (2, 0, 2) in the direction of the normal vector to the planex − 3y + 4z = 5.

4. Find the tangent plane to the following surfaces at the given point:

(i) x2 − y2 + z2 = 13 at the point (4, 2, 1).

(ii) x sin(yz) = 2 at the point (2, 2, π4).

(iii) xy − ex−yz = 31 at the point (16, 2, 8).

5. Locate and classify all of the critical points of the following functions:

(i) z = e−x2

(y2 + 1) (ii) z = 4xy − x4 − y4 + 4 (iii) z = y2 + x2y + x2 − 2y

(iv) z = e−x2−y2

(v) z = x2 − 4xy

y2 + 1(vi) z = xye−x2−y2

81

Documents

4 Partial Diﬀerentiation - University College Cork · 4 Partial Diﬀerentiation Many equations in engineering, physics and mathematics tie together more than two variables. For