28
Section 5: The Jacobian matrix and applications. S1: Motivation S2: Jacobian matrix + differentiability S3: The chain rule S4: Inverse functions Images from“Thomas’ calculus”by Thomas, Wier, Hass & Giordano, 2008, Pearson Education, Inc. 1

Section 5: The Jacobian matrix and applications. S1: Motivation S2

Embed Size (px)

Citation preview

Page 1: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Section 5: The Jacobian matrix and applications.

S1: Motivation

S2: Jacobian matrix + differentiability

S3: The chain rule

S4: Inverse functions

Images from“Thomas’ calculus”by Thomas, Wier, Hass & Giordano, 2008, Pearson Education, Inc.

1

Page 2: Section 5: The Jacobian matrix and applications. S1: Motivation S2

S1: Motivation. Our main aim of this section is to consider“general”

functions and to define a general derivative and to look at its properties.

In fact, we have slowly been doing this. We first considered vector–

valued functions of one variable f : R → Rn

f(t) = (f1(t), . . . , fn(t))

and defined the derivative as

f ′(t) = (f ′1(t), . . . , f′n(t)).

We then considered real–valued functions of two and three variables

f : R2 → R, f : R3 → R and (as we will see later) we may think of

the derivatives of these functions, respectively, as

∇f = (∂f/∂x, ∂f/∂y)

∇f = (∂f/∂x, ∂f/∂y, ∂f/∂z).

2

Page 3: Section 5: The Jacobian matrix and applications. S1: Motivation S2

There are still more general functions than those two or three types

above. If we combine the elements of each, then we can form“vector–

valued functions of many variables”.

A function f : Rm → Rn (n > 1) is a vector–valued function of m

variables.

Example 1

f

xyz

=

(x + y + z

xyz

)

defines a function from R3 to R2.

3

Page 4: Section 5: The Jacobian matrix and applications. S1: Motivation S2

When it comes to these vector–valued functions, we should write vectors

as column vectors (essentially because matrices act on column vectors),

however, we will use both vertical columns and horizontal m–tuple no-

tation. Thus, for example, for the vector x ∈ R3 we will write bothxyz

or (x, y, z) (and xi + yj + zk)

and so we could write f : R3 → R2 as

f

xyz

=

(f1(x, y, z)f2(x, y, z)

)and

f(x, y, z) = (f1(x, y, z), f2(x, y, z))

= f1(x, y, z)i + f2(x, y, z)j

or combinations of columns and m-tuples.

4

Page 5: Section 5: The Jacobian matrix and applications. S1: Motivation S2

In Example 1, the real–valued functions

f1

xyz

= x + y + z and

f2

xyz

= xyz

are called the co–ordinate or component functions of f , and wemay write

f =

(f1f2

).

Generally, any f : Rm → Rn is determined by n co–ordinate functionsf1, . . . , fn and we write

f =

f1(x1, . . . , xm)f2(x1, . . . , xm)

...fn(x1, . . . , xm)

. (1)

5

Page 6: Section 5: The Jacobian matrix and applications. S1: Motivation S2

We shall be most interested in the cases where f : R2 → R2 or

f : R3 → R3 because this is where the most applications occur and

because it will prove to be extremely useful in our topic on multiple

integration.

For these special cases we can use the following notation

f(x) = f(x, y)

= (f1(x, y), f2(x, y))

= f1(x, y)i + f2(x, y)j.

f(x) = f(x, y, z)

= (f1(x, y, z), f2(x, y, z), f3(x, y, z))

= f1(x, y, z)i + f2(x, y, z)j + f3(x, y, z)k.

6

Page 7: Section 5: The Jacobian matrix and applications. S1: Motivation S2

One way of visualizing f , say, f : R2 → R2 is to think of f as a

transformation between co–ordinate planes.

So that f may stretch, compress, rotate etc sets in its domain.

The above be particularly useful when dealing with multiple integration

and change of variables.

7

Page 8: Section 5: The Jacobian matrix and applications. S1: Motivation S2

S2: Jacobian matrix + differentiability.

Our first problem is how we define the derivative of a vector–valued

function of many variables. Recall that if f : R2 → R then we can

form the directional derivative, i.e.,

Duf = u1∂f

∂x+ u2

∂f

∂y= ∇f · u

where u = (u1, u2) is a unit vector. Thus, knowledge of the gradient

of f gives information about all directional derivatives. Therefore it is

reasonable to assume

∇pf =

(∂f

∂x(p),

∂f

∂y(p)

)is the derivative of f at p. (The story is more complicated than this but

when we say f is“differentiable”we mean ∇f represents the derivative,

to be discussed a little later.)

8

Page 9: Section 5: The Jacobian matrix and applications. S1: Motivation S2

More generally if f : Rm → R then we take the derivative at p to bethe row vector(

∂f

∂x1(p),

∂f

∂x2(p), . . . ,

∂f

∂xm(p)

)= ∇pf

Now take f : Rm → Rn where f is as in equation (1), then the naturalcandidate for the derivative of f at p is

Jpf =

∂f1∂x1

∂f1∂x2

. . .∂f1∂xm

∂f2∂x1

∂f2∂x2

. . .∂f2∂xm

... ... . . . ...

∂fn

∂x1

∂fn

∂x2. . .

∂fn

∂xm

where the partial derivatives are evaluated at p. This n×m matrix iscalled the Jacobian matrix of f . Writing the function f as a columnhelps us to get the rows and columns of the Jacobian matrix the rightway round. Note the“Jacobian”is usually the determinant of this matrixwhen the matrix is square, i.e., when m = n.

9

Page 10: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Example 2 Find the Jacobian matrix of f from Example 1 and evaluate

it at (1,2,3).

10

Page 11: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Most of the cases we will be looking at have m = n = either 2 or 3.Suppose u = u(x, y) and v = v(x, y). If we define f : R2 → R2

by

f

(xy

)=

(u(x, y)v(x, y)

)≡(

f1f2

)then the Jacobian matrix is

Jf =

∂u

∂x

∂u

∂y

∂v

∂x

∂v

∂y

and the Jacobian (determinant)

det(Jf) =

∣∣∣∣∣∣∣∣∣∣∣

∂u

∂x

∂u

∂y

∂v

∂x

∂v

∂y

∣∣∣∣∣∣∣∣∣∣∣=

∂u

∂x

∂v

∂y−

∂v

∂x

∂u

∂y.

We often denote det(Jf) by∂(u, v)

∂(x, y).

11

Page 12: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Example 3 Consider the transformation from polar to Cartesian co–

ordinates, where

x = r cos θ and y = r sin θ.

We have

∂(x, y)

∂(r, θ)=

∣∣∣∣∣∣∣∣∣∣∣

∂x

∂r

∂x

∂θ

∂y

∂r

∂y

∂θ

∣∣∣∣∣∣∣∣∣∣∣=

∣∣∣∣∣cos θ − r sin θsin θ r cos θ

∣∣∣∣∣ = r.

12

Page 13: Section 5: The Jacobian matrix and applications. S1: Motivation S2

We have already noted that if f : Rm → Rn then the Jacobian matrix

at each point a ∈ Rm is an m × n matrix. Such a matrix Jaf gives

us a linear map Da f : Rm → Rn defined by

(Da f) (x) := Jaf · x for all x ∈ Rn.

Note that x is a column vector.

When we say f : Rm → Rn is differentiable at q we mean that,

the affine function A(x) := f(q) +(Jq f

)· (x− q), is a “good”

approximation to f(x) near x = q in the sense that

limx→q

‖f(x)− f(q)− (Jq f) · (x− q)‖‖x− q‖

= 0

where

‖x− q‖ =√

(x1 − q1)2 + . . . + (xm − qm)2.

13

Page 14: Section 5: The Jacobian matrix and applications. S1: Motivation S2

You should compare this to the one variable case: a function f : R → R

is differentiable at a if limh→0

f(a + h)− f(a)

hexists, and we call this

limit f ′(a). But we could equally well say this as f : R → R is

differentiable at a if there is a number, written f ′(a), for which

limh→0

|f(a + h)− f(a)− f ′(a) · h||h|

= 0,

because a linear map L : R → R can only operate by multiplication

with a number.

How do we easily recognize a differentiable function? If all of the com-

ponent functions of the Jacobian matrix of f are continuous, then f is

differentiable.

14

Page 15: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Example 4 Write the derivative of the function in Example 1 at (1,2,3)

as a linear map.

Suppose f and g are two differentiable functions from Rm to Rn. It is

easy to see that the derivative of f +g is the sum of the derivatives of f

and g. We can take the dot product of f and g and get a function from

Rm to R, and then differentiate that. The result is a sort of product

rule, but I’ll leave you to work out what happens. Since we cannot divide

vectors, there cannot be a quotient rule, so of the standard differentiation

rules, that leaves the chain rule.

15

Page 16: Section 5: The Jacobian matrix and applications. S1: Motivation S2

S3: The chain rule. Now suppose that g : Rm → Rs and f :

Rs → Rn. We can now form the composition f ◦ g by mapping with

g first and then following with f :

x → g(x) → f(g(x)) (2)

(f ◦ g) (x) := f (g(x)) for all x ∈ Rm.

Example 5 Let g : R2 → R2 and f : R2 → R3 be defined,

respectively, by

g

(xy

):=

(x + y

xy

)and f

(xy

):=

sinxx− yxy

.

Then f ◦ g is defined by

(f ◦ g)

(xy

)= f

(g

(xy

))= f

(x + y

xy

)=

sin(x + y)x + y − xy(x + y) (xy)

.

16

Page 17: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Let b = g(p) ∈ Rs. If f and g in (2) above are differentiable then

the maps Jpg : Rm → Rs and Jbf : Rs → Rn are defined, and we

have the following general result.

Theorem 1 (The Chain Rule) Suppose that g : Rm → Rs and

f : Rs → Rn are differentiable. Then

Jp(f ◦ g) = Jg(p)f · Jpg.

This is again just like the one variable case, except now we are multiplying

matrices (see below).

17

Page 18: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Example 6 Consider Example 5:

g

(xy

)=

(x + y

xy

)and f

(xy

)=

sinxx− yxy

.

Find Jp(f ◦ g) where p =

(a1a2

). We have

Jpg =

(1 1y x

)p

=

(1 1a2 a1

).

Also

Jg(p)f =

cosx 01 −1y x

x=a1+a2,y=a1a2

=

cos(a1 + a2) 01 −1

a1a2 a1 + a2

.

18

Page 19: Section 5: The Jacobian matrix and applications. S1: Motivation S2

(Ex cont.) and

Jp(f ◦ g) =

cos(x + y) cos(x + y)1− y 1− x

2xy + y2 x2 + 2xy

p

We observe that cos(a1 + a2) cos(a1 + a2)1− a2 1− a1

2a1a2 + a22 a1

2 + 2a1a2

=

cos(a1 + a2) 01 −1

a1a2 a1 + a2

· ( 1 1a2 a1

)

19

Page 20: Section 5: The Jacobian matrix and applications. S1: Motivation S2

The one variable chain rule is a special case of the chain rule that we’vejust met — the same can be said for the chain rules we saw in earliersections.

Let x : R → R be a differentiable function of t and and let u : R → Ra differentiable function of x. Then (u ◦ x) : R → R is given by(u ◦ x)(t) = u(x(t)). In the notation of this chapter

Jt(u ◦ x) = Jx(t)u · Jtx

i.e.[

d

dt(u ◦ x)

]t=[du

dx

]x(t)

[dx

dt

]t.

We usually write this as

du

dt=

du

dx

dx

dt

keeping in mind that when we writedu

dtwe are thinking of u as a

function of t, i.e., u(x(t)) and when we writedu

dxwe are thinking of u

as a function of x.20

Page 21: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Now suppose we have x = x(t), y = y(t) and z = f(x, y). Then

Jt(f ◦ x) = Jx(t)f · Jtx

Therefore

d

dt(f(x(t), y(t))) =

(∂f

∂x

∂f

∂y

dx

dt

dy

dt

so that

df

dt=

∂f

∂x

dx

dt+

∂f

∂y

dy

dt,

which is just what we saw in earlier sections.

21

Page 22: Section 5: The Jacobian matrix and applications. S1: Motivation S2

S4: Inverse functions.

In first year (or earlier) you will have met the inverse function theo-

rem, which says essentially that if f ′(a) is not zero, then there is a

differentiable inverse function f−1 defined near f(a) with[d

dt(f−1)

]f(a)

=1

f ′(a).

What happens in the multi–variable case?

22

Page 23: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Let us consider a case where we can write down the inverse. For polar

coordinates we have

x = r cos θ, y = r sin θ

r =√

x2 + y2, θ = arctan(

y

x

).

Now differentiating we obtain

∂r

∂x=

x√x2 + y2

=r cos θ

r= cos θ and

∂x

∂r= cos θ

i.e.,∂r

∂x6=

1∂x

∂r

.

We see that the one variable inverse function theorem does not apply to

partial derivatives. However, there is a simple generalisation if we use

the multivariable derivative, that is, the Jacobian matrix.

23

Page 24: Section 5: The Jacobian matrix and applications. S1: Motivation S2

To continue with the polar coordinate example, define

f

(rθ

)=

(x(r, θ)y(r, θ)

)=

(r cos θr sin θ

)(3)

and

g

(xy

)=

(r(x, y)θ(x, y)

)=

√x2 + y2

arctan(

yx

) . (4)

Consider

(f ◦ g)

(xy

)= f

(g

(xy

))= f

(rθ

)=

(xy

)= Id

(xy

).

Therefore f◦g = Id, the identity operator on R2. Similarly g◦f = Id.

Recall

Id

(xy

)=

(xy

)so that J(Id) =

(1 00 1

)≡ 2× 2 identity matrix.

24

Page 25: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Thus by the chain rule

Jf · Jg = J(Id) =

(1 00 1

)= Jg · Jf

so that (Jf)−1 = Jg. Note for simplicity the points of evaluation

have been left out. Therefore∂r

∂x

∂r

∂y

∂θ

∂x

∂θ

∂y

−1

=

∂x

∂r

∂x

∂θ

∂y

∂r

∂y

∂θ

.

We can check this directly by substituting∂r

∂x=

x√x2 + y2

= cos θ

etc.

The same idea works in general:

25

Page 26: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Theorem 2 (The Inverse Function Theorem) Let f : Rn → Rn

be differentiable at p. If Jp f is an invertible matrix then there is an

inverse function f−1 : Rn → Rn defined in some neighbourhood of

b = f(p) and

(Jb f−1) = (Jp f)−1.

Note that the inverse function may only exist in a small region around

b = f(p).

Example 7 We earlier saw that for polar coordinates, with the notation

of equation (3)

Jf =

(cos θ − r sin θsin θ r cos θ

),

with determinant r. So it follows from the inverse function theorem that

the inverse function g is differentiable if r 6= 0.

26

Page 27: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Example 8 The function f : R2 → R2 is given by

f

(xy

)=

(uv

)=

(x2 − y2

x2 + y2

).

Where is f invertible? Find the Jacobian matrix of f−1 where f is

invertible.

SOLN: Jf =

(2x −2y2x 2y

)and det Jf = 8xy, so f is invertible

everywhere except the axes.

Jf−1 =

(2x −2y2x 2y

)−1

=1

8xy

(2y 2y−2x −2x

)=

1

4

(x−1 x−1

−y−1 −y−1

).

Translate to (u, v) coordinates and this is

Jf−1 =

√2

4

((u + v)−1/2 (u + v)−1/2

−(v − u)−1/2 −(v − u)−1/2

).

27

Page 28: Section 5: The Jacobian matrix and applications. S1: Motivation S2

Finally let us apply the inverse function theorem to the Jacobian deter-

minants. We recall that

∂(r, θ)

∂(x, y)= det Jg =

∣∣∣∣∣∣∣∂r∂x

∂r∂y

∂θ∂x

∂θ∂y

∣∣∣∣∣∣∣ and

∂(x, y)

∂(r, θ)= det Jf =

∣∣∣∣∣∣∂x∂r

∂x∂θ

∂y∂r

∂y∂θ

∣∣∣∣∣∣ .Since Jg and Jf are inverse matrices, their determinants are inverses:

∂(r, θ)

∂(x, y)=

1∂(x,y)∂(r,θ)

.

This sort of result is true for any change of variable — in any number

of dimensions — and will prove very useful in integration.

28