36
Matrix Differentiation CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Matrix Differentiation 1 / 36

Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

  • Upload
    phamdan

  • View
    341

  • Download
    6

Embed Size (px)

Citation preview

Page 1: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Differentiation

CS5240 Theoretical Foundations in Multimedia

Leow Wee Kheng

Department of Computer Science

School of Computing

National University of Singapore

Leow Wee Kheng (NUS) Matrix Differentiation 1 / 36

Page 2: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Linear Fitting Revisited

Linear Fitting Revisited

Linear fitting solves this problem:

Given n data points pi = [xi1 · · · xim]⊤, 1 ≤ i ≤ n, and their

corresponding values vi, find a linear function f that

minimizes the error

E =n∑

i=1

(f(pi)− vi)2 . (1)

The linear function f(pi) has the form

f(p) = f(x1, . . . , xm) = a1x1 + · · ·+ amxm + am+1. (2)

Leow Wee Kheng (NUS) Matrix Differentiation 2 / 36

Page 3: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Linear Fitting Revisited

The data points are organized into a matrix equation

Da = v, (3)

where

D =

x11 · · · x1m 1...

. . ....

...xn1 · · · xnm 1

, a =

a1...amam+1

, and v =

v1...vn

. (4)

The solution of Eq. 3 is

a = (D⊤D)−1D⊤v. (5)

Leow Wee Kheng (NUS) Matrix Differentiation 3 / 36

Page 4: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Linear Fitting Revisited

Denote each row of D as d⊤

i . Then,

E =

n∑

i=1

(d⊤

i a− vi)2 = ‖Da− v‖2. (6)

So, linear least squares problem can be described very compactly as

mina

‖Da− v‖2. (7)

To show that the solution in Eq. 5 minimizes error E, need todifferentiate E with respect to a and set it to zero:

dE

da= 0. (8)

How to do this differentiation?

Leow Wee Kheng (NUS) Matrix Differentiation 4 / 36

Page 5: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Linear Fitting Revisited

The obvious (but hard) way:

E =

n∑

i=1

m∑

j=1

ajxij + am+1 − vi

2

. (9)

Expand equation explicitly giving

∂E

∂ak=

2n∑

i=1

m∑

j=1

ajxij + am+1 − vi

xik, for k 6= m+ 1

2n∑

i=1

m∑

j=1

ajxij + am+1 − vi

, for k = m+ 1

Then, set ∂E/∂ak = 0 and solve for ak.This is slow, tedious and error prone!

Leow Wee Kheng (NUS) Matrix Differentiation 5 / 36

Page 6: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Linear Fitting Revisited

Which one do you like to be?

Leow Wee Kheng (NUS) Matrix Differentiation 6 / 36

Page 7: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Linear Fitting Revisited

At least like these?

Leow Wee Kheng (NUS) Matrix Differentiation 7 / 36

Page 8: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives

Matrix Derivatives

There are 6 common types of matrix derivatives:

Type Scalar Vector Matrix

Scalar∂y

∂x

∂y

∂x

∂Y

∂x

Vector∂y

∂x

∂y

∂x

Matrix∂y

∂X

Leow Wee Kheng (NUS) Matrix Differentiation 8 / 36

Page 9: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives

Derivatives by Scalar

Numerator Layout Notation Denominator Layout Notation

∂y

∂x

∂y

∂x

∂y

∂x=

∂y1∂x...

∂ym∂x

∂y

∂x=

[

∂y1∂x

· · ·∂ym∂x

]

≡∂y⊤

∂x

∂Y

∂x=

∂y11∂x

· · ·∂y1n∂x

.... . .

...∂ym1

∂x· · ·

∂ymn

∂x

Leow Wee Kheng (NUS) Matrix Differentiation 9 / 36

Page 10: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives

Derivatives by Vector

Numerator Layout Notation Denominator Layout Notation

∂y

∂x=

[

∂y

∂x1· · ·

∂y

∂xn

]

∂y

∂x=

∂y

∂x1...∂y

∂xn

∂y

∂x=

∂y1∂x1

· · ·∂y1∂xn

.... . .

...∂ym∂x1

· · ·∂ym∂xn

∂y

∂x=

∂y1∂x1

· · ·∂ym∂x1

.... . .

...∂y1∂xn

· · ·∂ym∂xn

≡∂y

∂x⊤≡

∂y⊤

∂x

Leow Wee Kheng (NUS) Matrix Differentiation 10 / 36

Page 11: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives

Derivative by Matrix

Numerator Layout Notation Denominator Layout Notation

∂y

∂X=

∂y

∂x11· · ·

∂y

∂xm1...

. . ....

∂y

∂x1n· · ·

∂y

∂xmn

∂y

∂X=

∂y

∂x11· · ·

∂y

∂x1n...

. . ....

∂y

∂xm1

· · ·∂y

∂xmn

≡∂y

∂X⊤≡

∂y

∂X

Leow Wee Kheng (NUS) Matrix Differentiation 11 / 36

Page 12: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives

Pictorial Representation

numeratorlayout

denominatorlayout

.

. ... .

.

.

Leow Wee Kheng (NUS) Matrix Differentiation 12 / 36

Page 13: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives

Caution

◮ Most books and papers don’t state which convention they use.

◮ Reference [2] uses both conventions but clearly differentiate them.

∂y

∂x⊤=

[

∂y

∂x1· · ·

∂y

∂xn

]

∂y

∂x=

∂y

∂x1...∂y

∂xn

∂y

∂x⊤=

∂y1∂x1

· · ·∂y1∂xn

.... . .

...∂ym∂x1

· · ·∂ym∂xn

∂y⊤

∂x=

∂y1∂x1

· · ·∂ym∂x1

.... . .

...∂y1∂xn

· · ·∂ym∂xn

◮ It is best not to mix the two conventions in your equations.

◮ We adopt numerator layout notation.

Leow Wee Kheng (NUS) Matrix Differentiation 13 / 36

Page 14: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Commonly Used Derivatives

Commonly Used Derivatives

Here, scalar a, vector a and matrix A are not functions of x and x.

(C1)da

dx= 0 (column matrix)

(C2)da

dx= 0⊤ (row matrix)

(C3)da

dX= 0⊤ (matrix)

(C4)da

dx= 0 (matrix)

(C5)dx

dx= I

Leow Wee Kheng (NUS) Matrix Differentiation 14 / 36

Page 15: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Commonly Used Derivatives

(C6)da⊤x

dx=

dx⊤a

dx= a⊤

(C7)dx⊤x

dx= 2x⊤

(C8)d(x⊤a)2

dx= 2x⊤aa⊤

(C9)dAx

dx= A

(C10)dx⊤A

dx= A⊤

(C11)dx⊤Ax

dx= x⊤(A+A⊤)

Leow Wee Kheng (NUS) Matrix Differentiation 15 / 36

Page 16: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Math Notation

Math Notation

We represent a vector x as a column matrix

x =

x1x2...

xm

.

Its transpose x⊤ is a row matrix

x⊤ =[

x1 x2 · · · xm]

.

Leow Wee Kheng (NUS) Matrix Differentiation 16 / 36

Page 17: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Math Notation

Consider two vectors x and y with the same number of components.Their inner product x⊤y is actually a 1×1 matrix:

x⊤y = [ s ]

where

s =m∑

i=1

xiyi.

For notational inconvenience, we usually drop the matrix andregard the inner product as a scalar, i.e.,

x⊤y =m∑

i=1

xiyi.

Leow Wee Kheng (NUS) Matrix Differentiation 17 / 36

Page 18: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivatives of Scalar by Scalar

Derivatives of Scalar by Scalar

(SS1)∂(u+ v)

∂x=

∂u

∂x+

∂v

∂x

(SS2)∂uv

∂x= u

∂v

∂x+ v

∂u

∂x(product rule)

(SS3)∂g(u)

∂x=

∂g(u)

∂u

∂u

∂x(chain rule)

(SS4)∂f(g(u))

∂x=

∂f(g)

∂g

∂g(u)

∂u

∂u

∂x(chain rule)

Leow Wee Kheng (NUS) Matrix Differentiation 18 / 36

Page 19: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivatives of Vector by Scalar

Derivatives of Vector by Scalar

(VS1)∂au

∂x= a

∂u

∂x

where a is not a function of x.

(VS2)∂Au

∂x= A

∂u

∂x

where A is not a function of x.

(VS3)∂u⊤

∂x=

(

∂u

∂x

)⊤

(VS4)∂(u+ v)

∂x=

∂u

∂x+

∂v

∂x

Leow Wee Kheng (NUS) Matrix Differentiation 19 / 36

Page 20: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivatives of Vector by Scalar

(VS5)∂g(u)

∂x=

∂g(u)

∂u

∂u

∂x(chain rule)

with consistent matrix layout.

(VS6)∂f(g(u))

∂x=

∂f(g)

∂g

∂g(u)

∂u

∂u

∂x(chain rule)

with consistent matrix layout.

Leow Wee Kheng (NUS) Matrix Differentiation 20 / 36

Page 21: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivatives of Matrix by Scalar

Derivatives of Matrix by Scalar

(MS1)∂aU

∂x= a

∂U

∂x

where a is not a function of x.

(MS2)∂AUB

∂x= A

∂U

∂xB

where A and B are not functions of x.

(MS3)∂(U+V)

∂x=

∂U

∂x+

∂V

∂x

(MS4)∂UV

∂x= U

∂V

∂x+

∂U

∂xV (product rule)

Leow Wee Kheng (NUS) Matrix Differentiation 21 / 36

Page 22: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivatives of Scalar by Vector

Derivatives of Scalar by Vector

(SV1)∂au

∂x= a

∂u

∂x

where a is not a function of x.

(SV2)∂(u+ v)

∂x=

∂u

∂x+

∂v

∂x

(SV3)∂uv

∂x= u

∂v

∂x+ v

∂u

∂x(product rule)

(SV4)∂g(u)

∂x=

∂g(u)

∂u

∂u

∂x(chain rule)

(SV5)∂f(g(u))

∂x=

∂f(g)

∂g

∂g(u)

∂u

∂u

∂x(chain rule)

Leow Wee Kheng (NUS) Matrix Differentiation 22 / 36

Page 23: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivatives of Scalar by Vector

(SV6)∂u⊤v

∂x= u⊤

∂v

∂x+ v⊤

∂u

∂x(product rule)

where∂u

∂xand

∂v

∂xare in numerator layout.

(SV7)∂u⊤Av

∂x= u⊤A

∂v

∂x+ v⊤A⊤

∂u

∂x(product rule)

where∂u

∂xand

∂v

∂xare in numerator layout,

and A is not a function of x.

Leow Wee Kheng (NUS) Matrix Differentiation 23 / 36

Page 24: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivatives of Scalar by Matrix

Derivatives of Scalar by Matrix

(SM1)∂au

∂X= a

∂u

∂X

where a is not a function of X.

(SM2)∂(u+ v)

∂X=

∂u

∂X+

∂v

∂X

(SM3)∂uv

∂X= u

∂v

∂X+ v

∂u

∂X(product rule)

(SM4)∂g(u)

∂X=

∂g(u)

∂u

∂u

∂X(chain rule)

(SM5)∂f(g(u))

∂X=

∂f(g)

∂g

∂g(u)

∂u

∂u

∂X(chain rule)

Leow Wee Kheng (NUS) Matrix Differentiation 24 / 36

Page 25: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivatives of Vector by Vector

Derivatives of Vector by Vector

(VV1)∂au

∂x= a

∂u

∂x+ u

∂a

∂x(product rule)

(VV2)∂Au

∂x= A

∂u

∂x

where A is not a function of x.

(VV3)∂(u+ v)

∂x=

∂u

∂x+

∂v

∂x

(VV4)∂g(u)

∂x=

∂g(u)

∂u

∂u

∂x(chain rule)

(VV5)∂f(g(u))

∂x=

∂f(g)

∂g

∂g(u)

∂u

∂u

∂x(chain rule)

Leow Wee Kheng (NUS) Matrix Differentiation 25 / 36

Page 26: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Notes on Denominator Layout

Notes on Denominator Layout

In some cases, the results of denominator layout are the transpose ofthose of numerator layout. Moreover, the chain rule for denominatorlayout goes from right to left instead of left to right.

Numerator Layout Notation Denominator Layout Notation

(C7)da⊤x

dx= a⊤

da⊤x

dx= a

(C11)dx⊤Ax

dx= x⊤(A+A⊤)

dx⊤Ax

dx= (A+A⊤)x

(VV5)∂f(g(u))

∂x=

∂f(g)

∂g

∂g(u)

∂u

∂u

∂x

∂f(g(u))

∂x=

∂u

∂x

∂g(u)

∂u

∂f(g)

∂g

Leow Wee Kheng (NUS) Matrix Differentiation 26 / 36

Page 27: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivations of Derivatives

Derivations of Derivatives

(C6)da⊤x

dx=

dx⊤a

dx= a⊤

(The not-so-hard way)

Let s = a⊤x = a1x1 + · · ·+ anxn. Then,∂s

∂xi= ai. So,

ds

dx= a⊤.

(The easier way)

Let s = a⊤x =∑

i

aixi. Then,∂s

∂xi= ai. So,

ds

dx= a⊤.

(C7)dx⊤x

dx= 2x⊤

Let s = x⊤x =∑

i

x2i . Then,∂s

∂xi= 2xi. So,

ds

dx= 2x⊤.

Leow Wee Kheng (NUS) Matrix Differentiation 27 / 36

Page 28: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivations of Derivatives

(C8)d(x⊤a)2

dx= 2x⊤aa⊤

Let s = x⊤a. Then,∂s2

∂xi= 2s

∂s

∂xi= 2 s ai. So,

ds2

dx= 2x⊤aa⊤.

(C9)dAx

dx= A

(The hard way)

Ax =

a11 · · · a1n...

. . ....

an1 · · · ann

x1...xn

=

a11x1 + · · ·+ a1nxn...

an1x1 + · · ·+ annxn

.

(The easy way)

Let s = Ax. Then, si =∑

j

aijxj , and∂si∂xj

= aij . So,ds

dx= A.

Leow Wee Kheng (NUS) Matrix Differentiation 28 / 36

Page 29: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Matrix Derivatives Derivations of Derivatives

(C10)dx⊤A

dx= A⊤

Let y⊤ = x⊤A, and aj denote the j-th column of A. Then, yi = x⊤aj .

Applying (C6) yieldsdyidx

= a⊤j . So,dy⊤

dx= A⊤.

(C11)dx⊤Ax

dx= x⊤(A+A⊤)

Apply (SV6) todx⊤Ax

dxand obtain x⊤

dAx

dx+ (Ax)⊤

dx

dx,

Next, apply (C9) to the first part of the sum, and obtain

x⊤A+ (Ax)⊤, which is x⊤(A+A⊤).

(Need to prove SV6—Homework.)

Leow Wee Kheng (NUS) Matrix Differentiation 29 / 36

Page 30: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Linear Fitting Revisited

Linear Fitting Revisited

Now, let us show that the solution

a = (D⊤D)−1D⊤v.

minimizes error E

E =n∑

i=1

(d⊤

i a− vi)2 = ‖Da− v‖2.

Proof:

E = ‖Da− v‖2 = (Da− v)⊤(Da− v)

= (a⊤D⊤ − v⊤)(Da− v)

= a⊤D⊤Da− a⊤D⊤v − v⊤Da+ v⊤v.

Leow Wee Kheng (NUS) Matrix Differentiation 30 / 36

Page 31: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Linear Fitting Revisited

E = a⊤D⊤Da− a⊤D⊤v − v⊤Da+ v⊤v.

Apply (C11), (C6), (C9) and (C2) to the four terms.

dE

da= a⊤(D⊤D+D⊤D)− (D⊤v)⊤ − v⊤D+ 0

= 2a⊤D⊤D− 2v⊤D.

Set dE/da = 0 and obtain

2a⊤D⊤D− 2v⊤D = 0

a⊤D⊤D = v⊤D

Transpose both sides of the equation and get

D⊤Da = D⊤v

a = (D⊤D)−1D⊤v. �

Leow Wee Kheng (NUS) Matrix Differentiation 31 / 36

Page 32: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Summary

Summary

◮ Matrix calculus studies calculus of matrices.

◮ There are 6 common derivatives of matrices.

◮ There are 2 competing notational convention:numerator layout notation vs. denominator layout convention.

◮ We adopt numerator layout notation.

◮ Do not mix the two conventions in your equations.

◮ Use matrix differentiation to prove that pseudo-inverse minimizessum square error.

Leow Wee Kheng (NUS) Matrix Differentiation 32 / 36

Page 33: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Summary

Use the right tool

become lightning fast!

Leow Wee Kheng (NUS) Matrix Differentiation 33 / 36

Page 34: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Probing Questions

Probing Questions

◮ Is there a simple way to double check that the derivative resultmakes sense?

◮ Why do we use sum square error for linear fitting? Can we useother forms of errors?

◮ Six common types of matrix derivatives are discussed. Three othertypes are left out. Can we work out the other derivatives, e.g.,derivatives of vector by matrix or matrix by matrix?

Leow Wee Kheng (NUS) Matrix Differentiation 34 / 36

Page 35: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

Homework

Homework

1. What are the key concepts that you have learned?

2. Prove the product rule SV3 using scalar product rule SS2.

(SV3)∂uv

∂x= u

∂v

∂x+ v

∂u

∂x

3. Prove the product rule SV6 using SV3.

(SV6)∂u⊤v

∂x= u⊤

∂v

∂x+ v⊤

∂u

∂x

where∂u

∂xand

∂v

∂xare in numerator layout.

4. Q2 of AY2015/16 Final Evaluation.

Leow Wee Kheng (NUS) Matrix Differentiation 35 / 36

Page 36: Matrix Differentiation - comp.nus.edu.sgcs5240/lecture/matrix-differentiation.pdf · Matrix Derivatives Notes on Denominator Layout Notes on Denominator Layout In some cases, the

References

References

1. J. E. Gentle, Matrix Algebra: Theory, Computations, and Applications in

Statistics, Springer, 2007.

2. H. Lutkepohl, Handbook of Matrices, John Wiley & Sons, 1996.

3. K. B. Petersen and M. S. Pedersen, The Matrix Cookbook, 2012.www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf

4. Wikipedia, Matrix Calculus.en.wikipedia.org/wiki/Matrix_calculus

Leow Wee Kheng (NUS) Matrix Differentiation 36 / 36