8

Click here to load reader

Matrix Tricks for Linear Statistical Models || Full Rank Decomposition

  • Upload
    jarkko

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Matrix Tricks for Linear Statistical Models || Full Rank Decomposition

Chapter 17Full Rank Decomposition

The lion and the calf shall lie down togetherbut the calf won’t get much sleep.

Woody Allen: The Scrolls

This chapter shows how helpful it is to express a matrix A as a product UV′where both U and V have full column ranks.

Theorem 17 (Full rank decomposition). Let A be an n×m matrix withrank r > 0. Then A can be written as a product

A = UV′, (17.1)

whererank(Un×r) = rank(Vm×r) = r , (17.2)

i.e., U and V have full column ranks.

Proof. Let U be an n× r matrix whose columns form a basis for the columnspace of A. Then every vector in the column space of A can be expressed asa linear combination of the columns of U. In particular, every column ai ofA can be written as

ai = Uvi , i = 1, . . . ,m , (17.3)

for some vi ∈ Rr (which is unique after fixing the basis U). Hence thereexists a matrix Vm×r such that

A = UV′ = U(v1 : . . . : vm) . (17.4)

Becauser = rank(A) = rank(UV′) ≤ rank(V′) ≤ r , (17.5)

we observe that V has full column rank, and the theorem is proved. ut

We note that in the full rank decomposition (17.1) the columns of U forma basis for the column space C (A), and clearly the columns of V form a basisfor C (A′):

C (U) = C (A) , C (V) = C (A′) . (17.6)

Sometimes it is convenient to choose the columns of U or V orthonormal.

349DOI 10.1007/978-3-642-10473-2_1 , © Springer-Verlag Berlin Heidelberg 2011 S. Puntanen et al., Matrix Tricks for Linear Statistical Models: Our Personal Top Twenty,

8

Page 2: Matrix Tricks for Linear Statistical Models || Full Rank Decomposition

350 17 Full Rank Decomposition

It is worth emphasizing that the full rank decomposition is (like alsosome other tricks in this book) mathematically very simple—but it canbe an amazingly handy tool at appropriate situations. As references, we

Photograph 17.1 PochirajuBhimasankaram (Hyderabad,2007).

may mention Marsaglia & Styan (1974a,Th. 1), Bhimasankaram (1988), and Piziak &Odell (1999).

17.1 Some Propertiesof an Idempotent Matrix

In this section we consider three properties ofan idempotent matrix An×n that can be easilyproved using the full rank decomposition

A = UV′, (17.7)

where rank(Un×r) = rank(Vn×r) = r. Thefirst property is the following:

Proposition 17.1. With the above notation,

A = A2 ⇐⇒ V′U = Ir . (17.8)

Proof. To prove (17.8), we first assume that A is idempotent:

UV′ = UV′UV′, (17.9)

Premultiplying (17.9) by the matrix (U′U)−1U′ yields the equality

V′ = V′UV′, (17.10)

Now postmultiplying (17.10) by V(V′V)−1 gives our claim V′U = Ir. Onthe other hand, if V′U = Ir, then

A2 = UV′UV′ = UIrV′ = A , (17.11)

and thus (17.8) is proved. utAs a second result, we prove the following implication:

Proposition 17.2.

A = A2 =⇒ rank(A) = tr(A) . (17.12)

Proof. This comes at once from (17.8):

tr(A) = tr(UV′) = tr(V′U) = tr(Ir) = r = rank(A) . (17.13)

Page 3: Matrix Tricks for Linear Statistical Models || Full Rank Decomposition

17.2 Rank Additivity 351

ut

Also the following result can be easily proved using the full rank decom-position (see Groß, Trenkler & Troschke 1997):

Proposition 17.3. Let A = A2. Then

A = A′ ⇐⇒ C (A) = C (A′) . (17.14)

Proof. The idempotency of A implies that A has a full rank decompositionA = UV′, where V′U = Ir, and

C (A) = C (U) , C (A′) = C (V) . (17.15)

If C (A) = C (A′), then the orthogonal projectors onto C (A) = C (U) andonto C (A′) = C (V) must be identical, i.e.,

U(U′U)−1U′ = V(V′V)−1V′. (17.16)

Premultiplying (17.16) by UV′ we obtain, using V′U = Ir,

U(U′U)−1U′ = UV′ = A , (17.17)

and so A indeed is symmetric. ut

17.2 Rank Additivity

We prove the following result:

Proposition 17.4. Let A and B be non-null n × m matrices, and letrank(A) = a, rank(B) = b. Then the following statements are equivalent:

(a) rank(A + B) = rank(A) + rank(B),(b) dim C (A) ∩ C (B) = dim C (A′) ∩ C (B′) = 0.

Proof. Assume first that (a) holds. Then, in view of

rk(A) + rk(B) = rk(A + B) = rk[(A : B)

(ImIm

)]≤ rk(A : B) = rk(A) + rk(B)− dim C (A) ∩ C (B) , (17.18a)

rk(A) + rk(B) = rk(A + B) = rk[(In : In)

(AB

)]≤ rk

(AB

)= rk(A) + rk(B)− dim C (A′) ∩ C (B′) , (17.18b)

it is clear that (a) implies (b).

Page 4: Matrix Tricks for Linear Statistical Models || Full Rank Decomposition

352 17 Full Rank Decomposition

To go the other way round, let A and B have the full rank decompositions

A = A1A′2 , B = B1B′2 , (17.19)

where

C (A) = C (A1) , A1 ∈ Rn×a, C (B) = C (B1) , B1 ∈ Rn×b, (17.20a)C (A′) = C (A2) , A2 ∈ Rm×a, C (B′) = C (B2) , B2 ∈ Rm×b. (17.20b)

Then

A + B = A1A′2 + B1B′2 = (A1 : B1)(

A′2B′2

):= UV′. (17.21)

In view of the disjointness assumption C (A2) ∩ C (B2) = {0}, we have

rk(Vm×(a+b)) = rk(A2 : B2) = rk(A2) + rk(B2) = a+ b , (17.22)

which means that V has full column rank. This further implies that rk(U) =rk(UV′) since

rk(U) ≥ rk(UV′) ≥ rk[UV′V(V′V)−1] = rk(U) . (17.23)

Hence we have

rk(A + B) = rk(UV′) = rk(U) = rk(A1 : B1) , (17.24)

which, in light of the disjointness assumption C (A1) ∩ C (B1) = {0}, yields

rk(A + B) = rk(A1 : B1) = rk(A1) + rk(B1) , (17.25)

and hence our claim is proved. ut

As references to Proposition 17.4, we may mention Marsaglia & Styan(1972), and Marsaglia & Styan (1974a, Th. 11); see also Rao & Bhi-masankaram (2000, p. 132).

17.3 Cochran’s Theorem: a Simple Version

In this example we consider the following simple version of the Cochran’sTheorem; for extended versions, see, e.g., Marsaglia & Styan (1974a), Ander-son & Styan (1982), and Bapat (2000, p. 60).

Proposition 17.5. Let A and B be n× n matrices satisfying the condition

A + B = In . (17.26)

Page 5: Matrix Tricks for Linear Statistical Models || Full Rank Decomposition

17.3 Cochran’s Theorem: a Simple Version 353

Then the following statements are equivalent:

(a) rank(A) + rank(B) = n,(b) A2 = A and B2 = B,(c) AB = 0.

Proof. “(c) =⇒ (b)”. If (c) holds, then, in view of (17.26), B = In−A, and

AB = A(In −A) = A−A2 = 0 , (17.27)

and hence A = A2 (and similarly B = B2).“(b) =⇒ (a)”. If (b) holds, then rank(A) = tr(A) and rank(B) = tr(B),

and (17.26) implies

n = tr(In) = tr(A + B) = tr(A) + tr(B) = rank(A) + rank(B) . (17.28)

“(a) =⇒ (c)”. We assume that a + b = n, when a = rank(A), andb = rank(B) = n− a. Consider the full rank decompositions of A and B:

A = A1A′2 , A1 ∈ Rn×a , B = B1B′2 , B1 ∈ Rn×(n−a) , (17.29)

which means that

A1A′2 + B1B′2 = (A1 : B1)(

A′2B′2

):= FG = In . (17.30)

Because (A1 : B1) = F is an n× n matrix satisfying the equation FG = In,the matrix G is the inverse of F, and hence satisfies the equation

GF =(

A′2B′2

)(A1 : B1) = In , (17.31)

that is,(A′2B′2

)(A1 : B1) =

(A′2A1 A′2B1B′2A1 B′2B1

)=(

Ia 00 In−a

)= In . (17.32)

The final claim AB = 0 is achieved when the equation

A′2B1 = 0 (17.33)

is premultiplied by A1 and postmultiplied by B′2. ut

Page 6: Matrix Tricks for Linear Statistical Models || Full Rank Decomposition

354 17 Full Rank Decomposition

17.4 Proof of the RCR Using the Full RankDecomposition

In Chapter 6 (p. 145) we have already proved the rank cancellation rule:

LAY = MAY and rank(AY) = rank(A) =⇒ LA = MA , (17.34)

but it is of interest to give a proof using the full rank decomposition (asdone in Marsaglia & Styan 1974a). To do this, let An×m have a full rankdecomposition

A = UV′, (17.35)

where

rank(Un×r) = rank(Vm×r) = r = rank(An×m) . (17.36)

Assumption LAY = MAY can be written as

L ·UV′ ·Y = M ·UV′ ·Y , i.e., LU(V′Y) = MU(V′Y) . (17.37)

Therefore, if the r×pmatrix V′Y has full row rank, then we can postmultiply(17.37) by (V′Y)′[(V′Y)(V′Y)′]−1 and obtain the equality

LU = MU . (17.38)

Our claim would then follow by postmultiplying (17.38) by V′. Our task is,therefore, to show that

rank(V′Y) = r . (17.39)

Assumption rank(AY) = rank(A) implies that rank(UV′Y) = r. but sinceU has full column rank, we get

rank(V′Y) ≥ rank(UV′Y) ≥ rank[(U′U)−1U′UV′Y] ≥ rank(V′Y) ,(17.40)

and hence indeed (17.39) holds.

17.5 Exercises

17.1. Let A have a full rank decomposition A = UV′. Prove the claim (4.13)(p. 107): A+ = V(V′V)−1(U′U)−1U′.

17.2. Proposition 17.5 (p. 352) can be generalized as follows: Let A1, . . . ,Am

be n×n matrices such that In = A1 + · · ·+Am. Then the following threeconditions are equivalent:

(a) n = rank(A1) + · · ·+ rank(Am),

Page 7: Matrix Tricks for Linear Statistical Models || Full Rank Decomposition

17.5 Exercises 355

(b) A2i = Ai for i = 1, . . . ,m,

(c) AiAj = 0 for all i 6= j.

Confirm the following: Let z ∼ Nn(µ, In) and let z′z = z′A1z +· · · + z′Amz. Then any of the three above conditions is a necessaryand sufficient condition for z′Aiz to be independently distributed asχ2[rank(Ai), ·]. For the χ2-distribution, see page 18.

17.3. Let

W = In + X− 2J ,

where X is an n×n symmetric involutory doubly-stochastic matrix, i.e.,X2 = In and X1n = 1n, and J = 1

n1n1′n.

(a) Show that W is scalar-potent, i.e., W2 = cW for some scalar c.(b) Find the scalar c and confirm that W is nonnegative definite.(c) Find the rank(W) as a function of the trace tr(X) and hence show

that tr(X) is even if and only if n is even.(d) Show that the rank(W) = 1 if and only if n + tr(X) = 4. [When

rank(W) = 1 then the matrix In −W is a Householder transforma-tion, see Exercise 18.23 (p. 390).]

For an application of the results in this exercise to magic squares seeChu, Drury, Styan & Trenkler (2010).

17.4. (a) Let A and B be symmetric n× n idempotent matrices and let Zbe n× n nonnegative definite, such that A + Z = B. Show that Z isidempotent and AZ = 0.

(b) Let the random variables x1 and x2 follow central chi-squared dis-tributions with degrees of freedom f1 and f2, respectively, such thatx1 − x2 = x3 ≥ 0 with probability 1. Then show that x3 follows achi-squared distribution and find the number of degrees of freedom.Show also that x2 and x3 are independently distributed.The results in this exercise may be called the Hogg–Craig theorem,following results in Hogg & Craig (1958); see also Ogasawara & Taka-hashi (1951) and Styan (1970).

17.5 (Group inverse). Let A be an n × n matrix such that rank(A) =rank(A2). Then the group inverse A# is the unique matrix satisfying

(a) AA#A = A, (b) A#AA# = A#, and (c) A#A2 = A.

If A has full-rank decomposition UV′ show that V′U is nonsingular andthat A# = U(V′U)−2V′.

For more about the group inverse see Ben-Israel & Greville (2003,pp. 156–161).

Page 8: Matrix Tricks for Linear Statistical Models || Full Rank Decomposition

356 17 Full Rank Decomposition

Philatelic Item 17.1 As observed by Tee (2003), “Determinants were applied in 1683by the Japanese mathematician Takakazu Seki Kôwa (1642–1708) in the construction ofthe resolvent of a system of polynomial equations [but] were independently invented in1693 by Gottfried Wilhelm von Leibniz (1646–1716)”; see also Farebrother, Styan & Tee(2003). Leibniz was a German mathematician and philosopher who invented infinitesimalcalculus independently of Isaac Newton (1643–1727). The stamp (left panel) for Sekiwas issued by Japan in 1992 (Scott 2147) and the stamp (right panel) for Leibniz bySt. Vincent in 1991 (Scott 1557) as “Head librarian for the electors of Hannover (&co-inventor of calculus)”.

Philatelic Item 17.2 Charles Lutwidge Dodgson (1832–1898), better known by thepen name Lewis Carroll, was an English author, mathematician, Anglican clergymanand photographer. As a mathematician, Dodgson was the author of Condensation ofDeterminants (1866) and Elementary Treatise on Determinants (1867). His most fa-mous writings, however, are Alice’s Adventures in Wonderland and its sequel Throughthe Looking-Glass. The sheetlet was issued by Tristan da Cunha in 1981 (Scott 287a)and shows the Dodgson family outside the Croft Rectory (in Darlington, Yorkshire),c. 1860; Charles is shown seated on the ground at the left of the group. Edwin HeronDodgson (1846–1918), Charles’s youngest brother (shown kneeling third from the right),apparently saved the population of Tristan da Cunha from starvation. For more see Fare-brother, Jensen & Styan (2000).