Chapter 1. Vector Spaces - wj32 · Chapter 1. Vector Spaces ... Example 4. [Exercise 1.4] Suppose that V is a vector space with basis B= fb ... Let V and W be vector spaces, let

Chapter 1. Vector Spaces

Theorem 1. Let V be a vector space and let S(V ) be the set of all subspaces of V . IfS, T ∈ S(V ), then

S ∩ T = glb {S, T}and

S + T = lub {S, T} .

Proof. It is clear that S ∩ T is a lower bound of {S, T}. If U ⊆ S and U ⊆ T , thenU ⊆ S ∩ T . Similarly, S + T is an upper bound of {S, T}. If U ⊇ S and U ⊇ T , thens+ t ∈ U whenever s ∈ S and t ∈ T . Therefore U ⊇ S + T . �

Theorem 2. Let V be a vector space and let S, T be subspaces of V with bases B1, B2

respectively. Then the following are equivalent:

(1) B1 ∩B2 is empty and B1 ∪B2 is a basis for V .(2) V = S ⊕ T .

Proof. Suppose that B = B1 ∪B2 is a basis for V and B1 ∩B2 is empty. If v ∈ V thenwe may write

v = a1v1 + · · ·+ vnbn

where v1, . . . , vn ∈ B. Rearranging terms shows that v = s + t for some s ∈ S andt ∈ T , and therefore V = S + T . Now suppose that S ∩ T 6= {0}, so that there existsv 6= 0 such that

v = a1v1 + · · ·+ amvm

and

v = b1w1 + · · ·+ bnwn

where v1, . . . , vm ∈ B1, w1, . . . , wn ∈ B2, not all a1, . . . , am are zero, and not all b1, . . . , bnare zero. Since B1 ∩B2 is empty, all of v1, . . . , vm, w1, . . . , wn are distinct, and

a1v1 + · · ·+ amvm − b1w1 − · · · − bnwn = 0

contradicts the linear independence of B.

Conversely, suppose that V = S ⊕ T . If v ∈ B1 ∩ B2, then v ∈ S ∩ T , which isa contradiction. Therefore B1 ∩ B2 is empty. We know that B1 ∪ B2 spans V sinceV = S + T . Suppose that

a1v1 + · · ·+ amvm + b1w1 + · · ·+ bnwn = 0

where v1, . . . , vm ∈ B1, w1, . . . , wn ∈ B2, not all a1, . . . , am are zero, and not all b1, . . . , bnare zero. Then

a1v1 + · · ·+ amvm = −b1w1 − · · · − bnwn1

2

is a nonzero element in S ∩ T , which is a contradiction. This shows that B1 ∪ B2 islinearly independent, and consequently B1 ∪B2 is a basis for V . �

Theorem 3. Any subspace of a vector space has a complement.

Proof. Let W be a subspace of V . There exists a basis B0 of W and a basis B of Vcontaining B0, by Theorem 1.9. Therefore

V = span(B0)⊕ span(B \B0)

= W ⊕ span(B \B0)

by Theorem 2. �

Example 4. [Exercise 1.4] Suppose that V is a vector space with basis B = {bi | i ∈ I}and S is a subspace of V . Let {B1, . . . , Bk} be a partition of B. Then is it true that

S =k⊕i=1

(S ∩ 〈Bi〉)?

What if S ∩ 〈Bi〉 6= {0} for all i?

Take

S = 〈e1, e2, e3〉 , B1 =

1001

,

0100

, B2 =

0011

,

1000

in R4. We have2⊕i=1

(S ∩ 〈Bi〉) = 〈e2〉 ⊕ 〈e1〉 6= S

but S ∩ 〈Bi〉 6= {0} for all i.

Theorem 5. [Exercise 1.6] Let S, T, U ∈ S(V ). If U ⊆ S, then

S ∩ (T + U) = (S ∩ T ) + U.

Proof. If v ∈ S ∩ (T + U) then v ∈ S and v = t + u where t ∈ T and u ∈ U ⊆ S.Therefore t = v − u ∈ S and v ∈ (S ∩ T ) + U . Conversely, if v ∈ (S ∩ T ) + U thenv = s+ u where s ∈ S ∩ T and u ∈ U ⊆ S. Therefore v ∈ S and v ∈ S ∩ (T + U). �

Theorem 6. [Exercise 1.8] Let v = (a1, . . . , an) ∈ Rn be a strongly positive vector.

(1) Let M = min {a1, . . . , an}. Then any vector w ∈ Rn with ‖v − w‖ < M isstrongly positive.

(2) If a subspace S of Rn contains v, then S has a basis of strongly positive vectors.

3

Proof. Suppose that w = (b1, . . . , bn) ∈ Rn has some bi ≤ 0 and ‖v − w‖ < M . Then(ai − bi)2 ≥M2, which contradicts

(a1 − b1)2 + · · ·+ (an − bn)2 < M2.

Let S be a subspace of Rn that contains v, and let {v = v1, v2, . . . , vm} be a basis of Scontaining v. Choose a positive number C such that∥∥∥v − (v +

viC

)∥∥∥ =∥∥∥viC

∥∥∥ < M

for every i ≥ 2; then {v, v +

v2C, . . . , v +

vmC

}is a basis of strongly positive vectors by (1). �

Theorem 7. [Exercise 1.14] Let V be a finite-dimensional vector space over an infinitefield F . If S1, . . . , Sk are subspaces of V of equal dimension, then there is a subspace Tof V for which V = Si ⊕ T for all i = 1, . . . , k.

Proof. Let n be the dimension of V and let m be the common dimension of S1, . . . , Sk.We use induction on the value of n−m. If m = n, we can choose T = {0}. Otherwise,assume that the statement is true for k subspaces of dimension m + 1. By Theorem1.2, we can choose a vector v ∈ V \ (S1 ∪ · · · ∪ Sk). The subspaces {Si ⊕ 〈v〉} havedimension m + 1, so applying the induction hypothesis gives a subspace T of V suchthat V = (Si ⊕ 〈v〉)⊕ T for all i. Then V = Si ⊕ (〈v〉 ⊕ T ), which shows that 〈v〉 ⊕ Tis a common complement of S1, . . . , Sk. �

Lemma 8. If V contains an infinite set that is linearly independent, then V is infinite-dimensional.

Proof. Let S be an infinite linearly independent set in V . Suppose that a finite set{v1, . . . , vn} spans V , and choose n + 1 elements from S. This contradicts Theorem1.10. �

Theorem 9. [Exercise 1.15] The vector space C of all continuous functions from R toR is infinite-dimensional.

Proof. For any positive integer n, define fn : R→ R by

fn(x) =

{1− |x| if − 1 < x < 1,

0 otherwise.

The set {fn | n ≥ 1} is linearly independent, and therefore C is infinite-dimensional byLemma 8. �

4

Theorem 10. [Exercise 1.19] Let V be an n-dimensional real vector space and supposethat S is a subspace of V with dim(S) = n − 1. Define an equivalence relation ≡ onthe set V \ S by v ≡ w if the “line segment”

L(v, w) = {rv + (1− r)w | 0 ≤ r ≤ 1}

has the property that L(v, w) ∩ S = ∅. Then ≡ is an equivalence relation and it hasexactly two equivalence classes.

Proof. Let B0 = (b1, . . . , bn−1) be an ordered basis for S and complete it to an orderedbasis B = (b1, . . . , bn) of V . Let φ : V → R be the map that takes a vector to its nthcoordinate with respect to B, and let sgn : R \ {0} → {−1, 1} be the sign functionsgn(x) = x/ |x|. Define an equivalence relation ∼ on V \ S by

v ∼ w ⇔ sgn(φ(v)) = sgn(φ(w)).

We want to show that ≡ is identical to ∼, i.e. v ≡ w if and only if v ∼ w. Letv, w ∈ V \ S with v 6≡ w. Then rv + (1 − r)w ∈ S for some 0 < r < 1 (note r 6= 0, 1since v, w /∈ S), and rφ(v) + (1− r)φ(w) = 0. Rewriting this as

φ(v) =r − 1

rφ(w)

shows that sgn(φ(v)) = −sgn(φ(w)) and v 6∼ w. Conversely, let v, w ∈ V \ S withv 6∼ w. Put

r = − φ(w)

φ(v)− φ(w)

so that rφ(v) + (1 − r)φ(w) = 0 and therefore rv + (1 − r)w ∈ S. This proves thatv 6≡ w, and the result follows from the fact that ∼ is an equivalence relation and ∼partitions R \ {0} into the positive and negative numbers. �

Theorem 11. [Exercise 1.22] Every subspace S of Rn is a closed set.

Proof. Let B0 = (v1, . . . , vm) be an ordered orthonormal basis for S and complete it toan ordered orthonormal basis B = (v1, . . . , vn) of Rn. Let x ∈ Sc, and write

[x]B =

x1...xn

.At least one of xm+1, . . . , xn is not equal to zero, for otherwise x ∈ S. Let

M = min {xi | i ≥ m+ 1 and xi 6= 0} .

5

We want to show that the open ball B(x,M) is wholly contained in Sc. Let y ∈ Rnwith ‖x− y‖ = ‖[x]B − [y]B‖ < M , and write

[y]B =

y1...yn

.If y ∈ S then ym+1 = · · · = yn = 0 so that

(x1 − y1)2 + · · ·+ (xm − ym)2 + x2m+1 + · · ·+ x2n < M2

contradicts the fact that xi ≥ M for at least one i ≥ m + 1. This completes theproof. �

Theorem 12. [Exercise 1.24] Let V be an n-dimensional vector space over an infinitefield F and suppose that S1, . . . , Sk are subspaces of V with dim(Si) ≤ m < n. Thenthere is a subspace T of V of dimension n−m for which T ∩ Si = {0} for all i.

Proof. Extend each subspace to dimension m, and apply Theorem 7. �

Theorem 13. [Exercise 1.26] Let V be a real vector space with complexification V C

and let U be a subspace of V C. Then the following are equivalent:

(1) There is a subspace S of V for which U = SC = {s+ it | s, t ∈ S}.(2) U is closed under complex conjugation χ : V C → V C defined by u+ iv 7→ u− iv.

Proof. (1)⇒ (2) since SC is closed under complex conjugation. Suppose that (2) holdsand let S = {u | u+ iv ∈ U}. We want to show that U = SC. If u+ iv ∈ U , then u ∈ Sand v ∈ S since v − iu = −i(u + iv) ∈ U . Conversely, if s + it ∈ SC then s + iv1 ∈ Uand t+ iv2 ∈ U for some v1, v2. Therefore

s+ it =

(s+ iv1

2+s− iv1

2

)+ i

(t+ iv2

2+t− iv2

2

)∈ U.

�

Chapter 2. Linear Transformations

Theorem 14. [Universal property of a free object] Let V and W be vector spaces, letB be a basis for V and let τ0 : B → W be any function. Then τ0 extends uniquely to alinear map τ : V → W . That is, there exists a unique linear map τ : V → W such thatτi = τ0, where i : B → V is the canonical injection.

6

Proof. For any v ∈ V write v = a1b1 + · · · + anbn where b1, . . . , bn ∈ B, and defineτv = a1τ0b1 + · · · + anτ0bn. It is clear that τ is well-defined and linear. Suppose thatτ ′ : V → W is a linear map that extends τ0. Let v ∈ V and write v = a1b1 + · · ·+ anbnwhere b1, . . . , bn ∈ B. Then

τ ′v = a1τ0b1 + · · ·+ anτ0bn = τv,

which shows that τ ′ = τ . �

Theorem 15. [Universal property of a free object, converse] Let B be a subset of Vsuch that for every vector space W and every τ0 : B → W there exists a unique linearmap τ : V → W that extends τ0. Then B is a basis for V .

Proof. We may assume that V 6= {0}, for otherwise the result follows trivially. Supposethat B is linearly dependent, i.e.

a1b1 + · · ·+ anbn = 0

where b1, . . . , bn are distinct elements from B and not all a1, . . . , an are zero. We maywrite without loss of generality

b1 = − 1

a1(a2b2 + · · ·+ anbn),

and we may choose τ0 : B → V such that

τ0b1 6= −1

a1(a2τ0b2 + · · ·+ anτ0bn).

This produces a contradiction when we extend τ0 to a linear map, and therefore B islinearly independent. If V = {0} then we are done. Otherwise if B does not span V ,then we may choose a vector v ∈ V \ span(B) so that B ∪ {v} is linearly independent.Setting τb = b for each b ∈ B and either τv = 0 or τv 6= 0 shows that there is no uniquelinear map that extends the canonical injection i : B → V . This is a contradiction, soB spans V . �

Theorem 16. Let τ ∈ L(V,W ) be an isomorphism. Let S ⊆ V . Then

(1) S spans V if and only if τS spans W .(2) S is linearly independent in V if and only if τS is linearly independent in W .(3) S is a basis for V if and only if τS is a basis for W .

Proof. We will only prove (1) and (2) in one direction, for the same argument can beapplied with the isomorphism τ−1. Suppose that τS spans W , and let v ∈ V . Thenτv ∈ W , so we may write

τv = a1τs1 + · · ·+ anτsn

= τ(a1s1 + · · ·+ ansn)

7

for some s1, . . . , sn ∈ S. Since τ is an isomorphism, applying τ−1 gives

v = a1s1 + · · ·+ ansn,

which shows that S spans V . Now suppose that τS is linearly independent in W and

a1s1 + · · ·+ ansn = 0

where s1, . . . , sn ∈ S. Then

0 = τ(a1s1 + · · ·+ ansn)

= a1τs1 + · · ·+ anτsn,

and a1 = · · · = an = 0 by the linear independence of τS. (3) follows immediately from(1) and (2). �

Theorem 17. Let V be a vector space and let ρ ∈ L(V ).

(1) If V = S ⊕ T then

ρS,T + ρT,S = ι.

(2) If ρ = ρS,T then

im(ρ) = S and ker(ρ) = T

and so

V = im(ρ)⊕ ker(ρ).

In other words, ρ is a projection onto its image along its kernel. Moreover,

v ∈ im(ρ) ⇔ ρv = v.

(3) If σ ∈ L(V ) has the property that

V = im(σ)⊕ ker(σ) and σ|im(σ) = ι

then σ is a projection onto im(σ) along ker(σ).

Proof. (1) and (2) are obvious. Let v = v1 + v2 ∈ V where v1 ∈ im(σ) and v2 ∈ ker(σ).Then

σ(v) = σ(v1 + v2)

= ι(v1) + σ(v2)

= v1,

which proves (3). �

Theorem 18. Let V = S⊕T . Then (S, T ) reduces τ ∈ L(V ) if and only if τ commuteswith ρS,T .

8

Proof. By Theorem 2.23, S and T are τ -invariant if and only if

(*) ρS,T τρS,T = ρS,T τ

and

(**) (ι− ρS,T )τ(ι− ρS,T ) = (ι− ρS,T )τ

since ρS,T + ρT,S = ι. Given (*), (**) is equivalent to

τ − τρS,T − ρS,T τ + ρS,T τρS,T = τ − ρS,T ττρS,T = ρS,T τ.

Both conditions taken together are equivalent to just τρS,T = ρS,T τ . �

Theorem 19. [Exercise 2.1] Let A ∈Mm,n have rank k.

(1) There exist matrices X ∈ Mm,k and Y ∈ Mk,n, both of rank k, such thatA = XY .

(2) A has rank 1 if and only if it has the form A = xTy where x and y are rowmatrices.

Proof. If A has rank k, then

A = P

[Ik 00 0

]Q

by performing elementary row and column operations, where P and Q are invertible.But

A = P ′Q′

where P ′ contains the first k columns of P and Q′ contains the first k rows of Q. Thisproves (1). Suppose now that A has rank 1; (1) shows that A = xTy for row matricesx and y. Conversely,

xTy =

[xT

0Ik−1

]1 0 0 00 0 · · · 0

0...

. . ....

0 0 · · · 0

[

y0 Ik−1

],


Theorem 20. [Exercise 2.2] Let τ ∈ L(V,W ), where dim(V ) = dim(W ) < ∞. Thenτ is injective if and only if τ is surjective. This does not hold if the finiteness conditionis removed.

9

Proof. By Theorem 2.8, dim(ker(τ)) = 0 if and only if dim(im(τ)) = dim(W ).

Let V be the set of all functions f : R → R such that f(x) = 0 whenever x < −1 orx > 1, as a real vector space. Define the linear map τ : V → V by (τf)(x) = f(2x);then τ is injective but not surjective. �

Theorem 21. [Exercise 2.3] Let τ ∈ L(V,W ). Then τ is an isomorphism if and onlyif it carries a basis for V to a basis for W .

Proof. Theorem 16 shows that if τ is an isomorphism, then it carries a basis for V toa basis for W . Conversely, suppose that τ carries a basis B for V to a basis B′ for W .Clearly τ is surjective. If

τ(a1v1 + · · ·+ anvn) = a1τv1 + · · ·+ anτvn

= 0

where v1, . . . , vn ∈ B, then a1 = · · · = an = 0 since τv1, . . . , τvn ∈ B′. This shows thatτ is a bijection. �

Theorem 22. [Exercise 2.5] If V = S ⊕ T , then V ∼= S � T .

Proof. Let ϕ : S � T → S ⊕ T be given by (s, t) 7→ s+ t. This map is easily seen to bean isomorphism. �

Remark 23. [Exercise 2.6] Let V = A+B and define the linear map τ : A�B → V by(v, w) 7→ v+w. This map is always surjective, but is injective if and only if V = A⊕B.

Theorem 24. [Exercise 2.7] Let τ ∈ LF (V ) where dim(V ) = n <∞. Let A ∈Mn(F ).Suppose that there is an isomorphism σ : V → F n with the property that στ = Aσ.Then there exists an ordered basis B for which A = [τ ]B.

Proof. Choose B = (σ−1e1, . . . , σ−1en), which is a basis by Theorem 16. Then A =

στσ−1 = [τ ]B. �

Definition 25. Let T be a subset of L(V ). A subspace S of V is T -invariant if S isτ -invariant for every τ ∈ T . Also, V is T -irreducible if the only T -invariant subspacesof V are {0} and V .

Theorem 26. [Exercise 2.8] Suppose that TV ⊆ L(V ), TW ⊆ L(W ), V is TV -irreducibleand W is TW -irreducible. Let α ∈ L(V,W ) satisfy αTV = TWα, that is, for any µ ∈ TVthere is a λ ∈ TW such that αµ = λα and for any λ ∈ TW there is a µ ∈ TV such thatαµ = λα. Then α = 0 or α is an isomorphism.

Proof. Suppose first that kerα 6= {0}, let v ∈ kerα, and let µ ∈ TV . There exists aλ ∈ TW such that αµ = λα, so αµv = λαv = 0 and µv ∈ kerα. This shows that kerαis a TV -invariant subspace and therefore kerα = V , i.e. α = 0. Similarly, suppose that

10

imα 6= V , let w ∈ imα so that w = αv for some v ∈ V , and let λ ∈ TW . There existsa µ ∈ TV such that αµ = λα, so λw = λαv = αµv ∈ imα. This shows that imα isa TW -invariant subspace and therefore imα = {0}, i.e. α = 0. We conclude that if αfails to be injective or surjective, then α = 0. �

Theorem 27. [Exercise 2.9] Let τ ∈ L(V ) where dim(V ) <∞. If rk(τ 2) = rk(τ) showthat im(τ) ∩ ker(τ) = {0}.

Proof. If v ∈ im(τ) ∩ ker(τ) is nonzero, there exists some nonzero x ∈ V such thatτx = v 6= 0 and τ 2x = τv = 0. We know that ker(τ) ⊆ ker(τ 2), and since x 6= ker(τ)but x ∈ ker(τ 2), the nullity of τ must be strictly less than the nullity of τ 2. By Theorem2.8, rk(τ 2) 6= rk(τ). �

Theorem 28. [Exercises 2.10,2.11] Let τ ∈ L(U, V ) and σ ∈ L(V,W ).

(1) rk(στ) ≤ min {rk(τ), rk(σ)}.(2) null(στ) ≤ null(τ) + null(σ).

Proof. Obvious. �

Theorem 29. [Exercise 2.12] Let τ, σ ∈ L(V ) where τ is invertible. Then rk(τσ) =rk(στ) = rk(σ).

Proof. Obvious. �

Theorem 30. [Exercise 2.13] Let τ, σ ∈ L(V,W ). Then rk(τ + σ) ≤ rk(τ) + rk(σ).

Proof. im(τ + σ) ⊆ im(τ) + im(σ), and dim(A+B) ≤ dim(A⊕B). �

Theorem 31. [Exercise 2.14] Let S be a subspace of V .

(1) There exists a τ ∈ L(V ) such that ker(τ) = S.(2) There exists a σ ∈ L(V ) such that im(σ) = S.

Proof. Choose a complement T so that V = S ⊕ T . Define τ to be the projection ontoT along S, and define σ to be the projection onto S along T . By Theorem 2.21, τ andσ satisfy (1) and (2). Note that τ and σ as defined here are orthogonal. �

Theorem 32. [Exercise 2.15] Let τ, σ ∈ L(V ).

(1) σ = τµ for some µ ∈ L(V ) if and only if im(σ) ⊆ im(τ).(2) σ = µτ for some µ ∈ L(V ) if and only if ker(τ) ⊆ ker(σ).

11

Proof. If σ = τµ and v = σx for some x ∈ V , then v = τµx ∈ im(τ). Conversely,suppose that im(σ) ⊆ im(τ). Let P = τ−1(im(τ)) and Q = τ−1(im(σ)) be subspaces ofV so that Q ⊆ P . Define µ ∈ L(V ) as a projection onto Q along some complement ofQ. Then for any v ∈ V , write v = q + r where q ∈ Q, r /∈ Q \ {0}, and

τµv = τµq + τµr

= τq

= τ(v − r)= τv − τr= τv

since r /∈ Q ⊆ P or r = 0. The proof for (2) is similar. �

Theorem 33. [Exercise 2.16] Let dim(V ) < ∞ and suppose that τ ∈ L(V ) satisfiesτ 2 = 0. Then 2 rk(τ) ≤ dim(V ).

Proof. Since τ 2 = 0, we have ker(τ) ⊆ im(τ) and null(τ) ≤ rk(τ). By Theorem 2.8,

2 rk(τ) ≤ null(τ) + rk(τ) = dim(V ).

�

Theorem 34. [Exercise 2.18] Let V have basis B = {v1, . . . , vn} and assume that thebase field F for V has characteristic 0. Suppose that for each 1 ≤ i, j ≤ n we defineτi,j ∈ L(V ) by

τi,j(vk) =

{vk if k 6= i,

vi + vj if k = i.

Then the τi,j are invertible and form a basis for L(V ).

Proof. Define εi,j ∈ L(V ) by εi,j(vi) = vj and εi,j(vk) = 0 for k 6= i; then τi,j = ι + εi,j.We can compute inverses

τ−1i,j =

{2−1εi,i +

∑k 6=i εk,k if i = j,

ι− εi,j if i 6= j,

since [τi,i]B is diagonal and

(ι+ εi,j) (ι− εi,j) = ι− ε2i,j = ι.

To see that the τi,j form a basis for L(V ), suppose that

a1,1τ1,1 + · · ·+ an,nτn,n =∑i,j

ai,jι+∑i,j

ai,jεi,j = 0.

12

For each k, we have

(*)∑i,j

ai,jιvk +∑i,j

ai,jεi,jvk =∑i,j

ai,jvk +∑j

ak,jvj = 0

so that ak,j = 0 for all (k, j) with j 6= k. Equation (*) then shows that∑i

ai,i + ak,k = 0

for each k; viewing this as n equations in n unknowns, ak,k = 0 for each k since U + Iis invertible where U is a matrix with all entries set to 1. �

Remark 35. [Exercise 2.19] Let τ ∈ L(V ). If S is a τ -invariant subspace of V mustthere be a subspace T of V for which (S, T ) reduces τ? No, unless τ is an isomorphism.In that case, let T be a complement of S in V . Since τ |S : S → S is an isomorphism,im(τ |T ) must be a subspace of T .

Example 36. [Exercise 2.20] Find an example of a vector space V and a proper sub-space S of V for which V ∼= S. Let Fa denote the set of all functions f : R → R suchthat f(x) = 0 whenever x < −a or x > a. Fix 0 < α < β; then Fα is a proper subspaceof Fβ, but ϕ : Fα → Fβ defined by (ϕf)(x) = f(βx/α) is an isomorphism.

Theorem 37. [Exercise 2.21] Let dim(V ) < ∞. If τ, σ ∈ L(V ) then στ = ι impliesthat τ and σ are invertible and that σ = p(τ) for some polynomial p(x) ∈ F [x].

Proof. The first assertion is clear since ker(σ) 6= {0} or ker(τ) 6= {0} implies thatker(στ) 6= {0}. Since L(V ) is finite-dimensional, there exists a nonzero polynomialp(x) ∈ F [x] such that p(τ) = 0. Such a polynomial can be taken to have nonzeroconstant term, for otherwise we may apply τ−1 to both sides. Then

−a0ι =∑k≥1

akτk

−a0σ =∑k≥0

ak+1τk

σ = −∑k≥0

ak+1

a0τ k.

�

Theorem 38. [Exercise 2.22] Let τ ∈ L(V ). The following are equivalent:

(1) τσ = στ for all σ ∈ L(V ).(2) τ = aι for some a ∈ F .

13

Proof. (2)⇒ (1) is obvious. Suppose that τ 6= aι for all a ∈ F . Then there exists somev ∈ V such that {v, τv} is linearly independent. Complete this to a (possibly infinite)basis B of V , and choose σ ∈ L(V ) such that σv = v, στv = v + τv, and σb = b forall b ∈ B \ {v, τv}. Then τσv = τv 6= στv which shows that τ and σ do not commute.This proves (1)⇒ (2). �

Theorem 39. [Exercise 2.23] Let V be a vector space over a field F of characteristicother than 2, and let ρ, σ be projections.

(1) The difference ρ− σ is a projection if and only if

ρσ = σρ = σ

in which case

im(ρ− σ) = im(ρ) ∩ ker(σ) and ker(ρ− σ) = ker(ρ)⊕ im(σ).

(2) If ρ and σ commute, then ρσ is a projection, in which case

im(ρσ) = im(ρ) ∩ im(σ) and ker(ρσ) = ker(ρ) + ker(σ).

Proof. The difference ρ− σ is a projection if and only if ι− (ρ− σ) = (ι− ρ) + σ is aprojection; this sum is a projection if and only if ι−ρ ⊥ σ, i.e. σ = ρσ = σρ. In this case,im(ρ)∩ker(σ) ⊆ im(ρ−σ) since v = ρx and v ∈ ker(σ) implies that (ρ−σ)v = ρ2x = v.If v = (ρ− σ)x then σv = σρx− σ2x = 0 and v = (ρ− ρσ)x = ρ(ι− σ)x, which showsthe reverse inclusion. If v ∈ ker(ρ) then (ρ − σ)v = ρv − σρv = 0, and if v ∈ im(σ)then (ρ − σ)v = ρ(v − σv) = 0. Therefore ker(ρ) + im(σ) ⊆ ker(ρ − σ). Conversely, ifv ∈ ker(ρ − σ) then v = (v − ρv) + σv ∈ ker(ρ) + im(σ). Finally, v ∈ ker(ρ) ∩ im(σ)implies that v = σx = σρx = 0, which shows that the sum is direct.

If ρ and σ commute then ρσρσ = ρ2σ2 = ρσ. Clearly im(ρσ) ⊆ im(ρ) and im(ρσ) =im(σρ) ⊆ im(σ). If v ∈ im(ρ) ∩ im(σ) then v = ρx = σx′ so that

ρσv = σρv = σρx = σv = σx′ = v,

which shows that v ∈ im(ρσ). Clearly ker(ρ) + ker(σ) ⊆ ker(ρσ). If v ∈ ker(ρσ) thenv = (v − ρv) + ρv ∈ ker(ρ) + ker(σ), which proves that ker(ρσ) = ker(ρ) + ker(σ). �

Theorem 40. [Exercise 2.24] Let f : Rn → R be a continuous function with theproperty that f(x+ y) = f(x) + f(y). Then f is a linear functional on Rn.

Proof. By induction, f(nx) = nf(x) for all integers n, and consequently f(rx) = rf(x)for all r ∈ Q. Let a ∈ R, and let {an} be a sequence in Q such that an → a. Then

f(ax) = f(

limn→∞

anx)

= limn→∞

f(anx)

14

= limn→∞

anf(x)

= af(x)

since f is continuous, which completes the proof. �

Theorem 41. [Exercise 2.25] Any linear functional f : Rn → R is a continuous map.

Proof. Let M = max {|f(e1)| , . . . , |f(en)|}. Let x ∈ R and ε > 0. For all y ∈ R suchthat

‖x− y‖ < ε

Mn,

we have

|f(x)− f(y)| = |f(x− y)|

= ‖x− y‖∣∣∣∣f ( x− y‖x− y‖

)∣∣∣∣ .Write

x− y‖x− y‖

= a1e1 + · · ·+ anen

with a21 + · · ·+ a2n = 1. Then

|f(x)− f(y)| = ‖x− y‖ |a1f(e1) + · · ·+ anf(en)|= M ‖x− y‖ (|a1|+ · · ·+ |an|)= Mn ‖x− y‖max {|a1| , . . . , |an|}< ε

since |ai| ≤ 1 for all i. �

Theorem 42. [Exercise 2.32] Let V be a real vector space with complexification V C

and let σ ∈ L(V C). Then the following are equivalent:

(1) σ is a complexification, i.e. σ has the form τC for some τ ∈ L(V ).(2) σ commutes with the conjugate map χ : V C → V C defined by χ(u+ iv) = u− iv.

Proof. Suppose that σ = τC for some τ ∈ L(V ). Then

σχ(u+ iv) = σ(u− iv)

= τu− iτv= χ(τu+ iτv)

= χσ(u+ iv).

15

Conversely, suppose that σ commutes with χ. Let τ : V → V be given by v 7→Re(σ(v + 0i)); then σ(u+ iv) = x+ iy where x, y ∈ V . We have

x =x+ iy

2+x− iy

2

=σ(u+ iv)

2+σ(u− iv)

2= σu

= τu

and similarly y = τv. This shows that σ = τC. �

Chapter 3. The Isomorphism Theorems

Remark 43. [Exercise 3.1] If V is infinite-dimensional and S is an infinite-dimensionalsubspace of V , must the dimension of V/S be finite? No, take V = R2 and S = R×{0}.Then V/S ∼= R, which is not finite-dimensional.

Theorem 44. [Exercise 3.2] Let S be a subspace of V . Then the function that assigns toeach intermediate subspace S ⊆ T ⊆ V the subspace T/S of V/S is an order-preserving(with respect to set inclusion) bijection between the set of all subspaces of V containingS and the set of all subspaces of V/S.

Proof. Let ψ denote the correspondence T 7→ T/S, which is clearly order-preserving.Suppose that T1/S = T2/S; we want to show that T1 = T2. If v ∈ T1, then v + S ∈T1/S = T2/S and we may choose an element v′ ∈ v + S ⊆ T2. We have v′ − v = s forsome s ∈ S, so v = v′ − s ∈ T2. Therefore T1 ⊆ T2, and similarly T2 ⊆ T1. This showsthat ψ is injective. Now let

W = {u+ S | u ∈ U}be a subspace of V/S, where U ⊆ V . Let T be the union of all cosets in W :

T =⋃u∈U

(u+ S).

If x, y ∈ T then x+ S, y + S ∈ W ; since W is a subspace of V/S,

rx+ S, (x+ y) + S ∈ W,

which implies that rx, x+ y ∈ T . Hence, T is a subspace of V containing S. It is clearthat T/S = W ; this proves that ψ is surjective. �

Theorem 45. [Exercise 3.3] Let τ : V → W be a linear map. Then im(τ) ∼= V/ ker(τ).

16

Proof. By Theorem 3.4, there exists a map τ ′ : V/ ker(τ) → W such that ker(τ ′) =ker(τ)/ ker(τ) ∼= {0}. Then τ ′ is injective and

im(τ) = im(τ ′) ∼= V/ ker(τ).

�

Remark 46. [Exercise 3.5] Let S be a subspace of V . Starting with a basis {s1, . . . , sk}for S, how would you find a basis for V/S?

Let F be the base field of V . Complete the given basis of S to a basis {s1, . . . , sk, vk+1, . . . , vn}of V . We have

V = 〈s1〉 ⊕ · · · ⊕ 〈sk〉 ⊕ 〈vk+1〉 ⊕ · · · ⊕ 〈vn〉and

S = 〈s1〉 ⊕ · · · ⊕ 〈sk〉so that ϕ : 〈vk+1〉 ⊕ · · · ⊕ 〈vn〉 → V/S defined by v 7→ v + S is an isomorphism. Thenϕ carries the basis {vk+1, . . . , vn} to a basis {vk+1 + S, . . . , vn + S} of V/S.

Remark 47. [Exercise 3.6] The rank-nullity theorem follows from the first isomorphismtheorem.

Let τ ∈ L(V,W ) with dim(V ) < ∞; we have im(τ) ∼= V/ ker(τ), so dim(im(τ)) =dim(V ) − dim(ker(τ)). However, the main part of this proof applies the dimensionformula for quotient spaces (Corollary 3.7). The formula follows from taking a comple-ment of ker(τ), which is the same technique used in the original proof of the rank-nullitytheorem.

Remark 48. [Exercise 3.7] Let τ ∈ L(V ) and let S be a subspace of V . Define a mapτ ′ : V/S → V/S by v + S 7→ τv+ S. When is τ ′ well-defined? If τ ′ is well-defined, is ita linear transformation? What are im(τ ′) and ker(τ ′)?

τ ′ is well-defined if and only τv−τv′ = τ(v−v′) ∈ S whenever v, v′ ∈ V and v−v′ ∈ S.That is, τ ′ is well-defined if and only if S is a τ -invariant subspace. In that case τ ′ is alinear map, and we may compute

im(τ ′) = {τv + S | v ∈ V } = (im(τ) + S)/S ∼= im(τ)/(im(τ) ∩ S)

andker(τ ′) = {v ∈ V | τv ∈ S} = τ−1(S).

Theorem 49. [Exercise 3.8] For any nonzero vector v ∈ V , there exists a linear func-tional f ∈ V ∗ for which f(v) 6= 0.

Proof. Let F be the base field of V . Choose a basis B of V that contains v, and definef : V → F by f(v) = 1 and f(b) = 0 for every b ∈ B \ {v}. �

Corollary 50. [Exercise 3.9] A vector v ∈ V is zero if and only if f(v) = 0 for allf ∈ V ∗. Two vectors v, w ∈ V are equal if and only if f(v) = f(w) for all f ∈ V ∗.

17

Theorem 51. [Exercise 3.10] Let S be a proper subspace of a finite-dimensional vectorspace V and let v ∈ V \ S. Then there exists a linear functional f ∈ V ∗ for whichf(v) = 1 and f(s) = 0 for all s ∈ S.

Proof. Choose a basis B0 of S and choose a basis B of V that contains B0∪{v}. Definef ∈ V ∗ by f(v) = 1 and f(x) = 0 for all x ∈ B \ {v}. This functional has the desiredproperties. �

Example 52. [Exercise 3.11] Find a vector space V and decompositions

V = A⊕B = C ⊕D

with A ∼= C but B � D. Hence A ∼= C does not imply that Ac ∼= Cc.

Consider the vector space V over R of all sequences f : N→ R (where 0 ∈ N). We saythat f is an even sequence if f(2k+ 1) = 0 for all k ∈ N, and that f is an odd sequenceif f(2k) = 0 for all k ∈ N. Choose A to be the space of all even sequences, choose B tobe the space of all odd sequences, choose C = V , and choose D = {0}. Then A ∼= Csince ϕ : A→ C given by (ϕf)(k) = f(2k) is an isomorphism. But clearly B � {0}.

Example 53. [Exercise 3.12] Find isomorphic vector spaces V and W with

V = S ⊕B and W = S ⊕D

but B � D. Hence V � W does not imply that V/S ∼= W/S.

Using the terminology of Example 52, choose S to be the space of all even sequences,choose B to be the space of all odd sequences, and choose D = {0}. Then V is thespace of all sequences, which is isomorphic to W = S ⊕ D = S, the space of all evensequences.

Lemma 54. Let S, T be subspaces of V . Then there exist subspaces S ′, T ′, U such that

S = S ′ ⊕ (S ∩ T ),

T = T ′ ⊕ (S ∩ T ),

V = S ′ ⊕ (S ∩ T )⊕ T ′ ⊕ U.

Proof. Since S ∩ T is a subspace of both S and T , we may choose complements S ′

and T ′ of S ∩ T in S and T respectively. Choose a complement U of S + T so thatV = (S + T )⊕ U . It is clear that S + T = S ′ + (S ∩ T ) + T ′; it remains to show thatthe sum is direct. Suppose that v ∈ S ′ ∩ [T ′ + (S ∩ T )] = S ′ ∩ T . Then v ∈ S ′ andv ∈ S ∩ T , so v = 0. Similarly, v ∈ T ′ ∩ [S ′ + (S ∩ T )] implies that v = 0. Finally, ifv ∈ (S ∩ T ) ∩ (S ′ + T ′) then v = s+ t where s ∈ S ′ and t ∈ T ′. Since t = v − s ∈ S wehave t = 0 and since s = v − t ∈ T we have s = 0. This completes the proof. �

18

Theorem 55. [Exercise 3.13] Let V be a vector space with

V = S1 ⊕ T1 = S2 ⊕ T2.If S1 and S2 have finite codimension in V , then so does S1 ∩ S2 and

codim(S1 ∩ S2) ≤ dim(T1) + dim(T2).

Proof. Applying Lemma 54 gives a decomposition

S1 = S ′1 ⊕ (S1 ∩ S2),

S2 = S ′2 ⊕ (S1 ∩ S2),

V = S ′1 ⊕ (S1 ∩ S2)⊕ S ′2 ⊕ U.We can compute

V/(S1 ∩ S2) ∼= S ′1 ⊕ S ′2 ⊕ U = (S ′1 ⊕ U) + (S ′2 ⊕ U).

Then V/(S1 ∩ S2) must be finite-dimensional, being the sum of the finite-dimensionalvector spaces S ′2 ⊕ U ∼= V/S1 and S ′1 ⊕ U ∼= V/S2. Furthermore,

dim(V/(S1 ∩ S2)) = dim((S ′1 ⊕ U) + (S ′2 ⊕ U))

≤ dim(S ′1 ⊕ U) + dim(S ′2 ⊕ U)

= dim(V/S1) + dim(V/S2)

= dim(T1) + dim(T2)

since T1 ∼= V/S1 and T2 ∼= V/S2. �

Remark 56. [Exercise 3.15] Let B be a basis for an infinite-dimensional vector space Vand define, for all b ∈ B, the map b′ ∈ V ∗ by b′(c) = 1 if c = b and 0 otherwise, for allc ∈ B. Does B′ = {b′ | b ∈ B} form a basis for V ∗? What do you conclude about theconcept of a dual basis?

No, B′ is not a basis for V ∗. For example, the functional f ∈ V ∗ that sends all elementsof B to 1 is not a linear combination of elements from B′, since such combinations mustbe finite. Therefore the concept of a dual basis only applies to finite-dimensional vectorspaces.

Theorem 57. [Exercise 3.16] If S and T are subspaces of V , then (S⊕T )∗ ∼= S∗�T ∗.

Proof. Define ϕ : S∗ � T ∗ → (S ⊕ T )∗ by

(ϕ(f, g))(s+ t) = f(s) + g(t).

This map is easily seen to be linear. Given a functional f ∈ (S ⊕ T )∗,

ϕ(f |S, f |T )(s+ t) = f(s) + f(t) = f(s+ t),

which shows that ϕ is surjective. If ϕ(f, g) = 0, then f(s) + g(t) = 0 for every s ∈ Sand t ∈ T . Therefore f and g are both zero, which shows that ϕ is injective. �

19

Theorem 58. [Exercise 3.17] 0× = 0 and ι× = ι, where 0 is the zero operator andι : V → V is the identity.

Proof. For all f ∈ W ∗,(0×f)(v) = (f0)(v) = f(0) = 0,

and for all f ∈ V ∗,(ι×f)(v) = (fι)(v) = f(v).

�

Theorem 59. [Exercise 3.18] Let S be a subspace of V . Then (V/S)∗ ∼= S0.

Proof. Define ϕ : (V/S)∗ → S0 by (ϕf)(v) = f(v + S), which is clearly linear. Givenany g ∈ S0 define f ∈ (V/S)∗ by f(v+S) = g(v); this is well-defined since v+S = v′+Simplies

f(v + S)− f(v′ + S) = g(v − v′)= 0

since v − v′ ∈ S. Then we have ϕf = g, which shows that ϕ is surjective. If ϕf = 0then f(v + S) = 0 for all v ∈ V , i.e. f = 0. �

Theorem 60. [Exercise 3.19] Let V and W be vector spaces.

(1) (τ + σ)× = τ× + σ× for τ, σ ∈ L(V,W ).(2) (rτ)× = rτ× for any r ∈ F and τ ∈ L(V,W ).

Proof. For all f ∈ W ∗,

[(τ + σ)×f ](v) = f(τ + σ)v = fτv + fσv = (τ×f)(v) + (σ×f)(v),

and for all r ∈ F , f ∈ W ∗,

[(rτ)×f ](v) = f(rτ)v = rfτv = r(τ×f)(v).

�

Theorem 61. [Exercise 3.20] Let τ ∈ L(V,W ), where V and W are finite-dimensional.Then rk(τ) = rk(τ×).

Proof. By Theorem 3.19 we have im(τ×) = ker(τ)0 ∼= S∗ where S is any complementof ker(τ) in V . Then

rk(τ×) = dim(S∗) = dim(S) = rk(τ).

�

Theorem 62. Let V be a vector space over a field F and let X, Y ⊆ V .

(1) X0 is a subspace of V .

20

(2) If X ⊆ Y then Y 0 ⊆ X0.(3) X ⊆ span(X) ⊆ X00.

Proof. Let f, g ∈ X0, a ∈ F and x ∈ X. Clearly 0 ∈ X0; we also have f(x) = g(x) = 0so (f + g)(x) = 0 and (af)(x) = 0. Therefore f + g, af ∈ X0, which proves (1). For(2), let f ∈ Y 0 so that f(y) for every y ∈ Y . In particular, f(x) for every x ∈ X, sof ∈ X0.

If v ∈ span(X) then v = a1v1 + · · ·+ anvn where v1, . . . , vn ∈ X. If f ∈ X0 then

f(v) = a1f(v1) + · · ·+ anf(vn) = 0,

so v ∈ X00. This proves (3). �

Chapter 4. Modules I: Basic Properties

Theorem 63. [Universal property of a free object] Let F and M be R-modules, let Bbe a basis for F and let τ0 : B → M be any function. Then τ0 extends uniquely to anR-map τ : F →M . That is, there exists a unique R-map τ : F →M such that τi = τ0,where i : B → V is the canonical injection.

Theorem 64. [Universal property of a free object, converse] Let F be an R-module.Let B be a subset of F such that for every R-module M and every τ0 : B → M thereexists a unique R-map τ : F →M that extends τ0. Then B is a basis for F .

Proof. Let F (B) be the free module on B. There exists a unique R-map f : F → F (B)that extends the inclusion map i : B → F (B). There also exists a unique R-mapg : F (B) → F that extends the inclusion map i1 : B → F . Since fg : F (B) → F (B)agrees with idF (B) on B and B is a basis for F (B), we must have fg = idF (B). On theother hand, gf : F → F agrees with idF on B, and by the given conditions on B wehave gf = idF . Therefore g is an isomorphism, and g carries the basis B of F (B) to anidentical basis of F . �

Theorem 65. [Exercise 4.2] Let S = {v1, . . . , vn} be a subset of a module M . ThenN = 〈S〉 is the smallest submodule of M containing S. That is, N is a submodule ofevery module that contains S.

Proof. Any module that contains S must also contain all linear combinations of elementsfrom S, i.e. 〈S〉. �

Theorem 66. [Exercise 4.3] Let M be an R-module and let I be an ideal in R. LetIM be the set of all finite sums of the form

r1v1 + · · ·+ rnvn

here ri ∈ I and vi ∈M . Then IM is a submodule of M .

21

Proof. It is clear that IM ⊆M and that IM is closed under addition. If r ∈ R, ri ∈ Iand vi ∈M , then

r(r1v1 + · · ·+ rnvn) = (rr1)v1 + · · ·+ (rrn)vn ∈ IM

since I is an ideal in R. �

Theorem 67. [Exercise 4.4] Let M be an R-module and let S(M) be the set of allsubmodules of M . If S, T ∈ S(M), then

S ∩ T = glb {S, T}

and

S + T = lub {S, T} .

Proof. Same as Theorem 1. �

Theorem 68. [Exercise 4.5] Let S1 ⊆ S2 ⊆ · · · be an ascending sequence of submodulesof an R-module M . Then the union

⋃Si is a submodule of M .

Proof. If r ∈ R and v ∈⋃Si, then v ∈ Sk for some k. Therefore rv ∈ Sk ⊆

⋃Si. If

v, v′ ∈⋃Si, then v ∈ Sj and v′ ∈ Sk for some j, k. Since v, v′ ∈ Smax(j,k), we have

v + v′ ∈ Smax(j,k) ⊆⋃Si. �

Example 69. [Exercise 4.6] Give an example of a module M that has a finite basisbut with the property that not every spanning set in M contains a basis and not everylinearly independent set in M is contained in a basis.

Consider Z2 as a Z-module, which has a basis {(1, 0), (0, 1)}. The spanning set

{(2, 0), (3, 0), (0, 2), (0, 3)}

does not contain a basis, and the linearly independent set {(2, 0)} is not contained ina basis.

Theorem 70. [Exercise 4.8] Let τ ∈ HomR(M,N) be an R-isomorphism. If B is abasis for M , then τB is a basis for N .

Proof. Same as Theorem 16. �

Theorem 71. [Exercise 4.9] Let M be an R-module and let τ ∈ HomR(M,M) be anR-endomorphism. Then τ is idempotent, that is, τ 2 = τ , if and only if

M = ker(τ)⊕ im(τ) and τ |im(τ) = ι.

22

Proof. Suppose that τ 2 = τ and let v ∈ M . Since τ(v − τv) = τv − τ 2v = 0, we havev = (v − τv) + τv ∈ ker(τ) + im(τ). If v ∈ ker(τ) and v ∈ im(τ) then v = τx for somex ∈ M , and v = τ 2x = τv = 0. This shows that M = ker(τ) ⊕ im(τ), and clearlyτ |im(τ) = ι. Conversely, suppose that M = ker(τ) ⊕ im(τ) and τ |im(τ) = ι. Let v ∈ Mso that v = x+ y where x ∈ ker(τ) and y ∈ im(τ). Then

τ 2v = τ 2y = τy = τ(x+ y) = τv.

�

Example 72. [Exercise 4.10] Consider the ring R = F [x, y] of polynomials in twovariables. Show that the set M consisting of all polynomials in R that have zeroconstant term is an R-module. Show that M is not a free R-module.

The first claim is obvious. For the second claim, suppose that B is a basis for M .Suppose that B contains at least two elements f(x, y) and g(x, y). Then

g(x, y)f(x, y)− f(x, y)g(x, y) = 0

which contradicts the linear independence of f(x, y) and g(x, y). Therefore B containsexactly one element f(x, y), for an empty set cannot span M . Write

f(x, y) =∑i

aixmiyni ,

and choose i such that mi + ni 6= 0 is minimized. We must have mi ≤ 1 for otherwisex /∈ 〈B〉, and similarly ni ≤ 1 for otherwise y /∈ 〈B〉. If mi = ni = 1 then x /∈ 〈B〉, somi = 1 or ni = 1, but not both. If mi = 1 then y /∈ 〈B〉; if ni = 1 then x /∈ 〈B〉. Thisshows that B cannot span M . Therefore M is not free.

Theorem 73. [Exercise 4.11] Let R be an integral domain and let M be an R-module.If v1, . . . , vn is linearly independent over R, then so is rv1, . . . , rvn for any nonzeror ∈ R.

Proof. Ifa1rv1 + · · ·+ anrvn = 0,

then a1r = · · · = anr = 0. Since r 6= 0 and R is an integral domain, we must havea1 = · · · = an = 0. �

Theorem 74. [Exercise 4.12] If a nonzero commutative ring R with identity has theproperty that every finitely generated R-module is free, then R is a field.

Proof. Let I be an ideal of R. Then R/I = 〈1 + I〉, so R/I is free with some basisB. Suppose that B is not empty and I 6= {0}, i.e. {0} ⊂ I ⊂ R, choose b + I ∈ B,and choose some nonzero i ∈ I. Then i(b + I) = I, which contradicts the linearindependence of B. Therefore I = {0} or I = R, which shows that R is simple andtherefore a field. �

23

Theorem 75. [Exercise 4.13] Let M and N be R-modules. If S is a submodule of Mand T is a submodule of N then

(M ⊕N)/(S ⊕ T ) ∼= (M/S) � (N/T ).

Proof. The R-map ϕ : M ⊕N → (M/S)⊕ (N/T ) given by m+ n 7→ (m+ S, n+ T ) issurjective and ker(ϕ) = S ⊕ T . �

Remark 76. [Exercise 4.14] If R is a commutative ring with identity and I is an idealof R, then I is an R-module. What is the maximum size of a linearly independent setin I? Under what conditions is I free?

Any linearly independent set in I contains at most 1 element, for if a, b ∈ I are nonzeroelements then ba − ab = 0, which is a contradiction. Therefore if I is free then I isprincipal. If R is an integral domain, then I is free if and only if I is principal.

Theorem 77. [Exercise 4.15] Let R be an integral domain and let M be an R-module.

(1) Mtor is a submodule of M .(2) M/Mtor is torsion-free.(3) If R is not an integral domain, then Mtor may not be a submodule of M .

Proof. If x ∈ Mtor then ax = 0 for some nonzero a ∈ R. Therefore a(rx) = 0 andrx ∈ Mtor. If x, y ∈ Mtor then ax = by = 0 for some nonzero a, b ∈ R. Thereforeab(x+ y) = 0 and x+ y ∈Mtor, noting that ab 6= 0 since R is an integral domain. Thisshows that Mtor is a submodule of M . For (2), suppose that r(x+Mtor) = Mtor for somex + Mtor ∈ M/Mtor and some nonzero r ∈ R. Then rx ∈ Mtor, and srx = 0 for somenonzero s ∈ R. Since R is an integral domain and sr 6= 0, we must have x = 0 ∈Mtor.This shows that M/Mtor is torsion-free. For (3), consider Z6 as a Z-module; 2 and 3are torsion elements but 2 + 3 = 5 is not. �

Example 78. [Exercise 4.16]

(1) Find a module M that is finitely generated by torsion elements but for whichann(M) = {0}.

(2) Find a torsion module M for which ann(M) = {0}.

Proof. For (1), choose M = 〈2, 3〉 as a submodule of the Z-module Z6. Then ann(1) ={0}, so ann(M) = {0}. For (2), let {pi}i≥1 be the sequence of all prime numbers andlet M be the Z-module defined by the infinite external direct sum

Z2 � Z3 � · · ·� Zpi � · · · .If x = (k1, . . . , kn, 0, 0, . . . ) ∈ M , then

∏ki 6=0 ki ∈ ann(x) and x is a torsion element.

However, ann(M) = 0, for if r ∈ ann(M) is nonzero and pi is the largest prime numberdividing r, then r /∈ ann(x) where x has a 1 in the (i+ 1)th position. �

24

Theorem 79. [Exercise 4.19] Any R-module M is isomorphic to the R-module HomR(R,M).

Proof. Define ϕ : HomR(R,M) → M by f 7→ f(1). If f, g ∈ HomR(R,M) and r ∈ R,we have

ϕ(f + g) = (f + g)(1) = f(1) + g(1) = ϕ(f) + ϕ(g),

ϕ(rf) = rf(1) = rϕ(f),

which shows that ϕ is an R-map. If ϕ(f) = f(1) = 0 then for any r ∈ R,

ϕ(r) = ϕ(1)ϕ(r) = 0.

This shows that ϕ is injective. If v ∈ M then we may define an R-map f : R → Mby f(r) = rv. Since f(1) = v, we have ϕ(f) = v, which shows that ϕ is surjective.Therefore HomR(R,M) ∼= M . �

Theorem 80. [Exercise 4.20] Let R and S be commutative rings with identity and letf : R → S be a ring homomorphism. Then any S-module is also an R-module underthe scalar multiplication rv = f(r)v.

Proof. Obvious. �

Lemma 81. Let f : Zn → Zm be a map with f(j) ≡ kj ( mod m) for some k. Thenf is a Z-map if and only if m/ gcd(m,n) divides f(1) = k.

Theorem 82. [Exercise 4.21] HomZ(Zn,Zm) ∼= Zd where d = gcd(n,m).

Proof. Define the Z-map ϕ : Zm → HomZ(Zn,Zm) by k 7→ ψk where ψk : Zn → Zm isgiven by j 7→ k(m/d)j. This is well-defined by Lemma 81. Given f ∈ HomZ(Zn,Zm),we know by Lemma 81 that m/d divides f(1), and ψf(1)/(m/d) = f since

ψf(1)/(m/d)(j) = f(1)j = f(j).

Therefore ϕ(f(1)) = f , which shows that ϕ is surjective. Suppose that ϕ(k) = ψk = 0.This is true if and only if

ψk(1) =km

d≡ 0 ( mod m)⇔ d | k.

Then ker(ϕ) = dZm and HomZ(Zn,Zm) ∼= Zm/dZm ∼= Zd. �

Theorem 83. [Exercise 4.22] Let R be a commutative ring with identity. If I, J areideals of R for which R/I ∼= R/J as R-modules, then I = J . If R/I ∼= R/J only asrings, then I = J may be false.

25

Proof. Let ϕ : R/I → R/J be an R-isomorphism. Let i ∈ I. Then

i+ J = i(1 + J)

= iϕ(ϕ−1(1 + J))

= ϕ(iϕ−1(1 + J))

= ϕ(I)

= J

since ϕ−1(1 + J) = a+ I for some a ∈ R and i(a+ I) = I. Therefore i ∈ J and I ⊆ J .Similarly, J ⊆ I.

Alternatively, ann(R/I) = ann(R/J) since R/I ∼= R/J , but ann(R/I) = I andann(R/J) = J .

If R/I ∼= R/J only as rings, the conclusion may not hold. For example, take R = Z3[x],I = 〈x3 − x+ 1〉 and J = 〈x3 − x2 + 1〉. We have R/I ∼= R/J as finite fields, butI 6= J . �

Chapter 5. Modules II: Free and Noetherian Modules

Remark 84. [Exercise 5.1] If M is a free R-module and τ : M → N is an epimorphism,then must N also be free? No. Take M = Z2 as a Z-module with basis {(1, 0), (0, 1)},and N = Z2

2 with the epimorphism (m,n) 7→ ([m], [n]).

Theorem 85. [Exercise 5.2] Let I be an ideal of R. If R/I is a free R-module, thenI = {0} or I = R.

Proof. If R/I = {0}, then I = R and we are done. Otherwise, let B be a (nonempty)basis for R/I. Let i ∈ I and choose any r + I ∈ B. Then i(r + I) = I, and since B islinearly independent, i = 0. �

Theorem 86. [Exercise 5.4] Let S be a submodule of an R-module M . If M is finitelygenerated, so is the quotient module M/S.

Proof. Let P = {p1, . . . , pn} be a finite subset of M such that M = 〈P 〉. Then{p1 + S, . . . , pn + S} spans M/S, for if m+ S ∈M/S then

m = a1p1 + · · ·+ anpn

andm+ S = a1(p1 + S) + · · ·+ an(pn + S).

�

Theorem 87. [Exercise 5.5] Let S be a submodule of an R-module M . If both S andM/S are finitely generated, then so is M .

26

Proof. Let P = {p1 + S, . . . , pn + S} be a finite subset of M/S such that M/S = 〈P 〉,and let Q = {q1, . . . , qn} be a finite subset of S such that S = 〈Q〉. Let v ∈ M so thatv + S ∈M/S and

v + S = a1p1 + · · ·+ anpn + S.

Then v − a1p1 + · · ·+ anpn ∈ S, so

v − a1p1 + · · ·+ anpn = b1q1 + · · ·+ bnqn.

This shows that M = 〈P ∪Q〉. �

Theorem 88. [Exercise 5.7] Let τ : M → N be an R-homomorphism.

(1) If M is finitely generated, then so is im(τ).(2) If ker(τ) and im(τ) are finitely generated, then M = ker(τ) + S where S is a

finitely generated submodule of M . Hence, M is finitely generated.

Proof. For (1), im(τ) ∼= M/ ker(τ) which is finitely generated by Theorem 86. (2)follows from Theorem 87. �

Theorem 89. [Exercise 5.8] If R is Noetherian and I is an ideal of R, then R/I isalso Noetherian.

Proof. LetJ1 ⊆ J2 ⊆ J3 ⊆ · · ·

be an ascending sequence of ideals of R/I. By the correspondence theorem, for each iwe can write Ji = Si/I where Si is an ideal of R, and

S1 ⊆ S2 ⊆ S3 ⊆ · · ·is an ascending sequence of ideals of R. Since R is Noetherian, there exists a k suchthat

Sk = Sk+1 = Sk+2 = · · · ,and therefore

Jk = Jk+1 = Jk+2 = · · · .�

Theorem 90. [Exercise 5.9] If R is Noetherian, then so is R[x1, . . . , xn].

Proof. Follows from induction on n and the Hilbert basis theorem. �

Example 91. [Exercise 5.10] Find an example of a commutative ring with identitythat does not satisfy the ascending chain condition.

The ring of polynomials in infinitely many variables is not Noetherian. Specifically,

Z[x1] ⊆ Z[x1, x2] ⊆ Z[x1, x2, x3] ⊆ · · ·is an ascending chain that does not become constant.

27

Theorem 92. [Exercise 5.11]

(1) An R-module M is cyclic if and only if it is isomorphic to R/I where I is anideal of R.

(2) An R-module M is simple if and only if it is isomorphic to R/I where I is amaximal ideal of R.

(3) For any nonzero commutative ring R with identity, a simple R-module exists.

Proof. If M is cyclic, then M = 〈a〉 for some a ∈ M , and ϕ : R → M given byr 7→ ra is a surjection. Therefore M ∼= R/ ker(ϕ). Conversely, let ϕ : R/I → M be anisomorphism, and let a = ϕ(1 + I). Then for any v ∈M , there exists some r ∈ R suchthat

v = ϕ(ϕ−1(v))

= ϕ(r + I)

= ra,

which shows that M = 〈a〉. This proves (1).

For (2), suppose that M is simple and choose any nonzero v ∈M . Since 〈v〉 is a nonzerosubmodule of M , we must have M = 〈v〉. By (1), M is isomorphic to R/I where I isan ideal of R. Furthermore, I is a maximal ideal since R/I ∼= M is simple. Conversely,suppose that M ∼= R/I where I is a maximal ideal. Then M is simple since R/I issimple.

For (3), choose any maximal ideal I of R (which always exists). Then R/I is simple by(2). �

Example 93. [Exercise 5.12] If R is not a principal ideal domain, then there may existan n-generated R-module that contains a submodule that is not n-generated.

Take R = Z[x]. Then R is a 1-generated R-module since R = 〈1〉, but 〈2, x〉 =2Z+ xZ[x] is not 1-generated.

Theorem 94. [Exercise 5.13] Let R be a commutative ring with identity.

(1) R is Noetherian if and only if every finitely generated R-module is Noetherian.(2) Suppose that R is a principal ideal domain. If an R-module M is n-generated,

then any submodule of M is also n-generated.

Proof. One direction for (1) is obvious. For the other direction, we use induction onthe number n of generating elements of an R-module M . If n = 0 then M = {0},and trivially all submodules of M are finitely generated. Now suppose that everyn-generated R-module is Noetherian, and let M be an (n + 1)-generated R-modulewith M = 〈v1, . . . , vn+1〉. Let M ′ = 〈v1, . . . , vn〉, let S be a submodule of M , and

28

let T = S ∩ M ′. By the induction hypothesis, T is finitely generated since M ′ isn-generated. Now consider

S/T = S/(S ∩M ′) ∼= (S +M ′)/M ′.

If S/T is finitely generated then S is finitely generated, by Theorem 87. Therefore itremains to show that (S+M ′)/M ′ is finitely generated. Let ϕ : Rn+1 → S+M ′ be theepimorphism defined by (r1, . . . , rn+1) 7→ r1v1 + · · · + rn+1vn+1, and let ψ : Rn+1 → Rbe the R-map that sends each tuple to its last coordinate. Then U = ψ(ϕ−1(S+M ′)) isa submodule of R that is generated by some finite set G = {g1, . . . , gk} ⊆ S, since R isNoetherian. Let G∗ = {g1vn+1 +M ′, . . . , gkvn+1 +M ′}. For any v+M ′ ∈ (S+M ′)/M ′,we can choose some (r1, . . . , rn+1) ∈ ϕ−1({v}), and

v +M ′ = r1v1 + · · ·+ rn+1vn+1 +M ′

= rn+1vn+1 +M ′

∈ G∗

since rn+1 ∈ U . This shows that (S +M ′)/M ′ = 〈G∗〉.

For (2), we can modify the proof by choosing some a ∈ R such that U = 〈a〉. ThenS/T ∼= (S + M ′)/M ′ is cyclic, and S is generated by n + 1 elements (see Theorem87). �

Theorem 95. [Exercise 5.14] Any R-module M is isomorphic to the quotient of a freemodule F . If M is finitely generated, then F can also be taken to be finitely generated.

Proof. Let F (M) be the free module on M and let ϕ : F (M) → M be the (unique)epimorphism that extends idM . Then M ∼= F (M)/ ker(ϕ). Now suppose that M =〈v1, . . . , vn〉, and let ϕ : Rn →M be the epimorphism defined by

(r1, . . . , rn) 7→ r1v1 + · · ·+ rnvn.

Then M ∼= Rn/ ker(ϕ). �

Theorem 96. [Exercise 5.15] Let S, T, T1, T2 be submodules of a module M , where allmodules are free and have finite rank.

(1) If S ∼= T , then M/S ∼= M/T .(2) If S ⊕ T1 ∼= S ⊕ T2, then T1 ∼= T2.(3) The above statements do not necessarily hold if the modules are not free or do

not have finite rank.

Proof. If S ∼= T then rk(S) = rk(T ) and

rk(M/S) = rk(M)− rk(S) = rk(M)− rk(T ) = rk(M/T ).

29

Similarly, if S ⊕ T1 ∼= S ⊕ T2 then

rk(T1) = rk(S ⊕ T1)− rk(S) = rk(S ⊕ T2)− rk(S) = rk(T2),

so T1 ∼= T2.

For (3), see Example 53. �

Chapter 6. Modules over a Principal Ideal Domain

Lemma 97. Let α1, . . . , αn be relatively prime, let µ = α1 · · ·αn, let βi = µ/αi for eachi, and choose ai ∈ R such that

a1β1 + · · ·+ anβn = 1.

Then ai and αi are relatively prime for each i.

Proof. Let d be a GCD of ak and αk. Then d clearly divides akβk, and d divides aiβiwhen i 6= k since αk divides βi. Therefore d divides

a1β1 + · · ·+ anβn = 1

and d must be a unit. �

Theorem 98. Let M be an R-module and suppose that

M = A1 + · · ·+ An

where the submodules Ai have relatively prime orders. Then the sum is direct.

Proof. Denote the order of each Ai by αi, and let µ = α1 · · ·αn. Suppose that

v1 + · · ·+ vn = 0

where vi ∈ Ai. Then for each i,

0 =µ

αi(v1 + · · ·+ vn)

=µ

αivi,

and

1 = o

(µ

αivi

)=

o(vi)

gcd(µ/αi, o(vi))

= o(vi)

since o(vi) divides αi and gcd(µ/αi, αi) = 1. Therefore vi = 0. �

Theorem 99. [Exercise 6.1] Any free module over an integral domain is torsion-free.

30

Proof. Let R be an integral domain and let M be a free R-module. Let B be a basisfor M . If v ∈M is nonzero and rv = 0 for some r, then we can write

r(a1v1 + · · ·+ anvn) = 0

where v1, . . . , vn ∈ B and at least one ai is nonzero, say ak. Since B is a basis, rak = 0,and since R is an integral domain, r = 0. �

Theorem 100. [Exercise 6.2] Let M be a finitely generated torsion module over aprincipal ideal domain. The following are equivalent:

(1) M is indecomposable.(2) M has only one elementary divisor (including multiplicity).(3) M is cyclic of prime power order.

Proof. If M has two or more elementary divisors, it must be decomposable by definition,so (1)⇒ (2). If M has only one elementary divisor pe, then M = 〈v〉 where ann(〈v〉) =〈pe〉, which shows (2) ⇒ (3). Suppose that M = A ⊕ B where A and B are propersubmodules of M . If o(A) and o(B) are relatively prime, then M cannot have primepower order. If o(A) and o(B) are not relatively prime, then M cannot be cyclic byTheorem 6.17. This proves (3)⇒ (1). �

Theorem 101. [Exercise 6.3] Let R be a principal ideal domain and R+ the field ofquotients. Then R+ is an R-module, and any nonzero finitely generated submodule ofR+ is a free module of rank 1.

Proof. If a/b ∈ R+, defining r (a/b) = (ra)/b gives R+ the structure of an R-module.Furthermore, R+ is free with basis {1}, and the result follows from Theorem 6.5. �

Theorem 102. [Exercise 6.4] Let R be a principal ideal domain. Let M be a finitelygenerated torsion-free R-module. Suppose that N is a submodule of M for which N isa free R-module of rank 1 and M/N is a torsion module. Then M is a free R-moduleof rank 1.

Proof. By Theorem 6.8, we know that M is free (with a finite basis). Let B ={v1, . . . , vn} be a basis for M . Since M/N is a torsion module, for each i there ex-ists a nonzero ri ∈ R such that ri(vi + N) = N and rivi ∈ N . Suppose that n ≥ 2.Since N has rank 1, {r1v1, . . . , rnvn} must be linearly dependent by Theorem 5.5. Eachri is nonzero, so this contradicts the linear independence of B. Therefore n ≤ 1, andsince N ⊆M is nonempty, we must have n = 1. �

Example 103. [Exercise 6.5] Show that the primary cyclic decomposition of a torsionmodule over a principal ideal domain is not unique (even though the elementary divisorsare).

31

Consider Z2 × Z2 as a Z-module. We have the two decompositions

Z2 × Z2 = 〈(1, 0)〉 ⊕ 〈(0, 1)〉= 〈(1, 1)〉 ⊕ 〈(0, 1)〉 .

Example 104. [Exercise 6.6] Show that if M is a finitely generated R-module where Ris a principal ideal domain, then the free summand in the decomposition M = F ⊕Mtor

need not be unique.

Consider Z× Z2 as Z-module. We have the two decompositions

Z× Z2 = 〈(1, 0)〉 ⊕ 〈(0, 1)〉= 〈(1, 1)〉 ⊕ 〈(0, 1)〉 .

Theorem 105. [Exercise 6.7] If 〈v〉 is a cyclic R-module of order a then the mapτ : R→ 〈v〉 defined by τr = rv is a surjective R-homomorphism with kernel 〈a〉 and so〈v〉 ∼= R/ 〈a〉.

Proof. Obvious. �

Theorem 106. [Exercise 6.8] If R is an integral domain with the property that allsubmodules of cyclic R-modules are cyclic, then R is a principal ideal domain.

Proof. Observe that R is a cyclic R-module with R = 〈1〉. If I is an ideal of R, then Iis a submodule of R and I = 〈a〉 for some a ∈ I. This shows that I is a principal idealdomain. �

Theorem 107. [Exercise 6.9] Let F be a finite field and let F ∗ be the set of all nonzeroelements of F .

(1) If p(x) ∈ F [x] is a nonconstant polynomial over F and if r ∈ F is a root ofp(x), then x− r is a factor of p(x).

(2) Every nonconstant polynomial p(x) ∈ F [x] of degree n has at most n distinctroots in F .

(3) F ∗ is a cyclic group under multiplication.

Proof. Since F [x] is a Euclidean domain, we can write

p(x) = (x− r)q(x) + c.

where c ∈ F . But 0 = p(r) = c, so x− r is a factor of p(x) and this proves (1). For (2),if p(x) has distinct roots {r1, . . . , rm} where m > n then

m∏i=1

(x− ri)

is a factor of p(x) with degree m, which contradicts p(x) having degree n.

32

F ∗ is a group under multiplication, and we can define F ∗ as a Z-module with scalarmultiplication kf = fk. Since F ∗ is finite, it must be finitely generated. Then we havea primary decomposition

(*) F ∗ = Mp1 ⊕ · · · ⊕Mpn

where p1, . . . , pn are distinct prime numbers and o(Mpi) = peii for each i. Fix some iand let k = peii . Choose an element fi ∈ Mpi with o(fi) = k so that the order of thesubgroup 〈fi〉 is k. Every element of Mpi is a root of the polynomial p(x) = xk − 1, so|Mpi | ≤ k by (2). This shows that Mpi = 〈fi〉 is cyclic. Applying Theorem 6.4 to (*),we have

F ∗ = 〈f1〉 ⊕ · · · ⊕ 〈fn〉= 〈f1 + · · ·+ fn〉 ,


Lemma 108. Let R be a principal ideal domain and let M be an R-module. If v, w ∈Mwhere 〈w〉 is a submodule of 〈v〉 and o(w) = o(v), then 〈w〉 = 〈v〉.

Proof. We have w = rv for some r ∈ R, and we are given that o(w) = o(rv) = o(v).By Theorem 6.3, 〈w〉 = 〈rv〉 = 〈v〉. �

Theorem 109. [Exercise 6.10] Let R be a principal ideal domain and let M = 〈v〉 bea cyclic R-module with order α. Then for every β ∈ R such that β divides α there is aunique submodule of M of order β.

Proof. Let α = pe11 · · · penn and β = pf11 · · · pfnn be factorizations of α and β where fi ≤ eifor each i. By Theorem 6.17, we can write

M = 〈v1〉 ⊕ · · · ⊕ 〈vn〉

where 〈vi〉 is a primary submodule of M and o(〈vi〉) = peii . Then o(pei−fii 〈vi〉) = pfiiand

N = pe1−f11 〈v1〉 ⊕ · · · ⊕ pen−fnn 〈vn〉is a submodule of M with order β. To prove uniqueness, suppose that N ′ is a submoduleM of order β. Since N ′ is cyclic, applying Theorem 6.17 gives a decomposition

N ′ = 〈w1〉 ⊕ · · · ⊕ 〈wn〉

where o(〈wi〉) = pfii . Fix some i; we want to show that 〈wi〉 = pei−fii 〈vi〉. By definition,

〈vi〉 = {v ∈M | peii v = 0}is a primary submodule of M and therefore 〈wi〉 must be a submodule of 〈vi〉. Letx ∈ 〈wi〉 so that x = kvi for some k ∈ R. We have

pfii x = pfii kvi = 0,

33

so o(vi) = peii divides pfii k and pei−fii must divide k. This shows that x ∈ pei−fii 〈vi〉 and

therefore 〈wi〉 is a submodule of pei−fii 〈vi〉. But 〈wi〉 = pei−fii 〈vi〉 by Lemma 108, whichproves the uniqueness of N . �

Theorem 110. [Exercise 6.11] Let M be a free module of finite rank over a principalideal domain R. Let N be a submodule of M . If M/N is torsion, then rk(N) = rk(M).

Proof. Let B0 be a basis for N and let B = {v1, . . . , vn} be a basis for M . Since B0 isa linearly independent set in M , we have

rk(N) = |B0| ≤ rk(M)

by Theorem 5.5. Suppose that n > rk(N). For each i, the element vi + N ∈ M/N istorsion, and there exists some nonzero ri ∈ R such that rivi + N = N , i.e. rivi ∈ N .By Theorem 5.5, {r1v1, . . . , rnvn} is a linearly dependent set, and

a1r1v1 + · · ·+ anrnvn = 0

for some a1, . . . , an ∈ R where not all ai are zero. But every ri is nonzero, so {v1, . . . , vn}is linearly dependent, which is a contradiction. Therefore

rk(N) ≤ rk(M) = n ≤ rk(N)

and rk(N) = rk(M). �

Example 111. [Exercise 6.13] Show that the rational numbers Q form a torsion-freeZ-module that is not free.

It is clear that Q is torsion-free. Suppose that B is a basis for Q. If r1, r2 are distinctelements of B, then we may write r1 = a1/b1 and r2 = a2/b2 so that

(a2b1)r1 + (−a1b2)r2 = a1a2 − a1a2 = 0.

This contradicts the linear independence of B, so B must contain exactly one elementr. But r/2 cannot be written as an integer multiple of r, so Q cannot be free.

More on Complemented Submodules.

Theorem 112. [Exercise 6.14] Let R be a principal ideal domain and let M be a freeR-module.

(1) A submodule N of M is complemented if and only if M/N is free.(2) If M is also finitely generated, then N is complemented if and only if M/N is

torsion-free.

Proof. If M/N is free, then applying Theorem 5.6 to the quotient map τ : M →M/Nshows that N is complemented. Conversely, if N is complemented then M = N ⊕ Sfor some submodule S of M , and M/N ∼= S by Theorem 4.16. But S is free, so M/Nis free. If M is finitely generated then we know that M/N is finitely generated, by

34

Theorem 86. Theorem 6.8 shows that M/N is free if and only if M/N is torsion-free,which proves (2). �

Theorem 113. [Exercise 6.15] Let M be a free module of finite rank over a principalideal domain R.

(1) If N is a complemented submodule of M , then rk(N) = rk(M) if and only ifN = M .

(2) (1) need not hold if N is not complemented.(3) N is complemented if and only if any basis for N can be extended to a basis for

M .

Proof. One direction for (1) is evident. Write M = N ⊕ S. If rk(N) = rk(M) thenrk(S) = 0 and S = {0}, which shows that N = M . To show (2), let M = Z andN = 2Z as Z-modules. Then M is free with basis {1} and N is free with basis {2}, butN 6= M .

Suppose that M = N ⊕S where S is a submodule of M , and let B1 be a basis for S. IfB0 is a basis for N , then B0 ∪B1 is a basis for M . Conversely, suppose that any basisfor N can be extended to a basis for M . Let B0 be a basis for N and extend this to abasis B of M . Then

M = N ⊕ 〈B \B0〉 ,which shows that N is complemented. This proves (3). �

Theorem 114. [Exercise 6.16] Let M and N be free modules of finite rank over aprincipal ideal domain R. Let τ : M → N be an R-map.

(1) ker(τ) is complemented.(2) im(τ) need not be complemented.(3) rk(M) = rk(ker(τ)) + rk(im(τ)) = rk(ker(τ)) + rk(M/ ker(τ)).(4) If τ is surjective, then τ is an isomorphism if and only if rk(M) = rk(N).(5) If L is a submodule of M and if M/L is free, then

rk(M/L) = rk(M)− rk(L).

Proof. By considering τ as a map onto the free module im(τ), Theorem 5.6 shows thatker(τ) is complemented and that

(*) M = ker(τ)⊕Nwhere N ∼= im(τ). However, im(τ) need not be complemented; we can take τ : Z→ Zdefined by k 7→ 2k. This proves (1) and (2). We have im(τ) ∼= M/ ker(τ) by the firstisomorphism theorem, so from (*) we see that

rk(M) = rk(ker(τ)) + rk(N)

= rk(ker(τ)) + rk(im(τ))

35

= rk(ker(τ)) + rk(M/ ker(τ)),

which proves (3). Suppose that τ is surjective. If τ is an isomorphism then ker(τ) = {0}and rk(M) = rk(N) by (3). Conversely, if rk(M) = rk(N) then rk(ker(τ)) = 0 andker(τ) = {0}. This proves (4). Finally, (5) follows from taking the quotient mapπ : M →M/L in (3). �

Theorem 115. [Exercise 6.17] Let R be a commutative ring with identity. A submoduleN of an R-module M is said to be pure in M if whenever v ∈ M \ N , then rv /∈ Nfor all nonzero r ∈ R.

(1) N is pure if and only if v ∈ N and v = rw for nonzero r ∈ R implies thatw ∈ N .

(2) N is pure if and only if M/N is torsion-free.(3) If R is a principal ideal domain and M is finitely generated, then N is pure if

and only if M/N is free.(4) If L and N are pure submodules of M , then so is L ∩N .(5) If N is pure in M , then L ∩N is pure in L for any submodule L of M .

Proof. (1) is the contrapositive of the definition of pure. Suppose that N is pure andsuppose that r(v + N) = N for some r 6= 0. Then rv ∈ N which implies that v ∈ Nand v + N = N . Conversely, suppose that M/N is torsion-free. If v ∈ N and v = rw,then r(w + N) = N and w ∈ N . This proves (2), and (3) follows from Theorem 6.8.Suppose that L and N are pure submodules of M . If v ∈ L ∩N and v = rw for somenonzero r ∈ R, then w ∈ L and w ∈ N , which shows that L ∩N is pure. This proves(4). Now suppose that N is pure in M and L is any submodule of M . If v ∈ L ∩ Nand v = rw for some nonzero r ∈ R and w ∈ L, then w ∈ N . This proves (5). �

Theorem 116. [Exercise 6.18] Let M be a free module of finite rank over a principalideal domain R. Let L and N be submodules of M with L complemented in M . Then

rk(L+N) + rk(L ∩N) = rk(L) + rk(N).

Proof. By the second isomorphism theorem,

(N + L)/L ∼= N/(N ∩ L).

Applying Theorem 114 gives

rk(N + L)− rk(L) = rk(N)− rk(N ∩ L).

�

36

Chapter 7. The Structure of a Linear Operator

Theorem 117. Let V be a vector space over a field F and let τ ∈ L(V ). If f(x), g(x) ∈F [x] are coprime polynomials such that f(x)g(x) annihilates Vτ , then

V = ker(f(τ))⊕ ker(g(τ)).

Proof. Write

a(x)f(x) + b(x)g(x) = 1

for some a(x), b(x) ∈ F [x]. If v ∈ V then

v = b(τ)g(τ)v + a(τ)f(τ)v ∈ ker(f(τ)) + ker(g(τ))

since

f(τ)b(τ)g(τ)v = 0 = g(τ)a(τ)f(τ)v.

If v ∈ ker(f(τ)) ∩ ker(g(τ)) then

v = b(τ)g(τ)v + a(τ)f(τ)v = 0,

which shows that the sum is direct. �

Remark 118. [Exercise 7.1] We have seen that any τ ∈ L(V ) can be used to make V intoan F [x]-module. Does every module V over F [x] come from some τ ∈ L(V )? Explain.

Given an F [x]-module V , define τ : V → V by v 7→ xv; τ is easily seen to be a linearmap. If

p(x) =∑i

aixi ∈ F [x]

and v ∈ V , then

p(x)v =∑i

aixiv

=∑i

aiτiv

= p(τ)v.

Thus the F [x]-module V does come from the linear operator τ .

Theorem 119. [Exercise 7.2] Let τ ∈ L(V ) have minimal polynomial

mτ (x) = pe11 (x) · · · penn (x)

where pi(x) are distinct monic primes. The following are equivalent:

(1) Vτ is cyclic.(2) deg(mτ (x)) = dim(V ), i.e. τ is nonderogatory.

37

(3) The elementary divisors of τ are the prime power factors peii (x) and so

Vτ = 〈v1〉 ⊕ · · · ⊕ 〈vk〉is a direct sum of τ -cyclic submodules 〈vi〉 of order peii (x).

Proof. If Vτ is cyclic, then V is τ -cyclic of dimension deg(mτ (x)) by Theorem 7.5, whichshows (1)⇒ (2). If deg(mτ (x)) = dim(V ), then mτ (x) = cτ (x) and

pe11 (x), . . . , penn (x)

must be the complete list of elementary divisors. This proves (2)⇒ (3). Finally, if (3)holds then Vτ is cyclic by Theorem 6.4, proving (3)⇒ (1). �

Theorem 120. [Exercise 7.3] A matrix A ∈Mn(F ) is nonderogatory if and only if itis similar to a companion matrix.

Proof. If A is nonderogatory, then Theorem 7.9 shows that VA is cyclic and Theorem7.12 shows that the multiplication operator τA can be represented by a companionmatrix. This means that A is similar to a companion matrix. Conversely, if A =B(C[p(x)])B−1 then [τA]B = C[p(x)], and VA is cyclic by Theorem 7.12. By Theorem7.9, A is nonderogatory. �

Theorem 121. [Exercise 7.4] If A and B are block diagonal matrices with the sameblocks, but in possibly different order, then A and B are similar.

Proof. One simple way to see this is to explicitly produce a permutation matrix P suchthat A = PBP−1. We will use another approach here. Let A′ be the matrix formedby taking the blocks of A into the elementary divisor version of the rational canonicalform; then A is similar to A′. Define B′ similarly so that B is similar to B′. Theresulting companion blocks in A′ must be the same as those in B′ up to reordering, soA′ is similar to B′ by Theorem 7.14. �

Remark 122. [Exercise 7.5] Let A ∈ Mn(F ). Justify the statement that the entries ofany invariant factor version of a rational canonical form for A are “rational” expressionsin the entries of A, hence the origin of the term rational canonical form. Is the sametrue for the elementary divisor version?

If F = C and A contains only rational entries, then any invariant factor version of arational canonical form contains only polynomials with rational coefficients. This isdespite the fact that the invariant factors will split in C[x]. The elementary divisorsmay not be polynomials with rational coefficients, so the elementary divisor version isnot “rational” as in the invariant factor version.

Theorem 123. [Exercise 7.6] Let τ ∈ L(V ) where V is finite-dimensional. If p(x) ∈F [x] is irreducible and if p(τ) is not one-to-one, then p(x) divides the minimal polyno-mial of τ .

38

Proof. Let K = ker(p(τ)). If p(τ) is not one-to-one, then K 6= {0} and ann(K) is notgenerated by a unit in F [x]. Since p(x) annihilates K and is irreducible, we must haveann(K) = 〈p(x)〉. But the minimal polynomial of τ also annihilates K, so it must bedivisible by p(x). �

Theorem 124. [Exercise 7.7] The minimal polynomial of τ ∈ L(V ) is the least commonmultiple of its elementary divisors.

Proof. Follows from Theorem 7.7. �

Remark 125. [Exercise 7.8] Let τ ∈ L(V ) where V is a finite-dimensional vector spaceover a field F . Describe conditions on the minimal polynomial of τ that are equivalentto the fact that the elementary divisor version of the rational canonical form of τ isdiagonal. What can you say about the elementary divisors?

If the elementary divisor version of the rational canonical form is diagonal, then eachelementary divisor must be linear. Equivalently, the minimal polynomial of τ splits intolinear factors over F .

Theorem 126. [Exercise 7.10] Given any multiset of monic prime power polynomials

M ={pe1,11 (x), · · · , pe1,k11 (x), · · · , · · · , pen,1n (x), · · · , pen,knn (x)

}and given any vector space V of dimension equal to the sum of the degrees of thesepolynomials, there exists an operator τ ∈ L(V ) whose multiset of elementary divisorsis M .

Proof. Define the matrix

A = diag(C[p

e1,11 (x)], · · · , C[p

e1,k11 (x)], · · · , · · · , C[pen,1n (x)], · · · , C[p

en,knn (x)]

).

Then by Theorem 7.12, the operator τA(v) = Av has elementary divisors equal toM . �

Example 127. [Exercise 7.11] Find all rational canonical forms (up to the order ofthe blocks on the diagonal) for a linear operator on R6 having minimal polynomial(x− 1)2(x+ 1)2.

The possible elementary divisor multisets and their rational canonical forms for boththe elementary divisor and invariant factor versions are:

39

• {(x− 1)2, (x+ 1)2, (x− 1)2} with matrices0 −11 2

0 −11 −2

0 −11 2

and

0 0 0 −11 0 0 00 1 0 20 0 1 0

0 −11 2

• {(x− 1)2, (x+ 1)2, (x+ 1)2} with matrices

0 −11 2

0 −11 −2

0 −11 −2

and

0 0 0 −11 0 0 00 1 0 20 0 1 0

0 −11 −2

• {(x− 1)2, (x+ 1)2, x− 1, x− 1} with matrices

0 −11 2

0 −11 −2

11

and

0 0 0 −11 0 0 00 1 0 20 0 1 0

11

• {(x− 1)2, (x+ 1)2, x+ 1, x+ 1} with matrices

0 −11 2

0 −11 −2

−1−1

and

0 0 0 −11 0 0 00 1 0 20 0 1 0

−1−1

• {(x− 1)2, (x+ 1)2, x− 1, x+ 1} with matrices

0 −11 2

0 −11 −2

1−1

and

0 0 0 −11 0 0 00 1 0 20 0 1 0

0 11 0

40

Example 128. [Exercise 7.12] How many possible rational canonical forms (up to orderof blocks) are there for linear operators on R6 with minimal polynomial (x−1)(x+1)2?

We give the general method for computation here, even if it is unnecessary for thisexample. The largest invariant factor must be (x − 1)(x + 1)2, so the remaining ele-mentary divisors must be a combination of x − 1, x + 1 and (x + 1)2. The number ofdifferent elementary divisor multisets is given by the coefficient of x3 in the generatingfunction

f(x) = (1 + x+ x2 + · · · )2(1 + x2 + x4 + · · · ).We can expand f :

f(x) =1

(1− x)2(1− x2)

=1

8

(1

1 + x

)+

1

8

(1

1− x

)+

1

4

(1

(1− x)2

)+

1

2

(1

(1− x)3

)=∑n≥0

(1

8(−1)n +

1

8+

1

4(n+ 1) +

1

2

(n+ 2

2

))xn.

The coefficient of x3 is then equal to

1 +1

2

(5

2

)= 6.

Remark 129. [Exercise 7.13]

(1) Show that if A and B are n × n matrices, at least one of which is invertible,then AB and BA are similar.

(2) What do the matrices

A =

[1 00 0

]and B =

[0 10 0

]have to do with this issue?

(3) Show that even without the assumption on invertibility the matrices AB andBA have the same characteristic polynomial.

Assume without loss of generality that A is invertible. Then

AB = A(BA)A−1,

so AB and BA are similar. For (2), we can compute

AB =

[0 10 0

]and BA =

[0 00 0

].

These two matrices are not similar, so (1) may not hold if both A and B are singular.

41

For (3), write (by row and column operations)

A = PIn,rQ

where P and Q are invertible and In,r is an n× n matrix that has the r× r identity inthe upper left-hand corner and 0s elsewhere. Let B′ = QBP . Then AB = PIn,rB

′P−1

and BA = Q−1B′In,rQ, so it remains to show that det(xI − In,rB′) = det(xI −B′In,r),for AB is similar to In,rB

′ and BA is similar to B′In,r. Write

B′ =

[B′1,1 B′2,1B′2,1 B′2,2

]where B′1,1 is an r × r matrix. Then

det(xI − In,rB′) = det

(xI −

[B′1,1 B′2,1

0 0

])=

∣∣∣∣xI −B′1,1 −B′2,10 xI

∣∣∣∣= xn−r det(xI −B′1,1)

and

det(xI −B′In,r) = det

(xI −

[B′1,1 0B′2,1 0

])=

∣∣∣∣xI −B′1,1 0−B′2,1 xI

∣∣∣∣= xn−r det(xI −B′1,1),

which completes the proof.

Example 130. [Exercise 7.14] Let τ be a linear operator on F 4 with minimal poly-nomial mτ (x) = (x2 + 1)(x2 − 2). Find the elementary divisor version of the rationalcanonical form for τ if F = Q, F = R or F = C.

If F = Q, the matrix is 0 −11 0

0 21 0

.If F = R, we can write mτ (x) = (x2 + 1)(x+

√2)(x−

√2) and the matrix is

0 −11 0

−√

2 √2

.

42

If F = C, we can write mτ (x) = (x+ i)(x− i)(x+√

2)(x−√

2) and the matrix is−i

i

−√

2 √2

.Remark 131. [Exercise 7.15] Suppose that the minimal polynomial of τ ∈ L(V ) isirreducible. What can you say about the dimension of V ?

If the minimal polynomial mτ (x) is irreducible, then any other elementary divisors mustbe equal to mτ (x). Therefore the dimension of V is a multiple of deg(mτ (x)).

Theorem 132. [Exercise 7.16] Let τ ∈ L(V ) where V is finite-dimensional. Supposethat p(x) is an irreducible factor of the minimal polynomial m(x) of τ . Suppose furtherthat u, v ∈ V have the property that o(u) = o(v) = p(x). Then u = f(τ)v for somepolynomial f(x) if and only if v = g(τ)u for some polynomial g(x).

Proof. Suppose that u = f(τ)v for some polynomial f(x). Since f(x) /∈ ann(v) =〈p(x)〉, it must be relatively prime to p(x). Write

a(x)f(x) + b(x)p(x) = 1

for some polynomials a(x), b(x). Then

v = a(x)f(x)v + b(x)p(x)v

= a(x)u

since p(x) annihilates v. The other direction follows by symmetry. �

Chapter 8. Eigenvalues and Eigenvectors

Theorem 133. [Exercise 8.1] Let J be the n× n matrix whose entries are all equal to1. Find the minimal polynomial and characteristic polynomial of J and the eigenvalues.

Proof. We can calculate J2 = nJ , so the minimal polynomial is

mJ(x) = x2 − nx = x(x− n).

Since J has rank 1, we can find n− 1 eigenvectors for the eigenvalue 0. Also, (1, . . . , 1)is an eigenvector with eigenvalue n. Therefore the characteristic polynomial of J isxn−1(x− n). �

43

Example 134. [Exercise 8.2] Prove that the eigenvalues of a matrix do not form acomplete set of invariants under similarity.

Consider the following matrices:

A =

[1 00 1

]and B =

[0 −11 2

].

Both have characteristic polynomial (x−1)2, but mA(x) = x−1 while mB(x) = (x−1)2.

Theorem 135. [Exercise 8.3] A linear operator τ ∈ L(V ) is invertible if and only if 0is not an eigenvalue of τ .

Proof. As a simple consequence of the definition, 0 is not an eigenvalue of τ if and onlyif τ − 0ι = τ is invertible. �

Theorem 136. [Exercise 8.4] Let A be an n×n matrix over a field F that contains allroots of the characteristic polynomial of A. Then det(A) is the product of the eigenvaluesof A, counting multiplicity.

Proof. We have cA(x) = det(xI − A) = (x− λ1) · · · (x− λn). Setting x = 0,

det(−A) = (−1)n det(A) = (−1)nλ1 · · ·λn

so det(A) = λ1 · · ·λn. �

Theorem 137. [Exercise 8.5] If λ is an eigenvalue of τ , then p(λ) is an eigenvalue ofp(τ), for any polynomial p(x). Also, if λ 6= 0, then λ−1 is an eigenvalue for τ−1.

Proof. Let p(x) =∑

k akxk be any polynomial. If τv = λv for some nonzero v, then

p(τ)v =∑k

akτkv =

∑k

akλkv = p(λ)v.

If λ 6= 0, then

τ−1v = λ−1(τ−1λv) = λ−1τ−1τv = λ−1v.

�

Theorem 138. [Exercise 8.6] An operator τ ∈ L(V ) is nilpotent if τn = 0 for somepositive n ∈ N.

(1) If τ is nilpotent, then the (point) spectrum of τ is {0}.(2) There exists a nonnilpotent operator with (point) spectrum {0}.

44

Proof. The following proof for (1) is only valid if V is finite-dimensional: Let n denotethe dimension of V and let τ ∈ L(V ) be a nilpotent operator with τ k = 0 for some k > 0.Since xk annihilates Vτ we have mτ (x) = xj for some j ≤ k. But the characteristicpolynomial has the same roots as mτ (x), so cτ (x) = xn. We now give a more generalproof.

We can assume that V 6= {0}, for otherwise (1) is trivial. If τ is injective, then τv 6= 0for every v 6= 0. If we choose any nonzero v ∈ V , then τnv must be nonzero, whichcontradicts the fact that τn = 0. Therefore ker(τ) 6= {0} and 0 ∈ Spec(τ). If τv = λvfor some nonzero v ∈ V , then

λnv = τnv = 0

and λn = 0 = λ since v 6= 0. Therefore Spec(τ) = {0}, which proves (1).

(2) is not true if V is finite-dimensional, since the Jordan normal form shows thatoperators with spectrum {0} are similar to a matrix with only subdiagonal entries.However, we can show (2) by choosing an infinite-dimensional vector space. Let V =R[x] and consider the differential operator D : V → V . To see that Spec(τ) = {0}, wehave D(1) = 0 which shows that 0 ∈ Spec(τ), and if Df(x) = λf(x) and f(x) 6= 0 thenλ = 0 for otherwise the degree of Df(x) is strictly less than the degree of λf(x). Butτ is not nilpotent, since Dn(xn) 6= 0 for every n. �

Theorem 139. [Exercise 8.7] If σ, τ ∈ L(V ) and one of σ and τ is invertible, thenστ ∼ τσ and so στ and τσ have the same eigenvalues, counting multiplicity.

Proof. Assume without loss of generality that σ is invertible. Then τσ = σ−1(στ)σ. �

Example 140. [Exercise 8.8]

(1) Find a linear operator τ that is not idempotent but for which τ 2(ι− τ) = 0.(2) Find a linear operator τ that is not idempotent but for which τ(ι− τ)2 = 0.(3) Prove that if τ 2(ι− τ) = τ(ι− τ)2 = 0, then τ is idempotent.

For (1) and (2), we can choose operators on R4 defined by the matrices

A =

1

10 01 0

and B =

1 01 1

00

,where A2(I − A) = 0 and B(I −B)2 = 0. The condition in (3) implies that

τ 2 − τ 3 = 0 = τ(ι− 2τ + τ 2) = τ − 2τ 2 + τ 3,

so τ 3 = τ 2 and

τ − τ 2 = τ − 2τ 2 + τ 2 = τ − 2τ 2 + τ 3 = 0.

45

Remark 141. [Exercise 8.9] An involution is a linear operator θ for which θ2 = ι. If τis idempotent, what can you say about 2τ − ι? Construct a one-to-one correspondencebetween the set of idempotents on V and the set of involutions.

If τ is idempotent, then τ 2 = τ and

(2τ − ι)2 = 4τ 2 − 4τ + ι = ι.

Let V be a vector space over a field F with characteristic other than 2. Let I be theset of idempotent operators on V and let J be the set of involutions on V . Then themap f : I → J defined by

τ 7→ 2τ − ιis a bijection, for if τ ∈ J then

f

(τ + ι

2

)= τ

and if f(τ) = f(τ ′) then 2τ − ι = 2τ ′ − ι which implies that τ = τ ′.

Lemma 142. Let V be a vector space over a field F . Let τ ∈ L(V ) and suppose that

B = (v, τv, . . . , τn−1v)

is a basis for an n-dimensional subspace S. Let {p0(x), . . . , pn−1(x)} be a set of poly-nomials where each pk(x) has degree k. Then

B′ = (p0(τ)v, p1(τ)v, . . . , pn−1(τ)v)

is also a basis for S.

Proof. For each k, write pk(x) =∑k

j=0 ajxj. Define a linear operator σ : V → V by

[σ]B =

a0,0a1,0 a1,1...

.... . .

an−1,0 an−1,1 · · · an−1,n−1

.Since each pk(x) has degree k, the coefficient ak,k must be nonzero. This implies thatσ is invertible and therefore an isomorphism; it carries the basis B to B′. By Theorem16, B′ is also a basis. �

Theorem 143. [Exercise 8.11] Let V be a vector space over a field F . Let τ ∈ L(V )and let

S =⟨v, τv, . . . , τ de−1v

⟩be a τ -cyclic subspace of V with minimal polynomial p(x)e where p(x) is prime of degreed. Let σ = p(τ) restricted to 〈v〉 = Fv. Then S is the direct sum of d σ-cyclic subspaceseach of dimension e, that is,

S = T1 ⊕ · · · ⊕ Td.

46

Proof. For each 0 ≤ i < d, let

Bi+1 ={τ iv, p(τ)τ iv, . . . , p(τ)e−1τ iv

}.

ThenB1∪· · ·∪Bd is a set that contains exactly de polynomials, with degrees 0, 1, . . . , de−1. Applying Lemma 142 completes the proof. �

Theorem 144. [Exercise 8.12] Fix ε > 0. Define the “almost” Jordan block to be thematrix

J (λ, n) =

λε λ

. . .. . .

ε λ

.Every complex square matrix is similar to a matrix composed of “almost” Jordan blocks(for the given ε).

Proof. Let A be a complex square matrix. By Theorem 8.6, A is similar to a matrix ofthe form

diag(J (a1, e1), . . . ,J (ak, ek)).

For each 1 ≤ i ≤ k, define the similarity block Bi = diag(1, ε, . . . , εek) so thatBiJ (ai, ei)B

−1i = J (ai, ei). Then A is similar to

diag(J (a1, e1), . . . , J (ak, ek))

with similarity matrix diag(B1, . . . , Bk). �

Remark 145. [Exercise 8.13] Show that the Jordan canonical form is not very robust inthe sense that a small change in the entries of a matrix A may result in a large jumpin the entries of the Jordan form J . Consider the matrix

Aε =

[ε 01 0

].

What happens to the Jordan form of Aε as ε→ 0?

If ε 6= 0, then

J =

[0 00 ε

]with eigenvectors (0, 1) and (ε, 1). But if ε = 0,

J =

[0 01 0

]since the eigenvalue λ = 0 now has algebraic multiplicity 2.

47

Example 146. [Exercise 8.14] Give an example of a complex nonreal matrix all ofwhose eigenvalues are real. Show that any such matrix is similar to a real matrix.What about the type of the invertible matrices that are used to bring the matrix toJordan form?

The matrix

A =

[i 11 −i

]has characteristic polynomial cA(x) = x2, so all eigenvalues are real. The Jordan normalform of any matrix M with eigenvalues all real must also be real, and M is similar toits Jordan normal form. However, if M = PJP−1 where M is nonreal and J is real,then P must be complex for otherwise the product PJP−1 would be real.

Theorem 147. Let τ ∈ L(V ) and suppose that τ is represented by a Jordan blockJ(λ, e). Then mτ (x) = (x− λ)e and

J(λ, e) ∼ C[(x− λ)e].

Proof. Since

J(λ, e)− λI =

01 0

. . .. . .

1 0

,a simple computation shows that (J(λ, e) − λI)e = 0 and (x − λ)e annihilates Vτ .Then mτ (x) = (x − λ)k for some k ≤ e, but (J(λ, e) − λI)k 6= 0 if k < e. Thereforemτ (x) = (x − λ)e and τ is nonderogatory. Similarity to the companion matrix nowfollows from Theorem 7.12. �

Theorem 148. The Jordan normal form, up to the order of the Jordan blocks, is acomplete invariant for similarity.

Proof. If τ, σ ∈ L(V ) and τ ∼ σ, then their Jordan normal forms are identical up toorder of Jordan blocks since their elementary divisors are the same. Conversely, if

J = diag(J(λ1, e1), . . . , J(λk, ek))

is a Jordan form matrix, then

J ∼ diag(C[(x− λ1)e1 ], . . . , C[(x− λk)ek ])by Theorem 147. Applying Theorem 7.14 completely determines the elementary divisorsof J . �

Theorem 149. [Exercise 8.15] Let J = [τ ]B be the Jordan form of a linear operatorτ ∈ L(V ). For a given Jordan block of J(λ, e) let U be the subspace of V spanned bythe basis vectors of B associated with that block.

48

(1) τ |U has a single eigenvalue λ with geometric multiplicity 1. In other words,there is essentially only one eigenvector (up to scalar multiple) associated witheach Jordan block. Hence, the geometric multiplicity of λ for τ is the number ofJordan blocks for λ.

(2) The algebraic multiplicity of an eigenvalue λ is the sum of the dimensions of theJordan blocks associated with λ.

(3) The number of Jordan blocks in J is the maximum number of linearly indepen-dent eigenvectors of τ .

(4) If the algebraic multiplicity of every eigenvalue is equal to its geometric multi-plicity, then each Jordan block is a 1× 1 matrix.

Proof. For (1), it suffices to show that the matrix J(λ, e) satisfies the required proper-ties. We can compute

J(λ, e)− λI =

01 0

. . .. . .

1 0

,which has a one-dimensional kernel:

ker(J(λ, e)− λI) =

⟨0...01

⟩.

Since the Jordan blocks are taken over subspaces with pairwise trivial intersections, thegeometric multiplicity of λ is the number of Jordan blocks. This proves (1). Each k×kJordan block for an eigenvalue λ contributes k to the total algebraic multiplicity of λ,so (2) follows. (3) is immediate from (1) and the definition of geometric multiplicity.Finally, if the algebraic multiplicity of every eigenvalue is equal to its geometric multi-plicity, then for each λ the sum of the dimensions of the Jordan blocks for λ is equal tothe number of Jordan blocks for λ. In other words, each Jordan block has size 1. Thisproves (4). �

Theorem 150. [Exercise 8.16] Assume that the base field F is algebraically closed.Then assuming that the eigenvalues of a matrix A are known, it is possible to determinethe Jordan form J of A by looking at the rank of various matrix powers. A matrix B isnilpotent if Bn = 0 for some n > 0. The smallest such exponent is called the indexof nilpotence.

(1) Let J = J(λ, n) be a single Jordan block of size n× n. Then J − λI is nilpotentof index n. Thus, n is the smallest integer for which rk(J − λI)n = 0.

49

Now let J be a matrix in Jordan form but possessing only one eigenvalue λ.

(2) J − λI is nilpotent. If m is its index of nilpotence, then m is the maximum sizeof the Jordan blocks of J and rk(J − λI)m−1 is the number of Jordan blocks inJ of maximum size.

(3) rk(J −λI)m−2 is equal to 2 times the number of Jordan blocks of maximum sizeplus the number of Jordan blocks of size one less than the maximum.

(4) The sequence rk(J −λI)k for k = 1, . . . ,m uniquely determines the number andsize of all of the Jordan blocks in J , that is, it uniquely determines J up to theorder of the blocks.

(5) Now let J be an arbitrary Jordan matrix. If λ is an eigenvalue for J , then thesequence rk(J − λI)k for k = 1, . . . ,m, where m is the first integer for whichrk(J − λI)m = rk(J − λI)m+1, uniquely determines J up to the order of theblocks.

(6) For any matrix A with spectrum {λ1, . . . , λs} the sequence rk(A − λiI)k fori = 1, . . . , s and k = 1, . . . ,m, where m is the first integer for which rk(A −λiI)m = rk(A− λiI)m+1, uniquely determines the Jordan matrix J for A up tothe order of the blocks.

This provides an alternative proof of the uniqueness of the Jordan normal form.

Proof. (1) follows from Theorem 147. In (2), since J has only one eigenvalue λ, thematrix J −λI is lower triangular with zero diagonal entries; this is clearly nilpotent. If

J = diag(J(λ, e1), . . . , J(λ, ek))

then

(J − λI)n = diag((J(λ, e1)− λI)n, . . . , (J(λ, ek)− λI)n).

= diag(Bn1 , . . . , B

nk )

where we write Bi = J(λ, ei) − λI for convenience. If m is the index of nilpotence ofJ − λI, then m must be the index of nilpotence for at least one of Bi, and no otherblock has a higher index of nilpotence. By (1), the maximum size of any Jordan blockin J must be m. Also, for each Bi with maximum size, Bm−1

i consists of a single 1 inthe bottom-left corner, so its rank is 1. This proves (2). In general, if Bi has size ei,then Bk

i contains exactly ei− k nonzero entries, and has rank ei− k− 1. From this, (3)and(4) are clear. If J is an arbitrary Jordan matrix, we can reorder the blocks to groupblocks with the same eigenvalue together. Applying (4) then proves (5) and (6). �

Theorem 151. [Exercise 8.17] Let A ∈Mn(F ).

(1) If all the roots of the characteristic polynomial of A lie in F , then A is similarto its transpose AT .

(2) Any matrix is similar to its transpose.

50

Proof. If all the roots of cA(x) lie in F , then A ∼ J where J is a matrix in Jordan form.But every Jordan block is similar to its transpose, since

λ1 λ

. . .. . .

1 λ

=

1

. ..

1

λ 1

λ. . .. . . 1

λ

1

. ..

1

.Therefore A ∼ J ∼ JT ∼ AT , which proves (1). If cA(x) does not split into linearfactors in F , then let K ⊇ F be the splitting field of cA(x). Now A ∼ AT in K, soA ∼ AT in F by Theorem 7.20. �

The Trace of a Matrix.

Theorem 152. [Exercise 8.18] Let A ∈Mn(F ).

(1) tr(rA) = r tr(A) for r ∈ F .(2) tr(A+B) = tr(A) + tr(B).(3) tr(AB) = tr(BA).(4) tr(ABC) = tr(CAB) = tr(BCA), but tr(ABC) does not necessarily equal

tr(ACB).(5) The trace is an invariant under similarity.(6) If F is algebraically closed, then the trace of A is the sum of the eigenvalues of

A, including multiplicity.

Proof. (1) and (2) are obvious. For (3), we have

tr(AB) =n∑i=1

(AB)i,i =n∑i=1

n∑j=1

Ai,jBj,i =n∑j=1

n∑i=1

Bj,iAi,j = tr(BA).

(4) follows immediately from (3), noting that

tr

([1 00 0

] [0 10 0

] [0 11 0

])= tr

([1 00 0

])= 1

but

tr

([1 00 0

] [0 11 0

] [0 10 0

])= tr

([0 00 0

])= 0.

If A = PBP−1 then

tr(A) = tr(PBP−1) = tr(BP−1P ) = tr(B),

which proves (5). If F is algebraically closed, then A is similar to its Jordan normalform, which has the eigenvalues of A as its diagonal entries. This proves (6). �

51

Example 153. [Exercise 8.19] Use the concept of the trace of a matrix, as defined inthe previous exercise, to prove that there are no matrices A,B ∈ Mn(C) for whichAB −BA = I.

If AB − BA = I then n = tr(I) = tr(AB − BA) = tr(AB) − tr(BA) = 0, which is acontradiction.

Theorem 154. [Exercise 8.20] Let T : Mn(F ) → F be a function with the followingproperties. For all matrices A,B ∈Mn(F ) and r ∈ F ,

(1) T (rA) = rT (A).(2) T (A+B) = T (A) + T (B).(3) T (AB) = T (BA).

Then there exists s ∈ F for which T (A) = s tr(A), for all A ∈ Mn(F ). That is, up toa constant factor, tr is the unique function that satisfies the above three properties.

Proof. If A = PBP−1 then

T (A) = T (PBP−1) = T (BP−1P ) = tr(B),

which shows that T is a similarity invariant. We also have T (0) = T (0 ·I) = 0T (I) = 0.Write Ei,j for an n× n matrix with 1 in the (i, j) position and 0 everywhere else. Lets = T (E1,1); we have T (Ei,i) = s for every i since Ei,i ∼ E1,1. Now consider Ei,j wherei 6= j. We have Ei,jEj,j = Ei,j but Ej,jEi,j = 0, so

T (Ei,j) = T (Ei,jEj,j) = T (Ej,jEi,j) = 0.

Finally, let A ∈Mn(F ) and write

A =n∑i=1

n∑j=1

ai,jEi,j

so that

T (A) = T

(n∑i=1

n∑j=1

ai,jEi,j

)=

n∑i=1

n∑j=1

ai,jT (Ei,j) = s

n∑i=1

ai,i = s tr(A).

�

Commuting Operators.

Lemma 155. If τ ∈ L(V ) is diagonalizable and S is a τ -invariant subspace of V , thenτ |S is also diagonalizable.

Proof. Since τ is diagonalizable, its minimal polynomial mτ (x) has no multiple roots.But S is τ -invariant, so mτ (τ |S) = 0 and mτ |S(x) divides mτ (x). Then mτ |S(x) has nomultiple roots, so τ |S is diagonalizable. �

52

Theorem 156. [Exercise 8.21] A pair of linear operators σ, τ ∈ L(V ) is simultane-ously diagonalizable if there is an ordered basis B of V for which [τ ]B and [σ]B areboth diagonal, that is, B is an ordered basis of eigenvectors for both τ and σ. Twodiagonalizable operators σ and τ are simultaneously diagonalizable if and only if theycommute, that is, στ = τσ.

Proof. If τ and σ are simultaneously diagonalizable, then [τ ]B = D1 and [σ]B = D2

where D1 and D2 are diagonal matrices. We have

[τσ]B = D1D2 = D2D1 = [στ ]B ,

which shows that τ and σ commute. Conversely, suppose that στ = τσ and write

V = Eλ1 ⊕ · · · ⊕ Eλk

where λ1, . . . , λk are the distinct eigenvalues of τ . If v ∈ Eλi ,

τ(σv) = στv = λi(σv)

so σv ∈ Eλi . This shows that (Eλ1 , . . . , Eλk) reduces σ. But each σ|Eλi is diagonalizable

by Lemma 155, so there exists a basis Bi of Eλi such that[σ|Eλi

]Bi

is diagonal. Let

B = B1 ∪ · · · ∪Bk; then [σ]B is diagonal and [τ ]B is also diagonal. �

Theorem 157. [Exercise 8.22] Let σ, τ ∈ L(V ). If σ and τ commute, then everyeigenspace of σ is τ -invariant. Thus, if F is a commuting family, then every eigenspaceof any member of F is F-invariant.

Proof. As in Theorem 156. �

Theorem 158. [Exercise 8.23] Let F be a finite family of operators in L(V ) with theproperty that each operator in F has a full set of eigenvalues in the base field F , thatis, the characteristic polynomial splits over F . If F is a commuting family, then F hasa common eigenvector v ∈ V .

Proof. Write F = {τ1, . . . , τn}. If n = 1, then we are done. Otherwise, assumethat any commuting family of n − 1 operators has a common eigenvector. Choosean eigenspace Eλ of τn; then Eλ is F -invariant by Theorem 157. By the inductionhypothesis, {τ1|Eλ , . . . , τn−1|Eλ} has a common eigenvector v ∈ Eλ. But v is also aneigenvector of τn, and this completes the proof. �

53

Gersgorin Disks.

Example 159. [Exercise 8.25] Find and sketch the Gersgorin region and the eigenvaluesfor the matrix

A =

1 2 34 5 67 8 9

.Write Dr(z0) for the closed disc {z ∈ C | |z − z0| ≤ r}. The row region is

GR(A) = D5(1) ∪D10(5) ∪D15(9)

and the column region is

GC(A) = D11(1) ∪D10(5) ∪D9(9).

Therefore the Gersgorin region is

G(A) = D5(1) ∪D10(5) ∪D9(9).

Definition 160. A matrix A ∈ Mn(C) is diagonally dominant if for each k =1, . . . , n,

|Akk| ≥ Rk(A),

and it is strictly diagonally dominant if strict inequality holds.

Theorem 161. [Exercise 8.26] If A is strictly diagonally dominant, then it is invertible.

Proof. Any eigenvalue λ must lie in the Gersgorin row regionn⋃k=1

{z ∈ C | |z − Akk| ≤ Rk(A)} .

Since A is strictly diagonally dominant, we have

Rk(A) ≥ |λ− Akk|≥ |Akk| − |λ|> Rk(A)− |λ|

for all k, which implies that |λ| > 0. By Theorem 135, A is invertible. �

Example 162. [Exercise 8.27,8.28] Find a matrix A ∈Mn(C) that is diagonally domi-nant but not invertible, and find a matrix B ∈Mn(C) that is invertible but not strictlydiagonally dominant.

Take

A =

[1 11 1

]and B =

[0 11 0

].

54

Semisimple Operators.

Definition 163. A linear operator τ ∈ L(V ) on a finite-dimensional vector space issemisimple if every τ -invariant subspace has a complement that is also τ -invariant.

Theorem 164. Let τ ∈ L(V ) be a semisimple operator. If S is a τ -invariant subspaceof V , then τ |S is also semisimple.

Proof. Let T be a τ |S-invariant subspace of S. Since T is τ -invariant, there exists aτ -invariant subspace U of V such that V = T ⊕U . Then S = T ⊕ (U ∩S) by Theorem5, so U ∩ S is a τ |S-invariant complement of T in S. �

Theorem 165. Let V be a finite-dimensional vector space over an algebraically closedfield F . Then τ is semisimple if and only if it is diagonalizable.

Proof. Suppose that τ is diagonalizable and write

V = Eλ1 ⊕ · · · ⊕ Eλkwhere λ1, . . . , λk are distinct eigenvalues. If S is a τ -invariant subspace, then τ |S isdiagonalizable by Theorem 155 and we can write

S = Eλ1 ⊕ · · · ⊕ Eλk

where each Eλi ⊆ Eλi may be trivial. For each i, choose a subspace Dλi such that

Eλi = Eλi ⊕Dλi ; then

T = Dλ1 ⊕ · · · ⊕ Dλkis a τ -invariant complement of S (any subspace of an eigenspace is τ -invariant). To provethe converse, we use induction on the dimension n of V . If n = 0 then the statement istrivial. Otherwise, assume that every semisimple linear operator on a vector space ofdimension n− 1 over F is diagonalizable. Since F is algebraically closed, τ has a someeigenvector v. Choose a τ -invariant subspace T of V such that T = 〈v〉 ⊕ T ; then τ |Tis semisimple, and by the induction hypothesis, also diagonalizable. Therefore

T = 〈v〉 ⊕ 〈v1〉 ⊕ · · · ⊕ 〈vk〉

where v1, . . . , vk are eigenvectors of τ , so τ is diagonalizable. �

Chapter 9. Real and Complex Inner Product Spaces

Theorem 166. [Exercise 9.1] If a matrix M is unitary, upper triangular and haspositive entries on the main diagonal, then it must be the identity matrix.

55

Proof. We use induction on the size of M . If M is a 1 × 1 matrix, the result is clear.Otherwise, write

M =

[m1,1 ∗

0 M1

]=

m1,1 m1,2 · · · m1,n

m2,2

. . ....

. . . mn−1,nmn,n

,noting that the columns m1, . . . ,mn of M form a orthonormal set. For each i > 1 wehave 〈m1,mi〉 = m1,1m1,i = 0 and m1,1 > 0, so m1,i = 0. Also 〈m1,m1〉 = m2

1,1 = 1 andm1,1 > 0, so m1,1 = 1. Since M is unitary and

M−1 =

[1 00 M−1

1

],

M1 must also be unitary and upper triangular with positive entries on its main diagonal.By the induction hypothesis, M1 = I and therefore M = I. �

Theorem 167. [Exercise 9.2] Any triangularizable matrix is unitarily (orthogonally)triagularizable.

Proof. Let A be a triangularizable matrix with A = PBP−1 where B is upper triangu-lar. Write P = QR where Q is unitary and R is upper triangular; then

A = (QR)B(QR)−1 = Q(RBR−1)Q−1

where RBR−1 is upper triangular. �

Theorem 168. [Exercise 9.3] In the triangle inequality

‖u+ v‖ ≤ ‖u‖+ ‖v‖ ,equality holds if and only if one of u and v is a nonnegative scalar multiple of the other.

Proof. If u = av where a ≥ 0, then

‖u+ v‖ = (a+ 1) ‖v‖ = ‖u‖+ ‖v‖ ;

the same holds when v = au. Conversely, if ‖u+ v‖ = ‖u‖+ ‖v‖ then

‖u‖2 + 〈u, v〉+ 〈v, u〉+ ‖v‖2 = ‖u‖2 + 2 ‖u‖ ‖v‖+ ‖v‖2 .But

|〈u, v〉| ≤ ‖u‖ ‖v‖ = Re(〈u, v〉) ≤ |〈u, v〉| ,so 〈u, v〉 is equal to ‖u‖ ‖v‖. Equality in the Cauchy-Schwarz inequality holds only ifone of u and v is a scalar multiple of the other; assume without loss of generality thatu = av for some a ∈ F . In this case,

‖u‖ ‖v‖ = 〈u, v〉 = 〈av, v〉 = a ‖v‖2

56

so a must be nonnegative. �

Lemma 169. [Exercise 9.4] If u, v ∈ V then

‖u+ v‖2 + ‖u− v‖2 = 2(‖u‖2 + ‖v‖2).

Proof. We have‖u+ v‖2 = ‖u‖2 + 〈u, v〉+ 〈v, u〉+ ‖v‖2

and‖u− v‖2 = ‖u‖2 − 〈u, v〉 − 〈v, u〉+ ‖v‖2 .

�

Theorem 170. [Exercise 9.5] If u, v, w ∈ V , then

‖w − u‖2 + ‖w − v‖2 =1

2‖u− v‖2 + 2

∥∥∥∥w − 1

2(u+ v)

∥∥∥∥2 .Proof. We have

(*) ‖w − u‖2 + ‖w − v‖2 = 2 ‖w‖2 + ‖u‖2 + ‖v‖2 − 〈w, u〉 − 〈u,w〉 − 〈w, v〉 − 〈v, w〉while the right hand side is

1

2

(‖u− v‖2 + ‖u+ v‖2

)+ 2 ‖w‖2 − 〈w, u+ v〉 − 〈u+ v, w〉

= 2 ‖w‖2 + ‖u‖2 + ‖v‖2 − 〈w, u+ v〉 − 〈u+ v, w〉 ,which equals (*). �

Theorem 171. [Exercise 9.6] Let V be an inner product space with basis B. The innerproduct is uniquely defined by the values 〈u, v〉 for all u, v ∈ B.

Proof. If u, v ∈ V , write

u = a1b1 + · · ·+ ambm,

v = a′1b′1 + · · ·+ a′nb

′n

where ai, a′i ∈ F and bi, b

′i ∈ B. Then

〈u, v〉 =

⟨m∑i=1

aibi,

n∑j=1

a′jb′j

⟩

=m∑i=1

ai

⟨bi,

n∑j=1

a′jb′j

⟩

=m∑i=1

n∑j=1

aia′i 〈bi, b′i〉 ,

57

which depends only on the values of 〈·, ·〉 for vectors in B. �

Theorem 172. [Exercise 9.7] Two vectors u and v in a real inner product space V areorthogonal if and only if

‖u+ v‖2 = ‖u‖2 + ‖v‖2 .

Proof. This is clear from the expansion

‖u+ v‖2 = ‖u‖2 + 2 〈u, v〉+ ‖v‖2 .�

Theorem 173. [Exercise 9.8] Any isometry is injective.

Proof. Let τ ∈ L(V,W ) be an isometry. If τv = 0 then

〈v, v〉 = 〈τv, τv〉 = 0

and v = 0. �

Theorem 174. [Exercise 9.9] Let V 6= {0} be an inner product space. Then V has aHilbert basis.

Proof. Let A be the collection of all orthonormal sets in V , partially ordered by setinclusion; A is not empty since we can choose a nonzero v ∈ V and take {v/ ‖v‖} ∈ A.Suppose that C = {Oi | i ∈ I} ⊆ S is a nonempty chain in S. The union

U =⋃i∈I

Oi

is orthonormal and U ⊆ A. By Zorn’s lemma, A has a maximal element O. This is aHilbert basis. �

Theorem 175. [Exercise 9.10] Let V be a finite-dimensional inner product space, letv ∈ V and let v be the orthogonal projection of v onto a subspace S of V . Then

‖v‖ ≤ ‖v‖ .

Proof. Since v − v ⊥ v,

‖v‖2 = ‖v − v + v‖2 = ‖v − v‖2 + ‖v‖2 ≥ ‖v‖2 .�

Theorem 176. [Exercise 9.11-9.13] Let O = {u1, . . . , uk} be an orthonormal subset ofan inner product space V and let S = 〈O〉. The following are equivalent:

(1) O is an orthonormal basis for V .

(2) 〈O〉⊥ = {0}.(3) Every vector is equal to its Fourier expansion, that is, v = v for all v ∈ V .

58

(4) Bessel’s identity holds for all v ∈ V , that is, ‖v‖ = ‖v‖.(5) Parseval’s identity holds for all v, w ∈ V , that is,

〈v, w〉 = [v]O · [w]O

where · is the standard inner product in F k.

Proof. We first prove (1) ⇔ (2). Suppose that O is an orthonormal basis and there

exists a nonzero v ∈ 〈O〉⊥. Then O ∪ {v/ ‖v‖} is orthonormal, so O is not maximal.Conversely, if O is not maximal then O ⊂ P where P is orthonormal; any v ∈ P \ Ois nonzero and a member of 〈O〉⊥. Now suppose that (1) holds. By Theorem 9.14,

v− v ∈ 〈O〉⊥ = {0} and v = v. This implies (3), while (3) implies (4). If (4) holds andv ⊥ 〈O〉, then

‖v‖2 = ‖v‖2 = ‖〈v, u1〉u1 + · · ·+ 〈v, uk〉uk‖2 = 0,

so v = 0. This implies (2). It now remains to show (1)⇔ (5). If (1) holds then (3) alsoholds. Therefore

〈v, w〉 = 〈v, w〉 = [v]O · [w]Osince O is an isometry. Suppose that (5) holds and v ⊥ 〈O〉. Then v = 0, so

〈v, v〉 = [v]O · [v]O = 0

and v = 0, which implies (2). �

Theorem 177. [Exercise 9.14] Let u = (r1, . . . , rn) and v = (s1, . . . , sn) be in Rn.Then

(|r1s1|+ · · ·+ |rnsn|)2 ≤ (r21 + · · ·+ r2n)(s21 + · · ·+ s2n).

Proof. Assume each rk, sk is nonnegative; we can compute(∑k

r2k

)(∑k

s2k

)−

(∑k

rksk

)2

=∑j,k

(r2js2k − rjrksjsk)

=∑j 6=k


=∑j<k

(r2js2k − rjrksjsk) +

∑k<j


=∑j<k

(r2js2k − rjrksjsk) +

∑j<k

(r2ks2j − rkrjsksj)

=∑j<k

(rjsk − rksj)2

≥ 0.

59

�

Theorem 178. Let V be an inner product space and let X, Y ⊆ V (cf. Theorem 62).

(1) X⊥ is a subspace of V .(2) If X ⊆ Y then Y ⊥ ⊆ X⊥.(3) X ⊆ span(X) ⊆ X⊥⊥.

Proof. Let v, w ∈ X⊥, a ∈ F and x ∈ X. Clearly 0 ∈ X⊥; we also have 〈v, x〉 =〈w, x〉 = 0 so 〈v + w, x〉 = 0 and 〈av, x〉 = 0. Therefore v + w, av ∈ X⊥, which proves(1). For (2), let v ∈ Y ⊥ so that v ⊥ y for every y ∈ Y . In particular, v ⊥ x for everyx ∈ X, so v ∈ X⊥.

If v ∈ span(X) then v = a1v1 + · · ·+ anvn where v1, . . . , vn ∈ X. If w ∈ X⊥ then

〈v, w〉 = a1 〈v1, w〉+ · · ·+ an 〈vn, w〉 = 0,

so v ∈ X⊥⊥. This proves (3). �

Theorem 179. [Exercise 9.15] Let V be a finite-dimensional inner product space. Forany subset X of V , we have X⊥⊥ = span(X). In particular, if X is a subspace of Vthen X⊥⊥ = X.

Proof. We know that span(X) ⊆ X⊥⊥ from Theorem 178. If v ∈ X⊥⊥ then v = s + s′

where s ∈ span(X) and s′ ∈ span(X)⊥, by Theorem 9.15. Since X ⊆ span(X) wehave span(X)⊥ ⊆ X⊥ by Theorem 178, so s′ ∈ X⊥ and s′ ⊥ v. But s′ ⊥ s, so s′ isorthogonal to itself and s′ = 0. This implies that v = s ∈ span(X). �

Example 180. [Exercise 9.16] Let P3 be the inner product space of all polynomials ofdegree at most 3, under the inner product

〈p(x), q(x)〉 =

ˆ ∞−∞

p(x)q(x)e−x2

dx.

Apply the Gram-Schmidt process to the basis {1, x, x2, x3}, thereby computing the firstfour Hermite polynomials (at least up to a multiplicative constant).

We have

h0(x) = 1

h1(x) = x−´∞−∞ xe

−x2 dx´∞−∞ e

−x2 dx= x

h2(x) = x2 −´∞−∞ x

2e−x2dx´∞

−∞ e−x2 dx

−´∞−∞ x

3e−x2dx´∞

−∞ x2e−x2 dx

x = x2 − 1

2

60

h3(x) = x3 −´∞−∞ x

3e−x2dx´∞

−∞ e−x2 dx

−´∞−∞ x

4e−x2dx´∞

−∞ x2e−x2 dx

x−´∞−∞(x5 − 1

2x3)e−x

2dx´∞

−∞(x2 − 12)2e−x2 dx

x2

= x3 − 3

2x.

Theorem 181. [Exercise 9.17] The linear map τ : V → V ∗ defined by τx = 〈·, x〉 isinjective.

Proof. If 〈·, x〉 = 0 then 〈x, x〉 = 0, so x = 0. �

Theorem 182. [Exercise 9.18] Let V be a complex inner product space and let S be asubspace of V . Suppose that v ∈ V is a vector for which 〈v, s〉 + 〈s, v〉 ≤ 〈s, s〉 for alls ∈ S. Then v ∈ S⊥.

Proof. Let s ∈ S be a nonzero vector. For z ∈ C,

〈v, zs〉+ 〈zs, v〉 ≤ 〈zs, zs〉 ,

z 〈v, s〉+ z〈v, s〉 ≤ |z|2 ‖s‖2 .

For all r ∈ R, putting z = r gives 2rRe(〈v, s〉) ≤ r2 ‖s‖2 and putting z = ir gives2r Im(〈v, s〉) ≤ r2 ‖s‖2. Therefore

2 |Re(〈v, s〉)|‖s‖2

≤ |r| and2 |Im(〈v, s〉)|‖s‖2

≤ |r| ;

taking r → 0 shows that |Re(〈v, s〉)| = |Im(〈v, s〉)| = 0. This completes the proof.

If S is assumed to be finite-dimensional, we sketch an alternative proof. The givencondition implies that ‖v‖ ≤ ‖v − s‖ for all s ∈ S; if v 6= 0 then Theorem 9.14implies that ‖v‖ ≤ ‖v − v‖ < ‖v‖, which is a contradiction. Therefore v = 0 andv = v − v ∈ S⊥. �

Example 183. [Exercise 9.19] If V and W are inner product spaces, consider thefunction on V �W defined by

〈(v1, w1), (v2, w2)〉 = 〈v1, v2〉+ 〈w1, w2〉 .

Is this an inner product on V �W?

Yes. Clearly 〈(v, w), (v, w)〉 ≥ 0, and

〈(v, w), (v, w)〉 = ‖v‖2 + ‖w‖2 = 0

implies that ‖v‖ = ‖w‖ = 0. The (conjugate) symmetry and linearity conditions aresimilarly satisfied.

61

Theorem 184. [Exercise 9.20] If V is a real normed space and if the norm satisfiesthe parallelogram law

‖u+ v‖2 + ‖u− v‖2 = 2 ‖u‖2 + 2 ‖v‖2 ,

then the polarization identity

〈u, v〉 =1

4(‖u+ v‖2 − ‖u− v‖2)

defines an inner product on V .

Proof. Since

0 = ‖v − v‖ ≤ ‖−v‖+ ‖v‖ = 2 ‖v‖ ,we have ‖v‖ ≥ 0 for all v ∈ V . Therefore 〈v, v〉 = ‖v‖2 is positive-definite. Also, 〈·, ·〉is clearly symmetric, and

8 〈u, x〉+ 8 〈v, x〉 = 2(‖u+ x‖2 + ‖v + x‖2)− 2(‖u− x‖2 + ‖v − x‖2)= ‖u+ v + 2x‖2 + ‖u− v‖2 − ‖u+ v − 2x‖2 − ‖u− v‖2

= 4 〈u+ v, 2x〉

and 2(〈u, x〉+ 〈v, x〉) = 〈u+ v, 2x〉. Setting v = 0 gives 〈v, 2x〉 = 2 〈v, x〉 and thus

〈u+ v, x〉 =1

2〈u+ v, 2x〉 = 〈u, x〉+ 〈v, x〉 .

For fixed nonzero u, v ∈ V , define f : R → R by r 7→ 〈u, rv〉. Let a ∈ R, ε > 0 andchoose δ = min(ε/(2 ‖u‖ ‖v‖), ‖u‖ / ‖v‖). Then for all |b− a| < δ,

|f(b)− f(a)| = |〈u, (b− a)v〉|

=1

4(‖u+ (b− a)v‖+ ‖u− (b− a)v‖) |‖u+ (b− a)v‖ − ‖u− (b− a)v‖|

≤ 1

4(2 ‖u‖+ 2 ‖(b− a)v‖) ‖2(b− a)v‖

< δ ‖v‖ (‖u‖+ δ ‖v‖)≤ ε

where the triangle inequality is used in the third line. This shows that f is continuous.We have

f(2k) =⟨u, 2kv

⟩= 2k 〈u, v〉 = 2kf(1)

for all k ∈ Z and f(a+ b) = f(a) + f(b) for all a, b ∈ R. Let r ∈ R and let

r =n∑

k=−∞

bk2k

62

be a binary expansion of r, where bk ∈ {0, 1}. Then for all m,

f

(n∑

k=−m

bk2k

)=

n∑k=−m

bk2kf(1),

and since f is continuous, letting m→∞ gives f(r) = rf(1), i.e. 〈u, rv〉 = r 〈u, v〉. �

Theorem 185. [Exercise 9.21] Let S be a subspace of a finite-dimensional inner productspace V . Then each coset in V/S contains exactly one vector that is orthogonal to S(cf. Theorem 59).

Proof. From Theorem 9.15, V = S � S⊥. Since S⊥ is a complement of S, by Theorem3.6 we have S⊥ ∼= V/S. The result follows immediately. �

Extensions of Linear Functionals.

Theorem 186. [Exercise 9.22] Let f be a linear functional on a subspace S of a finite-dimensional inner product space V . Let f(v) = 〈v,Rf〉. Suppose that g ∈ V ∗ is an

extension of f , that is, g|S = f . Then Rg = Rf , where Rg is the orthogonal projectionof Rg onto S.

Proof. Write Rg = Rg + x where Rg ∈ S and x ∈ S⊥. We have

〈Rg −Rf , Rg −Rf〉 = 〈Rg −Rf , Rg −Rf − x〉

= 〈Rg −Rf , Rg〉 − 〈Rg −Rf , Rf〉= 0

since x ⊥ Rg −Rf and 〈v,Rg〉 = 〈v,Rf〉 for all v ∈ S. Therefore Rg = Rf . �

Theorem 187. [Exercise 9.23] Let f be a nonzero linear functional on a subspace Sof a finite-dimensional inner product space V and let K = ker(f). If g ∈ V ∗ is anextension of f , then Rg ∈ K⊥ \ S⊥. Moreover, for each vector u ∈ K⊥ \ S⊥ there isexactly one scalar λ for which the linear functional g(x) = 〈x, λu〉 is an extension of f .

Proof. Let Rg be the orthogonal projection of Rg onto S. By Theorem 186, Rg = Rf

where Rf is the Riesz vector for f . If v ∈ K then 〈v,Rg〉 = 〈v,Rf〉 = f(v) = 0, but〈Rf , Rg〉 = 〈Rf , Rf〉 > 0 since f 6= 0. Therefore Rg ∈ K⊥ \ S⊥. Write V = S � S⊥ =〈w〉 �K � S⊥ where w ∈ K⊥ and w is nonzero. By Theorem 9.18, Rf is given by

Rf =f(w)

‖w‖2w.

63

If u ∈ K⊥ \ S⊥ then u = aw + z ∈ 〈w〉 for some nonzero a ∈ F and some z ∈ S⊥; ifg(x) = 〈x, λu〉 is an extension of f , then

g(Rf ) = 〈Rf , λaw〉 = λa〈Rf ,‖w‖2

f(w)Rf〉 =

λa ‖w‖2

f(w)‖Rf‖2

and g(Rf ) = f(Rf ) = ‖Rf‖2 so that

λ =f(w)

a ‖w‖2.

Conversely, this value of λ does define an extension of f , for 〈x, λu〉 = 〈x,Rf〉 =f(x). �

Lemma 188. Let V be a finite-dimensional inner product space and let f : V → F bea nonzero linear functional. Write V = 〈w〉 � ker(f) where w ∈ ker(f)⊥ is nonzero.Then Rf ∈ 〈w〉.

Proof. By definition, for any v ∈ ker(f) we have 〈v,Rf〉 = f(v) = 0. So Rf ∈ ker(f)⊥ =〈w〉. �

Positive Linear Functionals on Rn.

Theorem 189. [Exercise 9.24] A linear functional f on Rn is positive if and only ifRf ≥ 0 and strictly positive if and only if Rf � 0.

Proof. Write Rf = (r1, . . . , rn) and v = (v1, . . . , vn). If Rf ≥ 0 and v > 0, then

Rf · v = r1v1 + · · ·+ rnvn ≥ 0

since ri, vi ≥ 0. Conversely, if some rk < 0, then put vk = 1 and vi = 0 for i 6= k sothat Rf · v = rkvk is not nonnegative. If Rf � 0 and v > 0, then Rf · v > 0 sincer1, . . . , rn > 0 at least one of vi is positive. Conversely, if some rk ≤ 0, then put vk = 1and vi = 0 for i 6= k so that Rf · v = rkvk is not strictly positive. �

Lemma 190. Let p, s ∈ Rn where p is strongly positive. Then there exists a C > 0such that ∣∣∣∣〈v, s〉〈v, p〉

∣∣∣∣ < C

for all positive vectors v ∈ Rn.

Proof. Let E = {v ∈ Rn | ‖v‖ = 1 and v > 0}, which is a compact set. By Theorem 41,the maps v 7→ 〈v, s〉 and v 7→ 〈v, p〉 are continuous. Additionally, 〈v, p〉 6= 0 for anyv ∈ E since v > 0 and p� 0. Therefore f(v) = 〈v, s〉 / 〈v, p〉 is a continuous on E, and

64

since E is compact, f(E) is compact. Choose a C > 0 such that |f(v)| < C wheneverv ∈ E. Then

|f(v)| =∣∣∣∣〈v/ ‖v‖ , s〉〈v/ ‖v‖ , p〉

∣∣∣∣ < C

for all v > 0. �

Theorem 191. [Exercise 9.25] Let f : S → R be a strictly positive linear functionalon a subspace S of Rn. Then f has a strictly positive extension to Rn.

Proof. We split the proof into two cases. We write V for Rn. Suppose that S containsa strictly positive vector v1 and let K = ker(f). By Theorem 188, we can writeV = S⊥�S = S⊥�〈Rf〉�K where Rf ∈ K⊥ is nonzero. If v ∈ ker(f) then f(v) = 0,so K does not contain any strictly positive vectors. By the fact given in the text,K⊥ = S⊥ � 〈Rf〉 contains some strongly positive vector z = s + aRf . Furthermore,since a 〈Rf , v1〉 = 〈z, v1〉 > 0, we must have a 6= 0. So z ∈ K⊥ \ S⊥, and by Theorem187 there exists a λ such that the linear functional g(x) = 〈x, λz〉 extends f . But v1 > 0and z � 0, so

λ 〈v1, z〉 = g(v1) = f(v1) > 0

implies that λ > 0. Finally, applying Theorem 189 shows that g is strictly positive.

Now suppose that S does not contain any strictly positive vectors. Then S⊥ containsa strongly positive vector v1 by the fact given in the text. Choose a subspace T of S⊥

such that S⊥ = 〈v1〉 � T , and write V = S⊥ � S = S⊥ � 〈w〉 �K where K = ker(f)and w ∈ K⊥. By Lemma 190 there exists a C > 0 such that |〈v, w〉 / 〈v, v1〉| < C forall v > 0. Let M = C |f(w)| and define g : Rn → R by

a1v1 + t+ aw + k 7→ a1M + af(w)

where a1, a ∈ R, t ∈ T and k ∈ K. If v = a1v1 + t + aw + k is a positive vector thena1 = 〈v, v1〉 > 0 and a = 〈v, w〉, so∣∣∣∣ aa1f(w)

∣∣∣∣ < C |f(w)| = M.

Therefore

g(v) = a1M + af(w)

= a1

(M +

a

a1f(w)

)> 0.

This shows that g is a strictly positive extension of f . �

65

Theorem 192. [Exercise 9.26] If V is a real inner product space, then we can definean inner product on its complexification V C as follows:

〈u+ vi, x+ yi〉 = 〈u, x〉+ 〈v, y〉+ (〈v, x〉 − 〈u, y〉)i.Then

‖(u+ vi)‖2 = ‖u‖2 + ‖v‖2 .

Proof. We can compute

〈u+ vi, u+ vi〉 = 〈u, u〉+ 〈v, v〉+ (〈v, u〉 − 〈u, v〉)i= ‖u‖2 + ‖v‖2

since 〈v, u〉 = 〈u, v〉. �

Conjugate-linear Maps.

Definition 193. A function σ : V → W on complex vector spaces is conjugate-linearor antilinear if

σ(v + v′) = σv + σv′

andσ(rv) = rσv

for all v, v′ ∈ V and r ∈ C.

Definition 194. Let V be a complex vector space. The complex conjugate V of Vis the complex vector space of all formal complex conjugates of V , that is,

V = {v | v ∈ V } ,equipped with addition and scalar multiplication defined by v+w = v + w and rv = rv.If B is a basis for V , then B =

{b | b ∈ B

}is easily seen to be a basis for V .

Theorem 195. If σ : V → W is a conjugate-linear map, then the corresponding mapσ : V → W defined by v 7→ σ(v) is a linear map.

Proof. For all v, v′ ∈ V and r ∈ C, we have

σ(v + v′) = σ(v + v′) = σ(v) + σ(v′) = σ(v) + σ(v′)

andσ(rv) = σ(rv) = σ(rv) = rσ(v) = rσ(v).

�

Theorem 196. Let σ : U → V and τ : V → W be conjugate-linear maps.

(1) τσ : U → W is linear.(2) If ker(τ) = {0} then τ is injective.

66

(3) If τ is a (conjugate) isomorphism and V is finite-dimensional, then V ∼= W inthe usual sense.

Proof. For (3), Theorem 195 shows that τ : V → W is a linear isomorphism, and sincedim(V ) = dim(V ), we have V ∼= V ∼= W . �

Chapter 10. Structure Theory for Normal Operators

Theorem 197. Let V and W be inner product spaces.

(1) ι∗ = ι, where ι : V → V is the identity.(2) 0∗ = 0, where 0 is the zero operator.

Proof. For all v ∈ V and w ∈ W ,

〈ιv, w〉 = 〈v, w〉 = 〈v, ιw〉and

〈0v, w〉 = 0 = 〈v, 0w〉 .�

Theorem 198. Let V and W be finite-dimensional inner product spaces. For everyσ, τ ∈ L(V,W ) and r ∈ F ,

(1) (σ + τ)∗ = σ∗ + τ ∗.(2) (rτ)∗ = rτ ∗.(3) τ ∗∗ = τ .(4) If V = W , then (στ)∗ = τ ∗σ∗.(5) If τ is invertible, then (τ−1)∗ = (τ ∗)−1.(6) If V = W and p(x) ∈ R[x], then p(τ)∗ = p(τ ∗).

Proof. For (1) and (2), we have for all v ∈ V , w ∈ W and r ∈ C,

〈(σ + τ)v, w〉 = 〈σv, w〉+ 〈τv, w〉 = 〈v, σ∗w〉+ 〈v, τ ∗w〉 = 〈v, (σ∗ + τ ∗)w〉and

〈rτv, w〉 = r 〈v, τ ∗w〉 = 〈v, rτ ∗w〉 .Part (3) follows from the conjugate symmetry of the inner product:

〈τ ∗w, v〉 = 〈v, τ ∗w〉 = 〈τv, w〉 = 〈w, τv〉 .For (4),

〈στv, w〉 = 〈τv, σ∗w〉 = 〈v, τ ∗σ∗w〉and (5) follows from (4) by setting σ = τ−1. Part (6) is clear from (1), (2) and (4). �

67

Theorem 199. Let V be a finite-dimensional inner product space and let S and T besubspaces of V . If ρ is a projection onto S along T , then (ρS,T )∗ = ρT⊥,S⊥.

Proof. We have ρ2 = ρ since ρ is a projection; by Theorem 198, (ρ∗)2 = ρ∗ and ρ∗ is aprojection. But

im(ρ∗) = ker(ρ)⊥ = T⊥ and ker(ρ∗) = im(ρ)⊥ = S⊥

by Theorem 10.3, so ρ∗ is a projection onto T⊥ along S⊥. �

Remark 200. [Exercise 10.1] Let τ ∈ L(U, V ). If τ is surjective, find a formula for theright inverse of τ in terms of τ ∗. If τ is injective, find a formula for the left inverse ofτ in terms of τ ∗.

If im(τ) = V , then im(ττ ∗) = V and ker(ττ ∗) = V ⊥ = {0} so that ττ ∗ is invertible.Therefore

(ττ ∗)(ττ ∗)−1 = ι

and τ ∗(ττ ∗)−1 is a right inverse for τ . Similarly, if ker(τ) = {0} then im(τ ∗τ) = {0}⊥ =V and ker(τ ∗τ) = {0} so that τ ∗τ is invertible. Therefore

(τ ∗τ)−1(τ ∗τ) = ι

and (τ ∗τ)−1τ ∗ is a left inverse for τ .

Theorem 201. [Exercise 10.2] Let τ ∈ L(V ) where V is a complex vector space andlet

τ1 =1

2(τ + τ ∗) and τ2 =

1

2i(τ − τ ∗).

Then τ1 and τ2 are self-adjoint, and

τ = τ1 + iτ2 and τ ∗ = τ1 − iτ2.Furthermore, if τ = σ1 + iσ2 where σ1 and σ2 are self-adjoint, then σ1 = τ1 and σ2 = τ2.

Proof. Computing adjoints gives

τ ∗1 =1

2(τ ∗ + τ) = τ1 and τ ∗2 = − 1

2i(τ ∗ − τ) = τ2,

and the last statement follows. If τ = σ1 + iσ2 where σ1 and σ2 are self-adjoint, thenτ ∗ = σ1 − iσ2 and we have

σ1 + iσ2 = τ1 + iτ2 and σ1 − iσ2 = τ1 − iτ2.Adding the equations gives σ1 = τ1 and σ2 = τ2. �

Theorem 202. [Exercise 10.3] All of the roots of the characteristic polynomial of askew-Hermitian matrix are pure imaginary.

68

Proof. If λ is an eigenvalue of a skew-Hermitian matrix A, then Av = λv implies that−Av = A∗v = λv and (λ + λ)v = 0. Since v 6= 0, we have λ + λ = 2 Re(λ) = 0, i.e. λis pure imaginary. �

Example 203. [Exercise 10.4] Give an example of a normal operator that is neitherself-adjoint nor unitary.

The matrix

A =

[0 ii 0

]is skew-Hermitian since

A∗ =

[0 −i−i 0

]= −A.

Theorem 204. [Exercise 10.5] Let V be a complex inner product space. If ‖τv‖ =‖τ ∗v‖ for all v ∈ V , then τ is normal.

Proof. For all v ∈ V ,

〈τ ∗τv, v〉 = 〈τv, τv〉 = 〈τ ∗v, τ ∗v〉 = 〈ττ ∗v, v〉and τ ∗τ = ττ ∗ by Theorem 9.2. �

Theorem 205. [Exercise 10.6] Let τ be a normal operator on a complex finite-dimensionalinner product space V or a self-adjoint operator on a real finite-dimensional inner prod-uct space.

(1) τ ∗ = p(τ) for some polynomial p(x) ∈ C[x].(2) For any σ ∈ L(V ), στ = τσ implies στ ∗ = τ ∗σ. In other words, τ ∗ commutes

with all operators that commute with τ .

Proof. We can assume that V is a complex inner product space, for if τ is self-adjointthen the statements follow trivially. Part (1) follows from the spectral resolution

τ = λ1ρ1 + · · ·+ λkpk

sinceτ ∗ = λ1ρ1 + · · ·+ λkρk = p(τ)

where p(x) ∈ C[x] is a polynomial such that p(λi) = λi for all i. For (2), if στ = τσthen σ commutes with any polynomial in τ ; since (1) shows that τ ∗ = p(τ) for somep(x) ∈ C[x], σ commutes with τ ∗. �

Theorem 206. A pair of linear operators σ, τ ∈ L(V ) is simultaneously unitarilydiagonalizable if there is an orthonormal basis O of V for which [τ ]O and [σ]O areboth diagonal, that is, O is an basis of orthonormal eigenvectors for both τ and σ. Twonormal operators σ and τ are simultaneously unitarily diagonalizable if and only if theycommute, that is, στ = τσ (cf. Theorem 156).

69

Proof. If τ and σ are simultaneously unitarily diagonalizable, then [τ ]O = D1 and[σ]O = D2 where D1 and D2 are diagonal matrices. We have

[τσ]O = D1D2 = D2D1 = [στ ]O ,

which shows that τ and σ commute. Conversely, suppose that στ = τσ and write

V = Eλ1 � · · · � Eλkwhere λ1, . . . , λk are the distinct eigenvalues of τ . If v ∈ Eλi ,

τ(σv) = στv = λi(σv)

so σv ∈ Eλi . This shows that (Eλ1 , . . . , Eλk) reduces σ. But each σ|Eλi is diagonalizable

by Lemma 155, so there exists an orthonormal basis Oi of Eλi such that[σ|Eλi

]Oi

is

diagonal. Let O = O1 ∪ · · · ∪ Ok; then [σ]O is diagonal and [τ ]O is also diagonal. �

Theorem 207. [Exercise 10.7] A linear operator τ on a finite-dimensional complexinner product space V is normal if and only if whenever S is an invariant subspaceunder τ , so is S⊥ (cf. Theorem 165).

Proof. Suppose that τ is normal and write

V = Eλ1 � · · · � Eλkwhere λ1, . . . , λk are distinct eigenvalues. If S is a τ -invariant subspace, then τ |S isdiagonalizable by Theorem 155 and we can write

S = Eλ1 � · · · � Eλkwhere each Eλi ⊆ Eλi may be trivial and the sum is orthogonal by Theorem 10.8. For

each i, choose a subspace Dλi such that Eλi = Eλi �Dλi ; then

S⊥ = Dλ1 � · · · � Dλkis τ -invariant since any subspace of an eigenspace is τ -invariant. To prove the converse,let v1 be an eigenvector of τ (which exists since C is algebraically closed). Since 〈v1〉 is τ -

invariant, the orthogonal complement 〈v〉⊥ is also τ -invariant. Choosing an eigenvectorfor τ |〈v〉⊥ and continuing the process gives a decomposition

V = 〈v1〉 � · · · � 〈vn〉where each 〈vi〉 is an invariant subspace, and therefore each vi is an eigenvector. ByTheorem 10.9, τ is normal. �

Corollary 208. Let τ ∈ L(V ) be a normal operator. If S is τ -invariant then (S, S⊥)reduces τ , and τ |S is normal.

Theorem 209. [Exercise 10.8] Let V be a finite-dimensional inner product space andlet τ be a normal operator on V .

70

(1) If τ is idempotent, then it is also self-adjoint.(2) If τ is nilpotent, then τ = 0.(3) If τ 2 = τ 3, then τ is idempotent.

Proof. If τ 2 = τ then τ is a projection, and since τ is normal, ker(τ) = ker(τ ∗) = im(τ)⊥.Therefore τ is an orthogonal projection and is self-adjoint by Theorem 10.5, whichproves (1). Part (2) follows from the fact that a diagonalizable nilpotent matrix mustbe zero (see Theorem 138). For (3), if τ 2 = τ 3 then the minimal polynomial of τ dividesx3− x2 = x2(x− 1). Since τ is normal, mτ (x) must divide x(x− 1). If mτ (x) = x thenτ = 0, if mτ (x) = x− 1 then τ = ι, and if mτ (x) = x(x− 1) then τ 2 = τ . In all threecases, τ is idempotent. �

Theorem 210. [Exercise 10.9] If τ is a normal operator on a finite-dimensional com-plex inner product space, then the algebraic multiplicity is equal to the geometric mul-tiplicity for all eigenvalues of τ .

Proof. This is true for any diagonalizable operator. �

Theorem 211. [Exercise 10.10] Two orthogonal projections σ, ρ ∈ L(V ) are orthogonalto each other if and only if im(σ) ⊥ im(ρ).

Proof. Since σ and ρ are orthogonal projections, they are self-adjoint. Suppose thatσρ = ρσ = 0. If v ∈ im(σ) and w ∈ im(ρ), then v = σx and w = ρy for some x, y ∈ V .We have

〈v, w〉 = 〈σx, ρy〉 = 〈x, σρy〉 = 0,

which shows that im(σ) ⊥ im(ρ). Conversely, if im(σ) ⊥ im(ρ) and v ∈ V ,

〈σρv, σρv〉 =⟨ρv, σ2ρv

⟩= 0

so σρ = 0. Similarly, ρσ = 0. �

Theorem 212. [Exercise 10.11] Let τ be a normal operator and let σ be any operatoron V . If the eigenspaces of τ are σ-invariant, then τ and σ commute.

Proof. Write τ = λ1ρ1 + · · · + λkρk where each ρi is the orthogonal projection ontothe eigenspace Eλi . Since each eigenspace is σ-invariant, σ commutes with every ρi byTheorem 2.24. Therefore σ commutes with τ . �

Theorem 213. [Exercise 10.12] If τ and σ are normal operators on a finite-dimensionalcomplex inner product space and if τθ = θσ for some operator θ then τ ∗θ = θσ∗.

Proof. Write

V = Eλ1 � · · · � Eλk

71

where λ1, . . . , λk are the distinct eigenvalues of σ. If v ∈ Eλi then τθv = θσv = λiθv, soθ(Eλi) is an eigenspace of τ with eigenvalue λi. By Theorem 10.8,

σv = λiv ⇔ σ∗v = λiv and τθv = λiθv ⇔ τ ∗θv = λiθv.

Then for all v = v1 + · · ·+ vk where vi ∈ Eλi we have

τ ∗θv = τ ∗θv1 + · · ·+ τ ∗θvk

= λiθv1 + · · ·+ λkθvk

= θσ∗v1 + · · ·+ θσ∗vk

= θσ∗v.

�

Theorem 214. [Exercise 10.13] If two normal n × n complex matrices are similar,then they are unitarily similar, that is, similar via a unitary matrix.

Proof. If A and B are similar normal matrices, then A = PDP ∗ and B = QDQ∗ whereP and Q are unitary and D is diagonal. Therefore A = PQ∗DQP ∗ where PQ∗ isunitary. �

Theorem 215. [Exercise 10.14] If ν is a unitary operator on a complex inner productspace, then there exists a self-adjoint operator σ for which ν = eiσ.

Proof. Since ν is unitary, it has a spectral resolution ν = λ1ρ1+· · ·+λkρk where |λj| = 1for all j. Write λj = eiθj ; then

σ = θ1ρ1 + · · ·+ θkρk

is self-adjoint since every θj is real, and ν = eiσ. �

Theorem 216. [Exercise 10.15] A positive operator has a unique positive square root.

Proof. Let τ be a positive operator; we already know that τ has a positive square root.If σ = λ1ρ1 + · · ·+ λkρk is a positive square root of τ , then

τ = σ2 = λ21ρ1 + · · ·+ λ2kρk

is the spectral resolution of τ . Since this expression is unique and all eigenvalues of σare nonnegative, the λi are uniquely determined. �

Theorem 217. [Exercise 10.16] If τ has a square root, that is, if τ = σ2 for somepositive operator σ, then τ is positive.

Proof. Since σ is normal with nonnegative eigenvalues, τ is also normal with nonnega-tive eigenvalues. Theorem 10.22 then shows that τ is positive. �

72

Theorem 218. [Exercise 10.17] If σ ≤ τ (that is, τ−σ is positive) and if θ is a positiveoperator that commutes with both σ and τ , then σθ ≤ τθ.

Proof. This is equivalent to proving that the product of two commuting positive oper-ators σ, τ is positive. By Theorem 206, there exists an orthonormal basis O such that[σ]O and [τ ]O are both diagonal. Furthermore, since σ and τ are positive, both matriceshave nonnegative diagonal entries. Therefore the product [στ ]O also has nonnegativediagonal entries, and Theorem 10.22 shows that στ is positive. �

Theorem 219. [Exercise 10.18] An invertible linear operator τ ∈ L(V ) is positive ifand only if it has the form τ = ρ∗ρ where ρ is upper triangularizable. Moreover, ρ canbe chosen with positive eigenvalues, in which case the factorization is unique.

Proof. If τ = ρ∗ρ then τ is positive by Theorem 10.23. Conversely, if τ is positive andinvertible then it must be positive-definite. Therefore τ = (

√τ)∗√τ is a decomposition

where√τ is positive-definite (and has positive eigenvalues). �

Remark 220. [Exercise 10.19] Does every self-adjoint operator on a finite-dimensionalreal inner product space have a square root?

No, take the linear operator τ : R→ R given by x 7→ −x.

Theorem 221. Let A be an m × n complex matrix. Let a1, . . . , an be the columnsof A and let a1, . . . , am be the rows of A. Then the diagonal entries of A∗A are‖a1‖2 , . . . , ‖an‖2 and the diagonal entries of AA∗ are ‖a1‖2 , . . . , ‖am‖2.

Proof. For every k,

(A∗A)k,k =n∑i=1

(A∗)k,iAi,k =n∑i=1

Ai,kAi,k = ‖ak‖2

and

(AA∗)k,k =m∑i=1

Ak,i(A∗)i,k =

m∑i=1

Ak,iAk,i = ‖ak‖2 .

�

Theorem 222. Let V be a finite-dimensional inner product space and let τ ∈ L(V ). Iftr(τ ∗τ) = tr(ττ ∗) = 0, then τ = 0.

Proof. This follows by choosing an orthonormal basis for V and applying Theorem221. Alternatively, τ ∗τ is a positive operator by Theorem 10.23, so all eigenvalues arenonnegative. Since tr(τ ∗τ) = 0, all eigenvalues are 0 and τ ∗τ = 0. But ker(τ) =ker(τ ∗τ) = V , so τ = 0. �

73

Theorem 223. [Exercise 10.20] Let τ be a linear operator on Cn and let λ1, . . . , λn bethe eigenvalues of τ , each one written a number of times equal to its algebraic multi-plicity. Then ∑

i

|λi|2 ≤ tr(τ ∗τ).

Equality holds if and only if τ is normal.

Proof. We can unitarily triangularize τ to prove the first statement. Let O be anorthonormal basis such that T = [τ ]O is upper triangular with columns t1, . . . , tn. Thenby Theorem 221,

tr(τ ∗τ) = tr([τ ∗]O [τ ]O) =∑i

‖ti‖2 ≥∑i

|λi|2

since the diagonal entries of [τ ]O are the eigenvalues of τ . If equality holds then∑i

|λi|2 =∑i

‖ti‖2 =∑i,j

|Ti,j|2 =∑i

|λi|2 +∑i 6=j

|Ti,j|2 ,

so Ti,j = 0 for all i 6= j. Therefore T is diagonal and τ is normal. Conversely, if τ isnormal then equality follows by applying Theorem 221 to any unitary diagonalizationof τ . �

Theorem 224. [Exercise 10.21] Let τ ∈ L(V ) where V is a real inner product space.Then the Hilbert space adjoint satisfies (τ ∗)C = (τC)∗.

Proof. For all a, b, c, d ∈ R,⟨τC(a+ ib), c+ id

⟩= 〈τa+ iτb, c+ id〉= 〈τa, c〉+ 〈τb, d〉+ (〈τb, c〉 − 〈τa, d〉)i= 〈a, τ ∗c〉+ 〈b, τ ∗d〉+ (〈b, τ ∗c〉 − 〈a, τ ∗d〉)i= 〈a+ ib, τ ∗c〉+ 〈b− ia, τ ∗d〉= 〈a+ ib, τ ∗c〉+ 〈a+ ib, iτ ∗d〉= 〈a+ ib, τ ∗(c+ id)〉 .

�

Chapter 11. Metric Vector Spaces: The Theory of Bilinear Forms

Note: this section is incomplete.

Theorem 225. A metric vector space V is nonsingular if and only if all representingmatrices MB are nonsingular.

74

Proof. Suppose that V is nonsingular. If MB[v]B = 0 for some v then 〈x, v〉 =[x]TBMB[v]B = 0 for all x ∈ V , so v = 0 since rad(V ) = {0}. Therefore MB is in-vertible. Conversely, suppose that all representing matrices MB are invertible and letv ∈ rad(V ). Then 〈x, v〉 = [x]TBMB[v]B = 0 for all x ∈ V , and in particular, [ei]

TBMB[v]B

for all i. Therefore MB[v]B = 0 and v = 0 since MB is invertible. �

Theorem 226. Let τ ∈ L(V,W ) be a linear map between finite-dimensional metricvector spaces V and W .

(1) Let B = {v1, . . . , vn} be a basis for V . Then τ is an isometry if and only if τ isbijective and

〈τvi, τvj〉 = 〈vi, vj〉for all i, j.

(2) If V is orthogonal and char(F ) 6= 2, then τ is an isometry if and only if it isbijective and

〈τv, τv〉 = 〈v, v〉for all v ∈ V .

(3) Suppose that τ : V → W is an isometry and

V = S � S⊥ and W = T � T⊥.If τS = T , then τ(S⊥) = T⊥.

Proof. Let u, v ∈ V and write u = a1v1 + · · ·+ anvn and v = b1v1 + · · ·+ bnvn. Then

〈τu, τv〉 =

⟨τ

n∑i=1

aivi, τn∑j=1

bjvj

⟩

=n∑i=1

n∑j=1

aibj 〈τvi, τvj〉

=n∑i=1

n∑j=1

aibj 〈vi, vj〉

= 〈u, v〉 ,which proves (1). Part (2) follows from the identity

〈u, v〉 =1

2〈u+ v, u+ v〉 − 1

2〈u, u〉 − 1

2〈v, v〉 .

For (3), let v ∈ τ(S⊥) and t ∈ T . Then v = τx for some x ∈ S⊥ and t = τy for somey ∈ S since τS = T . Therefore

〈v, t〉 = 〈τx, τy〉 = 〈x, y〉 = 0

and τ(S⊥) ⊆ T⊥. But dim(τ(S⊥)) = dim(S⊥) = dim(T⊥), so τ(S⊥) = T⊥. �

75

Chapter 12. Metric Spaces

Theorem 227. [Exercise 12.1,12.2]

(1) If x1, . . . , xn ∈M then d(x1, xn) ≤ d(x1, x2) + · · ·+ d(xn−1, xn).(2) If x, y, a, b ∈M then |d(x, y)− d(a, b)| ≤ d(x, a) + d(y, b).(3) If x, y, z ∈M then |d(x, z)− d(y, z)| ≤ d(x, y).

Proof. We use induction on n to prove (1). If n = 2 then the result is clear. Otherwise,assume that the result holds for n− 1 elements of M . Then

d(x1, xn) ≤ d(x1, xn−1) + d(xn−1, xn)

≤ d(x1, x2) + · · ·+ d(xn−1, xn).

For (2),

d(x, y) ≤ d(x, a) + d(a, b) + d(y, b),

d(a, b) ≤ d(x, a) + d(x, y) + d(y, b).

Similarly, for (3),

d(x, z) ≤ d(x, y) + d(y, z),

d(y, z) ≤ d(x, y) + d(x, z).

�

Example 228. [Exercise 12.3] Let S ⊆ `∞ be the subspace of all binary sequences(sequences of 0s and 1s). Describe the metric on S.

The `∞ metric on S is identical to the discrete metric.

Theorem 229. [Exercise 12.5] Let 1 ≤ p <∞.

(1) If x = (xn) ∈ `p then xn → 0.(2) There exists a sequence that converges to 0 but is not an element of any `p for

1 ≤ p <∞.

Proof. If (xn) ∈ `p then∞∑n=1

|xn|p <∞

by definition. Therefore |xn|p → 0 and xn → 0. For (2), choose the sequence x =(1/ log n)n≥2. Then

∞∑n=2

1

(log n)p

never converges for any 1 ≤ p <∞, so x /∈ `p. �

76


(1) If x = (xn) ∈ `p, then x ∈ `q for all q > p.(2) There exists a sequence x = (xn) that is in `p for p > 1, but is not in `1.

Proof. If (xn) ∈ `p then∞∑n=1

|xn|p <∞.

For some integer N , we have |xn|p < 1 for all n ≥ N . For these n we have |xn|q ≤ |xn|psince q > p, and therefore

∞∑n=1

|xn|q <∞

by the comparison test. For (2), take xn = 1/n. �

Theorem 231. [Exercise 12.7] A subset S of a metric space M is open if and only ifS contains an open neighborhood of each of its points.

Proof. If S is open, then S clearly contains an open neighborhood (in fact an open ball)around each of its points. Conversely, if every point x ∈ S has an open neighborhoodN in S, then we may choose an open ball around x since N is open. �

Theorem 232. The union of any collection of open sets in a metric space is open.

Proof. Let {Ei} be a collection of open sets and let E =⋃iEi. If x ∈ E then x ∈ Ei

for some i. Since Ei is open, there exists an open ball B ⊆ Ei around x. But B ⊆ E,and this proves that E is open. �

Theorem 233. [Exercise 12.8] The intersection of any collection of closed sets in ametric space is closed.

Proof. This follows from Theorem 232 and the fact that(⋂i

Ei

)c

=⋃i

Eci .

�

Theorem 234. [Exercise 12.9] Let (M,d) be a metric space. The diameter of anonempty subset S ⊆M is

δ(S) = supx,y∈S

d(x, y).

A set S is bounded if δ(S) <∞.

(1) S is bounded if and only if there is some x ∈M and r ∈ R for which S ⊆ B(x, r).

77

(2) δ(S) = 0 if and only if S consists of a single point.(3) S ⊆ T implies δ(S) ≤ δ(T ).(4) If S and T are bounded, then S ∪ T is also bounded.

Proof. Suppose that S is bounded and choose any p ∈ S. Then S ⊆ B(p, δ(S) + 1)since any x ∈ S has d(p, x) ≤ δ(S). Conversely, if S ⊆ B(p, r) for some p ∈ S then

d(x, y) ≤ d(x, p) + d(p, y) < 2r

for every x, y ∈ S. Therefore supx,y∈S d(x, y) ≤ 2r, and in particular, it is finite.This proves (1). For (2), if S contains exactly one point x then δ(S) = d(x, x) = 0.Conversely, if x, y ∈ S are distinct points then 0 < d(x, y) ≤ δ(S). For (3), we haved(x, y) ≤ δ(T ) for every x, y ∈ S, so δ(S) ≤ δ(T ). For (4), suppose that S and T arebounded. Choose any two points p ∈ S and q ∈ T . Let x, y ∈ S ∪ T . If x, y ∈ S thend(x, y) ≤ δ(S) and if x, y ∈ T then d(x, y) ≤ δ(T ). If x ∈ S and y ∈ T then

d(x, y) ≤ d(x, p) + d(p, q) + d(q, y) ≤ δ(S) + d(p, q) + δ(T ),

and the same inequality holds if x ∈ T and x ∈ S. In all cases the above inequalityholds, so S ∪ T is bounded. �

Theorem 235. [Exercise 12.10] Let (M,d) be a metric space. Let d′ be the functiondefined by

d′(x, y) =d(x, y)

1 + d(x, y).

(1) (M,d′) is a metric space and M is bounded under this metric, even if it is notbounded under the metric d.

(2) (M,d) and (M,d′) have the same open sets.

Proof. We only prove the triangle inequality for d′. Let x, y, z ∈M and let a = d(x, y),b = d(x, z), c = d(z, y) for convenience. Then

a ≤ b+ c

a ≤ b+ bc+ c+ cb

a+ ab+ ac ≤ (b+ ab+ bc) + (c+ ac+ cb)

Adding abc to both sides and simplifying gives

a(1 + c)(1 + b) ≤ b(1 + a)(1 + c) + c(1 + a)(1 + b)

a

1 + a≤ b

1 + b+

c

1 + c,

which proves that d′(x, y) ≤ d′(x, z) + d′(z, y). Since

d(x, y)

1 + d(x, y)= 1− 1

1 + d(x, y)≤ 1,

78

every subset of M is bounded. This proves (1). The metric d′ is monotonically increas-ing as a function of d, so (2) follows. �

Theorem 236. [Exercise 12.11] If S and T are subsets of a metric space (M,d), wedefine the distance between S and T by

ρ(S, T ) = infx∈S,t∈T

d(x, y).

(1) x ∈ cl(S) if and only if ρ({x}, S) = 0.(2) (Is it true that ρ(S, T ) = 0 if and only if S = T? Is ρ a metric?

Proof. If x ∈ cl(S), then every open ball B(x, δ) contains a point of S. Thereforeinfs∈S d(x, s) ≤ δ for arbitrarily small δ, and ρ({x}, S) = 0. Conversely, if ρ({x}, S) = 0then for every δ > 0 there exists a s ∈ S such that d(x, s) < δ, so x ∈ cl(S). Thisproves (1). For (2), choose S = Q and T = R \ Q. Then ρ(S, T ) = 0 but S 6= T (infact they are disjoint), so ρ cannot be a metric. �

Theorem 237. [Exercise 12.12,12.13] Let (M,d) be a metric space and let S ⊆ M .The following are equivalent:

(1) x ∈M is a limit point of S.(2) Every neighborhood of x meets S in a point other than x itself.(3) Every open ball B(x, r) contains infinitely many points of S.

Proof. (1) ⇔ (2) is obvious. Suppose that x ∈ M is a limit point of S and let B(x, r)be an open ball. Choose a sequence {xi} of points in B(x, r)∩ S as follows: take x1 tobe any point in B(x, r) ∩ S not equal to x, which exists since x is a limit point of S.Having chosen the points x1, . . . , xk, let

r′ = min1≤i≤k

d(x, xi)

and choose some xk+1 in B(x, r′) ∩ S not equal to x. Continuing this process producesan infinite sequence of distinct points in B(x, r) ∩ S. This proves (1) ⇒ (3). For theconverse, suppose that x is not a limit point of S. Then there exists an open ball B(x, r)such that B(x, r)∩S is empty or contains only x; this contradicts the condition in (3).Therefore (3)⇒ (1). �

Theorem 238. [Exercise 12.14] Limits are unique. That is, (xn) → x and (xn) → yimplies that x = y.

Proof. Let ε > 0 and choose M,N such that d(x, xn) < ε whenever n ≥ M andd(y, xn) < ε whenever n ≥ N . For n = max(M,N) we have

d(x, y) ≤ d(x, xn) + d(xn, y) < 2ε;

since ε was arbitrary, x = y. �

79

Theorem 239. [Exercise 12.15] Let S be a subset of a metric space M . Then x ∈ cl(S)if and only if there exists a sequence (xn) in S that converges to x.

Proof. Let x ∈ cl(S). If x ∈ S then the constant sequence (x) converges to x. Otherwise,x is a limit point of S. For each integer n ≥ 1, choose a point xn from B(x, 1/n) ∩ S.Then (xn) → x since d(x, xn) < 1/n → 0 as n → ∞. Conversely, suppose that thereis a sequence (xn) in S that converges to x. We may assume xn 6= x for all n, forotherwise x ∈ S and we are done. If B(x, r) is a neighborhood of x, then there existsan integer N such that d(x, xn) < r whenever n ≥ N . In particular, xN ∈ B(x, r) ∩ Sand xN 6= x by our previous assumption. This proves that x is a limit point of S. �

Theorem 240. [Exercise 12.16] The closure has the following properties:

(1) S ⊆ cl(S).(2) cl(cl(S)) = cl(S).(3) cl(S ∪ T ) = cl(S) ∪ cl(T ).(4) cl(S ∩ T ) ⊆ cl(S) ∩ cl(T ). Equality does not necessarily hold.

Proof. We only prove (3). The inclusion cl(S) ∪ cl(T ) ⊆ cl(S ∪ T ) is obvious. We haveS ∪ T ⊆ cl(S) ∪ cl(T ), and since cl(S) ∪ cl(T ) is closed, cl(S ∪ T ) ⊆ cl(S) ∪ cl(T ).This proves (3). Part (4) is similar. If we take S = {1/n | n ∈ Z, n ≥ 1} and T ={−1/n | n ∈ Z, n ≥ 1}, then cl(S ∩ T ) is empty but cl(S) ∩ cl(T ) = {0}. �


(1) The closed ball B(x, r) is always a closed subset.(2) There exists a metric space in which the closure of an open ball B(x, r) is not

equal to the closed ball B(x, r).

Proof. Let E = B(x, r). If p ∈ Ec then d(x, p) > r. Choose some s with d(x, p) > s > r;then B(p, d(x, p)− s) is an open ball around p that is contained in Ec. This shows thatE is closed. For (2), consider Z with the discrete metric. We have B(0, 1) = {0} andB(0, 1) = Z, but cl(B(0, 1)) = {0}. �

Theorem 242. [Exercise 12.20] A discrete metric space is separable if and only if itis countable.

Proof. Let M be a discrete metric space. If S ⊆ M then cl(S) = S since every subsetof M is closed. Therefore, S is dense in M if and only if S = cl(S) = M . The resultfollows easily. �

Theorem 243. [Exercise 12.25] Any convergent sequence is a Cauchy sequence.

80

Proof. Let (xn) be a sequence that converges to some point x and let ε > 0. Choose aninteger N such that d(x, xn) < ε/2 for all n ≥ N . Then for all m,n ≥ N we have

d(xm, xn) ≤ d(xm, x) + d(x, xn) < ε.

�

Theorem 244. [Exercise 12.26] If (xn) → x in a metric space M , show that anysubsequence (xnk) of (xn) also converges to x.

Proof. Obvious. �

Theorem 245. [Exercise 12.27] If (xn) is a Cauchy sequence in a metric space M andsome subsequence (xnk) of (xn) converges to x ∈M , then (xn) converges to x.

Proof. Let ε > 0. Choose an integer M such that d(x, xnk) < ε/2 for all k such thatnk ≥ M . Also choose an integer N such that d(xm, xn) < ε/2 whenever m,n ≥ N .Then for all n ≥ max(M,N), there exists a nk ≥ n, and

d(x, xn) ≤ d(x, xnk) + d(xnk , xn) < ε.

�

Theorem 246. [Exercise 12.28] If (xn) is a Cauchy sequence, then the set {xn} isbounded. However, a bounded sequence is not necessarily a Cauchy sequence.

Proof. Choose an integer N such that d(xm, xn) < 1 whenever m,n ≥ N . In particular,d(xN , xn) < 1 for all n > N , so {xN , xN+1, . . . } is bounded. Since {x1, . . . , xN−1} isfinite, it is also bounded. Therefore {xn} is bounded. The converse is not true, sincethe sequence defined by xn = (−1)n is bounded but not Cauchy. �

Theorem 247. [Exercise 12.29] Let (xn) and (yn) be Cauchy sequences in a metricspace M . Then the sequence dn = d(xn, yn) converges.

Proof. Let ε > 0. Choose integers M,N such that d(xm, xn) < ε/2 whenever m,n ≥Mand d(ym, yn) < ε/2 whenever m,n ≥ N . Then for all m,n ≥ max(M,N),

dm − dn = d(xm, ym)− d(xn, yn)

≤ d(xm, xn) + d(xn, yn) + d(yn, ym)− d(xn, yn)

= d(xm, xn) + d(yn, ym)

< ε,

and a similar argument shows that dn − dm < ε, so |dm − dn| < ε. This proves that(dn) is a Cauchy sequence. Since R is complete, (dn) converges. �

Theorem 248. [Exercise 12.36] The metric spaces C[a, b] and C[c, d], under the supmetric, are isometric.

81

Proof. Take the isometry ϕ : C[a, b]→ C[c, d] given by

(ϕf)(x) = f

(a+

b− ad− c

(x− c)).

�

Theorem 249. [Exercise 12.37] Let p, q ≥ 1 with p + q = pq. If∑|xn|p and

∑|yn|q

converge, then∑|xnyn| converges and

∞∑n=1

|xnyn| ≤

(∞∑n=1

|xn|p)1/p( ∞∑

n=1

|yn|q)1/q

.

Proof. We first establish the inequality

(*) uv ≤ up

p+vq

q

for positive real numbers u and v. We have

uv = (up)1/p(vq)1/q

= exp

(1

plog up

)exp

(1

qlog vq

)= exp

(1

plog up +

1

qlog vq

)≤ 1

pexp(log up) +

1

qexp(log vq)

=up

p+vq

q

since exp is convex and 1/p+ 1/q = 1. Now let

X =∞∑n=1

|xn|p and Y =∞∑n=1

|yn|q .

We can assume that X, Y 6= 0, for otherwise the result follows immediately. For eachn, set

u =|xn|X1/p

and v =|yn|Y 1/q

in (*) to obtain

|xn| |yn|X1/pY 1/q

≤ 1

pX|xn|p +

1

qY|yn|q .

82

By the comparison test the series∑|xn| |yn| converges, and summing on n gives

1

X1/pY 1/q

∞∑n=1

|xn| |yn| ≤1

pX

∞∑n=1

|xn|p +1

qY

∞∑n=1

|yn|q

=1

p+

1

q

= 1.

Therefore∞∑n=1

|xn| |yn| ≤ X1/pY 1/q

=

(∞∑n=1

|xn|p)1/p( ∞∑

n=1

|yn|q)1/q

.

�

Theorem 250. [Exercise 12.38] Let p ≥ 1. If∑|xn|p and

∑|yn|p converge, then∑

|xn + yn|p converges and(∞∑n=1

|xn + yn|p)1/p

≤

(∞∑n=1

|xn|p)1/p

+

(∞∑n=1

|yn|p)1/p

.

Proof. The case p = 1 is clear since |xn + yn| ≤ |xn| + |yn| for all n. Now assume thatp > 1. We have

|xn + yn|p = |xn + yn| |xn + yn|p−1

≤ |xn| |xn + yn|p−1 + |yn| |xn + yn|p−1

for all n; summing from 1 to k gives

k∑n=1

|xn + yn|p ≤k∑

n=1

|xn| |xn + yn|p−1 +k∑

n=1

|yn| |xn + yn|p−1 .

Let q = p/(p− 1) so that 1/p+ 1/q = 1. Applying Theorem 249 to the finite sums onthe right, we have

k∑n=1

|xn| |xn + yn|p−1 ≤

(k∑

n=1

|xn|p)1/p( k∑

n=1

|xn + yn|p)1/q

,

k∑n=1

|yn| |xn + yn|p−1 ≤

(k∑

n=1

|yn|p)1/p( k∑

n=1

|xn + yn|p)1/q

83

and therefore (k∑

n=1

|xn + yn|p)1/p

≤

(k∑

n=1

|xn|p)1/p

+

(k∑

n=1

|yn|p)1/p

.

Since the left hand side is nonnegative and the right hand side converges as k → ∞,the left hand side must also converge. Taking k →∞ gives(

∞∑n=1

|xn + yn|p)1/p

≤

(∞∑n=1

|xn|p)1/p

+

(∞∑n=1

|yn|p)1/p

.

�

Chapter 13. Hilbert Spaces

Theorem 251. Let V be an inner product space.

(1) If (xn)→ x then ‖xn‖ → ‖x‖.(2) If (xn)→ x and (yn)→ y then 〈xn, yn〉 → 〈x, y〉.

Proof. For (1), |‖xn‖ − ‖x‖| ≤ ‖xn − x‖ → 0 as n → ∞. Part (2) follows from thepolarization identities and (1). �

Theorem 252. Let τ : H1 → H2 be a bounded linear map.

(1) ‖τ‖ = sup‖x‖=1

‖τx‖.

(2) ‖τ‖ = sup‖x‖≤1

‖τx‖.

(3) ‖τ‖ = inf {c ∈ R | ‖τx‖ ≤ c ‖x‖ for all x ∈ H1}.

Proof. By definition, ‖τv‖ ≤ ‖τ‖ for all ‖v‖ = 1, so sup‖x‖=1 ‖τx‖ ≤ ‖τ‖. For any‖v‖ 6= 0 we have

‖τv‖‖v‖

=

∥∥∥∥τ ( v

‖v‖

)∥∥∥∥ ≤ sup‖x‖=1

‖τx‖ ,

so ‖τ‖ ≤ sup‖x‖=1 ‖τx‖. The proof of (2) is similar. Let

M = inf {c ∈ R | ‖τx‖ ≤ c ‖x‖ for all x ∈ H1} .Since ‖τx‖ ≤ ‖τ‖ ‖x‖ for all x ∈ H1, we have M ≤ ‖τ‖. But ‖τx‖ ≤ M ‖x‖ for allx 6= 0, so ‖τ‖ ≤M . �

Theorem 253. [Exercise 13.1] The sup metric on the metric space C[a, b] of continuousfunctions on [a, b] does not come from an inner product.

84

Proof. Let f(t) = 1 and g(t) = (t− a)/(b− a). Then

‖f + g‖+ ‖f − g‖ = 2 + 1

6= 2(‖f‖2 + ‖g‖2) = 2,

so the norm does not satisfy the parallelogram law. �

Theorem 254. [Exercise 13.3] Let V be an inner product space and let A and B besubsets of V .

(1) A ⊆ B ⇒ B⊥ ⊆ A⊥.(2) A⊥ is a closed subspace of V .(3) [cspan(A)]⊥ = A⊥.

Proof. (1) is proved in Theorem 178. Let x be a limit point of A⊥; by Theorem 239,there exists a sequence (xn) in A⊥ that converges to x. If a ∈ A then 〈xn, a〉 = 0 forevery n, so 〈x, a〉 = limn→∞ 〈xn, a〉 = 0. Therefore x ∈ A⊥, and this proves (2). For(3), we have [cspan(A)]⊥ ⊆ A⊥ by (1). Let x ∈ A⊥. If a ∈ cspan(A) then there existsa sequence (an) in span(A) that converges to a. Since 〈an, x〉 = 0 for every n, we have〈a, x〉 = limn→∞ 〈an, x〉 = 0. Therefore x ∈ [cspan(A)]⊥, which proves (3). �

Remark 255. [Exercise 13.4] Let V be an inner product space and S ⊆ V . Under whatconditions is S⊥⊥⊥ = S⊥?

Theorem 254 shows that S⊥ is closed, so if V is a Hilbert space then S⊥⊥⊥ = cl(S⊥) =S⊥ by Theorem 13.13.

Theorem 256. [Exercise 13.5] A subspace S of a Hilbert space H is closed if and onlyif S = S⊥⊥.

Proof. By Theorem 13.13, cl(S) = S⊥⊥. Therefore S = S⊥⊥ if and only if S = cl(S),i.e. S is closed. �

Theorem 257. [Exercise 13.7] Let O = {u1, u2, . . . } be an orthonormal set in H. Ifx =

∑k rkuk converges, then

‖x‖2 =∞∑k=1

|rk|2 .

Proof. For all n, ∥∥∥∥∥n∑k=1

rkuk

∥∥∥∥∥2

=n∑k=1

|rk|2 .

Since ‖·‖ is continuous, letting n→∞ proves the result. �

85

Theorem 258. [Exercise 13.8] If an infinite series∞∑k=1

xk

converges absolutely in a Hilbert space H, then it also converges in the sense of the“net” definition.

Proof. Suppose that∑∞

k=1 xk → x and let ε > 0. Choose an integer N such that|∑n

k=1 xk − x| < ε/2 for every n ≥ N . Since the sum converges absolutely, we may alsochoose an integer N ′ ≥ N such that

∑∞k=N ′ ‖xk‖ < ε/2. Let K = {1, 2, . . . , N ′}, let T

be a finite set with K ⊂ T ⊂ N, and let m = maxT . Then∣∣∣∣∣∑k∈T

xk − x

∣∣∣∣∣ =

∣∣∣∣∣∣∑k∈K

xk − x+∑k∈T\K

xk

∣∣∣∣∣∣≤

∣∣∣∣∣∑k∈K

xk − x

∣∣∣∣∣+∑k∈T\K

‖xk‖

< ε.

�

Example 259. [Exercise 13.10] Find a countably infinite sum of real numbers thatconverges in the sense of partial sums, but not in the sense of nets.

Consider the sum∞∑k=1

(−1)k

k.

If this sum converges in the “net” sense, then by Theorem 13.20 there exists a finiteset I ⊆ K such that

J ∩ I = ∅, J finite⇒

∣∣∣∣∣∑k∈J

(−1)k

k

∣∣∣∣∣ ≤ ε.

Let m = max I and let J = {m+ 1,m+ 3, . . . ,m+ 2n+ 1}. As n→∞ we must havea contradiction since the harmonic series diverges.

Theorem 260. [Exercise 13.11] If a Hilbert space H has infinite Hilbert dimension,then no Hilbert basis for H is a Hamel basis.

Proof. Choose some infinite sequence (vn) from H and consider the element

v =∞∑n=1

1

nvn.

86

The sum converges since∑∞

n=1 1/n2 converges. But if H is a Hamel basis, then v canbe written as a (finite) linear combination of elements from H, which contradicts theequation above. �

Theorem 261. [Exercise 13.13] Any linear map between finite-dimensional Hilbertspaces is bounded.

Proof. Let τ ∈ L(V,W ) and choose an orthonormal basis B = {v1, . . . , vm} for V . LetM = max {‖τv1‖ , . . . , ‖τvm‖}. If ‖x‖ = 1 then we can write x = a1v1 + · · · + amvmwhere |a1| , . . . , |am| ≤ 1, so

‖τx‖ = ‖a1τv1 + · · ·+ amτvm‖≤ |a1| ‖τv1‖+ |am| ‖τvm‖≤ mM.

This shows that τ is bounded. �

Theorem 262. [Exercise 13.14] If f ∈ H∗, then ker(f) is a closed subspace of H.(Note that H∗ denotes the set of all bounded linear functionals on H.)

Proof. Since f is bounded, it is continuous. But ker(f) = f−1({0}), which is closedsince {0} is closed. �

Theorem 263. [Exercise 13.15] A Hilbert space H is separable if and only if hdim(H) ≤ℵ0.

Proof. Let B be a Hilbert basis for H so that cspan(B) = H, i.e. B is dense in H. Itfollows that H is separable if and only if hdim(H) = |B| ≤ ℵ0. �

Remark 264. [Exercise 13.16] Can a Hilbert space have countably infinite Hamel di-mension?

Suppose that a Hilbert space H with infinite Hilbert dimension has a countably infiniteHamel basis B = {v1, v2, . . . }. By Theorem 9.11, there exists an orthogonal sequenceO = {u1, u2, . . . } with the property that 〈u1, . . . , uk〉 = 〈v1, . . . , vk〉 for all k > 0. Inparticular, span(O) = H, so O is a Hilbert basis. This contradicts Theorem 260.

Theorem 265. [Exercise 13.18] Let τ and σ be bounded linear operators on H.

(1) ‖rτ‖ ≤ |r| ‖τ‖.(2) ‖τ + σ‖ ≤ ‖τ‖+ ‖σ‖.(3) ‖τσ‖ ≤ ‖τ‖ ‖σ‖.

87

Proof. For all ‖x‖ = 1 we have

‖(rτ)x‖ = |r| ‖τx‖ ≤ |r| ‖τ‖ ,‖(τ + σ)x‖ ≤ ‖τx‖+ ‖σx‖ ≤ ‖τ‖+ ‖σ‖ ,‖τσx‖ ≤ ‖τ‖ ‖σx‖ ≤ ‖τ‖ ‖σ‖ .

The results follow from the properties of sup and Theorem 252. �

Theorem 266. [Exercise 13.19] There exists a conjugate isomorphism between H∗ andH for any Hilbert space H. In particular, if H is a real Hilbert space then H ∼= H∗.

Proof. Let R : H∗ → H be the Riesz map, which is injective by Theorem 13.32.Furthermore, for any z0 ∈ H the linear functional x 7→ 〈x, z0〉 is bounded, so R is abijection. It is easy to see that R is a conjugate-linear map. �

Chapter 14. Tensor Products

Definition 267. If f : V1 × · · · × Vn → W is a multilinear map, then we denote thesubspace of W generated by f(V1 × · · · × Vn) by Im(f).

Theorem 268. Let U, V be vector spaces. A pair (T, t) is a tensor product of U withV if and only if the following two conditions hold:

(1) T = Im(t).(2) If f : U × V → W is a bilinear map then there exists a (not necessarily unique)

linear map τ : T → W such that f = τt.

Proof. Suppose that the two conditions hold; we need to prove that if τ, σ : T → W aretwo linear maps such that f = τt and f = σt, then τ = σ. But (τ − σ)t = f − f = 0,which implies that τ = σ since t is surjective (from the first condition). Now supposethat (T, t) is a tensor product of U with V . Condition (2) follows immediately, so itremains to prove (1). Write X = Im(t) and U ⊗ V = T .

U × V U ⊗ V

X

t

ρρ j

88

Define the bilinear map ρ : U × V → X by (u, v) 7→ u ⊗ v; there exists a uniqueρ : U ⊗ V → X such that ρt = ρ. Let j : X → U ⊗ V be the inclusion map. Nowjρ = t, so jρt = jρ = t. But ιt = t where ι is the identity map on U ⊗ V , so by theuniversal property (for bilinear maps from U×V to U⊗V ) we have jp = ι. This showsthat j is surjective, i.e. U ⊗ V ⊆ X. �

Theorem 269. Let V1, . . . , Vn,W1, . . . ,Wm be vector spaces over a field F .

(1) There exists a unique isomorphism

τ : (V1 ⊗ · · · ⊗ Vn)⊗ (W1 ⊗ · · · ⊗Wm)→ V1 ⊗ · · · ⊗ Vn ⊗W1 ⊗ · · · ⊗Wm

for which

τ [(v1 ⊗ · · · ⊗ vn)⊗ (w1 ⊗ · · · ⊗ wm)] = v1 ⊗ · · · ⊗ vn ⊗ w1 ⊗ · · · ⊗ wm.In particular,

(U ⊗ V )⊗W ∼= U ⊗ (V ⊗W ) ∼= U ⊗ V ⊗W.(2) If π is a permutation of the indices {1, . . . , n}, then there exists a unique iso-

morphismσ : V1 ⊗ · · · ⊗ Vn → Vπ(1) ⊗ · · · ⊗ Vπ(n)

for whichσ(v1 ⊗ · · · ⊗ vn) = vπ(1) ⊗ · · · ⊗ vπ(n).

(3) There exists a unique isomorphism ρ : F ⊗ V → V for which

ρ(a⊗ v) = av.

Similarly, there exists a unique isomorphism ρ′ : V ⊗ F → V for which

ρ′(v ⊗ a) = av.

Hence, F ⊗ V ∼= V ∼= V ⊗ F .(4) There exists a unique isomorphism ϕ : (U ⊕V )⊗W → (U ⊗W )� (V ⊗W ) for

whichϕ((u+ v)⊗ w) = (u⊗ w, v ⊗ w),

and similarly W ⊗ (U ⊕ V ) ∼= (W ⊗ U) � (W ⊗ V ).

Proof. Define the multilinear maps

t : (V1 ⊗ · · · ⊗ Vn)× (W1 ⊗ · · · ⊗Wm)→ V1 ⊗ · · · ⊗ Vn ⊗W1 ⊗ · · · ⊗Wm

(v1 ⊗ · · · ⊗ vn, w1 ⊗ · · · ⊗ wm) 7→ v1 ⊗ · · · ⊗ vn ⊗ w1 ⊗ · · · ⊗ wmand

s : V1 × · · · × Vn → Vπ(1) ⊗ · · · ⊗ Vπ(n)v1 × · · · × vn 7→ vπ(1) ⊗ · · · ⊗ vπ(n)

89

so that the linear maps τ and σ exist by the universal property. It is easy to see thatthese maps are bijections; this proves (1) and (2). For (3), define the bilinear mapr : F × V → V by (a, v) 7→ av so that there exists a unique linear map ρ : F ⊗ V → Vwith ρ(a⊗v) = av. But linear map v 7→ 1⊗v is an inverse to ρ, so ρ is an isomorphism.For (4), define

e : (U ⊕ V )×W → (U ⊗W ) � (V ⊗W )

(u+ v, w) 7→ (u⊗ w, v ⊗ w)

so that ϕ exists by the universal property. Similarly, the bilinear maps (u,w) 7→ u⊗wand (v, w) 7→ v ⊗ w induce unique linear maps f : U ⊗ W → (U ⊕ V ) ⊗ W andg : V ⊗W → (U ⊕ V )⊗W . The map h : (U ⊗W ) � (V ⊗W )→ (U ⊕ V )⊗W givenby (a, b) 7→ f(a) + g(b) is an inverse to e, so e is an isomorphism. �

Theorem 270. Let z ∈ U ⊗ V with

z =r∑i=1

xi ⊗ yi

where the sets {xi} ⊆ U and {yi} ⊆ V contain exactly r elements and are linearlyindependent. If {wi} ⊆ U and {zi} ⊆ V are sets with exactly r elements and arelinearly independent, then the equation

z =r∑i=1

wi ⊗ zi

holds if and only if there exist invertible r × r matrices A,B for which ATB = I and

wi =r∑j=1

Ai,jxj and zi =r∑

k=1

Bi,kyk

for i = 1, . . . , r.

Proof. If there exist matrices A,B satisfying the given conditions, then

r∑i=1

wi ⊗ zi =r∑i=1

r∑j=1

r∑k=1

Ai,jBi,k(xj ⊗ yk)

=r∑j=1

r∑k=1

r∑i=1

ATj,iBi,k(xj ⊗ yk)

=r∑j=1

r∑k=1

[ATB]j,k(xj ⊗ yk)

90

=r∑i=1

xi ⊗ yi

= z

since ATB = I. The converse was shown in (14.3) from the text. �

Theorem 271. [Exercise 14.1] If τ : W → X is a linear map and b : U × V → W isbilinear, then τ ◦ b : U × V → X is bilinear.

Proof. We have

(τ ◦ b)(αx+ βy, z) = τ(αb(x, z) + βb(y, z))

= α(τ ◦ b)(x, z) + β(τ ◦ b)(y, z).Similarly, τ ◦ b is linear in the other coordinate. �

Theorem 272. [Exercise 14.2] The only map that is both linear and n-linear (forn ≥ 2) is the zero map.

Proof. Let τ : V n → W be a linear map that is also n-linear. For all v1, . . . , vn ∈ V ,

τ(v1, . . . , vn) =n∑k=1

τ(0, . . . , vk, . . . , 0)

= 0

where the first equality follows from the linearity of τ and the second equality followsfrom the n-linearity of τ . �

Example 273. [Exercise 14.3] Find an example of a bilinear map τ : V × V → Wwhose image im(τ) = {τ(u, v) | u, v ∈ V } is not a subspace of W .

The tensor map t : U × V → U ⊗ V is an example, since its image consists of alldecomposable tensors. If u ⊗ v, u′ ⊗ v′ are decomposable tensors, then u ⊗ v + u′ ⊗ v′is not necessarily a decomposable tensor.

Theorem 274. [Exercise 14.4] Let B = {ui | i ∈ I} be a basis for U and let C ={vj | j ∈ J} be a basis for V . Then the set

D = {ui ⊗ vj | i ∈ I, j ∈ J}is a basis for U ⊗ V .

Proof. We give an argument that shows that D is linearly independent and spans U⊗V .Suppose that ∑

i,j

ri,jui ⊗ vj =∑i

ui ⊗∑j

ri,jvj = 0.

91

Since B is linearly independent, Theorem 14.5 shows that∑j

ri,jvj = 0

for each i. But C is linearly independent, so ri,j = 0 for all i, j. This shows that D islinearly independent. Let t : U × V → U ⊗ V be the tensor map. Since B and C spanU and V respectively, we have

span({t(ui, vj) | i ∈ I, j ∈ J}) = span({t(u, v) | u ∈ U, v ∈ V })= Im(t)

= U ⊗ Vby Theorem 268. �

Theorem 275. [Exercise 14.5] The following property of a pair (W, g : U × V → W )with g bilinear characterizes the tensor product (U ⊗ V, t : U × V → U ⊗ V ) up toisomorphism: For a pair (W, g : U × V → W ) with g bilinear, if {ui} is a basis for Uand {vi} is a basis for V , then {g(ui, vj)} is a basis for W .

Proof. Theorem 274 already shows one direction; we must prove that if (W, g : U×V →W ) satisfies the property, then it is a tensor product of U with V . But {ui ⊗ vj}is a basis for U ⊗ V , so we can define a linear map τ : U ⊗ V → W by settingτ(ui ⊗ vj) = g(ui, vj). This map carries a basis of U ⊗ V to a basis of W , so it is anisomorphism and therefore W ∼= U ⊗ V . �

Theorem 276. [Exercise 14.7] Let X and Y be nonempty sets. Then FX×Y ∼= FX⊗FY ,where FA is the free vector space on A.

Proof. Let t : FX × FY → FX ⊗ FY be the tensor map. Define the bilinear mapτ : FX ×FY → FX×Y by

τ

(∑i

aixi,∑j

bjyj

)=∑i,j

aibj(xi, yj)

so that there exists a unique τ : FX ⊗FY → FX×Y with τ t = τ . Clearly ker(τ) = {0},so τ is injective. If ∑

i,j

ri,j(xi, yj) ∈ FX×Y

then

τ

(∑i,j

ri,j(xi ⊗ yj)

)=∑i,j

ri,j(xi, yj),

which shows that τ is surjective. �

92

Theorem 277. [Exercise 14.8] Let u, u′ ∈ U and v, v′ ∈ V with u ⊗ v 6= 0. Thenu⊗ v = u′ ⊗ v′ if and only if u′ = ru and v′ = r−1v for some r 6= 0.

Proof. One direction is evident. Suppose that u⊗ v = u′⊗ v′. For any multilinear mapf : U × V → W we have f(u, v) = τ(u⊗ v) and in particular,

f(u, v) = τ(u⊗ v) = τ(u′ ⊗ v′) = f(u′, v′).

This is true for any multilinear map f , so it is true if g ∈ U∗, h ∈ V ∗ and f : U×V → Fis given by f(u, v) = g(u)h(v):

g(u)h(v) = g(u′)h(v′).

Note that u⊗ v 6= 0 implies that u, v 6= 0; we can choose h ∈ V ∗ with h(v) 6= 0 so that

g(u) = g

(h(v′)

h(v)u′

)for all g ∈ U∗. By Corollary 50 we have

u =h(v′)

h(v)u′,

noting that h(v′)/h(v) 6= 0 since u 6= 0. Now choose g ∈ U∗ with g(u) = 1 so that

h(v) = g(u)h(v) = g(u′)h(v′) = h

(h(v)

h(v′)v′

)for all h ∈ V ∗. Applying Corollary 50 once more shows that

v =h(v)

h(v′)v′.

Note that this result can also be proved by using Theorem 14.6. �

Theorem 278. [Exercise 14.9] Let B = {bi} be a basis for U and C = {cj} be a basisfor V . Any function f : B × C → W can be extended uniquely to a linear functionf : U⊗V → W . Therefore, the function f can be extended in a unique way to a bilinear

map f : U × V → W . All bilinear maps are in fact obtained in this way.

Proof. The first statement follows from the fact that {bi ⊗ cj} is a basis for U ⊗ V and

Theorem 14. Let t : U × V → U ⊗ V be the tensor map. Since ft : U × V → W is a

bilinear map that extends f , the function f exists. In fact, if f is a bilinear map that

extends f then f = τt for a unique τ : U ⊗ V → W such that τ(bi ⊗ cj) = f(bi, cj) for

all i, j. But f is already such a linear map, so f = τt = ft. This proves the secondstatement. If g : U × V → W is a bilinear map then we can define f(bi, cj) = g(bi, cj)

93

for all i, j. The function f can be extended in a unique way to a bilinear map f such

that f(bi, cj) = f(bi, cj) for all i, j. By the uniqueness of f , we have f = g. �

Theorem 279. Let U, V be vector spaces, let S be a subspace of U and let T be asubspace of V . If (T, t) is a tensor product of U with V then (Im(t|S×T ), t|S×T ) is atensor product of S with T .

Proof. Let f : S × T → W be a bilinear map. Choose a bilinear map g : U × V → Wsuch that g|S×T = f ; there exists a linear map τ : T → W with g = τt. But

f = g|S×T = τ |Im(t|S×T )t|S×T ,so condition (1) of Theorem 268 is satisfied. Condition (2) is automatically satisfied,so this proves the result. �

Lemma 280. [Exercise 14.10] Let S1, S2 be subspaces of V . Then

(S1 ⊗ V ) ∩ (S2 ⊗ V ) = (S1 ∩ S2)⊗ Vwhere the tensor products are interpreted as subspaces of V ⊗ V (cf. Theorem 279).

Proof. Let B = {vi} be a basis of V . Let z ∈ (S1 ⊗ V ) ∩ (S2 ⊗ V ) and write

z =m∑i=1

xi ⊗ vji =n∑i=1

yi ⊗ vki

where xi ∈ S1 and yi ∈ S2. After reindexing, we have

z =r∑i=1

xi ⊗ vji =r∑i=1

yi ⊗ vji

so thatr∑i=1

(xi − yi)⊗ vji = 0.

Since the {vji} are linearly independent, xi = yi for each i and therefore z ∈ (S1∩S2)⊗V .The other inclusion is obvious. �

Lemma 281. [Exercise 14.11] Let S ⊆ U and T ⊆ V be subspaces of vector spaces Uand V respectively. Then

(S ⊗ V ) ∩ (U ⊗ T ) = S ⊗ T.

Proof. Choose a subspace T ′ ⊆ V such that V = T ⊕ T ′. Then

(S ⊗ V ) ∩ (U ⊗ T ) = (S ⊗ (T + T ′)) ∩ (U ⊗ T )

= ((S ⊗ T ) + (S ⊗ T ′)) ∩ (U ⊗ T )

= (S ⊗ T ) + ((S ⊗ T ′) ∩ (U ⊗ T ))

94

by Theorem 5. But

(S ⊗ T ′) ∩ (U ⊗ T ) ⊆ (U ⊗ T ′) ∩ (U ⊗ T )

= U × (T ′ ∩ T )

= {0}by Lemma 280 so

(S ⊗ V ) ∩ (U ⊗ T ) = (S ⊗ T ) + {0} = S ⊗ T.�

Theorem 282. [Exercise 14.12] Let S1, S2 ⊆ U and T1, T2 ⊆ V be subspaces of U andV respectively. Then

(S1 ⊗ T1) ∩ (S2 ⊗ T2) = (S1 ∩ S2)⊗ (T1 ∩ T2).

Proof. Applying Lemma 280 twice gives

(S1 ⊗ T1) ∩ (S2 ⊗ T2) ⊆ (S1 ⊗ V ) ∩ (S2 ⊗ V ) = (S1 ∩ S2)⊗ Vand

(S1 ⊗ T1) ∩ (S2 ⊗ T2) ⊆ (U ⊗ T1) ∩ (U ⊗ T2) = U ⊗ (T1 ∩ T2).By Lemma 281,

(S1 ⊗ T1) ∩ (S2 ⊗ T2) ⊆ ((S1 ∩ S2)⊗ V ) ∩ (U ⊗ (T1 ∩ T2))= (S1 ∩ S2)⊗ (T1 ∩ T2).

The other inclusion is clear. �

Theorem 283. [Exercise 14.14] Let ιX denote the identity operator on a vector spaceX. For any vector spaces V and W , we have ιV � ιW = ιV⊗W .

Proof. By definition, ιV � ιW : V ⊗W → V ⊗W is the unique linear map such that

(ιV � ιW )(v ⊗ w) = ιV v ⊗ ιWw = v ⊗ wfor all v ∈ V and w ∈ W . But ιV⊗W also satisfies the equation above, so ιV � ιW =ιV⊗W . �

Theorem 284. [Exercise 14.15] Let τ1 : U → V , τ2 : V → W and σ1 : U ′ → V ′,σ2 : V ′ → W ′. Then

(τ2 ◦ τ1)� (σ2 ◦ σ1) = (τ2 � σ2) ◦ (τ1 � σ1).

Proof. For all u ∈ U and u′ ∈ U ′,[(τ2 � σ2) ◦ (τ1 � σ1)](u⊗ u′) = (τ2 � σ2)(τ1u⊗ σ1u′)

= (τ2 ◦ τ1)u⊗ (σ2 ◦ σ1)u′.By definition, the result follows. �

95

Theorem 285. [Exercise 14.16] Let V be a vector space over a field F and let K be afield extension of F . Then F n ⊗F K ∼= Kn as K-spaces, with scalar multiplication onF n ⊗F K defined as α(v ⊗ k) = v ⊗ αk where α ∈ K.

Proof. Define the multilinear F -map f : F n ×K → Kn by (v, r) 7→ rv. This induces alinear F -map ϕ : F n ⊗F K → Kn satisfying ϕ(v ⊗ r) = rv. But if k ∈ K then

ϕ(k(v ⊗ r)) = ϕ(v ⊗ kr) = krv = kϕ(v ⊗ r),so ϕ is also a linear K-map. Furthermore, ϕ carries the basis {ei ⊗ 1} of the K-spaceF n ⊗F K to the basis {ei} of Kn, so ϕ is an isomorphism. �

Example 286. [Exercise 14.17] Prove that in a tensor product U ⊗ U for whichdim(U) ≥ 2, not all vectors have the form u⊗ v for some u, v ∈ U .

Let u, v ∈ U be linearly independent vectors and consider z = u ⊗ v + v ⊗ u. ByTheorem 14.6 the rank of z is 2, so it does not have the form x⊗ y.

Theorem 287. [Exercise 14.18] For the block matrix

M =

[A B0 C

]we have det(M) = det(A) det(C).

Proof. Suppose that M is a n × n matrix and A is a m ×m matrix. Define the mapf : Mm → F given by

X 7→ det

[X B0 I

],

which is multilinear and antisymmetric in the columns of X. Corollary 14.20 showsthat

(*) f(X) = det

[I B0 I

]det(X) = det(X)

since the last matrix is upper triangular. If we define the map g : Mn−m → F given by

X 7→ det

[A B0 X

],

a similar argument shows that

g(X) =

(det

[A B0 I

])det(X) = det(A) det(X)

by using (*). Therefore

det

[A B0 C

]= g(C) = det(A) det(C).

�

96

Theorem 288. [Exercise 14.19] Let A,B ∈ Mn(F ). If either A or B is invertible,then the matrices A+ αB are invertible except for a finite number of α’s.

Proof. If A is invertible, then

det(A+ αB) = 0⇔ det(A(I + αA−1B)) = 0

⇔ det(I + αA−1B) = 0

⇔ det(α−1I + A−1B) = 0.

But A−1B has only a finite number of eigenvalues, so det(A+αB) = 0 for only finitelymany values of α. If B is invertible, a similar argument applies. �

The Tensor Product of Matrices.

Theorem 289. [Exercise 14.20] Let A = (ai,j) be the matrix of a linear operatorτ ∈ L(V ) with respect to the ordered basis A = (u1, . . . , un). Let B = (bi,j) be thematrix of a linear operator σ ∈ L(V ) with respect to the ordered basis B = (v1, . . . , vn).Consider the ordered basis C = (ui⊗vj) ordered lexicographically; that is ui⊗vj < u`⊗vkif i < ` or i = ` and j < k. Then the matrix of τ ⊗ σ with respect to C is

A⊗B =

a1,1B a1,2B · · · a1,nBa2,1B a2,2B · · · a2,nB...

.... . .

...an,1B an,2B · · · an,nB

.This matrix is called the tensor product, Kronecker product or direct productof the matrix A with the matrix B.

Proof. Let x, y ∈ V and write

x =n∑i=1

riui and y =n∑j=1

sjvj.

We have

(τ ⊗ σ)(x⊗ y) = (τ ⊗ σ)

( ∑1≤i,j≤n

risj(ui ⊗ vj)

)=

∑1≤i,j≤n

risj(τui ⊗ σvj)

=∑

1≤i,j≤n

risj(n∑k=1

ak,iuk ⊗n∑`=1

b`,jv`)

97

=∑

1≤i,j,k,`≤n

risjak,ib`,j(uk ⊗ v`)

so that

([(τ ⊗ σ)(x⊗ y)]C)k =∑

1≤i,j≤n

risjadk/ne,ibn+k−ndk/ne,j

=n2∑i=1

adk/ne,di/nebn+k−ndk/ne,n+i−ndi/nerdi/nesn+i−ndi/ne

=n2∑i=1

(A⊗B)k,irdi/nesn+i−ndi/ne

=n2∑i=1

(A⊗B)k,i([x⊗ y]C)i

= ((A⊗B)[x⊗ y]C)k.

Therefore [τ ⊗ σ]C = A⊗B. �

Example 290. [Exercise 14.21] Show that the tensor product is not, in general, com-mutative.

If

A =

[1 00 1

]and B =

[0 11 0

]then

A⊗B =

0 11 0

0 11 0

and B ⊗ A =

1 00 1

1 00 1

.Theorem 291. [Exercise 14.22-14.30] Let A,B be n× n matrices.

(1) The tensor product A⊗B is bilinear in both A and B.(2) A⊗B = 0 if and only if A = 0 or B = 0.(3) (A⊗B)T = AT ⊗BT .(4) (A⊗B)∗ = A∗ ⊗B∗ when F = C.(5) If u, v ∈ F n, then (as row vectors) uTv = uT ⊗ v.(6) Let Am,n, Bp,q, Cn,k, Dq,r are matrices of the given sizes. Then

(A⊗B)(C ⊗D) = (AC)⊗ (BD).

In particular, if u, v are column vectors of appropriate sizes then

(Au)(Bv) = (A⊗B)(u⊗ v).

98

(7) If A and B are invertible, then so is A⊗B and

(A⊗B)−1 = A−1 ⊗B−1.(8) tr(A⊗B) = tr(A) tr(B).(9) Suppose that F is algebraically closed. If A has eigenvalues λ1, . . . , λn and B

has eigenvalues µ1, . . . , µm, both lists including multiplicity, then A ⊗ B haseigenvalues

{λiµj | 1 ≤ i ≤ n, 1 ≤ j ≤ m} ,again counting multiplicity.

(10) det(An,n ⊗Bm,m) = det(An,n)m det(Bm,m)n.

Proof. Parts (1) to (5) are clear. Part (6) follows from Theorem 284. If k = r = 1 thenC and D are column vectors, so we have the result that

Au⊗Bv = (A⊗B)(u⊗ v) = (A⊗B)

u1v...unv

,which is simply a version of Theorem 289. For (7), put C = A−1 and D = B−1 in (6).For (8), we have

tr(A⊗B) = tr(a1,1B) + · · ·+ tr(an,nB)

= (a1,1 + · · ·+ an,n) tr(B)

= tr(A) tr(B).

For (9), let u1, . . . , un be eigenvectors for λ1, . . . , λn and let v1, . . . , vm be eigenvectorsfor µ1, . . . , µm. Then

(A⊗B)(ui ⊗ vj) = Aui ⊗Bvj= λiµj(ui ⊗ vj)

for each i, j. Part (10) follows from (9), taking the algebraic closure of F if necessary. �

Applications of the Exterior Algebra.

Remark 292. If u, v ∈∧V satisfy u = αv for some scalar α, we write u/v for α.

Theorem 293. Let V be a finite-dimensional vector space. If v1, . . . , vk ∈ V are distinctvectors, then v1 ∧ · · · ∧ vk = 0 if and only if v1, . . . , vk are linearly dependent.

Proof. Suppose that v1, . . . , vk are linearly dependent and assume without loss of gen-erality that a1v1 + · · · + akvk = 0 with a1 6= 0. Then v1 = −a−11 (a2v2 + · · · + akvk),so

v1 ∧ · · · ∧ vk = −a−11 (a2v2 + · · ·+ akvk) ∧ v2 ∧ · · · ∧ vk = 0.

99

Conversely, suppose that v1, . . . , vk are linearly independent. We can extend {v1, . . . , vk}to a basis {v1, . . . , vn} of V , and since {v1 ∧ · · · ∧ vn} is a basis of

∧n V by Theorem14.17, we must have v1 ∧ · · · ∧ vn 6= 0. But v1 ∧ · · · ∧ vk is a factor of v1 ∧ · · · ∧ vn, sov1 ∧ · · · ∧ vn 6= 0. �

Definition 294. Let τ ∈ L(V, V ) where V is finite-dimensional. The alternatingmultilinear map

f : V n →∧n

V

(v1, . . . , vn) 7→ τv1 ∧ · · · ∧ τvninduces a unique linear operator ϕ :

∧n V →∧n V with ϕ(v1∧· · ·∧vn) 7→ τv1∧· · ·∧τvn.

Since∧n V is one-dimensional, we can define the determinant det(τ) to be the unique

number satisfying ϕ = det(τ)ι where ι is the identity on∧n V . Similarly, the alternating

multilinear map

(v1, . . . , vn) 7→n∑k=1

v1 ∧ · · · ∧ τvk ∧ · · · ∧ vn

induces a unique linear operator ψ :∧n V →

∧n V ; the trace tr(τ) is defined as theunique number satisfying ψ = tr(τ)ι.

Theorem 295. Let τ, σ ∈ L(V, V ) where dim(V ) = n. Let A =[a1 · · · an

]be an

n× n matrix with entries in a field F , and let {e1, . . . , en} be the standard basis of F n.

(1) det(ι) = 1 and det(0) = 0.(2) det(τσ) = det(τ) det(σ).(3) det(rτ) = rn det(τ).(4) a1 ∧ · · · ∧ an = det(A)e1 ∧ · · · ∧ en.(5) det(A) =

∑ρ∈Sn(−1)ρAρ(1),1 · · ·Aρ(n),n.

(6) det(AT ) = det(A).(7) If ϕ : V → W is an isomorphism, then det(ϕτϕ−1) = det(τ).(8) If A = [τ ]B where B is a basis of V , then det(A) = det(τ).

Proof. (1) is obvious. For all v1, . . . , vn ∈ V ,

det(τσ)v1 ∧ · · · ∧ vn = τσv1 ∧ · · · ∧ τσvn= det(τ)σv1 ∧ · · · ∧ σvn= det(τ) det(σ)v1 ∧ · · · ∧ vn

and

det(rτ)v1 ∧ · · · ∧ vn = rτv1 ∧ · · · ∧ rτvn= rnτv1 ∧ · · · ∧ τvn= rn det(τ)v1 ∧ · · · ∧ vn,

100

which proves (2) and (3). (4) follows from the computation

a1 ∧ · · · ∧ an = Ae1 ∧ · · · ∧ Aen = det(A)e1 ∧ · · · ∧ en.For (5),

det(A)e1 ∧ · · · ∧ en = Ae1 ∧ · · · ∧ Aen

=n∑

i1=1

Ai1,1ei1 ∧ · · · ∧n∑

in=1

Ain,nein

=n∑

i1=1

· · ·n∑

in=1

Ai1,1 · · ·Ain,nei1 ∧ · · · ∧ ein

=∑ρ∈Sn

Aρ(1),1 · · ·Aρ(n),nei1 ∧ · · · ∧ ein

=∑ρ∈Sn

(−1)ρAρ(1),1 · · ·Aρ(n),ne1 ∧ · · · ∧ en.

For (6), ∑ρ∈Sn

(−1)ρAρ(1),1 · · ·Aρ(n),n =∑ρ∈Sn

(−1)ρ−1

A1,ρ−1(1) · · ·An,ρ−1(n)

=∑ρ∈Sn

(−1)ρA1,ρ(1) · · ·An,ρ(n).

Finally, suppose that ϕ : V → W is an isomorphism. We have an isomorphism ψ :∧n V →∧nW induced by the map (v1, . . . , vn) 7→ ϕv1∧· · ·∧ϕvn, so for all v1, . . . , vn ∈

V we have

det(τ)v1 ∧ · · · ∧ vn = τv1 ∧ · · · ∧ τvn= τϕ−1ϕv1 ∧ · · · ∧ τϕ−1ϕvn= ψ−1(ϕτϕ−1ϕv1 ∧ · · · ∧ ϕτϕ−1ϕvn)

= ψ−1(det(ϕτϕ−1)ϕv1 ∧ · · · ∧ ϕvn)

= det(ϕτϕ−1)v1 ∧ · · · ∧ vn.This proves (7), and (8) follows by considering the isomorphism ϕ : V → F n that takesB to the standard basis of F n. �

Theorem 296. Let A =[v1 · · · vn

]be an n × n matrix with entries in a field F .

The following are equivalent:

(1) A is invertible.(2) v1 ∧ · · · ∧ vn 6= 0.(3) det(A) 6= 0.

101

Proof. (1) and (2) are equivalent by Theorem 293, and (2)⇔ (3) follows from Theorem295. �

Theorem 297. [Cramer’s rule] Let A ∈ Mn(F ) be an invertible matrix with columnsv1, . . . , vn and let x = (x1, . . . , xn) be a column vector. If Ax = y then

xk =v1 ∧ · · · ∧ vk−1 ∧ y ∧ vk+1 ∧ · · · ∧ vn

v1 ∧ · · · ∧ vn=

det(Ak)

det(A)

for every k, where Ak is the matrix obtained by replacing the kth column of A with y.

Proof. Let {e1, . . . , en} be the standard basis of F n and write x = x1e1 + · · · + xnen.For every k we have

y ∧∧i 6=k

vi = A(x1e1 + · · ·+ xnen) ∧∧i 6=k

vi

= xkv1 ∧ · · · ∧ vn,so the result follows from Theorem 295. �

Chapter 15. Positive Solutions to Linear Systems: Convexity andSeparation

Theorem 298. [Exercise 15.1] Any hyperplane has the form H(N, ‖N‖2) for an ap-propriate vector N .

Proof. Any hyperplane

H(N, b) = {x ∈ Rn | 〈N, x〉 = b}is equivalent to the hyperplane

H(rN, rb) = {x ∈ Rn | 〈rN, x〉 = rb}for any nonzero r ∈ R. In particular, setting

r =b

‖N‖2

gives the desired result. �

Theorem 299. [Exercise 15.2] If A is an m×n matrix then the set C = {Ax | x ∈ Rn, x > 0}is a convex cone in Rm.

Proof. It is clear that C is a cone. If x, y ∈ C then x = Ax′ and y = Ay′ for somex′, y′ ∈ Rn+. For 0 ≤ t ≤ 1,

tx+ (1− t)y = A(tx′ + (1− t)y′) ∈ Csince Rn+ is convex. Therefore C is convex. �

102

Theorem 300. [Exercise 15.3] If A and B are strictly separated subsets of Rn and ifA is compact, then A and B are strongly separated as well.

Proof. Suppose that A and B are strictly separated by the hyperplane H(N, b), andassume without loss of generality that 〈N,A〉 < b < 〈N,B〉. Since A is compact, thecontinuous function a 7→ 〈N, a〉 takes on its maximum b′ < b at some a0 ∈ A. We have

〈N,A〉 ≤ b′ <b+ b′

2< b < 〈N,B〉 ,

so A and B are strongly separated by the hyperplane H(N, (b+ b′)/2). �

Theorem 301. [Exercise 15.4] Let V be a real vector space and let X be a subset ofV . The following are equivalent:

(1) X is closed under the taking of convex combinations of any two of its points.(2) X is closed under the taking of arbitrary convex combinations, that is, for all

n ≥ 1,

x1, . . . , xn ∈ X andn∑i=1

ri = 1 and 0 ≤ ri ≤ 1

implies thatn∑i=1

rixi ∈ X.

Proof. The direction (2)⇒ (1) is clear. Suppose that (1) holds; we use induction on n.If n = 1, 2 then (2) clearly holds. Otherwise, suppose that (2) holds for n − 1 and letx1, . . . , xn ∈ X with

n∑i=1

ri = 1 and 0 ≤ ri ≤ 1.

If r1, . . . , rn−1 = 0 then we are done. Otherwise, let r = r1 + · · ·+ rn−1 so that

n−1∑i=1

rir

= 1 and 0 ≤ rir≤ 1 for 1 ≤ i ≤ n− 1

and thereforen−1∑i=1

rirxi ∈ X

by the induction hypothesis. Now r + rn = 1 and 0 ≤ r, rn ≤ 1, so

r

n−1∑i=1

rirxi + rnxn =

n∑i=1

rixi ∈ X

103

by (1). �

Remark 302. [Exercise 15.5] Explain why an (n− 1)-dimensional subspace of Rn is thesolution set of a linear equation of the form a1x1 + · · ·+ anxn = 0.

Let a = (a1, . . . , an) and assume a 6= 0; then the solution set is the set of all x ∈ Rnsuch that 〈a, x〉 = 0, i.e. the kernel of the linear functional x 7→ 〈a, x〉. It is clear thatthe kernel of any nonzero linear functional is (n− 1)-dimensional.

Theorem 303. [Exercise 15.6] If N ∈ Rn is nonzero and b ∈ R then

H+(N, b) ∩H−(N, b) = H(N, b)

and H◦+(N, b), H◦−(N, b), H(N, b) are pairwise disjoint with

H◦+(N, b) ∪H◦−(N, b) ∪H(N, b) = Rn.

Proof. Obvious. �

Theorem 304. [Exercise 15.7] A function T : Rn → Rm is affine if it has the formT (v) = τv + b for b ∈ Rm, where τ ∈ L(Rn,Rm). If C ⊆ Rn is convex, then so isT (C) ⊆ Rm.

Proof. If x, y ∈ T (C) then x = τx′ + b and y = τy′ + b for some x′, y′ ∈ Rn. For0 ≤ t ≤ 1,

tx+ (1− t)y = t(τx′ + b) + (1− t)(τy′ + b)

= τ(tx′ + (1− t)y′) + b

∈ T (C)

since tx′ + (1− t)y′ ∈ C. �


(1) There exist cones in R2 that are not convex.(2) A subset X of Rn is a convex cone if and only if x, y ∈ X implies that λx+µy ∈

X for all λ, µ ≥ 0.

Proof. An example for (1) can be constructed by taking

{ax | a ≥ 0} ∪ {ay | a ≥ 0}for any two nonzero and nonparallel vectors x, y ∈ R2. For (2), if X is a convex conethen

λx+ µy = (λ+ µ)

(λ

λ+ µx+

µ

λ+ µy

)∈ X

if λ+ µ 6= 0. The converse is clear. �

Theorem 306. [Exercise 15.9] The convex hull of a set {x1, . . . , xn} in Rn is bounded.

104

Proof. This follows easily from Theorem 15.4, but we give another argument. LetM = max {‖x1‖ , . . . , ‖xn‖}. The closed disc

D = {x ∈ Rn | ‖x‖ ≤M}is a convex set that contains {x1, . . . , xn}, so C(S) ⊆ D by definition. Since D isbounded, C(S) is bounded. �

Theorem 307. [Exercise 15.10] Suppose that a vector x ∈ Rn has two distinct rep-resentations as convex combinations of the vectors v1, . . . , vn. Then the vectors v2 −v1, . . . , vn − v1 are linearly dependent.

Proof. Suppose that

x = a1v1 + · · ·+ anvn and x = b1v1 + · · ·+ bnvn

where (a1, . . . , an) 6= (b1, . . . , bn), 0 ≤ ai, bi ≤ 1 and∑ai =

∑bi = 1. Since

∑(ai−bi) =

0 we haven∑i=1

(ai − bi)vi = 0 =n∑i=1

(ai − bi)v1,

son∑i=2

(ai − bi)(vi − v1) = 0.

Furthermore, we cannot have ai − bi = 0 for all i ≥ 2; otherwise,

a1 = an − · · · − a2 = bn − · · · − b2 = b1

which contradicts the fact that (a1, . . . , an) 6= (b1, . . . , bn). This shows that v2 −v1, . . . , vn − v1 are linearly dependent. �

Theorem 308. Let H = {x ∈ Rn | 〈N, x〉 = 0} be a linear hyperplane with N 6= 0. IfN ′ 6= 0 and N ′ ∈ H⊥, then N ′ is a nonzero scalar multiple of N .

Proof. We cannot have 〈N,N ′〉 = 0 for otherwise N ′ ∈ H and 〈N ′, N ′〉 = 0. Now⟨N,

N ′ ‖N‖2

〈N,N ′〉−N

⟩=〈N,N ′〉 ‖N‖2

〈N,N ′〉− ‖N‖2 = 0,

so ⟨N ′,

N ′ ‖N‖2

〈N,N ′〉−N

⟩=‖N ′‖2 ‖N‖2

〈N,N ′〉− 〈N ′, N〉 = 0.

Therefore |〈N,N ′〉| = ‖N ′‖ ‖N‖ and the result follows from Theorem 9.3. �

Theorem 309. [Exercise 15.11] Let C be a nonempty convex subset of Rn and letH(N, b) be a hyperplane disjoint from C. Then C lies in one of the open half-spacesdetermined by H(N, b).

105

Proof. By Theorem 298, we can assume that the hyperplane is of the form H(N, ‖N‖2).Apply Theorem 15.6 to the convex set C −N and the subspace

S = H(N, ‖N‖2)−N = {x ∈ Rn | 〈N, x〉 = 0} ;

there exists a nonzero N ′ ∈ S⊥such that

〈N ′, x〉 ≥ ‖N ′‖2 > 0

for all x ∈ C − N . By Theorem 308, N ′ = rN for some nonzero r ∈ R. Assume thatr > 0. For all x ∈ C we have

〈N, x〉 = 〈N, x−N +N〉 = 〈N, x−N〉+ ‖N‖2 =1

r〈N ′, x−N〉+ ‖N‖2 ,

and since 〈N ′, x−N〉 ≥ ‖N ′‖2,

〈N, x〉 ≥ 1

r‖N ′‖2 + ‖N‖2 > ‖N‖2 .

This shows that C lies in the open half-space H+(N, ‖N‖2). Similarly, if r < 0 thenC ⊆ H−(N, ‖N‖2). �

Remark 310. [Exercise 15.12] The conclusion of Theorem 15.6 may fail if we assumeonly that C is closed and convex.

Let S be the x-axis in R2 and let C be the set

{(x, y) | x ≥ 0, f(x) ≤ y ≤ f(x) + 1}

where f is any nonnegative and monotonically decreasing function continuous on [0,∞)with limx→∞ f(x) = 0. For example, f(x) = 1/(x + 1) is such a function. Then C isclosed and convex, but C cannot be strongly separated from S since limx→∞ f(x) = 0.

Example 311. [Exercise 15.13] Find two nonempty convex subsets of R2 that arestrictly separated but not strongly separated.

Take X = {(x, y) | x > 0} and Y = {(x, y) | x < 0}.

Theorem 312. [Exercise 15.14] Let X and Y be subsets of Rn. Then X and Y arestrongly separated by H(N, b) if and only if

〈N, x′〉 > b for all x′ ∈ Xε and 〈N, y′〉 < b for all y′ ∈ Yεwhere Xε = X + εB(0, 1) and Yε = Y + εB(0, 1) and where B(0, 1) is the closed unitball.

Proof. Obvious. �

106

Remark 313. [Exercise 15.15] Show that Farkas’s lemma cannot be improved by replac-ing vA ≤ 0 in statement 2) with vA� 0.

Let

A =

[1 −10 0

]and b =

[01

].

There is no solution to the system Ax = b, but there is no vector v = (v1, v2) ∈ Rm forwhich vA� 0, since [

v1 v2] [1 −1

0 0

]=[v1 −v1

]� 0

implies that 0 < v1 < 0.

Chapter 16. Affine Geometry

Theorem 314. Let S and T be subspaces of V and let X = x + S and Y = y + T beflats in V .

(1) If X ‖ Y then X ⊆ Y , Y ⊆ X or X ∩ Y = ∅.(2) X ‖ Y if and only if some translation of one of these flats is contained in the

other.

Proof. If S ⊆ T then w+X ⊆ Y for some w ∈ V by Theorem 16.1. If X ∩ Y 6= ∅ thenTheorem 16.1 shows that X ⊆ Y . Similarly, T ⊆ S implies that X ∩ Y = ∅ or Y ⊆ X.This proves (1). Part (2) follows almost directly from Theorem 16.1. �

Theorem 315. [Exercise 16.1] If x1, . . . , xn ∈ V , then the set S = {∑rixi |

∑ri = 0}

is a subspace of V .

Proof. The proof is obvious, but the geometric interpretation is not. For example, ifx1, . . . , xn are linearly independent then S will be of dimension n− 1, and will includeevery vector of the form xi − xj. �

Theorem 316. [Exercise 16.2] If X ⊆ V is nonempty and x ∈ X then

affhull(X) = x+ 〈X − x〉 .

Proof. It is clear that x+〈X − x〉 is a flat that contains X, so affhull(X) ⊆ x+〈X − x〉.Since x ∈ affhull(X) we can write affhull(X) = x+ S where S is a subspace of V . Foreach y ∈ X we have y ∈ x+ S and y − x ∈ S, so 〈X − x〉 ⊆ S. Therefore

x+ 〈X − x〉 ⊆ x+ S = affhull(X).

�

107

Example 317. [Exercise 16.3] The set X = {(0, 0), (1, 0), (0, 1)} in Z22 is closed under

the formation of lines, but not affine hulls.

If x, y ∈ X then any point on the line between x and y must simply be either x or y.Therefore X is closed under the formation of lines. However, since 1 + 1 + 1 = 1 wehave the affine combination

(0, 0) + (1, 0) + (0, 1) = (1, 1) /∈ X,so X is not affine closed.

Theorem 318. [Exercise 16.4,16.5]

(1) A flat contains the origin 0 if and only if it is a subspace.(2) A flat X in V is a subspace if and only if for some x ∈ X we have rx ∈ X for

some 1 6= r ∈ F .

Proof. (1) follows from the basic properties of cosets. For (2), write X = x+S where Sis a subspace of V . Since x, rx ∈ X we have x− rx = (1− r)x ∈ S, so x ∈ S. Thereforex− x = 0 ∈ S and the result follows from (1). �

Theorem 319. [Exercise 16.6] The join of a collection C = {xi + Si | i ∈ K} of flatsin V is the intersection of all flats that contain all flats in C.

Proof. Obvious. �

Theorem 320. [Exercise 16.8-16.10] Let V be an n-dimensional vector space over afield F with n ≥ 1. Let X = x+ S and Y = y + T be flats in V .

(1) If dim(X) = dim(Y ) and X ‖ Y , then S = T .(2) If X and Y are disjoint hyperplanes, then S = T .(3) Let v /∈ X. There is exactly one flat containing v, parallel to X and having the

same dimension as X.

Proof. For (1) we have S ⊆ T or T ⊆ S. In either case, dim(S) = dim(T ) implies thatS = T . For (2), suppose that S 6= T . There is some v ∈ S \ T or v ∈ T \ S; in eithercase T + 〈v〉 or S + 〈v〉 spans V since dim(S) = dim(T ) = n− 1, so S + T = V . Nowx− y = s+ t for some s ∈ S and t ∈ T , so

x− s = y + t ∈ X ∩ Y.For (3), write X = x+ S. The flat v + S satisfies the desired properties. Suppose thaty+T contains v, is parallel to v+S and has the same dimension as v+S. Then S = Tby (1) and y + T = y + S = v + S since v ∈ y + T . �

Theorem 321. [Exercise 16.11] Let V be a finite-dimensional vector space over a fieldF with char(F ) 6= 2.

108

(1) The join X ∨ Y of two flats may not be the set of all lines connecting all pointsin the union of these flats.

(2) If X = x+S and Y = y+ T are flats with X ∩ Y 6= ∅, then X ∨ Y is the unionof all lines xy where x ∈ X and y ∈ Y .

(3) If char(F ) = 2 then (2) does not necessarily hold.

Proof. Let X ⊂ R2 be the x-axis and let Y = X + 1. We have X ∨ Y = R2, but thereis no vertical line of length 2 that joins two points in X ∪ Y . If X and Y are flatswith X ∩ Y 6= ∅, then X ∨ Y = p + (S + T ) for some p ∈ X ∩ Y by Theorem 16.5. Ifz = p+ s+ t ∈ X ∨ Y where s ∈ S and t ∈ T then

z =1

2(p+ 2s) +

1

2(p+ 2t)

is on the line joining the points p + 2s ∈ X and p + 2t ∈ Y . This proves (2). Ifchar(F ) = 2 then the flats

X = {(0, 0), (1, 0)} and Y = {(0, 0), (0, 1)}satisfy X ∩ Y 6= ∅ and X ∨ Y = Z2

2. But Example 317 shows that X ∪ Y is two-affineclosed, so (1, 1) is not on any line xy where x ∈ X and y ∈ Y . �

Theorem 322. [Exercise 16.12] If X ‖ Y and X ∩ Y = ∅, then

dim(X ∨ Y ) = max {dim(X), dim(Y )}+ 1.

Proof. This follows directly from Theorem 16.6 and the fact that S ⊆ T or T ⊆ S. �

Theorem 323. [Exercise 16.13] Let dim(V ) = 2.

(1) The join of any two distinct points is a line.(2) The intersection of any two nonparallel lines is a point.

Proof. We apply Theorem 16.6 here. If X and Y are distinct points then dim(X∨Y ) =1+1−0 = 2. If X = x+〈v〉 and Y = y+〈w〉 are two nonparallel lines then 〈v〉+〈w〉 = Vsince v and w are linearly independent, so x− y = av + bw for some a, b ∈ F and

x− av = y + bw ∈ X ∩ Y.But dim(X ∩ Y ) = 1, so this proves (2). �

Theorem 324. [Exercise 16.14] Let dim(V ) = 3.

(1) The join of any two distinct points is a line.(2) The intersection of any two nonparallel planes is a line.(3) The join of any two lines whose intersection is a point is a plane.(4) The intersection of two coplanar nonparallel lines is a point.(5) The join of any two distinct parallel lines is a plane.

109

(6) The join of a line and a point not on that line is a plane.(7) The intersection of a plane and a line not parallel to that plane is a point.

Proof. (1) is obvious. Let X = x+ S and Y = y + T be two planes as in (2). We haveS + T = V , so x− y ∈ s+ t and x− s = y + t ∈ X ∩ Y . Furthermore,

dim(S ∩ T ) = dim(S) + dim(T )− dim(S + T ) = 2 + 2− 3 = 1,

so X ∩ Y is a line. Let X, Y be two lines as in (3); we have

dim(X ∨ Y ) = 1 + 1− 0 = 2

by Theorem 16.6. Part (4) follows from applying part (2) in Theorem 323. Let X, Ybe lines as in (5); we have

dim(X ∨ Y ) = dim(S + T ) + 1 = 2.

For (6), let X = x+ S be a line and let Y = y + {0} be a point not on that line. Then

dim(X ∨ Y ) = dim(S + {0}) + 1 = 2.

For (7), let X = x + S be a plane and let Y = y + 〈v〉 be a line not parallel to thatplane, where v /∈ S. Then S + 〈v〉 = V , so x− y ∈ s+ av and x− s = y+ av ∈ X ∩ Y .We have dim(S ∩ 〈v〉) = 0 since v /∈ S. �

Chapter 17. Singular Values and the Moore-Penrose Inverse

Theorem 325. [Exercise 17.1] For any τ ∈ L(U), the singular values of τ ∗ are thesame as those of τ .

Proof. Let U = 〈u1〉 � · · · � 〈un〉 be a decomposition of U where u1, . . . , un are eigen-vectors of τ ∗τ with corresponding eigenvalues λ1, . . . , λn. Then

ττ ∗(τui) = τλiui = λiτui

for each i, and furthermore

〈τui, τuj〉 = 〈ui, τ ∗τuj〉 = λj 〈ui, uj〉 ,which shows that {τu1, . . . , τun} is an orthogonal set. Therefore τu1, . . . , τun are lin-early independent eigenvectors of ττ ∗ with eigenvalues λ1, . . . , λn. Since τ ∗τ and ττ ∗

have the same eigenvalues, τ and τ ∗ have the same singular values.

An alternative proof is to note that if A = PΣQ∗ where P,Q are unitary and Σ isdiagonal with nonnegative entries then

A∗ = QΣ∗P ∗ = QΣP ∗,

so the singular values of A∗ are the same as those of A by the uniqueness of Σ. �

110

Example 326. [Exercise 17.2] Find the singular values and the singular value decom-position of the matrix

A =

[3 16 2

].

Find A+.

We compute

A∗A =

[3 61 2

] [3 16 2

]=

[45 1515 5

],

which has orthonormal eigenvectors and eigenvalues

u1 =

[3/√

10

1/√

10

]with λ1 = 50,

u2 =

[−1/√

10

3/√

10

]with λ2 = 0.

The only singular value is s1 = 5√

2, with

v1 =1

5√

2

[3 16 2

] [3/√

10

1/√

10

]=

[1/√

5√4/√

5

],

v2 =

[−√

4/√

5

1/√

5

].

Therefore the singular value decomposition of A is[3 16 2

]=

[1/√

5 −√

4/√

5√4/√

5 1/√

5

] [5√

2 00 0

] [3/√

10 1/√

10

−1/√

10 3/√

10

],

and the MP inverse is

A+ = QΣ′P ∗ =

[3/50 3/251/50 1/25

].

Example 327. [Exercise 17.4] Let X =[x1 · · · xm

]Tbe a column matrix over C.

Find a singular value decomposition of X.

We have

X∗X =[|x1|2 + · · ·+ |xm|2

]=[‖X‖2

],

which has an eigenvector [1] with eigenvalue ‖X‖2. The left singular vector is

v1 =1

‖X‖[x1 · · · xm

]T,

111

and v2, . . . , vm are vectors chosen so that {v1, . . . , vm} is an orthonormal basis for Cm.The singular value decomposition of X is then

X =[v1 · · · vm

]‖X‖

0...0

[1].

Theorem 328. [Exercise 17.5] Let A ∈ Mm,n(F ) and let B ∈ Mm+n,m+n(F ) be thesquare matrix

B =

[0 AA∗ 0

].

Counting multiplicity, the nonzero eigenvalues of B are precisely the singular values ofA together with their negatives.

Proof. If [0 AA∗ 0

] [vu

]=

[AuA∗v

]=

[λvλu

]with λ 6= 0 then

A∗Au = A∗λv = λ2u,

so |λ| is a singular value of A with right singular vector u. Conversely, suppose thats 6= 0 is a singular value of A with right singular vector u so that A∗Au = s2u. Letv = (1/s)Au. Then [

0 AA∗ 0

] [vu

]=

[AuA∗v

]=

[svsu

]and [

0 AA∗ 0

] [v−u

]=

[−AuA∗v

]=

[−svsu

],

so s and −s are eigenvalues of B with eigenvectors (v, u) and (v,−u). �

Theorem 329. [Exercise 17.6] For any matrix A ∈ Mm,n(F ), its adjoint A∗, itstranspose AT and its conjugate A all have the same singular values. If U and U ′ areunitary, then A and UAU ′ have the same singular values.

Proof. Theorem 325 shows that A∗ has the same singular values as A. If v ∈ F n is aneigenvector of A∗A with real eigenvalue λ then (A

∗A)v = A∗Av = λv = λv, so λ is

also an eigenvalue of A∗A. By symmetry, A has exactly the same singular values as A.

Therefore AT = A∗ has the same singular values as A. If U and U ′ are unitary, then

(UAU ′)∗(UAU ′) = (U ′)∗A∗U∗UAU ′ = (U ′)∗A∗AU ′,

so A and UAU ′ have the same singular values. �

112

The Frobenius Norm.

Definition 330. If A = (ai,j) is an m×n complex matrix, then the Frobenius normof A is

‖A‖F =

(∑i,j

|ai,j|2)1/2

.

Theorem 331. Let A ∈Mm,n(C), B ∈Mn,p(C) and U ∈Mm,m(C).

(1) ‖A‖2F = tr(A∗A) = tr(AA∗).(2) If U is unitary then ‖UA‖F = ‖A‖F .(3) ‖AB‖F ≤ ‖A‖F ‖B‖F .

Proof. Part (1) follows from Theorem 221. If U is unitary then

‖UA‖2F = tr(A∗U∗UA) = tr(A∗A) = ‖A‖2F ,

which proves (2). For (3), applying the Cauchy-Schwarz inequality gives

‖AB‖2F =m∑i=1

p∑j=1

∣∣∣∣∣n∑k=1

Ai,kBk,j

∣∣∣∣∣2

≤m∑i=1

p∑j=1

(n∑k=1

|Ai,k|2)(

n∑k=1

|Bk,j|2)

=

(m∑i=1

n∑k=1

|Ai,k|2)(

n∑k=1

p∑j=1

|Bk,j|2)

= ‖A‖2F ‖B‖2F .

�

Theorem 332. [Exercise 17.8] If s1, . . . , sk are the singular values of A, then ‖A‖2F =∑s2i (cf. Theorem 223).

Proof. Let A = PΣQ∗ be the singular value decomposition of A. Since P and Q areunitary,

‖A‖2F = ‖PΣQ∗‖2F = ‖Σ‖2F =∑

s2i

by Theorem 331. �

113

Chapter 18. An Introduction to Algebras

Theorem 333. Any associative F -algebra A is isomorphic to a subalgebra of the en-domorphism algebra LF (A). In fact, if µa is the left multiplication map defined by

µax = ax

then the map µ : A→ LF (A) given by a 7→ µa is an algebra embedding, called the leftregular representation of A.

Proof. It is clear that µ is a homomorphism. If µa = 0 then 0 = µa1 = a1 = a, so µ isinjective. �

Theorem 334. [Exercise 18.1] The subalgebra generated by a nonempty subset X ofan algebra A is the subspace spanned by the products of finite subsets of elements of X:

〈X〉alg = 〈x1 · · ·xn | xi ∈ X〉 .

Proof. Obvious. Note that 1 ∈ 〈X〉alg due to the case n = 0 which is an emptyproduct. �

Theorem 335. [Exercise 18.3] Let A,B be associative unital algebras over a field F .If ϕ : A→ B is an algebra homomorphism, then ker(ϕ) is an ideal of A.

Proof. Firstly, note that 0 ∈ ker(ϕ). If x, y ∈ ker(ϕ) then ϕ(x + y) = ϕx + ϕy = 0, sox+ y ∈ ker(ϕ). Similarly, x− y ∈ ker(ϕ). If a, b ∈ A then

ϕ(axb) = (ϕa)(ϕx)(ϕb) = 0.

This shows that ker(ϕ) is an ideal of A. �

Theorem 336. [Exercise 18.5] If A is an algebra over a field F and S ⊆ A is nonempty,define the centralizer CA(S) of S to be the set of elements of A that commute with allelements of S. The centralizer CA(S) is a subalgebra of A.

Proof. Clearly 0, 1 ∈ CA(S). If x, y ∈ CA(S), r ∈ F and a ∈ A, we have

(x+ y)a = xa+ ya = ax+ ay = a(x+ y)

and

xya = xay = axy

and

rxa = rax = a(rx).

�

114

Example 337. [Exercise 18.6] Show that Z6 is not an algebra over any field.

The center of Z6 is simply Z6, and the only nontrivial subring (with identity) is Z6.This is not a field, so Theorem 18.1 shows that Z6 cannot be an algebra over any field.

Remark 338. [Exercise 18.7] Let A = F [a] be the algebra generated over F by a singlealgebraic element a.

(1) A is isomorphic to the quotient algebra F [x]/ 〈p(x)〉, where 〈p(x)〉 is the idealgenerated by some p(x) ∈ F [x].

(2) What can you say about p(x)?(3) What is the dimension of A?(4) What happens if a is not algebraic?

Let ma(x) be the minimal polynomial of a. Define ϕ : F [x] → A by f(x) 7→ f(a). Iff(a) = 0 then ma(x) divides f(x), so ker(ϕ) ⊆ 〈ma(x)〉. But ma(a) = 0, so 〈ma(x)〉 ⊆ker(ϕ). Therefore

A = im(ϕ) ∼= F [x]/ 〈ma(x)〉since im(ϕ) ∼= F [x]/ 〈ma(x)〉 is a field and A is by definition the smallest field containinga. The dimension of A is dim(ma(x)). If a is not algebraic, then there may not be anyminimal polynomial and this construction does not work.

Theorem 339. Let G,H be groups and let F be a field. For any homomorphismϕ : G→ H, there exists a unique homomorphism ψ : F [G]→ F [H] such that

ψ(r1g1 + · · ·+ rngn) = r1ϕ(g1) + · · ·+ rnϕ(gn).

Proof. Define ψ as specified. Clearly ψ(1) = ϕ(1) = 1. Let x, y ∈ F [G] and r ∈ F ;write

x =n∑i=1

rigi and y =n∑i=1

sigi

where g1, . . . , gn ∈ G (and some ri’s or si’s may be zero). Then ψ(rx) = rψ(x),

ψ(x+ y) =n∑i=1

(ri + si)ϕ(gi) = ψ(x) + ψ(y)

and

ψ(xy) = ψ

(n∑i=1

n∑j=1

risjgigj

)

=n∑i=1

n∑j=1

risjϕ(gigj)

115

=n∑i=1

riϕ(gi)n∑j=1

sjϕ(gj)

= ψ(x)ψ(y).

�

Theorem 340. [Exercise 18.8] Let G = {1 = a0, . . . , an} be a finite group. For x ∈F [G] of the form

x = r1a1 + · · ·+ rnan

let T (x) = r1 + · · ·+ rn. Then T : F [G]→ F is an algebra homomorphism, where F isan algebra over itself.

Proof. Let z : G→ {1} be the trivial homomorphism. By Theorem 339, there exists aunique homomorphism f from F [G]→ F [{1}] such that

f(r1a1 + · · ·+ rnan) = (r1 + · · ·+ rn)1.

The result follows from the fact that r1 7→ r is an isomorphism from F [{1}] to F . �

Theorem 341. [Exercise 18.9] A homomorphism σ : A→ B of F -algebras induces anisomorphism σ : A/ ker(σ)→ im(σ) defined by σ(a+ ker(σ)) = σa.

Proof. If a+ ker(σ) = b+ ker(σ) then a− b ∈ ker(σ), so σa− σb = σ(a− b) = 0. Thisshows that σ is well-defined, and it is now easy to see that σ is an isomorphism. �

Example 342. [Exercise 18.10] The quaternion field H is an R-algebra and a field.

We only check that nonzero elements of H are invertible. Let q = a+ bi+ cj + dk 6= 0,let q = a − bi − cj − dk, and define |q|2 = a2 + b2 + c2 + d2. It is easy to check thatqq = |q|2, so the inverse of q is q/ |q|2 (noting that |q|2 6= 0 since q 6= 0).

Example 343. [Exercise 18.11] Describe the left regular representation of the quater-nions using the ordered basis B = (1, i, j, k).

Let q = a+ bi+ cj + dk. We compute

qi = −b+ ai+ dj − ck,qj = −c− di+ aj + bk,

qk = −d+ ci− bj + ak,

so the matrix representation of q isa −b −c −db a −d cc d a −bd −c b a

.

116

Theorem 344. [Exercise 18.12] Let Sn be the group of permutations (bijective func-tions) of the ordered set X = (x1, . . . , xn) under composition.

(1) Each σ ∈ Sn defines a linear isomorphism τσ on the vector space V with basisX over a field F . This defines an algebra homomorphism f : F [Sn] → LF (V )with the property that f(σ) = τσ.

(2) The representation f is faithful if and only if n < 3.

Proof. To each σ ∈ Sn there is an associated permutation matrix in Mn(F ). Thealgebra homomorphism f can then be defined by extending linearly to the group algebraF [Sn]. If n ≥ 4 then |Sn| = n! > n2 = dim(LF (V )) implies that the set {τσ | σ ∈ Sn}is linearly dependent. There exist r1, . . . , rn! ∈ Sn such that

r1τσ1 + · · ·+ rn!τσn! = 0,

so f(r1σ1 + · · · + rn!σn!) = 0 and the representation cannot be faithful. If n = 2 theonly two matrices are [

1 00 1

]and

[0 11 0

]which are clearly linearly independent. If n = 3 we have the linear combination

−

1 0 00 1 00 0 1

+

1 0 00 0 10 1 0

+

0 1 01 0 00 0 1

−0 0 1

1 0 00 1 0

−0 1 0

0 0 11 0 0

+

0 0 10 1 01 0 0

= 0,

so the representation cannot be faithful. �

Theorem 345. [Exercise 18.13,18.14] Let V be a vector space over a field F .

(1) The algebra LF (V ) has center

Z = {r1 | r ∈ F}and so LF (V ) is central.

(2) The set I of all elements of LF (V ) that have finite rank is an ideal of LF (V )and is contained in all other nonzero ideals of LF (V ).

(3) LF (V ) is simple if and only if V is finite-dimensional.

Proof. Theorem 38 proves (1). Let J be an ideal of LF (V ). Theorem 18.3 proves thatany operator of rank 1 is an element of J . Let τ ∈ LF (V ) with rank k and write

V = 〈v1〉 ⊕ · · · ⊕ 〈vk〉 ⊕Wwhere im(τ) = 〈v1〉 ⊕ · · · ⊕ 〈vk〉. By Theorem 2.25, there exists a resolution of theidentity

ι = ρ1 + · · ·+ ρk + ρ

117

where each ρi is a projection onto 〈vi〉 and ρ is a projection onto W . Now

τ = ρ1τ + · · ·+ ρkτ

and each ρiτ ∈ J , so τ ∈ J . This proves (2). If LF (V ) is simple then I = LF (V ),where I is defined as in (2). Therefore the identity operator has finite rank and V isfinite-dimensional. Conversely, if V is finite-dimensional then I = LF (V ). If J is anonzero ideal of LF (V ) then (2) shows that I ⊆ J , which implies that J = LF (V ).This proves (3). �

Corollary 346. [Exercise 18.15] The matrix algebras Mn(F ) are central and simple.

Theorem 347. [Exercise 18.16] An element a ∈ A is left-invertible if there is ab ∈ A for which ba = 1, in which case b is called the left inverse of a. Similarly,a ∈ A is right-invertible if there is a b ∈ A for which ab = 1, in which case b is calleda right inverse of a. Left and right inverses are called one-sided inverses and anordinary inverse is called a two-sided inverse. Let a ∈ A be algebraic over F .

(1) ab = 0 for some b 6= 0 if and only if ca = 0 for some c 6= 0. (Does c necessarilyequal b?)

(2) If a has a one-sided inverse b, then b is a two-sided inverse. (Does this hold ifa is not algebraic?)

(3) Let a, b ∈ A be algebraic. Then ab is invertible if and only if a and b areinvertible, in which case ba is also invertible.

Proof. If ab = 0 and b 6= 0 then ma(x) = xp(x) for some p(x) ∈ F [x] by Theorem18.4. Now ap(a) = p(a)a = 0 and p(a) 6= 0 since deg(p(x)) < deg(ma(x)). Bysymmetry, ca = 0 implies that ab = 0. This proves (1). Note that c does not necessarilyequal b for otherwise we can multiply c by 2 (if char(F ) 6= 2). For (2), suppose thatba = 1. If ma(x) = xp(x) then 0 = ap(a), so 0 = bap(a) = p(a) gives a contradictionsince deg(p(x)) < deg(ma(x)). By Theorem 18.4, a has a two-sided inverse and b =baa−1 = a−1. If a is not algebraic then (2) does not necessarily hold, since there existinjective linear maps on infinite-dimensional vector spaces that are not surjective, i.e.linear maps with left inverses but not right inverses. For (3), if ab is invertible thenabc = cab = 1 for some c ∈ A, so a−1 = bc and b−1 = ca by (2). Also, baa−1b−1 = 1, so(ba)−1 = a−1b−1. �

Documents

Chapter 1. Vector Spaces - wj32 · Chapter 1. Vector Spaces ... Example 4. [Exercise 1.4] Suppose that V is a vector space with basis B= fb ... Let V and W be vector spaces, let