Matricesfaculty.missouri.edu/~cutkoskys/LinearAlgebra.pdfDe nition 2.9. V is called a nite dimensional vector space if V has a nite basis. For the most part, we will consider nite

CLASS NOTES ON LINEAR ALGEBRA

STEVEN DALE CUTKOSKY

1. Matrices

Suppose that F is a field. Fn will denote the set of n×1 column vectors with coefficientsin F , and Fm will denote the set of 1 ×m row vectors with coefficients in F . Mm,n(F )will denote the set of m× n matrices A = (aij) with coefficients in F . We will sometimesdenote the zero vector of Fn by 0n, the zero vector of Fm by 0m and the zero vector ofMm,n by 0m,n. We will also sometimes abbreviate these different zeros as ~0 or 0. We willlet ei ∈ Fn be the column vector whose entries are all zero, except for a 1 in the i-therow. If A ∈ Mm,n(F ) then At ∈ Mn,m(F ) will denote the transpose of A. We will writeA = (A1, A2, . . . , An) where Ai ∈ Fm are the columns of A, and

A =

A1

A2...Am

where Aj ∈ Fn are the rows of A. For x = (x1, . . . , xn)t ∈ Fn, we have the formula

Ax = x1A1 + x2A

2 + · · ·+ xnAn.

In particular, Aei = Ai for 1 ≤ i ≤ n and (ei)tAej = aij for 1 ≤ i ≤ m, 1 ≤ j ≤ n. We

also have

Ax =

A1xA2x

...Amx

=

(A1)

t · x(A2)

t · x...

(Am)t · x

where for x = (x1, . . . , xn)t, y = (y1, . . . , yn) ∈ Fn, x · y = x1y1 + x2y2 + · · ·+ xnyn is thedot product on Fn.

Lemma 1.1. The following are equivalent for two matrices A,B ∈Mm,n(F ).

1. A = B.2. Ax = Bx for all x ∈ Fn.3. Aei = Bei for 1 ≤ i ≤ n.

N will denote the natural numbers {0, 1, 2 . . .}. Z+ is the set of positive integers{1, 2, 3, . . .}.

1

2. Span, Linear Dependence, Basis and Dimension

In this section we suppose that V is a vector space over a field F . We will denote thezero vector in V by 0V . Sometimes, we will abbreviate, and write the zero vector as ~0 or0.

Definition 2.1. Suppose that S is a nonempty subset of V . Then

Span(S) = {c1v1 + · · ·+ cnvn | n ∈ Z+, v1, . . . , vn ∈ S and c1, . . . , cn ∈ F}.

This definition is valid even when S is an infinite set. We define Span{∅} = {~0}.

Lemma 2.2. Suppose that S is a subset of V . Then Span(S) is a subspace of V .

Definition 2.3. Suppose that S is a subset of V . S is a linearly dependent set if thereexists n ∈ Z+, v1, . . . , vn ∈ S and c1, . . . , cn ∈ F such that c1, . . . , cn are not all zero, andc1v1 + c2v2 + · · ·+ cnvn = 0. S is a linearly independent set if it is not linearly dependent.

Observe that ∅ is a linearly independent set.

Definition 2.4. Suppose that S is subset of V which is a linearly independent set andSpan(S) = V . Then S is a basis of V .

Theorem 2.5. Suppose that V is a vector space over a field F . Then V has a basis.

We have that the empty set ∅ is a basis of the 0 vector space {~0}.

Theorem 2.6. (Extension Theorem) Suppose that V is a vector space over a field F andS is a subset of V which is a linearly independent set. Then there exists a basis of V whichcontains S.

Theorem 2.7. Suppose that V is a vector space over a field F and S1 and S2 are twobases of V . Then S1 and S2 have the same cardinality.

This theorem allows us to make the following definition.

Definition 2.8. Suppose that V is a vector space over a field F . Then the dimension ofV is the cardinality of a basis of V .

The dimension of {~0} is 0.

Definition 2.9. V is called a finite dimensional vector space if V has a finite basis.

For the most part, we will consider finite dimensional vector spaces. If V is finitedimensional of dimension n, a basis is considered as an ordered set β = {v1, . . . , vn}.

Lemma 2.10. Suppose that V is a finite dimensional vector space and W is a subspaceof V . Then dim W ≤ dim V , and dim W = dim V if and only if V = W .

Lemma 2.11. Suppose that V is a finite dimensional vector space and β = {v1, v2, . . . , vn}is a basis of V . Suppose that v ∈ V . Then v has a unique expression

v = c1v1 + c2v2 + · · ·+ cnvn

with c1, . . . , cn ∈ F .

Lemma 2.12. Let {v1, . . . , vn} be a set of generators of a vector space V . Let {v1, . . . , vr}be a maximal subset of linearly independent elements. Then {v1, . . . , vr} is a basis of V .

2

3. Direct Sums

Suppose that V is a vector space over a field F , and W1,W2, . . . ,Wm are subspaces ofV . Then the sum

W1 + · · ·+Wm = Span(W1 ∪ · · · ∪Wm) = {w1 + w2 + · · ·+ wm | wi ∈Wi for 1 ≤ i ≤ m}is the subspace of V spanned by W1,W2, . . . ,Wm.

The sum W1 + · · · + Wm is called a direct sum, denoted by W1⊕W2⊕· · ·⊕Wm, if

every element v ∈ W1 +W2 + · · ·+Wm has a unique expression v = w1 + · · ·+ wm withwi ∈Wi for 1 ≤ i ≤ m.

We have the following useful criterion.

Lemma 3.1. The sum W1 +W2 + · · ·+Wm is a direct sum if and only if 0 = w1 +w2 +· · ·+ wm with wi ∈Wi for 1 ≤ i ≤ m implies wi = 0 for 1 ≤ i ≤ m.

In the case when V is finite dimensional, the direct sum is characterized by the followingequivalent conditions.

Lemma 3.2. Suppose that V is a finite dimensional vector space. Let n = dim (W1 +· · ·+Wm) and ni = dim Wi for 1 ≤ i ≤ m. The following conditions are equivalent

1. If v ∈W1 +W2 + · · ·+Wm then v has a unique expression v = w1 + · · ·+wm withwi ∈Wi for 1 ≤ i ≤ m.

2. Suppose that βi = {wi,1, . . . , wi,ni} are bases of Wi for 1 ≤ i ≤ m. Then β ={w1,1, . . . , w1,n1 , w2,1, . . . , wm,nm} is a basis of W1 + · · ·+Wm.

3. n = n1 + · · ·+ nm.

The following lemma gives a useful criterion.

Lemma 3.3. Suppose that U and W are subspaces of a vector space V . Then V = U⊕W

if and only if V = U +W and U ∩W = {~0}.

4. Linear Maps

In this section, we will suppose that all vector spaces are over a fixed field F .

We recall the definition of equality of maps, which we will use repeatedly to show thattwo maps are the same. Suppose that f : V → W and g : V → W are maps (functions).Then f = g if and only if f(v) = g(v) for all v ∈ V .

If f : U → V and g : V → W , we will usually write the composition g ◦ f : U → W asgf . If x ∈ U , we will sometimes write fx for f(x), and gfx for (g ◦ f)(x).

Definition 4.1. Suppose that V and W are vector spaces. A map L : V →W is linear ifL(v1 + v2) = L(v1) + L(v2) for v1, v2 ∈ V and L(cv) = cL(v) for v ∈ V and c ∈ F .

Lemma 4.2. Suppose that L : V →W is a linear map. Then

1. L(0V ) = 0W .2. L(−v) = −L(v) for v ∈ V .

Lemma 4.3. Suppose that L : V → W is a linear map and T : W → U is a linear map.Then the composition TL : V → U is a linear map.

If S and T are sets, a map f : S → T is 1-1 and onto if and only if there exists a mapg : T → S such that g ◦ f = IS and f ◦ g = IT , where IS is the identity map of S and IT is

3

the identity map of T . When this happens, g is uniquely determined. We write g = f−1

and say that g is the inverse of f .

Definition 4.4. Suppose that L : V → W is a linear map. L is an isomorphism if thereexists a linear map T : W → V such that TL = IV is the identity map of V , and LT = IWis the identity map of W .

The map T in the above definition is unique, if it exists, by the following lemma. If Lis an isomorphism, we write L−1 = T , and call T the inverse of L.

Lemma 4.5. Suppose that L : V → W is a map. Suppose that T : W → V andS : W → V are maps which satisfy TL = IV , LT = IW and SL = IV , LS = IW . ThenS = T .

Lemma 4.6. a linear map L : V → W is an isomorphism if and only if L is 1-1 andonto.

Lemma 4.7. Suppose that L : V →W is a linear map. Then

1. Image L = {L(v) | v ∈ V } is a subspace of W .2. Kernel L = {v ∈ V | L(v) = 0} is a subspace of V .

Lemma 4.8. Suppose that L : V → W is a linear map. Then L is 1-1 if and only ifKernel L = {0}.

Definition 4.9. Suppose that F is a field, and V , W are vector spaces over F . LetLF (V,W ) be the set of linear maps from V to W .

We will sometimes write L(V,W ) = LF (V,W ) if the field is understood to be F .

Lemma 4.10. Suppose that F is a field, and V , W are vector spaces over F . ThenLF (V,W ) is a vector space.

Theorem 4.11. (Dimension Theorem) Suppose that L : V → W is a homomorphism offinite dimensional vector spaces. Then

dim Kernel L+ dim Image L = dim V.

Proof. Let {v1, . . . , vr} be a basis of Kernel L. Extend this to a basis {v1, . . . , vr, vr+1, . . . , vn}of V . Let wi = L(vi) for r + 1 ≤ i ≤ n. We will show that {wr+1, . . . , wn} is a ba-sis of Image L. Since L is linear, {wr+1, . . . , wn} span Image L. We must show that{wr+1, . . . , wn} are linearly independent. Suppose that we have a relation

cr+1wr+1 + · · ·+ cnwn = 0

for some cr+1, . . . , cn ∈ F . Then

L(cr+1vr+1 + · · ·+ cnvn) = cr+1L(vr+1) + · · ·+ cnL(vn) = cr+1wr+1 + · · ·+ cnwn = 0.

Thus cr+1vr+1 + · · · cnvn ∈ Kernel L, so that we have an expansion

cr+1vr+1 + · · · cnvn = d1v1 + · · ·+ drvr

for some d1, . . . , dr ∈ F , which we can rewrite as

d1v1 + · · ·+ drvr − cr+1vr+1 − · · · − cnvn = 0.

Since {v1, . . . , vn} are linearly independent, we have that d1 = · · · = dr = cr+1 = · · · =cn = 0. Thus {wr+1, . . . , wn} are linearly independent. �

4

Corollary 4.12. Suppose that V is a finite dimensional vector space and Φ : V → V andΨ : V → V are linear maps such that ΨΦ = IV . Then Φ and Ψ are isomorphisms andΨ = Φ−1.

Theorem 4.13. (Universal Property of Vector Spaces) Suppose that V and W are finitedimensional vector spaces, β = {v1, . . . , vn} is a basis of V , and w1, . . . , wn ∈ W . Thenthere exists a unique linear map L : V →W such that L(vi) = wi for 1 ≤ i ≤ n.

Proof. We first prove existence. For v ∈ V , define L(v) = c1w1 + c2w2 + · · · + cnwn ifv = c1v1 + · · ·+ cnvn with c1, . . . , cn ∈ F . L is a well defined map since every v ∈ V has aunique expression v = c1v1 + · · ·+ cnvn with c1, . . . , cn ∈ F , as β is a basis of V . We havethat L(vi) = wi for 1 ≤ i ≤ n. We leave the verification that L is linear to the reader.

Now we prove uniqueness. Suppose that T : V →W is a linear map such that T (vi) = wifor 1 ≤ i ≤ n. Suppose that v ∈ V . Then v = c1v1 + · · · + cnvn for some c1, . . . , cn ∈ F .We have that

T (v) = T (c1v1 + · · · cnvn) = c1T (v1) + · · ·+ cnT (vn) = c1w1 + · · ·+ cnwn = L(v).

Since T (v) = L(v) for all v ∈ V , we have that T = L. �

Lemma 4.14. Suppose that A ∈ Mm,n(F ). Then the map LA : Fn → Fm defined byLA(x) = Ax for x ∈ Fn is linear.

Lemma 4.15. Suppose that A = (A1, . . . , An) ∈Mm,n(F ). Then

Image LA = Span{A1, . . . , An}.

Kernel LA = {x ∈ Fn | Ax = 0} is the solution space to Ax = 0.

Lemma 4.16. Suppose that A,B ∈ Mm,n(F ). Then LA+B = LA + LB. Suppose thatc ∈ F . Then cLA = LcA. Suppose that A ∈Mm,n(F ) and C ∈Ml,m. Then the compositionof maps LCLA = LCA.

Lemma 4.17. The map Φ : Mm,n → LF (Fn, Fm) defined by Φ(A) = LA for A ∈Mm,n(F )is an isomorphism.

Proof. By Lemma 4.16, Φ is a linear map. We will now show that Kernel Φ is {0}. LetA ∈ Kernel Φ. Then LA = 0. Thus Ax = 0 for all x ∈ Fn. By Lemma 1.1, we havethat A = 0. Thus Kernel Φ = {0} and Φ is 1-1. Suppose that L ∈ LF (Fn, Fm). Letwi = L(ei) for 1 ≤ i ≤ n. Let A = (w1, . . . , wn) ∈ Mm,n(F ). For 1 ≤ i ≤ n, we havethat LA(ei) = Aei = wi. By Theorem 4.13, L = LA. Thus Φ is onto, and Φ is anisomorphism. �

Definition 4.18. Suppose that V is a finite dimensional vector space. Suppose that β ={v1, . . . , vn} is a basis of V . Define the coordinate vector (v)β ∈ Fn of v respect to β by

(v)β = (c1, . . . , cn)t,

where c1, . . . , cn ∈ F are the unique elements of F such that v = c1v1 + · · ·+ cnvn.

Lemma 4.19. Suppose that V is a finite dimensional vector space. Suppose that β ={v1, . . . , vn} is a basis of V . Suppose that u1, u2 ∈ V and c ∈ F . Then (u1 + u2)β =(u1)β + (u2)β and (cu1)β = c(u1)β.

Theorem 4.20. Suppose that V is a finite dimensional vector space. Let β = {v1, . . . , vn}be a basis of V . Then the map Φ : V → Fn defined by Φ(v) = (v)β is an isomorphism.

5

Proof. Φ is a linear map by Lemma 4.19. Note that Φ(vi) = ei for 1 ≤ i ≤ n. By Theorem4.13, there exists a unique linear map Ψ : Fn → V such that Ψ(ei) = vi for 1 ≤ i ≤ n(where {e1, . . . , en} is the standard basis of Fn). Now ΨΦ : V → V is a linear map whichsatisfies ΨΦ(vi) = vi for 1 ≤ i ≤ n. By Theorem 4.13, there is a unique linear map fromV → V which takes vi to vi for 1 ≤ i ≤ n. Since the identity map IV of V has thisproperty, we have that ΨΦ = IV . By a similar calculation, ΦΨ = IFn . Thus Φ is anisomorphism (with inverse Ψ). �

Definition 4.21. Suppose that V and W are finite dimensional vector spaces. Supposethat β = {v1, . . . , vn} is a basis of V and β′ = {w1, . . . , wm} is a basis of W . Suppose that

L : V → W is a linear map. Define the matrix Mββ′(L) ∈ Mm,n(F ) of L with respect to

the bases β and β′ to be the matrix

Mββ′(L) = ((L(v1)β′ , (L(v2))β′ , . . . , (L(vn))β′).

The matrix Mββ′(L) of L with respect to β and β′ has the following important property:

(1) Mββ′(L)(v)β = (L(v)β′

for all v ∈ V . In fact, Mββ′(L) is the unique matrix A such that A(v)β = (L(v))β′ for all

v ∈ V .

Lemma 4.22. Suppose that V and W are finite dimensional vector spaces. Supposethat β = {v1, . . . , vn} is a basis of V and β′ = {w1, . . . , wm} is a basis of W . Suppose

that L1, L2 ∈ LF (V,W ) and c ∈ F . Then Mββ′(L1 + L2) = Mβ

β′(L1) + Mββ′(L2) and

Mββ′(cL1) = cMβ

β′(L1).

Theorem 4.23. Suppose that V and W are finite dimensional vector spaces. Supposethat β = {v1, . . . , vn} is a basis of V and β′ = {w1, . . . , wm} is a basis of W . Then the

map Λ : LF (V,W )→Mm,n(F ) defined by Λ(L) = Mββ′(L) is an isomorphism.

Proof. Λ is a linear map by Lemma 4.22. It remains to verify that Λ is 1-1 and onto,from which it will follow that Λ is an isomorphism Suppose that L1, L2 ∈ LF (V,W ) and

Λ(L1) = Λ(L2). Then Mββ′(L1) = Mβ

β′(L2) so that (L1(vi))β′ = (L2(vi))β′ for 1 ≤ i ≤ n.

Thus L1(vi) = L2(vi) for 1 ≤ i ≤ n. Since β is a basis of V , L1 = L2 by the uniquenessstatement of Theorem 4.13. Thus Λ is 1-1. Suppose that A = (aij) ∈ Mm,n(F ). WriteA = (A1, . . . , An) where Ai = (a1,i, . . . , am,i)

t ∈ Fm are the columns of A. Let zi =∑mj=1 ajiwj ∈ W for 1 ≤ i ≤ n. By Theorem 4.13, there exists a linear map L : V → W

such that L(vi) = zi for 1 ≤ i ≤ n. We have that (L(vi))β′ = (zi)β′ = Ai for 1 ≤ i ≤ n.

We have that Λ(L) = Mββ′(L) = (A1, . . . , An) = A, and thus Λ is onto. �

Corollary 4.24. Suppose that V is a vector space of dimension n and W is a vector spaceof dimension m. Then LF (V,W ) is a vector space of dimension mn.

Lemma 4.25. Suppose that U , V and W are finite dimensional vector spaces, with re-spective bases β1, β2, β3. Suppose that L1 ∈ LF (U, V ) and L2 ∈ LF (V,W ). Then

Mβ1β3

(L2L1) = Mβ2β3

(L2)Mβ1β2

(L1).

Theorem 4.26. Suppose that V is a finite dimensional vector space, and β is a basis of V .

Then the map Λ : LF (V, V )→Mn,n(F ) defined by Λ(L) = Mββ (L) is a ring isomorphism.

6

5. Bilinear Forms

Let U, V,W be vector spaces over a field F . Let Bil(U × V,W ) be the set of bilinearmaps from U × V to W . Bil(U × V,W ) is a vector space over F . We call an element ofBilF (V ×V, F ) a bilinear form. A bilinear form is usually written as a pairing < v,w >∈ Ffor v, w ∈ V . A bilinear form <,> is symmetric if < v,w >=< w, v > for all v, w ∈ V . Asymmetric bilinear form is nondegenerate if whenever v ∈ V is such that < v,w >= 0 forall w ∈ V then v = 0.

Theorem 5.1. Let F be a field and A ∈ Mm,n(F ). Let gA : Fm × Fn → F be definedby gA(v, w) = vtAw. Then gA ∈ BilF (Fm × Fn, F ). Further, the map Ψ : Mm,n(F ) →BilF (Fm × Fn, F ) defined by Ψ(A) = gA is an isomorphism of F -vector spaces.

Also, in the case when m = n,

1. Ψ gives a 1-1 correspondence between n× n matrices and bilinear forms on Fn.2. Ψ gives a 1-1 correspondence between n × n symmetric matrices and symmetric

forms on Fn

3. Ψ gives a 1-1 correspondence between invertible n × n symmetric matrices andnondegenerate symmetric forms

From now on in this section, suppose that V is a vector space over a field F , and <,>is a nondegenerate symmetric form on V . An important example is the dot product onFm where F is a field. We begin by stating the identity

(2) < v,w >=1

2[< v + w, v + w > − < v, v > − < w,w >]

for v, w ∈ V .Define v, w ∈ V to be orthogonal (or perpendicular) if < v,w >= 0. Suppose that

S ⊂ V is a subset. Define

S⊥ = {v ∈ V |< v,w >= 0 for all w ∈ S}.

Lemma 5.2. Suppose that S is a subset of V . Then

1. S⊥ is a subspace of V .2. If U is the subspace Span(S) of V , then U⊥ = S⊥.

Lemma 5.3. Suppose

A =

A1...Am

∈Mm,n(F ).

Let U = Span{At1, . . . , Atm} ⊂ Fn, which is the column space of At. Then U⊥ =Kernel LA.

A basis β = {v1, . . . , vn} of V is called an orthogonal basis if < vi, vj >= 0 wheneveri 6= j.

Theorem 5.4. Suppose that V is a finite dimensional vector space with a symmetricnondegenerate bilinear form < , >. Then V has an orthogonal basis.

Proof. We prove the theorem by induction on n = dim V . If n = 1, the theorem is truesince any basis is an orthogonal basis.

Assume that dim V = n > 1, and that any subspace of V of dimension less than n hasan orthogonal basis. There are two cases. The first case is when every v ∈ V satisfies

7

< v, v >= 0. Then by (2), we have < v,w >= 0 for every v, w ∈ V . Thus any basis of Vis an orthogonal basis.

Now assume that there exists v1 ∈ V such that < v1, v1 >6= 0. Let U = Span{v1}, aone dimensional subspace of V . Suppose that v ∈ V . Let

c =< v, v1 >

< v1, v1 >∈ F.

Then v − cv1 ∈ U⊥, and thus v = cv1 + (v − cv1) ∈ U +U⊥. It follows that V = U +U⊥.

We have that U ∩ U⊥ = {~0}, since < v1, v1 >6= 0. Thus V = U⊕U⊥. Since U⊥ has

dimension n − 1 by Lemma 3.2, we have by induction that U⊥ has an orthogonal basis{v2, . . . , vn}. Thus {v1, v2, . . . , vn} is an orthogonal basis of V . �

Theorem 5.5. Let V be a finite dimensional vector space over a field F , with a symmetricbilinear form < , >. Assume dim V > 0. Let V0 be the subspace of V defined by

V0 = V ⊥ = {v ∈ V |< v,w >= 0 for all w ∈ V }.

Let {v1, . . . , vn} be an orthogonal basis of V . Then the number of integers i such that< vi, vi >= 0 is equal to the dimension of V0.

Proof. Order the {v1, . . . , vn} so that < vi, vi >= 0 for 1 ≤ i ≤ r and < vi, vi >6= 0 forr < i ≤ n. Suppose that w ∈ V . Then w = c1v1 + · · ·+ cnvn for some c1, . . . , cn ∈ F .

< vi, w >=n∑j=1

ci < vi, vj >= ci < vi, vi >= 0

for 1 ≤ i ≤ r. Thus v1, . . . , vr ∈ V0.Now suppose that v ∈ V0. We have v = d1v1 + · · ·+ dnwn for some d1, . . . , dn ∈ F . We

have

0 =< vi, v >=n∑j=1

dj < vi, vj >= di < vi, vi >

for 1 ≤ i ≤ n. Since < vi, vi >6= 0 for r < i ≤ n we have di = 0 for r < i ≤ n. Thusv = d1v1 + · · · + drvr ∈ Span{v1, . . . , vr}. Thus V0 = Span{v1, . . . , vr}. Since v1, . . . , vrare linearly independent, they are a basis of V0, and thus dim V0 = r. �

The dimension of V0 in Theorem 5.5 is called the index of nullity of the form.

Theorem 5.6. (Sylvester’s Theorem) Let V be a finite dimensional vector space over Rwith a symmetric bilinear form. There exists an integer r with the following property. If{v1, . . . , vn} is an orthogonal basis of V , then there are precisely r integers i such that< vi, vi >> 0.

Proof. Let {v1, . . . , vn} and {w1, . . . , wn} be orthogonal bases of V . We may suppose thattheir elements are arranged so that < vi, vi >> 0 is 1 ≤ i ≤ r, < vi, vi >< 0 if r+1 ≤ i ≤ sand < vi, vi >= 0 if s+ 1 ≤ i ≤ n. Similarly, < wi, wi >> 0 is 1 ≤ i ≤ r′, < wi, wi >< 0if r′ + 1 ≤ i ≤ s′ and < wi, wi >= 0 if s′ + 1 ≤ i ≤ n.

We first prove that v1, . . . , vr, wr′+1, . . . , wn are linearly independent. Suppose we havea relation

x1v1 + · · ·+ xrvr + yr′+1wr′+1 + · · · ynwn = 0

for some x1, . . . , xr, yr;+1, . . . , yn ∈ F . Then

x1v1 + · · ·+ xrvr = −(yr′+1wr′+1 + · · ·+ ynwn).8

Let ci =< vi, vi > and di =< wi, wi > for all i. Taking the product of each side of thepreceding equation with itself, we have

c1x21 + · · ·+ crx

2r = dr′+1y

2r′+1 + · · ·+ ds′y

2s′ .

Since the left hand side is ≥ 0 and the right hand side is ≤ 0, we must have that bothsides of this equation are zero. Thus x1 = · · · = xr = 0. Since wr′+1, . . . , wn are linearlyindependent, yr′+1 = · · · = yn = 0.

Since dim V = n, we have that r + n − r′ ≤ n, so that r ≤ r′. Interchanging the rolesof the two bases in the proof, we also obtain r′ ≤ r, so that r = r′. �

The integer r in the proof of Sylvester’s theorem is call the index of positivity of theform.

6. Exercises

1. Let V be an n-dimensional vector space over a field F . The dual space of V is thevector space V ∗ = L(V, F ). Elements of V ∗ are called functionals.

i) Suppose that β = {v1, . . . , vn} is a basis of V . For 1 ≤ i ≤ n, define v∗i ∈ V ∗by v∗i (v) = ci if v = c1v1 + · · · + cnvn ∈ V (so that v∗i (vj) = δij). Prove thatthe v∗i are elements of V ∗ and that they form a basis of V ∗. (This basis iscalled the dual basis to β.)

ii) Suppose that V has a nondegenerate symmetric bilinear form < , >. Forv ∈ V define Lv : V → F by Lv(w) =< v,w > for w ∈ V . Show that Lv ∈ V ∗,and that the map Φ : V → V ∗ defined by Φ(v) = Lv is an isomorphism ofvector spaces.

2. Suppose that L : V → W is a linear map of finite dimensional vector spaces.Define L̂ : W ∗ → V ∗ by L̂(ϕ) = ϕL for ϕ ∈W ∗.

i) Show that L̂ is a linear map.

ii) Suppose that L is 1-1. Is L̂ 1-1? Is L̂ onto? Prove your answers.

iii) Suppose that L is onto. Is L̂ 1-1? Is L̂ onto? Prove your answers.iv) Suppose that A ∈ Mm,n(F ) where F is a field, and L = LA : Fn → Fm. Let{e1, . . . , en} be the standard basis of Fn, and {f1, . . . , fm} be the standardbasis of Fm. Let β = {e∗1, . . . , e∗n} be the dual basis of (Fn)∗ and let β′ =

{f∗1 , . . . , f∗m} be the dual basis of (Fm)∗. Compute Mβ′

β (L̂).

3. Let V be a vector space of dimension n, and W be a subspace of V . Define

PerpV ∗(W ) = {ϕ ∈ V ∗ | ϕ(w) = 0 for all w ∈W}.i) Prove that dim W + dim PerpV ∗(W ) = n.ii) Suppose that V has a nondegenerate symmetric bilinear form < , >. Prove

that the map v 7→ Lv of Problem 1.ii) induces an isomorphism of

W⊥ = {v ∈ V |< v,w >= 0 for all w ∈W}with PerpV ∗(W ). Conclude that dim W + dim W⊥ = n.

iii) Use the conclusions of Problem 3.ii) to prove that the column rank of a matrixis equal to the row rank of a matrix (see Definition 7.4). This gives a differentproof of part of Theorem 7.5.

iv) (rank-nullity theorem) Suppose that A ∈ Mm,n(F ) is a matrix. Show thatthe dimension of the solution space

{x ∈ Fn | Ax = ~0}9

is equal to n− rank A. Thus with the notation of Lemma 5.3,

dim U⊥ + rank A = n.

v) Consider the complex vector space C2 with the dot product as nondegeneratesymmetric bilinear form. Give an example of a subspace W of C2 such thatC2 6= W⊕W⊥. Verify for your example that dim W+dim W⊥ = dim C2 = 2.

4. Suppose that V is a finite dimensional vector space with a nondegenerate symmet-ric bilinear form < , >. An operator Φ on V is a linear map Φ : V → V . If Φ isan operator on V , show that there exists a unique operator Ψ : V → V such that< Φ(v), w >=< v,Ψ(w) > for all v, w ∈ V . Ψ is called the transpose of Φ, and wewrite Ψ = Φt.

Hint: Define a map Ψ : V → V as follows. For w ∈ V , Let L : V → F be themap defined by L(v) =< Φ(v), w > for v ∈ V . Verify that L is linear, so thatL ∈ V ∗. By Problem 1.ii), there exists a unique w′ ∈ V such that for all v ∈ V ,we have L(v) =< v,w′ >. Define Ψ(w) = w′. Prove that Ψ is linear, and that< Φ(v), w >=< v,Ψ(w) > for all v, w ∈ V . Don’t forget to prove that Ψ is unique!

5. Let V = Fn with the dot product as nondegenerate symmetric bilinear form.Suppose that L : V → V is an operator. Show that Lt = LAt if A ∈ Mn×n(F ) isthe n× n matrix such that L = LA.

6. Give a “down to earth” (but “coordinate dependent”) proof of Problem 4, startinglike this. Let β = {v1, . . . , vn} be a basis of V . The linear map Φβ : V → Fn

defined by Φβ(v) = (v)β is an isomorphism, with inverse Ψ : Fn → V defined byΨ((x1, . . . , xn)t) = x1v1 + · · · + xnvn. Define a product [ , ] on Fn by [x, y] =<Ψ(x),Ψ(y) >. Verify that [ , ] is a nondegenerate symmetric bilinear form on Fn.By our classification of nondegenerate symmetric bilinear forms on Fn, we knowthat [x, y] = xtBy for some symmetric invertible matrix B ∈Mn×n(F ).

7. Determinants

Theorem 7.1. Suppose that F is a field, and n is a positive integer. Then there exists aunique function Det : Mn,n(F )→ F which satisfies the following properties:

1. Det is multilinear in the columns of matrices in Mn,n(F ).2. Det is alternating in the columns of matrices in Mn,n(F ); that is Det(A) = 0

whenever two columns of A are equal.3. Det(In) = 1.

Existence can be proven by defining a function on A = (aij) ∈Mn,n(F ) by

Det(A) =∑σ∈Sn

sgn(σ)aσ(1),1aσ(2),2 · · · aσ(n),n

where for a permutation σ ∈ Sn,

sgn(σ) =

{1 if σ is even−1 if σ is odd.

It can be verified that Det is multilinear, alternating and Det(In) = 1.Uniqueness is proven by showing that if a function Φ : Mn,n(F )→ F is multilinear and

alternating on columns and satisfies Φ(In) = 1 then Φ = Det.

Lemma 7.2. Suppose that A,B ∈Mn,n(F ). Then10

1. Det(AB) = Det(A)Det(B).2. Det(At) = Det(A).

Lemma 7.3. Suppose that A ∈Mn,n(F ) Then the following are equivalent

1. Det(A) 6= 02. The columns {A1, . . . , An} of A are linearly independent.3. The solution space to Ax = 0 is (0).

Proof. 2. is equivalent to 3. by the dimension formula, applied to the linear map LA :Fn → Fn.

We will now establish that 1. is equivalent to 2. Suppose that {A1, . . . , An} are linearlydependent. Then, after possibly permuting the columns of A, there exist c1, . . . , cn−1 ∈ Fsuch that An = c1A

1 + · · ·+ cn−1An−1. Thus

Det(A) = Det(A1, . . . , An−1, An)= Det(A1, . . . , An−1, c1A

1 + · · ·+ cn−1An−1)

= c1Det(A1, . . . , An−1, A1) + · · ·+ cn−1Det(A1, . . . , An−1, An−1)= 0.

Now suppose that {A1, . . . , An} are linearly independent. Then dim Image LA = n. Bythe dimension theorem, dim Kernel LA = 0. Thus LA is an isomorphism, so A is invertiblewith an inverse B. We have that 1 = Det(In) = Det(AB) = Det(A)Det(B). ThusDet(A) 6= 0. �

Definition 7.4. Suppose that A ∈Mm,n(F ).

1. The column rank of A is the dimension of the column space Span{A1, . . . , An},which is a subspace of Fm.

2. The row rank of A is the dimension of the row space Span{A1, . . . , Am}, which isa subspace of Fn.

3. The rank of A is the largest integer r such that there exists an r × r submatrix Bof A such that Det(B) 6= 0.

Theorem 7.5. Suppose that A ∈Mm,n(F ). Then the column rank of A, the row rank ofA and the rank of A are all equal.

Proof. Let s be the column rank of A, and let r be the rank of A. We will show that s = r.There exists an r × r submatrix B of A such that Det(B) 6= 0. There exist 1 ≤ i1 <

i2 < · · · < ir ≤ m and 1 ≤ j1 < j2 < · · · < jr ≤ n such that B is obtained from A bydeleting the rows Ai from A such that i 6∈ {i1, . . . , ir} and deleting the columns Aj suchthat j 6∈ {j1, . . . , jr}. We have that the columns of B are linearly independent, so thecolumns Aj1 , . . . , Ajr of A are linearly independent. Thus s ≥ r.

We now will prove that r ≥ s be induction on n, from which it follows that r = s. Thecase n = 1 is easily verified; r and s are either 0 or 1, depending on if A is zero or not.

Assume that the induction statement is true for matrices of size less than n. Afterpermuting the columns of A, we may assume that the first s columns of A are linearlyindependent. Since the column space is a subspace of Fm, we have that s ≤ m. Let Bthe matrix consisting of the first s rows of A. if s < n, then by induction we have thatthe rank of B is s, so that A has rank r ≥ s.

We have reduced to the case where s = n ≤ m. Let C be the (n− 1)×m submatrix ofA consisting of the first n−1 columns of A. By induction, there exists an (n−1)× (n−1)submatrix E of C whose determinant is non zero. After possibly interchanging rows of A,we may assume that E consists of the first n− 1 rows of C. For 1 ≤ i ≤ n, let Ei be the

11

column vector consisting of the first n− 1 rows of the i-th column of A, and for 1 ≤ i ≤ nand n ≤ j ≤ m, let Eji be the column vector consisting of the first n− 1 rows of the i-thcolumn of A, followed by the j-th row of the i-th column of A.

Since Det(E) 6= 0, {E1, . . . , En−1} are linearly independent, and are thus a basis ofFn−1. Thus there are unique elements a1, . . . , an−1 ∈ F such that

(3) a1E1 + · · ·+ an−1En−1 = En.

Suppose that all determinants of n × n submatrices of A are zero. Then {Ej1, . . . , Ejn}

are linearly dependent for n ≤ j ≤ m. By uniqueness of the relation (3), we must thenhave that

a1Ej1 + · · ·+ an−1E

jn−1 − En = 0

for n ≤ j ≤ m. Thus

a1A1 + · · ·+ an−1A

n−1 −An = 0,

a contradiction to our assumption that A has column rank n. Thus some n×n submatrixof A has nonzero determinant, and thus r ≥ s.

Taking the transpose of A, the above argument shows that the row rank of A is equalto the rank of A.

�

8. Rings

A set R is a ring if it has an addition operation under which A is an abelian group, andan associative multiplication operation such that a(b+ c) = ab+ ac and (b+ c)a = ba+ cafor all a, b, c ∈ A. A further has a multiplicative identity 1A (written 1 when there is nodanger of confusion). A ring A is a commutative ring if ab = ba for all a, b ∈ A.

Suppose that A and B are rings. A ring homomorphism ϕ : A → B is a mappingsuch that ϕ is a homomorphism of abelian groups, ϕ(ab) = ϕ(a)ϕ(b) for a, b ∈ A andϕ(1A) = 1B.

Suppose that B is a commutative ring, A is a subring of B and S is a subset of B.The subring of B generated by S and A is the intersection of all subrings T of B whichcontain A and S. The subring of B generated by S and A is denoted by A[S]. It shouldbe verified that A[S] is in fact a subring of B, and that

A[S] = {n∑

i1,i2,...,ir=0

aii,...,irsi11 · · · s

irr | r ∈ N, n ∈ N, ai1,...,ir ∈ A, s1, . . . , sr ∈ S}.

and let Ri = R for i ∈ N. Let

T = {{ai} ∈ ×i∈NRi | ai = 0 for i� 0}.

We can also write a sequence {ai} ∈ T as (a0, a1, a2, . . . , ar, 0, 0, . . .) for some r ∈ N. T isa ring with addition {ai}+ {bi} = {ai + bi} and {ai}{bj} = {ck} where ck =

∑i+j=k aibk.

The zero element 0T is the sequence all of whose terms are 0, and 1T is the sequence allof whose terms are zero expect the zero’th term which is 1. That is, 0T = (0, 0, . . . , ) and1T = (1, 0, 0, . . .). Let x be the sequence whose first term is 1 and all other terms are zero;that is, x = (0, 1, 0, 0, . . .). Then xi is the sequence whose i-th term is 1 and all otherterms are zero. The natural map of R into T which maps a ∈ R to the sequence whosezero-th term is a and all other terms are zero identifies R with a subring of T . Suppose

12

that {ai} ∈ T . Then there is some natural number r such that ai = 0 for all i > r. Wehave that

{ai} = (a0, a1, . . . , , ar, 0, 0, . . .) =

r∑i=0

aixi.

Thus the subring R[x] of T generated by x and R is equal to T . R[x] is called a polynomialring.

Lemma 8.1. Suppose that f(x), g(x) ∈ R[x] have expansions f(x) = a0 +a1x+ · · · arxr ∈R[x] and g(x) = b0+b1x+ · · ·+brxr ∈ R[x] with a0, . . . , ar ∈ R and b0, . . . , br ∈ R Supposethat f(x) = g(x). Then ai = bi for 0 ≤ i ≤ r.

Proof. We have f(x) = (a0, a1, . . . , ar, 0, . . .) and g(x) = (b0, b1, . . . , br, 0, . . .) as elementsof ×i∈NRi. Since two sequences in ×i∈NRi are equal if and only if their coefficients areequal, we have that ai = bi for all i. �

Theorem 8.2. (Universal Property of Polynomial Rings) Suppose that A and B are com-mutative rings and ϕ : A→ B is a ring homomorphism. Suppose that b ∈ B. Then thereexists a unique ring homomorphism ϕ : A[x] → B such that ϕ(a) = ϕ(a) for a ∈ A andϕ(x) = b.

Proof. We first prove existence. Define ϕ : A → B by ϕ(f(x)) = ϕ(a0) + ϕ(a1)b + · · · +ϕ(ar)b

r for

(4) f(x) = a0 + a1x+ · · ·+ arxr ∈ R[x]

with a0, . . . , ar ∈ R. This map is well defined, since the expansion (4) is unique by Lemma8.1. It should be verified that ϕ is a ring homomorphism with the desired properties.

We will now prove uniqueness. Now suppose that Ψ : A[x]→ B is a ring homomorphismsuch that Ψ(a) = ϕ(a) for a ∈ A and Ψ(x) = b. Suppose that f(x) ∈ A[x]. Writef(x) = a0 + a1x+ · · ·+ arx

r with a0, . . . , ar ∈ A. We have

Ψ(f(x)) = Ψ(a0 + a1x+ · · ·+ arxr) = Ψ(a0) + Ψ(a1)Ψ(x) + · · ·+ Ψ(ar)Ψ(x)r

= ϕ(a0) + ϕ(a1)b+ · · ·+ ϕ(ar)br = ϕ(a0 + a1x+ · · ·+ arx

r)= ϕ(f(x)).

Since this identity is true for all f(x) ∈ A[x], we have that Ψ = ϕ. �

Let F be a field, and F [x] be a polynomial ring over F . Suppose that f(x) = a0 +a1x+ · · ·+ anx

n ∈ F [x] with a0, · · · an ∈ F and an 6= 0. The degree of f(x) is defined tobe deg f(x) = n. If f(x) = 0 define deg f(x) = −∞. For f, g ∈ F [x] we have

deg fg = deg f + deg g

and

deg f + g ≤ max {deg f, deg g}.f ∈ F [x] is a unit (has a multiplicative inverse in F [x]) if and only if f is a nonzero elementof F , if and only if deg f = 0 (this follows from the formulas on degree). F [x] is a domain;that is it has no zero divisors. A basic result is Euclidean division.

Theorem 8.3. Suppose that f, g ∈ F [x] with f 6= 0. Then there exist unique polynomialsq, r ∈ F [x] such that g = qf + r with deg r < deg f .

Corollary 8.4. Suppose that f(x) ∈ F [x] is nonzero. Then f(x) has at most deg fdistinct roots in F .

13

Corollary 8.5. F [x] is a principal ideal domain; that is every ideal in F [x] is generatedby a single element.

An element f(x) ∈ F [x] is defined to be irreducible if it has degree ≥ 1, and f cannotbe written as a product f = gh with f, g 6∈ F . A polynomial f(x) ∈ F [x] of positivedegree n is monic if f(x) = a0 + a1x+ · · ·+ an−1x

n−1 + xn for some a0, . . . , an−1 ∈ F .

Theorem 8.6. (Unique Factorization) Suppose that f(x) ∈ F [x] has positive degree. Thenthere is a factorization

(5) f(x) = cfn11 · · · f

nrr

where r is a positive integer, f1, . . . , fr ∈ F [x] are monic irreducible, 0 6= c ∈ F andn1, . . . , nr are positive integers.

This factorization is unique, in the sense that any other factorization of f(x) as aproduct of monic irreducibles is obtained from (6) by permuting the fi.

If F is an algebraically closed field (such as the complex numbers C) the monic irre-ducibles are exactly the linear polynomials x− α for α ∈ F .

Theorem 8.7. Suppose that F is an algebraically closed field and f(x) ∈ F [x] has positivedegree. Then there is a (unique) factorization

(6) f(x) = c(x− α1)n1 · · · (x− αr)nr

where r is a positive integer, α1, . . . , αr ∈ F are distinct, 0 6= c ∈ F and n1, . . . , nr arepositive integers.

Suppose that f(x), g(x) ∈ F [x] are not both zero. A common divisor of f and g in F [x]is an element h ∈ F [x] such that h divides f and h divides g in F [x]. A greatest commondivisor of f and g in F [x] is an element d ∈ F [x] such that d is a common divisor of f andg in F [x] and d divides every other common divisor of f and g in F [x].

Theorem 8.8. Suppose that f, g ∈ F [x] are not both zero. Then

1. there exists a unique monic greatest common divisor d(x) = gcd(f, g) of f and gin F [x].

2. There exist p, q ∈ F [x] such that d = pf + qg.3. Suppose that K is a field containing F . Then d is a greatest common divisor of f

and g in K[x].

9. Eigenvalues and Eigenvectors

Suppose that F is a field.

Definition 9.1. Suppose that V is a vector space over a field F and L : V → V is a linearmap. An element λ ∈ F is an eigenvalue of L if there exists a nonzero element v ∈ Vsuch that L(v) = λv. A nonzero vector v ∈ V such that L(v) = λv is called an eigenvectorof L with eigenvalue λ. If λ ∈ F is an eigenvalue of L, then

E(λ) = {v ∈ V | L(v) = λv}

is the eigenspace of λ for L.

Lemma 9.2. Suppose that V is a vector space over a field F , L : V → V is a linear map,and λ ∈ F is an eigenvalue of L. Then E(λ) is a subspace of V of positive dimension.

14

Theorem 9.3. Suppose that V is a vector space over a field F , and L : V → V is a linearmap. Let λ1, . . . , λr ∈ F be distinct eigenvalues of L. Then the subspace of V spanned byE(λ1), . . . , E(λr) is a direct sum; that is

E(λ1) + E(λ2) + · · ·+ E(λr) = E(λ1)⊕

E(λ2)⊕· · ·⊕

E(λr).

Proof. We prove the theorem by induction on r. The case r = 1 follows from the definitionof direct sum. Now assume that the theorem is true for r − 1 eigenvalues. Suppose thatvi ∈ E(λi) for 1 ≤ i ≤ r, wi ∈ E(λi) for 1 ≤ i ≤ r and

(7) v1 + v2 + · · ·+ vr = w1 + w2 + · · ·+ wr.

We must show that vi = wi for 1 ≤ i ≤ r. Multiplying equation (7) by λr, we obtain

(8) λrv1 + λrv2 + · · ·+ λrvr = λrw1 + λrw2 + · · ·+ λrwr.

Applying L to equation (7) we obtain

(9) λ1v1 + λ2v2 + · · ·+ λrvr = λ1w1 + λ2w2 + · · ·+ λrwr.

Subtracting equation (8) from (9), we obtain

(λ1 − λr)v1 + · · ·+ (λr−1 − λr)vr−1 = (λ1 − λr)w1 + · · ·+ (λr−1 − λr)wr−1.

Now by induction on r, and since λi − λr 6= 0 for 1 ≤ i ≤ r − 1, we get that vi = wi for1 ≤ i ≤ r − 1. Substituting into equation (7), we then obtain that vr = wr, and we seethat the sum is direct. �

Corollary 9.4. Suppose that V is a finite dimensional vector space over a field F , andL : V → V is linear. Then L has at most dim(V ) distinct eigenvalues.

10. Characteristic Polynomials

Suppose that F is a field. We let F [t] be a polynomial ring over F .

Definition 10.1. Suppose that F is a field, and A ∈ Mn,n(F ). The characteristic poly-nomial of A is the polynomial pA(t) = Det(tIn −A) ∈ F [t].

pA(t) is a monic polynomial of degree n.

Lemma 10.2. Suppose that V is an n-dimensional vector space over a field F , and L :V → V is a linear map. Let β, β′ be two bases of V . Then

pMββ (L)

(t) = pMβ′β′ (L)

(t).

Proof. We have that

Mβ′

β′ (L) = Mββ′(I)Mβ

β (L)Mβ′

β (I).

Let A = Mββ (L), B = Mβ′

β′ (L) and C = Mβ′

β (I), so that B = C−1AC. Then

pMβ′β′ (L)

(t) = Det(tIn −B) = Det(tIn − C−1AC)

= Det(C−1(tIn −A)C) = Det(C)−1Det(tIn −A)Det(C)= Det(tIn −A) = p

Mββ (L)

(t).

�15

Definition 10.3. Suppose that V is an n-dimensional vector space over a field F , andL : V → V is a linear map. Let β be a basis of V . Define the characteristic polynomialpL(t) of L to be

pL(t) = pMββ (L)

(t).

Lemma 10.2 shows that the definition of characteristic polynomial of L is well defined;it is independent of choice of basis of V .

Theorem 10.4. Suppose that V is an n-dimensional vector space over a field F , andL : V → V is a linear map. Then the eigenvalues of L are the roots of pL(t) = 0 whichare in F . In particular, we have another proof that L has at most n distinct eigenvalues.

Proof. Let β be a basis of V , and suppose that λ ∈ F . Then the following are equivalent:

1. λ is an eigenvalue of L.2. There exists a nonzero vector v ∈ V such that Lv = λv.3. the Kernel of λI − L : V → V is nonzero.4. The solution space to Mβ

β (λI − L)y = 0 is non zero.

5. The solution space to (λIn −Mββ (L))y = 0 is non zero.

6. Det(λIn −Mββ (L)) = 0

7. pL(λ) = 0.

�

Definition 10.5. Suppose that V is an n-dimensional vector space over a field F , andL : V → V is a linear map. L is diagonalizable if there exists a basis β of V such that

Mββ (L) is a diagonal matrix.

Theorem 10.6. Suppose that V is an n-dimensional vector space over a field F , andL : V → V is a linear map. Then the following are equivalent:

1. L is diagonalizable.2. There exists a basis of V consisting of eigenvectors of L.3. Let λ1, . . . , λr be the distinct eigenvalues of L. Then

dim E(λ1) + · · ·+ dim E(λr) = n.

Proof. We first prove 1 implies 2. By assumption, there exists a basis β = {v1, . . . , vn} ofV such that

Mββ (L) = ((L(v1))β, . . . , (L(vn))β) =

λ1 0 · · · 00 λ2 0 · · ·

...0 · · · 0 λn

is a diagonal matrix. Thus L(vi) = λivi for 1 ≤ i ≤ n and so β is a basis of V consistingof eigenvectors of L.

We now prove 2 implies 3. By assumption, there exists a basis β = {v1, . . . , vn} of Vconsisting of eigenvectors of L. Thus if λ1, . . . , λr are the distinct eigenvalues of L, thenV ⊂ E(λ1) + · · ·+ E(λr). Thus

V = E(λ1) + · · ·+ E(λr) = E(λ1)⊕· · ·⊕

E(λr)

by Theorem 9.3, so

n = dimV = dimE(λ1) + · · ·+ dimE(λr).16

We now prove 3 implies 1. We have that E(λ1) + · · ·+ E(λr) = E(λ1)⊕· · ·⊕E(λr) by

Theorem 9.3 so that E(λ1)⊕· · ·E(λr) is a subspace of V which has dimension n = dimV .

Thus V = E(λ1)⊕· · ·⊕E(λr).

Let {vi1, . . . , visi} be a basis of E(λi) for 1 ≤ i ≤ r. Then

β = {v11, . . . , v1s1 , v21, . . . , v

2s2 , . . . , v

r1, . . . , v

rsr}

is a basis of V = E(λ1)⊕· · ·⊕E(λr). We have that Mβ

β (L) = ((L(v11)β, . . . , (L(vrsr)) is

a diagonal matrix. �

Lemma 10.7. Suppose that A ∈ Mn,n(F ) is a matrix. Then LA is diagonalizable if andonly if A is similar to a diagonal matrix (there exists an invertible matrix B ∈ Mn,n(F )and a diagonal matrix D ∈Mn,n(F ) such that B−1AB = D).

Proof. Suppose that LA is diagonalizable. Then there exists a basis β = {v1, . . . , vn} ofKn consisting of eigenvectors of LA. Let λi be the eigenvalue of vi, so that Avi = λivi for1 ≤ i ≤ n. We have that

D = Mββ (LA) = ((L(v1))β, . . . , (L(vn))β) =

λ1 0 · · · 00 λ2 0 · · ·

...0 · · · 0 λn

is a diagonal matrix. Let β∗ = {E1, . . . , En} be the standard basis of Fn.

D = Mββ (LA) = Mβ∗

β (idFn)Mβ∗

β∗ (LA)Mββ∗(idFn) = B−1AB

where B = Mββ∗(idKn) = (v1, v2, . . . , vn).

Now suppose that B−1AB = D is a diagonal matrix. Let B = (B1, . . . , Bn) andβ = {B1, . . . , Bn}, a basis of Fn since B is invertible.

Mββ (LA) = Mβ∗

β (idFn)Mβ∗

β∗ (LA)Mββ∗(idFn) = B−1AB = D

is a diagonal matrix,

D =

λ1 0 · · · 00 λ2 0 · · ·

...0 · · · 0 λn

so LA(Bi) = λiB

i for 1 ≤ i ≤ n, and so {B1, . . . , Bn} is a basis of eigenvectors of LA, andso LA is diagonalizable.

�

11. Triangulation

Let V be a finite dimensional vector space over a field F , of positive dimension n ≥ 1.Suppose that L : V → V is a linear map. A subspace W of V is L-invariant if L(W ) ⊂W .

A fan {V1, . . . , Vn} of L in V is a finite sequence of L-invariant subspaces

V1 ⊂ V2 ⊂ · · · ⊂ Vn = V

such that dim Vi = i for 1 ≤ i ≤ n.A fan basis (of the fan {V1, . . . , Vn}) for L is a basis {v1, . . . , vn} of V such that

{v1, . . . , vi} is a basis of Vi for 1 ≤ i ≤ n. Given a fan, a fan basis always exists.17

Theorem 11.1. Let β = {v1, . . . , vn} be a fan basis for L. Then the matrix Mββ (L) of L

with respect to the basis β is an upper triangular matrix

Proof. Let {V1, . . . , Vn} be the fan corresponding to the fan basis. L(Vi) ⊂ Vi for all iimplies there exist aij ∈ F such that

L(v1) = a11v1L(v2) = a21v1 + a22v2

...L(vn) = an1v1 + · · ·+ annvn.

Thus

Mββ (L) = (L(v1)β, L(v2)β, · · · , L(vn)β) = (aij)

is upper triangular. �

Definition 11.2. A linear map L : V → V is triangular if there exists a basis β of V such

that Mββ (L) is upper triangular. A matrix A ∈Mnn(F ) is triangulizable if there exists an

invertible matrix B ∈Mnn(F ) such that B−1AB is upper triangular.

A matrix A is triangulizable if and only if the linear map LA : Fn → Fn is triangulizable;if β is a fan basis for LA, then

Mββ (LA) = M st

β (I)M stst (LA)Mβ

st(I) = B−1AB

where I : V → V is the identity map and st denotes the standard basis of Fn, and

B = Mβst(I).

Suppose that V is a vector space, and W1, W2 are subspaces such that V = W1⊕W2.

Then every element v ∈ V has a unique expression v = w1 + w2 with w1 ∈ W1 andw2 ∈ W2 We may thus define a projection π1 : V → W1 by π1(v) = w1 if v = w1 + w2

with w1 ∈ W1 and w2 ∈ W2, and define a projection π2 : V → W1 by π1(v) = w2 forv = w1 + w2 ∈ V . These projections are linear maps, which depend on both W1 and W2.Composing with the inclusion W1 ⊂ V , we can view π1 as a map from V to V . Composingwith the inclusion W2 ⊂ V , we can view π2 as a map from V to V . Then π1 + π2 = IV .

Theorem 11.3. Let V be a non zero finite dimensional vector space over a field F , andlet L : V → V be a linear map. Suppose that the characteristic polynomial pL(t) factorsinto linear factors in F [t] (this will always happen if F is algebraically closed). Then thereexists a fan of L in V , so that L is triangulizable.

Proof. We prove the theorem by induction on n = dim V . If n = 1, then any basis ofV is a fan basis for V . Assume that the theorem is true for linear maps of vector spacesW over F of dimension less than n = dim V . Since pL(t) splits into linear factors inF [t], and pL(t) has degree n > 0, pL(t) has a root λ1 in F , which is thus an eigenvalueof L. Let v1 ∈ V be an eigenvector of L with the eigenvalue λ1. Let V1 be the onedimensional subspace of V generated by {v1}. Extend {v1} to a basis β = {v1, v2, . . . vn}of V . Let W = Span{v2, . . . , vn}. β′ = {v2, . . . , vn} is a basis of W . V is the direct sumV = V1

⊕W . Let π1 : V → V1 and π2 : V →W be the projections. The composed linear

map is π2L : W →W . From

Mββ (L) =

(λ1 ∗0 Mβ′

β′ (π2L)

),

18

we calculate pL(t) = (t−λ1)pπ2L(t). Since pL(t) factors into linear factors in F [t], pπ2L(t)also factors into linear factors in F [t].

By induction, there exists a fan of π2L in W , say {W1, . . . ,Wn−1}. Let Vi = V1 +Wi−1for 2 ≤ i ≤ n. The subspaces Vi form a chain

V1 ⊂ V2 ⊂ · · · ⊂ Vn = V.

Let {u1, . . . , un−1} be a fan basis for π2L, so that {u1, . . . , uj} is a basis of Wj for 1 ≤ j ≤n− 1. Then {v1, u1, . . . , ui−1} is a basis of Vi. We will show that {V1, . . . , Vn} is a fan ofL in V , with fan basis {v1, u1, . . . , un−1}.

We have that L = IL = (π1 + π2)L = π1L + π2L. Let v ∈ Vi. We have an expressionv = cv1 + wi−1 with c ∈ F and wi−1 ∈ Wi−1. π1L(v) = π1(L(v)) ∈ V1 ⊂ Vi. π2L(v) =π2L(cv1) + π2L(wi−1). Now π2L(cv1) = cπ2L(v1) = cπ2(λ1v1) = 0 and π2L(wi−1) ∈Wi−1since {w1, . . . , wn−1} is a fan basis for π2L. Thus π2L(v) ∈ Wi−1 and so L(v) ∈ Vi. ThusVi is L-invariant, and we have shown that {V1, . . . , Vn} is a fan of L in V . �

12. The minimal polynomial of a linear operator

Suppose that L : V → V is a linear map, where V is a finite dimensional vector spaceover a field F . Let LF (V, V ) be the F -vector space of linear maps from V to V . LF (V, V )is a ring with multiplication given by composition of maps. Let I : V → V be the identitymap, which is the multiplicative identity of the ring LF (V, V ). There is a natural inclusionof the field F as a subring (and subvector space) of LF (V, V ), obtained by mapping λ ∈ Fto λI ∈ LF (V, V ). Let F [L] be the subring of LF (V, V ) generated by F and L.

F [L] = {a0I + a1L+ · · ·+ arLr | r ∈ N and a0, . . . , ar ∈ F}.

F [L] is a commutative ring. By the universal property of polynomial rings, there is asurjective ring homomorphism ϕ from the polynomial ring F [t] onto F [L], obtained bymapping f(t) ∈ F [t] to f(L). ϕ is also a vector space homomorphism. Since F [L] isa subspace of LF (V, V ), which is an F -vector space of dimension n2, F [L] is a finitedimensional vector space. Thus the kernel of ϕ is nonzero. Let mL(t) be the monicgenerator of the kernel of ϕ. We have an isomorphism

F [L] ∼= F [t]/(mL(t)).

mL(t) is the minimal polynomial of L.We point out here that if g(t), h(t) ∈ F [t] are polynomials, then since F [L] is a commu-

tative ring, we have equality of composition of operators

g(L) ◦ h(L) = h(L) ◦ g(L).

We generally write g(L)h(L) = h(L)g(L) to denote the composition of operators. If v ∈ V ,we will also write Lv for L(v) and f(L)v for f(L)(v).

The above theory also works for matrices. The n×n matrices Mn,n(F ) is a vector spaceover F , and is also a ring with matrix multiplication. F is embedded as a subring by themap λ→ λIn, where In denotes the n× n identity matrix. Given A ∈Mn,n(F ), F [A] is acommutative subring, and there is a natural surjective ring homomorphism F [t]→ F [A].The kernel is generated by a monic polynomial mA(t) which is the minimal polynomial ofA.

Now suppose that V is a finite dimensional vector space over F , and β is a basis of V .

Then the map LF (V, V )→Mn,n(F ) defined by ϕ→Mββ (ϕ) for ϕ ∈ LF (V, V ) is a vector

space isomorphism, and a ring isomorphism.19

Suppose that L : V → V is a linear map, and let A = Mββ (L). Then we have an induced

isomorphism of F [L] with F [A]. For a polynomial f(x) ∈ F [x], we have that the image off(L) in F [A] is

Mββ (f(L)) = f(Mβ

β (L)) = f(A).

Thus we have that mA(t) = mL(t).

Lemma 12.1. Suppose that A is an n× n matrix with coefficients in a field F , and K isan extension field of F . Let f(x) be the minimal polynomial of A over F , and let g(x) bethe minimal polynomial of A over K. Then f(x) = g(x).

Proof. Let a1, a2, . . . , ar ∈ K be elements of K which are linearly independent over F . Wewill use the following observation. Suppose that A1, . . . , Ar ∈ Mnn(F ) and a1A1 + · · · +arAr = 0 in Mnn(K). Then since each coefficient of a1A1 + · · · + arAr is zero, we havethat A1 = A2 = · · · = Ar = 0.

Write g(x) = xe + be−1xe−1 + · · · + b0 with bi ∈ K. Let {a1 = 1, . . . , ar} be a basis of

the subspace of K (recall that K is a vector space over F ) spanned by {1, be−1, . . . , b0}.Expand bi =

∑tj=1 cijaj with cij ∈ F . Let f1(x) = xe + ce−1,1x

e−1 + · · · + c0,1 and

fj(x) = ce−1,jxe−1 + · · ·+ c0,j for j ≥ 2. fi(x) ∈ F [x] for all i, and g(x) =

∑rj=1 ajfj(x).

We have that 0 = g(A) =∑t

i=1 aifi(A), which implies that fi(A) = 0 for all i, as observedabove. Thus f(x) divides fi(x) in F [x] for all i, and then f(x) divides g(x) in K[x]. Sincef(A) = 0, g(x) divides f(x) in K[x] so f and g generate the same ideal, so f = g is theunique monic generator. �

13. The theorem of Hamilton-Cayley

Theorem 13.1. Let V be a non zero finite dimensional vector space, over an algebraicallyclosed field F , and let L : V → V be a linear map. Let pL(t) be its characteristic polyno-mial. Then pL(L) = 0.

Here “pL(L) = 0” means that pL(L) is the zero map on V .

Proof. There exists a fan {V1, . . . , Vn} of L in V , with associated fan basis β = {v1, . . . , vn}.The matrix Mβ

β (L) ∈Mn,n(F ) is upper triangular. Write Mββ (L) = (aij). Then

pL(t) = Det(tIn −Mββ (L)) = (t− a11)(t− a22) · · · (t− ann).

We shall prove by induction on i that

(L− a11I) · · · (L− aiiI)v = 0

for all v ∈ Vi.We first prove this in the case i = 1. Then (L − a11I)v = Lv − a11v = 0 for v ∈ V1,

since V1 is in the eigenspace for the eigenvalue a11.Now assume that i > 1, and that

(L− a11I) · · · (L− ai−1,i−1I)v = 0

for all v ∈ Vi−1. Suppose that v ∈ Vi. Then v = v′ + cvi for some v′ ∈ Vi−1 and c ∈ F .Let z = (L− aiiI)v′. z ∈ Vi−1 because L(Vi−1) ⊂ Vi−1 and aiiv

′ ∈ Vi−1. By induction,

(L− a11I) · · · (L− ai−1,i−1I)(L− aiiI)v′ = (L− a11I) · · · (L− ai−1,i−1I)z = 0.

We have (L− aiiI)cvi ∈ Vi−1, since L(vi) = ai1v1 + ai2v2 + · · ·+ aiivi. Thus

(L− a11I) · · · (L− ai−1,i−1I)(L− aiiI)cvi = 0,20

and thus(L− a11I) · · · (L− ai,iI)v = 0.

�

Corollary 13.2. Let F be an algebraically closed field and suppose that A ∈ Mn,n(F ).Let pA(t) be its characteristic polynomial. Then pA(A) = 0.

Here “pA(A) = 0” means that pA(A) is the zero matrix.

Proof. Let L = LA : Fn → Fn. Then 0 = pL(L). Let β be the standard basis of Fn.

Since A = Mββ (L), we have that pL(t) = pA(t) ∈ F [t]. Now

pA(A) = pL(Mββ (L)) = Mβ

β (pL(L)) = Mββ (0) = 0

where the right most zero in the above equation denotes the n× n zero matrix. �

Corollary 13.3. Suppose that F is a field, and A ∈Mn,n(F ). Then pA(A) = 0.

Proof. Let K be an algebraically closed field containing F . From the natural inclusionMn,n(F ) ⊂ Mn,n(K) we may regard A as an element of Mn,n(K). The computationpA(t) = Det(tIn − A) does not depend on the field F or K. By Corollary 13.2, pA(A) =0. �

Theorem 13.4. (Hamilton-Cayley) Suppose that V is a non zero finite dimensional vectorspace over a field F and L : V → V is a linear map. Then pL(L) = 0.

Proof. Let β be a basis of V , and let A = Mββ (L) ∈Mn,n(F ). We have pL(t) = pA(t). By

Corollary 13.3, pA(A) = 0. Now

Mββ (pL(L)) = pL(Mβ

β (L)) = pA(A) = 0.

Thus pL(L) = 0. �

Corollary 13.5. Suppose that V is a non zero finite dimensional vector space over afield F and L : V → V is a linear map. Then the minimal polynomial mL(t) divides thecharacteristic polynomial pL(t) in F [t].

Proof. By the Hamilton-Cayley Theorem, pL(t) is in the kernel of the homomorphismF [t] → F [L] which takes t to L and is the identity on F . Since mL(t) is a generator forthis ideal, mL(t) divides pL(t). �

14. Invariant Subspaces

Suppose that V is a finite dimensional vector space over a field F , and L : V → V isa linear map. Suppose that W1, . . . ,Wr are L-invariant subspaces of V such that V =W1⊕· · ·⊕Wr. Let βi = {vi,1, . . . , vi,σ(i)} be bases of Wi, and let

β = {v1,1, . . . , v1,σ(1), v2,1 . . . , vr,σ(r)}which is a basis of V . Since the Wi are L-invariant, the matrix of L with respect to β isa block matrix

Mββ (L) =

A1 0 0 · · · 00 A2 0 · · · 0

...0 0 0 · · · Ar

where Ai = Mβi

βi(L|Wi) are σ(i)× σ(i) matrices.

21

Lemma 14.1. Suppose that V is a vector space over a field F and L : V → V is a linearmap. Suppose that f(t) ∈ F [t] and let W be the kernel of f(L) : V → V . Then W is anL-invariant subspace of V .

Proof. Let x ∈W . Then f(L)Lx = Lf(L)x = L0 = 0. Thus Lx ∈W . �

Theorem 14.2. Let f(t) ∈ F [t]. Suppose that f = f1f2 with f1, f2 ∈ F [t] and deg f1 ≥ 1,deg (f2) ≥ 1 and gcd(f1, f2) = 1. Suppose that V is a vector space over a field F , L : V →V is a linear map and f(L) = 0. Let W1 = kernel f1(L) and W2 = kernel f2(L). ThenV = W1

⊕W2.

Proof. Since gcd(f1, f2) = 1, there exist g1, g2 ∈ F [t] such that 1 = g1(t)f1(t) + g2(t)f2(t),so that

(10) g1(L)f1(L) + g2(L)f2(L) = I

is the identity map of V . Let x ∈ V .

x = g1(L)f1(L)x+ g2(L)f2(L)x.

g1(L)f1(L)x ∈W2 since

f2(L)g1(L)f1(L)x = g1(L)f1(L)f2(L)x = g1(L)f(L)x = g1(L)0 = 0.

Similarly, g2(L)f2(L)x ∈W1. Thus V = W1+W2. To show that the sum is direct, we mustshow that an expression x = w1 + w2 with w1 ∈W1 and w2 ∈W2 is uniquely determinedby x. Apply g1(L)f1(L) to this expression to get

g1(L)f1(L)x = g1(L)f1(L)w1 + g1(L)f1(L)w2 = 0 + g1(L)f1(L)w2.

Now apply (10) to w2, to get

w2 = g1(L)f1(L)w2 + g2(L)f2(L)w2 = g1(L)f1(L)w2 + 0,

which implies that w2 = g1(L)f1(L)x is uniquely determined by x. Similarly, w1 =g2(L)f2(L)x is uniquely determined by x. �

Theorem 14.3. Let V be a vector space over a field F , and let L : V → V be a linearmap. Suppose f(t) ∈ F [t] satisfies f(L) = 0. Suppose that f(t) = f1(t)f2(t) · · · fr(t) wherefi(t) ∈ F [t] and gcd(fi, fj) = 1 if i 6= j. Let Wi = kernel fi(L) for 1 ≤ i ≤ r. ThenV = W1

⊕W2⊕· · ·⊕Wr.

Proof. We prove the theorem by induction on r, the case r = 1 being trivial. We have thatgcd (f1, f2f3 · · · fr) = 1 since gcd (f1, fi) = 1 for 2 ≤ i ≤ r. Theorem 14.2 thus impliesV = W1

⊕W , where W = kernel f2(L)f3(L) · · · fr(L). fj(L) : W → W for 2 ≤ j ≤ r.

By induction on r, we have that W = U2⊕· · ·⊕Ur, where for j ≥ 2, Uj is the kernel of

fj(L) : W →W . Thus V = W1⊕U2⊕· · ·⊕Ur.

We will prove that Wj = Uj for j ≥ 2, which will establish the theorem. We havethat Uj ⊂ Wj . Let v ∈ Wj , with j ≥ 2. Then v ∈ kernel fj(L) implies v ∈ W . Thusv ∈Wj ∩W = Uj . �

Corollary 14.4. Suppose V is a vector space over a field F . Let L : V → V be a linearmap. Suppose that a monic polynomial f(t) ∈ F [t] satisfies f(L) = 0, and f(t) splits intolinear factors in F [t] as

f(t) = (t− α1)m1 · · · (t− αr)mr

where α1, . . . , αr ∈ F are distinct. Let Wi = kernel(L− αi)mi. Then

V = W1

⊕W2

⊕· · ·⊕

Wr.

22

As an application of this Corollary, we consider the following exampleLet A(C) be the set of analytic functions on C, and D = d

dx . Suppose that n ∈ N anda0, a1, . . . , an−1 ∈ C. Let

V = {f ∈ A(C) | Dnf + an−1Dn−1f + · · ·+ a0f = 0}.

Then V is a complex vector space. Let

p(t) = tn + an−1tn−1 + · · ·+ a0 ∈ C[t].

Suppose that p(t) factors as

p(t) = (t− α1)m1 · · · (t− αr)mr

with α1, . . . , αr ∈ C distinct. Then by Corollary 14.4, we have that

V = ⊕ri=1Wi

where

Wi = {f ∈ A(C) | (D − αiI)mif = 0}.

Now from the following Lemma, we are able to construct a basis of V . In fact, we seethat

{eα1x, xeα1x, . . . , xm1−1eα1x, eα2x, . . . , xmr−1eαrx}

is a basis of V .

Lemma 14.5. Let α ∈ C, and m ∈ N. Let

W = {f ∈ A(C) | (D − αI)mf = 0}.

Then

{eαx, xeαx, . . . , xm−1eαx}

is a basis of W .

Proof. It can be verified by induction on m that

(D − αI)mf = eαxDm(e−αxf)

for f ∈ A(C). Thus f ∈ W if and only if Dm(e−αxf) = 0. The functions whose m-thderivitives are zero are the polynomials of degree ≤ m−1. Hence the space of solutions to(D − αIm)f = 0 is the space generated by {eαx, xeαx, . . . , xm−1eαx}. It remains to verifythat these functions are linearly independent. Suppose that there exist ci ∈ C such that

c0eαs + c1se

αs + · · ·+ cm−1sm−1eαs = 0

for all s ∈ C. Let

q(t) = c0 + c1t+ · · ·+ cm−1tm−1 ∈ C[t].

We have q(s)eαs = 0 for all s ∈ C. But eαs 6= 0 for infinitely many s ∈ C, so q(s) = 0 forinfinitely many s ∈ C. Since C is infinite, and the equation q(t) = 0 has at most a finitenumber of roots if q(t) 6= 0, we must have q(t) = 0. Thus ci = 0 for 0 ≤ i ≤ m− 1. �

23

15. Cyclic Vectors

Definition 15.1. Let V be a finite dimensional vector space over a field F , and let L :V → V be a linear map. V is called L-cyclic if there exists v ∈ V such that

V = F [L]v = {f(L)v | f(t) ∈ F [t]}.We will say that the vector v is L-cyclic.

Lemma 15.2. Let V be a finite dimensional vector space over a field F , and let L : V → Vbe a linear map. Suppose that V is L-cyclic and v ∈ V is an L-cyclic vector. Let d be thedegree of the minimal polynomial mL(t) ∈ F [t] of L. Then {v, Lv, . . . , Ld−1v} is a basisof V .

Proof. Suppose w ∈ V . Then w = f(L)v for some f(t) ∈ F [t]. We have f(t) =p(t)mL(t) + r(t) with p(t), r(t) ∈ F [t] and deg r(t) < d. Thus w = f(L)v = r(L)v ∈Span{v, Lv, . . . , Ld−1v}. This shows that V = Span{v, Lv, . . . , Ld−1v}. Suppose thatthere is a dependence relation

c0v + c1Lv + · · ·+ csLsv = 0

where 0 ≤ s ≤ d − 1, all ci ∈ F and cs 6= 0. Let f(t) = c0 + c1t + · · · + csts ∈ F [t]. Let

w ∈ V . Then there exist g(t) ∈ F [t] such that w = g(L)v.

f(L)w = f(L)g(L)v = g(L)f(L)v = g(L)0 = 0.

Thus the operator f(L) = 0, which implies that mL(t) divides f(t), which is impossiblesince f(t) is nonzero and has degree less than d. Thus {v, Lv, . . . , Ld−1v} are linearlyindependent, and a basis of V . �

With the hypotheses of Lemma 15.2, express mL(t) = a0 + a1t + · · · + ad−1td−1 + td

with the ai ∈ F . Let β = {v, Lv, . . . , Ld−1v}, a basis of V . Then

Mββ (L) =

0 0 · · · 0 0 −a01 0 0 0 −a10 1 0 0 −a2...0 0 0 0 −ad−20 0 0 1 −ad−1

.

16. Nilpotent Operators

Suppose that V is a vector space, and L : V → V is a linear map. We will say that L isnilpotent if there exists a positive integer r such that Lr = 0. If r is the smallest positiveinteger such that Lr = 0, then mL(t) = tr.

Theorem 16.1. Let V be a finite dimensional vector space over a field F , and let L : V →V be a nilpotent linear map. Then V is a direct sum of L-invariant L-cyclic subspaces.

Proof. We prove the theorem by induction on dim V . If V has dimension 1, then Vis L-cyclic. Assume that the theorem is true for vector spaces of dimension less thans = dim V .

Let r be the smallest integer such that Lr = 0, so that the minimal polynomial of L ismL(t) = tr. Since Lr−1 6= 0, there exists w ∈ V such that Lr−1w 6= 0. Let v = Lr−1w.Lv = 0 so Kernel L 6= 0.

dim L(V ) + dim Kernel L = dim V24

implies dim L(V ) < dim V .By induction on s = dim V , L(V ) is a direct sum of L-invariant subspaces which are

L-cyclic, say

L(V ) = W1

⊕· · ·⊕

Wm.

Let wi ∈Wi be an L|Wi-cyclic vector, so that if ri is the degree of the minimal polynomialmL|Wi

(t), then Wi has a basis {wi, Lwi, . . . , Lri−1wi} by Lemma 15.2. Since L|Wi isnilpotent, mL|Wi

(t) = tri . Let vi ∈ V be such that Lvi = wi.

Let Vi be the cyclic subspace F [L]vi of V . Lri+1vi = Lriwi = 0 implies that {vi, Lvi, . . . , Lrivi}spans Vi. Suppose that

d0vi + d1Lvi + · · ·+ driLrivi = 0

with d0, . . . , dri ∈ F . Then

0 = L(0) = L(d0vi + d1Lvi + · · ·+ driLrivi)

= d0Lvi + · · ·+ dri−1Lrivi + driL

ri+1vi= d0wi + · · ·+ dri−1L

ri−1wi.

Thus d0 = · · · = dri−1 = 0, since {wi, . . . , Lri−1wi} is a basis of Wi. Now since Lrivi =Lri−1wi 6= 0, we have dri = 0. Thus {vi, Lvi, . . . , Lrivi} are linearly independent, and arethus a basis of Vi.

We will prove that the subspace V ′ = V1 + · · · + Vm of V is a direct sum. We have toprove that if

(11) 0 = u1 + · · ·+ um

with ui ∈ Vi, then ui = 0 for all i. Since ui ∈ Vi, we have an expression ui = fi(L)vi wherefi(t) ∈ F [t] is a polynomial. Thus (11) becomes

(12) f1(L)v1 + · · ·+ fm(L)vm = 0.

Apply L to (12), and recall that Lfi(L) = fi(L)L, to get

f1(L)w1 + · · ·+ fm(L)wm = 0.

Now W1 + · · ·+ Wm is a direct sum decomposition of L(V ) by L-invariant subspaces, sofi(L)wi = 0 (for all i with 1 ≤ i ≤ m) which implies that fi(L)(Wi) = 0, since Wi iswi-cyclic. Thus mL|Wi

(t) = tri divides fi(t). In particular, t divides fi(t) in F [t]. Writefi(t) = gi(t)t for some polynomial gi(t). Then fi(L) = gi(L)L. (12) implies

g1(L)w1 + · · ·+ gm(L)wm = 0.

Thus gi(L)wi = 0 for all i, since L(V ) is the direct sum of the Wi. This implies that tri

divides gi(t) in F [t] so that tri+1 divides fi(t) which implies that ui = fi(L)vi = 0. Thisproves that V ′ is the direct sum V ′ = V1

⊕· · ·⊕Vm.

An element of L(V ) is of the form

f1(L)w1 + · · ·+ fm(L)wm = f1(L)Lv1 + · · ·+ fm(L)Lvm= L(f1(L)v1 + · · ·+ fm(L)vm)

for some polynomials fi(t) ∈ F [t]. Thus L(V ) ⊂ L(V ′). Since V ′ ⊂ V , we have L(V ′) ⊂L(V ) and so L(V ′) = L(V ). Now let v ∈ V . Lv = Lv′ for some v′ ∈ V ′. Then L(v−v′) = 0.Thus v = v′ + (v− v′) ∈ V ′ + Kernel L. We conclude that V = V ′ + kernel L (which maynot be a direct sum).

Let β′ = {Ljvi | 1 ≤ i ≤ m, 1 ≤ j ≤ ri} be the basis we have constructed of V ′. Weextend β′ to a basis of V by using elements of kernel L, to get a basis β = {β′, z1, . . . , ze}of V where z1, . . . , ze ∈ kernel L. Each zj satisfies Lzj = 0, so zj is an eigenvector for L,

25

and the one dimensional space generated by zj is L-invariant and cyclic. Let this subspacebe Zj . We have

V = V ′⊕

Z1 · · ·⊕

Ze = V1⊕· · ·⊕

Vm⊕

Z1

⊕· · ·⊕

Ze

expressing V as a direct sum of L-cyclic subspaces.�

17. Jordan Form

Definition 17.1. Suppose that V is a finite dimensional vector space over a field F , and

L : V → V is a linear map. A basis β of V is a Jordan basis for L if the matrix Mββ (L)

is a block matrix

J = Mββ (L) =

J1 0 · · · 00 J2 · · · 0

... 00 0 · · · Jm

,

where each Ji is a Jordan block; that is a matrix of the form

Ji =

αi 1 0 · · · 0 00 αi 1 · · · 0 00 0 αi · · · 0 0

...0 0 0 · · · αi 10 0 0 · · · 0 αi

for some αi ∈ F .

A matrix of the form J is called a Jordan form.

Theorem 17.2. Suppose that V is a finite dimensional vector space over a field F , andL : V → V is a linear map. Suppose that the minimal polynomial mL(t) splits into linearfactors in F [t]. Then V has a Jordan basis for L.

Proof. By assumption, we have a factorization mL(t) = (t − α1)a1 · · · (t − αs)

as withα1, . . . , αs ∈ F distinct. Let

Vi = {v ∈ V | (L− αiI)aiv = 0}for 1 ≤ i ≤ s. We proved in Lemma 14.1 and Corollary 14.4 that Vi are L-invariantsubspaces of V and V = V1

⊕· · ·⊕Vs. It thus suffices to prove the theorem in the case

that there exists α ∈ F and r ∈ N such that (L−αI)r = 0. In this case we apply Theorem16.1 to the nilpotent operator T = L − αI, to show that V is a direct sum of T -cyclicsubspaces. Since a T -invariant subspace is L-invariant, it suffices to prove the theorem inthe case that T is nilpotent, and V is T -cyclic. Let v be a T -cyclic vector. There existsa positive integer r such that the minimal polynomial of T is mT (t) = tr, and by Lemma15.2, {T r−1v, . . . , T v, v} is a basis of V . Recalling that T = L− αI, we have

L(L− αI)r−1v = α(L− αI)r−1vL(L− αI)r−2v = (L− αI)r−1v + α(L− αI)r−2v

...Lv = (L− α)v + αv.

26

Thus the matrix of L with respect to the basis β is the Jordan block

Mββ (L) =

α 1 0 · · · 0 00 α 1 · · · 0 00 0 α · · · 0 0

...0 0 0 · · · α 10 0 0 · · · 0 α

.

�

The proof of the following theorem reduces the calculation of the Jordan form of anoperator L to a straightforward calculation, if we are able to determine the eigenvalues(roots of the characteristic equation) of L.

Theorem 17.3. Suppose that V is a finite dimensional vector space over a field F , andL : V → V is a linear map. Suppose that the minimal polynomial mL(t) splits into linearfactors in F [t]. For α ∈ F and i ∈ N, define

si(α) = dim Kernel (L− αI)i.

Then the Jordan form of L is uniquely determined, up to permutation of Jordan blocks,by the si(α).

In particular, the Jordan Form of L is uniquely determined up to permutation of Jordanblocks.

Proof. Suppose that α ∈ F . Let si = si(α) for 1 ≤ i ≤ n = dim V . Suppose that β is aJordan basis of V . Let

A = Mββ (L) =

J1 0 · · · 00 J2 0...0 0 Js

where the

Ji =

αi 1 0 · · · 00 αi 1 0...0 0 0 αi

are Jordan blocks. Suppose that Ji has size ni for 1 ≤ i ≤ s. We have that

dim Kernel (L− αI)i = dim Kernel LMββ (L−αI)i

= dim Kernel L(A−αIn)i

where In is the n × n identity matrix and LMββ (L−αI)i

is the associated linear map from

Fn to Fn. We have that

(A− αIn)i =

(J1 − αIn1)i 0 · · · 0

0 (J2 − αIn2)i 0...0 0 (Js − αIns)i

.

27

Now if αj 6= α, then

(Jj − αInj )i =

(αj − α)i ∗ ∗ · · · ∗

0 (αj − α)i ∗ ∗...0 0 0 (αj − α)i

,

and

(Jj − αjInj )i =

0 · · · 0 1 0 · · · 00 0 0 1 0...0 0 0 0 0

for 1 ≤ i ≤ nj − 1, where the 1 in the first row occurs in the i+ 1-st column. We also havethat (Jj − αjInj )nj = 0. We have

dim Kernel L(Jj−αInj )i =

{0 if αj 6= αmin{i, nj} if αj = α.

Let rj be the number of Jordan blocks Jj of size nj = i and with αj = α. Let

s1 = r1 + · · ·+ rn...

si = r1 + 2r2 + 3r3 + · · ·+ i(ri + · · ·+ rn)...

sn = r1 + 2r2 + 3r3 + · · ·+ nrn.

From the above system we obtain r1 = 2s1−s2, ri = 2si−si+1−si−1 for 2 ≤ i ≤ n−1 andrn = sn−sn−1. Since the Jordan form of L is determined up to permutation of the Jordanblocks by the knowledge of the ri = ri(α) for α ∈ F , and the si = si(α) are completelydetermined by L, the Jordan form of L is completely determined up to permutation ofJordan blocks.

�

18. Exercises

Suppose that V is a finite dimensional vector space over an algebraically closed field F ,and L : V → V is a linear map.

1. If L is nilpotent and not the zero map, show that L is not diagonalizable.2. Show that L can be written in the form L = D + N where D : V → V is

diagonalizable, N : V → V is nilpotent, and DN = ND.3. Give a formula for the minimal polynomial of L, in terms of its Jordan form.4. Let pL(t) be the characteristic polynomial of L, and suppose that it has a factor-

ization

pL(t) =r∏i=1

(t− αi)mi

where α1, . . . , αr are distinct. Let f(t) be a polynomial. Express the characteristicpolynomial pf(L)(t) as a product of factors of degree 1.

28

19. Inner Products

Let z denote complex conjugation, for z ∈ C. z ∈ R if and only if z = z.Suppose that V is a vector space over F , where F = R or F = C. An inner product on

V is a (respectively, real or complex) valued function on V ×V such that for v1, v2, w ∈ Vand α1 ∈ F

1. < v1 + v2, w >=< v1, w > + < v2, w >,2. < α1v1, w >= α1 < v1, w1 >,3. < v,w >= < w, v >,4. < w,w >≥ 0 and < w,w >= 0 if and only if w = 0.

An inner product space is a vector space with an inner product. A complex inner productis often called an Hermitian product.

A real inner product is a positive definite, symmetric bilinear form. However, an Her-mitian inner product is not even a bilinear form. In spite of this difference, the theory forreal and complex inner products is very similar.

From now on in this section, suppose that V is an inner product space, with an innerproduct <,>.

For v ∈ V , we define the norm of v by

||v|| =√< v, v >.

From the definition of an inner product, we obtain the polarization identities. If V is areal inner product space, then for v, w ∈ V , we have

(13) < v,w >=1

4||v + w||2 − 1

4||v − w||2.

If V is an Hermitian inner product space, then for v, w ∈ V , we have

(14) < v,w >=1

4||v + w||2 − 1

4||v − w||2 +

i

4||v + iw||2 − i

4||v − iw||2.

A property of inner products that we will use repeatedly is the fact that if x, y ∈ V and< v, x >=< v, y > for all v ∈ V , then x = y.

We say that vectors v, w ∈ V are orthogonal or perpendicular if < v,w >= 0. Supposethat S ⊂ V is a subset. Define

S⊥ = {v ∈ V |< v,w >= 0 for all w ∈ S}.

Lemma 19.1. Suppose that S is a subset of V . Then

1. S⊥ is a subspace of V .2. If U is the subspace Span(S) of V , then U⊥ = S⊥.

A set of nonzero vectors {v1, . . . , vr} in V are called orthogonal if< vi, vj >= 0 wheneveri 6= j.

Lemma 19.2. Suppose that {v1, · · · , vr} are nonzero orthogonal vectors in V . Thenv1, . . . , vr are linearly independent.

A set of vectors {u1, . . . , ur} in V are called orthonormal if < ui, uj >= 0 if i 6= j and||ui|| = 1 for all i.

Lemma 19.3. Suppose that {u1, · · · , ur} are orthonormal vectors in V . Then u1, . . . , urare linearly independent.

A basis of V consisting of orthonormal vectors is called an orthonormal basis.29

Lemma 19.4. Let {v1, . . . , vs} be a set of nonzero orthogonal vectors in V , and v ∈ V .Let

ci =< v, vi >

< vi, vi >.

1. We have that

v −s∑i=1

civi ∈ Span{v1, . . . , vs}⊥.

2. Let a1, . . . , as ∈ F . Then

||v −s∑i=1

civi|| ≤ ||v −s∑i=1

aivi||

with equality if and only if ai = ci for all i. Thus c1v1 + · · · + csvs gives the bestapproximation to v as a linear combination of v1, . . . , vs.

Proof. We first prove 1. For 1 ≤ j ≤ s, we have

< v −s∑i=1

civi, vj >=< v, vj > −s∑i=1

ci < vi, vj >=< v, vj > −cj < vj , vj >= 0.

We now prove 2. We have that

||v −∑s

i=1 aivi||2 = ||v −∑s

i=1 civi +∑s

i=1(ci − ai)vi||2= ||v −

∑si=1 civi||2 + ||

∑si=1(ci − ai)vi||2

since v −∑s

i=1 civi ∈ Span{v1, . . . , vs}⊥. �

Theorem 19.5. (Gram Schmidt) Suppose that V is a finite dimensional inner productspace. Suppose that {u1, . . . , ur} is a set of orthonormal vectors in V . Then {u1, . . . , ur}can be extended to an orthonormal basis {u1, . . . , ur, ur+1, . . . un} of V .

Proof. First extend u1, . . . , ur to a basis {u1, . . . , ur, vr+1, . . . , vn} of V . Inductively define

wi = vi −i−1∑j=0

< vi, uj > uj

and

ui =1

||wi||wi

for r + 1 ≤ i ≤ n.Let Vr = Span{u1, . . . , ur} and Vi = Span{u1, . . . , ur, vr+1, . . . , vi} for r + 1 ≤ i ≤ n.

Then Vi = Span{u1, . . . , ui} for r+1 ≤ i ≤ n. By Lemma 19.4, we have that ui ∈ (V i−1)⊥

for r+ 1 ≤ i ≤ n. Thus {u1, . . . , un} are an orthonormal set of vectors which form a basisof V . �

Corollary 19.6. Suppose that W is a subspace of V . Then V = W⊕W⊥.

Proof. First construct, using Gram Schmidt, an orthonormal basis {w1, . . . , ws} of W .Now apply Gram Schmidt again to extend this to an orthonormal basis {w1, . . . , ws, u1, . . . , ur}of V . Let U = Span{u1, . . . , ur}. We have that V = W

⊕U . It remains to show that

U = W⊥. Let x ∈ U . Then x =∑r

i=1 ciui for some ci ∈ F . For all j,

< x,wj >=r∑i=1

ci < ui, wj >= 0.

30

Thus x ∈W⊥ and U ⊂W⊥.Suppose x ∈W⊥. Expand x =

∑ri=1 ciui +

∑sj=1 djwj for some ci, dj ∈ F . For all k,

0 =< x,wk >=r∑i=1

ci < ui, wk > +s∑j=1

dj < wj , wk >= dk.

Thus x ∈ U and W⊥ ⊂ U . �

We now can give another proof of Theorem 7.5, when F = R.

Corollary 19.7. Suppose that A ∈ Mm,n(R). Then the row rank of A is equal to thecolumn rank of A.

Suppose that V , W are inner product spaces (both over R or over C), with innerproducts <,> and [, ] respectively. We will say that a linear map ϕ : V →W is a map ofinner product spaces if < v,w >= [ϕ(v), ϕ(w)] for all v, w ∈ V .

Corollary 19.8. Suppose that V is a real inner product space of dimension n. ConsiderRn as an inner product space with the dot product. Then there is an isomorphism ϕ : V →Rn of inner product spaces.

Corollary 19.9. Suppose that V is an Hermitian inner product space of dimension n.Consider Cn as an inner product space with the standard Hermitian product, [v, w] = vtwfor v, w ∈ V . Then there is an isomorphism ϕ : V → Cn of inner product spaces.

Proof. Let {v1, . . . , vn} be an orthonormal basis of V . Let {e1, . . . , en} be the standardbasis of Cn. We can thus define a linear map ϕ : V → Cn by ϕ(v) = a1e1 + · · · anen ifv = a1v1 + · · · + anvn with a1, . . . , an ∈ C. Since {e1, . . . , en} is a basis of Cn, ϕ is anisomorphism. We have < vi, vj >= δij and [ϕ(vi), ϕ(vj)] = [ei, ej ] = δij for 1 ≤ i, j ≤ n.

Suppose that v = a1v1 + · · ·+ anvn ∈ V and w = b1v1 + · · ·+ bnvn ∈ V . Then

< v,w >=n∑

i,j=1

aibjδij =n∑i=1

aibi.

We also calculate

[ϕ(v), ϕ(w)] = [n∑i=1

aiei,n∑j=1

bjej ] =n∑

i,j=1

aibjδij =n∑i=1

aibi.

Thus < v,w >= [ϕ(v), ϕ(w)] for all v, w ∈ V . �

20. Symmetric, Hermitian, Orthogonal and Unitary Operators

Throughout this section, assume that V is a finite dimensional inner product space. Let<,> be the inner product. Suppose that L : V → V is a linear map.

Recall that the dual space of V is the vector space V ∗ = LF (V,K), and dim V ∗ =dim V .

Lemma 20.1. Suppose that z ∈ V . Then the map ϕz : V → F defined by ϕz(v) =< v, z >is a linear map. Suppose that ψ ∈ V ∗. Then there exists a unique z ∈ V such that ψ = ϕz.

Proof. Let {v1, . . . , vn} be an orthonormal basis of V with dual basis {v∗1, . . . , v∗n}. Supposethat ψ ∈ V ∗. Then there exist (unique) c1, . . . , cn ∈ F such that ψ = c1v

∗1 + · · · + cnv

∗n.

Let z = c1v1 + · · ·+ cnvn ∈ V . We have that ψ(vi) = ci for 1 ≤ i ≤ n.

ϕz(vi) =< vi, z >= c1 < vi, v1 > + · · ·+ cn < vi, vn >= ci = ψ(vi)31

for 1 ≤ i ≤ n. Since {v1, . . . , vn} is a basis of V , we have that ψ = ϕz. Suppose thatw ∈ V and ϕw = ψ. Expand v = d1v1 + · · ·+ dnvn. We have that

ci = ϕw(vi) = di

for 1 ≤ i ≤ n. Thus di = ci for 1 ≤ i ≤ n, and w = z. �

Theorem 20.2. There exists a unique linear map L∗ : V → V such that < Lv,w >=<v,L∗w > for all v, w ∈ V .

Proof. For fixed w ∈ V , the map ψw : V → F defined by ψw(v) =< L(v), w > is a linearmap. Hence ψw ∈ V ∗. By Lemma 20.1, there exists a unique vector zw ∈ V such thatψw = ϕzw (where ϕzw(v) =< v, zw > for v ∈ V ). Thus we have a function L∗ : V → Vdefined by L∗(w) = zw for w ∈ V .

We will now verify that L∗ is linear. Suppose w1, w2 ∈ V . For v ∈ V ,

< v,L∗(w1 + w2) = < L(v), w1 + w2 >=< L(v), w1 > + < L(v), w2 >= < v,L∗(w1) > + < v,L∗(w2) >=< v,L∗(w1) + L∗(w2) > .

Since this identity holds for all v ∈ V , we have that L∗(w1 + w2) = L∗(w1) + L∗(w2).Suppose w ∈ V and c ∈ F .

< v,L∗(cw) >=< L(v), cw >= c < L(v), w >= c < v, L∗(w) >=< v, cL∗(w) > .

Since this identity holds for all v ∈ V , we have that L∗(cw) = cL∗(w). Thus L∗ is linear.Now we prove uniqueness. Suppose that T : V → V is a linear map such that <

L(v), w >=< v, T (w) > for all v, w ∈ V . Then for any w ∈W , < v, T (w) >=< v,L∗(w) >for all v ∈ V . Thus T (w) = L∗(w), and L∗ is unique. �

L∗ is called the adjoint of L.

Definition 20.3. L is called self adjoint if L∗ = L.

If V is a real inner product space, a self adjoint operator is called symmetric, and wesometimes write L∗ = Lt. If V is an Hermitian inner product space, a self adjoint operatoris called Hermitian.

Lemma 20.4. Let V = Rn with the standard inner product. Suppose that L = LA : Rn →Rn for some A ∈Mnn(R). Then L∗ = LAt.

Lemma 20.5. Let V = Cn with the standard Hermitian inner product. Suppose thatL = LA : Cn → Cn for some A ∈Mnn(C). Then L∗ = L

At.

Proof. Let {e1, . . . , en} be the standard basis of Cn. Let B ∈ Mn,n(C) be the uniquematrix such that L∗ = LB. Write A = (aij) and B = (bij). We have that etiBej = bij . For1 ≤ i, j ≤ n, we have

< Lei, ej >=< ei, L∗ej >=< ei, Bej >= etiBej = eiBej = bij .

We also calculate

< Lei, ej >=< Aei, ej >= (Aei)tej = etiA

tej = aji.

Thus bij = aji, and B = At. �

Lemma 20.6. The following are equivalent:

1. LL∗ = I.2. ||Lv|| = ||v|| for all v ∈ V .3. < Lv,Lw >=< v,w > for all v, w ∈ V

32

Proof. Suppose that 1. holds, so that LL∗ = I. Then L∗ is a right inverse of L so that Lis invertible with inverse L∗. Thus L∗L = I. Suppose that v ∈ V . Then

||Lv||2 =< Lv,Lv >=< v,L∗Lv >=< v, v >= ||v||2,and thus 2. holds.

Now suppose that 2. holds. Then 3. holds by the appropriate polarization identity (13)or (14).

Finally suppose that 3. holds. For fixed w ∈ V , we have < v,w >=< Lv,Lw >=<v,L∗Lw > for all v ∈ V . Thus w = L∗Lw. Since this holds for all w ∈ W , L∗L = I, sothat L is an isomorphism with LL∗ = I, and 1. holds. �

The above Lemma motivates the following definition.

Definition 20.7. L : V → V is an isometry if LL∗ = I, where I is the identity map ofV .

If V is a real inner product space, then an isometry is called an orthogonal transfor-mation (or real unitary). If V is an Hermitian inner product space, then an isometry iscalled a unitary transformation.

Lemma 20.8. Let {v1, . . . , vn} be a an orthonormal basis of V . Then L is an isometryif and only if {Lv1, . . . , Lvn} is an orthonormal basis of V .

Theorem 20.9. If W is an L-invariant subspace of V and L = L∗, then W⊥ is anL-invariant subspace of V .

Proof. Let v ∈W⊥. For all w ∈W we have < Lv,w >=< v,L∗w >=< v,Lw >= 0, sinceLw ∈W . Hence Lv ∈W⊥. �

Lemma 20.10. Suppose that V is an Hermitian inner product space and L is Hermitian.Suppose that v ∈ V . Then < Lv, v >∈ R.

Proof. We have < Lv, v >=< v,L∗v >=< v,Lv >= < Lv, v >. Thus < Lv, v >∈ R. �

21. The Spectral Theorem

Lemma 21.1. Let A ∈Mnn(R) be symmetric. Then A has a real eigenvalue.

Proof. Regarding A as a matrix in Mn,n(C), we have that A is Hermitian. Now A has acomplex eigenvalue λ, with an eigenvector v ∈ Cn. Let < , > be the standard Hermitian

product on Cn. We have that A = At = At, so LA : Cn → Cn is Hermitian, by Lemma

20.5. By Lemma 20.10, we then have that < Av, v >= λ < v, v > is real. Since < v, v >is real we have that λ is real. �

Theorem 21.2. Suppose that V is a finite dimensional real vector space with an innerproduct. Suppose that L : V → V is a symmetric linear map. Then L has an eigenvector.

Proof. Let β = {v1, . . . , vn} be an orthonormal basis of V . We have an isomorphismΦ : V → Rn defined by Φ(v) = (v)β for v ∈ V . Let [, ] be the dot product on Rn. Let

A = Mββ (L) ∈Mnn(R). We have

< v,w >= [(v)β, (w)β]

for v, w ∈ V . For v, w ∈ V , we calculate

< L(v), w >= [(L(v))β, (w)β] = [A(v)β, (w)β] = [(v)β, At(w)β]

33

and< L(v), w >=< v,L(w) >= [(v)β, A(w)β].

Since [(v)β, At(w)β] = [(v)β, A(w)β] for all v, w ∈ V , we have that A = At. Thus A

has a real eigenvalue λ which necessarily has a real eigenvector v by Lemma 21.1. Lety = Φ−1(x) ∈ V .

(L(y))β = A(y)β = Ax = λx = (λy)β.

Thus L(y) = λy. Since y is nonzero, as x is nonzero, we have that y is an eigenvector ofL. �

Theorem 21.3. Suppose that V is a finite dimensional real vector space with an innerproduct. Suppose that L : V → V is a symmetric linear map. Then V has an orthonormalbasis consisting of eigenvectors of L.

Proof. We prove the Theorem by induction on n = dim V . By Theorem 21.2 there existsan eigenvector v ∈ V for L. Let W be the one dimensional subspace of V generated by v.W is L-invariant. By Theorem 20.9, W⊥ is also L-invariant. We have that V ∼= W

⊕W⊥,

so dim W⊥ = dim V − 1. The restriction of L to W⊥ is a symmetric linear map of W⊥ toitself. By induction on n, there exists an orthonormal basis {u2, . . . , un} of W⊥ consistingof eigenvectors of L. Thus {u1 = 1

||v||v, u2, . . . , un} is an orthonormal basis of V consisting

of eigenvectors of L. �

Corollary 21.4. Let A ∈Mnn(R) be symmetric. Then there exists an orthogonal matrixP ∈Mn,n(R) such that P tAP = P−1AP is a diagonal matrix.

Proof. Rn has an orthonormal basis β = {u1, . . . , un} of eigenvectors of LA : Rn → Rnby Lemma 20.4 and Theorem 21.3. Let λ1, . . . , λn be the respective eigenvalues. Then

Mββ (LA) is the diagonal matrix D with λ1, λ2, . . . , λn as diagonal elements. Let βst =

{e1, . . . , en} be the standard basis of Rn. Let P = Mββst(I) = (u1, . . . , un). P−1 = Mβst

β (I).

LP is an isometry by Lemma 20.8, since Pei = ui for 1 ≤ i ≤ n. Thus P is orthogonaland P−1 = P t by Lemmas 20.4 and 20.6.

D = Mββ (LA) = Mβst

β (I)Mβst

βst (LA)Mββst(I) = P−1AP = P tAP.

�

The proof of Theorem 21.3 also proves the following theorem.

Theorem 21.5. Suppose that V is a finite dimensional complex vector space with anHermitian inner product. Suppose that L : V → V is a Hermitian linear map. Then Vhas an orthonormal basis consisting of eigenvectors of L. All eigenvalues of L are real.

Corollary 21.6. Let A ∈ Mnn(C) be Hermitian (A = At). Then there exists a unitary

matrix U ∈Mn,n(C) such that UtAU = U−1AU is a real diagonal matrix.

34

Documents

Matricesfaculty.missouri.edu/~cutkoskys/LinearAlgebra.pdfDe nition 2.9. V is called a nite dimensional vector space if V has a nite basis. For the most part, we will consider nite