Linear Algebra Cambridge Mathematical Tripos Part IB

7/25/2019 Linear Algebra Cambridge Mathematical Tripos Part IB

1/58

LINEAR ALGEBRA

SIMON WADSLEY

Contents

1. Vector spaces 21.1. Definitions and examples 21.2. Linear independence, bases and the Steinitz exchange lemma 41.3. Direct sum 82. Linear maps 92.1. Definitions and examples 92.2. Linear maps and matrices 122.3. The first isomorphism theorem and the rank-nullity theorem 142.4. Change of basis 162.5. Elementary matrix operations 173. Duality 193.1. Dual spaces 193.2. Dual maps 214. Bilinear Forms (I) 235. Determinants of matrices 25

6. Endomorphisms 306.1. Invariants 306.2. Minimal polynomials 326.3. The Cayley-Hamilton Theorem 366.4. Multiplicities of eigenvalues and Jordan Normal Form 397. Bilinear forms (II) 447.1. Symmetric bilinear forms and quadratic forms 447.2. Hermitian forms 498. Inner product spaces 508.1. Definitions and basic properties 508.2. GramSchmidt orthogonalisation 518.3. Adjoints 538.4. Spectral theory 56

Date: Michaelmas 2015.

1


2/58

2 SIMON WADSLEY

Lecture 1

1. Vector spaces

Linear algebra can be summarised as the study of vector spaces and linear mapsbetween them. This is a second first course in Linear Algebra. That is to say, wewill define everything we use but will assume some familiarity with the concepts(picked up from the IA course Vectors & Matrices for example).

1.1. Definitions and examples.

Examples.

(1) For each non-negative integern, the set Rn of column vectors of length n withreal entries is a vector space (over R). An (m n)-matrixA with real entriescan be viewed as a linear map Rn Rm via vAv. In fact, as we will see,every linear map from Rn Rm is of this form. This is the motivating exampleand can be used for intuition throughout this course. However, it comes with

a specified system of co-ordinates given by taking the various entries of thecolumn vectors. A substantial difference between this course and Vectors &Matrices is that we will work with vector spaces without a specified set ofco-ordinates. We will see a number of advantages to this approach as we go.

(2) Let Xbe a set and RX :={f: X R} be equipped with an addition givenby (f+g)(x) := f(x) + g(x) and a multiplication by scalars (in R) given by(f)(x) = (f(x)). ThenRX is a vector space (over R) in some contexts calledthe space of scalar fields on X. More generally, ifV is a vector space over Rthen VX ={f: X V} is a vector space in a similar manner a space ofvector fields onX.

(3) If [a, b] is a closed interval in R then C([a, b], R) :={fR[a,b] |f is continuous}is an R-vector space by restricting the operations on R [a,b]. Similarly

C

([a, b], R) :={fC([a, b], R)|f is infinitely differentiable}is an R-vector space.

(4) The set of (m n)-matrices with real entries is a vector space over R.Notation. We will use F to denote an arbitrary field. However the schedules onlyrequire consideration ofR and C in this course. If you prefer you may understandFto always denote either R or C(and the examiners must take this view).

What do our examples of vector spaces above have in common? In each casewe have a notion of addition of vectors and scalar multiplication of vectors byelements in R.

Definition. AnF-vector spaceis an abelian group (V, +) equipped with a functionF

V

V; (, v)

v such that

(i) (v) = ()v for all , F and vV;(ii) (u + v) = u + v for all F and u, vV;

(iii) ( + )v= v+ v for all , F and vV;(iv) 1v= v for all vV.

Note that this means that we can add, subtract and rescale elements in a vectorspace and these operations behave in the ways that we are used to. Note also thatin general a vector space does not come equipped with a co-ordinate system, or


3/58

LINEAR ALGEBRA 3

notions of length, volume or angle. We will discuss how to recover these later inthe course. At that point particular properties of the field F will be important.

Convention. We will always write 0 to denote the additive identity of a vectorspaceV. By slight abuse of notation we will also write 0 to denote the vector space{0}.Exercise.

(1) Convince yourself that all the vector spaces mentioned thus far do indeed satisfythe axioms for a vector space.

(2) Show that for anyv in any vector space V , 0v= 0 and (1)v=vDefinition. Suppose that V is a vector space over F. A subset U V is an(F-linear) subspace if

(i) for allu1, u2U, u1+ u2U;(ii) for allF and uU, uU;

(iii) 0U.Remarks.

(1) It is straightforward to see thatU V is a subspace if and only ifU=andu1+ u2U for all u1, u2U and, F.

(2) IfUis a subspace ofV thenUis a vector space under the inherited operations.

Examples.

(1)

x1x2

x3

R3 : x1+ x2+ x3 = t is a subspace ofR3 if and only ift = 0.

(2) LetXbe a set. We define the supportof a functionf : XF to besupp f :={xX: f(x)= 0}.

Then{f FX

:| supp f|


4/58

4 SIMON WADSLEY

forV . So suppose that v1+U=v2+U. Then (v1 v2)Uand so v1 v2 =(v1 v2)U for each F since U is a subspace. Thusv1+U =v2+U asrequired.

Lecture 2

1.2. Linear independence, bases and the Steinitz exchange lemma.

Definition. Let Vbe a vector space over F and S V a subset ofV. Then thespan ofS inV is the set of all finite F-linear combinations of elements ofS,

S:=

ni=1

isi : iF, siS, n 0

Remark. For any subset SV,S is the smallest subspace ofV containingS.Example. Suppose thatV is R3.

IfS= 100

,011

,122

thenS= abb

:a, bR .Note also that every subset ofSof order 2 has the same span as S.

Example. LetXbe a set and for each xX, define x : XF by

x(y) =

1 ify = x

0 ify=x.Thenx : xX={fFX :|suppf|


5/58

LINEAR ALGEBRA 5

Convention. The span of the empty set is the zero subspace 0. Thus theempty set is a basis of 0. One may consider this to not be so much a convention asthe only reasonable interpretation of the definitions of span, linearly independentand basis in this case.

Lemma. A subsetSof a vector spaceV overF is linearly dependent if and only ifthere exists0, s1, . . . , snSdistinct and1, . . . , nF such thats0 =

ni=1 isi.

Proof. Suppose that S is linearly dependent so that

isi = 0 for some si Sdistinct and iF with j= 0 say. Then

sj =i=j

ij

si.

Conversely, ifs0 =n

i=1 isi then (1)s0+n

i=1 isi = 0.

Proposition. LetV be a vector space overF. Then

{e1, . . . , en

} is a basis forV

if and only if every elementv V can be written uniquely asv = ni=1 iei withiF.Proof. First we observe that by definition{e1, . . . , en}spansVif and only if everyelement v ofVcan be written in at least one way as v =

iei withiF.

So it suffices to show that{e1, . . . , en}is linearly independent if and only if thereis at most one such expression for every vV.

Suppose that{e1, . . . , en}is linearly independent andv =

iei =

iei withi, i F. Then,

(ii)ei = 0. Thus by definition of linear independence,

i i = 0 for i = 1, . . . , nand so i = i for all i.Conversely if{e1, . . . , en} is linearly dependent then we can write

iei = 0 =

0ei

for some i F not all zero. Thus there are two ways to write 0 as an F-linearcombination of the ei.

The following result is necessary for a good notion of dimension for vector spaces.

Theorem (Steinitz exchange lemma). Let V be a vector space over F. SupposethatS ={e1, . . . , en} is a linearly independent subset of V andT V spansV.Then there is a subsetT ofTof ordern such that(T\T)SspansV. In particularn |T|.

This is sometimes stated as follows (with the assumption that Tis finite).

Corollary. If{e1, . . . , en} V is linearly independent and{f1, . . . , f m} spansV.Thenn m and, possibly after reordering thefi,{e1, . . . , en, fn+1, . . . , f m} spansV.

We prove the theorem by replacing elements ofTby elements ofSone by one.

Proof of the Theorem. Suppose that weve already found a subset Tr ofT of order0 r < n such thatTr := (T\Tr) {e1, . . . , er} spansV . Then we can write

er+1 =

ki=1

iti


6/58

6 SIMON WADSLEY

withiF and tiTr. Since{e1, . . . , er+1}is linearly independent there must besome 1 j k such that j= 0 and tj {e1, . . . , er}. Let Tr+1 = Tr {tj} and

Tr+1 = (T\Tr+1) {e1, . . . , er+1}= (Tr\{tj}) {er+1}

.Now

tj = 1

jer+1

i=j

ij

ti,

so tj Tr+1 andTr+1=Tr+1 {tj} Tr= V.Now we can inductively construct T =Tn with the required properties.

Lecture 3

Corollary. LetVbe a vector space with a basis of ordern.

(a) Every basis ofV has ordern.

(b) Anyn LI vectors inV form a basis forV.(c) Anyn vectors inV that spanV form a basis forV.(d) Any set of linearly independent vectors inVcan be extended to a basis forV.(e) Any finite spanning set inVcontains a basis forV.

Proof. Suppose thatS={e1, . . . , en} is a basis for V.(a) Suppose thatTis another basis ofV. SinceSspansVand any finite subset

ofT is linearly independent|T| n. SinceTspans and S is linearly independent|T| n. Thus|T|= n as required.

(b) SupposeTis a LI subset ofVof order n. IfTdid not span we could choosevV\T. Then T {v} is a LI subset ofVof order n + 1, a contradiction.

(c) SupposeT spansVand has ordern. IfTwere LD we could findt0, t1, . . . , tmin T distinct such that t0 =

mi=1 iti for some iF. Thus V =T=T\{t0}

so T\{

t0}

is a spanning set for Vof ordern

1, a contradiction.(d) Let T ={t1, . . . , tm} be a linearly independent subset ofV . Since S spans

V we can find s1, . . . , sm in Ssuch that (S\{s1, . . . , sm}) T spans V . Since thisset has order (at most) n it is a basis containing T.

(e) Suppose that T is a finite spanning set for V and let T Tbe a subset ofminimal size that still spans V . If|T|= n were done by (c). Otherwise|T|> nand so T is LD as S spans. Thus there are t0, . . . , tm in T distinct such thatt0 =

iti for some i F. Then V =T =T\{t0} contradicting theminimality ofT

Exercise. Prove (e) holds for any spanning set in a f.d. V.

Definition. If a vector space V over Fhas a finite basis Sthen we say that V isfinite dimensional (or f. d.). Moreover, we define the dimensionofV by

dimF V= dim V =|S|.IfVdoes not have a finite basis then we will say that V is infinite dimensional.

Remarks.(1) By the last corollary the dimension of a finite dimensional space V does not

depend on the choice of basis S. However the dimension does depend on F.For exampleC has dimension 1 viewed as a vector space overC (since{1}is abasis) but dimension 2 viewed as a vector space over R (since{1, i}is a basis).


7/58

LINEAR ALGEBRA 7

(2) If we wanted to be more precise then we could define the dimension of aninfinite dimensional space to be the cardinality of any basis for V. But we havenot proven enough to see that this would be well-defined; in fact there are noproblems.

Lemma. IfV is f.d. andU Vis a proper subspace thenUis also f.d.. Moreover,dim U


8/58

8 SIMON WADSLEY

Proof. Let {u1, . . . , um} be a basis for Uand extend to a basis {u1, . . . , um, vm+1, . . . , vn}for V. It suffices to show that S :={vm+1+ U, . . . , vn + U} is a basis for V /U.Suppose thatv + U

V /U. Then we can write

v=

mi=1

iui+

nj=m+1

jvj .

Thus

v+ U=m

i=1

i(ui+ U) +n

j=m+1

j(vj + U) = 0 +n

j=m+1

j(vj + U)

and soSspans V /U. To show that Sis LI, suppose that

nj=m+1

j(vj + U) = 0.

Thennj=m+1 jvj U so we can writenj=m+1 jvj =mi=1 iui for someiF. Since the set{u1, . . . , um, vm+1, . . . , vn}is LI we deduce that each j (andj) is zero as required.

Lecture 4

1.3. Direct sum. There are two related notions of direct sum of vector spacesand the distinction between them can often cause confusion to newcomers to thesubject. The first is sometimes known as the internaldirect sum and the latter astheexternaldirect sum. However it is common to gloss over the difference betweenthem.

Definition. Suppose that Vis a vector space over F and U andW are subspaces

ofV . Recall thatsum ofU andW is defined to beU+ W ={u + w: uU, wW}.

We say that V is the (internal) direct sum ofU and W, written V = U W, ifV =U+ W andU W = 0. EquivalentlyV =U W if every elementvV canbe written uniquely as u + w withuU andwW.

We also say thatU andW arecomplementary subspaces in V .

Example. Suppose thatV =R3 and

U=

x1x2x3

:x1+ x2+ x3 = 0

, W1 =

111

andW2 = 10

0

thenV =U

W1 = U

W2.

Note in particular that Udoes not have only one complementary subspace inV.

Definition. Given any two vector spaces U and W over F the (external) directsum U W ofU andW is defined to be the set of pairs

{(u, w) : uU, wW}with addition given by

(u1, w1) + (u2, w2) = (u1+ u2, w1+ w2)


9/58

LINEAR ALGEBRA 9

and scalar multiplication given by

(u, w) = (u,w).

Exercise. Show that U Wis a vector space over F with the given operations andthat it is the internal direct sum of its subspaces

{(u, 0) : uU}and{(0, w) : wW}.More generally we can make the following definitions.

Definition. IfU1, . . . , U n are subspaces ofV then V is the (internal) direct sumofU1, . . . , U n written

V =U1 Un =n

i=1

Ui

if every element v ofVcan be written uniquely as v = ni=1 ui withui

Ui.

Definition. IfU1, . . . , U n are any vector spaces over F their (external) direct sumis the vector space

ni=1

Ui:={(u1, . . . , un)|uiUi}

with natural coordinate-wise operations.

From now on we will drop the adjectives internal and external from directsum.

2. Linear maps

2.1. Definitions and examples.

Definition.Suppose thatUand V are vector spaces over a field F. Then a function : UV is a linear map if

(i) (u1+ u2) = (u1) + (u2) for all u1, u2U;(ii) (u) = (u) for all uU andF.

Notation. We writeL(U, V) for the set of linear maps UV.Remarks.

(1) We can combine the two parts of the definition into one as: is linear if andonly if(u1 + u2) = (u1) + (u2) for all , F and u1, u2U. Linearmaps should be viewed as functions between vector spaces that respect theirstructure as vector spaces.

(2) Ifis a linear map thenis a homomorphism of the underlying abelian groups.In particular (0) = 0.

(3) If we want to stress the fieldF then we will say a map is F-linear. For example,complex conjugation defines an R-linear map from C to C but it is not C-linear.

Examples.(1) LetA be an n m matrix with coefficients in F write AMn,m(F). Then

: Fm Fn; (v) = Av is a linear map.


10/58

10 SIMON WADSLEY

To see this let , F and u, v Fm. As usual, letAij denote the ijthentry ofA and uj , (resp. vj) for the jth coordinate ofu (resp. v). Then for1 i n,

((u + v))i =m

j=1

Aij(uj+ vj) = (u)i+ (v)i

so(u + v) = (u) + (v) as required.(2) IfX is any set and g FX then mg : FX FX ; mg(f)(x) := g(x)f(x) for

xXis linear.(3) For allx[a, b], x : C([a, b], R)R; ff(x) is linear.(4) I: C([a, b], R)C([a, b], R); I(f)(x) = x

a f(t) dt is linear.

(5) D : C([a, b], R)C([a, b], R); (Df)(t) = f(t) is linear.(6) If, : U Vare linear andF then + : UVgiven by ( + )(u) =

(u) + (u) and : U V given by ()(u) = ((u)) are linear. In thisway

L(U, V) is a vector space over F.

Definition. We say that a linear map : U V is an isomorphism if there is alinear map : V U such that = idU and= idV.Lemma. Suppose thatU andVare vector spaces overF. A linear map : UVis an isomorphism if and only if is a bijection.

Proof. Certainly an isomorphism : U V is a bijection since it has an inverseas a function between the underlying sets U and V. Suppose that : U V is alinear bijection and let : V Ube its inverse as a function. We must show that is also linear. Let , F and v1, v2V. Then

(v1+ v2) = (v1) + (v2) = ((v1) + (v2)) .

Since is injective it follows that is linear as required.

Definition. Suppose that : UV is a linear map. The imageof, Im :={(u) : uU}. The kernel of, ker :={uU :(u) = 0}.

Examples.

(1) Let AMn,m(F) and let : Fm Fn be the linear map defined by xAx.Then the system of equations

mj=1

Aijxj = bi; 1 i n

has a solution if and only if

b1...

bn

Im . The kernel of consists of the

solutions

x1...xm

to the homogeneous equationsm

j=1

Aijxj = 0; 1 i n


11/58

LINEAR ALGEBRA 11

(2) Let: C(R, R)C(R, R) be given by(f)(t) = f(t) +p(t)f(t) + q(t)f(t)

for some p, q C(R, R). A function g C(R, R) is in the image of precisely if

f(t) +p(t)f(t) + q(t) = g(t)

has a solution in C(R, R). Moreover, ker consists of the solutions to thedifferential equation

f(t) +p(t)f(t) + q(t)f(t) = 0

in C(R, R).

Lecture 5

Proposition. Suppose that : UV is anF-linear map.(a) If is injective andS

U is linearly independent then(S)

V is linearly

independent.(b) If is surjective andSU spansU then(S) spansV.(c) If is an isomorphism andS is a basis then(S) is a basis.

Proof. (a) Suppose is injective, S U and (S) is linearly dependent. Thenthere are s0, . . . , snSdistinct and 1, . . . , nF such that

(s0) =

i(si) =

ni=1

isi

.

Since is injective it follows that s0 =n

1 isi andSis LD.(b) Now suppose that is surjective, S U spans U and let v in V. There is

u Usuch that (u) = v and there are s1, . . . , sn S and 1, . . . , n F suchthat isi = u. Then i(si) = v. Thus (S) spans V .(c) Follows immediately from (a) and (b).

Note that is injective if and only if ker = 0 and that is surjective if andonly if Im = V.

Corollary. If two finite dimensional vector spaces are isomorphic then they havethe same dimension.

Proof. If : U Vis an isomorphism and Sis a finite basis for U then (S) is abasis ofVby the proposition. Since is an injection|S|=|(S)|. Proposition. Suppose that V is a vector space over F of dimension n


12/58

12 SIMON WADSLEY

for all

x1...

xn

Fn so = . Thus is injective.

Suppose now that (v1, . . . , vn) is an ordered basis for V and define : Fn V

by

x1...

xn

= n

i=1

xivi.

Then is injective since v1, . . . , vn are LI and is surjective since v1, . . . , vn spanV and is easily seen to be linear. Thus is an isomorphism such that () =(v1, . . . , vn) and is surjective as required.

Thus choosing a basis for ann-dimensional vector spaceVcorresponds to choos-ing an identification ofV with Fn.

2.2. Linear maps and matrices.

Proposition.Suppose thatUandVare vector spaces overF andS:={e1, . . . , en}is a basis forU. Then every functionf: S Vextends uniquely to a linear map : UV.

Slogan To define a linear map it suffices to specify its values on a basis.

Proof. First we prove uniqueness: suppose thatf: S V and and are twolinear maps U V extending f. Let uU so that u= uiei for some ui F.Then

(u) =

ni=1

uiei

=

ni=1

ui(ei).

Similarly, (u) = n1ui(ei). Since(ei) = f(ei) = (ei) for each 1 i n wesee that(u) = (u) for all uUand so = .

That argument also shows us how to construct a linear map that extends f.Every uUcan be written uniquely as u = ni=1 uiei with uiF. Thus we candefine (u) =

uif(ei) without ambiguity. Certainly extends f so it remains

to show that is linear. So we compute for u =

uiei andv =

viei,

(u + v) =

ni=1

(ui+ vi)ei

=n

i=1

(ui+ vi)f(ei)

=

n

i=1

uif(ei) +

n

i=1

vif(ei)

= (u) + (v)

as required.

Remarks.

(1) With a little care the proof of the proposition can be extended to the case Uis not assumed finite dimensional.


13/58

LINEAR ALGEBRA 13

(2) It is not hard to see that the only subsetsS ofUthat satisfy the conclusionsof the proposition are bases: spanning is necessary for the uniqueness part andlinear independence is necessary for the existence part. The proposition shouldbe considered a key motivation for the definition of a basis.

Corollary. IfU andV are finite dimensional vector spaces overF with (ordered)bases(e1, . . . , em) and(f1, . . . , f n) respectively then there is a bijection

Matn,m(F) L(U, V)that sends a matrixA to the unique linear map such that(ei) =

ajifj.

InterpretationThe ith column of the matrix A tells where the ith basisvector ofUgoes (as a linear combination of the basis vectors ofV).

Proof. If : U V is a linear map then for each 1 i m we can write (ei)uniquely as (ei) =

ajifj with aji F. The proposition tells us that every

matrixA = (aij) arises in this way from some linear map and that is determined

byA.

Definition. We call the matrix corresponding to a linear map L(U, V) underthis corollary thematrix representing with respect to(e1, . . . , em)and(f1, . . . , f n).

Exercise. Show that the bijection given by the corollary is even an isomorphismof vector spaces. By finding a basis for Matn,m(F), deduce that dim L(U, V) =dim Udim V.

Proposition. Suppose thatU, V andW are finite dimensional vector spaces overFwith basesR := (u1, . . . , ur),S:= (v1, . . . , vs)andT := (w1, . . . , wt)respectively.If : U Vis a linear map represented by the matrixA with respect toR andSand: V Wis a linear map represented by the matrixB with respect toS andT then is the linear mapU

W represented byB A with respect toR andT.

Proof. Verifying thatis linear is straightforward: suppose x, yUand, Fthen

(x + y) = ((x) + (y)) = (x) + (y).

Next we compute (ui) as a linear combination ofwj .

(ui) =

k

Akivk

=k

Aki(vk) =k,j

AkiBjkwj =

j

(BA)jiwj

as required.


14/58

14 SIMON WADSLEY

Lecture 6

2.3. The first isomorphism theorem and the rank-nullity theorem. The

following analogue of the first isomorphism theorem for groups holds for vectorspaces.

Theorem(The first isomorphism theorem). Let : UVbe a linear map betweenvector spaces overF. Thenker is a subspace ofU and Im is a subspace ofV.Moreover induces an isomorphismU/ ker Im given by

(u + ker ) = (u).

Proof. Certainly 0ker . Suppose that u1, u2ker and , F. Then(u1+ u2) = (u1) + (u2) = 0 + 0 = 0.

Thus ker is a subspace ofU. Similarly 0Im and foru1, u2U,(u1) + (u2) = (u1+ u2)

Im().

By the first isomorphism theorem for groups is a bijective homomorphism of theunderlying abelian groups so it remains to verify that respects multiplication byscalars. But(u+ ker ) = (u) = (u) = (u+ ker ) for all F anduU. Definition. Suppose that : U V is a linear map between finite dimensionalvector spaces.

The number n() := dim ker is called the nullity of. The number r() := dimIm is called the rank of.

Corollary (The rank-nullity theorem). If : U V is a linear map between f.d.vector spaces overF then

r() + n() = dim U.

Proof. SinceU / ker =Im they have the same dimension. Butdim U= dim(U/ ker ) + dim ker

by an earlier computation so dim U=r() + n() as required.

We are about to give another proof of the rank-nullity theorem not using quotientspaces or the first isomorphism theorem. However, the proof above is illustrativeof the power of considering quotients.

Proposition. Suppose that : U Vis a linear map between finite dimensionalvector spaces then there are bases (e1, . . . , en) for U and (f1, . . . , f m) for V suchthat the matrix representing is

Ir 00 0wherer = r().

In particularr() + n() = dim U.

Proof. Let (ek+1, . . . , en) be a basis for ker (here n() = n k) and extend it toa basis (e1, . . . , en) forU(were being careful about ordering now so that we donthave to change it later). Let fi = (ei) for 1 i k.

We claim that (f1, . . . , f k) form a basis for Im (so that k = r()).


15/58

LINEAR ALGEBRA 15

Suppose first thatk

i=1 ifi = 0 for some i F. Then k

i=1 iei

= 0

and soki=1 iei

ker . But ker

e1, . . . , ek

= 0 by construction and soki=1 iei = 0. Since e1, . . . , ek are LI, each i = 0. Thus we have shown that

{f1, . . . , f k}is LI.Now suppose that v Im , so that v = (ni=1 iei) for some i F. Since

(ei) = 0 for i > k and i(ei) = fi for i k, v =k

i=1 ifi f1, . . . , f k. So(f1, . . . , f k) is a basis for Im as claimed (and k = r).

We can extend{f1, . . . , f r} to a basis{f1, . . . , f m}for V.Now

(ei) =

fi 1 i r

0 r+ 1 i m

so the matrix representing with respect to our choice of basis is as in the state-ment.

The proposition says that the rank of a linear map between two finite dimensionalvector spaces is its only basis-independent invariant (or more precisely any otherinvariant can be deduced from it).

This result is very useful for computing dimensions of vector spaces in terms ofknown dimensions of other spaces.

Examples.

(1) LetW ={xR5 |x1+ x2+ x5 = 0 and x3 x4 x5 = 0}. What is dim W?Consider : R5 R2 given by (x) =

x1+ x2+ x5x3 x4 x5

. Then is a linear

map with image R2 (since

10

000

= 10 and00

100

= 01 .)and ker = W. Thus dim W =n() = 5 r() = 5 2 = 3.

More generally, one can use the rank-nullity theorem to see that m linearequations inn unknowns have a space of solutions of dimension at least n m.

(2) Suppose that U and Ware subspaces of a finite dimensional vector space Vthen let : U W Vbe the linear map given by ((u, w)) =u +w. Thenker ={(u, u)|uU W}=U W, and Im = U+ W. Thus

dim U W= dim (U+ W) + dim (U W) .We can then recover dim U+ dim W= dim (U+ W) + dim (U W).

Corollary (of the rank-nullity theorem). Suppose that : U V is a linear mapbetween two vector spaces of dimensionn


16/58

16 SIMON WADSLEY

By the rank-nullity theorem n() = 0 if and only if r() = n and the latter isequivalent to being surjective.

This enables us to prove the following fact about matrices.

Lemma. LetA be ann n matrix overF. The following are equivalent(a) there is a matrixB such thatB A= In;(b) there is a matrixC such thatAC= In.

Moreover, if (a) and (b) hold thenB =C and we writeA1 =B = C; we sayAis invertible.

Proof. Let , , , : Fn Fn be the linear maps represented by A, B,C and Inrespectively (with respect to the standard basis for Fn).

Note first that (a) holds if and only if there exists such that = . Thislast implies that is injective which in turn implies that implies that is anisomorphism by the previous result. Conversely if is an isomoprhism there does

exist such that = by definition. Thus (a) holds if and only if is anisomorphism.Similarly (b) holds if and only if there exists such that = . This last

implies that is surjective and so an isomorphism by the previous result. Thus (b)also holds if and only if is an isomorphism.

If is an isomorphism thenandmust both be the set-theoretic inverse ofand soB = Cas claimed.

Lecture 7

2.4. Change of basis.

Theorem. Suppose that (e1, . . . , em) and (u1, . . . , um) are two bases for a vectorspace U over F and (f1, . . . , f n) and (v1, . . . , vn) are two bases of another vector

space V. Let : U V be a linear map, A be the matrix representing withrespect to (e1, . . . , em) and (f1, . . . , f n) and B be the matrix representing withrespect to(u1, . . . , um) and(v1, . . . , vn) then

B = Q1AP

whereui =

Pkiek fori = 1, . . . , mandvj =

Qljfl forj = 1, . . . , n.

Note that one can view Pas the matrix representing the identity map from Uwith basis (u1, . . . , um) to Uwith basis (e1, . . . , em) andQ as the matrix represent-ing the identity map from V with basis (v1, . . . , vn) to V with basis (f1, . . . , f n).Thus both are invertible with inverses represented by the identity maps going inthe opposite directions.

Proof. On the one hand, by definition

(ui) =j

Bjivj =j,l

BjiQljfl =l

(QB)lifl.

On the other hand, also by definition

(ui) =

k

Pkiek

=k,l

PkiAlkfl =

l

(AP)lifl.

Thus QB = APas the fl are LI. Since Q is invertible the result follows.


17/58

LINEAR ALGEBRA 17

Definition. We say two matrices A, B Matn,m(F) are equivalent if there areinvertible matrices P Matm(F) and QMatn(F) such that Q1AP =B.

Note that equivalence is an equivalence relation. It can be reinterpreted asfollows: two matrices are equivalent precisely if they respresent the same linearmap with respect to different bases.

We saw earlier that for every linear map between f.d. vector spaces there arebases for the domain and codomain such that is represented by a matrix of theform

Ir 00 0

.

Moreoverr = r() is independent of the choice of bases. We can now rephrase thisas follows.

Corollary. If A Matn,m(F) there are invertible matrices P Matm(F) andQMatn(F) such thatQ1APis of the form

Ir 00 0 .

Moreover r is uniquely determined by A. i.e. every equivalence class containsprecisely one matrix of this form.

Definition. IfAMatn,m(F) then column rank of A, written r(A) is the dimension of the subspace of Fn

spanned by the columns ofA; therow rank ofA is the column rank ofAT.

Note that if we take to be a linear map represented by A with respect tothe standard bases of Fm and Fn then r(A) = r(). i.e. column rank=rank.Moreover, since r() is defined in a basis-invariant way, the column rank ofA is

constant on equivalence classes.Corollary (Row rank equals column rank). IfAMatm,n(F)thenr(A) = r(AT).Proof. Letr = r(A). There exist P, Q such that

Q1AP =

Ir 00 0

.

Thus

PTAT(Q1)T =

Ir 00 0

and sor = r(AT). Thus A and AT have the same rank.

2.5. Elementary matrix operations.

Definition.We call the following three types of invertible n

n matriceselementary

matrices

Snij :=

i j

1 0

. . .

0 1

. . .

1 0

. . .

0 1

for i = j ,


18/58

18 SIMON WADSLEY

Enij() :=

i j

1 0

. . .

1 . . .

0 1

. . .

0 1

for i = j , F and

Tni () :=

i

1 0

. . .

1 0

0 1

. . .

0 1

for F\{0}.

We make the following observations: ifA is an m

n matrix then ASnij

(resp.Smij A) is obtained from A by swapping the ith and jth columns (resp. rows),AEnij() (resp. E

mij ()A) is obtained from A by adding (columni) to column j

(resp. adding (row j ) to rowi) andATni () (resp. Tmi ()A) is obtained from Aby multiplying column (resp. row)i by .

Recall the following result.

Proposition. IfA Matn,m(F) there are invertible matricesP Matm(F) andQMatn(F) such thatQ1APis of the form

Ir 00 0

.

Pure matrix proof of the Proposition. We claim that there are elementary matrices

En

1 , . . . , E n

a and Fm

1 , . . . , F m

b such that En

a En

1 AFm

1 Fm

b is of the requiredform. This suffices since all the elementary matrices are invertible and products ofinvertible matrices are invertible.

Moreover, to prove the claim it suffices to show that there is a sequence ofelementary row and column operations that reduces A to the required form.

If A = 0 there is nothing to do. Otherwise, we can find a pair i, j such thatAij= 0. By swapping rows 1 and i and then swapping columns 1 and j we canreduce to the case that A11= 0. By multiplying row 1 by 1A11 we can furtherassume thatA11 = 1.

Now, given A11 = 1 we can addA1j times column 1 to column j for each1< j m and then addAi1 times row 1 to row i for each 1 < i n to reducefurther to the case that A is of the form

1 00 B .Now by induction on the size ofA we can find elementary row and column operationsthat reduces B to the required form. Applying these same operations to A wecomplete the proof.

Note that the algorithm described in the proof can easily be implemented on acomputer in order to actually compute the matricesP andQ.


19/58


20/58

20 SIMON WADSLEY

Remark. If we think of elements ofVas column vectors with respect to some basis

xiei = x1

...xn

,then we can view elements ofV as row vectors with respect to the dual basis

aii =

a1 an

.

Then aii

xjej

=

aixi =

a1 anx1...

xn

Corollary. IfV is f.d. thendim V = dim V.

Proposition. Suppose thatV is a f.d. vector space overF with bases(e1, . . . , en)and (f1, . . . , f n), and that P is the change of basis matrix from (e1, . . . , en) to(f1, . . . , f n) i.e. fi =

nk=1 Pkiek for1 i n.

Let(1, . . . , n) and(1, . . . , n) be the corresponding dual bases so that

i(ej) = ij = i(fj) for1 i, j n.

Then the change of basis matrix from(1, . . . , n)to (1, . . . , n)is given by(P1)T

iei =

PTlil. .

Proof. LetQ = P1. Then ej =

Qkj fk, so we can compute

(

l

Pill)(ej) =k,l

(Pill)(Qkj fk) =k,l

PilklQkj = ij.

Thus i=

lPill as claimed.

Definition.(a) IfUVthen theannihilator ofU, U :={V |(u) = 0 uU} V.(b) IfW V, then the annihilator ofW :={vV| (v) = 0 W} V.Example. Consider R3 with standard basis (e1, e2, e3) and (R3) with dual basis(1, 2, 3), U = (e1+ 2e2+e3)R3 and W = (1 3, 1 22)(R3). ThenU =W andW =U.

Proposition. Suppose thatV is f.d. overF andUV is a subspace. Thendim U+ dim U = dim V.

Proof 1. Let (e1, . . . , ek) be a basis for Uand extend to a basis (e1, . . . , en) for Vand consider the dual basis (1, . . . , n) for V.

We claim that U is spanned by k+1, . . . , n.Certainly ifj > k, then j(ei) = 0 for each 1 i k and so j U. Suppose

now thatU. We can write = ni=1 ii with iF. Now,0 = (ej) = j for each 1 j k.

So =n

j=k+1 jj . Thus U is the span ofk+1, . . . , n and

dim U =n k= dim V dim Uas claimed.


21/58

LINEAR ALGEBRA 21

Proof 2. Consider the restriction map V U given by |U. Since everylinear map U F can be extended to a linear map V F this map is a linearsurjection. Moreover its kernel is U. Thus dim V = dim U+dim U by the rank-nullity theorem. The proposition follows from the statements dim U= dim U anddim V= dim V.

Exercise(Proof 3). Show that (V /U)=U and deduce the result.Of course all three of these proofs are really the same but presented with differing

levels of sophistication.

Lecture 9

3.2. Dual maps.

Definition. Let V and Wbe vector spaces over F and suppose that : V Wis a linear map. The dual map to is the map : W V is given by .

Note that is the composite of two linear maps and so is linear. Moreover, if, F and 1, 2W andvV then

(1+ 2)(v) = (1+ 2)(v)

= 1(v) + 2(v)

= ((1) + (2)) (v).

Therefore(1+ 2) = (1) + (2) and is linear ie L(W, V).Lemma. Suppose thatV andW are f.d. with bases (e1, . . . , en) and(f1, . . . , f m)respectively. Let (1, . . . , n) and (1, . . . , m) be the corresponding dual bases. If : V W is represented by A with respect to (e1, . . . , en) and (f1, . . . , f m) then is represented byAT with respect to(1, . . . , m) and(1, . . . , n).

Proof. Were given that (ei) =Akifk and must compute (i) in terms of1, . . . , n.(i)(ej) = i((ej))

= i(k

Akj fk)

=k

Akjik = Aij

Thus (i)(ej) =

kAikk(ej) =

kATkik(ej) and so

(i) =

KATkik as

required.

Remarks.

(1) If : UV and: V Ware linear maps then () =.(2) If, : UV then ( + ) = + .(3) IfB = Q1APis an equality of matrices with P andQ invertible, then

BT =PTAT

Q1T

=

P1T1

AT

Q1T

as we should expect at this point.

Lemma. Suppose that L(V, W) withV, W f.d. overF. Then(a) ker = (Im );


22/58

22 SIMON WADSLEY

(b) r() = r() and(c) Im = (ker )

Proof. (a) SupposeW. Then ker if and only if() = 0 if and only if(v) = 0 for all vVif and only if(Im ).

(b) As Im is a subspace ofW, weve seen that dim Im +dim(Im ) = dim W.Using part (a) we can deduce thatr() + n() = dim W= dim W. But the rank-nullity theorem givesr() + n() = dim W.

(c) Suppose that Im . Then there is someW such that = () =. Therefore for all vker , (v) = (v) = (0) = 0. Thus Im (ker ).

But dim ker + dim(ker ) = dim V. So

dim(ker ) = dim V n() = r() = r() = dimIm .and so the inclusion must be an equality.

Notice that we have reproven that row-rank=column rank in a more conceptuallysatisfying way.

Lemma. LetV be a vector space overF there is a canonical linear mapev : VV given byev(v)() = (v).

Proof. First we must show that ev(v) V whenever v V. Suppose that1, 2V and, F. Then

ev(v)(1+ 2) = 1(v) + 2(v) = ev(v)(1) + ev(v)(2).

Next, we must show ev is linear, ie ev(v1 + v2) = ev(v1) + ev(v2) wheneverv1, v2 V , , F. We can show this by evaluating both sides at each V.Then

ev(v1+ v2)() = (v1+ v2) = (ev(v1) + ev(v2))()

so ev is linear.

Lemma. Suppose thatVis f.d. then the canonical linear map ev : V V is anisomorphism.

Proof. Suppose that ev(v) = 0. Then(v) = ev(v)() = 0 for all V. Thusv has dimension dim V. It follows thatv is a space of dimension 0 so v = 0.In particular weve proven that ev is injective.

To complete the proof it suffices to observe that dim V = dim V = dim V soany injective linear map V V is an isomorphism. Remarks.

(1) The lemma tells us more than that there is an isomorphism between V andV. It tells us that there is a way to define such an isomorphism canonically,that is to say without choosing bases. This means that we can, and from nowon we will identify V and V whenever V is f.d. In particular for v

V and

V we can write v() = (v).(2) Although the canonical linear map is ev : V V always exists it is not an

isomorphism in general ifV is not f.d.

Lemma. SupposeV andWare f.d. overF. After identifyingV withV andWwithW viaev we have

(a) IfUis a subspace ofV thenU =U.(b) If L(V, W) then =.


23/58

LINEAR ALGEBRA 23

Proof. (a) Let u U. Thenu() = (u) = 0 for all U. Thusu U. ieUU. But

dim U= dim V dim U = dim V dim U = dim U.(b) Suppose that (e1, . . . , en) is a basis for V and (f1, . . . , f m) is a basis for W

and (1, . . . , n) and (1, . . . , m) are the corresponding dual bases. Then if isrepresented byA with respect to (e1, . . . , en) and (f1, . . . , f m),

is represented byAT with respect to (1, . . . , n) and (1, . . . , n).

Since we can view (e1, . . . , en) as the dual basis to (1, . . . , n) as

ei(j) = j(ei) = ij ,

and (f1, . . . , f m) as the dual basis of (1, . . . , m) (by a similar computation),

is represented by (AT)T =A.

Lecture 10Proposition. SupposeV is f.d. overF andU1, U2 are subspaces ofV then

(a) (U1+ U2) =U1 U2 and(b) (U1 U2) =U1 + U2 .Proof. (a) Suppose that V. Then(U1+ U2) if and only if(u1+ u2) = 0for all u1U1 andu2U2 if and only if(u) = 0 for all uU1 U2 if and only ifU1 U2 .

(b) by part (a), U1 U2 = U1 U2 = (U1 + U2 ). Thus(U1 U2) = (U1 + U2 ) =U1 + U2

as required

4. Bilinear Forms (I)

LetV andWbe vector spaces over F.

Definition. : V W F is abilinear formif it is linear in both arguments; i.e.if(v, ) : W FW for all vV and(, w) : V FV for all wW.Examples.

(0) The mapV V F; (v, )(v) is a bilinear form.(1) V =Rn; (x, y) =

ni=1 xiyi is a bilinear form.

(2) Suppose that A Matm,n(F) then : Fm Fn F; (v, w) = vTAw is abilinear form.

(3) IfV =W =C([0, 1], R) then (f, g) =

1

0f(t)g(t) dt is a bilinear form.

Definition. Let (e1, . . . , en) be a basis for Vand (f1, . . . , f m) be a basis ofW and : V W F a bilinear form. Then the matrixA representing with respect to(e1, . . . , en) and(f1, . . . , f m) is given by Aij =(ei, fj).

Remark. Ifv =

iei andw =

jfj then

iei,

jfj

=

ni=1

i

ei,

jfj

=

ni=1

mj=1

ij(ei, fj).


24/58

24 SIMON WADSLEY

Therefore ifA is the matrix representing with respect to (e1, . . . , en) and (f1, . . . , f m)we have

(v, w) = 1 nA1...m

and is determined by the matrix representing it.

Proposition. Suppose that (e1, . . . , en) and (v1, . . . , vn) are two bases of V suchthatvi =

nk=1 Pkiek for i = 1, . . . , n and (f1, . . . , f m) and (w1, . . . , wm) are two

bases ofW such thatwi =m

l=1 Qlifl for i = 1, . . . , m. Let : V W F be abilinear form represented byA with respect to (e1, . . . , en) and(f1, . . . , f m) and byB with respect to(v1, . . . , vn) and(w1, . . . , wm) then

B = PTAQ.

Proof.

Bij = (vi, wj)

=

nk=1

Pkiek,m

l=1

Qljfl

=k,l

PkiQlj(ek, fl)

= (PTAQ)ij

Definition. We define the rank of to be the rank of any matrix representing. Since r(PTAQ) =r(A) for any invertible matrices P and Q we see that this isindependent of any choices.

A bilinear form gives linear maps L : V W and R : W V by theformulae

L(v)(w) = (v, w) = R(w)(v)

forvV andwW.Exercise. If : V V; (v, ) = (v) then L : V V sends v to ev(v) andR : V

V is the identity map.Lemma. Let (1, . . . , n) be the dual basis to (e1, . . . , en) and (1, . . . , m) be thedual basis to (f1, . . . , f m). ThenA representsR with respect to (f1, . . . , f m) and(1, . . . , m) andAT representsL with respect to (e1, . . . , en) and(1, . . . , m).

Proof. We can compute L(ei)(fj) =(ei, fj) =Aij and so L(ei) =mj=1 A

Tjij

andR(fj)(ei) = (ei, fj) = Aij and so R(fj) = ni=1 Aiji. Definition. We call ker L theleft kernel of and ker R theright kernel of .

Note that

ker L ={vV| (v, w) = 0 for all wW}and

ker R ={wW| (v, w) = 0 for all vV}.


25/58

LINEAR ALGEBRA 25

More generally, ifT Vwe writeT :={wW| (t, w) = 0 for all tT}

and ifUWwe writeU :={vV| (v, u) = 0 for all uU}.

Definition. We say a bilinear form : V W F is non-degenerateif ker L =0 = kerR. Otherwise we say that is degenerate.

Lemma. Let V and W be f.d. vector spaces over F with bases (e1, . . . , en) and(f1, . . . , f m) and let : W VF be a bilinear form represented by the matrixAwith respect to those bases. Then is non-degenerate if and only if the matrixAis invertible. In particular, if non-degenerate thendim V= dim W.

Proof. The condition that is non-degenerate is equivalent to ker L = 0 andker R = 0 which is in turn equivalent to n(A) = 0 =n(A

T). This last is equivalent

to r(A) = dim V and r(A

T

) = dim W. Since row-rank and column-rank agree wecan see that this final statement is equivalent to A being invertible as required.

It follows that, when V andW are f.d., defining a non-degenerate bilinear form : V W F is equivalent to defining an isomorphism L : V W (or equiva-lently an isomorphism R : W V).

Lecture 11

5. Determinants of matrices

Recall thatSn is the group of permutations of the set{1, . . . , n}. Moreover wecan define a group homomorphism : Sn {1} such that () = 1 whenever is a product of an even number of transpositions and () =1 whenever is aproduct of an odd number of transpositions.

Definition. IfAMatn(F) then the determinant ofA

det A:=

Sn

()

ni=1

Ai(i)

.

Example. Ifn = 2 then det A= A11A22 A12A21.Lemma. det A= det AT.

Proof.

det AT =

Sn

()n

i=1

A(i)i

= Sn

()n

i=1

Ai1(i)

=

Sn

(1)n

i=1

Ai(i)

= det A


26/58

26 SIMON WADSLEY

Lemma. LetAMatn(F) be upper triangular ie

A= a1 0 . . . 0 0 an

thendet A=

ni=1 ai.

Proof.

det A=

Sn

()

ni=1

Ai(i).

NowAi(i) = 0 ifi > (i). Son

i=1 Ai(i) = 0 unless i (i) for all i = 1, . . . , n.

Since is a permutationn

i=1 Ai(i) is only non-zero when = id. The resultfollows immediately.

Definition. A volume form d on Fn is a function Fn

Fn

Fn

F;

(v1, . . . , vn)d(v1, . . . , vn) such that(i) d is multi-linear i.e. for each 1 i n, and v1, . . . , vi1, vi+1, . . . , vnV

d(v1, . . . , vi1, , vi+1, . . . , vn)(Fn)(ii) d is alternating i.e. whenevervi = vj for some i=j then d(v1, . . . , vn) = 0.

One may view a matrix AMatn(F) as an n-tuple of elements ofFn given byits columns A = (A(1) A(n)) withA(1), . . . , A(n) Fn.Lemma. det: Fn Fn F; (A(1), . . . , A(n))det A is a volume form.Proof. To see that det is multilinear it suffices to see that

ni=1 Ai(i) is multilinear

for each Sn since a sum of (multi)-linear functions is (multi)-linear. Since oneterm from each column appears in each such product this is easy to see.

Suppose now that A(k)

= A(l)

for some k= l. Let be the transposition (kl).Thenaij =ai(j) for everyi, j in{1, . . . , n}. We can writeSn is a disjoint union ofcosetsAn

An.

Then An

ai(i) =

An

ai(i)=

An

ai(i)

Thus det A= LHS RHS = 0. We continue thinking about volume forms.

Lemma. Letd be a volume form. Swapping two entries changes the sign. i.e.

d(v1, . . . , vi, . . . , vj , . . . , vn) =d(v1, . . . , vj , . . . , vi, . . . , vn).Proof. Considerd(v1, . . . , vi +vj , . . . , vi + vj , . . . , vn) = 0. Expanding the left-hand-

side using linearity of the ith andj th coordinates we obtain

d(v1, . . . , vi, . . . , vi, . . . , vn) + d(v1, . . . , vi, . . . , vj , . . . , vn)+

d(v1, . . . , vj, . . . , vi, . . . , vn) + d(v1, . . . , vj , . . . , vj , . . . , vn) = 0.

Since the first and last terms on the left are zero, the statement follows immediately.

Corollary. IfSn thend(v(1), . . . , v(n)) = ()d(v1, . . . , vn).


27/58

LINEAR ALGEBRA 27

Theorem. Let d be a volume form on Fn. Let A be a matrix with ith columnA(i) Fn. Then

d(A(1), . . . , A(n

)) = det A d(e1, . . . , en).In order wordsdet is the unique volume formd such thatd(e1, . . . , en) = 1.

Proof. We compute

d(A(1), . . . , A(n)) = d(

ni=1

Ai1ei, A(2), . . . , A(n))

=

i

Ai1d(ei, A(2), . . . , A(n))

=i,j

Ai1Aj2d(ei, ej , . . . , A(n))

= i1,...,in

nj=1

Aijj d(ei1 , . . . , ein)

But d(ei1 , . . . , ein) = 0 unless i1, . . . , in are distinct. That is unless there is someSn such that ij =(j). Thus

d(A(1), . . . , A(n)) =

Sn

nj=1

A(j)j

d(e(1), . . . , e(n)).Butd(e(1), . . . , e(n)= ()d(e1, . . . , en) so were done.

Remark. We can interpret this as saying that for every matrix A,d(Ae1, . . . , A en) = det A d(e1, . . . , en).

The same proof gives d(Av1, . . . , A vn) = det A d(v1, . . . , vn) for allv1, . . . , vnFn.We can view this result as the motivation for the formula defining the determinant;det Ais the unique way to define the volume scaling factor of the linear map givenbyA.

Theorem. LetA, BMatn(F). Thendet(AB) = det A det B.Proof. Let d be a non-zero volume form on Fn, for example det. Then we cancompute

d(ABe1, . . . , A B en) = det(AB)

d(e1, . . . , en)

by the last theorem. But we can also compute

d(ABe1, . . . , A B en) = det A d(Be1. . . . , B en) = det A det B d(e1, . . . , en)by the remark extending the last theorem. Thus asd(e1, . . . , en)= 0 we can seethat det(AB) = det A det B


28/58

28 SIMON WADSLEY

Lecture 12

Corollary. IfA is invertible thendet A

= 0 anddet(A1) = 1det A .

Proof. We can compute

1 = det In = det(AA1) = det A det A1.

Thus det A1 = 1det A as required.

Theorem. LetAMatn(F). The following statements are equivalent:(a) A is invertible;(b) det A= 0;(c) r(A) = n.

Proof. Weve seen that (a) implies (b) above.Suppose that r(A) < n. Then by the rank-nullity theorem n(A) > 0 and so

there is someFn\0 such that A = 0 i.e. there is a linear relation between thecolumns ofA; ni=1 iA(i) = 0 for some iF not all zero.Suppose that k= 0 and let B be the matrix with ith column ei fori=k andkth column. ThenAB has kth column 0. Thus det AB= 0. But we can computedet AB= det A det B = kdet A. Sincek= 0, det A= 0. Thus (b) implies (c).

Finally (c) implies (a) by the rank-nullity theorem: r(A) =n implies n(A) = 0and the linear map corresponding to A is bijective as required.

Notation. LetAij denote the submatrix ofA obtained by deleting the ith rowand the j th column.

Lemma. LetAMatn(F). Then(a) (expanding determinant along thejth column)det A=

ni=1(1)i+jAijdetAij;

(b) (expanding determinant along theith row) det A=

nj=1(1)i+jAijdet

Aij.

Proof. Since det A= det AT it suffices to verify (a).Now

det A = det(A(1), . . . , A(n))

= det(A(1), . . . ,

i

Aijei, . . . , A(n))

=

i

Aijdet(A(1), . . . , ei, . . . , A

(n))

=

i

Aij(1)i+j det B

where

B= Aij 0 1 .Finally forSn,

ni=1 Bi(i) = 0 unless (n) = n and we see easily that det B =

detAij as required. Definition. Let A Matn(F). The adjugate matrix adj A is the element ofMatn(F) such that

(adj A)ij = (1)i+j detAji .


29/58

LINEAR ALGEBRA 29

Theorem. LetAMatn(F). Then

(adj A)A= A(adj A) = (det A)In.

Thus ifdet A= 0 thenA1 = 1det A adj AProof. We compute

((adj A)A)jk =n

i=1

(adj A)jiAik

=

ni=1

(1)j+i detAijAikThe right-hand-side is det A if k = j. If k= j then the right-hand-side is thedeterminant of the matrix obtained by replacing the jth column ofA by the kthcolumn. Since the resulting matrix has two identical columns ((adj A)A)jk = 0 inthis case. Therefoe (adj A)A= (det A)In as required.

We can now obtain A adj A = (det A)In either by using a similar argumentusing the rows or by considering the transpose ofA adj A. The final part followsimmediately.

Remark. Note that the entries of the adjugate matrix are all given by polynomialsin the entries ofA. Since the determinant is also a polynomial, it follows that theentries of the inverse of an invertible square matrix are given by a rational function(i.e. a ratio of two polynomial functions) in the entries ofA. Whilst this is a veryuseful fact from a theoretical point of view, computationally there are better waysof computing the determinant and inverse of a matrix than using these formulae.

Well complete this section on determinants of matrices with a couple of resultsabout block triangular matrices.

Lemma. LetA andB be square matrices. Then

det

A C0 B

= det(A) det(B).

Proof. Suppose AMatk(F) and B Matl(F) and k+l = n so C Matk,l(F).Define

X=

A C0 B

then

det X=

Sn

()

ni=1

Xi(i)

.

Since Xij = 0 whenever i > k and j k the terms with such that (i) k forsomei > k are all zero. So we may restrict the sum to those such that (i)> kfori > k i.e. those that restrict to a permutation of{1, . . . , k}. We may factorise


30/58

30 SIMON WADSLEY

these as = 12 with 1Sk and 2 a permuation of{k+ 1, . . . , n}. Thus

det X = 12

(12) k

i=1

Xi1(i)l

j=1

Xj+k,2(j+k)=

1Sk

(1)

ki=1

Ai1(i)

2Sl

(2)

lj=1

Bj2(j)

= det A det B

Corollary.

det

A1 0

. . . 0 0 Ak

=

ki=1

det Ai

Warning: it is nottrue in general that ifA, B,C,DMatn(F) and M is theelement of Mat2n(F) given by

M=

A BC D

then det M= det A det D det B det C.

Lecture 13

6. Endomorphisms

6.1. Invariants.

Definition. Suppose thatV is a finite dimensional vector space over F. Anendo-morphism ofV is a linear map : V

V. Well write End(V) denote the vector

space of endomorphisms ofV and to denote the identity endomorphism ofV .

When considering endomorphisms as matrices it is usual to choose the samebasis for Vfor both the domain and the range.

Lemma. Suppose that(e1, . . . , en)and(f1, . . . , f n) are bases forV such thatfi =Pkiek. LetEnd(V),A be the matrix representingwith respect to(e1, . . . , en)

andB the matrix representing with respect to(f1, . . . , f n). ThenB= P1AP.

Proof. This is a special case of the change of basis formula for all linear mapsbetween f.d. vector spaces.

Definition. We say matrices A and B are similar (or conjugate) ifB = P1APfor some invertible matrix P.

Recall GLn(F) denotes all the invertible matrices in Matn(F). ThenGLn(F)acts on Matn(F) by conjugation and two such matrices are similar precisely if theylie in the same orbit. Thus similarity is an equivalence relation.

An important problem is to classify elements of Matn(F) up to similarity (ieclassify GLn(F)-orbits). It will help us to find basis independent invariants of thecorresponding endomorphisms. For example well see that given End(V) therank, trace, determinant, characteristic polynomial and eigenvalues of are allbasis-independent.


31/58

LINEAR ALGEBRA 31

Recall that the trace ofAMatn(F) is defined by tr A=

AiiF.Lemma.

(a) IfAMatn,m(F) andBMatm,n(F) thentr AB= tr BA.(b) IfA andB are similar thentr A= tr B.(c) IfA andB are similar thendet A= det B.

Proof. (a)

tr AB =

ni=1

mj=1

AijBji

=

mj=1

ni=1

BjiAij

= tr BA

IfB = P1AP then,(b) tr B= tr(P1A)P = tr P(P1A) = tr A.(c) det B = det P1 det A det P = 1det Pdet A det P= det A.

Definition. Let End(V), (e1, . . . , en) be a basis for V and A the matrixrepresenting with respect to (e1, . . . , en). Then the trace of written tr isdefined to be the trace ofA and the determinant of written det is defined tobe the determinant ofA.

Weve proven that the trace and determinant of do not depend on the choiceof basis (e1, . . . , en).

Definition. Let

End(V).

(a) F is an eigenvalue of if there isvV\0 such that v = v.(b) vV is an eigenvector for if(v) = v for some F.(c) WhenF, the-eigenspaceof, writtenE() or simplyE() is the set of

-eigenvectors of; i.e. E() = ker( ).(d) The characteristic polynomial of is defined by

(t) = det(t ).Remarks.

(1) (t) is a monic polynomial in t of degree dim V.(2) F is an eigenvalue of if and only if ker( )= 0 if and only if is a

root of(t).(3) IfA Matn(F) we can define A(t) = det(tInA). Then similar matrices

have the same characteristic polynomials.Lemma. Let End(V) and1, . . . , k be the distinct eigenvalues of . ThenE(1) + + E(k) is a direct sum of theE(i).Proof. Suppose that

ki=1 xi =

ki=1 yi with xi, yi E(i). Consider the linear

maps

j :=i=j

( i).


32/58

32 SIMON WADSLEY

Then

j(

k

i=1

xi) =

k

i=1

j(xi)

=k

i=1

r=j

( r)(xi)

=k

i=1

r=j

(i r)xi

=r=j

(j r)xi

Similarly, j(

ki=1 yi) = r=j

(j r)yi. Thus sincer=j

(j r)= 0, xj =yjand the expression is unique.

Note that the proof of this lemma show that any set of non-zero eigenvectorswith distinct eigenvalues is LI.

Definition. End(V) is diagonalisable if there is a basis for V such that thecorresponding matrix is diagonal.

Theorem. LetEnd(V). Let1, . . . , k be the distinct eigenvalues of. WriteEi = E(i). Then the following are equivalent

(a) is diagonalisable;(b) Vhas a basis consisting of eigenvectors of;(c) V =ki=1Ei;(d)dim Ei= dim V.Proof. Suppose that (e1, . . . , en) is a basis for V and A is the matrix representingwith respect to this basis. Then(ei) =

Ajiej . ThusA is diagonal if and only

if each ei is an eigenvector for . i.e. (a) and (b) are equivalent.Now (b) is equivalent to V =

Ei and weve proven that

Ei =ki=1Ei so

(b) and (c) are equivalent.The equivalence of (c) and (d) is a basic fact about direct sums that follows from

Example Sheet 1 Q10.

Lecture 14

6.2. Minimal polynomials.

6.2.1. An aside on polynomials.

Definition. A polynomial over F is something of the form

f(t) = amtm + + a1t + a0

for some m 0 and a0, . . . , am F. The largestn such that an= 0 is the degreeoff written deg f. Thus deg 0 =.


33/58

LINEAR ALGEBRA 33

It is straightforward to show that

deg(f+ g) max(deg f, deg g)

and

deg f g= deg f+ deg g.

Notation. We write F[t] :={polynomials with coefficients in F}.Note that a polynomial over F defines a function FF but we dont identify

the polynomial with this function. For example if F = Z/pZ then tp = t eventhough they define the same function on F. If you are restricting F to be just Ror C this point is not important.

Lemma(Polynomial division). Givenf , gF[t],g= 0 there existq, rF[t] suchthatf(t) = q(t)g(t) + r(t) anddeg r


34/58

34 SIMON WADSLEY

6.2.2. Minimal polynomials.

Notation. Givenf(t) = mi=0 ait

i

F[t],A

Matn(F) and

End(V) we write

f(A) :=m

i=0

aiAi

and

f() :=m

i=0

aii.

HereA0 =In and0 =.

Theorem. Suppose thatEnd(V). Then is diagonalisable if and only if thereis a non-zero polynomialp(t)F[t] that can be expressed as a product of distinctlinear factors such thatp() = 0.

Proof. Suppose that is diagonalisable and 1, . . . , k F are the distinct eigen-values of. Thus ifv is an eigenvector for then(v) = iv for some i = 1, . . . , k.

Letp(t) =k

j=1(t j)Since is diagonalisable, V =

ki=1 E(i) and each v V can be written as

v=

vi with viE(i). Then

p()(v) =

ki=1

p()(vi) =

ki=1

kj=1

(i j)vi= 0.

Thus p()(v) = 0 for all vV and so p() = 0End(V).Conversely, ifp() = 0 for p(t) =

ki=1(t i) for1, . . . , kF distinct (note

that without loss of generality we may assume thatp is monic). We will show that

V =ki=1 E(i). Since the sum of eigenspaces is always direct it suffices to showthat every element vVcan be written as a sum of eigenvectors.

Let

qj(t) :=

ki=1i=j

(t i)(j i)

forj = 1, . . . , k. Thus qj(i) = ij.

Nowq(t) =k

j=1 qj(t) F[t] has degree at most k 1 and q(i) = 1 for eachi= 1, . . . , k. It follows that q(t) = 1.

Letj : V Vbe defined by j =qj(). Then

j =q() = .LetvV. Then v = (v) =

j(v). But

( j)qj() = 1i=j(j i)

p()v= 0.

Thus j(v)ker( j) = E(j) and were done.

Remark. In the above proof, ifv E(i) then j(v) = qj(i)v = ijv. So j is aprojection onto E(j) alongi=jE(i).


35/58

LINEAR ALGEBRA 35

Lecture 15

Definition. The minimal polynomial of

End(V) is the non-zero monic poly-

nomialm(t) of least degree such that m() = 0.

Note that if dim V = n


36/58

36 SIMON WADSLEY

and so(v)Ei as claimed. Thus we can view |Ei as an endomorphism ofEi.Now sinceis diagonalisable, the minimal polynomialm ofhas distinct linear

factors. Butm

(|Ei

) = m

()|Ei

= 0. Thus |Ei

is diagonalisable for each Ei

and we can findBi a basis ofEi consisting of eigenvectors of. ThenB = ki=1 Biis a basis forV. Moreover and are both diagonal with respect to this basis.

6.3. The Cayley-Hamilton Theorem. Recall that the characteristic polynomialof an endomorphism End(V) is defined by (t) = det(t ).Theorem (CayleyHamilton Theorem). Suppose that V is a f.d. vector spaceover F and End(V). Then () = 0. In particularm divides (and sodeg m dim V).

Remarks.

(1) It is tempting to substitute t = A into A(t) = det(tInA) but it is notpossible to make sense of this.

(2) Ifp(t)F

[t] and

A=

1 0 00 . . . 00 0 n

is diagonal then

p(A) =

p(1) 0 00 . . . 00 0 p(n)

.So asA(t) =

ni=1(t i) we seeA(A) = 0. So CayleyHamilton is obvious

when is diagonalisable.

Definition. End(V) is triangulable if there is a basis for V such that thecorresponding matrix is upper triangular.Lemma. An endomorphism is triangulable if and only if(t)can be written asa product of linear factors. In particular ifF = C then every matrix is triangulable.

Proof. Suppose that is triangulable and is represented bya1 0 . . . 0 0 an

with respect to some basis. Then

(t) = det

tIn

a1 0

. . .

0 0 an

=

(t ai).Thus is a product of linear factors.

Well prove the converse by induction on n = dim V. If n = 1 every matrixis triangulable. Suppose that n > 1 and the result holds for all endomorphismsof spaces of smaller dimension. By hypothesis (t) has a root F. LetU =E()= 0. Let W be a vector space complement for U in V. Let u1, . . . , ur be


37/58

LINEAR ALGEBRA 37

a basis for U and wr+1, . . . , wn a basis for W so that u1, . . . , ur, wr+1, . . . , wn is abasis for V . Then is represented by a matrix of the form

Ir 0 B

.Moreover because this matrix is block triangular we know that

(t) = Ir(t)B(t).

Thus as is a product of linear factors B must be also. Let be the linear mapW W defined by B. (Warning: is not just |W in general. However it istrue that ( )(w)U for all w W.) Since dim W < dim V there is anotherbasisvr+1, . . . , vn for Wsuch that the matrixCrepresentingis upper-triangular.

Since for each j = 1, . . . , n r, (vj+r) = uj +nr

k=1Ckjvk+r for some uj U,

the matrix representing with respect to the basis u1, . . . , ur, vr+1, . . . , vn is of theform

Ir 0 Cwhich is upper triangular.

Lecture 16

Recall the following lemma.

Lemma. IfVis a finite dimensional vector space overC then everyEnd(V)is triangulable.

Example. The real matrix cos sin sin cos

is not similar to an upper triangular matrix over R for Zsince its eigenvaluesareei R. Of course it is similar to a diagonal matrix over C.Theorem (CayleyHamilton Theorem). Suppose that V is a f.d. vector spaceover F and End(V). Then () = 0. In particularm divides (and sodeg m dim V).

Proof of CayleyHamilton whenF= C. Since F = C weve seen that there is abasis (e1, . . . , en) for V such that is represented by an upper triangular matrix

A=

1 0 . . . 0 0 n

.Then we can compute(t) =ni=1(t i). LetVj =e1, . . . , ej for j = 0, . . . , nso we have

0 = V0V1 Vn1Vn = Vwith dim Vj =j . Since (ei) =

nk=1 Akiek =

ik=1 Akiek, we see that

(Vj)Vj for each j = 0, . . . , n .Moreover ( j)(ej) =

j1k=1 Akjek so

( j)(Vj)Vj1 for eachj = 1, . . . , n .


38/58

38 SIMON WADSLEY

Thus we see inductively thatn

i=j( i)(Vn)Vj1. In particularn

i=1

( i)(V)V0 = 0.Thus () = 0 as claimed.

Remark. It is straightforward to extend this to the case F = R: since R C, ifAMatn(R) then we can viewAas an element of Matn(C) to see thatA(A) = 0.But then if End(V) for any vector space V over R we can take A to be thematrix representing over some basis. Then () = A() is represented byA(A) and so it zero.

Second proof of CayleyHamilton. LetAMatn(F) and let B = tIn A. We cancompute that adj B is an n n-matrix with entries elements ofF[t] of degree atmostn 1. So we can write

adj B = Bn1t

n1

+ Bn2t

n2

+ + B1t + B0with each BiMatn(F). Now we know that B adj B= (det B)In = A(t)In. ie(tIn A)(Bn1tn1 + Bn2tn2 + + B1t + B0) = (tn + an1tn1 + + a0)InwhereA(t) =tn +an1t

n1 + a0. Comparing coefficients in tk fork = n, . . . , 0we see

Bn1 0 = InBn2 ABn1 = an1InBn3 ABn2 = an2In

= 0 AB0 = a0In

Thus

AnBn1 0 = AnAn1Bn2 AnBn1 = an1An1

An2Bn3 An1Bn2 = an2An2 =

0 AB0 = a0In

Summing we get 0 = A(A) as required.

Lemma. LetEnd(V), F. Then the following are equivalent(a) is an eigenvalue of;(b) is a root of(t);(c) is a root ofm(t).

Proof. We see the equivalence of (a) and (b) in section 6.1Suppose that is an eigenvalue of. There is some v V non-zero such that

v= v. Then for any polynomial pF[t], p()v= p()v so0 = m()v= m()(v).


39/58

LINEAR ALGEBRA 39

Sincev= 0 it follows that m() = 0. Thus (a) implies (c).Since m(t) is a factor of(t) by the CayleyHamilton Theorem, we see that

(c) implies (b).Alternatively, we could prove (c) implies (a) directly: suppose that m() = 0.

Then m(t) = (t)g(t) for some g F[t]. Since deg g < deg m, g()= 0.Thus there is some v V such that g()(v)= 0. But then ( ) (g()(v)) =m()(v) = 0. Sog ()(v)= 0 is a -eigenvector of and so is an eigenvalue of.

Example. What is the minimal polynomial of

A=

1 0 20 1 10 0 2

?We can compute A(t) = (t 1)2(t 2). So by CayleyHamiltonm(t) is a factorof (t 1)

2

(t 2). Moreover by the lemma it must be a multiple of (t 1)(t 2).SomA is one of (t 1)(t 2) and (t 1)2(t 2).

We can compute

(A I)(A 2I) =0 0 20 0 1

0 0 1

1 0 20 1 10 0 0

= 0.Thus mA(t) = (t 1)(t 2). Since this has distict roots, A is diagonalisable.6.4. Multiplicities of eigenvalues and Jordan Normal Form.

Definition (Multiplicity of eigenvalues). Suppose that End(V) and is aneigenvalue of:

(a) thealgebraic multiplicity ofis

a := the multiplicity of as a root of(t);

(b) thegeometric multiplicity ofis

g := dim E();

(c) another useful number is

c := the multiplicity of as a root ofm(t).

Examples.

(1) IfA =

1 0

. . .

. . . 10

Matn(F) then g = 1 and a = c= n.

(2) IfA = Ithen g = a= n and c = 1.

Lemma. LetEnd(V) andF an eigenvalue of. Then(a) 1 g a and(b) 1 c a.


40/58

40 SIMON WADSLEY

Proof. (a) By definition if is an eigenvalue of then E()= 0 so g 1.Suppose thatv1 . . . , vg is a basis for E() and extend it to a basisv1, . . . , vn for V.Then is represented by a matrix of the form

Ig 0 B

.

Thus (t) = Ig(t)B(t) = (t )gB(t). So a g= g.(b) Weve seen that if is an eigenvalue of then is a root ofm(t) soc 1.

CayleyHamilton saysm(t) divides (t) so c a.

Lemma. Suppose thatF = C andEnd(V). Then the following are equivalent:(a) is diagonalisable;(b) a = g for all eigenvalues of;(c) c = 1 for all eigenvalues of.

Proof. To see that (a) is equivalent to (b) suppose that the distict eigenvalues of

are 1, . . . , k. Then is diagonalisable if and only if dim V =ki=1dim E(i) =ki=1 gi . But g a for each eigenvalue and

ki=1 ai = deg = dim V

by the Fundamental Theorem of Algebra. Thus is diagonalisable if and only ifgi =ai for each i = 1, . . . , k.

Since by the Fundamental Theorem of Algebra for any such , m(t) may bewritten as a product of linear factors, is diagonalisable if and only if these factorsare distinct. This is equivalent to c = 1 for every eigenvalue of.

Definition. We say that a matrix AMatn(C) is in Jordan Normal Form (JNF)if it is a block diagonal matrix

A= Jn1(1) 0 0 0

0 Jn2(2) 0 0

0 0 . . . 00 0 0 Jnk(k)

where k 1, n1, . . . , nk N such that

ki=1 ni = n and 1, . . . , k C (not

necessarily distinct) andJm()Matm(C) has the form

Jm() :=

1 0 0

0 . . . 0

0 0 . . . 1

0 0 0

.We call the Jm() Jordan blocks

NoteJm() = Im+ Jm(0).

Lecture 17

Theorem (Jordan Normal Form). Every matrix A Matn(C) is similar to amatrix in JNF. Moreover this matrix in JNF is uniquely determined by A up toreordering the Jordan blocks.


41/58

LINEAR ALGEBRA 41

Remarks.

(1) Of course, we can rephrase this as whenever is an endomorphism of a f.d.

C-vector space V, there is a basis ofV such that is represented by a matrixin JNF. Moreover, this matrix is uniquely determined by up to reorderingthe Jordan blocks.

(2) Two matrices in JNF that differ only in the ordering of the blocks are similar.A corresponding basis change arises as a reordering of the basis vectors.

Examples.

(1) Every 22 matrix in JNF is of the form

00

with = or

00

or

10

. The minimal polynomials are (t)(t), (t) and (t

)2 respectively. The characteristic polynomials are (t)(t), (t)2and (t)2 respectively. Thus we see that the JNF is determined by theminimal polynomial of the matrix in this case (but not by just the characteristicpolynomial).

(2) Suppose now that 1, 2 and 3 are distinct complex numbers. Then every3 3 matrix in JNF is one of six forms1 0 00 2 0

0 0 3

,1 0 00 2 0

0 0 2

,1 0 00 2 1

0 0 2

1 0 00 1 0

0 0 1

,1 0 00 1 1

0 0 1

and1 1 00 1 1

0 0 1

.The minimal polynomials are (t1)(t2)(t3), (t1)(t2), (t1)(t

2)2, (t

1), (t

1)2 and (t

1)3 respectively. The characteristic

polynomials are (t1)(t2)(t3), (t1)(t2)2, (t1)(t2)2, (t1)3,(t1)3 and (t1)3 respectively. So in this case the minimal polynomial doesnot determine the JNF by itself (when the minimal polynomial is of the form(t1)(t2) the JNF must be diagonal but it cannot be determined whether1 or 2 occurs twice on the diagonal) but the minimal and characteristicpolynomials together do determine the JNF. In general even these two bits ofdata together dont suffice to determine everything.

We recall that

Jn() =

1 0 0

0 . . . 0

0 0 . . . 1

0 0 0

.

Thus if (e1, . . . , en) is the standard basis for Cn we can compute (Jn()In)e1 = 0

and (Jn() In)ei= ei1 for 1< i n. Thus (Jn() In)k mapse1, . . . , ek to0 andek+j toej for 1 j n k. That is

(Jn() In)k =

0 Ink0 0

fork < n

and (Jn() In)k = 0 fork n.


42/58

42 SIMON WADSLEY

Thus ifA = Jn() is a single Jordan block, thenA(t) = mA(t) = (t )n, sois the only eigenvalue ofA. Moreover dim E() = 1. Thus a= c= n and g= 1.

Now ifA be a block diagonal square matrix; ie

A=

A1 0 0 00 A2 0 0

0 0 . . . 0

0 0 0 Ak

thenA(t) =

ki=1 Ai(t). Moreover, ifpF[t] then

p(A) =

p(A1) 0 0 0

0 p(A2) 0 0

0 0 . . . 0

0 0 0 p(Ak)

so m

A(t) is the lowest common multiple ofm

A1(t), . . . , m

Ak(t).

We also have n(p(A)) = ki=1 n(p(Ai)) for anypF[t].In general a is the sum of the sizes of the blocks with eigenvalue which is

the same as the number ofs on the diagonal. g is the number of blocks witheigenvalue and c is the size of the largest block with eigenvalue .

Theorem. IfEnd(V) andA in JNF represents with respect to some basisthen the number of Jordan blocksJn() ofA with eigenvalue and sizen r 1is given by

|{Jordan blocksJn() inA withn k}|= n

( )k n ( )k1Proof. We work blockwise with

A= Jn1(1) 0 0

0 . . . 00 0 Jnk(k)

.We can compute that

n

(Jm() Im)k

= min(m, k)

andn

(Jm() Im)k

= 0

when=.Adding up for each block we get for k 1

n

( )k n ( )k1 = n (A I)k n (A I)k1=

k

i=1i=

(min(k, ni)

min(k

1, ni)

= |{1 i k|i= , ni k}= |{Jordan blocks Jn() in A with n k}|

as required.

Because these nullities are basis-invariant, it follows that if it exists then theJordan normal form representingis unique up to reordering the blocks as claimed.


43/58

LINEAR ALGEBRA 43

Theorem (Generalised eigenspace decompostion). LetV be a f.d. C-vector spaceandEnd(V). Suppose that

m(t) = (t 1)c1 (t k)ckwith1, . . . , k distinct. Then

V =V1 V2 VkwhereVj = ker((j)cj )is an-invariant subspace (called a generalised eigenspace).

Note that in the case c1 = c2 = = ck = 1 we recover the diagonalisabilitytheorem.

Sketch of proof. Letpj(t) =k

i=j

i=1(ti)ci . Thenp1, . . . , pk have no common factor

i.e. they are coprime. Thus by Euclids algorithm we can find q1, . . . , q kC[t] suchthat

ki=1 qipi= 1.

Let j = qj()pj() for j = 1, . . . , k. Thenkj=1 j = . Since m() = 0,( j)cjj = 0, thus Im jVj .Suppose that vV then

v= (v) =k

j=1

j(v)

Vj .

Thus V =

Vj .

But ij = 0 for i= j and so i = i(k

j=1 j) = 2i for 1 i k. Thus

j |Vj = Vj and if v =

vj with vj Vj then vj = j(v). So V =

Vj asclaimed.

Lecture 18Using the generalised eigenspace decomposition theorem we can, by considering

the action of on its generalised eigenspaces separately, reduce the proof of theexistence of Jordan normal form for to the case that it has only one eigenvalue .By considering () we can even reduce to the case that 0 is the only eigenvalue.Definition. We say that End(V) is nilpotentif there is some k 0 such thatk = 0.

Note that is nilpotent if and only ifm(t) = tk for some 1 k n. WhenF= C this is equivalent to 0 being the only eigenvalue for .

Example. Let

A= 3 2 01 0 01 0 1

.Find an invertible matrix P such that P1AP is in JNF.

First we compute the eigenvalues ofA:

A(t) = det

t 3 2 01 t 01 0 t 1

= (t 1)(t(t 3) + 2) = (t 1)2(t 2).


44/58

44 SIMON WADSLEY

Next we compute the eigenspaces

A I= 2 2 01 1 01 0 0

which has rank 2 and kernel spanned by

001

. Thus EA(1) = 00

1

. SimilarlyA 2I=

1 2 01 2 01 0 1

also has rank 1 and kernel spanned by

212

thus EA(2) = 21

2

. Sincedim EA(1) + dim EA(2) = 2< 3, A is not diagonalisable. Thus

mA(t) = A(t) = (t 1)2(t 2)and the JNF ofA is

J=

1 1 00 1 00 0 2

.So we want to find a basis (v1, v2, v3) such that Av1 = v1, Av2 = v1 + v2 andAv3 = 2v3 or equivalently (A I)v2 = v1, (A I)v1 = 0 and (A 2I)v3 = 0. Notethat under these conditions (A I)2v2 = 0 but (A I)v2= 0.

We compute

(A I)2 =

2 2 01 1 02

2 0

Thus

ker(A I)2 =11

0

,00

1

Take v2 =

110

, v1 = (AI)v2 =00

1

and v3 =21

2

. Then these are LIso form a basis for C3 and if we take P to have columns v1, v2, v3 we see thatP1AP =Jas required.

7. Bilinear forms (II)

7.1. Symmetric bilinear forms and quadratic forms.

Definition. Let V be a vector space over F. A bilinear form : V V F issymmetric if(v1, v2) = (v2, v1) for all vV.Example. Suppose S Matn(F) is a symmetric matrix (ie ST = S), then we candefine a symmetric bilinear form : Fn Fn F by

(x, y) = xTSy =n

i,j=1

xiSijyj


45/58


46/58

46 SIMON WADSLEY

It remains to prove uniqueness. Suppose that is such a symmetric bilinearform. Then for v, wV,

q(v+ w) = (v+ w, v+ w)= (v, v) + (v, w) + (w, v) + (w, w)

= q(v) + 2(v, w) + q(w).

Thus (v, w) = 12(q(v+ w) q(v) q(w)).

Lecture 19

Theorem (Diagonal form for symmetric bilinear forms). If : V V F is asymmetric bilinear form on a f.d. vector space V over F, then there is a basis(e1, . . . , en) forV such that is represented by a diagonal matrix.

Proof. By induction on n = dim V. Ifn = 0, 1 the result is clear. Suppose that we

have proven the result for all spaces of dimension strictly smaller than n.If(v, v) = 0 for allvV, then by the polarisation identity is identically zero

and is represented by the zero matrix with respect to every basis. Otherwise, wecan choose e1V such that (e1, e1)= 0. Let

U={uV| (e1, u) = 0}= ker (e1, ) : V F.By the rank-nullity theorem,Uhas dimensionn1 ande1Uso Uis a complementtoe1 in V .

Consider|UU: UUF, a symmetric bilinear form on U. By the inductionhypothesis, there is a basis (e2, . . . , en) for U such that |UUis represented by adiagonal matrix. The basis (e1, . . . , en) satisfies (ei, ej) = 0 for i= j and weredone.

Example. Letqbe the quadratic form on R

3

given by

q

xyz

=x2 + y2 + z2 + 2xy+ 4yz + 6xz.Find a basis (f1, f2, f3) for R

3 such thatqis of the form

q(af1+ bf2+ cf3) = a2 + b2 + c2.

Method 1Let be the bilinear form represented by the matrix

A=

1 1 31 1 23 2 1

so that q(v) = (v, v) for all vR

3

.

Nowq(e1) = 1= 0 so letf1 = e1 =10

0

. Then(f1, v) = fT1 Av = v1 +v2 +3v3.So we choosef2 such that(f1, f2) = 0 but(f2, f2)= 0. For example

q

110

= 0 butq 30

1

=8= 0.


47/58

LINEAR ALGEBRA 47

So we can takef2 =

30

1

. Then (f2, v) = fT2 Av =

0 1 8

v= v2+ 8v3.

Now we want(f1, f3) = (f2, f3) = 0, f3 = 5 8 1T will work. Thenq(af1+ bf2+ cf3) = a

2 + (8)b2 + 8c2.Method 2Complete the square

x2 + y2 + z2 + 2xy+ 4yz + 6xz = (x + y+ 3z)2 + (2yz) 8z2

= (x + y+ 3z)2 8

z+y

8

2+

y 2

8

Now solve x+y+ 3z = 1, z+ y8 = 0 and y = 0 to obtain f1 =

1 0 0T

, solve

x+ y + 3z = 0, z + y8 = 1 and y = 0 to obtain f2 =3 0 1T and solve

x + y+ 3z= 0, z + y8 = 0 and y = 1 to obtain f3=

58 1 18

T

.

Corollary. Let be a symmetric bilinear form on a f.dC-vector spaceV. Thenthere is a basis(v1, . . . , vn)forV such that is represented by a matrix of the formIr 00 0

withr = r() or equivalently such that the corresponding quadratic formqis givenbyq(n

i=1 aivi) =r

i=1 a2i .

Proof. We have already shown that there is a basis (e1, . . . , en) such that(ei, ej) =ijj for some 1, . . . , nC. By reordering the ei we can assume that i= 0 for1 i r and i = 0 for i > r. Since were working overCfor each 1 i r, ihas a non-zero square root i, say. Defining vi=

1i

ei for 1 i r and vi = ei for

r+ 1 i n, we see that (vi, vj) = 0 if i= j or i = j > r and (vi, vi) = 1 if1 i r as required.

Corollary. Every symmetric matrix inMatn(C) is congruent to a matrix of theform

Ir 00 0

.

Corollary. Let be a symmetric bilinear form on a f.d R-vector spaceV. Thenthere is a basis(v1, . . . , vn)forV such that is represented by a matrix of the formIp 0 00 Iq 0

0 0 0

withp, q 0andp + q= r()or equivalently such that the corresponding quadratic

formq is given byq(ni=1 aivi) = pi=1 a2ip+qi=p+1 a2i .Proof. We have already shown that there is a basis (e1, . . . , en) such that(ei, ej) =ijj for some 1, . . . , n R. By reordering theei we can assume that there isa p such that i > 0 for 1 i p and i < 0 for p+ 1 i r() and i = 0for i > r(). Since were working over R we can define i =

i for 1 i p,

i =i for p+ 1 i r() and i = 1 for i = 1 . Defining vi = 1i ei we see

that is represented by the given matrix with respect to v1, . . . , vn.


48/58

48 SIMON WADSLEY

Definition. A symmetric bilinear form on a real vector space V is

(a) positive definite if(v, v)> 0 for all vV\0;(b) positive semi-definite if(v, v) 0 for all vV;(c) negative definite if(v, v)< 0 for all vV\0;(d) negative semi-definite if(v, v) 0 for all vV.We say a quadratic form is ...-definiteif the corresponding bilinear form is so.

Lecture 20

Examples. If is a symmetric bilinear form on Rn represented byIp 00 0

then is positive semi-definite. Moreover is positive definite if and only ifn = p.

If instead is represented by

Iq 00 0

then is negative semi-definite. Moreover is negative definite if and only ifn = q.

Theorem (Sylvesters Law of Inertia). Let V be a finite-dimensional real vectorspace and let be a symmetric bilinear form onV. Then there areunique integers

p, qsuch thatVhas a basisv1, . . . , vn with respect to which is represented by thematrix Ip 0 00 Iq 0

0 0 0

Proof. Weve already done the existence part. We also already know thatp +q=r() is unique. To see p is unique well prove that p is the largest dimension of a

subspaceP ofV such that |PP is positive definite.Letv1, . . . , vn be some basis with respect to which is represented byIp 0 00 Iq 0

0 0 0

for some choice ofp. Then is positive definite on the space spanned by v1, . . . , vp.Thus it remains to prove that there is no larger such subspace.

Let P be any subspace ofV such that |PP is positive definite and let Q bethe space spanned by vp+1, . . . , vn. The restriction of to Q Q is negative semi-definite so P Q= 0. So dim P+ dim Q = dim(P+Q) n. Thus dim P p asrequired.

Definition. The signatureof the symmetric bilinear form given in the Theoremis defined to be p q.Corollary. Every real symmetric matrix is congruent to a matrix of the formIp 0 00 Iq 0

0 0 0


49/58

LINEAR ALGEBRA 49

7.2. Hermitian forms. LetVbe a vector space over C and let be a symmetricbilinear form onV. Then can never be positive definite since (iv,iv) =(v, v)for all v

V. Wed like to fix this.

Definition. LetV andWbe vector spaces over C. Then a sesquilinear form is afunction : V W C such that

(1v1+ 2v2, w) = 1(v1, w) + 2(v2, w) and

(v, 1w1+ 2w2) = 1(v, w1) + 2(v, w2)

for all 1, 2, 1, 2C, v , v1, v2V andw, w1, w2W.Definition. Letbe a sesquilinear form onVWand letVhave basis (v1, . . . , vm)andWhave basis (w1, . . . , wn). ThematrixA representing with respect to thesebases is defined by Aij =(vi, wj).

Suppose that

iviV and

jwjW then

(ivi,jwj) = mi=1

nj=1

iAijj = T

A.

Definition. A sesquilinear form : VV C is said to beHermitian if(x, y) =(y, x) for all x, yV.

Notice that if is a Hermitian form on V then (v, v) R for all v Vand (v,v) = ||2(v, v) for all C. Thus is it meaningful to speak ofpositive/negative (semi)-definite Hermitian forms and we will do so.

Lemma. Let : VV Cbe a sesquilinear form on a complex vector spaceV withbasis(v1, . . . , vn). Then is Hermitian if and only if the matrixA representing

with respect to this basis satisfiesA = AT

(we also say the matrixA isHermitian).

Proof. If is Hermitian then

Aij =(vi, vj) = (vj , vi) = Aji .

Conversely ifA = AT

then

ivi,

jvj

=

TA= TAT= TA=

jvj ,

ivi

as required

Proposition (Change of basis). Suppose that is a Hermitian form on a f.d.complex vector spaceV and thate1, . . . , en andv1, . . . , vn are bases forV suchthatvi =

nk=1 Pkiek. LetAbe the matrix representingwith respect to e1, . . . , en

andB be the matrix representing with respect tov1, . . . , vn thenB= P

TAP.

Proof. We compute

Bij =

nk=1

Pkiek,n

l=1

Pljel

=k,l

(PT

)ik(ek, el)Plj = [PT

AP]ij

as required.

Lemma(Polarisation Identity). A Hermitian form on a complex vector spaceVis determined by the function : V R; v(v, v).


50/58

50 SIMON WADSLEY

Proof. It can be checked that

(x, y) =1

4

((x + y)

i(x + iy)

(x

y) + i(x

iy))

Theorem(Hermitian version of Sylvesters Law of Inertia). LetVbe a f.d. complexvector space and suppose that : V V C is a Hermitian form on V. Thenthere is a basis(v1, . . . , vn) ofVwith respect to which is represented by a matrixof the form Ip 0 00 Iq 0

0 0 0

.Moreoverp andqdepend only on not on the basis.

Notice that for such a basis (

ivi,

ivi) =

pi=1 |i|2

p+qj=p+1 |j |2.

Sketch of Proof. This is nearly identical to the real case. For existence: if isidentical

Documents

Linear Algebra Cambridge Mathematical Tripos Part IB