1 Background and Review Material - College of William & Mary

CR Humber Intermediate Linear Algebra Lecture Notes Spring

Background and Review Material

. Algebraic Structures

For a nonempty set X, a relation ∼ on X is an equivalence relation if the followingaxioms hold:

x ∼ x,∀x ∈ X (Reflexivity) (.)x ∼ y⇒ y ∼ x (Symmetry) (.)

x ∼ y ∼ z⇒ x ∼ z (Transitivity) (.)

Given an equivalence relation ∼ on a set X, the equivalence class of x ∈ X is the set of allelements equivalent to x and denoted by

[x] = y ∈ X |x ∼ y.

Recall that a field is a nonempty set F together with two operations

(x,y) 7→ x+ y (Addition) (.)(x,y) 7→ xy (Multiplication) (.)

from F ×F → F , satisfying

F1) x+ y = y + x,∀x,y ∈ F ,F2) x+ (y + z) = (x+ y) + z,∀x,y,z ∈ F ,F3) There exists a unique element 0 ∈ F such that x+ 0 = x,∀x ∈ F ,F4) For every x ∈ F , there exists a unique y ∈ F such that x+ y = 0,F5) xy = yx,∀x,y ∈ F ,F6) x(yz) = (xy)z,∀x,y,z ∈ F ,F7) There exists a unique element 1 ∈ F (1 , 0) such that x1 = x,∀x ∈ F ,F8) For every x , 0 in F , there exists a unique y ∈ F such that xy = 1,F9) x(y + z) = xy + xz,∀x,y,z ∈ F .

For example, the rational numbers, real numbers and complex numbers, respectivelydenoted by Q,R,C, are all fields. These fields satisfy the correspondence Q ⊆R ⊆C.

Example . (Integers modulo n). Suppose two integers a,b ∈Z and a positive integern ≥ 1 satisfy a = nq+ b, that is, q is the quotient and b is the remainder in dividing a by n.This can be equivalently expressed as n|(a− b) (read “n divides a− b”). Given a,b ∈Z, wesay a is congruent to b modulo n, and write a ≡ b mod n, if n|(a−b). First, note that ≡ is anequivalence relation, which forms the basis for modular arithmetic. In such a case, n isreferred to as the modulus.


The set of all equivalence classes of the integers for a modulus n is called the integersmodulo n and denoted by

Z/nZ = [0], [1], . . . , [n− 1].Note that the notation Zn is also used for the integers modulo n. For example,Z/2Z = 0,1, since any even integer is equivalent to (mod ), while any odd number isequivalent to (mod ).Fact: If p is prime, Zp is a field.

A group is a nonempty set G together with a binary operation (typically denoted ?, ·, or+) that satisfies:

G1) (a ? b) ? c = a ? (b ? c),G2) There exists an element e ∈ G such that a ? e = a = e ? a,∀a ∈ G,G3) For each a ∈ G, there exists an element a−1 ∈ G such that a ? a−1 = a−1 ? a = e.

A group is called abelian (or commutative) if a ? b = b ? a for all a,b ∈ G. If G is abelian, itis common to write 1

b for b−1 and ab for ab−1. Note also that if the group operation is

addition (+), then it is common to write 0 for e, −a for a−1, and a− b for ab−1.Examples:

) (R,+) - real numbers with ordinary addition

) (C,+) - complex numbers with ordinary addition

) (Z,+) - integers with ordinary addition

) (Zn,+) - cyclic group of order n (addition modulo n)

A subset H of a group G is called a subgroup of G if

SG1) 1 ∈H,SG2) a,b ∈H ⇒ ab ∈H,SG3) a ∈H ⇒ a−1 ∈H.

Example . (General Linear Groups). The set

GLn(R) = A ∈Mn(R)|det(A) , 0

of (real) n×n invertible matrices, along with ordinary matrix multiplication, is a group.The set

GLn(C) = A ∈Mn(C)|det(A) , 0of (complex) n×n invertible matrices, along with ordinary matrix multiplication, is agroup.These are the general linear groups of degree n.


Given a group G and a subset H , we may define an equivalence relation as follows:

a ≡ b ⇐⇒ a−1b ∈H,

or equivalently,a ≡ b ⇐⇒ b ∈ aH.

Given an element a ∈ G, the equivalence class of a under this equivalence relation iscalled a left coset of H in G, denoted by aH , since

[a] = b ∈ G |a ≡ b = b ∈ G |a−1b ∈H= b ∈ G |a−1b = c,c ∈H= b ∈ G |b = ac,c ∈H= aH.

In additive notation, we denote left cosets as a+H , since the equivalence relation isexpressed as (−a) + b ∈H .Likewise, we may define a second equivalence relation

a ≡ b ⇐⇒ ba−1 ∈H ⇐⇒ b ∈Ha.

The equivalence class of a under this equivalence relation is called a right coset of H inG, denoted by Ha or H + a in additive notation. In either case, the element a is called acoset representative for a+H or H + a.The following lemma will be useful for our discussion on rings.

Lemma .. Let H be a subgroup of a group G and let a,b ∈ G.

i) aH = bH if and only if b−1a ∈H . In particular, aH =H if and only if a ∈H .

ii) If aH ∩ bH , ∅, then aH = bH .

iii) |aH | = |H | for all a ∈ G.

A ring is a nonempty set R, together with two binary operations (addition andmultiplication), such that:

R) R is a commutative group with respect to addition,

R) Multiplication is associative,

R) Multiplication is distributive with respect to addition.


More concretely, the conditions that R be a ring are:

1) r + (s+ t) = (r + s) + t, ∀r, s, t ∈ R,2) r + s = s+ r, ∀r, s ∈ R,3) There exists 0 ∈ R such that r + 0 = 0 + r = r, ∀r ∈ R,4) For every r ∈ R there exists − r ∈ R such that r + (−r) = (−r) + r = 0,5) (rs)t = r(st), ∀r, s, t ∈ R,6) (r + s)t = rt + st, ∀r, s, t ∈ R,7) r(s+ t) = rs+ rt, ∀r, s, t ∈ R.

A ring is called commutative if rs = sr for all r, s ∈ R. If there exists an element e ∈ R suchthat re = er = r for every r ∈ R, we call R a ring with identity. The multiplicative identity,e, is most often denoted as 1 (sometimes 1,1R or 1R). Note that some authors define aring as always having a multiplicative identity. In this case, a set satisfying the ringaxioms, except the existence of an identity, is referred to as a pseudo-ring. Given thedefinition of a ring, we may give a more concise definition of a field. A field is a ring thatdoes not only consist of 0 for which every nonzero element has a multiplicative inverse.

Example .. The set Zn = 0,1, . . . ,n− 1 is a commutative ring under addition andmultiplication modulo n, 1 being the identity.

Example .. The set Mn(F ) is a noncommutative ring under matrix addition andmultiplication, where the identity is denoted by I or In.

Example .. Denote the set of all polynomials in a single variable x, with coefficients inthe field F , by F [x]. With polynomial addition and multiplication, F [x] is acommutative ring. What is the identity?We denote the set of polynomials in two (or more) variables by F [x,y] or F [x1,x2, . . . ,xn].These are also commutative rings with identity.

Let R be a ring with identity. The smallest positive integer c for which c · 1 = 0 is calledthe characteristic of R, denoted charR. If no such c exists, we say R has characteristic 0.Here, c · 1 should be understand as 1 + 1 + · · ·+ 1︸︷︷︸

c times

.

Fact: Any finite ring has nonzero characteristic. Any finite field has prime characteristic.

Example ..

) charR = 0 since a · 1 = 0⇒ a = 0 for a ∈R. Likewise, charZ = 0.

) In the ring Zn, we have n · 1 = 0 and k · 1 , 0 for k = 0,1, . . . ,n− 1. Thus, charZn = n.Recall that Zn is a field if and only if n is prime.


Example . (Fields of characteristic 2). Suppose F is a field with charF = 2. So, 2 ·1 = 0and, furthermore, 2x = 0 for every x ∈ F . Equivalently, x+ x = 0⇒ x = −x, for everyx ∈ F .

A subring of a ring R is a subset S of R that is a ring using the operations of R and havingthe same multiplicative identity. Note that if 1R = 0R then R is called the zero ring.

Theorem .. A nonempty subset S of a ring R is a subring if and only if

) The multiplicative identity 1R of R is in S

) S is closed under subtraction:a,b ∈ S⇒ a− b ∈ S

) S is closed under multiplication:

a,b ∈ S⇒ ab ∈ S

Let R be a ring. A nonempty subset I of R is called an ideal if

I1) a,b ∈ I ⇒ a− b ∈ I (closed under subtraction),I2) a ∈ I , r ∈ R⇒ ar ∈ I and ra ∈ I .

Fact: If an ideal I contains the unit element 1, then I = R.

Example . (Polynomial ideals).

) Let d be a polynomial over F and set I = dF [x]. That is, I is the set of all multiplesdf of d by arbitrary f ∈ F [x]. The set I is an ideal called the principal idealgenerated by d.

) Let F be a subfield of the complex numbers and defineI = (x+ 2)F [x] + (x2 + 8x+ 16)F [x], which is an ideal. Hence,

(x2 + 8x+ 16)− x(x+ 2) = 6x+ 16 ∈ I

and6x+ 16− 6(x+ 2) = 4 ∈ I .

This shows that 1 ∈ I , so I = F [x].

) Let S be a subset of a ring with identity R. Consider the set

〈S〉 = n∑i=1

risi |ri ∈ R,si ∈ S,n ≥ 1

of all finite linear combinations of elements of S, with coefficients in R. S is anideal, called the ideal generated by S. It is the smallest ideal of R that contains S.


For rings, we may define cosets under the group structure (with respect to the additionoperation). Since addition is commutative, we do not need to distinguish between leftand right cosets, rather we simply have cosets. Given a subset S of a commutative ringwith identity R, we define an equivalence relation

a ≡ b ⇐⇒ a− b ∈ S.

If a ≡ b we say that a and b are congruent modulo S (congruent mod S) and we writea ≡ b mod S. The cosets of S in R are the sets

a+ S = a+ s |s ∈ S.

We denote the set of all cosets ofS in R by

R/S = a+ S |a ∈ R,

which is read as “R mod S”. We define the operation of coset addition by

(a+ S) + (b+ S) = (a+ b) + S.

If S is a subgroup of the abelian group R, then R/S is an abelian group under cosetaddition. Similarly, we define the operation of coset multiplication by

(a+ S)(b+ S) = ab+ S.

For coset multiplication to be well-defined, we require

b+ S = b′ + S⇒ ab+ S = ab′ + S,

or equivalently, by Lemma .(i),

b − b′ ∈ S⇒ a(b − b′) ∈ S.

Read carefully, the requirement is

a ∈ R,b ∈ S⇒ ab ∈ S,

that is, S must be an ideal for coset multiplication to be well-defined. On the other hand,if S is an ideal then coset multiplication is well-defined.

Theorem .. Let R be a commutative ring with identity. The quotient R/I is a ring undercoset addition and multiplication if and only if I is an ideal. In this case, R/I is called thequotient ring of R modulo I .

An ideal I in a ring R is a maximal ideal if I , R and if whenever J is an ideal satisfyingI ⊆ J ⊆ R then either J = I or J = R.


Theorem .. I is a maximal ideal if and only if R/I is a field.

Proof.

) Suppose R/I is a field and J is an ideal of R such that I ⊂ J ⊂ R. Let x ∈ J butx < I , so x+ I is a nonzero element of R/I . By assumption, there is a y ∈ R suchthat (x+ I )(y + I ) = 1 + I . Thus, xy + I = 1 + I , which implies that 1− xy ∈ I ⊂ J .Since J is an ideal, xy ∈ J . Finally, 1 = (1− xy) + xy ∈ J , since 1− xy ∈ J andxy ∈ J , so J = R.

) Now, assume I is a maximal ideal and let x ∈ R, x < I . We want to show that x hasa multiplicative inverse. Define

J = xr + a |r ∈ R,a ∈ I.

Claim: J is an ideal and I ⊂ J (proper). First, let xr + a ∈ J and y ∈ R. Then,

(xr + a) + y = xr + (a+ y) ∈ J ,

since a+y ∈ I (by the fact that I is an ideal). Clearly, if a ∈ I then a ∈ J , since 0 ∈ R.

By maximality, J = R, which implies that xr + a = 1 for some r ∈ R, a ∈ I . Thus,

1 + I = (xr + a) + I = xr + I = (x+ I )(r + I ),

since a ∈ I . That is, every element of R/I has an inverse.

Let R be a ring. A nonzero element r ∈ R is called a zero divisor if there exists a nonzeros ∈ R such that rs = 0. A commutative ring R with identity is called an integral domain ifit contains no zero divisors.Fact: A finite integral domain is a field.

. Some Set Theory Results

A (binary) relation R on a set A is a subset of A×A. A relation R is usually denoted by ∼(or ≤,⊆) and defined by α ∼ β ⇐⇒ (α,β) ∈ R ⊂ A×A. If we wish to specify the relationassociated to a specific set, we may specify a pair (A,∼). A set A is said to be partiallyordered if there is a binary relation ∼ such that

1) α ∼ α,∀α ∈ A (Reflexivity),2) α ∼ β and β ∼ α⇒ α = β (Antisymmetry),3) α ∼ β and β ∼ γ ⇒ α ∼ γ (Transitivity).


Note that set inclusion ⊆ is a relation between sets and most likely the only one we willuse.A set B ⊆ A, of a partially ordered set (A,∼), is called totally ordered if for any x,y ∈ B wehave x ∼ y or y ∼ x. A partially ordered set (A,∼) is called inductive if every totallyordered subset of A (chain in A) has an upper bound in A. An element x ∈ B ⊆ A is calledan upper bound for B if y ∼ x for all y ∈ B. An element x ∈ A is called maximal if x ∼ yimplies x = y.

Lemma . (Zorn’s Lemma). If a partially ordered set (A,∼) is inductive, then there exists amaximal element of A.

. Vector Spaces

A nonempty set V with the following two operations on elements of V

(x,y) 7→ x+ y maps V ×V → V (addition) (.)(α,x) 7→ αx maps F ×V → V (scalar multiplication) (.)

is called a vector space (sometimes linear space) over the field F if the followingconditions are satisfied for all c1, c2 ∈ F and α,β,γ ∈ V

V1) α + β = β +αV2) (α + β) +γ = α + (β +γ)V3) ∃! 0 ∈ V such that α + 0 = αV4) ∀α ∃! −α ∈ V such that α + (−α) = 0V5) 1α = αV6) c1(c2α) = (c1c2)αV7) (c1 + c2)α = c1α + c2α

V8) c1(α + β) = c1α + c1β.

In particular, if F = R we call V a real vector space, while if F = C we say V is a complexvector space. The elements α ∈ V are called vectors.ExamplesYou should first verify that 0,R,C are each vector spaces, then verify that the followingexamples are vector spaces.

) Rn = (x1,x2, . . . ,xn) | xi ∈R for i = 1, . . . ,n

) Cn = (z1, z2, . . . , zn) | zj ∈C, j = 1, . . . ,n

) (z1, z2, z1 + z2) | z1, z2 ∈C

) F [x]n which is the set of polynomials in the variable x up to order n (less than orequal to n).


.. Function Spaces

Let X be a nonempty set and V a vector space. Consider the set of functions mapping Xto V ,

F = f : X→ V .

F is a vector space with operations

(f + g)(x) = f (x) + g(x)(αf )(x) = αf (x).

Note that the 0-vector in F is the function identically defined to be zero: µ(x) ≡ 0 for allx ∈ X. It is also worth noting that we can regard the vector spaces Rn,Cn as functionspaces by the following identifications:

Rn = f : (1,2, . . . ,n)→R

Cn = g : (1,2, . . . ,n)→C

.. Subspaces

An important concept when dealing with vector spaces is subspaces. Let V be a vectorspace. We call a nonempty subset M ⊂ V a subspace of V if M is a vector space with theoperations of V restricted to M.

Theorem . (Subspace Test). A nonempty subset M of a vector space V is a subspace ifand only if for all c1, c2 ∈ F and α,β ∈M the linear combination c1α + c2β ∈M. That is, M isa nonempty subset of V which is closed with respect to addition and scalar multiplication.

Note that a vector space is a subspace of itself, however, this is usually not of much use.ExamplesLet Ω be an open subset of Rn. The following are subspaces of the space of all functionsfrom Ω to C:

C(Ω) = the space of all continuous functions from Ω to C

Ck(Ω) = the space of all complex-valued functions which are continuous andhave continuous partial derivatives up to order k

C∞(Ω) = the space of infinitely differentiable functions defined on Ω

P (Ω) = the space of all polynomials in n-variables (x1,x2, . . . ,xn)

Remarks:Given two subspaces M,N of a vector space V :


i) If M ⊂N , then M is a subspace of N

ii) M ∩N will always be a subspace

iii) M ∪N not necessarily a subspace

Exercise:Show that

W = x = (x1,x2,0)T |x1,x2 ∈Ris a subspace of R3.Exercise:If M,N are two subspaces of a vector space V , then M ∪N is a subspace if and only ifM ⊆N or N ⊆M.

Theorem .. If V is a vector space over an infinite field F , then V cannot be the union of afinite collection of proper subspaces.

If M,N are subspaces of a vector space V , we define the sum

M +N = u + v | u ∈M,v ∈N .

Likewise, given a collection of subspaces Skk∈K , we define the space

∑k∈K

Sk =

s1 + · · ·+ sn | sj ∈⋃k∈K

Sk

of finite sums of elements in the union. The sum of any collection of subspaces of V is asubspace of V .

.. Linear Independence and Bases

Given a vector space V , denote the set of all subsets of V by P (V ) and the set of allsubspaces of V by S(V ) (sometimes called the lattice of subspaces). Thus, S(V ) ⊆ P (V )and there is a natural (canonical) map L : P (V )→S(V ) sending a subset S ∈ P (V ) to its(linear) span L(S) ∈ S(V ), defined by

L(S) =

n∑i=1

ciαi | ci ∈ F ,αi ∈ S,n ≥ 0

.Note that L(S) is the subspace of finite linear combinations of elements of S and it is thesmallest subspace of V containing S. The map L mapping a subset S to its linear span issurjective, since any subspace is a subset. The following theorem contains otherproperties of the map L.


Theorem .. The map L : P (V )→S(V ) satisfies

i) if S1 ⊆ S2 then L(S1) ⊆ L(S2),

ii) if α ∈ L(S), then there exists a finite subset S ′ ⊆ S such that α ∈ L(S ′),

iii) S ⊆ L(S) for all S ∈ P (V ),

iv) for every S ∈ P (V ), L(L(S)) = L(S),

v) if β ∈ L(S ∪ α) and β < L(S), then α ∈ L(S ∪ β), where α,β ∈ V and S ∈ P (V ).

A subset S is linearly dependent (over F ) if there exists a finite subset α1, . . . ,αn ⊆ S andnonzero scalars c1, . . . , cn ∈ F such that

c1α1 + · · ·+ cnαn = 0.

Otherwise, we say S is linearly independent. That is, if

n∑i=1

ciαi = 0⇒ ci = 0 for each i = 1, . . . ,n.

Example .. Set V = R, F1 =,F2 = R. Note that V is a vector space over both F1 and F2.Let

S = α1 = 1,α2 =√

2,which is linearly independent over F1. However,

√2α1 + (−1)α2 = 0,

so S is linearly dependent over F2.

A basis of a vector space V is a subset S that is linearly independent such that L(S) = V .If L(S) = V , we say the set S spans V . If S is a basis for V , then any vector α ∈ V can bewritten uniquely as a linear combination of elements from S, that is, there exist unique

c1, . . . , cn ∈ F such that α =n∑i=1ciαi where αi ∈ S. Note that every vector space has a basis.

Example ..

a) Consider the vector space Rn. Define the vectors δi = (0, . . . ,0,1,0 . . . ,0)t of all 0’s,

with a 1 in the ith position. The standard basis for Rn consists of the vectors

δ1,δ2, . . . ,δn.

More generally, this is the standard basis for the vector space Fn of n-tuples of

elements from the field F . Note that the standard basis elements are sometimesdenoted by ei or εi .


b) Consider the vector space F [x]n of polynomials in the variable x, up to degree n,with coefficients in the field F . The set of monomials

1,x,x2, . . . ,xn

is a basis for F [x]n.

c) Consider the vector space Mm,n(F ) of m×n matrices with entries in the field F .Define the elementary matrices δij as having 1 in the (i, j) position and 0everywhere else. The standard basis for Mm,n(F ) is the set

δij | i = 1, . . . ,m; j = 1, . . . ,n.

As with the standard basis for F n, these elementary matrices are often denoted byeij or εij . Sometimes this is necessary to avoid confusion with Kronecker’s delta orthe δ-distribution.

Recall the following fact regarding solutions of homogeneous systems of linearequations.

Theorem .. If A ∈Mm,n(F ) with m < n then Ax = 0 has a nontrivial solution.

With this in mind, we can prove the following theorem for finite-dimensional vectorspaces.

Theorem .. Let V be a vector space which is spanned by a finite set of vectors α1, . . . ,αn.Then any independent set of vectors in V is finite and contains no more than n elements.

Proof. Denote the set of vectors that span V by α = α1, . . . ,αn, so for any v ∈ V there

exist scalars c1, . . . , cn ∈ F such that v =n∑i=1ciαi . Suppose S is a subset of V that is spanned

by a set of m vectors β = β1, . . . ,βm, where m > n. Since each βj ∈ L(α), j = 1, . . . ,m, thereexists scalars aij ∈ F such that

β1 = a11α1 + . . .+ an1αn... =

...

βj = a1jα1 + . . .+ anjαn... =

...

βm = a1mα1 + . . .+ anmαn.


Now, suppose d1β1 + . . .+ dmβm = 0. We want to show that not all dj ∈ F equal 0. Sinceeach βj can be expressed as a linear combination of the αi ∈ α, we have

m∑j=1

djβj =m∑j=1

dj

n∑i=1

aijαi

=

n∑i=1

m∑j=1

(aijdj)αi .

This defines a homogeneous system of linear equations

m∑j=1

aijdj = 0, 1 ≤ i ≤ n,

for which there exist d1, . . . ,dm not all 0, since m > n. Thus,

d1β1 + · · ·+ dmβm = 0,

which shows that β is linearly dependent.

For the finite case, this enables us to prove the following theorem.

Theorem .. If B1,B2 are two bases of a vector space V , then |B1| = |B2|.

Proof.Finite Case: The simplest way to prove the finite case is to use the previous theoremtwice. Suppose V is a vector space with a finite basis B1 = α1, . . . ,αn. We must showthat any other basis must have n elements. Suppose B2 = β1, . . . ,βm is another basis ofV . If m > n, the previous theorem shows that B2 is linearly dependent, thus not a basis.But if m < n, B1 is linearly dependent by the same argument. In either case, we have acontradiction, so we must have m = n.

Infinite Case: Suppose V does not have a finite basis. Let B1,B2 be two infinite bases ofV and let α ∈ B1. Since B2 is a basis, there exists a finite subset Bα ⊂ B2 such thatα ∈ L(Bα) and α < L(B′) for any proper subset B′ ⊂ Bα. This yields a well-definedfunction ϕ : B1→P (Bα) defined by ϕ(α) = Bα. We may now use the following resultfrom set theory.Fact: Let A and B be sets and suppose |A| =∞. If for each α ∈ A we have some finite setBα ⊂ B, then

|A| ≥

∣∣∣∣∣∣∣⋃α∈ABα∣∣∣∣∣∣∣ .


Since for each α ∈ B1 there is a finite subset Bα ⊂ B2, we must have

|B1| ≥

∣∣∣∣∣∣∣∣⋃α∈B1

Bα

∣∣∣∣∣∣∣∣ .Furthermore, since α ∈ L(Bα) for all α ∈ B1, V = L

⋃α∈B1

Bα

, which shows that⋃α∈B1

Bα is a

subset of B2 that spans all of V . Thus,⋃α∈B1

Bα = B2, so |B1| ≥ |B2|. Reversing the roles of

B1 and B2 shows that |B2| ≥ |B1|.

The following theorem shows that any subset that spans V contains a basis.

Theorem .. Let V be a vector space and suppose V = L(S), where S is a subset of V . ThenS contains a basis; that is, there is a basis B of V such that B ⊆ S.

Proof. If S does not contain a nonzero vector, then the results is trivial, since V must bethe zero vector space. Suppose S contains a nonzero vector α and define

S = A ⊆ S | A is linearly independent over F .

Since α ∈ S , S is nonempty. We may place a partial ordering on S by set inclusion, ⊆. IfA = Ak | k ∈ K is a totally ordered subset of S , then

⋃k∈K

Ak is an upper bound for A in S ,

so (S ,⊆) is inductive. By Zorn’s lemma ., S must have a maximal element, which wedenote B.Claim: B is a basis of V .Suppose, for contradiction, that B is not a basis of V ; in particular, suppose thatL(B) , V , so S 1 L(B). In this case, there must be an s ∈ S that is not a linear combinationof elements of B (i.e. s ∈ S\L(B)), since otherwise V = L(S) ⊆ L(L(B)) = L(B). Thus, B ∪ sis linearly independent over F , that is, B ∪ s ∈ S . Since s < L(B), s < B, hence B ∪ s isstrictly larger than B, which contradicts the maximality of B. Thus, L(B) = V .

On the other hand, the following theorem shows that any linearly independent subset ofV can be extended to a basis of V .

Theorem .. If S is a linearly independent subset of V over F , there exists a basis B of Vsuch that S ⊆ B.


Proof Sketch: Suppose L(S) , V , since otherwise S is a basis of V . So, there is a vectorβ ∈ V \L(S) Again, we may appeal to Zorn’s lemma. Consider the collection

A = A ∈ P (V ) | S ⊆ A,A is linearly independent .

Note that, since S ∈ A, A is nonempty. As in the previous theorem, we place a partialordering on A by set inclusion and a similar argument shows that (A,⊆) is inductive.Hence, there is a maximal element B, which by maximality, must be a basis of V .

If for any positive integer n there exist n linearly independent vectors in V , we say thatV is infinite dimensional. If V is not infinite dimensional, then there is a positive integern such that:

a) there is a set of n vectors that are linearly independent;

b) if m > n, then any m elements are linearly dependent.

In this case, we call n the dimension of V , which is the largest possible number of linearlyindependent vectors and the number of vectors in any basis of V .

Theorem .. Let V be a vector space over F .

a) if M is a subspace of V , then dimM ≤ dimV ,

b) if V is finite-dimensional and M is a subspace of V such that dimM = dimV , thenM = V ,

c) if M is a subspace of V , then there exists a subspace M ′ of V such that M +M ′ = V andM ∩M ′ = 0,

d) if V is finite-dimensional and M1,M2 are subspaces of V , then

dim(M1 +M2) + dim(M1 ∩M2) = dimM1 + dimM2.

Note, in particular, that b) does not necessarily hold if V is infinite-dimensional.

part d) only. Suppose B is a basis of V . Since S,T are subspaces of V , there are subsetsBS ⊆ B and BT ⊆ B that are bases for S and T , respectively.If S ∩ T = ∅, then BS ∩BT = ∅, in which case, BS ∪BT is a basis of S + T . Hence,dim(S + T ) = dim(S) + dim(T ).Now, suppose S ∩ T , ∅, so BS ∩BT , ∅. Let BST be a basis for S ∩ T . Note that S ∩ T is asubspace of both S and T (and V as well). So, we may extend BST to a basis BST ∪BS ofS, where BST ∩BS = ∅. Likewise, we may extend BST to a basis BST ∪BT of T , where


BST ∩T = ∅. We claim that BS ∪BST ∪BT is a basis of S + T . To show linearindependence, suppose there are subsets B′S ⊆ BS ,B

′ST ⊆ BST ,B

′T ⊆ BT such that

(a1α1 + · · ·+ anαn) + (b1γ1 + · · ·+ bmγm) + (c1β1 + · · ·+ cpβp) = 0, (.)

where B′S = α1, . . . ,αn,B′ST = γ1, . . . ,γm,B′T = β1, . . . ,βp. By (.), we havep∑k=1

ckβk ∈ S ∩ T , sop∑k=1

ckβk =m∑j=1xjγj , for some xj ∈ F . Since BST is a basis of S ∩ T we

must have dk = 0, for each k (also, xj = 0 for each j). Then, we are left withn∑i=1aiαi +

m∑j=1bjγj = 0. Since BS ∪BST is a basis of S, we have ai = 0, i = 1, . . . ,n and

bj = 0, j = 1, . . . ,m. Thus, BS ∪BST ∪BT is a basis and

dim(S + T ) + dim(S ∩ T ) = (n+m+ p) +mdimS + dimT = (n+m) + (p+m).

Example .. Consider the infinite dimensional vector space F [x] and the subspace

M =∑

cix2i | ci ∈ F

of even polynomials. A basis of M is all even powers of x (of which there are infinitelymany), so dimM = dimF [x], but M , F [x].

Example .. Consider the vector space Mn(F ) of n×n matrices over the field F . Ifchar(F ) = 2, then it is possible to have matrices which are both symmetric andskew-symmetric. Recall from Example . that if char(F ) = 2, then x = −x for everyx ∈ F . For example, consider M2(F ), where char(F ) = 2, and let a,b,c,d ∈ F . Then, amatrix would be symmetric if (

a bc d

)=

(a cb d

).

Additionally, the matrix is skew-symmetric if(a bc d

)=

(−a −c−b −d

).

To further highlight this example, consider the field F = Z2 and let

A =(

1 11 0

).

Then, At = A and

−At =(−1 −1−1 0

)= A,

since −1 ≡ 1( mod 2). Thus, A is both symmetric and skew-symmetric.


Suppose V is a finite-dimensional vector space over F and B = α1, . . . ,αn is a basis of V .There is a canonical map, called the coordinate map, φB : V → F

n defined by

φB(v) = [v]B =

c1...cn

where v = c1α1 + · · ·+ cnαn. That is, the coordinate map sends a vector to the n-tuple ofcoefficents in F

n.

.. Internal vs. External Direct Sum

Let Vkk∈K be a family of vector spaces over the (same) field F . We define the externaldirect sum of this family as⊕

k∈KVk = f : K →

⋃k∈K

Vk | f (k) = 0 except for finitely many k ∈ K.

In other words, the function f has finite support. Equivalently,⊕k∈K

Vk = (v1, . . . , vn) | vi ∈ Vi , i = 1, . . . ,n.

In either case, we make this into a vector space by defining pointwise addition andscalar multiplication.On the other hand, we may define a vector space from a family of subspaces of a vectorspace V over F . Let Skk∈K be a family of subspaces of V over the field F . We say V isthe internal direct sum of Skk∈K if for any v ∈ V there exist unique vk ∈ Sk such thatv = v1 + · · ·+ vn. In this case, we write

V =n⊕k=1

Sk ,

where Sk is referred to as a direct summand of V . If V = S ⊕ T , then T is called thecomplement of S in V .

Theorem .. For any subspace S of a vector space V , there exists a subspace T of V suchthat V = S ⊕ T . That is, any subspace of V has a complement in V .

Proof. First, assume V is finite-dimensional, with basis α = α1, . . . ,αn. SupposeB = αi1 ,αi2 , . . . ,αim ⊆ α is a basis for S. Here, i1, . . . , im ⊆ 1, . . . ,n, with m ≤ n. DefineB′ = αim+1

, . . . ,αin, where im+1, . . . , in = 1, . . . ,n\i1, . . . , im. Let T = L(B′). If v ∈ V hasexpression v = c1α1 + · · ·+ cnαn with respect to α, then after reordering (if necessary)


v = (ci1αi1 + · · ·+ cimαim) + (cim+1αim+1

+ · · ·+ cinαin) = vs + vt, with vs ∈ S and vt ∈ T . Clearly,S ∩ T = 0 and T is a subspace, by definition.There is nothing dramatically different about the infinite-dimensional case. If α is aninfinite basis of V , then there exists Bα ⊆ α such that L(Bα) = S. Then, define B′ = α\Bαand T = L(B′).

.. Linear Transformations

Given vector spaces V ,W over a (common) field F , a map T : V →W satisfying

T (c1α + c2β) = c1T α + c2T β, (L)

for every c1, c2 ∈ F and α,β ∈ V , is called a linear transformation or homomorphism. In thiscontext, the terms linear transformation and homomorphism are used interchangeably.

Example ..

) The coordinate map φB : V → Fn, where V is finite-dimensional, is a

homomorphism from V to n-tuples of coordinates.

) In the vector space V =Mm,n(F ), left matrix multiplication by a matrix A ∈Mm(F ),

T B = AB, B ∈ V ,

is a homomorphism from V to V .

) Let V = Ck(I), where k ≥ 2 and I is a subinterval of R. Ordinary differentiationf 7→ f ′ is a linear transformation from Ck(I) to Ck−1(I).

) Let A = [a1,b2]× · · · × [an,bn] ⊂Rn and consider the vector space V =R(A) of

Riemann-integrable functions. Integration defines a linear transformationT f =

∫Af .

We denote the set of all linear transformations from V to W , over F , by HomF

(V ,W ) or,alternately, L(V ,W ). If it is not necessary to specify the field, we often drop thesubscript, denoting this set simply as Hom(V ,W ). Note that we may define the vectorspace W V , consisting of all maps from V to W , with pointwise operations.Fact: Hom(V ,W ) is a subspace of W V .In the following, we define several special kinds of homomorphisms.

Definition ..

) A homomorphism T ∈Hom(V ,V ) is called an endomorphism. The space of allendomorphisms is often simplified as Hom(V ).


) An injective (-) homomorphism is called a monomorphism.

) A surjective (onto) homomorphism is called an epimorphism.

) A bijective homomorphism is called an isomorphism.

) A bijective endomorphism is called an automorphism.

Additionally, we define the following sets associated with a homomorphism T .

Definition ..

) The kernel of T is defined by

ker(T ) = v ∈ V | T v = 0.

Alternately, the kernel of T is called the nullspace, which is denoted by N (T ).

) The image of T is defined by

im(T ) = w ∈W | T v = w, v ∈ V .

Alternately, the image of T is called the range (space), which is denoted by R(T ).

The following theorem is a good exercise.

Theorem .. Let T be a homomorphism from V to W .

a) T is a monomorphism if and only if ker(T ) = 0.

b) T is an epimorphism if and only if im(T ) =W .

If there is an isomorphism T : V →W , then we say V and W are isomorphic and writeV W . By Exercises and , ker(T ) is a subspace of V , while im(T ) is a subspace ofW . The dimension of ker(T ) is often called the nullity of T , denoted by (T ). Thedimension of im(T ) is called the rank of T , denoted by rank(T ).

Theorem .. Let V ,W be vector spaces over F and let T ∈Hom(V ,W ).

a) If T is an epimorphism, then dimV ≥ dimW .

b) If dimV = dimW <∞, then T is an isomorphism if and only if T is injective orsurjective. That is, T is injective if and only if T is surjective.

c)dim(imT ) + dim(kerT ) = dimV .

In other notation, that isrankT + nullT = dimV .


Theorem .. Let V be a finite-dimensional vector space with basis α = α1, . . . ,αn. IdentifyW n with functions from 1, . . . ,n to W . For every collection of vectors β = (β1, . . . ,βn) ∈W n

there exists a unique homomorphism T ∈Hom(V ,W ) such that T αi = βi , for each i = 1, . . . ,n.

Proof. Recall that the coordinate map φα : V → Fn is an isomorphism. Define a

homomorphism from Fn to W by

Lβ(x1, . . . ,xn) =n∑i=1

xiβi .

It is left as an exercise to show that Lβ is a homomorphism. Now, consider the mapLβ φα. Since φα(αi) = δi , we have

Lβ(δi) = βi ,

which holds for each i = 1, . . . ,n. Now, suppose there is another S ∈Hom(V ,W ) such thatSαi = βi , for each i. Let v ∈ V be an arbitrary vector with expression v = a1α1 + · · ·+ anαn.Then

T v = T (n∑i=1

aiαi) =n∑i=1

aiT αi

=n∑i=1

aiβi

=n∑i=1

aiSαi

= S(n∑i=1

aiαi)

= Sv.

Since T v = Sv for any v ∈ V , we have T = S.

Corollary .. Let V be as in the previous theorem. Every basis α of V determines anisomorphism Ψ (α) : Hom(V ,W )→W n.

Proof. Define Ψ (α) : Hom(V ,W )→W n by Ψ (α)T = (T α1, . . . ,T αn)t and inverseξ :W n→Hom(V ,W ) by ξ((β1, . . . ,βn)t) = φα(Lβ(·))t.

Let V and W be finite-dimensional vector spaces over F , with α = α1, . . . ,αn a basis forV and β = β1, . . . ,βm a basis for W. Consider any homomorphism T ∈Hom(V ,W ). If


T αi , 0, then αi ∈ im(T ). Thus, there are aij ∈ F such that T αi =m∑j=1aijβj for each

i = 1, . . . ,n. Now, recall the coordinate map φβ :W → Fm. By the definition, we have

φβ(T αi) = φβ

m∑j=1

aijβj

= (ai1, ai2, . . . , aim)t ∈ F m.

Then, we define the homomorphism Γ (α,β) : Hom(V ,W )→Mm,n(F ) by

Γ (α,β)T = [φβ T α1,φβ T α2, . . . ,φβ T αn]. (.)

This construction yields the matrix Γ (α,β)T ∈Mm,n(F ) called the matrix of T relative tothe bases α and β or, simply, matrix representation of T . This suggests the following; if Vis n-dimensional and W is m-dimensional, then dim(Hom(V ,W )) is nm-dimensional.

Given a homomorphism T ∈Hom(V ), we say a subspace S of V is T -invariant ifT (S) ⊆ S. Furthermore, given a subsetL of Hom(V ), a subspace S of V is calledL -invariant if S is T -invariant for every T ∈L . The vector space V is calledL -irreducible if the onlyL -invariant subspaces of V are 0 and V .

. Quotient Spaces

Let S be a subspace of a vector space V . Recall that we may define an equivalencerelation by

u ≡ v ⇐⇒ u − v ∈ S,in which case, we say u is equivalent (congruent) to v mod S. Furthermore, theequivalence classes are

[v] = v + S,

cosets of S in V , where v is the coset representative. Additionally, recall that we denotethe set of all cosets of S in V by

V /S = v + S | v ∈ V ,

which is the quotient space of V mod S. We make V /S into a vector space by defining

a(v + S) = av + S(u + S) + (v + S) = (u + v) + S,

for a ∈ F and u,v ∈ V . In this vector space, recall that the equivalence class of 0 is thesubspace S. When looking at quotient spaces, an important mapping is the canonicalprojection (natural projection) πS : V → V /S defined by

πS(v) = v + S,

which sends each vector to its corresponding coset of S in V .


Lemma .. The canonical projection πS : V → V /S is surjective and ker(πS) = S.

Proof. First, suppose πs(v) = 0, so v + S = S, since the 0-coset is the space S. But,v +S = S ⇐⇒ v ∈ S. If w+S is a coset in V /S, then πS(w) = w+S, so πS is surjective.

By the rank-nullity theorem .(c), we have the following result, since imπS = V /S andkerπS = S.

Corollary .. If V is finite-dimensional and S a subspace of V , then

dimV = dimS + dimV /S.

Here, dimV /S is called the codimension of S in V .

Theorem . (First Isomorphism Theorem). Let T ∈Hom(V ,W ) and suppose S is asubspace of V such that T (S) = 0; that is, S ⊆ kerT . There exists a unique ξ ∈Hom(V /S,W )such that the following diagram commutes:

V W

V /S

T

πSξ

That is, ξ πS = T .

Proof. By the condition we are trying to satisfy, we define ξ ∈Hom(V /S,W ) by

ξ(v + S) = T v, v ∈ V

and show that ξ is, indeed, a linear transformation, followed by a uniqueness argument.We must also show that this gives a well-defined mapping. To show this, supposeu ∈ v +S. We must show that T u = T v. Equivalently, we suppose u − v ∈ S, in which case,T (u − v) = 0, since S ⊆ kerT . By linearity, it follows that T u = T v. This shows that ξ iswell-defined, in the sense that this mapping only depends on the coset, rather than aparticular coset representative.Now, let a,b ∈ F ,u,v ∈ V . By definition, we have

ξ(a[u] + b[v]) = ξ([au + bv]) = T (au + bv)= aT (u) + bT (v)= aξ([u]) + bξ([v]),


so ξ is indeed a linear transformation. Also, for v ∈ V , we have

ξ(πS(v)) = ξ([v]) = T v

by definition, so the diagram commutes.Lastly, suppose there is another τ ∈Hom(V /S,W ) such that τ πS = T . So, τ([v]) = ξ([v])for any [v] ∈ V /S. Since πS is surjective, we have τ = ξ (equality must hold for allv ∈ V ).

Corollary .. For T ∈Hom(V ,W ) we have imT V / kerT .

Now, we consider the scenario in which V =M ⊕N , so N is a complement of M in V . Inthise case, any element v ∈ V can be expressed as v = vm + vn, where vm ∈M and vn ∈N .As in Exercise , we may define a projection operator P : V →N by P (vm + vn) = vn. Theoperator P is called the projection onto N along M. Clearly, we have

im(P ) =N

and sinceP (vm + vn) = 0⇒ vn = 0,

we also have ker(P ) =M. If we apply the First Isomorphism Theorem, in particularCorollary ., we have

im(P ) V / ker(P ) ⇐⇒ N V /M.

This proves the following result.

Corollary .. Any complement of the subspace S in V is isomorphic to V /S.

Theorem . (Second Isomorphism Theorem). Suppose M,N are subspaces of a vectorspace V such that M ⊆N . Then

(V /M)/(N/M) V /N.

Proof. First, note that N/M is a subspace of V /M. Define a homomorphismT ∈Hom(V /M,V /N ) by T (v +M) = v +N . Since M ⊆N , if u +M = v +M, thenu − v ∈M ⊆N , hence u +N = v +N , showing that T is well-defined. The 0-coset in V /Nis N , thus,

T (u +M) =N ⇐⇒ u ∈N⇐⇒ u +M ∈N/M,

which shows that kerT =N/M. Also, T is surjective, imT = V /N . By Corollary .,

imT V / kerT ,

which, in this case, readsV /N (V /M)/(N/M).


The next isomorphism theorem is a generalization of Corollary .. Note that, in someof the literature, the second and third isomorphism theorems are in reverse order.

Theorem . (Third Isomorphism Theorem). Let M,N be subspaces of a vector space V .Then

(M +N )/M N/(M ∩N ).

Proof. Define T :M +N →N/M ∩N by T (m+n) = n+ (M ∩N ). Since kerT =M,Corollary . gives

(M +N )/M N/(M ∩N ).

Likewise, a similar map may be defined to show

(M +N )/N M/(M ∩N ).

A vector space A over F with a third operation, vector multiplication, from A×A→A,satisfying

A1) (ab)c = a(bc), for all a,b,c ∈ AA2) a(b+ c) = ab+ ac and (a+ b)c = ac+ bc, for all a,b,c ∈ AA3) x(ab) = (xa)b = a(xb), for all x ∈ F , a,b ∈ A

is called an associative algebra (or linear algebra). If there is a multiplicative identity,1 ∈ F such that a1 = 1a = a for every a ∈ A, then A is an (associative) algebra with identity.An algebra is commutative if ab = ba for every a,b ∈ A.If A,B are algebras, then a homomorphism ϕ ∈Hom(A,B) is called an algebrahomomorphism if it additionally satisfies ϕ(ab) = ϕ(a)ϕ(b) for all a,b ∈ A. That is, analgebra homomorphism preserves multiplication.

Exercises - Section .

. (a) Define S = A ∈Mn(F ) | At = A the subset of all symmetric matrices in Mn(F ).Show that S is a subspace of Mn(F ).

(b) Define T = A ∈Mn(F ) | At = −A the subset of all skew-symmetric matrices inMn(F ). Show that T is a subspace of Mn(F ).

. Let S be a subspace of a vector space V and let α,β ∈ V . Set A = α +S and B = β +S.Show that A = B or A∩B = ∅.

. If S1,S2 are subsets of a vector space V , show that L(S1 ∪ S2) = L(S1) +L(S2).

. Determine two subspaces S1,S2 of R2 such that S1 ∪ S2 is not a subspace of R2.


. (Dedekind’s Modular Law) Let R,S,T be subspaces of a vector space V with S ⊆ R.Show R∩ (S + T ) = S + (R∩ T ).

. Let V be a finite-dimensional vector space with basis B. Show that the coordinatemap φB is bijective. Show φB preserves addition and scalar multiplication, that is

φB(av + bw) = aφB(v) + bφB(w),

where a,b ∈ F and v,w ∈ V . This shows that φB is an isomorphism from V onto Fn,

where dimV = n.

. Let V be the vector space of all functions from R to R and define

Ve = f ∈ V | f (−x) = f (x)Vo = f ∈ V | f (−x) = −f (x)

the subsets of even and odd functions, respectively.

(a) Show Ve and Vo are subspaces of V

(b) Prove Ve +Vo = V

(c) Prove Ve ∩Vo = 0

This shows that V = Ve ⊕Vo.

. Let S1,S2 be subspaces of a vector space V such that S1 + S2 = V and S1 ∩ S2 = 0.Prove that for each α ∈ V there are unique vectors α1 ∈ S1,α2 ∈ S2 such thatα = α1 +α2.

. Define V = R as a vector space over the field F = Q. Show that this vector space isinfinite-dimensional.

. For a homomorphism T ∈Hom(V ,W ), show that ker(T ) is a subspace of V .

. For a homomorphism T ∈Hom(V ,W ), show that im(T ) is a subspace of W .

. Suppose V =M ⊕N and define linear operators PM , PN ∈Hom(V ) by

PM(vm + vn) = vmPN (vm + vn) = vn,

where vm ∈M,vn ∈N . The operators PM : V →M and PN : V →N are calledprojection operators. Show that

(a) P 2M = PM and P 2

N = PN .

(b) PM + PN = IV , where IV is the identity on V .


(c) PMPN = 0 = PNPM(d) V = im(PM)⊕ im(PN )

. Let T ∈Hom(V ). Show that T 2 = 0 if and only if there exists two subspaces M,N ofV such that

(a) M +N = V

(b) M ∩N = 0(c) T (N ) = 0(d) T (M) ⊆N .

. An operator T ∈Hom(V ) is called an involution if T 2 = IV . If T is an involution,show there exist subspaces M,N of V such that

(a) M +N = V

(b) M ∩N = 0(c) T v = v for every v ∈M(d) Tw = −w for every w ∈N .

. Let T ∈Hom(R3,R3) be defined by

T

uvw

=

3uu − v

2u + v +w

.Determine if T is a monomorphism. Why is the answer to this question enough todetermine whether T is an isomorphism (automorphism) or not?

. Suppose T ∈Hom(F 3,F 3) is an endomorphism such that

T δ1 =

12−1

; T δ2 =

−11−2

; T δ3 =

202

;

where δ1,δ2,δ3 is the standard basis of F 3. Describe T explicitly; that is, give aformula for

T

uvw

as in Problem .


. Consider the vector space V = C([−1,1]) of continuous, real-valued functionsdefined on [−1,1]. Let Se = f ∈ V | f (−x) = f (x) be the subspace of even functionsand So = f ∈ V | f (−x) = −f (x) the subspace of odd functions. Define the operatorT as the indefinite integral

(T f )(x) =∫ x

0f (s)ds.

Are the subspaces Se and So T -invariant?

. Show that E(V ) = Hom(V ,V ) with composition of endomorphisms,

(T1,T2) 7→ T1 T2,

defines an associative algebra with identity. Is this algebra commutative?

. Let M be a subspace of a vector space V and let m1, . . . ,mn be a basis of M.Explain how to determine a basis for V /M.

Modules; Linear and Multilinear functionals

. Linear functionals

In this section, we discuss a special class of homomorphisms defined on a vector spaceV , namely those that map V to the field F .A map f ∈Hom(V ,F ) is called a linear functional. That is, f is a map from V to F suchthat

f (av + bw) = af (v) + bf (w),

for all a,b ∈ F and v,w ∈ V .The set of all linear functionals defined on V is called the dual space of V , denoted by V ∗.So, V ∗ is simply a shorthand notation for Hom(V ,F ). We have already shown, in section.., that Hom(V ,W ) is a vector space in general, so V ∗ is a vector space (withpointwise operations).

Example ..

) Consider the vector space C[0,1] of all continuous, real-valued functions definedon [0,1]. Definite integration defines a linear functional; for any f ∈ C[0,1] define

ϕ(f ) =∫ 1

0f (x)dx.

Since integration is linear, ϕ is an element of the dual space C[0,1]∗.


) On any function space, C[0,1] or F [x] for example, point evaluation defines alinear functional. For any p ∈ F [x], define

g(p) = p(0),

point evaluation at 0. So, g is an element of the dual space F [x]∗.

Remark .. Recall the matrix representation of a homomorphism T ∈Hom(V ,W ),given by (.). If f is a linear functional on V , we may determine its matrixrepresentation, since linear functionals are special types of homomorphisms. Sincelinear functionals map to the field F , which has basis β = 1, the calculation of Γ (α,β)fis greatly simplified. As an example, let V = F [x]2 and define

f (p) =∫ 2

0p(x)dx.

Taking the standard basis α = α1,α2,α3 of V , where α1 = 1,α2 = x,α3 = x2, we have

f (α1) =∫ 2

01dx = 2,

f (α2) =∫ 2

0xdx = 2,

f (α3) =∫ 2

0x2dx =

83.

Thus, the matrix of f is given by

Γ (α,β)f =(φβ f (α1) φβ f (α2) φβ f (α3)

)=

(2 2 8

3

).

As a verification, let p(x) = ax2 + bx+ c. Evaluating f (p) directly yields∫ 2

0ax2 + bx+ cdx =

83a+ 2b+ 2c.

We may also compute f (p) by multiplying the coefficients of p in α by the matrix of f ,giving (

2 2 83

)cba

= 2c+ 2b+83a.

Similarly, if V = F [x]2 and f (p) = p(t), for some fixed t ∈ F , it is straightforward to verifythat the matrix of f is given by (

1 t t2).


Our next few results are concerned with the structure of the dual space V ∗ and itsrelation with the vector space V . First, note that if v ∈ V is nonzero, then there exists alinear functional f ∈ V ∗ such that f (v) , 0. In other words, if f (v) = 0 for every f ∈ V ∗,then v = 0. Since dimF = 1, any linear functional is either identically 0 or surjective(Exercise). Furthermore, by the rank-nullity theorem, if f , 0 and dimV <∞ we have

dim(kerf ) = dimV − 1.

In fact, the following theorem shows how the image and kernel of a nonzero functionalrelate to V .

Theorem .. If f ∈ V ∗ and f (v) , 0 for some v ∈ V , then V = L(v)⊕ker(f ).

Proof. We know that V = im(f )⊕ker(f ), since im(f ) V / ker(f ), by the firstisomorphism theorem. First, we may show that L(v)∩ker(f ) = 0. Supposeu ∈ L(v)∩ker(f ), so f (u) = u and u = av for some a ∈ F . Thus, f (av) = 0, which impliesthat af (v) = 0, but f (v) , 0; hence, a = 0, so u = 0.Now, let u ∈ V be an arbitrary vector with f (u) , 0, say f (u) = b ∈ F . We will show thatu ∈ L(v) + ker(f ). Since f (v) , 0, we may set f (v) = c for some c ∈ F . We havef (bcv) = b

c f (v) = b = f (u) and clearly bcv ∈ L(v). Also,

f

(u − b

cv

)= f (u)− f (u) = 0,

so u − bcv ∈ ker(f ). Since

u =bcv + (u − b

cv),

we have u ∈ L(v) + ker(f ). Finally, V = L(v)⊕ker(f ).

Now, suppose V has a basis α. We may define a set of linear functionals α∗j ∈ V∗ by

α∗j (αi) = δi,j .

The set α∗ = α∗i | i ∈ ∆ is linearly independent, as shown in the following theorem.

Theorem .. If V is finite-dimensional, then dimV = dimV ∗. Furthermore, the set α∗ is abasis of V ∗.

Proof. It suffices to show that α∗ is a basis of V ∗, from which it follows that

dimV = dimV ∗. To show linear independence, supposen∑i=1ciα∗i = 0 for some ci ∈ F .


Thus,n∑i=1ciα∗i (v) = 0 for all v ∈ V . In particular, this must be true on each basis element

αi ∈ α of V . So,n∑i=1

ciα∗i (αj) = 0, for each j = 1, . . . ,n,

which implies that cj = 0, for each j = 1, . . . ,n by the definition α∗i (αj) = δi,j . Hence, theset α∗ is linearly independent.Now, we must show that α∗ spans V ∗, that is, L(α∗) = V ∗, which follows from linearity.

Let f ∈ V ∗ be an arbitrary nonzero linear functional. If v ∈ V has expression v =n∑j=1ajαj ,

then

f (v) = f

n∑j=1

ajαj

=

n∑j=1

ajf (αj), (.)

which we will show is equivalent to the expressionn∑i=1f (αi)α∗i (v). To wit,

n∑i=1

f (αi)α∗i (v) =

n∑i=1

f (αi)α∗i

n∑j=1

ajαj

=

n∑i=1

f (αi)n∑j=1

ajα∗i (αj)

=n∑i=1

f (αi)ai ,

which again follows from the definition α∗i (αj) = δi,j . In conclusion, any f ∈ V ∗ can be

expressed as f =n∑i=1f (αi)α∗i , hence L(α∗) = V ∗.

The basis α∗ is called the dual basis of α.

Example .. Let V = R[x]2 = ax2 + bx+ c | a,b,c ∈R be the vector space of real


polynomials of degree 2 or less. For any p ∈ V , define three linear functionals by

f1(p) =

1∫0

p(x)dx

f2(p) =

2∫0

p(x)dx

f3(p) =∫ −1

0p(x)dx.

We will determine a basis of V for which f = f1, f2, f3 is the dual basis. Let

p1(x) = a1x2 + b1x+ c1

p2(x) = a2x2 + b2x+ c2

p3(x) = a3x2 + b3x+ c3

be polynomials in V . For f to be the dual basis of p1,p2,p3 we must have fi(pj) = δi,j for1 ≤ i, j ≤ 3. To determine p1, the conditions∫ 1

0(a1x

2 + b1x+ c)dx = 1 ⇒ 13a1 +

12b1 + c1 = 1∫ 2

0(a1x

2 + b1x+ c)dx = 0 ⇒ 83a1 + 2b1 + 2c1 = 0∫ −1

0(a1x

2 + b1x+ c)dx = 0 ⇒ −13a1 +

12b1 − c1 = 0

This yields a linear system of equations13

12 1

83 2 2

−13

12 −1

a1

b1

c1

=

1

0

0

.Likewise, the conditions for p2 and p3 yield the linear systems

13

12 1

83 2 2

−13

12 −1

a2

b2

c2

=

0

1

0


and 13

12 1

83 2 2

−13

12 −1

a3

b3

c3

=

0

0

1

.Thus, if we define

G =

13

12 1

83 2 2

−13

12 −1

,then

G−1 =

a1 a2 a3

b1 b2 b3

c1 c2 c3

.One can check that this yields

p1(x) = −32x2 + x+ 1,

p2(x) =12x2 − 1

6,

p3(x) = −12x2 + x − 1

2.

Given our previous remarks, the dual space V ∗ is, itself, a vector space, so we may defineits dual space, (V ∗)∗, which we typically denote by V ∗∗. The space V ∗∗ is called the doubledual and it consists of linear functionals in Hom(V ∗,F ). A vector space V is calledreflexive if V V ∗∗.We define a map ϕ : V ×V ∗→ F , called the dual pairing, by ϕ(v,f ) = f (v), for allv ∈ V ,f ∈ V ∗. The dual pairing simply evaluates the given linear functional at the givenvector. This is an example of a bilinear functional, meaning that the functional is linearover V and over V ∗ (linear in both components). In general, a map ω :U ×V →W iscalled bilinear if

ω(au1 + bu2,v) = aω(u1,v) + bω(u2,v)ω(u,av1 + bv2) = aω(u,v1) + bω(u,v2),

for all a,b ∈ F ,u,u1,u2 ∈U and v,v1,v2 ∈ V . You should verify that the dual pairingsatisfies the above conditions. Sometimes, the dual pairing is referred to as the natural


bilinear functional. The dual pairing ϕ yields a natural homomorphism ψ : V → V ∗∗

defined by ψ(v) = ϕ(v, ·), where the right hand side takes any f ∈ V ∗ and evaluates f (v).Thus, ψ(v) ∈ V ∗∗ for each v ∈ V . The linearity of ψ follows directly from the linearity ofϕ.

Theorem .. The map ψ : V → V ∗∗ is injective. If dimV <∞, then ψ is an isomorphism.

Proof. To show ψ is injective, suppose ψ(v) = 0. So, ϕ(v, ·) = 0, which means that f (v) = 0for all f ∈ V ∗, which is true if and only if v = 0. Hence, kerψ = 0.Now, suppose dimV <∞, so dimV ∗ <∞, from which it follows thatdimV = dimV ∗∗ <∞. By Theorem .b, ψ must also be surjective, hence ψ is anisomorphism.

Corollary .. If dimV <∞, V is reflexive.

.. Operator adjoints and perp spaces

Now, if S is a subspace of V , we define the set

S⊥ = f ∈ V ∗ | ϕ(s, f ) = 0,∀s ∈ S,

which is read as S “perp.” Note that this is exactly the same as what some texts refer toas the annihilator of S, denoted by S0, since any linear functional in this set sends anyelement of S to 0. Analogously, if M is a subspace of V ∗, we define the set

M⊥ = v ∈ V | ϕ(v,f ) = 0,∀f ∈M.

Note that S⊥ is a subspace of V ∗, while M⊥ is a subspace of V .

Example ..

a) Let V = R4 and consider the following three linear functionals

f1(x1,x2,x3,x4) = x1 + 2x2 + 2x3 + x4

f2(x1,x2,x3,x4) = 2x2 + x4

f3(x1,x2,x3,x4) = −2x1 − 4x3 + 3x4.

We can determine the subspace S of V that is perpendicular to f1, f2, f3; that is,the subspace that f1, f2, f3 annihilates. We must determine vectors(x1,x2,x3,x4) ∈R4 such that

x1 + 2x2 + 2x3 + x4 = 02x2 + x4 = 0

−2x1 − 4x3 + 3x4 = 0,


which can be expressed in matrix form

1 2 2 10 2 0 1−2 0 −4 3

x1x2x3x4

=

000

.The row-reduced echelon form of the above matrix is 1 0 2 0

0 1 0 00 0 0 1

,hence S is spanned by the vector (−2,0,1,0)t.

b) Now, let V = F4 where F = Z5 and consider the same linear functionals f1, f2, f3.

Now, f3 can be rewritten as

f3(x1,x2,x3,x4) = −2x1 − 4x3 + 3x4≡ 3x1 + x3 + 3x4 mod 5.

In this case, determining the subspace perpendicular to f1, f2, f3 is equivalent tosolving the linear system

1 2 2 10 2 0 13 0 1 3

x1x2x3x4

=

000

.Again, we need only row reduce to determine S, keeping in mind that alloperations are modulo 5. By the following calculations,1 2 2 1

0 2 0 13 0 1 3

−→r2←r3+2r1

1 2 2 10 2 0 10 4 0 0

−→r1←r1+4r2

1 0 2 00 2 0 10 4 0 0

−→r3←4r3

1 0 2 00 2 0 10 1 0 0

−→r2←3r2

1 0 2 00 1 0 30 1 0 0

we find that S is spanned by the vector (−2,0,1,0)t, which is equivalent to(3,0,1,0)t mod 5.

The following results follow by careful application of the definitions and are left as anexercise.

Theorem .. Let N,S be subspaces of a vector space V .


a) If N ⊆ S, then S⊥ ⊆N⊥.

b) S ⊆ (S⊥)⊥

c) (N ∪ S)⊥ =N⊥ ∩ S⊥

d) L(S)⊥ = S⊥.

Theorem .. Let S be a subspace of a finite-dimensional vector space V . Then

dimV = dimS + dimS⊥.

Proof. Let α = α1, . . . ,αn be a basis of V and let α∗ = α∗1, . . . ,α∗n be the correspondingdual basis. We already know that dimV = dimS + dim(V /S), so we will show thatdimS⊥ = dim(V /S). After possibly reindexing, we may assume that BS = α1, . . . ,αm is abasis of S, in which case, B′ = α\BS = αm+1, . . . ,αn is a basis of V /S. Since S⊥ is asubspace of V ∗, we may find a basis B∗ ⊂ α∗ for S⊥. Assume B∗ = α∗1, . . . ,α

∗k. By

definition, we must have α∗j (s) = 0 for j = 1, . . . , k and all s ∈ S. In particular, α∗j (αi) = 0 fori = 1, . . . ,m and j = 1, . . . , k. However, since α∗ is the dual basis of α, for each j = 1, . . . , kthere must be a corresponding m+ 1 ≤ i ≤ n such that α∗j (αi) = 1. Thus, there is aone-to-one correspondence between elements of B′ and B∗, so dimS⊥ = dim(V /S) anddimV = dimS + dimS⊥.

Given any homomorphism T ∈Hom(V ,W ), the duality between a vector space and itsdual space allow us to define the (operator) adjoint of T , denoted by T ∗. The adjointT ∗ ∈Hom(W ∗,V ∗) is defined by the condition T ∗f = f T for every f ∈W ∗. In particular, ifϕW :W ×W ∗→ F and ϕV : V ×V ∗→ F denote the dual pairings for W and V ,respectively, then the adjoint satisfies

ϕW (T v,f ) = ϕV (v,T ∗f ),

for all v ∈ V ,f ∈W ∗.As a consequence of the first isomorphism theorem, we have (V /S)∗ S⊥ (Exercise ).The following theorem details one connection between adjoints and perp spaces.

Theorem .. Let T ∈Hom(V ,W ). Then

a) (imT ∗)⊥ = kerT

b) kerT ∗ = (imT )⊥ .

Proof. a) Let ϕV and ϕW denote the dual pairings for V and W , respectively. Notethat

(imT ∗)⊥ = α ∈ V | ϕV (α,T ∗g) = 0,∀T ∗g ∈ imT ∗.


Let α ∈ (imT ∗)⊥. By definition of the adjoint, we have

ϕV (α,T ∗g) = 0, ∀g ∈W ∗

⇐⇒ ϕW (T α,g) = 0 ∀g ∈W ∗

⇐⇒ g(T α) = 0 ∀g ∈W ∗.

Since this is true for every g ∈W ∗, we must have T α = 0, so α ∈ kerT . Thus,(imT ∗)⊥ ⊆ kerT . On the other hand, since our logic holds in the reverse order, wealso have kerT ⊆ (imT ∗)⊥, hence kerT = (imT ∗)⊥.

b) Note that kerT ∗ = g ∈W ∗ | T ∗g = 0, where T ∗g ∈ V ∗ equals 0 if and only if(T ∗g)v = 0 for every v ∈ V . So, if g ∈ kerT ∗, we have

(T ∗g)v = 0 ∀v ∈ V⇐⇒ ϕV (v,T ∗g) = 0 ∀v ∈ V⇐⇒ ϕW (T v,g) = 0 ∀v ∈ V ,

g ∈ (imT )⊥ ⇐⇒ g ∈ kerT ∗.

An important property of adjoints is detailed in the following theorem.

Theorem .. If T ∈Hom(U,V ) and L ∈Hom(V ,W ), then

(LT )∗ = T ∗L∗.

Proof. Note that (LT )∗ ∈Hom(W ∗,U ∗), since LT ∈Hom(U,W ). Let u ∈U,g ∈W ∗ bearbitrary. Then,

ϕW (LT u,g) = ϕV (T u,L∗g) = ϕU (u,T ∗L∗g),

by the corresponding definitions for L∗ and T ∗. Since u,g are arbitrary, the equalityfollows.

The following corollary is left as an exercise (Exercise ).

Corollary .. If T ∈Hom(V ) is invertible, then

(T ∗)−1 = (T −1)∗.

We record two important results about the adjoint of a homomorphism.

Theorem .. Let T ∈Hom(V ,W ), where V and W are finite-dimensional, thenrank(T ) = rank(T ∗).


Proof. Since V is finite-dimensional, we have

dimV = rankT + nullityT⇒rankT = dimV −nullityT .

By Theorem .a, kerT = (imT ∗)⊥, hence

rankT = dimV ∗ −dim(imT ∗)⊥ ,

where we also used the fact that dimV ∗ = dimV . Note that, since T ∗ ∈Hom(W ∗,V ∗),im(T ∗) is a subspace of V ∗. Finally, by Theorem ., dimV ∗ = dim(imT ∗) + dim(imT ∗)⊥,so rankT ∗ = dimV ∗ −dim(imT ∗)⊥.

Theorem .. Let T ∈Hom(V ,W ), where V and W are finite-dimensional with bases α andβ, respectively. Denote the corresponding dual bases by α∗ and β∗, respectively. Then,

Γ (α∗,β∗)T ∗ =(Γ (α,β)T

)t.

That is, the matrix representation of the adjoint T ∗ is the transpose of the matrixrepresentation of T .

. Multilinear Functionals and modules

Let V1, . . . ,Vn and V be vector spaces over the field F , where n ≥ 1 is finite. A mappingφ : V1 × · · · ×Vn→ V is called multilinear (of degree n) if it is linear in each component.That is, φ is multilinear if for each i = 1, . . . ,n

φ(v1, . . . , avi + bwi , . . . , vn) = aφ(v1, . . . , vi , . . . , vn) + bφ(v1, . . . ,wi , . . . , vn),

for all a,b ∈ F , vi ,wi ∈ Vi and vj ∈ Vj , j , i. Note that multilinear mappings of degree 1and 2 are what we usually call linear and bilinear mappings, respectively. So,homomorphisms are a special case of multilinear mappings.One of the most important multilinear mappings in linear algebra is the determinant ofa square matrix. We will spend some time developing the theory of special types ofmultilinear mappings and modules as we develop the notion of determinants.Another special type of multilinear mapping are those for which V = F in the definition,which we call multilinear functionals.

.. Modules

Let R be a commutative ring with identity. A set M is called an R-module (or module overR) if


M) there is an addition operation (α,β) 7→ α + β from M ×M→M, under wich M is acommutative group,

M) there is a multiplication operation (c,α) 7→ cα from R×M→M such that

i) (c1 + c2)α = c1α + c2α,

ii) c(α + β) = cα + cβ,

iii) (c1c2)α = c1(c2α),

iv) 1α = α,

for all c,c1, c2 ∈ R and α,β ∈M.

If R is not commutative, we may still define the structure of a module, but we mustdistinguish between left R-modules and right R-modules. Note that modules play therole of vector spaces, however, since every element of R does not necessarily have aninverse, we must be cautious with scalar multiplication.

Example ..

) The set Rn consisting of n-tuples of ring elements is an R-module. This is one ofthe most basic, and common, examples; one that we will likely refer to often. As inthe case of F n, the operations on Rn are componentwise addition and scalarmultiplication.

) Likewise, the set Mm,n(R) of m×n matrices with elements in R, is an R-module.

) R, itself, can be regarded as an R-module.

Most of the concepts defined for vector spaces can be analogously defined forR-modules, although extra care must be taken. For instance, if a,b ∈ R are nonzero andu,v ∈M, suppose au + bv = 0, that is to say u,v is linearly dependent. As opposed tovector spaces, we cannot necessarily conclude that u is a scalar multiple of v, or viceversa, since a and b may not have a multiplicative inverse. Another important differencebetween vector spaces and modules, is that a module does not necessarily admit a basis.A basis for an R-module M is a linearly independent subset of M which spans(generates) M. An R-module M is called free if either M = 0 or M has a basis. If M hasa finite basis with n elements, M is called a free R-module with n generators. Furthermore,we say a module M is finitely generated if it contains a finite subset which spans M. Therank of a finitely generated module is the smallest integer k such that some k elementsspan M. Note that a module may be finitely generated without having a finite basis!A submodule of an R-module M is a nonempty subset S of M that is an R-module underthe same operations.


Theorem .. A nonempty subset S of an R-module M is a submodule if and only if

c1, c2 ∈ R,α,β ∈M⇒ c1α + c2β ∈M.

Theorem .. Let R be a commutative ring with identity. If M is a free R-module with ngenerators, then the rank of M is n.

Analogous to vector spaces, if M,N are R-modules and f :M→N , we call f anR-homomorphism if

f (au + bv) = af (u) + bf (v),

for all a,b ∈ R,u,v ∈M. Note, however, that the term “linear transformation" is nottypically used in this context. The set of all R-homomorphisms from M to N is denotedby HomR(M,N ), which is an R-module under addition of functions and scalarmultiplication. We will also refer to R-monomorphisms, R-epimorphisms,R-endomorphisms and R-endomorphisms, which are defined analogously. In particular,we define the dual module, M∗, which consists of all linear functions from M to R; that isM∗ = HomR(M,R).

.. Multilinear forms

As in the previous section, R denotes a commutative ring with identity. Let M1, . . . ,Mk beR-modules. A mapping L :M1 × · · · ×Mk→ R which satisfies

L(v1, . . . , avi + bwi , . . . ,vk) = aL(v1, . . . , vi , . . . , vk) + bL(v1, . . . ,wi , . . . , vk),

for each i = 1, . . . , k, all a,b ∈ R, vi ,wi ∈Mi and vj ∈Mj , j , i is called a multilinear form ofdegree k or a k-linear form. Note that multilinear forms are analogous to multilinearfunctionals, the difference being whether we are working over a ring or a field (modulesor vector spaces).Let M be an R-module. For a positive integer k let

Mk =M × · · · ×M︸︷︷︸k copies

= (v1, . . . , vk) | vi ∈ V ,

that is, the Cartesian product of k copies of M. Of particular interest are multilinearforms mapping Mk to R. We will denote the collection of all multilinear forms on Mk byT k(M). Under pointwise addition and scalar mutiplication,

(aL)(v1, . . . , vk) = a(L(v1, . . . , vk))(L+L′)(v1, . . . , vk) = L(v1, . . . , vk) +L′(v1, . . . , vk),

T k(M) becomes an R-module. Note that, for k = 1, T 1(M) is the dual space V ∗.Let L be an element of T k(M). We say L is alternating if L(v1, . . . , vk) = 0 whenever vi = vjfor some i , j. As a consequence of the definition, we have the following useful fact,which says that if we swap the order of any two vectors L changes sign.


Lemma .. Let L ∈ T k(M) be alternating. Then

L(v1, . . . , vi , . . . , vj , . . . , vk) = −L(v1, . . . , vj , . . . , vi , . . . , vk)

for any i , j.

Proof. If L is alternating, then

L(v1, . . . , vi + vj , . . . , vi + vj , . . . ,vk) = 0.

However, since L is linear in each component, the left hand side can be expanded to yield

L(v1, . . . , vi , . . . , vi , . . . , vk) +L(v1, . . . , vi , . . . , vj , . . . , vk) +L(v1, . . . , vj , . . . , vi , . . . , vk)

+L(v1, . . . , vj , . . . , vj , . . . , vk) = 0.

Again, by the definition of alternating, the first and last terms on the left hand side arezero, leaving

L(v1, . . . , vi , . . . , vj , . . . , vk) +L(v1, . . . , vj , . . . ,vi , . . . , vk) = 0

=⇒ L(v1, . . . , vi , . . . , vj , . . . , vk) = −L(v1, . . . , vj , . . . , vi , . . . , vk)

Example ..

) Consider the R-module Mn(R) of n×n matrices with entries in R. We may formmany examples of n-linear forms by constructing functions which are linear withrespect to the rows of an n×n matrix or the columns of the matrix. We will definen-linear forms on Mn(R) as being linear with respect to the rows. By this, we meana mapping which associates a scalar to each matrix and is linear in each row, whenall other rows are held fixed. More specifically, we may construct an n-linear formas follows. Let r1, . . . , rn denote the rows of a matrix A ∈Mn(R). When constructingn-linear forms of this type, we may write L(A) = L(r1, . . . , rn), so that the n-linearform may be regarded as a function of the rows of A. Define the n-linear formL : Rn × · · · ×Rn→ R by L(r1, . . . , rn) = det(A). On Mn(R), an n-linear form L isalternating if L(A) = 0 whenever any two rows of A are identical. By Lemma .,we have L(A′) = L(A), where A′ is the same matrix as A with two rows swapped.

) Let j1, . . . , jn be positive integers, 1 ≤ jk ≤ n, and let s be a fixed element of R. Foreach A ∈Mn(R), denote the (i, j) entry by A(i, j) and define

L(A) = sA(1, j1) · · ·A(n,jn). (.)

We want to show that L is n-linear with respect to the rows of A. Fix all rows of Aexcept the ith row. Then, we may write

L(ri) = sA(1, j1) · · ·A(n,jn) = bA(i, ji),


where b is the product of s with all terms except the ith. Then, ifr ′i = (A′(i,1), . . . ,A′(i,n)) is any row vector in Rn, we have

L(cri + r ′i ) = (cA(i, ji) +A′(i, ji))b = cL(ri) +L(r ′i ).

Clearly, this holds for each i, so L is n-linear. For example, let φ(A) = a11a22 · · ·annbe the product of the diagonal entries of A. Then, φ is an n-linear function of theform (.).

.. Formal definition of determinant

Suppose L is an alternating n-linear form on Mn(R). Let A ∈Mn(R) be a matrix with rowsr1, . . . , rn. If we denote the rows of the n×n identity matrix on R by e1, . . . , en, then

ri =n∑j=1

A(i, j)ej , 1 ≤ j ≤ n.

Since L is n-linear, we have

L(A) = L(r1, . . . , rn) = L

n∑j=1

A(1, j)ej , r2, . . . , rn

=

n∑j=1

A(1, j)L(ej , r2, . . . , rn).

Similarly, since L is linear in each component, we may write

L(A) = L(r1, . . . , rn) = L

n∑j1=1

A(1, j1)ej1 , r2, . . . , rn

=

n∑j1=1

A(1, j1)L

ej1 , n∑j2=1

A(2, j2)ej2 , . . . , rn

=

∑j1,j2

A(1, j1)A(2, j2)L

ej1 , ej2 , n∑j3=1

A(3, j3)ej3 , . . . , rn

=...

=∑

j1,j2,...,jn

A(1, j1) · · ·A(n,jn)L(ej1 , . . . , ejn), (.)

where the sum in (.) is over all sequences of positive integers (j1, . . . , jn) less than orequal to n. Thus, L is a finite sum of functions of the form (.), since L is n-linear. By


assumption, L is also alternating, thus L(ej1 , . . . , ejn) = 0 whenever any two of the indicesare equal. To simplify our analysis of (.), we may assume no two indices are equal,since otherwise, the alternating form is 0.In order to study (.) in more detail, let us digress and take a brief look at permutations.A sequence of integers (j1, . . . , jn), with 1 ≤ jk ≤ n, such that no two of the jk’s are equal, iscalled a permutation of degree n. In general, a permutation of a set A is a function fromA to A that is 1-1 and onto (bijective). For our purposes, we will primarily be interestedin permutations of the set ∆ = 1, . . . ,n. A bijective map permuting the elements of a setis often denoted by σ . For example, suppose σ (1) = j1,σ (2) = j2, . . . ,σ (n) = jn. In this case,the action of σ may be represented by a 2×n array

σ =(

1 2 · · · nj1 j2 · · · jn

).

Example .. Suppose n = 6 and let

σ =(1 2 3 4 5 63 2 4 6 1 5

).

Then, σ is a bijection given by σ (1) = 3,σ (2) = 2,σ (3) = 4,σ (4) = 6,σ (5) = 1,σ (6) = 5.

A permutation group of a set A is a set of permutations of A that forms a group underfunction composition. In particular, we denote the group of permutations of∆ = 1, . . . ,n by Sn, which is called the symmetric group of degree n. More precisely, thelaw (σ,τ) 7→ στ mapping Sn × Sn→ Sn satisfies

1) σ (τγ) = (στ)γ, for all σ,τ,γ ∈ Sn2) There exists an element 1 ∈ Sn such that σ1 = σ = 1σ for all σ ∈ Sn3) For every σ ∈ Sn there exists a τ ∈ Sn such that στ = 1 = τσ,

making Sn into a group. The order of a permutation group of a set A is the number ofdistinct permutations of A. Note that the order of Sn is n!, which is denoted by |Sn| = n!.A permutation σ ∈ Sn is called an m-cycle if σ cyclically permutes a sequence of melements, leaving all other elements fixed. That is, for m > 1 and elements i1, . . . , im ∈ ∆,σ (i1) = i2,σ (i2) = i3, . . . ,σ (im−1) = im,σ (im) = i1 and σ (k) = k for all k ∈ ∆− i1, . . . , im. If σ issuch an m-cycle, we often use cycle notation to express the action of σ . In array notationwe have

σ =(i1 i2 · · · im−1 im ki2 i3 · · · im i1 k

),

whereas, in cycle notation we would write

σ = (i1i2 · · · im)(k).

Note that any number not appearing in a cycle is fixed by that cycle.


Example ..

) The permutation

σ1 =(1 2 3 4 5 63 2 5 4 1 6

)is a 3-cycle, expressed in cycle notation as

σ1 = (135)(2)(4)(6).

However, it is more common to omit the cycles containing a single element, inwhich case, the above permutation is written as

σ1 = (135).

) The permutation

σ2 =(1 2 3 4 5 62 3 6 1 4 5

)is a 6-cycle, expressed in cycle notation as

σ2 = (123654).

) Now, consider the permutation

τ =(1 2 3 4 5 64 1 2 3 6 5

).

By definition, τ is not a cycle. However, τ can be expressed as the product ofcycles, specifically the product of a 2-cycle and a 4-cycle,

τ =(1 2 3 4 5 64 1 2 3 5 6

)(1 2 3 4 5 61 2 3 4 6 5

).

Each of the permutations in the product can be expressed in cycle notation as

(1432)(5)(6) and (1)(2)(3)(4)(56),

but again, it is more common to omit the 1-cycles, expressing the permutation asthe product

τ = (1432)(56).

We say two cycles are disjoint if they have no numbers in common. For example, thecycles (135) and (46) are disjoint, while the cycles (135) and (654) are not disjoint. A2-cycle in Sn is called a transposition.We now list several standard results about permutation groups, omitting proofs.


Theorem .. Any permutation σ ∈ Sn can be expressed as a product of disjoint cycles.

Theorem .. Every permutation σ ∈ Sn, n > 1, can be expressed as a product oftranspositions.

Note that a factorization of a permutation into a product of transpositions is not unique.However, if σ can be factored into an even number of transpositions, then anyfactorization of σ into transpositions will have an even number of transpositions.Likewise, if σ can be factored into an odd number of transpositions, any suchfactorization will have an odd number of transpositions. This allows us to define σ ∈ Snto be even if it can be factored into an even number of transpositions, and odd if σ can befactored into an odd number of transpositions. If σ is even, we define the sign of σ to be1 and write sgn(σ ) = 1, while if σ is odd we define sgn(σ ) = −1.With the tools of basic permutation theory, we may return to our discussion ofdeterminants. We may now express the formula (.) as

L(A) =∑σ∈Sn

A(1,σ (1)) · · ·A(n,σ (n))L(eσ (1), . . . , eσ (n)).

We are also in position to restate Lemma . in terms of permutations.

Lemma .. If L is an alternating n-linear form on Mn and σ ∈ Sn, then for all(v1, . . . , vn) ∈Mn

L(vσ (1),vσ (2), . . . , vσ (n)) = sgn(σ )L(v1,v2, . . . , vn).

Proof. Since every σ ∈ Sn is a product of transpositions, we may repeatedly applyLemma . for each transposition. So, if σ = σ1 · · ·σk, where each σj is a transposition,note that sgn(σj) = −1 and sgn(σ ) = (−1)k. For each σj , we apply Lemma ., whichresults in a sign change each time, hence

L(vσ (1),vσ (2), . . . , vσ (n)) = (−1)kL(v1,v2, . . . , vn).

Finally, we define the determinant of a matrix A ∈Mn(R) as the n-linear form

det(A) =∑σ∈Sn

sgn(σ )A(1,σ (1)) · · ·A(n,σ (n)). (.)

This definition is arrived at by seeking a determinant function on Mn(R). A determinantfunction on Mn(R) is an alternating, n-linear form, D, such that D(I) = 1. The abovedefintion of det is clearly a determinant function, in fact, it is the only one.


Theorem .. Let R by a commutative ring with identity and let n be a positive integer. Thefunction det defined by (.) is the unique alternating n-linear form on Mn(R) such thatdet(I) = 1. Furthermore, if L is any alternating n-linear form on Mn(R), then for eachA ∈Mn(R)

L(A) = det(A)L(I).

Example .. Let A ∈M2(R) be a 2× 2 matrix. We want to compute the determinant ofA using (.). Note that |S2| = 2!, hence there are only 2 unique permutations of 1,2,namely σ1 = (12) and σ2 = (21). Furthermore, sgn(σ1) = 1,sgn(σ2) = −1, since σ1 is theproduct of 0 transpositions and σ2 is a single transposition. Using (.), we have

det(A) =∑σ∈S2

sgn(σ )A(1,σ (1))A(2,σ (2))

= sgn(σ1)A(1,σ1(1))A(2,σ1(2)) + sgn(σ2)A(1,σ2(1))A(2,σ2(2))= (1A(1,1)A(2,2)) + (−1A(1,2)A(2,1)).

In more familiar form, if

A =(a bc d

)this yields det(A) = ad − bc.

Theorem .. Let A,B ∈Mn(R). Then

det(AB) = det(A)det(B).

Proof. Let B be a fixed n×n matrix over R and define

D(A) = det(AB),

for any A ∈Mn(R). Denoting the rows of A by r1, . . . , rn, we may write

D(A) =D(r1, . . . , rn) = det(r1B, . . . , rnB).

It is straightforward to verify that D is alternating and n-linear, since det is alternatingand n-linear. By the previous theorem, we have

D(A) = det(A)D(I).

But, D(I) = det(e1B, . . . , enB) = det(B), hence

det(AB) = det(A)det(B).

Exercises - Section .


. Let f ∈Hom(V ,F ) be a linear functional with f , 0. Show that V / ker(f ) F .

. If S is a subspace of a vector space V , use the first isomorphism theorem to show:

(a) (V /S)∗ S⊥

(b) V ∗ = S∗ ⊕ S⊥

. Let T ∈Hom(V ) be an invertible linear operator. Use Theorem . to show that

(T ∗)−1 = (T −1)∗.

. If N1,N2 are submodules of an R-module M, show that N1 ∩N2 and N1 +N2 aresubmodules of M.

Tensor products

. Free vector spaces and free modules

We begin our pursuit of constructing the tensor product by discussing how to form avector space (or R-module) from a given set. Let X be any set. Formally, we construct avector space over the field F by constructing the direct sum of |X | copies of F . LetF 〈X〉 = f : X→ F | f (x) = 0 for all but finitely many x ∈ X. We make F 〈X〉 into a vectorspace by defining addition and scalar multiplication component-wise; that is, iff ,g ∈ F 〈X〉, a ∈ F then the elements f + g ∈ F 〈X〉 and af ∈ F 〈X〉 are defined by

(f + g)(x) = f (x) + g(x)(af )(x) = af (x).

A basis of F 〈X〉 is given by the collection δx | x ∈ X where

δx(u) =

1, x = u0, x , u

To show that δx | x ∈ X is linearly independent, suppose

n∑i=1

aiδxi = 0 =⇒n∑i=1

aiδxi (u) = 0∀u ∈ X.

In particular, for each xj , j = 1, . . . ,n, we have

n∑i=1

aiδxi (xj) = 0 =⇒ aj = 0.


Now, to show that δx | x ∈ X spans F 〈X〉, let f ∈ F 〈X〉 be arbitrary. By definition, weknow f (x) = 0 for all but finitely many x ∈ X. Suppose f (xi) = ai for i = 1, . . . ,n, wherex1, . . . ,xn is the finite collection of elements for which f is nonzero. We will show that f

may be expressed by the sum f =n∑i=1aiδxi . Evaluating this sum at xj , we have

n∑i=1

aiδxi (xj) = a1δx1(xj) + a2δx2

(xj) + · · ·+ anδxn(xj) = aj ,

since δxi (xj) = 0 for i , j and δxj (xj) = 1. The vector space F 〈X〉 is called the free vectorspace on X. Although, formally, the basis is given by δx | x ∈ X, there is a naturalinjection map ι : X→ F 〈X〉 defined by ι(x) = δx. It is straightforward to show theinjection map is bijective, so we may regard the set X as a basis of F 〈X〉. That is, weidentify an elements x ∈ X with the function that has value 1 on x and 0 on every otherelement of X. Furthermore, the free vector space F 〈X〉 consists of formal, finite linearcombinations of elements of X.Similarly, if M is a set, we may construct the free R-module on M by the same process. Weomit the details of this construction, as they are almost identical to the vector space case.Analagous to free vector spaces, we will denote the free R-module on M by R〈M〉.

. Universal problem

Suppose V1,V2,V are all vector spaces over F and let φ : V1 ×V2→ V be a bilinearmapping. We may construct other bilinear mappings from φ as follows. Let h : V →Wbe a homomorphism from V to another vector space W (over F ). Then h φ is a bilinearmapping from V1 ×V2→W .

Exercise .. Verify that h φ is a bilinear mapping.

We wish to answer the following question.

Problem .. Can we determine a vector space V and bilinear mapping φ (as above) sothat given any bilinear mapping

ψ : V1 ×V2→W

there exists exactly one homomorphism h ∈Hom(V ,W ) such that h φ = ψ.

Problem . is called the universal problem (for bilinear mappings), a solution of whichconsists of a vector space and a bilinear mapping, which we denote as a pair (V ,φ).What we seek is, in some sense, the most general bilinear mapping on V1 ×V2. We willproduce a solution to the universal problem by construction. However, we will firstshow that a solution (V ,φ) is unique up to isomorphism.


Lemma .. The solution to the universal problem is unique up to isomorphism.

Proof. Suppose (V ,φ) and (V ′,φ′) are two solutions to Problem . Since (V ,φ) is asolution, for any ψ : V1 ×V2→W there exists h : V →W such that h φ = ψ. SettingW = V ′ and ψ = φ′, there must exist an h : V → V ′ such that

h φ = φ′. (.)

Likewise, since (V ′,φ′) is a solution, for any ψ : V1 ×V2→W there exists h′ : V ′→Wsuch that h′ φ′ = ψ. Setting W = V and ψ = φ, there must exist an h′ : V ′→ V such that

h′ φ′ = φ. (.)

On one hand, we may plug (.) into (.), yielding

h (h′ φ′) = φ′,

which can only be true if h h′ = idV ′ . On the other hand, we may plug (.) into (.),yielding

h′ (h φ) = φ,

which can only be true if h′ h = idV . Thus, h : V → V ′ and h′ : V ′→ V are inverseisomorphisms.

We have shown that, if a solution exists, the solution to Problem . is unique. However,it is worth reiterating that the solution depends on V1 ×V2. More plainly, we mayexpress this by saying (V ,φ) is a solution to the universal problem for V1 ×V2 or (V ,φ) isuniversal for V1 ×V2.

.. Constructing a solution to the universal problem

Let V1,V2 be two vector spaces over F and consider the free vector space F 〈V1 ×V2〉consisting of linear combinations of elements in V1 ×V2 with coefficients in F . Note thatF 〈V1 ×V2〉 is a vector space of dimension n1n2, where dim(V1) = n1 and dim(V2) = n2.Recall, from Section ., that F 〈V1 ×V2〉 consists of functions mapping V1 ×V2 into F

and has a basis consisting of δ(u,v) | u ∈ V1,v ∈ V2. Also, recall that there is a naturalinjection ι : V1 ×V2→ F 〈V1 ×V2〉 defined by ι(u,v) = δ(u,v).Let S be the subspace of F 〈V1 ×V2〉 generated by elements of one of the following forms

δ(u1+u2,v) − δ(u1,v) − δ(u2,v) (.)

δ(u,v1+v2) − δ(u,v1) − δ(u,v2) (.)

δ(au,v) − aδ(u,v) (.)

δ(u,av) − aδ(u,v), (.)


where a ∈ F ,u,u1,u2 ∈ V1 and v,v1,v2 ∈ V2. By the natural injection, we may identifysuch elements with elements of the form

(u1 +u2,v)− (u1,v)− (u2,v)(u,v1 + v2)− (u,v1)− (u,v2)(au,v)− a(u,v)(u,av)− a(u,v),

where a ∈ F ,u,u1,u2 ∈ V1 and v,v1,v2 ∈ V2. However, despite the notation being morecomplicated, we will utilize the former notation.Set V = F 〈V1 ×V2〉/S, elements of which are cosets of the form δ(u,v) + S. There is anatural mapping φ : V1 ×V2→ V defined by

φ(u,v) = δ(u,v) + S, (.)

which can be expressed as the composition φ = πS ι, where πS : F 〈V1 ×V2〉 → V is thecanonical projection given by πS(δ(u,v)) = δ(u,v) + S, as defined in Section ..

Theorem .. The pair (V ,φ) is a solution to Problem ..

Proof. We first show that φ : V1 ×V2→ V is bilinear. We need to show that

φ(u1 +u2,v) = φ(u1,v) +φ(u2,v), (.)φ(u,v1 + v2) = φ(u,v1) +φ(u,v2), (.)

and

φ(au,v) = aφ(u,v), (.)φ(u,av) = aφ(u,v). (.)

By the definition (.) of φ, (.) is equivalent to showing

δ(u1+u2,v) + S = (δ(u1,v) + S) + (δ(u2,v) + S).

However, this is true if and only if

δ(u1+u2,v) − δ(u1,v) − δ(u2,v) ∈ S,

which follows from the definition of S. Similarly, (.) is true if and only if

δ(u,v1+v2) − δ(u,v1) − δ(u,v2) ∈ S.

Relations (.)-(.) follow from the definition of S, as well.


Now, we need to show for any bilinear ψ : V1 ×V2→W there exists a linear h : V →Wsuch that h φ = ψ. Define a linear transformation T ∈Hom(F 〈V1 ×V2〉,W ) byT (δ(u,v)) = ψ(u,v). Since the elements δ(u,v) are a basis for F 〈V1 ×V2〉, this definition of Tis well defined. The map ψ is bilinear, so

ψ(u1 +u2,v) = ψ(u1,v) +ψ(u2,v),ψ(u,v1 + v2) = ψ(u,v1) +ψ(u,v2),

ψ(au,v) = aψ(u,v),ψ(u,av) = aψ(u,v),

or equivalently,

ψ(u1 +u2,v)−ψ(u1,v)−ψ(u2,v) = 0,ψ(u,v1 + v2)−ψ(u,v1)−ψ(u,v2) = 0,

ψ(au,v)− aψ(u,v) = 0,ψ(u,av)− aψ(u,v) = 0.

By the definition of T , these conditions translate to the conditions

T (δ(u1+u2,v))− T (δ(u1,v))− T (δ(u2,v)) = 0,

T (δ(u,v1+v2))− T (δ(u,v1))− T (δ(u,v2)) = 0,

T (δ(au,v))− aT (δ(u,v)) = 0,

T (δ(u,av))− aT (δ(u,v)) = 0.

Since T is linear, we equivalently have

T (δ(u1+u2,v) − δ(u1,v) − δ(u2,v)) = 0,

T (δ(u,v1+v2) − δ(u,v1) − δ(u,v2)) = 0,

T (δ(au,v) − aδ(u,v)) = 0,

T (δ(u,av) − aδ(u,v)) = 0,

which shows that T (S) = 0, or equivalently, S ⊆ ker(T ), since each of the elements inthe above conditions are one of the forms (.)-(.). Thus, the first isomorphismtheorem yields a linear transformation h ∈Hom(V ,W ) such that h πS = T , that ish(πS(δ(u,v))) = T (δ(u,v)). Since φ = πS ι and by the definition of T , we haveh(φ(u,v)) = ψ(u,v) for all (u,v) ∈ V1 ×V2. Hence, h φ = ψ, so the pair (V ,φ) is a solutionto the universal problem.

Now that we know a solution to the universal problem exists and is unique, we are readyto introduce notation. The vector space V = F 〈V1 ×V2〉/S is henceforth denoted asV1 ⊗V2 and called the tensor product of V1 and V2. Elements of V1 ⊗V2 are cosets of the


form φ(u,v) = δ(u,v) + S, which we henceforth denote as u ⊗ v, so φ(u,v) = u ⊗ v.Furthermore, we call such elements tensors and we call φ the canonical map. Since φ isbilinear we know

φ(u1 +u2,v) = φ(u1,v) +φ(u2,v),

which implies(u1 +u2)⊗ v = u1 ⊗ v +u2 ⊗ v.

Likewise, since

φ(u,v1 + v2) = φ(u,v1) +φ(u,v2),φ(au,v) = aφ(u,v),φ(u,av) = aφ(u,v),

we have

u ⊗ (v1 + v2) = u ⊗ v1 +u ⊗ v2,

(au)⊗ v = a(u ⊗ v),u ⊗ (av) = a(u ⊗ v).

Tensors of the form u ⊗ v are called pure tensors (or elementary, decomposable). It may notbe apparent that all tensors are not pure. In general, we can say that any tensor is a finitesum of pure tensors, that is, any element of V1 ⊗V2 is of the form∑

i,j

aijui ⊗ vj ,

where ui ,vj are basis elements of V1,V2, respectively. This is due to the following result.

Theorem .. Let B1,B2 be bases for V1,V2, respectively. The set β1 ⊗ β2 |β1 ∈ B1,β2 ∈ B2 isa basis of V1 ⊗V2.

This theorem can be proven by a straightforward argument, however, the next theoremprovides another approach and also provides a useful result.

Theorem .. If u1, . . . ,un are linearly independent elements of V1 and v1, . . . , vn ∈ V2 arearbitrary, then ∑

i

ui ⊗ vi = 0⇒ vi = 0 for all i.

Moreover, u ⊗ v = 0 if and only if u = 0 or v = 0.


Proof. Assume ui ,vi are nonzero and suppose∑iui ⊗ vi = 0. The universal problem states

that for any bilinear mapping ψ : V1 ×V2→W there is a linear map h : V1⊗V2→W suchthat h φ = ψ. Thus,

h

∑i

ui ⊗ vi

=∑i

ψ(ui ,vi) = 0

⇒∑i

(h φ)(ui ,vi)=∑i

ψ(ui ,vi) = 0.

Since this holds for any bilinear mapping, we may define a particular bilinear mappingthat does what we need. Let f ∈ V ∗1 , g ∈ V

∗2 be two linear functionals and define

ψ(u,v) = f (u)g(v).

The reader may check that ψ : V1 ×V2→ F is a blinear mapping. Having assumed thatu1, . . . ,un are linearly independent, we may consider the corresponding dual functionalsu∗j ∈ V

∗1 defined by

u∗j (ui) = δij .

Then, setting f = u∗j , we haveψ(u,v) = u∗j (u)g(v)

and ∑i

u∗j (ui)g(vi) = 0,

where g ∈ V ∗2 is arbitrary. Thus, by the definition of u∗j , we have g(vj) = 0 for any g ∈ V ∗2 ,hence vj = 0. Repeating the argument for each j yields the desired result.

Corollary .. If dim(V1) = n1 and dim(V2) = n2, then dim(V1 ⊗V2) = n1n2.

For example, using the standard basis e1, e2 of F 2, a basis of F 2 ⊗F 2 ise1 ⊗ e1, e1 ⊗ e2, e2 ⊗ e1, e2 ⊗ e2.Now, suppose ϕ ∈Hom(V ,V ′) and ρ ∈Hom(W,W ′) are linear transformations. Theuniversal problem allows us to associate linear maps with bilinear maps. Define amapping from V ×W to V ′ ⊗W ′ by (v,w) 7→ ϕ(v)⊗ ρ(w). It is straightforward to checkthat this map is bilinear, since both ϕ and ρ are linear. For instance, we have

(v1 + v2,w) 7→ ϕ(v1 + v2)⊗ ρ(w),

but since ϕ is linear, we have

ϕ(v1 + v2)⊗ ρ(w) = (ϕ(v1) +ϕ(v2))⊗ ρ(w) = ϕ(v1)⊗ ρ(w) +ϕ(v2)⊗ ρ(w),


by the bilinearity of the canonical map. By the universal property, we may associate tothe bilinear map (v,w) 7→ ϕ(v)⊗ ρ(w) a linear map h ∈Hom(V ⊗W,V ′ ⊗W ′). Note thatthe map (v,w) 7→ ϕ(v)⊗ ρ(w) from V ×W → V ′ ⊗W ′ is taking the place of ψ in theuniversal problem. Thus, we have

h(φ(v,w)) = ϕ(v)⊗ ρ(w),

or equivalently,h(v ⊗w) = ϕ(v)⊗ ρ(w).

We denote this particular linear map h by ϕ ⊗ ρ. Note that ϕ ⊗ ρ : V ⊗W → V ′ ⊗W ′ is alinear map. In conclusion, given linear maps ϕ : V → V ′ and ρ :W →W ′, we may definethe tensor product of these maps by the condition

ϕ ⊗ ρ(v ⊗w) = ϕ(v)⊗ ρ(w).

Example .. Suppose ϕ : F 2→ F2 and ρ : F 2→ F

2 are linear transformations. Let

A =(a bc d

)and

B =(a′ b′

c′ d′

)be the matrices of ϕ and ρ in the standard basis e1, e2, respectively. By our previousdiscussion, we may define a linear map ϕ ⊗ ρ : F 2 ⊗F 2→ F

2 ⊗F 2. Let’s compute thematrix of ϕ ⊗ ρ with respect to the basis e1 ⊗ e1, e1 ⊗ e2, e2 ⊗ e1, e2 ⊗ e2. Just as we wouldfor any other linear transformation, we must compute the action of ϕ ⊗ ρ on each basiselement. What helps us here is that for any u ∈ F 2, ϕ(u) = Au. Likewise, ρ(u) = Bu.Furthermore, by the definition of ϕ ⊗ ρ, we know

ϕ ⊗ ρ(u ⊗ v) = ϕ(u)⊗ ρ(v) = (Au)⊗ (Bv).

Naturally, we will denote the matrix of ϕ ⊗ ρ as A⊗B. Thus, we have

A⊗B(e1 ⊗ e1) = (Ae1)⊗ (Be1) = (ae1 + ce2)⊗ (a′e1 + c′e2)= ae1 ⊗ (a′e1 + c′e2) + ce2 ⊗ (a′e1 + c′e2)= ae1 ⊗ a′e1 + ae1 ⊗ c′e2 + ce2 ⊗ a′e1 + ce2 ⊗ c′e2

= aa′(e1 ⊗ e1) + ac′(e1 ⊗ e2) + ca′(e2 ⊗ e1) + cc′(e2 ⊗ e2).

This shows that the coordinates of A⊗B(e1⊗ e1) in the basis e1⊗ e1, e1⊗ e2, e2⊗ e1, e2⊗ e2are (

aa′ ac′ ca′ cc′)t.


Likewise, we have

A⊗B(e1 ⊗ e2) = (Ae1)⊗ (Be2) = (ae1 + ce2)⊗ (b′e1 + d′e2)= ab′(e1 ⊗ e1) + ad′(e1 ⊗ e2) + cb′(e2 ⊗ e1) + cd′(e2 ⊗ e2),

A⊗B(e2 ⊗ e1) = (Ae2)⊗ (Be1) = (be1 + de2)⊗ (a′e1 + c′e2)= ba′(e1 ⊗ e1) + bc′(e1 ⊗ e2) + da′(e2 ⊗ e1) + dc′(e2 ⊗ e2),

A⊗B(e2 ⊗ e2) = (Ae2)⊗ (Be2) = (be1 + de2)⊗ (b′e1 + d′e2)= bb′(e1 ⊗ e1) + bd′(e1 ⊗ e2) + db′(e2 ⊗ e1) + dd′(e2 ⊗ e2).

Hence, the matrix A⊗B is given byaa′ ab′ ba′ bb′

ac′ ad′ bc′ bd′

ca′ cb′ da′ db′

cc′ cd′ dc′ dd′

.Upon closer inspection, we see that A⊗B can be expressed as

a

(a′ b′

c′ d′

)b

(a′ b′

c′ d′

)c

(a′ b′

c′ d′

)d

(a′ b′

c′ d′

) ,

or in block form as (aB bBcB dB

).

The previous example illustrates the Kronecker product. More generally, ifϕ ∈Hom(F n,F m) has matrix representation A ∈Mm,n(F ) and ρ ∈Hom(F k ,F l) has matrixrepresentation B ∈Ml,k(F ), then the matrix

A⊗B =

a11B · · · a1nB...

...am1B · · · amnB

is called the Kronecker product of A and B.


. tensor algebras

Let A be a commutative ring, written additively, and M an A-module. First, define thetensor product of n copies of M, denoted by M⊗n or T n(M) or T nA (M), if the underlyingring needs to be explicitly stated. In these notes, the typical notation will be T n(M). Thetensor product T n(M) consists of elements of the form

x1 ⊗ x2 ⊗ · · · ⊗ xnwhere each xi belongs to M. Define a family of A-linear mapsmpq : T p(M)⊗ T q(M)→ T p+q(M) by the product

(x1 ⊗ · · · ⊗ xp) · (xp+1 ⊗ · · · ⊗ xq) = x1 ⊗ · · · ⊗ xp ⊗ xp+1 ⊗ · · ·xq.

The family of multiplication maps defined by these products are associative. Define thetensor algebra of the module M, denoted by T (M) or TA(M), as the direct sumT (M) =

⊕n≥0

T n(M) with the corresponding associative product. Given this definition, the

tensor algebra of M is a graded A-algebra of type N. For each n ≥ 0, the module T n(M)consists of the homogeneous elements of degree n. Naturally, we have the followingcorrespondence between the tensor powers, the ring A, and the module M;T 0(M) = A,T 1(M) =M. Note that the additive unit, denoted by 0, belongs to T 0(M) = A.There is a canonical injection, often denoted by φ,φM , or iM , mapping the module M tothe tensor algebra T (M). As a consequence of the product on T (M), the elements ofT n(M) can be expressed as

x1 ⊗ x2 ⊗ · · · ⊗ xn = φ(x1) ·φ(x2) · · ·φ(xn). (.)

Hence, the elements of T (M) are often expressed as x1x2 · · ·xn, by supressing the tensornotation (and the dot notation).The following proposition becomes quite useful when the exterior algebra and theuniversal enveloping algebra (of a Lie algebra) are introduced.

Proposition .. Suppose E is an A-algebra and f is an A-linear map from M to E. Thereexists a unique A-algebra homomorphism g : T (M)→ E such that f = g φ.

M E

T (M)

f

φg

By the canonical injection φ and the relations (.), if g exists, we have

g(x1 ⊗ · · · ⊗ xn) = f (x1) · · ·f (xn). (.)


Normed Spaces

For vector spaces, and various other spaces, it is quite useful to have some notion ofdistance. This section is devoted to reviewing the basics about normed spaces. A normedspace (or normed linear space) is a vector space V on which there is defined a function‖ · ‖ : V → F called the norm such that the following hold:

a) ‖x‖ ≥ 0 and ‖x‖ = 0 if and only if x = 0

b) ‖αx‖ = |α| ‖x‖

c) ‖x+ y‖ ≤ ‖x‖+ ‖y‖.

In this case, we often write (V ,‖ · ‖) is a normed space, unless it should be apparentwhich norm corresponds to V .Note that a normed space is the same as a metric space with the metric given byd(x,y) = ‖x − y‖.

Example .. ) We define a norm on the real vector space Rn. Let

x = (x1,x2, . . . ,xn)t ∈Rn, then

‖x‖2 =

n∑i=1

|xi |22

. (.)

) More generally, for 1 ≤ p ≤∞ we define the p-norms for x ∈Rn by

‖x‖p =

n∑i=1

|xi |p

1p

, for 1 ≤ p <∞ (.)

‖x‖∞ = max1≤i≤n

|xi | (.)

As an exercise, you should show that ‖ · ‖p is indeed a norm for p = 1,2,∞. Note thatwhen n = 1 all of the p-norms coincide with the absolute value. It can be shown that

‖x‖∞ = limp→∞‖x‖p.

Example .. For p ∈ [1,∞], we define the sequence spaces `p (the little l-p spaces) as

`p = x = (xn)n≥1 : ‖x‖`p <∞


where the corresponding p-norms are given by

‖x‖`p =

∞∑n=1

|xn|p1/p

for 1 ≤ p <∞, (.)

‖x‖`∞ = supn≥1|xn| for p =∞. (.)

The `p spaces consist of all infinite sequences of complex numbers which are boundedwith respect to the above norms.

We can define similar norms on function spaces. We must be careful, however, as someof these notions require technical details that we do not have time to cover in thiscourse. If any of these details become necessary, we will cover them as they arise.

Example .. For the space of continuous functions over the closed interval [a,b],denoted C[a,b], we define the maximum norm by

‖f ‖∞ = maxa≤x≤b

|f (x)|, f ∈ C[a,b].

In addition to being able to discuss the relative size of a vector or the distance betweentwo vectors, norms allow us to discuss the convergence of sequences.

Definition .. Let (V ,‖ · ‖) be a normed space. We say that a sequence (xn) in Vconverges to an element x ∈ V , if for every ε > 0 there exists a number δ(ε) such that forevery n ≥ δ(ε) we have ‖xn − x‖ < ε. In this case, we may write lim

n→∞xn = x or xn→ x.

Definition .. Let ‖ · ‖1 and ‖ · ‖2 be norms on a vector space V . We say that the norms‖ · ‖1 and ‖ · ‖2 are equivalent if there exist positive constants α,β such that

α‖x‖1 ≤ ‖x‖2 ≤ β‖x‖1

for all x ∈ V .

An important fact, which we may or may not prove, depending on time is the following.

Claim. Let V be a finite dimensional vector space. Then any two norms ‖ · ‖1,‖ · ‖2 on Vare equivalent.

This statement is not true for an infinite dimensional space. Norms also allow us todefine the concepts of continuity, open/closed sets, extreme values and a host of othernotions we have from calculus/real analysis.


Exercise .. Define a normed space as the vector space V = C[0,1] with the normdefined by

‖v‖p =(∫ 1

0|v(x)|p dx

)1/p

, 1 ≤ p <∞

‖v‖∞ = sup0≤x≤1

|v(x)|.

Consider the sequence

un(x) =

1−nx, 0 ≤ x ≤ 1n

0, 1n < x ≤ 1

Show that‖un‖p = (n(p+ 1))−1/p , 1 ≤ p <∞,

however‖un‖∞ = 1.

Thus, the sequence un converges in the normed spaces (V ,‖ · ‖p) for 1 ≤ p <∞, butdivereges in the normed space (V ,‖ · ‖∞).

Exercise .. Show that the following two norms are equivalent on C[0,1]:

‖f ‖a = |f (a)|+∫ 1

0|f ′(x)|dx

‖f ‖b =∫ 1

0|f (x)|dx+

∫ 1

0|f ′(x)|dx.

. Operators on normed spaces

Given two spaces V ,W , an operator L from V →W is a mapping which assigns to eachelement of V a unique element in W . The domain of L, D(L), is the following subset of V

D(L) = v ∈ V : L(v) exists ,

while the range of L is the following subset of W

R(L) = w ∈W : w = L(v) for some v ∈ V .

Another important set related to an operator, is the nullspace

N (L) = v ∈ V : L(v) = 0.

Given two operators L and M from V →W , we define the addition of operators by

(L+M)(v) = L(v) +M(v).


This defines a new operator, L+M, with D(L+M) =D(L)∩D(M). Likewise, for α ∈K wedefine scalar multiplication of an operator by

(αL)(v) = αL(v).

Again, this defines a new operator αL : V →W with domain D(L).

Definition .. An operator L : V →W is one-to-one (or 1-1) if

L(v1) = L(v2)→ v1 = v2, v1,v2 ∈ D(L).

An operator which is 1-1 is also said to be injective. If R(L) =W , we say L is surjective.Furthermore, if L is both injective and surjective, we say L is bijective.

Example .. Let V = C[0,1] =W be the vector space of continuous functions on [0,1].Define the differentiation operator by

ddxf = f ′, f ∈ C[0,1].

In this case, we have D( ddx ) = C1[0,1] ⊂ C[0,1]. Is ddx an injection and/or surjection?

For operators on normed spaces, properties such as continuity and boundedness aredefined in terms of vectors and sequences of vectors in the domain of the operator.

Definition .. Let V ,W be two normed spaces. An operator L : V →W is continuous atv ∈ D(L) ⊂ V if for a sequence vn ∈ D(L) we have

vn→ v in V ⇒ L(vn)→ L(v) in W.

We say that L is continuous if it is continuous for all v ∈ D(L). The operator L is said to bebounded if for any γ > 0, there is a R > 0 such that for v ∈ D(L) we have

‖v‖ ≤ γ ⇒ ‖L(v)‖ ≤ R.

Exercise .. Let V = C[0,1] be the space of complex-valued continuous functionsdefined on the closed interval [0,1]. Define an operator L by

L(v)(t) = v(0) +∫ t

0v(s)ds.

What is the domain of L? How about the range? Show that the function v(t) = aet is afixed point of L for any a ∈C. Analogous to functions, a fixed point of an operator L is anelement of D(L) such that L(v) = v.


Definition .. If L : V →W satisfies the following properties:

L(αv1 + βv2) = αL(v1) + βL(v2) for all v1,v2 ∈ V , α,β ∈K

we say that L is linear. We denote the set of all linear operators mapping V to W byL(V ,W ) and by L(V ) whenever V =W . In this case, we often use the notation Lv inplace of L(v).

Proposition .. Suppose L is a linear operator from a vector space V to a vector space W .Then

a) the continuity of L over the whole space V is equivalent to continuity at any one point(usually taken to be v = 0.)

b) L is bounded if an only if there exists a constant α ≥ 0 such that

‖Lv‖W ≤ α‖v‖V for all v ∈ V .

c) L is continuous on V if and only if L is bounded on V .

As a consequence of Proposition ., we define a norm on L(V ,W ) by

‖L‖V ,W = supv,0

‖Lv‖W‖v‖V

. (.)

This norm is usually referred to as the operator norm. The following theorem is obtainedas a consequence.

Theorem .. The set L(V ,W ) with the norm (.) is a normed space.

Proof. Exercise.

In fact, as we will discover soon, under certain conditions the space L(V ,W ) is a Banachspace. The following property of the operator norm can sometimes be useful

‖Lv‖W ≤ ‖L‖V ,W ‖v‖V for all v ∈ V .

Example .. For a normed space V , the identity operator I : V → V , defined by

Iv = v,

is an element of L(V ) and ‖L‖ = 1.

Documents

1 Background and Review Material - College of William & Mary