Chapter 1 Vector Spaces - site.iugaza.edu.pssite.iugaza.edu.ps/mabhouh/files/2016/01/Linear-Algebra-2.pdf · Chapter 1 Vector Spaces Let C be the ﬁeld of complex numbers, ... 4

Chapter 1

Vector Spaces

Let C be the field of complex numbers, R be the field of real numbers and Q be the field of rationalnumbers.

For the rest of discussion, we will use the symbol F to denote any of these fields.

1.1 Vector Spaces

1.1.1 Definition

Let V be a nonempty set where elements are called vectors. Suppose that a law of compositiondenoted by ”+” defined on V (i.e u + v ∈ V whenever u, v ∈ V ). Let F denote a field of scalars.Suppose that with each α ∈ F and each v ∈ V, there is formed a scalar multiplication αv whereαv ∈ V. Then V is said to be a vector space over F if:

1. u+ (v + w) = (u+ v) + w for all u, v, w ∈ V.

2. u+ v = v + u for all u, v ∈ V.

3. There is a vector called the zero vector ,0v, such that u+ 0v = 0v + u = u for all u ∈ V.

4. Given u ∈ V, there is −u ∈ V such that u+ (−u) = (−u) + u = 0v.

These are the axioms of an Abelian group.

5. α(u+ v) = αu+ αv for all α ∈ F and u, v ∈ V.

6. (α + β)u = αu+ βu for all α, β ∈ F and u ∈ V.

7. α(βu) = (αβ)u for all α, β ∈ F and u ∈ V.

8. 1F.u = u where u ∈ V and 1F is the identity of F.

1.1.2 Examples

1. The Euclidean vector space Rn = {(a1, ..., an) : ai ∈ R} where addition and scalar multiplicationis defined by:

(a1, ..., an) + (b1, ..., bn) = (a1 + b1, ..., an + bn) and α(a1, ..., an) = (αa1, ..., αan).

1

2. The vector space of all continuous functions C[a, b] under addition and scalar multiplicationgiven by:

For f, g ∈ C[a, b] and α ∈ R, f + g and αf are given by the rules: (f + g)(x) = f(x) + g(x)and (αf)(x) = α(f(x)).

3. Example 1 can be generalized to Fn in the same manner.

4. The vector spaceMmn(F) of all m×n matrices over the field F under usual addition and scalarmultiplication of matrices.

5. Let F[x] denotes the set of all polynomials in x with coefficients in F, i.e the set of all

a0 + a1x + a2x2 + ... + akx

k + ... where at most a finite number of the coefficients are nonzero. Then F[x] is a vector space under the usual addition of polynomials and multiplicationby scalars.

6. Let C1[a, b] denotes the set of all real-valued functions f defined on the closed interval [a, b]which have the properties:

(a) f is differentiable at each point of [a, b] and

(b) f ′ is continuous on [a, b].

Then C1[a, b] is a vector space over R.In general, Cn[a, b] is the vector space of all functions f that are

(a) f is n− times differentiable at each point of [a, b] and

(b) f (n) is continuous on [a, b].

Note that C[a, b] ⊇ C1[a, b] ⊇ C2[a, b] ⊇ ....

1.1.3 Theorem

Let V be a vector space over F.

1. α0v = 0v for any α ∈ F.

2. 0Fu = 0v for all u ∈ V.

3. α(u1 + u2 + ...+ un) = αu1 + αu2 + ...+ αun where α ∈ F and ui ∈ V.

4. (α1 + α2 + ...+ αn)u = α1u+ α2u+ ...+ αnu where αi ∈ F and u ∈ V.

5. −(−u) = u for all u ∈ V.

6. (−1)u = −u for all u ∈ V.

2

1.2 Subspaces

1.2.1 Definition

Let V be a vector space over F and M ⊆ V. Then M is called a subspace of V if M is a vector spaceover F using the same addition and scalar multiplication as in V.

Theorem 1.2.1 (Subspace criterion) A nonempty subset M ⊆ V is a subspace of V iff

1. m1 +m2 ∈M for all m1,m2 ∈M. (M is closed under addition)

2. αm ∈M whenever m ∈M and α ∈ F. (M is closed under scalar multiplication)

Note that the zero vector of M is 0v, the same as the zero vector of V.

1.2.2 Examples

1. Let M = {(a, a) : a ∈ R}. Then M is a subspace of R2.

2. {0v} and V are subspaces of any vector space V.

3. C1[a, b] is a subspace of C[a, b].

4. M = {(x, y, z) ∈ R3 : x+ y = z} is a subspace of R3.

5. M = {(x, y, z) ∈ R3 : y + z = 1} is not a subspace of R3. note that 0v = (0, 0, 0) /∈M.

Note that conditions (1), (2) of the subspace criterion can be replaced by a single condition:

αu+ βv ∈M

whenever u, v ∈M and α, β ∈ F.

1.2.3 Theorem

The intersection of any family of subspaces of a vector space V is a subspace of V.(That is if (Mi)i∈I is a family of subspaces of V, then

∩i∈I Mi is a subspace of V.

Note: The union of subspaces is not necessary a subspace.It is easy to see that M1 = {(a, 0) : a ∈ R} and M2 = {(0, b) : b ∈ R} are two subspaces of R2

but M1 ∪M2 is not a subspace since (1, 0), (0, 1) ∈M1 ∪M2 but (1, 0) + (0, 1) = (1, 1) /∈M1 ∪M2.

1.2.4 Definition

Let v1, v2, ..., vk be a finite set of vectors in a vector space V over F. An element of V of the formα1v1 + ...+ αkvk is called a linear combination of the vectors v1, v2, ..., vk.

3

1.2.5 Theorem

Let V be a vector space over F and v1, v2, ..., vk ∈ V. Let M consists of all linear combinations ofv1, v2, ..., vk. Then

1. M is a subspace of V.

2. [v1, ..., vk] is the smallest subspace of V that contains v1, ..., vk.

3. [v1, ..., vk] is the intersection of all subspace of V that contain v1, ..., vk.

i.e M = {α1v1 + ...+ αkvk : αi ∈ F}.

1.2.6 Definition

The subspace M is called the subspace spanned by v1, v2, ..., vk or the subspace is generated byv1, v2, ..., vk. and we write M = sp{v1, v2, ..., vk} or M = [v1, v2, ..., vk].

1.2.7 Examples

1. Let v1 = (1, 2), v2 = (2, 1) ∈ R2. Then

M = [v1, v2] = {a(1, 2) + b(2, 1) : a, b ∈ R} = {(a+ 2b, 2a+ b) : a, b ∈ R} = R2. (Prove it.)

2. Consider all functions in C1(R) which satisfy the differential equation:

d2y

dx2− 3

dy

dx+ 2y = 0

This equation has a typical solution of the form y = Ae2x + Bex. This says that the solutionsform a subspace of C1(R) namely, [ex, e2x].

We are now able to define the subspace generated by a (not necessary finite) subset of V.

1.2.8 Definition

Let S be a subset of a vector space V over a field F. We define the subspace generated by S, denotedby [S] or sp(S) to be the intersection of all subspaces that contain S.

Let H be the set of all linear combinations of all finite subsets of S. Then H = [S].

1.2.9 Examples

1. Let V be the vector space of all infinite sequences of real numbers i.e V = {(a1, a2, ..., ak, ...) :ai ∈ R}. for each i ≥ 1, let e1 = (0, 0, 0, ..., 0, 1, 0, ...) which has 1 in the ith place and 0elsewhere. Then [e1, e2, e3, ...] is the subspace of V that consists of all linear combinations ofall finite subsets of {e1, e2, e3, ...}. These are those vectors with at most a finite number of nonzero entries. Thus, [e1, e2, e3, ...] ⊂ V.

2. Let S = ϕ. then ϕ is a subset of every subspace of V. What is [ϕ]? Let M = [ϕ]. Since {0v} isa subspace that contained in every subspace, so {0v} ⊆ M. conversely,{0v} is a subspace thatcontains ϕ, so M ⊆ {0v}. .Therefore, [ϕ] = {0v}.

4

1.2.10 Definition

Let M1,M2, ...,Mk be subspaces of a vector space V. We define the sum of M1,M2, ...,Mk as

M1 +M2 + ...+Mk = {m1 +m2 + ...+mk : mi ∈Mi for all i = 1, 2, ..., k}

1.2.11 Theorem

With the above notations, M1 +M2 + ...+Mk is a subspace of V.

1.2.12 Examples

(a) Let M1 = [(1, 1, 0)] = {(a, a, 0) : a ∈ R},M2 = [(1, 0, 1)] = {(b, o, b) : b ∈ R}. thenM1 +M2 = {(a+ b, a, b) : a, b ∈ R}. This is the set of all vectors of the form (x, y, z) suchthat x = y + z. This is the equation of a plane.

(b) Let M1 = [(1, 0, 0)], M2 = [(0, 1, 0)], M3 = [(0, 0, 1)]. then M1 +M2 +M3 = R3.

NOTES:

(a) Mi ⊆M1 +M2 + ...+Mk for all i = 1, 2, ..., k. i.e each Mi is a subspace of the sum.

(b) If m ∈M1 +M2 + ...+Mk then it is not true that m ∈M1 or m ∈M2 or....m ∈Mk. Seelast example.

(c) Let u1, u2, ..., ur and v1, v2, ..., vs be two finite subsets of V. Then

[u1, u2, ..., ur] + [v1, v2, ..., vs] = [u1, u2, ..., ur, v1, v2, ..., vs].

1.3 Linear Dependence and Independence

1.3.1 Definition

Let v1, v2, ..., vk be a finite set of vectors in a vector space V over F. We say v1, v2, ..., vkare linearly dependent in V if there are scalars α1, α2, ..., αk not all zero such that α1v1 +α2v2 + ...+ αkvk = 0v.

If v1, v2, ..., vk are not linearly dependent, they are called linearly independent. This meansthat whenever α1v1 + α2v2 + ...+ αkvk = 0v, then α1 = α2 = ... = αk = 0.

1.3.2 examples

i. e1, e2, e3 are L.I in R3.

ii. the set of vectors (3, 0,−3), (−1, 1, 2), (4, 2,−2), (2, 1, 1) is linearly dependent since2(3, 0,−3) + 2(−1, 1, 2)− (4, 2,−2) + 0(2, 1, 1) = (0, 0, 0, 0).

iii. Let v1 = (1, 2,−1, 3), v2 = (2,−2, 1,−1), v3 = (1, 8,−4, 10), v4 = (5,−2, 1, 1). then1 2 1 52 −2 8 −2−1 1 −4 13 −1 10 1

−−−−→RREF

1 0 3 10 1 −1 20 0 0 00 0 0 0

5

Hence, v3 = 3v1 − v2 and v4 = v1 + 2v2. S0, v1 + 2v2 + 0v3 − v4 = 0v. Therefore,v1, v2, v3, v4 are linearly dependent.

iv. A. Any subset of a linearly independent set is linearly independent.

B. If a subset of v1, v2, ..., vk is linearly dependent, then v1, v2, ..., vk is linearly depen-dent.

C. Any finite set of vectors that contains the zero vector is linearly dependent.

D. Two vectors v1, v2 are linearly dependent iff one of them is a multiple of the otherone.

Earlier in this chapter, we define the vector space C[a, b], Cn[a, b].We extend this definitionto any interval (a, b), (a,∞), [a,∞), etc. So if I is any interval, we have

C(I) ⊇ C1(I) ⊇ ... ⊇ Cn(I) ⊇ ...

These are real vector spaces with point-wise addition and scalar multiplication. The zerovector is 0fn such that 0fn(x) = 0.

1.3.3 Examples

1. 1, x, x2 in C(R) are L.I.

Let a.1 + bx+ cx2 = 0 for all x ∈ R.

x = 0 ⇒ a = 0

x = 1 ⇒ b+ c = 0

x = −1 ⇒ −b+ c = 0

Therefore, a = b = c = 0. Hence 1, x, x2 are L.I.

2. 1, sin2 x, cos 2x are L.D. since 1.1 − 2. sin2 x − 1. cos 2x = 0 for all x ∈ R. therefore, thesefunctions are L.D. in the space C(R).

1.3.4 Theorem

Let y1(x), y2(x), ..., yn(x) be n functions in Cn−1[a, b]. If there is x0 ∈ [a, b] such that

det

y1(x0) y2(x0) ... yn(x0)y′1(x0) y′2(x0) ... y′n(x0)y′′1(x0) y′′2(x0) ... y′′n(x0)

...

y(n−1)1 (x0) y

(n−1)2 (x0) ... y

(n−1)n (x0)

= 0

Then y1(x), y2(x), ..., yn(x) are L.I. in Cn−1[a, b].Proof. Let λ1, λ2, ..., λn be scalars such that

λ1y1(x) + λ2y2(x) + ...+ λnyn(x) = 0fn

6

Differentiate this equation n− 1 times:

λ1y′1(x) + λ2y

′2(x) + ...+ λny

′n(x) = 0fn

λ1y′′1(x) + λ2y

′′2(x) + ...+ λny

′′n(x) = 0fn

...

λ1y(n−1)1 (x) + λ2y

(n−1)2 (x) + ...+ λny

(n−1)n (x) = 0fn

Evaluate these equations at x = x0 gives

λ1y1(x0) + λ2y2(x0) + ...+ λnyn(x0) = 0

λ1y′1(x0) + λ2y

′2(x0) + ...+ λny

′n(x0) = 0

λ1y′′1(x0) + λ2y

′′2(x0) + ...+ λny

′′n(x0) = 0

...

λ1y(n−1)1 (x0) + λ2y

(n−1)2 (x0) + ...+ λny

(n−1)n (x0) = 0

Write

A =

y1(x0) y2(x0) ... yn(x0)...

......

...

y(n−1)1 y

(n−1)2 ... y

(n−1)n

Thus last equations can be written as:

A

λ1λ2...λn

=

00...0

We are given that det(A) = 0, so A−1 exists. Multipling by A−1 gives

λ1λ2...λn

= A

00...0

=

00...0

and so, y1, y2, ..., yn are L.I. �

1.3.5 Definition

Let the situation be as described above. Then

det

y1(x0) y2(x0) ... yn(x0)y′1(x0) y′2(x0) ... y′n(x0)y′′1(x0) y′′2(x0) ... y′′n(x0)

...

y(n−1)1 (x0) y

(n−1)2 (x0) ... y

(n−1)n (x0)

7

is called the WRONSKIAN of y1, y2, ..., yn at x and denoted byW (y1, y2, ..., yn)(x).

The theorem states: if we can find x0 ∈ I such that W (y1, y2, ..., yn)(x) = 0, then y1, y2, ..., yn areL.I. in C(n−1)(I). This is not ”if and only if”

1.3.6 Examples

Decide whether each of the following sets are L.I or L.D in the indicated space.

1. x, sin x in C1(R).

2. ex, xex, x2ex in C2(I) where I is any interval.

3. 1, x, x2, ..., xn−1 in C(n−1)(R) where I is any interval.

4. y1(x) = x3, y2(x) = |x|3 in C[−1, 1]. For these functions W (y1(x), y2(x)) = 0 for all x ∈ I butthey are L.I. For the Wronskian, we have three cases, x > 0, x = 0, x < 0.

1.4 Basis and Dimension

1.4.1 Definition

Let V be a vector space over a field F. The finite set {v1, v2, ..., vn} is said to form a basis forV if

(a) {v1, v2, ..., vn} is a linearly independent set.

(b) sp{v1, v2, ..., vn} = V.

1.4.2 Examples

1. e1, e2, ..., en form a basis for Fn.

2. v1 = (1, 2, 1), v2 = (0, 1, 2), v3 = (2, 1, 1) form a basis for R3.

1.4.3 Theorem

Let v1, v2, ..., vn be linearly independent in V. Let v ∈ [v1, v2, ..., vn]. Then v ca be writtenuniquely in the form v = α1v1 + α2v2 + ...+ αnvn.

1.4.4 Corollary

Let v1, v2, ..., vn be a basis for V. then each v ∈ V can be written uniquely as a linear combinationof v1, v2, ..., vn.

Let u ∈ V, then there are unique scalars α1, ..., αn such that u = α1v1 + α2v2 + ...+ αnvn. Thescalars α1, ..., αn are called the coordinates of u relative to the basis v1, ..., vn.

8

Let v1, ..., vn. be a basis for the vector space V and let u1, u2, ..., uk be any k vectors in V. Thenfor any 1 ≤ i ≤ k there are unique scalars aij such that

uj = a1jv1 + a2jv2 + ...+ anjvn, 1 ≤ i ≤ k

Put A =

a11 ... a1j ... a1ka21 ... a2j ... a2k...

......

an1 ... anj ... ank

, so A ∈Mnk(F).

1.4.5 Theorem

Let The situation be as above. let λ1, λ2, ..., λk be scalars, then

λ1u1 + λ2u2 + ...+ λkuk = 0v ⇐⇒

λ1

a11...an1

+ λ2

a12...an2

+ ...+ λk

a1k...ank

=

0...0

⇐⇒

A

λ1...λk

=

0...0

This theorem says there is a dependence relations between the vectors u1, u2, ..., uk iff there is

exactly the same dependence relations between the columns of coordinates.

1.4.6 Corollary

Let v1, v2, ..., vn be a basis for V and let u1, u2, uk be Linearly independent in V. Then k ≤ n.

1.4.7 Corollary

Let V be a vector space over F. Then any two basis for V have the same number of elements.

1.4.8 Definition

If a vector space V have a finite basis, then any two basis will have the same number of elements.This number is defined to be the Dimension of V and we denote it by dim(V ) and V is said to be afinite dimensional.

A vector space which is not finite dimensional is said to be an infinite dimensional.

1.4.9 Examples

1. Fn has dimension n. A basis is e1, e2, ..., en.

2. Cn[a, b] is an infinite dimensional space.

Proof. suppose dim(Cn[a, b]) = k. Then there can not be a set of k + 1 vectors that are L.I.But we see before, that 1, x, x2, ..., xk ∈ Cn[a, b] are indeed L.I. this is a contradiction.

9

3. M22(F) has dimension 4.

4. LetM be the subspace ofM22(F) consisting of all symmetric 2×2 matrices. Then dim(M) = 3.(try to prove it)

1.4.10 Lemma

Let u1, u2, ..., uk ∈ V and suppose for some r with 1 ≤ r ≤ k, ur ∈ [u1, u2, ..., ur−1, ur+1, ..., uk],then[u1, u2, ..., ur, ..., uk] = [u1, u2, ..., ur−1, ur+1, ..., uk].

Note: Let V be s finitely generated vector space, say V = [v1, v2, ..., vs], (note that v1, v2, ..., vsare not necessarily L.I.). By repeating the application of last lemma we can reducing the spanningset to a linearly independent spanning set.

1.4.11 Theorem

A linearly independent set of vectors in a finite dimensional vector space V can be extended to abasis for V.

1.4.12 Example

Show that v1 = (1, 1, 0, 2), v2 = (1, 2, 1, 1), v3 = (2, 1, 1, 0) are linearly independent in R4 and extendit to a basis for R4.

Solution: Let

A =

1 1 2 1 0 0 01 2 1 0 1 0 00 1 1 0 0 1 02 1 0 0 0 0 1

−→ M =

1 0 0 0 −1 1 10 1 0 0 2 −2 −10 0 1 0 −2 3 10 0 0 1 3 −5 −2v1 v2 v3 e1 e2 e3 e4

Then v1, v2, v3 are L.I. and extend to a basis v1, v2, v3, e1 for R3.

10

Exercises

Set1.

(1) Find a matrix in reduced echelon form which is row equivalent to the matrix

A =

2 −9 −3 3 −4−1 6 2 2 21 3 1 1 −2

Hence, find the complete solution of the linear system

A

x1x2x3x4x5

=

000

(2) Let V denote the subspace of R6 the spanned by vectors

v1 = (1, 1, 2, 1, 1, 1), v2 = (3, 4, 3, 3, 5, 5), v3 = (−2,−3,−1, 0,−6, 3),

v4 = (2, 4,−2, 2, 6, 2), v5 = (3, 2, 9, 1, 3, 1)

Find a subset of this spanning set which forms a basis of V and express the remaining vectorsin the spanning set as a linear combination of your basis vectors.

(3) Determine a basis of R4 which contains the vectors v1 = (1, 2, 3, 4), v2 = (3, 2, 1, 0). Express thestandard basis e1, e2, e3, e4 as a linear combination of the vectors in your basis.

Set2.

(1) Determine which of the following subsets of R3 is a subspace of R3 :

1. M1 = {(x, y, z) ∈ R3 : 7x− 2y + z = 0},2. M2 = {(x, y, z) ∈ R3 : 3x+ 2y + z = 10},3. M3 = {(x, y, z) ∈ R3 : 3x+ 2y + z ≥ 0},4. M4 = {(x, y, z) ∈ R3 : y = x2},5. M5 = {(x, y, z) ∈ R3 : y2 = x2}.

(2) Determine which of the following subsets of C[−1, 1] is a subspace of C[−1, 1] :

1. N1 = {f ∈ C[−1, 1] : f(0) = −1}2. N2 = {f ∈ C[−1, 1] : f(−1

2) = 0}

3. N3 = {f ∈ C[−1, 1] : f(14) ≤ 0}

4. N4 = {f ∈ C[−1, 1] :∫ 1

−1f(t) dt = 0}

11

(3) Let V be a vector space over R and let W = V × V, the set of all ordered pairs (u, v) whereaddition and multiplication by complex numbers in W is given by:

(u1, v1) + (u2, v2) = (u1 + u2, v1 + v2),

(a+ bi)(u, v) = (au− bv, bu+ av)

Show that W is a vector space over C.Set3.

(1) Let u1, u2, ..., ur and v1, v2, ..., vs be two sets of vectors in a vector space V over F. Prove that

[u1, u2, ..., ur] = [v1, v2, ..., vs]

if and only if each ui ∈ [v1, v2, ..., vs] and each vj ∈ [u1, u2, ..., ur]. Show that

1. [(1,−1, 2), (2, 1,−1)] = [(3, 0, 1), (0,−3, 5), (5, 1, 0)] in R3.

2. [sin2 x, cos2 x, sin x cos x] = [1, sin 2x, cos 2x] in C[−π, π].

(2) Let M be a subspace of a vector space V over F and let u, v, w ∈ V.

1. Suppose that u+ v + w = 0v. Prove that [u, v] = [v, w].

2. Suppose that v /∈M but v ∈M + [u]. Show that u ∈M + [v].

(3) Let V be a real vector space of all infinite sequences of real numbers (a1, a2, a3, ...) with the usualcomponentwise addition and multiplication by scalars. For nN, let en = (0, 0, ..., 0, 1, 0, ...),which has 1 in the ith position and 0 every where else, and let f = (1, 1, 1, ..., 1, 1, ...0). PutW = [f, e1, e2, e3, ...].

Determine whether each of the following statements is true or false, giving reasons for youranswer:

1. (0, 0, 0, 1, 1, ..., 1, ...) ∈ W.

2. (1, 0, 1, 0, 1, 0...) ∈ W.

3. (1, 2, 3, 4, ..., n, n+ 1, ...) ∈ W

Set4.

(1) Let a ≤ b < c ≤ d be real numbers and let y1(x).y2(x), ..., yn(x) ∈ C[a, d],

(so that y1(x).y2(x), ..., yn(x) ∈ C[b, c],

Either prove or give a counterexample to the following statements:

1. if y1(x).y2(x), ..., yn(x) are linearly independent ∈ C[b, c], then y1(x).y2(x), ..., yn(x) arelinearly independent ∈ C[a, d].

2. if y1(x).y2(x), ..., yn(x) are linearly independent ∈ C[a, d], then y1(x).y2(x), ..., yn(x) arelinearly independent ∈ C[b, c].

(2) Determine whether the following are linearly dependent or linearly independent:

1. log t, t2 log t in C(0,∞).

2. tα, tβ, tγin C(0,∞) where α, β, γ are distinct real numbers.

3. sin3 t, sin t− 13sin 3t in C[−∞,∞].

4. eax sin bx, eax cos bx, (b = 0) in C(R).5. g(t), tg(t), t2g(t) in C(I) where g is any continuous function with g(t) = 0 for all t ∈ I.

12

Set5.

(1) 1. Find the dimension of the subspace M1 of C5 given by

M1 = {(a, b, c, d, e) ∈ C5 : a = ib, c+ id− e = 0}.

2. Find the dimension of the subspace M2 of Mnn(F) given by

M2 = {A ∈Mnn(F) : AT = A}.

3. Pn+1[x] denote the real vector space of all polynomials with coefficients in R, of degree≤ n. Find dim(Pn+1[x]).

Let n ≥ 2 and letM3 = {p(x) ∈ Pn+1[x] : p(1) = p′(1) = 0}.

Verify that M3 is a subspace of Pn+1[x] and find dimM3.

(2) Verify that 1, sin x, sin 2x, cosx are linearly independent in C[−π, π]. PutM = [1, sin x, sin 2x, cosx].Find a basis for the subspace N of M that is generated by

u1 = 1− sinx− sin 2x,

u2 = 1− sin 2x− cos x,

u3 = sinx− cos x,

u4 = 2 + sin x,

u5 = 1 + sin x+ sin 2x+ cos x.

Set6.

(1) The subspaces M1,M2 of R4 are given by:

M1 = [(1, 0, 3,−2), (2,−1, 1, 0), (0, 1, 5,−4), (2,−1, 2,−1, )]

M2 = [(−1, 1, 0,−1), (0, 1,−1, 2), (−1, 2,−1, 1), (−2,−1, 3,−8, )]

1. Find bases for M1,M2,M1 +M2.

2. Calculate dim(M1 ∩M2).

3. Examine your calculations to see if you can find a basis for M1 ∩M2.

4. M1 +M2 = ?

(2) Let L,M,N be finite dimensional subspaces of a vector space V over F. Verify that L ∩M ⊆L ∩ (M +N).

Prove thatdim(L+M +N) = dimL+ dimM + dimN

if and only ifL ∩ (M +N) =M ∩ (N + L) = N ∩ (L+M) = {0v}

13

Chapter 2

Linear Mappings

2.1 Definition

Let U, V be vector spaces over a field F. Let φ : U → V be a mapping which satisfies:

1. φ(u+ v) = φ(u) + φ(v) for all u, v ∈ V.

2. φ(αu) = αφ(u) for all u ∈ V, α ∈ F.Then φ is called a linear mapping from U to V.

Other means used: Linear map, Linear Transformation, homomorphism.

Note that conditions (1),(2) can be combined to

3. φ(αu+ βv) = αφ(u) + βφ(v) for all u, v ∈ V and α, β ∈ F.

2.2 Examples

1. Let A ∈Mm,n(F) be a fixed matrix, let φ : Fn → Fm given by

φ

x1...xn

= A

x1...xn

Then φ is a linear mapping. (Verify)

2. Let φ : R2 → R3 where φ(x, y) = (x, y, 0). Then φ is a linear mapping.

Note that this example is a special case of Example 1, where A =

1 00 10 0

and the vectors

in R2 and R3 are written in columns form.

3. For n ≥ 1, let φ : Cn(I) → Cn−1(I) given by: φ(f(x)) = ddx(f(x)). This is a linear mapping

since (f + g)′ = f ′ + g′ and (αf)′ = αf ′.

14

2.3 Consequences of the Definition

1. φ(α1v1 + α2v2 + ...+ αnvn) = α1φ(v1) + α2φ(v2) + ...+ αnφ(vn).

Proof: By induction on n.

2. φ(ov) = 0v.

Proof: Note that 0v = 0v + 0v. Now take φ for both sides.

2.4 Definition

Let U, V be vector spaces over a field F and let φ : U → V be a linear mapping. Then

1. The set of vectors {u ∈ U : φ(u) = 0v} is called the Kernal of φ and is denoted by ker φ whichis a subset of U.

2. The set of vectors {v ∈ V : v = φ(u) for some u ∈ U} is called the Image of φ, and is denotedby Im(φ) and it is a subset of V.

2.5 Theorem

Let U, V be vector spaces over a field F. Let φ : U → V be a linear mapping Then ker φ is a subspaceof U and Im (φ) is a subspace of V.

Proof. Exercise

2.6 Example

φ : R3 → R4 where φ(x, y, z) = (x, x, y − z, x). It is easy to show that φ is linear where xyz

=

1 0 01 0 00 1 −11 0 0

x

yz

kerφ :

φ(x, y, z) = (0, 0, 0, 0) iff (x, x, y − z, x) = (0, 0, 0, 0) iff x = 0 and y − z = 0 that is x = 0, y = z.Typical element of kerφ is (0, y, y) = y(0, 1, 1). Thus kerα = [(0, 1, 1)] and dimφ = 1.

Im(φ) :φ(x, y, z) = (x, x, y−z, x) = x(1, 1, 0, 1)+(y−z)(0, 0, 1, 0). Hence Im (φ) = [(1, 1, o, 1), (o, o, 1, 0)]

which is a basis for Im φ. So dim(Im φ) = 2.

2.7 Theorem

Let U, V be vector spaces over a field F and U be finite dimensional. Then ker(φ) and Im (φ) arefinite dimensional and

dim(kerφ) + dim(Im φ) = dim(U).

Proof. kerφ is a subspace of the finite dimensional vector space U and so finite dimensional. Letu1, u2, ..., un be a basis for U, let v ∈ Im φ. Then there exists u ∈ U, such that u = α1u1 + ...+ αnun

15

for some scalars α1, ..., αn and v = φ(u) = φ(α1u1 + ... + αnun) = α1φ(u1) + ... + αnφ(un). Sov ∈ [φ(u1), ..., φ(un)]. Hence, Im (φ) ⊆ [φ(u1), ..., φ(un)]. Therefore, Im (φ) is finite dimensional.

Let v1, ..., vk be a basis for ker(φ). Then, v1, ..., vk is a linearly independent set of vectors in U andso can be extended to a basis for U, say, v1, ..., vk, vk+1, ..., vn. This says that dimkerφ = k, dimU = n,so we have to show that dim Im φ = n− k.

Exactly as in the first part of the proof, φ(v1), φ(v2), ..., φ(vn) span Im φ. However, v1, ..., vk spankerφ, so φ(v1) = .... = φ(vk) = 0v. Hence, φ(vk+1), ..., φ(vn) span Im φ. Let αk+1, ..., αn be scalarssuch that αk+1φ(vk+1) + ...αnφ(vn) = 0V . That is φ(αk+1vk+1 + ...αnvn) = 0V . Thus, αk+1vk+1 + ...+αnvn ∈ ker(φ). Since v1, ..., vk is a basis for kerφ, then αk+1vk+1 + ...αnvn = β1v1 + ... + βkvk forsome scalars β1, ..., βk. Since v1, v2, ..., vn are L.I., then α1 = ... = αk = 0 = β1 = ... = βk. Thereforeφ(vk+1), ..., φ(vn) are linearly L.I and so form a basis for Im (φ). Hence dim Im (φ) = n− k. �

The proof shows that if u1, u2, .., un is a basis for U, then Im φ is spanned by φ(u1), ..., φ(un), i.efor any φ(u) ∈ Im φ, φ(u) = φ(α1u1 + ...+ αnun) = α1φ(u1) + ...+ αnφ(un).

Therefore, once the images of the basis vectors, φ(u1), ..., φ(un) are known, then φ is completelydetermined on U.

2.8 Theorem

Let U, V be vector spaces over a field F with dim(U) = n. let u1, ..., un be a basis for U and v1, ..., vnbe any n vectors in V. Then there is a unique linear mapping φ : U → V such that φ(ui) = vi for all1 ≤ i ≤ n.

Proof. Let u ∈ U, then there are unique scalars α1, ..., αn such that u = α1u1+ ...+αnun. Defineφ : U → V by

φ(u) = φ(α1u1 + ...+ αnun) = α1v1 + ...+ αnvn

This is a well defined mapping since the scalars α1, ..., αn are unique.It is easy to check that φ is indeed a linear mapping. (left as an exercise)Note that for each 1 ≤ i ≤ n,ui = 0u1 + ...+ 0ui−1 + 1.ui + 0ui+1 + ...+ 0un. So, φ(ui) = vi. Suppose thatψ : U → V is a linear mapping such that ψ(ui) = vi for all 1 ≤ i ≤ n. Then for any u ∈ U, if

u = α1u1 + ...+ αnun, then

φ(u) = α1v1 + ...+ αnvn

= α1ψ(u1) + ...+ αnψ(un)

= ψ(α1u1 + ...+ αnun)

= ψ(u)

Hence, ψ(u) = φ(u) for all u ∈ U. Hence ψ = φ. �

2.9 Example

Let e1, e2, e3 be the standard basis for R3 and v1 = (1, 1), v2 = (1, 1), v3 = (1, 2). What is φ(x, y, z)?where φ(e1) = v1, φ(e2) = v2, φ(e3) = v3.

16

2.10 Sums, Scalar multiples and Composite:

1. Let U, V be vector spaces over a field F, and φ1, φ2 : U → V be linear maps. Then φ1 + φ2 :U → V is a linear map where (φ1 + φ2)(u) = φ1(u) + φ2(u).

2. Let U, V be vector spaces over a field F, and φ : U → V be linear maps and α ∈ F. αφ : U → Vwhere (αφ)(u) = α(φ(u)) is a linear map.

3. Let U, V,W be vector spaces over a field F and ψ : U → V, φ : V → W be linear maps. Thenφ ◦ ψ : U → W is a linear map.

2.11 Example

Let φ : R2 → R3 and T : R3 → R4 given by φ(x, y) = (x, x, y) and T (x, y, z) = (x, x, y − z, x). Finda formula for T ◦ φ(x, y).

Isomorphisms

In this section we discuss the problem of ”when two vector spaces are essentially the same”.For example:

R4 M22(R)

(a, b, c, d)

(a bc d

)+ +

(x, y, z, w)

(x yz w

)= =

(a+ x, b+ y, c+ z, d+ w)

(a+ x b+ yc+ z d+ w

)Similarly for multiplication by scalar.

Let U, V be vector spaces over a field F, let φ : U → V be a linear map. Suppose that φ is abijection (1-1 and onto). Then there is the inverse map φ−1 : V → U which is given by φ−1(v) = uiff u is the unique element in U such that φ(u) = v. Also φ−1 ◦ φ = IU and φ ◦ φ−1 = IV .

2.12 Theorem

Let φ : U → V be a bijection linear mapping. Then φ−1 : V → U is also a linear mapping.Proof: Exercise. �

2.13 Definition

A linear mapping φ : U → V which is also a bijection is called an isomorphism and we say U isisomorphic to V. If this is the case, then φ−1 : V → U is also an isomorphism and V is isomorphic toU. Hence, we can say that U and V are isomorphic if we can set a bijection linear mapping betweenthem.

17

2.14 Example

The two vector spaces R4 and M22(R) are isomorphic via the map (a, b, c, d) 7−→(a bc d

).

Note that φ is bijection iff φ is onto and 1-1. That is equivalent to Im(φ) = V and ker(φ) = {0v}respectively.

2.15 Theorem

φ is injective iff ker(φ) = {0v}.

2.16 Theorem

Let U, V be finite dimensional vector spaces over F. Then U, V are isomorphic iff dimU = dimV.Proof. Suppose U, V are isomorphic. Then there is a bijection linear map φ : U → V. Since φ is

surjection and injective, then Im(φ) = V ker(φ) = {0v} respectively. By theorem (2.7)

dimU = dimkerφ+ dim Imφ

= 0 + dimV

= dimV

Conversely, suppose dimU = dimV = n, say. Let u1, u2, ..., un and v1, v2, ..., vn be bases for U andV respectively. By theorem (2.8) there is a unique linear mapping φ : U → V such that φ(ui) = vifor all 1 ≤ i ≤ n. Let u ∈ U, then u = α1u1 + ...+ αnun. Suppose u ∈ kerφ. Then

0v = φ(u)

= φ(α1u1 + ...+ αnun)

= α1φ(u1) + ...+ αnφ(un)

= α1v1 + ...+ αnvn

Since v1, ..., vn are linearly independent, so we must have α1 = ... = αn = 0 Thus kerφ = {0v} andso φ is 1-1. Again by theorem (2.7), we have dim(Imφ) = n = dimV. Since Im(φ) is a subspace ofV, then Im(φ) = V and so φ is surjection and hence an isomorphism. �.

2.17 Corollary

If U is a vector space over F and dimU = n, then U is isomorphic to Fn.

2.18 Example

Let U =M22(R), V = R4. Then U, V are isomorphic since both of them has dimension 4.

2.19 Theorem

Isomorphism is an equivalence relation on the collection of all vector spaces over a field F.

18

Linear Mapping and MatricesRecall that any linear mapping φ : Fn → Fm can be represented by an m× n matrix.

Now let U, V be finite dimensional vector spaces over F, say dimU = n and dimV = m. Letφ : U → V be a linear mapping, let B = {u1, ..., un} and B′ = {v1, ..., vm} be ordered bases for Uand V respectively. Then there are unique scalars aij such that

φ(u1) = a11v1 + a21v2 + ...+ am1vm

φ(u2) = a12v1 + a22v2 + ...+ am2vm... =

...

φ(un) = a1nv1 + a2nv2 + ...+ amnvm

Put

A =

a11 a12 ... a1na21 a22 ... a2n...

......

am1 am2 ... amn

∈Mmn(F)

A is called the matrix of φ relative to the bases B for U and B′ for V.Conversely, Let B = (bij) ∈ Mmn(F). Then there is a unique linear mapping ψ : U → V such

that ψ(ui) = b1iv1 + b2iv2 + ... + bmivm, (1 ≤ i ≤ n) and the matrix of ψ relative to the bases{u1, ..., un}, {v1, ..., vm} is B.

Hence, relative to fixed bases there is a one to one correspondence between the set of all m × nmatrices and the set of all linear maps from U to V.

Suppose φ : U → V is a linear map and suppose that A = (aij) is the matrix of φ relative to thebases B = {u1, ..., un} for U and B′ = {v1, ..., vm} for V. Let u ∈ U. Then there are unique scalarsα1, ..., αn such that u = α1u1 + ... + αnun. Hence the coordinate vector of u relative to the basis B

is [u]B =

α1...αn

. then

φ(u) = φ(α1u1 + ...+ αnun)

= α1φ(u1) + ...+ αnφ(un)

= α1(a11v1 + ...+ am1vm)

+ α2(a12v1 + ...+ am2vm)

+ ....

+ αn(a1nv1 + ...+ amnvm)

= (a11α1 + a12α2 + ...+ a1nαn)v1

+ (a21α1 + a22α2 + ...+ a2nαn)v2

+ ...

+ (am1α1 + am2α2 + ...+ amnαn)vm

Hence, the column of the coordinate of φ(u) in terms of v1, ..., vm isa11α1 + a12α2 + ...+ a1nαn

a21α1 + a22α2 + ...+ a2nαn...

am1α1 + am2α2 + ...+ amnαn

= A

α1

α2...αn

19

That is [φ(u)]B′ = A[u]B where A = [φ]BB′ is the matrix of φ (or representing φ) relative to theordered bases B for U and B′ for V.

2.20 Example

Let φ : R3 → R4 given by: φ(x, y, z) = (x+ y+ z, x− y, y, y− z). This is a linear mapping. Considerthe standard bases B = {e1, e2, e3} for R3 and B′ = {f1, f2, f3, f4} for R4. Then

φ(e1) = (1, 1, 0, 0) = f1 + f2

φ(e2) = (1,−1, 1, 1) = f1 − f2 + f3 + f4

φ(e3) = (1, 0, 0,−1) = f1 − f4

Hence, A =

1 1 11 −1 00 1 00 1 −1

. This linear mapping is

xyz

7−→

1 1 11 −1 00 1 00 1 −1

x

yz

2.21 Theorem

Let U, V,W be vector spaces over F of dimensionsm,n, p respectively and u1, ..., un, v1, ..., vm, w1, ..., wp

be bases for U, V,W respectively. Let φ : U → V and ψ : V → W be linear maps represented bythe matrices A,B relative to these bases. Then the composite ψ ◦ φ : U → W is represented by BArelative to the bases u1, ..., un for U and w1, ..., wp for W.

Proof. Write A = (aij), B = (bij). then φ(ui) =∑m

h=1 ahivh and ψ(vh) =∑p

k=1 bkhwk. Then

(ψ ◦ φ)(ui) = ψ(φ(ui)) = ψ(m∑

h=1

ahivh)

=m∑

h=1

ahiψ(vh) =m∑

h=1

ahi(

p∑k=1

bkhwk)

=

p∑k=1

(m∑

h=1

bkhahi)wk =m∑k=1

(BA)kiwk

�.

2.22 Example

Let U be a vector space over F, u1, ..., un be a basis for U. then the identity linear map IU : U → Uis represented by the identity matrix In.

20

2.23 Theorem

Let U, V be n−dimensional vector spaces over F and suppose φ : U → V is represented by a matrix Arelative to the bases u1, ..., un for U and v1, ..., vn for V. Then φ is an isomorphism iff A is invertible.

Proof. Suppose φ is an isomorphism, then φ−1 exists and φ−1 : V → U is linear. Suppose thatφ−1 is represented by the matrix B relative to the given basis. Then φ−1 ◦ φ = IU and BA = In.Also, φ ◦ φ−1 = IV and AB = In. Hence B = A−1.

Conversely, suppose A−1 exists. Let ψ : V → U be the linear mapping which has the matrix A−1

relative to the bases v1, ..., vn and u1, ..., un. (That is if A−1 = (αij) then φ−1(vi) =∑n

h=1 αhiuh.)Then φ◦ψ is represented by AA−1 = In, so φ◦ψ = IV . Similarly, ψ ◦φ is represented by A−1A = In,so ψ ◦ φ = IU . Therefore ψ = φ−1 and φ is an isomorphism. �

Change of Basis

Let u1, ..., un and v1, ..., vm be bases for U, V respectively, let φ : U → V be a linear mapping andsuppose that φ is represented by A = (aij) relative to these bases. That is φ(ui) = a1ivi+...+amivm =∑m

h=1 ahivh.Let u′1, ..., u

′n be another basis for U and v′1, ..., v

′m be another basis for V. What is the matrix of

φ relative to these new bases?First, we need to express φ(u′i) in terms of v′1, ..., v

′m. Now u1, ..., un and u′1, ..., u

′n are two basis

for U and so are connected by an invertible matrix Q = (qij) such that

u′i = q1iu1 + q2iu2 + ...+ qniun 1 ≤ i ≤ n

Thus Q is an n×n. Similarly,v1, ..., vm and v′1, ..., v′m are connected by an invertible matrix P = (pij)

such thatvh = p1hv

′1 + p2hv

′2 + ...+ pmhv

′m, 1 ≤ h ≤ m.

Then

φ(u′i) = φ(n∑

h=1

qhiuh) =n∑

h=1

qhiφ(uh)

=n∑

h=1

qhi(m∑k=1

akhvk) =m∑k=1

(n∑

h=1

akhqhi)vk

=m∑k=1

(AQ)kivk

=m∑k=1

(AQ)ki(m∑r=1

prkv′r) =

m∑r=1

(m∑k=1

prk(AQ)ki)v′r

=m∑r=1

(PAQ)riv′r

Therefore, the matrix representing φ relative to the new bases u′1, ..., u′n for U and v′1, ..., v

′m for V is

B = PAQ.

21

2.24 Example

Let φ : R3 → R2 be the linear mapping given by φ(x, y, z) = (x − 2y + 3z, 2x + y − z). Then the

matrix representing φ relative to the standard bases is A =

(1 −2 32 1 −1

)since

φ(e1) = (1, 2) = f1 + 2f2

φ(e2) = (2, 1) = 2f1 + f2

φ(e3) = (3,−1) = 3f1 − f2

where e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1) and f1 = (1, 0), f2 = (0, 1).

If u′1 = (1, 3, 2), u′2 = (0, 1, 1), u′3 = (1, 1, 1) is another basis for R3 and v′1 = (1, 1), v′2 = (1, 2) isanother basis for R2, find the matrix of φ relative to these new bases. Now

u′1 = (1, 3, 2) = e1 + 3e2 + 2e3

u′2 = (0, 1, 1) = 0e1 + e2 + e3

u′3 = (1, 1, 1) = e1 + e2 + e3

Hence, Q =

1 0 13 1 12 1 1

. Also

v′1 = (1, 1) = f1 + f2

v′2 = (1, 2) = f1 + 2f2

Therefore,

f1 = 2v′1 − v′2f2 = −v′1 + v′2

Hence, P =

(2 −1−1 1

). Therefore the required matrix is

PAQ =

(2 −1−1 1

)(1 −2 32 1 −1

) 1 0 13 1 12 1 1

=

(−1 2 22 −1 0

).

That means

φ(u′1) = −v′1 + 2v′2φ(u′2) = 2v′1 − v′2φ(u′3) = 2v′1

2.25 Example

Let A be m × n matrix. then A defines a linear mapping φ : Fn → Fm by φ(ei) = (a1i, a2i, ..., ami).The matrix φ relative to the standard basis is A.

a basis for kerφ can be extended to a basis for Fn, say u1, u2, ..., uk, ..., un is a basis for Fn whereuk+1, ..., un is a basis for kerφ. Put v1 = φ(u1), ..., vk = φ(uk). Then v1, ..., vk is a basis for Im(φ).

22

This basis can be extended to a basis for Fm, say v1, ..., vk, vk+1, ..., vm. Then

φ(u1) = v1

φ(u2) = v2... =

...

φ(uk) = vk

φ(uk+1) = 0Fm

... =...

φ(un) = 0Fm

So, the matrix of φ relative to the bases u1, ..., un and v1, ..., vm is

1 0 ... 0 0 ... 00 1 ... 0 0 ... 0

0 0 ......

......

......

......

......

0 0 ... 1 0 ... 0

0 0 ... 0 0 ... 0...

......

......

0 0 ... 0 0 ... 0

=

(Ik 00 0

)= PAQ.

Thus, given an m× n matrix A, there exists an invertible n× n matrix Q and an invertible m×m

matrix P such that PAQ =

(Ik 00 0

).

Now, k = dim(Im(φ)), Im(φ) is spanned by φ(e1), φ(e2), ..., φ(en). i.e

Im(φ) = [(a11, a21, ..., am1), ..., (a1n, a2n, ..., amn)]

Thus Im(φ) is the space spanned by the columns of A. So, dim(Im(φ)) is the maximum number oflinearly independent columns of A and this is the rank of A. Hence, k = rank(A).

Let φ : V → V be a linear mapping. In calculating matrices representing φ, it is customary touse the same basis on both sides. Therefore, relative to the basis B = {v1, ..., vn}, φ is representedby an n× n matrix A = (aij), where

φ(v1) = a11v1 + a21v2 + ...+ an1vn... =

...

φ(vn) = a1nv1 + a2nv2 + ...+ annvn

Let B′ = {v′1, ..., v′n} be another basis for V. Then there is an invertible matrix P = (pij) such that

v′i = p1iv1 + p2iv2 + ...+ pnivn, 1 ≤ i ≤ n

23

So thatvj = q1jv

′1 + q2jv

′2 + ...+ qnjv

′n, 1 ≤ j ≤ n

where Q = (qij) = P−1. Then relative to the basis B′ = {v′1, ..., v′n}, φ is represented by P−1AP.Also, any invertible matrix P will define a change of basis for V. So two n × n matrices A,B

represent the same linear mapping (relative to different bases) if and only if there is an invertiblematrix P such that B = P−1AP.

2.26 Definition

Let A,B ∈ Mnn(F). Then we say B is similar to A if there is an invertible matrix P such thatB = P−1AP.

Therefore two matrices represent the same linear mapping if and only if they are similar.

2.27 Proposition

Let A,B be similar matrices, then

1. det(A) = det(B).

2. Trace (A) =Tracs (B).

Proof. Exercise. �Note that the converse of this proposition is not true. That is if det(A) = det(B) or Trace

(A) =Tracs (B) then A,B need not be similar.

2.28 Definition

Let V be n−dimensional vector space and φ : V → V be linear mapping. Then any two matricesrepresenting φ will be similar and so have the same determinant and trace. Therefore, we define

1. det(φ) = det of any matrix representing φ.

2. Trace (φ) = trace of any matrix representing φ.

24

Exercises

Set7.

(1) Determine whether each of the following mappings is linear. In the case of they are linear, findbases for the kernal and image.

1. φ1 : R3 → R4 given by φ(a, b, c) = (a+ b, b+ c, 2, a− b− c).

2. φ2 : R4 → R3 given by φ(a, b, c, d) = (a+ c+ d, a+ b+ 2d, b− c+ d).

(2) Let B ∈Mnn(F), let φ :Mnn(F) →Mnn(F) given by φ(A) = AB −BA for all A ∈Mnn(F).Verify that φ is a linear mapping.

Find a basis for kerφ when n = 3 and B =

0 1 01 0 10 1 0

. Hence calculate dim(Imφ).

(3) Let U, V be vector spaces over F and let φ : U → V be a linear mapping. Either prove or givea counterexample to the following statements:

1. If u1, u2, ..., uk are linearly independent in U then φ(u1), φ(u2), ..., φ(uk) are linearly inde-pendent in V

2. If φ(u1), φ(u2), ..., φ(uk) are linearly independent in V then u1, u2, ..., uk are linearly inde-pendent in U.

Set8.

(1) Let u1 = (0, 1, 1), u2 = (1, 0, 1), u3 = (1, 1, 0) ∈ R3. Express each of e1, e2, e3 in terms ofu1, u2, u3. Deduce that u1, u2, u3 is a basis for R3.

Let φ : R3 → R4 be the unique linear mapping such that

φ(u1) = (0,−1, 2, 2), φ(u2) = (1,−1, 1, 2), φ(u3) = (2,−1, 0, 6).

Find φ(x, y, z).

(2) Let Pn(R) denote the vector space of all polynomials over R of degree < n. Find a basis forthe subspace

M1 = {p(x) ∈ Pn(R) : p′(0) = p(1) = 0}

Find a basis for the subspace M2 of Rn given by

M2 = {(a1, a2, ..., an) ∈ Rn : a1 + an = a2 + an−1 = 0}

Deduce that M1 is isomorphic to M2 and write an explicit isomorphism.

(3) The linear mapping φ : R3 → R2 has matrix A =

(1 0 2−1 4 3

)relative to the bases

u1 = (1, 0, 0), u2 = (1, 1, 0), u3 = (1, 1, 1) for R3 and v1 = (1, 0), v2 = (1, 1) for R2. Findφ(x, y, z).

Set9.

25

(1) The linear mapping φ : R3 → R3 is given by

φ(a, b, c) = (a+ 2b+ 2c,−a+ 3b+ 8c, 2a− b+ γc)

Write down the matrix representing φ relative to the standard basis e1, e2, e3 for R3. Deducethat φ is an isomorphism unless γ takes one particular value.

(2) φ : R4 → R3 is represented by the matrix

A =

1 2 −1 02 3 −1 −11 0 1 −2

relative to the standard bases for R4 and R3. Find a basis for kerφ and extend this basis to abasis for R4. Hence, find an invertible matrices Q(4× 4) and P (3× 3) such that

PAQ =

(Ik 00 0

)(3) Find detφ, trace φ, rank φ for the linear mapping φ : C3 → C3 given by

φ(x, y, z) = (2x+ y − z, x+ 2y + z,−x+ y + 2z)

26

Chapter 3

Inner Product Spaces

In this chapter we restrict the field of scalars to C or R. The results will be stated in terms of Cwhich involves complex conjugates. To get the R version, ignore the conjugate.

3.1 inner Product

3.1.1 Definition

Let V be a vector space over C. Suppose that with each order pairs of vectors u, v there is associateda scalar denoted by < u, v > . then < , > is said to be an inner product on V, and V is called aninner product space if:

1. < u, v + w >=< u, v > + < u,w > for all u, v, w ∈ V.

2. < v, u >= < u, v >. (complex conjugate)

3. < αu, v >= α < u, v > where α ∈ C.

4. < u, u >≥ 0 and < u, u >= 0 iff u = 0.

Note that:

• By putting u = v, from (2) we get < u, u >= < u, u >. Therefore, < u, u > is always a realnumber.

• < u, αv >= < αv, u > = α < v, u > = α< v, u > = α < u, v > .

• < u+ v, w >= < w, u+ v > = < w, u > + < w, v > =< w, u >+< w, v > =< u,w > + < v,w > .

3.1.2 Example

The scalar product on R2 and R3 given by < (x1, y1), (x2, y2) >= x1x2 + y1y2. is an inner product.This inner product ca be generalized to Rn by< (a1, a2, ..., an), (b1, b2, ..., bn) >= a1b1 + a2b2 + ...+ anbn.This is called the standard scalar product on Rn.For Cn, < (a1, a2, ..., an), (b1, b2, ..., bn) >= a1b1 + a2b2 + ...+ anbn.This is called the standard scalar product on Cn.

27

3.1.3 Example

<, > on R2 given by:< (x1, y1), (x2, y2) >= x1x2 + x1y2 + x2y1 + 2y1y2.Straightforward checking for (i),(ii),(iii). For (iv),< (x1, y1), (x1, y1) >= x21 + 2x1y1 + 2y21 = (x1 + y1)

2 = y21 ≥ 0, (by completing the square)and = 0 iff x1 + y1 = 0 and y21 = 0. That is if (x1, y1) = (0, 0).

3.1.4 Example

In the vector space C[a, b], for any f, g ∈ C[a, b] let

< f, g >=

∫ b

a

f(x)g(x)dx.

This is an inner product.Proof:

1.

< f, g + h > =

∫ b

a

f(x)[g(x) + h(x)]dx

=

∫ b

a

f(x)g(x)dx+

∫ b

a

f(x)h(x)dx

= < f, g > + < f, h > .

2. < f, g >=∫ b

af(x)g(x)dx =

∫ b

ag(x)f(x)dx =< g, f > .

3. < αf, g >=∫ b

aαf(x)g(x)dx = α

∫ b

ag(x)f(x)dx = α < f, g > .

4. Since (f(x))2 ≥ 0 for all x ∈ [a, b], then

< f, f >=

∫ b

a

(f(x))2dx ≥ 0.

Suppose < f, f >=∫ b

a(f(x))2dx = 0 and f(c) = 0 for some c ∈ [a, b]. Then (f(c))2 > 0 and

y = f(x) is continuous, so y = (f(x))2 is continuous. Hence, (f(x))2 is positive in some region

around x = c. Therefore,∫ b

a(f(x))2dx > 0. Thus f(c) = 0 is not possible. Hence,

If∫ b

a(f(x))2dx = 0 then f(x) = 0. This makes C[a, b] an inner product space.

In C[−1, 1], what is < x2 − 1, x3 >?

3.1.5 Proposition. (consequences of the definition of inner product)

Let V be an inner product space. Then

1. < 0v, v >= 0 for any v ∈ V.

2. If u ∈ V is such that < u, v >= 0 for all v ∈ V, then u− 0v.

3. < α1u1 + α2u2 + ...+ αkuk, β1v1 + β2v2 + ...+ βmvm >=∑k

i=1

∑mj=1 αiβj < ui, vj > .

A total of km terms.Proof. Exercise. �Which vector spaces can be turned into an inner product space?

28

3.1.6 Theorem

Let V be a finite dimensional vector space over C. Then V can be given the structure of an innerproduct space.

Proof. Suppose dim(V ) = n. Let u1, u2, ..., un be a basis for V. If u, v ∈ V, then there existsunique scalars x1, x2, ..., xn, y1, y2, ..., yn such that

u = x1u1 + x2u2 + ...+ xnun, v = y1u1 + y2u2 + ...+ ynun.

Define < u, v >= x1y1 + x2y2 + ...+ xnyn. Then axioms 1,2,3 are trivial. For axiom 4,< u, u >= x1x1+x2x2+ ...+xnxn = |x1|2+ |x2|2+ ...+ |xn|2 ≥ 0 and =0 iff x1 = x2 = ... = xn = 0.

That is u = 0. �

3.1.7 Example

In R2, a basis is e1(1, 0), e2 = (−, 1). For any vector u = (x, y) we have (x, y) = xe1+ ye2 and so theinner product is just the scalar product x1x2 + y1y2. But with basis e = (1, 0), f = (1, 1), we have(x, y) = (x− y)e+ yf. Hence,

< (x1, y1), (x2, y2) >= (x1 − y1)(x2 − y2) + y1y2 = x1x2 − x1y2 − x2y1 + y1y2.

3.2 Length and Distance

3.2.1 Definition

Let V be an inner product space. For u ∈ V, (< u, u >)12 is called the length of u or the norm of u

and is denoted by ||u||. (Note that < u, u > is a nonnegative real number.)

3.2.2 Example

Let V = R2 with scalar product, let u = (x, y). Then < u, u >= x2 + y2 and so ||u|| =√x2 + y2.

If ||u|| = 1, then u is called a unit vector.Note that if u = 0 and u is not a unit, then U

||u|| is a unit vector since,

<u

||u||,u

||u||>=

1

||u||< u,

1

||u||u >=

1

||u|||u||< u, u >=

1

||u||2.||u||2 = 1

3.2.3 Definition

Let V be an inner product space and u, v ∈ V. We define the distance of u from v to be ||u − v||.Note that ||u− v|| = ||v − u||.

3.2.4 Example

In R2, with scalar product, let u = (x, y), v = (a, b), then u − v = (x − a, y − b) and ||u − v||2 =(x− a)2 + (y − b)2 and ||u− v|| =

√(x− a)2 + (y − b)2.

3.2.5 Example

Let V = C[a, b] with < f, g >=∫ b

af(x)g(x)dx. then

||f − g||2 =∫ b

a(f(x)g(x))2dx.

29

3.3 Orthogonal and Orthonormal Sets

Let V be an inner product space.

3.3.1 Definition

Let v1, v2, ..., vk be elements in V. Then

1. v1, v2, ..., vk are said to form an orthogonal set of vectors if < vi, vj >= 0 whenever i = j.

2. v1, v2, ..., vk are said to form an orthonormal set of vectors if < vi, vj >=

{0, if; i = j1, if. i = j

}.

Note that

1. An orthogonal set can contain the zero vector.

2. If v1, v2, ..., vk is non zero orthogonal set, then v1||v1|| ,

v2||v2|| , ...,

vk||vk||

is an orthonormal set.

3.3.2 Example

In R2 with scalar product, (−1, 1), (1, 1) is an orthogonal set. Now since,||(−1, 1)|| =

√2, ||(1, 1)|| =

√2, an orthonormal set is (−1√

2, 1√

2), ( 1√

2, 1√

2).

3.3.3 Example

In C[a, b] with < f, g >=∫ b

af(x)g(x)dx orthogonality is about

∫ b

af(x)g(x)dx = 0,

thus cos x, sinx are orthogonal in C[−π, π] since

< cosx, sinx > =

∫ π

−π

cosx sinxdx

=

∫ π

−π

1

2cos 2xdx

= −1

4[cos 2x]π−π

= −1

4(cos 2π − cos(−2π)) = 0

and

< cos x, cos x > =

∫ π

−π

cos2 xdx

=1

2

∫ π

−π

(cos 2x+ 1)dx

=1

2[sin 2x

2+ x]π−π

= π

Therefore, || cos x|| =√π. Similarly, || sin x|| =

√π.

Hence an orthonormal set is cosx√π, sinx√

π.

30

3.3.4 Theorem

Let u, v be vectors in an inner product space and let a ∈ C. Then

1. ||au|| = |a|.||u||.

2. | < U, V > | ≤ ||u||.||v||. (Cauchy - Schwartz inequality).

3. ||u+ v|| ≤ ||u||+ ||v||. (triangle inequality).

Proof.

1. ||au||2 =< au, au >= aa < u, u >= |a|2||u||2. By taking positive square roots, we have ||au|| =|a|||u||.

2. If u = 0v or v = 0v, then both sides is 0. So suppose v = 0v. Put α = <u,v>||v||2 . Consider

< u− αv, u− αv > .

Then,

0 ≤< u− αv, u− αv > = < u, u > + < u,−αv > + < −αv, u > + < −αv,−αv >= < u, u > −α < u, v > −α < v, u > +αα < v, v > ............(∗)

But

α < u, v >= <u,v>||v||2 < u, v >= |<u,v>|2

||v||2 ,

α < u, v >= <u,v>||v||2 < v, u > = |<u,v>|2

||v||2 ,

αα < v, v >= <u,v>||v||2

<u,v>||v||2 ||v||2 = |<u,v>|2

||v||2 .

So, (*) becomes 0 ≤ ||u||2 − |<u,v>|2||v||2 , and therefore, | < u, v > |2 ≤ ||u||2||v||2. Again take

positive square roots, we get | < u, v > | ≤ ||u||.||v||.

3.

||u+ v||2 = =< u+ v, u+ v >

= < u, u > + < u, v > + < v, u > + < v, v >

= ||u||2+ < u, v > +< u, v >+ ||v||2

= ||u||2 + 2Re(< u, v >) + ||v||2

≤ ||u||2 + 2| < u, v > |+ ||v||2

≤ ||u||2 + 2||u||||v||+ ||v||2

≤ (||u||+ ||v||)2

Take positive square roots, ||u+ v|| ≤ ||u||+ ||v||. �

In terms of our usual examples, Cauchy Schwartz gives:

1. In Rn with scalar product: if u = (x1, ..., xn), v = (y1, ..., yn) then

| < u, v > = |x1y1 + ...+ xn + yn|≤ (x21 + ...+ x2n)

12 (y21 + ...+ y2n)

12 .

This holds for any real numbers x1, ..., xn, y1, ..., yn.

31

2. In C[a, b] with < f, g >=∫ b

af(x)g(x)dx, we have

|∫ b

a

f(x)g(x)dx| ≤(∫ b

a

(f(x))2dx

) 12(∫ b

a

(g(x))2dx

) 12

If u = 0v, v = 0v, Cauchy Schwartz says | < u, v > | ≤ ||u||||v|| and so |<u,v>|||u||||v|| ≤ 1. In real space,

−1 ≤ |<u,v>|||u||||v|| ≤ 1.

Hence, there is a unique θ such that 0 ≤ θ ≤ π such that |<u,v>|||u||||v|| = cos θ.

i.e < u, v >= ||u||||v|| cos θ.(θ is defined to be the angle between u and v.)

3.3.5 Theorem

Let u1, u2, ..., ur be non zero orthogonal vectors in the inner product space V. Then u1, u2, ..., ur arelinearly independent.

Proof. Let a1, ..., ar be scalars such that a1u1 + ...+ arur = 0v. Then

0 = < 0v, uj >

= < a1u1 + ...+ arur, uj >

= a1 < u1, uj > +...+ aj < uj, uj > +...+ ar < ur, uj >

= a1.0 + ...+ aj < uj, uj > +...+ ar.0

0 = aj < uj, uj >

Now, < uj, uj > = 0 since uj = 0 for all 1 ≤ j ≤ r, and therefore, aj = 0 for all j. �Note that in an n−dimensional vector space, the number of elements in an orthogonal set can

not exceed n.In the next theorem we show that we always can find a basis consisting of orthogonal vectors for

any finite dimensional vector space.

3.3.6 Theorem

(Gram - Schmidt Orthogonalisation Process)Let u1, u2, ..., un be linearly independent vectors in an inner product space V. Then there exists an

orthonormal set of vectora w1, w2, ..., wn such that for 1 ≤ k ≤ n, the subspace spanned by u1, ..., ukis the same as the subspace spanned by w1, ..., wk.

Proof. u1, u2, ..., un be linearly independent vectors in an inner product space V. We proceed Byinduction on n.

First, we construct a non zero orthogonal set v1, ..., vn. Let v1 = u1. This obviously has therequired spanning property. Next suppose we have constructed a non zero orthogonal set v1, ..., vrsuch that for 1 ≤ k ≤ r, [v1, ..., vk] = [u1, ..., uk]. Define,

vr+1 = ur+1 −< ur+1, v1 >

||v1||2v1 −

< ur+1, v2 >

||v2||2v2 − ...− < ur+1, vr >

||vr||2vr....(∗)

If vr+1 = 0v, then rearranging (*) writes ur+1 as a linear combination of v1, ..., vr. Hence ur+1 ∈[v1, ..., vr] = [u1, ..., ur]. Thus ur+1 is a linear combination of u1, ..., ur which contradicts the linear

32

independence of u1, ..., un. Hence, vr+1 = 0v. So

< vr+1, vj > = < ur+1 −< ur+1, v1 >

||v1||2v1 −

< ur+1, v2 >

||v2||2v2 − ...− < ur+1, vr >

||vr||2< vr, vj >

= < ur+1, vj > −< ur+1, vj >

||vj||2< vi, vj >

= 0

Therefore, v1, ..., vr + vr+1 is an orthogonal set of non zero vectors. It remains to show that[u1, ..., ur+1] = [v1, ..., vr+1]. So let

M1 = [u1, ..., ur+1], M2 = [v1, ..., vr+1].

By hypothesis, u1, ..., ur ∈ M2 and by (*), ur+1 ∈ M2. If follows that M! ⊆ M2. Hence, r + 1 =dimM1 ≤ dimM2 = r + 1. Since, M1 ⊂M2 and dimM1 = dimM2, then M1 =M2.

finally, put wj =vj

||vj || for all 1 ≤ j ≤ r. �

3.3.7 Corollary

Every finite dimensional inner product vector space possesses an orthonormal basis.

3.3.8 Example

Find an orthonormal basis for the subspace of R3, with scalar product as an inner product, which isspanned by (1, 2, 1), (3, 4, 1). Extend it to an orthonormal basis for R3.

Solution: Let u1 = (1, 2, 1), u2 = (3, 4, 1). Clearly u1, u2are linearly independent. hence, letv1 = u1 and v2 = u2 − <u2,v1>

||v1||2 v1 = (3, 4, 1) − 126(1, 2, 1) = (1, 0,−1). Required orthonormal basis is

w1 = ( 1√6, 2√

6, 1√

6), w2 = ( 1√

2, 0,− 1√

2).

To extend to an orthonormal basis for R3, we need a non zero vector v3 such that< v3, v1 >= 0, < v3, v2 >= 0. So, write v3 = (a, b, c). Then < v3, v1 >= a + 2b + c = 0 and

< v3, v2 >= a− c = 0. So, a = c, b = −a. So we can choose v3 = (1,−1, 1) and w3 = ( 1√3, −1√

3, 1√

3).

Now suppose u1, ..., un ia an orthonormal basis for the n−dimensional inner product space V, letu ∈ V. Then there are scalars a1, ..., an such that u = a1u1 + ...+ anun. So for any 1 ≤ j ≤ n,

< u, uj > = < a1u1 + ...+ anun, uj

=n∑

h=1

ah < uh, uj >

= aj < uj, uj >

= aj

Note that < ui, uj >= δij =

{0, if i = j;1, if i = j.

Hence, for any u ∈ V,

u =n∑

h=1

< u, uh > uh

33

3.3.9 Example

Express (1, 1, 1) as a linear combination of the orthonormal basis in the previous example.Solution: Note that a1 =< u,w1 >=

4√6, a2 =< u,w2 >= 0, a3 =< u,w3 >=

1√3. So u =

(1, 1, 1) = 4√6w1 +

1√3w3.

3.4 Infinite Dimensional in Inner Product Spaces

Let V be a vector space over a field F. A subset X of V is said to be linearly independent in V ifevery finite subset of X is L.I. If V is an inner product space, we say that {ui}i∈I is an orthonormalset if

< ui, uj >

{1, if i = j;0, if i = j.

If {ui}i∈I is an orthonormal set, then any finite subset will be linearly independent. Hence, in thiscase orthonormal⇒ L.I.

Let V be an infinite dimensional inner product space. In chapter (1), we saw that there exists aninfinite sequence u1, u2, u3, ... such that u1, u2, ..., uk is L.I. for all k ≥ 1.

Applying Gram - Schmidt to u1, u2, ..., uk for each k, will therefore produce an infinite orthonormalsequence w1, w2, w3, ... such that

[u1, u2, ..., uk] = [w1, w2, ..., wk] for all k ≥ 1.

Hence, we can assume that every infinite dimensional inner product space has an infinite sequenceof orthonormal vectors.

3.4.1 Example

The sequence 1, cos x, cos 2x, ..., cosnx, sinx, sin 2x, ..., sinnx is an orthogonal set of functions inC[−π, π] with inner product

∫ π

−πf(x)g(x)dx. The corresponding orthonormal set is

1√2π,cos x√π,cos 2x√

π, ...,

cosnx√π

,sin x√π,sin 2x√

π, ...,

sinnx√π

It follows that:1√2π,cos x√π,sinx√π,cos 2x√

π,sin 2x√

π...,

cosnx√π

,sinnx√

π, ...

is an infinite orthonormal (and therefore) linearly independent sequence in C[−π, π].The principle application of this chapter is the theory of Fourier series, which is to express a

function in the form:1

2a0 +

∞∑n=1

(an cosnx+ bn sinnx).

So far we have worked with continuous functions. This turns out to be too restricted. Sums, scalarmultiples and integrals are well defined if we allow reasonable discontinuities.

34

3.4.2 Definition

A function f : [a, b] → R is called piecewise continuous on [a, b] if for each x0 ∈ [a, b],

limx→x+

0

f(x) and limx→x−

0

f(x)

both exist. [This is a one side limit at the end points] If the limits exist, we denote them by f(x+0 )and f(x−0 ). Also, for piecewise continuity we require at most a finite number of discontinuity.

P.C ≡ at most s finite number of discontinuities and left and right limits at all points.Let f, g be piecewise continuous on [a, b]. Then f, g have at most a finite number of discontinuities,

so f + g will have this property. That is (f + g)(x+0 ) and (f + g)(x−0 ) both exist. Thus the sumof piecewise continuous functions is also piecewise continuous. Also for any α ∈ R, αf is piecewisecontinuous.

Notation Let PC[a, b] denote the collection of all piecewise continuous functions on [a, b].The above calculations shows that PC[a, b] is closed under addition and scalar multiplication.

Hence PC[a, b] is a subspace of the vector space of all real valued functions. Let f ∈ PC[a, b] withdiscontinuities at x1 < x2 < ... < xn and continuous everywhere else. Then∫ b

a

f(x)dx =

∫ x1

a

f(x)dx+

∫ x2

x1

f(x)dx+ ...+

∫ xn

xn−1

f(x)dx+

∫ b

xn

f(x)dx

and this defines∫ b

af(x)dx. Therefore < f, g >=

∫ b

af(x)g(x)dx is well defined. however, this is not

an inner product on PC[a, b]. Considerf ∈ PC[0, 1] where f(x) = 1 at x = 0.3 and x = 0.7 while f(x) = 0 elsewhere. Then∫ 1

0(f(x))2dx = 0 while f(x) = 0.

3.5 Normalized Functions

3.5.1 Definition

(Normalized piecewise continuous function)A piecewise continuous function f : [a, b] → R is called normalized, if

f(x) =1

2[f(x+) + f(x−)]

for all x ∈ (a, b) and f(a) = f(b) = 12[f(a+) + f(b−)].

Note that if f is continuous, then f(x) = f(x+) = f(x−).

3.5.2 Theorem

LetNPC[a, b] denotes the set of functions f : [a, b] → R that are normalized and piecewise continuouson [a, b]. Then NPC[a, b] is an inner product space, with the usual addition and scalar multiplicationof functions and inner product ∫ b

a

f(x)g(x)dx.

Proof. We have already seen that the sum of piecewise continuous functions is again piecewisecontinuous. Similarly for multiplication by scalars.

35

Let f, g ∈ NPC[a, b]. then

(f + g)(x0) = f(x0) + g(x0)

=f(x+0 ) + f(x−0 )

2+g(x+0 ) + g(x−0 )

2

=f(x+0 ) + g(x+0 ) + f(x−0 ) + g(x−0 )

2

=(f + g)(x+0 ) + (f + g)(x−0 )

2

Also

(αf)(x0) = α(f(x0))

=α(f(x+0 ) + f(x−0 ))

2

=α(f(x+0 )) + α(f(x−0 ))

2

Therefore NPC[a, b] is closed under addition and scalar multiplication. The rest of the axioms followfrom the properties of functions.

For the inner product we only prove the last axiom.Suppose f ∈ NPC[a, b] and < f, f >= 0, that is

∫ b

a(f(x))2 dx = 0.We show f is the zero function.

To the contrary, assume there is x0 ∈ [a, b] such that f(x0) = 0. Hence f(x0) =f(x+

0 )+f(x−0 )

2= 0.

Hence not both of f(x+0 ) and f(x−0 ) can be 0, say f(x+0 ) = 0. Then (f(x+0 ))

2 > 0. Since f has only afinite number of discontinuity, we must have |f(x)| > 0 in some range x0 < x < c. So

0 <

∫ c

x0

(f(x))2 dx ≤∫ b

a

(f(x))2 dx.

This is a contradiction. �

3.5.3 definition

1. A function f : R → R is said to be periodic of period 2π if f(x+ 2π) = f(x) for all x ∈ R.

2. Let f : [−π, π] → R. The periodic extension of f is a function f : R → R which is a periodicfunction of period 2π such that f(x) = f(x) for −π ≤ x ≤ π.

(Thus, f(x) = f(x− 2nπ), where n ∈ Z is such that x− 2nπ ∈ [−π, π].)

3.5.4 example

f(x) =

...x+ 2π, if − 3π < x < −π;0, if x = −π;x, if − π < x < π;0, if x = π;x− 2π, if π < x < 3π;... .

36

3.6 Projection onto Subspaces: (Best Approximation)

3.6.1 Lemma

Let w1, ..., wn be a basis for a subspace W in the real inner product space V. For v ∈ V, the followingare equivalent:

1. < v,w >= 0 for all w ∈ W

2. < v,wi >= 0 for 1 ≤ i ≤ n.

Proof. (i) → (ii) is obvious.(ii) → (i). Let w ∈ W. Then there are scalars α1, ..., αn ∈ R such that w = α1w1 + ... = αnwn. So

< v,w > = < v, α1w1 + ... = αnwn >

=n∑

i=1

αi < v,wi >= 0.

�A vector v satisfies these conditions is said to be perpendicular (or orthogonal) to W and we

write v ⊥ W.

3.6.2 Lemma: (Pythagorean Lemma)

If u1, ..., uk are orthogonal, then ||u1 + u2 + ...+ uk||2 = ||u1||2 + ...+ ||uk||2Proof. Since u1, ..., uk are orthogonal, then < ui, uj >= 0 for all i = j. Hence

||u1 + u2 + ...+ uk||2 = < u1 + u2 + ...+ uk, u1 + u2 + ...+ uk >

= < u1, u1 > + < u2, u2 > +...+ < uk, uk >

= ||u1||2 + ||u2||2 + ...+ ||uk||2

�

3.6.3 Theorem: Orthogonal Projection

Let V ba a real inner product space. Let W be a finite dimensional subspace of V and let v ∈ V.Then:

1. v can be written uniquely in the form v = w + x where w ∈ W and x ⊥ W.

2. w is the best approximation to v by a member of W in the sense that for all w′ ∈ W, withw′ = w

||v − w′|| > ||v − w||.

Proof.

1. Let w1, ..., wn be an orthonormal basis for W. Let

w =n∑

i=1

< v,wi > wi

37

and put x = v −w. Clearly w ∈ W. To show that x ⊥ W it sufficient to show that < x,wi >=0, 1 ≤ i ≤ n, by lemma (...). So

< x,wi > = < v − w,wi >

= < v −n∑

i=1

< v,wi > wi, wi >

= < v,wi > − < v,wi >

= 0

Suppose v = x+ w, where W ∈ W and x ⊥ W. Write w = α1w1 + ...+ αnwn. Then

< w,wj > = αj

= < v − x, wj >

= < v,wj > − < x, wj >

= < v,wj >

Hence, w =∑n

j=1 αjwj =∑n

j=1 < v,wj > wj = w. So x = x. This gives the uniqueness.

2. For w′ ∈ W, w′ = w,v − w′ = (v − w) + (w − w′)

Note that x = v − w′ and w − w′ ∈ W, and x ⊥ (w − w′), so by pythagorean lemma,

||v − w′||2 = ||(v − w) + (w − w′)||2

= ||v − w||2 + ||w − w′||2

> ||v − w||2

3.6.4 Example

Find the distance of (1, 3, 2) in R3 from the plane through (0, 0, 0, ), (1, 0, 0, ), (1, 1, 1).Solution:Apply Gram-Schmidt to (1, 0, 0, ), (1, 1, 1). to get an orthonormal basis for the plane W.Let v1 = (1, 0, 0), then v2 = (1, 1, 1)− 1

1(1, 0, 0) = (0, 1, 1). Hence,

w1 = (1, 0, 0), w2 = (0, 1√2, 1√

2). Now

w =∑2

i=1 < v,wi > wi = 1(1, 0, 0) + 5√2(0, 1√

2, 1√

2) = (1, 5

2, 52).

So x = v − w = (1, 3, 2)− (1, 52, 52) = (0, 1

2,−1

2). Hence ||x|| =

√14+ 1

4= 1√

2.

3.7 Convergence in an Inner Product Space

3.7.1 Definition

Let V be an inner product space. Let (un) be a sequence of vectors in V. We say (un) converges tou if

limn→∞

||un − u|| = 0

(Note that ||un − u|| is a real number)In case of functions, ||fn − f || → 0 means that the area between fn and f tends to 0.

38

3.8 Definition

Let xi ∈ V. Then∑∞

k=1 xk is said to converge to x if ||∑∞

k=1 xk − x|| → 0.

3.8.1 Theorem

Let (un) → u and (vn) → v in an inner product space V. Let∑∞

n=1 xn be a convergent series in Vand (wn) be an *infinite orthonormal sequence in V. Then

1. limn→∞(un + vn) = u+ v

2. limn→∞(λun) = λu where λ ∈ F

3. For w ∈ V, limn→∞ < un, w >=< u,w > . that is < limn→∞ un, w >= limn→∞ < un, w > .

4. <∑∞

n=1 xn, w >=∑∞

n=1 < xn, w > .

This includes the statement that series of real numbers in the r.h.s is convergent).

5. Let α1, α2, ... be scalars such that for y ∈ V,∑∞

n=1 αn, wn = y, then αn =< y,wn > .

Proof.

1. We have

0 ≤ ||(un + vn)− (u+ n)||= ||(un − u) + (vn − v||≤ ||(un − u)||+ ||(vn − v|| → 0 + 0 = 0 (by triangle inequality)

2. 0 ≤ ||λun − λu|| = |λ|||un − u|| → |λ|.0 = 0

3.

| < un, w > − < u,w > | = | < un − u,w > |≤ ||un − u||.||w|| → 0.||w|| = 0 (by Cauchy Schwartz)

Hence, | < un, w > − < u,w > | → 0, so < un, w >→< u,w > . that is limn→∞ < un, w >=<u,w >=< limn→∞ un, w > .

4.

<∞∑n=1

xn, w > = < limn→∞

n∑k=1

xn, w >

= limn→∞

<n∑

k=1

xn, w > (by part 3)

= limn→∞

n∑k=1

< xn, w > (using the linearity of the inner product)

=∞∑k=1

< xk, w >

39

5.

< y,wn > = <∞∑k=1

αkwk, wn >

=∞∑k=1

< αkwk, wn > (by part 4)

=∞∑k=1

αk < wk, wn > (using orthonormality)

= αn.1 = αn

�

3.9 Fourier Series

Consider the infinite orthonormal sequence

1√2π,cos x√π,sinx√π,cos 2x√

π,sin 2x√

π...,

cosnx√π

,sinnx√

π, ...

and the inner product in NPC[−π, π] given by

< f, g >=

∫ π

−π

f(t)g(t) dt

Then∑∞

k=1 < y,wk > becomes

A01√2π

+∞∑n=1

(An

cosnx√π

+Bnsinnx√

π

)..........(∗)

where

A0 =

∫ π

−π

f(t)1√2π

dt

An =

∫ π

−π

f(t)cosnt√

πdt

Bn =

∫ π

−π

f(t)sinnt√

πdt

Thus (∗) becomes∫ π

−π

f(t)1√2π

dt1√2π

+∞∑n=1

((∫ π

−π

f(t)cosnt√

πdt

)cosnx√

π+

(∫ π

−π

f(t)sinnt√

πdt

)sinnx√

π

)

=1

2π

∫ π

−π

f(t)dt+∞∑n=1

(1

π

∫ π

−π

f(t) cosnt dt

)cosnx+

(1

π

∫ π

−π

f(t) sinnt dt

)sinnx

This is of the form1

2a0 +

∞∑n=1

(an cosnx+ bn sinnx)

40

where

a0 =1

π

∫ π

−π

f(t) dt

an =1

π

∫ π

−π

f(t) cosnt dt

bn =1

π

∫ π

−π

f(t) sinnt dt

These numbers are called the Fourier coefficients of f.Let w1, w2, ..., wn, ... be an infinite orthonormal sequence in V and let v ∈ V. Then < v,wn > is

called the n− th generalized Fourier Coefficient of v relative to w1, w2, ..., wn, ...

3.9.1 Definition

The infinite orthonormal sequence w1, w2, ..., wn, ... in V is said to be complete in V if

∞∑n=1

< v,wn > wn = v for all v ∈ V.

3.9.2 Theorem

Let w1, w2, ..., wn, ... be an infinite orthonormal sequence in the inner product space V, and let u, v ∈ V.Then

1. Bessel’s Inequality:∑∞

n=1 | < v,wn > |2 is convergent and

∞∑n=1

| < v,wn > |2 ≤ ||v||

2. Riemann-Lebesgue Lemma: < v,wn >→ 0

3. The following statements are equivalent:

(a) w1, w2, ..., wn, ... is a complete sequence in V.

(b)∑∞

n=1 < v,wn >2= ||u||2. (Parseval’s identity)

(c)∑∞

n=1 < u,wn >< v,wn >=< u, v > . (Plancherel’s identity)

Proof. Put W = [w1, w2, ..., wn]. By orthogonal projection theorem, we have

v = xn + un, where un =∞∑n=1

| < v,wk > wk, and xn ⊥ un

By pythagorean lemma, ||v||2 = ||xn||2 + ||un||2.

41

1.

||un||2 = ||∞∑k=1

< v,wk > wk||2

=∞∑k=1

|| < v,wk > wk||2

=∞∑k=1

| < v,wk > |2||wk||2

=∞∑k=1

| < v,wk >2

= ||un||2

≤ ||un||2 + ||xn||2

= ||vn||2

Hence,∑∞

k=1 < v,wk >2 is a series of nonnegative terms whose partial sums are bounded above

by ||v||2, so the series must converge with sum ≤ ||v||2.

2. Since∑∞

n=1 < v,wn >2 converges, then < v,wn >

2→ 0 and so < v,wn >→ 0.

3. (c) ⇒ (b) Put v = u.

(b) ⇒ (a)

||v −∞∑k=1

< v,wn > wn||2 = ||xn||2

= ||v||2 − ||un||2

= ||v||2 −∞∑k=1

< v,wk >2→ 0 (since (b) holds)

(a) ⇒ (c)

Assume (a) holds, we have u =∑∞

k=1 αkwk, where αk =< u,wk > . Then

< u, v > = <

∞∑k=1

αkwk, v >

=∞∑k=1

< αkwk, v >

=∞∑k=1

αk < v,wk >

=∞∑k=1

< u,wk >< v,wk >

42

3.9.3 Lemma

Let h be a periodic of period 2π and piecewise continuous function, then∫ π

−π

h(t) dt =

∫ π+a

−π+a

h(t) dt.

Proof. Write I1 =∫ π

−πh(t) dt, I2 =

∫ π+a

−π+ah(t) dt. Then

I1 =

∫ π+a

−π

h(t) dt−∫ π+a

π

h(t) dt.

I2 =

∫ π+a

−π

h(t) dt−∫ −π+a

−π

h(t) dt.

So

I2 − I1 =

∫ π+a

π

h(t) dt−∫ −π+a

−π

h(t) dt.

Put u = t+ 2π then

I2 − I1 =

∫ π+a

π

h(t) dt−∫ π+a

π

h(u− 2π) du =

∫ π+a

π

h(t) dt−∫ π+a

π

h(u) du = 0

Hence, I1 = I2. �

3.9.4 Lemma

Let h : R → R be integrable. Then

1. If h is even then∫ π

−πh(t) dt = 2

∫ π

0h(t) dt.

2. If h is odd then∫ π

−πh(t) dt = 0.

Proof. Exercise. �

3.9.5 Example

Let f : R → R given by f(x) =

{x+ π, if − π < x < 0;x− π, if 0 < x < π.

Then f is piecewise continuous and periodic of period 2π. then

a0 =1

π

∫ π

−π

f(x) dx

an =1

π

∫ π

−π

f(t) cosnt dt

bn =1

π

∫ π

−π

f(t) sinnt dt

43

Note that f, sin are odd functions while, cos is an even function. Hence, f(t) cosnt is odd while,f(t) sinnt is even. Therefore a0 = an = 0 and

bn =2

π

∫ π

0

f(t) sinnt dt

=2

π

∫ π

0

(t− π) sinnt dt

=−2

n

So, the Fourier series is∑∞

n=1

(−2 sinnxn

)and we write f(x) ∼

∑∞n=1

(−2 sinnxn

).

3.9.6 Example

The f : R → R given by f(x) = x, π < x < π is periodic of period 2π, normalize piecewisecontinuous. This function is neither even nor odd. So,

a0 =1

π

∫ π

−π

x(π − x) dx

= −2

3π2

an =1

π

∫ π

−π

x(π − x) cosnx dx

= − 1

π

∫ π

−π

x2 cosnx dx

= − 1

π

[x2 sinnx

n

]π−π

+1

π

∫ π

−π

2x sinnx

ndx

=4

nπ

∫ π

0

x sinnx dx

=4

nπ

[−x cosnx

n

]π0

+

∫ π

0

cosnx

ndx

=4

nπ

(−π cosnπ

n

)=

4

n2(−1)n+1

bn =1

π

∫ π

−π

x(π − x) sinnx dx

=1

π

∫ π

−π

[(xπ sinnx)− (x2 sinnx)] dx

=2

π

∫ π

−π

(xπ sinnx) dx

= 2

∫ π

−π

(x sinnx) dx

= 2

[−x cosnx

n

]π0

+ 2

∫ π

0

cosnx

ndx

=2π

n(−1)n+1

44

Hence

f(x) ∼ −π2

3+

∞∑n=1

(−1)n+1(4

n2cosnx+

2π

nsinnx).

We now consider convergence of the Fourier series, i.e for x0 ∈ R, under what circumstances does ptfollow that

f(x0) =1

2a0 +

∞∑n=1

(an cosnx+ bn sinnx)?

3.9.7 Definition

Suppose f : R → R is normalized and piecewise continuous on R. If limh→0+f(x0+h−f(x0)

hexists, then

it is called the right derivative of f at x0. Similarly, if limh→0−f(x0+h−f(x0)

hexists, then it is called the

left derivative of f at x0. If f is differentiable at x0 then both exist and both have value f ′(x0).

3.9.8 Theorem: (Pointwise Convergence Theorem)

Suppose f : R → R is periodic of period 2π, normalized and piecewise continuous and let

f(x) ∼ 1

2a0 +

∞∑n=1

(an cosnx+ bn sinnx) .

Suppose f has both a left and a right derivative at x0. then the Fourier series converges at x = x0 tof(x0) i.e

f(x0) =1

2a0 +

∞∑n=1

(an cosnx+ bn sinnx) .

Before we can prove this theorem, we need a few lemmas.

3.9.9 Lemma 1

Let n ≥ 1 be an integer, then 12+∑n

k=1 cos kx =sin(n+ 1

2)x

2 sin 12x

where x = 2rπ.

Proof. Since sin(A+B)− sin(A−B) = 2 cosA sinB, then

2 cos kx sin1

2x = sin(k +

1

2x)− sin(k − 1

2)x

Hence

n∑k=1

2(cos kx sin1

2x) =

n∑k=1

sin[(k +1

2x)− sin(k − 1

2)x]

= sin(n+1

2− sin

1

2x

Dividing by 2 sin 12x and rearranging gives the result. �

45

3.9.10 Definition

For n ≥ 0, 12+∑n

k=1 cos kx is called the nth Dirichlet Kernal and denoted byDn(x) whereD0(x) =12.

Properties of Dn(x):

1. Dn is an even function.

2. Dn is periodic of period 2π.

3.∫ π

0Dn(x) dx =

∫ π

012dx+

∑nk=1

∫ π

0cos kx dx = π

2+ 0 = π

2.

For n ≥ 0 write

Sn(x) =1

2a0 +

0∑k=1

(ak cos kx+ bk sin kx) .

Sn(x) is called nth partial sum

3.9.11 Lemma 2

With the natation as above, for x ∈ R,

Sn(x)− f(x) =1

π

∫ π

0

(f(x+ u)− f(x+)

)Dn(u) du

+1

π

∫ π

0

(f(x− u)− f(x−)

)Dn(u) du

Proof.

Sn(x) =1

2a0 +

0∑k=1

(ak cos kx+ bk sin kx) .

Substitute for the Fourier coefficients

Sn(x) =1

2π

∫ π

−π

f(u) du+n∑

k=1

(cos kx

π

∫ π

−π

f(u) cos ku du+sin kx

π

∫ π

−π

f(u) sin ku du

)

=1

π

∫ π

−π

(1

2+

n∑k=1

(cos kx cos ku+ sin kx sin ku)

)f(u) du

=1

π

∫ π

−π

(1

2+

n∑k=1

(cos k(x− u)

)f(u) du

Put x− u = t (note that x is a constant) then

Sn(x) =1

π

∫ x−π

x+π

(1

2+

n∑k=1

(cos kt

)f(x− t) − dt

=1

π

∫ π+x

−π+x

(1

2+

n∑k=1

(cos kt

)f(x− t) dt

46

Since cos kt, f are periodic of period 2π then

Sn(x) =1

π

∫ =π

−π

(1

2+

n∑k=1

(cos kt

)f(x− t) dt

=1

π

∫ π

−π

Dn(t)f(x− t) dt , Put u = −t,

=1

π

∫ π

−π

Dn(u)f(x+ u) du

=1

π

∫ 0

−π

Dn(v)f(x+ v) dv +1

π

∫ π

0

Dn(u)f(x+ u) du

Put v = −u in the first integral we get

Sn(x) =1

π

∫ π

0

Dn(u)f(x− u) du+1

π

∫ π

0

Dn(u)f(x+ u) du(∗)

And

f(x) = f(x).1

= f(x).2

π

∫ π

0

Dn(u) du

=2

π

∫ π

0

f(x)Dn(u) du

=2

π

∫ π

0

f(x+) + f(x−)

2Dn(u) du

=1

π

∫ π

0

f(x+)Dn(u) +1

π

∫ π

0

f(x−)Dn(u) du(∗∗)

Result comes from subtracting (**) from (*).Recall Riemann Lebegue: If w1, w2, ..., wn, ... is an infinite orthonormal sequence, then< v,wn >→

0 an n→ ∞. Nowcos x√π,cos 2x√

π, ...,

cosnx√π

, ....

sin x√π,sin 2x√

π, ...,

sinnx√π, ...

are a pair of infinite orthonormal sequences. So, for any f ∈ NPC[−π, π]

limn→∞

∫ π

−π

f(t) cosnt dt = 0,

and

limn→∞

∫ π

−π

f(t) sinnt dt = 0,

3.9.12 Lemma 3

Let h : [0, π] → R be piecewise continuous function. Then

limn→∞

∫ π

0

h(t) sin(n+1

2)t dt = 0

47

Proof. We Extend h to [−π, π] to

g(t) =

{0, if − π ≤ t < 0;h(t), if 0 < t ≤ π.

Then g is piecewise continuous on [−π, π] and∫ π

0

h(t) sin(n+1

2)t dt =

∫ π

−π

g(t) sin(n+1

2)t dt

=

∫ π

−π

g(t)(sinnt cos1

2t+ cosnt+ sin

1

2t) dt

=

∫ π

−π

(g(t) cos1

2t) sinnt dt+

∫ π

−π

(g(t) sin1

2t) cosnt dt

Since (g(t) cos 12t) and (g(t) sin 1

2t) are piecewise then both tends to 0. �

Back to theorem 1.9.8, we know f is normalized, PC, periodic of period 2π and possesses leftderivative and right derivative at a point x = x0. We must show that Sn(x0)− f(x0) → 0.

By Lemma 2,

Sn(x0)− f(x0) =1

π

∫ π

0

(f(x0 + u)− f(x+0 )f(x0 − u)− f(x−0 )

)Dn(u) du

=1

π

∫ π

0

(f(x0 + u)− f(x+0 )f(x0 − u)− f(x−0 )

) sin(n+ 12)u

2 sin 12u

du(by Lemma 1)

Put

ϕ(u) =f(x0 + u)− f(x+0 )f(x0 − u)− f(x−0 )

2 sin 12u

If we can show that π(u) is PCon [0, π] then we can use Lemma 3. Then there will be no problemsexcept at u = 0, since f and sin 1

2u are PC on [0, π], however,

ϕ(u) =

[f(x0 + u)− f(x0)

u− f(x0 − u− f(x0)

−u

] u2

sin u2

Then ∫ π

−π

f(x)g(x) dx =

(√π

2a0

)(√π

2c0

)+

∞∑n=1

[(√πan)(

√πcn) + (

√πbn)(

√πdn)

]i.e

1

π

∫ π

−π

f(x)g(x) dx =1

2a0c0 +

∞∑n=1

(ancn + bndn)..........PLANCHERAL

In particular, with g = f,

1

π

∫ π

−π

[f(x)]2 dx =1

2a20 +

∞∑n=1

(a2n + b2n)..........PARSEV AL

48

3.9.13 Example

Let f(x) =

{x+ π, if − π < x < 0;x− π, if 0 < x < π

The Fourier series:

f ∼ −2

(sin x

1+

sin 2x

2+ ....

sinnx

n+ ....

)=

∞∑n=1

−2

nsinnx

a0 = an = 0, bn = −2n

so

1

π

∫ π

−π

[f(x)]2 dx =1

π

∫ 0

−π

[f(x)]2 dx+1

π

∫ π

0

[f(x)]2 dx

=1

π

∫ 0

−π

[x+ π]2 dx+1

π

∫ π

0

[x− π]2 dx

=2

3π3.

Hence,

1

π

2

3π3 = 0 +

∞∑n=1

(0 +4

n2)

2

3π2 =

∞∑n=1

4

n2

Note that∑∞

n=11n2 = π2

6.

Completeness

Recall that the infinite orthonormal sequence w1, w2, ..., wn, ... is said to be complete if

||n∑

k=1

< u,wk > || → 0

for every u ∈ V.In NPC[−π, π], the infinite orthonormal sequence has been

1√2π,cos x√π,sinx√π,cos 2x√

π,sin 2x√

π...,

cosnx√π

,sinnx√

π, ...

The proof that these functions are complete in NPC[−π, π] is long and difficult and will be omitted.The result is important: So if

f ∼ 1

2a0 +

∞∑n=1

(an cosnx+ bn sinnx)

then ||Sn(x)− f(x)|| → 0 for any f ∈ NPC[−π, π], i.e

49

∫ π

−π

(1

2a0 +

n∑k=1

(ak cos kx+ bk sin kx)− f(x)

)2

dx

−→ 0 as n −→ ∞

Parseval and Plancheral theorem says:

w1, w2, ..., wn, ...is complete ⇐⇒ < u, v >=∞∑i=1

< u,wi >< v,wi > −→ Plancheral

⇐⇒ < u, u >=∞∑i=1

< u,wi >2 −→ Parseval

For NPC[−π, π],

w1, w2, ..., wn, ... is1√2π,cosx√π,sinx√π,cos 2x√

π,sin 2x√

π...,

cosnx√π

,sinnx√

π, ...

So,

< f(x),1√2π

> =

∫ π

−π

f(x)1√2π

dx

=

√π

2.1

π

∫ π

−π

f(x) dx

=

√π

2a0,

< f(x),cosnx√

π> =

∫ π

−π

f(x)cosnx√

πdx

=√π1

π

∫ π

−π

f(x) cosnx dx

=√πan

Similarly,

< f(x),sinnx√

π>=

√πbn

50

Exercises

Set10.

(1) Determine which of the following rules defines an inner product on R2

< (x1, y1), (x2, y2) > = (i) x1x2

(ii) 2(x1x2 + y1y2)

(iii) − 2(x1x2 + y1y2)

(iv) (x1x2)2 + (y1y2)

2

(v) x1x2 − 2x1y2 − 2x2y1 + 5y1y2

(vi) x1y2 + x2y1.

(2) Show that the relation

< A,B >= trace (BTA)

for all A,B ∈Mnn(C) defines an inner product on Mnn(C).

(3) Let n ≥ 1 be an integer. Show that the functions

1, cos x, cos 2x, ..., cosnx, sinx, sin 2x, ..., sinnx

are an orthogonal set of functions in C[−π, π] with its usual inner product.

(i.e < f(x), g(x) >=∫ π

−πf(x)g(x) dx). Calculate the corresponding orthonormal set.

51

Set11.

(1) Verify that the relation

< u, v >= x1x2 + 2y1y2 + 3z1z1 − x1y2 − x2y1 + x1z2 + x2z1

where u = (x1, y1, z1), v = (x2, y2, z2) defines an inner product on R3. Find an orthonormalbasis for this space.

Show that

|x1x2 + 2y1y2 + 3z1z1 − x1y2 − x2y1 + x1z2 + x2z1|≤ (x21 + 2y21 + 3z21 − 2x1y1 + 2x1z1)

12 (x22 + 2y22 + 3z22 − 2x2y2 + 2x2z2)

12

for all real numbers x1, x2, y1, y2, z1, z2.

(2) Find an orthonormal basis for the subspace of R4 with standard scalar product as inner product,which is generated by

u1 = (1, 1,−1, 1), u2 = (1,−1,−2, 2), u3 = (4, 2, 3, 1), u4 = (3, 1, 0, 2)

and extend it to an orthonormal basis for R4.

Express (9, 4,−3, 5) in terms of your orthonormal basis.

Set12.

(1) Find an orthonormal basis for the subspace [1, x, x2, x3] of C[−1, 1] with usual inner product

< f, g >=∫ 1

−1f(t)g(t) dt.

(2) Let V be a real inner product space.

1. Show that||u+ v||2 + ||u− v||2 = 2||u||2 + 2||v||2

for all u, v ∈ V. Obtain a theorem about parallelograms by considering R2 with standardscalar product.

2. Suppose x, y ∈ V and ||x|| = ||y||. Prove that u + v and u − v are orthogonal. Obtain atheorem about rhombuses (rhombi?), try considering R2 with standard scalar product.

Set13.

(1) f : R → R is periodic of period 2π, normalized and piecewise continuous where

f(x) =

{x, if − π < x < 0;π − x, if 0 < x < π.

Sketch the graph of f for the range 3π ≤ x ≤ 3π and write down a formula for f over thisrange.

(2) Find the best approximation to the function f(x)− x in the subspace

[sin x, sin 2x, ..., sinnx], (n ≥ 1 is an integer).

of NPC[−π, π] with its standard inner product.

52

(3) Find the distance of the point (1, 0, 2) in R3 from the plane which passes through the originand the points (1,−1, 0) and (0, 1, 1).

Set14.

(1) Let W be a subspace of the real inner product space V. define

W⊥ = {u ∈ V :< u,w >= 0 ∀w ∈ W}

Show that W⊥ is a subspace of V.

Show that V =W⊕W⊥ whenW has finite dimensional. (The proof is one line using orthogonalprojection theorem)

W ⊥ is called the orthogonal complement of W.

Find a basis for the orthogonal complement W ⊥ of the subspace W when:

1. W = [(1, 2, 0, 1), (1,−1, 1, 0)] ⊆ R4 with standard scalar product as inner product.

2. W is the subspace of all symmetric matrices in M33(R) with inner product

< A,B >= trace (BTA) (see set9)

(2) Use Parseval identity and the function f(X) = x to show that the infinite orthonormal sequence

1√2π,cos x√π,cos 2x√

π, ...,

cosnx√π

, ...

is not complete in NPC[−π, π].

Set15.

(1) For each of the following functions, draw a graph of its periodic normalized extension over[−3π.3π] and find its Fourier Series:

1. f(x) =

{0, if − π < x < 0;x, if 0 < x < π.

2. g(x) = e−x for −π < x < π.

(2) f is normalized, piecewise continuous and has period π (i.e f(x + π) = f(x)). By means ofsuitable changes of variables, show that∫ 0

−π

f(t) cosnt dt =

∫ π

0

f(u) cosnu cosnπ du

∫ 0

−π

f(t) sinnt dt =

∫ π

0

f(u) sinnu cosnπ du

Deduce that , for n ≥ 1, the Fourier coefficients of f are given by:

an =

{0, if n is odd;2π

∫ π

0f(t) cosnt dt if n is even.

bn =

{0, if n is odd;2π

∫ π

0f(t) sinnt dt if n is even.

53

Set16.

(1) Let θ be a real number which is not an integer. The function f ∈ PNC[−π, π] satisfiesf(x) = cos θ for −π ≤ x ≤ π. Sketch the graph of the piecewise extension of f when θ = 1

4.

Show that

f ∼ sin πθ

πθ+

∞∑n=1

(−1)n2θ sinπθ

π(θ2 − n2)cosnx.

Deduce that

csc θπ =1

θπ+

2θ

π

∞∑n=1

(−1)n

θ2 − n2

cot θπ =1

θπ+

2θ

π

∞∑n=1

1

θ2 − n2

(2) By considering suitable even and odd functions on [−π, π] (and PC then), show that

π

2− 4

π

∞∑r=1

cos(2r − 1)x

(2r − 1)2= x = 2

∞∑n=1

(−1)n+1 sinnx

nif 0 ≤ x < π.

What is the value of

1.π

2− 4

π

∞∑r=1

cos(2r − 1)x

(2r − 1)2if − π < x < 0?

2.

2∞∑n=1

(−1)n+1 sinnx

nif − π < x < 0?

Set17.

(1) f : R → R is even, periodic of period 2π, normalized and

f(x) =

{0, if 0 < x < π

2;

1, if π2< x < π.

Show that

f(x) =1

2− 2

π

(cos x

1− cos 3x

3+

cos 5x

5− cos 7x

7+ ...

)for all x ∈ R. Find the sums of

1.∑∞

n=1(−1)n+1

2n−1,

2. 1 + 13− 1

5− 1

7+ 1

9+ 1

11− 1

13− 1

15+ ....

(2) The 2π-periodic functions f, g ∈ NPC[−π, π] are given by

f(x) = π2x− x3, (−π ≤ x ≤ π)

g(x) = x (−π < x < π)

54

It is given that their Fourier series are

f ∼∞∑n=1

12(−1)n+1

n3sinnx

g ∼∞∑n=1

2(−1)n+1

nsinnx

Use these two expansion to show that

1.∑∞

n=11n4 = π4

90,

2.∑∞

n=11n6 = π6

945,

Use part (2) to show that

3.∑∞

n=11

(2n−1)6= π6

960.

55

Chapter 4

Diagonalization

Recall that for A ∈Mnn(F), X ∈ Fn (written as a column) is called an eigenvector for A if:

1. X = 0,

2. AX = λX for some λ ∈ F.

λ is called the corresponding eigenvalue.Note that if X is an eigenvector for A and if k = 0, then kX is an eigenvector for A since

A(kX) = k(AX) = k(λX) = λ(kX).

Suppose X is an eigenvector for A with corresponding eigenvalue λ. Then AX = λX and so(λI −A)X = 0 is a system of n homogeneous equations in n unknowns, and this system has non

trivial solution (X = 0, by definition). Hence the matrix of coefficients must be singular, that isdet(λI −A) = 0. This is a polynomial of degree n in the variable λ. Its roots (zeros) will give the

eigenvalues. This polynomial is called the characteristic polynomial of A.

4.1 Example

Let A =

(0 1−1 0

), then det(λI −A) = λ2 + 1. If F = C then we two complex eigenvalues i,−i. If

F = R then it has no eigenvalues.

4.2 Definition

Let φ : V → V be a linear mapping of the n−dimensional vector space V. Then u ∈ V is called aneigenvector of φ if

1. u = 0v,

2. φ(u) = λu for some λ ∈ F.

56

Let φ : V → V be a linear mapping, let φ represented by A = (aij) relative to the basisv1, v2, ..., vn. Let u = x1v1 + x2v2 + ...+ xnvn ∈ V. Then

φ(u) =n∑

i=1

xiφ(vi)

=n∑

i=1

xi

(n∑

h=1

ahivh

)

=n∑

h=1

(n∑

i=1

ahixi

)vh

If u is an eigenvector of φ, then

φ(u) = λu

⇔n∑

h=1

(n∑

i=1

ahixi

)vh =

n∑h=1

λxhvh

⇔n∑

i=1

ahixi = λxh

⇔ AX = λX, (X = (x1, ..., xn)T ).

Thus λ is an eigenvalue for φ iff it is an eigenvalue for the representing matrix A.

4.3 Theorem

Let A ∈ Mnn(F). Then A is similar to a diagonal matrix if and only if A possesses n linearlyindependent eigenvectors. ( That is iff Fn possesses a basis consisting of eigenvectors of A.

Proof. SupposeX1, X2, ..., Xn are n linearly independent eigenvectors ofA. Put P = (X1, X2, ..., Xn),so P is the matrix which has these eigenvectors as its columns. Then P is invertible and

AP = (AX1, AX2, ..., AXn)

= (λ1X1, λ2X2, ..., λnXn)

= (X1, X2, ..., Xn)

λ1 0

λ2. . .

0 λn

≡ PΛ

Hence, P−1AP = Λ.Conversely, if P−1AP = Λ, then the columns of P must be linearly independent. The above

calculations worked backwards shows that these columns are eigenvectors of A. �

4.4 Lemma

Eigenvectors corresponding to distinct eigenvalues are linearly independent.

57

Proof. We prove it by induction. Let X1, X2, ..., Xn be eigenvectors corresponding to the distincteigenvalues λ1, ..., λn. If n = 1, then the result holds since X1 = 0 and any non zero vector is linearlyindependent. Suppose n > 1, and that the result holds for fewer eigenvectors. Let a1, ..., an ∈ F besuch that

a1X1 + a2X2 + ...+ anXn = 0 (4.1)

Apply A to this equation to get

a1AX1 + a2AX2 + ...+ anAXn = A.0

i.ea1λ1X1 + a2λ2X2 + ...+ anλnXn = A.0 (4.2)

Now (2)-λn(1) gives

a1(λ1 − λn)X1 + a2(λ2 − λn)X2 + ...+ an−1(λn−1 − λn)Xn = A.0

By induction, hypothesis, these vectors are linearly independent, so

ai(λi − λn)− 0 for 1 ≤ i ≤ n− 1

But λi = λn for all 1 ≤ i ≤ n− 1, so ai = 0 for all 1 ≤ i ≤ n− 1.So equation (1.1) reduces to anXn = 0. However, Xn = 0 because it is an eigenvector, so an = 0.

The result follows by induction. �

4.5 Corollary

If A ∈Mnn(F), has n distinct eigenvalues, then A is similar to a diagonal matrix.Note: This is an ’if...,then’ statement, it is not an ”if and only if”. Thus if A has a repeated

eigenvalue, then initially this tells us nothing.Recall that A,B represent the same linear mapping if and only if B = P−1AP for some invertible

matrix P. i.e if and only if A and B are similar.

4.6 Theorem

If A,B are similar, then they have the same characteristic polynomial.Proof. Assume A,B are similar then B = c for some invertible matrix P. So

det(λI −B) = det(λI − P−1AP )

= det[P−1(λI − A)P ]

= detP−1 det(λI − A) detP

= det(λI − A)

�Thus all matrices representing φ have the same characteristic polynomial. This we define to be

the characteristic polynomial of φ.

Similarity of a matrix of simple form

58

The simplest type of a matrix is the diagonal matrix. As we see before, if the eigenvalues of Aare distinct, then A is diagonalizable and the proof shows that if

P−1AP =

λ1 0

λ2. . .

0 λn

Then the columns of P are eigenvectors and λ1, ..., λn are eigenvalues.

4.7 Example

1. The matrix A =

4 −6 −612 −14 1212 −12 10

has characteristic polynomial (λ + 2)2(λ − 4) and P = 1 1 02 1 12 0 1

and P−1AP =

4 0 00 −2 00 0 −2

.

2. The matrix A =

3 1 −1−7 5 −1−6 6 −2

but has only two independent eigenvectors:

λ = 4 gives an eigenvector (0, 1, 1)T .

λ = −2 gives an eigenvector (1, 1, 0)T .

In this case, there are only two linearly independent eigenvectors and so, this matrix is notdiagonalizable.

Let A ∈Mnn(C) has characteristic polynomial

det(λI − A) = (λ− λ1)α1(λ− λ2)

α2 ...(λ− λk)αk

so that A has k distinct eigenvalues λ1, λ2, ..., λk.

4.8 Definition

For 1 ≤ j ≤ k, αj is called the algebraic multiplicity of the eigenvalue λj. Thus αj is the power towhich (λ− λj) appears in the characteristic polynomial. [If all the algebraic multiplicities =1, thenA is diagonalizable.]

4.9 Definition

For 1 ≤ j ≤ k, αj putEλj

= {X ∈ Cn : AX = λjX}

[Note that 0v ∈ Eλj, so = 0.]

This is a subspace of Cn. This is called the eigenspace corresponding to λj. Any non zero vectorin Eλj

is an eigenvector for λj.

59

4.10 Definition

dim(Eλj) is called the geometric multiplicity of λj.

(see previous examples)Note that if

det(λI − A) = (λ− λ1)α1(λ− λ2)

α2 ...(λ− λk)αk

then α1 + α2 + ...+ αk = n.We denote the geometric multiplicity of λj by γj and the algebraic multiplicity by αj.

4.11 Theorem

With above notations, the geometric multiplicity of λj ≤ the algebraic multiplicity of λj. Furthermore, if for some eigenvalue λk, the geometric multiplicity < the algebraic multiplicity then A is notsimilar to a diagonal matrix.

Proof. Let φ : Cn → Cn be the linear mapping given by φ(X) = AX. Then φ is represented byA relative to the standard basis. Let X1, X2, ..., Xδj be a basis for the eigenspace Eλj

. Extend thisbasis to a basis for X1, X2, ..., Xδj , ..., Xn for Cn. Then AXi = λjXi for 1 ≤ i ≤ δj. So relative to thisbasis φ is represented by

B =

λj 0 ... 00 λj ... 00 0 ... ... C... λj

0 0 ... 0 D

= P−1AP for some P

Hence,

(λ− λ1)α1 ...(λ− λj)

αj ...(λ− λk)αk = det(λI − A)

= det(λI − P−1AP )

= det

λ− λj 0 ... 00 λ− λj ... 00 0 ... ... −C... λ− λj

0 0 ... 0 λI −D

= (λ− λj)

δj det(λI −D)

So (λ− λj)δj divides r.h.s, so (λ− λj)

δj divides l.h.s. So γj ≤ αj.Suppose γj < αj. Assume that A is similar that A is similar to a diagonal matrix. Then there is

60

an invertible matrix P such that

P−1AP =

λ1 0 0 ... ... 0 00 λ2 0 0 ... 0

0. . .

... λj

.... . .

λj 0 0. . . 0

0 0 0 0 λk

Hence the columns of P corresponding to the λ′js will be linearly independent eigenvectors corre-sponding to λj. There are αj of them. So αj ≤ dim(Eλj

) = γj. However, γj < αj. Hence A is notsimilar to a diagonal matrix. �

Let A ∈ Mnn(C) and φ : Cn → Cn given by φ(X) = AX is a linear mapping. Cn is an innerproduct space with standard scalar product as inner product:

< (x1, ..., xn), (y1, ..., yn) >= x1y1 + x2y2 + ...+ xnyn = XTY

where X = (x1, ..., xn)T , Y = (y1, ..., yn)

T .For B ∈Mnn(C), say B = (bij) let B = (bij). It is easy to show that for any matrices B1, B2,

1. B1 +B2 = B1 +B2

2. B1B2 = B1B2

4.12 Theorem

Let v1, ..., vn be an orthonormal basis for the inner product space V over C. Let P = (pij) ∈Mnn(C),and put

ui = p1iv1 + p2iv2 + ...+ pnivn, (1 ≤ i ≤ n)

Then u1, u2, ..., un is an orthonormal basis for V if and only if PTP = I.

Proof.

< ui, uj > = δij

⇔ < p1iv1 + p2iv2 + ...+ pnivn, p1jv1 + p2jv2 + ...+ pnjvn > = δij

⇔ p1ip1j + p2ip2j + ...+ pnipnj = δij..................(∗)⇔ (P TP )ij = δij

⇔ P TP = δij

⇔ (PTP )ij = δij

⇔ PTP = In

�Note (*) says that the columns of P are mutually perpendicular units vectors, i.e orthonormal

vectors in Cn.

61

4.13 Definition

P = (pij) ∈Mnn(C) satisfies PTP = In is called a unitary matrix.

4.14 Example

Let P =

1√3

i√6

i√2

i√3

−1√6

1√2

−i√3

−i√6

0

then PT=

1√3

−i√3

i√3

−i√6

−1√6

i√6

−i√2

1√2

0

Then PTP = In and so P−1 = P

T

In a real space this reduces to P TP = In and this gives a real orthogonal matrix, thus P−1 = P T .If P is a real orthogonal, then P TP = In, taking determinants, detP T detP = det In. i.e

(detP )2 = 1. Thus detP = ±1.P is called a proper orthogonal if detP = 1 and P is called improper orthogonal if detP = −1.

4.15 Example

Consider the real orthogonal matrix whose first column is a multiple of (1, 1,−2)T . Let the sec-ond column be (a, b, c)T then (a, b, c)T ⊥ (1, 1,−2), then a + b − 2c = 0. we can choose (a, b, c) =(1,−1, 0). Let the third column be (x, y, z). Then < (x, y, z), (1, 1,−2) >= x + y − 2z = 0 and< (x, y, z), (1,−1, 0) >= x − y = 0. Then (x, y, z) = (1, 1, 1). Then the orthogonal matrix is 1 1 1

1 −1 1−2 0 1

and the orthonormal matrix is P =

1√6

1√2

1√3

1√6

−1√2

1√3

−2√6

0 1√3

Then detP = −1, so im-

proper. Interchanging two columns will produce a proper orthogonal matrix1√6

1√3

1√2

1√6

1√3

−1√2

−2√6

1√3

0

4.16 Definition

H ∈Mnn(C) is said to be Hermitian if HT= H.

Note that H = (aij) to be Hermition we must have aij = aij.

4.17 Example 2 1− i i1 + i −1 2−i 2 3

, Thus, a real symmetric matrix is also an Hermitian matrix.

4.18 Theorem

The eigenvalues of an Hermitian matrix are always real.

62

Proof. Let H be n×n Hermitian matrix, λ an eigenvalue, X = (x1, x2, ..., xn)T , a corresponding

eigenvector. Then X = 0, so |x1|2 + |x2|2 + ...+ |xn|2 > 0, (i.e X.X = XTX = 0). Now

HX = λX (n× 1)

XTHX = λX

TX (1)

Since HX = λX, then

(HX)T = (λX)T , i.e XTH

T= λX

T(1× n)

Multiply on right by X, we have

XTH

TX = λX

TX (1× 1)

However, HT= H, so

XTHX = λX

TX (2)

(1)− (2) gives

0 = (λ− λ)XTX, X

TX > 0

So, λ− λ = 0 and λ = λ. i.e λ is real. �

4.19 Corollary

The roots of the characteristic polynomial of a real symmetric matrix are real.

4.20 Lemma 1

Let H ∈Mnn(C) be Hermitian. Then, < HX, Y >=< X,HY > for any X, Y ∈ Cn.Proof.

< HX, Y > = (HX)TY

= XTHTY , (HT= H) ⇒ HT = H

= XTH Y

= XTHY

= < X,HY >

�

4.21 Lemma 2

Let V be an inner product space over C (resp. R) and let φ : V → V be a linear mapping satisfying

< φ(u), v >=< u,φ(v) >

Let φ be represented relative to the orthonormal basis w1, w2, ..., wn by H. Then H is Hermitian(resp. real symmetric).

63

Proof. Let H = (hij). Then

< φ(wi), wj > = < h1iw1 + h2iw2 + ...+ hniwn, wj >

= hji

= < wi, φ(wj) >

= < wi, h1jw1 + h2jw2 + ...+ hnjwn >

= hij

Thus HT= H. �

4.22 Lemma 3

Eigenvectors corresponding to distinct eigenvalues of an Hermitian matrix are orthogonal.Proof. Let λ = µ be eigenvalues of the Hermitian matrix H with corresponding eigenvectors

X, Y. Then by Lemma 1, < HX, Y >=< X,HY >, so

< HX, Y > = < λX, Y >= λ < X, Y >

= < X,HY >=< XµY >

= µ < X, Y >

Hence, λ < X, Y >= µ < X, Y > . Since λ = µ we have < X, Y >= 0. That is X, Y are orthogonal.�

4.23 Example

Let A =

2 1 11 2 11 1 2

. Then det(λI − A) = (λ− 1)2(λ− 4). So the eigenvalues are λ = 4, λ = 1.

A corresponding eigenvector of λ = 4 is (1, 1, 1)T and a unit eigenvector is ( 1√3, 1√

3, 1√

3)T .

A corresponding eigenvectors of λ = 1 is (1,−1, 0)T and another eigenvector (a, b, c)T such that(a, b, c)T ⊥ (1,−1, 0) is (1, 1,−2)T . Unit eigenvectors are ( 1√

2, −1√

2, 0)T . and ( 1√

6, −1√

6, −2√

6)T .

Hence P =

1√3

1√2

1√6

1√3

−1√2

1√6

1√3

0 −2√6

is an orthogonal matrix(P−1 = P T ) and P−1AP =

4 0 00 1 00 0 1

4.24 Theorem

Let V be an n−dimensional inner product space over F (where F = C or R). Let φ : V → V be alinear mapping satisfies < φ(u), v >=< u, φ(v) > for all u, v ∈ V. Then V possesses an orthonormalbasis consisting of eigenvectors for φ.

Proof. The proof is by induction on n = dim(V ).If n = 1, then any unit vector will be an orthonormal basis for V and also be an eigenvector for

φ. Suppose n > 1 and the result holds for spaces of smaller dimension possesses a linear mappingwith the given property. Let λ be an eigenvalue of φ. (i.e an eigenvalue of any matrix representingφ.) By Lemma 2, these include Hermitian matrices, so λ will be real by Theorem 1.18. Let M be

64

the eigen-space of V corresponding to λ. then dim(M) > 0. let w1, ..., wk be an orthonormal basisfor M. Note that w1, ..., wk are eigenvectors for φ corresponding to λ.

(Recall that the orthogonal complement M⊥ of M is M⊥ = {x ∈ V :< x, u >= 0∀u ∈M}.)By orthogonal projection theorem, V =M⊕M⊥ and dim(M⊥) = dim(V )−dim(M) = n−k < n.Let ψ be the restriction of φ to M⊥. (i.e ψ(m′) = φ(m′)∀m′ ∈ M⊥.) Let v ∈ M⊥, then for any

u ∈M

< u,ψ(v) > = < u,φ(v) >

= < φ(u), v >

= < λu, v >

= λ < u, v >= 0

This shows that ψ(v) ∈ M⊥. Thus ψ maps M⊥ to M⊥. Now dim(M⊥) < n and satisfies <ψ(u), v >=< u, ψ(v) > . Hence by induction hypothesis, M⊥ possesses an orthonormal basis con-sisting of eigenvectors of ψ(= φ), say wk+1, ..., wn. Then w1, w2, ..., wk, wk+1, ..., wn is an orthonormalbasis for V and they are all eigenvectors for φ. �

4.25 Corollary

Let S be a real symmetric matrix. Then there exists an orthogonal matrix P such that P−1AP =P TAP∆.

Similarity in Mnn(C) is an equivalent relation. So This relation splits Mnn(C) into the disjointequivalence classes.

The simplest type of matrix in any equivalent class would be a diagonal matrix. However, wehave seen that there are matrices which are not similar to a diagonal matrix. we look at this casenow:

Jordan Canonical Form:

Let A ∈ Mnn(C). Then the result is that there exists an invertible matrix P such that P−1APhas the form

J(λ1) 0J(λ2)

. . .

0 J(λn)

where the blocks down the diagonal area have the form:

J(λi) =

λi 0 0 01 λi 0 00 1 λi 00 0 1 λi

65

For example:

3 0 00 2 00 1 2

,

2 0 0 01 2 0 00 0 3 00 0 1 3

,

1 0 00 2 00 0 3

,

3 0 01 3 00 1 3

,

3 0 00 3 00 1 3

Cayley Hamilton Theorem:

If det(xI −A) = xn + an−1 + xn−1 + ...+ a1x+ a0 then An + an−1 + An−1 + ...+ a1A+ a0I = 0.(i.e Every matrix satisfies its own characteristic equation.) Thus if

det(xI − A) = (x− λ1)(x− λ2)...(x− λn)

then(A− λ1I)(A− λ2I)...(A− λnI) = 0

The order of factors is irrelevant since powers of A commute.

4.26 Lemma

LetA ∈Mnn(C), andX a column vector in Cn such thatAkX = 0, Ak−1X = 0, thenX,AX,A2X, ..., Ak−1Xare linearly independent in Cn.

Proof. Let a0, a1, ..., ak−1 be scalars such that

a0X + a1AX + a2A2X + ...+ ak−1A

k−1X = 0 (4.3)

Multiply on left by Ak−1 gives

a0Ak−1X + a1Ak−1AX + a2Ak−1A2X + ...+ ak−1Ak−1A

k−1X = Ak−10

Since AkX = 0 we havea0Ak−1X + 0 + ...+ 0 = 0

Since Ak−1X = 0, we have a0 = 0.So equation (1.3) becomes

a1AX + a2A2X + ...+ ak−1A

k−1X = 0

Applying Ak−2 will give a1 = 0. Repeat the process. �

Jordan Form for 3× 3 matrix

(I) A has eigenvalues λ, µ, ν, all different, then there is an invertible matrix P such that

P−1AP =

λ 0 00 µ 00 0 ν

66

(II) (A) A has eigenvalues λ, µ, µ and λ = µ with dimEµ = 2. Then there is an invertible matrixP such that

P−1AP =

λ 0 00 µ 00 0 µ

(B) A has eigenvalues λ, µ, µ and λ = µ with dimEµ = 1. Then the characteristic polynomial

of A is (x− λ)(x− µ)2. So that, by Cayley Hamilton Theorem,

(A− λI)(A− µI)2 = 0

Next, we will see what P−1AP looks like.

Let φ : C3 → C3 defined byφ(X) = (A− λI)X

Now kerφ = Eλ, dimkerφ = 1. So dim(Imφ) = 2. Hence we must be able to find Y suchthat Y ∈ Im(φ) \ Eµ, i.e (A − µI)Y = 0 Y = (A − λI)Z for some Z ∈ C3. Therefore,Y = 0, (A− µI)Y = 0 but (A− µI)2Y = (A− µI)2(A− λI)Z = 0Z = 0.

By Lemma, Y and (A− µI)Y are linearly independent. Let X1 be an eigenvector corre-sponding to λ. Put Y = X2, and (A− µI)Y = X3.

Claim: X1, X2, X3 are linearly independent.

Assume a1X1 + a2X2 + a3X3 = 0. Apply (A− µI)2 to this equation, this gives

a1(A− µI)2X1 + a2(A− µI)2X2 + a3(A− µI)2X3 = 0

a1(A− µI)2X1 + a20 + a30 = 0

a1(A2 − 2µA+ µ2I)X1 = 0

a1(λ2 − 2µλ+ µ2)X1 = 0

a1(λ− µ)2X1 = 0

But (λ− µ)2 = 0, X1 = 0, so a1 = 0.

The above reduces to a2X2+a3X3 = 0. But X2, X3 are known to be linearly independent,so a2 = a3 = 0. We have AX1 = λX1, (A − µI)X2 = X3, so AX2 = µX2 + X3 and(A− µI)X3 = 0, so AX3 = µX3.

Put P = (X1|X2|X3). Then

AP = (AX1|AX2|AX3)

= (λX1|µX2 +X3|µX3)

= (X1|X2|X3)

λ 0 00 µ 00 1 µ

So, P−1AP =

λ 0 00 µ 00 1 µ

Method:

(a) Choose X1 to be an eigenvalue of λ.

(b) Choose Z such that (A− µI)(A− λI)Z = 0.

(c) Put X2 = (A− λI)Z, X3 = (A− µI)X2.

67

4.27 Example

A =

−3 1 −1−7 5 −1−6 6 −2

det(xI − A) = (x− 4)(x+ 2)2

x = −2 : an eigenvector is (1, 1, 0)T only. NOT diagonalizable

So, let λ = 4, µ = −2. Then

(A− 4I) =

−7 1 −1−7 1 −1−6 6 −6

, (A+ 2I) =

−1 1 −1−7 7 −1−6 6 0

So, when λ = 4 an eigenvector is X1 =

011

Take Z =

100

X2 = (A− 4I)Z =

−7−7−6

and X3 = (A+ 2I)X2 =

660

So P =

0 −7 61 −7 61 −6 0

and P−1AP =

4 0 00 −2 00 1 −2

(III) det(xI − A) = (x− µ)3.

By Cayley Hamilton Theorem, (A− µI)3 = 0.

If (A−µI) = 0, then A = µI and so is in canonical form. So we can assume (A−µI) = 0,this now splits into two cases:

(A) (A− µI)2 = 0. By Lemma, X, (A− µI)X, (A− µI)2X are linearly independent.Put

X1 = X

X2 = (A− µI)X

X3 = (A− µI)2X

So, (A− µI)X1 = X2 ⇒ AX1 = µX1 +X2 and(A− µI)X2 = X3 ⇒ AX2 = µX2 +X3 and(A− µI)X3 = 0 ⇒ AX3 = µX3

Put P = (X1|X2|X3), then

P−1AP =

µ 0 01 µ 00 1 µ

68

4.28 Example

A =

−2 1 1−5 3 2−4 1 2

Then det(xI − A) = (x− 1)3 and

(A− µI) = (A− I) =

−3 1 1−5 2 2−4 1 1

, (A− I)2 =

0 0 0−3 1 13 −1 −1

= 0.

So let X1 = (0, 1, 0)T , X2 = (A − I)X1 = (1, 2, 1)T , X3 = (A − I)2X1 = (0, 1,−1)T .Hence,

P =

0 1 01 2 10 1 −1

, P−1AP =

1 0 01 1 00 1 1

(B) If (A− µI) = 0 while (A− µI)2 = 0 define

φ : C3 → C3 byφ(X) = (A− µI)X.

Then kerφ = Eµ. For X ∈ C3,

(A− µI)(A− µI)X = 0.X = 0

Hence, Im(φ) ⊆ kerφ. Now

dim(Im(φ)) > 0

dim(Im(φ)) ≤ dimkerφ

dim(Im(φ)) + dimkerφ = 3

The only way to satisfy these is to have dim(Im(φ)) = 1, dimkerφ = 2 i.e dimEµ = 2.Choose

i. X1 so that (A− µI)X1 = 0

ii. X2 = (A− µI)X1

iii. X3 such that X2, X3 is a basis for Eµ.

Note that (A− µI)X2 = (A− µI)2X1 = 0, so X2 is an eigenvector of µ.

Claim: X1, X2, X3 are linearly independent.Let a1X1 + a2X2 + a3X3 = 0. Apply (A− µI) :

a1(A− µI)X1 + a2.0 + a3.0 = 0

i.e a1X2 = 0, X2 = 0, so a1 = 0. Thusa2X2 + a3X3 = 0 which implies that a2 = a3 = 0 since X2, X3 are basis for Eµ.Put P = (X1|X2|X3), so P in invertible and

AX1 = µX1 +X2

AX2 = µX2

AX3 = µX3

69

So,

P−1AP =

µ 0 01 µ 00 0 µ

4.29 Example

A =

0 1 0−4 4 00 0 2

Then det(xI − A) = (x− 2)3 and

(A− 2I) =

−2 1 0−4 2 00 0 0

, (A− 2I)2 = 033

Put X1 = (0, 1, 0)T , X2 = (A − 2I)X1 = (1, 2, 0)T Take X3 = (0, 0, 1)T as an eigenvalueof µ = 2.

Then

P =

0 1 01 2 00 0 1

, P−1AP =

2 0 01 2 00 0 2

70

Exercises

Set18.

(1) Let A ∈Mnn(F) has characteristic polynomial

det(λI − A) = λn + bn−1λn−1 + ...+ b1λ+ b0

Express detA in terms of bn−1, bn−2, ..., b0. Deduce that 0 is an eigenvalue of A if and only if Ais singular.

(2) Determine whether either of the following matrices is similar to a diagonal matrix. In the casewhen are has this property, find an invertible matrix P such that P−1AP = Λ. 1 −3 3

3 −5 36 −6 4

,

0 1 00 0 116 12 0

Deduce that matrices which have the same characteristic polynomial are not necessarily similar.

(3) Find 8 different diagonal matrices Di ∈M33(R), (1 ≤ i ≤ 8) such that

D2i +Di =

0 0 00 2 00 0 6

It is given that the matrix A ∈ M33(R) has eigenvalues 0, 2, 6. Prove that there are at least 8different solutions (for X) of the matrix equation

X2 +X = A.

(4) Let A =

(3 21 2

)∈M22(R).

1. Find the complete solution of the simultaneous differential equations:

dy1dx

= 3y1 + 2y2

dy2dx

= y1 + 2y2

2. Find B ∈M22(R) such that B2 = A.

Set19.

(1) Let A,B be n× n real orthogonal matrices. Verify that B(A+B)TA = A+B.

By taking B to be I or −I (both of which are real orthogonal matrices), show that

1. if A is improper, then −1 is an eigenvalue of A,

2. if A is proper and n is odd, then 1 is an eigenvalue of A,

3. if A is improper and n is even, then 1 is an eigenvalue of A.

71

(2) S ∈Mnn(C) is called skew-Hermitian if ST= −S. Write down a 3× 3 skew-Hermitian matrix

not all of whose entries are real.

Prove that the eigenvalues of a skew-Hermitian matrix are all have the form ib where b ∈ R.

(3) 1. Find a unitary matrix U such that UTBU is diagonal where B =

(3 1− i

1 + i 2

)2. Find a real orthogonal matrix P such that P TAP is a diagonal matrix where

A =

2 −4 2−4 2 −22 −2 −1

.

Set20.

(1) For each of the following matrices, determine its Jordan Canonical Form. In each case, findthe non-singular matrix P such that P−1AP is in Jordan form:

1. A =

−4 2 −1−1 −1 −1−2 3 −4

2. B =

1 2 22 1 22 2 1

3. C =

0 0 −1−2 2 −1−8 9 −2

4. D =

−4 3 −1−2 1 −1−2 3 −3

(2) Find the general solution, on any interval containing 0, of the following system of simultaneous

differential equations:

y′1 = 2y1 + y3

y′2 = −y1 + 2y2 + 2y3

y′3 = −y1 − y2 + 5y3

Show that y1(0) = y2(0) = 1, y3(0) = 0.

72

Documents

Chapter 1 Vector Spaces - site.iugaza.edu.pssite.iugaza.edu.ps/mabhouh/files/2016/01/Linear-Algebra-2.pdf · Chapter 1 Vector Spaces Let C be the ﬁeld of complex numbers, ... 4