Upload
vomien
View
274
Download
0
Embed Size (px)
Citation preview
Chapter 1
Vector Spaces
Let C be the field of complex numbers, R be the field of real numbers and Q be the field of rationalnumbers.
For the rest of discussion, we will use the symbol F to denote any of these fields.
1.1 Vector Spaces
1.1.1 Definition
Let V be a nonempty set where elements are called vectors. Suppose that a law of compositiondenoted by ”+” defined on V (i.e u + v ∈ V whenever u, v ∈ V ). Let F denote a field of scalars.Suppose that with each α ∈ F and each v ∈ V, there is formed a scalar multiplication αv whereαv ∈ V. Then V is said to be a vector space over F if:
1. u+ (v + w) = (u+ v) + w for all u, v, w ∈ V.
2. u+ v = v + u for all u, v ∈ V.
3. There is a vector called the zero vector ,0v, such that u+ 0v = 0v + u = u for all u ∈ V.
4. Given u ∈ V, there is −u ∈ V such that u+ (−u) = (−u) + u = 0v.
These are the axioms of an Abelian group.
5. α(u+ v) = αu+ αv for all α ∈ F and u, v ∈ V.
6. (α + β)u = αu+ βu for all α, β ∈ F and u ∈ V.
7. α(βu) = (αβ)u for all α, β ∈ F and u ∈ V.
8. 1F.u = u where u ∈ V and 1F is the identity of F.
1.1.2 Examples
1. The Euclidean vector space Rn = {(a1, ..., an) : ai ∈ R} where addition and scalar multiplicationis defined by:
(a1, ..., an) + (b1, ..., bn) = (a1 + b1, ..., an + bn) and α(a1, ..., an) = (αa1, ..., αan).
1
2. The vector space of all continuous functions C[a, b] under addition and scalar multiplicationgiven by:
For f, g ∈ C[a, b] and α ∈ R, f + g and αf are given by the rules: (f + g)(x) = f(x) + g(x)and (αf)(x) = α(f(x)).
3. Example 1 can be generalized to Fn in the same manner.
4. The vector spaceMmn(F) of all m×n matrices over the field F under usual addition and scalarmultiplication of matrices.
5. Let F[x] denotes the set of all polynomials in x with coefficients in F, i.e the set of all
a0 + a1x + a2x2 + ... + akx
k + ... where at most a finite number of the coefficients are nonzero. Then F[x] is a vector space under the usual addition of polynomials and multiplicationby scalars.
6. Let C1[a, b] denotes the set of all real-valued functions f defined on the closed interval [a, b]which have the properties:
(a) f is differentiable at each point of [a, b] and
(b) f ′ is continuous on [a, b].
Then C1[a, b] is a vector space over R.In general, Cn[a, b] is the vector space of all functions f that are
(a) f is n− times differentiable at each point of [a, b] and
(b) f (n) is continuous on [a, b].
Note that C[a, b] ⊇ C1[a, b] ⊇ C2[a, b] ⊇ ....
1.1.3 Theorem
Let V be a vector space over F.
1. α0v = 0v for any α ∈ F.
2. 0Fu = 0v for all u ∈ V.
3. α(u1 + u2 + ...+ un) = αu1 + αu2 + ...+ αun where α ∈ F and ui ∈ V.
4. (α1 + α2 + ...+ αn)u = α1u+ α2u+ ...+ αnu where αi ∈ F and u ∈ V.
5. −(−u) = u for all u ∈ V.
6. (−1)u = −u for all u ∈ V.
2
1.2 Subspaces
1.2.1 Definition
Let V be a vector space over F and M ⊆ V. Then M is called a subspace of V if M is a vector spaceover F using the same addition and scalar multiplication as in V.
Theorem 1.2.1 (Subspace criterion) A nonempty subset M ⊆ V is a subspace of V iff
1. m1 +m2 ∈M for all m1,m2 ∈M. (M is closed under addition)
2. αm ∈M whenever m ∈M and α ∈ F. (M is closed under scalar multiplication)
Note that the zero vector of M is 0v, the same as the zero vector of V.
1.2.2 Examples
1. Let M = {(a, a) : a ∈ R}. Then M is a subspace of R2.
2. {0v} and V are subspaces of any vector space V.
3. C1[a, b] is a subspace of C[a, b].
4. M = {(x, y, z) ∈ R3 : x+ y = z} is a subspace of R3.
5. M = {(x, y, z) ∈ R3 : y + z = 1} is not a subspace of R3. note that 0v = (0, 0, 0) /∈M.
Note that conditions (1), (2) of the subspace criterion can be replaced by a single condition:
αu+ βv ∈M
whenever u, v ∈M and α, β ∈ F.
1.2.3 Theorem
The intersection of any family of subspaces of a vector space V is a subspace of V.(That is if (Mi)i∈I is a family of subspaces of V, then
∩i∈I Mi is a subspace of V.
Note: The union of subspaces is not necessary a subspace.It is easy to see that M1 = {(a, 0) : a ∈ R} and M2 = {(0, b) : b ∈ R} are two subspaces of R2
but M1 ∪M2 is not a subspace since (1, 0), (0, 1) ∈M1 ∪M2 but (1, 0) + (0, 1) = (1, 1) /∈M1 ∪M2.
1.2.4 Definition
Let v1, v2, ..., vk be a finite set of vectors in a vector space V over F. An element of V of the formα1v1 + ...+ αkvk is called a linear combination of the vectors v1, v2, ..., vk.
3
1.2.5 Theorem
Let V be a vector space over F and v1, v2, ..., vk ∈ V. Let M consists of all linear combinations ofv1, v2, ..., vk. Then
1. M is a subspace of V.
2. [v1, ..., vk] is the smallest subspace of V that contains v1, ..., vk.
3. [v1, ..., vk] is the intersection of all subspace of V that contain v1, ..., vk.
i.e M = {α1v1 + ...+ αkvk : αi ∈ F}.
1.2.6 Definition
The subspace M is called the subspace spanned by v1, v2, ..., vk or the subspace is generated byv1, v2, ..., vk. and we write M = sp{v1, v2, ..., vk} or M = [v1, v2, ..., vk].
1.2.7 Examples
1. Let v1 = (1, 2), v2 = (2, 1) ∈ R2. Then
M = [v1, v2] = {a(1, 2) + b(2, 1) : a, b ∈ R} = {(a+ 2b, 2a+ b) : a, b ∈ R} = R2. (Prove it.)
2. Consider all functions in C1(R) which satisfy the differential equation:
d2y
dx2− 3
dy
dx+ 2y = 0
This equation has a typical solution of the form y = Ae2x + Bex. This says that the solutionsform a subspace of C1(R) namely, [ex, e2x].
We are now able to define the subspace generated by a (not necessary finite) subset of V.
1.2.8 Definition
Let S be a subset of a vector space V over a field F. We define the subspace generated by S, denotedby [S] or sp(S) to be the intersection of all subspaces that contain S.
Let H be the set of all linear combinations of all finite subsets of S. Then H = [S].
1.2.9 Examples
1. Let V be the vector space of all infinite sequences of real numbers i.e V = {(a1, a2, ..., ak, ...) :ai ∈ R}. for each i ≥ 1, let e1 = (0, 0, 0, ..., 0, 1, 0, ...) which has 1 in the ith place and 0elsewhere. Then [e1, e2, e3, ...] is the subspace of V that consists of all linear combinations ofall finite subsets of {e1, e2, e3, ...}. These are those vectors with at most a finite number of nonzero entries. Thus, [e1, e2, e3, ...] ⊂ V.
2. Let S = ϕ. then ϕ is a subset of every subspace of V. What is [ϕ]? Let M = [ϕ]. Since {0v} isa subspace that contained in every subspace, so {0v} ⊆ M. conversely,{0v} is a subspace thatcontains ϕ, so M ⊆ {0v}. .Therefore, [ϕ] = {0v}.
4
1.2.10 Definition
Let M1,M2, ...,Mk be subspaces of a vector space V. We define the sum of M1,M2, ...,Mk as
M1 +M2 + ...+Mk = {m1 +m2 + ...+mk : mi ∈Mi for all i = 1, 2, ..., k}
1.2.11 Theorem
With the above notations, M1 +M2 + ...+Mk is a subspace of V.
1.2.12 Examples
(a) Let M1 = [(1, 1, 0)] = {(a, a, 0) : a ∈ R},M2 = [(1, 0, 1)] = {(b, o, b) : b ∈ R}. thenM1 +M2 = {(a+ b, a, b) : a, b ∈ R}. This is the set of all vectors of the form (x, y, z) suchthat x = y + z. This is the equation of a plane.
(b) Let M1 = [(1, 0, 0)], M2 = [(0, 1, 0)], M3 = [(0, 0, 1)]. then M1 +M2 +M3 = R3.
NOTES:
(a) Mi ⊆M1 +M2 + ...+Mk for all i = 1, 2, ..., k. i.e each Mi is a subspace of the sum.
(b) If m ∈M1 +M2 + ...+Mk then it is not true that m ∈M1 or m ∈M2 or....m ∈Mk. Seelast example.
(c) Let u1, u2, ..., ur and v1, v2, ..., vs be two finite subsets of V. Then
[u1, u2, ..., ur] + [v1, v2, ..., vs] = [u1, u2, ..., ur, v1, v2, ..., vs].
1.3 Linear Dependence and Independence
1.3.1 Definition
Let v1, v2, ..., vk be a finite set of vectors in a vector space V over F. We say v1, v2, ..., vkare linearly dependent in V if there are scalars α1, α2, ..., αk not all zero such that α1v1 +α2v2 + ...+ αkvk = 0v.
If v1, v2, ..., vk are not linearly dependent, they are called linearly independent. This meansthat whenever α1v1 + α2v2 + ...+ αkvk = 0v, then α1 = α2 = ... = αk = 0.
1.3.2 examples
i. e1, e2, e3 are L.I in R3.
ii. the set of vectors (3, 0,−3), (−1, 1, 2), (4, 2,−2), (2, 1, 1) is linearly dependent since2(3, 0,−3) + 2(−1, 1, 2)− (4, 2,−2) + 0(2, 1, 1) = (0, 0, 0, 0).
iii. Let v1 = (1, 2,−1, 3), v2 = (2,−2, 1,−1), v3 = (1, 8,−4, 10), v4 = (5,−2, 1, 1). then1 2 1 52 −2 8 −2−1 1 −4 13 −1 10 1
−−−−→RREF
1 0 3 10 1 −1 20 0 0 00 0 0 0
5
Hence, v3 = 3v1 − v2 and v4 = v1 + 2v2. S0, v1 + 2v2 + 0v3 − v4 = 0v. Therefore,v1, v2, v3, v4 are linearly dependent.
iv. A. Any subset of a linearly independent set is linearly independent.
B. If a subset of v1, v2, ..., vk is linearly dependent, then v1, v2, ..., vk is linearly depen-dent.
C. Any finite set of vectors that contains the zero vector is linearly dependent.
D. Two vectors v1, v2 are linearly dependent iff one of them is a multiple of the otherone.
Earlier in this chapter, we define the vector space C[a, b], Cn[a, b].We extend this definitionto any interval (a, b), (a,∞), [a,∞), etc. So if I is any interval, we have
C(I) ⊇ C1(I) ⊇ ... ⊇ Cn(I) ⊇ ...
These are real vector spaces with point-wise addition and scalar multiplication. The zerovector is 0fn such that 0fn(x) = 0.
1.3.3 Examples
1. 1, x, x2 in C(R) are L.I.
Let a.1 + bx+ cx2 = 0 for all x ∈ R.
x = 0 ⇒ a = 0
x = 1 ⇒ b+ c = 0
x = −1 ⇒ −b+ c = 0
Therefore, a = b = c = 0. Hence 1, x, x2 are L.I.
2. 1, sin2 x, cos 2x are L.D. since 1.1 − 2. sin2 x − 1. cos 2x = 0 for all x ∈ R. therefore, thesefunctions are L.D. in the space C(R).
1.3.4 Theorem
Let y1(x), y2(x), ..., yn(x) be n functions in Cn−1[a, b]. If there is x0 ∈ [a, b] such that
det
y1(x0) y2(x0) ... yn(x0)y′1(x0) y′2(x0) ... y′n(x0)y′′1(x0) y′′2(x0) ... y′′n(x0)
...
y(n−1)1 (x0) y
(n−1)2 (x0) ... y
(n−1)n (x0)
= 0
Then y1(x), y2(x), ..., yn(x) are L.I. in Cn−1[a, b].Proof. Let λ1, λ2, ..., λn be scalars such that
λ1y1(x) + λ2y2(x) + ...+ λnyn(x) = 0fn
6
Differentiate this equation n− 1 times:
λ1y′1(x) + λ2y
′2(x) + ...+ λny
′n(x) = 0fn
λ1y′′1(x) + λ2y
′′2(x) + ...+ λny
′′n(x) = 0fn
...
λ1y(n−1)1 (x) + λ2y
(n−1)2 (x) + ...+ λny
(n−1)n (x) = 0fn
Evaluate these equations at x = x0 gives
λ1y1(x0) + λ2y2(x0) + ...+ λnyn(x0) = 0
λ1y′1(x0) + λ2y
′2(x0) + ...+ λny
′n(x0) = 0
λ1y′′1(x0) + λ2y
′′2(x0) + ...+ λny
′′n(x0) = 0
...
λ1y(n−1)1 (x0) + λ2y
(n−1)2 (x0) + ...+ λny
(n−1)n (x0) = 0
Write
A =
y1(x0) y2(x0) ... yn(x0)...
......
...
y(n−1)1 y
(n−1)2 ... y
(n−1)n
Thus last equations can be written as:
A
λ1λ2...λn
=
00...0
We are given that det(A) = 0, so A−1 exists. Multipling by A−1 gives
λ1λ2...λn
= A
00...0
=
00...0
and so, y1, y2, ..., yn are L.I. �
1.3.5 Definition
Let the situation be as described above. Then
det
y1(x0) y2(x0) ... yn(x0)y′1(x0) y′2(x0) ... y′n(x0)y′′1(x0) y′′2(x0) ... y′′n(x0)
...
y(n−1)1 (x0) y
(n−1)2 (x0) ... y
(n−1)n (x0)
7
is called the WRONSKIAN of y1, y2, ..., yn at x and denoted byW (y1, y2, ..., yn)(x).
The theorem states: if we can find x0 ∈ I such that W (y1, y2, ..., yn)(x) = 0, then y1, y2, ..., yn areL.I. in C(n−1)(I). This is not ”if and only if”
1.3.6 Examples
Decide whether each of the following sets are L.I or L.D in the indicated space.
1. x, sin x in C1(R).
2. ex, xex, x2ex in C2(I) where I is any interval.
3. 1, x, x2, ..., xn−1 in C(n−1)(R) where I is any interval.
4. y1(x) = x3, y2(x) = |x|3 in C[−1, 1]. For these functions W (y1(x), y2(x)) = 0 for all x ∈ I butthey are L.I. For the Wronskian, we have three cases, x > 0, x = 0, x < 0.
1.4 Basis and Dimension
1.4.1 Definition
Let V be a vector space over a field F. The finite set {v1, v2, ..., vn} is said to form a basis forV if
(a) {v1, v2, ..., vn} is a linearly independent set.
(b) sp{v1, v2, ..., vn} = V.
1.4.2 Examples
1. e1, e2, ..., en form a basis for Fn.
2. v1 = (1, 2, 1), v2 = (0, 1, 2), v3 = (2, 1, 1) form a basis for R3.
1.4.3 Theorem
Let v1, v2, ..., vn be linearly independent in V. Let v ∈ [v1, v2, ..., vn]. Then v ca be writtenuniquely in the form v = α1v1 + α2v2 + ...+ αnvn.
1.4.4 Corollary
Let v1, v2, ..., vn be a basis for V. then each v ∈ V can be written uniquely as a linear combinationof v1, v2, ..., vn.
Let u ∈ V, then there are unique scalars α1, ..., αn such that u = α1v1 + α2v2 + ...+ αnvn. Thescalars α1, ..., αn are called the coordinates of u relative to the basis v1, ..., vn.
8
Let v1, ..., vn. be a basis for the vector space V and let u1, u2, ..., uk be any k vectors in V. Thenfor any 1 ≤ i ≤ k there are unique scalars aij such that
uj = a1jv1 + a2jv2 + ...+ anjvn, 1 ≤ i ≤ k
Put A =
a11 ... a1j ... a1ka21 ... a2j ... a2k...
......
an1 ... anj ... ank
, so A ∈Mnk(F).
1.4.5 Theorem
Let The situation be as above. let λ1, λ2, ..., λk be scalars, then
λ1u1 + λ2u2 + ...+ λkuk = 0v ⇐⇒
λ1
a11...an1
+ λ2
a12...an2
+ ...+ λk
a1k...ank
=
0...0
⇐⇒
A
λ1...λk
=
0...0
This theorem says there is a dependence relations between the vectors u1, u2, ..., uk iff there is
exactly the same dependence relations between the columns of coordinates.
1.4.6 Corollary
Let v1, v2, ..., vn be a basis for V and let u1, u2, uk be Linearly independent in V. Then k ≤ n.
1.4.7 Corollary
Let V be a vector space over F. Then any two basis for V have the same number of elements.
1.4.8 Definition
If a vector space V have a finite basis, then any two basis will have the same number of elements.This number is defined to be the Dimension of V and we denote it by dim(V ) and V is said to be afinite dimensional.
A vector space which is not finite dimensional is said to be an infinite dimensional.
1.4.9 Examples
1. Fn has dimension n. A basis is e1, e2, ..., en.
2. Cn[a, b] is an infinite dimensional space.
Proof. suppose dim(Cn[a, b]) = k. Then there can not be a set of k + 1 vectors that are L.I.But we see before, that 1, x, x2, ..., xk ∈ Cn[a, b] are indeed L.I. this is a contradiction.
9
3. M22(F) has dimension 4.
4. LetM be the subspace ofM22(F) consisting of all symmetric 2×2 matrices. Then dim(M) = 3.(try to prove it)
1.4.10 Lemma
Let u1, u2, ..., uk ∈ V and suppose for some r with 1 ≤ r ≤ k, ur ∈ [u1, u2, ..., ur−1, ur+1, ..., uk],then[u1, u2, ..., ur, ..., uk] = [u1, u2, ..., ur−1, ur+1, ..., uk].
Note: Let V be s finitely generated vector space, say V = [v1, v2, ..., vs], (note that v1, v2, ..., vsare not necessarily L.I.). By repeating the application of last lemma we can reducing the spanningset to a linearly independent spanning set.
1.4.11 Theorem
A linearly independent set of vectors in a finite dimensional vector space V can be extended to abasis for V.
1.4.12 Example
Show that v1 = (1, 1, 0, 2), v2 = (1, 2, 1, 1), v3 = (2, 1, 1, 0) are linearly independent in R4 and extendit to a basis for R4.
Solution: Let
A =
1 1 2 1 0 0 01 2 1 0 1 0 00 1 1 0 0 1 02 1 0 0 0 0 1
−→ M =
1 0 0 0 −1 1 10 1 0 0 2 −2 −10 0 1 0 −2 3 10 0 0 1 3 −5 −2v1 v2 v3 e1 e2 e3 e4
Then v1, v2, v3 are L.I. and extend to a basis v1, v2, v3, e1 for R3.
10
Exercises
Set1.
(1) Find a matrix in reduced echelon form which is row equivalent to the matrix
A =
2 −9 −3 3 −4−1 6 2 2 21 3 1 1 −2
Hence, find the complete solution of the linear system
A
x1x2x3x4x5
=
000
(2) Let V denote the subspace of R6 the spanned by vectors
v1 = (1, 1, 2, 1, 1, 1), v2 = (3, 4, 3, 3, 5, 5), v3 = (−2,−3,−1, 0,−6, 3),
v4 = (2, 4,−2, 2, 6, 2), v5 = (3, 2, 9, 1, 3, 1)
Find a subset of this spanning set which forms a basis of V and express the remaining vectorsin the spanning set as a linear combination of your basis vectors.
(3) Determine a basis of R4 which contains the vectors v1 = (1, 2, 3, 4), v2 = (3, 2, 1, 0). Express thestandard basis e1, e2, e3, e4 as a linear combination of the vectors in your basis.
Set2.
(1) Determine which of the following subsets of R3 is a subspace of R3 :
1. M1 = {(x, y, z) ∈ R3 : 7x− 2y + z = 0},2. M2 = {(x, y, z) ∈ R3 : 3x+ 2y + z = 10},3. M3 = {(x, y, z) ∈ R3 : 3x+ 2y + z ≥ 0},4. M4 = {(x, y, z) ∈ R3 : y = x2},5. M5 = {(x, y, z) ∈ R3 : y2 = x2}.
(2) Determine which of the following subsets of C[−1, 1] is a subspace of C[−1, 1] :
1. N1 = {f ∈ C[−1, 1] : f(0) = −1}2. N2 = {f ∈ C[−1, 1] : f(−1
2) = 0}
3. N3 = {f ∈ C[−1, 1] : f(14) ≤ 0}
4. N4 = {f ∈ C[−1, 1] :∫ 1
−1f(t) dt = 0}
11
(3) Let V be a vector space over R and let W = V × V, the set of all ordered pairs (u, v) whereaddition and multiplication by complex numbers in W is given by:
(u1, v1) + (u2, v2) = (u1 + u2, v1 + v2),
(a+ bi)(u, v) = (au− bv, bu+ av)
Show that W is a vector space over C.Set3.
(1) Let u1, u2, ..., ur and v1, v2, ..., vs be two sets of vectors in a vector space V over F. Prove that
[u1, u2, ..., ur] = [v1, v2, ..., vs]
if and only if each ui ∈ [v1, v2, ..., vs] and each vj ∈ [u1, u2, ..., ur]. Show that
1. [(1,−1, 2), (2, 1,−1)] = [(3, 0, 1), (0,−3, 5), (5, 1, 0)] in R3.
2. [sin2 x, cos2 x, sin x cos x] = [1, sin 2x, cos 2x] in C[−π, π].
(2) Let M be a subspace of a vector space V over F and let u, v, w ∈ V.
1. Suppose that u+ v + w = 0v. Prove that [u, v] = [v, w].
2. Suppose that v /∈M but v ∈M + [u]. Show that u ∈M + [v].
(3) Let V be a real vector space of all infinite sequences of real numbers (a1, a2, a3, ...) with the usualcomponentwise addition and multiplication by scalars. For nN, let en = (0, 0, ..., 0, 1, 0, ...),which has 1 in the ith position and 0 every where else, and let f = (1, 1, 1, ..., 1, 1, ...0). PutW = [f, e1, e2, e3, ...].
Determine whether each of the following statements is true or false, giving reasons for youranswer:
1. (0, 0, 0, 1, 1, ..., 1, ...) ∈ W.
2. (1, 0, 1, 0, 1, 0...) ∈ W.
3. (1, 2, 3, 4, ..., n, n+ 1, ...) ∈ W
Set4.
(1) Let a ≤ b < c ≤ d be real numbers and let y1(x).y2(x), ..., yn(x) ∈ C[a, d],
(so that y1(x).y2(x), ..., yn(x) ∈ C[b, c],
Either prove or give a counterexample to the following statements:
1. if y1(x).y2(x), ..., yn(x) are linearly independent ∈ C[b, c], then y1(x).y2(x), ..., yn(x) arelinearly independent ∈ C[a, d].
2. if y1(x).y2(x), ..., yn(x) are linearly independent ∈ C[a, d], then y1(x).y2(x), ..., yn(x) arelinearly independent ∈ C[b, c].
(2) Determine whether the following are linearly dependent or linearly independent:
1. log t, t2 log t in C(0,∞).
2. tα, tβ, tγin C(0,∞) where α, β, γ are distinct real numbers.
3. sin3 t, sin t− 13sin 3t in C[−∞,∞].
4. eax sin bx, eax cos bx, (b = 0) in C(R).5. g(t), tg(t), t2g(t) in C(I) where g is any continuous function with g(t) = 0 for all t ∈ I.
12
Set5.
(1) 1. Find the dimension of the subspace M1 of C5 given by
M1 = {(a, b, c, d, e) ∈ C5 : a = ib, c+ id− e = 0}.
2. Find the dimension of the subspace M2 of Mnn(F) given by
M2 = {A ∈Mnn(F) : AT = A}.
3. Pn+1[x] denote the real vector space of all polynomials with coefficients in R, of degree≤ n. Find dim(Pn+1[x]).
Let n ≥ 2 and letM3 = {p(x) ∈ Pn+1[x] : p(1) = p′(1) = 0}.
Verify that M3 is a subspace of Pn+1[x] and find dimM3.
(2) Verify that 1, sin x, sin 2x, cosx are linearly independent in C[−π, π]. PutM = [1, sin x, sin 2x, cosx].Find a basis for the subspace N of M that is generated by
u1 = 1− sinx− sin 2x,
u2 = 1− sin 2x− cos x,
u3 = sinx− cos x,
u4 = 2 + sin x,
u5 = 1 + sin x+ sin 2x+ cos x.
Set6.
(1) The subspaces M1,M2 of R4 are given by:
M1 = [(1, 0, 3,−2), (2,−1, 1, 0), (0, 1, 5,−4), (2,−1, 2,−1, )]
M2 = [(−1, 1, 0,−1), (0, 1,−1, 2), (−1, 2,−1, 1), (−2,−1, 3,−8, )]
1. Find bases for M1,M2,M1 +M2.
2. Calculate dim(M1 ∩M2).
3. Examine your calculations to see if you can find a basis for M1 ∩M2.
4. M1 +M2 = ?
(2) Let L,M,N be finite dimensional subspaces of a vector space V over F. Verify that L ∩M ⊆L ∩ (M +N).
Prove thatdim(L+M +N) = dimL+ dimM + dimN
if and only ifL ∩ (M +N) =M ∩ (N + L) = N ∩ (L+M) = {0v}
13
Chapter 2
Linear Mappings
2.1 Definition
Let U, V be vector spaces over a field F. Let φ : U → V be a mapping which satisfies:
1. φ(u+ v) = φ(u) + φ(v) for all u, v ∈ V.
2. φ(αu) = αφ(u) for all u ∈ V, α ∈ F.Then φ is called a linear mapping from U to V.
Other means used: Linear map, Linear Transformation, homomorphism.
Note that conditions (1),(2) can be combined to
3. φ(αu+ βv) = αφ(u) + βφ(v) for all u, v ∈ V and α, β ∈ F.
2.2 Examples
1. Let A ∈Mm,n(F) be a fixed matrix, let φ : Fn → Fm given by
φ
x1...xn
= A
x1...xn
Then φ is a linear mapping. (Verify)
2. Let φ : R2 → R3 where φ(x, y) = (x, y, 0). Then φ is a linear mapping.
Note that this example is a special case of Example 1, where A =
1 00 10 0
and the vectors
in R2 and R3 are written in columns form.
3. For n ≥ 1, let φ : Cn(I) → Cn−1(I) given by: φ(f(x)) = ddx(f(x)). This is a linear mapping
since (f + g)′ = f ′ + g′ and (αf)′ = αf ′.
14
2.3 Consequences of the Definition
1. φ(α1v1 + α2v2 + ...+ αnvn) = α1φ(v1) + α2φ(v2) + ...+ αnφ(vn).
Proof: By induction on n.
2. φ(ov) = 0v.
Proof: Note that 0v = 0v + 0v. Now take φ for both sides.
2.4 Definition
Let U, V be vector spaces over a field F and let φ : U → V be a linear mapping. Then
1. The set of vectors {u ∈ U : φ(u) = 0v} is called the Kernal of φ and is denoted by ker φ whichis a subset of U.
2. The set of vectors {v ∈ V : v = φ(u) for some u ∈ U} is called the Image of φ, and is denotedby Im(φ) and it is a subset of V.
2.5 Theorem
Let U, V be vector spaces over a field F. Let φ : U → V be a linear mapping Then ker φ is a subspaceof U and Im (φ) is a subspace of V.
Proof. Exercise
2.6 Example
φ : R3 → R4 where φ(x, y, z) = (x, x, y − z, x). It is easy to show that φ is linear where xyz
=
1 0 01 0 00 1 −11 0 0
x
yz
kerφ :
φ(x, y, z) = (0, 0, 0, 0) iff (x, x, y − z, x) = (0, 0, 0, 0) iff x = 0 and y − z = 0 that is x = 0, y = z.Typical element of kerφ is (0, y, y) = y(0, 1, 1). Thus kerα = [(0, 1, 1)] and dimφ = 1.
Im(φ) :φ(x, y, z) = (x, x, y−z, x) = x(1, 1, 0, 1)+(y−z)(0, 0, 1, 0). Hence Im (φ) = [(1, 1, o, 1), (o, o, 1, 0)]
which is a basis for Im φ. So dim(Im φ) = 2.
2.7 Theorem
Let U, V be vector spaces over a field F and U be finite dimensional. Then ker(φ) and Im (φ) arefinite dimensional and
dim(kerφ) + dim(Im φ) = dim(U).
Proof. kerφ is a subspace of the finite dimensional vector space U and so finite dimensional. Letu1, u2, ..., un be a basis for U, let v ∈ Im φ. Then there exists u ∈ U, such that u = α1u1 + ...+ αnun
15
for some scalars α1, ..., αn and v = φ(u) = φ(α1u1 + ... + αnun) = α1φ(u1) + ... + αnφ(un). Sov ∈ [φ(u1), ..., φ(un)]. Hence, Im (φ) ⊆ [φ(u1), ..., φ(un)]. Therefore, Im (φ) is finite dimensional.
Let v1, ..., vk be a basis for ker(φ). Then, v1, ..., vk is a linearly independent set of vectors in U andso can be extended to a basis for U, say, v1, ..., vk, vk+1, ..., vn. This says that dimkerφ = k, dimU = n,so we have to show that dim Im φ = n− k.
Exactly as in the first part of the proof, φ(v1), φ(v2), ..., φ(vn) span Im φ. However, v1, ..., vk spankerφ, so φ(v1) = .... = φ(vk) = 0v. Hence, φ(vk+1), ..., φ(vn) span Im φ. Let αk+1, ..., αn be scalarssuch that αk+1φ(vk+1) + ...αnφ(vn) = 0V . That is φ(αk+1vk+1 + ...αnvn) = 0V . Thus, αk+1vk+1 + ...+αnvn ∈ ker(φ). Since v1, ..., vk is a basis for kerφ, then αk+1vk+1 + ...αnvn = β1v1 + ... + βkvk forsome scalars β1, ..., βk. Since v1, v2, ..., vn are L.I., then α1 = ... = αk = 0 = β1 = ... = βk. Thereforeφ(vk+1), ..., φ(vn) are linearly L.I and so form a basis for Im (φ). Hence dim Im (φ) = n− k. �
The proof shows that if u1, u2, .., un is a basis for U, then Im φ is spanned by φ(u1), ..., φ(un), i.efor any φ(u) ∈ Im φ, φ(u) = φ(α1u1 + ...+ αnun) = α1φ(u1) + ...+ αnφ(un).
Therefore, once the images of the basis vectors, φ(u1), ..., φ(un) are known, then φ is completelydetermined on U.
2.8 Theorem
Let U, V be vector spaces over a field F with dim(U) = n. let u1, ..., un be a basis for U and v1, ..., vnbe any n vectors in V. Then there is a unique linear mapping φ : U → V such that φ(ui) = vi for all1 ≤ i ≤ n.
Proof. Let u ∈ U, then there are unique scalars α1, ..., αn such that u = α1u1+ ...+αnun. Defineφ : U → V by
φ(u) = φ(α1u1 + ...+ αnun) = α1v1 + ...+ αnvn
This is a well defined mapping since the scalars α1, ..., αn are unique.It is easy to check that φ is indeed a linear mapping. (left as an exercise)Note that for each 1 ≤ i ≤ n,ui = 0u1 + ...+ 0ui−1 + 1.ui + 0ui+1 + ...+ 0un. So, φ(ui) = vi. Suppose thatψ : U → V is a linear mapping such that ψ(ui) = vi for all 1 ≤ i ≤ n. Then for any u ∈ U, if
u = α1u1 + ...+ αnun, then
φ(u) = α1v1 + ...+ αnvn
= α1ψ(u1) + ...+ αnψ(un)
= ψ(α1u1 + ...+ αnun)
= ψ(u)
Hence, ψ(u) = φ(u) for all u ∈ U. Hence ψ = φ. �
2.9 Example
Let e1, e2, e3 be the standard basis for R3 and v1 = (1, 1), v2 = (1, 1), v3 = (1, 2). What is φ(x, y, z)?where φ(e1) = v1, φ(e2) = v2, φ(e3) = v3.
16
2.10 Sums, Scalar multiples and Composite:
1. Let U, V be vector spaces over a field F, and φ1, φ2 : U → V be linear maps. Then φ1 + φ2 :U → V is a linear map where (φ1 + φ2)(u) = φ1(u) + φ2(u).
2. Let U, V be vector spaces over a field F, and φ : U → V be linear maps and α ∈ F. αφ : U → Vwhere (αφ)(u) = α(φ(u)) is a linear map.
3. Let U, V,W be vector spaces over a field F and ψ : U → V, φ : V → W be linear maps. Thenφ ◦ ψ : U → W is a linear map.
2.11 Example
Let φ : R2 → R3 and T : R3 → R4 given by φ(x, y) = (x, x, y) and T (x, y, z) = (x, x, y − z, x). Finda formula for T ◦ φ(x, y).
Isomorphisms
In this section we discuss the problem of ”when two vector spaces are essentially the same”.For example:
R4 M22(R)
(a, b, c, d)
(a bc d
)+ +
(x, y, z, w)
(x yz w
)= =
(a+ x, b+ y, c+ z, d+ w)
(a+ x b+ yc+ z d+ w
)Similarly for multiplication by scalar.
Let U, V be vector spaces over a field F, let φ : U → V be a linear map. Suppose that φ is abijection (1-1 and onto). Then there is the inverse map φ−1 : V → U which is given by φ−1(v) = uiff u is the unique element in U such that φ(u) = v. Also φ−1 ◦ φ = IU and φ ◦ φ−1 = IV .
2.12 Theorem
Let φ : U → V be a bijection linear mapping. Then φ−1 : V → U is also a linear mapping.Proof: Exercise. �
2.13 Definition
A linear mapping φ : U → V which is also a bijection is called an isomorphism and we say U isisomorphic to V. If this is the case, then φ−1 : V → U is also an isomorphism and V is isomorphic toU. Hence, we can say that U and V are isomorphic if we can set a bijection linear mapping betweenthem.
17
2.14 Example
The two vector spaces R4 and M22(R) are isomorphic via the map (a, b, c, d) 7−→(a bc d
).
Note that φ is bijection iff φ is onto and 1-1. That is equivalent to Im(φ) = V and ker(φ) = {0v}respectively.
2.15 Theorem
φ is injective iff ker(φ) = {0v}.
2.16 Theorem
Let U, V be finite dimensional vector spaces over F. Then U, V are isomorphic iff dimU = dimV.Proof. Suppose U, V are isomorphic. Then there is a bijection linear map φ : U → V. Since φ is
surjection and injective, then Im(φ) = V ker(φ) = {0v} respectively. By theorem (2.7)
dimU = dimkerφ+ dim Imφ
= 0 + dimV
= dimV
Conversely, suppose dimU = dimV = n, say. Let u1, u2, ..., un and v1, v2, ..., vn be bases for U andV respectively. By theorem (2.8) there is a unique linear mapping φ : U → V such that φ(ui) = vifor all 1 ≤ i ≤ n. Let u ∈ U, then u = α1u1 + ...+ αnun. Suppose u ∈ kerφ. Then
0v = φ(u)
= φ(α1u1 + ...+ αnun)
= α1φ(u1) + ...+ αnφ(un)
= α1v1 + ...+ αnvn
Since v1, ..., vn are linearly independent, so we must have α1 = ... = αn = 0 Thus kerφ = {0v} andso φ is 1-1. Again by theorem (2.7), we have dim(Imφ) = n = dimV. Since Im(φ) is a subspace ofV, then Im(φ) = V and so φ is surjection and hence an isomorphism. �.
2.17 Corollary
If U is a vector space over F and dimU = n, then U is isomorphic to Fn.
2.18 Example
Let U =M22(R), V = R4. Then U, V are isomorphic since both of them has dimension 4.
2.19 Theorem
Isomorphism is an equivalence relation on the collection of all vector spaces over a field F.
18
Linear Mapping and MatricesRecall that any linear mapping φ : Fn → Fm can be represented by an m× n matrix.
Now let U, V be finite dimensional vector spaces over F, say dimU = n and dimV = m. Letφ : U → V be a linear mapping, let B = {u1, ..., un} and B′ = {v1, ..., vm} be ordered bases for Uand V respectively. Then there are unique scalars aij such that
φ(u1) = a11v1 + a21v2 + ...+ am1vm
φ(u2) = a12v1 + a22v2 + ...+ am2vm... =
...
φ(un) = a1nv1 + a2nv2 + ...+ amnvm
Put
A =
a11 a12 ... a1na21 a22 ... a2n...
......
am1 am2 ... amn
∈Mmn(F)
A is called the matrix of φ relative to the bases B for U and B′ for V.Conversely, Let B = (bij) ∈ Mmn(F). Then there is a unique linear mapping ψ : U → V such
that ψ(ui) = b1iv1 + b2iv2 + ... + bmivm, (1 ≤ i ≤ n) and the matrix of ψ relative to the bases{u1, ..., un}, {v1, ..., vm} is B.
Hence, relative to fixed bases there is a one to one correspondence between the set of all m × nmatrices and the set of all linear maps from U to V.
Suppose φ : U → V is a linear map and suppose that A = (aij) is the matrix of φ relative to thebases B = {u1, ..., un} for U and B′ = {v1, ..., vm} for V. Let u ∈ U. Then there are unique scalarsα1, ..., αn such that u = α1u1 + ... + αnun. Hence the coordinate vector of u relative to the basis B
is [u]B =
α1...αn
. then
φ(u) = φ(α1u1 + ...+ αnun)
= α1φ(u1) + ...+ αnφ(un)
= α1(a11v1 + ...+ am1vm)
+ α2(a12v1 + ...+ am2vm)
+ ....
+ αn(a1nv1 + ...+ amnvm)
= (a11α1 + a12α2 + ...+ a1nαn)v1
+ (a21α1 + a22α2 + ...+ a2nαn)v2
+ ...
+ (am1α1 + am2α2 + ...+ amnαn)vm
Hence, the column of the coordinate of φ(u) in terms of v1, ..., vm isa11α1 + a12α2 + ...+ a1nαn
a21α1 + a22α2 + ...+ a2nαn...
am1α1 + am2α2 + ...+ amnαn
= A
α1
α2...αn
19
That is [φ(u)]B′ = A[u]B where A = [φ]BB′ is the matrix of φ (or representing φ) relative to theordered bases B for U and B′ for V.
2.20 Example
Let φ : R3 → R4 given by: φ(x, y, z) = (x+ y+ z, x− y, y, y− z). This is a linear mapping. Considerthe standard bases B = {e1, e2, e3} for R3 and B′ = {f1, f2, f3, f4} for R4. Then
φ(e1) = (1, 1, 0, 0) = f1 + f2
φ(e2) = (1,−1, 1, 1) = f1 − f2 + f3 + f4
φ(e3) = (1, 0, 0,−1) = f1 − f4
Hence, A =
1 1 11 −1 00 1 00 1 −1
. This linear mapping is
xyz
7−→
1 1 11 −1 00 1 00 1 −1
x
yz
2.21 Theorem
Let U, V,W be vector spaces over F of dimensionsm,n, p respectively and u1, ..., un, v1, ..., vm, w1, ..., wp
be bases for U, V,W respectively. Let φ : U → V and ψ : V → W be linear maps represented bythe matrices A,B relative to these bases. Then the composite ψ ◦ φ : U → W is represented by BArelative to the bases u1, ..., un for U and w1, ..., wp for W.
Proof. Write A = (aij), B = (bij). then φ(ui) =∑m
h=1 ahivh and ψ(vh) =∑p
k=1 bkhwk. Then
(ψ ◦ φ)(ui) = ψ(φ(ui)) = ψ(m∑
h=1
ahivh)
=m∑
h=1
ahiψ(vh) =m∑
h=1
ahi(
p∑k=1
bkhwk)
=
p∑k=1
(m∑
h=1
bkhahi)wk =m∑k=1
(BA)kiwk
�.
2.22 Example
Let U be a vector space over F, u1, ..., un be a basis for U. then the identity linear map IU : U → Uis represented by the identity matrix In.
20
2.23 Theorem
Let U, V be n−dimensional vector spaces over F and suppose φ : U → V is represented by a matrix Arelative to the bases u1, ..., un for U and v1, ..., vn for V. Then φ is an isomorphism iff A is invertible.
Proof. Suppose φ is an isomorphism, then φ−1 exists and φ−1 : V → U is linear. Suppose thatφ−1 is represented by the matrix B relative to the given basis. Then φ−1 ◦ φ = IU and BA = In.Also, φ ◦ φ−1 = IV and AB = In. Hence B = A−1.
Conversely, suppose A−1 exists. Let ψ : V → U be the linear mapping which has the matrix A−1
relative to the bases v1, ..., vn and u1, ..., un. (That is if A−1 = (αij) then φ−1(vi) =∑n
h=1 αhiuh.)Then φ◦ψ is represented by AA−1 = In, so φ◦ψ = IV . Similarly, ψ ◦φ is represented by A−1A = In,so ψ ◦ φ = IU . Therefore ψ = φ−1 and φ is an isomorphism. �
Change of Basis
Let u1, ..., un and v1, ..., vm be bases for U, V respectively, let φ : U → V be a linear mapping andsuppose that φ is represented by A = (aij) relative to these bases. That is φ(ui) = a1ivi+...+amivm =∑m
h=1 ahivh.Let u′1, ..., u
′n be another basis for U and v′1, ..., v
′m be another basis for V. What is the matrix of
φ relative to these new bases?First, we need to express φ(u′i) in terms of v′1, ..., v
′m. Now u1, ..., un and u′1, ..., u
′n are two basis
for U and so are connected by an invertible matrix Q = (qij) such that
u′i = q1iu1 + q2iu2 + ...+ qniun 1 ≤ i ≤ n
Thus Q is an n×n. Similarly,v1, ..., vm and v′1, ..., v′m are connected by an invertible matrix P = (pij)
such thatvh = p1hv
′1 + p2hv
′2 + ...+ pmhv
′m, 1 ≤ h ≤ m.
Then
φ(u′i) = φ(n∑
h=1
qhiuh) =n∑
h=1
qhiφ(uh)
=n∑
h=1
qhi(m∑k=1
akhvk) =m∑k=1
(n∑
h=1
akhqhi)vk
=m∑k=1
(AQ)kivk
=m∑k=1
(AQ)ki(m∑r=1
prkv′r) =
m∑r=1
(m∑k=1
prk(AQ)ki)v′r
=m∑r=1
(PAQ)riv′r
Therefore, the matrix representing φ relative to the new bases u′1, ..., u′n for U and v′1, ..., v
′m for V is
B = PAQ.
21
2.24 Example
Let φ : R3 → R2 be the linear mapping given by φ(x, y, z) = (x − 2y + 3z, 2x + y − z). Then the
matrix representing φ relative to the standard bases is A =
(1 −2 32 1 −1
)since
φ(e1) = (1, 2) = f1 + 2f2
φ(e2) = (2, 1) = 2f1 + f2
φ(e3) = (3,−1) = 3f1 − f2
where e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1) and f1 = (1, 0), f2 = (0, 1).
If u′1 = (1, 3, 2), u′2 = (0, 1, 1), u′3 = (1, 1, 1) is another basis for R3 and v′1 = (1, 1), v′2 = (1, 2) isanother basis for R2, find the matrix of φ relative to these new bases. Now
u′1 = (1, 3, 2) = e1 + 3e2 + 2e3
u′2 = (0, 1, 1) = 0e1 + e2 + e3
u′3 = (1, 1, 1) = e1 + e2 + e3
Hence, Q =
1 0 13 1 12 1 1
. Also
v′1 = (1, 1) = f1 + f2
v′2 = (1, 2) = f1 + 2f2
Therefore,
f1 = 2v′1 − v′2f2 = −v′1 + v′2
Hence, P =
(2 −1−1 1
). Therefore the required matrix is
PAQ =
(2 −1−1 1
)(1 −2 32 1 −1
) 1 0 13 1 12 1 1
=
(−1 2 22 −1 0
).
That means
φ(u′1) = −v′1 + 2v′2φ(u′2) = 2v′1 − v′2φ(u′3) = 2v′1
2.25 Example
Let A be m × n matrix. then A defines a linear mapping φ : Fn → Fm by φ(ei) = (a1i, a2i, ..., ami).The matrix φ relative to the standard basis is A.
a basis for kerφ can be extended to a basis for Fn, say u1, u2, ..., uk, ..., un is a basis for Fn whereuk+1, ..., un is a basis for kerφ. Put v1 = φ(u1), ..., vk = φ(uk). Then v1, ..., vk is a basis for Im(φ).
22
This basis can be extended to a basis for Fm, say v1, ..., vk, vk+1, ..., vm. Then
φ(u1) = v1
φ(u2) = v2... =
...
φ(uk) = vk
φ(uk+1) = 0Fm
... =...
φ(un) = 0Fm
So, the matrix of φ relative to the bases u1, ..., un and v1, ..., vm is
1 0 ... 0 0 ... 00 1 ... 0 0 ... 0
0 0 ......
......
......
......
......
0 0 ... 1 0 ... 0
0 0 ... 0 0 ... 0...
......
......
0 0 ... 0 0 ... 0
=
(Ik 00 0
)= PAQ.
Thus, given an m× n matrix A, there exists an invertible n× n matrix Q and an invertible m×m
matrix P such that PAQ =
(Ik 00 0
).
Now, k = dim(Im(φ)), Im(φ) is spanned by φ(e1), φ(e2), ..., φ(en). i.e
Im(φ) = [(a11, a21, ..., am1), ..., (a1n, a2n, ..., amn)]
Thus Im(φ) is the space spanned by the columns of A. So, dim(Im(φ)) is the maximum number oflinearly independent columns of A and this is the rank of A. Hence, k = rank(A).
Let φ : V → V be a linear mapping. In calculating matrices representing φ, it is customary touse the same basis on both sides. Therefore, relative to the basis B = {v1, ..., vn}, φ is representedby an n× n matrix A = (aij), where
φ(v1) = a11v1 + a21v2 + ...+ an1vn... =
...
φ(vn) = a1nv1 + a2nv2 + ...+ annvn
Let B′ = {v′1, ..., v′n} be another basis for V. Then there is an invertible matrix P = (pij) such that
v′i = p1iv1 + p2iv2 + ...+ pnivn, 1 ≤ i ≤ n
23
So thatvj = q1jv
′1 + q2jv
′2 + ...+ qnjv
′n, 1 ≤ j ≤ n
where Q = (qij) = P−1. Then relative to the basis B′ = {v′1, ..., v′n}, φ is represented by P−1AP.Also, any invertible matrix P will define a change of basis for V. So two n × n matrices A,B
represent the same linear mapping (relative to different bases) if and only if there is an invertiblematrix P such that B = P−1AP.
2.26 Definition
Let A,B ∈ Mnn(F). Then we say B is similar to A if there is an invertible matrix P such thatB = P−1AP.
Therefore two matrices represent the same linear mapping if and only if they are similar.
2.27 Proposition
Let A,B be similar matrices, then
1. det(A) = det(B).
2. Trace (A) =Tracs (B).
Proof. Exercise. �Note that the converse of this proposition is not true. That is if det(A) = det(B) or Trace
(A) =Tracs (B) then A,B need not be similar.
2.28 Definition
Let V be n−dimensional vector space and φ : V → V be linear mapping. Then any two matricesrepresenting φ will be similar and so have the same determinant and trace. Therefore, we define
1. det(φ) = det of any matrix representing φ.
2. Trace (φ) = trace of any matrix representing φ.
24
Exercises
Set7.
(1) Determine whether each of the following mappings is linear. In the case of they are linear, findbases for the kernal and image.
1. φ1 : R3 → R4 given by φ(a, b, c) = (a+ b, b+ c, 2, a− b− c).
2. φ2 : R4 → R3 given by φ(a, b, c, d) = (a+ c+ d, a+ b+ 2d, b− c+ d).
(2) Let B ∈Mnn(F), let φ :Mnn(F) →Mnn(F) given by φ(A) = AB −BA for all A ∈Mnn(F).Verify that φ is a linear mapping.
Find a basis for kerφ when n = 3 and B =
0 1 01 0 10 1 0
. Hence calculate dim(Imφ).
(3) Let U, V be vector spaces over F and let φ : U → V be a linear mapping. Either prove or givea counterexample to the following statements:
1. If u1, u2, ..., uk are linearly independent in U then φ(u1), φ(u2), ..., φ(uk) are linearly inde-pendent in V
2. If φ(u1), φ(u2), ..., φ(uk) are linearly independent in V then u1, u2, ..., uk are linearly inde-pendent in U.
Set8.
(1) Let u1 = (0, 1, 1), u2 = (1, 0, 1), u3 = (1, 1, 0) ∈ R3. Express each of e1, e2, e3 in terms ofu1, u2, u3. Deduce that u1, u2, u3 is a basis for R3.
Let φ : R3 → R4 be the unique linear mapping such that
φ(u1) = (0,−1, 2, 2), φ(u2) = (1,−1, 1, 2), φ(u3) = (2,−1, 0, 6).
Find φ(x, y, z).
(2) Let Pn(R) denote the vector space of all polynomials over R of degree < n. Find a basis forthe subspace
M1 = {p(x) ∈ Pn(R) : p′(0) = p(1) = 0}
Find a basis for the subspace M2 of Rn given by
M2 = {(a1, a2, ..., an) ∈ Rn : a1 + an = a2 + an−1 = 0}
Deduce that M1 is isomorphic to M2 and write an explicit isomorphism.
(3) The linear mapping φ : R3 → R2 has matrix A =
(1 0 2−1 4 3
)relative to the bases
u1 = (1, 0, 0), u2 = (1, 1, 0), u3 = (1, 1, 1) for R3 and v1 = (1, 0), v2 = (1, 1) for R2. Findφ(x, y, z).
Set9.
25
(1) The linear mapping φ : R3 → R3 is given by
φ(a, b, c) = (a+ 2b+ 2c,−a+ 3b+ 8c, 2a− b+ γc)
Write down the matrix representing φ relative to the standard basis e1, e2, e3 for R3. Deducethat φ is an isomorphism unless γ takes one particular value.
(2) φ : R4 → R3 is represented by the matrix
A =
1 2 −1 02 3 −1 −11 0 1 −2
relative to the standard bases for R4 and R3. Find a basis for kerφ and extend this basis to abasis for R4. Hence, find an invertible matrices Q(4× 4) and P (3× 3) such that
PAQ =
(Ik 00 0
)(3) Find detφ, trace φ, rank φ for the linear mapping φ : C3 → C3 given by
φ(x, y, z) = (2x+ y − z, x+ 2y + z,−x+ y + 2z)
26
Chapter 3
Inner Product Spaces
In this chapter we restrict the field of scalars to C or R. The results will be stated in terms of Cwhich involves complex conjugates. To get the R version, ignore the conjugate.
3.1 inner Product
3.1.1 Definition
Let V be a vector space over C. Suppose that with each order pairs of vectors u, v there is associateda scalar denoted by < u, v > . then < , > is said to be an inner product on V, and V is called aninner product space if:
1. < u, v + w >=< u, v > + < u,w > for all u, v, w ∈ V.
2. < v, u >= < u, v >. (complex conjugate)
3. < αu, v >= α < u, v > where α ∈ C.
4. < u, u >≥ 0 and < u, u >= 0 iff u = 0.
Note that:
• By putting u = v, from (2) we get < u, u >= < u, u >. Therefore, < u, u > is always a realnumber.
• < u, αv >= < αv, u > = α < v, u > = α< v, u > = α < u, v > .
• < u+ v, w >= < w, u+ v > = < w, u > + < w, v > =< w, u >+< w, v > =< u,w > + < v,w > .
3.1.2 Example
The scalar product on R2 and R3 given by < (x1, y1), (x2, y2) >= x1x2 + y1y2. is an inner product.This inner product ca be generalized to Rn by< (a1, a2, ..., an), (b1, b2, ..., bn) >= a1b1 + a2b2 + ...+ anbn.This is called the standard scalar product on Rn.For Cn, < (a1, a2, ..., an), (b1, b2, ..., bn) >= a1b1 + a2b2 + ...+ anbn.This is called the standard scalar product on Cn.
27
3.1.3 Example
<, > on R2 given by:< (x1, y1), (x2, y2) >= x1x2 + x1y2 + x2y1 + 2y1y2.Straightforward checking for (i),(ii),(iii). For (iv),< (x1, y1), (x1, y1) >= x21 + 2x1y1 + 2y21 = (x1 + y1)
2 = y21 ≥ 0, (by completing the square)and = 0 iff x1 + y1 = 0 and y21 = 0. That is if (x1, y1) = (0, 0).
3.1.4 Example
In the vector space C[a, b], for any f, g ∈ C[a, b] let
< f, g >=
∫ b
a
f(x)g(x)dx.
This is an inner product.Proof:
1.
< f, g + h > =
∫ b
a
f(x)[g(x) + h(x)]dx
=
∫ b
a
f(x)g(x)dx+
∫ b
a
f(x)h(x)dx
= < f, g > + < f, h > .
2. < f, g >=∫ b
af(x)g(x)dx =
∫ b
ag(x)f(x)dx =< g, f > .
3. < αf, g >=∫ b
aαf(x)g(x)dx = α
∫ b
ag(x)f(x)dx = α < f, g > .
4. Since (f(x))2 ≥ 0 for all x ∈ [a, b], then
< f, f >=
∫ b
a
(f(x))2dx ≥ 0.
Suppose < f, f >=∫ b
a(f(x))2dx = 0 and f(c) = 0 for some c ∈ [a, b]. Then (f(c))2 > 0 and
y = f(x) is continuous, so y = (f(x))2 is continuous. Hence, (f(x))2 is positive in some region
around x = c. Therefore,∫ b
a(f(x))2dx > 0. Thus f(c) = 0 is not possible. Hence,
If∫ b
a(f(x))2dx = 0 then f(x) = 0. This makes C[a, b] an inner product space.
In C[−1, 1], what is < x2 − 1, x3 >?
3.1.5 Proposition. (consequences of the definition of inner product)
Let V be an inner product space. Then
1. < 0v, v >= 0 for any v ∈ V.
2. If u ∈ V is such that < u, v >= 0 for all v ∈ V, then u− 0v.
3. < α1u1 + α2u2 + ...+ αkuk, β1v1 + β2v2 + ...+ βmvm >=∑k
i=1
∑mj=1 αiβj < ui, vj > .
A total of km terms.Proof. Exercise. �Which vector spaces can be turned into an inner product space?
28
3.1.6 Theorem
Let V be a finite dimensional vector space over C. Then V can be given the structure of an innerproduct space.
Proof. Suppose dim(V ) = n. Let u1, u2, ..., un be a basis for V. If u, v ∈ V, then there existsunique scalars x1, x2, ..., xn, y1, y2, ..., yn such that
u = x1u1 + x2u2 + ...+ xnun, v = y1u1 + y2u2 + ...+ ynun.
Define < u, v >= x1y1 + x2y2 + ...+ xnyn. Then axioms 1,2,3 are trivial. For axiom 4,< u, u >= x1x1+x2x2+ ...+xnxn = |x1|2+ |x2|2+ ...+ |xn|2 ≥ 0 and =0 iff x1 = x2 = ... = xn = 0.
That is u = 0. �
3.1.7 Example
In R2, a basis is e1(1, 0), e2 = (−, 1). For any vector u = (x, y) we have (x, y) = xe1+ ye2 and so theinner product is just the scalar product x1x2 + y1y2. But with basis e = (1, 0), f = (1, 1), we have(x, y) = (x− y)e+ yf. Hence,
< (x1, y1), (x2, y2) >= (x1 − y1)(x2 − y2) + y1y2 = x1x2 − x1y2 − x2y1 + y1y2.
3.2 Length and Distance
3.2.1 Definition
Let V be an inner product space. For u ∈ V, (< u, u >)12 is called the length of u or the norm of u
and is denoted by ||u||. (Note that < u, u > is a nonnegative real number.)
3.2.2 Example
Let V = R2 with scalar product, let u = (x, y). Then < u, u >= x2 + y2 and so ||u|| =√x2 + y2.
If ||u|| = 1, then u is called a unit vector.Note that if u = 0 and u is not a unit, then U
||u|| is a unit vector since,
<u
||u||,u
||u||>=
1
||u||< u,
1
||u||u >=
1
||u|||u||< u, u >=
1
||u||2.||u||2 = 1
3.2.3 Definition
Let V be an inner product space and u, v ∈ V. We define the distance of u from v to be ||u − v||.Note that ||u− v|| = ||v − u||.
3.2.4 Example
In R2, with scalar product, let u = (x, y), v = (a, b), then u − v = (x − a, y − b) and ||u − v||2 =(x− a)2 + (y − b)2 and ||u− v|| =
√(x− a)2 + (y − b)2.
3.2.5 Example
Let V = C[a, b] with < f, g >=∫ b
af(x)g(x)dx. then
||f − g||2 =∫ b
a(f(x)g(x))2dx.
29
3.3 Orthogonal and Orthonormal Sets
Let V be an inner product space.
3.3.1 Definition
Let v1, v2, ..., vk be elements in V. Then
1. v1, v2, ..., vk are said to form an orthogonal set of vectors if < vi, vj >= 0 whenever i = j.
2. v1, v2, ..., vk are said to form an orthonormal set of vectors if < vi, vj >=
{0, if; i = j1, if. i = j
}.
Note that
1. An orthogonal set can contain the zero vector.
2. If v1, v2, ..., vk is non zero orthogonal set, then v1||v1|| ,
v2||v2|| , ...,
vk||vk||
is an orthonormal set.
3.3.2 Example
In R2 with scalar product, (−1, 1), (1, 1) is an orthogonal set. Now since,||(−1, 1)|| =
√2, ||(1, 1)|| =
√2, an orthonormal set is (−1√
2, 1√
2), ( 1√
2, 1√
2).
3.3.3 Example
In C[a, b] with < f, g >=∫ b
af(x)g(x)dx orthogonality is about
∫ b
af(x)g(x)dx = 0,
thus cos x, sinx are orthogonal in C[−π, π] since
< cosx, sinx > =
∫ π
−π
cosx sinxdx
=
∫ π
−π
1
2cos 2xdx
= −1
4[cos 2x]π−π
= −1
4(cos 2π − cos(−2π)) = 0
and
< cos x, cos x > =
∫ π
−π
cos2 xdx
=1
2
∫ π
−π
(cos 2x+ 1)dx
=1
2[sin 2x
2+ x]π−π
= π
Therefore, || cos x|| =√π. Similarly, || sin x|| =
√π.
Hence an orthonormal set is cosx√π, sinx√
π.
30
3.3.4 Theorem
Let u, v be vectors in an inner product space and let a ∈ C. Then
1. ||au|| = |a|.||u||.
2. | < U, V > | ≤ ||u||.||v||. (Cauchy - Schwartz inequality).
3. ||u+ v|| ≤ ||u||+ ||v||. (triangle inequality).
Proof.
1. ||au||2 =< au, au >= aa < u, u >= |a|2||u||2. By taking positive square roots, we have ||au|| =|a|||u||.
2. If u = 0v or v = 0v, then both sides is 0. So suppose v = 0v. Put α = <u,v>||v||2 . Consider
< u− αv, u− αv > .
Then,
0 ≤< u− αv, u− αv > = < u, u > + < u,−αv > + < −αv, u > + < −αv,−αv >= < u, u > −α < u, v > −α < v, u > +αα < v, v > ............(∗)
But
α < u, v >= <u,v>||v||2 < u, v >= |<u,v>|2
||v||2 ,
α < u, v >= <u,v>||v||2 < v, u > = |<u,v>|2
||v||2 ,
αα < v, v >= <u,v>||v||2
<u,v>||v||2 ||v||2 = |<u,v>|2
||v||2 .
So, (*) becomes 0 ≤ ||u||2 − |<u,v>|2||v||2 , and therefore, | < u, v > |2 ≤ ||u||2||v||2. Again take
positive square roots, we get | < u, v > | ≤ ||u||.||v||.
3.
||u+ v||2 = =< u+ v, u+ v >
= < u, u > + < u, v > + < v, u > + < v, v >
= ||u||2+ < u, v > +< u, v >+ ||v||2
= ||u||2 + 2Re(< u, v >) + ||v||2
≤ ||u||2 + 2| < u, v > |+ ||v||2
≤ ||u||2 + 2||u||||v||+ ||v||2
≤ (||u||+ ||v||)2
Take positive square roots, ||u+ v|| ≤ ||u||+ ||v||. �
In terms of our usual examples, Cauchy Schwartz gives:
1. In Rn with scalar product: if u = (x1, ..., xn), v = (y1, ..., yn) then
| < u, v > = |x1y1 + ...+ xn + yn|≤ (x21 + ...+ x2n)
12 (y21 + ...+ y2n)
12 .
This holds for any real numbers x1, ..., xn, y1, ..., yn.
31
2. In C[a, b] with < f, g >=∫ b
af(x)g(x)dx, we have
|∫ b
a
f(x)g(x)dx| ≤(∫ b
a
(f(x))2dx
) 12(∫ b
a
(g(x))2dx
) 12
If u = 0v, v = 0v, Cauchy Schwartz says | < u, v > | ≤ ||u||||v|| and so |<u,v>|||u||||v|| ≤ 1. In real space,
−1 ≤ |<u,v>|||u||||v|| ≤ 1.
Hence, there is a unique θ such that 0 ≤ θ ≤ π such that |<u,v>|||u||||v|| = cos θ.
i.e < u, v >= ||u||||v|| cos θ.(θ is defined to be the angle between u and v.)
3.3.5 Theorem
Let u1, u2, ..., ur be non zero orthogonal vectors in the inner product space V. Then u1, u2, ..., ur arelinearly independent.
Proof. Let a1, ..., ar be scalars such that a1u1 + ...+ arur = 0v. Then
0 = < 0v, uj >
= < a1u1 + ...+ arur, uj >
= a1 < u1, uj > +...+ aj < uj, uj > +...+ ar < ur, uj >
= a1.0 + ...+ aj < uj, uj > +...+ ar.0
0 = aj < uj, uj >
Now, < uj, uj > = 0 since uj = 0 for all 1 ≤ j ≤ r, and therefore, aj = 0 for all j. �Note that in an n−dimensional vector space, the number of elements in an orthogonal set can
not exceed n.In the next theorem we show that we always can find a basis consisting of orthogonal vectors for
any finite dimensional vector space.
3.3.6 Theorem
(Gram - Schmidt Orthogonalisation Process)Let u1, u2, ..., un be linearly independent vectors in an inner product space V. Then there exists an
orthonormal set of vectora w1, w2, ..., wn such that for 1 ≤ k ≤ n, the subspace spanned by u1, ..., ukis the same as the subspace spanned by w1, ..., wk.
Proof. u1, u2, ..., un be linearly independent vectors in an inner product space V. We proceed Byinduction on n.
First, we construct a non zero orthogonal set v1, ..., vn. Let v1 = u1. This obviously has therequired spanning property. Next suppose we have constructed a non zero orthogonal set v1, ..., vrsuch that for 1 ≤ k ≤ r, [v1, ..., vk] = [u1, ..., uk]. Define,
vr+1 = ur+1 −< ur+1, v1 >
||v1||2v1 −
< ur+1, v2 >
||v2||2v2 − ...− < ur+1, vr >
||vr||2vr....(∗)
If vr+1 = 0v, then rearranging (*) writes ur+1 as a linear combination of v1, ..., vr. Hence ur+1 ∈[v1, ..., vr] = [u1, ..., ur]. Thus ur+1 is a linear combination of u1, ..., ur which contradicts the linear
32
independence of u1, ..., un. Hence, vr+1 = 0v. So
< vr+1, vj > = < ur+1 −< ur+1, v1 >
||v1||2v1 −
< ur+1, v2 >
||v2||2v2 − ...− < ur+1, vr >
||vr||2< vr, vj >
= < ur+1, vj > −< ur+1, vj >
||vj||2< vi, vj >
= 0
Therefore, v1, ..., vr + vr+1 is an orthogonal set of non zero vectors. It remains to show that[u1, ..., ur+1] = [v1, ..., vr+1]. So let
M1 = [u1, ..., ur+1], M2 = [v1, ..., vr+1].
By hypothesis, u1, ..., ur ∈ M2 and by (*), ur+1 ∈ M2. If follows that M! ⊆ M2. Hence, r + 1 =dimM1 ≤ dimM2 = r + 1. Since, M1 ⊂M2 and dimM1 = dimM2, then M1 =M2.
finally, put wj =vj
||vj || for all 1 ≤ j ≤ r. �
3.3.7 Corollary
Every finite dimensional inner product vector space possesses an orthonormal basis.
3.3.8 Example
Find an orthonormal basis for the subspace of R3, with scalar product as an inner product, which isspanned by (1, 2, 1), (3, 4, 1). Extend it to an orthonormal basis for R3.
Solution: Let u1 = (1, 2, 1), u2 = (3, 4, 1). Clearly u1, u2are linearly independent. hence, letv1 = u1 and v2 = u2 − <u2,v1>
||v1||2 v1 = (3, 4, 1) − 126(1, 2, 1) = (1, 0,−1). Required orthonormal basis is
w1 = ( 1√6, 2√
6, 1√
6), w2 = ( 1√
2, 0,− 1√
2).
To extend to an orthonormal basis for R3, we need a non zero vector v3 such that< v3, v1 >= 0, < v3, v2 >= 0. So, write v3 = (a, b, c). Then < v3, v1 >= a + 2b + c = 0 and
< v3, v2 >= a− c = 0. So, a = c, b = −a. So we can choose v3 = (1,−1, 1) and w3 = ( 1√3, −1√
3, 1√
3).
Now suppose u1, ..., un ia an orthonormal basis for the n−dimensional inner product space V, letu ∈ V. Then there are scalars a1, ..., an such that u = a1u1 + ...+ anun. So for any 1 ≤ j ≤ n,
< u, uj > = < a1u1 + ...+ anun, uj
=n∑
h=1
ah < uh, uj >
= aj < uj, uj >
= aj
Note that < ui, uj >= δij =
{0, if i = j;1, if i = j.
Hence, for any u ∈ V,
u =n∑
h=1
< u, uh > uh
33
3.3.9 Example
Express (1, 1, 1) as a linear combination of the orthonormal basis in the previous example.Solution: Note that a1 =< u,w1 >=
4√6, a2 =< u,w2 >= 0, a3 =< u,w3 >=
1√3. So u =
(1, 1, 1) = 4√6w1 +
1√3w3.
3.4 Infinite Dimensional in Inner Product Spaces
Let V be a vector space over a field F. A subset X of V is said to be linearly independent in V ifevery finite subset of X is L.I. If V is an inner product space, we say that {ui}i∈I is an orthonormalset if
< ui, uj >
{1, if i = j;0, if i = j.
If {ui}i∈I is an orthonormal set, then any finite subset will be linearly independent. Hence, in thiscase orthonormal⇒ L.I.
Let V be an infinite dimensional inner product space. In chapter (1), we saw that there exists aninfinite sequence u1, u2, u3, ... such that u1, u2, ..., uk is L.I. for all k ≥ 1.
Applying Gram - Schmidt to u1, u2, ..., uk for each k, will therefore produce an infinite orthonormalsequence w1, w2, w3, ... such that
[u1, u2, ..., uk] = [w1, w2, ..., wk] for all k ≥ 1.
Hence, we can assume that every infinite dimensional inner product space has an infinite sequenceof orthonormal vectors.
3.4.1 Example
The sequence 1, cos x, cos 2x, ..., cosnx, sinx, sin 2x, ..., sinnx is an orthogonal set of functions inC[−π, π] with inner product
∫ π
−πf(x)g(x)dx. The corresponding orthonormal set is
1√2π,cos x√π,cos 2x√
π, ...,
cosnx√π
,sin x√π,sin 2x√
π, ...,
sinnx√π
It follows that:1√2π,cos x√π,sinx√π,cos 2x√
π,sin 2x√
π...,
cosnx√π
,sinnx√
π, ...
is an infinite orthonormal (and therefore) linearly independent sequence in C[−π, π].The principle application of this chapter is the theory of Fourier series, which is to express a
function in the form:1
2a0 +
∞∑n=1
(an cosnx+ bn sinnx).
So far we have worked with continuous functions. This turns out to be too restricted. Sums, scalarmultiples and integrals are well defined if we allow reasonable discontinuities.
34
3.4.2 Definition
A function f : [a, b] → R is called piecewise continuous on [a, b] if for each x0 ∈ [a, b],
limx→x+
0
f(x) and limx→x−
0
f(x)
both exist. [This is a one side limit at the end points] If the limits exist, we denote them by f(x+0 )and f(x−0 ). Also, for piecewise continuity we require at most a finite number of discontinuity.
P.C ≡ at most s finite number of discontinuities and left and right limits at all points.Let f, g be piecewise continuous on [a, b]. Then f, g have at most a finite number of discontinuities,
so f + g will have this property. That is (f + g)(x+0 ) and (f + g)(x−0 ) both exist. Thus the sumof piecewise continuous functions is also piecewise continuous. Also for any α ∈ R, αf is piecewisecontinuous.
Notation Let PC[a, b] denote the collection of all piecewise continuous functions on [a, b].The above calculations shows that PC[a, b] is closed under addition and scalar multiplication.
Hence PC[a, b] is a subspace of the vector space of all real valued functions. Let f ∈ PC[a, b] withdiscontinuities at x1 < x2 < ... < xn and continuous everywhere else. Then∫ b
a
f(x)dx =
∫ x1
a
f(x)dx+
∫ x2
x1
f(x)dx+ ...+
∫ xn
xn−1
f(x)dx+
∫ b
xn
f(x)dx
and this defines∫ b
af(x)dx. Therefore < f, g >=
∫ b
af(x)g(x)dx is well defined. however, this is not
an inner product on PC[a, b]. Considerf ∈ PC[0, 1] where f(x) = 1 at x = 0.3 and x = 0.7 while f(x) = 0 elsewhere. Then∫ 1
0(f(x))2dx = 0 while f(x) = 0.
3.5 Normalized Functions
3.5.1 Definition
(Normalized piecewise continuous function)A piecewise continuous function f : [a, b] → R is called normalized, if
f(x) =1
2[f(x+) + f(x−)]
for all x ∈ (a, b) and f(a) = f(b) = 12[f(a+) + f(b−)].
Note that if f is continuous, then f(x) = f(x+) = f(x−).
3.5.2 Theorem
LetNPC[a, b] denotes the set of functions f : [a, b] → R that are normalized and piecewise continuouson [a, b]. Then NPC[a, b] is an inner product space, with the usual addition and scalar multiplicationof functions and inner product ∫ b
a
f(x)g(x)dx.
Proof. We have already seen that the sum of piecewise continuous functions is again piecewisecontinuous. Similarly for multiplication by scalars.
35
Let f, g ∈ NPC[a, b]. then
(f + g)(x0) = f(x0) + g(x0)
=f(x+0 ) + f(x−0 )
2+g(x+0 ) + g(x−0 )
2
=f(x+0 ) + g(x+0 ) + f(x−0 ) + g(x−0 )
2
=(f + g)(x+0 ) + (f + g)(x−0 )
2
Also
(αf)(x0) = α(f(x0))
=α(f(x+0 ) + f(x−0 ))
2
=α(f(x+0 )) + α(f(x−0 ))
2
Therefore NPC[a, b] is closed under addition and scalar multiplication. The rest of the axioms followfrom the properties of functions.
For the inner product we only prove the last axiom.Suppose f ∈ NPC[a, b] and < f, f >= 0, that is
∫ b
a(f(x))2 dx = 0.We show f is the zero function.
To the contrary, assume there is x0 ∈ [a, b] such that f(x0) = 0. Hence f(x0) =f(x+
0 )+f(x−0 )
2= 0.
Hence not both of f(x+0 ) and f(x−0 ) can be 0, say f(x+0 ) = 0. Then (f(x+0 ))
2 > 0. Since f has only afinite number of discontinuity, we must have |f(x)| > 0 in some range x0 < x < c. So
0 <
∫ c
x0
(f(x))2 dx ≤∫ b
a
(f(x))2 dx.
This is a contradiction. �
3.5.3 definition
1. A function f : R → R is said to be periodic of period 2π if f(x+ 2π) = f(x) for all x ∈ R.
2. Let f : [−π, π] → R. The periodic extension of f is a function f : R → R which is a periodicfunction of period 2π such that f(x) = f(x) for −π ≤ x ≤ π.
(Thus, f(x) = f(x− 2nπ), where n ∈ Z is such that x− 2nπ ∈ [−π, π].)
3.5.4 example
f(x) =
...x+ 2π, if − 3π < x < −π;0, if x = −π;x, if − π < x < π;0, if x = π;x− 2π, if π < x < 3π;... .
36
3.6 Projection onto Subspaces: (Best Approximation)
3.6.1 Lemma
Let w1, ..., wn be a basis for a subspace W in the real inner product space V. For v ∈ V, the followingare equivalent:
1. < v,w >= 0 for all w ∈ W
2. < v,wi >= 0 for 1 ≤ i ≤ n.
Proof. (i) → (ii) is obvious.(ii) → (i). Let w ∈ W. Then there are scalars α1, ..., αn ∈ R such that w = α1w1 + ... = αnwn. So
< v,w > = < v, α1w1 + ... = αnwn >
=n∑
i=1
αi < v,wi >= 0.
�A vector v satisfies these conditions is said to be perpendicular (or orthogonal) to W and we
write v ⊥ W.
3.6.2 Lemma: (Pythagorean Lemma)
If u1, ..., uk are orthogonal, then ||u1 + u2 + ...+ uk||2 = ||u1||2 + ...+ ||uk||2Proof. Since u1, ..., uk are orthogonal, then < ui, uj >= 0 for all i = j. Hence
||u1 + u2 + ...+ uk||2 = < u1 + u2 + ...+ uk, u1 + u2 + ...+ uk >
= < u1, u1 > + < u2, u2 > +...+ < uk, uk >
= ||u1||2 + ||u2||2 + ...+ ||uk||2
�
3.6.3 Theorem: Orthogonal Projection
Let V ba a real inner product space. Let W be a finite dimensional subspace of V and let v ∈ V.Then:
1. v can be written uniquely in the form v = w + x where w ∈ W and x ⊥ W.
2. w is the best approximation to v by a member of W in the sense that for all w′ ∈ W, withw′ = w
||v − w′|| > ||v − w||.
Proof.
1. Let w1, ..., wn be an orthonormal basis for W. Let
w =n∑
i=1
< v,wi > wi
37
and put x = v −w. Clearly w ∈ W. To show that x ⊥ W it sufficient to show that < x,wi >=0, 1 ≤ i ≤ n, by lemma (...). So
< x,wi > = < v − w,wi >
= < v −n∑
i=1
< v,wi > wi, wi >
= < v,wi > − < v,wi >
= 0
Suppose v = x+ w, where W ∈ W and x ⊥ W. Write w = α1w1 + ...+ αnwn. Then
< w,wj > = αj
= < v − x, wj >
= < v,wj > − < x, wj >
= < v,wj >
Hence, w =∑n
j=1 αjwj =∑n
j=1 < v,wj > wj = w. So x = x. This gives the uniqueness.
2. For w′ ∈ W, w′ = w,v − w′ = (v − w) + (w − w′)
Note that x = v − w′ and w − w′ ∈ W, and x ⊥ (w − w′), so by pythagorean lemma,
||v − w′||2 = ||(v − w) + (w − w′)||2
= ||v − w||2 + ||w − w′||2
> ||v − w||2
3.6.4 Example
Find the distance of (1, 3, 2) in R3 from the plane through (0, 0, 0, ), (1, 0, 0, ), (1, 1, 1).Solution:Apply Gram-Schmidt to (1, 0, 0, ), (1, 1, 1). to get an orthonormal basis for the plane W.Let v1 = (1, 0, 0), then v2 = (1, 1, 1)− 1
1(1, 0, 0) = (0, 1, 1). Hence,
w1 = (1, 0, 0), w2 = (0, 1√2, 1√
2). Now
w =∑2
i=1 < v,wi > wi = 1(1, 0, 0) + 5√2(0, 1√
2, 1√
2) = (1, 5
2, 52).
So x = v − w = (1, 3, 2)− (1, 52, 52) = (0, 1
2,−1
2). Hence ||x|| =
√14+ 1
4= 1√
2.
3.7 Convergence in an Inner Product Space
3.7.1 Definition
Let V be an inner product space. Let (un) be a sequence of vectors in V. We say (un) converges tou if
limn→∞
||un − u|| = 0
(Note that ||un − u|| is a real number)In case of functions, ||fn − f || → 0 means that the area between fn and f tends to 0.
38
3.8 Definition
Let xi ∈ V. Then∑∞
k=1 xk is said to converge to x if ||∑∞
k=1 xk − x|| → 0.
3.8.1 Theorem
Let (un) → u and (vn) → v in an inner product space V. Let∑∞
n=1 xn be a convergent series in Vand (wn) be an *infinite orthonormal sequence in V. Then
1. limn→∞(un + vn) = u+ v
2. limn→∞(λun) = λu where λ ∈ F
3. For w ∈ V, limn→∞ < un, w >=< u,w > . that is < limn→∞ un, w >= limn→∞ < un, w > .
4. <∑∞
n=1 xn, w >=∑∞
n=1 < xn, w > .
This includes the statement that series of real numbers in the r.h.s is convergent).
5. Let α1, α2, ... be scalars such that for y ∈ V,∑∞
n=1 αn, wn = y, then αn =< y,wn > .
Proof.
1. We have
0 ≤ ||(un + vn)− (u+ n)||= ||(un − u) + (vn − v||≤ ||(un − u)||+ ||(vn − v|| → 0 + 0 = 0 (by triangle inequality)
2. 0 ≤ ||λun − λu|| = |λ|||un − u|| → |λ|.0 = 0
3.
| < un, w > − < u,w > | = | < un − u,w > |≤ ||un − u||.||w|| → 0.||w|| = 0 (by Cauchy Schwartz)
Hence, | < un, w > − < u,w > | → 0, so < un, w >→< u,w > . that is limn→∞ < un, w >=<u,w >=< limn→∞ un, w > .
4.
<∞∑n=1
xn, w > = < limn→∞
n∑k=1
xn, w >
= limn→∞
<n∑
k=1
xn, w > (by part 3)
= limn→∞
n∑k=1
< xn, w > (using the linearity of the inner product)
=∞∑k=1
< xk, w >
39
5.
< y,wn > = <∞∑k=1
αkwk, wn >
=∞∑k=1
< αkwk, wn > (by part 4)
=∞∑k=1
αk < wk, wn > (using orthonormality)
= αn.1 = αn
�
3.9 Fourier Series
Consider the infinite orthonormal sequence
1√2π,cos x√π,sinx√π,cos 2x√
π,sin 2x√
π...,
cosnx√π
,sinnx√
π, ...
and the inner product in NPC[−π, π] given by
< f, g >=
∫ π
−π
f(t)g(t) dt
Then∑∞
k=1 < y,wk > becomes
A01√2π
+∞∑n=1
(An
cosnx√π
+Bnsinnx√
π
)..........(∗)
where
A0 =
∫ π
−π
f(t)1√2π
dt
An =
∫ π
−π
f(t)cosnt√
πdt
Bn =
∫ π
−π
f(t)sinnt√
πdt
Thus (∗) becomes∫ π
−π
f(t)1√2π
dt1√2π
+∞∑n=1
((∫ π
−π
f(t)cosnt√
πdt
)cosnx√
π+
(∫ π
−π
f(t)sinnt√
πdt
)sinnx√
π
)
=1
2π
∫ π
−π
f(t)dt+∞∑n=1
(1
π
∫ π
−π
f(t) cosnt dt
)cosnx+
(1
π
∫ π
−π
f(t) sinnt dt
)sinnx
This is of the form1
2a0 +
∞∑n=1
(an cosnx+ bn sinnx)
40
where
a0 =1
π
∫ π
−π
f(t) dt
an =1
π
∫ π
−π
f(t) cosnt dt
bn =1
π
∫ π
−π
f(t) sinnt dt
These numbers are called the Fourier coefficients of f.Let w1, w2, ..., wn, ... be an infinite orthonormal sequence in V and let v ∈ V. Then < v,wn > is
called the n− th generalized Fourier Coefficient of v relative to w1, w2, ..., wn, ...
3.9.1 Definition
The infinite orthonormal sequence w1, w2, ..., wn, ... in V is said to be complete in V if
∞∑n=1
< v,wn > wn = v for all v ∈ V.
3.9.2 Theorem
Let w1, w2, ..., wn, ... be an infinite orthonormal sequence in the inner product space V, and let u, v ∈ V.Then
1. Bessel’s Inequality:∑∞
n=1 | < v,wn > |2 is convergent and
∞∑n=1
| < v,wn > |2 ≤ ||v||
2. Riemann-Lebesgue Lemma: < v,wn >→ 0
3. The following statements are equivalent:
(a) w1, w2, ..., wn, ... is a complete sequence in V.
(b)∑∞
n=1 < v,wn >2= ||u||2. (Parseval’s identity)
(c)∑∞
n=1 < u,wn >< v,wn >=< u, v > . (Plancherel’s identity)
Proof. Put W = [w1, w2, ..., wn]. By orthogonal projection theorem, we have
v = xn + un, where un =∞∑n=1
| < v,wk > wk, and xn ⊥ un
By pythagorean lemma, ||v||2 = ||xn||2 + ||un||2.
41
1.
||un||2 = ||∞∑k=1
< v,wk > wk||2
=∞∑k=1
|| < v,wk > wk||2
=∞∑k=1
| < v,wk > |2||wk||2
=∞∑k=1
| < v,wk >2
= ||un||2
≤ ||un||2 + ||xn||2
= ||vn||2
Hence,∑∞
k=1 < v,wk >2 is a series of nonnegative terms whose partial sums are bounded above
by ||v||2, so the series must converge with sum ≤ ||v||2.
2. Since∑∞
n=1 < v,wn >2 converges, then < v,wn >
2→ 0 and so < v,wn >→ 0.
3. (c) ⇒ (b) Put v = u.
(b) ⇒ (a)
||v −∞∑k=1
< v,wn > wn||2 = ||xn||2
= ||v||2 − ||un||2
= ||v||2 −∞∑k=1
< v,wk >2→ 0 (since (b) holds)
(a) ⇒ (c)
Assume (a) holds, we have u =∑∞
k=1 αkwk, where αk =< u,wk > . Then
< u, v > = <
∞∑k=1
αkwk, v >
=∞∑k=1
< αkwk, v >
=∞∑k=1
αk < v,wk >
=∞∑k=1
< u,wk >< v,wk >
42
3.9.3 Lemma
Let h be a periodic of period 2π and piecewise continuous function, then∫ π
−π
h(t) dt =
∫ π+a
−π+a
h(t) dt.
Proof. Write I1 =∫ π
−πh(t) dt, I2 =
∫ π+a
−π+ah(t) dt. Then
I1 =
∫ π+a
−π
h(t) dt−∫ π+a
π
h(t) dt.
I2 =
∫ π+a
−π
h(t) dt−∫ −π+a
−π
h(t) dt.
So
I2 − I1 =
∫ π+a
π
h(t) dt−∫ −π+a
−π
h(t) dt.
Put u = t+ 2π then
I2 − I1 =
∫ π+a
π
h(t) dt−∫ π+a
π
h(u− 2π) du =
∫ π+a
π
h(t) dt−∫ π+a
π
h(u) du = 0
Hence, I1 = I2. �
3.9.4 Lemma
Let h : R → R be integrable. Then
1. If h is even then∫ π
−πh(t) dt = 2
∫ π
0h(t) dt.
2. If h is odd then∫ π
−πh(t) dt = 0.
Proof. Exercise. �
3.9.5 Example
Let f : R → R given by f(x) =
{x+ π, if − π < x < 0;x− π, if 0 < x < π.
Then f is piecewise continuous and periodic of period 2π. then
a0 =1
π
∫ π
−π
f(x) dx
an =1
π
∫ π
−π
f(t) cosnt dt
bn =1
π
∫ π
−π
f(t) sinnt dt
43
Note that f, sin are odd functions while, cos is an even function. Hence, f(t) cosnt is odd while,f(t) sinnt is even. Therefore a0 = an = 0 and
bn =2
π
∫ π
0
f(t) sinnt dt
=2
π
∫ π
0
(t− π) sinnt dt
=−2
n
So, the Fourier series is∑∞
n=1
(−2 sinnxn
)and we write f(x) ∼
∑∞n=1
(−2 sinnxn
).
3.9.6 Example
The f : R → R given by f(x) = x, π < x < π is periodic of period 2π, normalize piecewisecontinuous. This function is neither even nor odd. So,
a0 =1
π
∫ π
−π
x(π − x) dx
= −2
3π2
an =1
π
∫ π
−π
x(π − x) cosnx dx
= − 1
π
∫ π
−π
x2 cosnx dx
= − 1
π
[x2 sinnx
n
]π−π
+1
π
∫ π
−π
2x sinnx
ndx
=4
nπ
∫ π
0
x sinnx dx
=4
nπ
[−x cosnx
n
]π0
+
∫ π
0
cosnx
ndx
=4
nπ
(−π cosnπ
n
)=
4
n2(−1)n+1
bn =1
π
∫ π
−π
x(π − x) sinnx dx
=1
π
∫ π
−π
[(xπ sinnx)− (x2 sinnx)] dx
=2
π
∫ π
−π
(xπ sinnx) dx
= 2
∫ π
−π
(x sinnx) dx
= 2
[−x cosnx
n
]π0
+ 2
∫ π
0
cosnx
ndx
=2π
n(−1)n+1
44
Hence
f(x) ∼ −π2
3+
∞∑n=1
(−1)n+1(4
n2cosnx+
2π
nsinnx).
We now consider convergence of the Fourier series, i.e for x0 ∈ R, under what circumstances does ptfollow that
f(x0) =1
2a0 +
∞∑n=1
(an cosnx+ bn sinnx)?
3.9.7 Definition
Suppose f : R → R is normalized and piecewise continuous on R. If limh→0+f(x0+h−f(x0)
hexists, then
it is called the right derivative of f at x0. Similarly, if limh→0−f(x0+h−f(x0)
hexists, then it is called the
left derivative of f at x0. If f is differentiable at x0 then both exist and both have value f ′(x0).
3.9.8 Theorem: (Pointwise Convergence Theorem)
Suppose f : R → R is periodic of period 2π, normalized and piecewise continuous and let
f(x) ∼ 1
2a0 +
∞∑n=1
(an cosnx+ bn sinnx) .
Suppose f has both a left and a right derivative at x0. then the Fourier series converges at x = x0 tof(x0) i.e
f(x0) =1
2a0 +
∞∑n=1
(an cosnx+ bn sinnx) .
Before we can prove this theorem, we need a few lemmas.
3.9.9 Lemma 1
Let n ≥ 1 be an integer, then 12+∑n
k=1 cos kx =sin(n+ 1
2)x
2 sin 12x
where x = 2rπ.
Proof. Since sin(A+B)− sin(A−B) = 2 cosA sinB, then
2 cos kx sin1
2x = sin(k +
1
2x)− sin(k − 1
2)x
Hence
n∑k=1
2(cos kx sin1
2x) =
n∑k=1
sin[(k +1
2x)− sin(k − 1
2)x]
= sin(n+1
2− sin
1
2x
Dividing by 2 sin 12x and rearranging gives the result. �
45
3.9.10 Definition
For n ≥ 0, 12+∑n
k=1 cos kx is called the nth Dirichlet Kernal and denoted byDn(x) whereD0(x) =12.
Properties of Dn(x):
1. Dn is an even function.
2. Dn is periodic of period 2π.
3.∫ π
0Dn(x) dx =
∫ π
012dx+
∑nk=1
∫ π
0cos kx dx = π
2+ 0 = π
2.
For n ≥ 0 write
Sn(x) =1
2a0 +
0∑k=1
(ak cos kx+ bk sin kx) .
Sn(x) is called nth partial sum
3.9.11 Lemma 2
With the natation as above, for x ∈ R,
Sn(x)− f(x) =1
π
∫ π
0
(f(x+ u)− f(x+)
)Dn(u) du
+1
π
∫ π
0
(f(x− u)− f(x−)
)Dn(u) du
Proof.
Sn(x) =1
2a0 +
0∑k=1
(ak cos kx+ bk sin kx) .
Substitute for the Fourier coefficients
Sn(x) =1
2π
∫ π
−π
f(u) du+n∑
k=1
(cos kx
π
∫ π
−π
f(u) cos ku du+sin kx
π
∫ π
−π
f(u) sin ku du
)
=1
π
∫ π
−π
(1
2+
n∑k=1
(cos kx cos ku+ sin kx sin ku)
)f(u) du
=1
π
∫ π
−π
(1
2+
n∑k=1
(cos k(x− u)
)f(u) du
Put x− u = t (note that x is a constant) then
Sn(x) =1
π
∫ x−π
x+π
(1
2+
n∑k=1
(cos kt
)f(x− t) − dt
=1
π
∫ π+x
−π+x
(1
2+
n∑k=1
(cos kt
)f(x− t) dt
46
Since cos kt, f are periodic of period 2π then
Sn(x) =1
π
∫ =π
−π
(1
2+
n∑k=1
(cos kt
)f(x− t) dt
=1
π
∫ π
−π
Dn(t)f(x− t) dt , Put u = −t,
=1
π
∫ π
−π
Dn(u)f(x+ u) du
=1
π
∫ 0
−π
Dn(v)f(x+ v) dv +1
π
∫ π
0
Dn(u)f(x+ u) du
Put v = −u in the first integral we get
Sn(x) =1
π
∫ π
0
Dn(u)f(x− u) du+1
π
∫ π
0
Dn(u)f(x+ u) du(∗)
And
f(x) = f(x).1
= f(x).2
π
∫ π
0
Dn(u) du
=2
π
∫ π
0
f(x)Dn(u) du
=2
π
∫ π
0
f(x+) + f(x−)
2Dn(u) du
=1
π
∫ π
0
f(x+)Dn(u) +1
π
∫ π
0
f(x−)Dn(u) du(∗∗)
Result comes from subtracting (**) from (*).Recall Riemann Lebegue: If w1, w2, ..., wn, ... is an infinite orthonormal sequence, then< v,wn >→
0 an n→ ∞. Nowcos x√π,cos 2x√
π, ...,
cosnx√π
, ....
sin x√π,sin 2x√
π, ...,
sinnx√π, ...
are a pair of infinite orthonormal sequences. So, for any f ∈ NPC[−π, π]
limn→∞
∫ π
−π
f(t) cosnt dt = 0,
and
limn→∞
∫ π
−π
f(t) sinnt dt = 0,
3.9.12 Lemma 3
Let h : [0, π] → R be piecewise continuous function. Then
limn→∞
∫ π
0
h(t) sin(n+1
2)t dt = 0
47
Proof. We Extend h to [−π, π] to
g(t) =
{0, if − π ≤ t < 0;h(t), if 0 < t ≤ π.
Then g is piecewise continuous on [−π, π] and∫ π
0
h(t) sin(n+1
2)t dt =
∫ π
−π
g(t) sin(n+1
2)t dt
=
∫ π
−π
g(t)(sinnt cos1
2t+ cosnt+ sin
1
2t) dt
=
∫ π
−π
(g(t) cos1
2t) sinnt dt+
∫ π
−π
(g(t) sin1
2t) cosnt dt
Since (g(t) cos 12t) and (g(t) sin 1
2t) are piecewise then both tends to 0. �
Back to theorem 1.9.8, we know f is normalized, PC, periodic of period 2π and possesses leftderivative and right derivative at a point x = x0. We must show that Sn(x0)− f(x0) → 0.
By Lemma 2,
Sn(x0)− f(x0) =1
π
∫ π
0
(f(x0 + u)− f(x+0 )f(x0 − u)− f(x−0 )
)Dn(u) du
=1
π
∫ π
0
(f(x0 + u)− f(x+0 )f(x0 − u)− f(x−0 )
) sin(n+ 12)u
2 sin 12u
du(by Lemma 1)
Put
ϕ(u) =f(x0 + u)− f(x+0 )f(x0 − u)− f(x−0 )
2 sin 12u
If we can show that π(u) is PCon [0, π] then we can use Lemma 3. Then there will be no problemsexcept at u = 0, since f and sin 1
2u are PC on [0, π], however,
ϕ(u) =
[f(x0 + u)− f(x0)
u− f(x0 − u− f(x0)
−u
] u2
sin u2
Then ∫ π
−π
f(x)g(x) dx =
(√π
2a0
)(√π
2c0
)+
∞∑n=1
[(√πan)(
√πcn) + (
√πbn)(
√πdn)
]i.e
1
π
∫ π
−π
f(x)g(x) dx =1
2a0c0 +
∞∑n=1
(ancn + bndn)..........PLANCHERAL
In particular, with g = f,
1
π
∫ π
−π
[f(x)]2 dx =1
2a20 +
∞∑n=1
(a2n + b2n)..........PARSEV AL
48
3.9.13 Example
Let f(x) =
{x+ π, if − π < x < 0;x− π, if 0 < x < π
The Fourier series:
f ∼ −2
(sin x
1+
sin 2x
2+ ....
sinnx
n+ ....
)=
∞∑n=1
−2
nsinnx
a0 = an = 0, bn = −2n
so
1
π
∫ π
−π
[f(x)]2 dx =1
π
∫ 0
−π
[f(x)]2 dx+1
π
∫ π
0
[f(x)]2 dx
=1
π
∫ 0
−π
[x+ π]2 dx+1
π
∫ π
0
[x− π]2 dx
=2
3π3.
Hence,
1
π
2
3π3 = 0 +
∞∑n=1
(0 +4
n2)
2
3π2 =
∞∑n=1
4
n2
Note that∑∞
n=11n2 = π2
6.
Completeness
Recall that the infinite orthonormal sequence w1, w2, ..., wn, ... is said to be complete if
||n∑
k=1
< u,wk > || → 0
for every u ∈ V.In NPC[−π, π], the infinite orthonormal sequence has been
1√2π,cos x√π,sinx√π,cos 2x√
π,sin 2x√
π...,
cosnx√π
,sinnx√
π, ...
The proof that these functions are complete in NPC[−π, π] is long and difficult and will be omitted.The result is important: So if
f ∼ 1
2a0 +
∞∑n=1
(an cosnx+ bn sinnx)
then ||Sn(x)− f(x)|| → 0 for any f ∈ NPC[−π, π], i.e
49
∫ π
−π
(1
2a0 +
n∑k=1
(ak cos kx+ bk sin kx)− f(x)
)2
dx
−→ 0 as n −→ ∞
Parseval and Plancheral theorem says:
w1, w2, ..., wn, ...is complete ⇐⇒ < u, v >=∞∑i=1
< u,wi >< v,wi > −→ Plancheral
⇐⇒ < u, u >=∞∑i=1
< u,wi >2 −→ Parseval
For NPC[−π, π],
w1, w2, ..., wn, ... is1√2π,cosx√π,sinx√π,cos 2x√
π,sin 2x√
π...,
cosnx√π
,sinnx√
π, ...
So,
< f(x),1√2π
> =
∫ π
−π
f(x)1√2π
dx
=
√π
2.1
π
∫ π
−π
f(x) dx
=
√π
2a0,
< f(x),cosnx√
π> =
∫ π
−π
f(x)cosnx√
πdx
=√π1
π
∫ π
−π
f(x) cosnx dx
=√πan
Similarly,
< f(x),sinnx√
π>=
√πbn
50
Exercises
Set10.
(1) Determine which of the following rules defines an inner product on R2
< (x1, y1), (x2, y2) > = (i) x1x2
(ii) 2(x1x2 + y1y2)
(iii) − 2(x1x2 + y1y2)
(iv) (x1x2)2 + (y1y2)
2
(v) x1x2 − 2x1y2 − 2x2y1 + 5y1y2
(vi) x1y2 + x2y1.
(2) Show that the relation
< A,B >= trace (BTA)
for all A,B ∈Mnn(C) defines an inner product on Mnn(C).
(3) Let n ≥ 1 be an integer. Show that the functions
1, cos x, cos 2x, ..., cosnx, sinx, sin 2x, ..., sinnx
are an orthogonal set of functions in C[−π, π] with its usual inner product.
(i.e < f(x), g(x) >=∫ π
−πf(x)g(x) dx). Calculate the corresponding orthonormal set.
51
Set11.
(1) Verify that the relation
< u, v >= x1x2 + 2y1y2 + 3z1z1 − x1y2 − x2y1 + x1z2 + x2z1
where u = (x1, y1, z1), v = (x2, y2, z2) defines an inner product on R3. Find an orthonormalbasis for this space.
Show that
|x1x2 + 2y1y2 + 3z1z1 − x1y2 − x2y1 + x1z2 + x2z1|≤ (x21 + 2y21 + 3z21 − 2x1y1 + 2x1z1)
12 (x22 + 2y22 + 3z22 − 2x2y2 + 2x2z2)
12
for all real numbers x1, x2, y1, y2, z1, z2.
(2) Find an orthonormal basis for the subspace of R4 with standard scalar product as inner product,which is generated by
u1 = (1, 1,−1, 1), u2 = (1,−1,−2, 2), u3 = (4, 2, 3, 1), u4 = (3, 1, 0, 2)
and extend it to an orthonormal basis for R4.
Express (9, 4,−3, 5) in terms of your orthonormal basis.
Set12.
(1) Find an orthonormal basis for the subspace [1, x, x2, x3] of C[−1, 1] with usual inner product
< f, g >=∫ 1
−1f(t)g(t) dt.
(2) Let V be a real inner product space.
1. Show that||u+ v||2 + ||u− v||2 = 2||u||2 + 2||v||2
for all u, v ∈ V. Obtain a theorem about parallelograms by considering R2 with standardscalar product.
2. Suppose x, y ∈ V and ||x|| = ||y||. Prove that u + v and u − v are orthogonal. Obtain atheorem about rhombuses (rhombi?), try considering R2 with standard scalar product.
Set13.
(1) f : R → R is periodic of period 2π, normalized and piecewise continuous where
f(x) =
{x, if − π < x < 0;π − x, if 0 < x < π.
Sketch the graph of f for the range 3π ≤ x ≤ 3π and write down a formula for f over thisrange.
(2) Find the best approximation to the function f(x)− x in the subspace
[sin x, sin 2x, ..., sinnx], (n ≥ 1 is an integer).
of NPC[−π, π] with its standard inner product.
52
(3) Find the distance of the point (1, 0, 2) in R3 from the plane which passes through the originand the points (1,−1, 0) and (0, 1, 1).
Set14.
(1) Let W be a subspace of the real inner product space V. define
W⊥ = {u ∈ V :< u,w >= 0 ∀w ∈ W}
Show that W⊥ is a subspace of V.
Show that V =W⊕W⊥ whenW has finite dimensional. (The proof is one line using orthogonalprojection theorem)
W ⊥ is called the orthogonal complement of W.
Find a basis for the orthogonal complement W ⊥ of the subspace W when:
1. W = [(1, 2, 0, 1), (1,−1, 1, 0)] ⊆ R4 with standard scalar product as inner product.
2. W is the subspace of all symmetric matrices in M33(R) with inner product
< A,B >= trace (BTA) (see set9)
(2) Use Parseval identity and the function f(X) = x to show that the infinite orthonormal sequence
1√2π,cos x√π,cos 2x√
π, ...,
cosnx√π
, ...
is not complete in NPC[−π, π].
Set15.
(1) For each of the following functions, draw a graph of its periodic normalized extension over[−3π.3π] and find its Fourier Series:
1. f(x) =
{0, if − π < x < 0;x, if 0 < x < π.
2. g(x) = e−x for −π < x < π.
(2) f is normalized, piecewise continuous and has period π (i.e f(x + π) = f(x)). By means ofsuitable changes of variables, show that∫ 0
−π
f(t) cosnt dt =
∫ π
0
f(u) cosnu cosnπ du
∫ 0
−π
f(t) sinnt dt =
∫ π
0
f(u) sinnu cosnπ du
Deduce that , for n ≥ 1, the Fourier coefficients of f are given by:
an =
{0, if n is odd;2π
∫ π
0f(t) cosnt dt if n is even.
bn =
{0, if n is odd;2π
∫ π
0f(t) sinnt dt if n is even.
53
Set16.
(1) Let θ be a real number which is not an integer. The function f ∈ PNC[−π, π] satisfiesf(x) = cos θ for −π ≤ x ≤ π. Sketch the graph of the piecewise extension of f when θ = 1
4.
Show that
f ∼ sin πθ
πθ+
∞∑n=1
(−1)n2θ sinπθ
π(θ2 − n2)cosnx.
Deduce that
csc θπ =1
θπ+
2θ
π
∞∑n=1
(−1)n
θ2 − n2
cot θπ =1
θπ+
2θ
π
∞∑n=1
1
θ2 − n2
(2) By considering suitable even and odd functions on [−π, π] (and PC then), show that
π
2− 4
π
∞∑r=1
cos(2r − 1)x
(2r − 1)2= x = 2
∞∑n=1
(−1)n+1 sinnx
nif 0 ≤ x < π.
What is the value of
1.π
2− 4
π
∞∑r=1
cos(2r − 1)x
(2r − 1)2if − π < x < 0?
2.
2∞∑n=1
(−1)n+1 sinnx
nif − π < x < 0?
Set17.
(1) f : R → R is even, periodic of period 2π, normalized and
f(x) =
{0, if 0 < x < π
2;
1, if π2< x < π.
Show that
f(x) =1
2− 2
π
(cos x
1− cos 3x
3+
cos 5x
5− cos 7x
7+ ...
)for all x ∈ R. Find the sums of
1.∑∞
n=1(−1)n+1
2n−1,
2. 1 + 13− 1
5− 1
7+ 1
9+ 1
11− 1
13− 1
15+ ....
(2) The 2π-periodic functions f, g ∈ NPC[−π, π] are given by
f(x) = π2x− x3, (−π ≤ x ≤ π)
g(x) = x (−π < x < π)
54
It is given that their Fourier series are
f ∼∞∑n=1
12(−1)n+1
n3sinnx
g ∼∞∑n=1
2(−1)n+1
nsinnx
Use these two expansion to show that
1.∑∞
n=11n4 = π4
90,
2.∑∞
n=11n6 = π6
945,
Use part (2) to show that
3.∑∞
n=11
(2n−1)6= π6
960.
55
Chapter 4
Diagonalization
Recall that for A ∈Mnn(F), X ∈ Fn (written as a column) is called an eigenvector for A if:
1. X = 0,
2. AX = λX for some λ ∈ F.
λ is called the corresponding eigenvalue.Note that if X is an eigenvector for A and if k = 0, then kX is an eigenvector for A since
A(kX) = k(AX) = k(λX) = λ(kX).
Suppose X is an eigenvector for A with corresponding eigenvalue λ. Then AX = λX and so(λI −A)X = 0 is a system of n homogeneous equations in n unknowns, and this system has non
trivial solution (X = 0, by definition). Hence the matrix of coefficients must be singular, that isdet(λI −A) = 0. This is a polynomial of degree n in the variable λ. Its roots (zeros) will give the
eigenvalues. This polynomial is called the characteristic polynomial of A.
4.1 Example
Let A =
(0 1−1 0
), then det(λI −A) = λ2 + 1. If F = C then we two complex eigenvalues i,−i. If
F = R then it has no eigenvalues.
4.2 Definition
Let φ : V → V be a linear mapping of the n−dimensional vector space V. Then u ∈ V is called aneigenvector of φ if
1. u = 0v,
2. φ(u) = λu for some λ ∈ F.
56
Let φ : V → V be a linear mapping, let φ represented by A = (aij) relative to the basisv1, v2, ..., vn. Let u = x1v1 + x2v2 + ...+ xnvn ∈ V. Then
φ(u) =n∑
i=1
xiφ(vi)
=n∑
i=1
xi
(n∑
h=1
ahivh
)
=n∑
h=1
(n∑
i=1
ahixi
)vh
If u is an eigenvector of φ, then
φ(u) = λu
⇔n∑
h=1
(n∑
i=1
ahixi
)vh =
n∑h=1
λxhvh
⇔n∑
i=1
ahixi = λxh
⇔ AX = λX, (X = (x1, ..., xn)T ).
Thus λ is an eigenvalue for φ iff it is an eigenvalue for the representing matrix A.
4.3 Theorem
Let A ∈ Mnn(F). Then A is similar to a diagonal matrix if and only if A possesses n linearlyindependent eigenvectors. ( That is iff Fn possesses a basis consisting of eigenvectors of A.
Proof. SupposeX1, X2, ..., Xn are n linearly independent eigenvectors ofA. Put P = (X1, X2, ..., Xn),so P is the matrix which has these eigenvectors as its columns. Then P is invertible and
AP = (AX1, AX2, ..., AXn)
= (λ1X1, λ2X2, ..., λnXn)
= (X1, X2, ..., Xn)
λ1 0
λ2. . .
0 λn
≡ PΛ
Hence, P−1AP = Λ.Conversely, if P−1AP = Λ, then the columns of P must be linearly independent. The above
calculations worked backwards shows that these columns are eigenvectors of A. �
4.4 Lemma
Eigenvectors corresponding to distinct eigenvalues are linearly independent.
57
Proof. We prove it by induction. Let X1, X2, ..., Xn be eigenvectors corresponding to the distincteigenvalues λ1, ..., λn. If n = 1, then the result holds since X1 = 0 and any non zero vector is linearlyindependent. Suppose n > 1, and that the result holds for fewer eigenvectors. Let a1, ..., an ∈ F besuch that
a1X1 + a2X2 + ...+ anXn = 0 (4.1)
Apply A to this equation to get
a1AX1 + a2AX2 + ...+ anAXn = A.0
i.ea1λ1X1 + a2λ2X2 + ...+ anλnXn = A.0 (4.2)
Now (2)-λn(1) gives
a1(λ1 − λn)X1 + a2(λ2 − λn)X2 + ...+ an−1(λn−1 − λn)Xn = A.0
By induction, hypothesis, these vectors are linearly independent, so
ai(λi − λn)− 0 for 1 ≤ i ≤ n− 1
But λi = λn for all 1 ≤ i ≤ n− 1, so ai = 0 for all 1 ≤ i ≤ n− 1.So equation (1.1) reduces to anXn = 0. However, Xn = 0 because it is an eigenvector, so an = 0.
The result follows by induction. �
4.5 Corollary
If A ∈Mnn(F), has n distinct eigenvalues, then A is similar to a diagonal matrix.Note: This is an ’if...,then’ statement, it is not an ”if and only if”. Thus if A has a repeated
eigenvalue, then initially this tells us nothing.Recall that A,B represent the same linear mapping if and only if B = P−1AP for some invertible
matrix P. i.e if and only if A and B are similar.
4.6 Theorem
If A,B are similar, then they have the same characteristic polynomial.Proof. Assume A,B are similar then B = c for some invertible matrix P. So
det(λI −B) = det(λI − P−1AP )
= det[P−1(λI − A)P ]
= detP−1 det(λI − A) detP
= det(λI − A)
�Thus all matrices representing φ have the same characteristic polynomial. This we define to be
the characteristic polynomial of φ.
Similarity of a matrix of simple form
58
The simplest type of a matrix is the diagonal matrix. As we see before, if the eigenvalues of Aare distinct, then A is diagonalizable and the proof shows that if
P−1AP =
λ1 0
λ2. . .
0 λn
Then the columns of P are eigenvectors and λ1, ..., λn are eigenvalues.
4.7 Example
1. The matrix A =
4 −6 −612 −14 1212 −12 10
has characteristic polynomial (λ + 2)2(λ − 4) and P = 1 1 02 1 12 0 1
and P−1AP =
4 0 00 −2 00 0 −2
.
2. The matrix A =
3 1 −1−7 5 −1−6 6 −2
but has only two independent eigenvectors:
λ = 4 gives an eigenvector (0, 1, 1)T .
λ = −2 gives an eigenvector (1, 1, 0)T .
In this case, there are only two linearly independent eigenvectors and so, this matrix is notdiagonalizable.
Let A ∈Mnn(C) has characteristic polynomial
det(λI − A) = (λ− λ1)α1(λ− λ2)
α2 ...(λ− λk)αk
so that A has k distinct eigenvalues λ1, λ2, ..., λk.
4.8 Definition
For 1 ≤ j ≤ k, αj is called the algebraic multiplicity of the eigenvalue λj. Thus αj is the power towhich (λ− λj) appears in the characteristic polynomial. [If all the algebraic multiplicities =1, thenA is diagonalizable.]
4.9 Definition
For 1 ≤ j ≤ k, αj putEλj
= {X ∈ Cn : AX = λjX}
[Note that 0v ∈ Eλj, so = 0.]
This is a subspace of Cn. This is called the eigenspace corresponding to λj. Any non zero vectorin Eλj
is an eigenvector for λj.
59
4.10 Definition
dim(Eλj) is called the geometric multiplicity of λj.
(see previous examples)Note that if
det(λI − A) = (λ− λ1)α1(λ− λ2)
α2 ...(λ− λk)αk
then α1 + α2 + ...+ αk = n.We denote the geometric multiplicity of λj by γj and the algebraic multiplicity by αj.
4.11 Theorem
With above notations, the geometric multiplicity of λj ≤ the algebraic multiplicity of λj. Furthermore, if for some eigenvalue λk, the geometric multiplicity < the algebraic multiplicity then A is notsimilar to a diagonal matrix.
Proof. Let φ : Cn → Cn be the linear mapping given by φ(X) = AX. Then φ is represented byA relative to the standard basis. Let X1, X2, ..., Xδj be a basis for the eigenspace Eλj
. Extend thisbasis to a basis for X1, X2, ..., Xδj , ..., Xn for Cn. Then AXi = λjXi for 1 ≤ i ≤ δj. So relative to thisbasis φ is represented by
B =
λj 0 ... 00 λj ... 00 0 ... ... C... λj
0 0 ... 0 D
= P−1AP for some P
Hence,
(λ− λ1)α1 ...(λ− λj)
αj ...(λ− λk)αk = det(λI − A)
= det(λI − P−1AP )
= det
λ− λj 0 ... 00 λ− λj ... 00 0 ... ... −C... λ− λj
0 0 ... 0 λI −D
= (λ− λj)
δj det(λI −D)
So (λ− λj)δj divides r.h.s, so (λ− λj)
δj divides l.h.s. So γj ≤ αj.Suppose γj < αj. Assume that A is similar that A is similar to a diagonal matrix. Then there is
60
an invertible matrix P such that
P−1AP =
λ1 0 0 ... ... 0 00 λ2 0 0 ... 0
0. . .
... λj
.... . .
λj 0 0. . . 0
0 0 0 0 λk
Hence the columns of P corresponding to the λ′js will be linearly independent eigenvectors corre-sponding to λj. There are αj of them. So αj ≤ dim(Eλj
) = γj. However, γj < αj. Hence A is notsimilar to a diagonal matrix. �
Let A ∈ Mnn(C) and φ : Cn → Cn given by φ(X) = AX is a linear mapping. Cn is an innerproduct space with standard scalar product as inner product:
< (x1, ..., xn), (y1, ..., yn) >= x1y1 + x2y2 + ...+ xnyn = XTY
where X = (x1, ..., xn)T , Y = (y1, ..., yn)
T .For B ∈Mnn(C), say B = (bij) let B = (bij). It is easy to show that for any matrices B1, B2,
1. B1 +B2 = B1 +B2
2. B1B2 = B1B2
4.12 Theorem
Let v1, ..., vn be an orthonormal basis for the inner product space V over C. Let P = (pij) ∈Mnn(C),and put
ui = p1iv1 + p2iv2 + ...+ pnivn, (1 ≤ i ≤ n)
Then u1, u2, ..., un is an orthonormal basis for V if and only if PTP = I.
Proof.
< ui, uj > = δij
⇔ < p1iv1 + p2iv2 + ...+ pnivn, p1jv1 + p2jv2 + ...+ pnjvn > = δij
⇔ p1ip1j + p2ip2j + ...+ pnipnj = δij..................(∗)⇔ (P TP )ij = δij
⇔ P TP = δij
⇔ (PTP )ij = δij
⇔ PTP = In
�Note (*) says that the columns of P are mutually perpendicular units vectors, i.e orthonormal
vectors in Cn.
61
4.13 Definition
P = (pij) ∈Mnn(C) satisfies PTP = In is called a unitary matrix.
4.14 Example
Let P =
1√3
i√6
i√2
i√3
−1√6
1√2
−i√3
−i√6
0
then PT=
1√3
−i√3
i√3
−i√6
−1√6
i√6
−i√2
1√2
0
Then PTP = In and so P−1 = P
T
In a real space this reduces to P TP = In and this gives a real orthogonal matrix, thus P−1 = P T .If P is a real orthogonal, then P TP = In, taking determinants, detP T detP = det In. i.e
(detP )2 = 1. Thus detP = ±1.P is called a proper orthogonal if detP = 1 and P is called improper orthogonal if detP = −1.
4.15 Example
Consider the real orthogonal matrix whose first column is a multiple of (1, 1,−2)T . Let the sec-ond column be (a, b, c)T then (a, b, c)T ⊥ (1, 1,−2), then a + b − 2c = 0. we can choose (a, b, c) =(1,−1, 0). Let the third column be (x, y, z). Then < (x, y, z), (1, 1,−2) >= x + y − 2z = 0 and< (x, y, z), (1,−1, 0) >= x − y = 0. Then (x, y, z) = (1, 1, 1). Then the orthogonal matrix is 1 1 1
1 −1 1−2 0 1
and the orthonormal matrix is P =
1√6
1√2
1√3
1√6
−1√2
1√3
−2√6
0 1√3
Then detP = −1, so im-
proper. Interchanging two columns will produce a proper orthogonal matrix1√6
1√3
1√2
1√6
1√3
−1√2
−2√6
1√3
0
4.16 Definition
H ∈Mnn(C) is said to be Hermitian if HT= H.
Note that H = (aij) to be Hermition we must have aij = aij.
4.17 Example 2 1− i i1 + i −1 2−i 2 3
, Thus, a real symmetric matrix is also an Hermitian matrix.
4.18 Theorem
The eigenvalues of an Hermitian matrix are always real.
62
Proof. Let H be n×n Hermitian matrix, λ an eigenvalue, X = (x1, x2, ..., xn)T , a corresponding
eigenvector. Then X = 0, so |x1|2 + |x2|2 + ...+ |xn|2 > 0, (i.e X.X = XTX = 0). Now
HX = λX (n× 1)
XTHX = λX
TX (1)
Since HX = λX, then
(HX)T = (λX)T , i.e XTH
T= λX
T(1× n)
Multiply on right by X, we have
XTH
TX = λX
TX (1× 1)
However, HT= H, so
XTHX = λX
TX (2)
(1)− (2) gives
0 = (λ− λ)XTX, X
TX > 0
So, λ− λ = 0 and λ = λ. i.e λ is real. �
4.19 Corollary
The roots of the characteristic polynomial of a real symmetric matrix are real.
4.20 Lemma 1
Let H ∈Mnn(C) be Hermitian. Then, < HX, Y >=< X,HY > for any X, Y ∈ Cn.Proof.
< HX, Y > = (HX)TY
= XTHTY , (HT= H) ⇒ HT = H
= XTH Y
= XTHY
= < X,HY >
�
4.21 Lemma 2
Let V be an inner product space over C (resp. R) and let φ : V → V be a linear mapping satisfying
< φ(u), v >=< u,φ(v) >
Let φ be represented relative to the orthonormal basis w1, w2, ..., wn by H. Then H is Hermitian(resp. real symmetric).
63
Proof. Let H = (hij). Then
< φ(wi), wj > = < h1iw1 + h2iw2 + ...+ hniwn, wj >
= hji
= < wi, φ(wj) >
= < wi, h1jw1 + h2jw2 + ...+ hnjwn >
= hij
Thus HT= H. �
4.22 Lemma 3
Eigenvectors corresponding to distinct eigenvalues of an Hermitian matrix are orthogonal.Proof. Let λ = µ be eigenvalues of the Hermitian matrix H with corresponding eigenvectors
X, Y. Then by Lemma 1, < HX, Y >=< X,HY >, so
< HX, Y > = < λX, Y >= λ < X, Y >
= < X,HY >=< XµY >
= µ < X, Y >
Hence, λ < X, Y >= µ < X, Y > . Since λ = µ we have < X, Y >= 0. That is X, Y are orthogonal.�
4.23 Example
Let A =
2 1 11 2 11 1 2
. Then det(λI − A) = (λ− 1)2(λ− 4). So the eigenvalues are λ = 4, λ = 1.
A corresponding eigenvector of λ = 4 is (1, 1, 1)T and a unit eigenvector is ( 1√3, 1√
3, 1√
3)T .
A corresponding eigenvectors of λ = 1 is (1,−1, 0)T and another eigenvector (a, b, c)T such that(a, b, c)T ⊥ (1,−1, 0) is (1, 1,−2)T . Unit eigenvectors are ( 1√
2, −1√
2, 0)T . and ( 1√
6, −1√
6, −2√
6)T .
Hence P =
1√3
1√2
1√6
1√3
−1√2
1√6
1√3
0 −2√6
is an orthogonal matrix(P−1 = P T ) and P−1AP =
4 0 00 1 00 0 1
4.24 Theorem
Let V be an n−dimensional inner product space over F (where F = C or R). Let φ : V → V be alinear mapping satisfies < φ(u), v >=< u, φ(v) > for all u, v ∈ V. Then V possesses an orthonormalbasis consisting of eigenvectors for φ.
Proof. The proof is by induction on n = dim(V ).If n = 1, then any unit vector will be an orthonormal basis for V and also be an eigenvector for
φ. Suppose n > 1 and the result holds for spaces of smaller dimension possesses a linear mappingwith the given property. Let λ be an eigenvalue of φ. (i.e an eigenvalue of any matrix representingφ.) By Lemma 2, these include Hermitian matrices, so λ will be real by Theorem 1.18. Let M be
64
the eigen-space of V corresponding to λ. then dim(M) > 0. let w1, ..., wk be an orthonormal basisfor M. Note that w1, ..., wk are eigenvectors for φ corresponding to λ.
(Recall that the orthogonal complement M⊥ of M is M⊥ = {x ∈ V :< x, u >= 0∀u ∈M}.)By orthogonal projection theorem, V =M⊕M⊥ and dim(M⊥) = dim(V )−dim(M) = n−k < n.Let ψ be the restriction of φ to M⊥. (i.e ψ(m′) = φ(m′)∀m′ ∈ M⊥.) Let v ∈ M⊥, then for any
u ∈M
< u,ψ(v) > = < u,φ(v) >
= < φ(u), v >
= < λu, v >
= λ < u, v >= 0
This shows that ψ(v) ∈ M⊥. Thus ψ maps M⊥ to M⊥. Now dim(M⊥) < n and satisfies <ψ(u), v >=< u, ψ(v) > . Hence by induction hypothesis, M⊥ possesses an orthonormal basis con-sisting of eigenvectors of ψ(= φ), say wk+1, ..., wn. Then w1, w2, ..., wk, wk+1, ..., wn is an orthonormalbasis for V and they are all eigenvectors for φ. �
4.25 Corollary
Let S be a real symmetric matrix. Then there exists an orthogonal matrix P such that P−1AP =P TAP∆.
Similarity in Mnn(C) is an equivalent relation. So This relation splits Mnn(C) into the disjointequivalence classes.
The simplest type of matrix in any equivalent class would be a diagonal matrix. However, wehave seen that there are matrices which are not similar to a diagonal matrix. we look at this casenow:
Jordan Canonical Form:
Let A ∈ Mnn(C). Then the result is that there exists an invertible matrix P such that P−1APhas the form
J(λ1) 0J(λ2)
. . .
0 J(λn)
where the blocks down the diagonal area have the form:
J(λi) =
λi 0 0 01 λi 0 00 1 λi 00 0 1 λi
65
For example:
3 0 00 2 00 1 2
,
2 0 0 01 2 0 00 0 3 00 0 1 3
,
1 0 00 2 00 0 3
,
3 0 01 3 00 1 3
,
3 0 00 3 00 1 3
Cayley Hamilton Theorem:
If det(xI −A) = xn + an−1 + xn−1 + ...+ a1x+ a0 then An + an−1 + An−1 + ...+ a1A+ a0I = 0.(i.e Every matrix satisfies its own characteristic equation.) Thus if
det(xI − A) = (x− λ1)(x− λ2)...(x− λn)
then(A− λ1I)(A− λ2I)...(A− λnI) = 0
The order of factors is irrelevant since powers of A commute.
4.26 Lemma
LetA ∈Mnn(C), andX a column vector in Cn such thatAkX = 0, Ak−1X = 0, thenX,AX,A2X, ..., Ak−1Xare linearly independent in Cn.
Proof. Let a0, a1, ..., ak−1 be scalars such that
a0X + a1AX + a2A2X + ...+ ak−1A
k−1X = 0 (4.3)
Multiply on left by Ak−1 gives
a0Ak−1X + a1Ak−1AX + a2Ak−1A2X + ...+ ak−1Ak−1A
k−1X = Ak−10
Since AkX = 0 we havea0Ak−1X + 0 + ...+ 0 = 0
Since Ak−1X = 0, we have a0 = 0.So equation (1.3) becomes
a1AX + a2A2X + ...+ ak−1A
k−1X = 0
Applying Ak−2 will give a1 = 0. Repeat the process. �
Jordan Form for 3× 3 matrix
(I) A has eigenvalues λ, µ, ν, all different, then there is an invertible matrix P such that
P−1AP =
λ 0 00 µ 00 0 ν
66
(II) (A) A has eigenvalues λ, µ, µ and λ = µ with dimEµ = 2. Then there is an invertible matrixP such that
P−1AP =
λ 0 00 µ 00 0 µ
(B) A has eigenvalues λ, µ, µ and λ = µ with dimEµ = 1. Then the characteristic polynomial
of A is (x− λ)(x− µ)2. So that, by Cayley Hamilton Theorem,
(A− λI)(A− µI)2 = 0
Next, we will see what P−1AP looks like.
Let φ : C3 → C3 defined byφ(X) = (A− λI)X
Now kerφ = Eλ, dimkerφ = 1. So dim(Imφ) = 2. Hence we must be able to find Y suchthat Y ∈ Im(φ) \ Eµ, i.e (A − µI)Y = 0 Y = (A − λI)Z for some Z ∈ C3. Therefore,Y = 0, (A− µI)Y = 0 but (A− µI)2Y = (A− µI)2(A− λI)Z = 0Z = 0.
By Lemma, Y and (A− µI)Y are linearly independent. Let X1 be an eigenvector corre-sponding to λ. Put Y = X2, and (A− µI)Y = X3.
Claim: X1, X2, X3 are linearly independent.
Assume a1X1 + a2X2 + a3X3 = 0. Apply (A− µI)2 to this equation, this gives
a1(A− µI)2X1 + a2(A− µI)2X2 + a3(A− µI)2X3 = 0
a1(A− µI)2X1 + a20 + a30 = 0
a1(A2 − 2µA+ µ2I)X1 = 0
a1(λ2 − 2µλ+ µ2)X1 = 0
a1(λ− µ)2X1 = 0
But (λ− µ)2 = 0, X1 = 0, so a1 = 0.
The above reduces to a2X2+a3X3 = 0. But X2, X3 are known to be linearly independent,so a2 = a3 = 0. We have AX1 = λX1, (A − µI)X2 = X3, so AX2 = µX2 + X3 and(A− µI)X3 = 0, so AX3 = µX3.
Put P = (X1|X2|X3). Then
AP = (AX1|AX2|AX3)
= (λX1|µX2 +X3|µX3)
= (X1|X2|X3)
λ 0 00 µ 00 1 µ
So, P−1AP =
λ 0 00 µ 00 1 µ
Method:
(a) Choose X1 to be an eigenvalue of λ.
(b) Choose Z such that (A− µI)(A− λI)Z = 0.
(c) Put X2 = (A− λI)Z, X3 = (A− µI)X2.
67
4.27 Example
A =
−3 1 −1−7 5 −1−6 6 −2
det(xI − A) = (x− 4)(x+ 2)2
x = −2 : an eigenvector is (1, 1, 0)T only. NOT diagonalizable
So, let λ = 4, µ = −2. Then
(A− 4I) =
−7 1 −1−7 1 −1−6 6 −6
, (A+ 2I) =
−1 1 −1−7 7 −1−6 6 0
So, when λ = 4 an eigenvector is X1 =
011
Take Z =
100
X2 = (A− 4I)Z =
−7−7−6
and X3 = (A+ 2I)X2 =
660
So P =
0 −7 61 −7 61 −6 0
and P−1AP =
4 0 00 −2 00 1 −2
(III) det(xI − A) = (x− µ)3.
By Cayley Hamilton Theorem, (A− µI)3 = 0.
If (A−µI) = 0, then A = µI and so is in canonical form. So we can assume (A−µI) = 0,this now splits into two cases:
(A) (A− µI)2 = 0. By Lemma, X, (A− µI)X, (A− µI)2X are linearly independent.Put
X1 = X
X2 = (A− µI)X
X3 = (A− µI)2X
So, (A− µI)X1 = X2 ⇒ AX1 = µX1 +X2 and(A− µI)X2 = X3 ⇒ AX2 = µX2 +X3 and(A− µI)X3 = 0 ⇒ AX3 = µX3
Put P = (X1|X2|X3), then
P−1AP =
µ 0 01 µ 00 1 µ
68
4.28 Example
A =
−2 1 1−5 3 2−4 1 2
Then det(xI − A) = (x− 1)3 and
(A− µI) = (A− I) =
−3 1 1−5 2 2−4 1 1
, (A− I)2 =
0 0 0−3 1 13 −1 −1
= 0.
So let X1 = (0, 1, 0)T , X2 = (A − I)X1 = (1, 2, 1)T , X3 = (A − I)2X1 = (0, 1,−1)T .Hence,
P =
0 1 01 2 10 1 −1
, P−1AP =
1 0 01 1 00 1 1
(B) If (A− µI) = 0 while (A− µI)2 = 0 define
φ : C3 → C3 byφ(X) = (A− µI)X.
Then kerφ = Eµ. For X ∈ C3,
(A− µI)(A− µI)X = 0.X = 0
Hence, Im(φ) ⊆ kerφ. Now
dim(Im(φ)) > 0
dim(Im(φ)) ≤ dimkerφ
dim(Im(φ)) + dimkerφ = 3
The only way to satisfy these is to have dim(Im(φ)) = 1, dimkerφ = 2 i.e dimEµ = 2.Choose
i. X1 so that (A− µI)X1 = 0
ii. X2 = (A− µI)X1
iii. X3 such that X2, X3 is a basis for Eµ.
Note that (A− µI)X2 = (A− µI)2X1 = 0, so X2 is an eigenvector of µ.
Claim: X1, X2, X3 are linearly independent.Let a1X1 + a2X2 + a3X3 = 0. Apply (A− µI) :
a1(A− µI)X1 + a2.0 + a3.0 = 0
i.e a1X2 = 0, X2 = 0, so a1 = 0. Thusa2X2 + a3X3 = 0 which implies that a2 = a3 = 0 since X2, X3 are basis for Eµ.Put P = (X1|X2|X3), so P in invertible and
AX1 = µX1 +X2
AX2 = µX2
AX3 = µX3
69
So,
P−1AP =
µ 0 01 µ 00 0 µ
4.29 Example
A =
0 1 0−4 4 00 0 2
Then det(xI − A) = (x− 2)3 and
(A− 2I) =
−2 1 0−4 2 00 0 0
, (A− 2I)2 = 033
Put X1 = (0, 1, 0)T , X2 = (A − 2I)X1 = (1, 2, 0)T Take X3 = (0, 0, 1)T as an eigenvalueof µ = 2.
Then
P =
0 1 01 2 00 0 1
, P−1AP =
2 0 01 2 00 0 2
70
Exercises
Set18.
(1) Let A ∈Mnn(F) has characteristic polynomial
det(λI − A) = λn + bn−1λn−1 + ...+ b1λ+ b0
Express detA in terms of bn−1, bn−2, ..., b0. Deduce that 0 is an eigenvalue of A if and only if Ais singular.
(2) Determine whether either of the following matrices is similar to a diagonal matrix. In the casewhen are has this property, find an invertible matrix P such that P−1AP = Λ. 1 −3 3
3 −5 36 −6 4
,
0 1 00 0 116 12 0
Deduce that matrices which have the same characteristic polynomial are not necessarily similar.
(3) Find 8 different diagonal matrices Di ∈M33(R), (1 ≤ i ≤ 8) such that
D2i +Di =
0 0 00 2 00 0 6
It is given that the matrix A ∈ M33(R) has eigenvalues 0, 2, 6. Prove that there are at least 8different solutions (for X) of the matrix equation
X2 +X = A.
(4) Let A =
(3 21 2
)∈M22(R).
1. Find the complete solution of the simultaneous differential equations:
dy1dx
= 3y1 + 2y2
dy2dx
= y1 + 2y2
2. Find B ∈M22(R) such that B2 = A.
Set19.
(1) Let A,B be n× n real orthogonal matrices. Verify that B(A+B)TA = A+B.
By taking B to be I or −I (both of which are real orthogonal matrices), show that
1. if A is improper, then −1 is an eigenvalue of A,
2. if A is proper and n is odd, then 1 is an eigenvalue of A,
3. if A is improper and n is even, then 1 is an eigenvalue of A.
71
(2) S ∈Mnn(C) is called skew-Hermitian if ST= −S. Write down a 3× 3 skew-Hermitian matrix
not all of whose entries are real.
Prove that the eigenvalues of a skew-Hermitian matrix are all have the form ib where b ∈ R.
(3) 1. Find a unitary matrix U such that UTBU is diagonal where B =
(3 1− i
1 + i 2
)2. Find a real orthogonal matrix P such that P TAP is a diagonal matrix where
A =
2 −4 2−4 2 −22 −2 −1
.
Set20.
(1) For each of the following matrices, determine its Jordan Canonical Form. In each case, findthe non-singular matrix P such that P−1AP is in Jordan form:
1. A =
−4 2 −1−1 −1 −1−2 3 −4
2. B =
1 2 22 1 22 2 1
3. C =
0 0 −1−2 2 −1−8 9 −2
4. D =
−4 3 −1−2 1 −1−2 3 −3
(2) Find the general solution, on any interval containing 0, of the following system of simultaneous
differential equations:
y′1 = 2y1 + y3
y′2 = −y1 + 2y2 + 2y3
y′3 = −y1 − y2 + 5y3
Show that y1(0) = y2(0) = 1, y3(0) = 0.
72