Upload
sriram1593
View
112
Download
0
Embed Size (px)
DESCRIPTION
Arindama Singh & A. V. JayanthanIIT Madras
Citation preview
Class Notes
MA 2030
Linear Algebra and Numerical Analysis
Arindama Singh & A. V. Jayanthan
IIT Madras
Caution
The writing style here is not refined.
It is only a class note, not a book.
You must learn correct writing style from your teacher.
1 Vector Spaces
The notion of a vector space is an abstraction of the familiar set of vectors in two or three dimensional Euclidean
space.
Let R denote the set of all real numbers.
We first recall certain ’good’ properties of vectors in the real plane:
There exists a vector, namely 0, such that for all x ∈ R2, x+ 0 = x = 0 + x.
For every x ∈ R2, there exists another vector, denoted by −x, such that x+ (−x) = 0 = (−x) + x.
’Addition’ distributes over ’Multiplication’.
Both addition and multiplication are ’associative’.
For all x ∈ R2, 1 · x = x.
We will always use x, y, z, u, v, w for vectors.
We will use the Greek letters α, β, γ, . . . and a, b, c for scalars.
The symbol 0 will stand for the ‘zero vector’ as well as ‘zero scalar’. From the context, you should know which one it
represents.
Let R denote the set of all real numbers and C denote the set of all complex numbers.
F denotes either R or C. Whenever needed, we will specifically mention the set of scalars.
Definition 1.1. A set V with two operations + (addition) and · (scalar multiplication) is said to be a vector space
over F if it satisfies the following axioms:
1. x+ y = y + x, for all x, y ∈ V.
2. (x+ y) + z = x+ (y + z), for all x, y, z ∈ V .
3. There exists 0 ∈ V such that x+ 0 = x for all x ∈ V .
4. for each x ∈ V , there exists (−x) ∈ V such that x+ (−x) = 0.
5. α · (x+ y) = α · x+ α · y for all α ∈ F and for all x, y ∈ V .
6. (α+ β) · x = α · x+ β · y, for all α, β ∈ F and for all x ∈ V .
7. (αβ) · x = α · (β · x) for all α, β ∈ F, for all x ∈ V .
8. 1 · x = x for all x ∈ V .
Example 1.1.
1. V = {0} is a vector space over F.
2. R2 :=
{(a
b
): a, b ∈ R
}with the usual addition and scalar multiplication is a vector space over R.
3. More generally: Rn :=
a1...
an
: ai ∈ R
with the usual addition and scalar multiplication is a vector space
over R.
A vector in Rn will either be represented as column vector, i.e., in the form
a1...
an
, which is (a1, a2, . . . , an)t, or
will be represented as a row vector, i.e., in the form (a1, . . . , an).
We will be using both the notations, according to the context and convenience.
4. V = {(x1, x2) ∈ R2 : x2 = 0} is a vector space over R under the usual addition and scalar multiplication.
2
5. V = {(x1, x2) ∈ R2 : 2x1 − x2 = 0} is a vector space over R under the usual addition and scalar multiplication.
6. Is V = {(x1, x2) ∈ R2 : 3x1 + 5x2 = 1} a vector space over R?
7. Pn := {a0 + a1t + · · · + antn : ai ∈ F} with the usual polynomial addition and scalar multiplication is a vector
space over F.
8. The set Mm×n(F) of all m× n matrices with entries from F with the usual matrix addition and scalar multipli-
cation is a vector space over F.
9. V = R2, for (a1, a2), (b1, b2) ∈ V and α ∈ R, define (a1, a2) + (b1, b2) = (a1 + b1, a2 + b2), α(a1, a2) = (0, 0) if
α = 0 and α(a1, a2) = (αa1, a2/α) if α 6= 0. Is V a vector space over R?
10. V = {f : [a, b]→ R : f is a function}.For f, g ∈ V , define f + g to be the map (f + g)(x) = f(x) + g(x) for all x ∈ R.
For α ∈ R and f ∈ V , define αf to be the map (αf)(x) = αf(x) for all x ∈ R.
What is the ’zero vector’ in V ?
The map f such that f(x) = 0 for all x ∈ [a, b].
For f ∈ V , −f , defined by (−f)(x) = −f(x) is the additive inverse.
V is a vector space over R.
11. V ={f : R→ R : d
2fdx2 + f = 0
}. Define addition and scalar multiplication as in the previous example.
For f, g ∈ V , d2(f+g)dx2 + (f + g) = (d
2fdx2 + f) + ( d
2gdx2 + g) = 0
Similarly d2(αf)dx2 + (αf) = α
(d2fdx2 + f
)= 0.
Therefore, V is closed under addition and scalar multiplication. Other properties can easily be verified. Hence
V is a vector space over R.
Theorem 1.1. Let V be a vector space over F. Then
1. The zero element is unique,
i.e., if there exists θ1, θ2 such that x+ θ1 = x and x+ θ2 = x, for all x ∈ V , then θ1 = θ2.
2. Additive inverse for each vector is unique,
i.e., for x ∈ V , if there exist x1 and x2 such that x+ x1 = 0 = x+ x2, then x1 = x2.
3. for any x ∈ V and α ∈ F:
(a) 0 · x = 0 (b) (−1) · x = −x (c) α · 0 = 0.
Proof.
1. θ1 = θ1 + θ2 = θ3.
3. x1 = x1 + 0 = x1 + x+ x2 = x2 + x+ x1 = x2 + 0 = x3.
3(a). 0 · x+ 0 = 0 · x = (0 + 0) · x = 0 · x+ 0 · x⇒ c · x = 0.
3(b). x+ (−1)x = 1 · x+ (−1) · x = (1 + (−1)) · x = 0 · x = 0⇒ (−1)x = x.
3(c). 0 + α · 0 = α · 0 = α · (0 + 0) = α · 0 + α · 0⇒ α · 0 = 0. �
Subspace of a vector space is a subset which follow the ’same structure’. We have seen that:
• W = {(x1, x2) ∈ R2 : 2x1 + x2 = 0} is a vector space.
• P3 is a vector space and is a subset of the vector space P4.
Definition 1.2. Let W be a subset of a vector space V . Then W is called a subspace of V if W is a vector space
with respect to the operations of addition and scalar multiplication as in V .
3
Example 1.2.
1. {0} ⊂ V is a subspace for any vector space V .
2. W = {(x1, x2, x3) ∈ R3 : 2x1 − x2 + x3 = 0} is a subspace of R3.
More generally, if A is an m×n matrix and x = (x1, · · · , xn), then the set of all solutions of the equation Ax = 0
is a subspace of Rn. (Prove it!)
3. Pn is a subspace of Pm for n ≤ m.
4. C[a, b] := {f : [a, b]→ R : f is a continuous function} is a vector subspace of
F [a, b] := {f : [a, b]→ R : f is a function}.
5. R[a, b] := {f : [a, b]→ R : f is integrable } is a vector subspace of C[a, b].
6. Ck[a, b] := {f : [a, b]→ R : dkfdxk exists } is a vector subspace of C[a, b].
7. Pn can also be seen as a subspace of C[a, b].
Do we have to verify all 8 conditions to check whether a given subset of a vector space is a subspace?
Theorem 1.2. Let W be a subset of a vector space V . Then W is a subspace of V if and only if W is non-empty
and x+ αy ∈W for all x, y ∈W and α,∈ F.
Proof. If W is a subspace, then obviously the given condition is satisfied.
Conversely suppose W is a subset which satisfies the given condition. The commutativity and associativity of
addition, distributive properties and scalar multiplication with 1 are satisfied in V and hence true in W too. Therefore,
we only need to verify the existence of ‘zero vector’ and ‘additive inverse’.
W 6= ∅ ⇒ ∃ x ∈W ⇒ x+ (−1)x = 0 ∈W . Hence the additive identity exists.
Now for y ∈W , take x = 0 and α = −1 in the given condition. We get 0 + (−1)y = −y ∈W . Hence additive inverse
exists.
Therefore, W is a subspace of V . �
Given two vector subspaces, what are the other possibilities of obtaining new vector subspaces from them?
Theorem 1.3. Let V1 and V2 are subspace of a vector space V . Then V1 ∩ V2 is a subspace of V .
Proof. V1 & V2 are subspaces ⇒ 0 ∈ V1 ∩ V2. Therefore V1 ∩ V2 6= ∅.Suppose x, y ∈ V1 ∩ V2 and α ∈ F, then x + αy belong to both V1 and V2 (since they are subspaces) and hence
they belong to V1 ∩ V2Therefore, V1 ∩ V2 is a subspace. �
If V1 and V2 are subspaces, then is V1 ∪ V2 a subspace?
Is the union of x-axis and y-axis a subspace of R2?
Theorem 1.4. Let V1 and V2 be subspaces of a vector space. Then V1 ∪ V2 is a subspace if and only if either V1 ⊆ V2or V2 ⊆ V1.
Proof. If V1 ⊆ V2 (V2 ⊆ V1), then V1 ∪ V2 = V2 (V1). Then it is a subspace.
Conversely, assume that V1 ∪ V2 is a subspace and V1 * V2. We want to show that V2 ⊆ V1.
Let x ∈ V2. Since V1 * V2, ∃ y ∈ V1 \ V2.
⇒ x, y ∈ V1 ∪ V2, ⇒ x+ y ∈ V1 ∪ V2, since V1 ∪ V2 is a subspace.
If x+ y ∈ V2, then y ∈ V2, which is a contradiction.
⇒ x+ y ∈ V1 ⇒ x ∈ V1 ⇒ V2 ⊆ V1. �
Example 1.3. Let V = C[−1, 1] and V0 = {f ∈ V : f is an odd function }. Check if V0 is a subspace of V.
4
Solution: The zero vector belongs to V0 ⇒ V0 6= ∅.If f, g ∈ V0 and α ∈ R, then
(f + αg)(−x) = f(−x) + αg(−x) = −f(x) + α(−g(x)) = −(f + αg)(x) ⇒ f + αg
is an odd function. Taking the zero function as an odd function, we see that V0 is a subspace of V.
Problems for Section 1
1. Let V be a vector space over F, where F is R or C. Show the following:
(a) For all x, y, z ∈ V, x+ y = z + y implies x = z.
(b) For all α, β ∈ F, x ∈ V, x 6= 0, αx 6= βx iff α 6= β.
(c) V must have an infinite number of elements.
2. Is the set of positive real numbers with the operations defined by x+ y = xy and αx = xα, a vector space over
R?
3. Let V = [0,∞). For x, y ∈ V, α ∈ R, define x⊕ y = xy, α� x = |α|x. Is (V,⊕,�) a vector space over R? Give
reasons.
4. In each of the following a non-empty set V is given and some operations are defined. Check whether V is a
vector space with these operations.
(a) V = {(x1, 0) : x1 ∈ R} with addition and scalar multiplication as in R3.
(b) V = {(x1, x2) ∈ R2 : 2x1 + 3x2 = 0} with addition and scalar multiplication as in R3.
(c) V = {(x1, x2) ∈ R2 : x1 + x2 = 1} with addition and scalar multiplication as in R3.
(d) V = R2, for (a1, a2), (b1, b2) ∈ V and α ∈ R, define (a1, a2) + (b1, b2) = (a1 + b1, a2 + b2); 0 (a1, a2) = (0, 0)
and for α 6= 0, α(a1, a2) = (αa1, a2/α).
(e) V = C2, for (a1, a2), (b1, b2) ∈ V and α ∈ C, define (a1, a2) + (b1, b2) = (a1 + 2b1, a2 + 3b2), α(a1, a2) =
(αa1, αa2).
(f) V = R2, for (a1, a2), (b1, b2) ∈ V and α ∈ R define (a1, a2) + (b1, b2) = (a1 + b1, a2 + b2), α(a1, a2) = (a1, 0).
5. Check if V0 is a subspace of V :
(a) Let V = P3 and V0 = {a0 + a1t+ a2t2 + a3t
3 : a0 = 0}.
(b) Let V = P3 and V0 = {a0 + a1t+ a2t2 + a3t
3 : a1 + a2 + a3 + a4 = 0}.
(c) V = R2 and V0 = any straight line passing through the origin. What are other subspaces of R2?
6. Is R[a, b], the set of all real valued Riemann integrable functions on [a, b], a vector space?
7. Is the set of all polynomials of degree 5 with usual addition and scalar multiplication of polynomials a vector
space?
8. Let S be a non-empty set, s ∈ S. Let V be the set of all functions f : S → R with f(s) = 0. Is V a vector space
over R with the usual addition and scalar multiplication of functions?
9. In each of the following, a vector space V and a subset W are given. Check whether W is a subspace of V .
(a) V = R2; W = {(x1, x2) : x2 = 2x1 − 1}.
(b) V = R3; W = {(x1, x2, x3) : 2x1 − x2 − x3 = 0}.
(c) V = C[0, 1]; W = {f ∈ V : f is differentiable}.
(d) V = C[−1, 1]; W = {f ∈ V : f is an odd function }.
(e) V = C[0, 1]; W = {f ∈ V : f(x) ≥ 0 for all x}.
5
(f) V = P3(R); W = {at+ bt2 + ct3 : a, b, c ∈ R}.
(g) V = P3(C); W = {a+ bt+ ct2 + dt3 : a, b, c, d ∈ C, a+ b+ c+ d = 0}.
(h) V = P3(C); W = {a+ bt+ ct2 + dt3 : a, b, c, d integers }.
10. Let A ∈ Rn×n. Let 0 ∈ Rn×1 be the zero vector. Is the set of all x ∈ Rn×1 with Ax = 0 a subspace of Rn×1?
11. Suppose U is a subspace of V and V is a subspace of W. Is U a subspace of W?
12. Let `1(N) be the set of all absolutely convergent sequences and `∞(N) be the set of all bounded sequences with
entries from F. Show that `1(N) and `∞(N) are vector spaces over F.
13. For a nonempty set S, let V be the vector space of all functions from S to R. Show that the set B(S) of all
bounded functions on S is a subspace of V.
14. Give an example to show that union of two subspaces need not be a subspace.
2 Span
Let V be a vector space and v ∈ V . Let V0 = {αv : α ∈ F}. Then V0 is a vector space. In general, if v1, . . . , vn ∈ Vthen V0 = {α1v1 + · · ·+ αnvn : αi ∈ F for all i = 1, . . . , n} is a subspace of V .
Definition 2.1. Let V be a vector space and v1, . . . , vn ∈ V . Then, by a linear combination of v1, . . . , vn, we mean
an element in V of the form α1v1 + · · ·+ αnvn with αj ∈ F, j = 1, . . . , n.
Let S be a non-empty subset of V . Then the set of all linear combinations of elements of S is called the span of
S, and is denoted by spanS.
Span of the empty set is taken to be {0}.
Example 2.1.
1. span({0}) = {0}.
2. C = span{1, i} with scalars from R.
3. Let e1 = (1, 0), e2 = (0, 1). Then R2 = span{e1, e2}.
4. If ei denotes the vector having 1 at the ith place and 0 elsewhere, then Rn = span{e1, . . . , en}.
5. P3 = span{1, t, t2, t3}.
6. Let P denote the set of all polynomials of all degree. Then P = span{1, t, t2, . . .}.
Caution: The set S can have infinitely many elements, but a linear combination has only finitely many elements.
Theorem 2.1. Let V be a vector space and S ⊆ V . Then spanS is the smallest subspace of V that contains S.
Proof. If S = ∅, then spanS = {0}. If S 6= ∅, x, y ∈ spanS, then x = α1x1 + · · ·+ αnxn and y = β1y1 + · · ·+ βmym
for some αi, βj ∈ F and xi, yj ∈ S. Then for α ∈ F,
x+ αy = a1x1 + · · ·+ anxn + αβ1y1 + · · ·αβmym ∈ spanS.
If V0 is a subspace containing S, then V0 contain all linear combination of elements of S.
⇒ spanS ⊆ V0 ⇒ spanS is the smallest subspace containing S. �
Exercise 2.1. Prove or give a counter example to the following:
1. S is a subspace of V if and only if spanS = S.
6
3. If S is a subset of a vector space V , then spanS =⋂{Y : Y is a subspace of V containing S}.
3. Let u1 = (a11, . . . , am1), . . . , un = (a1n, . . . , amn). Then the linear system of equations
a11x1 + a12x2 + · · · + a1nxn = b1
a21x1 + a22x2 + · · · + a2nxn = b2
· · · + · · · + · · · + · · · = · · ·am1x1 + am2x2 + · · · + amnxn = bm.
has a solution vector x = (x1, . . . , xn) if and only if b = (b1, . . . , bn) is in the span of {u1, . . . , un}.
Definition 2.2. Let V1 and V2 be subspaces of a vector space V . Sum of these subspaces is defined as
V1 + V2 = {u+ v : u ∈ V1 and v ∈ V2}.
Theorem 2.2. V1 + V2 = span(V1 ∪ V2).
Proof. Let u+ v ∈ V1 + V2. Then clearly u+ v ∈ span(V1 ∪ V2). [Note that V1 ∪ V2 ⊆ V1 + V2.]
Let v ∈ span(V1 ∪ V2). Then there exist u1, . . . , un ∈ V1 and v1, . . . , vm ∈ V2 such that
v = α1u1 + · · ·+ αnun + β1v1 + · · ·+ βmvm.
Since α1u1 + · · ·+ αnun ∈ V1 and β1v1 + · · ·+ βmvm ∈ V2, v ∈ V1 + V2.
Therefore, V1 + V2 is a subspace of V . �
What is x-axis + y-axis ?
Any vector (x, y) ∈ R2 can be uniquely written as x(1, 0) + y(0, 1).
For example, take u = (1, 1), v = (−1, 2), w = (1, 0). Then
(2, 3) = 3(1, 1) + 0(−1, 2) +−1(1, 0) = 1(1, 1) + 1(−1, 2) + 2(1, 0).
In general, suppose V1 ∩ V2 = {0}. Then every element of V1 + V2 can be written uniquely as x1 + x2 with x1 ∈ V1and x2 ∈ V2.
Uniqueness breaks down if V2 has a vector which is a scalar multiple of some vector of V1.
The sum of two subsets S1 and S2 of a vector space V can be defined as the set of all vectors of the form x + y
where x ∈ S1 and y ∈ S2. However, we do not require it right now.
Problems for Section 2
1. Do the polynomials t3 − 2t2 + 1, 4t2 − t+ 3, and 3t− 2 spanP3? Justify your answer.
2. What is span{tn : n = 0, 2, 4, 6, . . .}?
3. Let u, v1, v2, . . . , vn be n+ 1 distinct vectors in a real vector space V.
Take S1 = {v1, v2, . . . , vn} and S2 = {u, v1, v2, . . . , vn}. Prove that
(a) If span(S1) = span(S2), then u ∈ span(S1).
(b) If u ∈ span(S1), then span(S1) = span(S2).
4. Let S be a subset of a vector space V . Show that S is a subspace if and only if S = span(S).
5. Let U be a subspace of V, v ∈ V \ U . Show that for every x ∈ span({v} ∪ U), there exists a unique pair
(α, y) ∈ F× U such that x = αv + y.
6. Show that span{e1 + e2, e2 + e3, e3 + e1} is R3, where e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1).
7. Let V be a vector space and A,B be subsets of V . Prove or disprove the following:
(a) A is a subspace of V if and only if span(A) = A.
7
(b) If A ⊆ B, then span(A) ⊆ span(B).
(c) span(A ∪B) = span(A) + span(B).
(d) span(A ∩B) ⊆ span(A) ∩ span(B).
8. Let U, V be subspaces of a vector space W. Prove or disprove the following:
(a) U ∩ V and U + V are subspaces of W .
(b) U + V = V iff U ⊆ V.
(c) U ∪ V is a subspace if and only if U ⊆ V or V ⊆ U.
(d) Let U ∩ V = {0}. If x ∈ U + V, then there are unique u ∈ U, v ∈ V such that x = u+ v.
9. Let u1(t) = 1, and for j = 2, 3, . . . , let uj(t) = 1 + t + . . . + tj−1. Show that span{u1, . . . , un} is Pn−1, and
span{u1, u2, . . .} is P.
10. Let U, V,W be subspaces of a real vector space X.
(a) Prove that (U ∩ V ) + (U ∩W ) ⊆ U ∩ (V +W ).
(b) Give an example with appropriate U, V,W,X to show that
U ∩ (V +W ) 6⊆ (U ∩ V ) + (U ∩W ).
3 Linear Independence
Definition 3.1. A set of vectors {v1, . . . , vn} is said to be linearly dependent if one of the vectors can be written
as a linear combination of others, i.e., there exists j ∈ {1, . . . , n} and αi ∈ F, i 6= j such that vj =∑i 6=j αivi.
A set of vectors {v1, . . . , vn} is said to be linearly independent if the set is not linearly dependent, i.e., none of
the vector can be written as a linear combination of the others.
Thus, {v1, . . . , vn} is linearly dependent if there exists j such that vj ∈ span{vi : i 6= j}.Given a set of vectors, how do we verify these properties?
{v1, . . . , vn} is linearly dependent ⇒ vj = α1v1 + · · ·+ αj−1vj−1 + αj+1vj+1 + · · ·+ αnvn.
⇒ α1v1 + · · ·+ αnvn = 0 taking αj = −1.
Conversely, suppose α1v1 + · · ·+ αnvn = 0 for some αi ∈ F.
Suppose αj 6= 0 for at least one j. Then
vj =α1
−αjv1 + · · ·+ αn
−αjvn.
Therefore, {v1, . . . , vn} is linearly dependent.
So we conclude:
• {v1, . . . , vn} is linearly dependent if and only if
there exist α1, . . . , αn ∈ F, not all zero, such that α1v1 + · · ·+ αnvn = 0.
• {v1, . . . , vn} is linearly independent if and only if
for any set of scalars α1, . . . , αn ∈ F such that at least one of them is non-zero, α1v1 + · · ·+ αnvn 6= 0.
i.e., for scalars α1, . . . , αn ∈ F, if α1v1 + · · ·+ αnvn = 0, then αi = 0 for all i = 1, . . . , n.
Example 3.1.
1. {(1, 0), (0, 1)} is linearly independent in R2.
2. Is {(1, 0, 0), (1, 1, 0), (1, 1, 1)} linearly independent in R3?
Suppose α(1, 0, 0) + β(1, 1, 0) + γ(1, 1, 1) = (0, 0, 0).
⇒ α+ β + γ = 0; β + γ = 0; γ = 0.
⇒ α = β = γ = 0 ⇒ the vectors are linearly independent.
8
3. {1, t, t2} ⊂ P2 is linearly independent.
4. Is {sinx, cosx} ⊂ C[−π, π] linearly independent?
Solution: Suppose α sinx+ β cosx = 0.
Putting x = 0 we get β = 0.
Now putting x = π2 , we get α = 0.
Note that
• Any set containing the zero vector is linearly dependent.
• The set {e1, . . . , en} is linearly independent and span{e1, . . . , en} = Rn.
• The set {1, t, . . . , tn} is linearly independent and Pn = span{1, t, . . . , tn}.
• Moreover, the set {v1, . . . , vn} is linearly dependent DOES NOT IMPLY that each vector is in the span of the
remaining vectors.
Exercise 3.1. Are the following true?
1. {sinx, sin 2x, sin 3x, . . . , sinnx} ⊂ C[−π, π] is linearly independent.
3. {(1, 0), (1, 1), (2, 2)} is linearly dependent and (1, 0) /∈ span{(1, 1), (2, 2)}.
Example 3.2. {u, v} is linearly dependent if and only if one of them is a scalar multiple of the other.
Solution: If {u, v} is linearly dependent, then either u ∈ span{v} or v ∈ span{u}, i.e., either u = αv or v = βu.
Conversely suppose one of them is a scalar multiple of the other, say u = αv. Then u ∈ span{v} ⇒ {u, v} is
linearly dependent.
Example 3.3. If {u1, . . . , un} is linearly dependent, then for any vector v ∈ V, the set {u1, . . . , un, v} is linearly
dependent.
Solution: Suppose α1u1 + · · ·+ αnun = 0 with αi 6= 0 for at least one i.
⇒ α1u1 + · · ·+ αnun + 0 · v = 0
⇒ {u1, . . . , un, v} is linearly dependent.
Theorem 3.1. Let {v1, v2, . . . , vn} be a linearly dependent ordered set in a vector space V. Then there exists k ∈{1, . . . , n} such that vk ∈ span{v1, . . . , vk−1}.
Proof. If v1 is the zero vector, then we take the set {v1, . . . , vk−1} as the empty set, as convention. So, assume that
v1 6= 0. Since the given set of vectors is linearly dependent, we have scalars α1, . . . , αn not all zero such that
α1v1 + α2v2 + · · ·+ αnvn = 0.
Let k be the maximum integer in {1, 2, . . . , n} such that αk 6= 0. This means αk 6= 0 but all of αk+1, . . . , αn are zero.
Notice that if αn 6= 0, then k = n. Now, the equation above reduces to
α1v1 + α2v2 + · · ·+ αkvk = 0, αk 6= 0.
In that case, vk = − 1αk
(α1v1 + α2v2 + · · ·+ αk−1vk−1).
That is, vk ∈ span{v1, . . . , vk−1}. �
Notice that considering the linearly dependent set as an ordered set has this bonus.
Problems for Section 3
1. Answer the following questions with justification:
(a) Is every subset of a linearly independent set linearly independent?
(b) Is every subset of a linearly dependent set linearly dependent?
9
(c) Is every superset of a linearly independent set linearly independent?
(d) Is every superset of a linearly dependent set linearly dependent?
(e) Is union of two linearly independent sets linearly independent?
(f) Is union of two linearly dependent sets linearly dependent?
(g) Is intersection of two linearly independent sets linearly independent?
(h) Is intersection of two linearly dependent sets linearly dependent?
2. Give three vectors in R2 such that none of the three is a scalar multiple of another.
3. Suppose S is a set of vectors and some v ∈ S is not a linear combination of other vectors in S. is S linearly
independent?
4. In each of the following, a vector space V and A ⊆ V are given. Determine whether A is linearly dependent and
if it is, express one of the vectors in A as a linear combination of the remaining vectors.
(a) V = R3, A = {(1, 0,−1), (2, 5, 1), (0,−4, 3)}.
(b) V = R3, A = {(1, 2, 3), (4, 5, 6), (7, 8, 9)}.
(c) V = R3, A = {(1,−3,−2), (−3, 1, 3), (2, 5, 7)}.
(d) V = P3, A = {t2 − 3t+ 5, t3 + 2t2 − t+ 1, t3 + 3t2 − 1}.
(e) V = P3, A = {−2t3 − 11t2 + 3t+ 2, t3 − 2t2 + 3t+ 1, 2t3 + t2 + 3t− 2}.
(f) V = P3, A = {6t3 − 3t2 + t+ 2, t3 − t2 + 2t+ 3, 2t3 + t2 − 3t+ 1}.
(g) V = F2×2, A =
{[1 0
0 1
],
[0 0
0 1
],
[0 1
1 0
]}.
(h) V = {f : R→ R}, A = {2, sin2 t, cos2 t}.
(i) V = {f : R→ R}, A = {1, sin t, sin 2t}.
(j) V = C([−π, π]), A = {sin t, sin 2t, . . . , sinnt} where n is some natural number.
5. Show that two vectors (a, b) and (c, d) in R2 are linearly independent if and only if ad− bc 6= 0.
6. Let A = (a1j) ∈ Rn×n and let w1, . . . , wn be the n columns of A. Let {u1, . . . , un} be linearly independent in
Rn. Define vectors v1, . . . , vn by vj = a1ju1 + . . .+ anjun, for j = 1, 2, . . . , n.
Show that {v1, v2, . . . , vn} is linearly independent iff {w1, w2, . . . , wn} is linearly independent.
7. Let A,B be subsets of a vector space V. Prove or disprove:
span(A) ∩ span(B) = {0} iff A ∪B is linearly independent.
8. Show that a subset {v1, v2, . . . , vn} of a vector space V is linearly independent iff the function f : Fn → V
defined by f(α1, α2, . . . , αn) = α1v1 + · · ·αnvn is one-one.
9. Suppose {u1, . . . , un} is linearly independent and Y is a subspace of V with span{u1, . . . , un} ∩ Y = {0}. Prove
that every vector x in the span of {u1, . . . , un} ∪ Y can be written uniquely as x = α1u1 + · · ·+ αnun + y with
α1, . . . , αn ∈ F and y ∈ Y .
10. Let p1(t) = 1 + t+ 3t2, p2(t) = 2 + 4t+ t2, p3(t) = 2t+ 5t2. Are the polynomials p1, p2, p3 linearly independent?
11. Prove that in the vector space of all real valued functions, the set of functions {ex, xex, x3ex} is linearly inde-
pendent.
12. Suppose V1 and V2 are subspaces of a vector space V such that V1 ∩ V2 = {0}. Show that every x ∈ V1 + V2 can
be written uniquely as x = x1 + x2 with x1 ∈ V1 and x2 ∈ V2.
10
4 Basis and dimension
Definition 4.1. Let V be a vector space over F.. A set which is linearly independent and spans a vector space V is
called a basis of V .
Notice that a basis depends on the underlying field since linear combinations depend on the field.
Basis is NOT unique: For example, both {1} and {2} are R-bases of R. In fact {x}, for any non-zero x ∈ R, is an
R-basis of R.
Example 4.1. Verify if {(1, 1), (1, 2)} is an R-basis of R2.
Solution: Since neither of them is a scalar multiple of the other, the set is linearly independent.
span{(1, 1), (1, 2)} = R2?
Does the equation (x, y) = α(1, 1) + β(1, 2) has a solution in α, β?
α+ β = x; α+ 2β = y ⇒ β = y − x and α = 2x− y.
⇒ span{(1, 1), (1, 2)} = R2{(1, 1), (1, 2)} is an R-basis of R2.
Example 4.2. {e1, . . . , en} is an R-basis of Rn. This basis is called Standard Basis of Rn.
Consider set {Mij : i = 1, . . . ,m; j = 1, . . . , n}, where Mij is the matrix with (i, j)-th entry 1 and all other entries
0. Then this is a basis of Mm×n(F), called the standard basis.
Exercise 4.1. Prove (1-2) and answer (3-4):
1. {1, 1 + t, 1 + t+ t2} is a basis for P2.
2. If {p1(t), . . . , pr(t)} ⊂ P is set of polynomials such that deg p1 < deg p2 < · · · < deg pr, then {p1(t), . . . , pr(t)} is
linearly independent.
3. Find a basis for Pn similar to (1).
4. Can you find a basis for R2 consisting of 3 vectors?
Theorem 4.1. Let V be a vector space and B ⊂ V . Then the following are equivalent:
1. B is a basis of V
2. B is a maximal linearly independent set in V ,
i.e., B is linearly independent and every proper superset of B is linearly dependent.
3. B is a minimal spanning set of V ,
i.e., span(B) = V and no proper subset of B can spanV .
Proof.
(1) ⇒ (2): Since B is a basis, span(B) = V . If v ∈ V \B, then v is a linear combination of elements of B ⇒ B ∪{v}is linearly dependent ⇒ B is a maximal linearly independent subset of V .
(2) ⇒ (3): Since B is linearly independent, for any v ∈ B, v 6∈ span(B \ {v}) ⇒ no subset of B can spanV . If
v ∈ V \ span(B), then B ∪ {v} is linearly independent, which contradicts the assumption ⇒ span(B) = V .
(3) ⇒ (1): Assume that B is a minimal spanning set of V ⇒ span(B) = V . Suppose B is linearly dependent, i.e.,
there exists u ∈ B such that u ∈ span(B \ {u}). Thus, span(B \ {u}) = V, which contradicts the assumption that B
is a minimal spanning set. �
Definition 4.2. A vector space V is said to be finite dimensional if there exists a finite basis for V . A vector space
which is not finite dimensional is called infinite dimensional vector space.
Example 4.3. Fn,Pn,Mm×n(F) are finite dimensional vector spaces over F.
Theorem 4.2. If a vector space has a finite spanning set, then it has a finite basis.
11
Proof. Let V = spanS for some finite subset S of V. If S is linearly independent, then S is a basis.
Otherwise, there exists u1 ∈ S such that u1 ∈ span(S \ {u1}). Therefore, span(S \ {u1}) = V . If S1 = S \ {u1} is
linearly independent, then S1 is a basis.
Otherwise, one can repeat the process. The process has to stop since S is a finite set and we end up with a subset
Sk of S such that Sk is linearly independent and spanSk = V . �
Example 4.4. Compute a basis of V = {(x, y) ∈ R2 : 2x− y = 0}.(x, y) ∈ V ⇒ y = 2x ⇒ (x, y) = (x, 2x) = x(1, 2) ⇒ V = span{(1, 2)}. Since {(1, 2)} is linearly independent, it is a
basis for V .
Example 4.5. Compute a basis of V = {(x, y, z) ∈ R3 : x− 2y + z = 0}.(x, y, z) ∈ V ⇒ x = 2y − z⇒ (x, y, z) = (2y − z, y, z) = y(2, 1, 0) + z(−1, 0, 1)
⇒ V = span{(2, 1, 0), (−1, 0, 1)}.Check whether these vectors are linearly independent.
⇒ {(2, 1, 0), (−1, 0, 1)} is a basis of V .
Note: In R, any two real numbers are linearly dependent.
Exercise 4.2.
1. Compute a basis of V = {(x1, . . . , x5) ∈ R5 : x1 + x3 − x5 = 0 and x2 − x4 = 0}.
2. In R2, any three vectors are linearly dependent.
3. Any four polynomials in P2 are linearly dependent.
Spanning sets and linearly independent sets share a little secret.
Theorem 4.3. Let A = {u1, . . . , um}, B = {v1, . . . , vn} be subsets of a vector space V. Suppose that A is linearly
independent and B spans V. Then m ≤ n.
Proof. Assume that m > n. Then we have vectors un+1, . . . , um in A. Since B is a spanning set, u1 ∈ span(B). Thus,
the set
B1 = {u1, v1, v2, . . . , vn}
is linearly dependent. Now, consider B1 as an ordered set, the ordering of its elements being as they are written,
that is, u1 is its first element, v1 is its second element, and so on. By Theorem 3.1, we have a vector vk such that
vk ∈ span{u1, v1, . . . , vk−1}. Remove this vk from B1 to obtain the set
C1 = {u1, v1, . . . , vk−1, vk+1, . . . , vn}.
Notice that span(C1) = span(B1) = span(B) = V. Add u2 to the set C1 to form the set
B2 = {u2, u1, v1, . . . , vk−1, vk+1, . . . , vn}.
Again, B2 is linearly dependent. Then, for some j, vj ∈ span{u2, u1, v1, . . . , vj−1}. Notice that due to linear indepen-
dence of {u1, . . . , un}, u2 is not in the linear span of u1; only a v can be in the linear span of the previous vectors.
Remove this vj from B2 to obtain a set C2. Again, span(C2) = span(B2) = span(B) = V.
If possible, continue this process of introducing a u and removing a v for n steps. Finally, vn is removed and we
end up with the set Cn = {un, un−1, . . . , u2, u1} which spans V. Then un+1 ∈ span(Cn). This is a contradiction since
A is linearly independent. Therefore, our assumption that m > n is wrong. �
Given a vector space, do we have an upper limit for the cardinality of linearly independent set? The following
corollaries answer such questions.
Corollary 4.4. Let V be a vector space with basis consisting of n elements. Then any subset of V having n+1 vectors
is linearly dependent.
12
Corollary 4.5. Any two bases of a finite dimensional vector space have same cardinality.
Corollary 4.6. If a vector space contains an infinite linearly independent subset, then it is an infinite dimensional
space.
Definition 4.3. Suppose V is a finite dimensional vector space. Then the cardinality of a basis is said to be the
dimension of V , denoted by dimV .
Observation: Let V be a vector space of dimension n and A be a subset of V containing m vectors. Then
(a) If A is linearly independent, then m ≤ n.
(b) If m > n, then A is linearly dependent.
(c) If A is linearly independent and m = n, then A is a basis of V .
Example 4.6.
1. dimR = 1
2. dimRn = n
3. C considered as a vector space over R, dimC = 2.
4. Consider Cn as a vector space over R. What is dimC? Can you produce an R-basis for Cn?
5. dimPn = n+ 1
6. What is the dimension of Mm×n(R)?
7. The set of all polynomials of all degrees, P, is an infinite dimensional vector space.
Solution: Suppose the dimension is finite, say dimP = n. Then any set of n+ 1 vectors are linearly dependent.
{1, t, . . . , tn} is linearly dependent, which is a contradiction.
8. C[a, b] is an infinite dimensional vector space.
Solution: Take the collection of functions {fn : fn(x) = xn for all x ∈ [a, b]; n = 0, 1, 2, . . .}. Then {f0, f1, . . . , fn}is linearly independent for every n. So, C[a, b] can not have a finite basis.
9. What is the dimension of {0}? Ans: dim{0} = 0.
Let {u, v} ⊂ R2 be linearly independent. Then can span{u, v} be a proper subset of R2?
Let {u, v, w} ⊂ R3 be linearly independent. Then, can span{u, v, w} be a proper subset of R3?
If V is a finite dimensional vector space, then so is any subspace of V .
For, if a subspace W contains an infinite linearly independent set, then that set will remain linearly independent in V
as well. So, V is infinite dimensional, which contradicts our assumption.
Theorem 4.7. If V is a finite dimensional vector space and W is a proper subspace of V , then dimW < dimV .
Proof. Since V is finite dimensional, so is W . Let {w1, . . . , wm} be a basis of W and v ∈ V \W . Then {w1, . . . , wn, v}is linearly independent in V . Prove this! So, V contains n+ 1 vectors which are linearly independent.
If a basis of V contains n or less number of vectors, then there can not be a set with cardinality n + 1 which is
linearly independent. Thus, dimV ≥ n+ 1. �
If W is a subspace of a finite dimensional vector space V , then can we relate their bases?
Can we say that BW ⊂ BV ?
Let W = {(x, 0) : x ∈ R2 and V = R2. Then {2, 0} is a basis of W and {(1, 0), (0, 1)} is a basis of V .
Theorem 4.8. Let W be a subspace of a finite dimensional vector space V and BW = {u1, . . . , um} be a basis of W .
Then there exists a basis BV of V such that BW ⊆ BV .
13
Proof. If W = V , then there is nothing to prove. Suppose W is properly contained in V . Let um+1 ∈ V \W . Then
{u1, . . . , um, um+1} is linearly independent in V . Let W1 = span{u1, . . . , um, um+1}.If dimV = m+ 1, then W1 = V and hence we are done.
If not, then W1 is properly contained in V and hence there exists um+2 ∈ V \W1. So, {u1, . . . , um+2} is linearly
independent in V .
If dimV = m+ 2, then we are done. Otherwise continue as before.
If dimV = n, then Wn−m = span{u1, . . . , um, um+1, . . . , un} will be equal to V . �
Given two subspaces V1 and V2 of a finite dimensional vector spaces, we have two other subspaces,
span(V1 ∪ V2) = V1 + V2 and V1 ∩ V2.
How are the dimensions of these spaces related with the dimensions of the individual spaces?
We can easily say that dim(V1 ∩ V2) ≤ dimVi for each i = 1, 2.
If V1 ⊂ V2, then V1 ∩ V2 = V1 and hence, it can be equality in the above inequality for one of the i’s.
Exercise 4.3. Prove that if dim(V1 ∩ V2) = dimV1, then V1 ⊆ V2.
Can we relate dimV1 + V2 to dimV1 and dimV2?
Let V1 = {(x, y) ∈ R2 : 2x− y = 0} and V2 = {(x, y) ∈ R2 : x+ y = 0}. What is dimV1 + V2?
Let V1 = {(x, y, z) ∈ R3 : x+ y + z = 0} and V2 = {(x, y, z) ∈ R3 : x− y − z = 0}. What is dim(V1 + V2)?
{(1, 0,−1), (1,−1, 0)} is a basis of V1 and {(1, 1, 0), (1, 0, 1)} is basis of V2.
Then V1 + V2 = R3 ⇒ dim(V1 + V2) = 3 < dimV1 + dimV2.
Are we counting something twice?
Theorem 4.9. Let V1 and V2 be finite dimensional subspaces of a vector space V .
Then dim(V1 + V2) = dimV1 + dimV2 − dim(V1 ∩ V2).
Proof. If V1 ∩ V2 = {0}, then we are done. Suppose V1 ∩ V2 6= {0}. Then V1 ∩ V2 is a finite dimensional vector space.
Let {x1, . . . , xn} be a basis of V1 ∩V2. If B1 is a basis of V1 and B2 is a basis of V2, then span(B1 ∪B2) = V1 +V3.
Then V1 + V2 has a finite spanning set ⇒ V1 + V2 is finite dimensional.
Note that V1 ∩ V2 is a subspace of V1 as well as V2.
Let {x1, . . . , xn, y1, . . . , ym} be a basis of V1 and {x1, . . . , xn, w1, . . . , wk} be a basis of V2.
Claim: B = {x1, . . . , xn, y1, . . . , ym, w1, . . . , wk} is a basis of V1 + V2.
Proof of the Claim: First we show that span(B) = V1 + V2.
Let v1 + v2 ∈ V1 + V2. Then
v1 = α1x1 + · · ·+ αnxn + αn+1y1 + · · ·+ αn+mym;
v2 = β1x1 + · · ·+ βnxn + βn+1w1 + · · ·+ βn+kwk.
⇒ v1 + v2 =
n∑i=1
(αi + βi)xi +
m∑j=1
αn+jyj +
k∑l=1
βn+lwl
⇒ V1 + V2 ⊆ span(B) ⇒ span(B) = V1 + V2.
We now prove that B is linearly independent.
Suppose α1x1 + · · ·+ αnxn + β1y1 + · · ·+ βmym + γ1w1 + · · ·+ γkwk = 0.
⇒ α1x1 + · · ·+ αnxn + β1y1 + · · ·+ βmym = −γ1w1 − · · · − γkwk⇒ − γ1w1 − · · · − γkwk ∈ V1 ∩ V2.
⇒ − γ1w1 − · · · − γkwk = a1x1 + · · ·+ anxn
⇒ a1x1 + · · ·+ anxn + γ1w1 + · · ·+ γkwk = 0.
{x1, . . . , xn, w1, . . . , wk} is a basis of V2
⇒ they are linearly independent
⇒ a1 = · · · = an = γ1 = · · · = γk = 0.
Substituting the values of γi’s, we get α1x1 + · · ·+ αnxn + β1y1 + · · ·+ βmym = 0.
14
Since {x1, . . . , xn, y1, . . . , ym} is linearly independent, α1 = · · · = αn = β1 = · · · = βm = 0.
⇒ {x1, . . . , xn, y1, . . . , ym, w1, . . . , wk} is a basis for V1 + V2
Therefore dim(V1 + V2) = n+m+ k = (n+ k) + (m+ k)− k = dimV1 + dimV2 − dim(V1 ∩ V2). �
Corollary 4.10. Two distinct planes through the origin in R3 intersect on a line.
Problems for Section 4
1. Determine which of the following sets form bases for P2.
(a) {−1− t− 2t2, 2 + t− 2t2, 1− 2t+ 4t2}.
(b) {1 + 2t+ t2, 3 + t2, t+ t2}.
(c) {1 + 2t+ 3t2, 4− 5t+ 6t2, 3t+ t2}.
2. Let M = (aij) be an m× n matrix with aij ∈ F and n > m. Show that there exists (α1, . . . , αn) ∈ Fn such that
ai1α1 + ai2α2 + · · ·+ ainαn = 0, for all i = 1, . . . ,m.
3. Let {x, y, z} be a basis for a vector space V. Is {x+ y, y + z, z + x} also a basis for V ?
4. Extend the set {1 + t2, 1− t2} to a basis of P3.
5. Find a basis for the subspace {(x1, x2, x3) ∈ R3 : x1 + x2 + x3 = 0} of R3.
6. Is {1 + tn, t+ tn, . . . , tn−1 + tn, tn} a basis for Pn?
7. Let u1 = 1 and let uj = 1 + t+ t2 + · · ·+ tj−1 for j = 2, 3, 4, . . .
Is span{u1, . . . , un} = Pn? Is span{u1, u2, . . .} = P?
8. Prove: The set {v1, . . . , vn} is a basis for a vector space V iff corresponding to each v ∈ V, there exist unique
α1, . . . , αn ∈ F such that v = α1v1 + · · ·+ αnvn.
9. Prove that the only proper subspaces of R2 are the straight lines passing through the origin.
10. Find bases and dimensions of the following subspaces of R5:
(a) {(x1, x2, x3, x4, x5) ∈ R5 : x1 − x3 − x4 = 0}.
(b) {(x1, x2, x3, x4, x5) ∈ R5 : x2 = x3 = x4, x1 + x5 = 0}.
(c) span{(1,−1, 0, 2, 1), (2, 1,−2, 0, 0), (0,−3, 2, 4, 2),
(3, 3,−4,−2,−1), (2, 4, 1, 0, 1), (5, 7,−3,−2, 0)}.
11. Find the dimension of the subspace span{1 + t2,−1 + t+ t2,−6 + 3t, 1 + t2 + t3, t3} of P3.
12. For i = 1, 2, . . . ,m, and j = 1, 2, . . . , n, let Eij ∈ Fm×n be the matrix whose ij-th entry is 1 and all other entries
are 0. Show that the set of all such Eij is a basis for Fm×n.
13. Find a basis, and hence dimension, for each of the following subspaces of the vector space of all twice differentiable
functions from R to R:
(a) {x ∈ V : x′′ + x = 0}.
(b) {x ∈ V : x′′ − 4x′ + 3x = 0}.
(c) {x ∈ V : x′′′ − 6x′′ + 11x′ − 6x = 0}.
14. Let U =
{(a −ab c
): a, b, c ∈ R
}, V =
{(a b
−a c
): a, b, c ∈ R
}.
(a) Prove that U and V are subspaces of R2×2.
15
(b) Find bases, and hence dimensions, for U ∩ V, U, V, and U + V.
15. Show that if V1 and V2 are subspace of R9 such that dimV1 = 5 = dimV2, then V1 ∩ V2 6= ∅.
16. Let {e1, e2, e3} be the standard basis of R3. What is span{e1 + e2, e2 + e3, e3 + e1}?
17. Given a0, a1, . . . , an ∈ R, let V = {x(t) ∈ Ck[0, 1] : anx(n)(t) + · · ·+ a1x
(1)(t) + a0x(t) = 0}.Show that V is a subspace of Ck[0, 1], and find its dimension.
18. Let V = span{(1, 2, 3), (2, 1, 1)} and W = span{(1, 0, 1), (3, 0,−1)}. Find a basis for V ∩W.Also, find dim(V +W ).
19. Given real numbers a0, a1, . . . , ak, let V be the set of all solutions x ∈ Ck[a, b] of the differential equation
a0dkx
dtk+ a1
dk−1x
dtk−1+ · · ·+ akx = 0.
Show that V is a vector space over R. What is dimV ?
20. Consider each polynomial in P as a function from the set {0, 1, 2} to R. Is the set of vectors {t, t2, t3, t4, t5}linearly independent?
21. Suppose S is a set consisting of n elements and V is the set of all real valued functions defined on S. Show that
V is a vector space of dimension n.
5 Linear Transformations
Let A denote an m × n matrix with real entries. Then for any vector x = (x1, . . . , xn)t ∈ Rn, Ax is a vector in Rm.
Also, it satisfies the properties:
A(x+ y) = Ax+Ay and A(αx) = αAx for x, y ∈ Rn and α ∈ R.
If f and g are two differentiable functions from an interval [a, b] to R, then
d
dt(f + g) =
df
dt+dg
dtand
d
dt(αf) = α
df
dt.
If f and g are two real valued continuous functions from [a, b] to R, then∫ b
a
(f + g)(t)dt =
∫ n
a
f(t)dt+
∫ b
a
g(t)dt and
∫ b
a
(αf)(t)dt = α
∫ b
a
f(t)dt.
The interesting maps in a domain of mathematical discourse are the structure-preserving maps. In R, such maps
are continuous maps since they map closed intervals to closed intervals. In Group Theory, these maps are the
homomorphisims. In Topology, these maps are the homeomorphisims. In Vector space, these are the maps which
preserve the two operations of addition and the scalar multiplication; these are called linear transformations.
Definition 5.1. Let V1 and V2 be vector spaces over F. A function T : V1 → V2 is said to be a linear transformation
(or a linear map) if T (x+ y) = T (x) + T (y) and T (αx) = αT (x) for every x, y ∈ V1 and for every α ∈ F.
A bijective linear transformation is called an isomorphism. Two spaces are called isomorphic to each other if
there exists an isomorphism from one to the other.
Example 5.1.
1. T : V → V, T (v) = 0 for all v ∈ V .
2. T : V → V, T (v) = v.
3. Let A ∈ Mm×n(R) and TA : Rn → Rm be the map defined by TA(x) = Ax. Then, we have seen that TA is a
linear map.
16
4. Let V be any vector space and α ∈ F. Then the map T : V → V defined by T (v) = αv is a linear transformation.
5. Define Tj : Rn → R by Tj(x1, . . . , xn) = xj . Then Tj is linear. More generally, for α1, . . . , αn ∈ R,T (x1, . . . , xn) =
∑ni=1 αixi is a linear transformation.
6. Let V be a vector space with basis {u1, . . . , un}. Give any vector, there exist unique scalars α1, . . . , αn ∈ F such
that u = α1u1 + · · ·+ αnun. Define a map T : V → F, Ti(u) = αi. Then T is a linear map.
Solution: Let u, v ∈ V ; u = α1u1 + · · ·+ αnun and v = β1u1 + · · ·+ βnun. ⇒ u+ v =∑ni=1(αi + βi)ui.
This is a representation of u + v as a linear combination of {u1, . . . , un}. Since {u1, . . . , un} is a basis, this is THE
unique representation for u+ v.
⇒ Ti(u+ v) = αi + βi = Ti(u) + Ti(v). Similarly the other condition can be verified.
Exercise 5.1. In a similar manner, define a map T : V → Fn and prove that the map is linear.
Example 5.2.
1. Let a0 ∈ [a, b]. Define Ta0 : C[a, b]→ F by Ta0(f) = f(a0). Verify that Ta0 is a linear transformation.
2. Let T : C1[a, b]→ C[a, b] be defined by T (f) = f ′. Then T is linear.
3. Let T : C1[a, b]→ C[a, b] be defined by T (f) = αf + βf ′. Then verify that T is linear.
4. If T1 and T2 are linear transformations from V1 to V2, then the map T : V1 → V2 defined by
T (v) = αT1(v) + βT2(v) is a linear transformation.
5. Let A =
[cosφ sinφ
− sinφ cosφ
]. Then for any x = (r cos θ, r sin θ) ∈ R2, the map T : R2 → R2 defined by Tx = Ax is
the rotation by an angle φ. We have already seen that this is a linear map.
Caution: Every map that ‘looks linear’ need not be linear: T : R→ R defined by T (x) = 2x+ 3.
Theorem 5.1. Let T : V → W be a linear transformation, then for all vectors u, v, v1, . . . , vn ∈ V and scalars
α1, . . . , αn ∈ F:
(a) T (0) = 0
(b) T (u− v) = T (u)− T (v)
(c) T (α1v1 + · · ·+ αnvn) = α1T (v1) + · · ·+ αnT (vn).
Proof. (a) T (0) = T (0 + 0) = T (0) + T (0) ⇒ T (0) = 0.
(b) T (u− v) = T (u+ (−1)v) = T (u) + T (−1(v)) = T (u) + (−1)T (v) = T (u)− T (v).
(c) Apply induction on n. �
Consider the ‘differential map’, D : P3 → P2, defined by D(p(t)) = p′(t). In this map, we know that D(t3) =
3t2; D(t2) = 2t; D(t) = 1 and D(1) = 0 and use the linearity of differential operator to obtain D(p(t)) for any
polynomial p(t) ∈ P3.
Suppose T : R2 → R be a linear map such that T (1, 0) = 2 and T (0, 1) = −1, then what is T (2, 3)?
What are the information required to describe a linear map T?
Theorem 5.2. Let V be a finite dimensional vector space with basis B = {v1, . . . , vn}. If T1 and T2 are two linear
maps from V to another vector space W such that T1(vi) = T2(vi) for all i = 1, . . . , n, then T1(v) = T2(v) for all
v ∈ V .
17
Proof. Let v ∈ V . {v1, . . . , vn} is a basis of V ⇒ v = α1v1 + · · ·+ αnvn for some αi’s ∈ F. Then
T1(v) = T1(α1v1 + · · ·+ αnvn)
= α1T1(v1) + · · ·+ αnT1(vn) (using property 3)
= α1T2(v1) + · · ·+ αnT2(vn) (using the hypothesis)
= T2(α1v2 + · · ·+ αnvn)
= T2(v). �
This theorem tells us that if T : V → W is linear and V is finite dimensional, then we need to know only what T
does to basis vectors in V . This determines T completely.
Example 5.3. Let T : R3 → R2 be linear such that
T (1, 0, 0) = (2, 3), T (0, 1, 0) = (−1, 4) and T (0, 0, 1) = (5,−3). Then
T (3,−4, 5) = 3T (1, 0, 0) + (−4)T (0, 1, 0) + 5T (0, 0, 1) = 3(2, 3) + (−4)(−1, 4) + 5(5,−3) = (35,−22).
Exercise 5.2. Let T : R2 → R2 be a map such that T (1, 1) = (1,−1), T (0, 1) = (−1, 1) and T (2,−1) = (1, 0). Can T
be a linear transformation?
Suppose we take a basis {v1, . . . , vn} of V and randomly chosen vectors w1, . . . , wn ∈ W . Does there exists a linear
map T : V →W such that T (vi) = wi?
Theorem 5.3. Let V be a finite dimensional vector space with basis B = {v1, . . . , vn}. Let W be a vector space
containing the n vectors w1, . . . , wn. Then, there exists a unique linear map T : V → W such that T (vi) = wi for
i = 1, . . . , n.
Proof. We need to construct a map from V to W and prove that this map is linear and is unique.
Let v ∈ V ⇒ v = α1v1 + · · ·+ αnvn for some α1, . . . , αn ∈ F.
Define T (v) = α1w1 + · · ·+αnwn. Since, {v1, . . . , vn} is a basis, given a vector v, the scalars α1, . . . , αn are unique.
Therefore, this map is well-defined.
Linearity: Let u, v ∈ V . Then u = α1v1 + · · ·+ αnvn and v = β1v1 + · · ·+ βnvn
⇒ u+ v = (α1 + β1)v1 + · · ·+ (αn + βn)vn.
Thus
T (u+ v) = (α1 + β1)w1 + · · ·+ (αn + βn)wn
= (α1w1 + · · ·+ αnwn) + (β1w1 + · · ·+ βnwn)
= T (u) + T (v).
Similarly,
T (αu) = T (αα1v1 + · · ·+ ααnvn)
= αα1w1 + · · ·+ ααnwn = α(α1w1 + · · ·+ αnwn)
= αT (u).
Therefore T is a linear transformation.
Uniqueness follows from the previous theorem. �
Caution: The vectors w1, . . . , wn appearing in the previous theorem, need not be distinct or not even be linearly
independent.
Example 5.4. Construct a linear map T : R2 →W , where W = {(x1, x2, x3) | x1 − x2 − x3 = 0}. Describe the map
completely.
18
Solution: Start with a basis {v1 = (1, 0), v2 = (0, 1)} of R2. Choose any two vectors in W , for example w1 = (1, 1, 0)
and w2 = (1, 0, 1). We want T (1, 0) = (1, 1, 0) and T (0, 1) = (1, 0, 1).
Then define T (x1, x2) = x1(1, 1, 0) + x2(1, 0, 1) = (x1 + x2, x1, x2).
Therefore, this is a linear map from R2 to W .
Exercise 5.3. Find another linear map from R2 to W .
Let A =
1 −1 1
0 1 2
−2 1 0
.Consider the linear transformation T : R3 → R3 given by T (x) = Ax for every x ∈ R3.
Then Ae1 = (1, 0,−2); Ae2 = (−1, 1, 1) & Ae3 = (1, 2, 0).
Note that (1, 0,−2) = 1 · e1 + 0 · e2 +−2 · e3.
Definition 5.2. Let V be a vector space and B = {v1, . . . , vn} be an ordered basis of V . Let v ∈ V. There exist
unique scalars α1, . . . , αn ∈ F such that v = α1v1 + · · · + αnvn. The matrix of v with respect to the basis B is
the column matrix
[v]B :=
α1
...
αn
. We also write [v]B = (α1, . . . , αn)t. The column matrix [v]B is also called the co-ordinate vector of
v with respect to the ordered basis B of V.
Notice that the co-ordinate vector is well-defined since the scalars α1, . . . , αn are uniquely determined from v an
the basis vectors in the ordered basis B for V.
Example 5.5. Let B = {(1,−1), (1, 0)}. Find [(0, 1)]B .
Solution: (0, 1) = −1(1,−1) + 1(1, 0) ⇒ [(0, 1)]B =
[−1
1
].
Example 5.6. Let B = {1, 1 + t, 1 + t2} ⊂ P2. Is B a basis of P2? Find [1 + t+ t2]B .
Note here that the matrices would be different if we alter the positions of the basis vectors. i.e., the matrices w.r.t.
{1, 1 + t2, 1 + t} and {1, 1 + t, 1 + t2} are different.
That is, the position of vectors in a basis is important when we compute the matrix of a vector.
An ordered basis is a basis with the positions of the basis vectors fixed.
As bases of R3, {e1, e2, e3} and {e2, e1, e3} are same, but as ordered bases they are NOT THE SAME.
Let V and W be vector spaces over F and B1 = {v1, . . . vn} and {w1, . . . , wm} be bases of V and W respectively.
Let T : V →W be a linear transformation.
If u ∈ V , then u = α1v1 + · · ·+ αnvn
⇒ T (u) = α1T (v1) + · · ·+ αnT (vn).
⇒ Matrix of T (u) can be obtained once we know the matrices of T (v1), . . . , T (vn).
Definition 5.3. Let V and W be vector spaces over F and B1 = {v1, . . . vn} and {w1, . . . , wm} be ordered bases of V
and W respectively. Let T : V → W be a linear transformation. Let [T (vi)]B2 denote the matrix of T (vi) w.r.t. B2.
The the matrix of T with respect to the ordered bases B1 and B2 tis [T ]B1,B2 := [[T (v1)]B2 . . . [T (vn)]B2 ]].
The definition says the following: Suppose
T (v1) = α11w1 + · · ·+ αm1wm.
T (v2) = α12w1 + · · ·+ αm2wm....
T (vn) = α1nw1 + · · ·+ αmnwm.
19
Then
[T ]B1,B2=
α11 α12 · · · α1n
α21 α22 · · · α2n
......
......
αm1 αm2 · · · αmn
Caution: Mark which αij goes where.
Example 5.7. Let T : R2 → R3 be given by T (x1, x2) = (2x1−x2, x1+x2, x2−x1), B1 = {e1, e2} and B2 = {e1, e2, e3}.Then
T (e1) = (2, 1,−1) = 2e1 + 1e2 +−1e3
T (e2) = (−1, 1, 1) = −1e1 + 1e2 + 1e3.
Therefore [T ]B1,B2=
2 −1
1 1
−1 1
.Note that if A = [T ]B1,B2
, then T
[x1
x2
]= A
[x1
x2
].
Example 5.8. Let D : P3 → P2 be the map given by D(p) = p′.
1. Let A = {1, t, t2, t3} and B = {1, t, t2}. Then [D]A,B =
0 1 0 0
0 0 2 0
0 0 0 3
.2. Let B = {1, 1 + t, 1 + t2}. Then [D]A,B =
0 1 −2 −3
0 0 2 0
0 0 0 3
.Example 5.9. Let T : P2 → P3 be the map T (p(t)) =
∫ t0p(s)ds.
If A = {1, 1 + t, t+ t2} and B = {1, t, t+ t2, t2 + t3}, then [T ]A,B =
0 0 0
1 1/2 −1/6
0 1/2 1/6
0 0 1/3
.How are the matrices of x, T and T (x) related?
Theorem 5.4. Let V, W be finite dimensional vector spaces. Let A = {v1, . . . , vn} be an ordered basis of V and
B = {w1, . . . , wm} be an ordered basis of W . Let T : V →W be a linear transformation. Then for every x ∈ V,
[T (x)]B = [T ]A,B [x]A.
i.e., every linear transformation can be realized as a matrix multiplication.
Theorem 5.5. Let V, W be finite dimensional vector spaces. Let A = {v1, . . . , vn} be an ordered basis of V and
B = {w1, . . . , wm} be an ordered basis of W . If T1 and T2 are linear transformations from V to W and α ∈ F, then
1. [T1 + T2]A,B = [T1]A,B + [T2]A,B.
2. [αT1]A,B = α[T1]A,B.
Theorem 5.6. Let V, W, Z be finite dimensional vector spaces with A, B, C their respective ordered bases. Let
T1 : V →W and T2 : W → Z be linear transformations. Then
[T2 ◦ T1]A,C = [T2]B,C [T1]A,B .
20
Problems for Section 5
1. In each of the following determine whether T : R2 → R2 is a linear transformation:
(a) T (α, β) = (1, β) (b) T (α, β) = (α, α2)
(c) T (α, β) = (sinα, 0) (d) T (α, β) = (|α|, β)
(e) T (α, β) = (α+ 1, β) (f) T (α, β) = (2α+ β, α+ β2).
2. Let T : R2 → R2 be a linear transformation with T (1, 0) = (1, 4) and T (1, 1) = (2, 5). What is T (2, 3)? Is T
one-one?
3. In each of the following, determine whether a linear transformation T exists:
(a) T : R2 → R3 with T (1, 1) = (1, 0, 2) and T (2, 3) = (1,−1, 4).
(b) T : R3 → R2 with T (1, 0, 3) = (1, 1) and T (−2, 0,−6) = (2, 1).
(c) T : R3 → R2 with T (1, 1, 0) = (0, 0), T (0, 1, 1) = (1, 1) and T (1, 0, 1) = (1, 0).
(d) T : C1[0, 1]→ R with T (u) =∫ 1
0(u(t))2 dt.
(e) T : C1[0, 1]→ R2 with T (u) = (∫ 1
0u(t) dt, u′(0)).
(f) T : P3 → R with T (a+ bt2) = 0 for any a, b ∈ R.
(g) T : Pn(R)onto−→ R with T (p(x)) = p(α), for a fixed α ∈ R.
4. Let T1 : C1[0, 1]→ C[0, 1] and T2 : C[0, 1]→ R be defined by T1(u) = u′ and T2(v) =∫ 1
0v(t) dt. Find, if possible,
T1T2 and T2T1. Are they linear transformations?
5. Let U, V be vector spaces with {u1, . . . , un} a basis for U. Let v1, . . . , vn ∈ V. Show that
(a) There exists a unique linear transformation T : U → V with T (ui) = vi for i = 1, 2, . . . , n.
(b) This T is one-one iff {v1, . . . , vn} is linearly independent.
(c) This T is onto iff span{v1, . . . , vn} = V.
6. Construct an isomorphism between Fn+1 and Pn(F).
7. Let V1 and V2 be finite dimensional vector spaces and T : V1 → V2 be a linear transformation. Give reasons for
the following:
(a) rankT ≤ dimV1.
(b) T is onto implies dimV2 ≤ dimV1.
(c) T is one-one implies dimV1 ≤ dimV2.
(d) dimV1 > dimV2 implies T is not one-one.
(e) dimV1 < dimV2 implies T is not onto.
(f) Suppose dimV1 = dimV2. Then, T is one-one if and only T is onto.
8. Let T : R3 → R3 be defined by T (α, β, γ) = (β + γ, γ + α, α+ β). Find [T ]E1,E2 where
(a) E1 = {(1, 0, 0), (0, 0, 1), (0, 1, 0)}, E2 = {(0, 0, 1), (1, 0, 0), (0, 1, 0)}.
(b) E1 = {(1, 1,−1), (−1, 1, 1), (1,−1, 1)},E2 = {(−1, 1, 1), (1,−1, 1), (1, 1,−1)}.
9. Define T : P2(R)→ R by T (f) = f(2). Compute [T ] using the standard bases of the spaces.
10. Define T : R2 → R3 by T (a, b) = (a− b, a, 2b+ b). Suppose B be the standard basis for R2, C = {(1, 2), (2, 3)},and D = {(1, 1, 0), (0, 1, 1), (2, 2, 3)}. Compute [T ]B,D and [T ]C,D.
21
11. Let T : P2 → P3 be defined by T (a+bt+ct2) = at+bt2+ct3. If E1 = {1+t, 1−t, t2} and E2 = {1, 1+t, 1+t+t2, t3},then what is [T ]E1,E2
?
12. Let E1 =
{(1 0
0 0
),
(0 1
0 0
),
(0 0
1 0
),
(0 0
0 1
)}, E2 = {1, t, t2} and E3 = {1}.
(a) Define T : R2×2 → R2×2 by T (A) = At. Compute [T ]E1,E1 .
(b) Define T : P2(R)→ R2×2 by T (f) =
(f ′(0) 2f(1)
0 f ′(3)
). Compute [T ]E2,E1
.
(c) Define T : R2×2 → R by T (A) = tr(A). Compute [T ]E1,E3.
13. Given bases E1 = {1 + t, 1− t, t2} and E2 = {1, 1 + t, 1 + t+ t2, t3} for P2(R) and P3(R), respectively, and the
linear transformation S : P2(R)→ P3(R) with S(p(t)) = t p(t), find the matrix [S ]E1,E2.
14. Let E1 = {u1, . . . , un} and E2 = {v1, . . . , vm} be bases of V1 and V2, respectively. Show the following:
(a) If T ∈ L(V1, V2), then T is one-one iff columns of [T ]E1,E2are linearly independent.
(b) If T ∈ L(V1, V2), then T is not one-one iff det[T ]E1,E2= 0.
(c) If {g1, . . . , gm} is the ordered dual basis of L(V2,F) with respect to the basis E2 of V2, then for every
T ∈ L(V1, V2), [T ]E1,E2= (gi(Tuj)).
(d) If T1, T2, T ∈ L(V1, V2) and α ∈ F, then [T1 + T2]E1,E2= [T1]E1,E2
+ [T2]E1,E2and [αT ]E1,E2
= α[T ]E1,E2.
(e) Suppose {Mij : i = 1 . . . ,m; j = 1, . . . , n} is a basis of Fm×n. Let Tij ∈ L(V1, V2) be the linear transforma-
tion with [Tij ]E1,E2 = Mij . Then {Tij : i = 1 . . . ,m; j = 1, . . . , n} is a basis of L(V1, V2).
6 Rank and Nullity
Definition 6.1. Let V and W be vector spaces and T : V →W be a linear transformation. Then
Kernel of T = Null Space of T = N(T ) := {v ∈ V : T (v) = 0}.Range of T = Range Space of T = R(T ) := {w ∈W : w = T (v) for some v ∈ V }.
Note that N(T ) 6= ∅, since T (0) = 0. For the same reason R(T ) 6= ∅.Let u, v ∈ N(T ) and α ∈ F. Then T (u+ αv) = T (u) + αT (v) = 0 ⇒ u+ αv ∈ N(T )
⇒ N(T ) is a subspace of V . Thus the terminology “Null space”.
Similarly, if u, v ∈ R(T ) and α ∈ F, then there exist x, y ∈ V such that T (x) = u and T (y) = v. Therefore,
T (x+ αy) = T (x) + αT (y) = u+ αv ⇒ u+ αv ∈ R(T ) ⇒ R(T ) is a subspace of W .
Thus the terminology “Range space”.
Theorem 6.1. If T : V →W is a linear transformation, then N(T ) is a subspace of V and R(T ) is a subspace of W.
Example 6.1. Let T : R3 → R2 be defined by T (x1, x2, x3) = (x1 + x2, x1 − x3). Find a basis for N(T ) and a basis
for R(T ).
Solution: T (x1, x2, x3) = (0, 0) ⇒ x1 = −x2 and x1 = x3. Therefore
N(T ) = {(x1, x2, x3) : x1 = −x2 = −x3} = span{(1,−1, 1)}.T (0, 1, 0) = (1, 0) and T (0, 0,−1) = (0, 1). Since R(T ) is a subspace containing (1, 0) and (0, 1), R(T ) = R2.
Example 6.2. Let T : R2 → R3 be defined by T (x1, x2) = (x1 + x2, x1 − x2, 0). Find R(T ) and N(T ).
Definition 6.2. If T : V →W be a linear transformation, then
Nullity of T= nullT := dimN(T ) and Rank of T= rankT := dimR(T ).
Recall: A f : X → Y is called one-one if f(x1) = f(x2) ⇒ x1 = x2. f is called onto if for every y ∈ Y , there exists
x ∈ X such that f(x) = y. If f is one-one and onto, then it is called bijective. Moreover, f is bijective if and only if
there exists g : Y → X such that g ◦ f = idX and f ◦ g = idY . The map g is usually denoted by f−1.
22
Theorem 6.2. A linear transformation T : V →W is one-one if and only if N(T ) = {0}.
Proof. Assume that T is one-one. Let T (x) = 0. Then T (x) = T (0) ⇒ x = 0 ⇒ N(T ) = {0}.Conversely, suppose N(T ) = {0}. Let T (x1) = T (x2).
Then T (x1)− T (x2) = 0 ⇒ T (x1 − x2) = 0 ⇒ x1 − x2 ∈ N(T ).
Since N(T ) = {0}, x1 = x2 ⇒ T is one-one. �
Theorem 6.3. Let T : V →W be a bijective linear transformation. Then T−1 : W → V is linear.
Proof. Let w1, w2 ∈ W . Then there exists v1, v2 ∈ V such that T (v1) = w1 and T (v2) = w2. Therefore T (v1 + v2) =
w1 + w2 ⇒ T−1(w1 + w2) = v1 + v2 = T−1(w1) + T−1(w2). Similarly one can show that T−1(αw1) = αT−1(w1). �
Example 6.3. Let T : R2 → R2 be defined by T (x1, x2) = (x1 − x2, 2x1 + x2). Then T is an isomorphism.
Solution: Suppose T (x1, x2) = (0, 0). Then x1 = x2 and 2x1 = −x2. This implies that x1 = x2 = 0. That is,
N(T ) = {0}. Therefore T is one-one.
T (1, 0) = (1, 2) and T (0, 1) = (−1, 1) ⇒ (1, 2), (−1, 1) ∈ R(T ). Since they are linearly independent R(T ) = R2.
Therefore T is an isomorphism.
Example 6.4. Let {u1, . . . , un} be an ordered basis of V over F.
Define T : V → Fn by T (α1v1 + · · ·+ αnvn) = (α1, . . . , αn). Prove that T is an isomorphism.
Let V and W be finite dimensional vector spaces and T : V → W a bijective linear transformation. Let A and B
denote ordered bases of V and W respectively.
Then there exists a linear transformation T−1 such that T−1 ◦ T = idV and T ◦ T−1 = idW . From the cor-
respondence between the linear transformation and matrices, it follows that the matrix [T ]A,B is invertible and
[T−1]B,A = ([T ]A,B)−1.
Let T : V →W be a linear transformation and {v1, . . . , vn} be a basis of V . If w ∈ R(T ), then there exists v ∈ Vsuch that T (v) = w, i.e., w = T (α1v1 + · · ·+ αnvn) for some α′is in F⇒ w ∈ span{T (v1), . . . , T (vn)}⇒ R(T ) = span{T (v1), . . . , T (vn)}.
Can you find an onto map T : R→ R2?
Can you find a one-one map T : R2 → R?
What are all linear maps from R to R?
Exercise 6.1. Let {v1, . . . , vn} be a basis of V and T : V →W be a linear transformation. Prove that
T is one-one if and only if {T (v1), . . . , T (vn)} is linearly independent in W .
Exercise 6.2. T : V →W is an isomorphism if and only if for every basis {v1, . . . , vn} of V ,
the set {T (v1), . . . , T (vn)} is a basis of W .
Theorem 6.4. Let V be a finite dimensional vector space. Let T : V →W be a linear transformation. Then
rankT + nullT = dimV.
Proof. Let dimV = n. If n = 0, then V = {0}, and then there is nothing to prove. Suppose n ≥ 1.
Let {v1, . . . , vk} be a basis of N(T ), for some k ∈ {0, 1, . . . , n}. Note that if T is one-one, then k = 0 which means
the set is empty. Also, if T is the zero map, then N(T ) = V and R(T ) = {0}. This implies that k = n and hence the
theorem is true.
Assume that T is not the zero map. Extend the basis of N(T ) to a basis of V , say {v1, . . . , vk, vk+1, . . . , vn}.If we prove that {T (vk+1), . . . , T (vn)} is a basis of R(T ), then the theorem is proved.
Let w ∈ R(T ). Then there exists a v ∈ V such that T (v) = w. Let v = α1v1 + · · ·+ αnvn. Then
T (v) = α1T (v1) + · · ·+ αkT (vk) + αk+1T (vk+1) + · · ·+ αnT (vn)
= αk+1T (vk+1) + · · ·+ αnT (vn)
∈ span{T (vk+1, . . . , T (vn)}).
23
Therefore R(T ) = span{T (vk+1), . . . , T (vn)}.Suppose βk+1T (vk+1) + · · ·+ βnT (vn) = 0.
⇒ T (βk+1vk+1 + · · ·+ βnvn) = 0.
⇒ βk+1vk+1 + · · ·+ βnvn ∈ N(T ).
⇒ β1v1 + · · ·+ βkvk − βk+1vk+1 − · · · − βnvn = 0.
⇒ βi = 0 for all i = 1, . . . , n
⇒ {T (vk+1), . . . , T (vn)} is linearly independent.
⇒ dimR(T ) = n− k = dimV − dimN(T ). �
Observation:
1. There does not exists a one-one map T : R2 → R (more generally from Rm to Rn for any m > n.)
2. There does not exists an onto map T : R→ R2 (more generally from Rm to Rn for any m < n).
3. If dimV = dimW and T : V →W , then T is one-one if and only if T is onto.
Example 6.5.
1. T : Pn → Rn+1 defined by T (a0 + a1t+ · · ·+ antn) = (a0, . . . , an) is an isomorphism.
2. T : Rn → Rn defined by T (a1, . . . , an) = (a1, a1 + a2, . . . , a1 + · · ·+ an) is an isomorphism.
Recall that two spaces are isomorphic to each other if there exists an isomorphism between them. We have seen
that if T : V →W is an isomorphism, then dimV = dimW . Is the converse true? If dimV = dimW , then are V and
W isomorphic?
Let B1 = {v1, . . . , vn} be a basis of V and B2 = {w1, . . . , wn} be a basis of W .
Define T : V →W by T (α1v1 + · · ·+ αnvn) = α1w1 + · · ·+ αnwn.
Since B1 and B2 span V and W respectively, T is onto. Since B1 and B2 are linearly independent, T is one-one.
Therefore T is an isomorphism.
Theorem 6.5. V and W are isomorphic if and only if dimV = dimW .
If dim(V ) = n, then there exists an isomorphism from V to Fn. Given an ordered basis {v1, v2, . . . , vn} of V, each
v ∈ V is associated with its coordinate vector in Fn.Clue: v = α1v1 + α2v2 + · · ·+ αnvn
There can be many isomorphisms. But we have one special isomorphism. It maps v to its co-ordinate vector. The
map φv1,...,vn : V → Fn with φ(v) = [v]e1,...,en is called the canonical basis isomorphism between V and Fn.
Let B = {v1, . . . , vn} be an ordered basis of V. Let C = {w1, . . . , wm} an ordered basis of W. Have standard bases
for Fn and Fm. The canonical basis isomorphisms are:
φv1,...,vn from V to Fn and φw1,...,wmfrom W to Fm.
Let T : V →W be a linear transformation and [T ]B,C be its matrix representation. Then
VT−−−−→ W
φv1,...,vn
y' 'yφw1,...,wm
Fn −−−−→[T ]B,C
Fm
This means T = φ−1w1...wm◦ [T ]B,C ◦ φv1,...,vn .
The co-ordinate vectors are related to the so-called co-ordinate functionals. We use some more terminology:
Definition 6.3. The set of all linear transformations from V1 to V2 is denoted by L(V1, V2).
A linear transformation T : V → V is called a linear operator on V.
A linear transformation T : V → F is called a linear functional. The set of all linear functionals on a vector
space V is called the dual space of V and is denoted by V ′.
24
Example 6.6. 1. f1 : Rn → R defined by f1(x1, . . . , xn) = x1 is a linear functional.
2. fj : Rn → R defined by fj(x1, . . . , xn) = xj is a linear functional.
3. Let (v1, . . . , vn) be an ordered basis for a vector space V. Define fj : V → F by fj(v) = αj when v = α1v1+· · ·+αnvn.Verify that fj is a linear functional.
These functionals are called the co-ordinate functionals with respect to the given basis.
4. Let τ ∈ [a, b] be fixed. Let fτ : C[a, b] → F be defined by fτ (x) = x(τ), for x ∈ C[a, b]. Then fτ is a linear
functional. 5. Fix points τ1, . . . , τn in [a, b], and scalars α1, . . . αn in F. Let f : C[a, b] → F be defined by f(x) =∑ni=1 αix(τi) for x ∈ C[a, b]. Then f is a linear functional.
6. Define T : C[a, b]→ F by f(x) =∫ bax(t) dt, for x ∈ C[a, b]. Then f is a linear functional.
Theorem 6.6. Let V and W be vector spaces over F. Then L(V,W ) is a vector space with usual addition of functions
and multiplication of a function with a scalar.
Theorem 6.7. If dim V = n, dim W = m, then dim L(V,W ) = mn. Therefore, dim V ′ = n.
Proof. Due to the canonical basis isomorphisms, (Remember the diagram?) L(V,W ) is isomorphic to Fm×n.Alternative proof: Fix two ordered bases: {v1, . . . , vn} for V and {w1, . . . , wm} for W. Define linear transformations
Tij : V →W for i = 1, . . . , n, j = 1, . . . ,m by Tij(vi) = wj , Tij(any other vk) = 0.
Now, show that the set of all these Tij is a basis of L(V,W ). �
In particular, a basis for V ′ now looks like the set of functionals
fi : V → F, where fi(vi) = 1, fi( any other vk) = 0.
This basis is called the dual basis of V ′ with respect to the basis {v1, . . . , vn} of V.
How does a matrix of a linear functional look like?
And how are the co-ordinate vectors of linear functionals look in the dual basis?
Problems for Section 6
1. In the following, prove that T is a linear transformation. Determine rankT and nullT by finding bases for R(T )
and N(T ).
(a) T : R3 → R2; T (a1, a2, a3) = (a1 − a2, 2a3).
(b) T : R2 → R3; T (a1, a2) = (a1 + a2, 0, 2a1 − a2).
(c) T : R3×3 → R2×2; T
a11 a12 a13
a21 a22 a23
a31 a32 a33
=
(2a11 − a12 a13 + 2a12
0 0
).
(d) T : P2(R)→ P3(R); T (f(x)) = xf(x) + f ′(x).
(e) T : Rn×n → R; T (A) = tr(A).
2. Give an example for each of the following:
(a) A linear transformation T : R2 → R2 with N(T ) = R(T ).
(b) Distinct linear transformations T,U with N(T ) = N(U) and R(T ) = R(U).
3. Let V be a non-trivial real vector space. Let T : V → R be a non-zero linear map. Prove or disprove: T is onto
iff nullT = dimV − 1.
4. Let U, V be finite dimensional real vector spaces, and T : U → V linear. Prove or disprove:
(a) rankT ≤ dimU.
(b) If T is onto, then dimV ≤ dimU.
25
(c) If T is one-one, then dimU ≤ dimV.
(d) If dimU = dimV, then “T is one-one iff T is onto”.
5. Let A ∈ Rm×n. Let rankA, rowrank A and colrank A denote, respectively, the rank of A when considered as
a linear transformation, the maximum number of linearly independent rows of A, and the maximum number of
linearly independent columns of A. In A, we say that a row is superfluous if it is a linear combination of other
rows. Similarly, we call a column to be superfluous if it is a linear combination of other columns. Prove the
following:
(a) Omission of a superfluous column does not alter rowrank A.
(b) Omission of a superfluous row does not alter colrank A.
(c) rankA = rowrank A = colrank A.
6. Let V be the vector space of real valued functions on R which have derivatives of all orders. Let T : V → V be
the differential operator: Tx = x′. What is N(T )?
7. Let T : V → V be a linear operator such that T 2 = T . Let I denote the identity operator. Prove that
R(T ) = N(I − T ) and N(T ) = R(I − T ).
8. Find bases for the null space N(T ) and the range space R(T ) of the linear transformation T in each of the
following:
(a) T : R2 → R2 defined by T (x1, x2) = (x1 − x2, 2x2),
(b) T : R2 → R3 defined by T (x1, x2) = (x1 + x2, 0, 2x3 − x2),
(c) T : Rn×n → R defined by T (A) = tr(A).
(Recall: tr(A), the trace of a square matrix A, is the sum of its diagonal elements.)
9. Let T : V1 → V2 be a linear transformation, where both V1 and V2 are finite dimensional. Prove the following:
(a) There exists a subspace V0 of V1 such that V1 = N(T ) + V0 and V0 ∩N(T ) = {0}.
(b) If V0 is as in (a) and {v1, . . . , vk} is a basis of V0, then R(T ) = span{Tv1, . . . , T vk}.
10. Let the linear transformation A : P2(R)→ P3(R) be defined by A(p(t)) = t p(t) +dp(t)
dt. Find N(A) and R(A).
11. Let B : V →W be a linear transformation, where V and W are real vector spaces with dimW < dimV < 2013.
Show that B cannot be one-one.
7 Eigenvalues & Eigenvectors
Let A =
(0 1
1 0
). Here, A : R2 → R3. It transforms straight lines to straight lines or points.
Get me a straight line which is transformed to itself.
A =
(x
y
)=
(0 1
1 0
)(x
y
)=
(y
x
).
Thus, the line {(x, x) : x ∈ R} never moves. So also the line {(x,−x) : x ∈ R}.
Observe: A
(x
x
)= 1
(x
x
)and A
(x
−x
)= (−1)
(x
−x
).
Definition 7.1. Let V be a vector space over F, and T : V → V be a linear operator. A scalar λ ∈ F is called an
eigenvalue of T if there exists a non-zero vector v ∈ V such that Tv = λv. Such a vector v is called an eigenvector
of T for (or, associated with) the eigenvalue λ.
26
Example 7.1.
1. Consider the map T : R3 → R3 by T (a, b, c) = (a, a+ b, a+ b+ c). It has an eigenvector (0, 0, 1) associated with
the eigenvalue 1. Is (0, 0, c) also an eigenvector associated with the same eigenvalue 1?
2. Let T : P → P be defined by T (p(t)) = tp(t). Then, T has no eigenvector and no eigenvalue.
3. Let T : P[0, 1] → P[0, 1] be defined by T (p(t)) = p′(t). Then, since derivative of a constant polynomial is 0,
which equals 0 times the polynomial, the eigenvectors are any non-zero constant polynomial and the associated
eigenvalue is 0.
Theorem 7.1. A vector v 6= 0 is an eigenvector of T : V → V for the eigenvalue λ ∈ F iff v ∈ N(T − λI). Thus λ is
an eigenvalue of T iff the operator T − λI is not one-one.
Proof. Tv = λv iff (T − λI)v = 0 iff v ∈ N(T − λI). �
Theorem 7.2. Let V over F be finite dimensional. Let A be the matrix for an operator T : V → V with respect to
some basis. Then λ ∈ F is an eigenvalue of T iff det(A− λI) = 0.
Proof. Let dimV = n. In the same basis, the matrix of T − λI is A− λI. Now, T − λI is not one-one iff A− λI is not
one-one iff rank(A− λI) < n iff det(A− λI) = 0.
To see the last ‘iff’, suppose rank(A−λI) < n. Then, some column of A−λI is a linear combination of other columns.
Then det(A − λI) = 0. Conversely, suppose rank(A − λI) = n. Then A − λI is invertible. Now, 1 = det(I) =
det(A− λI) det(A− λI)−1 shows that det(A− λI) 6= 0. �
Notice that if a different basis is chosen, then the matrix of T with respect to such a basis can be written as P−1AP
for some invertible matrix P. In that case,
det(P−1AP − λI) = det(P−1(A− λI)P ) = det(A− λI).
This says that the polynomial det(A− λI) is independent of the particular matrix representation A of T.
Definition 7.2. Let T be a linear operator on a finite dimensional vector space V. Let A be a matrix of T with respect
to some basis of V. The polynomial det(A− tI) in the variable t is called the characteristic polynomial of T (also
of the matrix A).
Example 7.2. For T : R3 → R3 with T (a, b, c) = (a, a+ b, a+ b+ c), we have
1− λ 0 0
1 1− λ 0
1 1 1− λ= 0⇒ (1− λ)3 = 0⇒ λ = 1, 1, 1 are the eigenvalues of T.
Solving T (a, b, c) = 1× (a, b, c), we then get eigenvectors (a, 0, 0). Of course, a 6= 0.
Thus, eigenvalues of a matrix A are simply the zeros of the characteristic polynomial of A.
Therefore, if A is a matrix with complex entries then eigenvalues of A exist as complex numbers.
When a matrix A has real entries and the underlying field is R, then it may not have any eigenvalue at all.
It is thus advantageous to consider real entries of a matrix as complex entries.
Example 7.3.
1. For T : R2 → R2 with T (a, b) = (b,−a), we have
0− λ 1
−1 0− λ= 0⇒ λ2 + 1 = 0. It has no real roots.
Thus the operator T has no eigenvalues.
27
2. For T : C2 → C2 with T (a, b) = (b,−a), we have λ2+1 = 0⇒ λ = i, −i, the eigenvalues of T. The corresponding
eigenvectors are obtained by solving
T (a, b) = i(a, b) and T (a, b) = −i(a, b).
For λ = i, we have b = ia,−a = ib. Thus, (a, ia) is an eigenvector for a 6= 0.
For eigenvalue −i, the eigenvectors are (a,−ia) for a 6= 0.
Recollect: Eigenvalues of A are the zeros of det(A− λI), which are in the underlying field.
Theorem 7.3. A and At have the same eigenvalues.
Proof. det(At − λI) = det((A− λI)t) = det(At − λI). �
Theorem 7.4. Similar matrices have the same eigenvalues.
Proof. det(P−1AP − λI) = det(P−1(A− λI)P ) = det(P−1)det(A− λI)det(P ) = det(A− λI). �
Theorem 7.5. If A is diagonal or upper triangular or lower triangular, then its diagonal elements are precisely its
eigenvalues.
Proof. det(A− λI) = (a11 − λ) · · · (ann − λ). �
Theorem 7.6. det(A) equals the product and tr(A) equals the sum of all eigenvalues.
Proof. Let λ1, . . . , λn be the eigenvalues of A. Now, det(A− λI) = (λ1 − λ) · · · (λn − λ). Put λ = 0. to get the det(A).
Expand det(A− λI) and equate the coefficients of λn−1. How?
Coeff. λn−1 in det(A− λI) = Coeff. λn−1 in (a11 − λ)· minor of a11
= · · · = Coeff λn−1 in (a11 − λ) · (a22 − λ) · · · (ann − λ) = (−1)n−1tr(A).
But Coeff of λn−1 in det(A− λI) = (−1)n−1 ·∑λi. �
Theorem 7.7. (Caley-Hamilton). Any square matrix satisfies its characteristic polynomial.
Theorem 7.8. Eigenvalues of a real symmetric matrix or of a Hermitian matrix are real. Eigenvalues of a skew-
symmetric matrix are purely imaginary or zero.
Proof. Let A be Hermitian, i.e., A = (A)t. Let λ ∈ C be an eigenvalue of A with an eigenvector v ∈ Cn.Now, Av = λv ⇒ (v)tAv = λ(v)tv ∈ C. But ((v)tAv)t = (v)t(A)tv = (v)tAv. And ((v)tv)t = (v)tv.
Thus, both are real, and so is λ.
When A is skew-hermitian, ((v)tAv)t = −(v)tAv.
Then, λ = −λ. That is, λ is purely imaginary or zero. �
Theorem 7.9. Eigenvectors associated with distinct eigenvalues of an n× n matrix are linearly independent.
Proof. Let λ1, . . . , λn be the distinct eigenvalues of A, and let v1, . . . , vn be corresponding eigenvectors.
We use induction on k ∈ {1, . . . , n}.For k = 1, since v1 6= 0, {v1} is linearly independent,
Induction Hypothesis: for k = m suppose {v1, . . . , vm} is linearly independent.
Now, for k = m+ 1, suppose
α1v1 + α2v2 + · · ·+ αmvm + αm+1vm+1 = 0. (∗)
Then, T (α1v1 + α2v2 + · · ·+ αmvm + αm+1vm+1) = 0 gives (since Tvi = λivi)
α1λ1v1 + α2λ2v2 + · · ·+ αmλmvm + αm+1λm+1vm+1 = 0.
Now, multiply (∗) with λm+1. Subtract from the last to get:
α1(λ1 − λm+1)v1 + · · ·+ αm(λm − λm+1)vm = 0.
28
By the Induction Hypothesis, αi(λi − λm+1) = 0. This implies each αi = 0, for i = 1, 2, . . . ,m.
Then, from (∗), αm+1vm+1 = 0. Finally, all αi = 0 for i = 1, 2, . . . ,m,m+ 1. �
Suppose an n× n matrix A has n distinct eigenvalues. Then, the corresponding eigenvectors form a basis for Fn.In this basis of eigenvectors, the linear transformation that represents A has a matrix representation, say B. Now, B
is a diagonal matrix. Reason?
Suppose E = {v1, . . . , vj , . . . , vn}.Look at A as a linear transformation. Since vj is an eigenvector with eigenvalue λj of A,
Av1 = λ1v1, . . . , Avj = λjvj , . . . , Avn = λnvn.
Therefore, B = [A]E,E = diag(λ1, . . . , λn). Note that B = P−1AP, where columns of P are v1, . . . , vn.
So, we can have an alternative proof:
Form the matrix P by taking the eigenvectors v1, . . . , vn as its columns. Since these are linearly independent, P is
invertible. See that P−1AP = B.
Now, Be1 = P−1APe1 = P−1Av1 = λ1P−1v1 = λ1e1. Similarly, Bej = λjej for j = 1, 2, . . . , n. This means, B is
a diagonal matrix with diagonal entries as λ1, . . . , λn.
We will discuss diagonalizable matrices later.
Problems for Section 7
1. Let A ∈ Fn×n. Prove that rankA < n iff det(A) = 0.
2. Find the characteristic equation, the eigenvalues and the associated eigenvectors for the matrices given below.
(a)
(3 0
8 −1
)(b)
(3 2
−1 0
)(c)
(−2 −1
5 2
)(d)
−2 0 3
−2 3 0
0 0 5
(e)
0 1 0
0 0 1
4 −17 8
.
3. Find the eigenvalues and associated eigenvectors of the differentiation operator d/dt : P3 → P3.
4. Prove: The eigenvalues of a triangular matrix (upper or lower) are the entries on the diagonal.
5. Can any non-zero vector in any non-trivial vector space be an eigenvector of some linear transformation?
6. Given a scalar λ, can any non-zero vector in any non-trivial vector space be an eigenvector associated with the
eigenvalue λ of some linear transformation?
7. Prove: if T is invertible and has eigenvalues λ1, . . . , λn, then T−1 has eigenvalues 1/λ1, ..., 1/λn.
8. Let T : V → V be a linear transformation. Prove the following:
(a) If T is a bijection and 0 6= λ ∈ F, then λ is an eigenvalue of T if and only if 1/λ is an eigenvalue of T−1.
(b) If λ is an eigenvalue of T then λk is an eigenvalue of T k.
(c) If λ is an eigenvalue of T and α ∈ F, then λ+ α is an eigenvalue of T + αI.
(d) If p(t) = a0 + a1t + . . . + aktk for some a0, a1, . . . , ak in F, and if λ is an eigenvalue of T then p(λ) is an
eigenvalue of p(T ) := a0I + a1T + . . .+ akTk.
9. Let T1 and T2 be linear operators on V , λ is an eigenvalue of T1 and µ is an eigenvalue of T2. Is it necessary
that λµ is an eigenvalue of T1T2? Why? What is wrong with the following statement?
λµ an eigenvalue of T1T2 because, if T1x = λx and T2x = µx, then
T1T2x = T1(µx) = µT1x = µλx = λµx.
10. Let A be an n× n matrix and α be a scalar such that each row (or each column) sums to α. Show that α is an
eigenvalue of A.
29
8 Inner Products
To study geometrical problems, in which lengths and angles play a role, we need additional structures in a vector
space. For example, in R2 and R3, we have scalar product of vectors.
In R2, we have length as ‖x‖ = x · x and cos(angle(x, y)) = x·y‖x‖ ‖y‖ .
Let V be a vector space over F, which is either R or C. We define inner product in a vector space V by accepting
some of the fundamental properties of the scalar product in R2 or R3.
Definition 8.1. An inner product on a vector space V is a map (x, y) 〈x, y〉 which associates a pair of vectors
in V to a scalar 〈x, y〉 satisfying
(a) 〈x, x〉 ≥ 0 for each x ∈ V.(b) 〈x, x〉 = 0 iff x = 0.
(c) 〈x+ y, z〉 = 〈x, z〉+ 〈y, z〉 for all x, y, z ∈ V.(d) 〈αx, y〉 = α〈x, y〉 for each α ∈ F and for all x, y ∈ V.(e) 〈y, x〉 = 〈x, y〉 for all x, y ∈ V.
A vector space with an inner product on it is called an inner product space (ips).
Example 8.1.
1. The scalar product on R2 and also on R3 are inner products.
2. For x = (x1, x2, . . . , xn), y = (y1, y2, . . . , yn) ∈ Fn, 〈x, y〉 =∑nj=1 xjyj .
This inner product is called the standard inner product on Fn.
3. Let V be a vector space. Let B = {u1, u2, . . . , un} be an ordered basis for V. Let x =∑ni=1 αiui and
y =∑ni=1 βiui. Define 〈x, y〉B =
∑ni=1 αiβi. This is an inner product on V.
4. Let V be a vector space with dimV = n. Let T : V → Fn be a bijective linear transformation. Then
〈x, y〉T = 〈Tx, Ty〉 is an inner product on V. Here, 〈 · 〉 on the right hand side denotes an inner product on Fn.
5. Let t1, t2, . . . , tn+1 be distinct real numbers. For any p, q ∈ Pn, define 〈p, q〉 =∑n+1i=1 p(ti)q(ti). This is an inner
product on Pn.
6. For f, g ∈ C[a, b], take 〈f, g〉 =∫ baf(t)g(t) dt. This is an inner product on C[a, b].
7. In all the above examples, consider R as the underlying scalar field and remove the overline from the definition
of inner products. Then the resulting function 〈 · 〉 is an inner product on the corresponding vector space.
Theorem 8.1. Let V be an ips. For all x, y, z ∈ V and for all α ∈ F, 〈x, y + z〉 = 〈x, y〉+ 〈x, z〉, 〈x, αy〉 = α〈x, y〉.
Proof. 〈x, y + z〉 = 〈y + z, x〉 = 〈y, x〉+ 〈z, x〉 = 〈y, x〉+ 〈z, x〉 = 〈x, y〉+ 〈x, z〉.
〈x, αy〉 = 〈αy, x〉 = α〈y, x〉 = α 〈y, x〉 = α 〈x, y〉. �
Definition 8.2. Let V be an ips. For any x ∈ V, the length of x, also called the norm of x is ‖x‖ =√〈x, x〉.
For any x ∈ V, ‖x‖ ≥ 0 and ‖x‖ = 0 iff x = 0.
Theorem 8.2. Let x, y ∈ V, an ips. The parallelogram law holds: ‖x+ y‖2 + ‖x− y‖2 = 2‖x‖2 + 2‖y‖2.
Proof. ‖x+ y‖2 = 〈x+ y, x+ y〉 = 〈x, x〉+ 〈x, y〉+ 〈y, x〉+ 〈y, y〉. Complete the proof. �
Theorem 8.3. (Cauchy-Schwartz). Let x, y ∈ V, an ips. Then |〈x, y〉| ≤ ‖x‖ ‖y‖. Further, |〈x, y〉| = ‖x‖ ‖y‖ iff
{x, y} is linearly dependent.
30
Proof. If y = 0, then obvious. Assume y 6= 0. Set α = 〈x,y〉〈y,y〉 . Then α = 〈y,x〉
〈y,y〉 .
Now, 0 ≤ ‖x− αy‖2 = 〈x− αy, x− αy〉 = 〈x, x〉 − α〈x, y〉 − α[〈y, x〉 − α〈y, y〉].
= ‖x‖2 − 〈y, x〉〈y, y〉
〈x, y〉 = ‖x‖2 − |〈x, y〉|2
‖y‖2
Therefore, |〈x, y〉| ≤ ‖x‖ ‖y‖.
Next, equality holds iff y = 0 or x = α y. �
Theorem 8.4. (Triangle Inequality). For all x, y ∈ V, an ips, ‖x+ y‖ ≤ ‖x‖+ ‖y‖.
Proof. ‖x+ y‖2 = 〈x+ y, x+ y〉 = ‖x‖2 + 〈x, y〉+ 〈y, x〉+ ‖y‖2 = ‖x‖2 + 2 Re〈x, y〉+ ‖y‖2
≤ ‖x‖2 + 2|〈x, y〉|+ ‖y‖2 ≤ ‖x‖2 + 2‖x‖‖y‖+ ‖y‖2 = (‖x‖+ ‖y‖)2. �
Definition 8.3. Let x, y ∈ V, an ips. The acute angle between x and y is denoted by θ(x, y), and is defined by
cos θ(x, y) =|〈x, y〉|‖x‖ ‖y‖
. The vector x is orthogonal to y, i.e., x ⊥ y iff 〈x, y〉 = 0.
Notice that if x ⊥ y, then clearly, y ⊥ x. Moreover, 0 ⊥ x for every x.
Example 8.2.
1. Let {e1, e2, . . . , en} be the standard basis for Rn. Then ei ⊥ ej whenever i 6= j.
2. In C[0, 2π], define 〈f, g〉 =∫ 2π
0f(t)g(t)dt. Since
∫ 2π
0cosmt sinnt dt = 0 for m 6= n, cosmt ⊥ sinnt,
whenever m 6= n.
Theorem 8.5. (Pythagoras). Let V be an ips. Let x, y ∈ V.(a) If x ⊥ y, then ‖x+ y‖2 = ‖x‖2 + ‖y‖2.(b) Suppose V is a real vector space. If ‖x+ y‖2 = ‖x‖2 + ‖y‖2, then x ⊥ y.
Proof. (a) 〈x+ y, x+ y〉 = ‖x‖2 + 〈x, y〉+ 〈y, x〉+ ‖y‖2. Since x ⊥ y, both 〈x, y〉 = 0 = 〈y, x〉.
(b) Let V be a real vector space. Then 〈x, y〉 = 〈y, x〉. If ‖x+ y‖2 = ‖x‖2 + ‖y‖2, then 〈x, y〉 = 0. �
Example 8.3. Take V = C, a complex ips with 〈x, y〉 = xy, as usual.
Now, ‖1 + i‖2 = (1 + i)(1 + i) = 1 + 1 = ‖1‖2 + ‖i‖2. But 〈1, i〉 = 1× (−i) = −i 6= 0.
Definition 8.4. Let V be an ips, S ⊆ V, and x ∈ V.(a) x ⊥ S iff for each y ∈ S, x ⊥ y.(b) S⊥ = {x ∈ V : x ⊥ S}.(c) S is called an orthogonal set when x, y ∈ S, x 6= y implies x ⊥ y.
Example 8.4. Let V be an ips. Then
1. V ⊥ = {0}.2. {0}⊥ = V.
3. If S is a superset of some basis for V, then S⊥ = {0}.For (3), let x ∈ V. Then, x =
∑ni=1 αiui for some n, some αi ∈ F and for some ui ∈ S. If y ⊥ S, then y ⊥ x as well.
That is, y ⊥ V.
Definition 8.5. Let V be an ips. A set S ⊆ V is called an orthonormal set if S is orthogonal and ‖x‖ = 1 for each
x ∈ S. If V is finite dimensional, then an orthonormal set which is also a basis for V is called an orthonormal basis
of V.
An orthonormal basis in a general ips is, in fact, a maximal orthonormal set. Thus, in an infinite dimensional ips,
an orthonormal basis may fail to be a basis. It so happens that in a finite dimensional inner product space, a maximal
orthonormal set becomes a basis for the space. Since we are concerned with finite dimensional ips only, the above
definition is enough for us.
31
Example 8.5.
1. The standard basis of Rn is an orthonormal basis of it.
2. The set of functions {cosmt : m ∈ N} in the real ips C[0, 2π] with inner product as 〈f, g〉 =∫ 2π
0f(t)g(t) dt is an
orthogonal set. But∫ 2π
0cos2 t dt 6= 1. Hence, it is not an orthonormal set.
3. However, {(cosmt)/√π : m ∈ N} is an orthonormal set in C[0, 2π].
Theorem 8.6. Every orthogonal set of non-zero vectors is linearly independent.
Every orthonormal set is linearly independent.
Proof. Let S be an orthogonal set in an ips V. For n ∈ N, vi ∈ S, αi ∈ F, suppose∑ni=1 αivi = 0. Then for each
j ∈ {1, . . . , n}, 0 = 〈∑ni=1 αivi, vj〉 =
∑ni=1 αi〈vi, vj〉 = αj〈vj , vj〉. Since vj 6= 0, αj = 0. �
Theorem 8.7. Let S = {u1, u2, . . . , un} be an orthonormal set in an ips V. Let x ∈ span(S). Then
x =
n∑j=1
〈x, uj〉uj and ‖x‖2 =
n∑j=1
|〈x, uj〉|2.
Proof. x = α1u1 + α2u2 + · · ·+ αnun. Then 〈x, uj〉 = αj . Next,
‖x‖2 = 〈∑j αjuj ,
∑i αiui〉 =
∑j
∑i αjαi〈uj , ui〉 =
∑j αjαj =
∑nj=1 |〈x, uj〉|2. �
Theorem 8.8. (Fourier Expansion and Parseval’s Identity). Let {v1, v2, . . . , vn} be an orthonormal basis for
an ips V. Let x ∈ V. Then
x =
n∑j=1
〈x, vj〉vj and ‖x‖2 =
n∑j=1
|〈x, vj〉|2.
Theorem 8.9. (Bessel’s Inequality). Let {u1, u2, . . . , un} be an orthonormal set in an ips V. Let x ∈ V. Then
n∑j=1
|〈x, uj〉|2 ≤ ‖x‖2.
Proof. Let y =∑nj=1〈x, uj〉uj Then 〈x, ui〉 = 〈y, ui〉. That is, x− y ⊥ ui, for each i. So, x− y ⊥ y.
By Pythagoras’ theorem, ‖x‖2 = ‖x− y‖2 + ‖y‖2 ≥ ‖y‖2.As y ∈ span{u1, . . . , un}, by Parseval’s identity, ‖y‖2 =
∑nj=1 |〈x, uj〉|2. �
Problems for Section 8
1. Check whether each of the following is an inner product on the given vector spaces.
(a) 〈x, y〉 = x1y1 for x = (x1, x2), y = (y1, y2) in V = R2.
(b) 〈x, y〉 = x1y1 for x = (x1, x2), y = (y1, y2) in V = C2.
(c) 〈x, y〉 =∑nj=0 x(tj)y(tj) for x, y in V = Pn([0, 1],R), where t1, . . . , tn are distinct points in [0, 1].
(d) 〈x, y〉 =∫ 1
0x(t)y(t) dt for x, y in V = R([0, 1],R).
(e) 〈A,B〉 = trace(A+B) for A,B in V = R2×2.
(f) 〈A,B〉 = trace(AtB) for A,B in V = R3×3.
(g) 〈x, y〉 =∫ 1
0x′(t)y′(t) dt for x, y in V = C1([0, 1],R).
(h) 〈x, y〉 = x(0)y(0) +∫ 1
0x′(t)y′(t) dt for x, y in V = C1([0, 1],R).
(i) 〈(a, b), (c, d)〉 = ac− bd on R2
(j) 〈f, g〉 =∫ 1/2
0 f(t)g(t) on C[0, 1]
(k) 〈f, g〉 =∫ 1
0f ′(t)g(t) dt on P
(l) 〈f, g〉 =∫ 1
0f ′(t)g′(t) dt on C1[0, 1].
32
2. Let B be a basis for a finite dimensional inner product space. Prove that if 〈x, y〉 = 0 for all x ∈ B, then y = 0.
3. Let A = (aij) ∈ R2×2. For x, y ∈ R2×1, let fA(x, y) = ytAx. Show that fA is an inner product on R2×1 if and
only if a12 = a21, a11 > 0, a22 > 0, and a11a22 − a12a21 > 0.
4. Let V be an inner product space, and let x, y ∈ V. Show the following:
(a) ‖x‖ ≥ 0.
(b) x = 0 iff ‖x‖ = 0.
(c) ‖αx‖ = |α|‖x‖, for all α ∈ F.
(d) ‖x+ αy‖ = ‖x− αy‖ for all α ∈ F iff 〈x, y〉 = 0.
(e) If ‖x+ y‖ = ‖x‖+ ‖y‖, then either y = 0 or x = αy, for some α ∈ F.
5. Let V1 and V2 be inner product spaces with inner products 〈·, ·〉1 and 〈·, ·〉2 respectively. On V = V1×V2, define
〈(x1, x2), (y1, y2)〉V := 〈x1, y1〉1 + 〈x2, y2〉2 for all (x1, x2), (y1, y2) ∈ V.
Show that 〈·, ·〉V is an inner product on V .
6. Let 〈·, ·〉1 and 〈·, ·〉2 be inner products on a vector space V . Show that
〈x, y〉 := 〈x, y〉1 + 〈x, y〉2 for all x, y ∈ V
defines another inner product on V .
7. Let T : V → Fn be a linear isomorphism (bijective linear transformation). Show that
〈x, y〉T := 〈Tx, Ty〉Fn for all x, y ∈ V,
defines an inner product on V .Here, 〈·, ·〉Fn is the standard inner product on Fn.
8. Let V be an inner product space over C. Prove that for all x, y ∈ V, Re〈ix, y〉 = −Im〈x, y〉.
9. Let V1 and V2 be inner product spaces. Let T : V1 → V2 be a linear transformation. Prove that for all
(x, y) ∈ V1 × V2,
〈Tx, Ty〉 = 〈x, y〉 if and only if ‖Tx‖ = ‖x‖.
[Notice that both the inner products are denoted by 〈·, ·〉.Hint: For the only if part, use 〈T (x+ y), T (x+ y)〉 = 〈x+ y, x+ y〉 and the previous problem.]
10. For S ⊆ V, an inner product space, define S⊥ = {x ∈ V : 〈x, u〉 = 0, ∀u ∈ S}. Show that
(a) S⊥ is a subspace of V
(b) V ⊥ = {0}
(c) {0}⊥ = V
(d) S ⊆ S⊥⊥
(e) If W is a subspace of a finite dimensional inner product space V, then W⊥⊥ = W.
11. Show that W = {x ∈ R4 : x ⊥ (1, 0,−1, 1), x ⊥ (2, 3,−1, 2)} is a subspace of R4, where R4 has the standard
inner product. Find a basis for W .
12. For vectors x, y ∈ V, an inner product space, prove that x+ y ⊥ x− y iff ‖x‖ = ‖y‖.
13. Let x ⊥ y in an inner product space V. Prove that ‖x+ y‖2 = ‖x‖2 + ‖y‖2. Deduce Pythagoras theorem in R2.
14. Prove the following:
33
(a) If the scalar field is R, then show that the converse of the Pythagoras theorem holds, that is,
if ‖x+ y‖2 = ‖x‖2 + ‖y‖2, then x ⊥ y.
(b) If the scalar field is C, then show that the converse of Pythagoras theorem need not be true.
[Hint: Take V = C with standard inner product and x = α, y = i β for nonzero real numbers α, β ∈ R.]
15. Prove that every orthogonal set (hence, orthonormal set) in an inner product space is linearly independent.
16. Let V be an inner product space over F with dimV = n. Consider the inner product space Fn with the standard
inner product. Prove that there exists a linear isometry from V onto Fn, i.e., a surjective linear operator
T : V → Fn with ‖T (x)‖ = ‖x‖, for all x ∈ V . [Notice that both the norms are denoted by ‖ · ‖.]
17. Let V be an inner product space with basis B = {u1, u2, . . . , un}. If α1, α2, . . . αn ∈ F, then show that there is
exactly one vector x ∈ V such that 〈x, uj〉 = αj , for j = 1, 2, . . . , n.
18. Let V be a finite dimensional inner product space and let B = {u1, u2, . . . , un} be an orthonormal basis for V .
Let T be a linear operator on V , and A the matrix of T in the ordered basis B. Prove that Aij = 〈Tuj , ui〉.
19. Let V = Cn×n with the inner product 〈A,B〉 = tr(A∗B). Find the orthogonal complement of the subspace of
diagonal matrices.
20. Let {u1, u2, . . . , uk} be an orthonormal set in an inner product space V and let a1, a2, ..., ak ∈ R. Prove that
‖∑ki=1 aiui‖2 =
∑ki=1 |ai|2‖ui‖2.
21. Let {u1, u2, . . . , un} be an orthonormal basis of a finite dimensional inner product space V . Show that for
any x, y ∈ V, 〈x, y〉 =∑nk=1〈x, uk〉〈uk, y〉. Then deduce that there is a linear isometry T : V → Fn, that is,
‖T (x)‖ = ‖x‖ for each x ∈ V.
9 Gram-Schmidt Orthogonalization
Given two linearly independent vectors u1, u2 on the plane how do we construct two orthogonal vectors?
Keep v1 = u1. Take out the projection of u2 on u1 to get u3. Now, u2 ⊥ u1.
What is the projection of u2 on u1?
Its length is 〈u2, u1〉. Its direction is that of u1, i.e., u1/‖u1‖. Thus v1 = u1, v2 = u2 −〈u2, v1〉〈v1, v1〉
v1.
We may continue this process of taking out projections in n dimensions.
Let V = span{u1, u2, . . . , un}. Define
v1 = u1.
v2 = u2 −〈u2, v1〉〈v1, v1〉
v1
...
vn = un −〈un, v1〉〈v1, v1〉
v1 −〈un, v2〉〈v2, v2〉
v2 · · · −〈un, vn−1〉〈vn−1, vn−1〉
vn−1.
Theorem 9.1. Let {u1, u2, . . . , un} be linearly independent. Then
{v1, v2, . . . , vn} is orthogonal and span{v1, v2, . . . , vn} = span{u1, u2, . . . , un}.
Proof. 〈v2, v1〉 = 〈u2 −〈u2, v1〉〈v1, v1〉
v1, v1〉 = 〈u2, v1〉 −〈u2, v1〉〈v1, v1〉
〈v1, v1〉 = 0. Use Induction. �
Example 9.1. The vectors u1 = (1, 0, 0), u2 = (1, 1, 0), u3 = (1, 1, 1) form a basis for R3. Apply Gram-Schmidt
Orthogonalization.
34
Solution:
v1 = (1, 0, 0).
v2 = u2 −〈u2, v1〉〈v1, v1〉
v1 = (1, 1, 0)− (1, 1, 0) · (1, 0, 0)
(1, 0, 0) · (1, 0, 0)(1, 0, 0) = (1, 1, 0)− 1 (1, 0, 0) = (0, 1, 0).
v3 = u3 −〈u3, v1〉〈v1, v1〉
v1 −〈u3, v2〉〈v2, v2〉
v2 = (1, 1, 1)− (1, 1, 1) · (1, 0, 0)(1, 0, 0)− (1, 1, 1) · (0, 1, 0)(0, 1, 0)
= (1, 1, 1)− (1, 0, 0)− (0, 1, 0) = (0, 0, 1).
The set {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is orthogonal.
Example 9.2. The vectors u1 = (1, 1, 0), u2 = (0, 1, 1), u3 = (1, 0, 1) form a basis for F3. Apply Gram-Schmidt
Orthogonalization.
Solution:
v1 = (1, 1, 0).
v2 = u2 −〈u2, v1〉〈v1, v1〉
v1 = (0, 1, 1)− (0, 1, 1) · (1, 1, 0)
(1, 1, 0) · (1, 1, 0)(1, 1, 0) = (0, 1, 1)− 1
2(1, 1, 0) =
(− 1
2,
1
2, 1).
v3 = u3 −〈u3, v1〉〈v1, v1〉
v1 −〈u3, v2〉〈v2, v2〉
v2 = (1, 0, 1)− (1, 0, 1) · (1, 1, 0)(1, 1, 0)− (1, 0, 1) ·(− 1
2,
1
2, 1)(− 1
2,
1
2, 1)
= (1, 0, 1)− 12 (1, 1, 0)− 1
3 (− 12 ,
12 , 1) = (− 2
3 ,23 ,−
23 ).
The set {(1, 1, 0), (− 12 ,
12 , 1), (− 2
3 ,23 ,−
23 )} is orthogonal.
Example 9.3. The vectors u1 = 1, u2 = t, u3 = t2 form a linearly independent set in the ips of all polynomials
considered as functions from [−1, 1] to R; inner product is 〈p(t), q(t)〉 =∫ 1
−1 p(t)q(t) dt. Applying Gram-Schmidt
Process, we see:
v1 = u1 = 1.
v2 = u2 −〈u2, v1〉〈v1, v1〉
v1 = t−∫ 1
−1 t dt∫ 1
−1 dt1 = t.
v3 = u3 −〈u3, v1〉〈v1, v1〉
v1 −〈u3, v2〉〈v2, v2〉
v2 = t2 −∫ 1
−1 t2 dt∫ 1
−1 dt1−
∫ 1
−1 t3 dt∫ 1
−1 t2 dt
t = t2 − 1/3.
The set {1, t, t2 − 1/3} is an orthogonal set in this ips.
In fact, orthogonalization of {1, t, t2, t3, t4, . . .} with the above inner product gives the Legendre Polynomials.
An orthogonal set can be made orthonormal by dividing each vector by its norm. Therefore, every finite dimensional
inner product space has an orthonormal basis.
Problems for Section 9
1. Consider R3 with the standard inner product. In each of the following, find orthogonal vectors obtained from
the given vectors using Gram-Schmidt orthogonalization procedure:
(a) (1, 2, 0), (2, 1, 0), (1, 1, 1)
(b) (1, 1, 1), (1,−1, 1), (1, 1,−1)
(c) (0, 1, 1), (0, 1,−1), (−1, 1,−1).
2. Consider R3 with the standard inner product. In each of the following, find a vector of norm 1 which is orthogonal
to the given two vectors:
(a) (2, 1, 0), (1, 2, 1)
35
(b) (1, 2, 3), (2, 1,−2)
(c) (0, 2,−1), (−1, 2,−1).
(d) {(1, 0, 1), (1, 0,−1), (0, 3, 4)}
[Hint: If {u1, u2} is linearly independent, then find a vector u3 so that {u1, u2, u3} is also linearly independent;
use Gram-Schmidt orthogonalization procedure. Alternatively, Find α, β, γ such that u3 = (α, β, γ) satisfies
〈u1, u3〉 = 0 and 〈u2, u3〉 = 0. Next, normalize u3.]
3. Consider R4 with the standard inner product. In each of the following, find V ⊥0 where V0 = span{u1, u2}:
(a) u1 = (1, 2, 0, 1), u2 = (2, 1, 0,−1)
(b) u1 = (1, 1, 1, 0), u2 = (1,−1, 1, 1)
(c) u1 = (0, 1, 1,−1), u2 = (0, 1,−1, 1).
[Hint: Extend {u1, u2} to a basis {u1, u2, u3, u4}; obtain orthogonal vectors v1, v2, v3, v4 by Gram-Schmidt or-
thogonalization procedure. Then V ⊥0 = span{v3, v4}.]
4. Consider C3 with standard inner product. Find an orthonormal basis for the subspace spanned by the vectors
(1, 0, i) and (2, 1, 1 + i).
5. Consider R3×3 with the inner product 〈A,B〉 = trace(BTA). Using Gram-Schmidt orthogonalization procedure,
find a nonzero matrix which is orthogonal to both the matrices 1 1 1
1 −1 1
1 1 −1
and
1 0 1
1 1 0
0 1 1
.6. Consider the polynomials uj(t) = tj−1 for j = 1, 2, 3 in the vector space of all polynomials with real coefficients.
By Gram-Schmidt orthogonalization procedure, find orthogonal polynomials obtained from u1, u2, u3 with respect
to the following inner products:
(a) 〈p, q〉 =∫ 1
0p(t)q(t) dt
(b) 〈p, q〉 =∫ 1
−1 p(t)q(t) dt
(c) 〈p, q〉 =∫ 0
−1 p(t)q(t) dt.
7. Equip P3(R) with the inner product 〈f, g〉 =∫ 1
0f(t)g(t)dt.
(a) Find the orthogonal complement of the subspace of constant polynomials.
(b) Apply Gram-Schmidt process to the ordered basis {1, t, t2, t3}.
10 QR Factorization
Let u1, u2, . . . , un be the columns of A ∈ Rm×n, m ≥ n. Suppose the columns are linearly independent.
Orthonormalize them to get v1, v2, . . . , vn. Now, span{u1, . . . , uk} = span{v1, . . . , vk} for k = 1, . . . , n. Thus,
u1 = a11v1
u2 = a12v1 + a22v2...
un = a1nv1 + a2nv2 + · · ·+ annvn
Writing R = (aij) for i, j = 1, 2, . . . , n, Q = [v1, v2, · · · , vn], we see that [u1, u2, · · · , un] = QR.
Since columns of Q are orthonormal, Q ∈ Rm×n, R ∈ Rn×n, QtQ = I and R is upper triangular.
A matrix with real entries, whose columns are orthonormal, is called an orthogonal matrix.
36
Definition 10.1. The QR factorization of a matrix A is the determination of an orthogonal matrix Q and an upper
triangular matrix R such that A = QR.
We thus see that
Theorem 10.1. Any matrix A ∈ Rm×n, whose columns are linearly independent, does have a QR factorization.
Moreover, R is invertible.
Example 10.1. Let A =
1 1
0 1
1 1
.
Orthonormalization of the columns of A yields Q =
1/√
2 0
0 1
1/√
2 0
.
Since A = QR and QtQ = I, we have R = QtA =
( √2√
2
0 1
). Verify that A = QR.
Problems for Section 10
1. Find a QR-factorization of each of the following matrices:
(a)
0 1
1 1
0 1
(b)
1 0 2
0 1 1
1 2 0
(c)
1 1 2
0 1 −1
1 1 0
0 0 1
11 Best Approximation
What is the best point on a line that approximates a point not on the line?
What is the best point on a circle that approximates a point not on the circle?
What is the best point on a surface that approximates a point not on the surface?
Definition 11.1. Let U be a subspace of an ips V. Let v ∈ V. A vector u ∈ U is a best approximation of v if
‖v − u‖ ≤ ‖v − x‖ for each x ∈ U.
Theorem 11.1. Let U be a subspace of an ips V. A vector u ∈ U is a best approximation of v ∈ V iff v − u ⊥ U.
Moreover, a best approximation is unique.
Proof: (a) Suppose v − u ⊥ U. Let x ∈ U. Now, u− x ∈ U. By Pythagoras’ Theorem,
‖v − x‖2 = ‖(v − u) + (u− x)‖2 = ‖v − u‖2 + ‖u− x‖2 ≥ ‖v − u‖2.
Hence, ‖v − u‖ ≤ ‖v − x‖ for each x ∈ U. That is, u is a best approximation of v.
(b) Suppose u is a best approximation of v. Then
‖v − u‖ ≤ ‖v − x‖ for each x ∈ U. (∗)
Let y ∈ U. To show: 〈v − u, y〉 = 0.
For y = 0, clearly 〈v − u, y〉 = 0.
For y 6= 0, let α = 〈v − u, y〉/‖y‖2. Then 〈v − u, αy〉 = |α|2‖y‖2. Thus, 〈αy, v − u〉 = |α|2‖y‖2 also.
From (∗),
‖v−u‖2 ≤ ‖v−u−αy‖2 = 〈v−u−αy, v−u−αy〉 = ‖v−u‖2−〈v−u, αy〉−〈αy, v−u〉+|α|2‖y‖2 = ‖v−u‖2−〈v−u, αy〉.
37
Hence, 〈v − u, y〉 = 0.
(c) If both u,w are best approximations to v, then ‖v − u‖ ≤ ‖v − w‖ and ‖v − w‖ ≤ ‖v − u‖.So, ‖v − u‖ = ‖v − w‖. Now, since v − w ⊥ w − u, we have
‖v − u‖2 = ‖(v − w) + (w − u)‖2 = ‖v − w‖2 + ‖w − u‖2 = ‖v − u‖2 + ‖w − u‖2.
Thus, ‖w − u‖2 = 0. �
Theorem 11.2. Let {u1, . . . , un} be an orthonormal basis for U ⊆S V, an ips. Let v ∈ V. The unique best approxi-
mation of v from U is u =∑ni=1〈v, ui〉ui.
Proof: Let x ∈ U. Then, x =∑nj=1〈x, uj〉uj . Now,
〈v − u, x〉 = 〈v −∑ni=1〈v, ui〉ui,
∑nj=1〈x, uj〉uj〉 =
∑nj=1〈x, uj〉〈v, uj〉 −
∑ni=1
∑nj=1〈v, ui〉〈x, uj〉〈ui, uj〉
=∑nj=1〈x, uj〉〈v, uj〉 −
∑ni=1〈v, ui〉〈x, ui〉 = 0.
That is, v − u ⊥ U. �
If {u1, . . . , un} is a basis for U, write u =∑nj=1 βjuj . Our requirement v − u ⊥ uj means determining βj from
〈v −∑nj=1 βjuj , ui〉 = 0. Thus, we solve the linear system
∑nj=1〈uj , ui〉βj = 〈v, ui〉.
Example 11.1.
1. What is the best approximation of v = (1, 0) ∈ R2 from U = {(a, a) : a ∈ R}?Solution: Find (α, α) so that (1, 0) − (α, α) ⊥ (β, β) for all β. Find (α, α) so that (1 − α,−α) · (1, 1) = 0. So,
α = 1/3. The best approximation here is (1/2, 1/2).
2. In C[0, 1] over R, with 〈f, g〉 =∫ 1
0f(t)g(t) dt, what is the best approximation of t2 from P1?
Solution: Determine α, β so that t2 − (α+ βt) ⊥ 1 and t2 − (α+ βt) ⊥ t.∫ 1
0
(t2 − αβt) dt = 0 =
∫ 1
0
(t3 − αt− βt2) dt.
This gives 1/3− α− β/2 = 0 = 1/4− α/2− β/3. The best approximation is −1/6 + t.
Best approximation can be used to computing the best approximate solution of a linear system of equations.
Definition 11.2. Let U be a vector space, V an ips, A : U → V a linear transformation. A vector u ∈ U is a Best
Approximate Solution (also called a Least Square Solution) of the equation Ax = y if ‖Au− y‖ ≤ ‖Az− y‖ for
all z ∈ U.
Thus, u is a best approximate solution of Ax = y iff v = Au is a best approximation of y from R(A), the range
space of A. We then obtain the following result.
Theorem 11.3. Let U be a vector space, V an ips, A : U → V a linear transformation.
1. If R(A) is finite dimensional, then Ax = y has a best approximate solution.
2. A vector u ∈ U is a best approximate solution iff Au− y ⊥ R(A).
3. A best approximate solution is unique iff A is one-one.
Theorem 11.4. Let A ∈ Rm×n. A vector u ∈ Rn is a best approximate solution of Ax = y iff AtAu = Aty.
Proof: u1, . . . , un, the columns of A spanR(A).
Thus u is a best approximate solution of Ax = y
iff 〈Au− y, ui〉 = 0, for i = 1, . . . , n
iff uti(Au− y) = 0 for these i
iff At(Au− y) = 0
iff AtAu = Aty. �
38
Example 11.2. Let A =
(1 1
0 0
), y =
(0
1
), u =
(1
−1
).
Now, AtAu = Aty. Thus u is a best approximate solution of Ax = y.
Notice that Ax = y has no solution.
We can use QR factorization in computing a best approximate solution of a linear system.
Theorem 11.5. Let A be a matrix with real entries, whose columns are linearly independent. Then, the best approx-
imate solution of Ax = y is given by u = R−1Qty, where A = QR is the QR factorization of A.
Proof: Let u = R−1Qty. Now, AtAu = RtQtQRR−1Qty = RtQty = Aty.
That is, u satisfies the equation AtAx = Aty. �
Why u is not a solution of Ax = y?
Because, Au = QRR−1Qty = QQty?= y
But, if a solution v exists for Ax = y, then v = u. Reason?
Av = y ⇒ QRv = y ⇒ Rv = Qty ⇒ v = u.
Note that the system Ru = Qty is easy to solve since R is upper triangular.
Problems for Section 11
1. Find the best approximation of x ∈ V from U where
(a) V = R3, x = (1, 2, 1), U = span{(3, 1, 2), (1, 0, 1)}.
(b) V = R3, x = (1, 2, 1), U = {(α, β, γ) ∈ R3 : α+ β + γ = 0}.
(c) V = R4, x = (1, 0,−1, 1), U = span{(1, 0,−1, 1), (0, 0, 1, 1)}.
(d) V = C[−1, 1], x(t) = et, U = P4.
2. Let A ∈ Rm×n and y ∈ Rm. Show that there exists x ∈ Rn such that ‖Ax − y‖ ≤ ‖Au − y‖ for all u ∈ Rn iff
AtAx = Aty.
3. Let A ∈ Rm×n and y ∈ Rm. If columns of A are linearly independent, then show that there exists a unique
x ∈ Rn such that AtAx = Aty.
4. Find the best approximate (least square) solution for the system Ax = y, where
(a) A =
3 1
1 2
2 −1
, y =
1
0
−2
(b) A =
1 1 1
−1 0 1
1 −1 0
0 1 −1
, y =
0
1
−1
−2
.
12 Diagonalization
By changing bases, matrices can assume nice forms. In general,
Definition 12.1. Let V be a finite dimensional vector space. A linear operator T : V → V is diagonalizable if there
exists an ordered basis E for V such that [T ]E,E is diagonal.
Write such a basis as E = {u1, . . . , un}. Write [T ]E,E = diag(λ1, . . . , λn). Then Tui = λiui. That is, λi is an
eigenvalue of T with the associated eigenvector ui. Hence, T : V → V is diagonalizable iff V has a basis consisting of
eigenvectors of T.
How do we get such a basis for V ?
Take V as a vector space over C. Given any basis B for V, we have the matrix A = [T ]B,B for T. Eigenvalues of T
are the zeros of the characteristic polynomial det(A−λI) = 0. If λ is an eigenvalue of T, then its associated eigenvector
39
u is a solution of Au = λu. This means u ∈ N(T − λI). Now, how many linearly independent solution vectors u of
Au = λu are there? Obviously, it is dimN(T − λI). Is it same as the multiplicity of λ as zeros of det(A− λI) = 0?
Definition 12.2. Let V be a finite dimensional complex vector space. Let T : V → V be a linear transformation.
Let λ be an eigenvalue of T.
The algebraic multiplicity of λ is the number of times λ is a zero of the characteristic polynomial of T.
The geometric multiplicity of λ is dimN(T − λI).
Example 12.1. Let A =
(1 0
0 1
)and B =
(1 1
0 1
).
λ = 1 has algebraic multiplicity 2 for both A and B.
For geometric multiplicities, we solve Ax = x and By = y.
Ax = x gives x = x, which is satisfied by the linearly independent vectors (1, 0)t and (0, 1)t.
Thus, N(A− λI) has dimension 3.
Bx = x gives a+ b = a and b = b. That is, a = 0 and b can be any complex number. For example, (0, 1)t.
dimN(B − λI) = 1. Geometric multiplicity of λ = 1 for A is 2 but for B is 1.
Theorem 12.1. Let V be a complex vector space with dimV = n. Let T : V → V be linear. Geometric multiplicity
of any eigenvalue of T is less than or equal to its algebraic multiplicity. T is diagonalizable iff geometric multiplicity
of each eigenvalue is equal to its algebraic multiplicity iff sum of geometric multiplicities of all eigenvalues is n.
Therefore, the matrix
(1 1
0 1
)is not diagonalizable.
Things happen a better way in an ips.
In an ips V, we start with an orthonormal basis B = {u1, . . . , un}. Now, Tui =∑nk=1 aikuk gives 〈Tui, uj〉 = aij . That
is, Tui =∑nk=1〈Tui, uk〉uk. Thus, [T ]B,B = (〈Tui, uj〉).
Definition 12.3. A linear operator T : V → V, where V is an ips, is called a self-adjoint operator if for all
x, y ∈ V, 〈Tx, y〉 = 〈x, Ty〉.
Let B = {u1, . . . , un} be an orthonormal basis for V. Let T be self-adjoint. The entries of matrix A = [T ]B,B = (aij)
satisfy
aij = 〈Tui, uj〉 = 〈ui, Tuj〉 = 〈Tuj , ui〉 = aji.
That is, [T ]B,B is hermitian (when A∗ = A).
Conversely, if a matrix is hermitian, then it is easy to see that it is self-adjoint.
In case, V is a real vector space, we say that [T ]B,B is real symmetric.
Theorem 12.2. A self adjoint operator on a finite dimensional ips is diagonalizable. Thus, all hermitian matrices
and all real symmetric matrices are diagonalizable. In general, a normal matrix (when AA∗ = A∗A) is diagonalizable.
Since change of basis results in P−1AP, we say that a matrix A is diagonalizable iff there exists an invertible matrix
P such that P−1AP is a diagonal matrix. We also say that A is diagonalized by the matrix P.
Given a matrix A ∈ Cn×n, we find its eigenvalues and the associated eigenvectors.
We then check whether n such linearly independent eigenvectors exist or not.
If not, we cannot diagonalize A.
If yes, then, we put the eigenvectors together as columns to form the matrix P.
Then, P−1AP is a diagonalization of A.
If A is hermitian, then the eigenvalues are real.
We can also orthonormalize the eigenvetors.
Then, putting together the orthonormalized eigenvectors of A, as columns of P, we see that P ∗ = P−1.
That is, A can be diagonalized by a unitary matrix P.
40
If A is real symmetric, then the eigenvectors are also real. Thus the unitary matrix P satisfies P ∗ = P t = P−1.
That is, A can be diagonalized by an orthogonal matrix.
Note: If A has distinct eigenvalues, its eigenvectors are linearly independent, thus A is diagonalizable.
Example 12.2. A =
1 −1 −1
−1 1 −1
−1 −1 1
is real symmetric.
It has eigenvalues −1, 2, 2, with associated respective eigenvectors 1/√
3
1/√
3
1/√
3
,
−1/√
2
1/√
2
0
,
−1/√
6
−1/√
6
2/√
6
.
They form an orthonormal basis for R3. Taking
P =
1/√
3 −1/√
2 −1/√
6
1/√
3 1/√
2 −1/√
6
1/√
3 0 2/√
6
,
we see that P−1 = P t and
P−1AP = P tAP =
−1 0 0
0 2 0
0 0 2
.
Problems for Section 12
1. Show that the eigenvalues of a real skew symmetric matrix are purely imaginary.
2. Show that the eigenvalues of a real orthogonal matrix have absolute value 1.
3. Show that the eigenvectors corresponding to distinct eigenvalues of a real symmetric matrix are orthogonal.
4. If x and y are eigenvectors corresponding to distinct eigenvalues of a real symmetric matrix of order 3, then
show that the cross product of x and y is a third eigenvector linearly independent with x and y.
5. Diagonalize the following matrices: (a)
0 1 1
1 0 1
1 1 0
(b)
7 −2 0
−2 6 −2
0 −2 5
.
6. Give examples of matrices which cannot be diagonalized.
7. Let V be a finite dimensional vector space with dimV = n and T ∈ L(V, V ). Prove that T is a diagonalizable if
and only if T has a basis consisting of eigenvectors.
8. Which of the following linear transformation T is diagonalizable? If it is diagonalizable, find the basis E and
the diagonal matrix [T ]E,E .
(a) T : R3 → R3 such that T (x1, x2, x3) = (x1 + x2 + x3, x1 + x2 − x3, x1 − x2 + x3).
(b) T : P3 → P3 such that T (a0 + a1t+ a2t2 + a3t
3) = a1 + 2a2t+ 3a3t2.
(c) T : R3 → R3 such that Te1 = 0, T e2 = e1, T e3 = e2.
(d) T : R3 → R3 such that Te1 = e2, T e2 = e3, T e3 = 0.
(e) T : R3 → R3 such that Te1 = e3, T e2 = e2, T e3 = e1.
41
9. Show that the linear transformation T : R3 → R3 corresponding to each of the following matrix is diagonalizable.
Also find a basis of eigenvectors of T for R3.
(a)
3/2 −1/2 0
−1/2 3/2 0
1/2 −1/2 1
(b)
3 −1/2 −3/2
1 3/2 3/2
−1 −1/2 5/2
10. Check whether the linear transformation T : R3 → R3 corresponding to each of the following matrix is diago-
nalizable. If diagonalizable, find a basis of eigenvectors for the space R3:
(a)
1 1 1
1 −1 1
1 1 −1
(b)
1 1 1
0 1 1
0 0 1
(c)
1 0 1
1 1 0
0 1 1
13 System of Linear Equations
Definition 13.1. Let A ∈ Fm×n, b ∈ Fm. Then
a11x1 + · · ·+ a1nxn = b1...
......
am1x1 + · · ·+ amnxn = bn
is called a system of linear equations for the unknowns x1, . . . , xn with coefficients in F. If the bi are all zero, the
system is said to be homogeneous.
We consider the matrix A as a linear transformation A : Fn → Fm and the system is written as Ax = b. The
solution set of the system Ax = b is
Sol(A, b) = {x ∈ Fn : Ax = b}.
The system Ax = b is solvable if Sol(A, b) 6= ∅.
Theorem 13.1. Ax = b is solvable iff rankA = rank [A|b].
Proof. Ax = b is solvable iff b = Ax for some x ∈ Fn iff b ∈ R(A). Suppose b ∈ R(A).
As R(A) = span(Aei) = span of columns of A, rankA does not change if b is added to the set of column vectors of A.
Then rankA = rank[A|b].Conversely, suppose rankA = rank[A|b]. The columns of [A|b] generate the subspace, call it U, containing columns
of A and b. R(A) is a subspace of this space U. But R(A) and U has now the same dimension. Hence, U = R(A).
That is, b ∈ R(A). �
Theorem 13.2. Let x0 ∈ Fn be a solution of Ax = b. Then Sol(A, b) = x0 +N(A) = {x0 + x : x ∈ N(A)}.
Proof. If x ∈ N(A), then A(x0 + x) = Ax0 +Ax = Ax0 = b. That is, x0 + x ∈ Sol(A, b). If v ∈ Sol(A, b) and Ax0 = b,
then A(v − x0) = Av −Ax0 = b− b = 0. That is, v − x0 ∈ N(A). Or that v − x0 ∈ N(A). Then, v ∈ x0 +N(A). �
Corollary 13.3. Let r = nullA = n− rankA.
1. If x0 is a solution of Ax = b and {v1, . . . , vr} is a basis for N(A),
then Sol(A, b) = {x0 + λ1v1 + · · ·+ λrvr : λi ∈ F}.
2. A solvable system Ax = b is uniquely solvable iff N(A) = 0 iff rankA = n.
3. If A ia a square matrix, then Ax = b is uniquely solvable iff det(A) 6= 0.
Theorem 13.4. Let det(A) 6= 0. Denote by Ci the ith column of A. Then the solutions of Ax = b is given by
xi = det(A[Ci ← b])/det(A).
42
Proof. Write Ax = b as x1C1 + · · ·+ xnCn = b. Move b to left side:
x1C1 + · · ·+ xiCi − b+ · · ·+ xnCn = 0.
That is, the columns of the new matrix with columns
C1, . . . , xiCi − bi, . . . , Cn
are linearly dependent. Thus, det(A[Ci ← xiCi − b]) = 0.
This gives xidet(A)− det(A[Ci ← b]) = 0. �
Cramer’s rule helps in studying the map (A, b) x, when det(A) 6= 0.
But it is an impractical method for computing the solution of a system.
Definition 13.2. There are three kinds of Elementary Row Operations for a matrix A ∈ Fm×n:
R1. Interchange of rows
R3. Multiplication of a row by a nonzero constant
R3. Replacing a row by sum of that row with another row.
Similarly, Elementary Column Operations are defined.
Column operations do not alter the column rank of a matrix. Since column rank of a matrix is the rank of the
matrix, rank of a matrix can be computed using this.
Similarly, using elementary row operations, the row rank of a matrix can be computed.
We need to bring the matrix to a form where there are only zeros below the rth row, the diagonal elements a11, . . . , arr
in the new matrix are nonzero, and entries to the left of the diagonal are all zero. It would show that the row rank of
the matrix is r.
Theorem 13.5. Elementary row operations preserve the row rank of a matrix.
Similarly, elementary column operations preserve the column rank of a matrix.
Theorem 13.6. For a matrix A, rowrank(A) = colrank(A) = rankA.
Proof. In A, Call a row superfluous if it is a linear combination of other rows. This means omitting a superfluous row
does not alter the row rank.
Similarly, introduce superfluous columns and think about its omission.
If jth row is superfluous, then each entry in it is the same linear combination of other entries.
Then, a linear combination of columns is already zero if the corresponding column combination without the jth row
is zero.
Thus, omitting the jth row does not alter the column rank.
Similar argument proves that omitting a superfluous column does not alter the row rank. �
Example 13.1.
1 1 1
2 2 2
3 3 3
→ 1 1 1
2 2 2
0 0 0
→ 1 1 1
0 0 0
0 0 0
→ 1 1 0
0 0 0
0 0 0
→ 1 0 0
0 0 0
0 0 0
.
Hence, rank of the matrix is 1.
Theorem 13.7. If we change the augmented matrix [A|b] by elementary row transformations into a matrix [A′|b′],then Sol(A, b) = Sol(A′, b′).
Proof. It is just an abbreviated way of manipulating the equations linearly. �
Gaussian Elimination
1. Start with the augmented matrix [A|b]. If the (1,1) entry is 0, then interchange first row with another to bring the
43
(1,1) entry of the new matrix nonzero. Replace all other rows by that row minus a suitable multiple of the first row
to kill all (j, 1) entries for j > 1.
2. After the kth step, Do not touch the first k rows. Continue as in Step 1 with the rest of the matrix.
3. After the (n − 1)th step, the matrix is of the form [A′|b′], where A′ is upper triangular. Use back-substitution to
solve it.
(A) Start as in Gaussian Elimination as long as possible. After, say, t steps, you find that the first t diagonal elements
are nonzero, but no interchange among the last m− t rows can fill the (t+ 1, t+ 1) entry with a nonzero element.
Now, the augmented matrix looks like:
a′11 b′1a′22 ∗
0. . .
...
a′tt0
0... B′
0 b′m
.
(B) Use interchange of columns from among the last n − t columns and keep track of the corresponding variables.
Continue to proceed as in Step (A). Reach finally at a matrix where the first r diagonal elements are nonzero, and all
the rest n− r rows are zero rows in the matrix A. Call the b-entries as b′i.
Now, the augmented matrix looks like:
a′11 b′1a′22 ∗
. . ....
a′rr b′r
0...
b′m
.
(C) If one of b′r+1, . . . , b′n is nonzero, then Ax = b is not solvable.
(D) Otherwise, omit the last n− r rows. You now end up with the augmented matrix as [T |S|b′], where T is an upper
triangular r × r invertible matrix, S is an r × k matrix, b′ is an r vector.
Now, the augmented matrix looks like:a′11 b′1
a′22
0. . . S
...
a′rr b′r
.
(E) Write the unknowns as y1, . . . , yr, z1, . . . , zk. It is a permutation of the original x1, . . . , xn.
The system is Ty + Sz = b′.
(F) Solve Ty = b′ and adjoin k zeros to get w0. For jth vector wj solve Ty = −Sj , the jth column of S. Adjoin one 1
at jth position rest 0 for getting wj .
(We want to get k linearly independent solutions, since null(T + S) = k.)
(G) Reorder the vectors w0, . . . , wk for v0, . . . vk taking care of the reordering of the unknowns done in Step (F).
44
(H) Sol(A, b) = {v0 + λ1v1 + · · ·+ λkvk : λi ∈ F}.
Example 13.2. Suppose after Step D, you obtain:
x+ y + z = 1, y + 3z = 1.
You might take y = 1− 3z, x = 1− z − (1− 3z) = 2z giving
Sol(A, b) = {(2z, 1− 3z, z) : z ∈ F} = {(0, 1, 0) + λ(2,−3, 1) : λ ∈ F}.
Here, (0, 1) is obtained by solving x+ y = 1, y = 1 and (2,−3) is obtained by solving x+ y = −1, y = −3
Note that −1 is the coefficient of z in the first equation, and −3 is that in the second equation.
Example 13.3. Solve the following system Ax = b by Gaussian Elimination:
2x1 + 3x2 + x3 + 4x4 − 9x5 = 17
x1 + x2 + x3 + x4 − 3x5 = 6
x1 + x2 + x3 + 2x4 − 5x5 = 8
2x1 + 2x2 + 2x3 + 3x4 − 8x5 = 14.2 3 1 4 −9 17
1 1 1 1 −3 6
1 1 1 2 −5 8
2 2 2 3 −8 14
→
1 1 1 1 −3 6
2 3 1 4 −9 17
1 1 1 2 −5 8
2 2 2 3 −8 14
→
1 1 1 1 −3 6
0 1 −1 2 −3 5
0 0 0 1 −2 2
0 0 0 1 −2 2
→
1 1 1 1 −3 6
0 1 −1 2 −3 5
0 0 0 1 −2 2
0 0 0 0 0 0
→Col3,4
1 1 1 1 −3 6
0 1 2 −1 −3 5
0 0 1 0 −2 2
0 0 0 0 0 0
Leaving x1, x2, x4, others are arbitrary. Write x3 = α, x5 = β. Then the system looks like:
x1 + x2 + x4 = 6− α+ 3β
x2 + 2x4 = 5 + α+ 3β
x4 = 2 + 2β
Back substitution gives: x1 = 3− 2α+ 2β, x2 = 1 + α− β, x3 = α, x4 = 2 + 2β, x5 = β.
Thus, Sol(A, b) =
3
1
0
2
0
+ α
−2
1
1
0
0
+ β
2
−1
0
2
1
.
Problems for Sections 13
1. Let M = (aij) be an m× n matrix with aij ∈ F and n > m. Show that there exists (α1, . . . , αn) ∈ Fn such that
ai1α1 + ai2α2 + · · ·+ ainαn = 0, for all i = 1, . . . ,m. Interpret the result for linear systems.
2. Let A ∈ Fm×n have columns A1, . . . , An. Let b ∈ Fm. Show the following:
(a) The equation Ax = 0 has a non-zero solution if and only if A1, . . . , An are linearly dependent.
(b) The equation Ax = b has at least one solution if and only if b ∈ span{A1, . . . , An}.
(c) The equation Ax = b has at most one solution if and only if A1, . . . , An are linearly independent.
(d) The equation Ax = b has a unique solution if and only if rankA = rank[A|b] = n = number of unknowns.
45
3. Consider the system of linear equations: x1 − x2 + 2x3 − 3x4 = 7, 4x1 + 3x3 + x4 = 9, 2x1 − 5x2 + x3 = −2,
3x1 − x2 − x3 + 2x4 = −2. By determining ranks, decide whether the system has a solution. Using Gaussian
Elimination, determine whether the system has a solution. Connect the ranks with the process of Gaussian
Elimination. Determine the solution set of the system.
4. Using Gaussian Elimination, find all possible values of k ∈ R such that the system of linear equations:
x+ y + 2z − 5w = 3, 2x+ 5y − z − 9w = −3,
x− 2y + 6z − 7w = 7, 2x+ 2y + 2z + kw = −4
has more than one solution.
5. Prove: If U is a subspace of Fn and x ∈ Fn, then there exists a system of linear equations having n equations
and n unknowns, with coefficients in F, whose solution set equals x+ U.
46