Upload
dangque
View
236
Download
0
Embed Size (px)
Citation preview
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-1
Chapter 9
Eigenvalues, Eigenvectors and Canonical Forms Under Similarity
Eigenvectors and Eigenvectors play a prominent role in many applications of numerical
linear algebra and matrix theory. In this chapter, we provide basic results on this subject. Then,
we use these results to establish necessary and sufficient conditions for the diagonalization of a
square matrix under a similarity transformation. Finally, we develop the Jordan canonical form of
a matrix, a canonical form the has many applications.
Let T : U → U be a linear operator on a vector space U over the scalar field F. We are
interested in non-zero vectors Xr
which map under T into scalar multiples of themselves. That is,
we are interested in non-zero vectors Xr
∈ U that satisfy
T[Xr
] = λXr
(9-1)
for some scalar λ ∈ F. Such a vector Xr
is said to be an eigenvector corresponding to the
eigenvalue λ.
Example
Let I : U → U be the identity operator. For every Xr
∈ U, I[Xr
] = Xr
. Here, λ = 1 is an eigenvalue
of I, and every non-zero vector in U is an eigenvector.
Example
Let T : R2 → R2 rotate in a counter clock wise direction
every vector by π/2 radians. The scalar field is R , the set
of real numbers. Note that no non-zero vector is a scalar
multiple of itself. Hence T : R2 → R2 has no eigenvalues
or eignevectors.
This lack of eigenvalues and eignevectors will not
occur if we use F = C, the field of complex numbers. Hence, in applications where eignevalues
rXT( )
rX
π/2
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-2
play a role, we use the complex number field.
Let Xr
and λ be an eigenvector and eigenvalue, respectively, so that T[Xr
] = λXr
. Let c ∈ F
= C be any non-zero scalar. Then we have
T[cXr
] = λ(cXr
) (9-2)
so that cXr
is an eignevector. Hence, eigenvectors are defined up to an arbitrary, non-zero, scalar.
Two or more linearly independent eigenvectors can be associated with a given eigenvalue.
In fact, for a given eigenvalue λ, the set
Sλ = {Xr
∈ U : T(Xr
) = λXr
} (9-3)
is a subspace known as the eigenspace associated with λ (note that 0r is in the eigenspace, but 0
r is
not an eigenvector). Finally, the dimension of eigenspace Sλ is known as the geometric
multiplicity of λ. In what follows, we use γ to denote the geometric multiplicity of an eigenvalue.
For a given basis, the transformation T : U → U can be represented by an n×n matrix A.
In terms of this basis, a representation for the eigenvectors can be given. Also, the eigenvalues
and eigenvectors satisfy
(A - λI)Xr
= 0r. (9-4)
Hence, the eigenspace associated with eigenvalue λ is just the kernel of (A - λI).
While the matrix representing T is basis dependent, the eigenvalues and eigenvectors are
not. The eigenvalues of T : U → U can be found by computing the eigenvalues of any matrix that
represents T. Let n×n matrix A represent T : U → U with respect to some fixed basis. Then the
eigenvalues are the roots of the nth-order characteristic polynomial
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-3
A - λI = det(A - λI) = 0. (9-5)
(note the notation introduced here: A means the determinant of matrix A). The eigenvalues
can be complex or real-valued. They can occur as simple roots or as multiple roots of the
characteristic polynomial. The number of times (i.e., the multiplicity) that λ appears as a root of
det(A - λI) is called the algebraic multiplicity of λ. We use αk to denote the algebraic
multiplicity of eigenvalue λk. A basis for the eigenspace associated with λ can be found by
computing a basis for the kernel of (A - λI).
Example
A =−−−
F
HGG
I
KJJ =
− −− + −− −
= + − =1 3 3
3 5 3
6 6 4
1 3 3
3 5 3
6 6 4
2 4 02 so that I - Aλλ
λλ
λ λ( ) ( ) .
The distinct eigenvalues are λ1 = -2 and λ2 = 4. Eigenvalue λ1 = -2 has algebraic multiplicity α1 =
2, and eigenvalue λ2 = 4 has algebraic multiplicity α2 = 1. Now we find the eigenvectors.
Consider first the eigenvalue λ1 = -2. The matrix
[ ]A I− =−−−
F
HGG
I
KJJλ
λYY = −21
3 3 3
3 3 3
6 6 6
has a nullity of two, and Xr
11 = [1 1 0]T and Xr
12 = [-1 0 1]T are two linearly independent
eigenvectors that span the two dimensional eigenspace associated with λ1 = -2 . Hence λ1 = -2
has geometric and algebraic multiplicities of γ1 = α1 = 2. Now, consider λ2 = 4. The matrix
[ ]A I− =− −
−−
F
HGG
I
KJJλ
λYY = 42
3 3 3
3 9 3
6 6 0
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-4
has a nullity of 1, and Xr
2 = [1 1 2]T spans the one-dimensional eigenspace associated with λ2 =
4.
Eigenvector Indexing
From time to time, subscripts and superscripts need to placed on eigenvectors (and the
generalized eigenvectors that are introduced below). In the literature, there is not one indexing
scheme that is predominant (there are faults with all eigenvector indexing schemes). Notice the
indexing scheme that was introduced by the previous example. On some eigenvectors, we placed
two subscripts; we wrote Xr
jk. The first subscript (the "j" subscript) associates the eigenvector
with one of the numerically distinct eigenvalues (each of which can have an algebraic multiplicity
greater than one); we have 1 ≤ j ≤ d, where d is the number of distinct eigenvalues. The second
subscript (the "k" subscript) orders the eigenvector in the set of independent eigenvectors
associated with the "j" eigenvalue; we have 1 ≤ k ≤ γj, where γj is the geometric multiplicity of the
"j" eigenvalue. Sometimes, we place only one subscript on an eigenvector. This one subscript
may associate the eigenvector with a distinct eigenvalue, or it may order the eigenvector in a set
of independent eigenvectors (or it may do both). When one subscript appears, its meaning can be
inferred from context (or its meaning will be stated explicitly). Finally, on an eigenvector,
subscripts are used only when necessary; we will drop all subscripts when they are not needed to
clarify notation.
Eigenvalues of Similar Matrices
Recall that n×n matrices A and B are said to be similar if there exists a nonsingular n×n
matrix P such that A = P-1BP. The matrix representing a linear transformation depends on the
underlying basis; however, all matrices that represent a linear transform are similar to one another.
Furthermore, they have the same eigenvalues and eigenvectors.
Theorem 9-1
Similar matrices have the same eigenvalues and eigenvectors.
Proof: This follows directly from the basic definitions since eigenvalues and eigenvectors are
associated with an underlying linear transformation and not with any particular matrix or vector
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-5
representation. Let βr
1, βr
2, ... , βr
n and βr
1′, βr
2′, ... , βr
n′ denote the “old” and “new” bases,
respectively, for the vector space; in terms of a non-singular transformation matrix P, the “old”
and “new” bases are related as shown by (3-27). Xr
and A denote “old” representations for the
eigenvector and matrix, respectively. Xr
′ = P-1Xr
and A′ = P-1AP denotes “new” representations
for the eigenvector and matrix, respectively (see (3-29) and (3-41)). With respect to the “new”
basis, the “old” eigen problem AXr
= λXr
becomes
(PA′P-1)PXr
′ = λ(PXr
′). (9-6)
After multiplication on the left by P-1, (9-6) becomes the “new” eigen problem
A′Xr
′ = λXr
′. (9-7)
So, while a similarity transformation changes the matrix and vector representations, it does not
change the underlying linear transformation or its eigenvalues/eigenvectors.♥
Theorem 9-2
For eigenvalue λ, the geometric multiplicity γ does not exceed the algebraic multiplicity α.
Proof: The geometric multiplicity γ of eigenvalue λ is defined independently of any matrix
representing linear transformation T : U → U. The characteristic equation, eigenvalues and
eigenvectors are the same for all matrices that represent T. Hence, to represent transformation T,
we can choose the matrix that makes obvious the proof of this theorem. Let γ be the dimension
of eigenspace Sλ (γ is the geometric multiplicity of λ). Let eigenvectors Xr
1, ... , Xr
γ be a basis for
eigenspace Sλ (eigenvector subscripts are used here as an index into the set of basis vectors). This
linearly independent set of eigenvectors can be extended to a basis
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-6
r rL
r
1 244 344
r rL
r
1 2444 3444X X X X X X
eigenvectors that
n
any other
1 2 1 2, , , , , , ,
span eigenspace Sindependent
vectors
γ γ γ
λ
+ + (9-8)
of n-dimensional U. The vectors Xr
γ+1, ... , Xr
n can be arbitrary as long as they are independent of
each other and independent of the first γ eigenvectors. Now, T(Xr
i) = λXr
i for 1 ≤ i ≤ γ. With
respect to (9-8), the matrix A representing T has the form
A
A
A
n
n n
=
L
N
MMMMMMMMM
O
Q
PPPPPPPPP
−
− −
λλ
λ
γ γ
γ γ
O ,
,
γ rows
n-γ rows
γ cols n-γ cols
Sub-matrix Aγ,n-γ is γ×(n-γ) and A n-γ,n-γ is (n-γ)×(n-γ). These sub-matrices are non-zero, in general;
the values they contain are of no concern to us. From inspection of (9-9), it is evident that the
algebraic multiplicity of λ is at least equal to γ. Hence, for any eigenvalue, the algebraic
multiplicity ≥ geometric multiplicity (α ≥ γ).♥
Theorem 9-3
Let λ1, λ2, ... , λs be any s distinct eigenvalues, and let Xr
1, Xr
2, ... , Xr
s (subscripts are used
here to associate an eigenvector with a distinct eigenvalue) be the associated eigenvectors. These
s eigenvectors are linearly independent.
Proof (by contradiction)
Suppose the set of s vectors is dependent. Re-order the eigenvectors so that the first k are
linearly independent and the remaining s-k vectors are dependent on the first k vectors. Then, we
can write the unique representation
(9-9)
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-7
r rX Xs i i
i
k= ∈
=∑c c
1
, i F , (9-10)
Since Xr
s ≠ 0r, there are non-zero ci in (9-10). Apply the linear transformation T to (9-10) and
obtain
λ λs s i i ii
kX Xr r
==∑c
1
. (9-11)
There are two possibilities. First, if λs = 0, then λi ≠ 0, 1 ≤ i ≤ k, since λ1, ... , λs are distinct. λs =
0 implies that Xr
1, ... , Xr
k are dependent, a contradiction. The second possibility is that λs ≠ 0, so
that we can write
r rX Xs i
i
si
i
k= F
HGIKJ=
∑cλλ1
. (9-12)
Since there are non-zero ci, and λi/λs ≠ 1 due to distinct eigenvalues, (9-12) is different than
(9-10), a contradiction (since representation (9-10) is unique). Hence, for either possibility, we
have a contradiction, and the s eigenvectors Xr
1, Xr
2, ... , Xr
s are independent.♥ Note that the
converse of this theorem is not true (independent eigenvectors are not always associated with
distinct eigenvalues).
Theorem 9-3 tells us a lot about matrices with distinct eigenvalues (distinct eigenvalues
are a common occurrence in practical applications). Matrices with distinct eigenvalues have linear
independent eigenvectors. When this occurs, it is possible to use the n independent eignevectors
to form a basis of n-dimensional U, a useful thing to do when proving theorems.
Let n×n matrix A represent T : U → U with respect to some fixed basis. Suppose T has n
linearly independent eigenvectors Xr
1, Xr
2, ... , Xr
n, (subscripts are used to index the eigenvectors in
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-8
this set of n independent eigenvectors), and we use them as a basis of n-dimensional space U. We
want to find the matrix D that represents linear transformation T with respect to this basis of
eignevectors. Use these independent eigenvectors to define the n×n transformation matrix
P X X Xn≡r r
Lr
1 2 . (9-13)
Then, with respect to the eigenvector basis, the matrix D that represents T is D = P-1AP. But this
implies that AP = PD, a result that can be written as
A X X X X X X
X X X
n n n
n
n
r rL
r r rL
r
r rL
r
O
1 2 1 1 2 2
1 2
1
2
=
=
L
N
MMMM
O
Q
PPPP
λ λ λ
λλ
λ.
But, Equation (9-14) leads to the observation that
D X X X A X X X
n
n n=
L
N
MMMM
O
Q
PPPP=
−
λλ
λ
1
21 2
11 2O
r rL
r r rL
r
Hence, when a basis of eigenvectors is used, the n×n matrix representing T: U → U is diagonal
with the eigenvalues appearing on the diagonal.
If n×n matrix A has distinct eigenvalues, then there is a basis of eigenvectors that can be
used as columns of n×n matrix P. And, with a similarity transformation, matrix P can be used to
diagonalize matrix A. More generally, if each eigenvalue of A has equal geometric and algebraic
(9-14)
(9-15)
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-9
multiplicities, then there are n linearly independent eigenvectors, and A can be diagonalized as
described above.
The converse is true as well. That is, if an n×n nonsingular matrix P exists such that P-1AP
is diagonal, then we can conclude
1. The eigenvalues of A appear on the diagonal of P-1AP, and
2. The columns of P are n linearly independent eigenvectors of matrix A.
We have argued the following theorem.
Theorem 9-4
An n×n matrix A is similar to a diagonal matrix D if and only if there are n linearly independent
eigenvectors of A. Furthermore, the eigenvalues of A must appear on the diagonal of D.
Example A =−−
LNM
OQP
1 1
2 1
det(A - λI) = λ2 + 1 so that λ = ± j are the eigenvalues. λ1 = +j has the eigenvector Xr
1 = [1 1-j]T.
λ2 = -j has the eigenvector Xr
2 = [1 1+j]T. The eigenvalues are distinct, so Xr
1 and Xr
2 are
independent and
Pj j
APj
j=
− +LNM
OQP ⇒ =
−LNM
OQP
1 1
1 1
0
0 P-1
Example A =−L
NMMM
O
QPPP
1 0 1
0 1 0
0 0 2
has eigenvalues λ1 = 1 (α1 = 2), and λ2 = 2 (α2 = 1). The
eigenvectors are
λ1 = 1 ⇒ Xr
11 = [1 0 0]T and Xr
12 = [0 1 0]T
λ2 = 2 ⇒ Xr
2 = [-1 0 1]T
Note that λ1 has equal algebraic and geometric multiplicities of two. Hence, Xr
11, Xr
12 and Xr
2
comprise a basis of eigenvectors, and we have
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-10
P AP=−L
NMMM
O
QPPP
⇒ =L
NMMM
O
QPPP
1 0 1
0 1 0
0 0 1
1 0 0
0 1 0
0 0 2
P-1
Example A =L
NMMM
O
QPPP
1 1 2
0 1 3
0 0 2
has eigenvalues λ1 = 1 (α1 = 2), and λ2 = 2 (α2 = 1). Since nullity(A - λ1I) = 1, we know that λ1
= l has a geometric multiplicity of γ1 = 1 but an algebraic multiplicity of α1 = 2. Hence, there is no
basis of eigenvectors, and matrix A cannot be diagonalized under similarity.
When there is not a basis of eigenvectors, n×n matrix A cannot be diagonalized.
However, we show that a nonsingular n×n matrix P exists such that P-1AP is “almost” diagonal;
our P-1AP has eigenvalues on its diagonal and “1s” immediately above some of the diagonal
eigenvalues. This new “almost diagonal” matrix is called the Jordan Canonical Form for A, and
it has many applications in engineering and the applied sciences. First, we must introduce the
subject of generalized eigenvectors.
Generalized Eigenvectors
Let A be an n×n matrix. For an eigenvalue λ, vector Xr
is said to be a generalized
eigenvector of rank k > 0 if
( )
( )
A I X
A I X
k
k
− =
− ≠−
λ
λ
r r
r r
0
01. (9-16)
An “ordinary” eigenvector Xr
is a generalized eigenvector of rank k = 1 since (A - λI)Xr
= 0r and
(A - λI)0Xr
= Xr
≠ 0r.
We develop a chain of generalized eigenvectors. For a given eigenvalue λ, let Xr
be a
generalized eigenvector of rank k. Define the chain of k generalized eigenvectors as
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-11
r r
r r r
r r r
M
r r r
X X
X X X
X X X
X X X
k
k k
k k
(A - I) = (A - I)
(A - I) = (A - I)
(A - I) = (A - I)
2
k-1
≡
≡
≡
≡
−
− −
1
2 1
1 2
λ λ
λ λ
λ λ
. (9-17)
A superscript on a vector is not a power; it is used to indicate rank, and it is used as an index! On
Xr
k, the k is used as a rank indicator and index; k is not a power (raising a vector to a power is an
operation that has not been defined!). Now, settle down, get over it! On a vector, the only time
we will use a superscript is when we are working with vectors in a chain of generalized
eigenvectors (we have already described how we want to use the subscript position(s)). On
generalized eigenvectors, superscripts are standard in the literature.
For each i, 1 ≤ i ≤ k, Xr
i is a generalized eigenvector of rank i since
(A - λI)i Xr
i = (A - λI)i (A - λI)k-i Xr
= (A - λI)k Xr
= 0r, (9-18)
(A - λI)i-1 Xr
i = (A - λI)i-1 (A - λI)k-i Xr
= (A - λI)k-1 Xr
≠ 0r . (9-19)
Note that Xr
1 is an "ordinary" eigenvector since
(A - λI)Xr
1 = (A - λI)(A - λI)k-1Xr
= (A - λI)k Xr
= 0r. (9-20)
As mentioned above, we call Xr
1, Xr
2, ... , Xr
k a chain of generalized eigenvectors. Now, we
examine some properties that chains have.
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-12
Theorem 9-5
A chain Xr
1, Xr
2, ... , Xr
k of generalized eigenvectors is linearly independent.
Proof (by contradiction)
For the moment, assume that the vectors in the chain are dependent. Then there exists constants
c1, c2, ... , ck, not all zero, such that
c1Xr
1 + c2Xr
2 + ... + ckXr
k = 0r. (9-21)
First, note that for i = 1, 2, ... , k-1 we can write
(A - λI)k-1 Xr
i = (A - λI)k-1(A - λI)k-i Xr
= (A - λI)2k-(i+1) Xr
= 0r, (9-22)
a result we will use very soon. Now, apply (A - λI)k-1 to both sides of (9-21) to obtain
(A - λI)k-1{ c1Xr
1 + c2Xr
2 + ... + ckXr
k} = 0r. (9-23)
Use (9-22) in (9-23) to obtain
ck(A - λI)k-1 Xr
k = 0r. (9-24)
But, we know that (A - λI)k-1 Xr
k ≠ 0r. Hence, we must have ck = 0 so that (9-21) becomes
c1Xr
1 + c2Xr
2 + ... + ck-1Xr
k-1 = 0r. (9-25)
On this equation, repeat the procedure that starts with (9-22). That is, multiply (9-25) by (A -
λI)k-2, and repeat the above argument (that produced ck = 0) to reach the conclusion that ck-1 = 0.
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-13
Obviously, this same argument can be repeated a sufficient number of time to conclude that ci = 0,
1 ≤ i ≤ k. This contradiction leads to the conclusion that the chain Xr
1, Xr
2, ... , Xr
k is comprised of
linear independent generalized eigenvectors.♥
Theorem 9-6
Let λ1 ≠ λ2 be two eigenvalues of n×n matrix A. Suppose Xr
is a generalized eigenvector
of rank k associated with λ1 and Yr
is a generalized eigenvector of rank m associated with λ2.
Define the two chains
Xr
k = Xr
, and Xr
i = (A - λ1I)Xr
i+1 = (A - λ1I)k-i X
r for i = k-1, k-2, ... , 1 (9-26)
Yr
m = Yr
, and Yr
j = (A - λ2I)Yr
j+1 = (A - λ2I)m-j Y
r for j = m-1, m-2, ... , 1 (9-27)
The set of k+m vectors described by (9-26) and (9-27) are linearly independent. Equivalently, any
generalized eigenvector from one chain is independent of the vectors in the other chain.
Proof (by contradiction)
Suppose there is an i, 1 ≤ i ≤ k, for which Xr
i is linearly dependent on the chain Yr
1, Yr
2, ... , Yr
m.
Then, there exists constants c1, ... , cm, not all zero, such that
r rX Yi j
j
m=
=∑c j
1
(9-28)
Multiply (9-28) by (A - λ1I)i, and use the fact that
(A - λ1I)i Xr
i = (A - λ1I)i (A - λ1I)
k-i Xr
= 0r
(9-29)
to obtain
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-14
(A - I) iλ1 cjr rY j
j
m
=∑ =
1
0 (9-30)
Now, multiply (9-30) by (A - λ2I)m-1, and use the facts
i) (A - λ2I)m-1(A - λ1I)
i = (A - λ1I)i (A - λ2I)
m-1
ii) (A - λ2I)m-1 Y
rj = 0
r for j = m-1, m-2, ... , 1
to obtain
(A - λ1I)i (A - λ2I)
m-1 cmYr
m = cm(A - λ1I)i Yr
1 = 0r
(9-31)
Now, Yr
1 is an "ordinary" eigenvector: AYr
1 = λ2Yr
1, so (9-31) becomes
cm(λ2 - λ1)Yr
1 = 0r. (9-32)
Since λ2 ≠ λ1 we must have cm = 0 so that (9-30) becomes
(A - I) iλ1 c jr rY j
j
m
=
−
∑ =1
10 . (9-33)
Now repeat the argument that started with (9-30) and produced cm = 0. That is, multiply (9-33)
by (A - λ2I)m-2
, follow the argument, and conclude that cm-1 = 0. Continue this process to the
conclusion that ci = 0 for i = m, m-1, m-2, ... , 1. This contradiction (the ci's are not all zero)
leads to the conclusion that Xr
i is independent of Yr
1, Yr
2, ... , Yr
m. Hence, the two chains Yr
1, Yr
2, ...
, Yr
m and Xr
1, Xr
2, ... , Xr
k contain m+k linearly independent vectors.♥
Theorem 9-7
Let Yr
and Xr
be generalized eigenvectors of rank m and k, respectively, associated with
the same eigenvalue λ. Define the two chains
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-15
Xr
k = Xr
, and Xr
i = (A - λI)Xr
i+1 = (A - λI)k-i Xr
for i = k-1, k-2, ... , 1 (9-34)
Yr
m = Yr
, and Yr
j = (A - λI)Yr
j+1 = (A - λI)m-j Yr
for j = m-1, m-2, ... , 1 (9-35)
If the "ordinary" eigenvectors Yr
1 and Xr
1 are independent, then so are the two chains (i.e., (9-34)
and (9-35) describe m+k independent vectors).
Proof
Similar to the proof of Theorem 9-6.♥
Theorems 9-5, 9-6 and 9-7 provide the basis of our generalized eigenvector theory. Note
that we have shown an important result.
Note that we have not discussed how many vectors are in each chain. We have argued only that
there are a total of α generalized eigenvectors divided into γ chains associated with λ. While n×n
matrix A may, or may not, have n independent eigenvectors, it always has n independent
generalized eigenvectors.
Eigenvector Indexing - Revisited
It's time once more to consider generalized eigenvector indexing. A generalized
eigenvector can have two subscripts and one superscript. The meaning of the two subscripts are
given above in the section on eigenvector indexing (which is worth reading again). The
superscript is used as both a rank indicator and index into a chain. For example, consider the
generalized eigenvector r
lX jk . The "j" subscript associates the generalized eigenvector with
eigenvalue λj (1 ≤ j ≤ d, where d is the number of numerically distinct eigenvalues). The "k"
Associated with eigenvalue λ are γ distinct chains of generalized eigenvectors (γ is the geometric multiplicity of λ).
Each chain is "anchored" by an "ordinary" eigenvector (of rank one). In these γ chains, the total number of generalized
eigenvectors is α, the algebraic multiplicity of λ. And, these α vectors are linearly independent.
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-16
subscript associates the generalized eigenvector with a particular chain of independent generalized
eigenvectors for λj (1 ≤ k ≤ γj , where γj is the geometric multiplicity of λj). As described above,
superscript l is a rank indicator, and it is an index into the kth chain of generalized eigenvectors
associated with λj. Finally, note that rX jk
1 is the kth "ordinary" eigenvector associated with λj.
Listing of all Generalized Eigenvectors
Let λ1, λ2, ... , λd denote the numerically distinct eigenvalues of an n×n matrix A . For 1 ≤
k ≤ d, eigenvalue λk has an algebraic multiplicity of αk and a geometric multiplicity of γk.
Furthermore, for 1 ≤ k ≤ d, eigenvalue λk is associated with γk separate chains of generalized
eigenvectors containing a total (in all of the γk chains) of αk independent generalized eigenvectors.
Finally, taken all together, for the d numerically distinct eigenvalues, a total of n generalized
eigenvectors exist, considering all of the vectors in all of the chains.
We can list these n generalized eigenvectors. Using the indexing scheme outline above,
we write
The generalized eigenvectors for are
divided into chains 1 1
1
α λγ
γ γ γγ
r rL
r
r rL
r
M M Mr r
Lr
X X X
X X X
X X X
h
h
h
111
112
11
121
122
12
11
12
1
11
12
1 1 1
1 1
R
S|||
T|||
The generalized eigenvectors for are
divided into chains 2 2
2
α λγ
γ γ γγ
r rL
r
r rL
r
M M Mr r
Lr
X X X
X X X
X X X
h
h
h
211
212
21
221
222
22
21
22
2
21
22
2 2 2
2 2
R
S|||
T|||
(9-36)
M M M M M
The generalized eigenvectors for are
divided into chains d d
d
α λγ
γ γ γγ
r rL
r
r rL
r
M M Mr r
Lr
X X X
X X X
X X X
d d dh
d d dh
d d d
h
d
d
d d
d d d
11
12
1
21
22
2
1 2
1
2
R
S||
T|||
.
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-17
Here, hkj, 1 ≤ k ≤ d, 1 ≤ j ≤ γk, denotes the number of generalized eigenvectors in the jth chain
associated with the numerically distinct eigenvalue λk. Integer hkj has to be computed as outlined
in the example given below. As stated in the list given above, we have
αγ
k kjj
hk
==∑
1
. (9-37)
Also, we denote the total number of chains as
ν γ==
∑ kk
d
1
. (9-38)
Finally, for an n×n matrix A, we have
n hkk
d
kjjk
d k
= == ==
∑ ∑∑αγ
1 11
. (9-39)
An n×n matrix A may, or may not, have n linearly independent eigenvectors. However, it always
has n linearly independent generalized eigenvectors.
Example Reconsider the previous example where
A =L
NMMM
O
QPPP
1 1 2
0 1 3
0 0 2
Eigenvalue λ1 = 1 has an algebraic multiplicity of α1 = 2 and a geometric multiplicity of γ1 = 1;rX11
1 = [1 0 0]T is an "ordinary" eigenvector for λ1. Eigenvalue λ2 = 2 has geometric and
algebraic multiplicities of 1; rX21
1 = [5 3 1]T is an "ordinary" eigenvector for λ2. We are one
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-18
eigenvector short; the matrix A cannot be diagonalized by a similarity transformation. However,
we can find two generalized eigenvectors associated with λ1 = 1. Let's find a chain of length two
associated with λ1 = 1. These two generalized eigenvectors, when combined with rX21
1 , will
produce a basis of generalized eigenvectors. First, find a non-zero Xr
such that
(A X− =L
NMMM
O
QPPP
≠λ Ι)1r r rX
0 1 2
0 0 3
0 0 1
0
(A X X− =L
NMMM
O
QPPP
L
NMMM
O
QPPP
=L
NMMM
O
QPPP
=λ Ι)12 r r r r
X
0 1 2
0 0 3
0 0 1
0 1 2
0 0 3
0 0 1
0 0 5
0 0 3
0 0 1
0
Clearly, Xr
= [0 1 0]T is a generalized eigenvector of rank 2, and we use this vector to write
rX11
20
1
0
=L
NMMM
O
QPPP
r rX A X11
11
0 1 2
0 0 3
0 0 1
0
1
0
1
0
0
= − =L
NMMM
O
QPPP
L
NMMM
O
QPPP
=L
NMMM
O
QPPP
( )λ
{rX11
1 , rX11
2 } is a chain of length two associated with λ1 = 1. The vectors rX11
1 = [1 0 0]T, rX11
2 =
[0 1 0]T, rX21
1 = [5 3 1]T form a basis of generalized eigenvectors. With respect to this basis,
let's find the matrix A′ that represents the underlying transformation. Define the 3×3 non-singular
matrix P X X X≡r r r
111
112
211 , and compute A′ = P-1AP. We compute A′ by considering the
equivalent equation PA′ = AP ⇔ r r rX X X11
1112
211 A′ = A
r r rX X X11
1112
211 so that
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-19
AX X X Xr r r r
111
111
112
211
1
0
0
1 0 0=L
NMMM
O
QPPP
= ⋅ + ⋅ + ⋅
AX X X Xr r r r
112
111
112
211
1
1
0
1 1 0=L
NMMM
O
QPPP
= ⋅ + ⋅ + ⋅
AX X X Xr r r r
211
111
112
211
10
6
2
0 0 2=L
NMMM
O
QPPP
= ⋅ + ⋅ + ⋅
As a result, we see that
′ = =L
NMMM
O
QPPP
−A P AP11 1 0
0 1 0
0 0 2
.
Note that A′ has two blocks on its diagonal; we write A′ as
′ =L
NMMM
O
QPPP
≡LNM
OQP ≡A , , J J1 2
1 1
0 12
J1
J2
Matrix A′ is known as the Jordan Canonical Form for matrix A.
Jordan Canonical Form
This procedure can be applied to transform any n×n matrix into its block-diagonal Jordan
canonical form. Let λ1, λ2, ... , λd be the numerically distinct eigenvalues of n×n matrix A. For 1
≤ k ≤ d, let λk have algebraic multiplicity αk and geometric multiplicity of γk. As outlined above,
eigenvalue λk is associated with γk chains containing a total of αk generalized eigenvectors, and
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-20
each chain is "anchored" by an "ordinary" eigenvector. As listed by (9-36), there are a total of n
linearly-independent generalized eigenvectors split up into ν chains. We use these n generalized
eigenvectors to define the n×n transformation matrix
P X X X X X X
X X X X X X
X
h h h
h h h
d
= [r
Lr
1 244 344
rL
r
1 244 344L
rL
r
1 2444 3444
rL
r
1 2444 3444
rL
r
1 2444 3444L
rL
r
1 2444 3444L
M M M
Lr
L
111
11 121
12 11
1
211
21 221
22 21
2
11
11 12 1
21 22 2
1
1
2 2
2
1
1
2
chain #1 for chain #2 for chain # for
chain #1 for chain #2 for chain # for
1 1 1
2 2 2
λ λγ γ
γ λ
λ λγ γ
γ λ
γ
γ
r
1 2444 3444
rL
r
1 2444 3444L
rL
r
1 2444 3444X X X X Xd
hd d
hd d
hd d d
d d
d
d
1 21
211 2
chain #1 for chain #2 for chain # for d d d
.λ λ
γ γ
γ λ
γ ]
(9-40)
By using the similarity transformation A′ = P-1AP, matrix P, given by (9-40), can be used
to transform n×n matrix A into its Jordan Canonical Form. This canonical form is a block
diagonal matrix
′ = =
L
N
MMMM
O
Q
PPPP−A P AP
J
J
J
1
1
2
O
ν
made from ν blocks Jk, 1 ≤ k ≤ ν, one block for each chain of generalized eigenvectors. Note that
(9-41) is equivalent to AP = PA′, a matrix equation that can be written as
AP = A[ ]r
Lr
1 244 344L
rL
r
1 2444 3444X X X X
h
chain
d d
h
chain
d d
d d
111
11111
1stth
γ γ
ν
γ
(9-41)
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-21
= [ ]r
Lr
1 244 344L
rL
r
1 2444 3444X X X X
h
chain
d d
h
chain
d d
d d
111
11111
1stth
γ γ
ν
γ
J
J
J
A
1
2
O
ν
L
N
MMMM
O
Q
PPPP= ′ P
Let's examine the structure of a typical block. Consider r r
Lr
X X Xjk jk jkh jk1 2, , , , the kth chain
associated with λj, the jth distinct eigenvalue. The Jordan block for this chain is Jp, where
p k ii
j= +
=
−
∑ γ1
1 (9-43)
From the basic definition of this chain, we have
r
r r r r r
r r r r r
X
X A I X AX X X
X A I X AX X X
h
hj
h hj
h h
hj
h hj
h h
jk
jk jk jk jk jk
jk jk jk jk jk
− −
− − − − −
= − ⇒ = +
= − ⇒ = +
1 1
2 1 1 1 2
( )
( )
λ λ
λ λ
(9-44)
,
M M M
r r r r rX A I X AX X Xj j
1 2 2 2 1= − ⇒ = +( )λ λ
where we have omitted the common subscripts jk on all generalized eigenvectors. From PA′ =
AP (see (9-42)), we have the requirement
r rL
r r rL
rX X X J A X X Xjk jk jk
hp jk jk jk
hjk jk1 2 1 2LNM OQP = LNM OQP , (9-45)
where p is given by (9-43). However, from (9-44) and the requirement ArX jk
1 = λj
rX jk
1 , it is easy
to see that
(9-42)
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-22
Jp
j
j
j
j
j
=
L
N
MMMMMMMM
O
Q
PPPPPPPP
λλ
λ
λλ
1 0
0 1
0 0
1
0
O
Ohjk rows
hjk columns
That is, Jp is an hjk×hjk matrix with λj on its diagonal, "1s" on its first "super diagonal", and zeros
everywhere else.
Computational Procedure for Jordan Form
For many low-dimensional problems of practical interest, the Jordan form can be
computed "by hand" without too much effort. A computational procedure for computing the
Jordan form is outlined below.
1. Compute the eigenvalues and "ordinary" eigenvectors of n×n matrix A; determine the algebraic
and geometric multiplicities of the eigenvalues. The distinct eigenvalues are λ1, λ2, ... , λd; for 1 ≤
k ≤ d, eigenvalue λk has algebraic multiplicity αk and geometric multiplicity γk.
2. In γ1 distinct chains, compute a total of α1 independent, generalized eigenvectors for λ1. To
accomplish this, compute (A - λ1I)i for i = 1, 2, ... until the rank of (A - λ1I)
k is equal to the rank
of (A - λ1I)k+1. Then, compute a rank k generalized eigenvector and its k-long chain. If k = α1,
go to step #3. Otherwise, look for a second rank-k vector and its chain. If a second rank k
vector does not exist, look for one of rank k-1, and so on, until we have γ1 distinct chains of α1
generalized eigenvectors.
3. Repeat step #2 for the remaining eigenvalues λ2, ... , λd.
4. Write down the Jordan form. For eigenvalue λj, the kth chain is of length hjk (determined in
step #2), and there is an hjk×hjk Jordan block with λj on its diagonal.
(9-46)
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-23
In the Jordan form, the ordering of the blocks is not critical. However, it is common to
keep sequential all blocks associated with the same eigenvalue.
Example A =
−− −
− −
L
N
MMMMMMM
O
Q
PPPPPPP
3 1 1 1 0 0
1 1 1 1 0 0
0 0 2 0 1 1
0 0 0 2 1 1
0 0 0 0 1 1
0 0 0 0 1 1
Compute the eigenvalues and algebraic multiplicities. Note that det(A - λI) = λ(λ - 2)5, and this
implies that λ1 = 2 with α1 = 5 and λ2 = 0 with α2 = 1. Furthermore, eigenvalue λ1 = 2 has the
two independent eigenvectors
r
r
X
X
T
T
111
121
1 1 0 0 0 0
0 0 1 1 0 0
=
= −
,
so γ1 = 2. Also, λ2 = 0 has the single eigenvector
rX T
211 0 0 0 0 1 1= − ,
so γ2 = 1. Now, compute (A - λ1I)i, for increasing i until the rank no longer changes.
( )A I− =
−− − −
− −−
−
L
N
MMMMMMM
O
Q
PPPPPPP
2
1 1 1 1 0 0
1 1 1 1 0 0
0 0 0 0 1 1
0 0 0 0 1 1
0 0 0 0 1 1
0 0 0 0 1 1
has rank equal to 4.
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-24
( )A I− =
−−
L
N
MMMMMMM
O
Q
PPPPPPP
2
0 0 2 2 0 0
0 0 2 2 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 2 2
0 0 0 0 2 2
2 has rank equal to 2.
( )A I− =
−−
L
N
MMMMMMM
O
Q
PPPPPPP
2
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 4 4
0 0 0 0 4 4
3 has rank equal to 1.
( )A I− =
−−
L
N
MMMMMMM
O
Q
PPPPPPP
2
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 8 8
0 0 0 0 8 8
4 has rank equal to 1.
The rank of (A - 2I)3 is equal to the rank of (A - 2I)4; hence, there is a rank 3 generalized
eigenvector that is in K((A - 2I)3) but not in K((A - 2I)2). It is easily computed as rX11
3 = [0 0 1
0 0 0]T since (A - 2I)3rX11
3 = 0r but (A - 2I)2
rX11
3 ≠ 0r. Now, we compute the first 3-long chain
associated with λ1.
r r r r rX A I X X A I X X11
1 2113
112
113
1132
2
2
0
0
0
0
2
1
1
0
0
0
0
0
0
1
0
0
0
≡ − =
L
N
MMMMMMM
O
Q
PPPPPPP
≡ − =
−L
N
MMMMMMM
O
Q
PPPPPPP
=
L
N
MMMMMMM
O
Q
PPPPPPP
( ) ( ) , , .
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-25
Since α1 = 5, there are two more generalized eigenvectors associated with λ1; inspection of (A -
2I)2 and (A - 2I) reveals where they are. There is a generalized eigenvector of rank 2 that is in
K((A - 2I)2) but not in K(A - 2I). This rank 2 vector is rX12
2 = [0 0 0 0 1 1]T; note that (A -
2I)2rX12
2 = 0r
but (A - 2I)rX12
2 ≠ 0r. Hence, our second chain associated with λ1 = 2 is
r r rX A I X X12
1122
1222
0
0
2
2
0
0
0
0
0
0
1
1
≡ − =−
L
N
MMMMMMM
O
Q
PPPPPPP
=
L
N
MMMMMMM
O
Q
PPPPPPP
( ) , .
We have 5 generalized eigenvectors associated with λ1 = 2; there are no more. With the
eigenvector rX T
211 0 0 0 0 1 1= − , we have a basis of 6 generalized eigenvectors that we
can use to write the transformation matrix
P X X X X X X= =
−
−−
L
N
MMMMMMM
O
Q
PPPPPPP
r r r r r r111
112
113
121
122
211
2 1 0 0 0 0
2 1 0 0 0 0
0 0 1 2 0 0
0 0 0 2 0 0
0 0 0 0 1 1
0 0 0 0 1 1
.
We write down (no computation is necessary) the Jordan canonical form
′ =
L
N
MMMMMMM
O
Q
PPPPPPP
A
2 1 0 0 0 0
0 2 1 0 0 0
0 0 2 0 0 0
0 0 0 2 1 0
0 0 0 0 2 0
0 0 0 0 0 0
.
EE448/528 Version 1.0 John Stensby
CH9.DOC Page 9-26
Note that A′ contains the three Jordan blocks
J1
2 1 0
0 2 1
0 0 2
2 1
0 20=
L
NMMM
O
QPPP
=LNM
OQP = , J , J2 3 .
It is easy to see that A′ satisfies PA′ = AP (so that A′ = P-1AP). As a MatLab exercise, enter P
and A as described above, and type inv(P)*A*P at the command prompt. MatLab will return
the Jordan canonical form A′ given above.
Jordan Form - Sensitivity Issues
Computation of the Jordan form is laborious and time consuming. Also, the Jordan form
in “computationally unstable”; in some cases, a very small perturbation of A can “put back” all of
the missing eigenvectors and remove the superdiagonal of ones. Because of the possible stability
problems, many numerical analysis computer programs do not include the Jordan form (the
Jordan form is not in MatLab proper; it is in MatLab’s symbolic algebra toolbox).
The Jordan form has several applications in state space control theory. Generally
speaking, control engineers will not design a system having a structure that is extremely sensitive
to small perturbations. In Chapter 10, we use the Jordan form to compute functions of matrices.