Chapter 9 Eigenvalues, Eigenvectors and Canonical Forms ... · PDF fileChapter 9 Eigenvalues, Eigenvectors and Canonical Forms Under Similarity ... (9-5) (note the notation ... Eigenvalues,

EE448/528 Version 1.0 John Stensby

CH9.DOC Page 9-1

Chapter 9

Eigenvalues, Eigenvectors and Canonical Forms Under Similarity

Eigenvectors and Eigenvectors play a prominent role in many applications of numerical

linear algebra and matrix theory. In this chapter, we provide basic results on this subject. Then,

we use these results to establish necessary and sufficient conditions for the diagonalization of a

square matrix under a similarity transformation. Finally, we develop the Jordan canonical form of

a matrix, a canonical form the has many applications.

Let T : U → U be a linear operator on a vector space U over the scalar field F. We are

interested in non-zero vectors Xr

which map under T into scalar multiples of themselves. That is,

we are interested in non-zero vectors Xr

∈ U that satisfy

T[Xr

] = λXr

(9-1)

for some scalar λ ∈ F. Such a vector Xr

is said to be an eigenvector corresponding to the

eigenvalue λ.

Example

Let I : U → U be the identity operator. For every Xr

∈ U, I[Xr

] = Xr

. Here, λ = 1 is an eigenvalue

of I, and every non-zero vector in U is an eigenvector.

Example

Let T : R2 → R2 rotate in a counter clock wise direction

every vector by π/2 radians. The scalar field is R , the set

of real numbers. Note that no non-zero vector is a scalar

multiple of itself. Hence T : R2 → R2 has no eigenvalues

or eignevectors.

This lack of eigenvalues and eignevectors will not

occur if we use F = C, the field of complex numbers. Hence, in applications where eignevalues

rXT( )

rX

π/2


CH9.DOC Page 9-2

play a role, we use the complex number field.

Let Xr

and λ be an eigenvector and eigenvalue, respectively, so that T[Xr

] = λXr

. Let c ∈ F

= C be any non-zero scalar. Then we have

T[cXr

] = λ(cXr

) (9-2)

so that cXr

is an eignevector. Hence, eigenvectors are defined up to an arbitrary, non-zero, scalar.

Two or more linearly independent eigenvectors can be associated with a given eigenvalue.

In fact, for a given eigenvalue λ, the set

Sλ = {Xr

∈ U : T(Xr

) = λXr

} (9-3)

is a subspace known as the eigenspace associated with λ (note that 0r is in the eigenspace, but 0

r is

not an eigenvector). Finally, the dimension of eigenspace Sλ is known as the geometric

multiplicity of λ. In what follows, we use γ to denote the geometric multiplicity of an eigenvalue.

For a given basis, the transformation T : U → U can be represented by an n×n matrix A.

In terms of this basis, a representation for the eigenvectors can be given. Also, the eigenvalues

and eigenvectors satisfy

(A - λI)Xr

= 0r. (9-4)

Hence, the eigenspace associated with eigenvalue λ is just the kernel of (A - λI).

While the matrix representing T is basis dependent, the eigenvalues and eigenvectors are

not. The eigenvalues of T : U → U can be found by computing the eigenvalues of any matrix that

represents T. Let n×n matrix A represent T : U → U with respect to some fixed basis. Then the

eigenvalues are the roots of the nth-order characteristic polynomial


CH9.DOC Page 9-3

A - λI = det(A - λI) = 0. (9-5)

(note the notation introduced here: A means the determinant of matrix A). The eigenvalues

can be complex or real-valued. They can occur as simple roots or as multiple roots of the

characteristic polynomial. The number of times (i.e., the multiplicity) that λ appears as a root of

det(A - λI) is called the algebraic multiplicity of λ. We use αk to denote the algebraic

multiplicity of eigenvalue λk. A basis for the eigenspace associated with λ can be found by

computing a basis for the kernel of (A - λI).

Example

A =−−−

F

HGG

I

KJJ =

− −− + −− −

= + − =1 3 3

3 5 3

6 6 4

1 3 3

3 5 3

6 6 4

2 4 02 so that I - Aλλ

λλ

λ λ( ) ( ) .

The distinct eigenvalues are λ1 = -2 and λ2 = 4. Eigenvalue λ1 = -2 has algebraic multiplicity α1 =

2, and eigenvalue λ2 = 4 has algebraic multiplicity α2 = 1. Now we find the eigenvectors.

Consider first the eigenvalue λ1 = -2. The matrix

[ ]A I− =−−−

F

HGG

I

KJJλ

λYY = −21

3 3 3

3 3 3

6 6 6

has a nullity of two, and Xr

11 = [1 1 0]T and Xr

12 = [-1 0 1]T are two linearly independent

eigenvectors that span the two dimensional eigenspace associated with λ1 = -2 . Hence λ1 = -2

has geometric and algebraic multiplicities of γ1 = α1 = 2. Now, consider λ2 = 4. The matrix

[ ]A I− =− −

−−

F

HGG

I

KJJλ

λYY = 42

3 3 3

3 9 3

6 6 0


CH9.DOC Page 9-4

has a nullity of 1, and Xr

2 = [1 1 2]T spans the one-dimensional eigenspace associated with λ2 =

4.

Eigenvector Indexing

From time to time, subscripts and superscripts need to placed on eigenvectors (and the

generalized eigenvectors that are introduced below). In the literature, there is not one indexing

scheme that is predominant (there are faults with all eigenvector indexing schemes). Notice the

indexing scheme that was introduced by the previous example. On some eigenvectors, we placed

two subscripts; we wrote Xr

jk. The first subscript (the "j" subscript) associates the eigenvector

with one of the numerically distinct eigenvalues (each of which can have an algebraic multiplicity

greater than one); we have 1 ≤ j ≤ d, where d is the number of distinct eigenvalues. The second

subscript (the "k" subscript) orders the eigenvector in the set of independent eigenvectors

associated with the "j" eigenvalue; we have 1 ≤ k ≤ γj, where γj is the geometric multiplicity of the

"j" eigenvalue. Sometimes, we place only one subscript on an eigenvector. This one subscript

may associate the eigenvector with a distinct eigenvalue, or it may order the eigenvector in a set

of independent eigenvectors (or it may do both). When one subscript appears, its meaning can be

inferred from context (or its meaning will be stated explicitly). Finally, on an eigenvector,

subscripts are used only when necessary; we will drop all subscripts when they are not needed to

clarify notation.

Eigenvalues of Similar Matrices

Recall that n×n matrices A and B are said to be similar if there exists a nonsingular n×n

matrix P such that A = P-1BP. The matrix representing a linear transformation depends on the

underlying basis; however, all matrices that represent a linear transform are similar to one another.

Furthermore, they have the same eigenvalues and eigenvectors.

Theorem 9-1

Similar matrices have the same eigenvalues and eigenvectors.

Proof: This follows directly from the basic definitions since eigenvalues and eigenvectors are

associated with an underlying linear transformation and not with any particular matrix or vector


CH9.DOC Page 9-5

representation. Let βr

1, βr

2, ... , βr

n and βr

1′, βr

2′, ... , βr

n′ denote the “old” and “new” bases,

respectively, for the vector space; in terms of a non-singular transformation matrix P, the “old”

and “new” bases are related as shown by (3-27). Xr

and A denote “old” representations for the

eigenvector and matrix, respectively. Xr

′ = P-1Xr

and A′ = P-1AP denotes “new” representations

for the eigenvector and matrix, respectively (see (3-29) and (3-41)). With respect to the “new”

basis, the “old” eigen problem AXr

= λXr

becomes

(PA′P-1)PXr

′ = λ(PXr

′). (9-6)

After multiplication on the left by P-1, (9-6) becomes the “new” eigen problem

A′Xr

′ = λXr

′. (9-7)

So, while a similarity transformation changes the matrix and vector representations, it does not

change the underlying linear transformation or its eigenvalues/eigenvectors.♥

Theorem 9-2

For eigenvalue λ, the geometric multiplicity γ does not exceed the algebraic multiplicity α.

Proof: The geometric multiplicity γ of eigenvalue λ is defined independently of any matrix

representing linear transformation T : U → U. The characteristic equation, eigenvalues and

eigenvectors are the same for all matrices that represent T. Hence, to represent transformation T,

we can choose the matrix that makes obvious the proof of this theorem. Let γ be the dimension

of eigenspace Sλ (γ is the geometric multiplicity of λ). Let eigenvectors Xr

1, ... , Xr

γ be a basis for

eigenspace Sλ (eigenvector subscripts are used here as an index into the set of basis vectors). This

linearly independent set of eigenvectors can be extended to a basis


CH9.DOC Page 9-6

r rL

r

1 244 344

r rL

r

1 2444 3444X X X X X X

eigenvectors that

n

any other

1 2 1 2, , , , , , ,

span eigenspace Sindependent

vectors

γ γ γ

λ

+ + (9-8)

of n-dimensional U. The vectors Xr

γ+1, ... , Xr

n can be arbitrary as long as they are independent of

each other and independent of the first γ eigenvectors. Now, T(Xr

i) = λXr

i for 1 ≤ i ≤ γ. With

respect to (9-8), the matrix A representing T has the form

A

A

A

n

n n

=

L

N

MMMMMMMMM

O

Q

PPPPPPPPP

−

− −

λλ

λ

γ γ

γ γ

O ,

,

γ rows

n-γ rows

γ cols n-γ cols

Sub-matrix Aγ,n-γ is γ×(n-γ) and A n-γ,n-γ is (n-γ)×(n-γ). These sub-matrices are non-zero, in general;

the values they contain are of no concern to us. From inspection of (9-9), it is evident that the

algebraic multiplicity of λ is at least equal to γ. Hence, for any eigenvalue, the algebraic

multiplicity ≥ geometric multiplicity (α ≥ γ).♥

Theorem 9-3

Let λ1, λ2, ... , λs be any s distinct eigenvalues, and let Xr

1, Xr

2, ... , Xr

s (subscripts are used

here to associate an eigenvector with a distinct eigenvalue) be the associated eigenvectors. These

s eigenvectors are linearly independent.

Proof (by contradiction)

Suppose the set of s vectors is dependent. Re-order the eigenvectors so that the first k are

linearly independent and the remaining s-k vectors are dependent on the first k vectors. Then, we

can write the unique representation

(9-9)


CH9.DOC Page 9-7

r rX Xs i i

i

k= ∈

=∑c c

1

, i F , (9-10)

Since Xr

s ≠ 0r, there are non-zero ci in (9-10). Apply the linear transformation T to (9-10) and

obtain

λ λs s i i ii

kX Xr r

==∑c

1

. (9-11)

There are two possibilities. First, if λs = 0, then λi ≠ 0, 1 ≤ i ≤ k, since λ1, ... , λs are distinct. λs =

0 implies that Xr

1, ... , Xr

k are dependent, a contradiction. The second possibility is that λs ≠ 0, so

that we can write

r rX Xs i

i

si

i

k= F

HGIKJ=

∑cλλ1

. (9-12)

Since there are non-zero ci, and λi/λs ≠ 1 due to distinct eigenvalues, (9-12) is different than

(9-10), a contradiction (since representation (9-10) is unique). Hence, for either possibility, we

have a contradiction, and the s eigenvectors Xr

1, Xr

2, ... , Xr

s are independent.♥ Note that the

converse of this theorem is not true (independent eigenvectors are not always associated with

distinct eigenvalues).

Theorem 9-3 tells us a lot about matrices with distinct eigenvalues (distinct eigenvalues

are a common occurrence in practical applications). Matrices with distinct eigenvalues have linear

independent eigenvectors. When this occurs, it is possible to use the n independent eignevectors

to form a basis of n-dimensional U, a useful thing to do when proving theorems.

Let n×n matrix A represent T : U → U with respect to some fixed basis. Suppose T has n

linearly independent eigenvectors Xr

1, Xr

2, ... , Xr

n, (subscripts are used to index the eigenvectors in


CH9.DOC Page 9-8

this set of n independent eigenvectors), and we use them as a basis of n-dimensional space U. We

want to find the matrix D that represents linear transformation T with respect to this basis of

eignevectors. Use these independent eigenvectors to define the n×n transformation matrix

P X X Xn≡r r

Lr

1 2 . (9-13)

Then, with respect to the eigenvector basis, the matrix D that represents T is D = P-1AP. But this

implies that AP = PD, a result that can be written as

A X X X X X X

X X X

n n n

n

n

r rL

r r rL

r

r rL

r

O

1 2 1 1 2 2

1 2

1

2

=

=

L

N

MMMM

O

Q

PPPP

λ λ λ

λλ

λ.

But, Equation (9-14) leads to the observation that

D X X X A X X X

n

n n=

L

N

MMMM

O

Q

PPPP=

−

λλ

λ

1

21 2

11 2O

r rL

r r rL

r

Hence, when a basis of eigenvectors is used, the n×n matrix representing T: U → U is diagonal

with the eigenvalues appearing on the diagonal.

If n×n matrix A has distinct eigenvalues, then there is a basis of eigenvectors that can be

used as columns of n×n matrix P. And, with a similarity transformation, matrix P can be used to

diagonalize matrix A. More generally, if each eigenvalue of A has equal geometric and algebraic

(9-14)

(9-15)


CH9.DOC Page 9-9

multiplicities, then there are n linearly independent eigenvectors, and A can be diagonalized as

described above.

The converse is true as well. That is, if an n×n nonsingular matrix P exists such that P-1AP

is diagonal, then we can conclude

1. The eigenvalues of A appear on the diagonal of P-1AP, and

2. The columns of P are n linearly independent eigenvectors of matrix A.

We have argued the following theorem.

Theorem 9-4

An n×n matrix A is similar to a diagonal matrix D if and only if there are n linearly independent

eigenvectors of A. Furthermore, the eigenvalues of A must appear on the diagonal of D.

Example A =−−

LNM

OQP

1 1

2 1

det(A - λI) = λ2 + 1 so that λ = ± j are the eigenvalues. λ1 = +j has the eigenvector Xr

1 = [1 1-j]T.

λ2 = -j has the eigenvector Xr

2 = [1 1+j]T. The eigenvalues are distinct, so Xr

1 and Xr

2 are

independent and

Pj j

APj

j=

− +LNM

OQP ⇒ =

−LNM

OQP

1 1

1 1

0

0 P-1

Example A =−L

NMMM

O

QPPP

1 0 1

0 1 0

0 0 2

has eigenvalues λ1 = 1 (α1 = 2), and λ2 = 2 (α2 = 1). The

eigenvectors are

λ1 = 1 ⇒ Xr

11 = [1 0 0]T and Xr

12 = [0 1 0]T

λ2 = 2 ⇒ Xr

2 = [-1 0 1]T

Note that λ1 has equal algebraic and geometric multiplicities of two. Hence, Xr

11, Xr

12 and Xr

2

comprise a basis of eigenvectors, and we have


CH9.DOC Page 9-10

P AP=−L

NMMM

O

QPPP

⇒ =L

NMMM

O

QPPP

1 0 1

0 1 0

0 0 1

1 0 0

0 1 0

0 0 2

P-1

Example A =L

NMMM

O

QPPP

1 1 2

0 1 3

0 0 2

has eigenvalues λ1 = 1 (α1 = 2), and λ2 = 2 (α2 = 1). Since nullity(A - λ1I) = 1, we know that λ1

= l has a geometric multiplicity of γ1 = 1 but an algebraic multiplicity of α1 = 2. Hence, there is no

basis of eigenvectors, and matrix A cannot be diagonalized under similarity.

When there is not a basis of eigenvectors, n×n matrix A cannot be diagonalized.

However, we show that a nonsingular n×n matrix P exists such that P-1AP is “almost” diagonal;

our P-1AP has eigenvalues on its diagonal and “1s” immediately above some of the diagonal

eigenvalues. This new “almost diagonal” matrix is called the Jordan Canonical Form for A, and

it has many applications in engineering and the applied sciences. First, we must introduce the

subject of generalized eigenvectors.

Generalized Eigenvectors

Let A be an n×n matrix. For an eigenvalue λ, vector Xr

is said to be a generalized

eigenvector of rank k > 0 if

( )

( )

A I X

A I X

k

k

− =

− ≠−

λ

λ

r r

r r

0

01. (9-16)

An “ordinary” eigenvector Xr

is a generalized eigenvector of rank k = 1 since (A - λI)Xr

= 0r and

(A - λI)0Xr

= Xr

≠ 0r.

We develop a chain of generalized eigenvectors. For a given eigenvalue λ, let Xr

be a

generalized eigenvector of rank k. Define the chain of k generalized eigenvectors as


CH9.DOC Page 9-11

r r

r r r

r r r

M

r r r

X X

X X X

X X X

X X X

k

k k

k k

(A - I) = (A - I)

(A - I) = (A - I)

(A - I) = (A - I)

2

k-1

≡

≡

≡

≡

−

− −

1

2 1

1 2

λ λ

λ λ

λ λ

. (9-17)

A superscript on a vector is not a power; it is used to indicate rank, and it is used as an index! On

Xr

k, the k is used as a rank indicator and index; k is not a power (raising a vector to a power is an

operation that has not been defined!). Now, settle down, get over it! On a vector, the only time

we will use a superscript is when we are working with vectors in a chain of generalized

eigenvectors (we have already described how we want to use the subscript position(s)). On

generalized eigenvectors, superscripts are standard in the literature.

For each i, 1 ≤ i ≤ k, Xr

i is a generalized eigenvector of rank i since

(A - λI)i Xr

i = (A - λI)i (A - λI)k-i Xr

= (A - λI)k Xr

= 0r, (9-18)

(A - λI)i-1 Xr

i = (A - λI)i-1 (A - λI)k-i Xr

= (A - λI)k-1 Xr

≠ 0r . (9-19)

Note that Xr

1 is an "ordinary" eigenvector since

(A - λI)Xr

1 = (A - λI)(A - λI)k-1Xr

= (A - λI)k Xr

= 0r. (9-20)

As mentioned above, we call Xr

1, Xr

2, ... , Xr

k a chain of generalized eigenvectors. Now, we

examine some properties that chains have.


CH9.DOC Page 9-12

Theorem 9-5

A chain Xr

1, Xr

2, ... , Xr

k of generalized eigenvectors is linearly independent.


For the moment, assume that the vectors in the chain are dependent. Then there exists constants

c1, c2, ... , ck, not all zero, such that

c1Xr

1 + c2Xr

2 + ... + ckXr

k = 0r. (9-21)

First, note that for i = 1, 2, ... , k-1 we can write

(A - λI)k-1 Xr

i = (A - λI)k-1(A - λI)k-i Xr

= (A - λI)2k-(i+1) Xr

= 0r, (9-22)

a result we will use very soon. Now, apply (A - λI)k-1 to both sides of (9-21) to obtain

(A - λI)k-1{ c1Xr

1 + c2Xr

2 + ... + ckXr

k} = 0r. (9-23)

Use (9-22) in (9-23) to obtain

ck(A - λI)k-1 Xr

k = 0r. (9-24)

But, we know that (A - λI)k-1 Xr

k ≠ 0r. Hence, we must have ck = 0 so that (9-21) becomes

c1Xr

1 + c2Xr

2 + ... + ck-1Xr

k-1 = 0r. (9-25)

On this equation, repeat the procedure that starts with (9-22). That is, multiply (9-25) by (A -

λI)k-2, and repeat the above argument (that produced ck = 0) to reach the conclusion that ck-1 = 0.


CH9.DOC Page 9-13

Obviously, this same argument can be repeated a sufficient number of time to conclude that ci = 0,

1 ≤ i ≤ k. This contradiction leads to the conclusion that the chain Xr

1, Xr

2, ... , Xr

k is comprised of

linear independent generalized eigenvectors.♥

Theorem 9-6

Let λ1 ≠ λ2 be two eigenvalues of n×n matrix A. Suppose Xr

is a generalized eigenvector

of rank k associated with λ1 and Yr

is a generalized eigenvector of rank m associated with λ2.

Define the two chains

Xr

k = Xr

, and Xr

i = (A - λ1I)Xr

i+1 = (A - λ1I)k-i X

r for i = k-1, k-2, ... , 1 (9-26)

Yr

m = Yr

, and Yr

j = (A - λ2I)Yr

j+1 = (A - λ2I)m-j Y

r for j = m-1, m-2, ... , 1 (9-27)

The set of k+m vectors described by (9-26) and (9-27) are linearly independent. Equivalently, any

generalized eigenvector from one chain is independent of the vectors in the other chain.


Suppose there is an i, 1 ≤ i ≤ k, for which Xr

i is linearly dependent on the chain Yr

1, Yr

2, ... , Yr

m.

Then, there exists constants c1, ... , cm, not all zero, such that

r rX Yi j

j

m=

=∑c j

1

(9-28)

Multiply (9-28) by (A - λ1I)i, and use the fact that

(A - λ1I)i Xr

i = (A - λ1I)i (A - λ1I)

k-i Xr

= 0r

(9-29)

to obtain


CH9.DOC Page 9-14

(A - I) iλ1 cjr rY j

j

m

=∑ =

1

0 (9-30)

Now, multiply (9-30) by (A - λ2I)m-1, and use the facts

i) (A - λ2I)m-1(A - λ1I)

i = (A - λ1I)i (A - λ2I)

m-1

ii) (A - λ2I)m-1 Y

rj = 0

r for j = m-1, m-2, ... , 1

to obtain

(A - λ1I)i (A - λ2I)

m-1 cmYr

m = cm(A - λ1I)i Yr

1 = 0r

(9-31)

Now, Yr

1 is an "ordinary" eigenvector: AYr

1 = λ2Yr

1, so (9-31) becomes

cm(λ2 - λ1)Yr

1 = 0r. (9-32)

Since λ2 ≠ λ1 we must have cm = 0 so that (9-30) becomes

(A - I) iλ1 c jr rY j

j

m

=

−

∑ =1

10 . (9-33)

Now repeat the argument that started with (9-30) and produced cm = 0. That is, multiply (9-33)

by (A - λ2I)m-2

, follow the argument, and conclude that cm-1 = 0. Continue this process to the

conclusion that ci = 0 for i = m, m-1, m-2, ... , 1. This contradiction (the ci's are not all zero)

leads to the conclusion that Xr

i is independent of Yr

1, Yr

2, ... , Yr

m. Hence, the two chains Yr

1, Yr

2, ...

, Yr

m and Xr

1, Xr

2, ... , Xr

k contain m+k linearly independent vectors.♥

Theorem 9-7

Let Yr

and Xr

be generalized eigenvectors of rank m and k, respectively, associated with

the same eigenvalue λ. Define the two chains


CH9.DOC Page 9-15

Xr

k = Xr

, and Xr

i = (A - λI)Xr

i+1 = (A - λI)k-i Xr

for i = k-1, k-2, ... , 1 (9-34)

Yr

m = Yr

, and Yr

j = (A - λI)Yr

j+1 = (A - λI)m-j Yr

for j = m-1, m-2, ... , 1 (9-35)

If the "ordinary" eigenvectors Yr

1 and Xr

1 are independent, then so are the two chains (i.e., (9-34)

and (9-35) describe m+k independent vectors).

Proof

Similar to the proof of Theorem 9-6.♥

Theorems 9-5, 9-6 and 9-7 provide the basis of our generalized eigenvector theory. Note

that we have shown an important result.

Note that we have not discussed how many vectors are in each chain. We have argued only that

there are a total of α generalized eigenvectors divided into γ chains associated with λ. While n×n

matrix A may, or may not, have n independent eigenvectors, it always has n independent

generalized eigenvectors.

Eigenvector Indexing - Revisited

It's time once more to consider generalized eigenvector indexing. A generalized

eigenvector can have two subscripts and one superscript. The meaning of the two subscripts are

given above in the section on eigenvector indexing (which is worth reading again). The

superscript is used as both a rank indicator and index into a chain. For example, consider the

generalized eigenvector r

lX jk . The "j" subscript associates the generalized eigenvector with

eigenvalue λj (1 ≤ j ≤ d, where d is the number of numerically distinct eigenvalues). The "k"

Associated with eigenvalue λ are γ distinct chains of generalized eigenvectors (γ is the geometric multiplicity of λ).

Each chain is "anchored" by an "ordinary" eigenvector (of rank one). In these γ chains, the total number of generalized

eigenvectors is α, the algebraic multiplicity of λ. And, these α vectors are linearly independent.


CH9.DOC Page 9-16

subscript associates the generalized eigenvector with a particular chain of independent generalized

eigenvectors for λj (1 ≤ k ≤ γj , where γj is the geometric multiplicity of λj). As described above,

superscript l is a rank indicator, and it is an index into the kth chain of generalized eigenvectors

associated with λj. Finally, note that rX jk

1 is the kth "ordinary" eigenvector associated with λj.

Listing of all Generalized Eigenvectors

Let λ1, λ2, ... , λd denote the numerically distinct eigenvalues of an n×n matrix A . For 1 ≤

k ≤ d, eigenvalue λk has an algebraic multiplicity of αk and a geometric multiplicity of γk.

Furthermore, for 1 ≤ k ≤ d, eigenvalue λk is associated with γk separate chains of generalized

eigenvectors containing a total (in all of the γk chains) of αk independent generalized eigenvectors.

Finally, taken all together, for the d numerically distinct eigenvalues, a total of n generalized

eigenvectors exist, considering all of the vectors in all of the chains.

We can list these n generalized eigenvectors. Using the indexing scheme outline above,

we write

The generalized eigenvectors for are

divided into chains 1 1

1

α λγ

γ γ γγ

r rL

r

r rL

r

M M Mr r

Lr

X X X

X X X

X X X

h

h

h

111

112

11

121

122

12

11

12

1

11

12

1 1 1

1 1

R

S|||

T|||


divided into chains 2 2

2

α λγ

γ γ γγ

r rL

r

r rL

r

M M Mr r

Lr

X X X

X X X

X X X

h

h

h

211

212

21

221

222

22

21

22

2

21

22

2 2 2

2 2

R

S|||

T|||

(9-36)

M M M M M


divided into chains d d

d

α λγ

γ γ γγ

r rL

r

r rL

r

M M Mr r

Lr

X X X

X X X

X X X

d d dh

d d dh

d d d

h

d

d

d d

d d d

11

12

1

21

22

2

1 2

1

2

R

S||

T|||

.


CH9.DOC Page 9-17

Here, hkj, 1 ≤ k ≤ d, 1 ≤ j ≤ γk, denotes the number of generalized eigenvectors in the jth chain

associated with the numerically distinct eigenvalue λk. Integer hkj has to be computed as outlined

in the example given below. As stated in the list given above, we have

αγ

k kjj

hk

==∑

1

. (9-37)

Also, we denote the total number of chains as

ν γ==

∑ kk

d

1

. (9-38)

Finally, for an n×n matrix A, we have

n hkk

d

kjjk

d k

= == ==

∑ ∑∑αγ

1 11

. (9-39)

An n×n matrix A may, or may not, have n linearly independent eigenvectors. However, it always

has n linearly independent generalized eigenvectors.

Example Reconsider the previous example where

A =L

NMMM

O

QPPP

1 1 2

0 1 3

0 0 2

Eigenvalue λ1 = 1 has an algebraic multiplicity of α1 = 2 and a geometric multiplicity of γ1 = 1;rX11

1 = [1 0 0]T is an "ordinary" eigenvector for λ1. Eigenvalue λ2 = 2 has geometric and

algebraic multiplicities of 1; rX21

1 = [5 3 1]T is an "ordinary" eigenvector for λ2. We are one


CH9.DOC Page 9-18

eigenvector short; the matrix A cannot be diagonalized by a similarity transformation. However,

we can find two generalized eigenvectors associated with λ1 = 1. Let's find a chain of length two

associated with λ1 = 1. These two generalized eigenvectors, when combined with rX21

1 , will

produce a basis of generalized eigenvectors. First, find a non-zero Xr

such that

(A X− =L

NMMM

O

QPPP

≠λ Ι)1r r rX

0 1 2

0 0 3

0 0 1

0

(A X X− =L

NMMM

O

QPPP

L

NMMM

O

QPPP

=L

NMMM

O

QPPP

=λ Ι)12 r r r r

X

0 1 2

0 0 3

0 0 1

0 1 2

0 0 3

0 0 1

0 0 5

0 0 3

0 0 1

0

Clearly, Xr

= [0 1 0]T is a generalized eigenvector of rank 2, and we use this vector to write

rX11

20

1

0

=L

NMMM

O

QPPP

r rX A X11

11

0 1 2

0 0 3

0 0 1

0

1

0

1

0

0

= − =L

NMMM

O

QPPP

L

NMMM

O

QPPP

=L

NMMM

O

QPPP

( )λ

{rX11

1 , rX11

2 } is a chain of length two associated with λ1 = 1. The vectors rX11

1 = [1 0 0]T, rX11

2 =

[0 1 0]T, rX21

1 = [5 3 1]T form a basis of generalized eigenvectors. With respect to this basis,

let's find the matrix A′ that represents the underlying transformation. Define the 3×3 non-singular

matrix P X X X≡r r r

111

112

211 , and compute A′ = P-1AP. We compute A′ by considering the

equivalent equation PA′ = AP ⇔ r r rX X X11

1112

211 A′ = A

r r rX X X11

1112

211 so that


CH9.DOC Page 9-19

AX X X Xr r r r

111

111

112

211

1

0

0

1 0 0=L

NMMM

O

QPPP

= ⋅ + ⋅ + ⋅

AX X X Xr r r r

112

111

112

211

1

1

0

1 1 0=L

NMMM

O

QPPP

= ⋅ + ⋅ + ⋅

AX X X Xr r r r

211

111

112

211

10

6

2

0 0 2=L

NMMM

O

QPPP

= ⋅ + ⋅ + ⋅

As a result, we see that

′ = =L

NMMM

O

QPPP

−A P AP11 1 0

0 1 0

0 0 2

.

Note that A′ has two blocks on its diagonal; we write A′ as

′ =L

NMMM

O

QPPP

≡LNM

OQP ≡A , , J J1 2

1 1

0 12

J1

J2

Matrix A′ is known as the Jordan Canonical Form for matrix A.

Jordan Canonical Form

This procedure can be applied to transform any n×n matrix into its block-diagonal Jordan

canonical form. Let λ1, λ2, ... , λd be the numerically distinct eigenvalues of n×n matrix A. For 1

≤ k ≤ d, let λk have algebraic multiplicity αk and geometric multiplicity of γk. As outlined above,

eigenvalue λk is associated with γk chains containing a total of αk generalized eigenvectors, and


CH9.DOC Page 9-20

each chain is "anchored" by an "ordinary" eigenvector. As listed by (9-36), there are a total of n

linearly-independent generalized eigenvectors split up into ν chains. We use these n generalized

eigenvectors to define the n×n transformation matrix

P X X X X X X

X X X X X X

X

h h h

h h h

d

= [r

Lr

1 244 344

rL

r

1 244 344L

rL

r

1 2444 3444

rL

r

1 2444 3444

rL

r

1 2444 3444L

rL

r

1 2444 3444L

M M M

Lr

L

111

11 121

12 11

1

211

21 221

22 21

2

11

11 12 1

21 22 2

1

1

2 2

2

1

1

2

chain #1 for chain #2 for chain # for

chain #1 for chain #2 for chain # for

1 1 1

2 2 2

λ λγ γ

γ λ

λ λγ γ

γ λ

γ

γ

r

1 2444 3444

rL

r

1 2444 3444L

rL

r

1 2444 3444X X X X Xd

hd d

hd d

hd d d

d d

d

d

1 21

211 2

chain #1 for chain #2 for chain # for d d d

.λ λ

γ γ

γ λ

γ ]

(9-40)

By using the similarity transformation A′ = P-1AP, matrix P, given by (9-40), can be used

to transform n×n matrix A into its Jordan Canonical Form. This canonical form is a block

diagonal matrix

′ = =

L

N

MMMM

O

Q

PPPP−A P AP

J

J

J

1

1

2

O

ν

made from ν blocks Jk, 1 ≤ k ≤ ν, one block for each chain of generalized eigenvectors. Note that

(9-41) is equivalent to AP = PA′, a matrix equation that can be written as

AP = A[ ]r

Lr

1 244 344L

rL

r

1 2444 3444X X X X

h

chain

d d

h

chain

d d

d d

111

11111

1stth

γ γ

ν

γ

(9-41)


CH9.DOC Page 9-21

= [ ]r

Lr

1 244 344L

rL

r

1 2444 3444X X X X

h

chain

d d

h

chain

d d

d d

111

11111

1stth

γ γ

ν

γ

J

J

J

A

1

2

O

ν

L

N

MMMM

O

Q

PPPP= ′ P

Let's examine the structure of a typical block. Consider r r

Lr

X X Xjk jk jkh jk1 2, , , , the kth chain

associated with λj, the jth distinct eigenvalue. The Jordan block for this chain is Jp, where

p k ii

j= +

=

−

∑ γ1

1 (9-43)

From the basic definition of this chain, we have

r

r r r r r

r r r r r

X

X A I X AX X X

X A I X AX X X

h

hj

h hj

h h

hj

h hj

h h

jk

jk jk jk jk jk

jk jk jk jk jk

− −

− − − − −

= − ⇒ = +

= − ⇒ = +

1 1

2 1 1 1 2

( )

( )

λ λ

λ λ

(9-44)

,

M M M

r r r r rX A I X AX X Xj j

1 2 2 2 1= − ⇒ = +( )λ λ

where we have omitted the common subscripts jk on all generalized eigenvectors. From PA′ =

AP (see (9-42)), we have the requirement

r rL

r r rL

rX X X J A X X Xjk jk jk

hp jk jk jk

hjk jk1 2 1 2LNM OQP = LNM OQP , (9-45)

where p is given by (9-43). However, from (9-44) and the requirement ArX jk

1 = λj

rX jk

1 , it is easy

to see that

(9-42)


CH9.DOC Page 9-22

Jp

j

j

j

j

j

=

L

N

MMMMMMMM

O

Q

PPPPPPPP

λλ

λ

λλ

1 0

0 1

0 0

1

0

O

Ohjk rows

hjk columns

That is, Jp is an hjk×hjk matrix with λj on its diagonal, "1s" on its first "super diagonal", and zeros

everywhere else.

Computational Procedure for Jordan Form

For many low-dimensional problems of practical interest, the Jordan form can be

computed "by hand" without too much effort. A computational procedure for computing the

Jordan form is outlined below.

1. Compute the eigenvalues and "ordinary" eigenvectors of n×n matrix A; determine the algebraic

and geometric multiplicities of the eigenvalues. The distinct eigenvalues are λ1, λ2, ... , λd; for 1 ≤

k ≤ d, eigenvalue λk has algebraic multiplicity αk and geometric multiplicity γk.

2. In γ1 distinct chains, compute a total of α1 independent, generalized eigenvectors for λ1. To

accomplish this, compute (A - λ1I)i for i = 1, 2, ... until the rank of (A - λ1I)

k is equal to the rank

of (A - λ1I)k+1. Then, compute a rank k generalized eigenvector and its k-long chain. If k = α1,

go to step #3. Otherwise, look for a second rank-k vector and its chain. If a second rank k

vector does not exist, look for one of rank k-1, and so on, until we have γ1 distinct chains of α1

generalized eigenvectors.

3. Repeat step #2 for the remaining eigenvalues λ2, ... , λd.

4. Write down the Jordan form. For eigenvalue λj, the kth chain is of length hjk (determined in

step #2), and there is an hjk×hjk Jordan block with λj on its diagonal.

(9-46)


CH9.DOC Page 9-23

In the Jordan form, the ordering of the blocks is not critical. However, it is common to

keep sequential all blocks associated with the same eigenvalue.

Example A =

−− −

− −

L

N

MMMMMMM

O

Q

PPPPPPP

3 1 1 1 0 0

1 1 1 1 0 0

0 0 2 0 1 1

0 0 0 2 1 1

0 0 0 0 1 1

0 0 0 0 1 1

Compute the eigenvalues and algebraic multiplicities. Note that det(A - λI) = λ(λ - 2)5, and this

implies that λ1 = 2 with α1 = 5 and λ2 = 0 with α2 = 1. Furthermore, eigenvalue λ1 = 2 has the

two independent eigenvectors

r

r

X

X

T

T

111

121

1 1 0 0 0 0

0 0 1 1 0 0

=

= −

,

so γ1 = 2. Also, λ2 = 0 has the single eigenvector

rX T

211 0 0 0 0 1 1= − ,

so γ2 = 1. Now, compute (A - λ1I)i, for increasing i until the rank no longer changes.

( )A I− =

−− − −

− −−

−

L

N

MMMMMMM

O

Q

PPPPPPP

2

1 1 1 1 0 0

1 1 1 1 0 0

0 0 0 0 1 1

0 0 0 0 1 1

0 0 0 0 1 1

0 0 0 0 1 1

has rank equal to 4.


CH9.DOC Page 9-24

( )A I− =

−−

L

N

MMMMMMM

O

Q

PPPPPPP

2

0 0 2 2 0 0

0 0 2 2 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 2 2

0 0 0 0 2 2

2 has rank equal to 2.

( )A I− =

−−

L

N

MMMMMMM

O

Q

PPPPPPP

2

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 4 4

0 0 0 0 4 4


( )A I− =

−−

L

N

MMMMMMM

O

Q

PPPPPPP

2

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 8 8

0 0 0 0 8 8


The rank of (A - 2I)3 is equal to the rank of (A - 2I)4; hence, there is a rank 3 generalized

eigenvector that is in K((A - 2I)3) but not in K((A - 2I)2). It is easily computed as rX11

3 = [0 0 1

0 0 0]T since (A - 2I)3rX11

3 = 0r but (A - 2I)2

rX11

3 ≠ 0r. Now, we compute the first 3-long chain

associated with λ1.

r r r r rX A I X X A I X X11

1 2113

112

113

1132

2

2

0

0

0

0

2

1

1

0

0

0

0

0

0

1

0

0

0

≡ − =

L

N

MMMMMMM

O

Q

PPPPPPP

≡ − =

−L

N

MMMMMMM

O

Q

PPPPPPP

=

L

N

MMMMMMM

O

Q

PPPPPPP

( ) ( ) , , .


CH9.DOC Page 9-25

Since α1 = 5, there are two more generalized eigenvectors associated with λ1; inspection of (A -

2I)2 and (A - 2I) reveals where they are. There is a generalized eigenvector of rank 2 that is in

K((A - 2I)2) but not in K(A - 2I). This rank 2 vector is rX12

2 = [0 0 0 0 1 1]T; note that (A -

2I)2rX12

2 = 0r

but (A - 2I)rX12

2 ≠ 0r. Hence, our second chain associated with λ1 = 2 is

r r rX A I X X12

1122

1222

0

0

2

2

0

0

0

0

0

0

1

1

≡ − =−

L

N

MMMMMMM

O

Q

PPPPPPP

=

L

N

MMMMMMM

O

Q

PPPPPPP

( ) , .

We have 5 generalized eigenvectors associated with λ1 = 2; there are no more. With the

eigenvector rX T

211 0 0 0 0 1 1= − , we have a basis of 6 generalized eigenvectors that we

can use to write the transformation matrix

P X X X X X X= =

−

−−

L

N

MMMMMMM

O

Q

PPPPPPP

r r r r r r111

112

113

121

122

211

2 1 0 0 0 0

2 1 0 0 0 0

0 0 1 2 0 0

0 0 0 2 0 0

0 0 0 0 1 1

0 0 0 0 1 1

.

We write down (no computation is necessary) the Jordan canonical form

′ =

L

N

MMMMMMM

O

Q

PPPPPPP

A

2 1 0 0 0 0

0 2 1 0 0 0

0 0 2 0 0 0

0 0 0 2 1 0

0 0 0 0 2 0

0 0 0 0 0 0

.


CH9.DOC Page 9-26

Note that A′ contains the three Jordan blocks

J1

2 1 0

0 2 1

0 0 2

2 1

0 20=

L

NMMM

O

QPPP

=LNM

OQP = , J , J2 3 .

It is easy to see that A′ satisfies PA′ = AP (so that A′ = P-1AP). As a MatLab exercise, enter P

and A as described above, and type inv(P)*A*P at the command prompt. MatLab will return

the Jordan canonical form A′ given above.

Jordan Form - Sensitivity Issues

Computation of the Jordan form is laborious and time consuming. Also, the Jordan form

in “computationally unstable”; in some cases, a very small perturbation of A can “put back” all of

the missing eigenvectors and remove the superdiagonal of ones. Because of the possible stability

problems, many numerical analysis computer programs do not include the Jordan form (the

Jordan form is not in MatLab proper; it is in MatLab’s symbolic algebra toolbox).

The Jordan form has several applications in state space control theory. Generally

speaking, control engineers will not design a system having a structure that is extremely sensitive

to small perturbations. In Chapter 10, we use the Jordan form to compute functions of matrices.

Documents

Chapter 9 Eigenvalues, Eigenvectors and Canonical Forms ... · PDF fileChapter 9 Eigenvalues, Eigenvectors and Canonical Forms Under Similarity ... (9-5) (note the notation ... Eigenvalues,