Upload
docong
View
221
Download
0
Embed Size (px)
Citation preview
ECE557 Systems Control
Bruce Francis
Course notes, Version 2.0, September 2008
Preface
This is the second Engineering Science course on control. It assumes ECE356 as a prerequisite. Ifyou didn’t take ECE356, you must go through Chapters 2 and 3 of the ECE356 course notes.
This course is on the state-space approach to control system analysis and design. By contrast,ECE356 treated frequency domain methods. Generally speaking, the state-space methods scalebetter to higher order, multi-input/output systems. The frequency domain methods use complexfunction theory; the state-space approach uses linear algebra—eigenvalues, subspaces, and all that.
The emphasis in the lectures will be on concepts, examples, and use of the theory.There are several computer applications for solving numerical problems in this course. The most
widely used is MATLAB, but it’s expensive. I like Scilab, which is free. Others are Mathematica(expensive) and Octave (free).
3
4
Contents
1 Introduction 71.1 State Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.1 Magnetic levitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.2.2 Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 The Equation x = Ax 172.1 Brief Review of Some Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3 The Jordan Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.4 The Transition Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.5 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3 More Linear Algebra 333.1 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.3 Matrix Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.4 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4 Controllability 474.1 Reachable States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.2 Properties of Controllability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.3 The PBH (Popov-Belevitch-Hautus) Test . . . . . . . . . . . . . . . . . . . . . . . . 554.4 Controllability from a Single Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.5 Pole Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.6 Stabilizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5 Observability 735.1 State Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735.2 The Kalman Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.3 Detectability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775.4 Observers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5
6 CONTENTS
5.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6 Feedback Loops 816.1 BIBO Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816.2 Feedback Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826.3 Observer-Based Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 836.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7 Tracking and Regulation 877.1 Review of Tracking Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877.2 Distillation Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 887.3 Problem Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907.4 Tools for the Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927.5 Regulator Problem Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 947.6 Unobservability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977.7 More Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1027.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
8 Optimal Control 1078.1 Minimizing Quadratic Functions with Equality Constraints . . . . . . . . . . . . . . 1078.2 The LQR Problem and Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1138.3 Hand Waving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1188.4 Sketch of Proof that F is Optimal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1208.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Chapter 1
Introduction
Control is that beautiful part of system science/engineering where we get to design part of thesystem, the controller, so that the system performs as intended. Control is a very rich subject,ranging from pure theory (Can a robot with just vision sensors be programmed to ride a unicycle?)down to the writing of real-time code. This course is mathematical, but that doesn’t imply it isonly theoretical and isn’t applicable to real problems.
You are assumed to know Chapters 2 and 3 of the ECE356 course notes. This chaptergives a brief review of only part and is not sufficient.
First, some notation: Usually, a vector is written as a column vector, but sometimes to savespace it is written as an n-tuple:
x =
x1...
xn
or x = (x1, . . . , xn).
1.1 State Models
Systems that are linear, time-invariant, causal, finite-dimensional, and having proper transfer func-tions have state models,
x = Ax + Bu, y = Cx + Du.
Here u, x, y are vector-valued functions of t and A, B,C, D real constant matrices.
Deriving State Models
How to get a state model depends on what we have to start with.
Example nth order ODE. Suppose we have the system
2y − y + 3y = u.
The natural state vector is
x =�
y
y
�=:
�x1
x2
�.
7
8 CHAPTER 1. INTRODUCTION
Then
x1 = x2
x2 =12x2 −
32x1 +
12u ,
so
A =�
0 1−3
212
�, B =
�012
�, C =
�1 0
�, D = 0.
This technique extends to
any(n) + · · ·+ a1y + a0y = u.
What about derivatives on the right-hand side:
2y − y + 3y = u− 2u?
The transfer function is
Y (s) =s− 2
2s2 − s + 3U(s).
Introduce an intermediate signal v:
Y (s) = (s− 2)1
2s2 − s + 3U(s)
� �� �=:V (s)
.
Then
2v − v + 3v = u
y = v − 2v.
Taking x = (v, v) we get
A =�
0 1−3
212
�, B =
�012
�, C =
�−2 1
�, D = 0.
This technique extends to
y(n) + · · ·+ a1y + a0y = bn−1u
(n−1) + · · ·+ b0u.
The transfer function is
G(s) =bn−1s
n−1 + bn−2sn−2 + · · ·+ b0
sn + an−1sn−1 + · · ·+ a0
.
Then
G(s) = C(sI −A)−1B,
1.1. STATE MODELS 9
where
A =
0 1 · · · 0 00 0 · · · 0 0...
......
...0 0 0 1−a0 −a1 · · · −an−2 −an−1
, B =
0...01
C =�
b0 · · · bn−1�.
This state model is called the controllable (canonical) realization of G(s).For the case n = m, you divide denominator into numerator and thereby factor G(s) into the
sum of a constant and a strictly proper transfer function. This gives D �= 0, namely, the constant.If m > n, there is no state model.What if we have two inputs u1, u2, two outputs y1, y2, and coupled equations such as
y1 − y1 + y2 + 3y1 = u1 + u2
2d
3y2
dt3− y1 + y2 + 4y1 = u2?
The natural state is
x = (y1, y1, y2, y2, y2).
Please complete this example. �
Let’s study the transfer matrix for the state model
x = Ax + Bu, y = Cx + Du.
Take Laplace transforms with zero initial conditions:
sX(s) = AX(s) + BU(s), Y (s) = CX(s) + DU(s).
Eliminate X(s):
(sI −A)X(s) = BU(s)
⇒ X(s) = (sI −A)−1BU(s)
⇒ Y (s) = [C(sI −A)−1B + D]� �� �
transfer matrix
U(s).
This leads to the realization problem: Given G(s), find A, B,C, D such that
G(s) = C(sI −A)−1B + D.
A solution exists iff G(s) is rational and proper (every element of G(s) has deg denom ≥ deg num).The solution is never unique.
There are general procedures for getting a state model, but we choose not to cover this topic inthe interests of moving to other interests.
10 CHAPTER 1. INTRODUCTION
1.2 Examples
Here we look at two examples that we’ll use repeatedly for illustration.
1.2.1 Magnetic levitation
u
y
R
L i
+
!
This example was used frequently in ECE356. Imagine an electromagnet suspending an iron ball.Let the input be the voltage u and the output the position y of the ball below the magnet; let i
denote the current in the circuit. Then
Ldi
dt+ Ri = u.
Also, it can be derived that the magnetic force on the ball has the form Ki2/y
2, K a constant.Thus
My = Mg −Ki2
y2.
Realistic numerical values are M = 0.1 Kg, R = 15 ohms, L = 0.5 H, K = 0.0001 Nm2/A2, g = 9.8m/s2. Substituting in these numbers gives the equations
0.5di
dt+ 15i = u
0.1d
2y
dt2= 0.98− 0.0001
i2
y2.
Define state variables x = (x1, x2, x3) = (i, y, y). Then the nonlinear state model is x = f(x, u),where
f(x, u) = (−30x1 + 2u, x3, 9.8− 0.001x21/x
22).
Suppose we want to stabilize the ball at y = 1 cm, or 0.01 m. We need a linear model valid inthe neighbourhood of that value. Solve for the equilibrium point (x, u) where x2 = 0.01:
−30x1 + 2u = 0, x3 = 0, 9.8− 0.001x21/0.012 = 0.
Thus
x = (0.99, 0.01, 0), u = 14.85.
1.2. EXAMPLES 11
The linearized model is
˙δx = Aδx + Bδu, δy = Cδx,
where A equals the Jacobian of f with respect to x, evaluated at (x, u), and B equals the sameexcept with respect to u:
A =
−30 0 00 0 1
−0.002x1/x22 0.002x
21/x
32 0
(x,u)
=
−30 0 00 0 1
−19.8 1940 0
B =
200
, C =�
0 1 0�.
The eigenvalues of A are −30,±44.05, the units being s−1. The corresponding time constants are1/30 = 0.033, 1/44.05 = 0.023 s. The first is the time constant of the electric circuit; the second,the time constant of the magnetics.
1.2.2 Vehicles
The second example is a vehicle control problem motivated by research on intelligent highwaysystems. We begin with the simplest vehicle, a cart with a motor driving one wheel:
u
y
The input is the voltage u to the motor, the output the cart position y. We want the model fromu to y.
Free body diagrams:
y
f
f f
τ
τθ
motor
wheel
cartu
12 CHAPTER 1. INTRODUCTION
The cart
A force f via the wheel through the axle:
My = f. (1.1)
The wheel
An equal and opposite force f at the axle; a horizontal force where the wheel contacts the floor. Ifthe inertia of the wheel is negligible, the two horizontal forces are equal. Finally, a torque τ fromthe motor. Equating moments about the axle gives τ = fr, where r is the radius of the wheel. Thus
f = τ/r. (1.2)
The motor
The electric circuit equation is
Ldi
dt+ Ri = u− vb, (1.3)
where vb is the back emf. The torque produced by the motor:
τm = Ki. (1.4)
Newton’s second law for the motor shaft:
J θ = τm − τ. (1.5)
The back emf is
vb = Kbθ. (1.6)
Finally, the relationship between shaft angle and cart position:
y = rθ. (1.7)
Combining
The block diagram is then
1Ls + R
Kr
Js1s
Msr
Kb
r
−−
u i τm
f
y y
1.2. EXAMPLES 13
The inner loop can be reduced, giving
1Ls + R
K1s
Kb
r
−
u i τm y yα
s
α =r
r2M + J
Finally, we have the third-order system
1s
y yu
β =αK
Lγ =
αKKb
rL
β
s2 + RL s + γ
Although this vehicle is very easy to control, for more complex vehicles (Jeeps on open terrain)it’s customary to design a loop to cancel the dynamics, leaving a simpler kinematic vehicle, likethis:
1s
y yu β
s2 + RL s + γ−
yref
If the loop is well designed, that is, yref ≈ y, we can regard the system as merely a kinematic point,with input, velocity, say v, and output, position, y.
Platoons
Now suppose there are several of these motorized carts. We want them to move in a straight linelike this: A designated leader should go at constant speed: the second should follow at a fixeddistance d; the third should follow the second at the distance d; and so on.
leader under
cruise control
follower should stay
distance d behind
We’ll return to this problem later.
14 CHAPTER 1. INTRODUCTION
1.3 Problems
The first few problems study the concept of linearity of a system. Recall that a system F with inputu and output y is linear if it satisfies two conditions: superposition, i.e.,
F(u1 + u2) = F(u1) + F(u2),
and homogeneity,
F(cu) = cF(u),
c a real constant. To prove it’s not linear, you have to give a counterexample for one of these twoconditions.
1. Consider a quantizer Q with input u(t), that can take on a continuum of values, and outputy(t), which can take on only countably many values, say, {bk}k∈Z. More specifically, supposeR is partitioned into intervals Ik, k ∈ Z, and if u(t) ∈ Ik, then y(t) = bk. Prove that Q is notlinear.
2. Let S denote the ideal sampler of sampling period T ; it maps a continuous-time signal u(t)into the discrete-time signal u[k] = u(kT ). Let H denote the synchronized zero-order hold; itmaps a discrete-time signal y[k] into y(t), where
y(t) = y[k], kT ≤ t < (k + 1)T.
Then HS maps u(t) to y(t) where
y(t) = u(kT ), kT ≤ t < (k + 1)T.
Is HS linear? If so, prove it; if not, give a counterexample.
3. Consider the amplitude modulation system with input u(t) and output y(t) = u(t) cos(t). Isit linear?
4. At time t = 0 a force v(t) is applied to a mass M whose position is y(t); the mass is initially atrest. Thus My = v, where y(0) = y(0) = 0. The force is the output of a saturating actuatorwith input u(t) in this way:
v =
u, −1 ≤ u ≤ 11, u > 1−1, u < −1.
Is the system from u to y linear?
5. Give an example of a system that is linear, infinite-dimensional, causal, and time-varying.
6. Express the superposition property of a system F in terms of a block diagram. Express thehomogeneity property in like manner.
7. Both by hand and by Scilab/MATLAB find a state model for the system with transfer function
G(s) =s3 − 1
2s3 + s2 − 2s.
1.3. PROBLEMS 15
8. Consider the system model x = Ax + Bu, y = Cx with
A =
1 2.5 0 00 −1 0 0−2 −3 1 0
0 −0.5 0 1
, B =
1041
, C =�
1 2 0 1�.
Both by hand and by Scilab/MATLAB find the transfer function from u to y.
9. Kirchoff’s laws for a circuit lead to algebraic constraints (e.g., currents into a node sum tozero). Consider a system with inputs u1, u2 and outputs y1, y2 governed by the equations
y1 + 2y1 + y2 = u1
y1 + y2 = u2.
Find the transfer matrix from u = (u1, u2) to y = (y1, y2). Does this system have a statemodel? If so, find one.
10. Consider the system with input u(t) and output y(t) where
4y + y2 − y = (3t
2 + 8)u.
The nominal input and output are u0(t) = 1, y0(t) = t2 (you can check that they satisfy the
differential equation). Derive a nonlinear state model of the form
x = f(x, u, t).
Linearize this about the nominal state and input, ending up with a linear state equation.
11. An unforced pendulum is modeled by the equation
L θ + g sin θ = 0,
where L = length, g = gravity constant, θ = angle of pendulum.
(a) Put this model in the form of a state equation.(b) Find all equilibrium points.(c) Find the linearized model for each equilibrium point.
12. A system has three inputs u1, u2, and u3 and three outputs y1, y2, and y3. The equations are...y 1 + a1y1 + a2(y1 + y2) + a3(y1 − y3) = u1
y2 + a4(y2 − y1 + 2y3) + a5(y2 − y1) = u2
y3 + a6(y3 − y1) = u3 .
Find a state-space model for this system.
13. Find two different state models for the system
y + ay + by = u + cu.
16 CHAPTER 1. INTRODUCTION
Chapter 2
The Equation x = Ax
The object of study in this chapter is the unforced state equation
x = Ax.
Here A is an n× n real matrix and x(t) an n-dimensional vector-valued function of time.
2.1 Brief Review of Some Linear Algebra
In this brief section we review these concepts/results: Rn, linear independence of a set of vectors,span of a set of vectors, subspace, basis for a subspace, rank of a matrix, existence and uniquenessof a solution to Ax = b where A is not necessarily square, inverse of a matrix, invertibility. If youremember them (and I hope you do), skip to the next section.
The symbol Rn stands for the vector space of n-tuples, i.e., ordered lists of n real numbers.A set of vectors {v1, . . . , vk} in Rn is linearly independent if none is a linear combination of
the others. One way to check this is to write the equation
c1v1 + · · ·+ ckvk = 0
and then try to solve for the c�is. The set is linearly independent iff the only solution is ci = 0 for
every i,The span of {v1, . . . , vk}, denoted Span{v1, . . . , vk}, is the set of all linear combinations of these
vectors.A subspace V of Rn is a subset of Rn that is also a vector space in its own right. This is true
iff these two conditions hold: If x, y are in V, then so is x + y; if x is in V and c is a scalar, thencx is in V. Thus V is closed under the operations of addition and scalar multiplication. In R3 thesubspaces are the lines through the origin, the planes through the origin, the whole of R3, and theset consisting of only the zero vector.
A basis for a subspace is a set of linearly independent vectors whose span equals the subspace.The number of elements in a basis is the dimension of the subspace.
The rank of a matrix is the dimension of the span of its columns. This can be proved to equalthe dimension of the span of its rows.
The equation Ax = b has a solution iff b belongs to the span of the columns of A, equivalently
rank A = rank�
A b�.
17
18 CHAPTER 2. THE EQUATION X = AX
When a solution exists, it is unique iff the columns of A are linearly independent, that is, the rankof A equals its number of columns.
The inverse of a square matrix A is a matrix B such that BA = I. If this is true, then AB = I.The inverse is unique and we write A
−1. A square matrix A is invertible iff its rank equals itsdimension (we say “A has full rank”); equivalently, its determinant is nonzero. The inverse equalsthe adjoint divided by the determinant.
2.2 Eigenvalues and Eigenvectors
Now we turn to x = Ax. The time evolution of x(t) can be understood from the eigenvalues andeigenvectors of A—a beautiful connection between dynamics and algebra. Recall that the eigenvalueequation is
Av = λv.
Here λ is a real or complex number and v is a nonzero real or complex vector; λ is an eigenvalueand v a corresponding eigenvector. The eigenvalues of A are unique but the eigenvectors are not:If v is an eigenvector, so is cv for any real number c �= 0. The spectrum of A, denoted σ(A), is itsset of eigenvalues. The spectrum consists of n numbers, in general complex, and they are equal tothe zeros of the characteristic polynomial det(sI −A).
Example Consider two carts and a dashpot like this:
M1 M2
x1 x2
D
Take D = 1, M1 = 1, M2 = 1/2, x3 = x1, x4 = x2. You can derive that the model is x = Ax, where
A =
0 0 1 00 0 0 10 0 −1 10 0 2 −2
.
The characteristic polynomial of A is s3(s + 3), and therefore
σ(A) = {0, 0, 0,−3}.
�
The equation Av = λv says that the action of A on an eigenvector is very simple—just multi-plication by the eigenvalue. Likewise, the motion of x(t) starting at an eigenvector is very simple.
Lemma 2.2.1 If x(0) is an eigenvector v of A and λ the corresponding eigenvalue, then x(t) = eλtv.
Thus x(t) is an eigenvector too for every t.
2.2. EIGENVALUES AND EIGENVECTORS 19
Proof The initial-value problem
x = Ax, x(0) = v
has a unique solution—this is from differential equation theory. So all we have to do is show thateλt
v satisfies both the initial condition and the differential equation, for then eλtv must be the
solution x(t). The initial condition is easy:
eλtv
���t=0
= v.
And for the differential equation,
d
dt(eλt
v) = eλtλv = eλt
Av = A(eλtv).
�
The result of the lemma extends to more than one eigenvalue. Let λ1, . . . ,λn be the eigenvaluesof A and let v1, . . . , vn be corresponding eigenvectors. Suppose the initial state x(0) can be writtenas a linear combination of the eigenvectors:
x(0) = c1v1 + · · ·+ cnvn.
This is certainly possible for every x(0) if the eigenvectors are linearly independent. Then thesolution satisfies
x(t) = c1eλ1tv1 + · · ·+ cneλnt
vn.
This is called a modal expansion of x(t).
Example
A =�−1 1
2 −2
�, λ1 = 0, λ2 = −3, v1 =
�11
�, v2 =
�−1
2
�
Let’s say x(0) = (0, 1). The equation
x(0) = c1v1 + c2v2
is equivalent to
x(0) = V c,
where V is the 2× 2 matrix with columns v1, v2 and c is the vector (c1, c2). Solving gives c1 = c2 =1/3. So
x(t) =13v1 +
13e−3t
v2
�
20 CHAPTER 2. THE EQUATION X = AX
The case of complex eigenvalues is only a little complicated. If λ1 is a complex eigenvalue, someother, say λ2, is its complex conjugate: λ2 = λ1. The two eigenvectors, v1 and v2, can be taken tobe complex conjugates too (easy proof). Then if x(0) is real and we solve
x(0) = c1v1 + c2v2,
we’ll find that c1, c2 are complex conjugates as well. Thus the equation will look like
x(0) = c1v1 + c1v2 = 2� (c1v1),
where � denotes real part.
Example
A =�
0 −11 0
�, λ1 = j, λ2 = −j, v1 =
�1−j
�, v2 =
�1j
�
Suppose x(0) = (0, 1). Then c1 = j/2, c2 = −j/2 and
x(t) = 2��c1eλ1t
v1
�= �
�jejt
�1−j
��=
�− sin t
cos t
�.
�
2.3 The Jordan Form
Now we turn to the structure theory of a matrix related to its eigenvalues. It’s convenient tointroduce a term, the kernel of a matrix A. Kernel is another name for nullspace. Thus Ker A isthe set of all vectors x such that Ax = 0; that is, Ker A is the solution space of the homogeneousequation Ax = 0. Notice that the zero vector is always in the kernel. If A is square, then Ker A isthe zero subspace, and we write Ker A = 0, iff 0 is not an eigenvalue of A. If 0 is an eigenvalue,then Ker A equals the span of all the eigenvectors corresponding to this eigenvalue; we say Ker A
is the eigenspace corresponding to the eigenvalue 0. More generally, if λ is an eigenvalue of A
the corresponding eigenspace is the solution space of Av = λv, that is, of (A − λI)v = 0, that is,Ker (A− λI).
Let’s begin with the simplest case, where A is 2× 2 and has 2 distinct eigenvalues, λ1, λ2. Youcan show (this is a good exercise) that there are then 2 linearly independent eigenvectors, say v1, v2
(maybe complex vectors). The equations
Av1 = λ1v1, Av2 = λ2v2
are equivalent to the matrix equation
A�
v1 v2�
=�
v1 v2� �
λ1 00 λ2
�,
that is, AV = V AJF , where
V =�
v1 v2�, AJF = diag (λ1, λ2).
2.3. THE JORDAN FORM 21
The latter matrix is the Jordan form of A. It is unique up to reordering of the eigenvalues. Themapping A �−→ AJF = V
−1AV is called a similarity transformation. Example:
A =�−1 1
2 −2
�, V =
�1 −11 2
�, AJF =
�0 00 −3
�.
Corresponding to the eigenvalue λ1 = 0 is the eigenvector v1 = (1, 1), the first column of V . Allother eigenvectors corresponding to λ1 have the form cv1, c �= 0. We call the subspace spanned byv1 the eigenspace corresponding to λ1. Likewise, λ2 = −3 has a one-dimensional eigenspace.
These results extend from n = 2 to general n. Note that in the preceding result we didn’tactually need distinctness of the eigenvalues — only linear independence of the eigenvectors.
Theorem 2.3.1 The Jordan form of A is diagonal, i.e., A is diagonalizable by similarity transfor-mation, iff A has n linearly independent eigenvectors. A sufficient condition is n distinct eigenval-ues.
The great thing about diagonalization is that the equation x = Ax can be transformed viaw = V
−1x into w = AJF w, that is, n decoupled equations:
wi = λiwi, i = 1, . . . , n.
The latter equations are trivial to solve:
wi(t) = eλitwi(0), i = 1, . . . , n.
Now we look at how to construct the Jordan form when there are not n linearly independenteigenvectors. We start where A has only 0 as an eigenvalue.
Nilpotent matrices
Consider
0 1 00 0 00 0 0
,
0 1 00 0 10 0 0
. (2.1)
For both of these matrices, σ(A) = {0, 0, 0}. For the first matrix, the eigenspace Ker A is two-dimensional and for the second matrix, one-dimensional. These are examples of nilpotent matrices:A is nilpotent if A
k = 0 for some k ≥ 1. The following statements are equivalent:
1. A is nilpotent.
2. All its eigs are 0.
3. Its characteristic polynomial is sn.
4. It is similar to a matrix of the form (2.1), where all elements are 0’s, except 0’s or 1’s on thefirst diagonal above the main one. This is called the Jordan form of the nilpotent matrix.
22 CHAPTER 2. THE EQUATION X = AX
Example Suppose A is 3× 3 and A = 0. Then of course it’s already in Jordan form,
0 0 00 0 00 0 0
�
Example Here we do an example of transforming a nilpotent matrix to Jordan form. Take
A =
1 1 0 0 0−1 −1 0 1 0
0 0 0 0 00 0 0 1 10 0 0 −1 −1
.
The rank of A is 3 and hence the kernel has dimension 2. We can compute that
A2 =
0 0 0 1 00 0 0 0 10 0 0 0 00 0 0 0 00 0 0 0 0
, A
3 =
0 0 0 1 10 0 0 −1 −10 0 0 0 00 0 0 0 00 0 0 0 0
, A
4 = 0.
Take any vector v5 in Ker A4 = R5 that is not in Ker A
3, for example,
v5 = (0, 0, 0, 0, 1).
Then take
v4 = Av5, v3 = Av4, v2 = Av3.
We get
v4 = (0, 0, 0, 1,−1) ∈ Ker A3, �∈ Ker A
4
v3 = (0, 1, 0, 0, 0) ∈ Ker A2, �∈ Ker A
3
v2 = (1,−1, 0, 0, 0) ∈ Ker A, �∈ Ker A2.
Finally, take v1 ∈ Ker A, linearly independent of v2, for example,
v1 = (0, 0, 1, 0, 0).
Assemble v1, . . . , v5 into the columns of V . Then
V−1
AV = AJF =
0 0 0 0 00 0 1 0 00 0 0 1 00 0 0 0 10 0 0 0 0
.
2.3. THE JORDAN FORM 23
This is block diagonal, like this:
0 0 0 0 00 0 1 0 00 0 0 1 00 0 0 0 10 0 0 0 0
.
�
In general, the Jordan form of a nilpotent matrix has 0 in each entry except possibly in the firstdiagonal above the main diagonal which may have some 1s.
A nilpotent matrix has only the eigenvalue 0. Now consider a matrix A that has only oneeigenvalue, λ, i.e.,
det(sI −A) = (s− λ)n.
To simplify notation, suppose n = 3. Letting r = s− λ, we have
det[rI − (A− λI)] = r3,
i.e., A−λI has only the zero eigenvalue, and hence A−λI =: N , a nilpotent matrix. So the Jordanform of N must look like
0 � 00 0 �
0 0 0
,
where each star can be 0 or 1, and hence the Jordan form of A is
λ � 00 λ �
0 0 λ
, (2.2)
To recap, if A has just one eigenvalue, λ, then its Jordan form is λI + N , where N is a nilpotentmatrix in Jordan form.
An extension of this analysis results in the Jordan form in general. Suppose A is n × n andλ1, . . . ,λp are the distinct eigenvalues of A and m1, . . . ,mp are their multiplicities; that is, thecharacteristic polynomial is
det(sI −A) = (s− λ1)m1 · · · (s− λp)mp .
Then A is similar to
AJF =
A1
. . .Ap
,
24 CHAPTER 2. THE EQUATION X = AX
where Ai is mi ×mi and it has only the eigenvalue λi. Thus Ai has the form λiI + Ni, where Ni isa nilpotent matrix in Jordan form. Example:
A =
0 0 1 00 0 0 10 0 −1 10 0 2 −2
As we saw, the spectrum is σ(A) = {0, 0, 0,−3}. Thus the Jordan form must be of the form
AJF =
0 � 0 00 0 � 00 0 0 00 0 0 −3
.
Since A has rank 2, so does AJF . Thus only one of the stars is 1. Either is possible, for example,
AJF =
0 0 0 00 0 1 00 0 0 00 0 0 −3
.
This has two “Jordan blocks”:
AJF =�
A1 00 A2
�, A1 =
0 0 00 0 10 0 0
, A2 = −3.
2.4 The Transition Matrix
Let us review from the ECE356 course notes. For a square matrix M , the exponential eM is definedas
eM := I + M +12!
M2 +
13!
M3 + · · · .
The matrix eM is not the same as the component-wise exponential of M . Facts:
1. eM is invertible for every M , and (eM )−1 = e−M .
2. eM+N = eMeN iff M and N commute, i.e., MN = NM .
The matrix function t �−→ etA : R → Rn×n is then defined and is called the transition matrixassociated with A. It has the properties
1. etA|t=0 = I
2. etA and A commute.
3.d
dtetA = AetA = etA
A.
Moreover, the solution of
x = Ax, x(0) = x0
is x(t) = etAx0. So etA maps the state at time 0 to the state at time t. In fact, it maps the state at
any time t0 to the state at time t0 + t.
2.4. THE TRANSITION MATRIX 25
On computing the transition matrix
via the Jordan form If one can compute the Jordan form of A, then etA can be written in closedform, as follows. The equation
AV = V AJF
implies
A2V = AV AJF = V A
2JF .
Continuing in this way gives
AkV = V A
kJF ,
and then
eAtV = V eAJF t
,
so finally
eAt = V eAJF tV−1
.
The matrix exponential eAJF t is easy to write down. For example, suppose there’s just one eigen-value, so AJF = λI + N , N nilpotent, n× n. Then
eAJF t = eλteNt
= eλt
�I + Nt + N
2 t2
2!+ · · ·+ N
n−1 tn−1
(n− 1)!
�.
via Laplace transforms Taking Laplace transforms of
x = Ax, x(0) = x0
gives
sX(s)− x0 = AX(s).
This yields
X(s) = (sI −A)−1x0.
Comparing
x(t) = etAx0, X(s) = (sI −A)−1
x0
shows that etA, (sI − A)−1 are Laplace transform pairs. So one can get etA by finding the matrix(sI −A)−1 and then taking the inverse Laplace transform of each element.
26 CHAPTER 2. THE EQUATION X = AX
2.5 Stability
The concept of stability is fundamental in control engineering. Here we look at the scenario wherethe system has no input, but its state has been perturbed and we want to know if the system willrecover. This was introduced in the ECE356 course notes. Here we go a little farther now thatwe’re armed with the Jordan form.
The maglev example is a good one to illustrate this point. Suppose a feedback controller hasbeen designed to balance the ball’s position at 1 cm below the magnet. Suppose if the ball is placedat precisely 1 cm it will stay there; that is, the 1 cm location is a closed-loop equilibrium point.Finally, suppose there is a temporary wind gust that moves the ball away from the 1 cm position.The stability questions are, will the ball move back to the 1 cm location; if not, will it at least staynear that location?
So consider
x = Ax.
Obviously if x(0) = 0, then x(t) = 0 for all t. We say the origin is an equilibrium point—if youstart there, you stay there. Equilibrium points can be stable or not. While there are more elaborateand formal definitions of stability for the above homogeneous system, we choose the following two:The origin is asymptotically stable if x(t) −→ 0 as t −→ ∞ for all x(0). The origin is stableif x(t) remains bounded as t −→ ∞ for all x(0). Since x(t) = eAt
x(0), the origin is asymptoticallystable iff every element of the matrix eAt converges to zero, and is stable iff every element of thematrix eAt remains bounded as t −→∞. Of course, asymptotic stability implies stability.
Asymptotic stability is relatively easy to characterize. Using the Jordan form, one can provethis very important result, where � denotes “real part”:
Theorem 2.5.1 The origin is asymptotically stable iff the eigenvalues of A all satisfy � λ < 0.
Let’s say the matrix A is stable if its eigenvalues satisfy � λ < 0. Then the origin is asymptot-ically stable iff A is stable.
Now we turn to the more subtle property of stability. We’ll do some examples, and we may aswell have A in Jordan form.
Consider the nilpotent matrix
A = N =�
0 00 0
�.
Obviously, x(t) = x(0) for all t and so the origin is stable. By contrast, consider
A = N =�
0 10 0
�.
Then
eNt = I + tN,
which is unbounded and so the origin is not stable. This example extends to the n × n case: If A
is nilpotent, the origin is stable iff A = 0.
2.5. STABILITY 27
Here’s the test for stability in general in terms of the Jordan form of A:
AJF =
A1
. . .Ap
.
Recall that each Ai has just one eigenvalue, λi, and that Ai = λiI + Ni, where Ni is a nilpotentmatrix in Jordan form.
Theorem 2.5.2 The origin is stable iff the eigenvalues of A all satisfy � λ ≤ 0 and for anyeigenvalue with � λi = 0, the nilpotent matrix Ni is zero, i.e., Ai is diagonal.
Here’s an example with complex eigenvalues:
A =�
0 −11 0
�, AJF =
�j 00 −j
�.
The origin is stable since there are two 1× 1 Jordan blocks. Now consider
A =
0 −1 1 01 0 0 10 0 0 −10 0 1 0
.
The eigenvalues are j, j,−j,−j and so the Jordan form must look like
AJF =
j � 0 00 j 0 00 0 −j �
0 0 0 −j
.
Since the rank of A− jI equals 3, the upper star is 1; since the rank of A + jI equals 3, the lowerstar is 1. Thus
AJF =
j 1 0 00 j 0 00 0 −j 10 0 0 −j
.
Since the Jordan blocks are not diagonal, the origin is not stable.
Example Consider the cart-spring-damper systemy
K
D
28 CHAPTER 2. THE EQUATION X = AX
The equation is
My + Dy + Ky = 0.
Defining x = (y, y), we have x = Ax with
A =�
0 1−K/M −D/M
�.
Assume M > 0 and K, D ≥ 0. If D = K = 0, the eigenvalues are {0, 0} and A is a nilpotentmatrix in Jordan form. The origin is an unstable equilibrium. If only D = 0 or K = 0 but notboth, the origin is stable but not asymptotically stable. And if both D,K are nonzero, the originis asymptotically stable. �
Example Two points move on the line R. The positions of the points are x1, x2. They move towardeach other according to the control laws
x1 = x2 − x1, x2 = x1 − x2.
Thus the state is x = (x1, x2) and the state equation is
x = Ax, A =�−1 1
1 −1
�.
The eigenvalues are λ1 = 0, λ2 = −2, so the origin is stable but not asymptotically stable. Obviously,the two points tend toward each other; that is, the state x(t) tends toward the subspace
V = {x : x1 = x2}.
This is the eigenspace for the zero eigenvalue. To see this convergence, write the initial conditionas a linear combination of eigenvectors:
x(0) = c1v1 + c2v2, v1 =�
11
�, v2 =
�−1
1
�.
Then
x(t) = c1eλ1tv1 + c2eλ2t
v2 = c1v1 + c2e−2tv2 → c1v1.
So x1(t) and x2(t) both converge to c1, the same point. �
Phase portraits help us visualize state evolution and stability, but they’re applicable only forthe n = 2 case. Below is shown a plot in R2 of the vector field for
A =�
0 1−1 −1
�,
that is, at a grid of points, the directions of the velocity vectors Ax are shown translated to thepoint x. By following the arrows, we get a trajectory; one is shown. The plot was done usingwww.math.psu.edu/melvin/phase/newphase.html
2.6. PROBLEMS 29
You can also use MATLAB, Scilab (free), Mathematica, or Octave (free).
2.6 Problems
1. Are the following vectors linearly independent?
v1 = (1, 1, 2, 0), v2 = (1, 0, 2,−2), v3 = (−1, 2,−2, 6).
2. Continuing with the same vectors, find a basis for Span {v1, v2, v3}.
3. What kind of geometric object is {x : Ax = b} when A ∈ Rm×n? That is, is it a sphere, apoint—what?
4. (a) Let A be an 8× 8 real matrix with eigenvalues
2, 2,−3,−3,−3, 8, 4, 4.
Assume
rank(A− 2I) = 7, rank(A + 3I) = 6, rank(A− 4I) = 6.
Write down the Jordan form of A.
(b) The matrix
A =
1 0 0 11 0 0 11 0 0 1−1 0 0 −1
is nilpotent. Write down its Jordan form.
30 CHAPTER 2. THE EQUATION X = AX
5. Take
A =
0 0 1 00 0 0 10 0 −1 10 0 2 −2
.
Show that the matrix V constructed as follows satisfies V−1
AV = AJF :
Select v3 in Ker A2 but not in Ker A.
Set v2 = Av3.
Select v1 in Ker A such that {v1, v2} is linearly independent.
Select an eigenvector v4 corresponding to the eigenvalue −3.
Set V = [v1 v2 v3 v4].
(The general construction of the basis for the Jordan form is along these lines.)
6. Let
A =
0 1 0 00 0 1 00 0 0 1−2 1 0 2
.
Write down the Jordan form of A.
7. Consider
A =�
σ ω
−ω σ
�,
where σ and ω �= 0 are real. Find the Jordan form and the transition matrix.
8. In the previous problem, we saw that when
A =�
σ ω
−ω σ
�
its transition matrix is easy to write down. This problem demonstrates that a matrix withdistinct complex eigenvalues can be transformed into the above form using a nonsingulartransformation. Let
A =�−1 −41 −1
�.
Determine the eigenvalues and eigenvectors of A, noting that they form complex conjugatepairs. Let the first eigenvalue be written as a+jb with the corresponding eigenvector v1 +jv2.Take v1 and v2 as the columns of a matrix V . Find V
−1AV .
2.6. PROBLEMS 31
9. Consider the homogeneous state equation x = Ax with
A =�
3 12 2
�
and x0 = (3, 2). Find a modal expansion of x(t).
10. Show that the origin is asymptotically stable for x = Ax iff all poles of every element of(sI −A)−1 are in the open left half-plane. Show that the origin is stable iff all poles of everyelement of (sI − A)−1 are in the closed left half-plane and those on the imaginary axis havemultiplicity 1.
11. Consider the linear system
x =�
0 11 0
�x +
�−11
�u
y =�
0 1�x
(a) If u(t) is the unit step and x(0) = 0, is y(t) bounded?(b) If u(t) = 0 and x(0) is arbitrary, is y(t) bounded?
12. (a) Suppose that σ(A) = {−1,−3,−3,−1 + j2,−1− j2} and the rank of (A− λI)λ=−3 is 4.Determine AJF .
(b) Suppose that σ(A) = {−1,−2,−2,−2} and the rank of (A − λI)λ=−2 is 3. DetermineAJF .
(c) Suppose that σ(A) = {−1,−2,−2,−2,−3} and the rank of (A−λI)λ=−2 is 3. DetermineAJF .
13. Find AJF for
A =
0 1 00 0 1−2 −4 −3
.
14. Summarize all the ways to find exp(At). Then find exp(At) for
A =
1 1 00 1 10 0 2
.
15. Consider the set
{cv : c ≥ 0},
where v �= 0 is a given vector in R2. This set is called a ray from the origin in the directionof v. More generally,
{x0 + cv : c ≥ 0}
is a ray from x0 in the direction of v. Find a 2 × 2 matrix A and a vector x0 such that thesolution x(t) of x = Ax, x(0) = x0 is a ray.
32 CHAPTER 2. THE EQUATION X = AX
16. Consider the following system:
x1 = −x2
x2 = x1 − 3x2
Do a phase portrait using Scilab or MATLAB. Interpret the phase portrait in terms of themodal decomposition of the system. Do lots more examples of this type.
Chapter 3
More Linear Algebra
This chapter extends our knowledge of linear algebra: subspaces, matrix representations, linearmatrix equations, and invariant subspaces.
3.1 Subspaces
Let X = Rn and let V,W be subspaces of X . Then V +W denotes the set
{v + w : v ∈ V, w ∈W},
and it is a subspace of X . The set union V ∪W is not a subspace in general unless one is containedin the other. The intersection V ∩W is however a subspace. As an example:
X = R3, V a line, W a plane.
Then V +W = R3 if V does not lie in W. If V ⊂W, then of course V +W = W.1
It is a fact that
dim(V +W) = dim(V) + dim(W)− dim(V ∩W).
For example, think of V,W as two planes in R3 that intersect in a line. Then the dimension equationevaluates to
3 = 2 + 2− 1.
Two subspaces V,W are independent if V ∩W = 0. This is not the same as being orthogonal.For example two lines in R2 are independent iff they are not colinear (i.e., the angle between themis not 0), while they are orthogonal iff the angle is 90◦.
Every vector x in V +W can be written as
x = v + w, v ∈ V, w ∈W.
If V,W are independent, then v, w are unique. Think of v as the component of x in V and w as itscomponent in W. Let’s prove uniqueness. Suppose
x = v + w = v1 + w1.
1In this chapter when we speak of lines we mean lines through 0. Similarly for planes.
33
34 CHAPTER 3. MORE LINEAR ALGEBRA
Then
v − v1 = w1 − w.
The left-hand side is in V and the right-hand side in W. Since the intersection of these two subspacesis zero, both sides equal 0.
Clearly, V,W are independent iff
dim(V +W) = dim(V) + dim(W).
Three subspaces U ,V,W are independent if U ,V+W are independent, V,U +W are indepen-dent, and W,U + V are independent. This is not the same as being pairwise independent. As anexample, let U ,V,W be 1-dimensional subspaces of R3, i.e., three lines. When are they independent?Pairwise independent?
Every vector x in U + V +W can be written as
x = u + v + w, u ∈ U , v ∈ V, w ∈W.
If U ,V,W are independent, then u, v, w are unique. Also, U ,V,W are independent iff
dim(U + V +W) = dim(U) + dim(V) + dim(W).
If V,W are independent subspaces, we write their sum as V ⊕W. This is called a direct sum.Likewise for more than two.
Let’s finish this section with a handy fact: Every subspace has an independent complement, i.e.,
V ⊂ X =⇒ (∃W ⊂ X ) X = V ⊕W.
Think of X as R3 and V as a plane. Then W can be any line not in the plane.
3.2 Linear Transformations
We now introduce linear transformations. The important point is that a linear transformation isnot the same as a matrix, but every linear transformation has a matrix representation once youchoose a basis.
Let X = Rn and Y = Rp. A linear function A : X → Y defines a linear transformation (LT);X is called its domain and Y its co-domain. Thus
A(x1 + x2) = Ax1 + Ax2, x1, x2 ∈ X
A(ax) = aAx, a ∈ R, x ∈ X .
It is an important fact that an LT is uniquely determined by its action on a basis. That is, ifA : X → Y is an LT and if {e1, . . . , en} is a basis for X , then if we know the vectors Aei, we cancompute Ax for every x ∈ X , by linearity.
Example For us, the most important example is an LT generated by a matrix. Let A ∈ Rm×n.For each vector x in Rn, Ax is a vector in Rm. The mapping x �→ Ax is an LT A : Rn → Rm.Linearity is easy to check. �
3.2. LINEAR TRANSFORMATIONS 35
Example Take a vector in the plane and rotate it counterclockwise by 90◦. This defines an LTA : R2 → R2. Note that A is not given as a matrix; it’s given by its domain, its co-domain, and itsaction on vectors. If we take a vector to be represented by its Cartesian coordinates, x = (x1, x2),then we’ve chosen a basis for R2. In that case A maps x = (x1, x2) to Ax = (−x2, x1), and sothere’s an associated rotation matrix
�0 −11 0
�.
We’ll return to matrix representation later. �
Example Let X = Rn and let {e1, . . . , en} be a basis. Every vector x in X has a unique expansion
x = a1e1 + · · ·+ anen, ai ∈ R.
Let a denote the vector (a1, . . . , an), the n-tuple of coordinates of x with respect to the basis.The function x �−→ a defines an LT Q : X → Rn. The equation
x = a1e1 + · · ·+ anen
can be written compactly as x = Ea, where E is the matrix with columns e1, . . . , en and a is thevector with components a1, . . . , an. Therefore a = E
−1x and so Qx = E
−1x, that is, the action of
Q is to multiply by the matrix E−1.
For example, let X = R2. Take the natural basis
e1 =�
10
�, e2 =
�01
�.
In this case E = I and Qx = x. If the basis instead is
e1 =�
11
�, e2 =
�−1
1
�,
then
E =�
1 −11 1
�
and Qx = E−1
x. �
Every LT on finite-dimensional vector spaces has a matrix representation. Let’s do this veryimportant construction carefully. Let A be an LT X → Y,
X = Rn, basis {e1, . . . , en}; Y = Rp
, basis {f1, . . . , fp}.
Bring in the coordinate LTs:
Q : X → Rn, R : Y → Rp
.
So now we have the setup
36 CHAPTER 3. MORE LINEAR ALGEBRA
X Y
Rn Rp
R
A
Q
The left downward arrow gives us the n-tuple, say a, that represents a vector x in the basis{e1, . . . , en}. The right downward arrow gives us the p-tuple, say b, that represents a vector y
in the basis {f1, . . . , fn}. It’s possible to add a fourth LT to complete the square:
X Y
Rn Rp
R
A
Q
M
This is called a commutative diagram. The object M in the diagram is the matrix representationof A with respect to these two bases. Notice that the bottom arrow represents the LT generatedby the matrix M ; we write M in the diagram for simplicity, but you should understand that reallythe object is an LT. The matrix M is the p× n matrix that makes the diagram commute, that is,for every x ∈ X
Ma = b, where a = Qx, b = RAx.
In particular, take x = ei, the ith basis vector in X . Then a is the n-vector with 1 in the i
th entryand 0 otherwise. So Ma equals the i
th column of the matrix M . Thus, we have the following recipefor constructing the matrix M :
1. Take the 1st basis vector e1 of X .
2. Apply the LT A to get Ae1.
3. Find b, the coordinate vector of Ae1 in the basis for Y.
4. Enter this b as column 1 of M .
5. Repeat for the other columns.
Recall that Q is the LT generated by E−1, where the columns of E are the basis in the domain of
A. Likewise, R is the LT generated by F−1, where the columns of F are the basis in the co-domain
of A. Thus the equation Ma = b reads
ME−1
x = F−1Ax. (3.1)
Example Let A : R2 → R2 be the LT that rotates a vector counterclockwise by 90◦. Let’s firsttake the standard bases: e1 = (1, 0), e2 = (0, 1) for the domain and f1 = (1, 0), f2 = (0, 1) for theco-domain. Following the steps we first apply A to e1, that is, we rotate e1 counterclockwise by90◦; the result is Ae1 = (0, 1). Then we express this vector in the basis {f1, f2}:
Ae1 = 0× f1 + 1× f2.
3.2. LINEAR TRANSFORMATIONS 37
Thus the first column of M is (0, 1), the vector of coefficients. Now for the second column, rotatee2 to get (−1, 0) and represent this in the basis {f1, f2}:
Ae2 = −1× f1 + 0× f2.
So the second column of M is (−1, 0). Thus
M =�
0 −11 0
�.
Suppose we had different bases:
e1 = (1, 1), e2 = (−1, 2), f1 = (1, 2), f2 = (1, 0).
Apply the recipe again. Get Ae1 = (−1, 1). Expand it in the basis {f1, f2}:
(−1, 1) =12f1 −
32f2.
Get Ae2 = (−2,−1). Expand it in the basis {f1, f2}:
(−2,−1) = −12f1 −
32f2.
Thus
M =
�12 −1
2
−32 −3
2
�.
�
Example Let A ∈ Rm×n and let A : Rn −→ Rm be the generated LT. It is easy to check thatA itself is then the matrix representation of A with respect to the standard bases. Let’s do it.Let {e1, . . . , en} be the standard basis on Rn and {f1, . . . , fm} the standard basis on Rm. ThenAe1 = Ae1 equals the first column, (a11, a21, . . . , am1), of A. This column can be written as
a11f1 + · · ·+ am1fm,
and hence (a11, a21, . . . , am1) is the first column of the matrix representation of A.Suppose instead that we have general bases, {e1, . . . , en} on Rn and {f1, . . . , fm} on Rm. Form
the matrices E and F from these basis vectors. From (3.1) we get that the matrix representationM with respect to these bases satisfies
ME−1 = F
−1A,
or equivalently
AE = FM.
A very interesting special case of this is where A is square and the same basis {e1, . . . , en} istaken for both the domain and co-domain. Then
AE = EM,
38 CHAPTER 3. MORE LINEAR ALGEBRA
or M = E−1
AE; the matrix M is a similarity transformation of the given matrix A.Finally, suppose we start with a square A and take the basis {v1, . . . , vn} of generalized eigen-
vectors. The new matrix representation is our familiar Jordan form AJF = V−1
AV . Thus the twomatrices A and AJF represent the same LT: A in the given standard basis and AJF in the basis ofgeneralized eigenvectors. �
An LT has two important associated subspaces. Let A : X → Y be an LT. The kernel (ornullspace) of A is the subspace of X on which A is zero:
Ker A := {x : Ax = 0}.
The LT A is said to be one-to-one if Ker A = 0, equivalently, the homogeneous equation Ax = 0has only the trivial solution x = 0. The image (or range space) of A is the subspace of Y that Acan reach:
Im A := {y : (∃x ∈ X )y = Ax}.
We say A is onto if Im A = Y, equivalently, the equation Ax = y has a solution x for every y.Whether A is one-to-one or onto (or both) can be easily checked by examining any matrix
representation A:
A is one-to-one ⇐⇒ A has full column rank;
A is onto ⇐⇒ A has full row rank.
If A is a matrix, we will write Im A for the image of the generated LT—it’s the column span ofthe matrix; and we’ll write Ker A for the kernel of the LT.
Example Let A : R3 −→ R3 map a vector to its projection on the horizontal plane. Then the kernelequals the vertical axis, the image equals the horizontal plane, A is neither onto nor one-to-one,and its matrix with respect to the standard basis is
1 0 00 1 00 0 0
.
We could modify the co-domain to have A : R3 −→ R2, again mapping a vector to its projectionon the horizontal plane. Then the kernel equals the vertical axis, the image equals the horizontalplane, A is onto but not one-to-one, and its matrix with respect to the standard basis is
�1 0 00 1 0
�.
�
Example Let V ⊂ X (think of V as a plane in 3-dimensional space X ). Define the functionV : V → X , Vx = x. This is an LT called the insertion LT. Clearly V is one-to-one andIm V = V. Suppose we have a basis for V,
{e1, . . . , ek},
3.2. LINEAR TRANSFORMATIONS 39
and we extend it to get a basis for X ,
{e1, . . . , ek, . . . , en}.
Then the matrix rep. of V is
V =�
Ik
0
�.
Clearly, rank V = k. �
Example Let X be 3-dimensional space, V a plane (2-dimensional subspace), and W a line not inV. Then V,W are independent subspaces and
X = V ⊕W.
Every x in X can be written x = v+w for unique v in V and w in W. Define the function P : X → Vmapping x to v. This is an LT called the natural projection onto V. Check that
Im P = V, Ker P = W.
Suppose {e1, e2} is a basis for V, {e3} a basis for W. The induced matrix representation is
P =�
1 0 00 1 0
�.
�
Example Let A : X → Y be an LT. Its kernel, Ker A, is a subspace of X ; let {ek+1, . . . , en} be abasis for Ker A and extend it to get a basis for X :
{e1, . . . , ek, . . . , en} for X .
Then
{Ae1, . . . ,Aek}
is a basis for Im A Extend it to get a basis for Y:
{Ae1, . . . ,Aek, fk+1, . . . , fp}.
Then the matrix representation of A is
A =�
Ik 00 0
�.
�
40 CHAPTER 3. MORE LINEAR ALGEBRA
3.3 Matrix Equations
We already reviewed the linear equation
Ax = b, A ∈ Rn×m, x ∈ Rm
, b ∈ Rn.
The equation is another way of saying b is a linear combination of the columns of A. Thus theequation has a solution iff b ∈ column span of A, i.e., b ∈ ImA. Then the solution is unique iff rankA = m, i.e., Ker A = 0.
These results extend to the matrix equation
AX = B, A ∈ Rn×m, X ∈ Rm×p
, B ∈ Rn×p
In this section we study this and similar equations. We could work with LTs but we’ll use matricesinstead.
The first equation is AX = I. Such an X is called a right-inverse of A.
Lemma 3.3.1 A ∈ Rn×m has a right-inverse iff it’s onto, i.e. the rank of A equals n.
Proof (=⇒) If AX = I, then, for every y ∈ Rn,
AXy = y.
Thus for every y ∈ Rn, there exists x ∈ Rm such that Ax = y. Thus A is onto.(⇐=) Let {f1, . . . , fp} be the standard basis for Rn. Since A is onto
(∀i)(∃xi ∈ Rm)fi = Axi.
Now define X to be the matrix whose ith column is xi, i.e., via Xfi = xi. Then AXfi = fi. This
implies AX = I. �
The second equation is the dual situation XA = I. Obviously, such an X is a left-inverse.
Lemma 3.3.2 A ∈ Rn×m has a left-inverse iff it’s one-to-one, i.e., A has rank m.
Lemma 3.3.3 1. There exists X such that AX = B iff Im B ⊂ Im A, that is,
rank A = rank�
A B�.
2. There exists X such that XA = B iff Ker A ⊂ Ker B., that is,
rank A = rank�
A
B
�.
3.4. INVARIANT SUBSPACES 41
3.4 Invariant Subspaces
Example Let
A =�
1 12 2
�
and let A : R2 → R2 be the generated LT. Clearly, Ker A is the 1-dimensional subspace spanned
by�
1−1
�. Also,
x ∈ Ker A⇒ Ax = 0 ∈ Ker A,
or equivalently,
AKer A ⊂ Ker A.
�
In general, if A : X → X is an LT, a subspace V ⊂ X is A-invariant if AV ⊂ V. The zerosubspace, X itself, Ker A, and Im A are all A-invariant. Now Ker A is the eigenspace for the zeroeigenvalue, assuming λ = 0 is an eigenvalue (as in the example above).
More generally, suppose λ is an eigenvalue of A. Assume λ ∈ R. Then Ax = λx for some x �= 0.Then V = Span {x} is A-invariant. So is the eigenspace
{x : Ax = λx} = {x : (A− λI)x = 0} = Ker (A− λI).
Let V be an A-invariant subspace. Take a basis for V,
{e1, . . . , ek},
and extend it to a basis for X :
{e1, . . . , ek, . . . , en}.
Then the matrix representation of A has the form
A =�
A11 A12
0 A22
�.
Notice that the lower-left block of A equals zero; this is because V is A-invariant.
Example Let X = R3, let V be the (x1, x2)-plane, and let A : X → X be the LT that rotates avector 90◦ about the x3-axis using the right-hand rule. Thus V is A-invariant. Let us take the bases
e1 =
100
, e2 =
010
for V
e1, e2, e3 =
111
for X .
42 CHAPTER 3. MORE LINEAR ALGEBRA
The matrix representation of A with respect to the latter basis is
A =
0 −1 −21 0 00 0 1
.
So, in particular, the restriction of A to V is represented by the rotation matrix
A11 =�
0 −11 0
�.
�
Finally, let A be an n×n matrix. Suppose V is an n×k matrix. Then Im V is a subspace of Rn.How can we know if this subspace is invariant under A, or more precisely, under the LT generatedby A? The answer is this:
Lemma 3.4.1 The subspace Im V is A-invariant iff the linear equation AV = V A1 has a solutionA1.
Proof If AV = V A1, then Im AV ⊂ Im V , that is, A Im V ⊂ Im V , which says Im V is A-invariant.Conversely, if Im AV ⊂ Im V , then the equation AV = V A1 is solvable, by Lemma 3.3.3. �.
3.5 Problems
1. Prove the following facts about subspaces:
(a) V + V = VHint: You have to show V+V ⊂ V and V ⊂ V+V. Similarly for other subspace equalities.
(b) If V ⊂W, then V +W = W.(c) If V ⊂W, then W ∩ (V + T ) = V +W ∩ T .
2. Show that W∩(V+T ) = W∩V+W∩T is false in general by giving an explicit counterexample.
3. Let A be the identity LT on R2. Take��
11
�,
�1−1
��= basis for domain,
��20
�,
�−1
3
��= basis for co-domain.
Find the matrix A.
4. Let A denote the LT R4 → R5 with the action
x1
x2
x3
x4
�→
x4
02x4
x2 + x3 + 2x4
x2 + x3
.
Find bases for R4 and R5 so that the matrix representation is
A =�
I 00 0
�.
3.5. PROBLEMS 43
5. Let A be an LT. Show that if {Ae1, . . . ,Aen} is linearly independent, so is {e1, . . . , en}. Givean example where the converse is false.
6. Find all right-inverses of the matrix
A =�
1 −1 11 1 2
�.
7. Let X denote the 4-dimensional vector space with basis
{sin t, cos t, sin 2t, cos 2t}.
Thus vectors in X are time-domain signals of frequency 1 rad/s, 2 rad/s, or a combination ofboth. Suppose an input x(t) from X is applied to a lowpass RC-filter, producing the outputy(t). The equation for the circuit is
RCy(t) + y(t) = x(t).
For simplicity, take RC = 1. From circuit theory, we know that y(t) belongs to X too. (Thisis steady-state analysis; transient response is neglected.) So the mapping from x(t) to y(t)defines a linear transformation A : X −→ X . Find the matrix representation of A withrespect to the given basis.
8. Consider the vector space R3. Let x1, x2, and x3 denote the components of a vector x in R3.Now let V denote the subspace of R3 of all vectors x where
x1 + x2 − x3 = 0,
and let W denote the subspace of R3 of all vectors x where
2x1 − 3x3 = 0.
Find a basis for the intersection V ∩W.
9. Let A : R3 −→ R3 be the LT defined by
A :
x1
x2
x3
�→
8x1 − 2x3
x1 + 7x2 − 2x3
4x1 − x3
.
Find bases for Ker A and Im A.
10. Find all solutions of the matrix equation XA = I where
A =
1 21 02 −1
.
44 CHAPTER 3. MORE LINEAR ALGEBRA
11. For a square matrix X, let diagX denote the vector formed from the elements on the diagonalof X.
Let A : Rn×n −→ Rn be the LT defined by
A : X �→ diagX.
Does A have a left inverse? A right inverse?
12. Consider the two matrices:
4 1 −13 2 −31 3 0
,
1 2 3 4 52 3 4 1 23 4 5 0 0
.
For each matrix, find its rank, a basis for its image, and a basis for its kernel.
13. Let A, U ∈ Rn×n with U nonsingular. True or false:
(a) Ker (A) = Ker (UA).(b) Ker (A) = Ker (AU).(c) Ker (A2) ⊆ Ker (A).
14. Is {(x1, x2, x3) : 2x1 + 3x2 + 6x3 − 5 = 0} a subspace of R3?
15. You are given the n eigenvalues of a matrix in Rn×n. Can you determine the rank of thematrix? If no, can you give bounds on the rank?
16. Suppose that A ∈ Rm×n and B ∈ Rn×m with m ≤ n and rank A = rank B = m. Find anecessary and sufficient condition that AB be invertible.
17. Let A be an LT from X to X , a finite-dimensional vector space. Fix a basis for X and let A
denote the matrix representation of A with respect to this basis. Show that A2 is the matrix
representation of A2.
18. Consider the following “result:”
Lemma If A is a matrix with full column rank, then the equation Ax = y is solvable forevery vector y.
Proof Let y be arbitrary. Multiply the equation Ax = y by the transpose of A:
ATAx = A
Ty.
Since A has full column rank, ATA is invertible. Thus
x = (ATA)−1
ATy.
�
(a) Give a counterexample to the lemma.(b) What is the mistake in logic in the proof?
3.5. PROBLEMS 45
19. Let L denote the line in the plane that passes through the origin and makes an angle +π/6radians with the positive x-axis. Let A : R2 → R2 be the LT that maps a vector to itsreflection about L.
(a) Find the matrix representation of A with respect to the basis
e1 =�
11
�, e2 =
�−1
1
�.
(b) Show that A is invertible and find its inverse.
20. Fix a vector v �= 0 in R3 and consider the LT A : R3 → R3 that maps x to the cross productv × x.
(a) Find Ker(A) and Im(A).
(b) Is A invertible?